Integrating the Mind: Domain General Versus Domain Specific Processes in Higher Cognition

Integrating the Mind There are currently several debates taking place simultaneously in various ®elds of psychology th...

Author: Maxwell J. Roberts

38 downloads 1194 Views 3MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Integrating the Mind

There are currently several debates taking place simultaneously in various ®elds of psychology that address the same fundamental issue: to what extent are the processes and resources that underlie higher cognition domain general versus domain speci®c? Extreme domain speci®city argues that people are effective thinkers only in contexts which they have directly experienced, or in which evolution has equipped them with effective solutions. The role of general cognitive abilities is ignored, or denied altogether. This book evaluates the evidence and arguments put forward in support of domain speci®c cognition, at the expense of domain generality. The contributions re¯ect a range of expertise, and present research into logical reasoning, problem solving, judgement and decision making, cognitive development, and intelligence. The contributors suggest that domain general processes are essential, and that domain speci®c processes cannot function without them. Rather than continuing to divide the mind's function into ever more speci®c units, this book argues that psychologists should look for greater integration and for people's general cognitive skills to be viewed as an integral part of their lives. Integrating the Mind will be valuable reading for students and researchers in psychology interested in the ®elds of cognition, cognitive development, intelligence and skilled behaviour. Maxwell J. Roberts is a psychology lecturer at the University of Essex.

Integrating the Mind

Domain general versus domain speci®c processes in higher cognition

Edited by Maxwell J. Roberts

First published 2007 by Psychology Press 27 Church Road, Hove, East Sussex BN3 2FA Simultaneously published in the USA and Canada by Psychology Press 270 Madison Avenue, New York, NY 10016 This edition published in the Taylor & Francis e-Library, 2008. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.”

Psychology Press is an imprint of the Taylor & Francis Group, an informa business Copyright Ø 2007 Psychology Press All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging in Publication Data Roberts, Maxwell J. Integrating the mind : domain general versus domain speci®c processes in higher cognition / Maxwell J. Roberts. p. cm. ISBN 1-84169-587-4 (hardback) 1. Cognition. I. Title. BF311.R54 2007 153.4±dc22 2006027001 ISBN 0-203-92669-2 Master e-book ISBN

ISBN: 978-1-84169-587-7 (Print Edition)

Contents

Contributors

viii

Introduction

1

MAXWELL J. ROBERTS

PART I

Extreme domain speci®city and higher cognition 1

Contextual facilitation methodology as a means of investigating domain speci®c cognition

11

13

MAXWELL J. ROBERTS

2

To what extent do social contracts affect performance on Wason's selection task?

39

IRA A. NOVECK, HUGO MERCIER, AND JEAN-BAPTISTE VAN DER HENST

3

What sorts of reasoning modules have been provided by evolution? Some experiments conducted among Tukano speakers in Brazilian AmazoÃnia concerning reasoning about conditional propositions and about conditional probabilities

59

Ã NIO ROAZZI, RENATO ATHIAS, AND DAVID P. O'BRIEN, ANTO ÄO MARIA DO CARMO BRANDA

4

Content-independent conditional inference

83

DAVID E. OVER

5

Ontological commitments and domain speci®c categorisation STEVEN SLOMAN, TANIA LOMBROZO, AND BARBARA MALT

105

vi 6

Contents Perspectives on the ``tools'' of decision-making

131

BEN R. NEWELL AND DAVID R. SHANKS

7 Domain general contributions to social reasoning: The perspective from cognitive neuroscience

153

MARGARET C. MCKINNON, BRIAN LEVINE, AND MORRIS MOSCOVITCH

8 Explaining the domain generality of human cognition

179

KEITH STENNING AND MICHIEL VAN LAMBALGEN

PART II

Extreme domain speci®city and cognitive development 9 Domain general processes in higher cognition: Analogical reasoning, schema induction and capacity limitations

211

213

GRAEME S. HALFORD AND GLENDA ANDREWS

10 A competence±procedural and developmental approach to logical reasoning

233

WILLIS F. OVERTON AND ANTHONY STEVEN DICK

11 Less speci®city in higher cognitive mechanisms: Evidence from theory of mind

257

KEITH R. HAPPANEY AND PHILIP DAVID ZELAZO

12 Interactions between domain general and domain speci®c processes in the development of children's theories of mind

275

LOUIS J. MOSES AND MARK A. SABBAGH

13 Do we need a number sense?

293

KELLY S. MIX AND CATHERINE M. SANDHOFER

PART III

Extreme domain speci®city versus domain general intelligence

327

14 Do problem solvers need to be intelligent?

329

MAXWELL J. ROBERTS

15 Creativity: Specialised expertise or general cognitive processes? DEAN KEITH SIMONTON

351

Contents 16 The CASE for a general factor in intelligence

vii 369

PHILIP ADEY

17 Innovation, fatal accidents, and the evolution of general intelligence

387

LINDA S. GOTTFREDSON

18 Heritability and the nomological network of g

427

NATHAN BRODY

19 Cognitive and neurobiological mechanisms of the Law of General Intelligence

449

CHRISTOPHER F. CHABRIS

Author index Subject index

492 506

Contributors

Philip Adey, Department of Education and Professional Studies, King's College, Franklin-Wilkins Building (Waterloo Bridge Wing), Waterloo Road, London SE1 9NH, UK. Glenda Andrews, School of Psychology, Grif®th University, Gold Coast Campus, PMB 50, Gold Coast Mail Centre, 9726, Australia. Renato Athias, Departamento de Antropologia, Universidade Federal de Pernambuco, Av. Acad. HeÂlio Ramos, s/n ± CFCH, 11¾ Andar, Recife 50670901 PE, Brazil. Nathan Brody, 50 Walbridge Road, West Hartford, CT 06119, USA. Christopher F. Chabris, Department of Psychology, Harvard University, 33 Kirkland Street, Cambridge, MA 02138, USA. Anthony Steven Dick, Department of Neurology, University of Chicago, The University of Chicago Hospitals, 5841 S. Maryland Avenue, Chicago, IL 60637, USA. Maria do Carmo BrandaÄo, Departamento de Antropologia, Universidade Federal de Pernambuco, Av. Acad. HeÂlio Ramos, s/n ± CFCH, 11¾ Andar, Recife 50670901 PE, Brazil. Linda S. Gottfredson, School of Education, University of Delaware, Newark, DE 19716, USA. Graeme S. Halford, Applied Cognitive Neuroscience Research Centre, School of Psychology, Mt Gravatt Campus, Grif®th University, Queensland 4111, Australia. Keith R. Happaney, Department of Psychology, Lehman College (CUNY), 250 Bedford Park Blvd. West, Bronx, NY 10468-1589, USA. Brian Levine, Rotman Research Institute, Baycrest Centre for Geriatric Care, 3560 Bathurst Street, Toronto, ON, M6A 2E1, Canada. Tania Lombrozo, Department of Psychology, University of California at Berkeley, 3210 Tolman Hall, Berkeley, CA 94720, USA. Barbara Malt, Department of Psychology, Lehigh University, 17 Memorial Drive East, Bethlehem, PA 18015, USA. Margaret C. McKinnon, Mood Disorders Program, Centre for Mountain Health Services, St. Joseph's Healthcare, Rm K-109D, 100 West 5th St., Box 585,

Contributors

ix

Hamilton, ON, L8N 3K7, Canada. Hugo Mercier, Institut Jean Nicod, CNRS-EHESS-ENS, 1 bis Avenue de Lowendal, 75007 Paris, France. Kelly S. Mix, Department of Counseling, Educational Psychology and Special Education, College of Education, Michigan State University, 447 Erickson Hall, East Lansing, MI 48824, USA. Morris Moscovitch, Rotman Research Institute, Baycrest Centre for Geriatric Care, 3560 Bathurst Street, Toronto, ON, M6A 2E1, Canada. Louis J. Moses, Department of Psychology, 1227 University of Oregon, Eugene, OR 97405-1227, USA. Ben R. Newell, School of Psychology, University of New South Wales, Sydney, 2052, Australia. Ira A. Noveck, Institut des Sciences Cognitives, CNRS & UniversiteÂ Lyon 1, 67 Boulevard Pinel, 69675 Bron, France. David P. O'Brien, Department of Psychology, Baruch College of the City University of New York, Box B8-215, 1 Bernard Baruch Way, New York, NY 10010, USA. David E. Over, Department of Psychology, University of Durham, South Road, Durham DH1 3LE, UK. Willis F. Overton, Department of Psychology, Temple University, 1701 North 13th Street, Philadelphia, PA 19122-6085, USA. AntoÃnio Roazzi, Departamento de Psicologia, Universidade Federal de Pernambuco, Av. Acad. HeÂlio Ramos, s/n ± CFCH, 8¾ Andar, Recife 50670-901 PE, Brazil. Maxwell J. Roberts, Department of Psychology, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK. Mark A. Sabbagh, Psychology Department, Queen's University, Kingston, ON, K7L 3N6, Canada. Catherine M. Sandhofer, Department of Psychology, University of California, Los Angeles, CA 90095, USA. David R. Shanks, Department of Psychology, University College London, Gower Street, London WC1E 6BT, UK. Dean Keith Simonton, Psychology Department, University of California, One Shields Avenue, Davis, CA 95616, USA. Steven Sloman, Cognitive and Linguistic Sciences, Brown University, Box 1978, Providence, RI 02912, USA. Keith Stenning, The Human Communication Research Centre, University of Edinburgh, 2 Buccleuch Place, Edinburgh EH8 9LW, UK. Jean-Baptiste Van der Henst, Institut des Sciences Cognitives, CNRS & UniversiteÂ Lyon 1, 67 Boulevard Pinel, 69675 Bron, France. Michiel van Lambalgen, Institute for Logic, Language and Computation, Department of Philosophy, University of Amsterdam, Nieuwe Doelenstraat 15, 1012 CP Amsterdam, The Netherlands. Philip David Zelazo, Department of Psychology, University of Toronto, 100 St George Street, Toronto, ON, M5S 3G3, Canada.

Introduction Maxwell J. Roberts

Currently, there are several debates taking place simultaneously in various ®elds of psychology that address the same fundamental issue: to what extent are the processes and resources that underlie higher cognition domain general versus domain speci®c? Domain general processes function identically on represented information, irrespective of content and context. Furthermore, their success of operation depends on the availability of domain general resources such as working memory capacity. However, many researchers believe that domain general processes serve little purpose in the real world, and may even not exist, and that domain general resources are an irrelevance. Instead, people's performance is determined by domain speci®c processes that operate only in narrow contexts. Underlying these might be innately programmed modules, originating from the problems that humans needed to solve when they ®rst became social hunter-gatherers (hence, massive modularity). An alternative domain speci®c position is that, although these procedures dominate cognition, they are learnt (extreme contextualism). Combining these, it is possible that innate predispositions to process particular types of information channel the development of modules, which therefore originate from learning, but indirectly from genetic programming. Theoretical approaches that deny the importance of domain general procedures and resources will be named extreme domain speci®city. The domain general/domain speci®c debate is extremely wide ranging, and has widespread implications for cognitive science and philosophy of mind. Speci®cally, it impinges most obviously on cognitive psychologists, but also has immediate implications for developmental, differential, and educational psychologists. For researchers attempting to understand cognitive development, if domain general processes or resources are important, then the study of their acquisition during childhood is a worthwhile exercise. If not, then the development of domain speci®c modules, knowledge, and other context-speci®c inference structures must be studied instead. Proponents of extreme domain speci®city also deny that measuring individual differences in general ability, i.e., intelligence, is a useful activity. At best, all that is being measured is a highly speci®c puzzle-solving ability that

2

Roberts

has little relevance to the skills that people display in the real world. Also, in education, if domain general reasoning processes are of importance, then there could be some value in teaching general thinking skills. If not, then educators should focus on speci®c domains. How seriously is extreme domain speci®city taken as a theoretical position? In one in¯uential collection, we have been told that: ``A growing number of researchers have concluded that many cognitive abilities are specialised to handle speci®c types of information. In short, much of human cognition is domain speci®c'' (Hirschfeld & Gelman, 1994, p. 3). Twelve years later, it seems that little has changed. In fact attitudes may even have hardened: ``The upshot is that there is nothing in the human psyche that requires any signi®cant retreat from a thesis of massively modular mental organization'' (Carruthers, 2004, p. 259). This is despite one of the early proponents of modularity expressing scepticism that higher cognitive processes, so-called central systems, could be modular (Fodor, 1983), and his subsequent failure to be convinced by the arguments: ``I'm going to argue that there's no a priori reason that massive modularity should be true; that the most extreme versions of massive modularity can't be true; and that there is, in fact, no convincing evidence that anything of the sort is true'' (Fodor, 2000, pp. 64±65; emphasis in original). The purpose of the current book is to bring together chapters from a wide variety of researchers, all of whom believe that the plausibility of extreme domain speci®city warrants close scrutiny. Broadly, they argue that evidence in its support has been overstated, interpretational dif®culties have been played down, contradictory data have been ignored, and theoretical dif®culties have not been addressed. The purpose of this introduction is not to review and discuss the various manifestations of extreme domain speci®city and the arguments for and against it. This task is ably accomplished by the individual contributions. Nor will I give a chapter-by-chapter summary. Instead, I will give an overview of the structure of this book and how its chapters relate to each other, ®nishing by outlining some of the recurring strands and key issues that will be encountered. Many chapters focus on particular manifestations of extreme domain speci®city, either massive modularity or extreme contextualism, re¯ecting the current locus of attention within particular topics. However, evidence against one manifestation frequently counts as evidence against another (see Roberts, Chapter 1) and readers should always bear this in mind while reading. The chapters are loosely organised by psychological topic, and this book is divided into three parts. The ®rst series of chapters (1 to 8) focuses on the plausibility of extreme domain speci®city, both in general and as applied to particular topics in higher cognition, from methodological, empirical, theoretical, and philosophical standpoints. Topics discussed include deductive and inductive reasoning, categorisation, judgement and decision making, and social reasoning. The second series of chapters (9 to 13) focuses on cognitive development within domains central to the debate, including

Introduction

3

logical competence, theory of mind ability, and number sense. Does the development of these represent only the maturation of modules, or the acquisition of domain speci®c knowledge (perhaps directed by biological predispositions), or are these domains less ``special'' than some researchers have suggested? The third series of chapters (14 to 19) focuses on the quintessential domain general ability: intelligence. Proponents of extreme domain speci®city reject the entire notion of important general individual differences in cognition. Consistent individual differences in performance, predictable in advance by general cognitive ability, are at best awkward and at worst devastating for extreme domain speci®city. This ®nal series of chapters discusses evidence put forward in an attempt to deny the importance of general intelligence, and outlines evidence counter to this, suggesting a wide-ranging, important domain general ability that is dif®cult to ignore. Notwithstanding the overall organisation of the book, readers will see that there are many recurring themes throughout. Much of the research in support of extreme domain speci®city, whatever its manifestation, is based on use of contextual facilitation methodology. Generally, this compares performance between (1) tasks set in appropriate contexts, versus (2) impoverished abstract tasks that are (supposedly) isomorphic. Good performance at the contextually appropriate material is taken as evidence for the existence of context-speci®c processes. Roberts (Chapter 1) argues that irrespective of domain, such conclusions drawn from this methodology are suspect. Attempts to manipulate context also manipulate other extraneous variables, and hence performance has been facilitated for reasons irrelevant to context. This line of argument is expanded in the next two chapters, which focus on the Wason Selection Task. Previously, ®ndings from this task have been taken to imply, at the very least, learnt domain speci®c reasoning schemas, and possibly modules sensitive to the presence of hazards, and to cheaters evading social contracts. Noveck, Mercier, and Van der Henst (Chapter 2) have investigated several extraneous task variables that have previously been inadvertently manipulated simultaneously with context, and that may have facilitated performance in past studies. O'Brien, Roazzi, Athias, and BrandaÄo (Chapter 3) have performed a cross-cultural study, concluding that different problem formats are differentially interesting to participants, rather than differentially engaging of domain speci®c reasoning mechanisms. Overton and Dick (Chapter 10) suggest that there are numerous dif®culties with studies that attempt to show (1) that contextual selection tasks facilitate children's performance and (2) that this is due to activating domain speci®c reasoning procedures. Proponents of extreme domain speci®city ignore humans' general logical competence and understanding at their peril: Overton and Dick go on to argue that domain speci®c approaches to cognition entirely neglect the general logical competence that adults show, including the understanding of key domain general concepts such as logical necessity, as well as neglecting the developmental trajectory of logical reasoning ability and logical concepts.

4

Roberts

Claims for extreme domain speci®city have been put forward not just for reasoning, but also for judgement and decision making. On one hand, suggestions have been made that, owing to our evolutionary history, humans are better equipped to reason about number problems when these are expressed in terms of frequencies rather than proportions. O'Brien, Roazzi, Athias, and BrandaÄo (Chapter 3) argue that these conclusions are based on poorly designed items. With more appropriately designed ones, there is no evidence for differences, even cross-culturally. Over (Chapter 4) argues that many theoretical claims are based on confusions about the natures of frequencies versus proportions, and that it is possible to account for people's behaviour with an extended model of conditional reasoning that takes the uncertainty of information into account. Another recent domain speci®c claim is that people's judgements are made on the basis of ``fast and frugal'' heuristics: collections of innate strategies that can be applied to speci®c tasks. These rapidly generate reasonable answers without overwhelming cognitive resources with extensive computations. However, Newell and Shanks (Chapter 6) suggest that the evidence for domain speci®city is poor; rather, people have domain general strategies that are adjustable for particular situations. Furthermore, McKinnon, Levine, and Moscovitch (Chapter 7) argue that even for social reasoning, such as for tasks requiring moral judgements and empathy, the availability of domain general cognitive resources is important. Similar to the Wason selection task, there can also be dif®culties with the use of ``matched'' control tasks when investigating theory of mind reasoning. Happaney and Zelazo (Chapter 11) suggest that ``false-belief'' tasks, which supposedly tap into a theory-of-mind module, versus ``equivalent'' control items, which supposedly do not, are often incorrectly matched in terms of their cognitive demands. The selection task and false-belief task strands are brought together by Stenning and van Lambalgen (Chapter 8), who suggest that different task contexts trigger the use of different domain general logics. Hence, the selection of domain general processes is contextsensitive. Some of these logics are more demanding in terms of general cognitive resources than others. This theme is echoed by Moses and Sabbagh (Chapter 12), who suggest that, although domain general executive processes and resources are necessary for the solution of false-belief/theory of mind tasks, they are not suf®cient. Also necessary is domain speci®c knowledge of the various properties of beliefs. Even so, reasoning about certain aspects of theory of mind is more demanding of domain general resources than reasoning about others. The importance of the availability of domain general resources is also noted by Overton and Dick (Chapter 10), in terms of success or failure at accessing underlying deductive competence, and by Halford and Andrews (Chapter 9), in terms of ubiquitous relationships between objective measures of task complexity and task performance. Turning to more speci®c relationships between measures of general executive function, task demands, and reasoning performance, the need

Introduction

5

for domain general resources is also noted by Happaney and Zelazo (Chapter 11) concerning theory of mind reasoning, and by McKinnon, Levine, and Moscovitch (Chapter 7) concerning social decision making in general. The general point that a domain speci®c architecture, at the very least, requires a domain general co-ordinator, is explicitly noted by Gottfredson (Chapter 17), Halford and Andrews (Chapter 9), and Happaney and Zelazo (Chapter 11). A point often neglected by advocates of extreme domain speci®city is that apparent speci®city can arise as a result of the use of domain general procedures. Hence, Halford and Andrews (Chapter 9) note that although analogies are speci®c to their relevant domains, schema creation and the associated control processes of schema usage must be domain general. Sloman, Lombrozo, and Malt (Chapter 5) also make this observation in relation to inductive reasoning and categorisation, reaching the conclusion that apparent domain speci®c differences in the ways in which living things versus artifacts are categorised result largely from the properties of the objects themselves. Hence, minimal assumptions are necessary in order for a domain general categorisation mechanism to exhibit such differences. Furthermore, Mix and Sandhofer (Chapter 13) argue that claims that humans have predispositions to be sensitive to basic number concepts can be explained in terms of much more general basic perceptual and attentional mechanisms. When looking at developmental trajectories, acquisition of number concepts mirrors others for which no special status has been argued. We can go on to suggest that individual differences in general cognitive skill (i.e., intelligence) are entirely counter to claims made by at least some proponents of extreme domain speci®city. If the importance of domain general processes and resources is downplayed, this should imply that no individuals will be consistently better at problem solving than others. Roberts (Chapter 14) argues that attempts to show that intelligence is unrelated to general problem-solving skill have largely failed. Simonton (Chapter 15) explodes the myths that (1) creative geniuses manifest extreme domain speci®city in their skills, and (2) their only differences compared with non-geniuses are in terms of motivation and practice with respect to their domains of expertise: In sharp contrast to established theories of expertise, the achievements of such people imply high levels of domain general problem-solving skills. Brody (Chapter 18) and Gottfredson (Chapter 17) demonstrate that there are substantial relationships between intelligence and performance in the classroom, and when facing the trials of day-to-day life, respectively. This is especially true when wide-ranging indices are taken, aggregating success across many measures. In recent years, new methodologies have developed, and older methodologies have been re®ned. Hence, Brody (Chapter 18) shows that modern behaviour genetics methods have largely answered the complaints of critics, and that a substantial proportion of individual differences in intelligence is

6

Roberts

due to genetic differences. In turn, this genetic in¯uence can be traced through to school achievement. On the other hand, Adey (Chapter 16) and colleagues have devised a series of instruction programmes for schools that are ®rmly based on cognitive psychology, and also what is achievable given children's normal developmental trajectories. Here, they have found the elusive far-transfer effect: that programmes targeted at certain school topics raise performance in others, providing inspiration for those who seek to improve domain general problem solving skill. Recent advances in brain imaging, along with ongoing studies of patients with brain lesions, also suggest greater degrees of domain generality than researchers have argued in the past. Hence, McKinnon, Levine, and Moscovitch (Chapter 7) show that many tasks, in ostensibly different domains, are mediated in part by the same brain regions involved in a variety of domain general cognitive and affective processes. Overall, they argue that tasks differ along two major dimensions ± (1) cognitive capacity demands, and (2) emotional involvement ± rather than the speci®c domain in which each task is grounded. Also taking a neuropsychology perspective, Chabris (Chapter 19) reviews recent literature on (1) general intelligence in nonhuman animals and (2) human neuroimaging studies. He argues that despite cognition necessarily arising from the action of separate networks or ``modules'', the ubiquitous positive correlation among cognitive measures should be recognised as a behavioural law, resulting from neurological properties that have effect throughout the brain, and re¯ecting common cognitive/neural resources in the performance of many different tasks. What are the evolutionary pressures that our ancestors really faced? What sort of cognitive architecture would these favour? Several chapters engage with this dif®cult question (e.g., Over, Chapter 4; Happaney & Zelazo, Chapter 11), and Stenning and van Lambalgen (Chapter 8) and Gottfredson (Chapter 17) tackle it in detail. Stenning and van Lambalgen assert that the enormous ± and neglected ± discontinuity with our ancestors is our domain generality: we have an unprecedented ability to solve problems irrespective of context and environment. Compared with our ancestors, there appears to be an increased ability for on-line planning, necessary for verbal communication, which is in turn necessitated by the long period of social upbringing that our children require. This birth of very immature young is in turn necessitated by the biological constraints of narrow-hipped bipeds giving birth to large-brained babies ± the latter making them better able eventually to communicate, plan and problem solve. Also on the theme of planning and foresight, Gottfredson (Chapter 17) takes a different line, that the rapid evolution of a general problem-solving intelligence was inevitable once our ancestors started innovating: the most intelligent of them created effective new technologies (e.g., ®re, hunting, blades, guns, cars, and powertools) whose dangers outstripped the bene®ts for the least intelligent of our ancestors. There commenced a drive for a rapid upward spiral in human intelligence, which may still be operating today.

Introduction

7

Although the chapters have been written by authors from different areas of research, there are many recurring themes and threads that we can identify. The ®rst we can extract is that apportioning cognition to domain speci®c versus domain general mechanisms is extremely dif®cult. Many methodological dif®culties with previous research have been highlighted by the authors, all of which cast doubt on the strengths of conclusions that have been drawn in the past. Research into this topic, as we have seen, is extremely important, but the manipulation of variables, and the construction of tasks, is not something that should be taken for granted. There are many lessons concerning research design that can be learnt from this book. Second, we should note that the differences between domain general and domain speci®c processes are not as clear-cut as might ®rst appear. Processes differ in how speci®cally they may be applied. For example, a permission schema is domain speci®c in that it can only be applied to contexts in which permission must be obtained before performing an action. Nonetheless, this is still more general than an innate module that functions only in social contexts, those in which cheaters are potentially evading social contracts. Domain general procedures are never totally domain general. For example, counting is domain speci®c in the sense that such a procedure cannot be used to evaluate a categorical syllogism, but is domain general in the sense that no matter what is being counted, whether toys, animals, or vegetables, the procedure is the same. Hence, we have a complex continuum rather than a dichotomy. Even so, we can still ask the question: is a particular procedure available to all contexts, or is it encapsulated within a context-speci®c module or knowledge structure? Third, we should note that cognition is in¯uenced by context. The same cognitive procedures are not applied blindly no matter what the situation. If different contexts require different procedures, people can respond to this. People are domain speci®c in the sense that they can, and do, behave differently in different domains, but this by itself does not entail a oneprocess-per-context cognition, and numerous chapters in this book argue that a suite of domain general procedures, selected on the basis of context where necessary, can account for human performance. None of the authors in this book is arguing for entirely domain general cognition. However, many argue that domain general processes can be intelligently modi®ed so as to ®t context, guided by knowledge and learning as necessary. Fourth, irrespective of the domain generality (or not) of our cognitive processes, we see a recurring role for domain general resources, such as working memory capacity. Hence, differences in task demands have important in¯uences on performance. Furthermore, individual and developmental differences in the availability of domain general resources are also important. What does this mean exactly? For the authors in this book, this implies at the very least a domain general co-ordinator, whose effectiveness varies from task to task (depending on complexity) and from person to person (depending on the personal availability of resources). For advocates of

8

Roberts

extreme domain speci®city, this is a very dif®cult ®nding to grapple with. Some researchers have sought to de¯ect it by proposing hybrid domain general/domain speci®c mechanisms. Presumably, the capacity effects are entirely subsumed by the domain general component. However, this still does not answer the question: to what extent is the operation of domain speci®c processes (whether modular or learned ) in¯uenced by task demands, as opposed to task content, and why? Unfortunately, domain speci®c theories (or components of theories) are almost entirely descriptive, a catalogue of how the mind is sliced up, rather than an explanation of how modules or other contextual mechanisms take input and convert to output (contrast this with our current state of knowledge of low-level vision processes). For example, the Wason selection task can be made contextual (and easier) by adding a suitable cover story, for example, a policeman trying to catch people who are drinking illegally by breaking the following rule (see Roberts, Chapter 1): If you are drinking alcohol, you must be over 18 years old Same context, same correct answer, but different rule: Either you drink a soft drink, or you must be over 18 years old The author consistently ®nds poorer performance for the second rule compared with the ®rst. In domain general terms, this is easy to explain because disjunctions are harder to reason with than conditionals: they require additional processing steps irrespective of context. What would a domain speci®c theory make of this difference, one that denied general capacity limitations as the source of differences in performance at contextual tasks? At the moment, we simply do not know. Thirty-four authors have contributed to this book. It would be unprecedented for such a large number of psychologists to be in full agreement with each other when discussing any topic, and this book is no exception. However, as I hope the reader will see, their agreements outweigh their disagreements: All agree that the case for extreme domain speci®city has been overstated for various reasons, and that there must be (relatively) domain general aspects to human cognition. Contextual effects are not denied, but one conclusion from them ± extreme domain speci®city ± is disputed. On this basis, we should stop trying to slice up the mind into an ever-growing number of uncoordinated modules for ever more domains. Now it is time to take a step back and evaluate which aspects of cognition are likely to be domain general and which are domain speci®c. In short, it is now necessary to integrate the mind.

References Carruthers, P. (2004). Practical reasoning in a modular mind. Mind & Language, 19, 259±278. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press.

Introduction

9

Fodor, J. A. (2000). The mind doesn't work that way. Cambridge, MA: MIT Press. Hirschfeld, L. A., & Gelman, S. A. (1994). Overview. In L. A. Hirschfeld, & S. A. Gelman (Eds.), Mapping the mind: Domain speci®city in cognition and culture (pp. 3±35). New York: Cambridge University Press.

Part I

Extreme domain speci®city and higher cognition

1

Contextual facilitation methodology as a means of investigating domain speci®c cognition Maxwell J. Roberts

How do people make inferences? Do they rely on domain general procedures: versatile cognitive processes that operate no matter what the material being considered? Alternatively, do people rely on highly domain speci®c procedures, which operate only within narrow contexts, and function only with dif®culty, if at all, if contexts deviate from the ideal? If we take a more plausible view, that both types of process are important, what are the circumstances in which each is likely to be relied on? These are important questions to answer, but in order to do so, psychologists need to be con®dent that they know for sure exactly how to identify each type of process. If their con®dence is unwarranted, then so are some of the stronger claims that have been made, such as: An alternative view that has been developing recently is that there are no such things as operations of thought and logic detached from the content and context in which they operate. (Richardson, 1991, p. 129) Domain speci®c cognitive mechanisms . . . can be expected to systematically outperform (and hence preclude or replace) more general mechanisms that fail to exploit these features. (Cosmides & Tooby, 1994, p. 89) However, we would also repeat our point that the basic pragmatic constraint of relevance permeates human reasoning so deeply that it is doubtful that a psychologically plausible natural logic can be encapsulated from pragmatic considerations. (Holyoak & Cheng, 1995a, p. 385) All these researchers are suggesting that cognition is effectively domain speci®c: It makes no sense to speak in terms of context-free domain general thought processes. They do not exist, or if they do, they serve little or no purpose. The focus of this chapter will be the plausibility of one type of evidence in support of such assertions.

14

Roberts

Domain general versus domain specific accounts of cognition Domain general theories posit general-purpose inference processes that operate similarly, irrespective of content and context. For example, the basis of the mental models theory (e.g., Johnson-Laird & Byrne, 1991) is that inferences take place via the representation of information on spatial arrays. These correspond to states of the world, for example, as described in a set of premises, and additional inferences can be made from the representations. Other researchers have proposed theories of reasoning based on abstract propositional inference rules (e.g., Braine & O'Brien, 1998; Rips, 1994), in which inferences can be generated from represented information by applying sequences of rules. The two types of theory have in common the property of domain generality. The reasoning procedures can operate on material in any domain, irrespective of content and context. They also have in common a requirement for domain general resources, such as working memory capacity, in order for people to reason successfully. Finally, the proponents of both types of theory acknowledge fully the importance of content and context effects in reasoning. The researchers may accept the need for additional procedures to account for them, such as those outlined below. Alternatively, they may suggest that content and context affect the operation of domain general reasoning procedures by in¯uencing which information is represented, and how, and whether the output from these procedures is deemed plausible, and accepted, or deemed implausible, and further inferences sought. Evidence in support of domain general theories mainly consists of logic effects. Irrespective of content and context, inferences tend to differ in their dif®culty, and this can be related to the logical structure of the items. Overall, inferences will be harder to make if the logic of a problem means that many inference rules must be applied, or many mental models constructed. Hence, domain general processes require domain general resources, and excessive demands on resources prevent them from operating effectively. Domain speci®c theories can likewise be categorised into different types. On one hand, we have theories that stress the importance of knowledge, schemas, or other learnt context-based procedures. Where it is asserted additionally that domain general procedures and resources do not exist or are irrelevant, such theories will be termed extreme contextualism. One example of a learnt context-based domain speci®c procedure is the schema theory of Cheng and Holyoak (e.g., Cheng & Holyoak, 1985; Holyoak & Cheng, 1995b). This posits that, as a result of past experience, people possess memory structures known as schemas, which enable them to make appropriate decisions and take appropriate actions. To operate, schemas require situations whose content can be mapped onto their structures. For example, where certain preconditions must be ful®lled in order to have permission to perform an action, a pragmatic permission schema is activated, a structure consisting of the rules (1) if you perform action A, you

1. Contextual facilitation methodology

15

must ful®l precondition P, (2) if you do not perform action A, you need not ful®l precondition P, (3) if you ful®l precondition P, you may perform action A, and (4) if you have not ful®lled precondition P, you must not perform action A. Armed with such a schema, and suitable content (e.g., A is the action of drinking alcohol, P is the precondition of being the legal drinking age) it is possible for a person to determine whether he or she is permitted to take an action or, alternatively, whether another person is illegally performing an action. For some memory structures, such as the permission schema, relevant experience during upbringing seems to be universal. Evidence for widespread possession of an obligation schema is less clear (e.g., Noveck & O'Brien, 1996). This schema speci®es the circumstances in which a person is obliged to perform a particular behaviour. Less common still is the convergence schema (e.g., Gick & Holyoak, 1983), which must be taught to university students before the majority are likely to use it. Hence, schema theory can account for individual differences in reasoning: People's performance at a task is determined by whether the appropriate schema is in place, rather than by whether domain general processes operate successfully (e.g., Gick & Holyoak, 1983, p. 32; Richardson, 1991). People differ by opportunity, and once the necessary memory structures are possessed, they function impeccably once triggered, irrespective of the availability of domain general resources. The possibility that people may differ in their ability to acquire schemas is also not entertained, even though this itself is a domain general procedure (e.g., Halford & Andrews, Chapter 9, this volume). Similar in operation to schemas, if not the origin of the processes, are the massive modularity theories of evolutionary psychologists. The underlying premise is that past environments festooned our distant ancestors with suf®cient numbers of similar survival-critical problems, that natural selection favoured the evolution of pre-programmed domain speci®c modules able rapidly to generate appropriate solutions. The alternative, use of domain general procedures, or even the learning of domain speci®c ones from scratch, is said to be too slow and risky, so that people who needed to rely on them would have been at a disadvantage compared with those preequipped. Our ancestors had many different types of such problems, resulting in many domain speci®c modules, hence massive modularity. For example, it has been claimed that, as social beings, individuals' survival depended on cooperation with their peers, but the advantages to be had by obtaining the assistance of others without reciprocation would be considerable (e.g., Cosmides, 1989). The result is an evolved module, sensitive to detecting cheaters, that is, people who seek to break social contracts of the form if you take a bene®t, then you must pay a cost. In a context where there is the possibility that such cheating will take place, presumably such a module activates, and caution is observed. More recently, the existence of a hazard management module has been proposed on a similar basis (see Fiddick, Cosmides, & Tooby, 2000). Because successful operation of these

16

Roberts

modules is said to be critical to survival, they have evolved to operate impeccably, independently of the availability of domain general resources, which are therefore also irrelevant to successful reasoning. At ®rst sight, learned versus evolved context-dependent domain speci®c reasoning mechanisms appear almost to be opposites. However, despite their different origins, there is important common ground between them. The basic reasoning procedure is the same for both. When content is appropriate, context-based reasoning procedures are activated. These operate independently of domain general resources, such as working memory capacity, ef®ciently generating appropriate inferences. Hence, when people perform well, this is because the appropriate schema or module has activated. If the reasoning procedures are not triggered, the picture is less clear. Domain general procedures might take a stab, except that proponents of both approaches explicitly deny their existence, or at least their utility. In which case, where domain speci®c processes are not activated, reasoning performance should be at ¯oor level. Even where domain general processes and resources are admitted, their importance tends to be downplayed, or they are taken as not being applicable within the domains for which speci®c procedures are proposed. Of course, nothing is known about the human cognitive architecture to suggest that any of the proposals put forward are mutually exclusive. Indeed, proponents of massive modularity who deny the possibility of domain general reasoning processes are then dependent on the parallel validity of extreme contextualism. We have not evolved modules for medicine, law, chess, plumbing, or solving intelligence test items. Differences in people's ability to perform in modern domains must entirely depend on the existence of schemas, or other contextual reasoning procedures. Overall, any dif®culties with extreme contextualism, or any evidence to suggest that domain general procedures or resources might be of importance when learning or performing skilled tasks (see Roberts, Chapter 14; Gottfredson, Chapter 17, this volume), automatically count as evidence against massive modularity. Given the similarities in operation between domain speci®c theories, the types of evidence said to support each is likewise similar. One example is from applying the contextual facilitation methodology. This is undoubtedly a cornerstone for domain speci®city theorists, and the purpose of this chapter is to scrutinise this, and the claims made on its basis. Methodological and theoretical dif®culties with such evidence will be equally damaging to massive modularity and extreme contextualism.

The contextual facilitation methodology Contextual facilitation methodology is an experimental technique in which performance is compared on at least two different types of problem: an impoverished version, and an ``isomorphic'' enriched version. For the impoverished problem, performance will usually be poor. A domain general

1. Contextual facilitation methodology

17

explanation of this would be that the task is too demanding for the sample tested, either (1) because the subjects have insuf®cient domain general resources in order to solve it, for example because too many processing steps are required and working memory capacity is insuf®cient, or (2) because people in the sample do not possess the appropriate underlying domain general competence to enable them to do so. In each of these cases, it might be argued that people who cannot solve the task are insuf®ciently advanced in their cognitive development, or else that they are insuf®ciently intelligent. In contrast, a domain speci®c explanation would suggest that the impoverished problem is inappropriately framed. Its format has failed to activate the necessary schema (assuming it is possessed) or module, or indeed any reasoning procedures at all. Superior performance is then demonstrated via an enriched version of the task. This is asserted to be logically isomorphic to the impoverished task, and therefore just as demanding to solve, but with embellishments that set the problem in a suitable context: one that is compatible with the necessary input for a schema or module. Improved performance for the embellished task is interpreted to different degrees by different researchers, but typically at least some of (1) to (4) below will be drawn. 1

2 3 4

Different levels of performance for the impoverished versus embellished task suggest that different reasoning procedures have been applied to each. The improvement is not because of domain general processes working more effectively. Good performance is observed only when appropriate context is present, suggesting that the reasoning processes applied to the embellished task must be highly context speci®c. The impoverished problem underestimated the true capabilities of the person. With suitable context, people are perfectly capable of making complex logical inferences. People do not reason ``out of context''. They reason using domain speci®c procedures that activate only in appropriate contexts. Domain general processes or resources serve little if any purpose, if they exist at all.

In this chapter, I will argue that owing to dif®culties with the contextual facilitation methodology, none of these conclusions is safe. This will be illustrated using examples from a wide range of topics, all with the same underlying problems. 1

Embellished problems are supposedly isomorphic with impoverished problems, hence ``There are now many completely convincing demonstrations that children perform much better in one condition, than in another, despite the fact that both contexts make exactly the same logical and cognitive demands'' (Roazzi & Bryant, 1992, p. 14). In

18

2

Roberts reality, taking an abstract meaningless problem, and then embellishing it with suitable details, changes it in all sorts of ways, much more than simply adding content and context. Despite claims to the contrary, alterations often inadvertently highlight key components of the problem, or even change its logic, ultimately changing the cognitive demands of the task. Hence, researchers fail to consider the possibility that manipulating context has also manipulated extraneous variables. The arguments above are least controversial when performance at the impoverished task is at ¯oor level. In other words, without context, people are no better than guessing. It is therefore easy to argue that the reasoning that has taken place is either zero or context-based. Once people are able to solve the impoverished task better than chance, it becomes dif®cult to deny the existence of domain general processes. Extreme contextualism theorists have an easier time explaining this away than massive modularity theorists, as they have available the rationalisation that some of their subjects must have been in possession of a collection of puzzle-solving schemas. These are useful only for solving limited puzzles, otherwise they could constitute domain general reasoning procedures. In general, the better the performance at the impoverished task, the greater the improvement necessary at the embellished task for arguments against domain generality to carry any plausibility. However, no matter what the difference, the conclusion is false, as can be illustrated from the analogy used by Roberts and Stevenson (1996, p. 529): Taking a reasoning task and showing that performance improves when appropriate context is added cannot disprove the existence of domain-free reasoning processes. This would be analogous to ®nding that people travel faster with cars than without and concluding that legs do not exist, or that if they do, that they serve no purpose in the real world.

3

Assuming only domain speci®c reasoning processes, there should be just two scores possible for an individual, either ¯oor (no procedures activated) or ceiling (context activates appropriate procedures that generate the correct answer). The logic of a task should be irrelevant. In reality, logic effects abound in reasoning, cognitive development, and intelligence. These can be observed not just in impoverished tasks, but in embellished ones also. Logic effects support the existence of domain general procedures and the necessity of domain general resources. If domain general processes and resources do not exist, these logic effects should not be found anywhere, even on abstract tasks. If they are merely unimportant, then at the very least there should be no logic effects in contextualised versions of tasks. If we do ®nd logic effects in contextualised tasks, then this at least shows that reasoning

1. Contextual facilitation methodology

19

does depend upon the availability of domain general resources, which constrain the effectiveness of domain speci®c processes. Furthermore, the possibility that domain general processes contributed to the inference cannot be ruled out. Overall, it will not be denied that context is important. Of course, people will be at an advantage in a situation in which they possess appropriate knowledge, or a schema, than in a situation where they do not. What will be denied is that evidence from the contextual facilitation methodology can be used to ``disprove'' the existence of domain general processes and resources, or consign them to the periphery of cognition. Piagetian problems The contextual facilitation methodology came to prominence during the 1970s, one strand as a means of investigating the framework of cognitive development devised by Piaget and his followers (e.g., Piaget, 1950). Piaget's research programme and the ``dif®culties'' in interpreting his ®ndings are well known. The framework itself is nonetheless a powerful one. Put simply, Piaget observed that (1) older children compared with younger children have a clear advantage in their ability for more complicated, domain general inferences, and (2) there is an ordered, consistent progression in the acquisition of cognitive skills during childhood owing to qualitative differences in the ways in which problems are approached. Various competencies appeared to be correlated, and needed to be mastered before more advanced ones could be acquired, inevitably implying some sort of progressive theory of cognitive development. A 4-year-old child is shown a table with three mountains with various landmarks and features on them. The child is given a set of photographs. He is asked to select the one that shows the same view as would be seen by someone sitting at a different place on the table. The child is unable to do this. The child is unable to imagine the scene from different points of view. An adult places two rows of counters on a table, each row with the same number, such that they are aligned precisely. A 6-year-old child is asked whether there are more counters in one row than another, or whether they contain an equal number. The child (correctly) says that they are equal. Now the adult spreads out the counters in one of the rows so that, although the numbers remain the same, it is now longer than the other row. When the child is asked the same question again, he answers (incorrectly) that the longer row contains more counters. The child does not understand that despite the transform, the number of counters has been conserved.

20

Roberts

Both of these examples suggest that children have ``failed'' certain tasks. In other words, successful performance required a skill, or competence, that the child apparently did not possess. However, many researchers did not accept that there were important phenomena to explain: i.e., that (1) at certain ages, children perform certain tasks with dif®culty, if at all; whereas (2) at later ages, the identical tasks are performed effortlessly easily, and therefore (3) something interesting has happened in between. Instead complaints were made that Piaget's tasks were ``unfair'', underestimating what children could do. For example, the original mountains task was said to be impoverished because no context was set up to give any basis for asking the question, making it meaningless, abstract, and unmotivating, failing to engage the child's reasoning processes. Add suitable context and: A 4-year-old child is shown a set of intersecting walls on a table, forming a cross, along with a boy doll and a policeman. The child is asked whether the policeman can see the boy from various positions, and is asked to hide the boy from the policeman. Feedback is given, and corrections if necessary. A second policeman is introduced and the boy is asked to hide the boy doll from both policemen. The child is able to perform all tasks with few errors. The child can imagine a scene from different points of view after all. (Adapted from Donaldson, 1978; see also Hughes, 1978) As Adey (1997), for example, notes, the policeman version of the task simply requires a concrete determination of whether straight lines between toys are obstructed or unobstructed. The original mountain version requires the imagination of objects from different viewpoints. It cannot be claimed that the two tasks are isomorphic. The mountains task is more dif®cult logically. Furthermore, in redesigning tasks to make them more ``appropriate'', the danger of false positives is considerable (Sophian, 1997). An adult places two rows of counters on a table, each row with the same number, such that they are aligned precisely. A 6-year-old child is asked whether there are more counters in one row than another, or whether they contain an equal number. The child (correctly) says that they are equal. Now ``naughty teddy'' spreads out the counters in one of the rows. When the child is asked the same question again, he answers (correctly) that both rows still contain equal numbers of counters. The child does understand that despite the transform, the number of counters has been conserved. (Adapted from McGarrigle & Donaldson, 1975) Here, the original task was said to be impoverished because there was no basis for asking the question twice. Indeed, this could be misleading, and moving the counters could be taken to be a false hint. However, if a child

1. Contextual facilitation methodology

21

genuinely does understand conservation, why should asking the same question twice cause a change of answer (Nelson, Dockrell, & McKechnie, 1983)? Embellishment with ``naughty teddy'' supposedly sets the context for the repeat question and gives the task ``human sense''. Even then, 30% of children still responded erroneously in the original study. Unfortunately, Moore and Frye (1986) found that the improvement in performance was not straightforward. When subitising was not possible, even if ``naughty teddy'' (as opposed to the experimenter) increased the length of one row, a high level of (correct) ``equal numbers'' responses was not observed. Furthermore, whether or not subitising was possible, if ``naughty teddy'' really did add counters to one row (without changing its length), there were actually fewer (now correct) ``more counters'' responses compared with the abstract, unmotivating, ``experimenter only'' condition. A complicated pattern of strategy, assumption, and response bias was therefore revealed (see also Light & Gilmour, 1983; Sophian, 1997). Overall then, from these two examples, we can see that there can be important dif®culties in the application of contextual facilitation methodology. The reason for facilitation need not be the embedding of tasks in appropriate context. Performance improved, either because the task was logically easier, or because the correct answer matched a task formatinduced response bias. In neither case can we say that hidden competence was unequivocally revealed. Any use of social context cues appears to be a default strategy applied when the fundamental concepts being tested are not fully understood (Sophian, 1997). From a cognitive development perspective, how these issues are interpreted depends on the criterion by which a competence/understanding must be demonstrated in order for it to be established that it is possessed. At one extreme we have (1) the skill must be routinely displayed no matter what the context, and at the other we have (2) just one demonstration is enough, no matter how contrived the task. (2) ignores the fact that older children tend to manifest competences easily in a versatile domain general way, whereas younger children require carefully designed tasks, displaying these supposedly same skills sporadically in a speci®c domain or context, if at all. Clearly something has changed with age, and it is necessary to know what and why. One explanation by advocates of domain speci®city is that knowledge of social contexts and interactions has been acquired, enabling children to understand ``testing contexts''. This seems unconvincing at best, and the remaining examples in this chapter cannot be explained away in this way. Card tricks The Wason selection task attracted the contextual facilitation methodology at much the same time as Piagetian tasks. Indeed, Piaget developed analogous precursors to this task (see Overton & Dick, Chapter 10, this

22

Roberts

volume). The selection task has generated much interest among reasoning researchers because, in its original version, typically fewer than 10% of people are able to make logically correct responses. This standard, abstract (i.e., impoverished) task consists of a brief explanatory paragraph, a conditional rule (of the form If p then q) and pictures of four cards to which the rule refers (with the cards showing the four logical values p, not-p, q, and not-q). Subjects are told that each card has two mutually exclusive symbols (e.g., letters and numbers), one on each side, and that they need to determine which of the cards should be turned over in order to determine whether the rule is true or false. For example, if the rule is If there is an E on one side of the card then there is a 4 on the other side and the visible sides of the four cards show E, K, 4, and 7, then the logically correct response is E and 7 ( p and not-q), but by far the most frequent two-card selection is E and 4 ( p and q). There are numerous explanations of poor performance. These tend to be based on rule interpretations (e.g., Stenning and van Lambalgen, Chapter 8, this volume) and attention to inappropriate cards (e.g., Roberts & Newton, 2001; Stanovich & West, 1998). However, more relevant to the current discussion is the application of contextual facilitation methodology. The selection task is particularly ``suitable'' for contextual facilitation methodology because, even among adults, performance at the impoverished task appears to be at ¯oor level. Hence, it is easy to argue that there is no actual evidence of domain general processes manifesting themselves in any way. The next step is to see whether embellishing the task can improve performance, and if so, how. In general, successful facilitation is found with problems phrased so that they ``make sense'' to subjects: typically adding context so as to give good reason for such a rule, and good reason to perform the task, although in the process, ``testing the truth or falsity of the rule'' generally mutates into ``looking for potential rule violators''. Imagine you are a policeman in a pub, seeking people who may be violating the local drinking rule, which is ``If you are drinking alcohol, then you must be over eighteen years old.'' The cards below show four drinkers. On one side is their drink, on the other, their age. For two people, you know how old they are, for two you know what they are drinking. What cards should you turn over in order to see whether the rule is being broken? [Four cards show (1) Beer; (2) Coke; (3) 16 years old; (4) 21 years old.] (After Griggs & Cox, 1982) Compared with the standard abstract task, performance improves considerably, with up to 80% of subjects correctly choosing the ``Beer'' and ``16 years old'' cards. Among domain speci®city theorists, the jury is still out as to whether facilitation, where it occurs, is owing to module activation (whether hazard or social contract; see Cosmides, 1989; Fiddick, Cosmides, & Tooby, 2000) or schema application (e.g., permission and obligation

1. Contextual facilitation methodology

23

schemas; see Cheng & Holyoak, 1985; Holyoak & Cheng, 1995b). Personal knowledge of the context described is not necessary, but undoubtedly helps. Few selection task scenarios facilitate as much as the drinking age rule. Do we have evidence for domain speci®c reasoning processes to the exclusion of domain general ones? Unfortunately not. Embellishing the standard abstract task in order to put it into a meaningful context carries a number of pitfalls. This can be seen simply by looking at the greatly extended cover stories for contextual versus abstract versions. In brief, embellished versions may provide material that is more interesting to subjects (see O'Brien, Roazzi, Athias, & BrandaÄo, Chapter 3, this volume), may be more likely to trigger appropriate interpretations (Stenning and van Lambalgen, Chapter 8, this volume), or may emphasise the costs and bene®ts regulated by the rule (Noveck, Mercier, & Van der Henst, Chapter 2, this volume). Elsewhere in the literature, we see numerous other possibilities. Identifying rule violators (as required by typical contextual versions) is easier than testing rules for truth (as required by typical abstract versions) (e.g., Manktelow and Over, 1991; Noveck and O'Brien, 1996). Early contextual items used explicit negatives, which may reduce dif®culty (e.g., a card showing ``not over 18 years'' would be an explicit negation of the drinking rule, but ``16 years'' would be an implicit negation; e.g., Noveck & O'Brien, 1996). The extended cover stories on contextual versions seem inadvertently to contain elements that direct people's attention towards correct cards (e.g., Love and Kessler, 1995; Platt and Griggs, 1995), can make imagining violators easier (e.g., Green & Larking, 1995; Liberman & Klar, 1996; Platt and Griggs, 1995), can clarify the meaning of the rule (e.g., Liberman & Klar, 1996), and can emphasise the likelihood that illegal combinations may occur and the seriousness of making a mistake at the task (e.g., Love and Kessler, 1995). Over and above this, embellished contextual tasks seem to contain a number textual enrichments that are dif®cult to pin down, but seem to make the task easier, and nonetheless should be irrelevant to the activation of contextual processes (e.g., Noveck, Mercier, & Van der Henst, Chapter 2, this volume; Noveck & O'Brien, 1996). All these factors are strictly extraneous to the addition of context, and have been shown to assist people's solution. Although performance at the abstract selection task typically appears to be at ¯oor level, there is evidence for logic effects, a marker for the importance of domain general reasoning processes and/or resources. The standard abstract selection task rule (If p then q) is just one item out of several possible ones. To ®nd logic effects, we need to look at all possible items, the full negations paradigm (e.g., Evans & Lynch, 1973). This involves multiple selection tasks, and the evaluation of cards using all possible permutations of negations within the rule: If p then q, If p then not-q, If notp then q, and If not-p then not-q. In a meta analysis (N = 419), Roberts (2002) found that the overall percentage of logical choices, calculated from

24

Roberts

the proportion of cards correctly selected plus the proportion of cards correctly not selected, came to 60%, well above the chance score of 50%, so that the requirement for a ¯oor level in performance is not even met. Not only this, but performance at the abstract selection task is correlated with intelligence test score (see Stanovich & West, 1998; Valentine, 1975, but see also Newstead, Handley, Harley, Wright, & Farrelly, 2004). To ®nd logic effects, we also need to investigate different logical connectives, in addition to the conditional (If there is p then there is q). Roberts (2002) found that the disjunctive version (e.g., Either there is not p or there is q ± 58% logical choices) was much harder than a categorical version (e.g., All cards with p have q ± 68% logical choices). Similarly, looking at Evans, Legrenzi, and Girotto (1999, Experiment 3), they found a range of rule dif®culty, with negative conjunctions the easiest (There is not both p and not q, 59% logical choices), then standard conditionals (57%), then disjunctions (54%), then only-if rules (There is p only if there is q, 51%). If all reasoning is domain speci®c, and there are no domain general in¯uences on reasoning, then there should be no such logic effects for abstract tasks. It is also necessary to see whether logic effects can be found in contextual versions of the task. For example, are drinking age contexts with isomorphic rules equally easy? According to domain speci®c approaches, they should be: 1 2 3 4

If a person is drinking alcohol, then he/she must be over 18 years old (conditional); Either a person has a soft drink, or he/she must be over 18 years old (disjunctive); All people drinking alcohol must be over 18 years old (af®rmative categorical); No people drinking alcohol may be under 18 years old (negative categorical).

In four unpublished studies (total N = 163), the author has found, using factorial designs that compare (1) contextual versus abstract items; and (2) conditional versus disjunctive rules, main effects of both context and logic, but no interaction. The aggregate percentage of logical choices is 83% for contextual conditional rules, but only 69% for disjunctive rules in the same context. Furthermore, af®rmative categorical rules seem equally easy as conditional rules (N = 40, conditional rules 82% logical choices, versus 81% for af®rmative categoricals) but negative categorical rules seem even easier (N = 48, conditional rules 93% logical choices, versus 98% for negative categoricals). Overall, the evidence obtained by applying the contextual facilitation methodology to the selection task is inconclusive. The problem, that manipulating context simultaneously manipulates extraneous variables, is usually not adequately addressed, and when it is, effect sizes diminish

1. Contextual facilitation methodology

25

considerably (e.g., Noveck, Mercier, & Van der Henst, Chapter 2, this volume). Performance at the standard conditional abstract task is not at ¯oor level, and there is good reason to believe that there are logic effects, both with and without context. Testing intelligence tests Moving onto intelligence tests, the reasons for asserting the importance of domain speci®c processes at the expense of domain generality become clear. Once the importance of domain general processes is admitted, it makes sense to investigate whether there are individual differences in these, to measure them, and to make decisions about people's futures on the basis of test scores. To people for whom all intelligence testing is an abhorrence, even though the tests are valid (see Gottfredson, Chapter 17, this volume), one possible way of curtailing the use of testing is to discredit the entire notion of domain general processes and resources. Primarily, intelligence tests comprise sets of logic puzzles. These usually involve inductive reasoning ± identifying rules from regularities and then applying them. Raven's Progressive Matrices (Raven, Raven, & Court, 1993) have acquired the reputation of being an intelligence test gold standard, the best possible measure of general intelligence, or g (e.g., Carpenter, Just, & Shell, 1990; Snow, Kyllonen, & Marshalek, 1984). Each item consists of abstract shapes and patterns (elements), organised in a 33 matrix of cells. The ninth cell on the lower right-hand corner of each matrix is always empty. The contents of the remaining eight cells, and hence the empty one, are determined by various arbitrary abstract rules. Solving each item therefore requires identifying its elements and the rules linking them, using these to generate the answer. One interpretation of scores at this task is that they indicate a level of domain general cognitive skill, necessary not just for performance at other logic puzzles, but also for success at school, work, and day-to-day life. However, an alternative explanation, where performance is poor, is that some people are not familiar with the necessary puzzle-solving context that the test entails. In people's own familiar contexts, they are perfectly capable of making complex inferences, but away from these their performance is poor, and their true abilities underestimated. This was the starting point for Richardson (1991), based on the premise that standard intelligence test items are abstract and meaningless to children, and thus fail to engage the necessary reasoning procedures. The ®rst 10 items of Set E of the Standard Progressive Matrices were redesigned to preserve the rules, but display the items as socially meaningful contexts (for example, people departing, or real objects being moved), hence contextual facilitation methodology was applied. When administered to 10-year-old children, higher scores were obtained for the Contextual matrices (81% correct for these, 26% correct for Standard items; the chance score is 12.5%). Richardson argued that the

26

Roberts

new items had engaged children's reasoning processes, thus demonstrating that appropriately designed materials had revealed an ability for complex reasoning that would have been underestimated had the Standard items only been used. Unfortunately, Roberts and Stevenson (1996; see also Richardson, 1996; Roberts, 1996) showed that there was a confound: Each Contextual matrix also had a verbal commentary, read alongside its item while the experimenters pointed to relevant cells. These were ostensibly intended to activate the hypothesised schemas for reasoning about social contexts. The Standard matrices were simply solved in silence. More interesting, some commentaries seemed to do more than activate contexts. Despite the claim that ``only suf®cient information is given to `set up' the rationale for the problem. No explicit mention is made of the actual rules or covariations to be induced, and which lead to the solution of the problem'' (Richardson, 1996, p. 133), it was clear that some contained cues as to the actual rules themselves. Roberts and Stevenson (1996) divided the commentaries into weak guidance and strong guidance versions. For example, here is a strong guidance commentary (Item 6): This is a story about Mrs Smith who goes shopping one morning. When she gets home, she puts the things on the table like this. Then she puts some things in the cupboard like this, then the table looks like this . . . When logically and informationally equivalent commentaries were prepared for the Standard matrices, and like compared with like, the original advantage for children was considerably reduced (60% for Contextual matrices averaged over with versus without commentary, versus 36% correct for Standard matrices). Also, the stronger guidance commentaries were associated with stronger facilitation than the weaker guidance ones. This effect of guidance was con®rmed when Roberts, Welfare, Livermore, and Theadom (2000) prepared a new set of commentaries for all matrices, reversing the guidance strength, and found that performance levels reversed to match the new guidance levels. For example, the Item 6 Contextual commentary was rewritten to have weak guidance: This is a story about Mrs Smith returning home with her shopping. First we see this, then we see this, and then we see this . . . Combining relevant data from Roberts and Stevenson (1996) and Roberts et al. (2000), we have performance data for every item, both Standard and Contextual, at every level of guidance: none, weak, and strong (see Figure 1.1). The ®gure shows how performance improves steadily as guidance increases. Using everyday items as elements and adding commentaries does far more than simply add context. Roberts et al. (2000) suggested that even

1. Contextual facilitation methodology

27

90 80 70

Standard matrices Contextual matrices

% Correct

60 50 40 30 20 10 0 None

Weak

Strong

Level of Guidance R&S (1996) All ten items

R&S (1996) Items 1,6,7,9,10

R&S (1996) Items 2,3,5,6,8

RWL&T (2000) E1 All ten items

RWL&T (2000) E1 Items 2,3,5,6,8

RWL&T (2000) E1 Items 1,6,7,9,10

RWL&T (2000) E2 All ten items

RWL&T (2000) E2 Items 1,6,7,9,10

RWL&T (2000) E2 Items 2,3,5,6,8

Figure 1.1 Aggregated performance at Standard and Contextual matrices taken from Roberts and Stevenson (1996, N = 72), and Roberts, Welfare, Livermore, and Theadom (2000), Experiment 1 (N = 164) and Experiment 2 (N = 115). This shows that the level of guidance provided by commentaries is almost equally bene®cial for both item types.

at the weak level of guidance, attention is directed to the overall format of the matrices, encouraging children to approach each item more systematically. And of course, strong guidance went further by giving cues as to the rules. Overall, Roberts and Stevenson's (1996) conclusion seems sound: that contrary to claims, children are perfectly capable of reasoning with abstract, meaningless rules. Their main source of dif®culty is identifying the rules to begin with. Assistance with this improves performance, depending on how much is given. Figure 1.1 also shows a consistent main effect advantage for the Contextual matrices. Are they nonetheless triggering domain speci®c procedures used for solution? On balance, probably not. Roberts et al. (2000) argued that there were other confounds in creating the Contextual matrices. For the Standard ones, elements are often dif®cult to name, and are often superimposed. For Richardson's Contextual items, the elements were familiar, easy to identify and name (e.g., people, furniture, toys), and did not overlap. Roberts et al. (2000) created a new set of Simpli®ed matrices that used logically equivalent rules to the Standard and Contextual versions, were just as abstract and meaningless as the Standard matrices, but used simpler,

28

Roberts

non-overlapped familiar/easy-to-name elements. This eradicated the main effect advantage for the Contextual items (61% correct for Simpli®ed matrices versus 59% for Contextual matrices, and 30% for the Standard matrices). Hence, the advantage for the Contextual matrices can be explained simply in terms of the recognisability of their elements, rather than the triggering of special cognitive processes for solving them. Children cannot reason if they cannot identify what they are reasoning about. Any assistance, whether contextual or not, will improve performance. Thus far, the contextual facilitation methodology has again been confounded because of the manipulation of extraneous variables simultaneously with context. We can also see, from Figure 1.1, that even at their most disadvantaged, performance at the Standard matrices is above chance level. Do we have evidence for logic effects? There is plenty. Much past research has shown that matrix dif®culty is attributable to working memory capacity and management, with item dif®culty predicted by numbers of elements, numbers of rules, and types of rules (Carpenter, Just, & Shell, 1990; Primi, 2002; Vodegel Matzen, Van der Molen, & Dudink, 1994). Add to this the ®nding that element appearance is important (e.g., Roberts et al., 2000; Meo, Roberts, & Marucci, in press), and we have a very clear picture of why some matrix items are harder than others, and why some people ®nd them harder than other people. Furthermore, Set E of the Standard Matrices comprises the last 12 items of a set of 60. Even for children of the age tested above, earlier items would present little dif®culty for reasons entirely compatible with the above working memory and element identi®ability explanations. There is also evidence for logic effects in the Contextual matrices. Richardson's (1991) experiment was within-subjects, all children attempted both types of item, and here the correlation in numbers of correct answers for the 10 pairs of items is extremely high, around .7. Roberts and colleagues have used only between-subjects designs. Despite this, combining all relevant data across their experiments, and looking at the correlations in the percentages of correct answers for 30 pairs of items (scores for the 10 Standard±Contextual item pairs for each of the three levels of guidance), we ®nd a correlation of .68, p < .01, a very strong logic effect. Standard items that are hard are also hard when they are converted into Contextual items. This is presumably because of the number of elements that each item contains, and the number and types of rules. If there were no domain general processes at work, or domain general resources were irrelevant, it should not be possible to predict the dif®culty of a Contextual item on the basis of the dif®culty of its equivalent abstract, arbitrary, non-contextual Standard item. False analogies Analogies of the sort A is to B as C is to ??? are another common item in intelligence testing, and they have also received the attention of researchers

1. Contextual facilitation methodology

29

who believe in extreme domain speci®city. For this task, an appropriate relationship needs to be identi®ed that links the A and B terms. This then needs to be applied to the C term in order to identify the correct answer. Analogies may be nonverbal (involving transitions between shapes) but, unlike Raven's Matrices, verbal analogies may also be used, for example Bird is to Air as Fish is to ???. Item dif®culty is relatively straightforward to predict for nonverbal analogies. As for Raven's Progressive Matrices, working memory demand is implicated: the more elements contained within each term, and the more transitions that must be applied to the A and C terms in order to convert them into the B and D terms respectively, the harder the analogy is to solve (Mulholland, Pellegrino, & Glaser, 1980). Dif®culties of verbal analogies are harder to predict, especially a priori (e.g., see Lohman, 2000, for a review). Factors related to this include not just the vocabulary and knowledge required, and whether words have multiple meanings, but also the ambiguity and multiplicity of relationships linking the A and B and the C and D terms, the abstractness of the target relationship, and the extent to which the distracters contain plausible but incorrect foils. Using similar methodology to Richardson (1991), Richardson and Webster (1996) sought to show that performance at items that used abstract meaningless shapes could be boosted. Ten items were taken from the AH4 test (Heim, 1970), and equivalent ones prepared using everyday objects. However, this study was performed without simultaneous commentaries assisting the children. They were also on average 11 years old, and therefore on the verge of pro®ciency at solving such tasks. Not surprisingly, the effect size was smaller: 65% correct for Contextual items, versus 49% correct for the Standard abstract items. Roberts et al. (2000, Experiment 3) suggested that this effect could be entirely attributed to element salience: Easier to identify shapes meant easier to identify relationships, and hence easier to identify solutions. They demonstrated this by producing another set of abstract meaningless items, but again using highly distinctive shapes and patterns, ®nding that these facilitated performance even more (78% correct) than the Contextual items (64% correct), with 53% correct for the Standard items. Also, looking back at the data, there is clear logic effect. Taking 20 Standard±Contextual item pairs ± in order to control the study, each item was tested with two different sets of distracters ± the correlation between the two is r = .54, p < .01. Again, the dif®culty at Contextual items is predicted by Standard abstract items. Goswami (e.g., 1992) has conducted extensive research into analogies in an attempt to challenge views that qualitative changes occur in children's ability to solve them. However, more recently it appears that the importance of domain general processing is also rejected (Goswami, Leevers, Pressley, & Wheelwright, 1998). Hence, the position also appears to be one of extreme domain speci®city: The only barrier to solution among children is said to be knowledge. If children know and understand the relationship

30

Roberts

between the A and B terms, and the C term and its answer, then they will display competence at this type of reasoning. Furthermore, it is explicitly claimed that knowledge is the only limitation on performance; domain general resources, even working memory capacity, are irrelevant (Goswami et al., 1998, p. 566): The two experiments reported here support the claim made by knowledge-based theorists that analogical reasoning is determined by the child's possession of the requisite relational knowledge, rather than the processing capacity claim that younger children are unable to solve relationally complex analogies based on pairs of relations due to capacity limitations. Hence, with specially created items, using only relationships that children were likely to know (e.g., play-doh is to cut play-doh as apple is to ???), and with items presented and answered only pictorially, even children as young as 3 years could solve analogies, with this ability correlated with their understanding of the underlying causal relationships (e.g., Goswami & Brown, 1989). With the review of the contextual facilitation methodology thus far, the potential pitfalls of improving performance should be clear. Facilitation might indeed take place because reasoning only requires appropriate knowledge. However, it might also be the case that procedural aspects of the studies, along with the specially chosen items, mean that there are other interpretations. First, note that the children were given the tasks in a very structured way: The experimenter then put down the picture cards for a given trial, asking the child to name the pictures as they appeared. This was to ensure that the pictures were recognised by the child. The experimenter laid out the ®ve choice cards [. . .] in a random order, again asking the child to name them as he or she did so. Any misapprehensions about the pictures were corrected at this point. (Goswami & Brown, 1989, pp. 75±76; similar assistance is reported in Goswami & Brown, 1990) If a child misconceived a picture, most importantly by focusing on an inappropriate aspect of the transition by misnaming the terms, this was corrected. Worse, children were corrected if they made mistakes immediately after each item (Goswami & Brown, 1989). It is therefore plausible that the children identi®ed a strategy of applying the adjective from the B term (``cut'') to the noun in the C term (``apple''). Hence, rather than demonstrating a hidden competence, or an ability to apply knowledge in a sophisticated way, it could be argued that the highly structured, guided nature of the task encouraged children to adopt an effective task-speci®c solution strategy. In the light of this, we reach the rather bland conclusion

1. Contextual facilitation methodology

31

that the more assistance children are given, the more likely it is that they will solve a task successfully. Second, is there a confound in these studies? Relationships for which knowledge about them develops relatively late might also happen to be those that are more demanding to understand and reason about in terms of cognitive processes and resources. In the light of this, statements such as ``Children could be failing because they do not understand these relations rather than because the ability to reason about higher-order relations is late developing'' (Goswami & Brown, 1989, p. 71) are merely circular. When we are told that ``When children are familiar with the relations on which the analogies depend, they are able to reason about higher order relations without dif®culty'' (p. 93), we might begin to wonder about analogies such as bird is to air as ®sh is to ??? and black is to white as hard is to ??? (dif®cult to solve before the age of 12). These are dismissed on the basis that children cannot understand their ``quite dif®cult relations'' (Goswami & Brown, 1989) such as habitat and opposite. This is a very informal analysis, and one that is dif®cult to square with children's knowledge. Word games, such as producing opposites, are understood by children as young as 5 years old, provided that the meanings of the words are understood (e.g., Wiig & Secord, 1992). Ultimately, Goswami's use of terms such as ``too dif®cult'' is tautological: Any analogy unsolvable by any age group is automatically ``too dif®cult'' for them to solve, no matter whether the A/B and C/D terms might be linked by relationships that children might be expected to know. Goswami has demonstrated facilitated performance from specially chosen analogies, presented pictorially, in tasks through which children are carefully guided step by step. One interpretation of this is that, if children know and understand the relationships under test, they can solve analogies with ease. Based on some of the methodology we have seen, we might reinterpret the ®ndings as showing that if children are guided and prompted, they can solve analogies with ease. Overall, an explanation based on domain general resources is compatible with these results. Indeed, looking at nonverbal shape analogies, i.e., ones in which relationships are entirely abstract and arbitrary, Goswami (1989) found clear age and item effects, which mirror the working memory demand ®ndings described above. Again the preferred explanation was that the harder items have relationships that are ``too dif®cult'' for the children, but here a domain general resources explanation is even harder to evade. If the rules and numbers of elements used for an item mean that it is too demanding on, say, working memory capacity for easy solution, then of course the relationship is too dif®cult and has not been understood. A day at the amusement arcade The ®nal example concerns a study of learning that again ostensibly shows the importance of context via a demonstration of facilitation, but again is

32

Roberts

mired in methodological and interpretational dif®culties. The task required children to enter, via a keyboard, a prediction for which part of a screen a shape would travel to (for the fullest description, see Ceci, Bronfenbrenner, & Baker, 1988; see also Ceci, 1996). For example, white shapes travelled up, black shapes down. Circles travelled left, squares right, and triangles travelled only along the vertical axis. Large shapes travelled further than small shapes. Despite up to 750 practice trials, on average 10- to 12-yearold children stubbornly remained below 30% accuracy. A domain general interpretation of this ®nding might be that the task was too complicated for children of this age, and was therefore beyond their competence or resources (perhaps intelligence). Furthermore, any children able to learn the relationships might be said to be more intelligent than their peers. Preferring a different interpretation, and observing children's enthusiasm at playing games in amusement arcades, ®rst Ceci et al. (1988) transformed the task to include real objects, but still requiring a keyboard input (this manipulation was a failure), and then added several further modi®cations all at once: Instead of typing numbers on a keyboard to represent children's estimates of the objects' termination point, we asked them to shoot down the objects with a cannon that was under the control of a joystick. They were awarded different amounts of points for hits of varying objects, sizes, and colours (Ceci et al., 1988, p. 254; Ceci, 1996, also notes that sound effects were also added to the embellished version, but Ceci & Roazzi, 1994, neglect to mention the transition from keyboard to joystick, simply saying that subjects ``point to'' the location or ``place a cursor'', calling the changes ``slight'' pp. 77±79). With children now responding enthusiastically, most were able to learn the task, but was this because they were now being tested in a familiar context? On balance, it is dif®cult to draw any conclusions at all. We should ®rst note that there are many differences between the embellished and the original impoverished version. We should also wonder whether any prior experience of an arcade game context is necessary in order to ®nd the embellished task easier to learn. It is not by accident that joysticks rather than keyboards are used in arcade games, and context-based learning (effortless because of a familiar context) is not the same as channelled learning (effortless because of a format congruent with what humans are able to learn most easily). Hence, another explanation of the ®ndings is that the relationship between shape movement and response (hand movement via a joystick) is more compatible, and therefore more easily learnt, than pressing buttons on a keyboard. It may be that converting the task in this way turns it from one of intellectual coordination to one of motor coordination, more akin to learning to ride a bicycle. Indeed, Ceci et al. (1988) themselves also hint at such an explanation.

1. Contextual facilitation methodology

33

In any case, even in this task, logic effects can be discerned in both impoverished and contextual versions. For the prediction task outlined above, ®nal performance approached 90% in the arcade-game condition, but only just exceeded 20% in the abstract condition. Using a more complicated rule that involved a single object moving according to several sine functions, the equivalent performance, again after 750 trials, was approximately 45% in the arcade game condition, versus just over 15% in the abstract one.

Game over In this review of the contextual facilitation methodology, we see a number of recurring pitfalls. Fundamentally, adding context simultaneously manipulates extraneous variables, invalidating claims of logical isomorphism and equal complexity. This results in studies with confounded variables and empty cells. When these cells are ®lled, effect sizes become less impressive, and main effects of logic abound, even in contextualised versions. Claims of logical isomorphism should always be treated with scepticism. Everywhere we have looked, we have seen ``easy'' contextualised tasks, supposedly logically isomorphic to ``dif®cult'' abstract equivalents, which turn out not to be quite so, because the logic has been changed, or key aspects have been made more salient, or additional help has been added, or even a combination of all three. Even so, there are still many extremely important phenomena to explain: that some tasks are easier than others, and that some people are better than others, and that this is predictable, for example by age and intelligence. Domain speci®c explanations have been put forward that attempt to account for these in their entirety: that some tasks are easier than others because they activate a module, or appropriate knowledge; that some people are better than others because, for whatever reason, they have had experience at a wider range of contexts, and/or more experience at those relevant to the tasks in question. These explanations still fall down because they cannot explain logic effects: systematic differences in task dif®culty, predictable from the abstract structure of the task, that are irrelevant to context, but operate in parallel with this. Strictly, there should be none at all. Ad hoc explanations for these effects can be put forward; for example, suggesting that some logic formats are better than others at activating context-speci®c reasoning mechanisms. Such retrospective denials of the importance of domain general processes and resources need better evidence than is currently available. The problem is that, fundamentally, domain speci®c theories of inference are imprecisely expressed. Often, they amount to little more than descriptions of situations in which people are likely to perform at their best. Without proper process models, which explain how knowledge structures or modules are activated, and how their processes are applied, logic effects can never be explained.

34

Roberts

Is the game over for the contextual facilitation methodology? Probably not; its limitations and confounds are somewhat subtle, and the effects found using it can be so striking that even the strongest conclusions drawn can seem reasonable. However, it seems unlikely that the methodology can be improved in order to remove its fundamental problems. Even so, it should be emphasised that the use of the term extraneous throughout this chapter should be taken to mean extraneous to the hypothesis under test rather than extraneous to psychological interest. It is the confound between contextual manipulations and extraneous variables that makes the experiments using this methodology dif®cult to interpret. Despite this, we have probably learnt a lot from its use. Once we strip away the trivial conclusions, such as the more help people are given, the easier they ®nd a task, we see that there are important phenomena to explain. The contextrelatedness of cognition is undoubtedly important, just not exclusively so. There are also ubiquitous logic effects even within contextualised tasks, at the very least implying the universal importance of domain general resources, and quite possibly domain general processes too. There is the ®nding that, in the right circumstances, even young children can reason about meaningless shapes and patterns, but that a signi®cant source of dif®culty comes from identifying the rules that link them, so that assistance with this for matrices and analogies considerably improves performance. It is important that all effects on performance are understood, likewise all individual differences and developmental changes. For example, if manipulations mean that attention or memory de®cits can be overcome in certain circumstances, or that key elements of a task can be more effectively identi®ed and represented, why is this, and how can it be capitalised on so that everyone can perform at their best, and learn at their fastest? To conclude, the evidence for special memory structures such as schemas and specialist modules, derived from the contextual facilitation methodology, is far from clear-cut. Of course, a defective methodology does not preclude their existence; it merely limits the conclusions that we can draw when relying on it. On balance, the safest conclusions so far are (1) knowledge can substitute for inference, and (2) context enables domain general processes to operate more effectively by emphasising key aspects of problems, and hence reducing demands on domain general resources. Stronger conclusions require evidence from other methodologies.

Acknowledgements The unpublished data looking at logic effects in contextual Wason selection tasks were collected by Rebecca Barton, Hannah Cooper, Patrick Gurden, and Toni Nelson. The author would like to thank Kelly Mix, Elizabeth Newton, and David Over for their comments on earlier versions of this chapter, and Jill Baird for bringing to his attention the presence of logic effects in the Contextual Matrices.

1. Contextual facilitation methodology

35

References Adey, P. (1997). It all depends on the context, doesn't it? Searching for general, educable dragons. Studies in Science Education, 29, 45±92. Braine, M. D. S., & O'Brien, D. P. (Eds.). (1998). Mental logic. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: A theoretical account of the processing in the Raven Progressive Matrices Test. Psychological Review, 97, 404±431. Ceci, S. J. (1996). On intelligence. Cambridge, MA: Harvard University Press. Ceci, S. J., Bronfenbrenner, U., & Baker, J. G. (1988). Memory in context: the case for prospective remembering. In F. E. Weinert & M. Perlmutter (Eds.), Memory development: Universal changes and individual differences (pp. 243±256). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Ceci, S. J., & Roazzi, A. (1994). The effects of context on cognition: Postcards from Brazil. In R. J. Sternberg & R. K. Wagner (Eds.), Mind in context: Interactionist perspectives on human intelligence (pp. 74±101). Cambridge: Cambridge University Press. Cheng, P. W., & Holyoak, K. J. (1985). Pragmatic reasoning schemas. Cognitive Psychology, 17, 391±416. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187±276. Cosmides, L., & Tooby, J. (1994). Origins of domain speci®city: The evolution of functional organisation. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind (pp. 85±116). Cambridge: Cambridge University Press. Donaldson, M. (1978). Children's minds. London: Fontana Press. Evans, J. St B. T., Legrenzi, P., & Girotto, V. (1999). The in¯uence of linguistic form on reasoning: The case of matching bias. Quarterly Journal of Experimental Psychology, 52A, 185±216. Evans, J. St B. T., & Lynch, J. S. (1973). Matching bias in the selection task. British Journal of Psychology, 64, 391±397. Fiddick, L., Cosmides, L., & Tooby, J. (2000). No interpretation without representation: The role of domain speci®c representations and inferences in the Wason selection task. Cognition, 77, 1±79. Gick, M. L., & Holyoak, K. J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15, 1±38. Goswami, U. (1989). Relational complexity and the development of analogical reasoning. Cognitive Development, 4, 251±268. Goswami, U. (1992). Analogical reasoning in children. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Goswami, U., & Brown, A. L. (1989). Melting chocolate and melting snowmen: Analogical reasoning and causal relations. Cognition, 35, 69±95. Goswami, U., & Brown, A. L. (1990). Higher-order structure and relational reasoning: Contrasting analogical and thematic relations. Cognition, 36, 207±226. Goswami, U., Leevers, H., Pressley, S., & Wheelwright, S. (1998). Causal reasoning about pairs of relations and analogical reasoning in young children. British Journal of Developmental Psychology, 16, 553±569. Green, D. W., & Larking, R. (1995). The locus of facilitation in the abstract selection task. Thinking and Reasoning, 1, 183±199.

36

Roberts

Griggs, R. A., & Cox, J. R. (1982). The elusive thematic materials effect in the Wason selection task. British Journal of Psychology, 73, 407±420. Heim, A. W. (1970). Manual for the AH4 group test of general intelligence. Windsor, UK: NFER-Nelson. Holyoak, K. J., & Cheng, P. W. (1995a). Pragmatic reasoning from multiple points of view: A response. Thinking and Reasoning, 1, 373±388. Holyoak, K. J., & Cheng, P. W. (1995b). Pragmatic reasoning with a point of view. Thinking and Reasoning, 1, 289±314. Hughes, M. (1978). Selecting pictures of another person's view. British Journal of Educational Psychology, 48, 210±219. Johnson-Laird, P. N., & Byrne, R. M. J. (1991). Deduction. Hove, UK: Psychology Press. Liberman, N., & Klar, Y. (1996). Hypothesis testing in Wason's selection task: Social exchange detection or task understanding. Cognition, 58, 127±156. Light, P., & Gilmour, A. (1983). Conservation or conversation? Contextual facilitation of inappropriate conservation judgments. Journal of Experimental Child Psychology, 36, 356±363. Lohman, D. F. (2000). Complex information processing and intelligence. In R. J. Sternberg (Ed.), Handbook of intelligence (pp. 285±340). Cambridge: Cambridge University Press. Love, R. E., & Kessler, C. M. (1995). Focusing in Wason's selection task: Content and instruction effects. Thinking and Reasoning, 1, 153±182. McGarrigle, J., & Donaldson, M. (1975). Conservation accidents. Cognition, 3, 341±350. Manktelow, K. I., & Over, D. E. (1991). Social roles and utility in reasoning with deontic conditionals. Cognition, 39, 85±105. Meo, M., Roberts, M. J., & Marucci, F. S. (in press). Element salience as a predictor of item dif®culty for Raven's Progressive Matrices. Intelligence. Moore, C., & Frye, D. (1986). The effect of experimenter's intention on the child's understanding of conservation. Cognition, 22, 283±298. Mulholland, T. M., Pellegrino, J. W., & Glaser, R. (1980). Components of geometric analogy solution. Cognitive Psychology, 12, 252±284. Nelson, I., Dockrell, J., & McKechnie, J. (1983). Does repetition of the question in¯uence children's performance in conservation tasks? British Journal of Developmental Psychology, 1, 163±174. Newstead, S. E., Handley, S. J., Harley, C., Wright, H., & Farrelly, D. (2004). Individual differences in deductive reasoning. Quarterly Journal of Experimental Psychology, 57A, 33±60. Noveck, I. A., & O'Brien, D. P. (1996). To what extent do pragmatic reasoning schemas affect performance on Wason's selection task? Quarterly Journal of Experimental Psychology, 49A, 463±489. Piaget, J. (1950). The psychology of intelligence (reprinted 2005). London: Routledge. Platt, R. D., & Griggs, R. A. (1995). Facilitation and matching bias in the abstract selection task. Thinking and Reasoning, 1, 55±70. Primi, R. (2002). Complexity of geometric inductive reasoning tasks, contribution to the understanding of ¯uid intelligence. Intelligence, 30, 41±70. Raven, J., Raven, J. C., & Court, J. H. (1993). Manual for Raven's Progressive Matrices and Mill Hill Vocabulary Scales. Oxford: Oxford Psychologists Press.

1. Contextual facilitation methodology

37

Richardson, K. (1991). Reasoning with Raven ± in and out of context. British Journal of Educational Psychology, 61, 129±138. Richardson, K. (1996). Putting Raven into context: A response to Roberts and Stevenson. British Journal of Educational Psychology, 66, 533±538. Richardson, K., & Webster, D. S. (1996). Analogical reasoning and the nature of context: A research note. British Journal of Educational Psychology, 66, 23±32. Rips, L. J. (1994). The psychology of proof. Cambridge, MA: MIT Press. Roazzi, A., & Bryant, P. (1992). Social class, context and cognitive development. In P. Light & G. Butterworth (Eds.), Context and cognition (pp. 14±27). New York: Harvester Wheatsheaf. Roberts, M. J. (1996). Putting context into context: A rejoinder to Richardson. British Journal of Educational Psychology, 66, 539±542. Roberts, M. J. (2002). The elusive matching bias effect in the disjunctive selection task. Experimental Psychology, 49, 89±97. Roberts, M. J., & Newton, E. J. (2001). Inspection times, the change task, and the rapid-response selection task. Quarterly Journal of Experimental Psychology, 54A, 1031±1048. Roberts, M. J., & Stevenson, N. J. (1996). Reasoning with Raven ± with and without help. British Journal of Educational Psychology, 66, 519±532. Roberts, M. J., Welfare, H., Livermore, D. P., & Theadom, A.M. (2000). Context, visual salience, and inductive reasoning. Thinking and Reasoning, 6, 349±374. Snow, R. E., Kyllonen, P. C., & Marshalek, B. (1984). The topography of ability and learning correlations. In R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (Vol. 2, pp. 47±103). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Sophian, C. (1997). Beyond competence: The signi®cance of performance for conceptual development. Cognitive Development, 12, 281±303. Stanovich, K. E., & West, R. F. (1998). Cognitive ability and variation in selection task performance. Thinking and Reasoning, 4, 193±231. Valentine, E. R. (1975). Performance on two reasoning tasks in relation to intelligence, divergence, and interference proneness. British Journal of Educational Psychology, 45, 198±205. Vodegel Matzen, L. B. L., Van der Molen, M. W., & Dudink, A. C. M. (1994). Error analysis of Raven test performance. Personality and Individual Differences, 16, 433±445. Wiig, E. H., & Secord, W. (1992). Test of word knowledge, examiner's manual. San Antonio, TX: The Psychological Corporation.

2

To what extent do social contracts affect performance on Wason's selection task? Ira A. Noveck, Hugo Mercier, and Jean-Baptiste Van der Henst

It is only fair for us to say that all three of us endorse the notion that evolutionary factors play a crucial role in reasoning, as well as in other cognitive activities. It is also in the interest of the reader to know that we are not unsympathetic to what the editor of this volume has called the extreme domain speci®city hypothesis. That said, we also think it is important to highlight how dif®cult it is to investigate the theoretical arguments of evolutionary accounts using an experimental paradigm, especially in reasoning. In this respect, we are in agreement with at least one aim of the book, the one that points to the extreme interpretational dif®culties encountered when a speci®c evolutionary theory claims to account for a speci®c set of data. This chapter will focus on Wason's selection task, a reasoning problem that has become one of the staples in the cognitive literature, as well as an arena of sorts for competing accounts concerned with the role of content in facilitating performance. It is also the task employed by Cosmides (1989) ± a proponent of one of the ``extreme'' views this volume addresses ± to underline the import of social contracts in the evolution of conditional reasoning. In this chapter, we will focus on how subtle features of the selection task play a very important role in facilitating ``correct'' performance, and how these often overshadow, or raise doubts about, the more theoretically driven aspects that are claimed to be sources of facilitation. Clearly, it is in everyone's interest to separate extraneous variables from the one or two factors that a given account considers to be of genuine theoretical interest. Our plan for this chapter is to provide some historical background on the content effect related to the selection task. This leads to Cheng and Holyoak's Pragmatic Reasoning Schema theory and its account of the content effect. We then show how prior investigations, which addressed the role of potential confounds with respect to this account, have led to more carefully constructed selection tasks. These prior efforts have shown that (1) correct performance on the selection task is often due to in¯uences that have little to do with theoretical claims, and (2) such studies can provide insight into the role of potentially confounding factors in Cosmides'

40

Noveck, Mercier, Van der Henst

research. We then present the results of three experiments that investigate the role of extraneous factors in Cosmides' tasks.

The selection task Wason's Selection Task hardly needs an introduction to most readers of this volume. Nonetheless, it always pays to present the task before describing the experimental manipulations made to it. In the Standard Abstract problem, subjects are presented with four cards showing, for example, A, B, 4, and 7, and told that each of these has a letter on one side and a number on the other. The original problem requires subjects to consider a universally quanti®ed conditional rule concerning a relationship between the two sides of the cards, e.g., if a card has an ``A'' on one side then it has a ``4'' on its other side. The task is to reason-about a rule, i.e., decide which of the cards would need to be turned over to determine whether it is true or false. The appropriate answer from the perspective of standard logic (hereafter referred to as the ``correct'' answer) is to choose the ``A'' and the ``7'' cards. In the event that one ®nds a number other than ``4'' on the other side of the ``A'' or an ``A'' on the other side of the ``7'', the rule has been falsi®ed. The probability of a correct response by chance is .0625, and the rate at which this typically occurs does not differ signi®cantly from chance (see Evans, 1989; Evans, Newstead, & Byrne, 1993; Johnson-Laird & Wason, 1970). The modal responses are to turn over the A and the 4, or just the A card. Interest in the selection task stems largely from ®ndings of a content effect, that is, several realistic-content versions of the task elicit correct responses. For example, the facilitative postal problem (Johnson-Laird, Legrenzi, & Legrenzi, 1972) has the rule If a letter is sealed then it has a 50 lire stamp on it along with four envelopes that mirror the sorts of cards one ®nds in the standard task: the back of an envelope showing that it is sealed, the back of an envelope showing that it is unsealed, an envelope's face having a 50 lire stamp, and an envelope's face showing a 10 lire stamp. Such problems yield rates of correct responses that are usually above 50%, well above chance. The selection task therefore became an important paradigm in which to test theoretically driven explanations of content effects in reasoning. Cheng and Holyoak's (1985) pragmatic reasoning schema account claimed that, as a result of repeated exposure to particular classes of contents, people induce and store domain speci®c inference structures in clusters called pragmatic reasoning schemas. These were de®ned in terms of classes of goals and content (e.g., permissions, obligations, and causations) and were described as being context-sensitive, in that they apply only when appropriate goals and contents are present. In other words, a schema becomes available when a situation warrants it. According to the pragmatic schemas theory, reasoning with thematically familiar materials typically uses such knowledge structures. Part of the appeal of this theory was that it provided an

2. Social contracts and the selection task

41

apparently straightforward explanation for the content effect: up to that point, most of the realistic-content versions that had elicited facilitation could have been understood as presenting veiled pragmatic rules. For example, the rule in the postal problem could be viewed as an obligation schema (If Situation S arises then Action A must be done). The triggering of this schema prompts four production rules: I II III IV

If If If If

(Situation) S arises then (Action) A must be done; S does not occur then A need not be done; A is done, then S might have (or might not have) occurred; A has not been done, then S must not have occurred.

These, in effect, walk one through the correct responding to the selection task. Cheng and Holyoak's (1985) strongest evidence came from abstractcontent versions of the selection task that facilitated the correct response pattern. These employed rules derived from their abstractly worded schemas along with cards that were worded similarly (e.g., Situation S arises, Situation S does not arise). Cheng and Holyoak claimed that their abstract permission version (If one is to take Action A, then one must ®rst satisfy Precondition P) elicited correct response patterns because the wording in the problem's rule triggered the entire permission schema, whereas the rules in the Standard Abstract problems did not. Although it could be argued that Cheng and Holyoak's main claims have not been completely refuted on experimental grounds (see Holyoak & Cheng, 1995), several experiments have shown that much of the facilitation originally reported (61% correct on the abstract permission problem versus 19% on the control) is due to extraneous factors. For example, Noveck & O'Brien (1996) showed that a permission rule by itself does nothing to elicit solution: Only 8% of subjects solved the least successful permission-rule problems. Adding certain details to the task, such as making negative information explicit in the cards and using what were called ``reasoningfrom'' problems, increased the percentage of subjects solving the problem to 40%, and adding a set of other elaborating features increased the percentage to 61%, which is the same value reported by Cheng and Holyoak (1985) for the abstract permission problem.1 These enriching features ± not

1 ``Not P'' can be expressed either explicitly (``has not ful®lled precondition P'') or implicitly (``has ful®lled precondition Q''). Unlike the reasoning-about problems used in Wason's original task, which require participants to determine whether a rule is true or false, reasoning-from problems present the rule as true and as a basis for ®nding violators. One example of an elaborating factor is that the overall length of the original ``permission'' problem is roughly 50% greater than its control problem. Part of the extra length is due to an elaboration on the given permission rule, e.g., by saying ``In other words . . .'' which did not exist for the control problems.

42

Noveck, Mercier, Van der Henst

in the scope of Cheng and Holyoak's theoretical framework ± substantially increased the number of participants solving the task, and thus played a crucial role in achieving the level of success previously reported. Such work clari®es how apparently innocent details affect performance on Wason's selection task, and demonstrates that caution is called for when one is introducing new variables to the paradigm. The present work aims to enlarge the scope of this approach by focusing on Cosmides' social contract theory. In a landmark paper, Cosmides (1989) argued that content effects are due not to acquired pragmatic schemas, but to an innate cheater detection module. This account, inspired by evolutionary theory, can be summarised as follows: Human beings cooperate, and we seem to have done so ever since we emerged as a species. A possible explanation for the appearance of cooperation is reciprocal altruism (Trivers, 1971): Individuals follow the rule ``You scratch my back and I'll scratch yours.'' By bene®ting both parties, this mechanism allows for the evolution of cooperation. However, some conditions addressing cheaters must be met because cooperation could ultimately be undermined if cheating is unrestricted. By failing to give something in return, a cheater ends up taking an illicit bene®t. As computer models have shown (Axelrod, 1984), cheaters who go unpunished will take advantage of others and subvert the evolution of cooperation. Therefore, in any species practising reciprocal altruism, it makes sense to look for mechanisms designed to detect and punish cheaters. Cosmides hypothesised that this cheater detection module is the key to success on the selection task. Central to her arguments were data showing that tasks requiring participants to ®nd violators of ``If you take the bene®t then you must pay the cost'' rules had facilitated rates of performance.2 However, these claims are dubious because, much like Cheng and Holyoak's reasoning problems, Cosmides' social contract tasks contain many narrative details and elaborations not found among companion control tasks. To make our argument quickly, it suf®ces to point to a super®cial measure ± problem length ± of one of the social contract tasks, the Kaluame problem. This variation of the selection task ± referred to as the original USSC (``unfamiliar standard social contract'') problem ± uses 392 words to describe in colourful detail the participant's task, which is to imagine being a member of a foreign culture and enforcing its strict laws. It yields a rate of correct responses of around 70%. Its rule ± ``If a man eats cassava root then he must have a tattoo on his face'' ± comes with a very long narrative describing the bene®ts and scarcity of cassava root, as well as those instances when one ®nds tattoos (``only married men have tattoos on

2 More recently, Fiddick, Cosmides, and Tooby (2000) have re®ned their account, one upshot being that bene®ts are de®ned in these contexts as requirements. For the sake of simplicity we retain their original language.

2. Social contracts and the selection task

43

their faces''). The abstract problem that comes closest to Wason's original task (to be called the standard abstract task) contains only 141 words, and a second descriptive control problem has 320. In Cosmides' experiments, both of these yield rates of correct responses that are around 20±25% (note that Wason's original problem usually yields a much lower rate). This simple measure shows that, for the social contract problems heralded by Cosmides, there are potential advantages favouring comprehension built into them. Below, we compare in greater detail the original USSC problem to the descriptive control ± the two that are closest in terms of length ± in order to reveal three advantages inherent to the original USSC problem. First, there is an urgency written into the original USSC task that is absent in the descriptive control. In the introduction to the USSC problem, participants are told that ``the elders have entrusted you with enforcing [rules] and to fail would disgrace you and your family.'' In the descriptive problem, the participants are told to imagine being an ``anthropologist studying the Kaluame people'' and the rule is presented as dubious: ``You decide to investigate your colleague's peculiar claim.'' Not only has the literature shown that role-playing in the selection task can have a signi®cant impact on performance (Politzer & Ngyuen-Xuan, 1992), but the introduction for the original USSC task arguably motivates the participants more. Second, a level of detail is ascribed to the bene®ts and costs in the original USSC problem that one does not ®nd in the descriptive problem. Whereas USSC sentences introduce the bene®cial cassava root, explaining why it is so treasured (103 words elaborate on how ``cassava root is a powerful aphrodisiac''), the descriptive problem mentions cassava root only in the most general of ways (sentences containing the word ``cassava'' add up to only 55 words). Someone defending the experimental validity of these two narratives might say that elaborating on the costs and bene®ts in the original USSC problem, while only sketching these in the descriptive controls, is essential to Cosmides' claims. However, even if this is the case, the differences could have been implemented experimentally in sounder, and less stark ways. Many narrative details in the original USSC problem repeatedly state the main take-home message about the aphrodisiac, which is that cassava root is strongly desired and carefully rationed. This could have been not only avoided but eliminated, because the rule itself, along with minimal information pointing out what is a cost and what is a bene®t, should suf®ce to trigger the cheater detection module. Third, the descriptive problem includes irrelevant information: that cassava root is found in the north of the island and that people eat cassava root or molo nuts, but not both. It could be then that the original USSC ± which does not contain the obfuscating details ± is not necessarily facilitative, but the descriptive control blocks facilitation. To summarise, like Cheng and Holyoak's initial work, Cosmides' original social contract problems contain features that make these tasks look very different from their controls, and in a way that is not justi®ed on theoretical

44

Noveck, Mercier, Van der Henst

grounds.3 The shortcomings of Cosmides' studies are arguably more egregious than those in Cheng and Holyoak's. They are also diffuse, making it hard to see how one can easily remove these while testing the relevant features of the social contract thesis. Platt and Griggs (1993) endeavoured to separate the in¯uence of theoretically based claims from experimental confounds by investigating a host of issues that are raised by Cosmides' tasks. They compared (1) participants who received one selection task problem versus many; (2) the presence versus absence of cost±bene®t information; (3) the presence versus absence of explicit negations in the cards; (4) the presence versus absence of an authority-taking ``perspective''; and (5) the presence versus absence of the modal must in the social contract problems. Of these, only (2) is directly relevant to activation of a cheater detection module, and even this aspect is overrepresented in the original USSC problem when compared to descriptive controls. This is emblematic of the kind of research one must do in order to distil out the relevant theoretical features. It is no small task and it is an unfortunate diversion from theoretical development. Platt and Griggs presented evidence showing that cost±bene®t information does affect rates of correct performance. In their Experiment 2, they removed (or maintained) what they deemed to be cost±bene®t information from the body of three different selection task problems. For two of these (Cosmides' Namka and School problems), this was detrimental to rates of correct performance, and for the last one ± the Kaluame problem, the original USSC problem ± it was not. Even so, other factors were shown to contribute to the high rate of correct performance with the Kaluame problem (e.g., the word must in the rule). The authors concluded that ``the cost± bene®t structure [is] necessary for substantial facilitation'' on Cosmides' problems, and that their ®ndings are strongly supportive of Cosmides' account (p. 187). Although Platt and Griggs do provide some support for Cosmides' account, there are three reasons to remain dubious. First, the fact that the manipulated cost±bene®t information was part of an extensive elaboration of the rule raises doubts about whether social contract claims need apply. If success on a task depends on more and more elaboration on a speci®c theme, then the modular aspect of the cheater detection device seems weak. The long narrative describing the drawbacks of the costs, and the importance of the bene®ts, in Cosmides' tasks, should not be necessary and is in itself controversial. If cost±bene®t information is indeed suf®cient for facilitating performance, this should be self-evident in the rule, and not require extensive

3 Cosmides' tasks can be criticised on other grounds as well. For example, as Fiddick et al. (2000) are aware, ®nding a cheater is not the same thing as ®nding a violator to a logical rule. Sperber and Girotto (2002) point out how such a distinction makes social contract problems unique in reasoning paradigms.

2. Social contracts and the selection task

45

explanation. Second, Platt and Griggs used Cosmides' original tasks as a kind of standard before removing speci®c sentences. Given that Cosmides' tasks are practically stories, to remove lines summarily from them potentially interrupts the narrative ¯ow that was arguably present in the original. The upshot is that whenever problems yield lower success rates, this might indeed be the result of the removal of critical pieces of information (as claimed) or it might be due to a disrupted narrative ¯ow. We agree with Platt and Griggs' experimental intent, but not with the way they carried it out. Our strategy will be to take a problem that contains the bare minimum of theoretically relevant information (i.e., one that does not come with a plethora of unnecessary details) and then import features, such as cost±bene®t information. This way the control problem is assured to be sensible before the variables are introduced. Third, as Platt and Griggs point out themselves, in some cases it is not clear how one should characterise sentences and fragments (e.g., as cost±bene®t information or not). Some of their own decisions are not convincing. For example, they considered the phrase Cassava root is so powerful an aphrodisiac, that many men are tempted to cheat on this law whenever the elders are not looking as cost±bene®t information (as opposed to information about rule enforcement). We think an audit of such classi®cations is called for. These doubts led us to carry out our own set of experiments that follow up on Platt and Griggs (1993) and address the methodological drawbacks in Cosmides' original study. In one study, we compare Cosmides' original USSC problem to a version that has nearly all extra-theoretical information removed (Experiment 1A). In the same spirit, but coming from the opposite direction, we compare a version of a short abstract control problem to one that has only relevant, minimal cost±bene®t information added (Experiment 1B). In Experiment 2, we investigate the role of cost±bene®t information and rule-enforcement information found outside the provided rule in the Kaluame problem (which still facilitated even with the deletions in Platt and Griggs' studies). Our strategy is to start with a minimal set of relevant features before importing details. Our ultimate aim is to capture the in¯uence of relevant theoretical factors (i.e., cost±bene®t information) inside the rule (Experiments 1A and 1B) and outside the rule (Experiment 2), while separating out the in¯uence of non-theoretical, and potentially confounding, information.

Experiments 1A and 1B According to Cosmides, the cheater detection module ought to be activated as soon as the costs and bene®ts involving a social contract situation are detected (Cosmides, 1989, pp. 199±200). In other words, an appropriately worded rule ought to prompt a cheater detection module as much as one in a richly detailed context. We intend to determine the extent to which this can be supported.

46

Noveck, Mercier, Van der Henst

In much the same way as Noveck and O'Brien and others (Girotto, Mazzocco, & Cherubini, 1992; Griggs & Cox, 1993; Jackson & Griggs, 1990; Kroger, Cheng, & Holyoak, 1993) investigated extraneous factors in the pragmatic reasoning schema account, we determine the extent to which extraneous information in¯uences performance on Cosmides' social contract account. This is why we compared Cosmides' original USSC problem to a version that was shorn of nearly all its unnecessary details (Experiment 1A) and why we compared an abstract control problem to one whose rule ought to provoke a cheater detection module (Experiment 1B). Each experiment contained a version of a previously run selection task that allows us to verify that our samples resemble those found in the literature. Experiment 1A In Experiment 1A we compared a French translation of Cosmides' original Kaluame USSC problem to another we call the concrete aphrodisiac± married problem. For the original USSC, note that its rule is ``If a man eats cassava root then he must have a tattoo on his face.'' The novel problem included cost±bene®t information in the rule only, directly by using the term aphrodisiac instead of cassava root and married instead of tattoo on his face. If the concrete aphrodisiac±married problem produces a rate of correct performance that resembles the original USSC problem, then that would support Cosmides' claims. If the details in the narrative are important, then we would expect a signi®cantly higher rate of correct response in the original USSC problem. Method Eighty-three French undergraduates in History participated (mean age: 19.7 years). Each received two sheets. The ®rst contained, unlike Cosmides (1989), short instructions about the participant's task. The second contained one of the two following problems: Cosmides' original USSC problem or the concrete aphrodisiac±married problem.4 These were randomly assigned and were run prior to a History class in a lecture-hall. The concrete aphrodisiac±married task (translated) looked like this: Imagine that you are an authority among the Kaluame, a Polynesian tribe. Among the Kaluame, there is a very important rule that you must make sure is respected. If a man takes an aphrodisiac, then he must be married.

4 Wording for previously used problems, not given here, can be readily obtained from original sources, textbooks, and the internet.

2. Social contracts and the selection task

47

Table 2.1 Response patterns to the problems of Experiments 1A and 1B, and for the four novel problems and Cosmides' original USSC problem in Experiment 2. The correct response is to choose the P-and-not-Q cards Problem

n

P & not-Q P

P & Q Not-Q Other

Experiment 1A Unfamiliar standard social contract Concrete aphrodisiac±married

39 44

69% 25%

10% 33%

5% 19%

2% 4%

14% 19%

Experiment 1B Standard abstract Abstract cost±bene®t

38 41

16% 46%

18% 10%

30% 10%

0% 19%

36% 15%

Experiment 2 Unfamiliar standard social contract

37

73%

3%

5%

11%

8%

43 47 39 46

37% 40% 53% 54%

17% 2% 5% 13%

12% 17% 13% 13%

5% 8% 5% 7%

30% 32% 23% 13%

New selection task problems Cost±bene®t

Rule-enforcement

unelaborated unelaborated elaborated elaborated

unelaborated elaborated unelaborated elaborated

The cards below contain information about four young Kaluame men. Each card represents a man. On the face side of the card, it shows what the man ate, and the other side shows whether or not he is married. In order to verify that the rule is violated, which card(s) below do you need to turn over? Turn over only those cards that are necessary. The four cards were illustrated with ``took aphrodisiac'', ``did not take aphrodisiac'', ``married'', and ``not married'', in that order. Results Table 2.1 shows the percentage of correct answers (the P & not-Q cards) for each problem. We highlight two ®ndings. First, the original USSC problem yielded a rate of correct responses consistent with the literature, 69%, indicating that our participants are comparable to others. Second, the concrete aphrodisiac±married problem yielded a rate of correct responses, 25%, that was much lower. The difference between the two problems is signi®cant, 2(1) = 18.5, p < .01. Experiment 1B Experiment 1B also compared two problems. One was a standard abstract problem, which typically produces low rates of correct responses. The

48

Noveck, Mercier, Van der Henst

version used was a reasoning-from problem with explicit negations. The other was labelled the abstract cost±bene®t problem. Much like in Cheng and Holyoak's abstract problems, the rule was presented as ``If one takes Bene®t `B', then one must pay the Cost `C' ''. If the salience of costs and bene®ts is enough to prompt a cheater detection module, this novel problem ought to provide a rate of correct responses that is higher than the standard abstract problem. Method Seventy-nine French undergraduates in History participated in this experiment (mean age: 19.8 years). The novel problem presented the rule with arbitrary references to costs and bene®ts. Here is an English version of the problem: Imagine that you are an authority who needs to verify whether or not people respect the following rule: If someone takes bene®t ``B'' then he must pay cost ``C''. The cards below contain information about four people. One side of the card indicates whether the person took the bene®t ``B'' or not and the other indicates whether the same person paid the cost ``C'' or not. In order to verify whether or not the rule has been violated, which card(s) below would you turn over? Turn over only those card(s) that are necessary. The cards were presented as having taken bene®t ``B'', not having taken bene®t ``B'', having paid cost ``C'', not having paid cost ``C''. The study's procedure was identical to that of Experiment 1A. Results Table 2.1 shows the percentage of correct answers (the P & not-Q cards) for the two problems. The standard problem yielded a rate of correct responses (16%) among our participants that is consistent with the literature. The novel abstract cost±bene®t problem prompted a rather high rate of correct responses (46%). The difference between the two problems is signi®cant, 2(1) = 8.05, p < .01. Discussion of experiments 1A and 1B Our investigation shows that extraneous features are crucial to successful performance on social contract problems originating from Cosmides. When the original USSC problem is reduced so that only relevant theoretical

2. Social contracts and the selection task

49

features are included, remaining mostly in the rule of the concrete aphrodisiac±married problem, rates of correct responses drop dramatically. Even though this problem has enough features to trigger a cheater detection module, participants largely fail to ®nd all potential cheaters. In Experiment 2, we will determine which of the extraneous features of the original USSC problem are responsible for facilitation. A second result is that the abstract cost±bene®t problem in Experiment 1B was successful at facilitation. This ®nding is potential support for Cosmides' hypothesis. Moreover, Table 2.1 shows that one of its prominent response patterns is to choose the not-Q card only. This is noteworthy because typically, when only one of the two ``correct'' cards is selected, it is usually the P card. It seems then that the abstract cost±bene®t problem not only leads to a relatively high rate of correct responses, but it improves performance because it puts the focus on the false consequent. We will not pursue this further here, but it could form a basis for future research. Overall, the results are mixed. On one hand, it appears that a systematic reduction of detail on the original USSC task lowers the rate of correct responses. On the other, the addition of clear cost±bene®t information to an abstract rule prompts facilitation. Nevertheless, neither of the two novel problems here prompts rates of correct responses comparable to Cosmides' original USSC problem.

Experiment 2 In light of the ®ndings from Experiments 1A and 1B, we investigate two features of the original USSC problem here: one that could arguably be considered support for Cosmides' theory (cost±bene®t information) and another that clearly cannot, rule enforcement. We look at each of these in turn. One possible explanation for the low rate of correct performance in the concrete aphrodisiac±married problem is that the cost±bene®t structure is only in the rule ± if a man takes an aphrodisiac, then he has to be married. Taking an aphrodisiac may not be viewed as an obvious bene®t, and being married may not be considered a cost. Perhaps the extraneous information in the original USSC problem is necessary in order to emphasise the bene®ts of the aphrodisiac and the costs of being married. How this squares with Cosmides' theory is not clear. A conservative argument would be that the theory should stand without extensive cost±bene®t elaborations in the problem. A more generous account would be that costs and bene®ts need to be clearly spelled out. In any case, much of the information in the body of the original USSC problem can be characterised as being devoted to costs and bene®ts. The other factor that could account for the high rates of correct performance on the original USSC problem and the lower rates on the concrete aphrodisiac±married problem is the rule-enforcement aspect of the task.

50

Noveck, Mercier, Van der Henst

Much of the extraneous information in the original USSC problem includes phrases such as ``To fail would disgrace your family'' or ``If any get past you, you and your family will be disgraced''. These exhortations should not be necessary if cheater detection is modular. Moreover, it could be that these theoretically irrelevant features facilitate correct performance in the same way as do (1) reasoning-from tasks, as opposed to reasoning-about tasks; or (2) negative information made explicit on the cards (see note 1). Platt and Griggs (1993) likewise investigated these two factors. In their second experiment, they used three problems from Cosmides (1989) and isolated factors that they considered to be either ``cost±bene®t information'' or what they called ``subject's perspective (cheating versus no cheating)'' information. Their technique was essentially to remove either the information they deemed relevant to the cost±bene®t aspects or the information they viewed as relevant to cheating detection (what we call information relative to rule enforcement). Their results were mixed. The removal of the cost±bene®t information had no effect on the original USSC problem (Kaluame), though it did have an effect on Cosmides' School and Namka problems. Their original USSC problem yielded a rate of correct responses of 64% even when both sorts of information were removed. This is surprising for the following four reasons. First, in two of their other problems (School and Namka), Platt and Griggs did ®nd effects based on the presence or absence of cost±bene®t information. Second, Platt and Griggs removed sections of the original USSC problem (representing over 225 words) that one would think would be useful for facilitation. Third, other studies, using slightly different tasks, have yielded results that are inconsistent with Platt and Griggs (e.g., Gigerenzer & Hug, 1992). Finally, ®ndings from Experiment 1B here ± showing that information eliciting the cost±bene®t aspects of the rule positively affects performance ± are inconsistent with Platt and Griggs' ®ndings. We thus implemented an experiment similar to Platt and Griggs' Experiment 2, adopting a different strategy. Rather than starting with Cosmides' original USSC problem and then removing information, we ®rst devised a minimalist version of Cosmides' original USSC problem, using its rule plus the minimal amount of narrative information necessary to make sense of it, and then we added the features we wanted to investigate. We thus included four main problems ± one with no cost±bene®t information nor rule-enforcement information added (CBÿ/REÿ), one with only cost± bene®t information added (CB/REÿ), one with only rule-enforcement information added (CBÿ/RE), and one with both added (CB/RE). Our way of categorising information is slightly different from Platt and Griggs. For example, we considered the phrase many men are tempted to cheat on this law whenever the elders are not looking as part of the ruleenforcement aspect of the task, whereas Platt and Griggs considered the phrase to be cost±bene®t information. More importantly, our method

2. Social contracts and the selection task

51

allowed us to remove entire sections from the original USSC problem that had no relevance to the factors investigated. For example, this problem includes much narrative that ought to be unnecessary to test Cosmides' claims (``molo nuts taste bad''; ``You are very sensual people . . .'', etc.). Our longest version (when translated into English) contains only 268 words. Nevertheless, we made sure that even our most basic version was sensible. This would allow us to see how extraneous information might in¯uence correct performance even when comparing the original USSC problem to our new versions. Our prediction was that the minimalist version would yield a relatively low rate of correct responses (much like the concrete aphrodisiac±married problem in the ®rst experiment) because neither the cost±bene®t nor the rule-enforcement aspects of the problem are made salient. The inclusion of one or both of the two factors should reveal what role each plays in the facilitation found on Cosmides' original USSC problem. Method Two hundred and twelve French undergraduates in History participated (mean age: 19.7 years). The procedure was identical to the one used in the prior experiments. The problems were randomly assigned (Table 2.1 shows how many participants received each problem). The basic wording of the problem was as follows (text in italics refers to added rule-enforcement information; text in bold refers to added cost±bene®t information). You are a Kaluame, a member of a Polynesian culture that is found only on the Maku Island in the Paci®c. The Kaluame have many strict laws which must be enforced and the elders have entrusted you with enforcing them. To fail would disgrace you and your family. Among the Kaluame, when a man marries, he gets a tattoo on his face; only married men have tattoos on their faces. A facial tattoo means that a man is married, an unmarked face means that a man is a bachelor. Cassava root is a powerful aphrodisiac ± it makes the man who eats it irresistible to women. Moreover, it is delicious and nutritious ± and very scarce. Although everyone craves cassava root, eating it is a privilege that your people closely ration. Among the Kaluame, there is an important rule concerning rationing privileges that you must enforce. The ancestors have created the laws. The one you must enforce is the following: If a man eats cassava root, then he must have a tattoo on his face.

52

Noveck, Mercier, Van der Henst Many men are tempted to cheat on this law whenever the elders are not looking. The cards below contain information about four young Kaluame men. Each card represents one man. On the face side of the card, it shows what the man ate, and the other side shows whether or not he has a tattoo. In order to verify that the rule is violated, which card(s) below do you need to turn over? Turn over only those cards that are necessary to turn over. [In rule-enforcement conditions, ``necessary to turn over'' was substituted with ``necessary to see if any of these men are breaking the law.'']

The cards were then presented as ``eats cassava root'', ``does not eat cassava root'', ``tattoo'', and ``no tattoo''. Results and discussion Table 2.1 shows the percentage of correct answers (P & not-Q) for each type of problem. The results are clear-cut. Cosmides' original problem yields the highest rate of correct response (73%) and this is signi®cantly higher than the two that have no cost±bene®t information (for the comparison between USSC and CBÿ/RE, 2(1) = 8.85, p < .01 and for the comparison between USSC and CBÿ/REÿ, 2(1) = 10.2, p < .01). There are no other signi®cant effects when any two problems are compared to one another. However, when types of problem are investigated (and we leave out the original USSC problem), one ®nds that cost±bene®t information has a signi®cant effect on performance (for the comparison between the two CB problems versus the two CBÿ problems, 2(1) = 4.1, p < .05) while the ruleenforcement information has no effect at all (for the comparison between the two RE problems versus the two REÿ problems, 2(1) = .08, p = .77). The fact that Cosmides' original USSC problem yields the highest rate of correct performance, and that this is signi®cantly above at least two of the others, shows that extraneous narrative information facilitates correct performance. Lines of text such as ``Unlike cassava root, molo nuts are very common . . .'' and ``You are a very sensual people . . . The elders disapprove of relations between unmarried people and particularly distrust the motives and intentions of bachelors'' apparently have facilitative effects. The most impoverished of the problems (CBÿ/REÿ) yields a rate of correct responses that is of interest (37%) because it is above that predicted by chance, 2(1) = 69.78, p < .01. Using the standard abstract problem of Experiment 1 as a benchmark (16% of participants gave the correct answer), the rate of correct performance in the most impoverished problem here is still signi®cantly higher. This tells us that the rule itself in the original USSC problem is facilitative in much the same way that the abstract cost±bene®t rule was in Experiment 1B. Overall, one can ®nd two shifts of improving performance. Rates of correct performance increase from around 38% to 54% due to elaborations

2. Social contracts and the selection task

53

on cost±bene®t information in the body of the problem. There is, however, a secondary increase (to around 71%) that is visible when comparing the two problems that have cost±bene®t information to the two original USSC problems in Experiments 1A and 2 (2(1) = 4.89, p < .05). This second increase can only be due to the other elaborative information included in the original USSC problem, but excluded from our social contract problems.

General discussion We began this chapter by pointing out that caution is called for when testing theoretical claims with the selection task. Its apparent simplicity makes it seem an appropriate tool for testing content-based accounts of reasoning. However, it is not a simple matter to introduce variables into this task (see also Roberts, Chapter 1, this volume). The net results of our experiments are clear. Cost±bene®t language does have an impact on the selection task. Experiment 1A revealed that the rate of correct performance increases signi®cantly when an abstract rule using the words cost `C' and bene®t `B' is employed and compared to a standard abstract rule. Strictly speaking, this is the best case for the claims of the social contract approach because the change is limited to the conditional rule. If one wants to go beyond the rule to look for con®rmatory evidence, one can cite how the elaboration of cost±bene®t information in the body of the original USSC problem increases the rate of correct performance from about 38% to about 54%. This result is a correction for the literature because a prior attempt from Platt and Griggs (1993) did not succeed in isolating facilitative cost±bene®t information with this speci®c task. However, when one looks at the three problems whose cost±bene®t information is limited to the rule (the concrete aphrodisiac±married problem of Experiment 1A, the abstract cost±bene®t problem of Experiment 1B, and the CBÿ/REÿ problem of Experiment 2), one notices two things. First, there is some variability. The concrete aphrodisiac±married problem yields a rate of correct responses of 25%, the abstract cost±bene®t problem 46%, and the CBÿ/REÿ problem 37%. The second and third rates are higher than what one would ®nd in standard abstract problems, but the ®rst one is not. Thus, it is not suf®cient just to use any rule that could be interpreted as having a cost and a bene®t (or a cost and a requirement). One needs a rule that presents these clearly (i.e., getting a tattoo on the face upon marriage is viewed as being more costly than getting married). Second, they show that the relatively high rate of correct performance reported on the original USSC problem (rates of correct responses of around 70±75%) is largely due to elaborations that occur outside the rule. This implies that ®nding an appropriate solution to the selection task is incremental. As more relevant information is presented, the appropriate strategy for this task becomes more obvious. This does not seem to describe a modular cheater detection system.

54

Noveck, Mercier, Van der Henst

There is also another factor (or set of factors) ± having nothing to do with elaborations of costs and bene®ts in Cosmides' original USSC problem ± that further raises rates of correct performance from around 54% to 71%. The cause of this is hard to nail down because there are many candidates. It could be due to the negative characterisation of molo nuts that is in the original USSC problem and not in our CB/RE version. It could be the style and focus of the long narrative (mentioning the importance of remaining chaste etc.) that simply makes the task more engaging in its original version (see also O'Brien, Roazzi, Athias, & BrandaÄo, Chapter 3, this volume). It is dif®cult to know. We do know that something other than cost±bene®t information is a facilitating factor on these tasks. Overall, if one could say that rates of correct performance start out at around 16% for standard abstract problems and range from 25% to 73% on problems derived from the original USSC format, it can be said that at most 38 of the potential 57 percentage point increase is due to a theoretically relevant factor (up to 54% provide correct responses due to what are arguably cost±bene®t-related claims, while 16% respond correctly even without cost±bene®t information). If one con®nes oneself to the rule, one can claim that anywhere from only 9 to 30 percentage points can be attributed to cost± bene®t features. Note that this leaves 43% of participants to account for, who either ®nd the correct response without cost±bene®t information or do not answer correctly despite a great deal of cost±bene®t information. Put in this light, the theoretical claims do not completely match the data. Does this mean one should abandon evolutionary accounts? No. That costs and bene®ts can assist reasoning to any extent is of interest in itself. Are there other evolutionary accounts that can incorporate or address Cosmides' ®ndings? Yes. Relevance theory (Sperber & Wilson, 1985/1996), to which we now turn, employs two factors, effort and effect, to describe comprehension and Sperber, Cara and Girotto (1995) operationalised these to account for selection task performance. Although these factors resonate with costs and bene®ts, they are not con®ned to types of rules or to a speci®c cheater detection module. Relevance theory develops two general claims or ``principles'' about the role of relevance in cognition and in communication. The ®rst, the cognitive principle of relevance, predicts that our perceptual mechanisms tend spontaneously to pick out potentially relevant stimuli, that our retrieval mechanisms tend spontaneously to activate potentially relevant assumptions, and that our inferential mechanisms tend spontaneously to process them in the most productive way. This principle, moreover, has important implications for human communication. In order to communicate, the communicator needs her audience's attention. If, as claimed by the cognitive principle of relevance, attention tends automatically to go to what is most relevant at the time, then the success of communication depends on the audience taking the utterance to be relevant enough to be worthy of attention. Wanting her communication to succeed, the communicator, by

2. Social contracts and the selection task

55

the very act of communicating, indicates that she wants her utterance to be seen as relevant by the audience, and this is what the communicative principle of relevance states. According to relevance theory, the presumption of optimal relevance conveyed by every utterance is precise enough to ground a speci®c comprehension heuristic: Presumption of optimal relevance: a b

The utterance is relevant enough to be worth processing; It is the most relevant one compatible with the communicator's abilities and preferences.

Relevance-guided comprehension heuristic: a

b

Follow a path of least effort in constructing an interpretation of the utterance (and in particular, in resolving ambiguities and referential indeterminacies, in going beyond linguistic meaning, in computing implicatures, etc.); Stop when your expectations of relevance are satis®ed.

Sperber et al. (1995) showed how one can conjoin these principles in order to build an ``easy'' selection task. Their ``recipe'' can be boiled down to this: Minimise the effort of ®nding denial of conditional cases (i.e., P-and-not-Q cases) and maximise effects by making the production of P-and-not-Q cases desirable representations. In a series of four experiments, they showed how this could be done. In the experiment that presents the most convincing evidence in support of their account (Experiment 4), they presented a scenario in which a machine presents numbers on one side and letters on the other. The rule was If the card has a 6 on the front then it has an E on the back. What distinguished each of the four conditions was the cognitive effort required and the cognitive effects produced in order to ®nd P-and-not-Q cases, with the prediction that the problem that maximises effects produced while minimising effort needed would be the most likely to produce correct responses. One way their manipulation minimised the participant's effort was by simply saying that there are either 4s or 6s on the front rather than ``numbers''; one way their manipulation maximised effects was by adding that the machine did not always produce the letter E. As predicted, the scenario that maximised effects and minimised effort yielded the highest rate of correct responses. The one that minimised effects and maximised effort yielded the lowest. The analysis from Sperber et al. (1995) can account for Cosmides' outcomes. The long narrative in the original USSC problem includes details that arguably maximise effects by encouraging participants to ®nd P-andnot-Q cases. The discussion of molo nuts, for example, tells the reader to ignore cards that mention them and the extended descriptions describing which men can have facial tattoos tell the reader that the absence of tattoos

56

Noveck, Mercier, Van der Henst

is critical. It would be a painstaking process to uncover all the details that encourage this search for P-and-not-Q cases, but work from Sperber and colleagues (Girotto, Kemmelmeier, Sperber, & Van der Henst, 2001; Sperber et al., 1995) gives a principled way to look for them. Relevance principles stand in sharp contrast with Cosmides' domain speci®c cheater detection module. Relevance assumes that abilities for solving any communicative task are fairly domain general. Through the communicative principle of relevance, premises are taken as portions of a communicative act and communicative intentions are derived from it. In contrast, a cheater detection module takes as input those strictly related to cost and bene®ts in a social contract situation. However, before one concludes that relevance is a non-modular mechanism, two caveats deserve mention. First, pragmatic abilities are not truly domain general: they cannot be (successfully) applied to non-communicative acts (Sperber, 2000, p. 133), so their range of input is in some ways limited. The second ± and perhaps more important ± point is that our pragmatic abilities show the true landmark of modular mechanisms: they are informationally encapsulated. Such mechanisms do not have access to our entire mental database to function: they have to rely on their own, proprietary database (Fodor, 2001; Sperber, 2005). This is clearly the case for our pragmatic abilities since there is a lot of information (e.g., sensory information) for which they have no use and that does not bear on their inner workings. So the relevance account, though relatively general when compared to a cheater detection module, is fully compatible with the massive modularity hypothesis, even if it forces us to loosen an overly stringent de®nition of modules (aÁ la Fodor, 1983) and to pay closer attention to their different properties (Sperber, 2001; Sperber & Wilson, 2002). To summarise, the original work from Cosmides shows how the selection task can come with traps if one uses it too liberally. A modi®cation of content can seem harmless enough, but theoretical investigations can compel the experimenter to make wholesale changes to the task itself. These modi®cations often include extraneous details that prompt participants to give the ``correct'' response. These ultimately overshadow the theoretical insight that initiated the investigation in the ®rst place.

Acknowledgements The authors wish to thank Monica Martinat and Anne BeÂroujon for access to their students as well as Nathalie Bedoin for discussions pertaining to social contracts and experimentation.

References Axelrod, R. (1984). The evolution of cooperation. New York: Basic Books. Cheng, P. W., & Holyoak, K. J. (1985). Pragmatic reasoning schemas. Cognitive Psychology, 17, 391±416.

2. Social contracts and the selection task

57

Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187±276. Evans, J. St B. T. (1989). Bias in human reasoning. Hove, UK: Lawrence Erlbaum Associates Ltd. Evans, J. St B. T., Newstead, S. E., & Byrne, R. M. J. (1993). Human reasoning: The psychology of deduction. Hove, UK: Lawrence Erlbaum Associates Ltd. Fiddick, L., Cosmides, L., & Tooby, J. (2000). No interpretation without representation: The role of domain speci®c representations in the Wason selection task. Cognition, 77, 1±79. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Fodor, J. A. (2001). The mind doesn't work that way. Cambridge, MA: MIT Press. Gigerenzer, G., & Hug, K. (1992). Domain speci®c reasoning: Social contracts, cheating, and perspective change. Cognition, 43, 127±171. Girotto, V., Kemmelmeier, M., Sperber, D., & Van der Henst, J. B. (2001). Inept reasoners or pragmatic virtuosos? Relevance and the deontic selection task. Cognition, 81, 69±76. Girotto, V., Mazzocco, A., & Cherubini, P. (1992). Judgements of deontic relevance in reasoning: A reply to Jackson and Griggs. Quarterly Journal of Experimental Psychology, 45A, 547±574. Griggs, R. A., & Cox, J. R. (1993). Permission schemas and the selection task. Quarterly Journal of Experimental Psychology, 46A, 637±651. Holyoak, K. J., & Cheng, P. W. (1995). Pragmatic reasoning about human voluntary action: Evidence from Wason's selection task. In J. St B. T. Evans and S. E. Newstead (Eds.), Perspectives on thinking and reasoning: Essays in honour of Peter Wason (pp. 67±89). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Jackson, S. L., & Griggs, R. A. (1990). The elusive pragmatic reasoning schema effect. Quarterly Journal of Experimental Psychology, 42A, 353±373. Johnson-Laird, P. N., Legrenzi, P. & Legrenzi, M. S. (1972). Reasoning and a sense of reality. British Journal of Psychology, 63, 395±400. Johnson-Laird, P. N., & Wason, P. C. (1970). Insight into a logical relation. Quarterly Journal of Experimental Psychology, 22, 49±61. Kroger, J. K., Cheng P. W., & Holyoak, K. J. (1993). Evoking the permission schema: The impact of explicit negations and a violation-checking context. Quarterly Journal of Experimental Psychology, 46A, 615±635. Noveck, I. A., & O'Brien, D. P. (1996). To what extent do pragmatic reasoning schemas affect performance on Wason's selection task? Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 2, 463±489. Platt, R. D., & Griggs, R. A. (1993). Darwinian algorithms and the Wason selection task: A factorial analysis of social contract selection task problems. Cognition, 48, 163±192. Politzer, G., & Ngyuen-Xuan, A. (1992). Reasoning about conditional promises and warnings: Darwinian algorithms, mental models, relevance judgements or pragmatic schemas? Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 44A, 401±421. Sperber, D. (2000). Metarepresentations in an evolutionary perspective. In D. Sperber (Ed.), Metarepresentations: A multidisciplinary perspective (pp. 117±137). Oxford: Oxford University Press. Sperber, D. (2001). In defense of massive modularity. In E. Dupoux (Ed.),

58

Noveck, Mercier, Van der Henst

Language, brain and cognitive development: Essays in honor of Jacques Mehler (pp. 47±57). Cambridge, MA: MIT Press. Sperber, D. (2005). Modularity and relevance: How can a massively modular mind be ¯exible and context-sensitive? In P. Carruthers, S. Laurence & S. Stich (Eds.), The Innate Mind: Structure and Contents. Oxford: Oxford University Press. Sperber, D., Cara, F., & Girotto, V. (1995). Relevance theory explains the selection task. Cognition, 52, 3±39. Sperber, D., & Girotto, V. (2002). Use or misuse of the selection task? Rejoinder to Fiddick, Cosmides, and Tooby. Cognition, 85, 277±290. Sperber, D., & Wilson, D. (1985/1996). Relevance: Communication and cognition. Oxford: Basil Blackwell. Sperber, D., & Wilson, D. (2002). Pragmatics, modularity and mind-reading. Mind and Language, 17, 3±23. Trivers, R. L. (1971). The evolution of reciprocal altruism. Quarterly Review of Biology, 46, 35±57.

3

What sorts of reasoning modules have been provided by evolution? Some experiments conducted among Tukano speakers in Brazilian AmazoÃnia concerning reasoning about conditional propositions and about conditional probabilities David P. O'Brien, AntoÃnio Roazzi, Renato Athias, and Maria do Carmo BrandaÄo

We investigate two claims made by Cosmides and her associates about content-speci®c reasoning processes (e.g., Barkow, Cosmides, & Tooby, 1992; Brase, Cosmides, & Tooby, 1998; Cosmides, 1989; Cosmides & Tooby, 1992, 1994; Fiddick, Cosmides, & Tooby, 2000; Tooby & Cosmides, 1992). The ®rst is a bioevolutionary argument that the environmental pressures on our Pleistocene ancestors resulted in a content-speci®c reasoning module for identifying violators of social contracts, but not in any content-general modules such as a mental logic for conditionals, of the sort we have proposed (e.g., O'Brien, 2004; O'Brien, Roazzi, Athias, Dias, BrandaÄo, & Brooks, 2003; see Braine & O'Brien (1998) for the most complete presentation of mental-logic theory and evidence in its support). The second claim is that the same human bioevolutionary history has provided a module for representing and reasoning about frequencies of events, but not for the probabilities of single events. We reject the claim that human bioevolutionary history requires the sort of content-speci®c modules that are proposed by Cosmides and her associates. We do not claim that content does not matter when people reason, but we disagree that adequate reasons have been provided ± either theoretical or empirical ± to think that content-general procedures are not among the basic reasoning processes. First, we describe the bioevolutionary proposals and the arguments and evidence presented for them. Then we describe data from three experiments conducted with Tukano speakers in the northwest Amazon basin of Brazil, as well as with university students in New York, that show that even in an illiterate indigenous population without formal education, none of the response tendencies that would be predicted from the proposals of Cosmides and her associates was found.

60

O'Brien, Roazzi, Athias, BrandaÄo

The social-contract hypothesis and Wason's selection task We turn ®rst to the proposal concerning a content-speci®c module that enables identi®cation of violators of social contracts (e.g., Cosmides, 1989; Fiddick et al., 2000). This was that the mind contains a specialised reasoning module that was adaptive for the social world of our hunter/gatherer ancestors, whose societies would have needed to make judgments about who was violating contracts pertaining to the granting of bene®ts and the extracting of costs. A society based on a cost/bene®t system of social exchanges would require that its members have the necessary cognitive equipment to detect cheaters, and therefore the mind must contain a bioevolutionarily derived module for identifying them. Social exchanges have a conditional form ± if one takes the bene®t, one must pay the cost ± but according to the social-contract theorists these social pressures would not have led to the development of any general logical reasoning abilities that would apply for evaluating conditionals across broad sorts of content. No other environmental pressures would have led to any content-general logical reasoning procedures. From the perspective of social-contract theory, reasoning thus is based on some fairly narrow content-speci®c modules. The empirical support for the social-contract hypothesis has come from comparisons between two general versions of Wason's selection task: one that presents a conditional social contract and requires identi®cation of potential cheaters, and another that presents an indicative conditional assertion and either requires identi®cation of potential evidence that the assertion is false, or identi®cation of potential violators. The general form of the task (Wason, 1968) presents a conditional, if p then q, together with four cards showing p, not p, q, and not q, respectively. Participants are told that on one side of each card is a value for p (i.e., either p or not p) and on the other side a value for q (i.e., either q or not q) and are asked to select those cards whose inspection could reveal that the conditional is false, or that it has been violated. Social-contract versions of the problem (e.g., Cosmides, 1989; Fiddick et al., 2000) present a rule of the form if a person takes a bene®t then that person must pay the cost together with four cards showing a person taking a bene®t, a person not taking the bene®t, a person who has paid the cost, and a person who has not paid the cost. Participants were asked to identify the cards with the potential to identify cheaters, that is the p and not q cards. For example, for a rule that if someone in a particular society eats a particular kind of meat, that person must have a tattoo, people would need to choose cards showing a person eating this kind of meat and a person without a tattoo. A typical control problem asked participants (in Cambridge, Massachusetts) to imagine that they had read a report on commuting habits that concluded if a person goes into Boston, then that person takes the subway. Each of the four cards showed a destination on one side (either Boston or Arlington) and a means of transportation on the other side (subway or

3. Reasoning by Tukano speakers

61

taxicab). Participants were asked to select the cards that could identify whether the conditional rule was being violated. The putatively correct answer was to select the cards showing Boston ( p) and taxicab (not q). Cosmides reported that typically people chose the p and not q cards more often on social-contract problems than on such a control problem, concluding that if the human mind had evolved procedures for detecting logical violations of conditional rules, people ought to choose the p and not q instances on the control problems as well. The fact that such selections are rare, she argued, indicates that the mind has not evolved domain general logical procedures, but instead has evolved specialised procedures applicable only to the social-contract situation. Thus, people give p and not q responses to the social-contract problems because the content triggers the cheater-detection module, which cannot be triggered by a problem without a social contract. We ®nd the argument problematic. First, note the idea that observation of people's commuting habits would lead to a rule about relations between destinations and means of transportation. Although observations about the typical behaviours of commuters plausibly could lead to assertions about their habits, these hardly seem to qualify as rules that could be violated. This makes the control problem intuitively odd in a way that the socialcontract version is not. When one is asking people to seek counterexamples to a conditional assertion, it would be better to ask for potentially falsifying instances rather than potential rule violators; people typically do not think that assertions can be violated, although they do think of them as open to falsi®cation (see O'Brien, Roazzi, Dias, Cantor, & Brooks, 2004). Further, although one can think of a rule as being universally applicable (e.g., driving on a speci®c side of the highway is applicable to everyone), inductive generalisations about commuters' tendencies based on observation hardly qualify as universal. Indeed, it is likely that the students who received such problems would have understood the quanti®cation of the conditional assertion not as universal but as typical; that is, although one typically takes the subway to Boston, exceptions such as taxicabs, bicycles, and walking also occur without threatening the truth of the assertion. The control problems thus seem to provide a very different sort of quanti®cation from the social-contract problems, which nulli®es the notion that an instance of p and not q is either a violation or a falsi®er (see also Stenning & van Lambalgen, Chapter 8, this volume). The most directly relevant data presented in support of the bioevolutionary claim about the reasoning of hunter/gatherers were presented by Sugiyama, Tooby, and Cosmides (2002). This came from ®eldwork with a remote indigenous group, the Shiwiar, in Ecuadorian AmazoÃnia. The data included a comparison between a social-contract version and a descriptive version of the selection task. The conditionals were embedded within story contexts, with every problem referring to a conditional that was not previously known to the Shiwiar. The descriptive problem presented, in the

62

O'Brien, Roazzi, Athias, BrandaÄo

Shiwiar language, the conditional ``if there is a green butter¯y in the picture on the top part of the card, then there is a red ¯ower in the picture on the bottom part of the card.'' The social contract presented the rule, ``if you eat mongongo nut (which was described as an aphrodisiac), then you must have a tattoo on your chest (which was described as denoting married status).'' Fourteen of the 21 Shiwiar participants selected the p and not-q cards on the social-contract problem, whereas only three of the 21 did so on the descriptive problem. This difference stemmed entirely from the q and not q cards, with moderately more selections of q (.67 vs .48) and fewer of not q (.52 vs .86) for descriptive than for social-contract problems. Sugiyama et al. interpreted the differences in terms of the availability of a violator-checking module for the social-contract problem that they claimed is missing for the descriptive problem. Of course, one could choose instead to interpret the difference in terms of a differential interest in an aphrodisiac in the social-contract problem versus de-contextualised and unmotivated pictures of butter¯ies and ¯owers in the control problem.

Bayesian-reasoning tasks and the frequentist hypothesis Cosmides and her associates (e.g., Cosmides & Tooby, 1996; Brase, et al., 1998) have argued that bioevolutionary processes would have resulted in representational formats and reasoning processes for event frequencies, but not for the probabilities of single events. Together with an overlapping proposal by Gigerenzer and his associates (e.g., Gigerenzer & Hoffrage, 1995), this has become known as the frequentist hypothesis. The bioevolutionary argument claims that because one cannot observe the probability of a single event ± the event either occurs or it does not ± it follows that it is impossible for nature to build a sense organ for detecting single-event probabilities: ``No organism can evolve cognitive mechanisms designed to reason about, or receive as input, information in a format that did not regularly exist'' (Cosmides & Tooby, 1996, p. 15). Unlike probabilities, which were claimed to be completely separate from observation, Cosmides and Tooby assumed that frequencies are open directly to observation ± as Cosmides and Tooby (1996) put it, frequencies are ``available in the environment'' (p. 15), and Brase et al. (1998, p. 3) wrote, ``the `probability' of a single event cannot be observed by an individual . . . an individual can, however, observe the frequency with which events occur.'' On their argument, then, our hunter/gatherer ancestors could observe, for example, that 15 out of the past 20 hunting expeditions to some particular location were successful. As Cosmides and Tooby (1996, pp. 15±16) expressed it, ``our hominid ancestors were immersed in a rich ¯ow of observable frequencies that could be used to improve decision-making, given procedures that could take advantage of them. So if we have adaptations for inductive reasoning, they should take frequency information as input.''

3. Reasoning by Tukano speakers

63

What should one make of their claim that probabilities are so completely divorced from observation that it is conceptually impossible to consider that nature could produce any representational formats or reasoning procedures that implement them, but that frequencies exist at the level of observation? Cosmides and her associates presented this without any further justi®cation, as though it were simply intuitively obvious. Their claim seems odd on re¯ection, however, when one realises that knowledge about frequencies, as well as about probabilities, requires the intervention of computation between observation and quanti®cational representation. Apart from the fact that the categorisation of events is a complicated business (no two hunts, successful or not, are identical), one needs some way of recording these categorised events in memory, as well as a way of counting these recorded events, if one is to know the frequency with which events of a certain sort have occurred. One should not confuse the ability to observe an event with the ability to observe the frequency of events that belong to the same category. It is straightforwardly obvious that one needs to compute a sum of events to have a frequency count for them. One therefore has no more right to claim that frequencies occur at the level of observation than to claim that probabilities occur at this level. Both require memory, ways of categorising, and computation. In this, both seem open to the sort of analysis that was provided by Hume about causation, which, as he pointed out, is inferred by the observer rather than observed directly in the event.1 The computational difference that is involved in saying that 15 of 20 hunts were successful, rather than saying that 75% of the hunts were successful, is not great. The frequency description requires one to calculate the sums both for successful hunts (15) and unsuccessful hunts (5), to then add them together to obtain the denominator (that there were 20 hunts in all), and then place only the successful hunts in the numerator; only then can one know the frequency description, ``15 out of 20.'' The probability description ± the single event probability of .75 is equal to 75% of hunts being successful ± requires one additional computation, that is, dividing the numerator by the denominator to obtain .75. To claim, as Cosmides and her associates have, that the probability description requires computation, whereas the frequency description can be observed directly, is patently mistaken. Even though the theoretical argument presented by Cosmides and her associates is mistaken philosophically (see also Over, Chapter 4, this volume), they have presented data that seem to support their contention that people solve Bayesian-reasoning problems when these are presented in 1 Hume (1737/1957) showed that causation does not exist in the observation of an event, but is imposed on the event by the observer. Indeed, given Hume's analysis of causation, the argument made by Cosmides and her associates would lead to the conclusion that evolution could not have derived a way to represent causes in the human mind. We doubt that most readers would want to accept an argument that nature could have provided neither a way for the mind to represent causation nor a way to make causal inferences.

64

O'Brien, Roazzi, Athias, BrandaÄo

terms of frequency information, but not when presented in terms of probabilities of single events (e.g., Cosmides & Tooby, 1996; see also Gigerenzer & Hoffrage, 1995). This means that even when one dismisses their bioevolutionary argument, one still needs to deal with the empirical side of their presentation. Roazzi, O'Brien, and Dias (2003) showed, however, that there is a confound in the comparisons between frequency and probability versions of the Bayesian-reasoning problems presented by Cosmides and Tooby, and by Gigerenzer and Hoffrage, which can account for the reported differences. Roazzi et al. noted that the two sorts of problems consistently were presented with different response formats (`` %'' for probability problems vs. `` out of '' for frequency problems). When these response formats were balanced across problem types the advantage for frequency problems disappeared. We thus are not convinced by the published data that people are more apt to provide Bayesian answers to frequency-based than to probability-based problems.

Some experiments with Tukano speakers in Brazilian AmazoÃnia We have been engaged in an investigation of reasoning among speakers of the Tukano language in the IauarateÃ district on the Rio UaupeÃs along the border between Brazil and Colombia. IauarateÃ is in the Terra IndõÂgena do Alto Rio Negro in the State of Amazonas in Brazil. Until recently this population lived in semi-nomadic groups as hunter/gatherers with some intermittent farming. In the past two to three decades they have been settling into communities largely supported by subsistence farming and ®shing. The principal purpose of the project is to investigate deductive reasoning. Given the central place in the literature of the debates concerning the social-contract hypothesis and the putative representational and reasoning advantages of frequencies over probabilities, a large part of a trip to the IauarateÃ district in 2004 was devoted to collection of data relevant to these issues. The Terra IndõÂgena Alto Rio Negro is a reserved area that stretches west and north from the town of SaÄo Gabriel da Cachoeira on the Rio Negro to the Colombian border, with an area of 79,993 square kilometres (approximately twice the size of Switzerland). The of®cial population is only 14,599, that is, one person for each 5.5 square kilometres (FOIRN-ISA, 2000). Access to the area is limited to the indigenous population, to government functionaries such as members of the Brazilian military and the federal police, and workers from FUNASA (the Brazilian federal ministry for health). Only a few researchers are given permits to travel there and the general population is not permitted into the area. The trip from SaÄo Gabriel da Cachoeira to IauarateÃ is approximately 375 km of river travel and can be accomplished in 12 hours when the rivers are full, using a small, uncovered boat with an outboard motor, and three days during the dry

3. Reasoning by Tukano speakers

65

season when river travel is more dif®cult. Because of its inaccessibility to outside populations, the area and its inhabitants are extremely isolated. Younger members of the Tukano-speaking communities have learned to speak some Portuguese, have received some formal schooling provided by the Amazonas state government, and have learned at least the rudiments ± often much more ± of Portuguese literacy. Their daily communications, however, still take place almost exclusively in the Tukano language. Older members typically do not speak Portuguese beyond a rudimentary level, if at all, have not been exposed to any formal schooling, and do not have any literacy skills. Of course, there are exceptions, and some older members of the community are literate and conversant in Portuguese and some younger members are lacking in these skills. For the most part, however, the presence or absence of exposure to schooling is a function of age. Typically, both younger and older members of the community are multilingual, almost all of them speaking Tukano, and many speak three or more of the other local indigenous languages, including Miriti-Tapuia, Dessana, Tariano, Siriana, Piratapuia, Uanano, Hupde, BareÂ, and a scattering of others. Younger members, however, are far less likely to speak multiple indigenous languages. This situation allowed us to compare two types of participants ± those with formal school experience and literacy skills versus those without ± allowing an assessment of any possible effects of exposure to Western languages, or of schooling and literacy, on obtained judgments. We were also able to recruit three young male colleagues in their mid-twenties from the Tukano-speaking community in IauarateÃ, ¯uent both in Tukano and in Portuguese. Experiment 1: Selection task problems The original plan was to present stimuli in a manner similar to what was reported by Sugiyama, Tooby, and Cosmides (2002) with the Shiwiar population in Ecuadorian AmazoÃnia, who were shown pictures with parts of them covered. This method was abandoned, however, as our participants demonstrated great confusion with that format. Our informants told us that theirs was an oral tradition, so we changed to a presentation that was entirely oral, and embedded the problems within a story context. Two sets of problems were presented, and each participant received either a Social-Contract or an Indicative Problem Set. Each set contained two initial training and four subsequent experimental problems. The training problems introduced the scenario for the entire set, describing a man named TapõÃ, who recently travelled from his home village to the Land of the Sacred Forest, where he encountered some social rules that were different from those in his homeland. When he returned home he told his neighbours about these rules. For example, in the Land of the Sacred Forest if a man gets help in building his house, he must kill a large animal called a tapir so that he can hold a big feast.

66

O'Brien, Roazzi, Athias, BrandaÄo

The experimenter explained that sometimes people in the Land of the Sacred Forest do not obey the rules, and we want to catch them. An initial question asked, ``So, tell me, what situation would show that someone is not obeying this rule?'' For the training problems only, when participants gave a response that was unhelpful from our perspective, the experimenter asked, in a ``hint'' question, whether it would be helpful to ®nd out whether someone did or did not get help building a house or whether someone did or did not kill a tapir in order to hold a big feast. Examples of ``unhelpful'' responses included using magic to ®nd rule violators, or that we should ask people if they broke the rules. For each problem, answers to the initial question were followed by asking four additional ``selection task'' questions (with question orders randomised across problems). Each began by reminding participants ``we want to catch people who are not obeying the rule that if a man gets help in building his house, he must kill a large animal called a tapir so that he can hold a big feast.'' This was followed by the question, ``Imagine we ®nd a man who did receive help constructing his home. Would this be a situation in which we might catch someone who is not obeying the rule? Why?'' The other three questions asked about a man who did not receive help constructing his home, a man who did kill a tapir, and a man who did not kill a tapir. The four questions are analogous to showing the four cards, p, not p, q, and not q, respectively, in the traditional selection task, and requiring an individual evaluation for every card. The second Social-Contract training task, and the four experimental problems, were structurally the same: the second training task presented the rule that if a man goes hunting, he must wear a feathered head dress. The four experimental problems did not include any ``hint'' questions, and they presented the following rules. 1 2 3 4

If a man catches a ride on a boat on the Sacred River, he must give ®sh to the owner of the boat. If a man receives the meat from a wild pig as a gift, he must repay the gift with honey. If a woman gives a man a ®shing net, he must carry her ®rewood in repayment. If a man is treated by a medicine man, that man must fast for two days before receiving treatment.

The Indicative Problems had the same structure. The ®rst training problem introduced a man named Dari who is well known by the people for saying things that are not true.2 Dari and his cousin, Mohan, recently visited a

2 Although Dari might be a ``cheater'' in terms of a Gricean violation of expectations that he speaks the truth, these scenarios do not involve social contracts per se, that is, cheating in order to obtain a bene®t without paying a cost.

3. Reasoning by Tukano speakers

67

place called the Land of the Sacred Forest, where the plants, animals, and objects differ from where they live. Dari told Mohan that in the Land of the Sacred Forest there is a type of tree called the Chacroma tree, and he said that if Mohan eats the fruit of the Chacroma tree, he will grow an extra ®nger. Participants were told to assume that what Dari said was not true and that we want to catch Dari in this lie. ``So, tell me. What would show that what Dari said is not true? Remember Dari said that if Mohan eats the fruit of the Chacroma tree, he will grow an extra ®nger. What would show that this is not true?'' Again, as in the practice problems for the SocialContract Problem Set, the experimenter gave ``hint'' questions if the response to the initial question was not of the sort that we were seeking. The four ®nal questions asked for each problem were presented in the following way. ``Remember, we want to catch Dari in this lie. Dari said that if Mohan eats the fruit of the Chacroma tree, he will grow an extra ®nger. Imagine that Mohan does eat the fruit of the Chacroma tree. Would this be a situation in which we might catch Dari in his lie? Why?'' The pattern for the three other questions was the same as described earlier for the SocialContract Problem Set. The indicative statement for the second training task was if Mohan travelled across the river, a snake would bite him. The conditionals for the four indicative experimental problems were as follows. 1 2 3 4

If Mohan looks inside the red box that is in his room, he will ®nd a large amount of money. If a Jucu snake bites Mohan, he will die. If Mohan eats the fruit of the Banira tree, he will get a big stomach ache. If a pregnant woman drinks the milk from a jaguar, she will have a son.

Each participant received either the Social-Contract Problem set or the Indicative Problem set, with the four experimental problems presented in random order after the two training ones. The problems were presented orally and administered individually. Participants were asked to repeat the rule or assertion TapõÃ or Dari had said, and the experimenter repeated the information when necessary until the participant demonstrated that it had been remembered correctly. Altogether, 34 illiterate Tukano speakers, 36 literate Tukano speakers, and 50 university students from New York City participated. Results We turn ®rst to the responses to the initial questions about what would falsify or violate the conditionals (see Table 3.1). The modal response for all three populations, for both the Indicative and the Social-Contract Problems, was p and not-q, ranging from 49% to 72% of responses. In addition to the

68

O'Brien, Roazzi, Athias, BrandaÄo

Table 3.1 Proportions of p and not q selection patterns for the selection task in Experiment 1, and the evaluation task in Experiment 2. Also, scores on the ®vepoint scale for the probability and frequency problems in Experiment 3 Problem type Social contract Initial question

Follow-up questions

Initial question

Follow-up questions

.15 .26 .38

.72 (.78) .64 (.88) .66 (.83)

.29 .29 .47

Experiment 1 (Selection task) Illiterate Tukano .49 (.72) Literate Tukano .63 (.83) University students .56 (.92) Experiment 2 (Evaluation Illiterate Tukano Literate Tukano University students

Indicative

task) .83 .87 .90

.70 .84 .85

Experiment 3 (Probability and frequency problems) Frequency

Illiterate Tukano Literate Tukano University students

Probability

Initial question

Final question

Initial question

Final question

4.56 4.33 4.73

2.89 2.94 3.33

4.38 4.45 4.67

2.71 3.00 3.17

Proportions inside parentheses include participants who responded that they would select only p or that they would select only not q.

p and not q response, the two other most popular responses were for participants to respond by saying that they would check the p situation or that they would check the not q situation. When one adds these two responses to the selections of p and not q, between 72% and 92% of responses are accounted for, for both problem types and across all three groups, with no apparent differences between Social-Contract and Indicative Problem Sets. We turn now to the responses to the four follow-up ``selection task'' questions. An initial ANOVA was computed with responses summed for each problem across the four questions, which revealed no differences among the problems, so we computed an ANOVA with responses summed across problems. The sole signi®cant effect was for participant type, F(2, 114) = 27.32, p < .01, with university students responding ``yes'' less often than either the illiterate or the literate Tukanos (proportions = .39, .58, and .63, respectively). This stemmed largely from a subset of university students who responded ``no'' to all four proposition types (on a total of 16% of

3. Reasoning by Tukano speakers

69

problems versus only a single problem for Tukano speakers). University students seemed to believe that these conditionals were suf®ciently strange that they were not open to ordinary sorts of evaluation. This type of response was not found among the Tukano speakers, for whom the stimulus sentences were not so strange. (Indeed, the problems had been constructed within the context of Tukano society.) We now turn to the p and not q response pattern for each of the problem types for each of the three groups of participants, for the responses given when each proposition was questioned individually, after the initial question. This pattern was made by the illiterate Tukanos 15% and 29% of the time to the Social-Contract and Indicative Problems, respectively, by the literate Tukanos 26% and 29% of the time, and by the university students 38% and 47% of the time. An ANOVA revealed only a signi®cant main effect for group, F(2, 114) = 7.20, p < .01, with university students demonstrating more p and not q responses than did either of the Tukano groups. Pre-planned t-tests revealed that all three groups made p and not q selections more often than would be expected by chance alone (the chance proportion = .0625), with t scores = 2.46, 3.46, and 6.78 for the illiterate Tukano, literate Tukano, and university groups, respectively. The ANOVA revealed neither a signi®cant difference for problem type nor an interaction between group and problem type. Overall, although all three groups of participants were more likely than chance to give p and not q responses, this tendency was more pronounced for the university students. This may re¯ect that the university students were more accustomed to taking tests, or more likely to set aside pragmatically based inferences. The superior performance on the initial questions, in comparison to the subsequent selection-task questions, requires some re¯ection. Even though logically appropriate p and not q responses occurred at above-chance levels for both social-contract and indicative problems, both for the initial question and for the follow-up ``selection task'' questions, there were far fewer such responses on the ``selection task'' questions. For illiterate Tukano speakers, for example, 49% gave p and not q as an answer to the initial questions for the indicative problems, but only 29% provided such responses to the follow-up questions for the same problems. Clearly, the subsequent ``selection task'' questions were less apt to reveal an appropriate understanding of what can falsify or violate a conditional than were the initial questions. One speculative possibility is that repeated questioning on the same problem might have led some people to question whether their initial answer was correct, and then to seek an alternative response. Perhaps people can be forced to ``over-think'' on these problems when questioned repeatedly. Whatever the reason, the tendency to provide logically more appropriate responses on the initial questions was found for all three types of participants. None of this should divert the reader, however, from realising that all three groups of participants revealed a basic, appropriate, and equal ability to understand what violates a conditional social contract

70

O'Brien, Roazzi, Athias, BrandaÄo

and what falsi®es an indicative conditional. Most importantly, in the context of the theoretical issues the study was intended to assess, there was no indication that the problems in the Social-Contract Set were better understood than the problems in the Indicative Set. Experiment 2: Evaluation task problems The problems here used what Evans (1982) and O'Brien and Overton (1982) referred to as an evaluation task, presenting conditionals together with four exemplars of the forms p and q, p and not q, not p and q, and not p and not q, asking participants to identify which one of the forms falsi®ed an indicative conditional, or violated a conditional social contract. Evaluation tasks were presented because, in our experience, they are easier than the selection task. We wanted to present maximally simple opportunities for the three populations to identify violating and falsifying instances for socialcontract and indicative conditionals, respectively. A total of 10 social-contract and 10 indicative problems were constructed. We turn ®rst to the Social-Contract Set, which included ®ve problems that had clear bene®t/cost rules and ®ve that did not, but that presented conditional obligations.3 The Social-Contract Problem Set was presented with a short vignette that introduced a man named TarõÃ, who made a long trip to many strange places. When he returned to his home village he told his neighbours about the social rules that he encountered when in the Land of the Sacred Forest and that, although the people in this land had these rules, they did not always behave in the way they ought. The participants were told that they would hear about some of these rules and would be asked to choose which picture from among a set would show the person who was not obeying the rule. Each problem presented its rule twice for emphasis, once in conditional form, and a second time in universal form. For example, one bene®t/cost problem stated, ``There is a rule in the Land of the Sacred Forest that if a man gets a ride in a boat, he must pay the owner of the boat with bananas. Yes, that's right. Every time a man gets a ride in a boat he must pay the boat's owner bananas.'' Each problem presented the conditional rule and then showed four pictures that illustrated situations for p and q, not p and q, p and not q, and not p and not q; each problem presented the pictures in a different random order. Participants were instructed to point to the picture that showed someone violating the rule. The other four cost/bene®t rules stated, ``if a man is treated by a medicine man he must give the medicine man a basket,'' ``if a man is given arrows to go hunting, he must give a

3 We thought it would be of interest to discover whether problems with clear bene®t/cost rules would differ from pragmatic rules of the sort outlined in the theory of Cheng and Holyoak (1985). No differences were found between these two types of problem.

3. Reasoning by Tukano speakers

71

parrot in return to the man who gave him the arrows,'' ``if a man is given the meat of a tapir, he must climb a tree to get honey,'' and ``if a man is given a ®shing net by a woman, he must carry her ®rewood.'' The rules for the ®ve conditional obligation problems were, ``if a man dances beneath a rainbow he must paint his face red,'' ``if a woman is widowed, she must get a tattoo on her face,'' ``if a woman is pregnant she must wear a red necklace,'' ``if a hunter carries a tapir he must wear a feathered head dress called a cocar,'' and ``if a man plays drums, he must wear a green hat.'' The 10 Indicative Problems had the same structure as the Social-Contract Problems, although with a different opening vignette. Participants were told about a man named Tipi who visited the Land of the Sacred Forest. When he returned to his own village he told his neighbours many strange facts about what he had seen. His neighbours, however, knew Tipi to be a man who tended to say things that were not true. Participants were instructed to point to the picture that showed that what he was saying this time was not true. The ten indicative problems presented the following conditional assertions: ``if a woman has children she always is bald,'' ``if a man is playing drums he always is wearing a green hat,'' ``if a man is hunting, he always is wearing a feather hat,'' ``if someone sees a lightning, that person always covers his eyes,'' ``if a man is not married, his house always has a red door,'' ``if a village has (a large communal house), it always has a cow,'' ``if a house has a green door, it always has two windows,'' ``if a canoe transports ®ve passengers, it always has a motor,'' ``if a man is wearing sandals, he always wears a hat,'' and ``if a house has a metal roof, it always has a red roof.'' As with the Social-Contract Problems, each Indicative conditional was presented twice, one in conditional form and then immediately in universal form. Each participant received either the set of Social-Contract Problems or the set of Indicative Problems. Eighteen illiterate Tukano speakers received the Indicative Problems and 17 the Social-Contract Problems, 22 literate Tukano speakers received the Indicative Problems and 24 the SocialContract Problems, and 27 university students from the City University of New York received the Indicative Problems and 27 the Social-Contract Problems. Results The vast majority of selections were for the picture showing p and not q (84%, with only 3%, 8%, and 4% for p and q, not p and q, and not p and not q, respectively, see Table 3.1). These responses ranged from 70% for the illiterate Tukanos on Social-Contract Problems to 90% for university students with Indicative Problems. Thus, independently of population tested or of type of problem, the participants understood that p and not q was the exemplar type that was relevant to falsifying or violating the conditionals.

72

O'Brien, Roazzi, Athias, BrandaÄo

Analyses were computed using only the data for the p and not q selections, which were summed across the 10 problems to yield a maximum score of 10 per participant. The ANOVA yielded a signi®cant main effect for group, F(2, 129) = 7.81, p < .01, with the illiterate Tukanos making fewer p and not q selections than did both of the two literate groups, and a signi®cant main effect for problem type, F(1, 129) = 9.62, p < .01, with the social-contract problems leading to fewer p and not q responses than did the indicative problems. The interaction of problem type participant type was not signi®cant, F(1, 129) = 1.72, p > .05. Given that all three participant types overwhelmingly preferred selection of the p and not q instances on all problems, the difference between the illiterate Tukano group and the two literate groups seemed to stem only from the less experience the illiterate participants had at test taking. Indeed, all three participant types tended to make p and not q selections for both types of problem, and the difference between the two concerns only a slightly larger tendency to choose p and not q instances for the indicative problems than the social contract problems. This is counter-evidence to the social-contract theory, although an absence of any difference also would have been evidence against it. Social-contract theory predicted that the social-contract problems would have the most p and not q responses. Experiment 3 We turn now to the claim that bioevolutionary history provided humans with an ability to represent and reason about event frequencies, but not about the probabilities of single events. The modal error reported in the literature, for the sorts of Bayesian reasoning problems that have been presented, is ``base-rate neglect,'' which can be illustrated with the prototypical problem in the literature ± the taxicab problem (Kahneman & Tversky, 1972; Bar-Hillel, 1980). Participants are told about a city that has two taxicab companies, with a Blue Cab Company having 15% of the cabs and a Green Cab Company the remaining 85%. A cab is involved in an accident and a witness who is accurate 80% of the time identi®es it as blue. Participants tend to judge the probability that the cab in the accident is blue as .80, thus apparently ignoring the base rate that only 15% of the cabs in the City are blue. Someone constructing a Bayesian line of reasoning would compute the ratio of the probability of a cab being correctly identi®ed as blue, divided by the probability of any cab being identi®ed as blue (i.e., blue cabs correctly identi®ed plus green cabs incorrectly identi®ed) (= .12/[.12 .17] = .41). Similar ®ndings of base-rate neglect have been reported for problems that presented medical tests for a disease (e.g., Cosmides & Tooby, 1996; Gigerenzer & Hoffrage, 1995) or diagnosis of mechanical problems (e.g., Birnbaum & Mellors, 1983). Bayes's theorem, of course, applies across all sorts of content in situations other than those in which a witness (or a test) identi®es an accident

3. Reasoning by Tukano speakers

73

(or a disease). Indeed, a probability-theory textbook is apt to present coloured beads of various shapes and colours in an urn, and then to ask, for example, about the probability that a circular bead is red. If an investigator wants to assess whether experimental participants are able to resist ``base-rate neglect,'' one needs only to assess whether they adjust their judgments about a probability following the introduction of the information that makes a judgment conditional. For example, imagine presenting an array of beads, most of which are red, but with a few that are green. We ask a participant to judge the probability that a randomly selected bead is red, and they report that the probability is very high. We then point out that most of the red beads are triangles, and only a few of the red beads are circles, whereas most of the green beads are circles. We now ask our participants what the probability is that a randomly selected circular bead is red. If the participants now respond that the (conditional) probability that the circular bead is red is lower than was their original (non-conditional) judgment about its probability, we have found that they are not exhibiting base-rate neglect, but that they have exhibited Bayesian conditional reasoning in considering the base rates of colours and shapes of the beads. Note also that the demonstration of Bayesian reasoning in this hypothetical example would have been made on a problem presented without any numerical information, and without requiring a numerical response; all the information about probabilities in this example was presented using nonnumerical quanti®er terms. During debrie®ngs of undergraduate college students in previous studies, they often told us that the problems are dif®cult because they do not remember from classroom mathematics teaching, or have not yet learned, how to perform them. These students thus seemed to think that they were being tested about what they have learned in mathematics classes rather than being assessed for their psychological intuitions. We believe that such beliefs about the problems, when they are numerically presented, make them a poor way to assess basic intuitions about probabilities. Ferreira (2003) therefore conducted a study at the Federal University of Pernambuco in Recife, Brazil, that replaced the numbers in the problems with non-numerical quanti®er terms. For example, the numerical information in the standard taxicab problem was replaced with ``most of the cabs in the city are green and only a few of them are blue,'' and ``the witness correctly identi®ed the blue taxicab most of the time and misidenti®ed the colour of the green taxicabs only once-in-a-while.'' Both the numerical and the non-numerical problems used ®ve-point Likerttype scales, allowing the numerical terms in one scale to be matched with non-numerical terms in the other scale (e.g., ``the probability is more than 80%'' was matched with ``the probability is extremely high''). Extensive pretesting was conducted, which demonstrated that the numerical and nonnumerical descriptions were understood as conveying basically the same information, and results on the ®nal experiment showed no differences between numerical and non-numerical presentations. This showed that one

74

O'Brien, Roazzi, Athias, BrandaÄo

can legitimately assess Bayesian reasoning without presenting numerically based ``maths problems,'' so that populations that have no schooling in mathematics, and thus are innumerate, can be tested. In the present experiment, two problems were constructed, one that presented information in terms of probabilities and asked for judgments about the probabilities of single events, and the other that presented information in terms of event frequencies and asked for judgments about the frequencies of events. Of course, if the theoretical proposals of Cosmides and her colleagues are correct, that human bioevolutionary history has provided representational formats and reasoning processes for frequencies and not for the probabilities of single events, then the frequency problems ought to be the easier. Both problems referred to canoe races between two men, one of whom had a reputation as a good racer, and the other of whom had a reputation as a poor racer. Each problem had two parts: an initial part in which only the skill of the racer was mentioned as a factor in predicting race outcomes, and a ®nal part in which an additional factor of boat length was introduced. Base-rate neglect would be indicated to the extent that judgments were not adjusted as the new information was introduced, that is, as the problems came to require a judgment about a conditional probability. For the probability problem, the experimenter explained that in the village of Inca Rapids the people often hold canoe races on which people often gamble. The canoes are always 7 or 8 m in length, and 8 m boats are faster and more likely to win. Two men, Moligon the Good Racer and Kiniwi the Weak Racer sometimes race canoes. Both men weigh the same ± 190 pounds ± but Moligon the Good Racer is more athletic, muscular, strong and, most importantly, as is shown by his nickname, he is a very good racer of canoes. Kiniwi the Weak Racer is very different. He is sedentary, ¯abby, weak, and most importantly, he is not a very good racer of canoes. The initial question about the canoe race was as follows: ``So, if Moligon and Kiniwi have a canoe race, and both men use 8 m canoes, who is more likely to win the race? Moligon or Kiniwi? Or are the two men equally likely to ®nish at the same time?'' If a participant said that one or the other of the two men was more likely to win, the experimenter asked whether this man was a little more likely to win the race or much more likely to win. This method of asking the question allowed the response to be scored using a ®ve-point scale where 1 = Kiniwi is much more likely to win, 2 = Kiniwi is a little more likely to win, 3 = the two are about equally likely to win, 4 = Moligon is a little more likely to win, and 5 = Moligon is much more likely to win. After the response was recorded, the experimenter said that at times there are races in which the two competitors use canoes of different sizes, with one racer using a 7 m canoe and the other an 8 m canoe. The canoes are identical in quality, materials, shape, and weight, and the only difference is in the length, with a canoe of 8 m being much faster than a canoe of 7 m.

3. Reasoning by Tukano speakers

75

There is going to be another race between Moligon the Good Racer and Kiniwi the Weak Racer. This time Moligon is going to race using a slow canoe of 7 m and Kiniwi is going to use a fast canoe of 8 m. The second and ®nal question then was posed as follows. ``So, if Moligon the Good Racer and Kiniwi the Weak Racer have a race when Moligon uses a much slower canoe and Kiniwi uses a much faster canoe, who is more likely to win the race? Moligon or Kiniwi, or are they both likely to ®nish at the same time?'' The same follow-up questions were used so that the response could be scored using the same ®ve-point scale. The frequency version of the problem was identical except that all references to probabilities were replaced with references to frequencies. For example, an 8 m canoe usually wins, a good racer wins a vast majority of his races, and the two questions referred to the two men having many races and asked who would win more of them, and if one racer was predicted to be the winner of more races, a judgment then was required as the whether this would be only a few more races or many more races. For all three populations ± literate and illiterate Tukano speakers and City University of New York undergraduate students ± the problems were administered individually and orally. Participants were asked to repeat each piece of information until the experimenter was satis®ed that it had been heard and remembered, and participants were asked to explain each response. Eighteen illiterate Tukano speakers were presented the probability problem and 21 the frequency problem. Eighteen literate Tukano speakers were presented the probability problem and 20 the frequency problem. Thirty-six university students in New York City were presented the probability problem, and 33 the frequency problem.

Results The means for the initial question were 4.58 and 4.53 for the frequency and probability versions, respectively, whereas the means for the ®nal question were 3.33 and 3.17, respectively (see Table 3.1). The only signi®cant difference was between the initial and ®nal questions, F(1, 140) = 320.01, p < .01 (means = 4.56 and 3.06 for the initial and ®nal problems, respectively), showing that all three groups were sensitive to the need to adjust their answers based on the additional conditional information. The consistent tendency for both types of problems, and for all three populations, was to adjust the judgments in a way that was appropriate for Bayesian reasoning. No differences between groups occurred, nor were there any differences between frequency and probability forms of the problems, thus discon®rming the notion that reasoning about frequencies would be easier than reasoning about probabilities. All three of the populations tested thus showed no indication of base-rate neglect on either problem type, demonstrating judgments that were appropriate from a Bayesian perspective.

76

O'Brien, Roazzi, Athias, BrandaÄo

Discussion The data presented in Experiments 1 and 2, with an uneducated and illiterate indigenous population in the Amazon basin, as well as with two educated and literate populations, are unlike the data that were reported by Sugiyama et al. (2002) with an uneducated illiterate Amazonian population, and by Cosmides (1989) and Fiddick et al. (2000) with educated literate populations. Whereas Cosmides and her associates reported that people can identify violators of social contracts but cannot identify counterexamples for conditional assertions, data from our Experiments 1 and 2 did not reveal any differences between the two types of problem. All three of the populations we tested favoured instances of p and not q as evidence for social-contract and indicative problems equally. Why do our data show people identifying instances of p and not q as falsi®ers of conditional assertions when the studies presented by Cosmides and her associates showed that people did not do so? We suspect that the indicative problems in the experiments reported here succeeded because (1) we made them equal to the social-contract problems in interest and interpretability, which was not the case in the previously reported experiments, and (2) we made explicit the requirement to ®nd falsifying information in the indicative problems. Just as the social-contract problems always have made the task of ®nding violators obvious, we made the task of ®nding falsifying evidence obvious; indeed, the scenarios emphasised the likelihood that the indicative conditionals were false. In brief, the problems we presented were devoid of extraneous disadvantages for the indicative problems that one ®nds in most of the literature (see also Noveck, Mercier, & van der Henst, Chapter 2, this volume). In Experiment 3, all three of the populations were able to resist the baserate fallacy when presented with Bayesian reasoning problems, with no differences between problems presented in frequency versus probability formats. Once again, one can ask why our problems resulted in responses that were equally appropriate for probability and for frequency versions when previous studies (e.g., Cosmides & Tooby, 1996; Gigerenzer & Hoffrage, 1995) had reported signi®cant differences. First, as was shown by Roazzi et al. (2003), the ®ndings reported in previous studies had different response formats for probability versus frequency formatted problems, and when these were removed, the differences between problem types disappeared. No such confounds existed in the present experiment. Additionally, the problems presented in Experiment 3 removed the ``mathematical'' appearance of the problems by presenting information about frequencies and probabilities with non-numerical quanti®er terms in place of numbers. Without these extraneous aspects to the problems, the intuitions that people revealed were appropriate and identical for the two problem types. The data presented in these three experiments thus indicate that the ®ndings reported by previous investigators both for Wason's selection task and for

3. Reasoning by Tukano speakers

77

Bayesian reasoning tasks are interpretable as artifacts of particular task features that were extraneous to the issues of theoretical interest (see also Roberts, Chapter 1, this volume). In evaluating evolutionary explanations offered by biologists for functional anatomy, Rosen (1982) commented that such explanations are constrained only by the inventiveness of those who author them, and by the gullibility of their audience. Gray, Heaney, and Fairhall (2003) came to a similar conclusion about explanations by evolutionary psychologists generally, and we extend the precaution to evolutionary explanations offered about reasoning processes in particular. We think of at least some bioevolutionary explanations as instances of a ``Little Red Riding Hood'' approach to evolutionary explanation, where one assumes that the wolf has big eyes so that it can see Little Red Riding Hood better. Of course, the folk tale never considers other possibilities as reasons for a wolf to have big eyes, such as making the wolf more attractive to wolves of the opposite sex. It also leaves unanswered why large eyes per se would improve a wolf's eyesight, much less why the improved eyesight is speci®c for seeing Little Red Riding Hood. The fact that Cosmides and her associates (e.g., Cosmides, 1989; Cosmides & Tooby, 1994, 1996; Fiddick et al., 2000; Tooby & Cosmides, 1992) have reported that people tended to solve versions of Wason's selection task when presented with social contracts, but did not do so with other sorts of conditionals, did not entail that evolution has provided humans with special reasoning modules for identifying violators of social contracts, any more than the wolf's big eyes were designed speci®cally for seeing Little Red Riding Hood; nor do such ®ndings justify a conclusion that evolution has not provided any content-general logical reasoning processes any more than the folk tale eliminated other possible reasons for evolution having resulted in a wolf's big eyes. A similar caution should be extended to interpreting data purportedly showing that people are able to reason about frequencies but not about probabilities. As we have noted elsewhere (e.g., Noveck & O'Brien, 1996; O'Brien et al., 2004; Roazzi et al., 2003), it is parsimonious to explain such ®ndings as resulting from artifacts of the particular features of the experiments, but not as resulting from the bioevolutionary history of our species. The data for the three experiments reported here support this conclusion. Claims that content-dependent processes dominate human reasoning have become fashionable in recent years also among researchers who have not adopted the sorts of bioevolutionary arguments presented by Cosmides and her associates (e.g., Cheng & Holyoak, 1985; Evans, 1982). We suspect, however, that if such content-dependent claims were being made about language processing rather than about thinking and reasoning, they would not ®nd such ready acceptance. Most readers would agree, we think, that ordinary sentence comprehension relies on knowledge about a sentence's content, but the fact that people take content into account is evidence

78

O'Brien, Roazzi, Athias, BrandaÄo

neither that all linguistic judgments are governed only by contentdependent processes, nor that content-general processes have no place of importance in linguistics. We reject the notion that there is no place in linguistics for a content-general theory of syntax because we recognise that many of our linguistic judgments are best understood in terms of such processes. We also reject an argument that there is no place in the psychology of thinking and reasoning for a content-general theory because many human reasoning judgments are likewise best understood in terms of content-general processes. Because of space limitations we shall not describe here the evidence for reasoning judgments that cannot be explained without reference to a set of content-general logical inference processes; we refer the reader to Braine and O'Brien (1998) and O'Brien (2004), which provided reviews of empirical evidence for logic judgments that cannot be explained in terms of problem content. Given that the cognitive notion of modularity was introduced with the proposal of a linguistic module that applies across all sorts of semantic content (e.g., Chomsky, 1988; Fodor, 1983), proposals that modularity implies narrowly de®ned content domains in human reasoning are ironic. The concept of modularity carries with it, of course, a notion of code speci®city that applies to input, processing, and output. For example, a language module is constrained so as to take only linguistic input and provide only linguistic output. The notion of speci®city, however, can be applied narrowly or broadly. Pragmatic-reasoning-schemas theory (e.g., Cheng & Holyoak, 1985) describes rules that apply only to the extremely limited content of pragmatic actions and preconditions; social-contract theory provides a checking algorithm that applies only to the extremely limited content of social costs and bene®ts. These two theories thus are at the narrow end of a spectrum of content speci®city. At the broad end of this spectrum one can ®nd mental-logic theories that propose inference processes that are not constrained by content. Mental-logic theory proposes representational speci®city in that input is limited to propositional strings, but it allows the propositions to express any sort of content. (The broad end of the spectrum also includes the content-general mental-models theory of Johnson-Laird and his associates, e.g., Johnson-Laird & Byrne, 2002, although that theory does not constrain its input to propositional strings.) Clearly, there would be a bioevolutionary advantage for a species to have inference-making processes that apply without constraints on the kinds of content to which they can apply, so that as new domains of content are encountered, the same inference processes can be applied. In the absence of any inference-making process that can be applied across content domains, a reasoner either would need to acquire new processes, or would not be able to make inferences when new domains were encountered. The bioevolutionary advantage for a species to have some content-general inference processes is obvious; such processes would allow adaptation when environmental changes are encountered (see O'Brien, 1993, for further discussion

3. Reasoning by Tukano speakers

79

on this point). We are not proposing that content-dependent processes play no role in reasoning, but we are pointing out that a cognitive system without content-general processes would be at a serious bioevolutionary disadvantage when new environmental demands were encountered. Let us not turn a blind eye to the multiplicity of uses to which a wolf might turn its large eyes.

Acknowledgments This material is based on work supported by the National Science Foundation under Grant No. 0104503 to David P. O'Brien and Patricia J. Brooks, and a grant from CNPq of Brazil (proc. nuÂmero 910023/01-8) to Antonio Roazzi and Maria G. Dias.

References Bar-Hillel, M. (1980). The base-rate fallacy in probability judgments. Acta Psychologia, 44, 211±233. Barkow, J. H., Cosmides, L., & Tooby, J. (Eds.). (1992). The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press. Birnbaum, M. H., & Mellors, B. A. (1983). Bayesian inference: Combining base rates with opinions of sources who vary in credibility. Journal of Personality and Social Psychology, 45, 792±804. Braine, M. D. S., & O'Brien, D. P. (Eds.). (1998). Mental logic. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Brase, G. L., Cosmides, L., & Tooby, J. (1998). Individuation, counting, and statistical inference: The role of frequency and whole-object representations in judgment under uncertainty. Journal of Experimental Psychology: General, 127, 3±21. Cheng, P. W., & Holyoak, K. J. (1985). Pragmatic reasoning schemas. Cognitive Psychology, 17, 391±416. Chomsky, N. (1988). Language and problems of knowledge. Cambridge, MA: MIT Press. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187±276. Cosmides, L., & Tooby, J. (1992). Cognitive adaptations for social exchange. In J. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind (pp. 163±228). New York: Oxford University Press. Cosmides, L., & Tooby, J. (1994). Beyond intuition and instinct blindness: Toward an evolutionarily rigorous cognitive science. Cognition, 50, 41±77. Cosmides, L., & Tooby, J. (1996). Are humans good intuitive statisticians after all? Rethinking some conclusions from the literature on judgment under uncertainty. Cognition, 58, 1±73. Evans, J. St B. T. (1982). Psychology of deductive reasoning. London: Routledge & Kegan Paul. Ferreira, S. M. (2003). Reasoning about conditional probabilities: Beyond the debate

80

O'Brien, Roazzi, Athias, BrandaÄo

between the heuristics theorists and the frequentist theorists. Masters thesis, Federal University of Pernambuco, Recife, Brazil. Fiddick, L., Cosmides, L., & Tooby, J. (2000). No interpretation without representation: The role of domain speci®c representations and inferences in the Wason selection task. Cognition, 77, 1±79. Fodor, J. A. (1983). Modularity of mind. Cambridge, MA: MIT Press. FOIRN-ISA (2000). Povos indõÂgenas do Alto e MeÂdio Rio Negro: Uma IntroducËaÄo aÁ diversidade cultural e ambiental do noroeste da AmazoÃnia Brasileira. BrasõÂlia, Brazil: MEC/SEF/DPEF, FederacËaÄo das OrganizacËoÄes IndõÂgensas do Rio Negro & Instituto Socioambiental. Gigerenzer, G., & Hoffrage, U. (1995). How to improve Bayesian reasoning without instruction: Frequency formats. Psychological Review, 102, 684±704. Gray, R. D., Heaney, M., & Fairhall, S. (2003). Evolutionary psychology and the challenge of adaptive explanation. In K. Sterelny & J. Fitness (Eds.), From mating to mentality: Evaluating evolutionary psychology (pp. 247±268). Hove, UK: Psychology Press. Hume, D. (1957). Enquiries concerning the human understanding and concerning the principles of morals. Oxford, UK: Oxford University Press. (Original work published 1737) Johnson-Laird, P. N., & Byrne, R. M. J. (2002). Conditionals: A theory of meaning, pragmatics, and inference. Psychological Review, 109, 646±678. Kahneman, D. & Tversky, A. (1972). Subjective probability: A judgment of representativeness. Cognitive Psychology, 3, 430±454. Noveck, I. A., & O'Brien, D. P. (1996). To what extent are pragmatic reasoning schemas responsible for performance on Wason's selection task? Quarterly Journal of Experimental Psychology, 49A, 463±489. O'Brien, D. P. (1993). Mental logic and irrationality: We can put a man on the moon, so why can't we solve those logical reasoning problems? In K. I. Manktelow and D. E. Over (Eds.), Rationality: Psychological and philosophical perspectives (pp. 110±135). London: Routledge. Reprinted in M. D. S. Braine and D. P. O'Brien (Eds.). (1998). Mental logic (pp. 23±43). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. O'Brien, D. P. (2004). Mental-logic theory: What it proposes, and reasons to take this proposal seriously. In J. Leighton & R. J. Sternberg (Eds.), The nature of reasoning (pp. 205±233). New Haven, CT: Cambridge University Press. O'Brien, D. P., & Overton, W. F. (1982). Conditional reasoning and the competence±performance issue: A developmental analysis of a training task. Journal of Experimental Child Psychology, 34, 274±290. O'Brien, D. P., Roazzi, A., Athias, R., Dias, M. G., BrandaÄo, M. C., & Brooks, P. J. (2003). The language of thought and the existence of a mental logic: Experimental investigations in the laboratory and in the ®eld. Psychologica, 32, 263±284. O'Brien, D. P., Roazzi, A., Dias, M. G., Cantor, J. B., & Brooks, P. J. (2004). Violations, lies, broken promises, and just plain mistakes; The pragmatics of counterexamples, logical semantics, and the evaluation of conditional assertions, regulation, and promises in variants of Wason's selection task. In K. I. Manktelow (Ed.), The psychology of reasoning: Historical and philosophical perspectives (pp. 95±126). Hove, UK: Psychology Press. Roazzi, A., O'Brien, D., & Dias, M. G. B. B. (2003). Sobre o debate FrequÈentista

3. Reasoning by Tukano speakers

81

versus Probabilista: ``Sorte de tolo'' torna-se uma explicacËaÄo plausõÂvel. Psicologia: Re¯exaÄo e CrõÂtica, 16, 201±221. Rosen, D. E. (1982). Teleostean interrelationships, morphological function and evolutionary inference. American Zoologist, 22, 261±273. Sugiyama, L. S., Tooby, J., & Cosmides, L. (2002). Cross-cultural evidence of cognitive adaptations for social exchange among the Shiwiar of Ecuadorian Amazonia. Proceedings of the National Academy of Sciences, 99, 11537±11542. Tooby, J., & Cosmides, L. (1992). The psychological foundations of culture. In J. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind. Oxford: Oxford University Press. Wason, P. C. (1968). Reasoning about a rule. Quarterly Journal of Experimental Psychology, 20, 273±281.

4

Content-independent conditional inference David E. Over

Some evolutionary psychologists have argued that people do not have a natural logical ability to use content-independent inference rules (Cosmides & Tooby, 1992, 1994; Gigerenzer, 2000; Gigerenzer, Todd, & the ABC Research Group, 1999; Tooby & Cosmides, 1992). According to them, the mind is like a Swiss army knife: it contains many different contentdependent modules for processing domain speci®c information about adaptive problems, but no general-purpose ``blade''. (See Cosmides & Tooby, 1994, on the knife analogy, and Over, 2003, for problems with it.) This view of the mind has been called the massive modularity hypothesis (Samuels, 1998; Sperber, 1994). The argument for this is that ``content free, general-purpose systems could not evolve, could not manage their own reproduction, and would be grossly inef®cient and easily outcompeted'' by domain speci®c mechanisms (Tooby & Cosmides, 1992, p. 112). Cosmides and Tooby (1994) criticise the dual architecture theory of Fodor (1983), according to which the mind contains both speci®c modules and a contentindependent capacity for logical inference (see also Fodor, 2000). Others have carried massive modularity to the still further extreme of implying that the content-independent rules of logic and probability theory are surplus ``baggage'' for solving adaptive problems. Todd and Gigerenzer (1999, p. 365) claim that ``thought processes that forgo the baggage of logic can solve real-world adaptive problems quickly and well'' (also note this theme in Gigerenzer, 2000). The mind is described as an ``adaptive toolbox'' that only implements content-dependent ``fast and frugal'' heuristics (Gigerenzer et al., 1999). But even more than that, logic apparently should never be used as the norm to assess human rationality. (See Gigerenzer, 2006; this extreme view is implied in many of the papers of Gigerenzer, 2000.) One in¯uential argument that evolutionary psychologists have used in order to try to support massive modularity concerns probability and frequency. They have argued that people do not follow content-independent rules to solve certain word problems about frequencies (Brase, Cosmides & Tooby, 1998; Cosmides & Tooby, 1996; Gigerenzer & Hoffrage, 1995, 1999). Instead, people rely on an adaptive domain speci®c ability to think about the frequencies of real-world events. I shall argue, in reply, that these

84

Over

frequency word problems are in fact solved by following contentindependent rules (see also Over, 2003, and Sloman & Over, 2003). To solve these, people cannot ``forgo the baggage of logic'', but must follow elementary logical rules, which are the normative and descriptive basis of the solution. I will begin, however, with more general points about logical inference and what psychological experiments have shown about it.

Logical inference Logically valid inferences depend only on logical form, and not at all on speci®c non-logical content, and hence these are content-independent. Consider the logically valid inference rule of modus ponens (MP). Following MP, given the major premise ``if p then q'' and the minor premise p, we would infer the conclusion q from them. For example: (MP)

If you have attended all the classes then you will pass the module. You have attended all the classes. Therefore, you will pass the module.

The speci®c content of this example is irrelevant to the logical validity of MP. In general, an inference form is logically valid if and only if it is logically impossible for its conclusion to be false given that its premises are true. No matter what the content of p and q, it is logically impossible for the conclusion q of MP to be false, provided that its premises are true. Af®rmation of the consequent (AC) is a logically invalid inference form. Following AC, given the major premise ``if p then q'' and the minor premise q, one might infer the conclusion p from them. For example: (AC)

If you have attended all the classes then you will pass the module. You will pass the module. Therefore, you have attended all the classes.

The speci®c content of this example is also irrelevant to the logical invalidity of AC: it is logically possible for the conclusion p to be false when the premises are true. In some pragmatic contexts, a person can be epistemologically justi®ed ± in a wider sense than formal logic provides ± in following AC as a reasonable inference. However, logical validity is an all or nothing matter. There are possible contents for p and q for which the premises of AC are true and the conclusions false, and that is what makes AC a logically invalid inference form. Some valid inference forms are of such fundamental importance that people can only be said to understand the meanings of logical terms if they have at least some ability to follow them. For example, people with no ability to follow MP, when they ®rmly believe its premises, could hardly be said to understand the meaning of ``if'' (although people will rightly

4. Content-independent conditional inference

85

hesitate to accept or believe the conclusion of MP when they are uncertain of its premises; see Stevenson & Over, 1995, 2001, and the next section). Another example relevant to this chapter is that of ``and'', and the inference form and-elimination, which is inferring q (or p) as a conclusion from the single premise ``p & q''. It is clearly impossible for q to be false given that `p & q'' is true, and that makes and-elimination a logically valid inference. This rule is so fundamental that, if people did not have some ability to follow it, they could not be said to understand the meaning of ``and''.

Belief bias and the conjunction fallacy Evolutionary psychologists have embraced massive modularity, to the total exclusion of logic, partly as a result of experimental results that seem to them to imply that people do not have the ability to follow even elementary inference forms. As an example, consider how people's ability to follow MP can sometimes be affected by belief bias. This is a tendency not to infer the conclusion of MP when one does not believe it, even though one has been asked to assume the premises. Another example is the conjunction fallacy, which people commit when they think that a conjunction is more probable than one of its conjuncts. That is, they judge P( p & q) to be higher than P(q). Of course, it should be evident from the logical validity of andelimination that ``p & q'' cannot be the more probable. Some evolutionary psychologists, especially Gigerenzer and his collaborators, have strongly denied that biases and fallacies, such as belief bias and the conjunction fallacy, are faults in reasoning and judgement. Instead, they have argued that human beings have been highly successful, under natural conditions, as a result of ef®cient and reliable domain-dependent adaptations. They have inferred that logical ability is not really needed for adaptive behaviour in the real world. They think of ``biases'' and ``fallacies'' ± only so-called in their eyes ± as the results of ``fast and frugal'' heuristics that can be ef®cient ways to think and make judgements about realistic matters of fact. These heuristics are supposedly non-logical, content-speci®c ways of ef®ciently achieving adaptive behaviour. The content-independent inference rules of logic are charged with being inef®cient and time-wasting for realistic thought and action (Gigerenzer et al., 1999; Gigerenzer, 2000). It is a mistake, however, to imagine that people's tendency to have biases and to commit fallacies shows that they do not have domain general logical abilities. Consider again belief bias. A logically valid inference has a conclusion that is guaranteed true only when its premises are actually true. If one or more of the premises may actually be false, then the conclusion may be false. An inference with even one false premise, whether logically valid or not, therefore does not give logical grounds for believing its conclusion. People can justi®ably reject the conclusion of a logically valid inference if its premises clash with their well-justi®ed beliefs. And if they have grounds for holding that the conclusion of a logically valid inference is false, then they

86

Over

have grounds for inferring that one of the premises is false. A useful distinction here, sometimes made in logic, is between a logically valid and a logically sound inference. A logically valid inference is as we have already de®ned it, and a logically sound inference is a logically valid inference in which the premises are true, in fact, and not just by assumption. There could hardly be stronger evidence from psychological experiments that ordinary people endorse MP as a valid inference form. But it appears that they do not always sharply distinguish logical validity and logical soundness. People often seem to operate with a simpler distinction between a ``good'' logical inference and a ``bad'' one. The former would be a sound inference, being valid and having true premises, and the latter an unsound inference, either being invalid or having a false premise. Nevertheless, there is strong evidence that people grasp the underlying notion of logical validity. For example, logical validity versus invalidity in¯uences endorsement rates when the believability of the conclusion is controlled (Evans & Over, 1996, 2004). An unsympathetic researcher could allege that ordinary people show a lack of ability when they fail to take account of the instructions in a standard experiment on belief bias. These state that the participants are to assume that the premises are true. Since the participants do not do so, and because they are affected by whether the inference is sound, they can be said to be biased. However, this failure to follow instructions can hardly be called a logical failure. It is certainly not evidence that people are unable to follow content-independent inference rules. In everyday affairs, it would not be rational to try to extend one's knowledge from mere assumptions that implausible statements are correct, rather than from well-justi®ed beliefs. Any proposition, no matter how subjectively or objectively improbable, can be assumed ``for the sake of argument''. But it is often highly ef®cient, in normal reasoning, for people to use automatic and fast heuristics for judging the plausibility of conclusions, rather than to suspend their well-founded beliefs and make assumptions. This does not mean that people are unable to make logical inferences from implausible or improbable statements. Belief bias effects can be reduced by stressing that the premises of an inference are to be assumed true (see Evans & Over, 1996, Chapter 6, and 2004, Chapter 6, for a discussion of this point). In ordinary affairs, inferences from premises that one does not believe are essential for certain purposes. Consider the reductio ad absurdum inference form, a valid content-independent inference procedure. Using this, we assume a point of view in order to derive, by valid inferences, a contradiction. Since we have used valid inferences, we can infer, from the derived contradiction, that at least one of our assumptions must have been false. We can use this form to discover and correct a falsehood among our own beliefs, but also to attack someone else's beliefs as ``absurd''. Even more generally, where someone else holds a point of view that we reject, we can use content-independent rules to make inferences as to what this person is

4. Content-independent conditional inference

87

also likely to believe. We do not agree with people's bigoted beliefs, but we can sometimes use simple valid inferences, taking those beliefs as assumptions, to infer the conclusions they will themselves draw. Suppose they have dogmatic beliefs in ``if p then q'' and p, which are prejudices that we do not share. We can still infer that they will also believe q, whatever the content of p and q. Indeed, we could not have an adequate model of other people's minds if we could not make content-independent inferences from their false beliefs taken as assumptions. An increasing ability to perform these inferences could well have increased the reproduction success of our evolutionary ancestors. Without such inferences, they could not have had the full advantages of social knowledge (Humphrey, 1976; Tomasello, 1999). Evolutionary psychologists who talk about a ``theory of mind module'' (Cosmides & Tooby, 1992) are presupposing that people do have the ability to perform content-independent inferences from what they do not believe (see also Happaney & Zelazo, Chapter 11; McKinnon, Levine, & Moscovitch, Chapter 7; Stenning & Van Lambalgen, Chapter 8; Moses & Sabbagh, Chapter 12, this volume). It is also wrong to conclude, on the basis of experiments on the conjunction fallacy, that people do not grasp the logical validity of and-elimination. Consider what Tversky and Kahneman (1983) show in detail in their seminal paper. Participants in one of their experiments were asked about a survey of adult males in British Columbia. One question that participants were asked was of the form ``p & q'': What percentage of the men surveyed are both over 55 years old and have had one or more heart attacks? Another question participants were asked was of the form q: What percentage of the men surveyed have had one or more heart attacks? Participants gave a higher percentage as the answer to the question about ``p & q'' than they gave to the question about q. They thereby committed the conjunction fallacy, but Tversky and Kahneman (1983) did not conclude that their participants were unable to grasp the elementary logic of ``and''. Their conclusion was rather that people often use quick and ef®cient, but sometimes unreliable heuristics to make probability judgements. (For other important work on the conjunction fallacy, see Sides, Osherson, Bonini, & Viale, 2002, and Stanovich & West, 1998). Tversky and Kahneman (1983) showed, in fact, how to activate people's basic logical understanding by reformulating the questions: They asked about the subsets of a set of 100 adult males participating in a survey in British Columbia. The ``p & q'' question was reformulated as:

88

Over How many of the 100 participants are both over 55 years old and have had one or more heart attacks?

The reformulated q question was: How many of the 100 participants have had one or more heart attacks? Asked these new questions, people tended to avoid the conjunction fallacy. They speci®ed a smaller subset as the answer to the ``p & q'' question than they speci®ed for the q question. There is, of course, a logical embedding relation between a ``p & q'' subset and a q subset of 100 (or of any other number of ) objects. The content of p and q does not matter. For example, if 10 men out of the 100 have had heart attacks, then the subset of men over 55 years old who have heart attacks must be a subset of those 10. Perhaps 8 of the 10 are over 55 years old. In general, there is a logically necessary connection between elementary logic and relations in ®nite set theory. One of the unlimited applications of content-independent set theory is to express facts about sample frequencies from surveys and other sampling procedures. Hence Tversky and Kahneman (1983) could have claimed to show that people avoid the conjunction fallacy when they think in terms of frequencies rather than percentages. Instead they argued that the frequency questions made the subset relations ``transparent''. Another way to put this is that this format made the logic of ``and'' transparently relevant. People do have the content-independent ability to follow this logic, but whether that ability is exercised on any given occasion can depend on the way a problem or question is put to them. Tversky and Kahneman held that people often use fast and ef®cient heuristics to make probability judgements. These have ``ecological validity'' (Tversky and Kahneman, 1973) in their general reliability, but do not always lead to judgements that conform to logical validity. Tversky and Kahneman (1983) were so far from denying that people have elementary logical ability that they demonstrated how to give probability problems a representation, in terms of embedded ®nite sets, that make them easy to solve by following logically valid inference rules.

Natural sampling Continuing the work of Tversky and Kahneman (1983), evolutionary psychologists have shown in experiments that reformulating probability problems in terms of embedded ®nite sets can help participants to solve them correctly. Gigerenzer (1991) seemed to argue at ®rst that frequency problems in general are easier to solve correctly than single-case problems. But, under the in¯uence of Kleiter (1994), there was a quick retreat to a much weaker position. This was the claim that people ®nd frequency problems easier to solve when these are given in the form that Kleiter called

4. Content-independent conditional inference

89

natural sampling. Cosmides and Tooby (1996) and Brase et al. (1998) were clear in giving an evolutionary explanation for why natural sampling problems are easier to solve than single-case probability problems: Only an ability to solve the former was adaptive under primitive conditions. They concluded that human beings have a dedicated, domain speci®c module only for solving natural sampling problems. However, Gigerenzer and Hoffrage (1995, 1999) introduced some confusion by advancing both this evolutionary explanation, and the explanation that a natural sampling problem is easier to solve because it is computationally simpler than a single-case problem. They did not say how the two explanations are consistent. (For critical points about the claims covered in this paragraph, see Girotto & Gonzales, 2001, 2002; Kleiter, 1996; Over, 2002, 2003; Sloman & Over, 2003; Sloman, Over, Slovak, & Stibel, 2003.) We can illustrate a natural sampling problem by assuming that, out of 100 men that we know, we can recall that 10 have had heart attacks and that 8 of those were over 55 years old. Out of the 90 who have not had heart attacks, suppose we can recall that 12 were over 55 years old. We can quickly see that 20 men (8 12) we know are over this age. With these memories, we can easily answer a question about the frequency of heart attacks among the men we know over 55 years old. Our answer will of course be 8 out of 20. There appears, at ®rst sight at least, no dif®culty in giving this answer, and the experiments of Gigerenzer and Hoffrage (1995), Cosmides and Tooby (1996), and Brase et al. (1998) con®rm that participants tend to solve correctly problems like this one about natural sampling. In contrast, consider the problem of inferring the single-case probability that a certain man we know has had a heart attack, given that he is over 55 years old. Suppose that there are the following single-case probabilities: (1) The probability that the man has had a heart attack is 0.1; (2) the probability that he is over 55 years old given that he has had a heart attack is 0.8; and (3) the probability that he is over 55 years old given that he has not had a heart attack is about 0.13. From this information, we can answer a question by applying Bayes's theorem to calculate the single-case probability. The answer is about 0.4. This calculation for the single-case question seems harder than inferring the answer of 8 out of 20 for the natural sampling frequency format. The experiments of Gigerenzer and Hoffrage (1995), Cosmides and Tooby (1996), and Brase et al. (1998) again con®rm that signi®cantly fewer participants in experiments correctly solve such problems. We have results from experiments on natural sampling that demand explanation. But in doing so, evolutionary psychologists systematically confuse sample frequencies with actual objective frequencies. In their terminology, the result of natural sampling is knowledge of a ``natural frequency'' (Gigerenzer, 1998; Gigerenzer & Hoffrage, 1995, 1999), the implication being that a ``natural frequency'' is an objective frequency. However, what natural sampling gives us is a sample frequency, which could

90

Over

be so badly biased that it is nowhere near the objective one. Cosmides and Tooby (1996) are especially strong in claiming that people can observe objective frequencies but cannot observe single-case probabilities. But philosophers of science would immediately point out that people cannot observe objective frequencies. What people observe and record or remember are sample frequencies, such as how many heads there have been out of so many spins of a coin. One must then use the sample frequencies to try to infer the objective frequencies, which will be unobservable theoretical entities, such as hypothetical in®nite sequences of spinning events, or an underlying theoretical propensity, that is the fairness of the coin (Over, 2002, 2003; Sloman & Over, 2003). After some number of spins, one might infer that the coin was fair with a high degree of con®dence, but one could always be wrong about that as an objective fact. Some evolutionary psychologists proclaim that a great advantage of natural sampling is that it is not normalised (Gigerenzer, 1998; Gigerenzer & Hoffrage, 1995, 1999). They write almost as if mathematicians invented this just to make sums dif®cult for ordinary people! But there are practical advantages in normalising (Over, 2002, 2003). Suppose that we have shared 8 of 20 nuts we have gathered with an acquaintance on the understanding that he will share nuts with us in the future. He later offers us 18 nuts out of 50 that he has gathered. Is he cheating us or not? The question is trivial after normalising out of 100. It becomes the question whether 40 out of 100 is the same as 36 out of 100. Normalising has made it transparent that 36 nuts are a proper subset of 40 nuts. The use of scienti®c sampling methods and analysis, to avoid biased samples, requires normalisation, which does not depend on the content of the terms, such as ``has had a heart attack'' or ``has come up heads'', that are used in the sampling procedure. Perhaps normalisation is disparaged by those who champion massive modularity precisely because it is content-independent. Why do so many more participants get the correct answer in natural sampling word problems, like the example about heart attacks in men over 55 years old, than in single-case problems? As we have already noted, evolutionary psychologists proffered two possible explanations. Their clear evolutionary claim was that the natural sampling problems have a speci®c format that human beings were well prepared by natural selection to solve (Cosmides & Tooby, 1996; Brase et al., 1998; Gigerenzer & Hoffrage, 1995; Gigerenzer, 1998). The argument was that human beings could observe frequencies but not single-case probabilities, and that being able to think effectively about frequencies increased reproductive success under primitive conditions. But there was also the second point, which depended on Kleiter (1994), that natural sampling word problems are computationally simpler than single-case problems (Gigerenzer & Hoffrage, 1995). The speci®c claim was that natural sampling problems have fewer computational steps than single-case problems. The emphasis on this computational claim increased in later work (Gigerenzer & Hoffrage, 1999).

4. Content-independent conditional inference

91

There are notorious problems with inferring objective frequencies from sample frequencies. How good people are at this depends, to a large extent, on how well they are able to avoid, or compensate for, biases in the sampling. Our evolutionary ancestors were capable, to some extent, of making inferences about frequencies that affected reproductive success under primitive conditions. There is much to study about the dedicated perceptual and memory skills that enabled them to collect sample frequencies about contents related to reproductive success, and the cognitive abilities that enabled them to detect biases in those samples, insofar as they were able to do that. However, this cannot tell us why people tend to solve natural sampling word problems correctly, but not single-case word problems, in the experiments of Gigerenzer and Hoffrage (1995), Cosmides and Tooby (1996), and Brase et al. (1998). Those natural sampling problems do not require a participant to use perception and memory to notice and recall sample frequencies, nor to recognise biases in the samples. They only ask the participant to understand the simplest logical relations between sets of objects, and do the most elementary arithmetic (Howson & Urbach, 1993). Gigerenzer and Hoffrage (1995, 1999) claim that the natural sampling word problems are computationally simpler than the single-case probability problems because they have fewer steps. Consider again the example of heart attacks among the men we know over 55 years old. There are intuitively fewer steps in the natural sampling version of this, where the answer can be practically read off immediately as 8 out of 20, than in the singlecase version, where Bayes's theorem is applied to get the answer of about 0.4. But we can hardly use this fact to argue for a domain speci®c module that processes sample frequency information and not single-case probability information. There are fewer steps in adding two numbers rather than 10 numbers, and people are faster and make fewer mistakes when adding two numbers than 10 numbers, but that is not evidence for a module dedicated to adding two numbers but not 10 numbers. Moreover, it is too simple to use an intuitive and unanalysed idea of the number of ``steps'' needed to solve a problem as a measure of its computational complexity. The intuitive ``steps'' could be of greater or less dif®culty. In the natural sampling problem, only the most elementary logical and set operations are necessary. Participants have to do little more than form the union of the set of people who have the symptom (or other criterion, e.g., being over 55 years old) with the disease (e.g., have had a heart attack) and the set of people who have symptom without the disease. They add up the total number of people in this union and compare that to number of people who have the symptom with the disease. But not all operations on sets, which are intuitive ``steps'', are so elementary. For example, one could construct a problem about sample frequencies that required the ``step'' of considering the set of all subsets of the sampled objects or events. Of course, if there are n members in a set, then there are

92 n

Over

2 subsets of the set, and so considering the set of all subsets of a set increases the dif®culty of the problem exponentially. We must say something deeper than that there are fewer intuitive ``steps'' to solve natural sampling problems. Looking more closely at the elementary steps for solving them shows why it is wrong to hold that the solution results only from content-dependent mental processes. We cannot ``forgo the baggage of logic'' if we want to get the solution. Take the very ®rst step, which always consists of the logical partitioning of a set. It is the partition that matters, not what is in the set. This might be the set of all the men we know, or the set of patients treated over the lifetime of a physician, or the number of hunting trips a group of our primitive ancestors made in the dry season one year in the Pleistocene. The initial set is partitioned into a d subset and a not-d subset. The d and not-d subsets might be the set of patients who have had a disease versus the set of those who have not. But it does not matter at all to the procedure for solving the problem what the content of d is, whether this is a statement about men we know, patients with a disease, or successful hunting trips. All that matters is following the logical, content-independent rule that a set of objects can be partitioned into two exclusive and exhaustive subsets. The natural sampling problems in the experiments make it even easier for us. We are told outright how many objects are in the two subsets, and need only the logical understanding that each object must be in either the d or the not-d subset and that no object is in both. The solution proceeds by the further partition, according to elementary logic, of the d subset, and the not-d subset, into s and not-s subsets. Again, we are told outright in the experiments how many objects are in these. And speci®c content still does not matter at all: the procedure is fully contentindependent. The content of s might be about what is intuitively the symptom of a disease, but it could also be about a ``symptom'' that makes it probable that a hunting trip, or anything else, will be ``successful'', Both the d and the not-d subsets are partitioned in turn into s and not-s subsets. The way that these partitions embed the subsets can be visually displayed in the logical tree of Figure 4.1 (see Kleiter, 1994, on this use of logical trees and Howson, 1997, on their underlying use in logic itself ). We can speak of the left branch of the tree that begins with a node for the d subset, and the right branch of the tree that begins with a node for the not-d subset. Then the tree branches again under both of these nodes, with nodes for the s and not-s subsets. All that remains to complete the solution in any particular case is to ®ll in the number of objects, or events, in the subsets at the various nodes. After that is done, the solution can be practically read off the tree. But note that, to make the ®nal step correctly, one must grasp that the ``d & s'' set is also a subset of the s set. Getting the correct answer comes down to making the logical relation between ``d & s'' and s ``transparent'', as Tversky and Kahneman (1983) originally termed it (Over, 2004).

4. Content-independent conditional inference

93

d or not-d

not- d

d

s

not- s

s

not- s

m or not-m (100) total enrolment

m (90) passed module

c (9) attended all classes

not-c (81) did not attend all classes

not- m (10) failed module

c (1) attended all classes

not-c (9) did not attend all classes

Figure 4.1 The logical form of a tree for a natural sampling word problem (upper), along with a speci®c example (lower).

Ordinary people who solve a natural sampling word problem do not have to have an explicit mental model of the logical tree (though that might be implied by the mental model account of the word problems in JohnsonLaird, Legrenzi, Girotto, Legrenzi, & Caverni, 1999). However, they will be following, in a step-by-step way, the logical rules that underlie the construction of the tree. They will also have to follow the logical rule of andelimination at the last step. To infer that the ``d & s'' set is a subset of the s set is to follow that logical rule, thanks to the (content-independent) necessary relation between elementary logic and relations in ®nite set theory. Gigerenzer (2006, p. 118) criticises Tversky and Kahneman's studies of the conjunction fallacy because they ``retained logic as the norm for the rational judgement''. He denies that logical ability makes human beings ``smart'': it is supposedly ``fast and frugal'' heuristics that do that (see also Gigerenzer, 2000). Now logic is an unbounded normative theory, and its standards can at times be too high to assess human rationality. Moreover, we need a much wider standard than logic to evaluate the rationality of

94

Over

action and the rationality of inductive and other non-logical inference (Evans & Over, 1996; Over, 2004). However, we must retain logic as the norm that is used for assessing some inferences. If people could not follow elementary logical rules, they could not be said to understand the meanings of such fundamentally important words as ``and'' and ``if''. In particular, the ability to solve a natural sampling word problem absolutely depends on content-independent logical operations. And the ®nal step in the solution is that of following the content-independent conjunction rule. Understanding that the ``d & s'' set is necessarily a subset of the s set, no matter what the content of d and s, re¯ects the same logical ability as grasping that andelimination is a logically valid inference rule. As a new example (CM), suppose that a lecturer gives the following advice to one of her students: (CM) If you have attended all the classes (c) then you will pass the module (m). The lecturer may even use natural sampling to justify this advice. If she has read the literature on biases, she will be unlikely to rely on her memory. She will more probably consult her written or computer records. She cannot directly observe the objective frequency, but she can make a more reliable inference from her records than from her memory about the number of students who have taken the module, the number who passed, and the number who attended all classes. After going to this trouble, she may be able to use CM to give advice to her student. In spite of what the evolutionary psychologists argue, the whole point of natural sampling is often to make justi®ed single-case probability judgements. But continuing with the example, and keeping it simple, let us assume that the lecturer ®nds the following in her records. There have been 100 students who took the module; 90 of these passed and 10 failed. Out of the 90 who passed, 9 attended all the classes, and out of the 10 who failed, 1 attended all the classes. We can summarise this information in the logical tree of Figure 4.1. One does not have to assume that the sample frequencies summarised in Figure 4.1 are close to the objective frequencies in order to solve a word problem just about the samples. The lecturer could restrict herself to inferences about the sample frequencies, without believing that these correspond to the objective facts. But let us assume that she believes that the samples are unbiased, and that she can make her probability judgements equal to the sample frequencies. With these laid out so transparently, she can easily solve the problem of giving the frequency with which students passed the module, out of those who attended all the classes, which will obviously be 9 out of 10. Note that this result of natural sampling might well not be enough for her to give the advice in CM. She might do more than mindless natural sampling and re¯ect that whether or not her student passes the module is independent of whether he attends all the classes, as

4. Content-independent conditional inference

95

P(m) = P(m/c). She might also wonder whether stronger students are more likely to attend all the classes than weaker ones, and suspect that the relevant causal factor is whether students pay attention or take good notes in the classes they attend. These further points illustrate that natural sampling on its own would not be as effective as well-justi®ed inferences, sometimes with single-case conclusions, about causation (Over, 2002, 2003; Sloman & Over, 2003). Even if natural sampling gave us an unbiased sample frequency, this might only inform us about a correlation and not causation, and be useless for giving advice and making decisions. However, let us suppose that the lecturer is rather naive and does base her advice to the student on the natural sampling we have described. She is unlikely to give the advice directly in terms of the rather technical conditional probability form, ``the conditional probability of passing the module given that you have attended all classes is 0.9''. She is much more likely to use the ordinary indicative conditional CM to give her advice. Why is that?

Conditionals and conditional probability Philosophical logicians have long argued that there is a deep connection between ordinary indicative conditionals in natural language and conditional probability (see Bennett, 2003, and Edgington, 1995, for reviews and critical discussion). The position is that people's con®dence in a conditional should be their con®dence in the consequent of the conditional given its antecedent. In more technical terms, a subjective probability judgement about ``if p then q'', P(if p then q), should be the subjective conditional probability, P(q/p). An empirical version of this view is that the subjective probability of ``if p then q'' will the conditional subjective probability of q given p, P(q/p). For example, people's con®dence in CM will be P(m/c). Let us call this version the (empirical) conditional probability hypothesis. There is now very strong experimental support for this. Participants indeed tend to give P(if p then q) as P(q/p). This tendency is especially strong for the most realistic conditionals that apply to ordinary matters of fact (Evans, Handley, & Over, 2003; Evans & Over, 2004; Hadjichristidis, Stevenson, Over, Sloman, Evans, & Feeney, 2001; Oberauer & Wilhelm, 2003; Over & Evans, 2003; Over, Hadjichristidis, Evans, Handley, & Sloman, in press). It is still unclear exactly what the relation is between the ordinary indicative conditional and conditional probability. It may be that the sole underlying function of this type of conditional is to express a conditional probability judgement. It may also be that the mental evaluation of the ordinary indicative conditional is more complex than that. But people may tend, for reasons of ef®ciency or pragmatics, to focus primarily on the conditional probability when they give a probability for this type of conditional (Evans & Over, 2004; Over et al., in press).

96

Over

Not only do people tend to give the conditional probability for the probability of the conditional, they take account of this when they decide whether to endorse the conclusion of a conditional inference (for a review of the evidence, see Evans & Over, 2004). Even if the probability judgements are based on biased sample frequencies, we can use logical trees for the transparent representation of probabilistic conditional inferences. Note that there is a necessary connection between probability and logical validity, which Tversky and Kahneman (1983) appealed to in their discussion of the conjunction fallacy. For all coherent probability judgments, when one proposition, e.g., ``p & q'', logically implies another, e.g., q, it is impossible for the former to be more probable than the latter. In general, logical validity can simply be de®ned in terms of probability. When there is more than one premise, and when conditional probability is used as the probability of a conditional, it is easier to state this de®nition not directly in terms of probability, but rather in a logically equivalent notion of uncertainty. In this sense, the uncertainty of a proposition p, U( p), is equal to 1 ÿ P( p). For example, if the probability of ``p & q'' is 0.6, the uncertainty of ``p & q'' is 0.4. Now we can de®ne an inference as logically valid if and only if, for all coherent probability measures, the uncertainty of its conclusion does not exceed the sum of the uncertainties of its premises (Adams, 1998). For example, the uncertainty of q cannot possibly exceed 0.4 if the uncertainty of ``p & q'' is 0.4. Let us see how we can apply these de®nitions to CM and the example of a valid inference, MP. Suppose that the teacher can ®nd the relevant sample frequencies in her records, and let us evaluate the probability of a conditional as being the conditional probability. Then we can infer that the probability of the major premise for MP, P(CM), is the conditional probability, P(m/c), which, as shown earlier, was 0.9, making the uncertainty of the major premise, U(CM), equal to 0.1. The probability of the minor premise for MP, P(c), the probability that a student has attended all classes, is 0.1 (see Figure 4.1) and its uncertainty, U(c), is therefore 0.9. The sum of the uncertainties of the premises of this instance of MP is 0.1 0.9 = 1. The probability of the conclusion that the module will be passed, P(m), is 0.9 and the uncertainty, U(m), is 0.1. We see that the uncertainty of the conclusion of this instance of MP, 0.1, is not greater than the sum of the uncertainties of the premises, 1. In general, the uncertainty of the conclusion of MP can never exceed the sum of the uncertainties of the premises using coherent probability measures. The uncertainty of the conclusion of MP can, however, be high, when the uncertainty in the premises is high, and then of course MP should not give us con®dence in its conclusion. But this is just the probabilistic extension of the point that a logically valid inference with false premises can have a false conclusion. Now let us turn to the example of an invalid inference, AC, that we gave earlier:

4. Content-independent conditional inference (AC)

97

If you have attended all the classes then you will pass the module. U(CM) = 0.1 You will pass the module. U(m) = 0.1 Therefore, you have attended all the classes. U(c) = 0.9

The sum of the uncertainties of the premises U(CM) and U(m) of this instance of AC is 0.1 0.1 = 0.2, and the uncertainty of the conclusion, U(c), is 0.9. Now we immediately see that the uncertainty of the conclusion, at 0.9, greatly exceeds the sum of the uncertainties of the premises, at 0.2, and it is transparent that this instance of AC is a very poorly justi®ed inference. Not all cases of AC are as bad as this. One can sometimes have a reasonable pragmatic justi®cation for inferring that P(q/p) and P( p/q) are both high when someone states ``if p then q'' (for example, if you do well, then you will get a prize). But our above example demonstrates that AC is a logically invalid inference. It might be replied that, once we have laid out the tree, we do not need to perform any inferences, logically valid or invalid. All we have to do is to look at the probability, or equivalently uncertainty, of the conclusion, whether it is U(c) or U(m), and believe it, or doubt it, to that extent. However, this reply misses the point. The tree is a logical tree, and to lay it out is to perform a series of logically valid steps. It makes no difference what the contents of c and m are. All we need to know, for the logical tree, is how many instances there are of c and of m out of the total sample. In our example, we had 10 cases of c and 90 of m out of a total sample of 100, and when we have these numbers, the contents of c and m do not matter. When conditionals are evaluated by conditional probabilities, conditional reasoning becomes Bayesian reasoning. And a logical tree, whether it is displayed visually or thought through step by step, can make this reasoning utterly transparent. The transparency results from a series of elementary logical operations that do not depend on any particular content. (See Oaksford, Chater, & Larkin, 2000, for a very stimulating probabilistic account of conditional inference. This is restricted in applying, so far, only to the case in which the minor premise is certain, but it could be generalised and made transparent by using logical trees.) The non-logical element in the trees we have described comes from the sampling, but of course logic must always begin with assumptions. Whether a sample frequency is biased in the ®rst place is not a logical question but one about matters of fact. If the sample frequencies in a logical tree are biased, then there will no good reason to believe that the series of elementary inferences has led to a conclusion that correctly represents an objective probability. Logically valid inferences with false assumptions can have false conclusions. As always, if the assumptions are rubbish, the conclusion can be rubbish (where ``rubbish'' means false). The teacher can use the logical tree to justify a conclusion about an objective probability under the assumption that her sample frequencies are unbiased, but she will

98

Over

have to use non-logical means, about how she has actually done the sampling, to justify that assumption. The Bayesian problems originally studied by evolutionary psychologists can be seen as questions about how much con®dence to have in an ordinary indicative conditional. Consider again the conditional CM. We can ask two questions about the subjective probability of an ordinary conditional like this, one normative and the other descriptive. How much con®dence should the teacher have in CM? How much con®dence will the teacher have in CM? As we have pointed out above, some in¯uential philosophical logicians have long answered the normative question by claiming that the subjective probability of an indicative conditional should be the conditional probability. That is, P(CM) should be P(m/c). Ramsey (1931/1990) was the ®rst to suggest that conditionals should be evaluated in this way. He made a famous suggestion about how to make this evaluation that has come to be known as the Ramsey test (see Bennett, 2003, for a recent philosophical discussion of this, and Evans & Over, 2004, for its psychological signi®cance). Suppose that people are debating how much con®dence to have in a conditional ``if p then q''. They will be, according to Ramsey, ``adding p hypothetically to their stock of knowledge and arguing on that basis about q''. The result of this hypothetical thought will be a subjective conditional probability judgement, ``®xing their degrees of belief in q given p''. Unfortunately, he said nothing about the psychological procedures that would implement this. Other philosophical logicians have said much more about the philosophical implications of the Ramsey test, but almost nothing about its psychology. More recently, many psychologists have become interested in the descriptive question and investigated the conditional probability hypothesis, that people will evaluate P(CM) as P(m/c), in experiments (see Evans & Over, 2004, for a review) and there has been very strong con®rmation of this. The Ramsey test can also be seen as a high-level description of an inde®nite number of psychological processes that result in conditional probability judgements (Evans & Over, 2004). Many of the heuristics studied by psychologists for making conditional probability judgements could be part of a Ramsey test. The heuristic that seems closest to this test is the simulation heuristic of Kahneman and Tversky (1982). They suggested that people could ``run'' a ``simulation model'' from a condition p in order to make a judgement about a conclusion q. But the simulation heuristic must itself be seen as a highlevel description of other, more speci®c heuristics or psychological models, such as causal models (Sloman, 2005). Consider how the more speci®c availability heuristic (Tversky & Kahneman, 1973) could be applied in a Ramsey test or a simulation. Note that, when P(c & m) is higher (lower) than P(c & not-m), P(m/c) is relatively high (relatively low). Many instances of attending all the classes and passing the module may be available to the teacher, but an individual student who attended all the classes and yet failed

4. Content-independent conditional inference

99

the module may ®nd his own, sad case the most available to his mind. The availability heuristic would then lead the teacher to judge P(m/c) to be higher than the student would judge this. The conditional probability hypothesis implies that people's judgements about P(CM) will depend on such individual differences in their use of heuristics. Conditional subjective probability judgements, and so by the hypothesis, subjective judgements about conditionals, can of course depend on knowledge of sample frequencies. Logical trees can make it transparent which probability judgements we should make with the sample frequencies we have (under the assumption that these are unbiased). The conditional probability hypothesis implies that people will judge P(CM) to be 0.9 when the sample frequencies are displayed in a logical tree like Figure 4.1. This prediction has yet to be tested in experiments, although there is already very strong experimental support, of different types, for the conditional probability hypothesis (see, most recently, Over et al., in press).

Conclusion People's ability to solve natural sampling word problems cannot be explained by the massive modularity hypothesis. Still less can we dismiss logic as the norm for solving these word problems. To solve them, one must be able to perform elementary logical inferences that are independent of content, whether that is about attending classes and passing modules, or being over 55 years old and having heart attacks. We must turn to dual process theory to account for two separate cognitive capacities. There are domain speci®c abilities that are necessary for actual natural sampling. For example, to sample men over 55 years old, people would depend on a module for face recognition (Nakayama, 2001). There is also a separate content-independent ability to solve natural sampling word problems about sample frequencies ± which result from actual sampling (or which have been made up by experimenters). In dual process theories of the mind, there is a distinction between what can simply be called System 1 versus System 2 (Evans & Over, 1996, 2004; Kahneman & Frederick, 2002; Sloman, 1996; Stanovich, 1999, 2004; Stanovich & West, 2000, 2003). System 1 includes content-speci®c encapsulated modules that work automatically and pre-consciously. The modules implement heuristics for processing information quickly in associative or connectionist networks. They do not explicitly follow rules for content-independent inference, but rather implicitly comply with, or conform to, rules for processing information. (For more on this distinction, see Smith, Langston, & Nisbett, 1992, and Evans & Over, 1996.) System 2 is controlled and conscious and can explicitly follow content-independent logical rules. System 2 is sequential and slow in its step-by-step operation, being limited by working memory capacity, but it can override System 1 when the latter's results are likely to be inaccurate or biased.

100

Over

Dual process theory, as such, leaves open the empirical question of the number of System 1 domain speci®c modules. It also leaves open how far there is biological preparedness in the acquisition of any of these. No doubt there is so much preparedness for some modules that these can be called ``innate'' in a strong sense. But certainly System 1 includes processes that enable us, like other animals in varying degrees, to act on sample frequencies of events that are useful for us to know about. These processes are not absolutely reliable by any means, and can cause us to recognise and remember biased frequencies, but the heuristics they implement have a reasonable degree of ecological validity (Tversky & Kahneman, 1973). System 2 can explicitly follow content-independent rules for the elementary logical operations that are used to solve problems about the results of natural sampling. System 1 is primarily responsible for actual natural sampling: the observation and memory of sample frequencies. Once sample frequencies are expressed in a problem, System 2 becomes relevant for solving it with the help of logic. There are individual differences in the activation of System 2 (Stanovich, 1999; Stanovich & West, 2000). But its use to solve probability problems, whether about sample frequencies or single cases, is facilitated by making the logic of the problems transparent (see Sloman & Over, 2003, and Sloman et al., 2003, on how to do this using Euler circles). Most dual process theorists argue that System 2 abilities are the direct product of natural selection and evolved later than System 1 abilities, and uniquely in human beings (Evans & Over, 2004; Over & Evans, 2000; Stanovich, 2004; Stanovich & West, 2003). Even some former supporters of massive modularity have apparently moved towards dual process theory and now seem to argue that content-independent logical abilities could have evolved by natural selection (see Cosmides & Tooby, 2000, for this apparent argument, and Over, 2003, for a comment.) There is bound to be controversy about the nature and origin of System 2 logical abilities. Nevertheless, the existence of these is established by the very experiments on natural sampling word problems that were originally designed to support the massive modularity hypothesis.

Acknowledgements Thanks to Maxwell Roberts and Steven Sloman for very helpful comments on a draft of this chapter.

References Adams, E. (1998). A primer of probability logic. Stanford, CA: CLSI Publications. Bennett, J. (2003). A philosophical guide to conditionals. Oxford: Oxford University Press.

4. Content-independent conditional inference

101

Brase, G. L., Cosmides, L., & Tooby, J. (1998). Individuation, counting, and statistical inference: The role of frequency and whole-object representations in judgment under uncertainty. Journal of Experimental Psychology, 127, 3±21. Cosmides, L., & Tooby, J. (1992). Cognitive adaptations for social exchange. In J. H. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 163±227). Oxford: Oxford University Press. Cosmides, L., & Tooby, J. (1994). Beyond intuition and instinct blindness: Toward an evolutionarily rigorous cognitive science. Cognition, 50, 41±77. Cosmides, L., & Tooby, J. (1996). Are humans good intuitive statisticians after all? Rethinking some conclusions from the literature on judgment under uncertainty. Cognition, 58, 1±73. Cosmides, L., & Tooby, J. (2000). Consider the source: The evolution of adaptations for decoupling and metarepresentations. In D. Sperber (Ed.), Metarepresentations: A multidisciplinary perspective (pp. 53±116). Oxford: Oxford University Press. Edgington, D. (1995). On conditionals. Mind, 104, 235±329. Evans, J. St B. T., Handley, S. J., & Over, D. E. (2003). Conditionals and conditional probability. Journal of Experimental Psychology: Learning, Memory and Cognition, 29, 321±355. Evans, J. St B. T. & Over, D. E. (1996). Rationality and reasoning. Hove, UK: Psychology Press. Evans, J. St B. T., & Over, D. E. (2004). If. Oxford: Oxford University Press. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Fodor, J. A. (2000). The mind doesn't work that way. Cambridge, MA: MIT Press. Gigerenzer, G. (1991). How to make cognitive illusions disappear: Beyond ``heuristics and biases''. In W. Stroebe & M. Hewstone (Eds.), European review of social psychology (Vol. 2, pp. 83±115). Chichester, UK: Wiley. Gigerenzer, G. (1998). Ecological intelligence. In D. D. Cummins & C. Allen (Eds.), The evolution of mind (pp. 3±29). Oxford: Oxford University Press. Gigerenzer, G. (2000). Adaptive thinking: Rationality in the real world. Oxford: Oxford University Press. Gigerenzer, G. (2006). Bounded and rational. In R. J. Stainton (Ed.), Contemporary debates in cognitive science (pp. 115±133). Oxford: Blackwell. Gigerenzer, G., & Hoffrage, U. (1995). How to improve Bayesian reasoning without instruction: Frequency formats. Psychological Review, 102, 684±704. Gigerenzer, G., & Hoffrage, U. (1999). Overcoming dif®culties in Bayesian reasoning: Reply to Lewis and Keren (1999) and Mellers and McGraw (1999). Psychological Review, 106, 425±430. Gigerenzer, G., Todd, P. M., & the ABC Research Group (1999). Simple heuristics that make us smart. New York: Oxford University Press. Girotto, V., & Gonzalez, M. (2001). Solving probabilistic and statistical problems: A matter of information structure and question form. Cognition, 78, 247±276. Girotto, V., & Gonzalez, M. (2002). Chances and frequencies in probabilistic reasoning: Rejoinder to Hoffrage, Gigerenzer, Krauss, and Martignon. Cognition, 84, 353±359. Hadjichristidis, C., Stevenson, R. J., Over, D. E., Sloman, S. A., Evans, J. St B. T., & Feeney, A. (2001). On the evaluation of If p then q conditionals. Proceeding of the 23rd Annual Meeting of the Cognitive Science Society, Edinburgh, UK.

102

Over

Howson, C. (1997). Logic with trees: Introduction to symbolic logic. London: Routledge. Howson, C., & Urbach, P. (1993). Scienti®c reasoning: The Bayesian approach (2nd ed.). La Salle, IL: Open Court. Humphrey, N. (1976). The social function of intellect. In P. P. G. Bateson & R. A. Hinde (Eds.), Growing points in ethology (pp. 303±317). London: Faber & Faber. Johnson-Laird, P. N., Legrenzi, P., Girotto, V., Legrenzi, M. S., & Caverni, J.-P. (1999). Naive probability: A mental model theory of extensional reasoning. Psychological Review, 106, 62±88. Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. In T. Gilovich, D. Grif®n, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive judgment (pp. 49±81). Cambridge: Cambridge University Press. Kahneman, D., & Tversky, A. (1982). The simulation heuristic. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 201±210). Cambridge: Cambridge University Press. Kleiter, G. (1994). Natural sampling: Rationality without base rates. In G. H. Fisher & D. Laming (Eds.), Contributions to mathematical psychology, psychometrics, and methodology (pp. 375±388). New York: Springer-Verlag. Kleiter, G. (1996). Critical and natural sensitivity to base rates. Behavioral and Brain Sciences, 19, 27±29. Nakayama, K. (2001). Modularity in perception, its relation to cognition and knowledge. In E. B. Goldstein (Ed.), Blackwell handbook of perception (pp. 737± 759). Oxford: Blackwell. Oaksford, M., Chater, N., & Larkin, J. (2000). Probabilities and polarity biases in conditional inference. Journal of Experimental Psychology: Learning, Memory and Cognition, 26, 883±889. Oberauer, K., & Wilhelm, O. (2003). The meaning(s) of conditionals: Conditional probabilities, mental models and personal utilities. Journal of Experimental Psychology: Learning, Memory and Cognition, 29, 688±693. Over, D. E. (2002). The rationality of evolutionary psychology. In J. L. Bermúdez & A. Millar (Eds.), Reason and nature: Essays in the theory of rationality (pp. 187±207). Oxford: Oxford University Press. Over, D. E. (2003). From massive modularity to metarepresentation: The evolution of higher cognition. In D. E. Over (Ed.), Evolution and the psychology of thinking: The debate (pp. 121±144). Hove, UK: Psychology Press. Over, D. E. (2004). Rationality and the normative/descriptive distinction. In D. Koehler and N. Harvey (Eds.), Blackwell handbook of judgment and decision making (pp. 1±25). Oxford: Blackwell. Over, D. E., & Evans, J. St B. T. (2000). Rational distinctions and adaptations. Behavioral and Brain Sciences, 23, 693±694. Over, D. E., & Evans, J. St B. T. (2003). The probability of conditionals: the psychological evidence. Mind & Language, 18, 340±358. Over, D. E., Hadjichristidis, C., Evans, J. St B. T., Handley, S. J., & Sloman, S. A. (in press). The probability of causal conditionals. Cognitive Psychology. Ramsey, F. P. (1990). General propositions and causality. In D. H. Mellor (Ed.), Philosophical papers (pp. 145±163). Cambridge: Cambridge University Press. (Original work published 1931)

4. Content-independent conditional inference

103

Samuels, R. (1998). Evolutionary psychology and the massive modularity hypothesis. British Journal for the Philosophy of Science, 49, 575±602. Sides, A., Osherson, D. N., Bonini, N., & Viale, R. (2002). On the reality of the conjunction fallacy. Memory & Cognition, 30, 191±198. Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119, 3±22. Sloman, S. A. (2005). Causal models. Oxford: Oxford University Press. Sloman, S. A., & Over, D. E. (2003). Probability judgment from the inside and out. In D. E. Over (Ed.), Evolution and the psychology of thinking: The debate (pp. 145±169). Hove, UK: Psychology Press. Sloman, S. A., Over, D. E., Slovak, L., & Stibel, J. M. (2003). Frequency illusions and other fallacies. Organizational Behavior and Human Decision Processes, 91, 296±309. Smith, E. E., Langston, C., & Nisbett, R. E. (1992). The case for rules in reasoning. Cognitive Science, 16, 1±40. Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind: Domain speci®city in cognition and culture (pp. 39±67). Cambridge: Cambridge University Press. Stanovich, K. E. (1999). Who is rational? Studies in individual differences in reasoning. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Stanovich, K. E. (2004). The robot's rebellion: Finding meaning in the age of Darwin. Chicago: Chicago University Press. Stanovich, K. E., & West, R. F. (1998). Individual differences in framing and conjunction effects. Thinking and Reasoning, 4, 289±317. Stanovich, K. E., & West, R. F. (2000). Individual differences in reasoning: Implications for the rationality debate? Behavioral and Brain Sciences, 23, 645±726. Stanovich, K. E., & West, R. F. (2003). Evolutionary versus instrumental goals: How evolutionary psychology misconceives human rationality. In D. E. Over (Ed.), Evolution and the psychology of thinking (pp. 171±230). Hove, UK: Psychology Press. Stevenson, R. J., & Over, D. E. (1995). Deduction from uncertain premises. The Quarterly Journal of Experimental Psychology, 48A, 613±643. Stevenson, R. J., & Over, D. E. (2001). Reasoning from uncertain premises: Effects of expertise and conversational context. Thinking and Reasoning, 7, 367±390. Todd, P. M., & Gigerenzer, G. (1999). What we have learned (so far). In G. Gigerenzer, P. M. Todd, & the ABC Research Group. Simple heuristics that make us smart (pp. 357±365). New York: Oxford University Press. Tomasello, M. (1999). The cultural origins of human cognition. Cambridge, MA: Harvard University Press. Tooby, J., & Cosmides, L. (1992). The psychological foundations of culture. In J. H. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 19±136). New York: Oxford University Press. Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207±232. Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90, 293±315.

5

Ontological commitments and domain speci®c categorisation Steven Sloman, Tania Lombrozo, and Barbara Malt

Categorisation research is Janus-faced, with two orientations that rarely look at one another. One side is busy developing and testing algorithms for human classi®cation that are presumed to be domain general (Ashby, 1992; Kruschke, 1992; Nosofsky, 1992), paying scant attention to the possibility that human categorisation processes differ depending on the kind of object being considered. The other side spends its time carefully documenting how categorisation differs across different object domains (e.g., Atran, 1998; Gelman, 2003), with little notice of the many well-speci®ed and rigorous domain general models of categorisation that have been proposed. One difference between the purveyors of these two literatures is their sense of what constitutes psychological explanation. The ®rst group places more value on the precision and clarity afforded by formal models, the second on generalisability beyond the laboratory. But the contrasting assumptions of domain generality and domain speci®city may re¯ect a more profound theoretical divide. Cognitive scientists have documented a variety of ways in which patterns of categorisation and induction differ as a function of domain. For example, children and adults tend to privilege external appearance and functional properties when reasoning about artifacts like chairs, but care more about internal properties and an entity's origin when reasoning about natural kinds like dogs (Keil, 1989). Differential patterns of categorisation and inference can be explained by appeal to domain speci®c representations or processes, and indeed, they frequently are. However, it is also possible that children and adults make differential judgments on the basis of properties that merely correlate with domain, like similarity (e.g., animals tend to have more in common with one another than with artifacts) or causal role (e.g., artifacts are created to serve human needs and natural kinds typically are not), without engaging representations or processes tailored to the different domains. If so, domain differences may tell us little about the most central aspects of categorisation. In this chapter, we attempt to bridge the gap between these two orientations. We begin by spelling out the range of current theoretical views on domain speci®city in categorisation, laying out claims about the origins of domain distinctions and the different mechanisms involved. The presentation

106

Sloman, Lombrozo, Malt

will make explicit some of the assumptions latent in the various positions and allow us to evaluate the extent to which each ®nds support in existing research. To help specify the range of views, we offer a hierarchy of positions, beginning with the most strongly domain speci®c and ending with the most domain general. We take as our guide an analysis by Sloman and Rips (1998) of the role of similarity among objects and categories in human thought which distinguished four views: the extreme position that similarity is real, primitive, and cognitively ef®cacious in human reasoning (strong similarity), two variants of weaker claims, and the opposite extreme view that similarity is ``invidious, insidious, a pretender, a quack'' (Goodman, 1955), called ``no similarity.'' We will suggest here that a set of positions on domain speci®city in categorisation can be identi®ed that also makes increasingly weaker claims, this time about the role of domain speci®c representations and processes. At present, a compelling argument for invoking domain speci®c representations or processes in theories of categorisation comes from claims about psychological essentialism: that some categories are ascribed hidden, underlying essential properties that are causally responsible for an object's observable properties, and that these ascriptions play a central role in category judgments and inductions (e.g., Medin & Ortony, 1989). Most advocates of essentialism claim that only some categories are ascribed essences, and that such ``selective essentialism'' respects domain boundaries. However, debates about domain speci®city turn on questions beyond essentialism. We have identi®ed ®ve key questions whose answers distinguish positions on domain speci®city in categorisation (see also Table 5.1). 1 2 3 4 5

Are there innate modules that are differentially sensitive to objects from different domains? Are there cognitive mechanisms (including representations, processes, or both) arising from any source that are differentially sensitive to objects from different domains? Do people ascribe essential properties to (some) categories? Are objects' causal roles critical to categorisation? Are some domain general representations or processes operative in categorisation?

The strongest position we consider, extra-strong ontology, adopts the view that categorisation is governed by domain speci®c, innate modules that yield ontological kinds that are either essentialised or not. The weakest position we consider, no ontology, denies both domain speci®city and essentialism. We also consider three intermediate positions based on claims in the literature, which leads us to the following ®ve positions on the role of ontology in categorisation. 1

Extra-strong ontology. Domain differences in categorisation and induction are real and cognitively primitive (irreducible to other cognitive mechanisms). They result from innate modules evolved to pick out

5. Domain speci®city in categorisation

107

Table 5.1 Properties of categorisation theories Determinants of category judgments and inductions

Theory Extra-strong ontology Strong ontology Medium ontology Mild ontology No ontology

2

3

4

5

Innate modules 4

Domain speci®c Essentialist mechanisms beliefs 4 4

4 4 4

Causal roles 4 4 4 4

Domain general properties 4 4 4 4 4

systematic regularities in the environment, including causal regularities. Modules have a one-to-one correspondence with domains (e.g., natural kinds versus artifacts), and have an all-or-none policy on essentialism: either all entities corresponding to a module are essentialised or none are. In a given classi®cation or inference, an object is processed the way it is in virtue of the module that covers it. Strong ontology. Domain differences are real and cognitively primitive. They result from learning mechanisms sensitive to properties like causal structure that partition objects in the world along deep ontological lines. Domains have an all-or-none policy on essentialism: either all entities from a domain are essentialised or none are. In a given classi®cation or inference, an object is processed the way it is in virtue of the domain that covers it. Medium ontology. Domain differences in categorisation and inference are real, but not cognitively primitive. They emerge from domain general, causal learning mechanisms that ascribe essences to some entities but not others. To the extent that the causal assumptions underlying the ascription of an essence correspond to causal differences across domains, essentialism will obey domain boundaries. In a given classi®cation or inference, an object is processed the way it is in virtue of whether it is essentialised, which will correlate imperfectly with its domain. Mild ontology. Domain differences in categorisation and inference are systematic, but not cognitively primitive. People reason and classify using domain general causal reasoning mechanisms. To the extent that domains correspond to causal discontinuities in the world, systematic differences between domains may emerge, and domain thus serves as a useful shorthand for theorists to classify roughly different types of processing. However, in a given classi®cation or inference, an object is processed the way it is in virtue of its causal history and other causal roles, which will correlate imperfectly with its domain. No ontology. Systematic domain differences in categorisation and inference may exist, but are not cognitively primitive nor due to causal

108

Sloman, Lombrozo, Malt differences across domains. People classify and reason using domain general reasoning mechanisms. To the extent that domains differ with respect to the properties such mechanisms track, systematic differences between domains may emerge. However, in a given classi®cation or inference, an object is processed the way it is in virtue of its non-causal, domain general properties, which will correlate imperfectly (if at all) with its domain.

These positions are ordered according to the strength of the claims they make: Stronger claims entail weaker claims, not vice versa (see Table 5.1). For example, according to extra-strong ontology, innate modules correspond to domains, which in turn determine whether an essence is ascribed. The modules serve the adaptive function of reasoning about a particular domain, and thereby encode systematic properties of the domain (Fodor, 1983, discusses what a module is). Such systematic properties might include the causal roles that entities in the domain typically play, as well as regularities that are not entirely causal, like the physical form of objects. Nonetheless, an advocate for extra-strong ontology will differ from other theorists in predictions about classi®cation and inference in that judgments must be grounded in the relevant innate module, and not in essence or causal role. While the theories in Table 5.1 entail the claims of theories lower on the hierarchy, myriad positions that do not obey this nested structure are logically possible. If each column is treated as independent, there are 25 positions, yet we only ®nd ®ve, with close variants, represented in the literature. Partly this is because the criteria in some columns have implications for others; partly it is a fact about theorists' predilections. Nonetheless, it is worth acknowledging the possibility of uncharted yet coherent views. For example, a theorist might advocate essentialism while denying the existence of meaningful domains, or advocate domain differences without appeal to essentialism. In fact, we have some sympathy with this latter position. We next identify some of the theories associated with each of the ®ve views and review the critical data that support or contradict each one. For each, we focus on evidence for and against the claim that distinguishes that position from its neighbour one step lower on the hierarchy. This review is not intended to be exhaustive but merely to clarify the claims and what they entail. Although this hierarchy can be fruitfully applied to a range of domains, in the interest of space we focus on the distinction between natural kinds and artifacts. We conclude by advocating the weakest position that can account for existing data: mild ontology supplemented with domain-level generalisations.

Extra-strong ontology Extra-strong ontology is committed to the existence of modules that pick out domains and determine what properties found in those domains are

5. Domain speci®city in categorisation

109

essential. These modules are proposed to arise in humans through natural selection, because sensitivity to causal and other regularities in the environment provides a selective advantage. Extra-strong ontology is differentiated from strong ontology by the assumption that ontological distinctions result from innate mechanisms; that is, a module encodes some key structural aspect of categories that emerges regardless of exposure to objects in the relevant domain. While people need not represent ontological kinds explicitly (although they might), the view requires that different ontological kinds (primarily living things) be treated in a unique way by the processes that assign category membership and make inductions. People might even encode differences that do not correspond to actual differences in the world, such as race (Templeton, 1998) and biological species (Mayr, 1982; Sober, 1994). This view is articulated most fully by Atran (1998), who claims that a distinct module exists to make classi®cations and inductions related to plants and animals, and that this evolved via natural selection to re¯ect the history of interaction between people and these sorts of ontological kinds. He writes: ``Universal taxonomy is a core module, that is, an innately determined cognitive structure that embodies the naturally selected ontological commitments of human beings and provides a domain-speci®c mode of causally construing the phenomena in its domain'' (p. 555). Atran claims that people automatically generalise biological claims about an object to the generic-species level, a level that corresponds both to biological genus and species. He proposes that people make the ``commonsense assumption that each generic species has an underlying causal nature, or essence, that is uniquely responsible for the typical appearance, behaviour, and ecological preferences of the kind'' (Atran, 1998, p. 548), and that this assumption is innate. By ``essence,'' he means ``an intrinsic (i.e., nonartifactual) teleological agent, which physically (i.e., nonintentionally) causes the biologically relevant parts and properties of a generic species to function and cohere `for the sake of' the generic species itself'' (pp. 550±551). Note that essences need not be speci®c, identi®able parts or attributes; the claim is that people believe something essential exists whether or not it actually does and whether or not they have more speci®c beliefs about it. This position is echoed by Pinker (1997).

Support for extra-strong ontology Categories in different domains do tend to have different causal histories. Living things evolve within ecosystems whereas artifacts are the product of intentional human design. To the extent that natural selection shapes the mind to appreciate such aspects of the world, natural selection might build in this distinction. It would be particularly important if the information needed to be accessed quickly and automatically.

110

Sloman, Lombrozo, Malt

Beyond such speculation, the assumption of hardwired domain speci®c modules is consistent with the following observations. 1

2

3

4

5

Living things are categorised in largely the same way cross-culturally (Atran, 1990; Berlin, Breedlove, & Raven, 1973; Malt, 1995). Atran concludes there is a universal ``general-purpose'' taxonomy of living things (though see Ghiselin, 1998). Universality would be a natural, though not logically necessary, consequence of innate modularity. Some evidence indicates a cross-cultural tendency to prefer inductive inferences that are made at the generic-species level (e.g., vulture), even when this does not correspond to the basic level of categorisation as determined by other measures, like naming and feature listing (Coley, Medin, & Atran, 1997). For example, people are almost as willing to project a blank biological predicate (like ``has enzyme X'') to the generic-species level as they are to more speci®c levels, but they hesitate to project such properties beyond the generic-species level. This ®nding holds true not only for populations like the Itzaj Maya, for whom the generic-species level is the basic level of categorisation, but also for American college students, who typically treat the life-form level (e.g., bird) as the basic level on tasks other than inductive inference, like naming. This inductive tendency can be explained by appealing to an innate tendency to generalise at the generic-species level. Evidence has been reported for selective neuropsychological impairment of knowledge about animals (e.g., Caramazza & Shelton, 1998), consistent with the claim of an innate module that must have some form of neural representation. Even young children show systematic differences when making inferences about natural kinds versus artifacts. In particular, nonobservable internal features are given more weight in classi®cation and induction of living things and observable external features are given more weight for artifacts (reviewed in Gelman, 2003). The animate/inanimate distinction is made very early in development, in infancy (Rakison & Poulin-Dubois, 2001).

Problems with extra-strong ontology Although a lot of data are consistent with extra-strong ontology, this view makes strong assumptions while providing little more than a description of the data, rather than an explanation in terms of cognitive mechanisms. The notion of module merely mirrors the observation of domain differences without explaining how they arise or offering novel predictions. Any unambiguous support for the extra-strong ontology view must come from some other source, perhaps an evolutionary argument for innateness as offered by Atran (1998). Evaluating such an argument is beyond the scope of this chapter. When doing so, of course, it is critical to consider not just

5. Domain speci®city in categorisation

111

whether a plausible evolutionary account can be given of the modular view, but also whether one can be given of its antithesis, a non-modular view (see Sloman & Over, 2003; Sterelny, 2003). Extra-strong ontology also has dif®culty accounting for cross-cultural variation in biological kind classi®cation and inference. Despite broad similarities, animal categorisation exhibits some cultural speci®city: For instance, cows have a religious signi®cance that sets them apart in India, whereas they are often grouped with sheep and goats and other livestock in Wisconsin. The ¯exibility of biological categorisation has led theorists to propose promiscuous realism (DupreÂ, 1981), the idea that there is no single, privileged way to carve up kinds in the environment. Variation in animal categorisation is not arbitrary ± it tracks the different roles that animals play in the economic, social, and religious lives of different cultures (Diesendruck, Markson, & Bloom, 2003, and Malt, 1995, provide references). Extra-strong ontology could account for cultural variation by assuming the module that governs biological classi®cation is sensitive to complex causal roles that crosscut domains. But to the extent that categorisation cuts across domains by virtue of learning these causal roles, the innate and domain speci®c components of the module cease to account for the data. Moreover, taxonomies do not always support induction. Medin, Coley, & Storms (2003) provide several examples of inductions mediated by causal reasoning that break category boundaries. Sloman (1998) reports cases where people fail to use category hierarchies to make inductions in the most transparent situations. For instance, people do not necessarily infer that all sparrows have a property when told that all birds do, even when they af®rm that sparrows are birds. The problem for extra-strong ontologists is that the only inferential mechanism they propose is taxonomic, and yet taxonomic reasoning is surprisingly rare. Finally, views like extra-strong ontology provide no natural account of expertise that transcends evolutionary endowment. For example, biologists appeal to categories and processes that re¯ect non-obvious scienti®c discoveries, and have even rede®ned domain boundaries (e.g., by interpreting life as a biological rather than a spiritual phenomenon). If such knowledge emerges from the same mechanisms that govern folk biological inference, then the constraints imposed by mechanisms in the biology module must be too weak to warrant strong claims about the domain speci®c nature of categories and inference. Alternatively, such knowledge could result from inferential mechanisms outside the folk-biology module. But if such nonmodular inferential mechanisms are available, why have a module for folk biology in the ®rst place? One possibility is that innate, domain speci®c representations and processes are needed to get folk biology off the ground, but that once an initial scaffold of biological knowledge is in place, other mechanisms take over and account for scienti®c expertise and cultural variation. Proposals that such representations are required at the outset are often based on the

112

Sloman, Lombrozo, Malt

observation that perceptual information is insuf®cient to explain people's categories. For example, children might be observed not to treat toy dogs and real dogs in the same way; therefore theorists conclude that they use domain-dependent conceptual knowledge to make critical distinctions. This argument is based on treating domain speci®c conceptual processes and domain general perceptual heuristics as the only possible categorisation processes (Atran, 1998, makes this leap explicitly on p. 554). Clearly, the dichotomy is not exhaustive. In our discussion of mild ontology below, we offer an alternative in terms of domain general conceptual processing.

Strong ontology The idea of strong ontology is that, as a result of an essentialist bias and causal learning that picks out regularities in the causal structure of the world, people have domain speci®c mechanisms for classi®cation and inference. Strong ontology differs from extra-strong ontology in denying innate modules, but shares the assumptions that domains are associated with an all-or-none policy on essentialism and that the entities from some domains (like living things) are essentialised while others (like artifacts) are not. As Gelman, Coley, and Gottfried (1994) put it, essentialism ``readily applies to new domains that have never before been encountered'' but does not ``sensibly apply to all domains . . . An essentialist assumption may be `designed' in such a way that it functions only when it meets domains with the appropriate features'' (p. 358). Strong ontology differs from medium ontology in the assumption that ontological distinctions emerge from domain differences, and that essentialism respects these differences. At various points, strong ontology seems to be the view espoused by Gelman (2003). She writes: ``Essentialism is a domain-general assumption that is invoked differently in different domains depending on the causal structure of each domain'' (p. 312). However, the result of differentially applying these domain general assumptions seems to be domain speci®c, essentialist reasoning: ``Although the proposed principles are domaingeneral, essentialising is not. We do not essentialise wastebaskets or gumballs'' (p. 321). At other points, Gelman (2003) seems more consistent with the position we review in the next section, medium ontology. Support for strong ontology The main advantage of this approach is that it too is consistent with all of the phenomena used to motivate the extra-strong ontology view, but is unburdened by assumptions of innateness or those concerning modularity. The supposition that some domains and not others are ``essentialised'' is intended to account for the very data that motivate extra-strong ontology. And by allowing a role for learning, strong ontology is consistent with

5. Domain speci®city in categorisation

113

the cross-cultural variation in biological classi®cation that threatens extrastrong ontology. Problems with strong ontology Strong ontology has dif®culty accounting for cases in which classi®cation and induction do not respect domain differences. After all, objects ± including living things ± play many causal roles that are not tied directly to their biological domain. Hence inferences often depend on contingent aspects of the object, its environment, and the speci®c goal of the inference. For instance, a dog plays many causal roles in the day-to-day activity of a family, including drawing family members outside and causing them to administer medication. Some inductions, for example, about behaviour and disease, will depend in part on these other causal roles. Only a few, like those having to do with body parts, will be based purely on the dog's biological history (and even then, not if she loses a leg in a car accident). A dog's disposition may be as much a result of a boisterous household as of her evolutionary niche. Moreover, some important groupings have nothing to do with ontological domain. The class ``things to take out of a burning house'' (Barsalou, 1991) includes living things (like a dog) and artifacts (photos and jewellery), but not cockroaches and wastebaskets. To account for such inferences and groupings, strong ontology is forced to appeal to ad hoc alternative explanations. Moreover, essentialist assumptions do not cleanly line up with domain assumptions. For instance, it is natural to distinguish the domains of living things and artifacts, yet one might be prone to assume that both living things and some artifacts have essences. This tension is clear in Keil's work (Keil, 1995). His position is close to strong ontology for he asserts a set of dimensions that normally distinguish natural kinds and artifacts, but he allows exceptions: ``[T]hese contrasts with natural kinds turn out to be more like rough rules of thumb than strict criteria. Artifacts and natural kinds appear to be arrayed along several related continua rather than in sharply contrasting bins, as is seen with more complex artifacts, such as televisions, cars, and computers, and with designed living kinds, such as plants and animals subject to intensive breeding'' (Keil, 1995, p. 235). Indeed, some theorists assert that all artifacts have essences, namely the intent of the artifact's creator (Bloom, 1996; Matan & Carey, 2001). Of course, if artifacts do have essences, the value of essentialism for explaining domain speci®c phenomena that distinguish artifacts and living things is undermined. Essentialism could account for such phenomena by positing that the essences of artifacts and living things have different content, but then it is the content, and not essentialism, that does the necessary explanatory work. The one weakness of the strong ontology programme relative to extrastrong ontology is the inability to explain the very early acquisition of domain speci®c beliefs because strong ontology entails a period of causal

114

Sloman, Lombrozo, Malt

learning of domain differences (see Rakison & Poulin-Dubois, 2001). Quinn and Eimas (1996) found that young infants distinguish animals and artifacts. Pauen (2002) has shown that 8-month-olds distinguish animals and furniture, even when some perceptual properties are equated across the two domains. However, these results might re¯ect perception of differences across those dimensions that were not equated (like texture), and not necessarily domain speci®c knowledge.

Medium ontology Medium ontology allows that, although people are assumed to essentialise, essentialist beliefs do not necessarily line up neatly with domains. Inferences are made by virtue of generalisations over domains only insofar as domains pick out causal regularities, not by virtue of the domains per se. We believe that medium ontology represents the modal view in the literature on cognitive development, as many authors assume that a belief in essence is critical to categorisation processes while denying that essences are invariably assumed in some domains but not others (Keil, 1989, 1995; Kelemen & Carey, 2007). This is the view found, for example, in Gelman and Hirschfeld (1999). They argue that the ``early and nearly parallel emergence of essentialist reasoning in these different domains is consistent with the maturation of a single conceptual bias for essentialist reasoning'' (p. 421), but they suggest that the essentialist bias crosses domains: ``Essentialism may not map cleanly onto domains. Events and speci®c entities . . . may be essentialised without essentialising the larger domain of which it is part'' (p. 437). Essentialist beliefs are described in various ways by various authors (e.g., Keil, 1989, 1995; Medin & Ortony, 1989). Gelman and Hirschfeld (1999) say a causal essence is ``the substance, power, quality, process, relationships, or entity that causes other category-typical properties to emerge and be sustained and confers identity'' (p. 406, emphasis in original). Gelman (2003) identi®es essentialism as a set of assumptions that people, including young children, make about various domains, as follows. 1 2 3 4 5

Appearance is distinct from reality. This explains why non-obvious properties of objects can be the most central. Properties come in clusters that support induction. Properties have deterministic causes. The most central properties are root causes. Objects maintain their identity over time. Therefore, understanding an object's origin is important. People must defer to experts because one does not necessarily know an object's causal properties or origins. An essence is merely a placeholder for an ultimate cause. This forces people to accept some category anomalies.

5. Domain speci®city in categorisation

115

Other construals of psychological essentialism bear a family resemblance to Gelman's de®nition. The notion of psychological essentialism is related to but distinct from Putnam's (1975) and Kripke's (1980) notions of linguistic essentialism. Rips (2001) points out that an essence can only serve to distinguish categories on the assumption that it constitutes a necessary and suf®cient condition for category membership. Gelman (2003) calls such a de®ning set of features a ``sortal essence'' and denies that the causal essences she attributes to human thought have this property. But if a causal essence were not necessary for category membership, then there would be category members without the causal essence that would have few if any properties of the category because they would be missing the root cause of those properties. And if the causal essence were not suf®cient for category membership, then there would be nonmembers of the category that nevertheless have the category essence. Because they have the causal essence, they would have most of the properties of the category and yet not be category members. Neither of these conclusions makes sense. Therefore, denying necessity and suf®ciency seems to undermine the work that essences purport to accomplish, and the property of being de®ning should be added to Gelman's list. Support for medium ontology This position offers greater explanatory depth than the previous positions. It recommends a theoretical programme: that of spelling out the dimensions on which categories differ and specifying precisely how those dimensions in¯uence classi®cation and induction decisions, and when and how essentialist beliefs play a part. Notice that the explanatory work is done by the assertion that causal regularities are learned and not by the assumption of essentialism. The value of the essentialist assumption may be to explain what motivates people to learn those regularities. Problems with medium ontology Like the innate modules of extra-strong ontology, the essentialist claim seems more of a restatement of a set of phenomena than an explanation. Calling Gelman's (2003) list of insights about human thought ``essentialism'' does no more than label the insights. The absence of a coherent theory that puts the phenomena together in a principled way may be why different theorists draw essentialist lines in different ways. Furthermore, several experiments have tried to operationalise the notion of essence and found it wanting. For instance, Malt (1994) found that people's beliefs about the presence or amount of H2O in liquids did not serve to distinguish what they did and did not name water. Kalish (1995)

116

Sloman, Lombrozo, Malt

found that people were not willing to treat even natural kinds as having absolute category membership. Braisby, Franks, and Hampton (1996) found that the presence or absence of normal cat essence did not determine whether people classi®ed as cats animals that, in some cases, turned out to be robots controlled from Mars. Gelman and Hirschfeld (1999) argue that all these operationalisations are inadequate in one way or another, but this raises the question of the value of a concept that is so dif®cult to operationalise. One major weakness of this view, along with its stronger counterparts, is its assumption that essentialist beliefs are at least sometimes operative in the categorisation process. Several authors have argued that causal knowledge can do all the explanatory work done by essentialism without any appeal to essences per se (Rips, 2001; Sloman & Malt, 2003; Strevens, 2001). Arguments in favour of medium ontology often rely on the same dichotomy between domain speci®c conceptual processes versus domain general perceptual heuristics that one ®nds among extra-strong ontologists. One argument has the form ``People don't classify or induce based on perceptual similarity, so they must be relying on a hidden essence. And essences have a different status in different domains, therefore classi®cation and induction are domain speci®c.'' But this argument ignores the mild possibility, discussed below, of domain general conceptual processes that result in different patterns of categorisation and reasoning across domains.

Mild ontology This view assumes neither that essentialist beliefs guide our classi®cations and inductions, nor that classi®cations and inductions are done differently in different domains. Instead, people have reasoning mechanisms that use whatever causal knowledge is available as a means of explaining how the world works. As in medium ontology, the assumption is not that objects are processed by virtue of their ontological domain, but rather that objects from similar domains tend to be processed in the same way because causal regularities happen to be associated with domains in the world. Instead of positing essentialism, this programme simply assumes that people are interested in explaining the world around them and they build causal models to do so. These are assumptions that must be made by medium and strong ontology as well. A more detailed proposal about how people use causal regularities to classify and make inductions follows. Its assumptions require neither essentialism nor domain speci®city and yet are consistent with the phenomena described so far. They are motivated by the literature on causal Bayesianism; Sloman (2005) offers an introduction and Gopnik and Schulz (in press) review recent advances. Scholl (2005) offers a related Bayesian (though not causal) resolution to the innateness/ learning debate.

5. Domain speci®city in categorisation

· ·

· ·

·

117

Causal determination. Events and properties are assumed to have causes. Inductive leverage. Causes and effects relate to other causes and effects in systematic ways that are measurable. For one, certain patterns of causal relations map onto patterns of dependence and independence in probability distributions. For example, the heart's pumping causes oxygenated blood to travel to the limbs, which in turn causes transfer of oxygen to cells in the limbs. This explains why there is a statistical dependence between the heart pumping and oxygenated cells in the limbs. It also predicts that knowing whether or not blood is travelling to the limbs would render the other two independent. As a result of this mapping between causal structure and statistical dependence, causal structure can be induced from observation and experimental intervention. The structural properties of causal relations include such things as ``explaining away'': The presence of one explanation for an effect decreases the credibility of a second, independent explanation. Multiple scales. People are sensitive to causal structure at multiple scales. For instance, they learn and use structure involving coarse generalities like ``evolution causes adaptations'' as well as structure involving more speci®c mechanisms like ``molars are for chewing.'' Precision/error trade-off. To make inductions, people choose a level of causal structure that maximises inductive strength. This involves trading-off precision and accuracy. Coarse causal knowledge is likely to lead to correct but imprecise inference (e.g., organs that evolve through natural selection are likely to have some adaptive function). Finegrained causal knowledge will lead to precise conclusions, but is more prone to error (e.g., if dogs use molars for chewing, then all animals use molars for chewing). In the language of statistics, the proportion of variance accounted for must be traded off with the number of parameters required by the model. Speci®c models may capture more variance, but they make more assumptions and therefore are more likely to generalise incorrectly. More generally, the best theory maximises empirical coverage (and is therefore speci®c) while maintaining explanatory simplicity (and is therefore not too speci®c). Pragmatic tailoring. Individuals (aided by culture) are pretty good at maximising inductive strength. For different tasks, this maximisation leads to different levels and, more generally, different relevant causal structures. The relevant structure for naming might involve satisfying some functional goal (I want to open the door, so the function of the handle is important to selecting a name) or might involve reference (I want you to hand me that particular object and so I name according to physical properties). The causal structure for making inductions about behaviour could have to do with biological predispositions (Labradors like to retrieve) or with environmental context (dogs that belong to that surly guy are all vicious).

118

Sloman, Lombrozo, Malt

Mild ontology interprets causal relations broadly. Any causal role can potentially be relevant to classi®cation and induction. A view of this type is offered by Barsalou, Sloman, and Chaigneau (2005), who use a graphical probabilistic framework to describe the HIPE theory of the multiple levels of causal knowledge that people have about various categories. The theory describes what people know at a high level of abstraction, knowledge that cuts across domains. People believe that every object has a history (H) that causes its physical make-up (P) that, along with other necessary causes, also causes functional events (E): The ``I'' in HIPE re¯ects the fact that this knowledge re¯ects an intentional perspective.

HÔPÔE This generic domain general knowledge implies that nonobservable features relating to the past play a central role in the category's structure, suggesting that domain-dependent knowledge is not required to explain the importance that both children and adults put on nonobservable properties when categorising. Moreover, no property plays any essential role in this causal model. What makes a representation viable is the coherence of its causal relations. Some support for the relevance of domain general reasoning comes from Schulz and Gopnik (2004), who showed that 4-year-olds can make inferences about causal structure from probabilistic data, and that the inferences are domain general in that they are made in multiple domains and across domains. Knowledge also exists at a lower level of abstraction according to HIPE; H, P, and E can all be more fully speci®ed for particular domains, like living things and artifacts. A summary form of slightly more detailed models is shown in Figure 5.1. Hence, according to the theory, whereas different domains are associated with different models at a particular level of abstraction, they are associated with the same model at higher levels. More important, those models are just rough characterisations of the relations among relevant causal mechanisms. The models are described using a generic language in which nodes represents events and properties, and links represent causal mechanisms. Different models could be relevant depending on the reasoner's purpose; sometimes they might be unrelated to the models depicted above. Rehder (in press) offers a more detailed model that appears consistent with mild ontology, though at a single scale. Although he is willing to talk about ``essences,'' Rehder's idea is that classi®cation is mediated by knowledge of generative causal structure and that properties are important to the degree that they are consistent with causal knowledge. Rehder and Hastie (2004) offer a model of induction that appeals to causal structure without any direct appeal to belief in essences or to domain speci®c knowledge.

5. Domain speci®city in categorisation Natural selection

119

Physical structure Behaviour

Living things Individual goals

Creator’s intent

Physical structure

Artifacts

Functional outcome Agent’s goals

Agent’s actions

Figure 5.1 Examples of HIPE models with relevant details speci®ed.

Support for mild ontology This view inherits the advantages of medium ontology while making even fewer assumptions. Medium ontology assumes that people have an essentialist bias and also that they are biased to learn causal regularities in the world. Mild ontology accepts the centrality of causal structure and differs from medium ontology merely in denying that this structure arises from essentialism. Because medium ontology's explanations for the phenomena are all based on causal beliefs, not on essentialism per se, mild ontology inherits all medium ontology's explanations for the data so far discussed. That is, it assumes that children quickly learn causal regularities and use them to classify and make inductions. The only difference is that medium ontology says that children do it due to an essentialist bias, whereas mild ontology says they do it by virtue of a desire for causal explanation. As medium ontology likewise needs to assume that children seek explanations in order to pick out causal regularities, mild ontology seems to be making one fewer assumption. To see the power of the causal learning assumptions of mild ontology, consider how they explain preference for the generic-species level for inductive projection: That level is a frequent (though not exclusive) level of preferred projection for living things due to the convergence of causal structure at that level. The convergence is largely due to the fact that this is the level that supports procreation, which has enormous causal signi®cance with regard to evolution, common anatomical and physiological structure, genetic disease, behavioural adaptation, as well as determining current goals, social behaviours, family structure, etc. An enormous set of causal properties revolves around procreation, and that is what people are responding to when they indicate a preference to project across the generic-species level, not beliefs about the inductive potential of the level of generic species per se.

120

Sloman, Lombrozo, Malt

Mild ontology has the additional advantage that it is consistent with the deviations from Atran's (1998) folk-biological taxonomy. For instance, Coley et al. (1997) found that people's preferred level for naming could be higher than their preferred level of induction, the generic-species level. A causal constraint on naming that arises from its communicative demands is that all interlocutors must know the term being used. This is not a constraint on induction. Hence, the basic level of naming will sometimes be at a higher level than induction owing to differential causal constraints imposed by different categorisation tasks. A more complete analysis of the various tasks that de®ne the basic level would reveal other differences. For example, the feature listing task (Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976) depends on knowledge that can be articulated whereas induction does not (people might know not to generalise from oaks to elms even without being able to specify how they differ). For a fuller analysis of categorisation task constraints, see Malt and Sloman (2007). The relatively weak assumptions of mild ontology are also necessary to explain the fact that induction is often mediated by causal relations that have nothing to do with ontological knowledge. For example, Heit and Rubinstein (1994) show that induction depends not just on ontological kind, but on the requirements to support particular predicates (e.g., animals that eat at night require certain sensory capacities regardless of their species). Medin et al. (2003) found that people make causal inferences using variables that have no correspondence to folk-biological taxonomies. Inductions can be based on nontaxonomic relations like containment (diseases are more likely to be transferred from mice to owls/cats than from mice to squirrels, because owls and cats eat mice). Mild ontology is also consistent with the many different ways that entities in any domain can be grouped into categories. For instance, in the domain of natural kinds, tomatoes and mushrooms might be categorised together as sauce ingredients by virtue of their causal properties with respect to human taste. Similarly, Malt, Sloman, Gennari, Shi, and Wang (1999) showed that artifact naming is highly language-speci®c, a fact that has to do with different linguistic conventions in different cultures. The different naming patterns might re¯ect ± at least in part ± different causal histories in different societies. The fact that French has no single word that corresponds to the English ``jacket'' may re¯ect different societal differences in attitudes toward function and fashion. It also turns out that the judged causal centrality of properties can be manipulated by introducing new causal relations among properties both in classi®cation (Ahn, 1998) and in induction (Hadjichristidis, Sloman, Stevenson, & Over, 2004): A property is judged more central when judges are told that other properties depend on it. These effects imply a ¯exible use of causal structure in classi®cation and inference. To be fair, essentialism does not rule out the possibility that objects can be grouped in multiple ways. But it does imply that one grouping has some priority, namely, the grouping that conforms to the

5. Domain speci®city in categorisation

121

proposed essences. If other, non-essentialist groupings also exist, there must be cognitive machinery to create and use these. We have claimed that this cognitive machinery involves causal models and that it can explain both ``essentialist'' and ``non-essentialist'' groupings. At this point, the onus is on the essentialist theorist to explain why additional machinery is required for the ``essentialist'' cases.

Problems with mild ontology Such broad construal of causal roles requires the mild ontologist to be explicit about causal representation and learning, because the causal roles that properties can play are too varied and unpredictable to have been encoded directly by evolution. So work must be done to explain how causal structure is learned and deployed on line. Of course, a cognitive account of any of the phenomena requires this work anyway, because an appeal to evolution does not provide any direct understanding of cognitive processes or representations that govern categorisation. Atran (1998) argues that the weakness of views like those encompassed by mild ontology (and probably medium ontology as well) is their inability to explain the ``global relationships linking (e.g., generating) species and groups of species to and from one another'' (p. 566). The point seems to be that cultures and individuals differ immensely in their degree of integration of causal principles and in their knowledge of causal structure, yet folk biology is largely universal. Indeed, much causal knowledge is extremely vague (Keil, 2003), making it hard to see how learning would result in common principles of taxonomic classi®cation and inference. However, causal knowledge exists at multiple scales and even vague knowledge is knowledge. Few of the details of procreation need to be understood to appreciate that anatomical, physiological, behavioural, etc. systematicities are likely to be shared at the level that generally supports reproduction, the generic-species level. Even someone who lacks understanding or who does not believe that the generic-species level supports procreation will note all the correlated structure at that level and will thus give it priority. Perhaps a more critical problem with the mild ontology view is that it provides no account of why people have an essentialist intuition. That is, why do people assert that some objects have a true nature that determines their kind? Why do people have disputes about whether a mule is ``really'' a donkey or a horse? The mild ontologist would ®rst note that people's intuitions are hardly clear on this point. In a systematic study of people's assertions, Kalish (1995) concluded that people assert that membership in even animal categories is a matter of degree. But to the extent that intuitions about the existence of essences are real, they do not necessarily arise from the same

122

Sloman, Lombrozo, Malt

source as most everyday categorisation decisions (Armstrong, Gleitman, & Gleitman, 1983). Intuitions that categories have essences could arise as an overextension of the prerequisites of communication. Communication requires that assertions have truth conditions. Successful communication depends on interlocutors assuming that either a message passer believes their message to be true or the person has some rational motivation for passing it anyway. So conversation and other forms of communication take place under the usually implicit assumption that the categories under discussion are real and enduring and that they correspond to the way the world actually is. What people often neglect is that what determines whether an assertion is true or false is just a matter of convention with respect to a particular community. People select groupings for objects, and names to capture those groupings, in order to conform to whatever conventions are relevant at the moment (conventions that sometimes include rigorous scienti®c criteria). When probed, instead of ascribing the validity of an assertion to these conventions, for the sake of simplicity people ascribe it to some deeper, context-independent, essential property that may play no role either in the grouping or its name. For example, English speakers distinguish ``fruit'' from ``vegetable'' via a convention related to the role of the food in their meals (derived from properties like taste and texture), not from deeper botanical or biological properties (Malt, 1990). But people tend to believe that there is some deeper commonality that makes an apple and an orange appropriately called ``fruit'' in everyday English, a property not shared by eggplant or zucchini. A person's af®rmation that a song is typical rhythm and blues stems from the belief that there is a set of people with expertise in music who would concur. The label indicates a desire to conform to a community. The essentialist bias refers to the persistent further intuition that the convention re¯ects some essential quality of R&B whether or not it does. It is simpler for people to believe that a category has a reality across contexts, and therefore an essence, than to specify the context-speci®c conventions that support a given assertion.

No ontology The theories reviewed so far have not been formalised, yet a number of mathematical theories of the categorisation process have been proposed. With the exception of Rehder (in press), none makes any reference to essences or to domain speci®c processes (Ashby, 1992; Kruschke, 1992; Nosofsky, 1992). These theories do not make reference to ontological distinctions (for an exception see Lamberts & Shapiro, 2002), and the processing principles they advocate are completely domain general. Depending on the theory, these processing principles include the following: categories maximise the ability to predict new features (Anderson, 1991), category boundaries maximise separation among distributions of instances (Ashby),

5. Domain speci®city in categorisation

123

and categorisation involves comparison to exemplars (Nosofsky, Kruschke). Although they have not addressed the domain speci®city issue, the implicit claim of all these models would seem to be that domain differences do not have systematic effects on the central processes of categorisation, or at least that no explanatory leverage on processes of classi®cation and induction is gained by drawing distinctions among ontological domains. Other theorists have proposed that classi®cation and induction are mediated by a general causal reasoning system without claiming that the system is speci®cally tuned to domain differences (Lien & Cheng, 2000; Sloman, 2005). Of course, to the extent that such hypotheses would lead to models that learn causal structure that systematically varies across domains, they will effectively be implementing mild ontology. Support for no ontology The no ontology position inherits the advantages of mild ontology. By virtue of being completely domain general, it is consistent with the deviations in biological classi®cation that favour mild ontology. Problems with no ontology Although the no ontology theories offered do not say anything inconsistent with the data reviewed, this position offers no leverage to explain the differences that do exist between natural kind and artifact categories. Many of the claims about categorisation made by these models have not been tested with objects from different domains, so their generality is unknown. In fact, most of the tests of these ideas have not even used real categories, but arti®cial ones based on simple physical attributes. The data we have reviewed do seem to imply that a notion of causal structure is indispensable for explaining facts like those concerning the relative importance of internal and external properties. If differences in causal structure were uncorrelated with domain differences, then knowing an object's domain would give no purchase on understanding how people classify and make inductions with it, and the no ontology approach would be viable. But causal structure is correlated with domains, insofar as there are overall differences between the causal roles of living things (e.g., they are shaped by natural selection, they tend to be composed of homeostatic systems, see Keil, 1995) and artifacts (e.g., they are created by people, they tend to serve a human function). The mild ontology view recognises these differences and provides a means of talking about them. The no ontology theorists have the equipment to become mild ontologists. They merely need to state explicitly how their theories distinguish the distributions of properties of, say, living things and artifacts. A super®cial account could assert that there are distributional differences by stipulating, for instance, that nonobservable features are given relatively

124

Sloman, Lombrozo, Malt

more weight in categorisation than observable features, for living things over artifacts. A richer explanation would provide an account of how that difference comes about. However, doing so would require an account of causal learning and representation, something that most no ontology theorists have not offered.

Conclusions Because we see no motivation for essentialism separate from causal knowledge, we do not subscribe to medium, strong, or extra-strong ontology. We do believe, however, that there are important distributional differences in the way people represent living things, artifacts, and surely other domains, such as nominal kinds. Therefore we believe that a stronger stand than no ontology is called for. This leaves mild ontology. And in fact the best explanation for the evidence that we have covered is that people use speci®c causal knowledge to make inductions and to classify when they can. However, there is reason to suppose that people use generic domain knowledge when lacking speci®c causal knowledge. For example, Goodman (1955) posits beliefs that mediate induction called ``overhypotheses'' that can hold over a fairly abstract domain. An example might be ``different kinds of animals have characteristic mating behaviours.'' Such beliefs are clearly tied directly to knowledge about a domain. Shipley (1993) provides evidence for the psychological reality of such beliefs in the process of induction. Further evidence that taxonomic knowledge serves as a fallback in the absence of speci®c causal knowledge can be found in comparing novice to expert induction. American undergraduates without much knowledge of trees show the diversity effect: Evidence is treated as warranting projection to the degree that the evidence comes from dissimilar sources (Osherson, Smith, Wilkie, Lopez, & Sha®r, 1990). For instance, when told that two types of tree are susceptible to a disease, the extent to which they project susceptibility to a third type depends on the degree that the ®rst two types are dissimilar. Tree experts, in contrast, tend not to show the diversity effect (Coley et al., 1997). Instead, they reason in terms of ecological niche. When people have speci®c causal knowledge, they reason on the basis of it. But when they do not, they use taxonomic knowledge that may be tied either to a domain like living things (as Osherson et al. suggest) or to a feature-based representation that uses more generic knowledge (Sloman, 1993). A ®nal point in favour of mild ontology augmented by domain-level generalisations or overhypotheses comes from its generality. The proposal nicely accommodates the data we have reviewed on categorisation and induction, but also extends to other judgments, like the appropriateness of causal explanations. For example, most Western adults judge teleological explanations ± explanations in terms of a function or goal ± as appropriate for biological kinds, but not other natural kinds (Keil, 1992; Kelemen,

5. Domain speci®city in categorisation

125

1999): one says that people have eyes ``for seeing,'' but not that there are clouds ``for raining.'' These judgments appear to be a function of the domain whenever tasks do not provide participants with a speci®c causal history for the property being explained. For example, Kelemen (1999) asked children and adults questions like ``why are rocks pointy?,'' and provided a teleological and a non-teleological explanation. Adults systematically chose the non-teleological explanation for non-biological kinds like rocks, but the teleological explanation for artifacts and biological parts. This is presumably because they made an inference about the causal process likely to have led to what was being explained, and did so on the basis of the object's domain: a function-driven process like natural selection is a plausible cause for biological parts, while seemingly random physical processes are plausible causes for non-biological natural kinds. Lombrozo and Carey (2006) presented adults with similar questions, but for which the causal history of the property being explained was provided. In such cases, judgments of the appropriateness of a teleological explanation depended only on the causal history provided, with no effect of the object's domain. Such ®ndings suggest that previously documented domain-level differences deserve closer scrutiny in terms of underlying causal differences. The suggestion that people use kind membership when they do not have the speci®c causal knowledge needed to categorise is analogous to Quine's (1970) claim that similarity devolves into theory given enough knowledge. Quine argued that people construct categories based on similarity (e.g., whales go with ®sh) until they have enough theoretical knowledge to support their categories (many people come to believe that whales are mammals, not ®sh). We are suggesting that, analogously, ontology devolves into causal structure given enough knowledge. People use causal knowledge when they can; ontological knowledge serves as a proxy otherwise. The value of domain speci®c knowledge depends on what kind of causal structure could mediate the particular induction or classi®cation at hand. If an induction involves what your mother would like her coat to be made of, or what your colleague would like to be served for dinner, domain speci®c knowledge at the life-form level is relevant (e.g., plants are OK, animals are bad) and maybe at the speci®c level (Spanish tomatoes are good, Northern Ontario tomatoes are bad), not necessarily only at the generic-species level. In conclusion, we have tried to specify how domain speci®c knowledge is relevant to processes of categorisation, based on an analysis of what is required to explain the extant data, and not merely on intuition. We freely admit that the belief that certain kinds are what they are by virtue of an underlying essence can be intuitively compelling. Kalish (1998) showed that both children and adults treat both animals and artifacts as objective at the basic level, and they treat animals as more objective at the superordinate level. What we deny is that this intuition tells us very much about the cognitive processes associated with classi®cation and induction. People can have content in their beliefs about category membership that is not utilised

126

Sloman, Lombrozo, Malt

in their categorisation decisions. Malt (1994) has demonstrated this for a natural kind: Whether people call a substance ``water'' is independent of their beliefs about how much H2O it contains. Malt et al. (1999) have demonstrated it for artifacts: The way people sort containers and dishes is not determined by the names they give the objects. Despite people's claims about the properties that constrain the categories they form, the actual categories are in¯uenced by factors they are not aware of. As a result, the content of the beliefs and knowledge people hold may differ, and yet processes of categorisation and inference may be universal. Hence we view the contrasting assumptions of the two traditions of categorisation research as a real theoretical divide but not an unbridgeable one. We see the hope of a rapprochement between those who favour formal theories focusing on domain general mechanisms and those who grapple with the messier real world and who have been primarily concerned with essentialism and domain speci®city. Greater appreciation for the role and nature of causal reasoning would give the formal theorists the power to begin to explain the deep insights into the nature of human categorisation that have come from the other side.

Acknowledgments The authors would like to thank Tom Grif®ths and Daniel Weiskopf for helpful comments on an earlier version of the manuscript.

References Ahn, W. K. (1998). Why are different features central for natural kinds and artifacts? The role of causal status in determining feature centrality. Cognition, 69, 135±178. Anderson, J. R. (1991). The adaptive nature of human categorization. Psychological Review, 98, 409±429. Armstrong, S. L., Gleitman, L. R., & Gleitman, H. (1983). On what some concepts might not be. Cognition, 13, 263±308. Ashby, F. G. (1992). Multidimensional models of categorization. In F. G. Ashby (Ed.), Multidimensional models of perception and cognition (pp. 449±483). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Atran, S. (1990). Cognitive foundations of natural history: Towards an anthropology of science. Cambridge: Cambridge University Press. Atran, S. (1998). Folk biology and the anthropology of science: Cognitive universals and cultural particulars. Behavioral and Brain Sciences, 21, 547±610. Barsalou, L. W. (1991). Deriving categories to achieve goals. In G. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 27, pp. 1±64). San Diego, CA: Academic Press. Barsalou, L. W., Sloman, S. A., & Chaigneau, S. E. (2005). The HIPE theory of function. In L. C. E. v. d. Zee (Ed.), Functional features in language and space:

5. Domain speci®city in categorisation

127

Insights from perception, categorization, and development (pp. 131±148). New York: Oxford University Press. Berlin, B., Breedlove, D., & Raven, P. (1973). General principles of classi®cation and nomenclature in folk biology. American Anthropologist, 74, 214±242. Bloom, P. (1996). Intention, history, and artifact concepts. Cognition, 60, 1±29. Braisby, N. R., Franks, B., & Hampton, J. A. (1996). Essentialism, word use, and concepts. Cognition, 59, 247±274. Caramazza, A., & Shelton, J. R. (1998). Domain speci®c knowledge systems in the brain: The animate±inanimate distinction. Journal of Cognitive Neuroscience, 10, 1±34. Coley, J. D., Medin, D. L., & Atran, S. (1997). Does rank have its privilege? Inductive inferences within folkbiological taxonomies. Cognition, 64, 73±112. Diesendruck, G., Markson, L., & Bloom, P. (2003). Children's reliance on creator's intent in extending names for artifacts. Psychological Science, 14, 164±168. DupreÂ, J. (1981). Natural kinds and biological taxa. The Philosophical Review, 90, 66±90. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Gelman, S. A. (2003). The essential child. New York: Oxford University Press. Gelman, S. A., Coley, J. D., & Gottfried, G. (1994). Essentialist beliefs in children. In L. Hirschfeld & S. Gelman (Eds.), Mapping the mind (pp. 341±365). Cambridge: Cambridge University Press. Gelman, S. A., & Hirschfeld, L. A. (1999). How biological is essentialism? In D. L. Medin & S. Atran (Eds.), Folkbiology (pp. 403±446). Cambridge, MA: MIT Press. Ghiselin, M. T. (1998). Folk metaphysics and the anthropology of science. Behavioral and Brain Sciences, 21, 573±574. Goodman, N. (1955). Fact, ®ction and forecast. Cambridge, MA: Harvard University Press. Gopnik, A., & Schulz, L. E. (Eds.). (in press). Causal learning: Psychology, philosophy and computation. Oxford: Oxford University Press. Hadjichristidis, C., Sloman, S. A., Stevenson, R. J., & Over, D. E. (2004). Feature centrality and property induction. Cognitive Science, 28, 45±74. Heit, E., & Rubinstein, J. (1994). Similarity and property effects in inductive reasoning. Journal of Experimental Psychology: Learning Memory and Cognition, 20, 411±422. Kalish, C. W. (1995). Essentialism and graded membership in animal and artifact categories. Memory & Cognition, 23, 335±353. Kalish, C. W. (1998). Natural and arti®cial kinds: Are children realists or relativists about categories? Developmental Psychology, 34, 376±391. Keil, F. C. (1989). Concepts, kinds and cognitive development. Cambridge, MA: MIT Press. Keil, F. C. (1992). The origins of an autonomous biology. In M. R. Gunnar & M. Maratsos (Eds.), Modularity and constraints in language and cognition, vol. 25. Minnesota symposium on child psychology (pp. 103±138). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Keil, F. C. (1995). The growth of causal understanding of natural kinds. In D. Sperber, D. Premack, & A. J. Premack (Eds.), Causal cognition: A multidisciplinary approach (pp. 234±262). New York: Oxford University Press.

128

Sloman, Lombrozo, Malt

Keil, F. C. (2003). Folkscience: Coarse interpretations of a complex reality. Trends in Cognitive Science, 7, 368±373. Kelemen, D. (1999). Why are rocks pointy? Children's preference for teleological explanations of the natural world. Developmental Psychology, 35, 1440±1452. Kelemen, D., & Carey, S. (2007). The essence of artifacts: Developing the design stance. In S. Laurence & E. Margolis (Eds.), Creations of the mind: Theories of artifacts and their representation. Oxford: Oxford University Press. Kripke, S. (1980). Naming and necessity. Oxford: Blackwell. Kruschke, J. K. (1992). Alcove: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22±44. Lamberts, K., & Shapiro, L. (2002). Exemplar models and category-speci®c de®cits. In E. Forde & G. W. Humphreys (Eds.), Category-speci®city in brain and mind (pp. 291±314). Hove, UK: Psychology Press. Lien, Y., & Cheng, P. W. (2000). Distinguishing genuine from spurious causes: A coherence hypothesis. Cognitive Psychology, 40, 87±137. Lombrozo, T., & Carey, S. (2006). Functional explanation and the function of explanation. Cognition, 99, 167±204. Malt, B. C. (1990). Features and beliefs in the mental representation of categories. Journal of Memory and Language, 29, 289±315. Malt, B. C. (1994). Water is not H2O. Cognitive Psychology, 27, 41±70. Malt, B. C. (1995). Category coherence in cross-cultural perspective. Cognitive Psychology, 29, 85±148. Malt, B. C., & Sloman, S. A. (2007). Artifact categorization: The good, the bad, and the ugly. In E. Margolis & S. Laurence (Eds.), Creations of the mind: Essays on artifacts and their representation. New York: Oxford University Press. Malt, B. C., Sloman, S. A., Gennari, S., Shi, M., & Wang, Y. (1999). Knowing versus naming: Similarity and the linguistic categorization of artifacts. Journal of Memory and Language, 40, 230±262. Matan, A., & Carey, S. (2001). Developmental changes within the core of artifact concepts. Cognition, 78, 1±26. Mayr, E. (1982). The growth of biological thought. Cambridge, MA: Harvard University Press. Medin, D. L., Coley, J. D., & Storms, G. H. B. (2003). A relevance theory of induction. Psychonomic Bulletin and Review, 10, 517±532. Medin, D. L., & Ortony, A. (1989). Psychological essentialism. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 179±195). New York: Cambridge University Press. Nosofsky, R. M. (1992). Exemplar-based approach to relating categorization, identi®cation, and recognition. In F. G. Ashby (Ed.), Multidimensional models of perception and cognition (pp. 363±393). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Osherson, D. N., Smith, E. E., Wilkie, O., Lopez, A., & Sha®r, E. (1990). Categorybased induction. Psychological Review, 97, 185±200. Pauen, S. (2002). Evidence for knowledge-based category discrimination in infancy. Child Development, 73, 1016±1033. Pinker, S. (1997). How the mind works. New York: Norton. Putnam, H. (1975). The meaning of `meaning'. In K. Gunderson (Ed.), Language, mind and knowledge (pp. 131±193). Minneapolis: University of Minnesota Press.

5. Domain speci®city in categorisation

129

Quine, W. V. O. (1970). Natural kinds. In N. Rescher (Ed.), Essays in honor of Carl G. Hempel (pp. 5±23). Dordrecht, The Netherlands: D. Reidel. Quinn, P. C., & Eimas, P. D. (1996). Perceptual cues that permit categorical differentiation of animal species by infants. Journal of Experimental Child Psychology, 63, 189±211. Rakison, D. H., & Poulin-Dubois, D. (2001). Developmental origin of the animate± inanimate distinction. Psychological Bulletin, 127, 209±228. Rehder, B. (in press). Essentialism as a generative theory of classi®cation. In A. Gopnik & L. E. Schulz (Eds.), Causal learning: Psychology, philosophy and computation. Oxford: Oxford University Press. Rehder, B., & Hastie, R. (2004). Category coherence and category-based property induction. Cognition, 91, 113±153. Rips, L. J. (2001). Necessity and natural categories. Psychological Bulletin, 127, 827±852. Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382±439. Scholl, B. J. (2005). Innateness and (Bayesian) visual perception: Reconciling nativism and development. In P. Carruthers, S. Laurence, & S. Stich (Eds.), The structure of the innate mind (pp. 34±52). Cambridge: Cambridge University Press. Schulz, L. E., & Gopnik, A. (2004). Causal learning across domains. Developmental Psychology, 40, 162±176. Shipley, E. F. (1993). Categories, hierarchies, and induction. In D. L. Medin (Ed.), The psychology of learning and motivation (Vol. 30, pp. 265±301). San Diego, CA: Academic Press. Sloman, S. A. (1993). Feature-based induction. Cognitive Psychology, 25, 231±280. Sloman, S. A. (1998). Categorical inference is not a tree: The myth of inheritance hierarchies. Cognitive Psychology, 35, 1±33. Sloman, S. A. (2005). Causal models: How people think about the world and its alternatives. New York: Oxford University Press. Sloman, S. A., & Malt, B. C. (2003). Artifacts are not ascribed essences, nor are they treated as belonging to kinds. Language and Cognitive Processes, 18, 563±582. Sloman, S. A., & Over, D. E. (2003). Probability judgment from the inside and out. In D. Over (Ed.), Evolution and the psychology of thinking: The debate (pp. 145±169). Hove, UK: Psychology Press. Sloman, S. A., & Rips, L. J. (1998). Similarity as an explanatory construct. Cognition, 65, 87±101. Sober, E. (1994). From a biological point of view. New York: Cambridge University Press. Sterelny, K. (2003). Thought in a hostile world: The evolution of human cognition. Oxford: Blackwell. Strevens, M. (2001). The essentialist aspect of naive theories. Cognition, 74, 149±175. Templeton, A. R. (1998). Human races: A genetic and evolutionary perspective. American Anthropologist, 100, 632±650.

6

Perspectives on the ``tools'' of decision-making Ben R. Newell and David R. Shanks

Decisions, decisions . . . Every day, throughout our lives, we are faced by the need to make a plethora of decisions, choices and judgements: what to have for lunch, where to go on holiday, what car to buy, whom to hire for a new faculty position, whom to marry, etc. Such examples illustrate the abundance of decisions in our lives and thus the importance of understanding the how and why of decision-making. Highlighting this range of situations emphasises the huge diversity of cognitive activities that are grouped under the general heading ``decision-making''. Given such diversity, any theoretical perspective clearly needs to be wide-ranging in its purview, and readily adaptable to a variety of tasks and conditions. How can such scope and adaptability be achieved? In this chapter we examine a framework that claims to be capable of achieving precisely this, while retaining psychological plausibility. The key, according to its proponents, lies in a ``collection of specialised cognitive mechanisms, that evolution has built into the mind, for speci®c domains of inference and reasoning'' (Gigerenzer & Todd, 1999, p. 30). This chapter takes the following form: First we review the background and claims for such an adaptive toolbox, then we present summaries of empirical tests of some of the proposed tools of decision-making, and ®nally we sketch an alternative domain-free perspective, which, we argue, provides a better and more parsimonious account of the cognitive mechanisms underlying our judgements and decisions.

Swiss army knives, adaptive toolboxes and adjustable spanners Over (2003) points out that many evolutionary psychologists have used tools as vivid metaphors for characterising the mind as comprising a range of speci®c modules ± the so-called massive modularity hypothesis (e.g., Buss, 1999; Cosmides & Tooby, 1994; Pinker, 1997). For example, Cosmides and Tooby (1994) suggested that the mind be viewed like a Swiss army knife, with individual blades specialised for particular ``survival-related'' tasks, but no general-purpose, adaptable blade: just as we ®nd the screwdriver more

132

Newell and Shanks

useful than a general-purpose blade for tightening screws, so the mind uses a content-speci®c module (e.g., a cheater-detection module), rather than a content-free mechanism, when confronted with conditional reasoning problems (Cosmides, 1989; Over, 2003). Overall, evolution favours ``ef®cient'' content-speci®c over ``inef®cient'' content-independent mechanisms, and so the latter are selected against (Cosmides & Tooby, 1992). Similar to the Swiss army knife, Gigerenzer, Todd and the ABC Group (1999; see also Chase, Hertwig, & Gigerenzer, 1998) propose a ``toolbox'' containing a variety of special tools but no single power-tool. The idea is that the mind has evolved mechanisms or heuristics that are suited to particular tasks, such as choosing between alternatives, categorising items, estimating quantities, selecting a mate, judging habitat quality, even determining how much to invest in one's children. Two types of domain speci®city then act to determine which heuristic is used for a particular task: There are two (overlapping) forms of domain speci®city . . . that can determine heuristic choice: speci®c adaptive tasks, such as mate choice and parental investment; and speci®c inference tasks such as categorisation or estimation. Clearly, a heuristic designed to make a choice between two alternatives will not be suitable for categorisation, nor will a mate choice heuristic help in judging habitat quality. The domain speci®c bins in the adaptive toolbox could often only hold a single appropriate tool. (Gigerenzer & Todd, 1999, p. 32; emphasis added) Extending the tool metaphor, Gigerenzer and Todd argue that just as a car mechanic uses speci®c wrenches, pliers and spanners in maintaining a car engine rather than hitting everything with a hammer, so too the mind relies on unique one-function devices to provide serviceable solutions to individual problems. There is, however, an important distinction between the Swiss army knife and the toolbox metaphors. The blades of the Swiss army knife are discrete and encapsulated. In contrast, the heuristics in the cognitive toolbox are made up from simpler components or ``building blocks'' which provide possibilities for recombining and nesting to produce new heuristics. Gigerenzer and Todd note that in the same way as a handle can be added to a chopping stone to create an axe, so the building blocks of principles for information search, stopping search and deciding can be combined in different ways to construct new tools. This constructive view of the mind is important because, as we will argue, it may be simpler to propose that the elements of search, stopping and decision are all that is required, and that the combination of these basic principles within a single general mechanism (an adjustable spanner?) can do just as well in accounting for our decision-making behaviour (Newell, 2005).

6. Tools of decision-making

133

A simple tool for a simple task Although the adaptive toolbox is said to contain tools for adaptive tasks and for inference tasks, it is only the latter that have been explored in any detail in the literature. We shall return to the question of whether separate tools are needed for these two types of task later, but ®rst we turn to a simple inference tool for determining a choice between alternatives, a tool that has stimulated a great deal of debate. Imagine you are facing a choice between two alternatives ± such as two companies to invest in ± and your task is to pick the one that is better with regard to some criterion (e.g., future returns on investments). ``Take-thebest'' (TTB) is designed for just such a situation. TTB operates according to two principles. The ®rst ± the recognition principle ± states that for any decision made under uncertainty, if only one among a range of alternatives is recognised, then the recognised alternative will be chosen. The second principle is invoked when more than one alternative is recognised, and the recognition principle therefore cannot discriminate. In such cases, people are assumed to have access to a reference class of cues or features, which are searched in descending order of feature validity until one that discriminates between alternatives is discovered. Search then stops and this single best discriminating feature is used to make the choice. The procedure is thus not rational in a formal sense because, rather than using all discriminatory pieces of information (as, for example, linear regression would), it bases its choice on a single piece. A ¯owchart of the processing steps used by TTB is shown in Figure 6.1. These simple steps for searching, stopping and deciding might seem rather trivial, but Gigerenzer and Goldstein (1996) showed convincingly that the TTB procedure is as accurate as ± and sometimes even slightly more accurate than ± more computationally complex and time consuming ones (e.g., linear regression). These initial results, from a task in which the goal was to decide which of two cities had a higher population, were replicated in a variety of real-world environments ranging from predicting professorial salaries to the amount of sleep engaged in by different mammals (Czerlinski, Gigerenzer, & Goldstein, 1999). TTB works by exploiting what Gigerenzer and colleagues term ``ecological rationality''. This standard is concerned only with whether judgement procedures perform well in the real world (like the datasets investigated by Czerlinski et al., 1999), regardless of their adherence to formal inference methods (e.g., multiple regression, Bayes's theorem). Ecological rationality is thus concerned solely with the correspondence criterion of rationality (Hammond, 1996). The approach is an extension of the ideas of Brunswik (e.g., Brunswik, Hammond, & Stewart, 2000) and also Simon (1956), who argued that, when de®ning optimal behaviour, it is imperative to consider both the cognitive limitations of an organism and the role played by the environment in which the organism ®nds itself.

134

Newell and Shanks ––

+– Recognition

+ +

Guess

No

Other cues known?

Choose the alternative to which the cue points

Yes Choose the best cue No +– or +?

Yes

Figure 6.1 Flowchart of the processing steps in the TTB procedure. ``'' indicates a positive cue value; ``ÿ'' indicates a negative cue value, and ``?'' indicates that the cue value is unknown. For example, if one knows that one city has a football team () and either knows for sure that the other does not (ÿ) or is uncertain as to whether it has (?), then according to TTB one uses this single piece of discriminating information to make a judgement. Adapted with permission from Gigerenzer and Goldstein (1996).

The heuristics in the adaptive toolbox take account of the existence of cognitive limitations by utilising the least necessary amount of information (frugal) in the shortest time (fast). They also save us from the ignominy of poor and biased judgments by capitalising on the ®t between a heuristic and an environment ± i.e., they are ecologically rational (Gigerenzer, 2001). The appeal of this framework to cognitive scientists is the promise of simple, psychologically plausible procedures that, counterintuitively, perform as well as ± or sometimes even better than ± more complex ones. For the broader community, the temptation of easy shortcuts ``that make us smart'' has proved too much to resist. Consequently, the fast and frugal framework has generated extensive debate in the literature as well as being examined in a number of applied contexts (e.g., Dhami & Ayton, 2001; Dhami & Harries, 2002; Elwyn, Edwards, Eccles, & Rovner, 2001). But does the existing evidence support the bold claims made for the toolbox of domain speci®c, unique, one-function devices? Two kinds of evidence can be brought to bear on this issue: theoretical arguments that question some of the premises underlying the adaptive toolbox, and empirical data that either conform to or violate the predictions of the simple heuristics.

6. Tools of decision-making

135

Theoretical arguments What is adaptive in the adaptive toolbox? A heuristic is de®ned as being ecologically rational to the extent that it ®ts with the environment (Gigerenzer, 2001). The idea is that each heuristic, situated in its ``domain speci®c bin'', has been shaped by natural selection to produce an accurate representation of its particular domain. The sole criterion of success for an ecologically rational heuristic is, therefore, its relative accuracy. This notion is clearly intended to be grounded in evolutionary psychology: ``Ultimately, ecological rationality depends on decisionmaking that furthers an organism's adaptive goals in the physical or social environment'' (Todd & Gigerenzer, 1999, p. 364). However, as Over (2000a, 2000b) and Stanovich and West (2003) point out, there is inherent ambiguity in de®ning ecological rationality in this way. First, what is meant by adaptive in the preceding quote? Schmitt and Pilcher (2004) note that the concept of adaptation can be used both as a verb (i.e., adaptation as the process of evolution) and a noun (i.e., adaptation as a product of evolution). If it is used as a noun then the adaptation can refer to ``any attribute that helps a creature survive and reproduce at the moment'' (p. 643), or it can refer to the product of historical evolution, that is, as features that were ``functionally designed by the process of evolution by selection acting in nature in the past'' (Thornhill, 1997, p. 4). In a textual analysis of Gigerenzer et al. (1999), Stanovich and West (2003) pointed out that the phrase ``organism's adaptive goals'' is used interchangeably to refer both to the goals of the organism at the moment, and to the goals of the organism's genes. They go on to argue that because of this ambiguity between the genes' goals and the organism's goals it is unclear whether the physical or social environment mentioned in the quote refers to the current environment, or the environment of our hunter-gatherer predecessors (the so-called environment of evolutionary adaptedness, EEA). This ambiguity in the fundamental thesis underlying the toolbox allows proponents to adopt whichever interpretation is most convenient for the argument being made. For example, Todd, Fiddick, and Krauss (2000) claim that tools from the toolbox are adaptive in current environments ``without privileging problems with ®tness consequences'', and go on to state that these same tools are ``good candidates for evolved mechanisms'' (p. 379). How can such tools be products of natural selection if they have no ®tness consequences? It appears that advocates of the toolbox approach want the biological plausibility provided by evolutionary adaptation, but choose to ignore other problematic implications of evolutionary explanations (Stanovich & West, 2003). A case in point is the recognition heuristic ± shown as the ®rst step of the inference procedure in Figure 6.1. Goldstein and Gigerenzer (2002) describe this simple rule for selecting recognised objects as a ``cognitive adaptation'' (p. 88) which can be used in domains in which (1) recognition is partial

136

Newell and Shanks

(i.e., a person has heard of some but not all of the items in a domain) and (2) recognition is correlated with the criterion. It is plausible that relying solely on recognition might have conferred some evolutionary advantage on our ancestors (e.g., only eating foods that one recognises), but surely a complete unwillingness to try anything new (e.g., a different type of fuel for a ®re, a different type of wood for a tool handle) would not be a successful strategy for survival. Speculation about the EEA aside, how adaptive is it to rely on recognition in our current environment? Stanovich and West (2003) note that relying solely on recognition would lead to us always buying more expensive drinks and snacks (e.g., Starbucks is more recognisable than an independent and probably cheaper coffee shop), paying higher bank fees (because larger banks charge higher fees) and incurring credit card debt instead of paying cash. None of these behaviours is adaptive, yet all of them are arguably triggered by the recognition heuristic (Stanovich & West, 2003). Despite these intuitions about the inadvisability of relying solely on recognition, Borges, Goldstein, Ortmann, & Gigerenzer (1999) claim that recognition alone can beat the stock market and thus could be used as a fast-and-frugal investment strategy. They make much of the ®nding that a portfolio of stocks recognised by over 90% of Munich pedestrians beat both portfolios selected by experts and those of two benchmark mutual funds during the period December 1996 to June 1997. However, as Boyd (2001) suggested, this effect may simply have been a ``big ®rm'' effect. High capitalisation and high recognition tend to go together (Over, 2000a) and in the strong ``bull market'' of those months, the high capitalisation stocks of the big ®rms tended to do very well. Boyd (2001) emphasised this point by testing the recognition heuristic in a ``bear'' or down market (June to December 2000). The results were opposite to those found by Borges et al. (1999): stocks recognised by less than 10% of the non-expert participants achieved a return 30% greater than that achieved by the stocks recognised by more than 90%, and a 20% higher return than the market index. The message from Boyd's test seems to be that the original ®nding of Borges et al. was a simple effect of the timing of their study. Recognition can beat the stock market, but only in very special circumstances, and in other conditions employing the heuristic would lead to very poor decision making. In summary, it can be argued that the adaptive toolbox framework is a little too ``fast and frugal in its theoretical foundations'' (Over 2000a, p. 191), especially with regard to what is meant by ``adaptive'' and what role evolution is thought to play in the formulation of different tools. Next we turn to arguments about the psychological plausibility of the tools. Simulating plausibility? The argument for the psychological reality of the inference tools in the adaptive toolbox is based on an appeal to their plausibility as mechanisms

6. Tools of decision-making

137

for inference. The argument goes something like this: The methods of classical rationality are time-consuming or even intractable and are thus beyond the bounds of human decision-makers. In contrast, simple mechanisms like TTB can be carried out under conditions of limited time and knowledge; simulations showing that simple models like TTB often match or outperform competing rational models in terms of accuracy are thus proof that the fast and frugal models are viable accounts of human decision processes (cf. Chater, Oaksford, Nakisa, & Redington, 2003; Gigerenzer & Todd, 1999; Lee & Cummins, 2004). Several commentators have raised questions regarding this appeal to plausibility (e.g., BroÈder & Schiffer, 2003a; Chater et al., 2003; Juslin & Persson, 2002; Newell, 2005; Newell & Shanks, 2003, 2004; Newell, Weston, & Shanks, 2003; Rakow, Hinvest, Jackson, & Palmer, 2004). One question is: How simple are the models? Figure 6.1 shows that TTB can be described as a simple three-step procedure but its successful execution relies on a large amount of precomputation. Recall that in the cases when both alternatives are recognised and the recognition principle cannot be invoked, people are assumed to search cues in descending order of ``ecological validity''. But before search can begin the cues need to be hierarchically organised in validity order ± how is such a hierarchy constructed? The ecological validity of a cue is de®ned as the relative frequency with which it selects the correct answer when applied to all pairwise comparisons in a given environment. The ``drosophila'' environment used by Gigerenzer and Goldstein (1996) consists of 83 German cities, and to compute the validity hierarchy for this environment the procedure (TTB) needs to ascertain whether the cue discriminates between the cities and, if it does, whether it points to the correct or wrong answer. Juslin and Persson calculated that for a set of 80 cities TTB would need to perform 28,440 such checks in order to establish the hierarchy. As they note: ``regardless of whether this [checking] is done continuously as the objects are encountered, as a matter of automatic processing of frequency, or at the time of judgment, it amounts to extensive computation'' (Juslin & Persson, 2002, p. 13). As we will see in the section on empirical evidence, it turns out that participants ®nd constructing such hierarchies dif®cult, and tend to be more sensitive to aspects other than a cue's ecological validity (Newell, Rakow, Weston, & Shanks, 2004; Rakow et al., 2004; Rakow, Newell, Fayers, & Hersby, 2005). Another issue regarding the appeal to plausibility is the extent to which we should be persuaded by the argument that TTB is fast (and therefore more plausible) because it searches for fewer pieces of information. The interpretation of speed relies on assumptions about the architecture of the cognitive system. A serial architecture, in which it is assumed that information is searched sequentially at a constant rate, would show an advantage for TTB, but in a parallel architecture large amounts of information can be searched simultaneously, and therefore speed and amount of information will be unrelated (Chater et al., 2003).

138

Newell and Shanks

Chater et al. (2003) argue that given the extensive success of connectionist architectures and instance-based models as general accounts of cognitive processes (e.g., Rumelhart, McClelland, & the PDP Research Group, 1986; Nosofsky & Johansen, 2000), we should exercise considerable caution in accepting a measure of speed that presupposes a serial architecture. Juslin and Persson (2002) developed this idea by implementing and testing an instance-based procedure (PROBEX) for the types of binary choice problem for which TTB was designed. In contrast to the sequential cue-by-cue search mechanism of TTB, PROBEX (which stands for Probabilities from Exemplars) makes inferences by relying on the similarity between current and previously experienced exemplars. This similarity matching mechanism is ``lazy'' in that it assumes no precomputed abstractions (i.e., cue validities) and thus the model satis®es the constraints of bounded rationality. In both simulations and ®ts to human data, PROBEX was impressive and, Juslin and Persson argued, at least as psychologically plausible as TTB. (See also Chater et al., 2003 for other comparisons of instance-based models and TTB.) Theoretical arguments about the viability of the adaptive toolbox framework are deeply entrenched in much wider questions regarding the nature of human cognition: the role of evolution, automatic versus controlled processing (of frequencies), using rules or similarity to make inferences, implementing connectionist or serial architectures, etc. Clearly it is beyond the scope of this chapter to go into these debates in detail; suf®ce it to say that to some extent the viability of the framework depends on one's theoretical predilections. Less debatable, however, is the empirical evidence for the use of the proposed tools. It is to this that we now turn.

Empirical evidence BroÈder and Schiffer (2003a), in discussing the evidence for the adaptive toolbox, remarked: ``We feel that plausibility is a weak advisor in the scienti®c endeavour and prefer empirical evidence if it is attainable'' (p. 278). We wholeheartedly agree with this view. It is interesting to note that despite the seductive appeal of a set of readily testable models, the adaptive toolbox framework was, at the time it was proposed, supported by very little empirical evidence demonstrating the use of the heuristics in the environments in which they were claimed to operate ± a shortcoming noted by a number of commentators (e.g., Shanks & Lagnado, 2000; Oaksford, 2000). Since the original framework was introduced, the gulf between the bold vision and the empirical reality has begun to be ®lled. Researchers have started to question (1) whether there is any evidence for a set of fast and frugal heuristics contained within an ``adaptive toolbox''; (2) whether we can determine how one heuristic is selected or ``triggered'' over another in

6. Tools of decision-making

139

particular situations; and (3) whether it is possible to distinguish between a toolbox of strategies and a single evidence accumulation strategy. To address these questions, we examine some of the empirical evidence for the two heuristics that have received the most attention in the literature ± recognition and TTB. Figure 6.1 shows these as part of the same inference procedure, but we will examine them separately. Recognition and the inconsequentiality of further knowledge Most of the evidence for the recognition heuristic comes from the ``cities task'' in which participants are required to estimate which of two cities is larger (e.g., Goldstein & Gigerenzer, 2002). In the case of pairs in which participants recognise only one of the two cities, they tend to choose the recognised city as having a larger population approximately 90% of the time. There are at least two interpretations of how the recognition heuristic operates. One is that we rely on this when we have no other information and no possibility of obtaining further information to include in our decisions. If that is the case, then the claim is not so bold ± any method, such as utility maximising, unit weighting, or ``one-reason'' decision-making would make the same prediction. If all that is, or could be, available is one piece of information ± i.e., recognition of one object but not the other ± then we should rely on this to make an inference, regardless of the mechanism underlying that decision process. However, this does not seem to be the intended interpretation. Goldstein and Gigerenzer (2002) stated that the recognition heuristic is used in a noncompensatory fashion. Even when other information about a recognised alternative can be obtained, it never overrides the weight placed on simple recognition: ``The recognition heuristic is a non-compensatory strategy: If one object is recognised and the other is not, then the inference is determined; no other information about the recognised object is searched for and therefore no other information can reverse the choice determined by recognition'' (Goldstein & Gigerenzer, 2002, p. 82; emphasis added). The idea that when we recognise one object but not another we make an inference solely on the basis of recognition, with no possibility of it being overridden, is a very strong claim to make. This, however, is essential for distinguishing the heuristic from other decision-making tools. The noncompensatory nature de®nes its ``stopping rule'' (stop search when only one of two objects is recognised) and also its decision rule (choose the recognised alternative). Compensatory use of recognition, in which further cues might compensate for the information provided by this, is inconsistent with the notion of fast, frugal, one-function devices contained in the adaptive toolbox. Is there good empirical evidence to support the non-compensatory claim? As Oppenheimer (2003) points out, given the inconsequentiality of further knowledge, the recognition heuristic predicts that individuals would judge a recognised city as larger than an unrecognised city even if the

140

Newell and Shanks

recognised city were known to be small. Oppenheimer (2003) tested this counterintuitive prediction in an experiment in which he paired cities that were recognisable (owing to their proximity to the university where the study was conducted) but known to be small (e.g., Cupertino), with ®ctional cities that, by de®nition, could not be recognised (e.g., Rhavadran). On average participants judged the local ± recognised ± city to be larger on only 37% of trials. This result, which contrasts starkly with the prediction of the recognition heuristic, led Oppenheimer to conclude: ``people clearly are using information beyond recognition when making judgments about city size'' (p. B4). A potential criticism of Oppenheimer's study is that selective sampling of local cities known to be small may have alerted participants to the fact that the recognition heuristic was non-adaptive in that environment (Oppenheimer, 2003). Note that this heuristic is general in the sense that it is a ``tool'' that could be used whenever some (but not all) objects in a set are recognised; but it is also speci®c in the sense that its use is appropriate only when recognition is correlated positively with the criterion of interest. In Oppenheimer's experiment this positive correlation was not present, thus arguably making the experiment an unfair test of the applicability of the recognition heuristic and of whether the heuristic is used in a noncompensatory manner. To address this issue, Newell and Fernandez (2006) returned to the original evidence that Goldstein and Gigerenzer (2002) proposed for the noncompensatory use of recognition and attempted a replication. In Goldstein and Gigerenzer's Experiment 2, participants were presented with a series of pairs of German cities and asked to choose the city they believed had the larger population. In addition to the city names, participants were taught extra information about some of the cities in the sample that could, Goldstein and Gigerenzer argued, be incorporated into the decisions. Before beginning the cities task, participants were given a training phase in which they were told that nine of the 30 largest cities in Germany have soccer teams and that the nine cities with teams are larger than the 21 without teams in 78% of all possible pairs. They were also taught the names of four wellknown cities that have soccer teams and four that do not. The critical pairs in the ensuing cities task were those that included one unrecognised city and one recognised city that did not have a soccer team. Goldstein and Gigerenzer argued that equipped with the knowledge from the training phase, and placing no special emphasis on recognition (i.e., contrary to their own position and predictions), participants should choose the unrecognised city in such pairs. This is because, from the information given, participants could work out that if a city does not have a soccer team then even if it is recognised it is only likely to be larger than an unrecognised city in 22% of all possible pairs. Thus any chance that the unrecognised city has a soccer team should lead participants to choose against the prediction of the recognition heuristic.

6. Tools of decision-making

141

Goldstein and Gigerenzer reported that despite being provided with this con¯icting information, participants' inferences followed those of the recognition heuristic on an average of 92% of the critical pairs. This ®nding was their key evidence for the non-compensatory use of recognition information. Newell and Fernandez (2006) attempted to replicate Goldstein and Gigerenzer's results but also included a condition in which participants were provided with information about the soccer teams that was higher in predictive validity. In the replication group there was a 5 in 22 chance that an unrecognised city had a soccer team; in the new condition this was raised to a 3 in 4 chance. Newell and Fernandez reasoned that if extra information was truly inconsequential then there would be no difference in reliance on recognition in the two groups: the soccer team information would be ignored regardless of its predictive validity. In stark contrast to this, participants in the 5 in 22 condition chose the recognised city in the critical pairs on 73% of occasions on average, versus only 57% for those in the 3 in 4 condition. Furthermore, Newell and Fernandez found that when participants recognised a city and knew it had a soccer team (a ``corresponding pair''), recognised cities were chosen on 98% of occasions (collapsed across validity conditions), but when a city was recognised but known not to have a soccer team (a ``con¯icting pair''), choice of the recognised city dropped to 64%. This large difference is not predicted by the recognition heuristic: search is supposed to stop as soon as one city is recognised and the other is not so the direction that the soccer team cue points should not affect the extent to which the recognised city is chosen. Unfortunately, Goldstein and Gigerenzer did not report their results for corresponding pairs, so we do not know whether their participants exhibited the same behaviour (see È nkal, 2004, for similar ®ndings). Ayton & O Newell and Fernandez' results are important because they illustrate that even in an environment that is appropriate for testing the recognition heuristic, evidence for its de®ning characteristic (i.e., its non-compensatory nature) is dif®cult to ®nd. Without evidence of process, claims for the existence of a speci®c, unique recognition heuristic are weakened considerably. Next we need to consider what happens when recognition information cannot be relied on: What is the empirical evidence concerning the subsequent steps illustrated in Figure 6.1? Do people take the best? When both ± or all ± alternatives in a choice set are recognised, the TTB procedure states that people will continue searching until they discover a cue that discriminates between the alternatives. In the context of the cities task, if a person recognises Berlin and Munich, she would then consult her memory for further information (e.g., is either city the capital, does either have a university or an airport, has either hosted the Olympic Games). We

142

Newell and Shanks

noted earlier that TTB presupposes a serial search process (despite evidence for parallel processing) ± but in what order should search proceed? Search rules TTB has a precise answer: Search in validity order, where validity is de®ned as the relative frequency with which a cue selects the correct answer when applied to all pairwise comparisons in a given environment. In the German cities environment this will lead a person to check the ``is the city the capital?'' cue ®rst. This is likely to be highly valid (in the sense that it will lead to the correct answer) because the capital city has a very large population. However, it is also likely to be redundant in many cases because it will provide an answer only when one of the cities in the pair actually is the capital ± in the vast majority of possible pairs, neither will be the capital so this cue will be non-discriminating. In contrast, knowing whether or not the city has a university will have much less redundancy ± many, but not all cities have universities, but only one is a capital ± but it will not be as valid an indicator because it is not always the case that universities are in the largest cities. The overall usefulness of a cue must then take account of both its validity and its redundancy ± or ability to discriminate between the two options. Useful cues are those that can frequently be used to make an inference (i.e., have a high discrimination rate); and, when used, usually point in the correct direction (i.e., have high validity). In support of this, Newell et al. (2004) found that in a simulated stock market environment involving a series of predictions about pairs of companies, participants' predecision search strategies conformed to a pattern that revealed sensitivity to both the validity and discrimination rate of cues. Given suf®cient practice in the environment, participants searched through cues according to how ``successful'' they were for predicting the correct outcome (see Martignon & Hoffrage, 1999, for a detailed discussion and de®nition of ``success'' ± it is a function of the validity and discrimination rate of cues). Thus, rather than using a ``validity'' search rule, participants tended to use a ``success'' search rule (see also Rakow et al., 2005). Stopping rules According to TTB, search stops once one discriminating cue has been discovered (see Figure 6.1). BroÈder (2000) was the ®rst to test whether people would indeed adhere to such a frugal strategy. In four experiments using a variety of cover stories, BroÈder trained participants to learn the validities of a set of cues, and then gave a test phase in which participants made forced-choice decisions between two alternatives. In the test phase BroÈder included critical pairs that revealed whether participants were relying on a single cue-stopping rule, or whether other cues (which should not affect the decision according to TTB) did indeed in¯uence choice.

6. Tools of decision-making

143

For example, confronted with the pair of alternatives A [ ÿ ÿ ÿ] versus B [ÿ ] (where is a positive cue value, and ± a negative cue value, and cues are listed left to right in order of validity), TTB predicts selection of alternative A because the ®rst (most valid) cue discriminates between A and B, rendering the remaining cues irrelevant. In contrast, a compensatory strategy would choose B, because B has a greater number of positive cue values (3:1). BroÈder analysed individuals' choices and reported that 28% of individuals in one experiment and 53% in another made choices roughly consistent with the use of TTB or a similar non-compensatory strategy. In an extension of BroÈder's experiments, Newell and Shanks (2003) used a process-tracing design to monitor individuals' patterns of information acquisition (see Payne, Bettman, & Johnson, 1993). Participants were told they would play the role of a stockbroker and had to choose to invest in one of two companies on each trial. In a training phase they were given the opportunity to learn the validities of four pieces of information concerning the companies' performance (e.g., employee turnover, investment in new projects). Following the training there was a test phase in which participants were no longer provided with information about the companies but had the opportunity to buy as many pieces as they wished before making their choice. According to TTB one would expect people only to buy information up to the point where it allowed discrimination between the two alternatives (i.e., a YES for one share and a NO for the other). However, people typically went beyond this, buying more information than predicted by the TTB stopping rule. Crucially, when extra information con¯icted with information already acquired, participants changed their decisions accordingly. This behaviour was especially prevalent when the relative cost of information was low (i.e., when the cost of each piece of information was small relative to the expected pay-off for a correct answer, participants tended to buy lots of information). Other participants went to the other extreme, tending to guess rather than buying any information. Newell et al. (2003) found similar results when the number of cues in the environment was either reduced to two, or increased to six. The results reported by Newell and colleagues, in conjunction with other empirical investigations (e.g., BroÈder, 2000; 2003; Juslin, Jones, Olsson, & Winman, 2003; Lee & Cummins, 2004) demonstrate that TTB is clearly not universally adopted by participants ± even under conditions strongly constrained to promote its use. However, some other investigations have provided tentative evidence that TTB or some similar non-compensatory heuristic may be one of the strategies that people adopt in certain situations. For example, Dhami and Ayton (2001) reported that a close variant of TTB (the matching heuristic) provided a better ®t than two compensatory mechanisms for a substantial minority of lay magistrates' judgments of whether defendants should be granted bail in the English legal system. Dhami and Harries (2002) found in a medical context that this same matching heuristic did as well as logistic regression in capturing doctors'

144

Newell and Shanks

prescription decisions. Conclusions drawn on the basis of this may be a little premature, however: BroÈder and Schiffer (2003b) noted that the free parameters in the heuristic led it to be the best ®tting model even for data sets generated randomly! This superior ®t to such data suggests that the model is doing something beyond what a psychological model should do (that is, over®tting; Cutting, 2000). The picture painted by the empirical data suggests mixed support for TTB. There is some evidence that people use ``something like'' TTB some of the time (BroÈder 2000, Dhami & Ayton, 2001; Dhami & Harries, 2002) but equally a growing body of evidence suggests wide individual differences and a poor ®t with TTB's core principles (Juslin & Persson, 2002; Newell & Shanks, 2003; Newell et al., 2003, 2004; Rakow et al., 2005).

Different tools for different tasks? Earlier we noted that proponents of the adaptive toolbox distinguish between two broad classes of tools ± those for speci®c inference tasks and those for speci®c adaptive tasks. The empirical evidence reviewed so far suggests limited support for the inference task tools. What about those for the adaptive tasks? One speci®c adaptive task discussed by Todd and Miller (1999; see also Miller & Todd, 1998) is that of searching for a mate. Here it appears that they are using adaptive in the sense of adaptive for the genes' goals rather than the organism's goals, although this is not stated explicitly. The procedure they propose for mate search is called ``take-the-next-best''. Already, one's suspicions might be alerted that this ``speci®c tool'' is perhaps not all that distinct from the TTB heuristic discussed earlier. A closer inspection con®rms this. Todd and Miller (1999) consider a simpli®ed mate-search problem in which the aim is to pick the ``best'' mate from a known population of individuals. The decision maker needs to know the total number of individuals in the population (N ) and needs to specify the number or percentage of them that he or she wishes to check out (C ). The individual then needs to store this percentage and remember the highest ``mate value'' or dowry (D) observed in that sample. ``Take-the-next-best'' stipulates that the decision maker should choose the next potential mate with a dowry greater than D. Todd and Miller use simulations to demonstrate that even when low values of C are set (e.g., 14%) one still has a high chance (over 80%) of selecting a mate who has a value in the top 10% of all the individuals in the population. They thus argue that ``Take-the-next-best'' satis®es the constraints of bounded rationality because reasonable performance can be achieved through checking only a manageable number of individuals. No empirical evidence for the use of this strategy is provided. The discrete search, stop and decision elements of this take-the-next-best heuristic seem to map rather closely onto the discrete aspects of the TTB

6. Tools of decision-making

145

heuristic. Information is ordered, a stopping rule is selected, and search terminates when the stopping rule has been satis®ed. Like TTB, it also assumes a high degree of precomputation (knowing the total population) and thus its ``frugality'' is questionable for the same reasons. Thus, despite claims for single bins in the toolbox holding single tools for speci®c tasks, it is not clear that such a multitude of tools is necessary to explain our decision-making behaviour. If the phenomena of interest can be captured simply in terms of fundamental rules for accumulating information, and making decisions on the basis of that information, why does one need to posit the extra ``baggage'' of speci®c tools for speci®c jobs? How many tools in the adaptive toolbox? Our review of the empirical evidence highlights wide individual differences in the use of simple heuristics. These deviations from the deterministic procedures present a fundamental problem for the fast-and-frugal framework. Although the framework allows for people to have access to numerous strategies or heuristics in their adaptive toolbox, it assumes that it is the environment that determines strategy selection, and not the individual. Without such an assumption the framework necessarily falls foul of the homunculus problem of needing a metaheuristic to select the appropriate tool for the job (Goldstein et al., 2001). Individual differences in strategy use are not easily reconciled with the proposed role for the environment in triggering particular strategies. Why do people with the same cognitive apparatus, operating in the same environment (for which there is often a single ``ecologically rational'' strategy) differ (cf. Lee & Cummins, 2004)? The reason this question poses a problem for the fast-and-frugal framework is that although the models are clearly speci®ed, there is no indication of the degree of empirical deviation permissible from the deterministic search, stopping and decision rules. Any deviations are merely ignored as noise, allowing proponents to claim 64% of choices consistent with TTB as an excellent result (e.g., BroÈder & Schiffer, 2003a) while others might regard the failure to ®t 36% as a considerable problem. To advance the debate and increase the testability of the heuristics, clearer speci®cation as to what constitutes a good ®t between a heuristic and data is required. Furthermore, it is essential to incorporate an error theory into the toolbox to account for the stochastic deviation from the heuristics' deterministic rules. Without such advances we risk getting mired in arguments about the ``fullness of the glass'' (Newell, 2005; Newell et al., 2003). An adjustable spanner? Perhaps there is another way. Could these patterns of individual variability be reconciled within a single model rather than attributing behaviour to different heuristics? Lee and Cummins (2004), who studied patterns of

146

Newell and Shanks

individual variability in a test of TTB, suggested that the beginnings of a unifying explanation could come from recognising that behaviours that conform to heuristics are special cases of a more general approach to decision-making. They developed this argument by presenting an evidenceaccumulation model, which relies on a random walk sequential sampling process. Such models have been applied to a wide range of tasks. Although speci®c mechanisms differ, in general the models assume that rather than taking a predetermined quantity of information, sampling of each option occurs until evidence suf®cient to favour one over the other has been accumulated (e.g., Busemeyer & Townsend, 1993; Dror, Busemeyer, & Basola, 1999; Lee & Cummins, 2004; Nosofsky & Palmeri, 1997; Ratcliff, 1978). The important feature of an evidence-accumulation model in the context of a binary choice problem is that it can mimic the performance of TTB's stopping rule, or the recognition heuristic, or a strategy that incorporates more evidence (e.g., a weighted additive rule) by adjusting the evidence required before a decision is made. Thus one way of explaining individual variability is to suggest that everyone uses an evidence-accumulation strategy, but some people require more evidence than others before making their decisions. Lee and Cummins (2004) found that such a uni®ed model accounted for 85% of the decisions made by participants ± more than that accounted for by either TTB or a compensatory strategy alone. Importantly, through the application of model selection criteria, it was demonstrated that the improved accuracy was not due to the additional complexity of the uni®ed model (i.e., its two free parameters compared to the parameter-free TTB). Maybe we are all using the same tool ± an adjustable spanner (Newell, 2005)? A domain independent ``power tool '' For a domain independent perspective to supply the favoured explanation, we need to develop empirical techniques that allow us to distinguish between the two accounts (e.g., Lee & Cummins, 2004) and also be able to specify how the evidence threshold is affected by factors such as the cost of information, time pressure, the costs and bene®ts of correct and incorrect decisions, and perhaps the effect of individual characteristics such as intelligence (e.g., BroÈder, 2003). Even if this empirical challenge proves dif®cult, the parsimony afforded by such a domain independent power tool over a toolbox of heuristics should not be underestimated. If one of our goals in science is data reduction, then arguably we are more likely to achieve this by constraining a model through the empirical speci®cation of its parameters than by generating a panoply of heuristics. Gigerenzer and Todd (1999, p. 18) are aware of the danger of too much speci®city but claim that the heuristics circumvent the problem by virtue of their simplicity:

6. Tools of decision-making

147

if a different heuristic were required for every slightly different decision-making environment, we would need an unworkable multitude of heuristics to reason with, and we would not be able to generalise to previously unencountered environments. Fast and frugal heuristics avoid this trap by their very simplicity which allows them to be robust in the face of environmental change and enables them to generalise well to new situations. This admission of the importance of generalising across environments does not sit well with the claims of speci®c tools for speci®c tasks (see the quote in the introduction to this chapter). Such a ``speci®c but a bit general'' account demands the question of how slightly different does an environment need to be before a new tool is selected ± and as noted earlier, who or what does the selecting? In conclusion, we suggest that by encumbering themselves with the baggage of a collection of domain speci®c, evolutionarily programmed heuristics, proponents of the adaptive toolbox framework have painted themselves into a corner. With largely equivocal evidence for the only two heuristics to undergo detailed empirical scrutiny, claims for a toolbox of ecologically rational heuristics that are constructed and triggered into action by the properties of particular environments have been weakened. In general, the domain-free mechanism that we sketch as an alternative, while by no means a panacea, has the potential to account for the phenomena of interest far more parsimoniously.

Acknowledgements The writing of this chapter was supported by a Discovery Project Grant from the Australian Research Council and a Faculty Research Grant from The University of New South Wales. The work was part of the programme of the UK ESRC Research Centre for Economic Learning and Social Evolution, University College London.

References È nkal, D. (2004). Effects of ignorance and information on judgmental Ayton, P., & O forecasting. Manuscript submitted for publication. Borges, B., Goldstein, D. G., Ortmann, A., & Gigerenzer, G. (1999). Can ignorance beat the stock market? In G. Gigerenzer, P. M. Todd & The ABC Research Group (Eds.), Simple heuristics that make us smart (pp. 3±34). Oxford: Oxford University Press. Boyd, M. (2001). On ignorance, intuition and investing: A bear market test of the recognition heuristic. The Journal of Psychology and Financial Markets, 2, 150± 156.

148

Newell and Shanks

BroÈder, A. (2000). Assessing the empirical validity of the ``Take-The-Best'' heuristic as a model of human probabilistic inference. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 1332±1346. BroÈder, A. (2003). Decision-making with the adaptive toolbox: In¯uence of environmental structure, personality, intelligence, and working memory load. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 611±625. BroÈder, A., & Schiffer, S. (2003a). ``Take-the-Best'' versus simultaneous feature matching: Probabilistic inferences from memory and the effects of representation format. Journal of Experimental Psychology: General, 132, 277±293. BroÈder, A., & Schiffer, S. (2003b). Bayesian strategy assessment in multi-attribute decision-making. Journal of Behavioral Decision Making, 16, 193±213. Brunswik, E., Hammond, K. R., & Stewart, T. (Eds.). (2000). The essential Brunswik: Beginnings, explications, and applications. Oxford: Oxford University Press. Busemeyer, J. R., & Townsend, J. T. (1993). Decision ®eld theory: A dynamic cognition approach to decision making. Psychological Review, 100, 432±459. Buss, D. (1999). Evolutionary psychology: The new science of the mind. Boston: Allyn & Bacon. Chase, V., Hertwig, R., & Gigerenzer G. (1998). Visions of rationality. Trends in Cognitive Sciences, 6, 206±214. Chater, N., Oaksford, M., Nakisa, R., & Redington, M. (2003). Fast, frugal and rational: How rational norms explain behavior. Organizational Behavior and Human Decision Processes, 90, 63±80. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187±276. Cosmides, L., & Tooby, J. (1992). Cognitive adaptations for social exchange. In J. H. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 163±228). New York: Oxford University Press. Cosmides, L., & Tooby, J. (1994). Beyond intuition and instinct blindness: Toward an evolutionarily rigorous cognitive science. Cognition, 50, 41±77. Cutting, J. E. (2000). Accuracy, scope and ¯exibility of models. Journal of Mathematical Psychology, 44, 3±19. Czerlinski, J., Gigerenzer, G., & Goldstein, D. G. (1999). How good are simple heuristics? In G. Gigerenzer, P. M. Todd & The ABC Research Group (Eds.), Simple heuristics that make us smart (pp. 97±118). Oxford: Oxford University Press. Dhami, M. K., & Ayton, P. (2001). Bailing and jailing the fast and frugal way. Journal of Behavioral Decision Making, 14, 141±168. Dhami, M. K., & Harries, C. (2002). Fast and frugal versus regression models of human judgment. Thinking and Reasoning, 7, 5±27. Dror, I. E., Busemeyer, J. R., & Basola, B. (1999). Decision making under time pressure: an independent test of sequential sampling models. Memory & Cognition, 27, 713±725. Elwyn, G., Edwards, A., Eccles, M., & Rovner, D. (2001). Decision analysis in patient care. The Lancet, 358, 571±574.

6. Tools of decision-making

149

Gigerenzer, G. (2001). The adaptive toolbox. In G. Gigerenzer & R. Selten (Eds.), Bounded rationality: The adaptive toolbox (pp. 37±51). Cambridge, MA: MIT Press. Gigerenzer, G., & Goldstein, D. G. (1996). Reasoning the fast and frugal way: Models of bounded rationality. Psychological Review, 103, 650±669. Gigerenzer, G., & Todd, P. M. (1999). Fast and frugal heuristics: The adaptive toolbox. In G. Gigerenzer, P. M. Todd & The ABC Research Group (Eds.), Simple heuristics that make us smart (pp. 3±34). Oxford: Oxford University Press. Gigerenzer, G., Todd, P. M., & The ABC Research Group (1999). Simple heuristics that make us smart. Oxford: Oxford University Press. Goldstein, D. G., & Gigerenzer, G. (2002). Models of ecological rationality: The recognition heuristic. Psychological Review, 109, 75±90. Goldstein, D. G., Gigerenzer, G., Hogarth, R. M., Kacelnik, A., Kareev, Y., Klein, G., et al. (2001). Why and when do simple heuristics work? In G. Gigerenzer & R. Selten (Eds.), Bounded rationality: The adaptive toolbox (pp. 173±190). Cambridge, MA: MIT Press. Hammond, K. R. (1996). Human judgment and social policy: Irreducible uncertainty, inevitable error, unavoidable injustice. Oxford: Oxford University Press. Juslin, P., Jones, S., Olsson, H., & Winman, A. (2003). Cue abstraction and exemplar memory in categorization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 924±941. Juslin, P., & Persson M. (2002). Probabilities from Exemplars (PROBEX): A ``lazy'' algorithm for probabilistic inference from generic knowledge. Cognitive Science, 95, 1±45. Lee, M. D., & Cummins, T. D. R. (2004). Evidence accumulation in decision making: Unifying ``Take the Best'' and ``rational'' models. Psychonomic Bulletin & Review, 11, 343±352. Martignon, L., & Hoffrage, U. (1999). Why does one-reason decision making work? A case study in ecological rationality. In G. Gigerenzer, P. M. Todd & The ABC Research Group (Eds.), Simple heuristics that make us smart (pp. 119±140). Oxford: Oxford University Press. Miller, G. F., & Todd, P. M. (1998). Mate search turns cognitive. Trends in Cognitive Sciences, 2, 190±198. Newell, B. R. (2005). Re-visions of rationality? Trends in Cognitive Sciences, 9, 11±15. Newell, B. R., & Fernandez, D. (2006). On the binary quality of recognition and the inconsequentiality of further knowledge: Two critical tests of the recognition heuristic. Journal of Behavioral Decision Making, 19, 333±346. Newell, B. R., Rakow, T., Weston, N. J., & Shanks, D. R. (2004). Search strategies in decision-making: The success of success. Journal of Behavioral Decision Making, 17, 117±137. Newell, B. R., & Shanks, D. R. (2003). Take-the-best or look at the rest? Factors in¯uencing ``one-reason'' decision making. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 53±65. Newell, B. R., & Shanks, D. R. (2004). On the role of recognition in decision making. Journal of Experimental Psychology: Learning, Memory & Cognition, 30, 923±935.

150

Newell and Shanks

Newell, B. R., Weston, N. J., & Shanks, D. R. (2003). Empirical tests of a fast and frugal heuristic: Not everyone ``takes-the-best''. Organizational Behavior and Human Decision Processes, 91, 82±96. Nosofsky, R. M., & Johansen, M. J. (2000). Exemplar-based accounts of ``multiplesystem'' phenomena in perceptual categorization. Psychonomic Bulletin & Review, 7, 375±402. Nosofsky, R. M., & Palmeri, T. J. (1997). An exemplar-based random walk model of speeded classi®cation. Psychological Review, 104, 266±300. Oaksford, M. (2000). Speed, frugality and the empirical basis of Take-the-Best. Behavioral and Brain Sciences, 23, 760±761. Oppenheimer, D. M. (2003). Not so fast! (And not so frugal): Rethinking the recognition heuristic. Cognition, 90, B1±B9. Over, D. E. (2000a). Ecological rationality and its heuristics. Thinking and Reasoning, 6, 182±192. Over, D. E. (2000b). Ecological issues: A reply to Todd, Fiddick & Krauss. Thinking and Reasoning, 6, 385±388. Over, D. E. (2003). From massive modularity to metarepresentation: The evolution of higher cognition. In D. E. Over (Ed.), Evolution and the psychology of thinking: The debate (pp. 121±144). Hove, UK: Psychology Press. Payne, J. W., Bettman, J. R., & Johnson, E. (1993). The adaptive decision maker. New York: Cambridge University Press. Pinker, S. (1997). How the mind works. New York: Penguin. Rakow, T., Hinvest, N., Jackson, E., & Palmer, M. (2004). Simple heuristics from the adaptive toolbox: Can we perform the requisite learning? Thinking & Reasoning, 10, 1±29. Rakow, T., Newell, B. R., Fayers, K., & Hersby, M. (2005). Evaluating three criteria for establishing cue-search hierarchies in inferential judgment. Journal of Experimental Psychology: Learning, Memory & Cognition, 31, 1088±1104. Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85, 59±108. Rumelhart, D. E., McClelland, J. L., & the PDP Research Group (1986). Parallel distributed processing: Explorations in the microstructure of cognition (Vols 1 & 2). Cambridge, MA: MIT Press. Schmitt, D. P., & Pilcher, J. J. (2004). Evaluating evidence of a psychological adaptation. Psychological Science, 15, 643±649. Shanks, D. R., & Lagnado, D. A. (2000). Sub-optimal reasons for rejecting optimality. Behavioral and Brain Sciences, 23, 761±762. Simon, H. A. (1956). Rational choice and the structure of the environment. Psychological Review, 63, 129±138. Stanovich, K. E., & West, R. F. (2003). Evolutionary versus instrumental goals: How evolutionary psychology misconceives human rationality. In D. E. Over (Ed.), Evolution and the psychology of thinking: The debate (pp. 171±230). Hove, UK: Psychology Press. Thornhill, R. (1997). The concept of an evolved adaptation. In G. R. Bock & G. Cardew (Eds.), Characterizing human psychological adaptations (pp. 4±22). Chichester, UK: Wiley. Todd, P. M., Fiddick, L., & Kraus, S. (2000). Ecological rationality and its contents. Thinking and Reasoning, 6, 375±384.

6. Tools of decision-making

151

Todd, P. M., & Gigerenzer, G. (1999). What we have learned (so far). In G. Gigerenzer, P. M. Todd & The ABC Research Group (Eds.), Simple heuristics that make us smart (pp. 357±365). Oxford: Oxford University Press. Todd, P. M., & Miller, G. F. (1999). From pride and prejudice to persuasion: satis®cing in mate search. In G. Gigerenzer, P. M. Todd & The ABC Research Group (Eds.), Simple heuristics that make us smart (pp. 287±308). Oxford: Oxford University Press.

7

Domain general contributions to social reasoning The perspective from cognitive neuroscience Margaret C. McKinnon, Brian Levine, and Morris Moscovitch

Consider the following dilemma: You are the captain of a military submarine. An onboard explosion has caused you to lose most of your air supply and has injured one of your crew, who is quickly losing blood. The injured crew member is going to die from his wounds no matter what. There isn't enough air for the whole crew. The only way to save them is to shoot dead the injured crew member so that there will be just enough air for the rest to survive. As a participant in a research study, you are asked: ``Is it OK for you to shoot the fatally injured crew member in order to save the lives of the remaining ones?'' In deciding what to do in this trying circumstance, what sort of information would you require? Some examples spring readily to mind, such as knowing the current level of distress of the injured man, how the remaining crew would react if you refused to shoot the injured crew member, and social norms regarding the killing of a young man. Along with a consideration of Western ethical principles, other types of information processing are less apparent but no less necessary to forming a response, such as the ability to hold in mind the con¯icting perspectives of the injured crew member and the remaining crew (including yourself ) and to inhibit a prepotent response of disgust to harming someone in your care. Recent interest in social cognitive neuroscience has led to a growing body of research focused on identifying the neural and behavioural correlates of a wide variety of social reasoning tasks, including empathy (Eslinger, 1998), moral reasoning (Greene, Sommerville, Nystrom, Darley, & Cohen, 2001), theory of mind (ToM; Frye, Zelazo, & Palfai, 1995; Stuss, Gallup, & Alexander, 2001) and social norm violations (Moll, de Oliveira-Souza, Bramati, & Grafman, 2002a). Shared reliance between these complex tasks ± and the domain general and domain speci®c processing resources upon which they draw ± appears likely as researchers attempt to discern how individuals represent con¯icting perspectives, identify shifting patterns of emotional response, and hold in mind the key elements of social interactions over time. These are all processes required to operate effectively in the social world, and to complete the sorts of social reasoning tasks devised by research investigators.

154

McKinnon, Levine, Moscovitch

In this chapter, we review and integrate research surrounding domain general and domain speci®c contributions to social reasoning. Following a brief account of neuropsychological constructs of modules and central processes, we survey research on four different types of social reasoning: theory of mind, deontic reasoning, moral reasoning, and empathy. We follow by describing the relation between domain general and domain speci®c contributions to performance on these related tasks, highlighting the overlap in putative mechanisms involved. Our emphasis is on studies of patients with brain disease, and evidence from functional neuroimaging. Where available, evidence from development is also included. Our review points towards both cognitive and affective contributions to social reasoning tasks.

Modules and central systems Modules have been described as domain speci®c systems that operate in an automatic and rapid fashion, producing shallow output that must be processed further by higher-order systems. Such modules are also said to be informationally encapsulated and invulnerable to the input of top-down processes (Fodor, 1983). Moscovitch and UmiltaÁ (1990) have translated these criteria to the neuropsychological level. Domain speci®city refers to the notion that modules can only accept information of a restricted or circumscribed nature. Neuropsychologically, adherence to this criterion can be demonstrated by showing that damage to a particular region or system (the equivalent of a module) leads to impairment of the modular domain, with relative sparing of function in other domains. Because central systems can also be localised to circumscribed neural regions, however, this condition is not suf®cient in itself. Informational encapsulation is where modules resist the effects of higherorder knowledge on processing and remain impenetrable to probes of their content and operation. This is demonstrated when the operation of a module is unaffected by gross intellectual decline caused by degeneration or focal damage to structures other than those mediating the module itself. For example, patients with generalised de®cits caused by Alzheimer's dementia may fail to understand even simple words or appreciate the function of objects, but can still read relatively well (Schwartz, Saffran, & Marin, 1980) and have a good three-dimensional representation of objects (Moscovitch & UmiltaÁ, 1990; Warrington & Taylor, 1978). Conversely, informational encapsulation can also be demonstrated when a domain speci®c de®cit occurs in the face of preserved intellectual functions and semantic knowledge. Patients with associative agnosia may not recognise an object visually but can provide detailed semantic information about the object when given its name, yet often cannot use this knowledge to identify the object visually (Moscovitch & UmiltaÁ, 1990; Riddoch & Humphreys, 1987). Finally, shallow output cannot be interpreted beyond the value assigned to it by the module, where the processes leading to instantiation are not

7. Social reasoning and neuroscience

155

available to conscious inspection. Here, domain speci®c output is generated, but cannot be interpreted semantically, as in patients with associative agnosia, who retain the ability to process objects, faces and words at the structural, presemantic level, but cannot assign meaning to this information once computed (Bauer, 1984; Moscovitch & UmiltaÁ, 1990; Warrington & Taylor, 1978). Central systems, by contrast, integrate information across diverse domains, producing output that is deep or meaningful through interlevel representations that may be available to consciousness. Unlike modules, central systems are open to top-down in¯uences, ultimately determining the meaning and relevance of the mind's contents. For example, strategic memory processes mediated by the frontal lobes assist in functions such as monitoring retrieved memories and placing them in proper spatiotemporal context with other memories (Moscovitch, 1992; Moscovitch & Winocur, 1992). Here, the frontal lobes work with memory to perform its diverse strategic functions. Importantly, the operation of central systems is domain general, available across tasks. We have no reason to believe that central systems and modules operate exclusively of one another. Indeed, central systems may act in tandem with modular processes (e.g., Leslie, Friedman, & German, 2004).1 Early claims of dedicated and domain speci®c modules for social reasoning, as in ToM (Baron-Cohen, 1995) and deontic reasoning (Cosmides, 1989), however, appeared to exclude the possibility that common domain general resources, such as working memory, attention, and inhibition, contribute additionally to task performance. By contrast, certain current formulations (Leslie et al., 2004) have attempted to reconcile evidence of domain general contributions to these tasks by positing both modular and nonmodular components. Indeed, rigorous tests, involving careful identi®cation of the processing demands of the wide variety of social tasks reported in the literature, are required to identify accurately and to dissociate domain general and domain speci®c contributions to social reasoning in general.

Evidence for domain general contributions to theory of mind and deontic reasoning Most of the evidence presented in favour of modularity concerning social reasoning is found in the literature concerning theory of mind, the ability to understand the mental states of others, including their beliefs, desires and 1 Future research may determine whether maintaining and coordinating activity among modules, responses and other cognitive operations makes demands on central processes and resources. Alternatively, though modules may require very little attention or cognitive resources to operate, their operations may not be completely resource-free. Thus, some very demanding tasks may ``overwhelm'' the operation even of modules, even though, by strict de®nition, modules are not thought to have domain general resource limitations.

156

McKinnon, Levine, Moscovitch

intentions, and to distinguish these mental states from our own. This debate has played out primarily in developmental studies of emergent ToM, where evidence for strict modularity is drawn largely from extensive research indicating that children with autism are impaired in their ability to understand the thoughts and feelings of others (e.g., Baron-Cohen, 1995). In contrast to their impaired ToM abilities, these children demonstrate relatively intact performance on matched control tasks that do not require an understanding of another's mental states (Charman & Baron-Cohen, 1992; but see Happaney & Zelazo, Chapter 11, this volume). Children with autism, however, are impaired across a wide range of tasks, including tests of executive function (e.g., Ozonoff, Pennington, & Rogers, 1991), for which correlations between performance on executive functioning tasks and ToM performance are observed in this population (e.g., Zelazo, Jacques, Burack, & Frye, 2002), and also in children with Down's Syndrome (Zelazo, Burack, Benedetto, & Frye, 1996). Moreover, in normally developing children, numerous studies reveal evidence of an association between domain general central processes ± such as working memory (Gordon & Olson, 1998), strategic behaviour (Hughes, 1998), mental set shifting (Frye et al., 1995), and inhibition (Carlson & Moses, 2001) ± and ToM function. Indeed, most current theories of social reasoning in children, and in particular of ToM, posit the contribution of modular and nonmodular components to performance (Leslie et al., 2004; Moses, 2001). In general, these revised theories form two camps. One view emphasises the role of executive functions (EFs) in the initial construction of a conceptual understanding that other people form desires and beliefs (the emergence account). These concepts, once formed, are thought to no longer require EFs for their operation, relying on the operation of a newly formed ``module'' (e.g., Moses, 2001; Perner & Lang, 1999). Proponents of these theories view developmental changes in ToM as evidence for increases in conceptual competence, with only a minor role ascribed to general capacity change. The other view emphasises the online contribution of EFs to social reasoning (the performance account; e.g., Leslie et al., 2004). In this case, EFs are thought to remain involved with ToM in perpetuity, provided that the demands of the ToM tasks are such that they are required (e.g., if multiple levels of inhibition are necessary). Here, ToM reasoning in young children re¯ects the underlying contribution of both innately speci®ed candidate beliefs (the theory of mind mechanism; ToMM) and nonmodular EFs (the selection process; SP), the latter likely inhibitory in nature. In this view, progressive changes in executive functioning lead to the ability to perform increasingly complex social reasoning tasks (e.g., ®rst- and secondorder false belief tasks ± for similar formulations of cognitive capacity and performance, see Case, 1992). ``Theory of mind'' EFs, however, are thought to be wholly or partly distinct from the domain general executive functions subserving other tasks (e.g., formal reasoning; Leslie et al., 2004).

7. Social reasoning and neuroscience

157

Evidence from cognitive neuroscience Claims of modularity of ToM function can also be found in the neuropsychological literature, where ToM has been described as having a selective brain basis and dedicated cognitive module that is vulnerable to selective malfunction (HappeÂ, Winner, & Brownell, 1998; Rowe, Bullock, Polkey, & Morris, 2001). Adherents to the view that ToM relies on domain speci®c processing resources acknowledge that ToM function may recruit a distributed network of cortical regions (e.g., Stone, Baron-Cohen, & Knight, 1998). They do not, however, think that domain general resources are crucial for ToM. This view is roughly analogous to the developmental emergence account (Moses, 2001), where social reasoning performance, past a period of early childhood development, is thought to rely preferentially on the operation of a cognitive module. One candidate neural substrate that has received much attention in both developmental and adult neuropsychological literature is the prefrontal cortex (PFC). For example, although studies examining the direct relation between ToM and PFC development have yet to be conducted, numerous workers note that the emergence of higher levels of consciousness ± including ToM and self-awareness ± occurs in close proximity to the development of frontally mediated executive skills, including ability to perform source memory tasks (Gopnik & Graf, 1988), to recall event details such as time and place (Hudson & Fivush, 1991), to perform tests of working memory and executive functioning (Case, 1992), and to maintain a consistent representation of the self across time (Povinelli, Landau, & Perilloux, 1996). These changes co-occur with synaptic pruning of cortical grey matter, and myelination increases in young children, which may increase the overall interconnectivity and ef®ciency of prefrontal systems while allowing for the development of additional circuits shaped by environmental stimulation and input from other cortical areas (for a review, see Levine, 2004). Accordingly, numerous studies reveal a correlation between the onset of advanced ToM reasoning and performance on tests of executive functioning, including inhibitory control, mental ¯exibility, reasoning, and working memory (e.g., Hughes, 2002; Stuss & Anderson, 2004). More direct evidence for the contribution of the PFC to ToM reasoning comes from studies of patients with damage to this region, in particular to the ventromedial PFC, who are unable to complete ToM tasks involving, for example, visual perspective taking, deception (Stuss et al., 2001) and identi®cation of the presence of a social faux pas in a complex social scenario (Stone et al., 1998). In many cases, this de®cit appears dissociable from performance on tests of more general inferential reasoning. In a pattern similar to that found in autism, however, in some patients, performance on more complex (e.g., second-order false belief ) ToM tasks appears correlated with performance on measures of executive function, such as working memory (e.g., Bibby & McDonald, 2005).

158

McKinnon, Levine, Moscovitch

Speci®cally, whereas studies including patients with primarily dorsolateral frontal lobe damage reveal evidence of an association between ToM and tests of executive function (e.g., Channon & Crawford, 2000; Stone et al., 1998), studies involving patients with damage to primarily medial aspects of the frontal lobes show little evidence of such an association (Rowe et al., 2001; Stuss et al., 2001). These results suggest there may be multiple routes to impairment on typical ToM tasks, with damage to those regions commonly associated with cognitive processes, such as working memory (e.g., dorsolateral PFC), likely disrupting cognitively-based aspects of task performance. By contrast, damage to regions associated primarily with emotional processing (e.g., ventromedial PFC) may disrupt aspects of task performance relying more heavily on this, an idea similar to that made historically for other social reasoning tasks, including empathy (e.g., Eslinger, 1998). One recent study provides partial support for this conclusion. ShamayTsoory, Tomer, Berger, Goldsher, & Aharon-Peretz (2005) compared the performance of patients with ventromedial (VMPFC; including orbitofrontal and medial frontal regions), dorsolateral (DLPFC) and mixed (VMPFC and DLPFC) prefrontal damage on ``affective'' ToM tasks. In contrast to patients with dorsolateral PFC damage, the VMPFC and mixed patients were impaired on the performance of two ToM tasks described by the authors as drawing heavily on affective processing: detection of social faux pas and detection of irony. Performance on a measure of affective empathy was also impaired, leading these authors to conclude that the affective components of social reasoning (shared across tasks) are mediated, in part, by ventromedial prefrontal regions. By contrast, ``cognitive'' processes (e.g., working memory, attention) involved in social reasoning may be subserved by the same dorsolateral prefrontal regions that mediate their performance across a wide variety of tasks (e.g., formal reasoning, autobiographical memory). Other studies of patient populations have examined the contribution of the amygdala to ToM reasoning. Indeed, the amygdala has been intimately tied with emotion (Hamann, Ely, Grafton, & Kilts, 1999) and the processing of social information (Stone, Baron-Cohen, Calder, Keane, & Young, 2003). Consistent with its role in processing of both pleasant and unpleasant stimuli, the amygdala is instrumental to enhanced memory and perception for emotionally arousing events (Anderson & Phelps, 2001). These amygdala±cortical modulatory in¯uences are thought to be supported by the substantial connections between the amygdala and polysensory cortices (Williams, Morris, McGlone, Abbott, & Mattingley, 2004), as well as the orbitofrontal cortex (Holland & Gallagher, 2004). Amygdala activation has been reported in neurologically intact controls while they judged mental states from pictures of people's eyes (Baron-Cohen et al., 1999). Moreover, in one recent study, Stone et al. (2003) reported the case of D.R. and S.E., adults with acquired bilateral amygdala damage. These

7. Social reasoning and neuroscience

159

participants were impaired at two different theory of mind tasks: detection of social faux pas and judging mental states from pictures of people's eyes. No evidence was reported to suggest that task complexity accounted for performance. These results complement earlier ®ndings showing that ToM de®cits arise following amygdala damage occurring in early childhood (Fine, Lumsden, & Blair, 2001), and suggest that, like PFC, the amygdala is involved in the childhood development of ToM, remaining active during adulthood. Several neuroimaging studies (Baron-Cohen, Ring, Moriarty, Schmitz, Costa, & Ell, 1994; Gallagher, HappeÂ, Brunswick, Fletcher, Frith, & Frith, 2000; Goel, Jordan, Sadato, & Hallett, 1995) also reveal evidence of the prefrontal lobes' involvement in ToM tasks. As for other complex tasks (e.g., autobiographical memory; Svoboda, McKinnon, & Levine, 2006), these studies reveal a diverse network of brain regions involved in both cognitive and emotional processing. Activation of regions, including the medial prefrontal (Baron-Cohen et al., 1994; Goel et al., 1995), anterior temporal (Goel et al., 1995), and anterior paracigulate cortex (Gallagher et al., 2000; McCabe, Houser, Ryan, Smith, & Trouard, 2001) ± linked previously to affective processing ± has been reported across several neuroimaging studies of ToM involving diverse testing paradigms (e.g., recognition of mental state terms, detection of social faux pas). Medial frontal (as well as amygdala) activation has also been reported while participants judge mental states from pictures of individual people's eyes (Baron-Cohen et al., 1999, single-perspective task). Additional regions, such as the superior temporal sulcus (linked to the perception of intentional behaviour) as well as posterior cingulate (Gallagher et al., 2000) and temporal poles (thought to be involved in the retrieval of autobiographical experiences), are also activated consistently across neuroimaging studies of ToM. Taken together, these point towards ToM, in particular, and social reasoning, in general, as complex, multifaceted processes, recruiting in their service neural regions involved in affective processing, including emotion recognition, and also cognitive processing, such as working memory and attention. Evidence from ageing More indirect evidence for the neural and behavioural correlates of ToM reasoning comes from studies of older adults. These participants are known to have age-related decrements across a range of tasks tapping domain general resources, including executive functioning (Craik, 1977). Furthermore, PFC regions, implicated in working memory (e.g., Ragland et al., 2002), are among the ®rst structures to deteriorate with age (Raz, 2000). Thus, correlations between older adults' ability to complete logical reasoning tasks and their performance on tests of executive function (e.g., working memory; Salthouse, 1992) suggest declines in central processing

160

McKinnon, Levine, Moscovitch

resources. Because ToM tasks appear to rely, at least in part, on these same domain general resources, it is likely that older adults should have impairment on them. These experiments, however, reveal a con¯icting pattern of ®ndings regarding older adults' performance. For example, whereas several studies have revealed evidence of age-related impairments on ToM tasks, others show no such impairment, and hence intact performance. We suspect that these apparent discrepancies stem from method variance across studies, similar to those observed in patients with frontal dysfunction. Speci®cally, ToM de®cits have been reported in older adults under conditions involving high demands to recall information (Maylor, Moulson, Muncer, & Taylor, 2002, Experiment 1). Similar age-related de®cits emerge on ToM tasks involving explicit demands to integrate the con¯icting perspectives of two different people, as in the knower/guesser task, where participants must reconcile the differing perspectives of the experimenter and a confederate (Saltzman, Strauss, Hunter, & Archibald, 2000). This ®nding is mirrored in other domains, such as moral reasoning (Pratt, Diessner, Pratt, Hunsberger, & Pancer, 1996). By contrast, relatively intact performance has been reported on ToM tasks that involve less explicit demands to hold in mind and compare con¯icting pieces of information (McKinnon & Moscovitch, in press; Saltzman et al., 2000) and when older adults, ostensibly superior in verbal and intellectual functioning, are tested (HappeÂ, Winner, & Brownell, 1998; cf. Maylor et al., 2002). Interestingly, participants with Alzheimer's disease also fail ToM tasks when demands on domain general resources appear high, as is the case for second-order false belief tasks. No such impairments are reported among this population for tasks involving lower-level reasoning (i.e., ®rst-order false belief tasks; Gregory et al., 2002). We recently tested older adults on ToM tasks that placed differential demands on central processing resources (McKinnon & Moscovitch, in press). If performance on social reasoning tasks in older adults relies on long-established cognitive constructs that no longer require executive functions for their operation (the emergence account), performance on tasks tapping these is likely to be relatively spared by ageing-associated depletion of central processing resources (Craik, 1994). Alternatively, if, as we suspect, performance on social reasoning tasks draws, at least in part, on the same cognitive resources and general abilities as does performance on other tasks that deteriorate with age, then older adults should show an agerelated de®cit on them. Moreover, this performance should be impacted differently by varying the load placed on EFs (e.g., high and low working memory) by these social reasoning tasks. After reading complex social scenarios, normally ageing older adults were asked to answer ®rst-order (e.g., What does A think?) and secondorder (e.g., How does A think B feels) ToM questions regarding the thoughts and feelings of characters in the scenarios. Relative to younger

7. Social reasoning and neuroscience

161

adults, the older adults were impaired on the second-order ToM questions that required them to hold in mind and to integrate two competing perspectives simultaneously; no such de®cits were observed for ®rst-order ToM questions where participants had to consider the thoughts and feelings of one person only (see Figure 7.1). One possible explanation for this is that the cognitive demands of the second-order ToM task impacted negatively on performance among older adults. Indeed, this ®nding is consistent with processing accounts of ToM reasoning (e.g., Leslie et al., 2004) in which older adults should be differentially in¯uenced by varying levels of demands. Our results, however, are at odds with emergence accounts of social reasoning which suggest that, after a critical period, performance of ToM tasks (e.g., Moses, 2001) is subserved by long-standing constructs that are impervious to domain general processing demands. Similar de®cits have been reported in older adults on another complex social reasoning task, versions of the deontic selection task involving either an unfamiliar social contract (Cosmides, 1989) or unfamiliar hazardous conditions (Fiddick, Cosmides, & Tooby, 2000); both versions result in enhanced performance in college-aged students when compared to versions involving descriptive or abstract reasoning, leading several authors to suggest that performance on these relies on the operation of innately speci®ed cognitive modules (see also Roberts, Chapter 1; Noveck, Mercier, & Van der Henst, Chapter 2; O'Brien, Roazzi, Athias, & BrandaÄo, Chapter 3, this volume). In deontic selection tasks, participants select from four cards those that are necessary to solve a reasoning problem presented in the text of a story. Different types of information are displayed on each side of these cards (one side of which must be imagined) and participants consider simultaneously multiple pieces of information regarding societal rules and obligations. When selecting from these cards, they must relate the information presented in the story to two possible outcomes inferred from the reading of its text. Simultaneous consideration of this information requires not only that previously displayed information be recalled, but also that it be integrated to determine the appropriate response selection. The demands of this task appear similar to those for second-order ToM tasks that require participants to hold in mind and compare contrasting beliefs or perspectives (``A thinks that B feels X''). Accordingly, these tasks may draw heavily on central processing resources, such as working memory, that are thought typically to mediate demands to hold in mind and integrate different pieces of information (see also Stenning & van Lambalgen, Chapter 8, this volume). Although there may be an important role for putative pre-existing cognitive biases for reasoning about situations involving danger or cheating, and for understanding the thoughts and feelings of others, it seems unlikely that additional domain general resources do not also play a role where such cognitive demands exist. Indeed, as for ToM, we suspect that potentially modular (e.g., cognitive reasoning biases) and nonmodular

Mean number of correct responses

30 25 20 15 ToM Tasks 10

first-order second-order

5

Mean number of correct responses

Mean number of correct responses

old

young

2.0 1.5 1.0 0.5 Social Contract Tasks

0

social contract descriptive

–0.5 old

young

2.0 1.5 1.0 Precaution Tasks 0.5

precaution descriptive

0 old

young

Figure 7.1 Mean scores across older and younger groups on ToM tasks (upper), social contract tasks (middle), and precaution tasks (lower).

7. Social reasoning and neuroscience

163

components (e.g., working memory) contribute to performance on social versions of these deontic reasoning tasks. One recent study, however, provides evidence of a selective de®cit in reasoning about social contracts arising from brain injury. Speci®cally, Stone, Cosmides, Tooby, Kroll, and Knight (2002) report the case of R.M., who suffered extensive bilateral damage to the OFC, temporal pole and amygdala as the result of a bicycle accident. When R.M.'s performance was compared across social and descriptive versions of the deontic selection task, R.M. was impaired on the social contract version only. Importantly, this de®cit did not emerge for hazard detection versions. While this result provides some support for the massive modularity hypothesis (Pinker, 1997), which assumes that modules (e.g., those specialised for reasoning about social contacts) are localised in speci®c brain regions, we speculate that additional factors could have accounted for the selective de®cit observed. For example, due to enhanced familiarity, social contract versions of the deontic task may be more readily schematised or ``chunked'' than descriptive (or precaution) versions, reducing overall cognitive load. Because R.M.'s damage was to limbic regions involved in emotional processing, we have no reason to believe that his de®cit would not extend to other social reasoning tasks that may rely on the same processing resources (e.g., emotion comprehension in ToM). Finally, in the absence of a double dissociation, there is little reason to suppose that a similar de®cit in social contact reasoning would not arise in patients with damage to regions (e.g., DLPFC) involved in cognitive processing, albeit for different reasons. Hence, we asked older adults to complete both social contract and precaution versions of the deontic selection task, along with descriptive versions. Here, we observed impairment in older adults on both versions of the deontic task (see Figure 7.1) on which younger adults have previously been shown to demonstrate bene®ts for social reasoning. These ®ndings are in line with earlier demonstrations that older adults perform poorly on ostensibly unfamiliar (and presumably more dif®cult; cf. Pollack, Overton, Rosenfeld, & Rosenfeld, 1995) versions of deontic selection tasks (Overton, Yaure, & Ward, 1986) where domain general processing requirements also appeared to affect task performance. Had performance on these tasks relied preferentially on modular components that no longer require EFs for their expression, one would expect a preservation of performance with cognitive ageing. The opposite proved to be true in our experiments, however, with declines in performance even when social reasoning versions were tested. These results appear similar to our ®ndings for older adults tested on ToM, where performance decrements were observed on resource-demanding second-order tasks. Although we suspect that declines in domain general resources, such as working memory, play a substantive role in the pattern of impaired performance observed among older adults on complex social reasoning tasks, the evidence for this contribution remains indirect, based as it is on the

164

McKinnon, Levine, Moscovitch

purported causes of the age differences observed. Equally plausible is that ageing leads to deterioration of domain speci®c modules needed to mediate performance on social reasoning tasks. Evidence from dual-task studies In order to provide a direct demonstration of the contribution of domain general resources to social reasoning, we tested younger adults' performance under dual-task interference on ®rst- and second-order ToM tasks and on social versions of the deontic selection task. Participants were asked to complete simultaneously an auditory version of the 2-back task ± thought to draw heavily on domain general working memory resources ± and social reasoning tasks. In a pattern broadly consistent with that which has been observed in older adults, young adults under divided attention showed impairment on both ToM and social versions of the deontic selection task. Moreover, in each case, interference was also observed on the working memory task performed simultaneously. As in older adults, the magnitude of interference was greatest for more complex tasks, which presumably drew more heavily on EFs, as is the case for second-order ToM compared to ®rstorder. We believe these interference effects stem from competition between the working memory and social reasoning tasks for the shared central processing resources on which each relies. These results provide a direct demonstration of the contribution of domain general resources to social reasoning that is complementary to the indirect evidence from older adults. Furthermore, the results of these dual-task experiments provide additional evidence that EFs continue to contribute to social reasoning in younger adults well past early belief formation in childhood (Leslie et al., 2004). Taken together, our experiments demonstrate that limits placed on social reasoning by executive functions contribute a great deal to performance, even in old age, and in healthy younger adults under conditions of divided attention. These ®ndings are strongly indicative of the reliance of both types of social reasoning tasks on shared resources. Indeed, they are the ®rst direct demonstration of this. Given this evidence, we cannot support the view ± advanced by some proponents of performance accounts of ToM reasoning (Leslie et al., 2004) ± that ``theory of mind'' executive functions speci®c to social reasoning in the ToM domain can be wholly or partially dissociated from domain general executive functions. We believe that future studies aimed at identifying shared processing resources across social reasoning tasks (e.g., ToM, empathy) will be important in furthering our understanding of domain speci®c and domain general contributions to social reasoning. Evidence for domain general contributions to moral reasoning Moral reasoning is another facet of social reasoning that has received much attention in the social cognitive neuroscience literature. Much like ToM,

7. Social reasoning and neuroscience

165

performance on moral reasoning tasks appears to be mediated by a diverse network of brain regions involved in cognitive (e.g., dorsolateral prefrontal regions) and affective (e.g., medial frontal regions) processing. Indeed, convergence between brain regions involved in moral reasoning and in ToM provides yet another provocative suggestion that the processes underlying performance on social reasoning tasks rely, in part, on shared domain general processing resources. Neuroimaging studies, utilising complex moral dilemmas similar to the submarine dilemma described earlier, indicate that moral reasoning, like ToM and deontic reasoning, is subserved by a complex network of neural regions. Notably, a number of these regions coincide with those described above in relation to ToM. These include the medial and orbito/ventromedial PFC (Moll et al., 2002a; Moll, Eslinger, & Oliveira-Souza, 2001), which together are thought to subserve domain general affective processes involving the integration of emotion into decision making, and the representation of reward and punishment, respectively (see Greene & Haidt, 2002 for a review). Similar to ToM, the amygdala (Moll et al., 2002b), a region commonly associated with the recognition of emotion, is activated in a limited number of moral reasoning imaging studies. Furthermore, in a pattern remarkable for its similarity to that for ToM, posterior cingulate, temporal poles and retrosplenial regions (Farrow et al., 2001; Greene et al., 2001; Moll et al., 2001) ± which are implicated in numerous studies of autobiographical memory ± are activated across neuroimaging studies of moral reasoning. Finally, dorsolateral prefrontal regions (Greene et al., 2001) ± involved in working memory and other cognitive functions ± appear active across these studies. However, the parietal±temporal junction ± implicated in many neuroimaging studies of ToM ± does not appear active in studies of moral reasoning. Activation of this region ± thought to be linked to the processing of biological motion as well as mentalising ± in studies of ToM may re¯ect, in part, the visual±spatial (e.g., cartoons, movement sequences) nature of many of the stimuli used in these experiments. Greene et al. (2001) have examined further cognitive and affective contributions to moral reasoning by comparing performance on two types of task: those involving ``personal'' versus ``impersonal'' moral violations and judgements. Whereas ``personal'' moral dilemmas involve (1) in¯icting serious harm, (2) onto a particular person, (3) through your own agency, an impersonal dilemma would not satisfy criterion (3). In the dilemma given earlier, a ``personal'' dilemma would involve having to decide whether to kill the injured crew member yourself. In an ``impersonal'' dilemma, the decision would involve deciding whether to order another crew member to perform the act. Greene et al. (2001) found that responding to ``personal'' dilemmas, as compared to ``impersonal'' and non-moral dilemmas, was associated with selective activation of regions associated with social/affective processing, including the medial prefrontal gyrus, posterior cingulate gyrus and the

166

McKinnon, Levine, Moscovitch

bilateral superior temporal sulcus. By contrast, relative to ``personal'' dilemmas, responding to ``impersonal'' and non-moral dilemmas resulted in enhanced activation of dorsolateral prefrontal and parietal areas, which are associated with working memory and cognitive control operations. Moreover, subjects, on average, took longer to approve of ``personal'' moral violations as compared to condemning personal moral violations. Approval or condemnation of ``impersonal'' moral violations and non-moral judgements was also made more quickly than approval of personal violations (that require one to overcome a negative emotional response). Hence, whereas judgements of ``impersonal'' dilemmas may occur in a fashion akin to that for non-moral dilemmas ± which recruit cognitive processes mediated by dorsolateral prefrontal cortex ± reasoning about ``personal'' moral dilemmas may involve a more affective style of reasoning, recruiting regions associated with emotional processing. Interestingly, however, regions involved in affective processing (e.g., the amygdala and medial frontal gyrus) are also recruited on ToM reasoning tasks that, ostensibly, are impersonal. Moral reasoning has also been examined in developmental studies and in studies of patients with brain damage. Pratt et al. (1996) examined moral reasoning in older adults with presumed frontal lobe dysfunction. In a pattern similar to ToM tasks (McKinnon & Moscovitch, in press), older adults performed well on tasks that required them to think about a single perspective, but poorly on tasks that relied on an understanding of two or more people's perspectives. Interestingly, as for ToM, progressive changes in moral reasoning abilities in young children coincide with the emergence of frontally mediated, high-level, reasoning skills (see above). Studies of patients with damage to prefrontal regions also reveal evidence of frontal lobe contributions to moral reasoning. In one in¯uential conception, Damasio (1996) links damage to prefrontal regions ± involved in affective processing ± to impaired performance on tasks involving social judgement. Speci®cally, patients with damage to ventral and medial prefrontal regions are unable to utilise effectively ``somatic markers'' involving neural representations of bodily states (e.g., anxiety, tension) that, when recognised appropriately, contribute affective signi®cance to behavioural decisions, guiding performance on complex cognitive and affective tasks. Accordingly, these patients show abnormal responses on the Iowa gambling task, thought to stimulate real-life affective decision-making (see Levine, Black, Cheung, Campbell, O'Toole, & Schwartz, 2005 for domain general, cognitive contributions to this task). At the same time, these patients show abnormal responses on skin-conductance measures believed to measure autonomic arousal (e.g., while making a risky decision). Interestingly, these de®cits occur in the face of intact cognitive performance (e.g., working memory, IQ) and preserved social knowledge (e.g., moral knowledge). In line with this, several studies link violent criminal behaviour to medial prefrontal and limbic dysfunction (e.g., Kiehl et al., 2001; see Greene & Haidt, 2002, for a review). Furthermore, these same regions may be

7. Social reasoning and neuroscience

167

involved in a variety of psychiatric illnesses (e.g., depression) involving altered affective processing (e.g., Mayberg, 1997). Lesion studies examining speci®cally moral reasoning performance are very rare. One study examined moral reasoning speci®cally in two adult patients with damage to ventral, medial and polar prefrontal cortex acquired during early childhood (Anderson & Phelps, 2001). Like Damasio's patients, with adult-onset damage, these patients performed poorly on the Iowa gambling despite intact performance on other domain general cognitive tasks. In contrast to the patients with damage acquired during adulthood, however, Anderson's patients showed de®cient knowledge of social and moral norms, re¯ected in their ``preconventional'' or egocentric reasoning on tests of complex moral reasoning. Moreover, these patients' real life moral functioning was grossly impaired, with longstanding patterns of lying, stealing, reckless ®nancial and sexual activities and verbal and physical aggression. Hence, early damage to prefrontal structures appears to impair acquisition of complex social conventions and moral rules, resulting in a pattern of behaviour resembling psychopathy (Greene & Haidt, 2002). We have recently had the opportunity to test adult patients with frontal (fvFTD) and temporal (tvFTD) variants of frontotemporal dementia (FTD; Neary et al., 1998) on tests of moral reasoning (McKinnon, Talmi et al., in preparation). In the frontal variant (fvFTD), unilateral or bilateral volume loss in the prefrontal cortex gives rise to behavioural changes. fvFTD is distinguished from other forms of dementia by the appearance of personality change, social comportment de®cits, and impaired self-regulation (Miller, Darby, Benson, Cummings, & Miller, 1997). This pattern is most apparent when damage affects the ventral portion of the prefrontal cortex (Stuss & Benson, 1986), an area particularly vulnerable to tissue loss in early-stage FTD (Rosen et al., 2002a) and implicated in numerous social reasoning tasks. By contrast, in tvFTD, unilateral right volume loss gives rise to behavioural changes (Edwards-Lee et al., 1997), whereas unilateral left volume loss gives rise to semantic dementia (SD; Hodges, Patterson, Oxbury, & Funnell, 1992). In association with this volume loss to anterior temporal regions (including the amygdala), patients with tvFTD (including SD) often have PFC damage (Rosen et al., 2002a). Recent work has shown that patients with FTD exhibit de®cits in numerous domains of social reasoning, including empathy (Rankin, Kramer, & Miller, 2005), theory of mind (Channon & Crawford, 2000), and recognition of the emotional content of speech (McKinnon, Schellenberg et al., in preparation), music (McKinnon, Schellenberg et al., in preparation) and faces (Fernandez-Duque & Black, 2005; Rosen et al., 2002b). It is unclear, however, whether these de®cits extend to moral reasoning. Participants with FTD, and neurologically intact control participants, completed two different moral reasoning tasks: a standardised questionnaire (Gibbs, Basinger, & Fuller, 1992) and a modi®ed version of the personal/impersonal

168

McKinnon, Levine, Moscovitch

moral dilemmas task used in the neuroimaging studies of moral reasoning described above (Greene et al., 2001). Preliminary results from our study suggest that patients with tvFTD and fvFTD are impaired on both tasks, providing responses that not only are concrete and in¯exible, but often simply repeat well-known social dictums, such as ``I wouldn't do it because it is wrong to kill''. Under most moral reasoning schemes (e.g., Colby & Kohlberg, 1987), these responses would be classi®ed as ``pre-conventional'', approached largely from the egocentric view of avoiding punishment. Interestingly, these patients respond similarly to personal and impersonal versions of the task, showing no difference in response to dilemmas that involve oneself as the agent of action (e.g., killing the injured crew member yourself ) or another (e.g., ordering another crew member to do it). In a related study, we report the case of J.S. (McKinnon, Freedman, et al., in preparation), a patient with progressive prefrontal lobe damage arising from fvFTD. At the early stages of his disorder, J.S. repeatedly engaged in a pattern of controlled ``lying'' and strategic deception that was followed at later stages by gross confabulation of a fantastic nature. We attribute J.S.'s changes to progressive loss of frontal lobe tissue involved in cognitive processing. Speci®cally, at initial diagnosis, J.S.'s prefrontal damage was con®ned primarily to orbital and medial prefrontal cortex, accounting for his dysregulated behaviour. Subsequently, and coinciding with J.S.'s inability to maintain the complex pattern of deception and strategic lying he had exhibited earlier, damage progressed to dorsolateral prefrontal regions involved in cognitive processes, such as working memory and attention. It was at this time that we ®rst began to note evidence of confabulation (see Gilboa & Moscovitch, 2002 for evidence of ventromedial PFC contributions to confabulation). Evidence for domain general contributions to empathy Empathy, like moral reasoning and ToM, is an important component of social cognition that contributes to our ability to understand and respond adaptively to others' emotions, succeed in emotional communication, and promote prosocial behaviour. The term ``empathy'' in general refers to the consequences of perceiving accurately the feeling state of another. Despite the prominence of this construct in developmental research (sometimes referred to as theory of mind; Sagi & Hoffman, 1976), and cross-species investigations of empathic capabilities (Rice & Gainer, 1962), the neural correlates of empathy remain elusive. Similarly to theories of ToM and moral reasoning, current theories of empathy emphasise the contribution of cognitive and affective mechanisms, including recognition of others' mental and feeling states, self±other awareness, and adoption of the subjective perspective of the other in forming an empathic response (e.g., Decety & Jackson, 2004). Collectively, these separate ``systems'' have been described as producing empathic responding

7. Social reasoning and neuroscience

169

(Boyer & Barrett, 2004). Although writers in this ®eld seldom discuss the notion of a cognitive ``module'' specialised for empathic responding, emphasis has been placed on the joint contribution of domain general processes, such as working memory, and biological ``predispositions'', such as the capacity to distinguish agents from others, and to form implicit models of mental states, that are thought mediated by distributed networks of neural regions (Decety & Jackson, 2004). Most of the literature concerning empathy focuses on a distinction between cognitive and emotional components. These have assumed various de®nitions. Put simply, however, emotional empathy can be thought of as an emotional reaction (e.g., distress) to the emotional response (e.g., sadness) of another. This reaction is not dependent on a cognitive understanding of why a person is suffering (Rankin et al., 2005). By contrast, cognitive empathy involves an intellectual or imaginative apprehension of another's emotional state, often described as overlapping with the construct of theory of mind or perspective taking, and used interchangeably by some authors (Lawrence, Shaw, Baker, Baron-Cohen, & David, 2004). This cognitive awareness is thought to be independent of the emotional states engendered by an emotional empathic response. In support of this proposed dissociation, recent work with psychiatric and neurological populations shows that these two components of empathy may be affected differentially by the presence of neurological (Rankin et al., 2005) or psychiatric disease (Kaplan & Arbuthnot, 1985). For example, Eslinger (1998) reported that whereas focal lesion patients with primarily dorsolateral prefrontal damage suffer impairments in cognitive (e.g., perspective taking), but not emotional empathy, patients with damage limited to primarily orbital prefrontal cortex show the reverse effect. Similar results were reported by Grattan, Bloomer, Archambault, and Eslinger (1994), who found that lesions to the orbitofrontal cortex impaired empathic ability, but had little effect on executive functioning. By contrast, lesions to dorsolateral frontal cortex showed the reverse effect. More recently, Rankin et al. (2005) reported that patients with fvFTD involving diffuse damage to orbitofrontal, medial, and dorsal prefrontal regions show impairments of cognitive empathy. Patients with the temporal variant (tvFTD), involving damage restricted primarily to the anterior temporal lobes, amygdala and ventromedial orbitofrontal regions, however, were impaired at both cognitive and affective components of empathic responding (see Perry, Rosen, Kramer, Beer, Levenson, & Miller, 2001, for similar ®ndings). Interestingly, patients with Alzheimer's dementia, involving damage to more posterior brain regions, showed no such impairment. Both components of empathic responding correlated with measures of executive function (semantic and phonological ¯uency; abstract reasoning). Importantly, however, the Eslinger, Grattan, and Rankin studies relied on informant (signi®cant other) responses to self-report questionnaires. These scales, the Hogan Empathy Scale and the Interpersonal Reactivity Index, demonstrate questionable

170

McKinnon, Levine, Moscovitch

reliability and validity (Froman & Peloquin, 2001). Moreover, no norms are available for informant versions of these measures. Other workers have suggested that empathy may be subserved by two dissociable networks described by Preston and de Waal (2002). The ®rst, involved in perception and emotion regulation, is thought to comprise the amygdala, cingulate and orbitofrontal cortices. The second, necessary for holding in mind and manipulating social information, comprises dorsolateral (and ventromedial) prefrontal regions. Farrow et al. (2001) attempted to distinguish empathic judgements from those involved in inferring others' intentions. Relative to the baseline inference task (which may have engaged ToM reasoning), empathic responding resulted in increased activation of anterior temporal, dorsolateral and orbitofrontal regions, and the precuneus, suggesting the engagement of regions involved in both the cognitive and affective component, as described by Preston and de Waal's model, of empathic processing. Notably, these regions are similar to those activated in the ToM and moral reasoning studies described above. Other research suggests that having an empathic response to someone else's experience (e.g., disgust, touch) may involve activation of the same neuronal networks as would be involved in experiencing the event oneself (Keysers, Wicker, Gazzola, Anton, Fogassi, & Gallese, 2004). Singer and Frith (2005) reported that the same anterior cingulate and anterior insulate regions that are activated by a painful stimulus applied to oneself are activated when one is told that a loved one is receiving a painful stimulus. Because no such activation was observed in primary somatosensory cortex ± involved in motor responses to pain ± the authors concluded that empathy for pain engenders an affective, but not sensory, response. By contrast, Avenanti, Bueti, Galati, & Aglioti (2005) showed that watching a needle prick an unknown confederate's hand is associated with reduced motor excitability in the same muscle of the observer as when transcortical magnetic stimulation is applied: the same response found when participants experience pain themselves. Taken together, these studies suggest that the neural correlates of empathy for strangers and for loved ones may differ with the level of affective signi®cance assigned to the target, with regions involved in affective processing (e.g., cingulate) engaged uniquely in response to feeling empathy for a person of high emotional signi®cance to the perceiver.

Summary and conclusions In this chapter, we undertake to provide a review of the current corpus of neuropsychological evidence surrounding domain general and domain speci®c contributions to performance on four different social reasoning tasks: theory of mind, deontic reasoning, moral reasoning, and empathy. Taken together, studies involving patients with brain disorders, functional neuroimaging, children, and normally ageing adults, set boundary conditions on what it means for social reasoning to be modular. The evidence

7. Social reasoning and neuroscience

171

suggests that both domain general and domain speci®c components play a crucial role in task performance. Certainly, social reasoning performance can be shown to be affected by damage to, or competition for, central processing resources that, presumably, would have little or no involvement in task performance if execution were mediated only by the modular components of social reasoning. This view is complementary to the elegant set of constraints already posited in the developmental literature for ToM (e.g., Leslie et al., 2004) where both modular and nonmodular components are thought to underlie task performance. At present, however, there is little reason to believe that modular components are not recruited when central processing demands are low on social reasoning tasks (e.g., ®rstorder ToM tasks). Indeed, select modular components of social reasoning (e.g., emotion comprehension; Winston, Strange, O'Doherty, & Dolan, 2002 for evidence of the automaticity of this function) may operate in tandem with nonmodular components, even under high levels of executive functioning demands. We believe that additional studies involving other social reasoning paradigms and different subject populations may reveal further whether potentially modular aspects of social reasoning act in tandem (or in isolation) with domain general resources, such as working memory, inhibition and attention, to satisfy the multiple processing requirements of typical social reasoning tasks. Another question, often occupying investigators concerned with issues of modularity in speech and language perception, is whether domain general effects operate on the output of the modules, or on the modules themselves. The former would allow for the existence of modules, whereas the latter would question their very existence. In addition to highlighting the shared contribution of domain general and domain speci®c resources to social reasoning tasks, our review suggests areas of potential overlap between cognitive and affective processing resources that contribute to performance across diverse social reasoning tasks. Neuroimaging and patient studies of ToM, moral reasoning and empathy implicate the same neural regions linked to domain general processes that serve diverse cognitive (e.g., dorsolateral prefrontal cortex), affective (e.g., orbitofrontal and medial frontal; amygdala) and memory (e.g., posterior cingulate, temporal poles, and retrosplenial regions) functions. These same neural regions appear implicated in emergent social reasoning in young children and in declines in social reasoning performance in older adults. Future studies aimed at identifying shared processing resources across social reasoning tasks (e.g., ToM, empathy) will be important in furthering our understanding of domain speci®c and domain general contributions to social reasoning. Additional studies may reveal neural regions (e.g., temporal± parietal junction) activated uniquely in response to the social reasoning tasks described here. On balance, however, our analysis points towards social reasoning as a complex, multifaceted form of reasoning that recruits a wide variety of processing resources in its service.

172

McKinnon, Levine, Moscovitch

References Anderson, A. K., & Phelps, E. A. (2001). Lesions of the human amygdala impair enhanced perception of emotionally salient events. Nature, 411, 305±309. Avenanti, A., Bueti, D., Galati, G., & Aglioti, S. M. (2005). Transcranial magnetic stimulation highlights the sensorimotor side of empathy for pain. Nature Neuroscience, 8, 955±960. Baron-Cohen, S. (1995). Mindblindness. Cambridge, MA: MIT Press. Baron-Cohen, S., Ring, H. A., Moriarty, J., Schmitz, B., Costa, D., & Ell, P. (1994). Recognition of mental state terms. Clinical ®ndings in children with autism and a functional neuroimaging study of normal adults. British Journal of Psychiatry, 165, 640±649. Baron-Cohen, S., Ring, H. A., Wheelwright, S., Bullmore, E. T., Brammer, M. J., Simmons, A., et al. (1999). Social intelligence in the normal and autistic brain: An fMRI study. European Journal of Neuroscience, 11, 1891±1898. Bauer, R. M. (1984). Autonomic recognition of names and faces in prosopagnosia: A neuropsychological application of the guilty knowledge test. Neuropsychologia, 22, 457±469. Bibby, H., & McDonald, S. (2005). Theory of mind after traumatic brain injury. Neuropsychologia, 43, 99±114. Boyer, P., & Barrett, H. C. (2004). Evolved intuitive ontology: Integrating neural, behavioural, and developmental aspects of domain speci®city. In D. Buss (Ed.), Handbook of Evolutionary Psychology (pp. 96±188). Cambridge, MA: MIT Press. Carlson, S. M., & Moses, L. J. (2001). Individual differences in inhibitory control and children's theory of mind. Child Development, 72, 1032±1053. Case, R. (1992). The mind 's staircase: Exploring the conceptual underpinnings of children's thoughts and knowledge. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Channon, S., & Crawford, S. (2000). The effects of anterior lesions on performance on a story comprehension test: Left anterior impairment on a theory of mind-type task. Neuropsychologia, 38, 1006±1017. Charman, T., & Baron-Cohen, S. (1992). Understanding drawings and beliefs: A further test of the metarepresentation theory of autism: A research note. Journal of Child Psychology & Psychiatry, 33, 1105±1112. Colby, A., & Kohlberg, L. (1987). The measurement of moral judgement. Cambridge: Cambridge University Press. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187±276. Craik, F. I. M. (1977). Age differences in human memory. In J. E. Birren & K. W. Schaie (Eds.), Handbook of the psychology of aging (pp. 384±420). Princeton, NJ: Van Nostrand Reinhold. Craik, F. I. M. (1994). Memory changes in normal aging. Current Directions in Psychological Science, 3, 155±158. Damasio, A. R. (1996). The somatic marker hypothesis and the possible functions of the prefrontal cortex. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 351, 1413±1420. Decety, J., & Jackson, P. L. (2004). The functional architecture of human empathy. Behavioral & Cognitive Neuroscience Reviews, 3, 406±412. Edwards-Lee, T., Miller, B. L., Benson, D. F., Cummings, J. L., Russell, G. L.,

7. Social reasoning and neuroscience

173

Boone, K., et al. (1997). The temporal variant of frontotemporal dementia. Brain, 120, 1027±1040. Eslinger, P. J. (1998). Neurological and neuropsychological bases of empathy. European Journal of Neurology, 39, 193±199. Farrow, T. F., Zheng, Y., Wilkinson, I. D., Spence, S. A., Deakin, J. F., Tarrier, N., et al. (2001). Investigating the functional anatomy of empathy and forgiveness. Neuroreport, 12, 2433±2438. Fernandez-Duque, D., & Black, S. E. (2005). Impaired recognition of negative facial emotions in patients with frontotemporal dementia. Neuropsychologia, 43, 1673±1687. Fiddick, L., Cosmides, L., & Tooby, J. (2000). No interpretation without representation: the role of domain speci®c representations and inferences in the Wason selection task. Cognition, 77, 1±79. Fine, C., Lumsden, J., & Blair, R. J. (2001). Dissociation between ``theory of mind'' and executive functions in a patient with early left amygdala damage. Brain, 124, 287±298. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: Bradford. Froman, R. D., & Peloquin, S. M. (2001). Rethinking the use of the Hogan Empathy Scale: a critical psychometric analysis. American Journal of Occupational Therapy, 55, 566±572. Frye, D., Zelazo, P. D., & Palfai, T. (1995). Theory of mind and rule-based reasoning. Cognitive Development, 10, 483±527. Gallagher, H. L., HappeÂ, F., Brunswick, N., Fletcher, P. C., Frith, U., & Frith, C. D. (2000). Reading the mind in cartoons and stories: An fMRI study of ``theory of mind'' in verbal and nonverbal tasks. Neuropsychologia, 38, 11±21. Gibbs, J. C., Basinger, K. S., & Fuller, D. (1992). Moral maturity: Measuring the development of sociomoral re¯ection. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Gilboa, A., & Moscovitch, M. (2002). The cognitive neuroscience of confabulation: A review and a model. In A. D. Baddeley, M. D. Kopelman, & B. A. Wilson (Eds.), The Handbook of Memory Disorders (2nd ed.) (pp. 315±342). Chichester, UK: Wiley. Goel, V., Jordan, G., Sadato, N., & Hallett, M. (1995). Modeling other minds. Neuroreport, 6, 1741±1746. Gopnik, A., & Graf, P. (1988). Knowing how you know: Young children's ability to identify and remember the sources of their beliefs. Child Development, 59, 1366±1371. Gordon, A. C., & Olson, D. R. (1998). The relation between acquisition of a theory of mind and the capacity to hold in mind. Journal of Experimental Child Psychology, 68, 70±83. Grattan, L. M., Bloomer, R. H., Archambault, F. X., & Eslinger, P. J. (1994). Cognitive ¯exibility and empathy after frontal lobe lesion. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 7, 251±259. Greene, J. D., & Haidt, J. (2002). How (and where) does moral judgment work? Trends in Cognitive Sciences, 6, 517±523. Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment. Science, 293, 2105±2108. Gregory, C., Lough, S., Stone, V. E., Erzinclioglu, S., Martin, L., Baron-Cohen, S.,

174

McKinnon, Levine, Moscovitch

et al. (2002). Theory of mind in patients with frontal variant frontotemporal dementia and Alzheimer's disease: Theoretical and practical implications. Brain, 125, 752±764. Hamann, S. B., Ely, T. D., Grafton, S. T., & Kilts, C. D. (1999). Amygdala activity related to enhanced memory for pleasant and aversive stimuli. Nature Neuroscience, 2, 289±293. HappeÂ, F. G., Winner, E., & Brownell, H. (1998). The getting of wisdom: Theory of mind in old age. Developmental Psychology, 34, 358±362. Hodges, J. R., Patterson, K., Oxbury, S., & Funnell, E. (1992). Semantic dementia. Progressive ¯uent aphasia with temporal lobe atrophy. Brain, 115, 1783±1806. Holland, P. C., & Gallagher, M. (2004). Amygdala±frontal interactions and reward expectancy. Current Opinion in Neurobiology, 14, 148±155. Hudson, J. A., & Fivush, R. (1991). As time goes by: Sixth graders remember a kindergarten experience. Applied Cognitive Psychology, 5, 347±360. Hughes, C. (1998). Finding your marbles: Does preschoolers' strategic behavior predict later understanding of mind? Developmental Psychology, 34, 1326±1339. Hughes, C. (2002). Executive functions and development: Emerging themes. Infant and Child Development, 11, 201±209. Kaplan, P. J., & Arbuthnot, J. (1985). Affective empathy and cognitive role-taking in delinquent and nondelinquent youth. Adolescence, 20, 323±333. Keysers, C., Wicker, B., Gazzola, V., Anton, J. L., Fogassi, L., & Gallese, V. (2004). A touching sight: SII/PV activation during the observation and experience of touch. Neuron, 42, 335±346. Kiehl, K. A., Smith, A. M., Hare, R. D., Mendrek, A., Forster, B. B., Brink, J., et al. (2001). Limbic abnormalities in affective processing by criminal psychopaths as revealed by functional magnetic resonance imaging. Biological Psychiatry, 50, 677±684. Lawrence, E. J., Shaw, P., Baker, D., Baron-Cohen, S., & David, A. S. (2004). Measuring empathy: reliability and validity of the Empathy Quotient. Psychological Medicine, 34, 911±919. Leslie, A. M., Friedman, O., & German, T. P. (2004). Core mechanisms in ``theory of mind''. Trends in Cognitive Sciences, 8, 528±533. Levine, B. (2004). Autobiographical memory and the self in time: Brain lesion effects, functional neuroanatomy, and lifespan development. Brain and Cognition, 55, 54±68. Levine, B., Black, S. E., Cheung, G., Campbell, A., O'Toole, C., & Schwartz, M. L. (2005). Gambling task performance in traumatic brain injury: Relationships to injury severity, atrophy, lesion location, and cognitive and psychosocial outcome. Cognitive and Behavioral Neurology, 18, 45±54. Mayberg, H. S. (1997). Lymbic-cortical dysregulation: A proposed model of depression. Journal of Neuropsychiatry and Clinical Neuroscience, 9, 471±481. Maylor, E. A., Moulson, J. M., Muncer, A.-M., & Taylor, L. A. (2002). Does performance on theory of mind tasks decline with age? British Journal of Psychology, 93, 465±485. McCabe, K., Houser, D., Ryan, L., Smith, V., & Trouard, T. (2001). A functional imaging study of cooperation in two-person reciprocal exchange. Proceedings of the National Academy of Sciences USA, 98, 11832±11835. McKinnon, M. C., Freedman, M., Spreng, N., & Levine, B. (in preparation). When

7. Social reasoning and neuroscience

175

a liar becomes a confabulator: Progression from ventromedial to dorsolateral prefrontal involvement in frontotemporal dementia. Manuscript in preparation. McKinnon, M. C., & Moscovitch, M. (in press). Domain general contributions to social reasoning: Theory of mind and deontic reasoning re-explored. Cognition. McKinnon, M. C., Schellenberg, E. G., Turner, G., Miller, B., Black, S., Freedman, M., et al. (in preparation). Decoding the emotional content of speech and music: Emotion comprehension de®cits in frontotemporal dementia. Manuscript in preparation. McKinnon, M. C., Talmi, D., Jaswal, G., Miller, B. L., Black, S. E., Freedman, M., et al. (in preparation). Impairment of moral reasoning performance in frontotemporal dementia. Manuscript in preparation. Miller, B. L., Darby, A., Benson, D. F., Cummings, J. L., & Miller, M. H. (1997). Aggressive, socially disruptive and antisocial behaviour associated with frontotemporal dementia. British Journal of Psychiatry, 170, 150±154. Moll, J., de Oliveira-Souza, R., Bramati, I. E., & Grafman, J. (2002a). Functional networks in emotional moral and nonmoral social judgments. Neuroimage, 16, 696±703. Moll, J., de Oliveira-Souza, R., Eslinger, P. J., Bramati, I. E., Mourao-Miranda, J., Andreiuolo, P. A., et al. (2002b). The neural correlates of moral sensitivity: A functional magnetic resonance imaging investigation of basic and moral emotions. Journal of Neuroscience, 22, 2730±2736. Moll, J., Eslinger, P. J., & Oliveira-Souza, R. (2001). Frontopolar and anterior temporal cortex activation in a moral judgment task: preliminary functional MRI results in normal subjects. Arquivos de Neuro-Psiquiatria, 59, 657±664. Moscovitch, M. (1992). Memory and working-with-memory: A component process model based on modules and central systems. Journal of Cognitive Neuroscience, 4, 257±267. Moscovitch, M., & UmiltaÁ, C. (1990). Modularity and neuropsychology: Implications for the organization of attention and memory in normal and braindamaged people. In M. F. Schwartz (Ed.), Modular de®cits in Alzheimer-type dementia (pp. 1±59). Cambridge, MA: MIT Press. Moscovitch, M., & Winocur, G. (1992). The neuropsychology of memory and aging. In T. A. Salthouse & F. I. M. Craik (Eds.), The handbook of aging and cognition (pp. 315±372). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Moses, L. J. (2001). Executive accounts of theory-of-mind development. Child Development, 72, 688±690. Neary, D., Snowden, J. S., Gustafson, L., Passant, U., Stuss, D. T., Black, S., et al. (1998). Frontotemporal lobar degeneration: a consensus on clinical diagnostic criteria. Neurology, 51, 1546±1554. Overton, W. F., Yaure, R., & Ward, S. L. (1986). Deductive reasoning in young and elderly adults. Paper presented to the Conference on Human Development, Nashville, TN. Ozonoff, S., Pennington, B. F., & Rogers, S. J. (1991). Executive function de®cits in high-functioning autistic individuals: Relationship to theory of mind. Journal of Child Psychology & Psychiatry, 32, 1081±1105. Perner, J., & Lang, B. (1999). Development of theory of mind and executive control. Trends in Cognitive Sciences, 3, 337±344. Perry, R. J., Rosen, H. R., Kramer, J. H., Beer, J. S., Levenson, R. L., & Miller, B. L. (2001). Hemispheric dominance for emotions, empathy and social behaviour:

176

McKinnon, Levine, Moscovitch

evidence from right and left handers with frontotemporal dementia. Neurocase, 7, 145±160. Pinker, S. (1997). How the Mind Works. New York: W.W. Norton. Pollack, R. D., Overton, W. F., Rosenfeld, A., & Rosenfeld, R. (1995). Formal reasoning in late adulthood: The role of semantic content and metacognitive strategy. Journal of Adult Development, 2, 1±14. Povinelli, D. J., Landau, K. R., & Perilloux, H. K. (1996). Self-recognition in young children using delayed versus live feedback: evidence of a developmental asynchrony. Child Development, 67, 1540±1554. Pratt, M. W., Diessner, R., Pratt, A., Hunsberger, B., & Pancer, S. M. (1996). Moral and social reasoning and perspective taking in later life: A longitudinal study. Psychology and Aging, 11, 66±73. Preston, S. D., & de Waal, F. B. (2002). Empathy: Its ultimate and proximate bases. Behavioral and Brain Sciences, 25, 1±20. Ragland, J. D., Turetsky, B. I., Gur, R. C., Gunning-Dixon, F. M., Turner, T., Schroeder, L., et al. (2002). Working memory for complex ®gures: an fMRI comparison of letter and fractal n-back tasks. Neuropsychology, 16, 370±379. Rankin, K. P., Kramer, J. H., & Miller, B. L. (2005). Patterns of cognitive and emotional empathy in frontotemporal lobar degeneration. Cognitive and Behavioral Neurology, 18, 28±36. Raz, N. (2000). Aging of the brain and its impact on cognitive performance: Integration of structural and functional ®ndings. In F. I. M. Craik & T. A. Salthouse (Eds.), The handbook of aging and cognition (2nd ed.) (pp. 1±90). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Rice, G. E., & Gainer, P. (1962). ``Altruism'' in the albino rat. Journal of Comparative and Physiological Psychology, 55, 123±125. Riddoch, M. J., & Humphreys, G. W. (1987). Visual object processing in optic aphasia: A case of semantic access agnosia. Cognitive Neuropsychology, 4, 131±186. Rosen, H. J., Gorno-Tempini, M. L., Goldman, W. P., Perry, R. J., Schuff, N., Weiner, M., et al. (2002a). Patterns of brain atrophy in frontotemporal dementia and semantic dementia. Neurology, 58, 198±208. Rosen, H. J., Perry, R. J., Murphy, J., Kramer, J. H., Mychack, P., Schuff, N., et al. (2002b). Emotion comprehension in the temporal variant of frontotemporal dementia. Brain, 125, 2286±2295. Rowe, A. D., Bullock, P. R., Polkey, C. E., & Morris, R. G. (2001). ``Theory of mind'' impairments and their relationship to executive functioning following frontal lobe excisions. Brain, 124, 600±616. Sagi, A., & Hoffman, M. L. (1976). Empathic distress in the newborn. Developmental Psychology, 12, 175±176. Salthouse, T. A. (1992). Working-memory mediation of adult age differences in integrative reasoning. Memory & Cognition, 20, 413±423. Saltzman, J., Strauss, E., Hunter, M., & Archibald, S. (2000). Theory of mind and executive functions in normal human aging and Parkinson's disease. Journal of the International Neuropsychological Society, 6, 781±788. Schwartz, M. F., Saffran, E. M., & Marin, O. S. M. (1980). Fractionating the reading process in dementia: Evidence for word-speci®c print-to-sound associations. London: Routledge and Kegan Paul. Shamay-Tsoory, S. G., Tomer, R., Berger, B. D., Goldsher, D., & Aharon-Peretz, J.

7. Social reasoning and neuroscience

177

(2005). Impaired ``affective theory of mind'' is associated with right ventromedial prefrontal damage. Cognitive and Behavioral Neurology, 18, 55±67. Singer, T., & Frith, C. D. (2005). The painful side of empathy. Nature Neuroscience, 8, 845±846. Stone, V. E., Baron-Cohen, S., Calder, A., Keane, J., & Young, A. (2003). Acquired theory of mind impairments in individuals with bilateral amygdala lesions. Neuropsychologia, 41, 209±220. Stone, V. E., Cosmides, L., Tooby, J., Kroll, N., & Knight, R. T. (2002). Selective impairment of reasoning about social interchange in a patient with bilateral limbic system damage. Proceedings of the National Academy of Sciences of the USA, 99, 11531±11536. Stone, V. E., Baron-Cohen, S., & Knight, R. T. (1998). Frontal lobe contributions to theory of mind. Journal of Cognitive Neuroscience, 10, 640±656. Stuss, D. T., & Anderson, V. (2004). The frontal lobes and theory of mind: Developmental concepts from adult focal lesion research. Brain and Cognition, 55, 69±83. Stuss, D. T., & Benson, D. F. (1986). The frontal lobes. New York: Raven Press. Stuss, D. T., Gallup, G. G., & Alexander, M. P. (2001). The frontal lobes are necessary for ``theory of mind''. Brain, 124, 279±286. Svoboda, E., McKinnon, M. C., & Levine, B. (2006). The functional neuroanatomy of autobiographical memory: A meta-analysis. Neuropsychologia, 44, 2189±2208. Warrington, E. K., & Taylor, A. M. (1978). Two categorical stages of object recognition. Perception, 7, 695±705. Williams, M. A., Morris, A. P., McGlone, F., Abbott, D. F., & Mattingley, J. B. (2004). Amygdala responses to fearful and happy facial expressions under conditions of binocular suppression. Journal of Neuroscience, 24, 2898±2904. Winston, J. S., Strange, B. A., O'Doherty, J., & Dolan, R. J. (2002). Automatic and intentional brain responses during evaluation of trustworthiness of faces. Nature Neuroscience, 5, 277±283. Zelazo, P. D., Burack, J. A., Benedetto, E., & Frye, D. (1996). Theory of mind and rule use in individuals with Down's syndrome: A test of the uniqueness and speci®city claims. Journal of Child Psychology and Psychiatry, 37, 479±484. Zelazo, P. D., Jacques, S., Burack, J. A., & Frye, D. (2002). The relation between theory of mind and rule use: Evidence from persons with autism-spectrum disorders. Infant & Child Development, 11, 171±195.

8

Explaining the domain generality of human cognition Keith Stenning and Michiel van Lambalgen

From an evolutionary point of view, what is most overwhelmingly in need of explanation is the generality of cognition in humans, as compared to that of our immediate ancestors. Humans sing operas, build diesel engines and agonise about domain dependence. Above all they can sustain the kinds of societies required for each of these activities. What part a domain specialist apparatus plays in this new-found encyclopaedic cognition is a good question. But we believe that the wholesale current appeal to domain speci®city has led to the neglect of what should be the guiding observation ± the hugely more general capacities for reasoning in human beings. The rejection of homogeneous universal learning mechanisms, or of monolithic universally interpreted languages and logics, as simplistic bases for explaining the generality of human cognition, has led to the ``massive modularity'' view of reasoning (Sperber, 1994, 2002), but we will argue that this is a remarkably poor basis for explaining what is overwhelmingly in need of explanation. One implausibility of lumbering us with a cognition of a thousand domains, each implemented in a new module, is the briefness of the evolutionary interval that led from the cognition of our most recent common ancestor with apes, to our present singing, dieseling and philosophising communities. A very small number of changes led to a radical change in versatility. The biological conclusion is that the vastly greater part of the apparatus was there already, though it may have been doing something very different at the time. Of course, this means that whatever modularisation there was in our immediate ancestor will have largely passed through this process into our own modularisation. So, we well may have a cognition of many modules. But if this was the evolutionary process, the modules are unlikely to be completely new (rather than modi®ed old ones), and modularisation per se will play little part in explaining the novel generality. Generality is not achieved (or explained) by adding a large number of new modules, though it may well be achieved by repurposing old ones. The problem is, of course, that with cognitive function, the relationship between the old purpose and the new may be hard to see.

180

Stenning and van Lambalgen

We have absolutely no brief for general learning theories, or monolithic homogeneous classical logic, which are always claimed by domain speci®city theorists to be the creed of their opposition. Our purpose here is to reject these straw men, along with many of the conclusions that have been drawn from their impossibilities. We will re-examine some of the evidence that has been used to argue for domain speci®city, ask how varieties of reasoning might usefully be conceptualised, and suggest how an approach through the multiplicity of logics might contribute to a theory that is aimed at explaining the so much greater domain generality of human cognition. Just as with domain speci®city, so also with modularity. We have no doubt that the mind is modular, and we agree entirely with Sperber that the best reason for believing that our mental apparatus is modular is the general biological reason, that organisms are modular. What we reject is quite narrowly focused ± the idea that mental modules can be identi®ed simply by observing mental capacities newly arisen in an animal's phenotype, and assigning their implementation to new modules. In fact, the best approach is just the opposite ± to identify what ancestral capacity might be the basis of evolutionary continuity, and then to identify how this capacity might be augmented by modifying ancestral modules to achieve the discontinuity. At a ®rst guess, the modularisation of the new capacity will consist of a small tweak of the old modularisation. Of course, with behavioural innovations, the ancestral module may have been doing something super®cially radically different from its redeployment, and this is one problem that makes cognitive evolution challenging. Reasoning is the appropriate battleground for this argument and will be our focus here. The idea that the domain generality of human cognition results from the human ability to reason logically over languages, and about language, goes back at least to Aristotle. More recently, in the opposite direction, the psychological study of reasoning has been the origin for some of the most celebrated arguments for domain speci®city. So, unpicking both these opposing arguments is an important initial step. Another supposedly content-de®ned domain, highly relevant to reasoning research, though usually studied by developmentalists, is ``theory-of-mind'' reasoning abilities (Leslie, 1987). This domain has also been attributed to a novel brain module (see also Halford & Andrews, Chapter 9; Happaney & Zelazo, Chapter 11; Moses & Sabbagh, Chapter 12, this volume). The status given to reasoning about beliefs, desires and intentions (and other mental states) has important rami®cations for the evolution of cognition. Some have argued that in ontogeny, reasoning about minds requires language (de Villiers & de Villiers, 2000) and is enabled by the arrival of certain syntactic structures in the child's language. On the other hand, reasoning about intentions is fundamental to the use of language, as much of the research on linguistic pragmatics testi®es. So, the relations between reasoning, thought, and language have a chicken-and-egg ¯avour in ontogeny as well as in phylogeny.

8. Domain generality of cognition

181

Finally, reasoning is the appropriate focus for arguments about modularity because, taken to its extremes, the cognition-as-a-thousand-modules view pretty much eliminates reasoning as a ®eld of interest. The generality of reasoning comes to be thought of as an illusion. Restoring reasoning to serious cognitive analysis must be central to restoring the aim of explaining human cognitive ¯exibility. It may help to give the reader some indication of where we are headed, by describing two problems that crop up repeatedly along the way. A pervasive equivocation in arguments about the domain speci®city of reasoning is between specialisation of the processor and specialisation of the content processed. For example, arguments for the specialised nature of, say, language processors, slides into conclusions about the specialisation of the content reasoned about in those languages. This slide is so egregious that it can, like the gorilla in the basketball game, go unnoticed while we count the points. The other pervasive equivocation that concerns us throughout is that between reasoning domains and reasoning tasks. We strongly believe that people have the capacity to perform a rich variety of reasoning tasks, and that these require a wide range of logics and perhaps processors. So we are all for variety. But we are sceptical about contentdetermined domains of reasoning (as opposed to, say, content-determined domains in perception). The content of reasoning problems has enormous in¯uence on reasoning, but through its in¯uence on the interpretation of the task, and without interpretation (what we will describe below as logical form) there is no reasoning. However, we will argue that the evidence is that once an interpretation is chosen, reasoning proceeds rather abstractly, unless it hits an impasse that invokes the possibility of the need for reinterpretation. Above all, humans are capable of formalisation, and formalisation goes on in everyday reasoning just as much as in the theoretician's practise. We will start out, as so many forays by domain speci®city theorists have done, from the Wason (1968) selection task ± the locus classicus for modular evolutionary theories of reasoning. This task irritates many researchers, but we beg the reader's patience as we believe its study will repay, in vividness of detail, what it initially appears to lack in ecological validity. When conceptualised the way Wason invited us to do, the task may be a party trick of not much signi®cance. It can nevertheless, sensibly reinterpreted, serve as a microcosm for understanding much about domains in reasoning. In the second section, we generalise the argument by ®tting it into a modern logical framework. On this view, there is a rich landscape of logical systems, and reasoners must choose an interpretation by setting syntactic, semantic and logical parameters within this space before reasoning is well de®ned. We believe this is a good picture of what people do in their daily reasoning ± not just of what the theorist does in doing semantics. In the third section, we attempt to relate this cognitive picture to the biological problem that is faced by accounts of the evolution of human

182

Stenning and van Lambalgen

cognition. Organisms are constructed in a modular fashion, and biology has many lessons for those who would modularise cognition. Speci®cally, biology counsels against simple identi®cations of modern function with originating adaptive pressure, and gives wonderful examples of the repurposing of old machinery. Biology also underlines the importance of simultaneously giving accounts of continuity and discontinuity in evolution. We end with a highly speculative proposal designed to contrast with the kinds of pictures of cognitive evolution currently on offer.

Domains in the selection task The Wason (1968) selection task was the departure point for domain speci®c accounts of reasoning. Wason's original ``abstract'' version presented subjects with four cards bearing letters on one side and numbers on the other, along with a conditional rule If there's a vowel on one side, there's an even number on the other. The visible sides of the cards bore the symbols A, K, 4 and 7. Subjects were told to turn over those cards (and only those cards) necessary to judge the truth of the rule with respect to these four cards only. This task achieved fame through Wason's argument that subjects should turn only the vowel (true antecedent) and the odd number (false consequent) cards, and his observation that only about 5% of highly selected intelligent undergraduates did so. The search by many experimenters for a version of the task that was easier yielded the now famous ``drinking age'' (thematic) materials (Griggs & Cox, 1982). This presented subjects with four cards, each representing a drinker, with their drink on one side and their age on the other, along with a conditional rule If you drink alcohol, then you must be over 18. The visible sides of the cards bore ``whiskey'', ``orange'', ``16'', and ``19''. Subjects were told to turn over those cards (and only those cards) necessary to decide which of these four drinkers were obeying the rule. The results were that about 85% of a sample from a similar population of subjects now turned ``whiskey'' and ``16'' only ± the cards corresponding to Wason's criterion of correctness. Cosmides (1989) argued for her domain speci®c approach to reasoning by claiming that facilitation with thematic materials demonstrated that human beings had an innate module for reasoning about social contracts, evolved during our sojourn on the savannah in the Pleistocene era. After all, here is an obvious domain dependence of reasoning: The very same problem with different content evokes completely different performances. We would, of course, agree that here is evidence for variety in reasoning. We would also remark that the second rule is not a social contract. One has to have ``signed up'' to, and have potential control over, compliance with a social contract, and over one's age one has little control. The experiment also lacks connection to modules, innateness, the savannah or the Pleistocene. The content of the materials in the two versions of the task may be

8. Domain generality of cognition

183

different, but by far the most important difference is the task set and its relation to the logic of each rule. In the drinking age experiment, the task is to identify the cases that fail to comply with the law. There is no mention of truth-values because a (legal) law does not have truth values. In the ``abstract'' vowels and consonants experiment, the task is to test the truth of the rule. Testing the truth (or falsity) of natural language conditionals is notoriously problematic. Wason himself assumed that the abstract rule was to be interpreted as a classical logical material implication ± true only when there were no cases of true antecedents with false consequents, and then true simply in the light of this fact. But undergraduate subjects were already, in 1968, well known to be vanishingly unlikely to interpret an apparently law-like conditional described as a rule as a material implication. Specifying that it applies to only the four cards just makes its description as a rule all the harder to bear. This issue is further developed in Stenning and van Lambalgen (2004, p. 502). There is a signi®cant sense in which Wason got his own task wrong, at the very least failing to understand that his criterion of correct performance could be controversial (Oaksford & Chater, 1996; Nickerson, 1996). Stenning and van Lambalgen (2004) argue that taking the semantics of law-like natural language conditionals seriously as defaults, and simultaneously taking the logic of legal laws to be deontic, allows some precise predictions of the problems of interpretation subjects will exhibit in Wason's original task, but not the deontic drinking-age one. Fundamentally, trying to test law-like conditionals, robust to exceptions, by examining cases is obviously fraught. As one of their subjects said: ``OK. If I turn that A and ®nd a 7, it doesn't ®t but that could just be an exception''. When experiments designed on the basis of logical analysis are done, the predictions are borne out. Wason's version of the task can be made systematically ``easier'' by removing the several interpretational problems that arise. These problems do not arise in deciding whether cases obey deontic laws. So when we read (e.g., in Canessa et al., 2005) that the difference in content between identical logical forms of the abstract and deontic selection tasks determines whether the right cerebral hemisphere gets involved, we need to reinterpret this as the difference in logical task (or form) switching on different bits of brain. So where does this leave us with the domain speci®city of reasoning? Different reasoning tasks certainly yield different reasoning performances. But the domain speci®city theorist needs more than that, i.e., that the two problems are isomorphs of each other. Our analysis of the task identi®es at least three varieties of reasoning: (1) the conditional in Wason's interpretation ± and that of about 5±10% of the subjects ± is a classical logical material implication, one variety of reasoning; (2) The bulk of the subjects' initial interpretation of the conditional in this task is as a defeasible default conditional: If there's a vowel on one side and there's nothing abnormal, then there's an even number on the other, a second variety; and (3) the deontic

184

Stenning and van Lambalgen

interpretation of the drinking-age task is a third variety. In each of these domains, people reason qualitatively differently: they better had; isomorphs these interpretations are not (see also Noveck, Mercier, & van der Henst, Chapter 2, this volume; Roberts, Chapter 1, this volume). Stenning and van Lambalgen (2004) show in their review of the many experiments on the selection task that a remarkably high proportion of the variance in reasoning is accounted for by the single factor of whether the subject interprets the conditional deontically or descriptively. So, how are these varieties of reasoning to be treated theoretically? In the next section we propose that logic has much to offer here. These varieties of reasoning are absolutely not content-de®ned. The various logics involved in the interpretations are each abstract. The illusion of content dependence results from the choice of interpretations being cued by content because the natural language sentence form is only a weak guide to interpretation. The selection task provides excellent examples of just how ®negrained this cueing can be. Famously, Johnson-Laird, Legrenzi, and Legrenzi (1972) showed that a version of Wason's task using a now defunct UK postal regulation (``If a letter has a second class stamp, it is left unsealed'') produced near ceiling performance. Though they described the facilitation in terms of familiarity, what was critical was that the rule, though stated indicatively, was interpreted deontically by their knowledgeable UK subjects. The same rule was later found by Griggs and Cox (1982) to fail to facilitate the performance of American subjects unfamiliar with the postal regulation. Again, we believe that this was because such subjects, lacking contextual knowledge, did not interpret the rule deontically but as a descriptive generalisation, with all the consequences described above. Here are different varieties of reasoning at their ®nest grain being cued by content. But Cosmides (1989) herself showed that familiarity has little to do with reasoning performance. Provided with suitable cues, subjects reason deontically about the behaviour of unfamiliar exotic tribes. Perhaps our three varieties of reasoning listed above might also be proposed as ``content'' differences. If this is ``content'' then it is a very different kind to that proposed by domain speci®city theories. These varieties are best construed as different reasoning tasks. Different tasks call for different reasoning, but domain speci®city in the literature is the notion that two tasks that are formally the very same task (isomorphs), with corresponding correct answers, in different domains, elicit very different performances. In summary of the lessons from the selection task, a careful analysis exhibits various distinct varieties of highly abstract reasoning that are variously applied to different versions. These constitute quite distinct reasoning tasks, and distinct tasks are rather abstractly de®ned as governed by different logics. Abstract as the logics are, recognising which one is most appropriately applied to natural language task materials is itself a task driven by highly specialised content knowledge and is exquisitely sensitive

8. Domain generality of cognition

185

to context. Different content plays its role through triggering different assignments of form. The question of whether these forms require different processors, and how such processes are modularised, evidently can now be asked only in the context of human discourse processing more generally. There is, of course, a huge amount of research in psycholinguistics and computational linguistics on the question of how discourse processing can be modularised. Basing research on the semantics of conditionals is an important step in the direction of connecting these literatures. With this very different view of the evidence about varieties of reasoning drawn from a single experimental paradigm, we next generalise the discussion by seeking a logical framework.

What logic has to say about domain specificity Preliminaries: What is involved in logical reasoning? Discussions of the purported domain dependence of logical reasoning are generally marred by a grossly oversimpli®ed view of what is involved, perhaps best summed up as ``conditionals are conditionals are conditionals''. It is typically assumed that logical reasoning ideally consists of a mindless application of classical logic (i.e., that conditionals are classical logical conditionals), on the principle that, since logical validity depends on form not content, and there is only one logic, any kind of content can be substituted for the variables in logical forms. Thus, in the selection task the logical form of the rule is taken to be p ! q, where ! is the classical material implication, and p, q can be replaced by statements about letters and numbers, or by statements describing the terms of a social contract. It is the purpose of this section to break the hold of this picture, by showing that it is entirely unpsychological, and, equally importantly, ignorant of the logician's view of reasoning. We start with the latter; in this way we will be able to explain the considerable mental processing accompanying reasoning to logical form. Logical form It is deeply unfortunate that much experimental work on reasoning in psychology (evolutionary or otherwise) takes classical logic to be the unique and undoubted norm. Even though psychologists may talk of ``pragmatic modulation'' as a process that takes meanings considerably beyond classical logical meanings, they still resort to classical logic in judging the correctness of their subjects' conclusions (Johnson-Laird & Byrne, 2002). This overlooks the fact that a logical system can be justi®ed only relative to a semantics for the system, and furthermore that this semantics must be appropriate to the domain over which one intends to reason. Before we

186

Stenning and van Lambalgen

delve into technicalities, we offer an analogy that we hope will be instructive. Logical reasoning is like statistical inference, in that it is crucially based on the choice of a model. Once a model has been chosen, the impact of the data can be calculated, and in this sense statistics is a ®eld of mathematics. But the choice of a model is not mathematically determined, and can in fact be the subject of considerable controversy. For the simplest kind of example, the choice of independence assumptions radically affects choice of model. The analogy with logical reasoning is that once a semantics is chosen, it in principle ®xes the corresponding logical laws. But the choice of a semantics appropriate to a domain is, in general, a matter of debate, and in any case is not a purely logical or mathematical question. At this point the reader may object: ``I can see that this is so in statistics, where the choice of a model clearly depends on the features one is interested in; but in logic there is essentially one semantics only (that of classical predicate logic), so there is nothing to choose! Furthermore, choosing a statistical model is a deliberative process, whereas there is no evidence that people engage in such a process in the case of logic''. Here the immediate aim of this section is to show that it is not merely from a formal point of view that there are many semantics to choose from, but also from a psychological point of view we actually do continually make corresponding choices. We are so well versed in this that choosing an appropriate semantics by and large proceeds automatically, without the conscious deliberation that is characteristic of explicit statistical modelling. Psychologists studying deductive reasoning were slow to recognise the regularity that deontic interpretation is the crucial determinant of ease of reasoning in the selection task. One might speculate that the automaticity of choosing a semantics is a large part of the reason why. Let us ®rst consider how the variety of logical forms arises formally. So far we have been talking about a variety of semantics, but actually there are several other parameters that can be varied. Together, these determine what we will call logical form. Assigning logical form to a natural language sentence at the very least involves choosing speci®c values for the following parameters: 1 2 3 4

L a formal language into which (a fragment of ) natural language is to be translated the sentence Sen in the formal language which corresponds to the translation of the given sentence in natural language the semantics Sem for L the de®nition of validity of arguments 1, . . ., n/' with premises i and conclusion '.

For each type of parameter on this list, there are many possibilities for variation. In Stenning and van Lambalgen (2004) this topic is treated in detail; here we provide only some pertinent examples.

8. Domain generality of cognition

187

Much research in psychology is concerned with conditional reasoning, and it is often assumed that the natural language conditional ``if . . . then'' is exhaustively de®ned by means of its truth table in classical logic ( p ! q is false if p is true while q is false, and true otherwise); ``exhaustive'' meaning that every inference pattern involving ! follows from its truth table.1 This identi®cation should, however, be viewed as a consequence of the following parameter settings. 1

2

3

4

The formal language into which a sentence ``if (clause), then (clause)'' is to be translated is propositional logic, so that the internal structure of the clauses in the antecedent and the consequent of the conditional, and their relation, does not play a role. The propositional connective used to translate ``if . . . then'' is the binary implication connective !. This may seem obvious, since ``if (clause), then (clause)'' seems to involve two variables for clauses; but as we shall see in ``Unknown preconditions'' below, there are cases where translation as a ternary connective is much more appropriate. The classical semantics for this propositional language is characterised by the properties of truthfunctionality (the truth value of a complex sentence is completely determined by the truth values of its atomic components) and bivalence (there are only two truth values, 0 (false) and 1 (true), such that not-false = true). Even after parameters 1±3 have been set, the truth table of p ! q is not completely determined as a function of the truth values of p and q. For this, one needs some semantic intuitions about which argument patterns should be (in)valid, together with a formal de®nition of validity.2 The classical de®nition of validity (semantic consequence) says that ``an argument is valid if the conclusion is true in all models of the premises''. This de®nition can be put to work as follows. If denotes this concept of validity, then the following argument patterns involving ! are classically considered to be valid: q p ! q (read as: on every model in which q is true, p ! q is also true), :p p ! q. By contrast, we have p = p ! q. The classical de®nition of validity, combined with these intuitions, then forces p ! q to be true if p, q are true and if :p, q are true. This determines two rows of the truth table

1 The fact that there has been discussion in the literature of ``truth-value'' gaps in reconstructions of subjects' truth tables makes it all the more surprising that there has been no consideration of whether non-classical logics might accommodate these observations. 2 Note that any syntactic rules of inference are merely formalisations of whatever semantic intuitions of validity are chosen. Modern logic uses to represent the underlying semantic relation of semantic consequence, contrasting with ` which represents the syntactic relation of derivability by some set of inference rules. Inference rules are not basic to the formulation of logical systems. If they capture (formalise) the semantic consequence relation soundly and completely, all well and good. But there are logics with no such sets of rules.

188

Stenning and van Lambalgen (the ®rst and the third); the other rows can be worked out similarly. This illustrates how basic semantic intuitions of consequence give rise to truth tables. We could go on to show how the truth tables give rise to and constrain sets of inference rules.

This is not simply our version of a ``just so'' story, showing how classical logic might have arisen. The possibility for interpretational variation is real, as will be shown in ``The many faces of closed-world reasoning'' below using several examples of conditional reasoning of psychological interest where the parameters have been set differently. In particular we will consider cases where an enriched representation of the conditional is used and where the classical de®nition of validity is changed. It should be noted that ®xing the logical form of a set of expressions or sentences (including a de®nition of validity) is only the ®rst step in logical reasoning. Once this has been done, proof search must start, which brings its own considerable dif®culties. We shall not say much about this area here, except in the next section, where its role in the Wason selection task is highlighted. Logical form and (pre)processing The notion of logical form introduced above leads to the claim that, before one can reason according to a particular logic, a number of parameters have to be ®xed. For readers who ®nd this implausible we give a brief sketch of our work on the selection task. Wason conceived his task to be about reasoning with the material implication, and subjects' ``irrationality'' while doing so. In Stenning and van Lambalgen (2004) we have given evidence that subjects' main dif®culty is in fact in coming to this interpretation, because the notion of conditional reasoning they start out from is so different. When subjects read the rule to be tested against the cards,3 they will interpret ``if . . . then'' according to their natural language understanding of this expression. As will be elaborated in ``Unknown preconditions'' below and as a moment's re¯ection should make clear, conditionals in natural language allow exceptions and are hardly ever made false by a counterexample. As quoted above, this was said in so many words by some of our subjects, who responded to a card with 7 on the front and A on the back as ``It could be just a little exception, you see?'' But once the rule is allowed to have exceptions, the task is unsolvable in that no choice of cards could conclusively establish truth or falsity of the rule. One would require knowledge of what counts as an exception and what as a counterexample ±

3 Say, the descriptive rule ``if there is a vowel on one side of a card, there is an even number on the other'', where the cards show A K 4 7.

8. Domain generality of cognition

189

all ravens are black, with the exception of ravens' eggs, albino ravens, ravens that have fallen into bleach, etc. The selection task doesn't supply the relevant information. However, the pragmatics of the experimental situation suggest that the task must be solvable, ergo the conditional must not allow exceptions. But then there are the mirror image dif®culties with such conditionals ± merely observing that any ®nite set of cases ®t the rule (are neither exceptions nor counterexamples) doesn't make the rule true. This is evidenced when some subjects note correctly that turning a card can at most establish that this card is not an exception or a counterexample to the rule; but they then refuse to identify ``not false'' (i.e., no counterexamples) with ``true'' (Stenning & van Lambalgen, 2004, pp. 501±2). Of course the task is only solvable if this identi®cation is made. But even then the dif®culties are not over yet. Suppose a subject has come round to the idea that the clue to solving the task is to determine whether the search for counterexamples succeeds or fails. This search is immediately frustrated because there are no real cards to be turned. That is, with real cards the search could proceed sequentially according to a rule such as: First turn A to see what is on the back, and if this does not yield a counterexample, continue with 7. When given only pictures of cards, what the subject has to do instead is to re¯ect on the possible outcomes of the search, if it were performed. There is ample evidence in our material that subjects experience these dif®culties (see Stenning & van Lambalgen, 2004), and that presenting the task in such a way that sequential choice is possible greatly improves performance.4 This issue is exacerbated by the instruction to turn as few cards as necessary. This sketch should suf®ce to convince the reader, ®rst, that considerable preprocessing of a reasoning problem is necessary before logic can be applied, and second, that even if a subject has managed to assign a logical form and understands abstractly how the problem must be solved, processing dif®culties associated with proof search may stand in the way of a solution. It is shown in Stenning and van Lambalgen (2004) that deontically interpreted versions of the selection task, with a rule such as ``if you want to drink alcohol you have to be over 18'' elicit very different logical forms, which considerably reduce processing load in this task. Although we will come back to this issue, it should already be clear that this paints the domain generality or otherwise of logical reasoning in a different light. Far

4 The logically minded reader may ®nd the following analogy helpful. The method of semantic tableaux allows one to check the validity of an argument (say in propositional logic) by a systematic search for counterexamples. Now imagine that one is given a concrete argument with the instruction not to construct the corresponding tableau, but to show that if the tableau were constructed it would yield the desired answer. Re¯ection on what it is to construct a tableau is more dif®cult than the rule-guided process of the construction itself.

190

Stenning and van Lambalgen

from two formally identical tasks being differentiated by content, with a module just for processing ``social contracts'', the two are formally very distinct, where the choice of interpretation is often only cued by content (and the deontic ``domain'' is very much more abstract than social contracts). The deontic interpretation combines with other features of the task to make processing particularly easy; the descriptive one combines with the same features to make processing particularly hard. The many faces of closed-world reasoning As we have seen above, the classical de®nition of validity considers all models of the premises. This type of validity is useful in mathematics, where the discovery of a single counterexample to a theorem is taken to imply that its derivation is ¯awed. But there are many examples of reasoning in daily life where one considers only a subset of the set of all models of the premises. In this section we review some examples of closed world reasoning and its formalisation, gradually working towards reasoning in ``theory of mind'' tasks, which have been proposed as a marker for another evolutionarily critical module. One example is furnished by train schedules. In principle a schedule lists only positive information, and the world would still be a model of the schedule if there were more trains running than listed on the schedule. But the proper interpretation of a schedule is as a closed world ± trains not listed are inferred not to exist. Note that there is a difference here with the super®cially similar case of a telephone directory. If a telephone number is not listed, we do not therefore conclude that the person does not have a telephone ± she might after all have an ex-directory number. In fact a moment's re¯ection suggests that such examples can be found within the ``train schedule'' domain. From the point of view of a prospective passenger, the inference that there is no train between two adjacently listed trains may be valid, but for a trainspotter interested in trains passing through on the track, trains ``not in service'' may well occur between listed services. Thus, world knowledge is necessary to decide which logic is applicable, a feature that wreaks havoc with any simplistic domain speci®c/ domain general dichotomy. Unknown preconditions Real-world actions come with scores of preconditions that often go unnoticed. My action of switching on the light is successful only if the switch is functioning properly, the house is not cut off from electricity, the laws of electromagnetism still apply, etc. It would be impossible to verify all those preconditions; we even may not check the light bulb although its failure occurs all too often. We thus have a conditional ``if turn switch then light on'' that does not become false the moment we turn the switch only to

8. Domain generality of cognition

191

®nd that the light does not go on, as would be the case for the classical material implication. An enriched representation of the conditional as a ternary connective shows more clearly what is at issue here: ``if turn switch and nothing funny is going on then light on''. If we turn the switch but ®nd that the light is not on, we conclude that something is amiss and start looking for that something. But ± and this is the important point ± in the absence of positive information to the effect that something is amiss, we assume that there is nothing funny going on. This is the closed-world assumption for reasoning with abnormalities, CWA(ab). This phenomenon can be seen in a controlled setting in an experiment designed by Claire Hughes and James Russell, the ``box task'', which lends itself particularly well to a logical analysis using closed-world reasoning (Hughes & Russell, 1993). The task is to get a marble, which is lying on a platform, from inside the box where the platform is located. Platform and marble can be viewed through a circular opening in the box, and the ®rst impulse is therefore to reach through the opening to retrieve the marble. However, when the subject puts her hand through the opening, a trapdoor in the platform opens and the marble drops out of reach. This is because there is an infrared light-beam behind the opening that, when interrupted, activates the trapdoor mechanism. A switch on the left side of the box deactivates the whole mechanism, so that to get the marble you have to ¯ip the switch ®rst. In the standard setup, the subject is shown how manipulating the switch allows one to retrieve the marble after she has ®rst been tripped up by the trapdoor mechanism. A more formal analysis of the box task could go as follows. The main premise can be formulated as 1

if you reach for the marble through the opening and there is nothing funny going on, you can retrieve the marble

where the italicised conjunct is the variable, assumed to be present always, for an unknown precondition. This conjunct occasions closed-world reasoning of the form 2

I haven't seen anything funny. So, there is nothing funny going on.

Backward chaining then leads to the plan 3

to get the marble, put your hand through the opening.

Now a problem occurs: The marble drops out of reach before it can be retrieved. Premise (1) is not thereby declared to be false, but is now used to derive 4

something funny is going on.

192

Stenning and van Lambalgen

To determine what's so funny, the information about the switch is recruited, which can be formulated as a rule ``repairing'' 1 as in 5a or 5b. 5

a b

If you set the switch to the correct position and there is nothing funny going on, then you can retrieve the marble. If the switch is in the wrong position, there is something funny going on.

Closed-world reasoning with 5b now yields 6

if the switch is in the wrong position, there is something funny going on, but only then.

Backward chaining then leads to a new plan 7

To get the marble, set the switch to the correct position and put your hand through the opening.

One interesting feature of this analysis is thus that the new plan (7) is constructed from the old one by utilising the variable for the unknown precondition. This is not reasoning as it is usually studied in psychology, but it is reasoning nonetheless, with a discernible formal structure, and applicability across a wide range of domains. In fact CWA(ab) can be viewed as a de®nition of validity, as follows.5 Suppose we have an enriched conditional of the form p ^ :ab ! q, where ab is a proposition letter indicating some abnormality. Suppose furthermore that we have as information about ab the following implications: q1 ! ab, . . ., qk ! ab and that this is all the available information about ab. The implication ? ! ab is always true, in other words, for any material implication with ab as the consequent; if the antecedent is false, then the implication will be true whether ab is true or :ab is true. We may therefore also include this (admittedly trivial) statement in the information available about ab. We now want to say that, given p, :q1, . . ., :qn, q may be concluded. This is tantamount to replacing the information about ab with the single premise ab $ q1 _ . . . _ qk _ ? and applying classical validity. Note that as a consequence of this de®nition, if there is no nontrivial information about ab, the right-hand side

5 We use ^, _ and $ for conjunction, disjunction and biconditional (if and only if p then q) respectively. Furthermore, ? denotes a formula that is always false, and > a formula which is always true.

8. Domain generality of cognition

193

of the preceding bi-implication reduces to a falsehood (i.e., ?), and the biimplication itself to ab $ ?, which is equivalent to :ab. In short, if there is no nontrivial information about ab, we may infer :ab. Note that although classical reasoning is used here in explaining the machinery, the closedworld inference itself is nonclassical: in classical logic nothing can be concluded from the premises p, :q1, . . ., :qn,. Van Lambalgen and Smid (in press) have proposed that this logical framework can be used to give a perspicuous description of some executive function de®cits in autism. The main idea here is that classic executive failure, ``inability to inhibit the prepotent response'', should also affect the inhibitory force of the abnormality in the enriched representation of conditional rules. This prediction is borne out in their experiment that uses Byrne's (1989) ``suppression'' task. Given the premises ``She has an essay to write'', ``If she has an essay she studies late in the library'' and ``If the library is open she studies late in the library'', autistic subjects tend to draw the modus ponens conclusion that she studies late in the library, where undergraduate subjects tend to withhold this conclusion. This is because the third premise highlights a possible abnormality in the second premise, and normal subjects are sensitive to this. In executive function theory terms, for autists, the highlighting of a possible abnormality by the third premise fails to inhibit the prepotent response of the modus ponens inference. Another classical reasoning observation can be enriched from this point of view, Scribner's study of reasoning among the illiterate Kpelle tribe in Liberia (see Scribner, 1997). Here is a sample argument given to her subjects All Kpelle men are rice farmers. Mr Smith6 is not a rice farmer. Is Mr Smith a Kpelle man? Subjects refused to answer the question de®nitively, instead giving evasive answers such as ``If one knows a person, one can answer questions about him, but if one doesn't know that person, it is dif®cult''. Scribner went on to show that a few years of schooling in general led to the competence answer. This result, like those of Luria in the 1930s (see Luria, 1976), has been taken as evidence that the illiterate subjects do not understand what is being asked of them: to answer the question solely on the basis of (an inference from) the premises given. Instead, so it is argued, they prefer to answer from personal experience, or to refrain from answering if they have no relevant experience. But this interpretation presupposes that the Kpelle subject

6 That ``Mr Smith'' is not a possible Kpelle name may have a bearing here.

194

Stenning and van Lambalgen

adopts the material implication as the logical form of the ®rst premise. If, as is more plausible, he adopts an interpretation of the conditional that allows exceptions, he can be charged only with not applying closed-world reasoning to Mr Smith. That is, if the Kpelle subject believes he has too little information to decide whether Mr Smith is abnormal, he is justi®ed in refusing to draw the modus tollens inference. On this account, what the couple of years' elementary schooling teaches the child is a range of kinds of discourse in which exactly what to close the world on, and what to leave open, varies with some rather subtle contextual cues.

Causal reasoning There appears to be a strong empirical correlation between ``theory of mind'' reasoning and causal reasoning. In fact, recent ®ndings on counterfactual reasoning in children have been used to claim that children's developing capacities in the domain of ``theory of mind'' might re¯ect the emergence of the ability to engage in counterfactual thinking over the preschool period (e.g., German & Nichols, 2003; Riggs, Peterson, Robinson, & Mitchell, 1998). For instance, Riggs et al. devised a ``counterfactual'' adaptation of the standard false belief task, in which a mother doll bakes a chocolate cake, in the process of which the chocolate moves from the fridge (its original location) to the cupboard. The question asked of the child is now: (*) Where would the chocolate be if mother hadn't baked a cake? This question is about alternative courses of events, not about mental states, and hence seems to use causal reasoning instead of reasoning about beliefs; but still there is a good correlation with performance on standard false belief tasks. We will have more to say on this correlation in ``Attribution of beliefs and intentions'' below. Pragmatically, the formulation of question (*) suggests it must have an answer. The answer cannot come from classical logic, starting from the description of the situation alone: Classical logic compels one to ask ``what else could be the case?'', re¯ecting the obligation to consider all models of the data. In particular there would be models to consider in which mother eats all the chocolate, or in which the chocolate evaporates inside the fridge (an event of extremely small, but still nonzero, probability). Of course nothing of the sort happens in actual causal reasoning. There a ``principle of inertia'' applies, which roughly says: ``things and properties remain as they are, unless there is explicit information to the contrary''. This can be further spelled out as the closed-world assumption for reasoning about causality, CWA(c):

8. Domain generality of cognition 1 2 3

195

One assumes that only those events (affecting the entity of interest) occur that are forced to occur by the data ± here the only such event is the chocolate's change of location from fridge to cupboard. One also assumes that events only have those causal effects that are described by one's background theory ± e.g., turning on the oven does not have a causal effect on the location of the chocolate. No spontaneous changes occur, that is, every change of state or property can be attributed to the occurrence of an event with speci®ed causal in¯uence.

Together these principles suf®ce to derive an answer to (*). In fact this type of reasoning can be fully formalised in the ``event calculus'' originally developed in AI (see van Lambalgen & Hamm, 2004, for extensive treatment and references). Its logical structure is similar to the one detailed in ``Unknown preconditions'' above as regards properties 1 and 2, but property 3 brings in a new ingredient relating to development over time. Formally, this can be viewed as yet another twist to the de®nition of validity: One now obtains a notion, according to which the conclusion is evaluated at a later instant than the evaluation time of the premises. The classical de®nition of validity assumes that the conclusion of an argument is evaluated on models of the premises, thus validating a property like p p, that is, ``on every model on which p is true, p is true''. The de®nition of validity used in CWA(c) allows that models of the conclusion are temporal developments of models of the premises, and in this case we need no longer have p p. Suppose the models for the premises are evaluated at time t, and the models for the conclusion are temporal developments of these models considered at time t© > t. Clearly, even if p is true at time t, that same proposition p may be false at t©. These considerations allow us to see the connection between closed-world reasoning and planning. One feature distinguishing human planning from that of other species is the much increased capacity for of¯ine planning. This involves mentally constructing a model of (the relevant part of ) the world and computing the effect of actions in that model over time, taking into account likely events and the causal consequences of the actions performed. The various forms of closed-world reasoning introduced so far have to be combined here to enable the construction of the model and the computation of its development over time. What is interesting here for discussions of domain speci®city is that the procedures used to construct models in of¯ine planning can be used as well to construct models of linguistic discourse, for instance the structure of the events described by the discourse (see van Lambalgen & Hamm, 2004). It is proposed in the reference cited that of¯ine planning has been exapted for the purposes of language comprehension, viewed as the ability to construct discourse models. If true, this would show an incursion of very general reasoning procedures into the purportedly domain speci®c language module.

196

Stenning and van Lambalgen

Attribution of beliefs and intentions Now consider a standard false belief task such as Wimmer and Perner's (1983) ``Maxi task'': Maxi and Mummy are in the kitchen. They put some chocolate in the fridge. Then Maxi goes away to play with his friend. Mummy decides to bake a cake. She takes the chocolate from the fridge, makes the cake, and puts the rest of the chocolate in the cupboard. Maxi is returning now from visiting his friend and wants some chocolate. Children are then asked the test question: Where does Maxi think the chocolate is? Normally developing children will be able to attribute a ``false belief'' to Maxi and answer ``In the fridge'' from around age 4 or so. It is illuminating to view the reasoning leading up to this answer as an instance of closedworld reasoning. What is needed ®rst of all is an awareness of the causal relation between perception and belief, which can be stated in the form: ``if ' is true in scene S, and agent a sees S, then a comes to believe ''', where ' is a metavariable ranging over proposition letters p, q, . . . . In other words, seeing is a cause of believing. Thus Maxi comes to believe that the chocolate is in the fridge. An application of the principle of inertia (cf. 3 above) yields that Maxi's belief concerning the location of the chocolate persists unless an event occurs that causes him to have a new belief, incompatible with the former. The story does not mention such an event, whence it is reasonable to assume ± using 1 and 2 ± that Maxi still believes that the chocolate is in the fridge when he returns from visiting his friend. Viewed in this way, attribution of belief is a special case of causal reasoning, and some correlation with performance on counterfactual reasoning tasks is to be expected. The tasks are not quite the same, however. The causal relation between perception and belief is an essential ingredient in the false belief task, absent in the counterfactual task. There are two sides to this: positively, that a belief may form after seeing something; and negatively, that there are only a few speci®ed ways in which beliefs can form, e.g., by seeing, by being told, and by inference ± this negative aspect is an application of closed-world reasoning. Children failing the false belief task could master causal reasoning generally, but fail on the aspects just mentioned. So, assimilating the reasoning involved in ``theory of mind'' tasks as a kind of defeasible reasoning potentially provides both a basis for continuity with earlier developmental or evolutionary precursors and a basis for discontinuity ± it is causal reasoning by closed-world assumptions, but causal reasoning by closed-world assumptions of a speci®c kind. Reasoning about minds is

8. Domain generality of cognition

197

reasoning in a speci®c domain, but its characterisation may be possible by a rather small extension of a logical framework for other domains. In summary, a wide range of everyday tasks involve closed-world reasoning: planning, and adapting to failures of plans during their execution, diagnosis of causes, causal reasoning itself, reasoning about mental behaviour and states, interpreting speakers' intentions underlying discourse, etc. Treatments of this defeasible reasoning in a great variety of nonmonotonic logics is one important branch of logic, which provides productive frameworks for studying the contextual and content sensitivity of human reasoning. How does all this help one to understand domain specificity or generality of reasoning? We may adopt the following as a working de®nition for ``module'' in the sense of evolutionary psychology: A module is a set of computational processes that output answers just about a particular domain. (This captures the domain speci®city, but not the encapsulation, of the original Fodorian modules.) Cosmides' results on the Wason task were taken to imply that there is no computational process corresponding to logical reasoning over all domains. Only isolated domains such as social contracts would have a correlated reasoning module (in this case cheater detection), but there is no transference to other domains. The backdrop of Cosmides' discussion is a view of logic that emphasises its domain generality. This is as exempli®ed, for instance, in Ryle's 1954 description of logical constants, for example all, some, not, and, or, if as being indifferent to subject-matter, or topic neutral. Characterisations such as this are motivated by the classical, albeit informal, de®nition of validity, say for the syllogism All A are B. All B are C. Therefore, all A are C. as implying Whatever you substitute for the predicates A, B and C, if the premises are true for this substitution, then so is the conclusion. This would hold regardless of whether you intend to talk about mathematical entities, physical objects or actions, or indeed anything else: Logical constants are topic-neutral. The topic neutrality of logical inferences is then identi®ed with ``domain generality''. This identi®cation is used to argue that actual human reasoning cannot be viewed as the application of formal logical patterns because it is not domain general in this sense. It is the

198

Stenning and van Lambalgen

purpose of this section to argue, using the discussion of closed-world reasoning above, that human reasoning is both domain speci®c and domain general, and that none of this stands in the way of viewing human reasoning as proceeding according to formal logical laws. Domain speci®city of logic. As the example of the difference between train schedule and telephone directory showed, reasoning principles are sometimes domain speci®c. The prospective passenger applies closed-world reasoning to train schedules, but no-one would apply it to the telephone directory. Nevertheless, the reasoning involved in the former case is in a restricted sense topic-neutral; it is not, for example, about a particular train schedule at Amsterdam Central Station, but about schedules generally. How the formal patterns are determined. It requires world knowledge, semantic knowledge and reasoning to determine what formal logic corresponds to a proposed domain. We know that train schedules are to be used as closed worlds, whereas we can easily think of counterexamples to closedworld reasoning for telephone directories. Natural language conditionals, such as those describing regularities, are exception-tolerant, from which we can infer that they are not governed by classical logic, although forms of closed-world reasoning may be appropriate, as we have seen when discussing unknown preconditions. Knowledge is required to set the parameters involved in logical form, and each setting de®nes a domain. Once the parameters have been set, however, reasoning can proceed according to formal laws.7 Domain generality of logic. While being domain speci®c in the sense that different domains require different laws, logical reasoning is at the same time amazingly domain general. We have seen that closed world reasoning is a general reasoning scheme that occurs across a wide variety of domains ± for example in action planning, natural language comprehension, and attribution of beliefs. This is not the place for mathematical disquisitions, so the reader should take on trust that this generality of closed-world reasoning can be represented by means of a single de®nition involving a number of parameters. Domain generality of logic does not mean that a single set of rules can be applied across the board. It does mean, however, that human reasoning can be viewed as a general procedure that initiates a search for parameters (in which domain are we?), and that generates logical laws appropriate to the domain. This procedure can be called by several hypothesised modules, such as those for language and for theory of mind.

7 The reader may ask: ``What about the reasoning involved in determining logical form? Is this itself logical and formalisable?'' The tutorial dialogues on the Wason task cited in Stenning and van Lambalgen (2004) show that there is indeed identi®able logical reasoning going on, albeit sometimes as islands in a sea of confusion. In any case, if the analysis given here is correct, reasoning towards a logical form is fully general: it is about the relation between natural language expressions and domains.

8. Domain generality of cognition

199

There is thus no need to assume that the computations going on in theory of mind are disjoint from those in language comprehension, which weakens their claim to be called modules. Processing, logic and modules. Processing is a neglected area in evolutionary psychology, but it is of paramount importance for reasoning. Different domains may have different processing requirements resulting from their different formal structure ± what we have termed logical form ± but this provides no argument for domain speci®city. The selection task in its descriptive guise is a case in point ± descriptive semantics mean we have to test truth, and that requires testing relations of sets of cards and, in the very particular circumstances of the selection task, that creates contingencies between choices according to feedback on what is on the hidden side of the cards, and this means that if there is no such feedback, there is a high working memory load. The processing load for descriptive conditionals is much higher than that for deontic conditionals, where, as a consequence of the semantics, the problem of contingencies cannot arise. Therefore, general limitations on working memory or executive function may have strong effects on reasoning in one domain but not another. Performance differences thus cannot be used to argue for an evolutionary adaptation to a particular domain, in the form of a module that computes only over that domain. We have argued that closed-world reasoning occurs in such diverse domains as action planning, natural language interpretation and attribution of belief. Processing in these domains is fast and largely unconscious and, on the face of it, very different from that in standard reasoning tasks presented to subjects in the format ``given premises ÿ, does conclusion ' follow?'' Elsewhere, we have argued in detail that the processing underlying closed-world reasoning is an instance of a much more general form of mental processing. One direct approach to the question of what processing is going on is to consider neurally plausible implementations. A great deal of work in computational complexity theory has shown just how hard it is to implement classical logic. By contrast, closed-world reasoning can be implemented rather ef®ciently in neural networks. This is because, as explained by Stenning and van Lambalgen (2005, 2007), determining the set of propositions that are true in a closed world can be viewed as computing stable states of suitable neural networks. This observation may go some way toward explaining why these inferences proceed automatically, at least in the domains studied here (action, language comprehension, belief attribution). It is in fact the existence of highly ef®cient neural implementations of closed-world reasoning that makes it a plausible candidate for automatic psychological processes. At a very rough ®rst approximation, one might hypothesise that what Evans (2003), Stanovich (1999), and others have called Type I automatic processes are captured by defeasible closed-world reasoning, whereas conscious deliberate Type II processes might either be classical or be a class of ``repair'' processes still within closed-world

200

Stenning and van Lambalgen

reasoning. Type I and Type II processes are empirically observed to be very different, so if their underlying logics are different, this would be a promising place to look for diversity of logics determining diversity of processing. The conclusion to be drawn from this discussion is that, in the case of reasoning, domain speci®city and domain generality are false opposites: reasoning is both speci®c and general. In so far as reasoning is speci®c, however, there is no ground whatsoever to associate it to a module. First, speci®city results from parameter-setting in a general concept of reasoning. Second, observed performance differences in different domains do not indicate that we have a module for one and not the other, but only that the processing demands are different.

How did human reasoning get general? We have argued that humans reason abstractly using a wide variety of logics appropriate to different tasks, but that the identi®cation of which formalisation to adopt is generally cued by content. We propose taking formalisation seriously as a process engaged by everyday reasoning. Understanding logical form as being decided by the setting of many parameters, giving a landscape of possible systems, leaves many questions about processing undecided, but strongly militates against the view that each completely speci®ed system is a module. The very idea of setting parameters suggests that the resulting systems share functional characteristics when they share parameters. In this ®nal section we will speculate about how a logical approach might map onto a biological one in which modularisation plays a prominent part. The human mind is modular because organisms are modular. We have argued that ®nding human cognitive novelties and identifying them with novel modules is not a good approach to analysing human evolution. If the human mind didn't arise through the addition of a large number of new modules, each identi®ed as adding a human innovation, is there some more biologically plausible view of how it might have arisen that stands a chance of explaining the greater generality of human reasoning abilities? A more biological approach is to seek cognitive continuities with our ancestors, and then against that background of continuity, to seek ways to specify innovations and discontinuities. With the giraffe, its easy to see that necks are the continuity, and the innovation is in length. With cognitive functions, the homologies will not be so obvious. Interestingly, a number of disparate lines of research suggest that we may not need to go much beyond our analysis of logical reasoning to ®nd some of the relevantly continuous apparatus. Systems of closed-world reasoning are logics of planning. We plan with respect to our expectations of the world, not, as we would have to if we planned using classical logic, with respect to all logical possibilities.

8. Domain generality of cognition

201

Maintaining a model of the current state of the immediate environment relevant to action is a primitive biological function; calculating what is true in all logically possible models of the current sense data is not. These planning logics are just as much what one needs for planning low-level motor actions such as reaching and grasping as they are for planning chess moves (see Shanahan, 2000, for examples of such use in robotics).8 Recursion is a very important part of planning. We plan with respect to a main goal, and this then creates subgoals and subsubgoals with respect to which we plan recursively. Our early primate ancestors became planners of much more complex motor actions when they took to the trees and acquired the kind of forward facing binocular overlap required for doing the depth perception required for brachiation. Recursion no doubt got a huge boost from our arboreal habits. As neuroscientists have pointed out, it is intriguing that the motor areas for planning speech are right next to the motor areas for planning action. This hypothesis about the cognitive continuities between primate ancestors and humans has been elaborated by many researchers (Arbib & Rizzolatti, 1997; Green®eld, 1991). Approaching from the direction of the syntactic and semantic analysis of temporal expressions of natural languages also directs attention to planning as underlying our faculties for language (Steedman, 2002; van Lambalgen & Hamm, 2004). Another striking demonstration of unexpected continuities in planning in internal mental functions is provided by work on monkeys' working memory. Cebus appella have been shown to have hierarchical planning capabilities with respect to their working memories remarkably similar to the hierarchical chunking strategies that are evidenced in human list recall. When humans are given a suitably categorised list of words, the animalwords all come out clustered together, followed by the vegetable-words, etc. McGonigle, Chalmers, and Dickinson (2003) trained monkeys on a touchscreen task requiring they touch each of a set of hierarchically classi®able icons exhaustively. Of course, if the positions of the icons remains constant between touches, very simple spatial-sweep strategies suf®ce. So the icons are spatially shuf¯ed after each touch. The monkeys have to remember, with no external record, just which they have already touched, and they still succeed. McGonigle showed that the monkeys were capable of ef®ciently exhausting an array of nine items, and, more interestingly, they did so by touching all the icons of one shape, followed by all of those of another, just as humans' recall sequence is clustered. Here is recursive hierarchical planning in individual (rather than socially expressed) working memory, in the service of strategic planning of action sequences.

8 This should remind the reader that there is nothing particularly linguistic about logic, which is one reason why logical analysis may be particularly useful for ®nding evolutionary continuities between prelinguistic and postlinguistic cognition.

202

Stenning and van Lambalgen

Seeing such sophisticated strategic planning with respect to a private memory function in monkeys (which are more than 20 million years diverged from our ancestral line) is rather suggestive of the hypothesis that human innovations may have more to do with introducing the social expression of recursive planning in communication than with any lack of recursion in our ancestors' individual mental capacities.9 Planning therefore provides a good basis for understanding cognitive continuities at various timescales in our evolution. At least some of the simpler versions of closed-world reasoning, unlike classical logic, carry out the biologically primitive function of maintaining a model of the current environment, and, as mentioned above, are demonstrably extremely ef®ciently neurally implementable. What about discontinuities? If externalisation of planning, and plan recognition abilities, is one candidate area for what is special about human cognition, what does closed-world reasoning have to offer as a framework for understanding the transition to modern humans? In our discussion of causal reasoning, and attribution of beliefs and intentions, we suggested that reasoning about beliefs can be viewed as an extension of causal reasoning in which perception is a kind of causal effect on beliefs. Then there is still more complex reasoning about false beliefs, and about intentions for intentions to be recognised. Without pretending to have a full cognitive account of reasoning about minds in terms of closed-world reasoning, we would claim that this is a good potential basis for understanding the cognitive discontinuities of our reasoning capacities as well as the continuities. It is a good framework because it offers many gradations of reasoning and implementation. For example, we saw above that closed-world reasoning is a whole family of modes that can model many qualitative differences in what inferences are valid. The psychological literature on reasoning about mental states indicates the need for such gradations. At what stage ``theory of mind'' abilities emerge is at present controversial in human development. Falsebelief tasks were proposed as diagnosing a lack of these abilities in normal 3-year-olds and their presence in normal 4-year-olds (Leslie, 1987). Others have proposed that the irrelevant linguistic demands of these tasks deceptively depress 3-year-olds' performance. For example, in the ``SallyAnne'' task, the child sees the doll see the sweet placed in one box, and then the child, but not the doll, sees the sweet moved to another. Now if the child is asked ``Where will the doll look for the sweet ®rst?'' instead of ``Where will the doll look for the sweet?'', then children as young as two

9 This approach contrasts with Chomsky's belief that recursion is what is novel about human language. But at a deeper level it is closely aligned with Chomsky's own preference for the hypothesis that language may have evolved from a private ``language of thought'' whose function was internal mental calculation, and that only rather recently has become expressed as a medium of social communication (e.g., Chomsky, 2002, p. 76).

8. Domain generality of cognition

203

can sometimes solve the problem (Siegal & Beattie, 1991). Intriguingly, this might be read as evidence of the 3-year-olds in the original task adopting a deontic reading of the question (Where should the doll look?) rather than a descriptive one (where will the doll look ®rst?).10 These arguments push reasoning about intentions earlier in ontogeny. Above we suggested that some theory-of-mind failures might be more perspicuously described as failures of reasoning about possibilities (rather than speci®cally about mental states), so there is a great need for a more analytical classi®cation of these reasoning tasks. An approach based on the variety of logics and their contextual triggers offers a gradation of models of performance that can plausibly explain such developmental sequences ± whereas posing a theoryof-mind module itself offers little help. This approach through continuities and discontinuities of function still needs to be supplemented by a much more biological grounding. We will end by illustrating what one highly speculative grounding might look like (Stenning, 2003). Our purpose here is to provide an example of how the different levels of evidence ± from cognitive function to genetics ± might actually come together in a biological account of human speciation that does justice to the large-scale innovations of the human mind and the far greater generality of our reasoning. One of the great biological distinctions of the species Homo sapiens is the immature state of its offspring at birth, and the lengthened period of their maturation (known in biology as altriciality). We will take this example of a biological innovation and trace out some of its consequences and how it relates evidence at many levels. Here is a novel feature that engages biologically both with our social reorganisation into a group-breeding species ± intensi®ed by the cost of rearing altricial infants ± and with the distinctive cognitive changes that enable language and culture in such a group-breeding species. Remember that the purpose of our example is to provide a contrast with the kinds of stories on offer that identify cognitive innovations as modules. Altriciality is an example that shows how one selection pressure can plausibly give rise to changes in large numbers of other modular systems. It is an excellent illustrative example of biological grounding because it brings together effects at so many levels. Evolution does not, by and large, proceed by adding new modules, but by retiming the control of developmental processes of old ones. Altriciality is just such a process where plausibly few control elements resequence the operation of macro modules in development. Evolutionary stories must begin with a characterisation of selection pressures. A prime candidate for the driving force behind human altriciality

10 There are of course other possibilities. One, which also echoes a problem in the selection task, is that the younger child's problem may be with sequencing contingencies in their responses.

204

Stenning and van Lambalgen

is that constraints on growth rates of neural tissue, along with maternal anatomy, may mean that altriciality was the only way to develop a bigbrained narrow-hipped species of biped. Whatever the pressure for larger brains, these would have forced more altricial birth, given constraints on maternal pelvic dimensions ± ``Get out while you can!'' being the wise human infant's motto. There may even be a positive feedback loop here ± more altricial birth means greater dependence and more pressure for social coordination to group-rear infants. But social coordination is one candidate for driving the need for bigger brains, and so it goes round. It is easy to see that altriciality has radically affected the data to which human learning mechanisms are exposed, and the sequence of exposure. Maturational mechanisms that previously occurred in the womb now happen in the external environment. Humans are, for example, unique in the amount of postnatal neural growth they exhibit. The human changes in the duration and sequencing of brain development, which constitute an important part of altriciality, are prime candidates for being one cause of the kind of widespread modular repurposing that took place in human speciation. Altriciality also provides a good way into the ®nest grain of evolutionary explanation ± the molecular genetics level. There is, for example, a recently discovered phenomenon called ``imprinting'' whereby alleles of maternal origin have different effects to the same alleles of paternal origin (Isles & Wilkinson, 2000). Imprinting applies to relatively few, and only mammalian genes, which overwhelmingly control aspects of resource allocation in development. An increase in parental investment in offspring is one de®ning characteristic of mammals. The dominant theory of imprinting is that it is part of a competition between mother's and father's interests in reproductive resource allocation, although there is a good case that a non-competitive explanation for imprinting on the X sex-chromosome is required (Iwasa, 1998). Interestingly, the X chromosome appears to in¯uence the development of cognitive traits disproportionately (Zechner, Wilda, KehrerSawatzki, Fundele, Vogel, W. & Hameister, 2001). Be that as it may, imprinted genes are independently identi®able, and are evidently prime candidates for controlling the changed pattern of foetal maturation in Homo sapiens, and especially of brain development ± big brains are supremely metabolically expensive, as well as being dif®cult to give birth to. The molecular signatures of imprinting have enabled researchers to track down several genes controlling brain growth during maturation (mostly in mice ± 70 million years diverged from our ancestral line). Maternally imprinted genes (most though not all imprinting is maternal) are responsible for controlling the growth of the frontal lobes: one of the areas of greatest human innovation in development, and a prime seat of the planning functions we are focused on in our cognitive analysis. Paternally imprinted genes control growth of the hypothalamus (Goos & Silverman, 2001). To put it facetiously, mum controls the development of civilisation

8. Domain generality of cognition

205

and dad the animal urges. Slightly more intricately, the unique degree to which human brains, and especially the planning frontal lobes, continue growing after birth may prove to be an innovation controlled by mothers' interests in reproductive resource allocation ± speci®cally her interests in her offspring's capacities for understanding other minds. If such accounts hold up, then they provide evidence about the originating selection pressures for the explosive growth of human planning, reasoning, and communication capabilities. Just-so stories about evolution get grounded in selection pressures as evidenced by molecular genetics. These capabilities may have proved useful for mammoth hunting or for mate attraction, but to the extent that they are controlled by imprinted genes, their development is more likely to have been driven by changes in maternal reproductive resource allocation, or pressures for sexual dimorphy. Focusing on altriciality as a biologically recognisable human innovation offers the possibility of connecting cognition to the burgeoning information sources offered by genomics. Perhaps genetic imprinting can offer a methodology to identify some candidate genes for involvement in human speciation? Already a number of human behavioural syndromes are known to be related to abnormalities of imprinting (e.g., Prada-Willi, Angelman, and Beckwith-Wiedemann syndromes; Everman & Cassidy, 2000). Claims have been made that imprinting plays a role in determining the sex-ratio of cases of autism (Skuse, 2000). It does in mouse models of autism where a particular gene can be shown to affect analogues of executive function tasks (Davies et al., 2005). This is all highly speculative, but serves to highlight the opportunities that are emerging for linking levels of account. Here may be a place to look for the developmental and neurological substrates of humans' vastly more general planning capacities, and the evolutionary processes that pressed them into service for communication and reasoning. The search for neural and genetic substrates of human innovation is unlikely to be successful until it is based on a plausible functional analysis of the very general capacities of human reasoning. To return to the issue of domain generality of reasoning, we hypothesise that the intensi®cation of humans' focus on reasoning about the intentions of conspeci®cs, and particularly their communicative intentions, arose as an outgrowth of existing capacities for reasoning in other kinds of planning. One very important contribution of altriciality to human cognitive evolution is its pressures toward cooperation ± ®rst between mother and infant, then between adults in small groups, and outwards to larger societies. Hare, Brown, Williamson, & Tomasello (2002) have pointed to the predominantly competitive interactions between chimpanzees as being a major brake on the development of their mind-reading abilities. Even domestic dogs are superior in this regard, and, interestingly, domestic dogs are also altricial. There are good general reasons for believing that the social coordination required for the establishment and preservation of language conventions

206

Stenning and van Lambalgen

requires a highly cooperative setting such as that altriciality provides. For example, Davidson's (1973) arguments for the ``principle of charity'' in interpretation provide just such reasons. If we were really so focused in our social dealings on whether we are being cheated or lied to, it seems unlikely that human communication would ever have got off the ground. Reasoning about interpretations while, for example, learning a language, is quite hard enough under assumptions of cooperation, and the evidence of misalignment is all too easy to come by once one expects deliberate misalignment. Policing of contracts may be important at the margins, but is not a plausible explanation of the initial establishment of cooperation. The example of altriciality illustrates how the impact of a single coherent biological change can ramify throughout the biology of a species altering myriad ancestral functions, each itself already modularised in complex ways. It also illustrates how such a synoptic picture can be essential to analysing detail. Its most obvious contribution here is to focus attention on a different life-stage. ``Evolutionary'' psychologists have overwhelmingly concentrated on adult characteristics as adaptations. Cosmides' cheating detector module is a typical example, that confers reproductive advantage on the exchange of goods and services by adults. It is founded on evidence from supposed failure of language comprehension in the selection task, but it contributes nothing to relating that ``failure'' to the evident successes of human communication ± that is presumably left to other modules. The account through a novel module also contributes nothing to our understanding of how cheating detection is related to pre-existing ancestral machinery (say in the affective/emotional systems). In summary, our sketch of an answer to how human cognition became so much more general than that of our immediate ancestors is a variant of the traditional one that communication through language (suitably understood) is central. But it is a variant with considerable twists. Humans gained the ability to plan and reason about the complex intentions involved in communication, and so to process the multiple interpretations in the multiple logics required for natural language use. Considerable planning in thought may have been possible before it could be externalised in communication. The languages of thought may have led to the languages of expression. One of the important ®nal driving forces may have been altriciality, which had the effect of shifting the focus of our cognition onto social and mental reasoning.

References Arbib, M. A., & Rizzolatti, G. (1997). Neural expectations: A possible evolutionary path from manual skills to language. Communication and Cognition, 29, 393±424. Byrne, R. M. J. (1989). Suppressing valid inferences with conditionals. Cognition, 31, 61±83.

8. Domain generality of cognition

207

Canessa, N., Gorini, A., Cappa, S., Piattelli-Palmarini, M., Danna, M., Fazio, F., et al. (2005). The effect of social content on deductive reasoning: An fMRI study. Human Brain Mapping, 26, 30±43. Chomsky, N. (2002). On nature and language. Cambridge: Cambridge University Press. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187±276. Davidson, D. (1973). Radical interpretation. Dialectica, 27, 289±301. (Reprinted in Davidson, D. (2001). Inquiries into truth and interpretation (2nd ed.). Oxford: Clarendon Press. Davies, W., Isles, A., Smith, R., Karunadasa, D., Burrmann, D., Humby, T., et al. (2005). Xlr3b is a new imprinted candidate for X-linked parent-of-origin effects on cognitive function in mice. Nature Genetics, 37, 625±629. de Villiers, J. G., & de Villiers, P. A. (2000). Linguistic determinism and the understanding of false beliefs. In P. Mitchell & K. Riggs (Eds.), Children's reasoning and the mind (pp. 191±228). Hove, UK: Psychology Press. Evans, J. St B. T. (2003). In two minds: Dual-process accounts of reasoning. Trends in Cognitive Sciences, 7, 454±459. Everman, D. B., & Cassidy, S. B. (2000). Genetics of childhood disorders: XII. Genomic imprinting: Breaking the rules. Journal of the American Academy of Child Adolescent Psychiatry, 39, 386±389. German, T. P., & Nichols, S. (2003). Children's counterfactual inferences about long and short causal chains. Developmental Science, 6, 514±523. Goos, L., & Silverman, I. (2001). The in¯uence of genetic imprinting on brain development and behaviour. Evolution and Human Behavior, 22, 385±407. Green®eld, P. (1991). Language, tools and the brain: The ontogeny and phylogeny of hierarchically organized sequential behavior. Behavioral and Brain Sciences, 14, 531±595. Griggs, R. A., & Cox, J. R. (1982). The elusive thematic-materials effect in Wason's selection task. British Journal of Psychology, 73, 407±420. Hare, B., Brown, M., Williamson, C., & Tomasello, M. (2002). The domestication of social cognition in dogs. Science, 298, 1634±1636. Hughes, C., & Russell, J. (1993). Autistic children's dif®culty with mental disengagement from an object: Its implications for theories of autism. Developmental Psychology, 29, 498±510. Isles, A. R., & Wilkinson, L. S. (2000). Imprinted genes, cognition and behaviour. Trends in Cognitive Science, 4, 309±318. Iwasa, Y. (1998). The con¯ict theory of genomic imprinting: How much can be explained? Current Topics in Developmental Biology, 40, 255±293. Johnson-Laird, P. N., & Byrne, R. M. J. (2002). Conditionals: A theory of meaning, pragmatics and inference. Psychological Review, 109, 646±678. Johnson-Laird, P. N., Legrenzi, P., & Legrenzi, M. S. (1972). Reasoning and a sense of reality. British Journal of Psychology, 63, 395±400. Leslie, A. M. (1987). Pretence and representation: The origins of a ``theory of mind''. Psychological Review, 94, 412±426. Luria, A. R. (1976). Cognitive development: Its cultural and social foundations. Cambridge, MA: Harvard University Press. McGonigle, B., Chalmers, M., & Dickinson, A. (2003). Concurrent disjoint and

208

Stenning and van Lambalgen

reciprocal classi®cation by Cebus apella in serial ordering tasks: Evidence for hierarchical organization. Animal Cognition, 6, 185±197. Nickerson, R. S. (1996). Hempel's paradox and Wason's selection task: Logical and psychological puzzles of con®rmation. Thinking and Reasoning, 2, 1±31. Oaksford, M., & Chater, N. (1996). Rational explanation of the selection task. Psychological Review, 103, 381±392. Riggs, K., Peterson, D., Robinson, E., & Mitchell, P. (1998). Are errors in false belief tasks symptomatic of a broader dif®culty with counterfactuality? Cognitive Development, 13, 73±90. Ryle, G. (1954). Dilemmas. Cambridge: Cambridge University Press. Scribner, S. (1997). Mind and social practice: Selected writings of Sylvia Scribner. Cambridge: Cambridge University Press. Shanahan, M. (2000). Reinventing Shakey. In J. Minker (Ed.), Logic-based arti®cial intelligence (pp. 233±253). Dordrecht, The Netherlands: Kluwer. Siegal, M., & Beattie, K. (1991). Where to look ®rst for children's knowledge of false beliefs. Cognition, 38, 1±12. Skuse, D. (2000). Imprinting, the X-chromosome, and the male-brain: Explaining sex differences in the liability to autism. Pediatric Research, 47, 9±16. Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind: Domain speci®city in cognition and culture (pp. 39±67). New York: Cambridge University Press. (Revised version in Sperber, D. (1996). Explaining culture: A naturalistic approach. Oxford: Blackwell.) Sperber, D. (2002). In defence of massive modularity. In E. Dupoux (Ed.), Language, brain and cognitive development: Essays in honor of Jacques Mehler (pp. 47±57). Cambridge, MA: MIT Press. Stanovich, K. E. (1999). Who is rational? Studies of individual differences in reasoning. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Steedman, M. (2002). Plans, affordances and combinatory grammar. Linguistics and Philosophy, 25, 725±753. Stenning, K. (2003). How did we get here? A question about human cognitive evolution. Amsterdam: Amsterdam University Press. Stenning, K., & van Lambalgen, M. (2004). A little logic goes a long way: Basing experiment on semantic theory in the cognitive science of conditional reasoning. Cognitive Science, 28, 481±530. Stenning, K., & van Lambalgen, M. (2005). Semantic interpretation as reasoning in nonmonotonic logic. Cognitive Science, 29, 919±960. Stenning, K., & van Lambalgen, M. (2007). Human reasoning and cognitive science. Cambridge, MA: MIT Press. van Lambalgen, M., & Hamm, F. (2004). The proper treatment of events. Oxford: Blackwell. van Lambalgen, M., & Smid, H. (in press). Reasoning patterns in autism: Rules and exceptions. In L. Perez Miranda & J. Larrazabal (Eds.), Proceedings of the eighth international colloquium on cognitive science Donostia/San Sebastian. Dordrecht, The Netherlands: Kluwer. Wason, P. C. (1968). Reasoning about a rule. Quarterly Journal of Experimental Psychology, 20, 273±281. Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and con-

8. Domain generality of cognition

209

straining function of wrong beliefs in young children's understanding of deception. Cognition, 13, 103±128. Zechner, U., Wilda, M., Kehrer-Sawatzki, H., Vogel, W., Fundele, R., & Hameister, H. (2001). A high density of X-linked genes for general cognitive ability: A runaway process shaping human evolution? Trends in Genetics, 17, 697±701.

Part II

Extreme domain speci®city and cognitive development

9

Domain general processes in higher cognition Analogical reasoning, schema induction and capacity limitations Graeme S. Halford and Glenda Andrews

The existence of modular processes in human cognition is now widely accepted as one of the ``givens'' of the ®eld. The notion that all human cognition is undertaken by a unitary system ± from the most basic perceptual processes to thought and language ± is virtually impossible to sustain. On the other hand, it is also very hard to accept that all cognitive processes are performed by independent modules, if only because some mechanism would be needed to coordinate them, and this process would require some degree of domain independence. Both Baddeley (1986) and Newell (1990) have recognised this, the ®rst with the concept of the central executive, and the second with the quest for uni®ed theories of cognition. Given that we recognise the existence of both modular and domain general processes, we need objective means for determining between them. We are left then with the following problems. 1

2

We have to distinguish modular processes from the alternative, which might be variously characterised as domain-independent, central, or control processes. We will refer to these collectively as domain general processes. Making this distinction includes determining how modular and general processes communicate with each other. In order to do this, we have to de®ne the properties of each level of process.

We will propose that some phenomena that have been attributed to modular processes may instead be explained by higher cognitive processes that are of considerable generality, and that the apparent plausibility of modular explanations may be due, at least in part, to the relative lack of effort to investigate the role of more general processes in the relevant contexts. To advance this thesis we will ®rst consider how the modularity of a given cognitive process may be demonstrated empirically. We will do this by examining what is arguably an example par excellence of a modular process, early vision (Marr, 1982; Pylyshyn, 1999). Then we will consider the Wason four-card selection task. A widely accepted explanation for its performance is based on modular, or at least domain speci®c, cognitive

214

Halford and Andrews

processes, and we will show how performance might instead be explicable by highly general cognitive processes: schema induction and analogical reasoning. Then we will consider other properties of cognition that demonstrate some degree of domain independence, focusing on cognitive complexity and capacity limitations. Finally, we will consider implications for interpretation of the cognitive development literature.

Distinguishing modular processes How can modular processes be distinguished from domain general processes? An in¯uential conception of modularity was proposed by Fodor (1983). Modular systems are primarily concerned with the processing of perceptual information, and are rapid, automatic, informationally encapsulated and relatively unmodi®able. Our distinction will be based on these criteria. Early vision Pylyshyn (1999) has proposed that early vision ± using the term in the sense used by Marr (1982) to refer to preinterpretive processes ± is an example of a modular process. An example of early vision would be the construction of a representation of a three-dimensional object from a two-dimensional retinal stimulation. This process is automatic and impenetrable by higher cognitive processes. The information that does in¯uence it comes, according to Pylyshyn, from within the visual system. Higher cognition in¯uences postperceptual processes such as categorisation and inference, rather than perception per se. It is perception that enables us to see the threedimensional object, but higher cognition that tells us it is a car, that it is dangerous if coming towards us, that we can use it for transport, and so on. If we accept Pylyshyn's thesis, then early vision is an example par excellence of a modular process. Given the ®rst of our aims above, we can use Pylyshyn's analysis to indicate the sort of evidence that distinguishes modular from nonmodular processes. Pylyshyn argues that early vision is modular for the following reasons. 1

It is encapsulated from cognition, and is cognitively impenetrable. The best explanation of cognitive penetrability seems to be that of Pylyshyn (1999, p. 3): ``if a system is cognitively penetrable then the function it computes is sensitive, in a semantically coherent way, to the organism's goals and beliefs, i.e., it can be altered in a way that bears some logical relation to what the person knows''. Early vision is cognitively impenetrable because the top-down processes that do occur, such as ®lling-in or Gestalt-like ®gural completion, come from within the visual system rather than from outside it. Nonvisual in¯uences include focal

9. Domain general cognitive processes

2

3

4

215

attention and signals originating in both the visual and motor systems. Perception is resistant to rational cognitive in¯uence, so knowing that something is an illusion does not make it disappear. Perceptual principles are responsive only to visually presented information, and not to the cognitive context. Thus, if you view two persons in an Ames room, they appear very different in size, and this illusion is not reduced by your knowledge that the two people are similar in size, nor is it reduced by even the most detailed and precise knowledge of the processes that contribute to the Ames illusion. Neuroscience evidence, based on functional-anatomical studies of visual pathways, indicates partial independence of vision and other cortical functions. Cells in the visual system can be activated by other parts of the visual system or the motor system, rather than by higher cognitive processes. Clinical neurological evidence (Pylyshyn, 1999, pp. 11, 34) of visual agnosia patients indicates a dissociation between ability to integrate sources of visual information to form a representation of an object, and the ability to recognise the object.

Processes in the visual system may be, and often are, immensely complex. However, they are informationally encapsulated and are not generally modi®able by higher cognitive processes. They are not sensitive to cognitive contexts, such as goals or consequences associated with what is perceived. Our purpose here is not to advocate any speci®c thesis with regard to the visual system, nor do we want to join any controversies on this issue. Rather, we want to draw attention to the kinds of evidence that are used to identify a modular process. The case for an early vision module is based on convergence of several lines of evidence from behavioural observations, neuroscience, and clinical neurology. The speci®c evidence necessary will differ for each domain, but it might save a lot of confusion and unproductive controversy if claims for modules in cognitive development and higher cognition were as disciplined as the claim for modularity of early vision. Deontic reasoning Another example of a process that is sometimes claimed, or assumed, to be modular is deontic reasoning. Its signi®cance was recognised in the context of the Wason four-card selection task (Wason, 1966). In the classic, abstract version, participants are shown four cards, on each of which is printed one character, such as A, B, 4, 7. They are told that each card has a letter on one side and a number on the other, and that there is a rule that ``If there is an A on one side there will be a 4 on the other''. They are then asked to say which cards should be turned over to verify the rule, the correct answers

216

Halford and Andrews

being A and 7. Multiple studies have shown that performance is poor in the abstract version, but improves considerably with a familiar rule, such as: ``If a person is drinking an alcoholic beverage, the person must be 18 years of age''. A related ®nding of importance was that performance improved markedly with an abstract version of a permission rule, such as: ``If one is to take action A, one must ®rst satisfy precondition P'' (Cheng, Holyoak, Nisbett, & Oliver, 1986). The cards then contained wording such as, ``Has taken action A'', ``Has not taken action A'', ``Has ful®lled precondition P'', and ``Has not ful®lled precondition P''. Here it is much easier to see that the correct choices are Taken Action A and Not Ful®lled Precondition P. This suggests that it is not concreteness or familiarity simpliciter that brings about the improvement in performance, but the structure of the task. Cheng et al. proposed that improved performance was due to pragmatic reasoning schemas, which are abstract knowledge structures induced from life experience. Examples of schemas are permission and obligation, which belong in the category of deontic principles. Cheng et al. proposed that pragmatic reasoning schemas can be induced from experience. We have plenty of experience with permission situations, from which a general schema can be induced. A driver's licence is permission to drive a car; a theatre ticket is permission to enter a theatre; an introduction, or even a smile, is permission to enter conversation with a stranger, and so on. Schema induction processes have been investigated by Halford, Bain, Mayberry, and Andrews (1998a) and Hummel and Holyoak (2003). Another proposal concerning the mechanism of deontic reasoning is that this depends on an innate cheater detection schema (Cosmides & Tooby, 1992). Alternatively, the model of Oaksford and Chater (1994) proposes that the cards are chosen according to their utility in the deontic task, and according to information gain in the abstract task. The proposal that deontic reasoning is special has been challenged by Almor and Sloman (1996), who showed similar results in nondeontic contexts. Pragmatic schema theory is a ``middle-of-the-road'' approach in that it acknowledges the signi®cance of domain speci®c knowledge but also encompasses processes, such as schema induction, that apply across several domains. We will not pursue the controversy concerning the precise mechanism of facilitation here, but we want to emphasise that it might not be necessary for deontic reasoning to be based on a domain speci®c cheater detection schema in order to explain the empirical ®ndings. Elsewhere (Halford, 1993) it has been proposed that the permission version of the rule might produce improved performance because it has a structure that is more like the conditional, p implies q (A implies 4 ), whereas the abstract version of the rule is more similar to the biconditional, p implies q and q implies p (A implies 4 and 4 implies A, or 4 if and only if A). The permission rule is like the conditional because both are asymmetrical. Permission means that if you are performing a particular action, you must have permission, but it

9. Domain general cognitive processes

217

does not mean that if you have permission, you must be performing the action. For example, if you drive a car, you must have a licence, but this does not mean that if you have a licence you drive a car. Correspondingly, the conditional means that if p then q, but it does not mean that if q then p. The canonical, or conventionally correct solution to the Wason selection task is based on a conditional interpretation of the rule. Thus ``if there is an A on one side there will be a 4 on the other'' does not imply that ``if there is a 4 on one side there will be an A on the other''. Consequently there is no point in turning over the 4 because even if there is no A on the other side this will not falsify the rule. Therefore anything that induces an interpretation of the rule that is structurally similar (or isomorphic) to the conditional will tend to facilitate correct performance. By contrast, the abstract version of the task could be interpreted as a prediction schema. That is, the rule ``if there is an A on one side there will be a 4 on the other'' could be interpreted as the A predicting the 4 on the other side. However, a prediction schema more closely resembles a biconditional, because a biconditional rule (4 if, and only if, A) is a better predictor than the conditional, 4 if A (see also Stenning and van Lambalgen, Chapter 8, this volume). The permission schema is at an intermediate level of domain speci®city because it applies to certain classes of situations, those that require permission in some form, but it has generality, because permission applies to such a wide range of contexts. However, whereas the permission schema might have some degree of domain speci®city, the reasoning processes are domain general. That is, application of the permission schema depends on analogy, which applies, in principle, to any domain. The problem is handled by mapping content into the permission schema (Halford, 1993), which leads to a representation that is more like the conditional than the biconditional, and this yields more correct answers because the norm for correctness is based on the conditional. Acquisition of the permission schema is also based on a domain general process, schema induction. Thus Wason selection task performance is consistent with domain speci®c knowledge being obtained, and applied, by domain general processes: analogy and schema induction.

Domain specific knowledge emerging from domain general processes Our thesis is that the success of deontic rules in the Wason selection task might not be evidence for modular processes such as cheater detection, or even for the unique role of deontic domain knowledge, but it might instead re¯ect two domain general processes. The ®rst of these is schema induction. The second is analogical reasoning. That is, the permission schema gives improved performance, not just in concrete versions but also, where it is applicable, to abstract versions because it is an analogue of the conditional.

218

Halford and Andrews

Permission is a domain speci®c schema but it re¯ects a process, schema induction, that is domain general. We propose that both schema induction and analogical reasoning might account for many processes that might appear to be modular, yet neither has been adequately investigated in this context. Schema induction and analogical reasoning demonstrate that domain speci®c knowledge can result from the operation of domain general processes. Schema induction Schema induction processes have been investigated by Halford et al. (1998a), Holland, Holyoak, Nisbett, & Thagard (1986), and Hummel and Holyoak (2003). Induction of structured representations has been demonstrated in neural net models, such as the model of the balance scale by McClelland (1995). The formation of prototypes, a type of category representation, has also been demonstrated with neural net models by McClelland and Rumelhart (1985) and by Quinn and Johnson (1997). In the balance scale model of McClelland (1995) there are four sets of ®ve input units, representing one to ®ve weights on pegs, one to ®ve steps from the fulcrum, on both left and right sides. There are four hidden units, two of which compare weights and two of which compare distances. The output units compute the state ± left side down, right side down, or balance ± that is predicted to result from the inputs. Training results in those input units that represent larger weights, or larger distances, having greater connection strengths to the hidden units. This recognition of the relative effects of different weights and distances emerges as a result of being trained to compute an appropriate balance state for a given input of weights and distances on left and right, and is not prede®ned in the net. It is a case of structured internal representations emerging as a result of learning to compute input±output functions. This model is important for many reasons, but it has implications for the domain speci®city debate, because it demonstrates an important form of knowledge acquisition. The structured knowledge that is acquired is domain speci®c, as it relates to relative values of weights on the balance beam. The processes by which it was acquired, however, being based on backpropagation, have considerable generality. Thus domain speci®c knowledge can emerge from domain general processes. This example illustrates another point: That we need to know more about the processes that can yield structured knowledge before we make ®nal judgements about modularity. Another form of structured knowledge acquisition is the relational schema induction paradigm (Halford et al., 1998a). Participants are trained on an arti®cial structure comprising six state elements and three operators, which effect changes from one state to another. The states are represented by stimuli that have no natural ordering (e.g., ®rst names), and the operators are represented by arbitrary geometric ®gures. The states can be

9. Domain general cognitive processes

219

thought of as being arranged on the apices of a hexagon. Operator C moves one step clockwise, A moves one step anticlockwise, and S is a null operator (no move). For learning, an initial state and operator are presented and the participant has to select an outcome state by clicking on one of the state's elements presented in the display. Feedback is then given. Participants can rearrange the state elements by dragging with a mouse. This provides a mnemonic aid and the arrangements provide data about representation of structure, including its emergence as learning progresses. Participants are trained on all 18 items formed by six states three operators. The structure can also be learned by an implicit learning process similar to the Reber paradigm (Reber, 1989). When a problem is learned to criterion, an isomorphic problem is presented. An appropriate set of eight items is suf®cient to map the states and operators of the isomorphic problems into the representation of structure learned. Then participants are required to generate components of isomorphic tasks. That is, they predict components of a previously unknown task by analogy with the task learned. This generativity test was found to be a good measure for knowledge of structure. Because it has no requirements for articulation, it is suitable for prelinguistic and nonlinguistic participants, and no demands are imposed beyond those in original learning. The relational schema induction paradigm is another way that structured knowledge can be acquired for a speci®c domain by using domain general schema induction processes. The paradigm can be applied to topics such as mathematics education (English & Halford, 1995). Analogical reasoning Analogy has long been recognised as an important component of intelligence (Binet & Simon, 1905/1980; Piaget, 1947/1950) and is arguably fundamental to higher cognitive processes (Halford, 1993; Hofstadter, 2001). It plays a signi®cant role in mathematics (English & Halford, 1995; Polya, 1954), science (Dunbar, 2001; Gentner et al., 1997; Holyoak & Thagard, 1995), politics, art, religion, pedagogy, communication, humour and law (Holyoak & Thagard, 1995). Its basic role in reasoning is indicated by the fact that mental models, which are important to some theories of human reasoning (Johnson-Laird, 1983; Johnson-Laird & Byrne, 1991) and cognitive development (Halford, 1993) are essentially analogues. Analogy is used to effect transfer between isomorphic tasks (Halford et al., 1998a; Reed, Ackinclose, & Voss, 1990), and is particularly important for transfer between domains (Gentner & Gentner, 1983; Gick & Holyoak, 1983). An analogy is a structural correspondence between two cognitive representations, one called a base or source, the other a target (Gentner, 1983; Holland et al., 1986). Contemporary models of analogy (Gentner, Holyoak & Kokinov, 2001; Holyoak, 2005) appear to have reached a consensus that the following two principles are basic: uniqueness of mapping, so that each

220

Halford and Andrews

element in one structure is mapped to one and only one element in the other structure, and symbol±argument consistency, so that if a relation symbol r in one structure is mapped to the relation symbol r© in the other structure, the arguments of r are mapped to the arguments of r © and vice versa. These are essentially structural correspondence criteria. The importance of analogy to the domain speci®city debate is hard to overestimate, because it can potentially offer alternative explanations for many ®ndings. Improved performance in the Wason selection task with versions based on familiar concepts such as permission might simply re¯ect the use of permission and similar schemas as analogues. Elsewhere (Halford, 1993) we have shown how the abstract Wason selection task can be performed by mapping the task into a representation, such as permission, that is isomorphic to the conditional. It is also possible that laboratory studies have underestimated the potential power of analogical reasoning. Many developmental studies have been addressed primarily to issues concerned with age of attainment (Goswami, 2001). Furthermore, attempts to induce analogical reasoning might have underestimated the power of analogy, because it has been shown that participant-generated analogies are more bene®cial than those induced by experimenters (Dunbar, 2001). One of the most important points about analogy is that representation of some abstract concepts, such as variables, can be based on structural correspondence. Establishing structural correspondence is the basic process in analogical reasoning, so representation of variables emerges from the core of analogical reasoning processes. Variables and structural correspondence. Symbolic processes, such as those used in thinking and language, use representations that include variables and representations that have some independence of content. Analogical mappings that meet the criteria for structural correspondence, de®ned earlier, align entities in corresponding slots, so that the slots effectively function as variables. Thus structural correspondence can yield certain forms of abstraction. For example, the relation older-than(±,±) has two arguments or roles, one for an older entity and one for a younger entity. These can be instantiated in a variety of ways such as olderthan(Tom,John), older-than(John,Peter), older-than(Tom,Peter). If these representations are aligned, the ®rst slot corresponds to the older entity and the second slot to the younger entity. Thus the slots function as representations of the age variable.

Factors that apply across domains There are factors that affect cognitive processes and that apply across a number of domains. One such factor is that processing capacity limitations have been observed that appear to have a signi®cant degree of generality.

9. Domain general cognitive processes

221

Processing capacity effects across domains One of the most important demonstrations of a domain general capacity is working memory capacity, which now appears to be limited to four items (Cowan, 2001; Luck & Vogel, 1997). Furthermore, this limitation appears to be consistent across a wide variety of domains. Processing capacity limitations also appear to have more domain generality than was once believed. This has been demonstrated by research based on two recently developed cognitive complexity metrics. These are cognitive complexity and control theory (Frye & Zelazo, 1998) and the relational complexity metric (Halford, Wilson, & Philips, 1998b). These have considerable common ground, and both have been applied to a wide range of phenomena (Halford & Andrews, 2006). We will focus on the relational complexity metric here, but there are also a number of developmentally important studies based on the other metric. The relational complexity metric de®nes complexity as a function not of the total information in a task, but of the number of variables that can be related in a single cognitive representation. This corresponds to the arity, or number of arguments (slots) of a relation (an n-ary relation is a set of points in n-dimensional space). Data indicate that quaternary relations (four related variables) are the most complex that can be processed in parallel by most adult humans, though a minority can process quinary relations under optimal conditions (Halford, Baker, McCredden, & Bain, 2005). Norms indicate that the median ages at which each level of complexity is attained are 1 year for unary relations, 2 years for binary relations, 5 years for ternary relations, and 11 years for quaternary relations. Complex tasks can be segmented into components that do not overload the capacity to process information in parallel. However, relations between variables in different segments become inaccessible ( just as a three-way interaction would be inaccessible if only two-way analyses were performed). Processing loads can also be reduced by conceptual chunking, which is equivalent to compressing variables (analogous to collapsing factors in a multivariate experimental design). For example, velocity = distance/time, but can be recoded to a binding between a variable and a constant (e.g., speed = 80 km/h) (Halford et al., 1998b, Section 3.4.1). Conceptual chunking reduces processing load, but chunked relations become inaccessible. For example, if we think of velocity as a single variable, we cannot determine what happens to velocity if we travel the same distance in half the time. Complexity analyses are based on the principle that Variables can be chunked or segmented only if relations between them do not need to be processed. Tasks that impose high loads are those where chunking and segmentation are constrained. For example, knights and knaves tasks are dif®cult because, although they permit some serial processing, their structure requires inferences based on a large amount of contingent information. The rules are that there is an island inhabited solely by knights who always tell the truth and knaves who always lie. In a particular problem A says ``I

222

Halford and Andrews

am a knave and B is a knave''. B says ``A is a knave''. What are A and B? A good strategy might be to assume that A is a knight. The ®rst inference is that as knights always tell the truth, and A says that he is a knave, this contradicts the proposition that A is a knight, and therefore A is a knave. The second inference is that as A is a knave, his statement must be false, yet he has said correctly that he is a knave, therefore his statement that B is a knave must be false, so B is a knight (the proposition that A is a knave and B is a knave is therefore false). The third inference is that this is consistent with B's statement that A is a knave. The ®rst and second inferences are each based on a considerable amount of information, which has been quanti®ed by the relational complexity metric in a way that predicts problem dif®culty (Birney & Halford, 2002). Notice that relational complexity takes account of both serial and parallel processing. Relational complexity analyses yielded the prediction that 2-year-olds could process weight or distance information in the balance scale because these are binary relational (Halford, 1993). This was contrary to previous theory (Case, 1985) and empirical observation (Siegler, 1981) but has been con®rmed (Halford, Andrews, Dalton, Boag, & Zielinski, 2002). The study showed that young children demonstrate competence on balance scale items that are structurally simple (binary relational) but not on more complex (ternary relational) items. Processing capacity effects have therefore been demonstrated in many cognitive phenomena, only a small sample of which are reviewed above. Furthermore, processing capacity appears to have similar effects across a number of different domains. The performance of 4- to 8-year-old children was found to be similarly affected by cognitive complexity in the domains of transitivity, hierarchical classi®cation, cardinality, comprehension of relative clause sentences, hypothesis testing, and class inclusion (Andrews & Halford, 1998, 2002). A single factor accounted for approximately 50% of variance, and factor scores correlated with ¯uid intelligence (r = .79) and working memory (r = .66). Subsequent studies have observed correspondence between hierarchical categories, transitive inference, and class inclusion (Halford, Andrews, & Jensen, 2002), between processing sentences with embedded clauses, hierarchical classi®cation, and transitivity (Andrews, Halford, & Prasad, 1998) and between concept of mind, transitivity, hierarchical classi®cation, and cardinality (Andrews, Halford, Bunch, Bowden, & Jones, 2003). There is therefore substantial evidence showing that processing capacity effects operate in similar fashion across a number of domains. The cases we have been considering suggest that there are phenomena that set limits to the explanatory power of modular theories. The domain general processes of schema induction and analogical reasoning can provide alternative explanations for some phenomena that have seemed to provide evidence for modularity. Complexity is a factor that appears to apply somewhat independently of domain. Now we consider some cases where modularity might provide a partial rather than a complete explanation.

9. Domain general cognitive processes

223

Modular processes as partial explanations The view that the mind consists wholly (or largely) of modular systems stems from two premises: the mind is computationally realised; and amodular or holistic processes are computationally intractable (Carruthers, 2002). Modularity is said to avoid combinatorial problems by focusing computational resources on a strictly limited set of potentially useful ideas (Bryson, 2002). However, as noted by those who question this approach, it is possible to accept that infants are born with predispositions to focus on certain classes of perceptual stimuli, without endorsing modular accounts of human thought (Clark, 2002; Evans & Over, 2002; Nelson, 2002). Many of the modules proposed by researchers in cognitive psychology bear little resemblance to those originally proposed by Fodor (1983). Carruthers (2002) distinguished between input and output modules such as early vision, face recognition, and language on one hand, and modules that process conceptual information concerning particular domains on the other. The latter conceptual modules fall short of Fodor's criteria in that they (1) do not have propriety transducers, (2) might not have dedicated neural hardware, and (3) might not be fully encapsulated. However, according to Carruthers, they are innately channelled, dedicated computational systems that generate information in accordance with algorithms not shared with or accessible to other systems. The system proposed by Cosmides and Tooby (1992) for keeping track of social contracts and detecting cheaters (referred to earlier) is an example of this type of module. We will consider two domains for which conceptual modules have been proposed: theory of mind (e.g., Leslie, German, & Polizzi, 2005), and an intuitive number system (Dehaene & Changeux, 1993; Gallistel & Gelman, 1992; Wynn, 1992). Theory of mind Individuals who can attribute mental states (e.g., beliefs, desires) to other persons are said to possess a theory of mind. Before the age of 4 or 5 years, most normally developing children experience dif®culty with false-belief and other tasks that assess theory of mind ability (Wellman, Cross, & Watson, 2001). To succeed on false-belief tasks, children need to understand that individuals act in accordance with their beliefs even when these are false. For example, I look on the kitchen table for my keys even though they are in fact elsewhere. My belief (that my keys are on the table) is false. My behaviour (looking on the table) is driven by my false belief, not by reality (the keys are in my son's pocket). While it has been claimed that theory of mind abilities involve a domain speci®c module, there is a growing consensus that such explanations are, at best, partial. In recognition of this, Leslie and his colleagues (e.g., Friedman & Leslie, 2004; Leslie et al., 2005) proposed a learning mechanism that is in part modular (theory of mind module, ToMM) and in part penetrable

224

Halford and Andrews

(selection processor, SP). The ToMM allows children to attend to and detect mental states, and to attribute beliefs to others. The ToMM is biased toward representing beliefs as true. Success on false-belief tasks depends on overcoming the default true-belief attributions, which are generated automatically by the ToMM. The role of the SP is to inhibit true-belief attributions so that beliefs can be represented as false when circumstances warrant it. The SP is the domain general component, and it is relatively slow to develop. It is characterised in terms of inhibition, but inhibition accounts have also been questioned (Andrews et al., 2003; Perner, Lang, & Kloo, 2002; Sabbagh, 2004). Leslie et al. (2005) have themselves acknowledged that other accounts of the SP are plausible (see also Happaney & Zelazo, Chapter 11; Moses & Sabbagh, Chapter 12, this volume). The hybrid ToMM±SP model can account for the early emergence of belief and desire concepts in the second year of life, and also for children's dif®culty with false-belief tasks. It is also consistent with the ®ndings from brain imaging research, which suggest the involvement of two separate brain networks in theory of mind tasks. Siegal and Varley (2002) reviewed the imaging, autism, and lesion research and concluded that a dedicated, domain speci®c system for interpreting the mental states of others lies at the core of performance on theory of mind tasks. This is centred on the amygdala system and its connections with prefrontal and temporal lobe structures. They distinguished this ``core'' system from other ``coopted'' systems that are more domain general. These coopted systems are involved in performance of theory of mind tasks as well as tests of language and executive functions. On the basis of a review of extant research and their own event-related potential (ERP) studies, Sabbagh (2004; Sabbagh, Moulson, & Harkness, 2004; Sabbagh & Taylor, 2000) proposed, like Siegal and Varley (2002), that ToM reasoning can be fractionated into two functionally and anatomically distinct neural circuits. The ability to detect and decode others' mental states from observable cues (e.g., facial expressions, tone of voice) might rely on orbitofrontal/medial temporal circuit in the right hemisphere, whereas the ability to reason about others' mental states might rely on medial frontal regions in the left hemisphere. However, they stopped short of endorsing the innate modular interpretation of the orbitofrontal circuit as proposed by Siegal and Varley (2002). They proposed that this circuit might be important for decoding the meaning of stimuli in non-social domains as well as for decoding mental states. Orbitofrontal regions are known to be involved in decoding the content of emotionally rewarding stimuli in gustatory and olfactory domains (Rolls, 2000) and in performance of gambling (Bechara, Damasio, Damasio, & Anderson, 1994), reversal learning (Overman, Bachevalier, Schuhmann, & Ryan, 1996) and delayed nonmatching to sample tasks (Elliott, Dolan, & Frith, 2000), all of which require learning and/or updating the links between stimuli and rewards. It remains possible then that the involvement of orbitofrontal regions re¯ects learning of the reinforcement values that are associated with

9. Domain general cognitive processes

225

detecting and decoding mental states (see also McKinnon, Levine, & Moscovitch, Chapter 7, this volume). There are other unresolved issues surrounding modular accounts of theory of mind. One issue refers to the number of modules that are necessary and the relations between them. For example, do the conceptual modules for theory of mind and social contracts differ from each other, and from the modules involved in the visual processing of faces? Evans and Over (2002) have noted that much domain speci®c cognition is the result of domain general learning processes. Halford (1993) proposed an account of theory of mind that involves domain general acquisition processes including contingencies and condition±action rules, analogy, and a role for complexity. Children's initial understanding of mental states is based on the representation of environmental rules and contingencies that are acquired through basic learning mechanisms including reinforcement. These rules and contingencies are the building blocks of children's understanding of mind. They are derived from direct experience with situations involving their own and others' mental activities. Attributing mental states to others might involve the use of analogy. That is, if we know the mental state that we ourselves would experience in a particular situation, we might attribute that same mental state to another person faced with a similar situation. Many theory of mind tasks (e.g., false belief and appearance reality) involve complex inferences. Andrews et al. (2003) have demonstrated that children's performance on these tasks is related to their ability to process complex relational information in content domains that do not involve mental states. Other research (Carlson & Moses, 2001; Frye, Zelazo, & Palfai, 1995) also indicates the involvement of domain general abilities. In summary, while it is possible that there is a domain speci®c module for theory of mind, there are alternative explanations. It would be premature to conclude that such a module exists until these are investigated and eliminated. However, even if the module is shown to exist, it seems clear that it will not be suf®cient for reasoning about mental states. Although theory of mind ability includes decoding mental states from observable cues, it goes far beyond this. The complex inferences required to succeed on false-belief and appearance reality tasks, and to appreciate fully situations that involve concepts such as deception and irony, seem to recruit abilities that are better described as domain general. Intuitive number systems Basic quantitative abilities have been demonstrated in many nonhuman species and in human infants. Evidence for quantitative understanding in early infancy is provided by studies using the habituation/dishabituation and violation of expectations techniques, which assume that surprising events will capture infants' attention and they will look longer at them. Choice and manual search paradigms are also used with older infants.

226

Halford and Andrews

Results of studies using these methodologies have been interpreted as evidence for two core systems for numerical processing (Feigenson, Dehaene, & Spelke, 2004a; Xu, 2003). One system is suited to large quantities, which are represented in an approximate way using analogue-magnitude representations. In this system ``number'' is treated in a similar way to continuous quantities such as distance and time. Representations become increasingly fuzzy as numerical quantity increases, and accuracy of magnitude comparison is a function of the ratio of the quantities (Weber's law). Accumulator mechanisms have been proposed to account for the empirical ®ndings. Most of these models involve serial enumeration (Gallistel & Gelman, 1992; Wynn, 1992) but at least one parallel model has also been proposed (Dehaene & Changeux, 1993). Brain imaging and lesioning research suggest there is a distinct neural circuitry for this analogue-magnitude system, with the horizontal segment of the intraparietal sulcus bilaterally being a major site of activation in neuroimaging studies involving adults (Dehaene, Piazza, Pinel, & Cohen, 2003). Dehaene et al. concluded that this site is involved in nonverbal representation of numerical quantity, possibly a mental number line, which supports an intuitive understanding of what a given quantity means and the proximity relations between quantities. A second system represents the exact numerosity of very small sets of discrete objects. This is thought to re¯ect the operation of an object-®le system that allows humans and nonhumans to track up to four moving objects at a time (Simon, 1997). This system might also underlie ``subitising'', whereby adults are able directly to apprehend and enumerate sets of up to four or ®ve items. The brain circuitry underlying this system is less well understood, but one possibility is that subitising is a basic automatic function of the extrastriate areas (Feigenson et al., 2004a). Both systems are said to (1) be present in human infants and in other nonhuman species, with the implication that they are not acquired through individual learning or cultural transmission, (2) be deployed in an automatic fashion, (3) be tuned to speci®c types of information, and (4) continue to function throughout the lifespan. However, the ®rst (innateness) and third (domain speci®city) claims have been challenged (see also Mix & Sandhofer, Chapter 13, this volume). Consistent with the innateness claim, Dehaene and Changeux's (1993) neural model incorporated a wired-in numerosity detector. However, Verguts and Fias (2004) demonstrated that an approximate number line representation with the same properties as Dehaene and Changeux's could arise in uncommitted neurones under unsupervised learning conditions (but see Feigenson, Dehaene, & Spelke, 2004b, for a counterclaim). The brain imaging and lesion data provide support for the analogue-magnitude system. However, most of these studies have used adult participants. The only brain imaging support for this system in children is a single ERP study with 5year-olds (Temple & Posner, 1998). If analogue-magnitude representations

9. Domain general cognitive processes

227

are innate, they should be present prior to this age, but the relevant brain imaging studies involving younger children and infants are yet to be conducted. The domain speci®city claim has also been questioned. Johnson and Munakata (2005) agreed that both systems are relevant to the number domain, but noted that these systems can be engaged by non-numerical tasks. Indeed, in reviewing the brain imaging and lesion studies, Dehaene et al. (2003) acknowledged that the horizontal segment of the intraparietal sulcus, thought to underlie the mental number line, and which is activated during numerical comparison tasks, is also activated during comparison tasks involving other categories that have a strong spatial or serial component (alphabet, days of the week, etc.). The distance effects, which are taken as evidence for magnitude coding and the mental number line, have been observed with many types of stimuli and dimensions. Marchuetz, Smith, Jonides, DeGutis, and Chenevert (2000) conducted an fMRI study in which they compared recall of items versus recall of order information in a short-term memory task. They found that the areas that were signi®cantly more activated in the order condition included the parietal and prefrontal cortex. Parietal activations overlapped those involved in number processing, suggesting that the underlying representations of order and number share a common process, coding for magnitude. Regarding the second system, processes such as subitising might well be innate and re¯ect modular processing. However, it seems more likely that it is the module for early vision, rather than a module for number, that allows us to enumerate and keep track of small numbers of objects. The modular status of these systems will no doubt continue to be debated. By contrast there is general agreement regarding the limitations of these. As noted by Feigenson et al. (2004a), both core systems have limited representational power. They are insuf®cient to support the acquisition of the number words of the language and their correct sequence, elementary arithmetic, understanding of the compositional relationships within sets, and the conservation of number. Mastery of these concepts requires children to integrate their early quantitative knowledge with the symbolic number system of their culture. The manner in which children achieve this is not yet clear. However, elsewhere (Andrews & Halford, 2002; Halford & Andrews, 2006) we have argued that children might use mental models. A major source of the dif®culty that children encounter appears to be the complexity of mental models that underlie the particular numerical concepts. The age of acquisition of these concepts is well predicted by the complexity of the relations incorporated in them.

Conclusion To conclude, it is likely that modular processes play a role in certain aspects of both child and adult cognition, but it is equally clear that cognition

228

Halford and Andrews

cannot consist solely of modules. We caution that the distinction between modular and domain general processes needs to be made rigorously. The criteria currently used for conceptual modules (Carruthers, 2002) are much weaker than those described for early vision (Pylyshyn, 1999). This leads to a lack of discipline in module identi®cation, and to the danger of an unjusti®ed proliferation of modules. We have suggested that some interpretations of reasoning based on mechanisms specialised for deontic reasoning, or an innate cheater detection schema, arguably entail some overinterpretation of the data. Modular processes in theory of mind appear to be restricted to the ability to detect and decode others' mental states from observable cues (e.g., facial expressions, tone of voice). All this suggests that modularity plays an important, but restricted, role in human cognition. In addition to stringent criteria, there is a need for converging evidence from several sources, comparable to the behavioural, neuroscience, and clinical neurological evidence induced for a module of early vision. Despite much research in recent years, on current evidence it is doubtful that deontic reasoning, theory of mind, and number meet even the relaxed criteria for conceptual modules. The modular explanation is further limited by the fact, which is becoming increasingly well established, that processing capacity is a factor that affects cognition in similar ways in many domains. Furthermore, some of the observations that tempt us to propose modular explanations might be equally well or better explained by domain general processes. We outlined two candidates here, schema induction and analogy, both of which we suggest have not received the attention they deserve in this context. These and other domain general processes could prove a more fruitful focus of research efforts in the future.

References Almor, A., & Sloman, S. A. (1996). Is deontic reasoning special? Psychological Review, 103, 374±380. Andrews G., & Halford, G. S. (1998). Children's ability to make transitive inferences: The importance of premise integration and structural complexity. Cognitive Development, 13, 479±513. Andrews, G., & Halford, G. S. (2002). A cognitive complexity metric applied to cognitive development. Cognitive Psychology, 45, 153±219. Andrews, G., Halford, G. S., Bunch, K. M., Bowden, D., & Jones, T. (2003). Theory of mind and relational complexity. Child Development, 74, 1476±1499. Andrews, G., Halford, G. S., & Prasad, A. (1998). Processing load and children's comprehension of relative clause sentences. ERIC Document Reproduction Service No. ED 420091. Baddeley, A. D. (1986). Working memory. Oxford: Clarendon Press. Bechara, A., Damasio, A. R., Damasio, H., & Anderson, S. W. (1994). Insensitivity to future consequences following human prefrontal cortex. Cognition, 50, 7±14.

9. Domain general cognitive processes

229

Binet, A., & Simon, T. (1980). The development of intelligence in children. Nashville, TN: Williams Printing Co. (Original work published 1905) Birney, D. P., & Halford, G. S. (2002). Cognitive complexity of suppositional reasoning: An application of the relational complexity metric to the knight±knave task. Thinking and Reasoning, 8, 109±134. Bryson, J. J. (2002). Language isn't quite that special. Behavioral and Brain Sciences, 25, 679±680. Carlson, S. M., & Moses, L. J. (2001). Individual differences in inhibitory control and children's theory of mind. Child Development, 72, 1032±1053. Carruthers, P. (2002). The cognitive functions of language. Behavioral and Brain Sciences, 25, 657±674. Case, R. (1985). Intellectual development: Birth to adulthood. New York: Academic Press. Cheng, P. W., Holyoak, K. J., Nisbett, R. E., & Oliver, L. M. (1986). Pragmatic versus syntactic approaches to training deductive reasoning. Cognitive Psychology, 18, 293±328. Clark, A. (2002). Anchors not inner codes, coordination not translation (and hold the modules please). Behavioral and Brain Sciences, 25, 681. Cosmides, L., & Tooby, J. (1992). Cognitive adaptations for social exchange. In J. H. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 163±228). New York: Oxford University Press. Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, 87±185. Dehaene, S., & Changeux, J.-P. (1993). Development of elementary numerical abilities: A neuronal model. Journal of Cognitive Neuroscience, 5, 390±407. Dehaene, S., Piazza, M., Pinel, P., & Cohen, L. (2003). Three parietal circuits for number processing. Cognitive Neuropsychology, 20, 487±506. Dunbar, K. (2001). The analogical paradox: Why analogy is so easy in naturalistic settings, yet so dif®cult in the psychological laboratory. In D. Gentner, K. J. Holyoak, & B. K. Kokinov (Eds.), The analogical mind: Perspectives from cognitive science (pp. 313±334). Cambridge, MA: MIT Press. Elliott, R., Dolan, R. J., & Frith, C. D. (2000). Dissociable functions in the medial and lateral orbitofrontal cortex: Evidence from human neuroimaging studies. Cerebral Cortex, 10, 308±317. English, L. D., & Halford, G. S. (1995). Mathematics education: Models and processes. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Evans, J. St B. T., & Over, D. E. (2002). The role of language in the dual process theory of thinking. Behavioral and Brain Sciences, 25, 684±685. Feigenson, L., Dehaene, S., & Spelke, E. (2004a). Core systems of number. Trends in Cognitive Sciences, 8, 307±314. Feigenson, L., Dehaene, S., & Spelke, E. S. (2004b). Origins and endpoints of the core systems of number: Reply to Fias and Verguts. Trends in Cognitive Sciences, 8, 448±489. Fodor, J. A. (1983). Modularity of mind: An essay on faculty psychology. Cambridge, MA: MIT Press. Friedman, O., & Leslie, A. M. (2004). Mechanisms of belief±desire reasoning. Psychological Science, 15, 547±553.

230

Halford and Andrews

Frye, D., & Zelazo, P. D. (1998). Complexity: From formal analysis to ®nal action. Behavioral and Brain Sciences, 21, 836±837. Frye, D., Zelazo, P. D., & Palfai, T. (1995). Theory of mind and rule-based reasoning. Cognitive Development, 10, 483±527. Gallistel, C. R., & Gelman, R. (1992). Preverbal and verbal counting and computation. Cognition, 44, 43±74. Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7, 155±170. Gentner, D., Brem, S., Ferguson, R. W., Markman, A. W., Levidow, B. B., Wolff, P. et al. (1997). Analogical reasoning and conceptual change: A case study of Johannes Kepler. Journal of the Learning Sciences, 6, 3±40. Gentner, D., & Gentner, D. R. (1983). Flowing waters or teeming crowds: Mental models of electricity. In D. Gentner & A. L. Stevens (Eds.), Mental models (pp. 99±129). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Gentner, D., Holyoak, K. J., & Kokinov, B. K. (Eds.). (2001). The analogical mind: Perspectives from cognitive science. Cambridge, MA: MIT Press. Gick, M. L., & Holyoak, K. J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15, 1±38. Goswami, U. (2001). Analogical reasoning in children. In D. Gentner, K. J. Holyoak, & B. Kokinov (Eds.), The analogical mind: Perspectives from cognitive science (pp. 437±470). Cambridge, MA: MIT Press. Halford, G. S. (1993). Children's understanding: The development of mental models. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Halford, G. S., & Andrews, G. (2006). Reasoning and problem solving. In D. Kuhn & R. Siegler (Eds.), Handbook of child psychology: Volume 2. Cognitive, language and perceptual development (6th ed.). Hoboken, NJ: Wiley. Halford, G. S., Andrews, G., Dalton, C., Boag, C., & Zielinski, T. (2002). Young children's performance on the balance scale: The in¯uence of relational complexity. Journal of Experimental Child Psychology, 81, 417±445. Halford, G. S., Andrews, G., & Jensen, I. (2002). Integration of category induction and hierarchical classi®cation: One paradigm at two levels of complexity. Journal of Cognition and Development, 3, 143±177. Halford, G. S., Bain, J. D., Maybery, M., & Andrews, G. (1998a). Induction of relational schemas: Common processes in reasoning and complex learning. Cognitive Psychology, 35, 201±245. Halford, G. S., Baker, R., McCredden, J. E., & Bain, J. D. (2005). How many variables can humans process? Psychological Science, 16, 70±76. Halford, G. S., Wilson, W. H., & Phillips, S. (1998b). Processing capacity de®ned by relational complexity: Implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21, 803±831. Hofstadter, D. R. (2001). Analogy as the core of cognition. In D. Gentner, K. J. Holyoak, & B. N. Kokinov (Eds.), The analogical mind: Perspectives from cognitive science (pp. 499±538). Cambridge, MA: MIT Press. Holland, J. H., Holyoak, K. J., Nisbett, R. E., & Thagard, P. R. (1986). Induction: Processes of inference, learning and discovery. Cambridge, MA: MIT Press. Holyoak, K. J. (2005). Analogy. In K. J. Holyoak and R. G. Morrison (Eds.), The Cambridge handbook of thinking and reasoning. New York: Cambridge University Press. Holyoak, K. J., & Thagard, P. (1995). Mental leaps. Cambridge, MA: MIT Press.

9. Domain general cognitive processes

231

Hummel, J. E., & Holyoak, K. J. (2003). A symbolic-connectionist theory of relational inference and generalization. Psychological Review, 110, 220±264. Johnson, M. H., & Munakata, Y. (2005). Processes of change in brain and cognitive development. Trends in Cognitive Sciences, 9, 152±158. Johnson-Laird, P. N. (1983). Mental models. Cambridge: Cambridge University Press. Johnson-Laird, P. N., & Byrne, R. M. J. (1991). Deduction. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Leslie, A. M., German, T. P., & Polizzi, P. (2005). Belief±desire reasoning as a process of selection. Cognitive Psychology, 50, 45±85. Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279±281. McClelland, J. L. (1995). A connectionist perspective on knowledge and development. In T. Simon & G. S. Halford (Eds.), Developing cognitive competence: New approaches to cognitive modelling (pp. 157±204). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. McClelland, J. L., & Rumelhart, D. E. (1985). Distributed memory and the representation of general and speci®c information. Journal of Experimental Psychology: General, 114, 159±188. Marcheutz, C., Smith, E. S., Jonides, J., DeGutis, J., & Chenevert, T. L. (2000). Order information in working memory: fMRI evidence for parietal and prefrontal mechanisms. Journal of Cognitive Neuroscience, 12, Supplement 2, 130±144. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco: W.H. Freeman. Nelson, K. (2002). Developing dual-representation processes. Behavioral and Brain Sciences, 25, 693±694. Newell, A. (1990). Uni®ed theories of cognition. Cambridge, MA: Harvard University Press. Oaksford, M., & Chater, N. (1994). A rational analysis of the selection task as optimal data selection. Psychological Review, 101, 608±631. Overman, W. H., Bachevalier, J., Schuhmann, E., & Ryan, P. (1996). Cognitive gender differences in very young children parallel biologically based cognitive gender differences in monkeys. Behavioral Neuroscience, 110, 673±684. Perner, J., Lang, B., & Kloo, D. (2002). Theory of mind and self-control: More than a common problem of inhibition. Child Development, 73, 752±767. Piaget, J. (1950). The psychology of intelligence (M. Piercy & D. E. Berlyne, Trans.). London: Routledge & Kegan Paul. (Original work published 1947) Polya, G. (1954). Mathematics and plausible reasoning (I): Induction and analogy in mathematics. Princeton, NJ: Princeton University Press. Pylyshyn, Z. W. (1999). Is vision continuous with cognition? The case for cognitive penetrability of visual perception. Behavioral and Brain Sciences, 22, 341±423. Quinn, P. C., & Johnson, M. H. (1997). The emergence of perceptual category representations in young infants: A connectionist analysis. Journal of Experimental Child Psychology, 66, 236±263. Reber, A. S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General, 118, 219±235. Reed, S. K., Ackinclose, C. C., & Voss, A. A. (1990). Selecting analogous problems: Similarity versus inclusiveness. Memory and Cognition, 18, 83±98.

232

Halford and Andrews

Rolls, E. T. (2000). The orbitofrontal cortex and reward. Cerebral Cortex, 10, 284±294. Sabbagh, M. A. (2004). Understanding orbitofrontal contributions to theory-ofmind reasoning: Implications for autism. Brain and Cognition, 55, 209±219. Sabbagh, M. A., Moulson, M. C., & Harkness, K. L. (2004). Neural correlates of mental state decoding in human adults: An event-related potential study. Journal of Cognitive Neuroscience, 16, 415±426. Sabbagh, M. A., & Taylor, M. (2000). Neural correlates of theory-of-mind reasoning: An event-related potential study. Psychological Science, 11, 46±50. Siegal, M., & Varley, R. (2002). Neural systems involved in ``theory of mind''. Nature Reviews: Neuroscience, 3, 463±471. Siegler, R. S. (1981). Developmental sequences within and between concepts. Monographs of the Society for Research in Child Development, 46, 1±84. Simon, T. J. (1997). Reconceptualizing the origins of number knowledge: A ``nonnumerical'' account. Cognitive Development, 12, 349±372. Temple, E., & Posner, M. I. (1998). Brain mechanisms of quantity are similar in 5-year-olds and adults. Proceedings of the National Academy of Science USA, 95, 7836±7841. Verguts, T., & Fias, W. (2004). Representation of number in animals and humans: A neural model. Journal of Cognitive Neuroscience, 16, 1493±1504. Wason, P. C. (1966). Reasoning. Harmondsworth, UK: Penguin. Wellman, H. M., Cross, D., & Watson, J. (2001). Meta-analysis of theory-of-mind development: The truth about false belief. Child Development, 72, 655±684. Wynn, K. (1992). Evidence against empiricist accounts of the origins of numerical knowledge. Mind and Language, 7, 315±332. Xu, F. (2003). Numerosity discrimination in infants: Evidence for two systems of representations. Cognition, 89, B15±B25.

10 A competence±procedural and developmental approach to logical reasoning Willis F. Overton and Anthony Steven Dick

Despite several decades of debate, disagreement still exists about the nature and development of logical reasoning. The present chapter focuses on two areas that are central to this debate. The ®rst deals with the place of a mental logic in the reasoning process. On one side of the debate this is considered essential because it provides a structure on which propositional content is af®xed. Dif®culties in deductive reasoning can be understood as deviations from this normative structure. On the other side, mental logic is considered unnecessary and provides nothing to the understanding of reasoning. From this perspective, performance on tasks that appear to entail deductive reasoning are explained by procedural, information-processing features of thought that are unrelated to any formal logic structure. This latter understanding leads to a second area of disagreement, which concerns whether the processes involved in reasoning are domain speci®c or domain general. This chapter attempts to reconcile these two debates by presenting a competence±procedural theory of reasoning and its development. According to this theory, reasoning requires the development of a domain general mental logical competence, but access and implementation of this competence are limited by procedural dif®culties that may be domain general or domain speci®c.

Reasoning and logic The literature on deductive reasoning is complex, and some de®nitional clari®cation is needed before beginning an analysis of the issues that frame contemporary investigations into its nature and development. This initial review of some basic concepts distinguishes logical reasoning and the structure of logic, which is critical for an unambiguous understanding of the ®eld. Reasoning is goal-directed thought that coordinates inferences (JohnsonLaird & Byrne, 1991). As reasoning constitutes a kind of thinking, deductive or logical reasoning is a speci®cally unique form because it is the only type in which general propositions lead to particular conclusions, and premises provide conclusive or necessary support for the certainty of them.

234

Overton and Dick

We will see later that this notion of ``logical necessity'' is a central feature that, unfortunately, is often ignored in various debates in the ®eld. Deductive or logical reasoning is distinguished from other types, such as reasoning based on knowledge of context, or based on probability, as these involve induction, where inference proceeds from the particular to the general. In these latter cases, the premises provide probable, but not necessary, conclusions. The following examples illustrate deductive and inductive reasoning. 1

All ¯ights to Chicago have a layover in Cincinnati. The ¯ight in gate 10 goes to Chicago. Therefore, the ¯ight in Gate 10 has a layover in Cincinnati.

2

All the planes I have seen that ¯y to Chicago have a layover in Cincinnati. Therefore, all planes to Chicago have a layover in Cincinnati.

The ®rst example (1) is a deductive inference, and the second (2) is an inductive inference. Whereas the goal of induction is to achieve a probable solution, the goal of deduction is to draw an absolutely valid (``logically necessary'') conclusion from the premises. With the distinction between deductive and inductive reasoning in hand, we move to the next critical distinction, that between logic and logical reasoning. Logic itself as a discipline is not concerned with reasoning processes per se. Logicians focus on the products of these, which they term arguments. From a psychological perspective, examples (1) and (2) involve different forms of inference, but from the perspective of logic they are deductive and inductive arguments, respectively. Logic is thus concerned with arguments that are accepted as correct or incorrect, and logical reasoning is concerned with the mental processes and structures that are in some way related to logical arguments. From the perspective of logic, an argument is a set of propositions of which a conclusion follows from the premises, with these providing evidence for the truth of the conclusion. Deductive arguments also have the feature of being valid (correct) or invalid (incorrect). When we say that a deductive argument is valid, we are saying that it is absolutely impossible to ®nd a situation (i.e., a possible world) in which the argument has true premises and a false conclusion. Thus, in a valid deductive argument, the conclusion necessarily follows from the premises. In contrast to deductive arguments, inductive arguments cannot be valid or invalid because the premises only provide probable evidence for the truth of the conclusion (see also Over, Chapter 4, this volume). No matter how many planes to Chicago I have seen that have layovers in Cincinnati, there may be some that go straight through. Logicians use simple sentences and the ``sentential connectives'' that join them together (i.e., not called negation; and called conjunction; or called

10. Competence±procedural theory

235

disjunction; if . . . then called the conditional; if and only if called the biconditional). They then demonstrate valid and invalid arguments related to these connectives. Elementary valid arguments are the building blocks for elementary argument forms or rules of inference that are used to analyse more complex arguments. The simplest of these forms are modus ponens (if p then q, p, therefore q), modus tollens (if p then q, not-q, therefore not-p), and hypothetical syllogism (if p then q, if q then r, therefore if p then r). Here any content can be substituted for p, q, and r and the argument remains valid. The general deductive system that incorporates the connectives and inference rules is called a propositional or sentential logic. A more powerful system that goes beyond propositional logic is variously called quanti®cational, predicate, or ®rst-order logic. Deductive propositional and predicate logics ultimately are formalisations of the commonsense correct deductive arguments that people engage in on a day-to-day basis. The logics are attempts by logicians to codify the rules according to which such arguments proceed, or, to say this in a slightly different fashion, a logic is a theory of the nature of arguments. By establishing the rules of valid deductive arguments, a logic also pinpoints the nature of error. With these distinctions concerning logic, we can turn to various theories of the relation of logic and reasoning found in the study of deductive reasoning.

Theories of deductive reasoning There are a number of theories about the nature, origin, and development of logical reasoning. A major area of theoretical debate concerns the place of mental logic in this. If logic entails rules of arguments, then the mental logic of logical reasoning describes the mental processes associated with these rules. However, here the ®eld has generally organised itself around two very divergent options: (1) on one hand, the broad rules of logic are taken to operate as a model of the dynamic system of psychological processes entailed by logical reasoning. In this case, the model functions as a theoretical explanation for predicted or actual behaviour; (2) on the other hand, the rule system is considered to be irrelevant to reasoning, and the appearance of the reasoning as being ``logical'' is considered an epiphenomenon. Those who pursue option (1), and thus favour the use of logical systems as models of the dynamic organisation of mental processes, have been referred to as a mental logic group (Braine & O'Brien, 1998; Rips, 1994), or as competence theorists (Overton, 1985, 1990). The basic position here is that the rules that have been derived to represent the structure of valid arguments may be taken as relatively adequate representations of normative, idealised, abstract operations of mind in this domain. However, as Russell suggests, this competence is not ``to be regarded as `mental representations' that the adult thinker uses when she reasons, but [as] idealisations of the system of thought to which the `normal adult' has access.

236

Overton and Dick

Sometimes the access is good, sometimes poor'' (1987, p. 41). Competence theorists thus argue that this logical competence is an idealised model of the dynamic organisation of the individual's capability in the domain in question. And they argue that this competence must play a central and privileged role in any account of logical reasoning. One issue facing such theorists is how best to describe this competence. On one point, however, all competence theorists agree: The particular notation system used to describe the organisation of a mental logic does not need to correspond closely to that found in logic textbooks. For example, truth tables are a standard notation often used by logicians to describe argument forms, but no one argues that people generally reason according to truth tables or that a mental logic model should incorporate them. Piaget and his colleagues (Beth & Piaget, 1966; Inhelder & Piaget, 1958) initially described formal mental logical competence in terms of the propositional calculus; in other words, a propositional logic system that formalises compound statements in terms of the truth values of their connectives. Later, they moved towards a description of competence as a natural deductive system; in other words, a system that focuses on the validity of arguments rather than logical truth, and employs only inference rules rather than axioms (Piaget, 1986, 1987a, 1987b; Piaget & Garcia, 1991). Other in¯uential proposals have suggested that ``inference schemas'' or deduction rules constitute the best description of logical competence (Braine & O'Brien, 1998; O'Brien, Braine, & Yang, 1994; Rips, 1994). A second issue that such theorists face concerns the development of the competence system. It has been suggested that this is primarily innate (e.g., Macnamara, 1986); that it has a bioevolutionary origin (e.g., O'Brien, 2004); that it is partially innate and partially learned (Braine, 1990); that it emerges out of language (Falmagne, 1990); and ®nally there is the position held in the present paper, that the origins of competence are found in the embodied actions of the infant, the child, and the adolescent as the person coacts with the social and physical world (Lakoff & Johnson, 1999; Overton, 2003; Piaget, 1986). In contrast to theories that incorporate a mental logical competence, those that deny the very existence of this can best be thought of as procedural theories. Arguing that there is no formal system of logic or mental logical rules that underlie logical reasoning, procedural theorists tend to have a strong commitment to information processing and computational models as explanatory devices. There have been a range of speci®c procedural theories designed to account for performance on deductive reasoning tasks. These range from simple information processing models (Griggs, 1983; Mandler, 1983), which claim that counterexamples stored in memory are suf®cient to explain what appears to be ``logical'' reasoning, to theories that rely on domain speci®c reasoning via specialised schemas (e.g., Cheng & Holyoak, 1985, 1989), domain speci®c cognitive algorithms (Cosmides, 1989; Cummins, 1996b), or mental models (Johnson-Laird & Byrne, 1991, 2002).

10. Competence±procedural theory

237

In examining theories of the nature and development of deductive reasoning, we thus ®nd two broad classes of theory ± competence theories and procedural theories. Competence theories describe the nature and development of relatively enduring and universal operations of mind. When the domain of study is reasoning, this has been described in terms of the rules of deductive logic, and is thus primarily formal or syntactic in structure. In contrast, procedural or process theories focus on speci®c representations and procedures as these occur in real time and are related to local problems. Procedures are primarily semantic in nature, and involve speci®c interpreted representations. To clarify, deduction rule theories such as that proposed by Rips (1994) are not procedural in this sense because they specify particular behavioural operations that are neither independent of the syntactic structure nor context-sensitive. The rules presented by Rips simply describe a computational process directly relevant to the syntactical structures of the formal rules. Competence and procedural theories can only be seen as incompatible to the extent that one type of theory is considered a replacement for the other. This incompatibility disappears if the two theories are seen as distinct but interrelated components of a general theory of logical reasoning. This understanding forms the basis for the construction of a general competence± procedural developmental theory of deductive/logical reasoning. From this perspective, competence serves to explain the universal and necessary features of logical reasoning. However, competence, as the dynamic organisation of mental logic, is neutral as to how this system is accessed and implemented. Procedures, on the other hand, are exactly designed to function as the real-time processing that accesses and implements the competence. This distinction is similar to the availability and accessibility distinction introduced by Tulving and Pearlstone (1966) with respect to memory. There is much more that is available for processing than can be accessed at any one moment. The discrepancy between availability of competence and its accessibility implies a critical role for procedural mechanisms necessary to access and process an available competence.

A competence±procedural developmental theory A theory that proposes a distinction between a formal competence and procedures that access that competence must be explicit about how such a scheme would incorporate a developmental or change dimension. Overton and his colleagues (MuÈller, Overton, & Reene, 2001; Overton, 1990; Overton, Ward, Noveck, Black, & O'Brien, 1987; Ward, Byrnes, & Overton, 1990; Ward & Overton, 1990) have consistently maintained that the development of a relatively complete logical competence necessarily precedes success on reasoning problems that are fully deductive in nature, but that real-time information processing plays a key role in determining any speci®c performance. From this developmental perspective, some features of

238

Overton and Dick

logical reasoning may be available quite early, and through embodied actions in the world these features become increasingly differentiated and reintegrated until a relatively complete mental logic is formed in adolescence (MuÈller et al., 2001) or early adulthood. Processes of differentiation and reintegration imply that there will be qualitative transformations in both mental logical competence and procedures across the course of development. Overton (1985) thus describes the person as having two cognitive systems. The competence system functions to comprehend the world. It is a dynamic system composed of the relatively stable, relatively enduring, and universal logical operations (e.g., ``and'', ``or'', ``if . . . then'', ``negation''). The system is considered to be complete, as in the highest level of deductive competence, or incomplete, as in the sensorimotor embodied action systems found in infancy. Thus, the development of mental logic should not be characterised in terms of presence versus absence of a general competence. Rather, when it comes to classifying different conditional reasoning problems, it is better to investigate the cognitive competence that is necessary to achieve a particular interpretation (Staudenmayer & Bourne, 1977). A second system, the procedural system, functions to assure success on problems. This is composed of individuated real-time action systems that may be sequentially ordered, but are not enduring in the way that the competence system endures. A procedure is an action means to an end or goal. It is context-dependent, and context includes both the available competence and information inputs (e.g., success at baking a cake requires stirring, mixing, and beating, but a recipe represents the underlying competence that forms the necessary context for these procedures). Procedures are considered to be suf®cient or insuf®cient, rather than complete or incomplete. This distinction between the completeness/incompleteness of competence and the suf®ciency/insuf®ciency of procedures is particularly important for deductive reasoning.

Competence and necessity Within a competence±procedural theory, the development or acquisition of the concept of necessity (i.e., that which must be) becomes important. Competence develops ®rst at the action level of embodiment (Overton, 1990, 2003). Thus, the infant's early understandings of the real (that which is), the necessary (that which must be), and the possible (that which might be) are constructions derived from organised embodied action and the resistance it meets in the real world. This process by which general logical competence arises through interactions with the world conforms to Piaget's assimilation/accommodation process. Assimilation is the phase of an action where current meanings or expectations (either sensorimotor or symbolic) are projected onto the world. Accommodation refers to the phase of an

10. Competence±procedural theory

239

action that, following the partial success of an assimilation, feeds back and results in a change of the original meaning or expectation. Necessity is the expression of assimilatory action. Possibility is the expression of accommodative action. To illustrate, consider the example of newborn sucking. This act emerges from a preadapted action organisation (preadapted sensorimotor assimilatory structures or initial competence), gives meaning to the object of action, and, thus, ``creates'' the object's meaning as a sensorimotor ``suckable''. This assimilation also yields a result (e.g., obtains milk). It is when sucking meets a resistance (in the sense of failing to yield a result) that variation occurs, and one of the variants leads to success. This variant of the original act feeds back to the initial assimilatory competence system as an accommodation. The variant thus creates a new possibility that, by modifying the initial system, becomes a new assimilation. Here then we have the ®rst sensorimotor differentiation into necessity (assimilation), possibility (variations or accommodations), and reality (continued resistances) at a sensorimotor action level of integration. This embodied procedural origin of necessity and possibility establishes the base for progressive (higher level conceptual and propositional) coordinations of (1) possibilities (with their ¯exibility) and (2) necessities (with their self-regulating system character) into increasingly complete competence systems. This means that the understanding of logical necessity that becomes evident in childhood and adolescence is never the direct product of a hardware level of explanation, as a nativist might suggest, nor is it the product of inductive generalisation drawn from direct observations of the world. Logical necessity is characteristic of a logical model (from the point of view of logic) or the deductive competence system (from the point of view of psychology). The developmental question is not how either the brain or the world generates feelings of ``must''. The developmental question is how an action understanding of ``must'' becomes transformed into a propositional understanding of ``logical necessity''. Having seen a thousand planes going to Chicago and stopping in Cincinnati may yield a contingent truth (i.e., an assertion based on empirical knowledge), and a very strong feeling that this plane must stop at Cincinnati. It does not, however, yield logical necessity. Possessing a system of understanding that generates integrations such as ``all As are B; this is an A; therefore, this is a B'' does yield logical necessity, and this is independent of any particular set of empirical observations concerning As or Bs. If all planes to Chicago stop in Cincinnati, and this is a plane to Chicago, then this plane will stop in Cincinnati. This statement is a logical necessity, and this has nothing to do with empirical observations of planes. The logical necessity comes from the formal or syntactical organisation among the parts of the sentence or, more generally, among sets of propositions. It is this dynamic organisation, beginning in sensorimotor acts as an initial pattern of action, becoming transformed, through actions in the world, into the sensorimotor ``must'', and undergoing transformations into symbolic

240

Overton and Dick

representations and the integration of symbolic representations, that comes to constitute the complete deductive competence system of the adult. An account including logical necessity requires a focus on competence and its development. The concept of logical necessity, in turn, requires an understanding of the central concepts of implication and relevance. Implication is a relation that holds between premises and conclusion, or between antecedent and consequent, as in the proposition ``if p, then q''. When the relation of implication holds, we can then say that p implies q (a deductive argument). This argument is de®ned as implication if and only if it cannot be the case that the antecedent (premise) is true and the consequent (conclusion) false. Thus, the implication relation is de®ned by being a logical necessity. Implication, then, is a symbolic concept entailing a necessary relation that leads from one state of affairs to another. All concepts, including implication and necessity, become transformed across the several levels of development. We have already seen that, from an embodiment perspective (Overton, 2003), an understanding of necessity originates out of the organised actions of the individual. Thus, implication as a symbolic propositional concept as, for example, in the modus ponens argument (``if p, then q; p, therefore q''), ®nds its primitive and incomplete origins in the preconceptual action level (e.g., ``if I pull on the blanket, then I will bring the toy on top of the blanket closer''). At the preconceptual sensorimotor action level, the primitive analogue of implication is a clearly de®ned relation between actions and, thus, an intentional and intuitive sense of must. The progressive transformation of this analog to the ®nal propositional understanding of logical necessity is a development that entails the differentiation and reintegration of competence systems beginning at the sensorimotor action level, moving to the symbolic (at around age 2), and from there to the re¯ective symbolic level (beginning around age 4) and ®nally to higher order re¯ective symbolic systems (beginning at approximately age 6 and again at around age 14), culminating in the relatively complete logical competence system found in adolescence or early adulthood. Although any discussion of the development of implication and necessity demonstrates the critical role of meaning, it does not capture the full importance of meaning in the system. The introduction of the logical concept of relevance both forms an important bridge between meaning and implication and expands on the de®nition of implication itself. For any conditional proposition (``if p, then q'') there may be some identi®able meaningful relation, linkage, or connection between the antecedent ( p) and consequent (q) clauses, or there may be none. For example, for the conditional proposition, ``if he is a bachelor, then he is unmarried'', there is a meaning relation that is de®nitional in nature. This is distinguished from material implication in which there is no meaningful connection between antecedent and consequent (e.g., ``if the moon is made of blue cheese, then oceans are full of water''). Thus, the concepts implication, logical truth,

10. Competence±procedural theory

241

entailment, and validity require not only a necessary relation between antecedent and consequent as discussed earlier, but a relevance relation as well. Piaget (Piaget & Garcia, 1991) used the relevance relation, which he termed ``meaning implication'' or ``signifying implication'', to establish further the central role of implication in a competence model. This, in turn, further demonstrated both that the apparent extensional system develops out of the meaning system and that primitive forms of implication precede the relatively complete form evident in formal deductive reasoning. Piaget accomplished these purposes by arguing that a relevance connection ± de®ned as assimilation ± is basic to any action sequence that involves knowing, from the sensorimotor to re¯ective symbolic levels. At the sensorimotor action level, the relevance connection occurs between actual actions (e.g., ``if I pull on the blanket, then I will bring the toy on top of the blanket closer''). If toy±blanket is not assimilated (given a meaning linkage) to pulling the blanket, then the action implication vanishes. At the next developmental level, the action meanings become incorporated in language and become symbolic (e.g., ``if it rains, then the bicycle will get wet''). Here, the causal implication occurs only if a relevance relation is formed such that ``wet bicycle'' is given a meaning linkage to ``rain''. Finally then, at higher re¯ective propositional levels, the relevances and logical necessities join to form the deductive competence system that operates as a structured whole. This system, representing a developmental transformation of the earlier ones, permits the kind of logical understandings that involve genuine implication, entailment, logical truth, and validity that are evident in traditional deductive reasoning problems. In summary, the relatively complete deductive competence system takes the form of a logical deductive model, and the system differentiates out of the embodied actions of the child. As both the competence system and the procedural system become transformed and develop to higher levels of knowing, deductive competence becomes transformed and increasingly more adequately serves the function of logical comprehension or logical understanding. Interpretative/implementation procedures increasingly serve the function of providing access to, and implementation of, the competence. Developmental evidence If we begin with an understanding of necessity as originating in action, two major predictions can be made about the pattern of the development of deductive reasoning. First, we would expect that a propositional reasoning competence would be a relatively late developmental acquisition. Second, we would expect that factors related to procedural information processing should in¯uence access to and implementation of that competence. There are a number of empirical investigations that offer support for these hypotheses. For example, Overton and colleagues (Chapell & Overton,

242

Overton and Dick

1998; MuÈller et al., 2001; O'Brien & Overton, 1980, 1982; Overton, Byrnes, & O'Brien, 1985; Ward & Overton, 1990) have consistently found signi®cant developmental advances in reasoning performance from late childhood through adolescence on a number of different formal deductive reasoning tasks, including inference and evaluation tasks. In inference tasks, participants are presented with an incomplete conditional rule, and asked to use a set of exemplars to make an inference about the missing component of the rule. In evaluation tasks, participants are given a conditional rule, and asked to evaluate a set of propositional combinations that may or may not prove the rule to be false. Successful performance on these tasks does not appear consistently before middle to late adolescence. Additionally, whereas younger adolescents do not bene®t from training or the introduction of contradictory evidence, older adolescents do. Training is procedural in nature and procedures cannot access a competence that is not available (Overton et al., 1985). Evidence for qualitative developments in deductive reasoning also comes from investigations using the Wason selection task. The original version (Wason, 1968) assessed conditional/deductive reasoning through the veri®cation of the truth of an abstract conditional rule (``if there is a vowel on one side of the card, then there is an even number on the other side''). In this task participants are shown the conditional rule and an array of four cards showing, respectively, the letter E, the letter K, the number 4, and the number 7 on one side of each card. Instructions include being told that each card presents a letter on one side and a number on the other, along with the rule if there is an ``E'' on one side of the card, there must be a ``4'' on the other side, and the task is to select those cards that would have to be turned over to decide whether the rule is true or false. Selecting both the E ( p) and the 7 (not-q) cards is correct because this solution provides evidence for the recognition that only a falsi®cation strategy will provide logical certainty (i.e., a logical necessity). Not selecting the K (not-p) and 4 (q) along with selecting E and 7 demonstrates a coordination among the permissible and impermissible instances (i.e., coordination among the four argument forms modus pollens, modus tollens, denied antecedent, af®rmed consequent) and this coordination de®nes a full understanding of the advanced logical concept of ``implication''. As noted above, implication is a necessary relation between premises and conclusion that is present if and only if it could not be the case that the antecedent p (premise) is true and the consequent q (conclusion) is false. In fact, when p implies q, the argument is a de®ned as a deductive argument. The ``abstract version'' of the Wason selection task entails reasoning about indicative rules (i.e., requires reasoning about the truth of the situation; Noveck & O'Brien, 1996; Sperber, Cara, & Girotto, 1995) because it requires the identi®cation of instances that would falsify the rule. In early research using this task, successful performance was shown to be extremely dif®cult for both adolescents (Girotto, Gilly, Blaye, & Light, 1989; Overton

10. Competence±procedural theory

243

et al., 1987) and adults (Evans, Newstead, & Byrne, 1993). Only about 10% of all adults solve the abstract version correctly. The dominant pattern of response was the selection of the E ( p) and 4 (q) cards, or only the card E ( p). Wason (1983) himself pointed out that the dif®culty with the abstract version is related to the heavy memory load placed on participants, with the consequence that information cannot adequately be represented and maintained as a coherent whole. This suggests that the abstract task fails to access competence because of procedural issues (i.e., working memory demands). On the other hand, when meaningful, concrete, or thematic content is used in place of abstract content, the majority of adolescents and adults exhibit successful deductive reasoning (Foltz, Overton, & Ricco, 1995; Markovits & Vachon, 1990; MuÈller et al., 2001; Overton et al., 1987; Ward & Overton, 1990; see Evans et al., 1993 for a review of the adult literature). This suggests that when procedural interferences are reduced (e.g., working memory demands), competence becomes more readily accessed. One factor that has been found to be associated with access to a mature competence is the content and familiarity of the conditional rule. Work by Overton and colleagues has examined the speci®c role of meaningful content and performance on variations of the Wason selection task in children and adolescents (Overton et al., 1987; Ward & Overton, 1990). Overton et al. (1987) demonstrated that children have better success with meaningful and familiar social rules, but not meaningless and unfamiliar rules. However, in this study, a crucial ®nding for the competence± procedural theory was that familiar content did not lead to improved reasoning prior to 13 years old. Further research (Ward & Overton, 1990) focused on the meaningfulness of the relation between antecedent and consequent (i.e., logical relevance). Here, it was demonstrated that selection task problems with a meaningful relation between the two led to better performance for 14- and 17-year-olds, but not 11-year-olds, who performed equally poorly with meaningful and meaningless relations. Developmental changes in deductive reasoning skills of older children and adolescents have also been demonstrated in other cross-sectional studies by Overton and colleagues (Byrnes & Overton, 1986, 1988; Chapell & Overton, 1998, 2002), by cross-sectional investigations in other labs (Klaczynski & Narasimham, 1998; Klaczynski, Schuneman, & Daniel, 2004; Markovits, Fleury, Quinn, & Venet, 1998; Markovits, Schleifer, & Fortier, 1989; Markovits & Vachon, 1989, 1990; Venet & Markovits, 2001), and by recent longitudinal data (MuÈller et al., 2001). Additional empirical evidence comes from investigations using statistical analysis of latent dimensions, such as Rasch analysis, which have reported qualitative developmental changes in class and propositional reasoning (MuÈller, Sokol, & Overton, 1999), and in deductive reasoning (Spiel, GluÈck, & GoÈssler, 2001). Unidimensional latent structure analyses are particularly helpful in identifying qualitative developmental change taking place along a theoretically uniform dimension (e.g., logical competence). Taken together, the empirical

244

Overton and Dick

®ndings broadly support Overton's (1990, 1991) assertions that (1) access to, and implementation of, a deductive reasoning competence is a function of procedural determinants (e.g., working memory limitations, differences in problem representation, cognitive style, anxiety), and that (2) this formal deductive competence is not consistently available before adolescence.

Domain general versus domain specific reasoning Along with the debate regarding competence and procedures, recent discussion has centred on the question of whether processes involved in thinking and reasoning should be considered domain-free or domain speci®c (Beller & Spada, 2003; Roberts, Welfare, Livermore, & Theadom, 2000). Domain-free reasoning would entail relatively global cognitive processes, while domain speci®c reasoning would entail more limited cognitive processes that function with speci®c information and speci®c contexts. It should be clear by now that the competence±procedural theory proposes both kinds of reasoning processes. Formal re¯ective symbolic competence is a domain general competence, the access to which can be in¯uenced by the processing of domain speci®c information. In contrast to competence±procedural theory, several others have favoured an exclusive understanding of reasoning as being either domain general or domain speci®c. All such extant theories are procedural, holding the shared assumption that a formal logical competence is irrelevant to the nature and development of deductive reasoning. We of course argue that any exclusively procedural theory either fails to account for the logical features of logical reasoning, especially logical necessity, or implicitly assumes these features as background. One in¯uential procedural domain general model that illustrates this problem is presented by Johnson-Laird and colleagues (Johnson-Laird & Byrne, 1991, 2002). This mental-models theory proposes that, when reasoning, people construct mental representations of the possibilities presented in the problem. The theory assumes that people represent many possibilities or outcomes captured by the problem, that these are represented iconically so that the structure of the mental model corresponds to what it represents, and that mental models represent what is true, but not what is false (Johnson-Laird & Byrne, 2002). In order to represent negation in deductive reasoning problems, the person doing the reasoning must modify the model by ``tagging'' this, and by ¯eshing out the initial model into a fully explicit one. For example, the statement John is in Philadelphia would be represented by a single token, say p. The negation of this statement (John is not in Philadelphia) would be ``tagged'' by an explicit negation sign (:) preceding the token, as in :p. Using these tokens as symbols, and with each row representing a model, a kind of mental diagram can be created. The statement Either John is in Philadelphia or Sally is in Chicago, but not both might be ¯eshed out in (1) to contain (a) in the ®rst row, a

10. Competence±procedural theory

245

token for John is in Philadelphia, and a negated token for Sally is in Chicago; (b) in the second row, a negated token for John is in Philadelphia and a token for Sally is in Chicago. (1)

[p] [:q] [:p] [q]

Neither situation [p] [q] nor [:p] [:q] is represented on this model because they are ruled out as false possibilities by the rule: mental models do not represent what is false. Models can be ¯eshed out to make deductive inferences, such as in the following: John is in Philadelphia or Sally is in Chicago. But John is not in Philadelphia. So, Sally is in Chicago. The ¯eshing-out process can require the maintenance of several representational parts, or tokens, in a limited working memory, which constitutes a procedural limitation to reasoning, making this dif®cult in some situations but not others. Developmentally, the mental-models approach has been expanded by Johnson-Laird (1990) and by Markovits and colleagues (Markovits & Barrouillet, 2002). Accordingly, developmental dif®culties in reasoning are attributed to the immature ability to construct, maintain, and process a greater number of mental models in working memory (Markovits & Barrouillet, 2004). In this way, the modelling procedures are considered to be domain general because such cognitive processes are the source of constraints on successful reasoning. It should be recognised that the mental models theory is, in fact, compatible with a competence theory as long as it is not seen as a substitute for logical competence. As a procedural theory, the principal value of the mental models approach has been in demonstrating procedural effects on reasoning (e.g., the facilitating effect of local content, or as an explanation for why some inferences are harder to make than others), rather than presenting a comprehensive account of deductive reasoning. In this context, mental models may represent one half of the explanation (see Roberts, 1993, for a similar discussion). Indeed, one can readily agree with Braine (1990) when he says ``We think it very likely that subjects often use mental models in reasoning. It is almost certain that a complete account of deductive reasoning will need a subtheory of mental models'' (p. 147). However, missing from the mental models theory is a general account of the deductive process itself. As others have pointed out, mental models have never been able to provide an account of the conception of logical understanding itself (Russell, 1987; Scholnick, 1990), or how a construction of a model of local

246

Overton and Dick

content can lead to the understanding of the universal scope of the proposition (O'Brien, 2004). Additionally, it is dif®cult to see how a tokenbased mental-models theory could be used to describe competence in any way. Key aspects of logic cannot simply emerge from manipulating spatial tokens in working memory. In the absence of certain fundamental logical concepts, a theory that only assumes mental-models procedures is hard pressed to explain how most logical inferences would be possible. A mentalmodels procedural theory can be effective only if someone who has the necessary logical competence is interpreting it (see Roberts, 1993, and Rips, 1989 for similar arguments). This same criticism is applicable to theories that attempt to provide a domain speci®c account of deductive reasoning. In this camp, reasoning about social situations or problems has received the most attention, and several investigators have emphasised a set of specialised principles that are derived via domain speci®c experiences (Cheng & Holyoak, 1985, 1989) or innately determined via evolutionary pressures operating within social situations (Cosmides, 1989; Cummins, 1996b). Those who argue that reasoning is domain speci®c have typically concentrated on a form of conditional reasoning, within the social domain, called deontic reasoning. As contrasted with indicative conditionals, which require reasoning about the truth of a rule, deontic conditionals require reasoning about obligations, permissions, and prohibitions, and whether these have been violated (Cummins, 1996b; Manktelow & Over, 1991). A relatively consistent ®nding is that participants perform better with deontic conditionals when compared to indicative conditionals (see Cummins, 1996b; Evans, Newstead, & Byrne, 1993). Several theories have been presented to account for this difference (Chater & Oaksford, 1996; Cheng & Holyoak, 1985, 1989; Cosmides, 1989; Cummins, 1996b; Evans & Over, 1996; Johnson-Laird & Byrne, 1992; Manktelow & Over, 1991; Noveck & O'Brien, 1996; Sperber et al., 1995), but here we concentrate on two domain speci®c explanations of the deontic/indicative distinction. The ®rst is pragmatic reasoning theory (Chao & Cheng, 2000; Cheng & Holyoak, 1985, 1989; Cheng, Holyoak, Nisbett, & Oliver, 1986), which claims that deductive reasoning emerges from schemas. These are sets of induced generalised rules related to speci®c goals and environmental situations. Cheng and Holyoak's work has primarily focused on two such structures: the permission and obligation schemas. The permission schema is relevant to social regulations where the consequent speci®es a precondition that must be met in order for the action speci®ed in the antecedent to be taken. The obligation schema deals with situations in which the consequent speci®es an action that must be taken when the condition speci®ed in the antecedent occurs. This position holds that when a speci®c situation invokes the permission or the obligation schema, a set of rules for dealing with all possible outcomes becomes available. Permission and obligation rules are deontic rules because they are prescriptive and not descriptive.

10. Competence±procedural theory

247

This prescriptive feature is usually expressed by modal verbs such as ``must'' and ``may''. Support for pragmatic reasoning schema theory comes from performance on certain versions of the selection task. Adults perform signi®cantly better when the rule is formulated as a permission rather than an arbitrary rule (Cheng & Holyoak, 1985). Based on this evidence, it has been argued that people produce correct solutions on these problems without logical reasoning, and that success on deductive problems only appears to follow the rules of logic, and this is merely coincidental (Girotto & Light, 1993). Theories of the origin of pragmatic reasoning schemas have argued that there are no qualitative developmental changes in deductive reasoning (Girotto & Light, 1993), and that the reasoning schemas are abstracted from everyday experience at an early age (Cheng & Holyoak, 1985). As noted earlier, research has already supported the contention that there are, in fact, qualitative developmental changes, and this weakens the pragmatic reasoning position. The major issue that remains is the argument for developmental precocity. Claims of precocious performance in preschoolers have been made (Chao & Cheng, 2000; Cummins, 1996a, 1996b; Harris & NunÄes, 1996), but problems with the method of investigation give cause to question any conclusion concerning precocity. For example, in an often-cited study by Harris and NunÄes (1996), in which the claim was made that preschool children can reason with permissions, the children were simply given an evaluation task and asked to choose the picture that would violate a given permission rule. Children were told a story and given a permission rule (e.g., ``if you are going to play outside, you must put your coat on''), and then were shown two pictures and asked to point to the picture of the story character ``being naughty and not doing what her mum told her'' (i.e., the picture of the character without a coat on). This task requires the preschooler to point to an action that would violate an ``if . . . then'' rule, which is not a surprising ability even for a 3-year-old. In other words, characters who match the rule (i.e., who are wearing coats) are obeying it, and characters who don't match the rule are disobeying it. Demonstrating that children can point to violations of an ``if . . . then'' rule does not itself constitute evidence of deductive reasoning. As pointed out earlier, logical reasoning with conditional propositions entails an understanding of the logical concept of ``implication'' and this understanding is demonstrated only in cases which children demonstrate a coordination among the permissible and impermissible instances that de®ne implication (i.e., simultaneously understanding the two valid and two invalid argument forms). The ability to pick a violation of a rule may be an early developmental precursor to the modus tollens argument, but it does not constitute an understanding of implication and the logical necessity implication entails. The same criticism is relevant to the research conducted by Girotto and colleagues (Girotto et al., 1989; Light, Blaye, Gilly, & Girotto, 1989; Light, Girotto, & Legrenzi, 1990) exploring pragmatic reasoning schemas and

248

Overton and Dick

conditional reasoning in children. In the Girotto et al. (1989) study, the selection task is formulated as a permission rule (e.g., ``if one is sitting in the front of the car, then one must wear seat-belts''). The authors reported that 10-year-old children performed well with unfamiliar but plausible rules, but performance was signi®cantly worse when the rule was unfamiliar and implausible. However, in the studies using the full Wason task, successful performance was de®ned as the choice of the p and not-q cards, and not as a coordination among the four argument forms (i.e., selection of the correct cards coupled with avoidance of the incorrect cards). This pattern of selection, termed complete falsi®cation, implies recognition that only a falsi®cation strategy will lead to the correct solution. In many of the reported experiments, children who selected the correct cards also selected incorrect cards. Moreover, the ®ndings of Girotto and colleagues contradict more recent longitudinal work conducted by Overton and colleagues (MuÈller et al., 2001). Using the same problems from prior cross-sectional work (Ward & Overton, 1990), MuÈller et al. replicated the same developmental trajectory that had been found in these earlier studies (i.e., qualitative change over time). When problems are scored for complete falsi®cation in a longitudinal study, qualitative changes in reasoning competence are revealed. That said, the discrepancy between these longitudinal ®ndings and the cross-sectional research reported by Girotto and colleagues warrants further research. The pragmatic reasoning schema interpretation can also be challenged on other fronts. First, only the permission rule, but not the obligation rule, reliably facilitates performance on the selection task (MuÈller et al., 2001; Noveck & O'Brien, 1996). Second, pragmatic reasoning schemas appear to be task speci®c as they are associated with improved performance on the selection task, but not on an evaluation task (Markovits & Savary, 1992; Thompson, 1995), and they have dif®culty explaining good performance on other kinds of deductive reasoning tasks (see Rips, 1989). Similarly, it may be that other task features are confounded with the presence of the deontic rule, and it is thus not the deontic rule itself that is associated with improved performance (Klaczynski & Narasimham, 1998; Noveck & O'Brien, 1996; Thompson, 1995; see also Noveck, Mercier, & van der Henst, Chapter 2; Roberts, Chapter 1, this volume). For example, participants perform better in enriched contexts, regardless of whether the rule is pragmatic (Noveck & O'Brien, 1996), and they perform well with nonpragmatic rules that suggest the creation of alternative antecedents (Klaczynski & Narasimham, 1998; Thompson, 1995, 2000). Like mental models, pragmatic reasoning schemas fall short when presenting a complete account of deductive reasoning. However, also like mental models, pragmatic reasoning schemas may account for part of the explanation. In this regard, the comments from Noveck and O'Brien (1996, p. 484) are consistent with a competence±procedural account of deductive reasoning:

10. Competence±procedural theory

249

content-independent inferences of mental logic do not explain the in¯uence of permission content on these problems; note, however, that pragmatic schemas alone also do not provide a complete account of the data reported here, and some of the features that in¯uence performance are content-independent. More is going on with these problems than pragmatic schemas alone can explain. Again, these different accounts are incompatible only if they are seen as substitutes for each other. From a competence±procedural approach, both are necessary pieces of an account of deductive reasoning and its development. The above theoretical and empirical arguments concerning factors unrelated to the deontic status of the rule may also pose a problem for another class of domain speci®c theories: those that emerge from an evolutionary perspective. Such theories argue that the architecture of human reasoning is innate and results from selective pressures to solve adaptive problems within speci®c domains. Cosmides (1989; Cosmides & Tooby, 1992, 2005), for example, proposed that differences in performance between deontic and indicative versions of the selection task could be explained by social contract theory. According to this, there are domain speci®c cognitive processes dedicated to reasoning about social contracts in social situations, and these speci®c processes are modular in their organisation. The theory proposes that reasoning will be better in situations in which it is bene®cial to detect violators of a social contract (or to detect cheaters; Gigerenzer & Hug, 1992), such as in situations where a person accepts a bene®t without paying a cost. Detection of violators in such social situations would be adaptive and advantageous, and thus speci®c cognitive algorithms would be selected for that would allow our ancestors to detect cheaters ef®ciently (Cosmides & Tooby, 1992; Gigerenzer & Hug, 1992). Following this logic, other modular reasoning processes have been proposed, such as a special system for reasoning about risk reduction in hazardous situations (Fiddick, Cosmides, & Tooby, 2000) or reasoning about sharing-rules (Hiraishi & Hasegawa, 2001). Variants of this evolutionary approach have also been presented. For example, Cummins (1996b) has argued that reasoning about social contracts describes a piece of a larger adaptive system for reasoning about social norms. Thus, enhanced reasoning about deontic conditionals is taken to re¯ect enhanced reasoning about any rules that violate them, and not just reasoning about violations of reciprocity. Again, a focus of support for evolutionary psychological theories stems from research that ®nds early differences between deontic and indicative reasoning in children (Chao & Cheng, 2000; Cummins, 1996a, 1996b; Harris & NunÄes, 1996), or little qualitative change in reasoning competence across the lifespan (Girotto & Light, 1993). As noted earlier, all domain speci®c theories falter, and evolutionary theory is domain speci®c, when confronted by the ample evidence that

250

Overton and Dick

reasoning improves across the lifespan, especially when this improvement is qualitative in nature. However, there are additional causes to question the viability of evolutionary explanations. For example, Fodor suggests that the indicative±deontic distinction is related to the nature of the rules and is an artifact of the structure of the task (Fodor, 2000a, 2000b). The result is that the deontic conditional promotes a strategy of searching for violators, a strategy that is most appropriate in the assessment of a formal logical understanding. There is also evidence that successful performance can be elicited in nondeontic contexts (Almor & Sloman, 1996, 2000; Girotto, Kemmelmeier, Sperber, & van der Henst, 2001; Griggs, 1989; Liberman & Klar, 1996; Platt & Griggs, 1993; Sperber et al., 1995), and in contexts that do not involve social exchange (Cheng & Holyoak, 1985). There is considerable evidence that cheating detection content is neither necessary nor suf®cient to elicit the logically correct responses on the selection task (Liberman & Klar, 1996). For example, in the classic example of facilitation with unfamiliar content, the cassava root selection problem (Cosmides, 1989), subjects are presented with the rule: If a man eats cassava root, then he must have a tattoo on his face. The rule is preceded by a story in which cassava root is explained to be reserved for married men, who are distinguished from unmarried men by the presence of a tattoo. The task is to determine who is breaking the law, and the four selection cards are presented (i.e., [ p] ``eats cassava root'', [not-q] ``no tattoo'', [not-p] ``eats molo nuts'', [q] ``tattoo''). Cosmides (1989) and Gigerenzer and Hug (1992) reported very good performance on this version of the selection task, with poor performance in similar no-cheating versions, and this ®nding was replicated by Liberman & Klar (1996). However, Liberman and Klar implemented two other conditions that eliminated confounds related to task interpretation. In one problem, an unconfounded no-cheating version of the cassava root problem, the same rule was presented with the rationale that people without tattoos cannot digest cassava roots. Despite the lack of a cheater detection context or a social contract, performance was equivalent to cheating contexts (71%, on average, across ®ve different scenarios of both unconfounded no-cheating and original cheating versions of the task). Thus, a cheating context is clearly not a necessary condition for successful performance. This ®nding is particularly troublesome for evolutionary theories because there is no speci®c adaptive reason to expect successful performance in non-cheating versions. There is no social context, nor is there a cheater or violator of any social contract (Cosmides, 1989; Gigerenzer & Hug, 1992). On both theoretical and empirical grounds neither mental-models nor domain speci®c theories provide complete explanations for the development of, and successful performance of, deductive reasoning. First, these theories fail to predict the qualitative changes in the development of deductive

10. Competence±procedural theory

251

reasoning. Second, evolutionary theories must explain empirical results that con¯ict with the theory, including the ®nding that social contexts and social contracts are not necessary preconditions for facilitated performance in reasoning tasks. Finally, while accounting for many of the facts of logical reasoning, these theories skirt the core issue of logical necessity. An account including logical necessity requires a focus on competence and the development of competence.

Conclusion In this chapter we have argued that neither domain speci®c nor domain general models are adequate accounts of logical reasoning, as each fails to handle various critical empirical data. Rather, a comprehensive understanding of logical reasoning requires, on one hand, a theory that can explain the universal features of the logic that competent adults express. On the other hand, this theory also needs to include procedural explanations of how this available competence is accessed and implemented. From our perspective, the most adequate approach to construction of such a competence± procedural theory must entail recognition that ultimately reasoning is the re¯ection of an embodied development, and not the result of isolated biological mechanisms or isolated cultural in¯uences. The competence±procedural approach described in this chapter begins from an embodiment assumption (Overton, 2003), and describes the development of reasoning as a series of differentiations and integrations of knowledge structures. These begin with actual physical actions and at each new major level of integration the structures become transformed into richer and more complete propositional systems. With respect to offering a comprehensive theory of deductive reasoning, a key feature of this approach is that it recognises the necessity of both competence and procedural systems. Procedural approaches without a formal competence, as we have described in the present chapter, are unable to explain how one achieves a universal understanding of logical necessity, implication, and relevance.

References Almor, A., & Sloman, S. A. (1996). Is deontic reasoning special? Psychological Review, 103, 374±380. Almor, A., & Sloman, S. A. (2000). Reasoning versus text processing in the Wason selection task: A nondeontic perspective on perspective effects. Memory and Cognition, 28, 1060±1070. Beller, S., & Spada, H. (2003). The logic of content effects in propositional reasoning: The case of conditional reasoning with a point of view. Thinking and Reasoning, 9, 335±378. Beth, E. W., & Piaget, J. (1966). Mathematical epistemology and psychology. Dordrecht, The Netherlands: Reidel.

252

Overton and Dick

Braine, M. D. S. (1990). The ``natural logic'' approach to reasoning. In W. F. Overton (Ed.), Reasoning, necessity, and logic: Developmental perspectives (pp. 133±157). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Braine, M. D. S., & O'Brien, D. P. (Eds.). (1998). Mental logic. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Byrnes, J. P., & Overton, W. F. (1986). Reasoning about certainty and uncertainty in concrete, causal, and propositional contexts. Developmental Psychology, 22, 793±799. Byrnes, J. P., & Overton, W. F. (1988). Reasoning about logical connectives: A developmental analysis. Journal of Experimental Child Psychology, 46, 194±218. Chapell, M. S., & Overton, W. F. (1998). Development of logical reasoning in the context of parental style and test anxiety. Merrill-Palmer Quarterly, 44, 141±156. Chapell, M. S., & Overton, W. F. (2002). Development of logical reasoning and the school performance of African American adolescents in relation to socioeconomic status, ethnic identity, and self-esteem. Journal of Black Psychology, 28, 295±317. Chao, S., & Cheng, P. W. (2000). The emergence of inferential rules: The use of pragmatic reasoning schemas by preschoolers. Cognitive Development, 15, 39±62. Chater, N., & Oaksford, M. (1996). Deontic reasoning, modules, and innateness: A second look. Mind and Language, 11, 191±202. Cheng, P. W., & Holyoak, K. J. (1985). Pragmatic reasoning schemas. Cognitive Psychology, 17, 391±416. Cheng, P. W., & Holyoak, K. J. (1989). On the natural selection of reasoning theories. Cognition, 33, 285±313. Cheng, P. W., Holyoak, K. J., Nisbett, R. E., & Oliver, L. M. (1986). Pragmatic versus syntactic approaches to training deductive reasoning. Cognitive Psychology, 18, 293±328. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187±276. Cosmides, L., & Tooby, J. (1992). Cognitive adaptations for social exchange. In J. H. Barkow, L. Cosmides, and J. Tooby (Eds.), Adapted mind: Evolutionary psychology and the generation of culture (pp. 163±228). New York: Oxford University Press. Cosmides, L., & Tooby, J. (2005). Neurocognitive adaptations designed for social exchange. In D. Buss (Ed.), The handbook of evolutionary psychology (pp. 584± 627). New York: Wiley. Cummins, D. D. (1996a). Evidence of deontic reasoning in 3- and 4-year-olds. Memory and Cognition, 24, 823±829. Cummins, D. D. (1996b). Evidence for the innateness of deontic reasoning. Mind and Language, 11, 160±202. Evans, J. St B. T., Newstead, S. E., & Byrne, R. M. J. (1993). Human reasoning. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Evans, J. St B. T., & Over, D. E. (1996). Rationality in the selection task: Epistemic utility versus uncertainty reduction. Psychological Review, 103, 356±363. Falmagne, R. J. (1990). Language and the acquisition of logical knowledge. In W. F. Overton (Ed.), Reasoning, necessity, and logic: Developmental perspectives (pp. 111±131). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Fiddick, L., Cosmides, L., & Tooby, J. (2000). No interpretation without

10. Competence±procedural theory

253

representation: The role of domain speci®c representations in the Wason selection task. Cognition, 77, 1±79. Fodor, J. A. (2000a). The mind doesn't work that way: The scope and limitations of computational psychology. Cambridge, MA: MIT Press. Fodor, J. A. (2000b). Why we are so good at catching cheaters. Cognition, 75, 29±32. Foltz, C., Overton, W. F., & Ricco, R. (1995). Proof construction: Adolescent development from inductive to deductive problem solving strategies. Journal of Experimental Child Psychology, 59, 179±195. Gigerenzer, G., & Hug, K. (1992). Domain speci®c reasoning, social contracts, and perspective change. Cognition, 43, 127±171. Girotto, V., Gilly, M., Blaye, A., & Light, P. (1989). Children's performance in the selection task: Plausibility and familiarity. British Journal of Psychology, 80, 79±95. Girotto, V., Kemmelmeier, M., Sperber, D., & van der Henst, J.-B. (2001). Inept reasoners or pragmatic virtuosos? Relevance and the deontic selection task. Cognition, 81, B69±B76. Girotto, V., & Light, P. (1993). The pragmatic bases of children's reasoning. In P. Light and G. Butterworth (Eds.), Context and cognition: Ways of learning and knowing (pp. 134±156). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Griggs, R. A. (1983). The role of problem content in the selection task and in the THOG problem. In J. St B. T. Evans (Ed.), Thinking and reasoning: Psychological approaches (pp. 16±47). London: Routledge & Kegan Paul. Griggs, R. A. (1989). To ``see'' or not to ``see'': That is the selection task. Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 41A, 517±529. Harris, P. L., & NunÄes, M. (1996). Understanding of permission rules by preschool children. Child Development, 67, 1572±1591. Hiraishi, K., & Hasegawa, T. (2001). Sharing-rule and detection of free-riders in cooperative groups: Evolutionarily important deontic reasoning in the Wason selection task. Thinking and Reasoning, 7, 255±294. Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to adolescence. New York: Wiley. Johnson-Laird, P. N. (1990). The development of reasoning ability. In G. Butterworth and P. Bryant (Eds.), Causes of development: Interdisciplinary perspectives (pp. 85±110). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Johnson-Laird, P. N., & Byrne, R. M. J. (1991). Deduction. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Johnson-Laird, P. N., & Byrne, R. M. J. (1992). Modal reasoning, models, and Manktelow and Over. Cognition, 43, 173±182. Johnson-Laird, P. N., & Byrne, R. M. J. (2002). Conditionals: A theory of meaning, pragmatics, and inference. Psychological Review, 109, 646±678. Klaczynski, P. A., & Narasimham, G. (1998). Representations as mediators of adolescent deductive reasoning. Developmental Psychology, 34, 865±881. Klaczynski, P. A., Schuneman, M. J., & Daniel, D. B. (2004). Theories of conditional reasoning: A developmental examination of competing hypotheses. Developmental Psychology, 40, 559±571. Lakoff, G., & Johnson, M. (1999). Philosophy in the ¯esh: The embodied mind and its challenge to western thought. New York: Basic Books.

254

Overton and Dick

Liberman, N., & Klar, Y. (1996). Hypothesis testing in Wason's selection task: Social exchange, cheating detection, or task understanding. Cognition, 58, 127±156. Light, P., Blaye, A., Gilly, M., & Girotto, V. (1989). Pragmatic schemas and logical reasoning in 6- to 8-year-old children. Cognitive Development, 4, 49±64. Light, P., Girotto, V., & Legrenzi, P. (1990). Children's reasoning on conditional promises and permissions. Cognitive Development, 5, 369±383. Macnamara, J. (1986). A border dispute: The place of logic in psychology. Cambridge, MA: MIT Press. Mandler, J. M. (1983). Structural invariants in development. In L. S. Liben (Ed.), Piaget and the foundations of knowledge (pp. 97±124). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Manktelow, K. I., & Over, D. E. (1991). Social roles and utilities in reasoning with deontic conditionals. Cognition, 39, 85±105. Markovits, H., & Barrouillet, P. (2002). The development of conditional reasoning: A mental model account. Developmental Review, 22, 5±36. Markovits, H., & Barrouillet, P. (2004). Introduction: Why is understanding the development of reasoning important? Thinking and Reasoning, 10, 113±121. Markovits, H., Fleury, M., Quinn, S., & Venet, M. (1998). The development of conditional reasoning and the structure of semantic memory. Child Development, 69, 742±755. Markovits, H., & Savary, F. (1992). Pragmatic schemas and the selection task: To reason or not to reason. Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 45A, 133±148. Markovits, H., Schleifer, M., & Fortier, L. (1989). The development of elementary deductive reasoning in young children. Developmental Psychology, 25, 787±793. Markovits, H., & Vachon, R. (1989). Reasoning with contrary to fact propositions. Journal of Experimental Child Psychology, 47, 398±412. Markovits, H., & Vachon, R. (1990). Conditional reasoning, representation, and level of abstraction. Developmental Psychology, 26, 942±951. MuÈller, U., Overton, W. F., & Reene, K. (2001). Development of conditional reasoning: A longitudinal study. Journal of Cognition and Development, 2, 27±49. MuÈller, U., Sokol, B., & Overton, W. F. (1999). Developmental sequences in class reasoning and propositional reasoning. Journal of Experimental Child Psychology, 74, 69±106. Noveck, I. A., & O'Brien, D. P. (1996). To what extent do pragmatic reasoning schemas affect performance on Wason's selection task? Quarterly Journal of Experimental Psychology, 49A, 463±489. O'Brien, D. P. (2004). Mental-logic theory: What it proposes, and reasons to take this proposal seriously. In J. P. Leighton and R. J. Sternberg (Eds.), The nature of reasoning (pp. 205±233). Cambridge: Cambridge University Press. O'Brien, D. P., Braine, M. D. S., & Yang, Y. (1994). Propositional reasoning by model: Simple to refute in principle and in practice. Psychological Review, 101, 711±724. O'Brien, D. P., & Overton, W. F. (1980). Conditional reasoning following contradictory evidence: A developmental analysis. Journal of Experimental Child Psychology, 30, 44±60. O'Brien, D. P., & Overton, W. F. (1982). Conditional reasoning and the

10. Competence±procedural theory

255

competence±performance issue: A developmental analysis of a training task. Journal of Experimental Child Psychology, 34, 274±290. Overton, W. F. (1985). Scienti®c methodologies and the competence±moderator± performance issue. In E. Neimark, R. DeLisi, and J. Newman (Eds.), Moderators of competence (pp. 15±41). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Overton, W. F. (1990). Competence and procedures: Constraints on the development of logical reasoning. In W. F. Overton (Ed.), Reasoning, necessity, and logic: Developmental perspectives (pp. 1±32). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Overton, W. F. (1991). Competence, procedures, and hardware: Conceptual and empirical considerations. In M. Chandler & M. Chapman (Eds.), Criteria for competence: Controversies in the assessment of children's abilities (pp. 19±42). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Overton, W. F. (2003). Development across the lifespan: Philosophy, concepts, theory. In R. M. Lerner, M. A. Easterbrooks, & J. Mistry (Eds.), Comprehensive handbook of psychology: Developmental psychology (Vol. 6, pp. 13±42). New York: Wiley. Overton, W. F., Byrnes, J. P., & O'Brien, D. P. (1985). Developmental and individual differences in conditional reasoning: The role of contradiction training and cognitive style. Developmental Psychology, 21, 692±701. Overton, W. F., Ward, S. L., Noveck, I. A., Black, J., & O'Brien, D. P. (1987). Form and content in the development of deductive reasoning. Developmental Psychology, 23, 22±30. Piaget, J. (1986). Essay on necessity. Human Development, 29, 301±314. Piaget, J. (1987a). Possibility and necessity. Volume 1. The role of possibility in cognitive development. Minneapolis: University of Minnesota Press. Piaget, J. (1987b). Possibility and necessity. Volume 2. The role of necessity in cognitive development. Minneapolis: University of Minnesota Press. Piaget, J., & Garcia, R. (1991). Toward a logic of meanings. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Platt, R. D., & Griggs, R. A. (1993). Facilitation in the abstract selection task: The effects of attentional and instructional factors. Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 46A, 591±613. Rips, L. J. (1989). The psychology of knights and knaves. Cognition, 31, 85±116. Rips, L. J. (1994). The psychology of proof: Deductive reasoning in human thinking. Cambridge, MA: MIT Press. Roberts, M. J. (1993). Human reasoning: Deduction rules or mental models, or both? Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 46(A), 569±589. Roberts, M. J., Welfare, H., Livermore, D. P., & Theadom, A. M. (2000). Context, visual salience, and inductive reasoning. Thinking and Reasoning, 6, 349±374. Russell, J. (1987). Rule-following, mental models, and the developmental view. In M. Chapman & R. A. Dixon (Eds.), Meaning and the growth of understanding (pp. 23±48). New York: Springer-Verlag. Scholnick, E. K. (1990). The three faces of If. In W. F. Overton (Ed.), Reasoning, necessity, and logic: Developmental perspectives (pp. 159±181). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Sperber, D., Cara, F., & Girotto, V. (1995). Relevance theory explains the selection task. Cognition, 57, 31±95.

256

Overton and Dick

Spiel, C., GluÈck, J., & GoÈssler, H. (2001). Stability and change of unidimensionality: The sample case of deductive reasoning. Journal of Adolescent Research, 16, 150±168. Staudenmayer, H., & Bourne, L. E., Jr (1977). Learning to interpret conditional sentences: A developmental study. Developmental Psychology, 13, 616±623. Thompson, V. A. (1995). Conditional reasoning: The necessary and suf®cient conditions. Canadian Journal of Experimental Psychology, 49, 1±60. Thompson, V. A. (2000). The task speci®c nature of domain general reasoning. Cognition, 76, 209±268. Tulving, E., & Pearlstone, Z. (1966). Availability versus accessibility of information in memory for words. Journal of Verbal Learning and Verbal Behavior, 5, 381±391. Venet, M., & Markovits, H. (2001). Understanding uncertainty with abstract conditional premises. Merrill-Palmer Quarterly, 47, 74±99. Ward, S. L., Byrnes, J. P., & Overton, W. F. (1990). Organization of knowledge and conditional reasoning. Journal of Educational Psychology, 82, 832±837. Ward, S. L., & Overton, W. F. (1990). Semantic familiarity, relevance, and the development of deductive reasoning. Developmental Psychology, 26, 488±493. Wason, P. C. (1968). Reasoning about a rule. Quarterly Journal of Experimental Psychology, 20, 273±281. Wason, P. C. (1983). Realism and rationality in the selection task. In J. St B. T. Evans (Ed.), Thinking and reasoning: Psychological approaches (pp. 44±75). London: Routledge & Kegan Paul.

11 Less speci®city in higher cognitive mechanisms Evidence from theory of mind Keith R. Happaney and Philip David Zelazo

Throughout evolution, the emphasis has shifted from the brain invested with rigid, ®xed functions (thalamus) to the brain capable of ¯exible adaptation (cortex). The advent of neocortex may have represented an evolutionary repudiation of strong modularity as the dominant principle of neural organization, and a shift toward a more interactive principle of neural organization dominated by emergent properties. (Goldberg, 1995, p. 203)

There is a current longstanding debate over whether the greater part of cognition is best characterised as being domain speci®c. In keeping with Roberts' introduction to this volume, we will refer to the view that most, if not all, cognition is domain speci®c, as extreme domain speci®city, one version of which is massive modularity. The view that domain speci®city and/or modularity characterises human cognition has been popular since the 1980s, deriving support from an apparent convergence among philosophical, evolutionary, and empirical considerations. Indeed, its perceived consilience with evolutionary biology has been argued to be a particular strength of this viewpoint (see Cosmides & Tooby, 1994). In his seminal book on modularity, for which domain speci®city was seen as an important criterion, Fodor (1983) focused on peripheral ``input systems.'' While many theorists could accept modular claims for these, the claim for modularity of more central, conceptual systems, such as theory of mind (ToM), was another story. In fact, Fodor (1983) argued that these latter systems were decidedly nonmodular. What are the arguments for why modularity might characterise higher level cognitive processes? ToM serves as an excellent example for discussing the plausibility of extreme domain speci®city, with implications for other conceptual systems for which there is generally less support for domain speci®city.

258

Happaney and Zelazo

Aims of this Chapter In this chapter, we ®rst review several arguments for domain speci®city (or against domain generality), particularly from the perspective of evolutionary psychology. We note several limitations of these, and argue that a more domain general view is equally consistent with evolutionary considerations. Evidence for the modularity of ToM is then reviewed brie¯y. Finally, we suggest an alternative view, according to which many seemingly speci®c functions may emerge in phylogeny and ontogeny by making different uses of overlapping sets of neural mechanisms. These process information in particular ways, but are not limited to information from particular domains of experience ± in origin or in current practice. Rather, we believe that these mechanisms evolved precisely because they simultaneously solved multiple adaptive problems. This is likely to be especially true of higher cognitive functions, such as executive function, which roughly refers to operations involved in the control of thought and action (Zelazo, Carter, Reznick, & Frye, 1997). Indeed, there appears to be an inverse relation between extent of domain speci®city and level of information processing, with higher cognitive functions being the least domain speci®c. Concepts and conceptualisations For the sake of clarity, we ®rst provide brief de®nitions of key constructs. Domain specific mechanisms and modules Referring to a mechanism as domain speci®c implies that it only accepts as input information of a certain type (i.e., from its proprietary domain). Normally, this information is processed differently than information from other domains. Thus, as discussed by Fodor (2000), domain speci®city concerns both the domain of information being processed and the processes applied to that information. To be considered a module, a cognitive mechanism must meet further criteria in addition to domain speci®city (Fodor, 1983): Modules must be informationally encapsulated (i.e., cognitively impenetrable), process information rapidly, have only limited access to central mechanisms, function in an obligatory fashion, have shallow ± nonconscious ± outputs, exhibit characteristic and speci®c patterns of breakdown, and show a regular pattern of ontogenetic development. Evolutionary psychology One version of evolutionary psychology (Cosmides & Tooby, 1994; see also Sperber, 1994) is concerned with identifying the cognitive mechanisms that solved recurrent adaptive problems ± problems with implications for

11. Less speci®city in higher cognition

259

reproduction ± faced by our ancestors in the environment of evolutionary adaptation (EEA). Few would argue with the assumption that human competencies have evolutionary origins. The issue is how such competencies came about, whether they depend on other competencies, and whether they are domain speci®c or even modular. Cosmides and Tooby (1994) suggest that cognitive adaptations are necessarily domain speci®c and possess characteristics of modularity (see below), although a strictly modular view has been waning. It should be noted, however, that Sperber (1994) differentiated between what he has termed proper domains of information, which a mechanism was designed to act on, and actual domains that include all the information that a mechanism can, in fact, process. To Sperber (1994), many modern-day problems are solved by the cooption of proper domain speci®c mechanisms or modules. This is an idea shared in spirit by most evolutionary psychologists, including Cosmides and Tooby (1994).

Arguments concerning domain specificity and responses A central argument for domain speci®city is that domain general mechanisms could not possibly have evolved because all adaptive problems are speci®c problems. Cosmides and Tooby (1994) therefore suggest that domain general mechanisms, such as general learning, could not have evolved because there is no general adaptive problem. Moreover, it is assumed that any domain general solution to an adaptive problem will always be inferior to a domain speci®c solution because any mechanism that needs to solve many problems will not be as good at solving any speci®c one (or in other words: the jack of all trades is a master of none). There are several problems with this argument. For example, it assumes that evolution is a unidimensional, monotonic process, rather than one occurring in a highly complex environment, in which rapid change might have occurred (see Gould & Lewontin, 1979). In fact, individuals often face multiple adaptive problems together, and relatively general mechanisms that simultaneously solve a number of speci®c problems may have proved more adaptive than speci®c mechanisms that solved only a subset. In this way, adaptation may be seen as a process of multiple constraint satisfaction (e.g., McClelland & Rumelhart, 1988). Therefore, particularly in terms of higher cognitive functions, we suggest that mechanisms best suited to broad classes of adaptive problems may, in many instances, have been selected over mechanisms that solved one speci®c problem, but failed to address the others. Mechanisms that solve several adaptive problems simultaneously will likely be responsive to a set of inputs that are more abstractly de®ned. In this way, numerous speci®c adaptive problems considered together correspond to a ``general problem''; the general problem is what several speci®c problems have in common. Thus, some aspects of cognitive function (e.g., those mediating jealousy or incest avoidance) may well be modular in many

260

Happaney and Zelazo

respects (though see, e.g., Leavitt, 1990), but other aspects of cognitive function may be so general, in terms of both current function and evolutionary origin, that there is little point in calling them modular. This is not to say that there are no constraints at all on information processing. Another point to consider when weighing the ef®cacy of domain general versus domain speci®c mechanisms is that human beings evolved in a largely social context. Although solving a speci®c problem on one's own might have required extremely high competency within the proper domain (perhaps, but not necessarily, favouring a domain speci®c mechanism), several individuals within a group might have been capable of solving such a problem utilising existing cognitive structures. That is, what constitutes solvability from a social perspective may be very different from what would be expected given solitary problem solving. Thus, just as it is important to consider the entire adaptive environment rather than isolated problems within that environment, it is important to consider the social context in which evolution occurred (Caporael, 1997). For example, evolutionarily sensitive psychologists and anthropologists alike have argued that gaining assistance from conspeci®cs, in carrying out such tasks as hunting large animals, in¯uenced selection pressures in the development of the considerable social competencies of our human ancestors. Unfortunately, the bidirectional effect on the other competencies required for hunting (e.g., spatial competencies) has been largely ignored. The notion that mechanisms may have evolved to solve several problems simultaneously is, ironically, consistent with Sperber's (1994) distinction between actual and proper domains. That is, it is agreed that some mechanisms are currently ¯exible enough to apply to a range of ``actual'' domains, as well as the ``proper'' domains they are presumed to have evolved to solve. In fact, however, it may be dif®cult to know which domain was the proper domain and which domains are actual ones. It is possible that the mechanisms evolved precisely because they applied simultaneously to a range of proper domains. In keeping with this latter suggestion, we hypothesise that as a general rule (at least for central mechanisms), the larger the number of adaptive problems that a given mechanism solved simultaneously, the more likely this mechanism would be to have evolved. For example, although female superiority in object location memory has been attributed to females' greater role in foraging during the Pleistocene period (see Silverman & Eals, 1992) ± a domain speci®c, adaptationist account ± recent evidence shows that this superiority may be due to a more general mechanism that includes verbal memory (Choi & L'Hirondelle, 2005). Additionally, Cahill, Uncapher, Kilpatrick, Alkire, and Turner, (2004) demonstrated a crossover interaction between sex and hemispheric involvement in emotional memory, with left versus right amygdala involvement in females and males, respectively. This pattern is consistent with left hemispheric involvement in ®ne-grained processing and nonglobal aspects of linguistic processing, for which females show superiority, and right

11. Less speci®city in higher cognition

261

hemispheric involvement in nonverbal, spatial cognition, for which males demonstrate superiority (see e.g., Deacon, 1997; Elman, Bates, Johnson, Karmiloff-Smith, Parisi, & Plunkett, 1996; Kimura, 1999; Tucker, 1981). Thus, while we agree with evolutionary psychologists that many mechanisms, including those revealing sex differences, have evolutionary origins, it is dif®cult to discern the proper domain of their adaptation. Additionally, rather than providing separate accounts for narrowly de®ned competencies (e.g., aspects of linguistic, spatial, and emotional processing), it is important to consider more general processes that may underlie several competencies. Furthermore, in keeping with Sperber's (1994) correct statement that new adaptations make use of existing structures (see also Duchaine, Cosmides, & Tooby, 2001, for plausible overlapping supportive neural mechanisms), mechanisms effective in solving multiple problems would have been the most biologically economical. In relation to the claim that domain general mechanisms, such as general learning, could not have evolved (Cosmides & Tooby, 1994, p. 89), Chiappe and MacDonald (2005) have argued that whereas domain speci®c mechanisms may have responded to recurrent aspects of the EEA, domain general mechanisms would have been more responsive to rapidly changing aspects of human environments. Indeed, as suggested by Vygotsky (1978), a quintessential aspect of humans is their ability to change the environments in which they live, through the use of tools. Moreover, as suggested by Chiappe and MacDonald (2005), analogical reasoning allows for the application of existing knowledge to novel circumstances, representing a more active process than the triggering of proper domain speci®c mechanisms by content from the actual domain (see also Halford & Andrews, Chapter 9, this volume). Additionally, there is considerable evidence for the predictive utility and heritability of intelligence, a domain general construct (see Chiappe & MacDonald, 2005; Miller, 2000; Brody, Chapter 18; Gottfredson, Chapter 17, this volume). Finally, even if a domain speci®c mechanism were better than a domain general one, all things considered (i.e., the entire set of adaptive problems taken into account simultaneously and modulated by social context, as well as adaptation to novel problems in addition to recurrent ones), this does not mean that the domain speci®c mechanism would be more likely to have evolved. All that matters, from an evolutionary perspective, is that a mechanism be ``good enough.'' To argue otherwise would be to take a teleological view of the evolutionary process, essentially positing a sighted watchmaker (cf. Dawkins, 1986) or argument from design (see Darwin, 1872, for a response to this argument).

The need for domain general mechanisms One reason why domain general processes are required is to coordinate the output of possible modular processes and to regulate their expression. For

262

Happaney and Zelazo

example, it has been suggested that basic emotions have their evolutionary origins in adaptive behaviours (e.g., the teeth-displaying grimace that accompanies anger; Darwin, 1872), with universally shared facial expressions and neural underpinnings (Ekman et al., 1987). However, there are display rules that modulate or mask emotional expressions, and these vary from culture to culture (Ekman et al., 1987). Additionally, what happens when the desire for sexual contact comes into con¯ict with immediate pragmatic concerns? For example, suppose that the target of one's desire is the boss's wife or, in EEA terms, what if the female is the mate of a stronger and more in¯uential male within one's clan? In relation to the adaptive regulation of behaviour, Cosmides and Tooby (1994) have argued that what counts as error in one context may not count as error in another, and the relative weighting of errors may change rapidly. Thus, they argue for separate mechanisms that each detect errors within different domains (e.g., the desire for sex with an attractive member of the opposite sex and the need to avoid incest). However, some sort of mechanism would seem to be required to assess such changing contexts, consider behaviour in the light of these, and weigh alternative options. It is important to note that such considerations need not be of a conscious, deliberative nature, especially within settings of considerable motivational and emotional (affective) relevance; see below. Indeed, the prefrontal cortex appears to ful®ll just such a function in a very domain general fashion. Via its interactions with numerous other brain regions, the prefrontal cortex receives information about the current context and about one's needs or wants. It then modulates behaviour in the light of this context ± retrieving or formulating appropriate rules for governing this (Bunge & Zelazo, 2006). Information processed via prefrontal mechanisms often comes from a wide variety of domains ± a feature that may be tied to the fact that this information is often consciously accessible. For example, working memory, which depends on the dorsolateral prefrontal cortex (Petrides & Milner, 1982), allows for the short-term maintenance and manipulation of information of many different types. Even in cases where evidence indicates that different parts of the prefrontal cortex may be recruited for memory for different types of information ± such as the contingency value of an object as opposed to the object's identity (see Schoenbaum & Setlow, 2001) ± the distinction among types does not correspond to the different content domains (vision, language, number, kin, etc.) posited by adherents to extreme domain speci®city. Taking an extreme domain speci®c perspective may also lead to an underappreciation of more general principles of neural organisation (e.g., dependencies between the location of brain structures and their functions). The brain appears to be organised as a complex, hierarchical interactive system ± not solely as a set of separate modules. Although there is speci®city, there is also integration. The prefrontal cortex is one broad region

11. Less speci®city in higher cognition

263

that serves to integrate information that is processed in a more speci®c fashion elsewhere in the brain. The extent to which this region accounts for higher cognitive processes remains to be determined, but evidence suggests that it has a wide range of applicability in this respect (e.g., Stuss & Knight, 2002).

Domain specificity and theory of mind: Empirical evidence and responses Evidence for domain specificity of theory of mind: The case of autism Much of the evidence for domain speci®city in theory of mind has come from research on people with autism of varying severity. Autism is a neurodevelopmental disorder in which those af¯icted display a lack of social engagement and communication. More speci®cally, individuals with autism possess impairments with regard to social interaction, language and communication, and pretend play, and show highly stereotyped behaviour and an insistence on sameness (Wing & Gould, 1979). Based on the social de®cits of autism, along with the assumed social function of theory of mind, Baron-Cohen and colleagues (e.g., Baron-Cohen, Leslie, & Frith, 1985) posited that autism may represent a selective de®cit in ToM, an idea referred to as the theory of mind hypothesis of autism. According to this, a faulty ToM mechanism leads to the symptoms seen in this disorder. In this way, autism has been likened to disorders caused by malfunctioning bodily organs or organ systems. Put another way, autism is argued to result from problems within the ToM organ. Just how the purported ToM de®cits of autism lead to its symptoms has also been discussed. For example, an ineffective ToM mechanism should disallow or considerably attenuate social interaction, since one would have neither the motivation nor understanding to interact effectively under such situations. Further, as suggested by Leslie (1987), since pretend play may require one to understand mental representation in others (i.e., how they are currently representing a given object, as opposed to the actual identity and function of that object), engaging in joint pretense has been argued to be a relatively early manifestation of ToM (but see Harris, 1994; Kavanaugh & Harris, 1994, for an alternative explanation and ®ndings). Additionally, in keeping with Gricean principles (Grice, 1957), since verbal communication, particularly its pragmatic aspects, requires understanding the communicative needs and intentions of interlocutors, it too should require a functional ToM mechanism. Evidence for the ToM hypothesis of autism would, thus, involve showing that the ToM mechanism acts independently of others (of either a modular or a nonmodular nature). As the logic goes, the extent that those with autism show a speci®c impairment in ToM, but perform normally on tasks

264

Happaney and Zelazo

from other domains, would serve as evidence that ToM is its own distinct mechanism, the operation of which is not dependent on other cognitive competencies. The evidence In keeping with the ToM hypothesis of autism, Baron-Cohen et al. (1985) found that, unlike typically developing 3-year-old children and children with Down's syndrome, relatively high-functioning individuals with autism failed to attribute a false belief to a protagonist. Similarly, Baron-Cohen, Leslie, and Frith (1986) found that these individuals performed as well as, or better than, individuals with Down's syndrome and normal preschoolers on causal mechanical, and behavioural picture sequencing tasks, but worse on sequences in which they were required to ascribe intentional states to a protagonist. Scott and Baron-Cohen (1996) reported that children with autism performed comparably to typically developing children, and children with mental retardation, on tests of transitive inference and analogical reasoning, but performed signi®cantly worse on a false-belief task. Scott, Baron-Cohen, and Leslie (1999) extended these ®ndings to include a test of counterfactual reasoning, with and without prompts to pretend, and found that in the absence of pretense prompts, individuals with autism performed well relative to control participants. Similarly, Leekam and Perner (1991) found that children with autism, but not typically developing children, performed worse on a false-belief task than on a false-photograph task, which required reasoning about outdated physical representations rather than mental representations. This basic pattern was also found by Leslie and Thaiss (1992) and by Charman and Baron-Cohen (1992, 1995). At ®rst glance, these ®ndings appear to provide support for the domain speci®city of ToM: people with autism fail ToM tests but show an intact ability to reason about other things. However, this interpretation can be questioned for several reasons. It is now clear that individuals with autism have dif®culty with ToM, but the speci®city of this dif®culty is based on weak evidence. For example, Baron-Cohen et al. (1985) only assessed ToM, so the group differences could be attributed to more general cognitive de®cits that lead to dif®culty on ToM reasoning. In other words, this ®nding on its own does not provide evidence of speci®city. Moreover, Baron-Cohen et al.'s (1986) claim for the speci®city of ToM de®cits was not replicated by two other groups of researchers. Oswald and Ollendick (1989) found no differences in the picture sequencing task when they compared a relatively low-functioning group of individuals with autism to an age- and nonverbal-IQ-matched group of mentally retarded adolescents. Similarly, Ozonoff, Pennington, and Rogers (1991a) found no group differences on the intentional subtest (i.e., the ToM items) in a study with a relatively large group of high-functioning individuals with autism, who were compared to a well-matched clinical control group. However, the comparison

11. Less speci®city in higher cognition

265

group performed better on each of the mechanical subtests (i.e., the physical causality items) and on one of the two behavioural subtests. Consider next the ``syllogisms'' used by Scott and Baron-Cohen (1996), which were treated as control tests to demonstrate the speci®city of dif®culties with ToM. In one test, participants were told, for example, ``All bananas are pink; John is eating a banana'' and then asked, ``Is the banana pink?'' To answer the test question, children simply had to consult the ®rst premise; the second premise could be ignored. This is hardly a fair comparison for the reasoning involved in ToM tasks. There are similar dif®culties with the study by Scott et al. (1999) that make its results just as dif®cult to interpret. It is unclear, for example, whether the ``syllogisms'' actually required children to integrate the information provided in the premises. Thus, the test hardly assesses the ability to make logical inferences. Even more problematic is the fact that the correct response on every single trial was ``yes,'' so successful performance could be attributed to a ``yes'' bias, rather than to intact counterfactual reasoning. Indeed, Leevers and Harris (2000) found that autistic individuals performed very poorly on counterfactual syllogisms when ``no'' responses were required. Thus, at present, the empirical support for the domain speci®city of ToM is weak. In all the cases reviewed, there are inadequate demonstrations of differential de®cits in performance between ToM versus control tasks because these tasks also vary in relative dif®culty or complexity. Moreover, there is a growing body of evidence, from both typically developing and autistic individuals, that ToM depends on a domain general set of cognitive mechanisms (i.e., aspects of executive function; see below). Savant skills Another argument for domain speci®city comes from individuals with autism who display unusual talents. These have been taken as further evidence for the independence of particular psychological functions from general cognitive ability, or, from a more modular perspective, as the sparing of speci®c modules despite widespread neurological damage (see also Roberts, Chapter 14, this volume). However, as Spitz (1995) has pointed out, in many cases, the unusual skills of savants might be attributed to the more general phenomenon of expertise, perhaps with consequent automatisation and emergent modularisation (see Goldberg, 1995; Karmiloff-Smith, 1992). As an example, the majority of calendar calculators ± a skill displayed by some autistic savants ± spend an enormous amount of time looking over and creating perpetual calendars. This type of behaviour would provide the necessary practice to detect regularities that can subsequently be utilised. Indeed, as discussed by Spitz (1995), knowing about regularities between the years from 1600 to 2000 means that dates in years outside this period can be calculated by employing a few heuristics. Further, these regularities are probably performed in an implicit, nonconscious

266

Happaney and Zelazo

manner, as suggested by the inability of such individuals to report such strategies (cf. Segar, 1994). Thus, rather than suggesting the preservation of a speci®c modular function (e.g., a calendar-calculating module), these savant skills can perhaps best be explained as the acquisition of expertise in an area that is unlikely to sustain the interest of most people. Where some savant skills do not adhere to an account based on overpractice/expertise and subsequent emergent modularisation, they may result from de®cits in adaptive cognitive functions normally present in nonautistic individuals. For example, consider the excellent rote memorial abilities in autism. Hypermnestic autistic savants studied by Mottron, Belleville, Stip, and Morasse (1998), in comparison with nonautistic IQmatched controls, showed no retroactive interference and only mild proactive interference in list recall. Thus, as suggested by Mottron et al., the outstanding recall seen in autism might result from the absence of interference, which would normally impair performance. While a lack of interference might allow outstanding rote recall, such excellent memory is often maladaptive, as described by Luria (1987) in his observations of the profound everyday dif®culties of a hypermnestic individual. That is, from an adaptive standpoint, memory in humans should represent a balance between memory for relevant items and the forgetting of information not likely to be of use (see Potter, 1990). Non-social deficits: The role of hot executive function In addition to its more social de®cits (i.e., social interaction, communication, and pretense), autism also consists of a number of other seemingly non-ToM-relevant symptoms. These include, as discussed by Frith and Happe (1994), islets of abilities (more commonly referred to as splinter skills), particularly spatial skills, self-stimulation, restricted repertoire of interests, a preoccupation with parts of objects, excellent rote memory, and overly selective attention and responding. One could quite reasonably argue that some of the non-social de®cits of autism (i.e., those not entailing social interaction, communication, and pretense) may be consequences of these more primary social de®cits. Selfstimulatory behaviours may arise from a lack of social and physical stimulation (often provided by interaction with others), with behavioural or cognitive stereotypy being a potential consequence of the anxiety elicited from interaction with a socially unpredictable environment. For example, as reported by Koegel (1995), clinicians often report an attenuation of selfstimulation and other stereotyped behaviours when autistic children gain communication and social skills. A reasonable alternative, however, is that individuals with autism have a primary de®cit of a more domain general nature. For example, there is direct evidence that individuals with autism show de®cits in prefrontally mediated executive function (e.g., Colvert, Custance, & Swettenham, 2002; Ozonoff

11. Less speci®city in higher cognition

267

et al., 1991a; Zelazo, Jacques, Burack, & Frye, 2002). Importantly, in one study, executive function de®cits were more common among those with autism and Asperger's syndrome than were ToM de®cits (Ozonoff et al., 1991b). In this study, high functioning autistic subjects performed poorly on the Wisconsin card sorting task (WCST, which is a test of cognitive ¯exibility) as well as the Tower of Hanoi task (ToH), which measures the ability to plan actions internally prior to carrying them out, as well as to coordinate recursive subgoals. Interestingly, in keeping with the autistic tendency toward overfocusing of attention, those with autism have been found to perform at normal and even superior levels at maintaining set on the WCST (see Bryson, Landry, & Wainwright, 1997). Thus, although autistic individuals appear to show dif®culty in strategically shifting attentional focus or strategy, they are normal to superior at sustaining attentional focus, potentially as a result of their narrow spotlight of attention (Townsend & Courchesne, 1994). From this perspective, de®cits in ToM, and the dif®culties exhibited by individuals with autism, might reasonably be characterised in more domain general terms, as due to failures in aspects of prefrontally mediated executive function. In particular, individuals with autism appear to have dif®culty in integrating and regulating the outputs of more domain speci®c processes. They have dif®culties in representing context and using it to constrain responding. Consider, for example, the ability to detect faux pas (socially inappropriate utterances and/or gestures), which is heavily dependent on one's sensitivity to context. To detect faux pas one must know when a statement or behaviour will be appreciated and when it will be offensive. Interestingly, the orbitofrontal cortex (OFC, a region of the ventromedial prefrontal cortex, VM-PFC) has been tied to the ability to detect faux pas (Stone, Baron-Cohen, & Knight, 1998) as well as the ability to consider context when making affectively based decisions (as in the Iowa gambling task; Bechara, Damasio, Damasio, & Anderson, 1994), and to update the incentive value of stimuli based on feedback, as in response reversal tasks (see Rolls, Hornak, Wade, & McGrath, 1994). Zelazo and colleagues (e.g., Kerr & Zelazo, 2004) have termed these ``hot'' executive functions since they concern the control of thought and action within affectively signi®cant settings. Of relevance to faux pas, the right hemisphere has been tied to the detection of anomaly (Ramachandran & Blakeslee, 1998; Smith, Tays, Dixon, & Bulman-Fleming, 2002), which we have suggested is important for many aspects of VM-PFC/OFC function (Happaney, Zelazo, & Stuss, 2004). For instance, performance on the Iowa gambling task, as well as everyday decision-making, response reversal, and impulse control, all fall within the normal range even after left VM-PFC damage, but are severely impaired by damage to right VM-PFC (Tranel, Bechara, & Denburg, 2003; see also Happaney et al., 2004, for further discussion). We believe that ToM depends not on a module, but rather on a more domain general appreciation of context and anomalies with respect to that context.

268

Happaney and Zelazo

The importance of anomaly detection can be seen in the detection of false belief. Identifying a false belief may be conceptualised as ®nding a discrepancy or anomaly with respect to the current context. For example, in arguing against the comparability of false-belief and false-photograph tasks, Leslie, German, and Polizzi (2005) have argued that false, outdated beliefs are more anomalous than false, outdated photographs, since beliefs are supposed to be true, as opposed to photographs, which are almost by de®nition outdated (see also Moses & Sabbagh, Chapter 12, this volume). Some recent work (e.g., Saxe & Kanwisher, 2003) has suggested that reasoning about mental states (as opposed to emotional or physical states, as well as nonmental representations) may speci®cally depend on activation of the temporo-parietal junction (TPJ), but this region appears to play an important role in the detection of anomaly more generally. Indeed, Behrmann, Geng, and Shomstein (2004) discuss the role of the TPJ in detecting infrequent or unexpected events (essentially anomalous ones) within the modality in which input is presented, whether visual, auditory, or tactile, as well as in goal-directed situations when there is a change in the modality being monitored. Recently, Saxe and Wexler (2005) found speci®c right TPJ involvement in situations in which a protagonist's desires were in con¯ict with expectations. Although the authors interpreted this ®nding in terms of mental state attributions, the results are in keeping with a right hemisphere/anomaly detection account. Finally, the TPJ has also been particularly implicated in spatial neglect syndrome, which has been used as a model for understanding the attentional de®cits in autism (see Bryson, Wainwright-Sharp, & Smith, 1990). This approach is also consistent with other cognitive neuroscienti®c work on ToM. For example, Stone, Baron-Cohen and colleagues (Stone, BaronCohen, Calder, Keane, & Young, 2003; Stone et al., 1998) linked ToM competence to brain structures known to be involved in the processing of emotional and social signals. Stone et al. (1998) showed that patients with damage to the orbitofrontal cortex had dif®culty in detecting faux pas, which can easily be conceptualised as representing anomalous social behaviour. More recently, Stone et al. (2003) found evidence that patients with amygdala damage also show ToM de®cits, as assessed by faux pas recognition, and also by a measure that requires participants to infer people's emotions from their eyes. Although these results have been interpreted in a domain speci®c fashion, there is no evidence to suggest this type of mechanism. It is more likely that aspects of executive function contribute considerably to ToM competence and also underlie other emotionally and socially relevant processes (e.g., affective decision-making) that do not concern ToM. In fact, performance on the Iowa gambling task (a test of affective decision-making) has been found to relate to emotional intelligence while showing no such relations with more standard intelligence measures (Bar-On, Tranel, Denburg, & Bechara, 2003). In this respect,

11. Less speci®city in higher cognition

269

ToM may be seen as one aspect of affective decision-making, or hot executive function (see Kerr & Zelazo, 2004; Zelazo, Qu, & MuÈller, 2005).

Conclusion The meaning of the term ``domain'' depends on the context in which it is used. In recent work in¯uenced by evolutionary psychology, the term has come to refer to information falling within the relevant environment in which an adaptive problem was solved. From this perspective, domain speci®c mechanisms are those that process speci®c types of information, with these processes usually assumed to be the product of natural selection. While the search for domain speci®c mechanisms is important in its own right, room must be made for the potential dependence and interaction of these mechanisms with other speci®c and/or general ones. Extreme domain speci®city is thus likely to face the same fate as extreme equipotential perspectives. For example, just as exceptions to general learning principles have been reliably demonstrated (e.g., Garcia & Koelling, 1966), it is likely that a good number of arguably speci®c mechanisms, such as ToM, will prove to be less speci®c than has been argued by adherents to a domain speci®c viewpoint. One aspect of cognition that is particularly dif®cult to reconcile with a domain speci®c approach is executive function. Unlike some cognitive functions, executive function is not limited to control within circumscribed domains, even though non-domain speci®c differentiations between these functions appear to be empirically justi®ed and heuristically useful (see Happaney et al., 2004). Using standard criteria for domain speci®city, and particularly modularity, requires mechanisms that, given some of the assumptions of domain speci®c mechanisms (e.g., informational encapsulation, automaticity, mandatory operations) are unlikely in the case of executive function. Not surprisingly then, executive function is perhaps the greatest thorn in the side of extreme domain speci®city. The assumption that domain general mechanisms could not possibly have evolved (Cosmides & Tooby, 1994) appears ¯awed. Executive function is largely domain general in nature, and this casts doubt on this key assumption made by proponents of extreme domain speci®city. Mechanisms that solve several adaptive problems simultaneously may be especially likely to evolve ± indeed, in many instances, more likely than mechanisms that solve fewer adaptive problems. Aspects of executive function appear to underlie ToM, and defects in these appear to be a key aspect of the impairments seen in autism. Research on ToM in autism is widely regarded as among the strongest support for extreme domain speci®c views of higher cognitive processes. However, on closer inspection, evidence that autism re¯ects a domain speci®c de®cit in a ToM mechanism appears weak. Rather, we believe that the systems involved in ToM show a high degree of domain generality. While we agree that such systems may serve social and

270

Happaney and Zelazo

emotional (hot) functions, this hardly corresponds to the kind of domain speci®city envisioned by evolutionary psychology.

References Bar-On, R., Tranel, D., Denburg, N. L., & Bechara, A. (2003). Exploring the neurological substrate of emotional and social intelligence. Brain, 126, 1790±1800. Baron-Cohen, S., Leslie, A. M., & Frith, U. (1985). Does the autistic child have a theory of mind? Cognition, 21, 37±46. Baron-Cohen, S., Leslie, A. M., & Frith, U. (1986). Mechanical, behavioural, and intentional understanding of picture stories in autistic children. British Journal of Developmental Psychology, 4, 113±125. Bechara, A., Damasio, A. R., Damasio, H., & Anderson, S. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. Cognition, 50, 7±15. Behrmann, M., Geng, J. J., & Shomstein, S. (2004). Parietal cortex and attention. Current Opinion in Neurobiology, 214, 212±217. Bryson, S. E., Landry, R., & Wainwright, J. A. (1997). A componential view of executive dysfunction in autism: Review of recent evidence. In J. A. Burack & J. T. Enns (Eds.), Attention, development, and psychopathology (pp. 232±259). New York: Guilford. Bryson, S. E., Wainwright-Sharp, J. A., & Smith, I. M. (1990). Autism: A developmental spatial neglect syndrome? In J. Enns (Ed.), The development of attention: Research and theory (pp. 405±427). Oxford: North-Holland. Bunge, S., & Zelazo, P. D. (2006). A brain-based account of the development of rule use in childhood. Current Directions in Psychological Science, 15, 118±121. Cahill, L., Uncapher, M., Kilpatrick, L., Alkire, M. T., & Turner, J. (2004). Sexrelated hemispheric lateralization of amygdala function in emotionally in¯uenced memory: An fMRI investigation. Learning & Memory, 11, 261±266. Caporael, L. R. (1997). The evolution of truly social cognition: The core con®gurations model. Personality & Social Psychology Review, 1, 276±298. Charman, T., & Baron-Cohen, S. (1992). Understanding drawings and beliefs: A further test of the metarepresentation theory of autism: A research note. Journal of Child Psychology and Psychiatry, 33, 1105±1112. Charman, T., & Baron-Cohen, S. (1995). Understanding photos, models, and beliefs: A test of the modularity thesis of theory of mind. Cognitive Development, 10, 287±298. Chiappe, D., & MacDonald, K. (2005). The evolution of domain general mechanisms in intelligence and learning. Journal of General Psychology, 132, 5±40. Choi, J., & L'Hirondelle, N. (2005). Object location memory: A direct test of the verbal memory hypothesis. Learning & Individual Differences, 15, 237±245. Colvert, E., Custance, D., & Swettenham, J. (2002). Rule-based reasoning and theory of mind in autism: A commentary on the work of Zelazo, Jacques, Burack and Frye. Infant & Child Development, 11, 197±200. Cosmides, L., & Tooby, J. (1994). Origins of domain speci®city: The evolution of functional organization. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the

11. Less speci®city in higher cognition

271

mind: Domain speci®city in cognition and culture (pp. 85±116). New York: Cambridge University Press. Darwin, C. (1872). Expression of the emotions in man and animals. London: John Murray. Dawkins, R. (1986). The blind watchmaker: Why the evidence of evolution reveals a universe without design. New York: Norton. Deacon, T. W. (1997). The symbolic species: The co-evolution of language and the brain. New York: Norton. Duchaine, B., Cosmides, L., & Tooby, J. (2001). Evolutionary psychology and the brain. Current Opinion in Neurobiology, 11, 225±230. Ekman, P., Friesen, W. V., O'Sullivan, M., Chan, A., Tarlatzis, I. D., Heider, K., et al. (1987). Universals and cultural differences in the judgments of facial expressions of emotion. Journal of Personality and Social Psychology, 53, 712±717. Elman, J., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1996). Rethinking innateness: A connectionist perspective on development. Cambridge, MA: MIT Press. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Fodor, J. A. (2000). The mind doesn't work that way: The scope and limits of computational psychology. Cambridge, MA: MIT Press. Frith, U., & Happe, F. (1994). Autism: Beyond theory of mind. Cognition, 50, 115±132. Garcia, J., & Koelling, R. (1966). Relation of cue to consequence in avoidance learning. Psychonomic Science, 4, 123±124. Goldberg, E. (1995). Rise and fall of modular orthodoxy. Journal of Clinical and Experimental Neuropsychology, 17, 193±208. Gould, S. J., & Lewontin, R. C. (1979). The spandrels of San Marco and the Panglossian paradigm: A critique of the adaptationist program. Proceedings of the Royal Society of London, 205, 281±288. Grice, H. P. (1957). Meaning. Philosophical Review, 66, 377±388. Happaney, K., Zelazo, P. D., & Stuss, D. T. (2004). Development of orbitofrontal function: Current themes and future directions. Brain and Cognition, 55, 1±10. Harris, P. L. (1994). Understanding pretence. In P. Mitchell & C. Lewis (Eds.), Children's early understanding of mind: Origins and development (pp. 235±269). Hove, UK: Lawrence Erlbaum Associates Ltd. Karmiloff-Smith, A. (1992). Beyond modularity: A developmental perspective on cognitive science. Cambridge, MA: MIT Press. Kavanaugh, R. D., & Harris, P. L. (1994). Imagining the outcome of pretend transformations: Assessing the competence of normal children and children with autism. Developmental Psychology, 30, 847±854. Kerr, A., & Zelazo, P. D. (2004). Development of ``hot'' executive function: The children's gambling task. Brain and Cognition, 55, 148±157. Kimura, D. (1999). Sex & cognition. Cambridge, MA: MIT Press. Koegel, L. K. (1995). Communication and language intervention. In R. L. Koegel & L. K. Koegel (Eds.), Teaching children with autism: Strategies for initiating positive interactions and improving learning opportunities (pp. 17±32). Baltimore, MD: Brookes. Leavitt, G. C. (1990). Sociobiological explanations of incest avoidance: A critical review of evidential claims. American Anthropologist, 92, 971±993.

272

Happaney and Zelazo

Leekam, S. R., & Perner, J. (1991). Does the autistic child have a metarepresentational de®cit? Cognition, 40, 203±218. Leevers, H. J., & Harris, P. L. (2000). Counterfactual syllogistic reasoning in normal four-year-olds, children with learning disabilities, and children with autism. Journal of Experimental Child Psychology, 76, 64±87. Leslie, A. M. (1987). Pretense and metarepresentation. Psychological Review, 94, 412±426. Leslie, A. M., German, T. P., & Polizzi, P. (2005). Belief±desire reasoning as a process of selection. Cognitive Psychology, 50, 45±85. Leslie, A. M., & Thaiss, L. (1992). Domain speci®city in conceptual development: Neuropsychological evidence from autism. Cognition, 43, 225±251. Luria, A. R. (1987). The mind of a mnemonist: A little book about a vast memory. Cambridge, MA: Harvard University Press. McClelland, J. L., & Rumelhart, D. E. (1988). Explorations in parallel distributed processing: A handbook of models, programs, and exercises. Cambridge, MA: MIT Press. Miller, G. (2000). How to keep our metatheories adaptive: Beyond Cosmides, Tooby, and Lakatos. Psychological Inquiry, 11, 42±46. Mottron, L., Belleville, S., Stip, E., & Morasse, K. (1998). Atypical memory performance in an autistic savant. Memory, 6, 593±607. Oswald, D. P., & Ollendick, T. H. (1989). Role taking and social competence in autism and mental retardation. Journal of Autism and Developmental Disorders, 19, 119±127. Ozonoff, S., Pennington, B. F., & Rogers, S. J. (1991a). Executive function de®cits in high functioning autistic individuals: Relationship to theory of mind. Journal of Child Psychology and Psychiatry, 32, 1081±1105. Ozonoff, S., Rogers, S. J., & Pennington, B. F. (1991b). Asperger's syndrome: Evidence of an empirical distinction from high-functioning autism. Journal of Child Psychology & Psychiatry, 32, 1107±1122. Petrides, M., & Milner, B. (1982). De®cits on subject-ordered tasks after frontaland temporal-lobe lesions in man. Neuropsychologia, 20, 249±262. Potter, M. (1990). Remembering. In D. N. Osherson, & E. E. Smith (Eds.), Thinking: An invitation to cognitive science, Vol. 3. (pp. 3±32). Cambridge, MA: MIT Press. Ramachandran, V. S., & Blakeslee, S. (1998). Phantoms in the brain: Probing the mysteries of the human mind. New York: Morrow. Rolls, E. T., Hornak, J., Wade, D., & McGrath, J. (1994). Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. Journal of Neurology, Neurosurgery & Psychiatry, 57, 1518±1524. Saxe, R., & Kanwisher, N. (2003). People thinking about thinking people: The role of the temporo-parietal junction in ``theory of mind.'' NeuroImage, 19, 1835±1842. Saxe, R., & Wexler, A. (2005). Making sense of another mind: The role of the right temporo-parietal junction. Neuropsychologia, 43, 1391±1399. Schoenbaum, G., & Setlow, B. (2001). Integrating orbitofrontal cortex into prefrontal theory: Common processing themes across species and subdivisions. Learning & Memory, 8, 134±147. Scott, F. J., & Baron-Cohen, S. (1996). Logical, analogical, and psychological

11. Less speci®city in higher cognition

273

reasoning in autism: A test of the Cosmides theory. Development and Psychopathology, 8, 235±245. Scott, F. J., Baron-Cohen, S., & Leslie, A. M. (1999). ``If pigs could ¯y'': A test of counterfactual reasoning and pretence in children with autism. British Journal of Developmental Psychology, 17, 349±362. Segar, C. A. (1994). Implicit learning. Psychological Bulletin, 115, 163±196. Silverman, I., & Eals, M. (1992). Sex differences in spatial abilities: Evolutionary theory and data. In J. H. Barkow, L. Cosmides, and J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 533±549). London: Oxford University Press. Smith, S. D., Tays, W. J., Dixon, M. J., & Bulman-Fleming, M. B. (2002). The right hemisphere as an anomaly detector: Evidence from visual perception. Brain & Cognition, 48, 574±579. Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind: Domain speci®city in cognition and culture (pp. 85±116). New York: Cambridge University Press. Spitz, H. H. (1995). Calendar calculating idiot savants. New Ideas in Psychology, 13, 167±182. Stone, V. E., Baron-Cohen, S., Calder, A., Keane, J., & Young, A. (2003). Acquired theory of mind impairments in individuals with bilateral amygdala damage. Neuropsychologia, 41, 209±220. Stone, V. E., Baron-Cohen, S., & Knight, R. T. (1998). Frontal lobe contributions to theory of mind. Journal of Cognitive Neuroscience, 10, 640±656. Stuss, D. T., & Knight, R. T. (2002). Principles of frontal lobe function. London: Oxford University Press. Townsend, J., & Courchesne, E. (1994). Parietal damage and narrow ``spotlight'' spatial attention. Journal of Cognitive Neuroscience, 6, 220±232. Tranel, D., Bechara, A., & Denburg, N. L. (2003). Asymmetric functional roles of right and left ventromedial prefrontal cortices in social conduct, decision-making, and emotional processing. Cortex, 38, 589±612. Tucker, D. M. (1981). Lateral brain function, emotion, and conceptualization. Psychological Bulletin, 89, 19±46. Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press. Wing, L., & Gould, J. (1979). Severe impairments of social interaction and associated abnormalities in children: Epidemiology and classi®cation. Journal of Autism & Developmental Disorders, 9, 11±29. Zelazo, P. D., Carter, A., Reznick, S. J., & Frye, D. (1997). Early development of executive function: A problem-solving framework. Review of General Psychology, 1, 198±226. Zelazo, P. D., Jacques, S., Burack, J. A., & Frye, D. (2002). The relation between theory of mind and rule use: Evidence from persons with autism-spectrum disorders. Infant and Child Development, 11, 171±195. Zelazo, P. D., Qu, L., & MuÈller, U. (2005). Hot and cool aspects of executive function: Relations in early development. In W. Schneider, R. SchumannHengsteler, & B. Sodian (Eds.), Young children's cognitive development: Interrelationships among executive functioning, working memory, verbal ability, and theory of mind (pp. 71±93). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.

12 Interactions between domain general and domain speci®c processes in the development of children's theories of mind Louis J. Moses and Mark A. Sabbagh Much of our everyday cognition is social. We allocate substantial intellectual resources to predicting and interpreting human action and interaction in terms of underlying mental states. The acquisition of a ``theory of mind'' (ToM) is thus a cornerstone developmental achievement. Moreover, ToM de®cits have been increasingly linked to clinical disorders such as autism (Baron-Cohen, Tager-Flusberg, & Cohen, 2000) and schizophrenia (Langdon, 2005). Hence, it is vital to determine when such a theory is acquired, what governs its emergence and expression, how it changes with development, and what disrupts its functioning. The timeline of ToM development is marked by a series of increasingly sophisticated achievements. By 2 years of age, children have a ¯edgling appreciation of the subjectivity of human cognition. They recognise that others may have desires, intentions, emotions, and attentional states that differ from their own (Wellman, 2002). However, it is not until the end of the preschool years that a so-called representational ToM is thought to emerge (Perner, 1991). For example, 3-year-olds typically perform very poorly on measures assessing their understanding that beliefs can be false (Wimmer & Perner, 1983); that appearances may not re¯ect reality (Flavell, Flavell, & Green, 1983); and that different individuals may perceive the same scene in different ways (Flavell, Everett, Croft, & Flavell, 1981). These younger preschoolers appear to think that beliefs and appearances never misrepresent reality, and that there can only be a single perspective on any state of affairs. Yet, by the time they are ®ve, children have an appreciation of mental life that is beginning to resemble that of adults, although further signi®cant advances occur all the way through adolescence (Chandler & Sokol, 1999; Pillow, 1999). But what exactly is a theory of mind and how is it acquired? Several competing proposals have been advanced. The ®rst is the ``theory-theory'' (Gopnik & Wellman, 1994; Perner, 1991) according to which children progressively replace inadequate theories with ever more satisfactory ones in a manner somewhat akin to scienti®c theory change. In this constructivist view, children form mental state concepts on the basis of their experiences and observations in the world around them (including whatever

276

Moses and Sabbagh

they experience of their own inner mental life). The theories of mind at which children arrive could, in part, be constrained both by the nature of our cognitive architecture and by the cultural milieu in which children develop. A second approach is simulation theory (Goldman, 2001; Harris, 1991). Proponents of this suggest that mental states are not theoretical constructions. Rather, individuals have more or less direct access to their own mental states and, by a process of imaginative simulation, they are able to determine the mental states of others. Modularity theory is a third approach (Baron-Cohen, 1995; Leslie, 1994). Here, speci®c neural systems are dedicated to processing social information relevant to mental states. When these systems or modules mature, children have the capacity to make appropriate mental state attributions. A fourth proposal posits that the development of mental state concepts is constrained by the development of linguistic capacity (de Villiers & de Villiers, 1999). Speci®cally, children are not able to master a concept such as belief until they have mastered the syntactic frames in which talk about beliefs is embedded. Finally, information-processing approaches emphasise the roles of factors such as executive function or processing capacity in developing mental state understanding (Andrews, Halford, Bunch, Bowden, & Jones, 2003; Carlson & Moses, 2001; Frye, Zelazo, & Palfai, 1995; Happaney & Zelazo, Chapter 11, this volume; Moses & Carlson, 2004; Russell, 1996). Such factors may be implicated in either the emergence of ToM concepts or their expression (Moses, 2001). These various approaches may be characterised as emphasising either domain general or domain speci®c cognition. Information-processing approaches are domain general. They posit that cognitive capacities, which might be applied to many different kinds of content, constrain the development of children's theories of mind. The same is true of simulation theory: A general capacity ± the ability to imagine alternative states of affairs ± is said to be central to ToM reasoning (Harris, 2000). At the other end of the spectrum are domain speci®c approaches. The clearest example is modularity theory: On this account, advances in ToM depend on the maturation of neural architecture speci®cally dedicated to processing information about mental states. The syntactic approach might also be characterised as domain speci®c: one domain speci®c system (syntactic processing) constrains developments in another (ToM). Finally, the theory-theory is typically thought of as domain speci®c. Proponents of this view argue that children acquire theories in many speci®c domains (psychology, physics, biology, and so on), that these theories emerge on quite different timetables, and that they contain their own idiosyncratic explanatory frameworks (Wellman & Gelman, 1998). However, a dichotomous domain general versus domain speci®c characterisation is much too simple. Most of the approaches just described are in one way or another hybrids. For example, although modularity theory

12. Children's theories of mind

277

emphasises dedicated computational architecture for processing mental state content, the theory may also include a domain general component. In that regard, Leslie and his colleagues have argued that the ToM module outputs plausible mental state content, which is then subject to further information processing. Speci®cally, salient but incorrect options must be inhibited and correct options selected. Thus, although the ToM module might mature early, successful performance on theory of mind tasks also depends on advances in information-processing capacities (Leslie, Friedman, & German, 2004). Similarly, although the theory-theory strongly emphasises the emergence of domain speci®c knowledge structures, the processes through which these structures are said to develop are in fact domain general. That is, like Piagetian theory, the theory-theory is thoroughly constructivist. Changes in children's theories are generated by cognitive con¯ict between children's current ways of thinking and counterevidence that emerges as children observe or interact with the environment. The same mechanisms would be at work irrespective of the content of the theory in question (see also Halford & Andrews, Chapter 9; Sloman, Lombrozo, & Malt, Chapter 5, this volume). Finally, information-processing approaches, by themselves, cannot fully explain advances in ToM. They might reveal what factors constrain or facilitate the emergence or expression of concepts, but they must be embellished by other factors in order to explain how conceptual insights are achieved. These factors could well be domain speci®c. In what follows, we explore in some detail the contribution of one domain general processing factor ± executive functioning (EF) ± to ToM development. We will argue that EF is critical for ToM development, that it works in combination with domain speci®c experiential and conceptual factors, and that it interacts with these in producing developmental outcomes.

Executive function and theory of mind Executive function is a broad construct that encompasses skills and processes such as inhibitory control, planning, set shifting, error detection and correction, and working memory (Welsh, Pennington, & Groisser, 1991; Zelazo, Carter, Reznick, & Frye, 1997). What these abilities have in common is that they are the kinds of skills that are necessary in novel situations, in which habitual ways of acting need to be set aside (Baddeley, 1996; Luria, 1973; Norman & Shallice, 1986). The development of executive skills is tied to the maturation of the prefrontal cortex of the brain, and damage to this causes executive de®cits (Luria, 1973). These are associated with a lack of self-control, poor judgment, failure to plan ahead, perseveration with inappropriate behaviour,

278

Moses and Sabbagh

dif®culty in sustaining attention, and dif®culty in processing multiple sources of information. Individuals with attention-de®cit hyperactivity disorder (ADHD; Barkley, 1997) and autism (Russell, 1997) show signi®cant de®cits in executive skills. The prefrontal cortex matures late, continuing to develop all the way through adolescence and into early adulthood (Diamond, 2002). As a result, executive functions follow a protracted developmental course (Diamond, 2002; Zelazo et al., 1997). They begin to emerge in infancy, develop markedly in the preschool years, and then again in middle childhood. Although many executive abilities are relatively mature by early adolescence, they continue to be re®ned and consolidated into early adulthood (Anderson, 2002; Zelazo & Mueller, 2002). The possibility that EF might contribute in some way to the development of ToM has been increasingly scrutinised over the past decade (Moses, Carlson, & Sabbagh, 2005; Perner & Lang, 1999). Evidence in support of this possibility comes from two sources. First, when the executive demands of ToM tasks are increased, children's performance deteriorates (Friedman & Leslie, 2004; Leslie & Polizzi, 1998), and when these demands are reduced, performance improves (Carlson, Moses, & Hix, 1998). Second, moderate to strong correlations between EF and ToM have now been found in many studies (see Moses et al., 2005, for a review). These relationships persist when confounding factors that are associated with EF and ToM are controlled. These include age, sex, verbal ability, general intelligence, and family size (Carlson & Moses, 2001; Carlson, Moses, & Breton, 2002; Carlson, Moses, & Claxton, 2004). The relationship between EF and ToM would thus appear to be genuine. Executive function includes a somewhat heterogeneous collection of abilities. There is increasing evidence, however, that two of these ± inhibitory control and working memory ± are critical to EF±ToM relations (Carlson et al., 2002; Hala, Hug, & Henderson, 2003). In particular, the combination of these two skills appears to be important. EF tasks that require inhibitory control but little working memory (e.g., delay tasks), or that require working memory but little inhibition (e.g., span tasks), tend not to be highly related to ToM. However, tasks that demand both inhibition and working memory tend to be related strongly. Such tasks require (1) a set of rules to be held in mind and (2) a dominant response to be suppressed. An example is the dimensional change card sorting task, which consistently correlates with ToM (Frye et al., 1995; Perner & Lang, 1999). In this task, children are presented with cards that vary along two dimensions (e.g., shape and colour). They are ®rst asked to sort some of the cards by one dimension and are then asked to sort the remainder by the second. The task requires that children hold in mind a pair of rules (e.g., red cards go in location A, blue cards in location B) and, after the relevant dimension has changed, to inhibit a prepotent response (responding in terms of the now irrelevant dimension).

12. Children's theories of mind

279

Why does EF relate to ToM? The evidence suggests that the relationship between EF and ToM is not a spurious by-product of maturational or general cognitive factors, yet it does little to explain the nature of that relationship. In this regard there are at least four possibilities: (1) EF and ToM could be alternative measures of the same thing; (2) EF might affect the online expression of ToM understanding; (3) EF might affect the very emergence of ToM concepts; or (4) ToM might be necessary for the emergence of EF skills. We take up each of these possibilities in turn. EF and ToM are coextensive One possibility is that measures of ToM and EF might be empirically related because these two constructs are actually one and the same theoretical entity. How could this be so when the tasks used to measure the constructs are, at least on the surface, so dissimilar? Sorting cards by shape versus colour, for example, appears to be an altogether different form of cognitive activity than explaining what actors have done in terms of their beliefs and desires. Nonetheless, the proposal is perhaps not as implausible as it might look. Theory of mind is a central component of metacognition ± cognition about cognition. Metacognitive processes, however, overlap substantially with executive processes (Fernandez-Duque, Baird, & Posner, 2000; Jarman, Vavrik, & Walton, 1995; Moses & Baird, 1999). Whereas metacognition is fundamentally concerned with monitoring and control of thought, executive function encompasses the monitoring and control of action as well. In some sense, then, metacognition is an executive function. Little wonder then that there is an empirical relationship between measures of metacognition (ToM) and measures of EF! The proposal that ToM and EF are coextensive, or at least that the former is a part of the latter, does capture something important. Nonetheless, there is some slippage in the argument. Theory of mind is indeed an aspect of metacognition. Metacognition, however, has classically been divided into two related but conceptually distinct components: knowledge of cognition and regulation of cognition (Flavell, Miller, & Miller, 2002). Theory of mind (especially if one takes the ``theory'' aspect seriously) constitutes knowledge of cognition ± a set of interrelated concepts involving the mind and its relationship to action. Regulation of cognition, however, is not part of one's theory of mind, although the latter might well in¯uence the former. But if metacognition is an executive function then this can only be true of the regulation component, not the knowledge component. Concepts are not executive functions. The empirical evidence further clari®es that metacognitive knowledge (and hence ToM) is conceptually distinct from metacognitive regulation

280

Moses and Sabbagh

(and hence EF). First, although ToM and EF are strongly related in the preschool period, the same is not always true when one looks more generally at metacognitive knowledge and metacognitive regulation. For example, one of the driving forces behind much educational research on metacognition has been the prospect that gains in metacognitive knowledge would translate into improvements in metacognitive regulation (and then into enhanced academic performance). However, attempts to link the two constructs have often been disappointing (Cavanaugh & Perlmutter, 1982; but see Schneider & Bjorklund, 1998 for a more optimistic view). Second, Carlson and Moses (2001) found that when performance on executive tasks was partialled out (in addition to age, sex, and verbal ability), measures of theory of mind nonetheless remained signi®cantly interrelated, suggesting an underlying conceptual core. Moreover, in a factor analysis of the same data, Carlson (1997) found that the EF and ToM variables for the most part loaded on separate dimensions. Hence, whereas any empirical relations between metacognitive regulation and EF could well be explained by the fact that these two theoretical constructs are essentially one and the same beast, the source of any relations between ToM and EF would have to lie elsewhere. EF affects the expression of ToM One distinct possibility is that EF affects the online expression of ToM (Carlson, Moses, & Hix, 1998; Russell, Mauthner, Sharpe, & Tidswell, 1991). That is, children might already have the relevant conceptual knowledge in hand, but be unable to express that knowledge effectively, because the tasks with which ToM is measured are laced with burdensome executive demands. We have noted, for example, that inhibitory control and working memory appear to be central to the EF±ToM relationship (Carlson et al., 2002; Hala et al., 2003). In line with this, children might fail false-belief tasks either because they are unable to suppress their own salient knowledge about the world (inhibition) or because they have dif®culty holding in mind two competing representations of the world (working memory). The evidence alluded to earlier ± suggesting that manipulating the executive demands of ToM tasks systematically affects children's performance ± is fully consistent with this expression account. That said, a meta-analysis of false-belief studies showed that, although reducing executive demands affected the performance of older preschoolers, doing so failed to move the performance of very young preschoolers above chance levels (Wellman, Cross, & Watson, 2001). Something in addition to online executive demands would appear to be constraining children's ToM performance. Further evidence against the expression hypothesis comes from ®ndings suggesting that EF correlates just as highly with tasks that do not exhibit

12. Children's theories of mind

281

strong executive demands as with those that do (Perner, Lang, & Kloo, 2002; Moses, Carlson, Stieglitz, & Claxton, 2006). For example, Moses et al. (2005) gave children a task testing their knowledge of the distinction between ``know'' and ``think'' ± in particular, that ``know'' indicates greater certainty than does ``think''. In this task children hear a puppet state, for example, that he thinks a target object is in the red box while another puppet states that he knows the object is in the blue box. Children are then simply asked where the object is. The task is dif®cult for preschoolers and their performance is correlated with that on false-belief tasks (Moore, Pure, & Furrow, 1990). Yet the task would not appear to impose much in the way of executive demands. Indeed, across trials, children's errors are unsystematic. That is, they do not regularly choose the incorrect location, as might be expected if failures of inhibition were driving their performance. Nonetheless, Moses et al. (2006) found that their performance on executive tasks was as highly correlated with performance on the know±think task as it was with that on false-belief tasks. Moreover, the pattern remained when age, verbal ability, and simple working memory were controlled. These ®ndings are dif®cult to reconcile with the expression hypothesis. We further tested the expression hypothesis recently in an examination of Chinese children's executive functioning and theory of mind development (Sabbagh, Xu, Carlson, Moses, & Lee, 2006b). We believed that Chinese children might be advanced in their executive skills: Cultural psychologists have noted that Chinese parents expect their children to have achieved a signi®cant level of impulse control by the age of 2 years (Ho, 1994). The importance of control is also apparent in preschool settings where large groups of children are attended by very few adults (at least relative to Western standards), thereby placing a heavy demand on children's ability to control their own behaviour (Tobin, Wu, & Davidson, 1989). These practices within Chinese culture may provide children with many opportunities for acquiring and exercising executive skills. To the extent that they take advantage of these opportunities, Chinese children might show an advanced timetable of executive development relative to their Western counterparts. Genetic factors also suggest that cultural differences in the development of EF might be found. For example, the 7-repeat allele of the dopamine receptor gene (DRD4) has been associated with ADHD (Faraone, Doyle, Mick, & Biederman, 2001; Swanson et al., 1998). From a phenotypic standpoint, a key feature of ADHD is poor performance on executive tasks (Schachar, Tannock, Marriott, & Logan, 1995). Moreover, research with adults shows that the presence of the 7-repeat allele of DRD4 predicts poorer performance on executive tasks (Fan, Fossella, Sommer, Wu, & Posner, 2003). Intriguingly, this allele is rare in East and South Asia (including China), having a population prevalence of just 1.9%, compared with 48.3% in the Americas (Chang, Kidd, Kivak, Pakstis, & Kidd, 1996). Thus, as well as the experiential factors, Chinese children may have a biologically based advantage on executive skills.

282

Moses and Sabbagh

This hypothesised executive advantage raised the question of whether Chinese children might also show an advantage on ToM tasks. If the expression hypothesis is correct, then we would expect such an advantage. That is, if executive dif®culties are preventing young preschoolers from expressing already present ToM concepts, then executive advances should readily generate corresponding advances in ToM performance. We explored these issues with a sample of over 100 Chinese preschoolers, tested in Beijing, China. They were given the same battery of tasks that was given to US preschoolers in Carlson and Moses' (2001) large-scale study of EF and ToM. Some of these tasks mainly tap inhibitory control whereas others tap a combination of inhibition and working memory. The equivalence of method allowed a direct comparison of executive function across the two cultures. We found that Chinese children outperformed US children on all seven measures we analysed, on average being about 6 months ahead of US preschoolers. The ®ndings for theory of mind, however, were very different: Chinese children performed no better than US children on any of the four tasks they were given. This pattern of ®ndings is consistent with other research on Chinese children's ToM (see Liu, Wellman, Tardif, & Sabbagh, 2004, for a review). It appears then that EF and ToM are dissociable: Advances in EF may occur without concomitant advances in ToM. The ®ndings argue against the expression hypothesis. If children had a nascent understanding of ToM that is masked by the executive demands inherent in theory of mind tasks, then Chinese children should have outperformed their US counterparts on ToM. These children would seem to have a surfeit of executive ability, and so should have experienced much less dif®culty in negotiating whatever executive demands are present in ToM tasks. EF affects the emergence of ToM Our data from Chinese preschoolers, as well as from ToM tasks without apparent executive demands, are problematic for the expression hypothesis. They can, however, be accommodated within an alternative emergence hypothesis (Moses, 2001). This holds that executive functioning is implicated in the acquisition of ToM concepts, as opposed to their expression. On this view, children could not acquire abstract mental state concepts unless they had the capacity not only to hold in mind perspectives other than their own, but also to suppress their own perspective when required. To the extent that children's thinking is dominated by a single perspective, they would be unable to re¯ect on the possibility of alternative ways of thinking about the world. The emergence hypothesis can explain why performance on the ``know± think'' task is just as highly correlated with executive ability as is performance on the false-belief task. Even though the former task imposes minimal executive demands, it requires that children possess concepts of knowing

12. Children's theories of mind

283

and thinking. But, on the emergence hypothesis, children would require executive ability to develop these in the ®rst place. Hence children's executive ability should predict performance on any task requiring an understanding of these concepts. The emergence hypothesis is also consistent with our Chinese ®ndings. On this account, executive advances are necessary for the development of ToM. They are not, however, suf®cient. A well-developed executive system could never, in and of itself, generate ToM concepts. Critical experiences are likely necessary as well if those concepts are to be formed. Moreover, executive advances may well interact with experiential factors in promoting ToM development. First, children with more mature executive skills might be better able to participate in the kinds of social interactions that are necessary for the acquisition of ToM (Hughes, 1998). Second, executive skills might enable children to capitalise more fully on the theory of mindrelevant information that can be gleaned from such interactions. In either case, opportunities to engage in social interaction would be critical to theory of mind development. As we have seen, Chinese children are advanced on EF but not on ToM relative to US children. These children may have failed to show the corresponding ToM advantage because they have relatively less exposure to the experiential factors believed to be important for ToM development. One such factor is number of siblings. Preschoolers' ToM performance can be predicted from the number of older siblings living in the household (e.g., Ruffman, Perner, Naito, Parkin, & Clements, 1998). Although the precise mechanism by which this effect occurs is not fully understood, the presence of siblings may provide an opportunity for young children to discuss others' mental states (Brown, Donelan-McCall, & Dunn, 1996). If so, Chinese preschoolers, who by law have no siblings, will have fewer opportunities to have such discussions than their US counterparts. This experiential difference could explain, at least in part, why Chinese children are not advanced in ToM even though they have superior executive skills. Put more generally, this explanation illustrates the potential importance of both domain general cognitive factors and domain speci®c experiential factors in guiding children's ToM development. Moreover, such factors may well interact in generating ToM advances. ToM affects the emergence of EF Most of the ®ndings we have discussed are correlational. We have argued that EF in¯uences the emergence of ToM, but could the causal direction run the other way? Indeed, some have argued exactly that: EF±ToM relations arise because advances in theory of mind generate advances in executive skills. In particular, Perner and his colleagues (Perner & Lang, 1999; Perner et al., 2002) have suggested that the metacognitive abilities inherent in ToM are also required for the executive skills that develop over

284

Moses and Sabbagh

the preschool years. In the case of false belief, for example, children must understand that outdated information will lead the protagonist to look in the wrong place. In the case of inhibition, children must recognise that the prepotent action will lead to goal failure, and so must be suppressed in favour of the novel action. Hence, Perner argues, both executive inhibition and false-belief understanding involve metarepresentation: For inhibition, representation of a maladaptive action and its relationship to a goal; for false belief, representation of a mental state and its relationship to action. Although this proposal is intriguing, we do not think that it can plausibly account for the full range of existing data. The evidence against the proposal has been detailed elsewhere (Moses, 2005). Here we allude to three relevant sources of evidence. First, longitudinal data favour a causal role for EF as opposed to a causal role for ToM. Speci®cally, in the preschool period, earlier EF predicts later ToM more strongly than earlier ToM predicts later EF (Carlson, Mandell, & Williams, 2004; Hughes, 1998). Second, while the proposal offers an account of how ToM might be required for executive inhibition, it does not offer a similar account of how advances in ToM could generate advances in working memory. Yet, as discussed earlier, working memory in combination with inhibition appears to be central to EF±ToM relations. Finally, the proposal appears to be incompatible with the Chinese data we have described. The Chinese children performed just as well as US children on ToM tasks. However, if advances in ToM were necessary for advances in EF, then it is dif®cult to explain the Chinese advantage on EF. That is, the Chinese children are advanced on EF relative to their North American counterparts, yet they appear to have achieved that advance in the absence of any corresponding advance in ToM. Executive functioning interacts with theory of mind We outlined earlier how domain general executive skills might interact with domain speci®c experience in advancing ToM. Furthermore, domain general executive abilities may interact with domain speci®c conceptual content. We will discuss two sources of evidence in relation to this issue. The ®rst concerns relationships between EF and children's appreciation of different kinds of mental states. The second concerns relationships between EF and different kinds of representations. With respect to mental states, we have examined how EF relates to children's understanding of beliefs, desires, and pretense (Moses et al., 2006). Recognising that a belief is false, or that a desire is unful®lled, or that a pretend representation differs from reality, all appear to require similar executive skill. In each case two representations need to be held in mind ± a representation of the true state of affairs and a representation of the relevant mental state ± and in each case one of these representations needs to be inhibited in making the relevant mental state attribution.

12. Children's theories of mind

285

Nonetheless, we have argued that the executive demands required for these attributions are actually quite different. In the case of belief, the true state of affairs is highly relevant: our beliefs are intended to match reality. Hence, there may be pressure to collapse beliefs onto reality. In contrast, there is no such pressure in the case of pretense. We rarely intend that pretense matches reality: The whole point of pretense is to construct an interesting counterfactual scenario. Similarly, although we might wish that our desires match reality, we most often try to change reality to meet our desires rather than the reverse. Hence, the conceptual relevance of the true state of affairs is very different for each of these mental states. Moreover, the executive demands that need to be overcome in making mental state attributions should be greater in those contexts in which the true state of affairs is highly relevant, as is the case in belief contexts. Here, the true state of affairs should be considerably more prepotent, and consequently more dif®cult to suppress. This prediction was borne out in a study in which children were given closely matched belief, desire, and pretense attribution tasks as well as executive tasks. Executive ability was correlated strongly with performance on belief tasks, weakly with that on desire tasks, and not at all with that on pretense tasks (Moses et al., 2006). The need for executive functioning appears to interact with domain speci®c mental state content. We have also examined how executive ability relates to children's appreciation of belief representations in comparison to photographic representations (Sabbagh, Moses, & Shiverick, 2006a). We compared how EF relates to performance on false-belief tasks with how EF relates to performance on so-called ``false'' photograph tasks (Zaitchik, 1990). The false photograph task is designed to parallel the false-belief task closely in terms of formal structure and processing demands. The major difference is that children are required to make judgments about photographic representations that do not accurately re¯ect reality, rather than about mental representations that do not. In false photograph tasks, children listen to a story in which a character puts an object in one place and then takes a Polaroid photograph depicting the object there. While the photo is developing, the object is moved to a new location, thereby rendering the photograph inaccurate with respect to the location of the object. As in the false-belief task, children are asked to report the contents of the false representation (i.e., ``In the picture, where is [the object]?''). Consistent with these formal similarities, performance on the two tasks shares a similar developmental trajectory ± 3-year-olds consistently fail both tasks, and systematically correct performance emerges sometime around the age of four or ®ve (Davis & Pratt, 1995; Leekam & Perner, 1991; Leslie & Thaiss, 1992; Zaitchik, 1990). Yet, despite the apparent similarities, false-belief and false-photograph reasoning have quite different ontogenetic and neurocognitive pro®les. For instance, performance on the two tasks is only weakly correlated (e.g.,

286

Moses and Sabbagh

Davis & Pratt, 1995); training in one ability does not generate improved performance in the other (Slaughter, 1998); individuals with autism are impaired on false-belief tasks but not false-photograph tasks (Leekam & Perner, 1991; Leslie & Thaiss, 1992); and, ®nally, in adults, the neural systems activated during false-belief reasoning are distinct from those associated with false-photograph reasoning (Sabbagh & Taylor, 2000). It is typically assumed that these dissociations arise because of domain speci®c differences between mental and nonmental representations (Leslie & Thaiss, 1992). This assumption has plausibility because the domain general reasoning requirements of the tasks appear to be equivalent. First, the working memory demands associated with tracking the true state of affairs while holding in mind the previous (believed or photographed) state of affairs should be equal. Second, both tasks require cognitive inhibition to disengage from a salient real-world situation in order to attend to an unseen representation. That said, we have argued that, just as is the case for different mental state representations, the relationship of belief versus photographic representations to current reality is actually quite different. Speci®cally, whereas we would like our beliefs to update along with changes in the external world, the same is not true for photographic representations. Indeed, we would be dismayed if our prized photos continually updated to match the true state of the world. Instead, photos are typically intended to re¯ect reality at the time at which they are taken. Hence, the relevance of the current state of affairs (and thus its prepotency) should be much greater in the case of beliefs than in the case of photographs. If this line of reasoning is correct, we would expect the executive demands of false-belief tasks to be substantially greater than for falsephotograph tasks. To test this, we compared the relationship between preschoolers' EF and their performance on belief and photo tasks respectively. Consistent with our hypothesis, performance on false-photograph tasks bore no relationship to executive ability, whereas false-belief performance was once again strongly related to EF. We tested this further in a study in which children were given false-sign tasks (Parkin & Perner, 1996) as well as the tasks given in the earlier study (Sabbagh et al., 2006a). Like photos, directional signs indicating the location of objects are nonmental representations. Like beliefs, however, signs are intended to be up-to-date representations of reality. Thus, reasoning about false signs requires inhibiting a prepotent assumption about the usual veracity of signs, just as false-belief reasoning requires inhibiting a prepotent assumption about the veracity of beliefs. We would thus expect a strong correlation between executive functioning and reasoning about false signs. Our ®ndings were very much consistent with this: The correlation between performance on the EF and false-sign tasks was just as sizeable as the correlation between EF and falsebelief performance. In contrast, the correlation between performance on EF and false-photo tasks was again weak. These ®ndings illustrate further the

12. Children's theories of mind

287

ways in which domain general executive skills interact with domain speci®c conceptual content (see also Stenning & van Lambalgen, Chapter 8, this volume).

Conclusions Theoretical accounts that place the impetus for ToM development exclusively in either domain speci®c or domain general processes offer little promise of explaining the phenomenon. The acquisition of a theory of mind is a complex, multifaceted cognitive achievement to which domain general as well as domain speci®c processes surely contribute. Moreover, the ®ndings we have reviewed suggest that the contributions of these processes are unlikely to compound in any straightforward, additive fashion. Rather, we have argued that children's executive skills interact with both domain speci®c social experiences and domain speci®c conceptual content in driving the acquisition of a theory of mind. What these interactions illustrate is that adaptive executive functioning is not a blunt instrument operating in blind fashion irrespective of context. Instead, successful executive functioning is sensitive to both domain speci®c context and content. Negotiating the myriad social interactions to which our species is constantly exposed is thoroughly dependent on the ¯exible deployment of such domain general skills.

Acknowledgment Thanks to Max Roberts for constructive comments on an earlier draft of this chapter.

References Anderson, P. (2002). Assessment and development of executive function (EF) during childhood. Child Neuropsychology, 8, 71±82. Andrews, G., Halford, G. S., Bunch, K. M., Bowden, D., & Jones, T. (2003). Theory of mind and relational complexity. Child Development, 74, 1476±1499. Baddeley, A. D. (1986). Working memory. Oxford: Clarendon Press. Barkley, R. A. (1997). ADHD and the nature of self-control. New York: Guilford Press. Baron-Cohen, S. (1995). Mindblindness: An essay on autism and theory of mind. Cambridge, MA: MIT Press. Baron-Cohen, S., Tager-Flusberg, H., & Cohen, D. J. (Eds.). (2000). Understanding other minds (2nd ed.). Oxford: Oxford University Press. Brown, J. R., Donelan-McCall, N., & Dunn, J. (1996). Why talk about mental states? The signi®cance of children's conversations with friends, siblings, and mothers. Child Development, 67, 836±849. Carlson, S. M. (1997). Individual differences in inhibitory control and children's theory of mind. Doctoral dissertation, University of Oregon, Eugene.

288

Moses and Sabbagh

Carlson, S. M., Mandell, D. J., & Williams, L. (2004). Executive function and theory of mind: Stability and prediction from age 2 to 3. Developmental Psychology, 40, 1105±1122. Carlson, S. M., & Moses, L. J. (2001). Individual differences in inhibitory control and children's theory of mind. Child Development, 72, 1032±1053. Carlson, S. M., Moses, L. J., & Breton, C. (2002). How speci®c is the relation between executive function and theory of mind? Contributions of inhibitory control and working memory. Infant and Child Development, 11, 73±92. Carlson, S. M., Moses, L. J., & Claxton, L. J. (2004). Executive function and theory of mind: The role of inhibitory control and planning ability. Journal of Experimental Child Psychology, 87, 299±319. Carlson, S. M., Moses, L. J., & Hix, H. R. (1998). The role of inhibitory control in young children's dif®culties with deception and false belief. Child Development, 69, 672±691. Cavanaugh, J. & Perlmutter, M. (1982). Metamemory: A critical examination. Child Development, 53, 11±28. Chandler, M. J., & Sokol, B. W. (1999). Representation once removed: Children's developing conceptions of representational life. In I. Sigel (Ed.), Development of mental representation: Theories and applications (pp. 201±230). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Chang, F.-M., Kidd, J. R., Kivak, K. J., Pakstis, A. J., & Kidd, K. K. (1996). The world-wide distribution of allele frequencies at the human dopamine D4 receptor locus. Human Genetics, 98, 91±101. Davis, H. L., & Pratt, C. (1995). The development of children's theory of mind: The working memory explanation. Australian Journal of Psychology, 47, 25±31. de Villiers, J. G., & de Villiers. P. A. (1999). Linguistic determinism and false belief. In P. Mitchell & K. Riggs (Eds.), Children's reasoning about the mind (pp. 191± 228). Hove, UK: Psychology Press. Diamond, A. (2002). Normal development of prefrontal cortex from birth to young adulthood: Cognitive functions, anatomy, and biochemistry. In D. Stuss & R. Knight (Eds.), Principles of frontal lobe function (pp. 463±503). New York: Oxford University Press. Fan, J., Fossella, J., Sommer, T., Wu, Y., & Posner, M. I. (2003). Mapping the genetic variation of executive attention onto brain activity. Proceedings of the National Academy of Science, USA, 100, 7406±7411. Faraone, S. V., Doyle, A. E., Mick, E., & Biederman, J. (2001). Meta-analysis of the association between the 7-repeat allele of the dopamine D(4) receptor gene and attention de®cit hyperactivity disorder. American Journal of Psychiatry, 158, 1052±1057. Fernandez-Duque, D., Baird, J. A., & Posner, M. I. (2000). Executive attention and metacognitive regulation. Consciousness & Cognition, 9, 288±307. Flavell, J. H., Everett, B. A., Croft, K., & Flavell, E. R. (1981). Young children's knowledge about visual perception: Further evidence for the level 1±level 2 distinction. Developmental Psychology, 17, 99±103. Flavell, J. H., Flavell, E. R., & Green, F. L. (1983). Development of the appearance±reality distinction. Cognitive Psychology, 15, 95±120. Flavell, J. H., Miller, P. H., & Miller, S. A. (2002). Cognitive development (4th ed.). Englewood Cliffs, NJ: Prentice-Hall.

12. Children's theories of mind

289

Friedman, O., & Leslie, A. M. (2004). Mechanisms of belief±desire reasoning: Inhibition and selection. Psychological Science, 15, 547±552. Frye, D., Zelazo, P. D., & Palfai, T. (1995). Theory of mind and rule-based reasoning. Cognitive Development, 10, 483±527. Goldman, A. I. (2001). Desire, intention, and the simulation theory. In B. F. Malle, L. J. Moses, & D. A. Baldwin (Eds.), Intentions and intentionality: Foundations of social cognition (pp. 207±224). Cambridge, MA: MIT Press. Gopnik, A., & Wellman, H. M. (1994). The theory theory. In L. Hirschfeld & S. Gelman (Eds.), Mapping the mind: Domain speci®city in cognition and culture (pp. 257±293). New York: Cambridge University Press. Hala, S., Hug, S., & Henderson, A. (2003). Executive functioning and false belief understanding in preschool children: Two tasks are harder than one. Journal of Cognition and Development, 4, 275±298. Harris, P. L. (1991). The work of the imagination. In A. Whiten (Ed.), Natural theories of mind (pp. 283±304). Oxford: Blackwell. Harris, P. L. (2000). The work of the imagination. Oxford: Blackwell. Ho, D. Y. F. (1994). Cognitive socialization in Confucian heritage cultures. In P. M. Green®eld & R. R. Cocking (Eds.), Cross-cultural roots of minority development (pp. 285±313). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Hughes, C. (1998). Finding your marbles: Does preschoolers' strategic behavior predict later understanding of mind? Developmental Psychology, 34, 1326±1339. Jarman, R. F., Vavrik, J., & Walton, P. D. (1995). Metacognitive and frontal lobe processes: At the interface of cognitive psychology and neuropsychology. Genetic, Social, and General Psychology Monographs, 121, 153±210. Langdon, R. (2005). Theory of mind in schizophrenia. In B. F. Malle & S. D. Hodges (Eds.), Other minds (pp. 323±342). New York: Guilford Press. Leekam, S. R., & Perner, J. (1991). Does the autistic child have a metarepresentational de®cit? Cognition, 40, 203±218. Leslie, A. M. (1994). ToMM, ToBY, and agency: Core architecture and domain speci®city. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind: Domain speci®city in cognition and culture (pp. 119±148). New York: Cambridge University Press. Leslie, A. M., Friedman, O., & German, T. P. (2004). Core mechanisms in ``theory of mind''. Trends in Cognitive Sciences, 8, 528±533. Leslie, A. M., & Polizzi, P. (1998). Inhibitory processing in the false belief task: Two conjectures. Developmental Science, 1, 247±253. Leslie, A. M., & Thaiss, L. (1992). Domain speci®city in conceptual development: Neuropsychological evidence from autism. Cognition, 43, 225±251. Liu, D., Wellman, H. M., Tardif, T., & Sabbagh, M. A. (2004, August). Development of Chinese and North American children's theory of mind. Paper presented at the 28th International Congress of Psychology, Beijing, China. Luria, A. R. (1973). The working brain: An introduction to neuropsychology. New York: Basic Books. Moore, C., Pure, K., & Furrow, D. (1990). Children's understanding of the modal expression of certainty and uncertainty and its relation to the development of a representational theory of mind. Child Development, 61, 722±730. Moses, L. J. (2001). Executive accounts of theory of mind development. Child Development, 72, 688±690. Moses, L. J. (2005). Executive functioning and children's theories of mind. In B. F.

290

Moses and Sabbagh

Malle & S. D. Hodges (Eds.), Other minds: How humans bridge the divide between self and other (pp. 11±25). New York: Guilford Press. Moses, L. J., & Baird, J. A. (1999). Metacognition. In R. A. Wilson & F. C. Keil (Eds.), The MIT encyclopedia of the cognitive sciences (pp. 533±535). Cambridge, MA: MIT Press. Moses, L. J., & Carlson, S. M. (2004). Self regulation and children's theories of mind. In C. Lightfoot, C. Lalonde, & M. J. Chandler (Eds.), Changing conceptions of psychological life (pp. 127±146). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Moses, L. J., Carlson, S. M., & Sabbagh, M. A. (2005). On the speci®city of the relation between executive function and theory of mind. In W. Schneider, R. Schumann-Hengsteler, & B. Sodian (Eds.), Young children's cognitive development: Interrelationships among executive functioning, working memory, verbal ability, and theory of mind (pp. 131±145). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Moses, L. J., Carlson, S. M., Stieglitz, S., & Claxton, L. J. (2006). Executive function, prepotency, and children's theories of mind. Manuscript in preparation, University of Oregon. Norman, D., & Shallice, T., (1986). Attention to action: Willed and automatic control of behavior. In Davidson, R., Schwartz, G., and Shapiro, D., (Eds.), Consciousness and self regulation: Advances in research and theory (Vol. 4, pp. 1± 18). New York: Plenum. Parkin, L. J., & Perner, J. (1996). Wrong directions in children's theory of mind: What it means to understand belief as representation. Unpublished manuscript, University of Sussex, Brighton. Perner, J. (1991). Understanding the representational mind. Cambridge, MA: MIT Press. Perner, J., & Lang, B. (1999). Development of theory of mind and executive control. Trends in Cognitive Sciences, 3, 337±344. Perner, J., Lang, B., & Kloo, D. (2002). Theory of mind and self control: More than a common problem of inhibition. Child Development, 73, 752±767. Pillow, B. H. (1999). Epistemological development in adolescence and adulthood: A multidimensional framework. Genetic, Social, and General Psychology Monographs, 125, 413±432. Ruffman, T., Perner, J., Naito, M., Parkin, L. J., & Clements, W. (1998). Older (but not younger) siblings facilitate false belief understanding. Developmental Psychology, 34, 161±174. Russell, J. (1996). Agency: Its role in mental development. Hove, UK: Lawrence Erlbaum Associates Ltd. Russell, J. (Ed.) (1997). Autism as an executive disorder. Oxford: Oxford University Press. Russell, J., Mauthner, N., Sharpe, S., & Tidswell, T. (1991). The ``windows task'' as a measure of strategic deception in preschoolers and autistic subjects. British Journal of Developmental Psychology, 9, 331±349. Sabbagh, M. A., Moses, L. J., & Shiverick, S. M. (2006a). Executive functioning and preschoolers' understanding of false beliefs, false photographs and false signs. Child Development. 77, 1034±1049. Sabbagh, M. A., & Taylor, M. (2000). Neural correlates of theory of mind reasoning: An event-related potential study. Psychological Science, 11, 46±50.

12. Children's theories of mind

291

Sabbagh, M. A., Xu, F., Carlson, S. M., Moses, L. J., & Lee, K. (2006b). The development of executive functioning and theory of mind: A comparison of Chinese and U.S. preschoolers. Psychological Science, 17, 74±81. Schachar, R., Tannock, R., Marriott, M., & Logan, G. (1995). De®cient inhibitory control in attention de®cit hyperactivity disorder. Journal of Abnormal Child Psychology, 23, 411±437. Schneider, W., & Bjorklund, D. F. (1998). Memory. In D. Kuhn & R. S. Siegler (Eds.), Cognitive, language, and perceptual development (Vol. 2, pp. 467±521). In W. Damon (General Ed.), Handbook of child psychology (5th ed.). New York: Wiley. Slaughter, V. (1998). Children's understanding of pictorial and mental representations. Child Development, 69, 321±332. Swanson, J. M., Sunohara, G. A., Kennedy, J. L., Regino, R., Fineberg, E., Wigal, T., et al., (1998). Association of the dopamine receptor D4 (DRD4) gene with a re®ned phenotype of attention de®cit hyperactivity disorder (ADHD): a familybased approach. Molecular Psychiatry, 3, 38±41. Tobin, J. J., Wu, D. Y. H., & Davidson, D. H. (1989). Preschool in three cultures: Japan, China and the United States. New Haven, CT: Yale University Press. Wellman, H. M. (2002). Understanding the psychological world: Developing a theory of mind. In U. Goswami (Ed.), Blackwell handbook of childhood cognitive development (pp. 167±187). Oxford: Blackwell. Wellman, H. M., Cross, D., & Watson, J. (2001). Meta-analysis of theory of mind development: The truth about false belief. Child Development, 72, 655±684. Wellman, H. M., & Gelman, S. A. (1998). Knowledge acquisition in foundational domains. In D. Kuhn & R. S. Siegler (Eds.), Cognition, perception, and language (Vol. 2, pp. 523±573). In W. Damon (General Ed.), Handbook of child psychology (5th ed.). New York: Wiley. Welsh, M. C., Pennington, B. F., & Groisser, D. B. (1991). A normativedevelopmental study of executive function: A window on prefrontal function in children. Developmental Neuropsychology, 7, 131±149. Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception. Cognition, 13, 103±128. Zaitchik, D. (1990). When representations con¯ict with reality: The preschooler's problem with false beliefs and ``false'' photographs. Cognition, 35, 41±68. Zelazo, P. D., Carter, A., Reznick, J. S., & Frye, D. (1997). Early development of executive function: A problem-solving framework. Review of General Psychology, 1, 1±29. Zelazo, P. D., & Mueller, U. (2002). Executive functioning in typical and atypical children. In U. Goswami (Ed.), Blackwell handbook of childhood cognitive development (pp. 445±469). Oxford: Blackwell.

13 Do we need a number sense? Kelly S. Mix and Catherine M. Sandhofer

From year to year, species after species, ever more specialised mental organs have blossomed within the brain to better process the enormous ¯ux of sensory information received, and to adapt the organism's reactions to competitive or even hostile environments. One of the brain's specialised mental organs is a primitive number processor that pre®gures, without quite matching it, the arithmetic that is taught in our schools . . . This ``number sense'' provides animals and humans alike with a direct intuition of what numbers mean. (Dehaene, 1997, pp. 4±5) I began this article by asking how we come to have knowledge of number. I hope to have provided the answer ± we are built speci®cally to do so. (Wynn, 1998, p. 302)

In its simplest form, the question of domain speci®city asks only: When people process information, do they use speci®c processes for speci®c tasks, or do they use general purpose processes for many different tasks? For those who study adult cognition, this question is relatively straightforward. But for those who study cognitive development, domain speci®city has taken on special meaning because it has been invoked to explain not only how information is processed, but also how concepts originate and how learning takes place. Domain speci®city is often linked with nativism, leading to the proposal that human infants have instinctive or core knowledge for certain domains that gives us a leg up in learning (Chomsky, 1980; Dehaene, 1997; Fodor, 1983; Gelman, 1990; Leslie, 1994; Spelke & Tsivkin, 2001; Wynn, 1998). In this chapter, we evaluate the evidence regarding one such domain: number. Do humans need a number sense to learn the meanings of small numbers, or can domain general processes better explain what we know about this process?

Do we need special help for number learning? Domain speci®c accounts of numerical development often begin with the assertion that number learning would be dif®cult, if not impossible, without

294

Mix and Sandhofer

the help of a domain speci®c mechanism (Gelman, 1991; Spelke & Tsivkin, 2001). As Spelke and Tsivkin put it, ``Number is arguably our most abstract system of knowledge . . . How can children ever come to understand counting if they do not already understand the entities that counting singles out?'' (p. 84). This assertion has merit because the property of numerosity, though omnipresent, is not particularly obvious. Consider, for example, what is required to learn the meaning of ``three.'' As Quine (1960) pointed out, the referents for concrete nouns, like ``rabbit,'' are remarkably indeterminate on closer analysis. But for attributes like threeness, the indeterminacy problem is even more profound. Because number is a property of sets, there is not an object to point toward. Instead, the boundaries of a collection must be established before it, as an entity, can act as a referent for a count word. There also is extraneous information to ignore. To learn what ``three'' means, children must ignore the properties of each individual object in the collection, just as they must ignore the speci®c properties of a rabbit, such as soft and furry, to learn the word ``rabbit.'' However, they must also ignore properties of the collection as a whole, such as its total length, density, or area. Beyond isolating the property of number, young learners also must recognise numerical equivalence classes for each set size and learn what to call these groups. Forming equivalence classes requires the realisation that diverse individuals with a range of distinctive features are somehow alike. So, just as children come to see many breeds of rabbit as similar, they also must see many collections of three as similar. However, unlike rabbits, collections of three may bear little, if any, resemblance to one another. Consider the commonalities between three planets and three jumps. To learn the names for these equivalence classes, children must map the words in their native language to the abstract ideas to which they refer. But, unlike the names for nouns and other attributes, children learn the count words as part of the counting sequence, as well as learning them as labels for different set sizes. Thus, to understand the meaning of ``three,'' they must sort out both these and many other meanings and usages of numerals and number words (see Mix, Sandhofer & Baroody, 2005, for further discussion). Clearly, number learning has unique challenges. But does this mean that it requires unique processes? Some have argued that without numberspeci®c representations built in, children would be unable to surmount the complexity of the number learning problem. Domain specific models of number development Over the past 20 years, two models have emerged as the leading domain speci®c explanations for early numerical development: skeletal principles and core knowledge.

13. Do we need a number sense?

295

Skeletal principles On this view, number learning is guided by counting principles that are embodied in an innate representation for number called the accumulator mechanism (Gallistel & Gelman, 1992; Gelman, 1991). This mechanism works by emitting pulses of energy at a constant rate. When an item is counted, a gate opens that passes energy into a storage unit (i.e., an accumulator). Although there is not a one-to-one relation between pulses and items, the amount of energy per item is roughly equal. Thus, the resulting fullness of an accumulator represents the cardinality of the set. It has been argued that accumulators are used to remember set sizes, compare one set to another (e.g., 3 > 2), and solve calculation problems (e.g., 1 2 = ?) (e.g., Wynn, 1995, 1998; Gelman, 1991). The accumulator representation has no upper limit. However, as magnitude increases, the variability, or noise, around the exact numerosities increases. Therefore, in accordance with Weber's law, discriminability decreases as either the set sizes increase, the difference between set sizes decreases, or both. A key claim is that the accumulator representation operates according to the same principles as verbal counting (Gallistel & Gelman, 1992; Gelman, 1991). For example, to count a set of pebbles verbally, one tags each pebble once and only once with a count word (one-to-one principle). If a pebble is omitted, or is tagged more than once, the last count word will not represent the cardinality of the set. According to the skeletal principles view, the accumulator also obeys the one-to-one principle because energy is gated into the accumulator once and only once for each item. It has been argued that the accumulator obeys all ®ve counting principles, including this and stable order, cardinality, abstraction, and order irrelevance (Gallistel & Gelman, 1992). In fact, these investigators saw so many parallels between the accumulator and verbal counting that they called enumeration via accumulator ``preverbal counting'' (Gallistel & Gelman, 1992). Still, whereas skeletal principles are thought to provide an outline for developing a concept of number, experience with objects and verbal counting is needed to ¯esh this framework out. Thus, the idea of skeletal principles is rather like a language acquisition device for number ± nature provides the conceptual slots, but experience is required to ®ll them (Gelman, 1993, 1998). Clearly, an inborn counting mechanism would go a long way toward surmounting the challenges of number learning. First, by directing attention toward discrete number, this mechanism would solve a large part of the indeterminacy problem. As Gelman (1998) put it, ``®rst principles [contribute] by focusing attention on inputs that are relevant for acquisition of concepts and providing a way to store incoming data in a coherent fashion'' (p. 562). On this view, competition from other percepts would be diminished because domain speci®c structures give number information privileged status and salience. Second, by providing an amodal representation of cardinal number, the accumulator would help children see disparate

296

Mix and Sandhofer

sets as equivalent. Though various collections may differ in many respects, these differences would be stripped away once the collections were represented as featureless magnitudes. This abstraction not only would support the development of number categories, but could also facilitate the mapping of number words by providing unambiguous referents. Children, who are predisposed to isolate number from the perceptual stream, should also be more likely to map number words to referents correctly. And as an additional bene®t, children should have less trouble acquiring conventional counting skills because the accumulator follows all the principles of verbal counting. If the rules that govern counting are familiar, then learning to count would boil down to implementing known rules with languagespeci®c terminology ± a far cry from deducing the rules while also learning the words. Core knowledge Similar developmental bene®ts are provided in a second domain speci®c learning account that endows infants with core knowledge for number (Spelke, 2003; Spelke & Tsivkin, 2001; Wynn, 1992a, 1995, 1998). In Spelke's version, the proposed core consists of two distinct systems for representing number. One system uses a tracking mechanism that assigns a mental token to each object in a group. These tokens function as pointers to the objects' locations. Because there is a one-to-one relation between tokens and objects, the set of tokens can be used to represent the exact number of objects. However, only a few pointers can be active at any one time due to constraints on selective attention. Furthermore, although the representation preserves the individuality of the objects, it does not provide a representation of the whole group (i.e., in the way that a number word like ``three'' verbally represents a set's cardinality). The other system (also identi®ed as the sole core knowledge structure by Wynn, 1995, 1998) represents large sets, but only approximately. It is based on the accumulator mechanism described above. However, proponents of the core knowledge view do not emphasise the accumulator's parallels to verbal counting. In fact, Wynn (1995, 1998) has argued that the lack of functional parallels between counting and the accumulator is what makes learning to count so dif®cult. Instead, these investigators focus on the strengths and limitations of the accumulator ± in Spelke's case, with relation to the object tracking representation. In this regard, she notes that the accumulator is inherently inexact, even for small sets, because there is not a one-to-one relation between pulses and items (though there is a one-to-one relation between gate openings and items). Also, in contrast to the exact system, this representation does not preserve the individuality of the items, though it does represent the group as a whole. Thus, Spelke's core knowledge account holds that both systems have inherent limitations ± the ®rst being limited to set sizes that the object

13. Do we need a number sense?

297

tracking mechanism can handle (i.e., < 4) and the other being limited to rough estimates. Furthermore, though both systems represent an aspect of number, they do not interact so as to provide the basis for a complete number concept (i.e., the ability to represent a collection composed of individual items). Only verbal counting, she (Spelke, 2003; Spelke & Tsivkin, 2001) has asserted, allows people to represent collections composed of individuals, and do so for all set sizes exactly. Despite their limitations, it is argued that these core knowledge systems constitute the conceptual foundation for subsequent learning and, therefore, play a crucial role in numerical development. They solve many of the same problems that preverbal counting solves in the skeletal principles view. Like preverbal counting, core knowledge serves to direct attention toward discrete number, thus making numerical interpretations of the count words more likely. Like preverbal counting, core knowledge provides amodal representations of cardinal number, thereby supporting the formation of numerical equivalence categories. The main difference is that in the core knowledge view, the rules of conventional counting are not innately available. In summary, though these views vary in their treatment of certain points, they share one key assumption ± that children use number-speci®c knowledge systems to acquire more mature concepts and conventional skills. In each account, the problems of number learning are reduced by endowing children with knowledge structures that direct attention toward number, support comparisons and abstraction, and provide an organising framework for mapping words to meaning. Though these domain speci®c systems are incomplete in and of themselves, they are thought to reduce enough variability to make number learning tractable. Plan for the chapter In this chapter, we aim to evaluate whether number learning actually is supported and guided by domain speci®c processes ± by an innate number sense. This is challenging because, whereas it may be possible to show that processes exist, it is nearly impossible to prove they do not. To draw an extreme analogy, one could propose that number concepts develop after the number fairy sprinkles magic dust on children in their sleep. Though you could interpret a lot of behavioural evidence in terms of this explanation, it would be impossible to falsify. For example, if you videotaped a sample of children every night and never saw the number fairy, a fairy-theorist could argue that she was there, but she was too small to be seen by the naked eye. A further complication is that domain speci®c learning and domain general learning are not mutually exclusive. In fact, there is widespread agreement that domain general processes are involved in all learning, including number learning. The argument of domain speci®c theorists is that these processes are not suf®cient on their own. Thus, simply showing

298

Mix and Sandhofer

that number learning involves domain general processing cannot rule out domain speci®c processing. How, then, can the question of domain speci®city be addressed? In the present chapter we take three approaches. First, we ask whether particular domain general mechanisms can solve particular number learning problems. This is different from simply claiming that domain general processes are involved in number learning. At a general level, any mental activity requires domain general components, such as attention, perception, or memory. But when we consider domain general mechanisms that are more detailed, they often are either distinguishable from, or redundant with, the proposed domain speci®c mechanisms. Second, we will look for behavioural signatures of these domain general processes that have been documented in other concept learning. Because the explanations we entertain are detailed, they produce idiosyncratic patterns of learning. These patterns have been reported for learning a variety of noun and adjective categories (e.g., colour) as well as analogical reasoning and conceptual mapping tasks. If the same signatures were observed in number development, it would be strong evidence that the same mechanisms are involved. Third, we consider whether there is any additional evidence that compels a domain speci®c account. Even if the basic problems of number learning can be explained with domain general processes, it is possible that some behavioural evidence cannot. Indeed, the idea of domain speci®c number learning arose largely in response to evidence for numerical sensitivity in infants. We consider both the validity of this evidence and whether it requires a domain speci®c explanation.

A domain general account of number development Though number development involves many conceptual components, we will focus on three achievements in particular: (1) isolating number from the perceptual stream; (2) forming small number categories; and (3) bringing meaning to number words. These accomplishments likely involve both verbal and nonverbal processing. As such, they not only emerge in the age range most likely to bene®t from either skeletal principles or core knowledge, but also form the foundation for a range of other skills and ideas. We have already discussed the particular challenges inherent to these developments in the realm of number, and we have seen how prominent domain speci®c accounts can explain children's achievement of them. Now, let us consider whether domain general mechanisms provide a plausible alternative. How do children isolate number from the perceptual stream? Number applies only to collections of individuals. Thus, to have a notion of number, one must maintain and coordinate two interpretations of reality:

13. Do we need a number sense?

299

(1) there are individual objects and (2) some of these objects form a coherent whole ± a collection. To explain the origins of these interpretations, domain speci®c theorists build them into the baby. For example, Spelke (2003; Spelke & Tsivkin, 2001) contends that object tracking yields a representation of individuals, and the accumulator yields a representation of collections as wholes. Though she has argued that the coordination of these notions requires language, she maintains that the ideas are supplied by evolution. Setting aside, for the moment, the claim that these knowledge systems are innate, we should point out that neither object tracking nor the accumulator is a specialised mechanism for number. Object tracking is a perceptual mechanism for representing objects and their locations (Kahneman, Treisman, & Gibbs, 1992; Pylyshyn, 1989). The ability to parse a scene into discrete objects and track these through space serves many functions, including navigation, object representation, object identi®cation/naming, and object manipulation. Indeed, object individuation is so fundamental to human cognition that it is hard to imagine how most other processes could operate without it. Though the link between numerical cognition and object representation is obvious (i.e., it would be impossible to perceive number without individuation), that does not mean that the processes underlying individuation were evolved to enable humans to think about number (Scholl & Leslie, 1999). Instead, object tracking may be the quintessential domain general process. Similarly, the accumulator could apply to a variety of mental activities. There is evidence from rats that it underlies the estimation of time (Meck & Church, 1983). It could also, in principle, support estimates of intensity, size, and spatial extent. In fact, one could argue that the accumulator is better suited to continuous applications such as these because they do not require the effortful and potentially error-prone step of gating energy in segments (Mix, Huttenlocher, & Levine, 2002a). From this perspective, it seems more likely that the accumulator was evolved for non-numerical uses and was perhaps coopted for numerical processing, rather than the other way around. Of course, there are other ways children could come to see collections as coherent wholes besides representing them with an accumulator. We return to this issue later. For now, we wish only to acknowledge that this process does not constitute a domain speci®c endowment, even if it turns out to underlie numerical development. So, one possible answer to the question of how children isolate number without domain speci®c processes is that they rely on processes that are inborn, but domain general. However, this redescription still builds quite a lot into the baby unnecessarily, given recent discoveries about very early learning. Multiple studies have demonstrated that infants readily extract statistically reliable patterns from a variety of perceptual data, including auditory sequences (Saffron, Aslin, & Newport, 1996) and visual scenes (Kirkham, Slemmer, & Johnson, 2002), after even brief exposures.

300

Mix and Sandhofer

Moreover, infants recognise these patterns in subsequent, unfamiliar situations (Gomez & Gerken, 1999). This means that the conceptual precursors to number (i.e., individuation and colligation) could develop rapidly over the ®rst year of life, rather than being innate. Individuation via statistical learning Infants are bombarded with information about the physical world starting at least as early as birth (and perhaps earlier if we include encounters with one's own hands and feet). The number of distinct objects infants observe and contact in a single day at home far outstrips the number of stimuli presented in a statistical learning experiment. There can be no doubt that everyday experience provides enough data about objects to support the extraction of statistically reliable patterns. And according to the literature on object concepts, not only do such patterns exist, but infants respond to them in a predictable sequence that suggests gradual abstraction over time. Adults use many cues to parse the environment into separate objects, including colour, texture, and shape. But perhaps the most reliable test is whether all the parts move together or separately. Indeed, even adults overlook dramatic changes in an object's surface features (e.g., one person changing into another), as long as the object occupies the same position or trajectory in space (Scholl, Pylyshyn, & Feldman, 2001; Simons & Levin, 1997). This use of movement and space may re¯ect an innate bias, but it also could be learned. A newborn baby, who lacks the strength to even sit up, nonetheless observes other animates, most notably people and household pets, in nearly constant motion. This movement provides exceptionally reliable cues that Mum, for example, is not part of the wall, the table, or the bed. The limitations of newborns' visual systems actually may serve to increase their sensitivity to the patterns in movement information by reducing the salience of competing featural cues. As in other ``less-is-more'' accounts (e.g., Newport, 1990; Regier & Gahl, 2004), early lack of visual acuity may make infants particularly good at using movement cues because these would make up the bulk of their input, by default. Statistical learners are sensitive to correlations among features. So, once infants see moving objects as unitary, they should be able to extract other reliable patterns based on the correlated features of these units. For example, movement may tell them that the family dog is not part of the rug or the furniture. But this moving blob also consists of several other correlated features. It always has roughly the same shape. It is covered with brown fur. It moves a certain way. In contrast, another moving blob (e.g., Mum) may be tall. It may talk and smile. Good things might happen when it picks you up. Enough exposure to these bundles of correlated features should allow statistical learners to realise that colour, shape, texture, and sound also indicate unity and distinctiveness. Eventually, these cues may be enough, in and of themselves, for infants to perceive individuality. And the

13. Do we need a number sense?

301

learning we have described so far is what could occur in immobile infants. Once infants begin to move around and manipulate objects, the amount of information they receive about individuality would increase exponentially. In summary, given the correlated structure of individual objects and the rapidity with which infants can learn correlated structures, it is quite plausible that the perception of individuals is learned. Speci®c patterns in the object individuation literature lend support to this hypothesis. Multiple studies have described a developmental progression in the types of cues infants use to parse their visual world (Kellman & Spelke, 1983; Needham, 1999; Slater, Morrison, Somers, Mattock, Brown, & Taylor, 1990; Wilcox, 1999; Wilcox & Baillargeon, 1998; Xu & Carey, 1996; Xu, Carey, & Welch, 1999). Consistent with our account, this progression begins with the use of movement or spatio-temporal cues. In Kellman and Spelke's (1983) seminal rod and box experiments, 4-month-olds perceived two ends of a rod protruding from behind a screen as one continuous piece, as long as the two ends moved together. Subsequent research has shown that older babies also use movement cues to tell objects apart. For example, when shown a duck and a truck emerging simultaneously from behind a screen and then returning, 10-month-olds seem surprised to see only one object when the screen is lowered (i.e., they look longer at one object versus two) (Xu & Carey, 1996). Apparently, they realise that the same object cannot appear in more than one place at the same time. However, when only featural information is available (i.e., when the duck was displayed alone and then hidden behind the screen while the truck was displayed), 10month-olds respond as if they no longer represent the objects as distinct individuals. The fact that the duck and the truck did not look at all alike was not enough information to tell babies that these were separate objects. They apparently needed to see the objects occupying different locations. Thus, movement/spatiotemporal information seems to be the fundamental cue to individuation used by infants. Still, though use of this information emerges early, it is not present from birth. When newborns were tested with the rod and box procedure, they reacted to the test displays as if they perceived two small rods, rather than one continuous piece (Slater et al., 1990). This indicates that even movement cues may be learned during the ®rst four months of life. Further research has mapped out the use of various featural cues in infancy (Needham, 1999; Wilcox, 1999; Wilcox & Baillargeon, 1998; Xu & Carey, 1996). Though differences in testing procedures have led to disagreement about the particular ages involved, the existing studies all show that infants begin to use different cues at different times, in the same basic progression. This starts with the use of movement and form features, including size and shape. Somewhat later, surface features are used, beginning with pattern. Relatively late in development (at 11.5 months according to Wilcox) infants begin to use colour. This gradual acquisition of cues to individuation is consistent with the idea that infants learn what features go

302

Mix and Sandhofer

together after massive experience with objects. From this perspective, there are at least two reasons why some cues would be noticed before others. One is, as noted above, that changes in infants' visual acuity may increase the salience of certain information (e.g., movement, size, and shape) over information that requires better vision to discern (e.g., pattern). Another reason may be that some cues are more tightly coupled with objecthood than others. For example, because many objects are multicoloured, colour may be a less reliable cue to individuation than shape. If so, then infants may not expect colour information to indicate separate objects until they have amassed enough experience to know that it can sometimes be diagnostic. Colligation via categorisation Next, we turn to the second conceptual precursor to number: the notion of collections as undifferentiated wholes. Recall that in both of the domain speci®c accounts, the accumulator supplies this notion by converting perceived collections into mental magnitudes for which the individuality of their constituents is obscured. It may be true that such a representation would accomplish this. However, it is not clear how the accumulator solves an arguably more basic problem ± namely, that of perceiving the collections in the ®rst place. In other words, there is a chicken±egg problem inherent to the domain speci®c argument. To enumerate a collection using the accumulator, the infant must ®rst see a particular subset of objects as a collection. Yet, if they see a subset of objects as a collection worthy of enumerating, then they must already perceive the collection as a homogeneous entity at some level. In this light, the only contribution of the accumulator is to assign a quantitative value to a pre-existing percept. But how does this perception of collections itself originate? As we have discussed, number categories piggyback on other categories. You cannot enumerate ®sh until you know what ®sh are and can group ®sh separately from non-®sh. From a developmental standpoint, this means that numerical awareness should not be possible until at least one category can be recognised. Furthermore, subsequent number perception should emerge gradually, in one context and then another, as other categories are learned. Whether or not infants use an accumulator to enumerate sets, the necessity of non-numerical categorisation is a given. And because we can assume that non-numerical categorisation is taking place, there is no reason to posit a domain speci®c process for perceiving collections as wholes. Experience at forming and contrasting groups would be suf®cient. For example, imagine a baby playing with a pile of stuffed animals. To perceive various subsets of animals (i.e., collections), the infant would need to discover ways that the animals are similar. Extensive research suggests that adults and children discover dimensions of similarity via holistic or high-similarity comparisons. Perhaps seeing two highly similar toys in a restricted space, such as a container or one's own hands, would be enough to

13. Do we need a number sense?

303

induce the ®rst comparison. As each dimension is isolated, it can serve as the basis for subsequent comparisons that may, themselves, support the discovery of additional, new dimensions (Gentner, 2003; Gentner & Medina, 1997; Gentner & Namy, 2004; Goldstone, 1996; Medin, Goldstone, & Gentner, 1993; Smith, 1989). Furthermore, the contrast between matching items and nonmatching items serves to focus attention on particular dimensions (Paik & Mix, in press). That is, when two items are not only highly similar to each other but also highly distinctive from the surrounding objects, it is maximally likely that children will compare them. Thus, we can conceptualise the development of categorisation as a series of groupings, contrasts, and regroupings as more and more dimensions of similarity are discovered. Because comparisons between groupings require abstraction of shared features, the domain general process of categorisation has, inherent to it, the notion of collections as undifferentiated wholes. It is an open question whether infants engage in this type of categorisation. It has been argued that even very young infants perceive object categories. But this claim is based on evidence that infants respond to category boundaries in habituation experiments (e.g., Madole & Oakes, 1999; Quinn & Eimas, 1996). For example, Quinn and Eimas showed infants a series of cat heads until looking time decreased. When shown a dog head at test, infants looked signi®cantly longer (i.e., they dishabituated). A strong interpretation of such data is that infants formed a category of cat during habituation, compared the dog head to their remembered category at test, and rejected the dog as a member of the cat category. But this interpretation assumes that habituation±dishabituation re¯ects an explicit comparison process when it could instead re¯ect an implicit attentional process (see Cohen & Marks, 2002; Schoner & Thelen, 2001). Furthermore, these experiments provide no evidence that infants impose such categories on their perception of complex, real world scenes. In other words, do infants look into their family living room and mentally parse the scene into cats and non-cats? There is no way to tell from existing habituation experiments, because the categories in these tasks are provided by the experimenter. Competing stimuli are stripped away so that infants need only react to the regularities presented before them. And as we have seen, there is good reason to believe that infants are well equipped to respond to perceptual regularities. This does not necessarily mean that they ``have'' these categories yet, or that they see the world differently because of them. Similarly, in infant number experiments, the groupings to be enumerated are bounded by the experimenter. There is usually nothing to look at except the computer screen that contains a collection of two or three pictures. In this way, it is the experimenter that completes the categorisation step. Though it is unclear whether categorisation qua grouping is present in infancy, there are stronger indications that it has emerged by toddlerhood. Children touch objects in sequences that are consistent with explicit object grouping starting at 12 months of age (Bauer & Mandler, 1989; Sugarman,

304

Mix and Sandhofer

1981). Children begin grouping similar objects around 2 years of age. Thus, we conclude that the ability to group items, and thereby view collections as wholes, does not develop simultaneously with individuation, as claimed in the domain speci®c accounts. Instead, colligation appears to develop somewhat later. This makes sense from a learning perspective because, to form groups, one must see individuals as similar. And to see individuals as similar, one must see individuals. The other developmental implication here is that even if infants are endowed with numerical representations, without something to enumerate (i.e., an explicit grouping perceived by the infant), there would be no reason to use them. Thus, an important challenge to the domain speci®c position on number learning is to show what categories, if any, infants naturally recognise and enumerate in their everyday experience. In summary, number arises from the coordination of two ideas: (1) objects can be seen as individuals and (2) collections can be seen as wholes. Domain speci®c accounts assume that these notions are embodied in innate processes for representing number. However, we have argued that the same ideas can and do develop from domain general processes of object representation and categorisation. There is suf®cient correlational structure in objects, and suf®cient sensitivity to correlational structures in infants, for these notions to emerge through experience. There is good evidence for Spelke's claim that number language plays a critical role in coordinating these notions and transforming them into an explicit sense of number (Mix et al., 2005). But at the earliest stages, these ideas may not be numerical at all ± regardless of how much they may seem so to those who already understand how individuals, collections, and number are related. How do children form small number categories? Like most concepts, the core of number concepts consists of equivalence classes. The idea of dog is largely de®ned by the subset of entities in the world we call ``dogs.'' Similarly, the idea of three is largely de®ned by the subset of collections in the world we call ``three'' (e.g., Russell, 1919). Put another way, to know what three is means to recognise threeness in a variety of situations ± to see that many otherwise disparate collections can be the same in terms of number. But what draws children's attention to number when there are so many competing properties, most of which can be analysed at the object level, rather than the group level? Domain speci®c accounts solve this problem by building in specialised processes that support number categorisation in two important ways. First, they direct attention toward number. That is, domain speci®c processes not only enable our brains to think about numbers, but also cause them to actively seek out numerical information, much like the language acquisition device tunes children in to human language. Second, these processes provide amodal media that should facilitate numerical comparisons. Children who see three cookies on a plate and three dogs in the backyard may

13. Do we need a number sense?

305

not perceive these collections as equivalent. But if they represent them using identical mental tokens (whether three pointers or three gatings into an accumulator), then the likelihood of noticing this similarity should be increased. Indeed, an explicit claim of these domain speci®c accounts is that infants use both object tracking and accumulator representations to compare collections and judge similarity. If so, then the apparent obstacles to number categorisation would be overcome largely by genetic endowment. However, the challenges of number categorisation, though unique in some ways, are not all that different from categorisation in general. Because we know children form many other categories without the aid of domain speci®c processes, we can assume that effective domain general alternatives exist that might be recruited for use in number categorisation. These include (1) abstracting dimensions of similarity by making comparisons and (2) highlighting similarity by giving shared dimensions the same name. As the following review will show, these processes not only are suf®cient to explain the development of number categories in principle, but are re¯ected in the particular patterns that have been observed in studies of numerical and non-numerical categorisation alike. Categorisation via comparisons As discussed earlier, many studies demonstrate that people isolate new dimensions of similarity by aligning items for some other reason (Gentner & Markman, 1994; Goldstone, 1996; Kotovsky & Gentner, 1997; Markman, 1997; Smith, 1989, 1993). For example, children may not realise that dogs have tails, but if they start to examine and compare two dogs for some other reason (e.g., the way the dogs moved or sounded), they might discover that both dogs have tails, whereas people do not. Similarly, children might discover that objects in two collections can be aligned (thereby discovering numerical equivalence) because they noticed how the objects in one set can be matched one-to-one with the objects in another (e.g., cups onto saucers). There is abundant evidence of the gradual identi®cation and accrual of different points of alignment in non-numerical categorisation. One indication is that early comparisons depend on a high degree of similarity along many dimensions ± not just those relevant to a particular task (Brown & Kane, 1986; DeLoache, 1989; Gentner & Rattermann, 1991; Gentner & Toupin, 1986; Holyoak, Junn, & Billman, 1984; Smith, 1993). For example, DeLoache (1989) tested children's understanding of models by hiding a toy in either a full-size room or a model room and then having children search in the analogous space (e.g., if the toy were hidden in the room, they would search in the model, and vice versa). Children performed much better in this task when the surface similarity between the room and its model was high ± that is, when the furniture had the same fabric, when the tables were the same shape and colour, and so forth ± even though these features were irrelevant to the search task.

306

Mix and Sandhofer

Smith (1989) reported similar effects in an object-grouping task. Using a follow-the-leader procedure, she asked children to group objects that were the same colour. For example, if she chose a red triangle and a red circle from a pile of several objects, the child was supposed to infer the commonality and produce another pair of objects in the same category (e.g., red things). Because the youngest children Smith tested could only pair items that had a high degree of similarity overall (e.g., two red circles), she concluded that children do not isolate separate dimensions of similarity at ®rst. Instead, they initially group items with a high degree of overlap. Additional studies also have demonstrated that exposure to highsimilarity comparisons can induce children to discover new dimensions of similarity (Gentner & Markman, 1994; Gentner & Namy, 2004; Kotovsky & Gentner, 1997; Marzolf & DeLoache, 1994; Medin, Goldstone, & Gentner, 1993; Spalding & Ross, 1995; Waxman & Klibanoff, 2000). For example, Kotovsky and Gentner (1997) found that 4-year-olds had great dif®culty in recognising the relation between circles that increased in size and squares that increased in darkness. However, when children were trained on same-dimension comparisons (e.g., all sets that increased in size), their performance on cross-dimension comparisons increased signi®cantly. Along similar lines, Sandhofer (2003) found that 24-month-olds isolated the dimension of texture more readily when they were encouraged to compare and contrast objects. Children were trained to recognise different textures in one of two conditions. In non-comparison training, they were given three objects, one at a time, and asked, ``Is this scratchy?'' In comparison training, they were given the same three objects simultaneously and instructed to point to the scratchy one. Although children in both conditions learned the texture words, only children who had received comparison training could match same-textured objects in a subsequent generalisation task. Thus, comparing objects seemed to support the discovery and abstraction of new dimensions of similarity. If these processes underlie learning about number, then children's numerical equivalence judgments also should progress from high- to lowsimilarity matches. This should be evident in the natural progression of children's learning, as well as the effects of high-similarity training. With regard to the ®rst point, a progression from high- to low-similarity matches is precisely what Mix and others have found in the development of numerical equivalence judgments (Huttenlocher, Jordan, & Levine, 1994; Mix, 1999a, 1999b, 2002, 2004; Mix et al., 1996; Siegel, 1971, 1974). Starting at age 3 years, children can match nearly identical sets, such as two black dots and two black disks.1 However, children fail to match numerically equivalent sets 1 Though the equivalent sets in this condition were matched along several non-numerical dimensions, such as colour and shape, the same was true of the distracter sets, also black dots. Thus, to be correct in the high-similarity comparison, children had to take quantity into account.

13. Do we need a number sense?

307

where the objects are not identical, such as two black dots and two lion ®gurines, until 3±3.5 years of age (Mix, 1999a, 1999b, 2002; Mix et al., 1996; Sandhofer & Mix, 2003a). By 4 to 4.5 years, children can match heterogeneous sets where there are no items in common (Mix, 1999b; Siegel, 1974). Four-year-olds also recognise quite disparate numerical matches between sets of sounds and items in a visual display (Mix et al., 1996). However, number categories are not fully inclusive at 4 years of age. It takes an additional year for children to recognise numerical equivalence for dissimilar sets when one of the distracters is an identical object match (e.g., two ¯owers equals two trucks but not three ¯owers) (Mix, 2002). This condition is analogous to the cross-mapping condition that has been used in other preschool comparison research (Rattermann, Gentner, & DeLoache, 1990) and represents the complete decoupling of numerical similarity from surface or object level similarity. These studies indicate that children do not notice numerical similarity immediately for a range of comparisons, as one might expect given an amodal representation for number. Instead, they seem to build number categories gradually, beginning with comparisons that share a high degree of non-numerical similarity, and moving over a period of years toward number in complete isolation. The length of time involved in this progression, and the particular way it unfolds, is consistent with the progressions described for non-numerical category development. This is strong evidence that the same domain general processes are at work. Additional evidence for these processes comes from a training study with 30-month-olds (Sandhofer & Mix, 2003b). As in Sandhofer's (2003) texture training study, children were taught to identify small set sizes in one of two conditions. Children in the non-comparison condition were shown cards one at a time, and asked, for example, ``Is this three or four?'' Children in the comparison condition were shown three cards all at once and asked, ``Which card has three?'' Thus, children in the comparison condition compared sets of objects, whereas children in the non-comparison condition compared verbal labels. As for texture, children in both conditions learned the number words and could accurately identify named sets. However, only children who completed comparison training matched disparate sets in terms of numerical equivalence. This study indicates that, as with other properties like texture, children isolate and abstract the property of number by comparing collections. But if comparison is the mechanism by which number categories are built, a key question is what makes number salient. With a range of competing cues, why would children ever notice an obscure property like number? As we will see, verbal labelling of numerical set sizes may play a major role. However, there are other potential sources of information for young children. Toddlers engage in a variety of play activities that involve implicit comparisons between sets in terms of one±one correspondence (Mix, 2002; Anderson & Mix, 2004). For example, toddlers often distribute

308

Mix and Sandhofer

objects to people. This activity does not require a priori knowledge of numerical equivalence ± the right number of objects can be achieved by making local matches (i.e., empty hand gets a toy, full hand does not). Yet the results of these efforts open a window to the idea of numerical equivalence. Toddlers also encounter a variety of objects whose functional relations with other objects encourage one-to-one mappings, such as cups and saucers, plastic eggs and egg cartons, and so forth (Mix, 2002). For example, many children play with shape puzzles, in which each piece has its own uniquely shaped hole in a wooden board. Putting the pieces in their holes is an exercise in one-to-one correspondence. Furthermore, it is selfcorrecting. The objects themselves tell you if they ®t, if you have enough, or if you need more. In many instances, the holes have identical pictures of the pieces, or thematic cues that link them (e.g., a horse in the hole for the barn piece), thereby encouraging correct mappings via local pairs that have multiple points of alignable similarity. After generating enough one-to-one correspondences in situations that provide massive contextual support, children may become able to compare objects one-toone in situations that do not (i.e., garden variety comparisons between groupings). This could be enough to support the isolation of number as a property. Categorisation via shared labels A second domain general mechanism that promotes categorisation is the use of language to name common features and relations. Like the amodal representations credited to infants in the domain speci®c accounts, language performs two important functions in this regard. First, shared labels signal that there is a commonality (Gentner & Rattermann, 1991; Rattermann & Gentner, 1998; Sandhofer & Smith, 1999; Waxman & Markow, 1998). Like shared surface features, a shared label can initiate comparisons that are themselves a means of discovering new dimensions. This process is re¯ected in the ®nding that objects with the same label are rated as more similar than objects with different labels (Sloutsky, Lo, & Fisher, 2001). Learning a label also facilitates the recognition of shared properties and matching (Imai, Gentner, & Uchida, 1994; Markman, 1989; Rattermann & Gentner, 1998; Sandhofer & Smith, 1999; Smith, 1993; Waxman & Hall, 1993; Waxman & Markow, 1998). For example, 21-month-olds in a triad task made more taxonomic matches (cookie±cookie) than thematic matches (cookie±Cookie Monster) if the items had been given the same nonsense label (Waxman & Hall, 1993). Labelling also has dramatic effects on preschoolers' performance in cross-mapping comparisons (i.e., where relational similarity is pitted against surface similarity). For example, 3-year-olds initially failed a sticker search task in which they had to identify an item's relational match (same relative size) but ignore its identity match. However, when the experimenter labelled the items ``Daddy, Mummy, Baby,'' 3-year-olds performed well

13. Do we need a number sense?

309

above chance, reaching the same levels of accuracy as children two years older (Rattermann & Gentner, 1998). Though labels eventually do facilitate categorisation, a distinctive feature of this process is a marked lag between correct labelling and correct grouping. That is, children can correctly label items along a particular dimension without seeing the items as similar. For example, Smith (1993) found that children can say that this truck is red and that ball is red, but still may not recognise that the two items belong in the same class of red things. Further tests involving a connectionist model con®rmed that these two senses of ``same'' are quite distinct. Smith, Gasser, and Sandhofer (1997) trained a network to label three properties of a given input. For example, given a smooth red triangle and asked ``What colour is it?'' the network learned to respond ``red,'' and when asked ``What shape is it?'' the network learned to respond ``triangle.'' However, even after learning to label objects by colour the network failed to represent objects that were the same on a given property as equivalent. That is, when the network was asked, ``What colour is it?'' and was presented with a smooth red triangle, the pattern of activation on the hidden layer was different than when the network was presented with a bumpy red square and asked, ``What colour is it?'' The network apparently failed to isolate the property of colour right away and continued to represent aspects of the shape and texture of the objects for some time even though these were irrelevant to the task at hand. A second function of shared labels is to direct attention. Simply labelling objects in a scene improves memory for having seen the object (Gentner, 2003). This suggests that labelling increases attention toward one object in particular, and this increased processing is re¯ected in better memory. Shared labels also can direct attention toward a particular dimension, even if the precise meaning of the label is unknown. For example, hearing the word, ``red,'' orients children toward the dimension of colour even though they may not know exactly what ``red'' means (Landau & Gleitman, 1985; Backscheider & Shatz, 1993; Sandhofer & Smith, 1999). This is re¯ected in the fact that when 2-year-olds are asked, ``What colour is it?'' they tend to provide a colour word even though their responses are usually incorrect. In summary, naming promotes categorisation by signaling a commonality between two entities and by drawing attention toward a particular dimension. The way these processes typically unfold produces three distinct patterns: (1) labelling increases matching; (2) labels are learned prior to abstract categorisation; and (3) labels direct attention toward an overall dimension before speci®c word meanings are learned. Let us consider next whether these same patterns are evident in the development of number categories. Labelling increases number matching Across several experiments, children recognised more numerical matches if they knew the labels for at least a few small set sizes (e.g., could count

310

Mix and Sandhofer

to two and produce sets of one and two on demand) (Mix, 1999a, 1999b; Mix et al., 1996). In fact, children who failed to demonstrate at least this level of counting ability could not recognise numerical equivalence except for sets whose elements were nearly identical. This suggests, albeit indirectly, that knowing the labels for small collections facilitates numerical comparisons. Crosscultural research also indicates that number words facilitate numerical comparisons. Gordon (2004) studied numerical equivalence judgments in the PirahaÄ, an isolated group of hunter-gatherers in the Brazilian Amazon. The PirahaÄ have little contact with mainstream Brazilians and are essentially monolingual. Remarkably, the PirahaÄ lack a true counting system. According to Gordon, the PirahaÄ number words correspond to ``one,'' ``two,'' and ``many'' only. Moreover, these words are inexact. For example, the word for ``one'' frequently refers to quantities of two, three, or more objects. When members of the PirahaÄ tribe were asked to remember the numerosity of various sets, their performance was clearly impaired, particularly for numerosities of three or more. For example, after inspecting a set of nuts for several seconds, they watched as the nuts were placed in a can and then withdrawn one at a time. After each nut was withdrawn, participants were asked whether nuts remained in the can. Even for sets as small as two nuts, the PirahaÄ people were only 70% correct. For sets of four, performance dropped to 40% correct. When asked to discriminate between very small quantities, such as three versus four, PirahaÄ performance was at chance. These ®ndings suggest that learning words for exact quantities provides critical support for numerical reasoning. Number labels emerge before abstract categorisation Though number naming can promote categorisation, knowing number words does not result in immediate abstraction. Recall that children fail to recognise numerical equivalence between very disparate sets, even though they can accurately label small sets (e.g., Mix, 1999a, 1999b). Thus, as for other properties, children may label number in isolated instances before they know that these disparate situations are related. Sandhofer and Mix's (2003b) number training study provides additional evidence. Although 30month-olds successfully learned the meanings of small number words via nocomparison training (inasmuch as they could identify displays of each set size when requested), they were unable to match numerically equivalent sets. The same pattern has been reported in naturalistic observations (Mix et al., 2005). In brief, toddlers accurately label small sets in restricted contexts for many months before they can match the same set sizes in experimental tasks. (See Mix et al., 2005, for details.) This protracted time course does not seem consistent with the domain speci®c claim that children map number words to pre-existing, amodal representations. Instead, it suggests a more gradual learning process in which children ®rst map number words to speci®c

13. Do we need a number sense?

311

contexts, eventually juxtapose these contexts, make the necessary comparisons, and ®nally abstract numerical equivalence. Number labels direct attention to the dimension of number Finally, there is evidence that number labels direct attention toward the dimension of number before speci®c meanings are acquired. Wynn (1992a) showed preschoolers pairs of cards with different numbers of pictures (e.g., one ®sh versus four ®sh), and asked them to point to the card with a certain number of items (e.g., ``Can you show me the card with four ®sh?''). By 2.5 years of age, children correctly inferred that count words greater than one referred to sets of multiples. This was evident because they pointed to the correct card as long as it was paired with a singleton. However, these children performed randomly when both cards depicted multiples. Sandhofer and Mix (2003a) also found evidence of this pattern when they tracked children's acquisition of number language and concepts from 36 to 54 months of age. On average, children began to identify small sets correctly at 42 months of age. That is, they accurately produced sets of one, two, or three when asked. However, this accomplishment was preceded by an awareness that the number words refer to numerosity at age 37 months. In particular, children who were asked, ``how many?'' usually responded with a number word, even though the speci®c word did not always match the speci®c quantity. Note that both of these developments preceded the ability to match disparate sets on the basis of numerical equivalence (observed, on average, at 48 months). How do children bring meaning to small number words? Domain speci®c theorists assume, following Fodor (1983), that one can only learn words for concepts that one can already represent (Spelke & Tsivkin, 2001). From this perspective, it is natural to posit that innate representations of number provide conceptual referents for the number words. Without such representations, how else could number words be learned? And as it happens, the proposed representations make remarkably good referents. They are abstract and amodal. The category boundaries are clear, at least for small numerosities. Therefore, in these accounts, the main challenge to learning small number words is determining which words refer to which numerosities. Once these mappings have been sorted out, children can achieve new levels of understanding. For example, Spelke (2003; Spelke & Tsivkin, 2001) has argued that mapping small number words to both the object tracking and accumulator representations for those set sizes allows children to combine the ideas of individual and collection, thereby achieving true concepts of number. In other conceptualisations, the litmus test for innate concepts is whether they appear prior to language mastery (e.g., Carey,

312

Mix and Sandhofer

2001). The argument is that if children exhibit some understanding prior to mastering the words for it, then it must be innate. If they do not exhibit the understanding until after they have mastered the language for it, then it must be a cultural construction. Thus, like passing through a doorway, children move from one level of understanding to another, by way of small number word acquisition. Though we agree that exposure to number language likely precipitates new conceptual growth, we have argued that number words do not map neatly onto pre-existing representations (Mix et al., 2005). Instead, the process is much more iterative, continuous, and interwoven than these accounts suggest. In particular, we have argued that partial understanding of number language, and even the attention-directing role of unfamiliar labels, contributes to the construction of number concepts even though neither the concepts nor the words have been mastered. Because the details of this proposal are presented elsewhere (see Mix et al., 2005) we will not reiterate them here. However, we will review four key points that are particularly relevant to the question of domain speci®city: (1) Number words are part of early input; (2) Initial mappings are context-speci®c; (3) Number language is acquired like other language; and (4) Number word learning varies across children. Number words are part of early input Domain speci®c accounts imply that number words do not in¯uence quantitative thought until children master them, around age 4 years. Prior to this milestone, children are thought to rely on their innate representations (i.e., object tracking and accumulator) to perform numerical tasks. Furthermore, these representations are thought to change very little, if at all, during the preverbal period. However, it is important to acknowledge that number words are part of children's input from very early in life (e.g., Durkin, Shire, Riem, Crowther, & Rutter, 1986). This means that there is a large window of time between children's ®rst exposures to the number words and their eventual mastery of them. Within this window, it is possible that conceptual change is precipitated and shaped by exposure to partial understanding of these words. Let's consider how this might work. Words are potent organisers of attention even when children are unsure of their meanings (e.g., Gentner, 2003). So when Mum points to two cups and says, ``two,'' children will at least look at the cups even if they don't know what Mum is talking about. Seeing two cups that are distinct from other objects in the scene may be enough to impart the idea of ``same,'' or the category of ``cup'' (Paik & Mix, in press). Indeed, there is evidence that children's early uses of ``two'' re¯ect a confusion between numerosity and similarity (Mix, 2004; Mix et al., 2005). That is, they use ``two'' to mean ``same,'' but overwhelmingly do so for pairs, perhaps because pairs are easier to compare. This means that,

13. Do we need a number sense?

313

although children may not understand ``two'' at ®rst, exposure to this word is likely directing their attention toward situations that pave the way for that understanding to develop (i.e., pairs of easily compared, highsimilarity objects). Diary and longitudinal research provides further evidence of this iterative, bootstrapping process. Correct usage of small number words develops in a stepwise progression that extends over a rather protracted time period (Mix et al., 2005; Wagner & Walters, 1982). Children ®rst use ``two'' correctly in informal situations. Soon after, they begin to use ``one'' correctly. After several months, they begin to use ``three'' and ``four'' but are frequently incorrect. After approximately one year of correct labelling with the word ``two,'' they start to label correctly using the word ``three,'' but only in informal activities. Throughout this period, children provide no evidence that they comprehend any of the count words on experimental tasks. It is not until children consistently use ``one,'' ``two'' and ``three'' with perfect discrimination in everyday situations that they begin to demonstrate correct comprehension and production of these terms in experimental tasks. Soon after, they begin labelling sets of four correctly in informal usage. At this point, nearly two years after children's initial uses of the word ``two,'' they discover the connection between counting and cardinality (Wynn, 1990, 1992a). That is, they realise that the count ``1±2± 3±4'' means the collection has four items in it. Over the same time period, children's ``nonverbal'' number concepts also undergo signi®cant, seemingly continuous change (Mix, Huttenlocher, & Levine, 2002b; Schaeffer, Eggleston, & Scott, 1974). As we have discussed, they accrue experience with a variety of one-to-one mappings, starting with sets that can be mapped easily via local pairings, including socially reinforced activities (distributing objects or turn-taking) and objects that invite one-to-one pairing (peg±hole, peg±hole, etc.). These activities gradually give way to set-to-set comparisons where local pairings are less obvious (car±tree, car±tree, etc.). Around the same time (3 years of age) children begin to match high-similarity, equivalent sets explicitly in experimental tasks (Mix, 1999a, 1999b; Mix et al., 1996). From there, it takes almost two years before they can match equivalent sets that are crossmapped with object similarity (Mix, 2002). Along the way, they gradually recognise equivalence in increasingly abstract comparisons, including those between non-identical object sets, heterogeneous object sets, and sets of events and objects. This pattern of acquisition suggests that, rather than mapping number words to pre-existing concepts, language and concepts both develop ± if not hand in hand, then at least concurrently. In both cases, development is piecemeal. Learning number words involves the gradual accrual of partial understandings. So does learning number categories. This means that at any given point in time, children have an array of partial understandings at their disposal, both verbal and nonverbal, that can be assembled in

314

Mix and Sandhofer

different combinations depending on the task. This suggests that words and concepts interact all the way down the line, not only at the point when children seem to understand what the words mean. Initial mappings are context-specific Domain speci®c accounts describe the mapping of number words to concepts as if it takes place at an abstract level, divorced from any particular context. This makes sense because the innate representations of number supposedly provide a context-free, amodal redescription of different set sizes. Words, like these representations, also are arbitrary symbols not tied to any particular context. If these are the components that children are mapping, then it is reasonable to think that they would do so at an abstract level. However, children's early uses of count words do not re¯ect abstract mappings. Instead, they are decidedly context-speci®c. Mix (2004) found that when her son, Spencer, began saying number words, he mapped them to referents in a series of distinct, context-speci®c situations. In his earliest mappings, he did not reference sets of objects at all. Instead, he used number words to label written numerals. This began with the numerals that appeared in several of his board books, but he eventually came to recognise numerals on signs, license plates and addresses as well. At 23 months, he began using number words to label sets of objects. His ®rst mappings were restricted to the number ``two'' and they always occurred within a par. One. Two.'' For about a week, he ticular linguistic frame: ``Two labelled only sets of shoes using this frame (i.e., ``Two shoes. One. Two.''). Then he extended to other object sets, including two dogs, two spoons, and two straws, using the same frame. At 24 months, he began using the variant . . Two .'' For example, for two trains, he would say, `` ``Train. Train. Two trains.'' This frame appeared frequently for the next 6 weeks and, during this period, he did not label sets numerically without using it. Throughout this period, Spencer failed all tests of conventional counting. In the Give-a-Number Test, he failed to produce two objects on request and when asked how many objects were in a set of two, he responded with an idiosyncratic string of number words. Thus, although he correctly labelled different sets of two, his use of the number word ``two'' was far from decontextualised. In fact, it was deeply contextualised in two ways. First, it was initially restricted to speci®c situations ± ®rst labelling numerals, then labelling shoes. Second, these early attempts were embedded in speci®c linguistic frames. A similar pattern was reported in a diary study that tracked the development of another young boy (Blake) from 18 to 49 months of age (Mix et al., 2005). Blake's ®rst number word also was ``two,'' initially used only when asked his age (this response had been reinforced in preparation for his birthday). Although this was likely a simple association

13. Do we need a number sense?

315

without cardinal meaning, it is noteworthy that his ®rst use of a number word occurred only in this situation. Number language is acquired like other language If the acquisition of number language is guided by skeletal principles, or supported by core knowledge, then it should develop differently from acquisition of other language. Indeed, the key claim of domain speci®c accounts in development is that learning in certain domains is not like other learning. However, diary and longitudinal studies indicate that children learn number words exactly the same way as they learn other words ± most notably, the names of other properties. We have touched on several of these parallels already. Children realise that number words refer to numerosity before they know the speci®c cardinal meanings of these words (Sandhofer & Mix, 2003a; Wynn, 1992a), just as they realise that colour words refer to the dimension of colour before they know the meanings of individual colour words (Backscheider & Shatz, 1993; Landau & Gleitman, 1985; Sandhofer & Smith, 1999). Furthermore, as in learning colour terms, local mappings of number word to referent often precede the formation of numerical equivalence classes. Both Blake and Spencer spontaneously labelled various object sets, containing the same number of items, for many months before they could match equivalent sets in a forced-choice task (Mix et al., 2005). There are interesting parallels between the order of the mappings children perform for number and those observed for word learning more generally. As we have seen, equivalence classes are affected by the degree of similarity between objects. When the target and choice objects are highly similar, for example a red aeroplane and a similar red aeroplane, even children who do not comprehend colour terms can match these objects by colour (Soja, 1994). But when the target and choice objects are less similar or when there is competing similarity from a distracter object, children fail to match objects by colour until long after they have learned to comprehend and produce colour terms correctly (Rice, 1980; Sandhofer & Smith, 1999; Smith, 1984). This is precisely the same pattern we described previously for numerical equivalence judgments, number words, and object similarity (Mix, 1999a, 1999b, 2004; Mix et al., 1996). A second ordering of interest involves ®rst uses of the number words. Both Spencer and Blake mapped number words onto written numerals early in development. In fact, these constituted their ®rst number-word mappings. This makes sense given that children tend to interpret new words in terms of shape as their vocabularies increase (Smith et al., 2002). Indeed, children with a strong shape bias can identify more letters of the alphabet than children who lack the shape bias, presumably because learning letter names requires careful attention to shape (Long®eld, 2004). When children map number words to written numerals, they may be extending the shape bias to numbers. This is particularly likely given that numerically equivalent

316

Mix and Sandhofer

sets do not have a consistent shape. Thus, written numerals would provide a more straightforward mapping. Finally, Spencer's use of number frames is reminiscent of children's use of pivot grammar more generally. Bloom (1993) noted that children often use the same simple sentence structures to incorporate new vocabulary. For '' to request items and then example, they might learn the frame ``Give use this frame repeatedly as they acquire new words (e.g., ``Give milk,'' ``Give toy,'' ``Give cookie,'' etc.). Spencer's number frames have much the same quality. They provided a way for him to incorporate new sets into his category of ``twoness.'' Thus, we see many parallels between number development and development in other domains. The signi®cance of these parallels is that they indicate common underlying processes. When children associate number words with the dimension of number, they are likely responding to patterns in linguistic input as they do when learning other words (Bloom & Wynn, 1997). When children map number words to shape, they are using the same strategy that works in other word learning situations (e.g., Landau, Smith, & Jones, 1988; Smith, Jones, & Landau, 1992). When they overgeneralise, they are struggling to reconcile their understanding of the underlying categories with the socially accepted categories to which words refer (e.g., Mervis, 1985). These parallels provide important insights into the mechanisms by which children integrate number language with conceptual understanding ± mechanisms that are suf®cient to explain this accomplishment without invoking domain speci®c processing. Number word learning varies across children We have argued that number word acquisition progresses in a consistent pattern, much like acquisition of other properties. We have explained this consistency in terms of shared domain general processing. However, another way to look at this consistency is that it re¯ects the universality of domain speci®c processes. The argument is that without universal, domain speci®c processes built-in, the diversity, inadequacy, and pluripotentiality of children's experience with number would yield a wide range of developmental outcomes. Of course, universality is a relative thing. Domain speci®c theorists assume that interactions between innate structures and the environment are necessary, and therefore expect a certain amount of variability across children (Gelman, 1998). However, evidence for uniformity across children is generally taken as evidence for the constraints and focus provided by domain speci®c learning. To sort these issues out, it is helpful to consider in what particular ways children differ and in what particular ways they do not. For example, we have found that although most children will reach the same endpoint in number word learning (i.e., forming number categories for small set sizes and mapping the number words to these categories), individual children

13. Do we need a number sense?

317

take different pathways to get there (Sandhofer & Mix, 2003a). In a longitudinal study of children's numerical equivalence judgments and number word acquisition, we identi®ed two distinct patterns of acquisition. These patterns arose from children's performance on two tasks: (1) a number categorisation task in which children were asked to, for example, ``Give me three,'' and (2) an equivalence matching task in which they matched a standard to its numerical equivalent. One group of children learned counting skills early. These children produced the correct number of objects in the give-a-number task from early on. In fact, these children demonstrated this understanding of the small number words an average of 2 months before they could match any numerically equivalent sets. When these children ®nally began to match sets by number, they did so for all of the small set sizes nearly simultaneously. This pattern of acquisition suggests that children were mapping the number words to different instantiations without seeing these instantiations as related. The other group of children appeared to master each number, one at a time, succeeding on the number categorisation and equivalence task for the quantities of two before mastering the tasks for quantities of three. This pattern of acquisition suggests that children were fully working out the meaning of each number word, including its corresponding equivalence task, before moving on to the next. Unlike the language-®rst group, these children seemed to have some sense of numerical equivalence, but one that may have been contextually encapsulated until they learned the associated number words. What might account for these differences in development? Earlier in this chapter, we outlined a variety of domain general processes that might promote numerical development. These included use of labels to cue similarity, formation of number categories via one-to-one mappings and implicit categorisation, and local mappings of words to speci®c situations. It seems that every child would not need every mechanism to solve the problems of early number development. Hence the individual differences we have described here could indicate that different children recruit different processes depending on their learning histories. For example, children who are surrounded by playthings that invite one-to-one correspondence may be more likely to form equivalence categories ®rst. Children whose parents label sets for them often may be more likely to lead with categorisation based on shared labels. A fruitful direction for future research may be to explain why some children recruit different learning mechanisms from others, based on variations in parent input, children's play environments, and preferred learning styles (e.g., object manipulation versus conversation).

What about the babies? Thus far, we have argued that domain speci®c mechanisms are not needed to explain how children learn the meanings of small numbers. We have

318

Mix and Sandhofer

described domain general processes that could support attention toward discrete number as well as the formation of numerical equivalence categories. We have shown that these processes not only provide plausible alternatives to domain speci®c processing, but seem to be re¯ected in the details of early numerical development. Based on parsimony alone, the evidence strongly favours a domain general account. However, we cannot be sure that nature is parsimonious. If there were additional evidence that could be explained only with recourse to domain speci®c mechanisms, then we would have to concede their existence. And if these processes exist, then it is reasonable to assume that they solve all the problems that domain speci®c theorists claim that they solve. One class of evidence emerged in the 1980s and seemed to require a domain speci®c explanation of number development ± namely, evidence of numerical sensitivity in infants. In short, a series of studies revealed numerical processing that was so sophisticated, in babies who were so young, that it seemed impossible to explain its existence without assuming that humans are born with a number sense (e.g., Antell & Keating, 1983; Starkey, Spelke, & Gelman, 1990; Wynn, 1992b). From there, speci®c representations of number were proposed (i.e., object tracking, preverbal counting, etc.) that were consistent with the particular patterns of strengths and weaknesses in infants' numerical performance. This was the genesis of current domain speci®c accounts. Findings of numerical sensitivity in infants may be consistent with the domain speci®c accounts. However, we do not believe that they require them. First, the evidence of numerical processing in infants is not as solid as initially believed. In a recent review, Mix, Huttenlocher and Levine (2002a) concluded that none of the existing studies had succeeded in demonstrating sensitivity to number per se. Instead, all the procedures used with infants allowed at least one non-numerical cue to covary with number. For example, the most replicable evidence of numerical sensitivity in infants comes from looking time experiments in which even newborns have responded to changes in set size (Antell & Keating, 1983; Starkey & Cooper, 1980; Starkey, et al., 1990; Wynn, 1998; Xu & Spelke, 2000). In these experiments, infants are shown a series of displays with the same number of pictures (e.g., two dots). Over time, infants lose interest in these displays and stop looking at them as long. Once looking time has decreased by about half, infants are shown displays with a new set size (e.g., three dots). In general, infants' looking time has increased signi®cantly in response to this change, suggesting that they remembered the ®rst set size and noticed that the second set size was different. The problem is that when number changes in these displays, so do a variety of other cues, including contour length, density, complexity, surface area, brightness, and spatial frequency. This means that infants could be responding to a change in one or all of these non-numerical cues, rather than number. Direct support for this interpretation came from a study in which number was pitted against contour length at test (Clear®eld & Mix, 1999). Infants

13. Do we need a number sense?

319

were habituated to one set size of black squares. At test, they saw either the same number of squares with a different total contour length, or a different number of squares with the same contour length. Though there was signi®cant dishabituation to the contour length change, infants did not respond to the change in number when contour length was held constant. This provided strong evidence that in previous studies where non-numerical cues were not controlled, infants were responding to changes in those cues, and not to the changes in number. Other studies have demonstrated more sophisticated numerical abilities in infants. These include habituation±dishabituation to events or sounds (Sharon & Wynn, 1998; Wynn, 1996), violation-of-expectation experiments involving simple addition and subtraction problems (Simon, Hespos, & Rochat, 1995; Wynn, 1992b), and intermodal matching of numerically equivalent sets (Starkey et al., 1990). However, in every study, there was at least one non-numerical cue confounded with number (Mix et al., 2002a). We are unaware of any work published since that time that has succeeded in overcoming these confounds. And in several cases, when even one of these non-numerical cues has been controlled, infants have failed to respond to number (Demany, McKenzie, & Vurpillot, 1977; Feigenson, Carey, & Spelke, 2002; Lewkowicz, Dickson, & Kraebel, 2001). Thus, the very evidence that inspired domain speci®c theories of number development appears to re¯ect domain general perceptual processing instead. But what if an infant study did show unequivocal sensitivity to number? Would that necessitate a domain speci®c explanation? We are not convinced. As we discussed earlier, there is strong evidence that human infants are quite sensitive to statistically reliable patterns in perceptual experience. With relatively brief exposure to correlated percepts, they not only isolate recurring patterns, but also recognise these in novel situations (Gomez & Gerken, 1999; Kirkham et al., 2002; Saffron et al., 1996). Most infant number experiments have been carried out with babies 5 months old or older. But even a 1-month-old has had so much exposure to objects and visual scenes, literally millions of data points (Blumberg, 2005), that it would be plausible for them to respond to changes in set size based on early learning, rather than an innate endowment. During our review of the object individuation literature, we noted that infants acquire a range of cues in a predictable sequence ± one that is consistent with the gradual isolation of multiple, statistically reliable associations among correlated features. Thus, even if we assume, for the sake of argument, that infants exhibit true numerical sensitivity, there is no reason to assume that this sensitivity is built in, rather than constructed via domain general processes.

Conclusions This chapter has focused on the key problems children must solve as they construct a concept of number. Domain speci®c accounts have been

320

Mix and Sandhofer

developed to explain how children solve these problems. However, our review suggests that number development can be explained solely in terms of domain general processes, including those for pattern recognition, categorisation, comparison, and naming. Not only do these processes provide plausible explanations for number development, but they are evident in the speci®c patterns obtained in the extant studies. That is, number development bears the signatures of the particular processes that have been documented in the development of other concepts and categories. The main source of evidence that seems to compel a domain speci®c account, numerical sensitivity in infants, is undermined by non-numerical confounds. However, at a more basic level, we question whether awareness of any property in infants requires a domain speci®c knowledge structure when such awareness could arise from sensitivity to statistical patterns and massive sensory input. In summary, while it is not possible to rule out domain speci®c processing for number conclusively, we ®nd nothing in the evidence that compels it. Research on early number concepts has focused on the development of domain speci®c accounts, and the refutation of them, for some time. Although (or perhaps because) domain general processes are assumed to exist by most investigators, there has been little interest in understanding how they contribute to numerical development. This is unfortunate because, from any perspective, these processes must play a central role. They are the blue-chip stock of developmental psychology ± well-established and wellunderstood mechanisms of conceptual change. If there are domain speci®c components to number learning, then these components must interact with domain general processes. If there are not domain speci®c components, then domain general processes shoulder the entire explanatory burden. Either way, we hope that this chapter will lead to greater acknowledgement of the power of domain general processing and interest in its impact on early number development.

Acknowledgments This work was supported by a grant from the Spencer Foundation to the ®rst author. We thank Brian Bowdle, Susan Jones, Laura Namy, and Linda Smith for helpful discussion and comments on previous drafts.

References Anderson, J., & Mix, K. S. (2004, May). A longitudinal analysis of children's one-toone correspondence behaviors. Poster session presented at the 14th biennial meeting of the International Conference on Infant Studies, Chicago. Antell, S., & Keating, D. P. (1983). Perception of numerical invariance in neonates. Child Development, 54, 695±701.

13. Do we need a number sense?

321

Backscheider, A. G., & Shatz, M. (1993, April). Children's acquisition of the lexical domain of color. Twenty-ninth annual Proceedings of the Chicago Linguistic Society, Chicago. Bauer, P. J., & Mandler, J. M. (1989). Taxonomies and triads: Conceptual organization in one- to two-year-olds. Cognitive Psychology, 21, 156±184. Bloom, L. (1993). The transition from infancy to language: Acquiring the power of expression. New York: Cambridge University Press. Bloom, P., & Wynn, K. (1997). Linguistic cues in the acquisition of number words. Journal of Child Language, 24, 511±533. Blumberg, M. S. (2005). Basic instinct: The genesis of behavior. New York: Thunder's Mouth Press. Brown, A. L., & Kane, M. J. (1986). Preschool children can learn to transfer: Learning to learn and learning from example. Cognitive Psychology, 20, 493±523. Carey, S. (2001). Whorf versus continuity theorists: Bringing data to bear on the debate. In M. Bowerman & S. C. Levinson (Eds.), Language Acquisition and Conceptual Development (pp. 185±214). New York: Cambridge University Press. Chomsky, N. (1980). Rules and representation. New York: Columbia University Press. Clear®eld, M. W., & Mix, K. S. (1999). Number versus contour length in infants' discrimination of small visual sets. Psychological Science, 10, 408±411. Cohen, L. B., & Marks, K. S. (2002). How infants process addition and subtraction events. Developmental Science, 5, 186±201. Dehaene, S. (1997). The number sense: How the mind creates mathematics. Oxford: Oxford University Press. DeLoache, J. S. (1989). Young children's understanding of the correspondence between a scale model and a larger space. Cognitive Development, 4, 121±139. Demany, L., McKenzie, B., & Vurpillot, E. (1977). Rhythm perception in early infancy. Nature, 266, 718±719. Durkin, K., Shire, B., Riem, R., Crowther, R. D., & Rutter, D. R. (1986). The social and linguistic context of early number word use. British Journal of Developmental Psychology, 4, 269±288. Feigenson, L., Carey, S., & Spelke, E. (2002). Infants' discrimination of number vs. continuous extent. Cognitive Psychology, 44, 33±36. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Gallistel, C. R., & Gelman, R. (1992). Preverbal and verbal counting and computation. Cognition, 44, 43±74. Gelman, R. (1990). First principles organize attention to and learning about relevant data: Number and the animate±inanimate distinction as examples. Cognitive Science, 14, 79±106. Gelman, R. (1991). Epigenetic foundations of knowledge structures: Initial and transcendent constructions. In S. Carey & R. Gelman (Eds.) The epigenesis of mind: Essays on biology and cognition (pp. 293±322). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Gelman, R. (1993). A rational-constructivist account of early learning about numbers and objects. The Psychology of Learning and Motivation, 30, 61±95. Gelman, R. (1998). Domain speci®city in cognitive development: Universals and nonuniversals. In M. Sabourin, F. Craik, & M. Robert (Eds.), Advances in psychological science, Vol. 2: Biological and cognitive aspects (pp. 557±579). Hove, UK: Psychology Press/Lawrence Erlbaum Associates Ltd.

322

Mix and Sandhofer

Gentner, D. (2003). Why we're so smart. In D. Gentner & S. Goldin-Meadow (Eds.), Language in mind: Advances in the study of language and thought (pp. 195± 235). Cambridge, MA: MIT Press. Gentner, D., & Markman, A. B. (1994). Structural alignment in comparison: No difference without similarity. Psychological Sciences, 5, 152±158. Gentner, D., & Medina, J. (1997). Comparison and the development of cognition and language. Cognitive Studies: Bulletin of the Japanese Cognitive Science Society, 4, 112±149. Gentner, D., & Namy, L. L. (2004). The role of comparison in children's early word learning. In S. R. Waxman & D. G. Hall (Eds.), From many strands: Weaving a lexicon (pp. 533±568). Cambridge, MA: MIT Press. Gentner, D., & Rattermann, M. J. (1991). Language and the career of similarity. In S. A. Gelman & J. P. Byrnes (Eds.), Perspectives on language and thought: Interrelations in development (pp. 225±277). Cambridge: Cambridge University Press. Gentner, D., & Toupin, C. (1986). Systematicity and surface similarity in the development of analogy. Cognitive Science, 10, 277±300. Goldstone, R. L. (1996). Alignment-based nonmonotonicities in similarity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 988±1001. Gomez, R. L., & Gerken, L. A. (1999). Arti®cial grammar learning by one-year-olds leads to speci®c and abstract knowledge. Cognition, 70, 109±135. Gordon, P. (2004). Numerical cognition without words: Evidence from Amazonia. Science, 306, 496±499. Holyoak, K. J., Junn, E. N., & Billman, D. O. (1984). Development of analogical problem-solving skill. Child Development, 55, 2042±2055. Huttenlocher, J., Jordan, N. C., & Levine, S. C. (1994). A mental model for early arithmetic. Journal of Experimental Psychology: General, 123, 284±296. Imai, M., Gentner, D., & Uchida, N. (1994). Children's theories of word meanings: The role of shape similarity in early acquisition. Cognitive Development, 9, 45±75. Kahneman, D., Treisman, A., & Gibbs, B. (1992). The reviewing of object ®les: object speci®c integration of information. Cognitive Psychology, 74, 175±219. Kellman, P. J., & Spelke, E. S. (1983). Perception of partly occluded objects in infancy. Cognitive Psychology, 15, 483±524. Kirkham, N. Z., Slemmer, J. A., & Johnson, S. P. (2002). Visual statistical learning in infancy: Evidence for a domain general learning mechanism. Cognition, 83, B35±B42. Kotovsky, L., & Gentner, D. (1997). Comparison and categorization in the development of relational similarity. Child Development, 67, 2797±2822. Landau, B., & Gleitman, L. R. (1985). Language and experience. Cambridge, MA: Harvard University Press. Landau, B., Smith, L. B., & Jones, S. S. (1988). The importance of shape in early lexical learning. Cognitive Development, 3, 299±321. Leslie, A. M. (1994). ToMM, ToBY, and agency: Core architecture and domain speci®city. In L. A. Hirsch®eld and S. A. Gelman (Eds.) Mapping the mind: Domain speci®city in cognition and culture (pp. 119±148). Cambridge: Cambridge University Press. Lewkowicz, D. J., Dickson, L., & Kraebel, K. (2001, April). Perception of multimodal rhythm in human infants. Poster session presented at the biennial meeting of the Society for Research in Child Development, Minneapolis, MN.

13. Do we need a number sense?

323

Long®eld, E. (2004). Recognition of letters in the absence of a shape bias. Unpublished honors thesis, Indiana University, Bloomington. Madole, K. L., & Oakes, L. M. (1999). Making sense of infant categorization: Stable processes and changing representations. Developmental Review, 19, 263±296. Markman, A. B. (1997). Constraints on analogical inference. Cognitive Science, 21, 373±418. Markman, E. (1989). Categorization and naming in children: Problems of induction. Cambridge, MA: MIT Press. Marzolf, D., & DeLoache, J. S. (1994). Transfer in young children's understanding of spatial representations. Child Development, 65, 1±15. Meck, W. H., & Church, R. M. (1983). A mode control model of counting and timing processes. Journal of Experimental Psychology: Animal Behavior Processes, 9, 320±334. Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for similarity, Psychological Review, 100, 254±278. Mervis, C. B. (1985). On the existence of prelinguistic categories: A case study. Infant Behavior and Development, 8, 293±300. Mix, K. S. (1999a). Preschoolers' recognition of numerical equivalence: Sequential sets. Journal of Experimental Child Psychology, 74, 309±332. Mix, K. S. (1999b). Similarity and numerical equivalence: Appearances count. Cognitive Development, 14, 269±297. Mix, K. S. (2002). The construction of number concepts. Cognitive Development, 17, 1345±1363. Mix, K. S. (2004). How Spencer made number: First uses of the number words. (Technical Report No. 255). Bloomington: Indiana University Cognitive Science Program. Mix, K. S., Huttenlocher, J., & Levine, S. C. (1996). Do preschool children recognize auditory±visual numerical correspondences? Child Development, 67, 1592±1608. Mix, K. S., Huttenlocher, J., & Levine, S. C. (2002a). Multiple cues for quanti®cation in infancy: Is number one of them? Psychological Bulletin, 128, 278±294. Mix, K. S., Huttenlocher, J., & Levine, S. C. (2002b). Quantitative development in infancy and early childhood. New York: Oxford University Press. Mix, K. S., Sandhofer, C. M., & Baroody, A. (2005). Number words and number concepts: The interplay of verbal and nonverbal processes in early quantitative development. In R. V. Kail (Ed.), Advances in child development and behavior (Vol. 33, pp. 305±346). New York: Academic Press. Needham, A. (1999). The role of shape in 4-month-old infants' segregation of adjacent objects. Infant Behavior and Development, 22, 161±178. Newport, E. L. (1990). Maturational constraints on language learning. Cognitive Science, 14, 11±28. Paik, J. H., & Mix, K. S. (in press). Preschoolers' use of surface similarity in object comparisons: Taking context into account. Journal of Experimental Child Psychology. Pylyshyn, Z. W. (1989). The role of location indexes in spatial perception: A sketch of the FINST spatial-index model. Cognition, 32, 65±97. Quine, W. V. O. (1960). Word and object. Cambridge, MA: MIT Press. Quinn, P. C., & Eimas, P. D. (1996). Perceptual cues that permit categorical

324

Mix and Sandhofer

differentiation of animal species by infants. Journal of Experimental Child Psychology, 63, 189±211. Rattermann, M. J., & Gentner, D. (1998). More evidence for a relational shift in the development of analogy: Children's performance on a causal-mapping task. Cognitive Development, 13, 573±595. Rattermann, M. J., Gentner, D., & DeLoache, J. S. (1990). Effects of labels on children's use of relational similarity. Proceedings of the Twelfth Annual Meeting of the Cognitive Science Society (pp. 22±29). Boston. Regier, T., & Gahl, S. (2004). Learning the unlearnable: The role of missing evidence. Cognition, 93, 147±155. Rice, N. (1980). Cognition to language. Baltimore: University Park Press. Russell, B. (1919). Introduction to mathematical philosophy. London: George Allen & Unwin. Saffron, J., Aslin, R., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926±1928. Sandhofer, C. (2003, April). Comparison training promotes abstraction of dimensions. Poster presented at the biennial meeting of the Society for Research in Child Development, Tampa, FL. Sandhofer, C., & Mix, K. (2003a, April). Number language and number concepts: Evidence from a long-range microgenetic study. Paper presented at the biennial meeting of the Society for Research in Child Development, Tampa, FL. Sandhofer, C., & Mix, K. (2003b, October). Children learning properties: Are domain speci®c mechanisms necessary? Paper presented at the biennial meeting of the Cognitive Development Society, Park City, UT. Sandhofer, C., & Smith, L. B. (1999). Learning color words involves learning a system of mappings. Developmental Psychology, 35, 668±679. Schaeffer, B., Eggleston, V. H., & Scott, J. L. (1974). Number development in young children. Cognitive Psychology, 6, 357±379. Scholl, B. J., & Leslie, A. M. (1999). Modularity, development and ``theory of mind.'' Mind & Language, 14, 131±153. Scholl, B. J., Pylyshyn, Z. W., & Feldman, J. (2001). What is a visual object? Evidence from target merging in multiple object tracking. Cognition, 80, 159±177. Schoner, G., & Thelen, E. (2001). A dynamic ®eld model of infant habituation. Unpublished manuscript. Sharon, T., & Wynn, K. (1998). Individuation of actions from continuous motion. Psychological Science, 9, 357±362. Siegel, L. S. (1971). The sequence of development of certain number concepts in preschool children. Developmental Psychology, 5, 357±361. Siegel, L. S. (1974). Development of number concepts: Ordering and correspondence operations and the role of length cues. Developmental Psychology, 10, 907±912. Simon, T. J., Hespos, S. J., & Rochat, P. (1995). Do infants understand simple arithmetic? A replication of Wynn (1992). Cognitive Development, 10, 253±269. Simons, D. J., & Levin, D. T. (1997). Change blindness. Trends in Cognitive Sciences, 1, 261±267. Slater, A., Morrison, V., Somers, M., Mattock, A., Brown, E., & Taylor, D. (1990). Newborn and older infants' perception of partly occluded objects. Infant Behavior & Development, 13, 33±49. Sloutsky, V. M., Lo, Y., & Fisher, A. V. (2001). How much does a shared name

13. Do we need a number sense?

325

make things similar? Linguistic labels, similarity, and the development of inductive inference. Child Development, 72, 1695±1709. Smith, L. B. (1984). Young children's understanding of attributes and dimensions: A comparison of conceptual and linguistic measures. Child Development, 55, 363±380. Smith, L. B. (1989). From global similarities to kinds of similarities: The construction of dimensions in development. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogy (pp. 146±178). New York: Cambridge University Press. Smith, L. B. (1993). The concept of same. In H. W. Reese (Ed.), Advances in Child Development and Behavior (Vol. 24, pp. 215±252). San Diego, CA: Academic Press. Smith, L. B., Gasser, M., & Sandhofer, C. M. (1997). Learning to talk about the properties of objects: A network model of the development of dimensions. In R. L. Goldstone, P. G. Schyns, & D. L. Medin (Eds.), Perceptual Learning, 36, The Psychology of Learning and Motivation (pp. 219±255). San Diego, CA: Academic Press. Smith, L. B., Jones, S. S., & Landau, B. (1992). Count nouns, adjectives, and perceptual properties in children's novel word interpretations. Developmental Psychology, 28, 273±286. Smith, L. B., Jones, S. S., Landau, B., Gershkoff-Stowe, L., & Samuelson, L. (2002). Object name learning provides on-the-job training for attention. Psychological Sciences, 13, 13±19. Soja, N. N. (1994). Young children's concept of color and its relation to the acquisition of color words. Child Development, 65, 918±937. Spalding, T. L., & Ross, B. H. (1995). Comparison-based learning: Effects of comparing instances during category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1251±1263. Spelke, E. S. (2003). What makes us smart? Core knowledge and natural language. In D. Gentner & S. Goldin-Meadow (Eds.), Language in mind (pp. 277±311). Cambridge, MA: MIT Press. Spelke, E. S., & Tsivkin, S. (2001). Innate knowledge and conceptual change: Space and number. In M. Bowerman & S. C. Levinson (Eds.), Language acquisition and conceptual development (pp. 70±97). New York: Cambridge University Press. Starkey, P., & Cooper, R. G., Jr (1980). Perception of numbers by human infants. Science, 210, 1033±1035. Starkey, P., Spelke, E. S., & Gelman, R. (1990). Numerical abstraction by human infants. Cognition, 36, 97±127. Sugarman, S. (1981). The cognitive basis of classi®cation in very young children: An analysis of object-ordering trends. Child Development, 52, 1172±1178. Wagner, S. H., & Walters, J. (1982). A longitudinal analysis of early number concepts. In G. Foreman (Ed.), Action and thought: From sensorimotor schemes to symbolic operations (pp. 137±161). New York: Academic Press. Waxman, S. R., & Hall, D. G. (1993). The development of a linkage between count nouns and object categories: Evidence from 15- to 21-month-old infants. Child Development, 64, 1224±1241. Waxman, S. R., & Klibanoff, R. S. (2000). The role of comparison in the extension of novel adjectives, Developmental Psychology, 36, 571±581. Waxman, S. R., & Markow, D. B. (1998). Object properties and object kind:

326

Mix and Sandhofer

Twenty-one-month-old infants' extension of novel adjectives. Child Development, 69, 1313±1329. Wilcox, T. (1999). Object individuation: Infants' use of shape, size, pattern, and color. Cognition, 72, 125±166. Wilcox, T., & Baillargeon, R. (1998). Object individuation in infancy: The use of featural information in reasoning about occlusion events. Cognitive Psychology, 37, 97±155. Wynn, K. (1990). Children's understanding of counting. Cognition, 36, 155±193. Wynn, K. (1992a). Evidence against empiricist accounts of the origins of numerical knowledge. Mind & Language, 7, 315±332. Wynn, K. (1992b). Addition and subtraction by human infants. Nature, 358, 749±750. Wynn, K. (1995). Infants possess a system of numerical knowledge. Current Directions in Psychological Science, 4, 172±177. Wynn, K. (1996). Infants' enumeration of actions. Psychological Science, 7, 164±169. Wynn, K. (1998). Psychological foundations of number: Numerical competence in human infants. Trends in Cognitive Science, 2, 296±303. Xu, F., & Carey, S. (1996). Infants' metaphysics: The case of numerical identity. Cognitive Psychology, 30, 111±153. Xu, F., Carey, S., & Welch, J. (1999). Infants' ability to use object kind information for object individuation. Cognition, 70, 137±166. Xu, F., & Spelke, E. S. (2000). Large number discrimination in 6-month-old infants. Cognition, 74, 1±11.

Part III

Extreme domain speci®city versus domain general intelligence

14 Do problem solvers need to be intelligent? Maxwell J. Roberts

Imagine that you are about to undergo hospital treatment and, much to your surprise, you are given a choice: You can be operated on either by a doctor with an IQ of 120, and 10 years of surgical experience, or by a doctor with an IQ of 150, who has just quali®ed from medical school. Most people would question the sanity of the person who offered them such a choice. However, this precisely re¯ects the ``straw-man'' theory of the relationship between intelligence and problem solving that some advocates of extreme domain speci®city have sought to disprove. For this, intelligence is the only predictor of performance, an absolute talent model. Any believers in this model would have to choose the high-IQ doctor. Such a view is implicated by Ericsson, Krampe, and Tesch-RoÈmer (1993, p. 392): ``the popular `talent' view that asserts that differences in practice and experience cannot account for differences in expert performance''. Disproving such a model underlies Simon's (1990, p. 15) assertion that: In all domains, differences in knowledge (which must include learned skills as well as factual knowledge) prove to be a dominant source of differences in performance . . . Of course, this ®nding should not be taken to deny the existence of ``innate differences'', but rather to account for their relative (quantitative) insigni®cance in explaining differences in adult skilled performance. This position is held by many advocates of extreme domain speci®city, who thus deny any important relationship between intelligence and the execution of problem-solving procedures, or the acquisition of domain speci®c problem-solving skills. No proponent of domain general ability would claim that differences in general intelligence are the only explanation of differences in problem solving, so let us make the original question more interesting. This time you can choose between a doctor with an IQ of 120, and 10 years' surgical experience, and a doctor with an IQ of 150, who is otherwise identical, including 10 years' experience. Advocates of extreme domain speci®city would assert that the doctors would perform equally well, but on what

330

Roberts

grounds can they claim this? In reality, for a routine procedure, it probably would make little difference, but what if the procedure was unusual, or there was the possibility of complications? Would the doctors really be identically pro®cient? Claims of extreme domain speci®city typically comprise two strands. First, cognition is based on processes that operate entirely, and only, within narrow contexts. These might be innate and modular: Over the millennia, individuals have been favoured who have any genetic predisposition to be preprogrammed with suitable solutions to recurring survival-relevant problems. Examples include how to catch people who seek to break social contracts, and hazard detection and management (e.g., Fiddick, Cosmides, & Tooby, 2000, but see also Gottfredson, Chapter 17, this volume). Another domain speci®c alternative to this massive modularity is extreme contextualism. Here, the source of domain speci®c processes is learning and experience: People acquire context-speci®c knowledge that enables them to perform in those contexts, but very poorly out of them (e.g., Ceci & Roazzi, 1994, pp. 92±93; Richardson, 1991b, p. 78). Second, proponents of extreme domain speci®city deny individual differences in domain general ability. These might re¯ect the pro®ciency with which general problem-solving procedures operate; for example, constrained by the availability of domain general resources such as working memory capacity. In reality, there is nothing mutually exclusive about domain speci®c versus domain general processes. Indeed, the availability of domain general resources could constrain the effectiveness with which domain speci®c processes are acquired or operate. However, extreme domain speci®city rejects this, positing that the mere possession of a module or knowledge structure is necessary and suf®cient for good performance, and that all suf®ciently motivated people can acquire domain speci®c problem-solving skills equally well.1 Any predictive power of intelligence tests for problem-solving performance thus potentially refutes this position. It is therefore necessary for proponents of extreme domain speci®city to show that individual differences in domain general ability are nonexistent

1 We must be wary of political (as opposed to empirical) bases to extreme domain speci®city. Intelligence can be a sensitive topic because of the perceived injustices of testing (e.g., Gould, 1997; Rose, Lewontin, & Kamin, 1990). Extreme domain speci®city is often adopted as a defence against these ``excesses''. However, once it is shown that people differ in important domain general ways, the egalitarian underpinnings of extreme domain speci®city collapse. Among people who regard intelligence testing as an abhorrence, this leads to a strident eagerness to discredit any domain general skill, which might be tested and used to predict and affect people's futures. All this state of denial achieves is the neglect of important parameters in psychological modelling (see Chabris, Chapter 19, this volume) and of the real and recurring day-to-day problems faced by people with low intelligence (see Gottfredson, Chapter 17, this volume). This can only reduce the effectiveness and fairness of psychological testing.

14. Problem solving and intelligence

331

or irrelevant, and hence that intelligence is not important. Their problem is that intelligence test scores have a persistent and reasonably high predictive validity in schools, in the workplace, and in society (see Brody, Chapter 18, and Gottfredson, Chapter 17, this volume). One strategy, therefore, is to allege that test scores are a proxy for something else (e.g., Howe, 1988, 1990).2 Another is to divert attention by citing cases of poor predictive validity. It is effectively claimed that the ability measured by intelligence tests is narrow and specialised and has little utility, so that people's realworld problem solving is not predictable by their intelligence. All people therefore have equal potential, irrespective of measured intelligence. The plausibility of this argument will be evaluated in the current chapter. The focus of this chapter will be a set of laboratory and naturalistic studies of problem solving that have entered psychological folklore: They are cited as demonstrating zero association between intelligence test score and problem-solving performance, and hence they allegedly disprove the importance of domain general ability. Unfortunately, much of the evidence is anecdotal; for example, Sternberg, Wagner, Williams, and Horvath's (1995) account of refuse collection, and also their surprise that mentally retarded children could counter security precautions and escape from a classroom, but could not solve the Porteus Maze Test. The latter is hardly surprising given that (1) the children had several weeks to study the security, but not the test, and (2) the security was defeated collectively, but the test was solved individually. In a similar vein, Ceci and Liker (1986a, p. 140) seek to convince readers that academics are unsophisticated in their personal affairs ± and hence their superior performance in domains of expertise is entirely contextual ± entirely on their say-so. In any case, absent-minded professors are hardly representative of people with high intelligence. As another anecdotal example, Ceci & Roazzi (1994, p. 80) expect us to take seriously a zero relationship between intelligence and performance based upon an N of 2.

2 Typical claims are that intelligence tests really measure social status, speci®c temperamental dispositions (e.g., test-taking willingness), or more general ones (e.g., general motivation, or personality traits such as neuroticism or extraversion). Hence, with the right approach to all aspects of life, including intelligence testing, a person would generally be expected to succeed. Space precludes a discussion of these, and cognitive ability, dispositional, and motivational traits can be separately identi®ed and measured, as can interactions between them in determining performance (e.g., Ackerman & Heggestad, 1997). An alternative is that intelligence might be a labelling phenomenon, although evidence in support is inconsistent (e.g., Spitz, 1999). All ``proxy'' arguments have a fatal ¯aw. Suppose intelligence tests are acting as proxy measures for some other dimension. As accidental measures, they must be measuring imperfectly, and a test explicitly designed to measure the mystery dimension should be more valid, and therefore would predict life's outcomes better than intelligence tests. So far, this has not been achieved. Personality inventories are poor predictors of occupational success (e.g., Barrick and Mount, 1991; Blinkhorn & Johnson, 1990), and intelligence test score remains the best predictor of school and occupational success.

332

Roberts

Although anecdotes may illustrate a point, that is all that they can do. The lack of methodological rigour makes them otherwise worthless, and carefully selected ones without base-rates and counterexamples are misleading. Gottfredson (2003) notes that these no more sever the link between intelligence and general problem solving than accounts of long-lived smokers break the link between smoking and cancer. However, if sounder evidence is reported selectively (without available counter®ndings), the problem of bias is precisely the same. Even dismissing the anecdotes, the studies to be discussed suffer from the following dif®culties. 1 2 3

4 5

6

Inadequate data. There are often small sample sizes and poor measures of problem-solving performance, for example unreliable or prone to ceiling effects. Inadequate design. Many studies involve quasi-experimental group comparisons (e.g., experts versus novices), thus limiting the conclusions that can be drawn. Non-isomorphic comparison tasks. Several studies depend on a comparison of contextual versus non-contextual tasks, requiring that these are isomorphic, i.e., logically identical. This is not achieved (see also Roberts, Chapter 1; Stenning & van Lambalgen, Chapter 8, this volume). Obsolete theoretical approaches. Recent developments in theories of strategy acquisition and selection render many conclusions obsolete. Developmental trajectories neglected. Cross-sectional studies of welllearnt domain speci®c problem-solving skills cannot tell us about the process of skill acquisition, nor whether differences in domain general ability are associated with this. Straw-men theories refuted. Two can be identi®ed. First, as described earlier, the absolute talent model. Second, an absolute barrier model of intelligence, which implies that with an IQ of, say, 80, there are tasks that are too complicated for a person to learn. This invites researchers to come forward with examples of people with low IQ scores who can apparently do complex things. In reality, intelligence researchers are never so deterministic. Instead the emphasis is on how easily tasks can be learned and performed, with this increasing as intelligence level increases.

My argument, therefore, will be that the traditionally cited exception studies, which ostensibly disprove that a domain general intelligence predicts general problem-solving pro®ciency, fail to disprove anything at all.

Is skilled problem solving possible irrespective of intelligence? Historically, the increase in emphasis on domain speci®c processing corresponded to a disillusionment with domain general problem-solving

14. Problem solving and intelligence

333

procedures such as hill-climbing and means±ends analysis (also known as ``weak methods'', see Newell & Simon, 1972). Despite requiring no background knowledge, and despite the initial success of computers programmed to use such methods, their utility began to be called into question (e.g., Copeland, 1993, but see Simonton, Chapter 15, this volume). The focus turned towards knowledge-rich methods, such as analogies (e.g., Gick & Holyoak, 1983) and the use of knowledge by experts (e.g., Chi, Feltovich, & Glaser, 1981; Larkin, McDermott, Simon, & Simon, 1980). Simultaneously, this switch resulted in a subtle change in the conceptualisation of problem solving. Success in general was said to require the recall and application of memorised domain speci®c procedures, rather than the use of domain general methods to ®nd an unknown solution. This was suggested even for tasks not obviously linked to domains of expertise, such as psychometric test item solution (e.g., Richardson, 1991a; Schiano, Cooper, Glaser, & Zhang, 1989). Hence, success at intelligence tests was likewise said to re¯ect a domain speci®c expertise. Many of the studies to be described are products of this change in emphasis, investigating the ``solving'' of routine problems by people with many years of experience at them. Such studies should be considered investigations of memory rather than problem solving, limiting their generalisation to genuinely novel tasks. Hence, domain general ability may or may not be related to routine problem solving, but what about where a solution has not been well rehearsed? However, the dif®culties with the following studies mean that even the conclusion that a person's level of cognitive ability is irrelevant to routine problem solving is called into question. Intelligence tests themselves contain a substantial problem-solving element by any de®nition. For example, Raven's Progressive Matrices (Raven, Raven, & Court, 1993) require people to solve nonverbal logic puzzles, in which rules linking various shapes and patterns must be identi®ed and then applied. It is a problem-solving task in its own right, and is also one of the best single tasks for measuring psychometric g, or domain general intelligence (e.g., Snow, Kyllonen, & Marshalek, 1984). In the psychology laboratory, the problem-solving skills required to solve intelligence tests also seem to be necessary for other types of logic puzzles and decisionmaking tasks (Stanovich & West, 2000); problem-solving tasks involving rearrangement (Carpenter, Just, & Shell, 1990); and also computer games (Rabbitt, Banerji, & Szemanski, 1989). Of course, these are all relatively simple tasks, and it is possible to argue that they are not representative of the complexities of the real world, whether on the basis of relevance, procedure, or complexity. The use of intelligence tests in order to predict complex problem solving in longitudinal laboratory studies is lamentably infrequent, but relationships can be found. For example, looking at an aircraft landing task, Ackerman (1992) found correlations between cognitive ability and performance, and

334

Roberts

Schunn and Reder (2001, Study 3) between cognitive ability and strategy ¯exibility. However, much importance has previously been attached to a particular set of null results. Sternberg et al. (1995) and Ceci and Liker (1986a) cite the work of DoÈrner and colleagues as showing that intelligence test scores fail to predict performance at complex dynamic computer-based simulation games (Brehmer & DoÈrner, 1993; DoÈrner and Schaub, 1994). This might imply that once a certain degree of task complexity is reached, even in laboratory problem-solving tasks, intelligence tests fail to index the necessary abilities, and successful problem solving is possible irrespective of measured intelligence. However, null ®ndings can result from methodological dif®culties. In particular, the more complex a problem-solving task, the harder it is to devise reliable and valid measures of success. This has now been addressed for the dynamic computer-based simulations (Rigas, Carling, & Brehmer, 2002) and very high correlations between cognitive ability and performance are reported (e.g., Gonzalez, Thomas, & Vanyukov, 2005), superseding DoÈrner's null results. Overall, even among homogeneous samples in the psychology laboratory, across a range of tasks varying in complexity, some people on average are better at problem solving than others, and this is associated with measured intelligence. Many of the arguments that domain general intelligence ± as measured by standard tests ± is irrelevant to general problem solving therefore focus on the clever things that people achieve in real life.3 The implication is that laboratory problem-solving tasks tap rather specialist skills, perhaps simply a disposition to perform well in the laboratory. Instead, the complexities of the real world present people with different challenges that intelligence tests and laboratory tasks cannot capture. Reallife problem solving requires knowledge application, and knowledge acquisition requires experience and motivation, not domain general intelligence. Without domain speci®c knowledge, people ¯ounder. With such knowledge and suitable context, they succeed. The evidence for this comes from demonstrations of people whose behaviour is more ``complicated'' than might be expected from their measured intelligence. Don't write ordinary people off, it is implied. Just look at what they can achieve in the real world. How could people manage such things unless they are highly ``intelligent''? One focus of researchers seeking complex behaviour is the workplace. Scribner (1986) gave many examples, and ®ndings in a milk-packing plant

3 Many of the examples to be discussed have been touted as ``practical thinking'' or ``practical intelligence'', even though a straightforward de®nition ± to distinguish between the practical and the nonpractical ± is impossible. Words such as dynamic and ¯exible are used, and practical thinking is supposedly a response to varying environments and goals (e.g., Scribner, 1986). Even so, successful ``nonpractical'' thinking has also been shown to have precisely these characteristics (e.g., Schoenfeld, 1987; Schunn & Reder, 2001; Siegler & Jenkins, 1989).

14. Problem solving and intelligence

335

are typical (see also Scribner, 1984). Experienced people had devised numerous strategies that they were able to apply ¯exibly, enabling them to perform ef®ciently, and minimise overall physical or mental effort. For example, when packing orders, workers used visualisation strategies that dispensed with counting out components. These short-cut strategies avoided the need for exhaustive calculations, but were task-speci®c in the sense that unusual permutations defeated them, forcing use of exhaustive strategies (see Roberts & Newton, 2003). Not surprisingly, inexperienced people were less likely use short-cut strategies (depending on their prior experience at the particular task) and performed worse in the same setting. Complete novices (students) performed worst of all. This pattern was found for packing, pricing, and warehouse management. Even so, there were individual differences in strategy usage over and above the level of experience (Scribner, 1984, p. 38): In each comparison, one or more individuals who apparently lacked on-the-job experience with the task showed the same ¯uency in optimizing solution strategies as practitioners. Conversely, in every case, one or more individuals from the occupation in question did not turn in a skilled performance. However, these works do not document any lack of relationship between performance and schooling/cognitive ability. At best, when test problems were administered that defeated short-cuts, student/novice versus expert groups performed equally when using exhaustive strategies. Even here, the experts would have been more experienced at using them, and so we cannot conclude that general abilities are irrelevant for strategy execution. Scribner's work broke away from deterministic conceptualisations of problem-solving behaviour, in which humans were permitted to demonstrate little ¯exibility, only executing the same processes faster and faster with practice. Inter- and intra-individual differences in strategy usage were minimised as experimental noise. Today, the supposed uniqueness of behaviour in practical domains, and the (alleged) lack of association between success and general intelligence, is rendered obsolete by contemporary models of strategy development (e.g., Crowley, Shrager, & Siegler, 1997; Roberts & Newton, 2005; Siegler, 1999; Siegler & Jenkins, 1989). In general, when performing tasks, irrespective of domain, people spontaneously discover new strategies, performing more ef®ciently as a result. Interestingly, success-based models are currently favoured, in which the more successful performers initially are those who more rapidly discover improved strategies, leaving the less successful performers with the original inef®cient ones (e.g., Roberts, Gilmore, & Wood, 1997; Siegler & Jenkins, 1989; Siegler & Svetina, 2002; Wood, 1978). Initial unpractised performance has been shown to be predicted by cognitive ability (e.g., Roberts & Newton, 2005; Siegler & Svetina, 2002). Furthermore, deciding which

336

Roberts

strategy is the best sometimes demands high levels of cognitive ability (e.g., Dierckx & Vandierendonck, 2005). There is therefore nothing mysterious about the earlier ®ndings, all of which can be explained by contemporary cognitive accounts. As might be expected for any work conducted before the necessary frameworks had been pieced together, Scribner made no attempts to answer the more interesting questions: Who discovers improved strategies, when, and how, and how are they chosen between? These are far more relevant to the understanding of cognition than mere observations that experienced people can do ``clever'' complex things in their own contexts. Arithmetic in the supermarket and the street In the 1970s and 1980s, the study of naturalistic problem solving developed into a minor industry. Another often-cited body of work is by Lave (e.g., 1988), reporting the strategies used by adults at supermarkets. It was again found that people had developed a variety of strategies for evaluating value for money, and these had not necessarily been taught at school. Again, these were task-speci®c short-cuts, devisable because it is often possible to identify best buys without having to calculate exact unit prices, but dif®cult to generalise to other tasks. There were explicit claims that performance at supermarket-style evaluation tasks was not predicted by arithmetic ability or schooling. The underlying theme was the context speci®city of cognition, ruling out general ability as a predictor. With the hindsight of a body of work on strategy development, it now seems obvious that adults can develop useful arithmetic strategies spontaneously, and apply them ¯exibly. We also know that taking a diverse body of adults, and seeing whether their performance can be predicted by demographic measures and assessments of cognitive ability, will underestimate the association between cognitive ability and performance, because different people will have had different opportunities for developing new strategies. Had the developmental trajectory of strategy acquisition been studied, we would undoubtedly have observed individual differences associated with basic arithmetic ability (cf. Siegler & Jenkins, 1989). What is again therefore lacking is an attempt to understand the origin of these strategies. We should also note that in this study, performance on the naturalistic test problems was at ceiling level (95% correct). This is not a surprise: They are mostly very easy; a selection is given by Lave (1988, pp. 72±73 and 104±105). Their format ± forced choices in which the best value deal must be selected, usually with a clear winner ± is very different from the formal arithmetic assessments used, in which exact answers had to be generated. No wonder there were no correlations between these measures. Moving from American supermarkets to Brazilian street markets (e.g., Nunes, Schliemann, & Carraher, 1993) there are similar ®ndings. In one study (N = 5), calculation strategies for determining market prices ± mainly

14. Problem solving and intelligence

337

based on grouping ± had been spontaneously developed by children to ®ll the void caused by an absence of schooling (but how did these develop?). The strategies were executed very well (not surprising given the practice at using them) and success at using these did not correlate with formal written measures of arithmetic ability (again not surprising given the ceiling effect, 98% correct for the market tasks, p. 21). The formal tests were designed to mimic the market tasks in operations and setting, but the poorer performance at them is likely to be due to a lack of generalisation of short-cut strategies, in turn caused by a lack of understanding of basic arithmetic principles. Hence, one plausible hypothesis is that the higher the level of arithmetic skills and understanding ± not the same as years of schooling ± the more rapidly task-speci®c short-cut strategies will be generated, and the more likely that these can be capitalised on in other settings, i.e., generalised (also noted by Schliemann & Carraher, 1992, p. 48; cf. Siegler & Jenkins, 1989). Indeed, in a follow-up study, looking at children who were receiving schooling and working as street vendors, performance at the formal versus market tasks was more evenly matched (p. 38), and the main in¯uence on performance was whether oral versus written procedures were chosen. Presumably, use of the former permitted well-learnt market strategies to be generalised to the formal tests. Also, it seems, written solutions permit a level of abstractedness that can cause dif®culty if the grasp of arithmetical concepts is less than thorough. There was evidence throughout the studies of people who had been taught principles but did not really understand them: not surprising given Nunes et al.'s (1993) criticisms of Brazilian schooling. Overall, data by Schliemann & Carraher (1992) suggest that the strategies learnt by unschooled street traders enable them to perform excellently ± outperforming inexperienced people ± only as long as they do not stray too far from their domain of experience and preferred presentation. Beyond these, performance is related to the understanding of basic arithmetic concepts. Other research merely adds confusion by not controlling for task dif®culty (see also Roberts, Chapter 1, this volume). For example, Ceci and Roazzi (1994) describe a comparison between school children and street vendors. A Piagetian class inclusion task was compared with performance at calculations and class inclusions at a market (child street vendors) or at a simulated market (schoolchildren). The results initially imply a contextual double-dissociation: School children performed well only at the Piagetian class inclusion task (formal context); street vendors performed well only at the market task (market context). However, (1) the market tasks, unlike the Piagetian task, involved the calculation of multiple purchase prices plus class inclusion judgements, and (2) even for street vendors, the market task was relatively dif®cult. Together, these imply that in terms of memory demands and processing steps, the market tasks are harder. Street vendors performed relatively well at the market task (60% at highest level) because their learned task-speci®c strategies were able to function, but poorly at the

338

Roberts

Piagetian task (23%), indicating a lack of development of relevant domain general understanding (possibly as a result of lack of schooling). Schoolchildren performed well at the Piagetian task (54%), and poorly at the market task (27%), because the latter was the more demanding, and they had no task-speci®c short-cuts to assist. What is clear from all these examples is that, in repeatedly encountered settings, in which there is intensive need for similar types of problems to be solved, people are able to capitalise on repeating patterns, developing suites of strategies for solving them. However, despite their apparent ingenuity, they are not necessarily the best. For example, Nunes et al. (1993, pp. 18± 19) describe a Brazilian street trader who, when asked to price 10 items, underwent a lengthy decomposition procedure rather than simply add a zero to the unit price. Clearly the development of a successful strategy is interesting in the face of such adversity ± general arithmetic understanding so poor that one of the basic truths of Base 10 counting was not comprehended ± but we can speculate that such strategies may be only ``just good enough''. Being in possession of them, but not of basic arithmetic understanding, could result in dif®culty, such as for a complicated transaction in which a determined customer wished to exploit these weaknesses. We should also note the claimed situation speci®city of strategies, a cornerstone of extreme domain speci®city.4 Anecdotes abound of people who can use a strategy in one situation, but not another (e.g., Lave, 1988, pp. 65±66). Thus there are reports of people who, when taken away from familiar contexts, not only fail to generalise their own strategies, but fail to apply any worthwhile ones at all, or even to show basic understanding. Fortunately, this failure is by no means a human universal, because such weaknesses would make everyone easy targets for exploitation and fraud. In the Western world we must evaluate mortgages, pension plans, insurance schemes, credit card and loan terms, mobile phone and other utility contracts, deals on new cars, and extended warranties on electrical goods. What hope is there if, in order to make reasonable decisions, we must develop domain speci®c strategies to perform every individual task? Street vendors (and farmers, carpenters, construction foremen, and ®shermen) in Brazil are undoubtedly good at what they do, and are better at their tasks than schooled but inexperienced people, the latter equipped only with general tools rather than specialist strategies. But are any researchers claiming that basic arithmetic skills and understanding are irrelevant to, and uncorrelated with, making sense of the complexities faced by people in the modern world, or that the ®nancial examples above are unrepresentative, trivial, or irrelevant to most people's lives? Task-speci®c short-cut

4 Situation speci®city is not always found, even by the people who expect to ®nd it. For example, Nunes et al. (1993, p. 120) found that Brazilian ®shermen were able to apply strategies outside their domain of expertise.

14. Problem solving and intelligence

339

strategies do not transfer easily from situation to situation, but a domain general understanding of basic number concepts always will, and acquisition of this at school is predicted, in turn, by domain general intelligence. Fishing at the races Ceci and Liker (1986b) is one of the few naturalistic studies in which the potential importance of domain general intelligence was taken seriously enough to warrant measuring it directly. They identi®ed 30 horse-racing enthusiasts (who had visited a racetrack at least twice a week for at least seven years) and asked them to forecast odds and favourites on the basis of past form. Their predictions were compared with professional forecasters. The judgements of 14 people matched the professionals well and they were classed as ``experts''. Procedurally, the main distinction between experts and nonexperts was the use of a ``seven-way interactive model'' ± a slightly misleading name, as its complexity was nowhere near that of a similarsounding statistical term from Analysis of Variance, ``seven-way interaction''. Similarly to other studies, how people acquired the complex strategy was not investigated. Surprisingly, when the IQ scores of the experts versus nonexperts were compared, there was no difference (mean IQ = 99 versus 101 respectively), nor any relationship between IQ and use of the ``seven-way interactive model''. People with lower IQ were as likely to use the ``complex'' method as people with higher IQ. As Liker & Ceci (1987, p. 304) put it: ``a growing number of . . . studies . . . ®nd individuals with relatively low IQs who are able to perform practical cognitive tasks that surpass the level of complexity that their scores suggest they can handle''. Making absolute judgements of what people can and cannot do on the basis of an IQ score (the absolute barrier model straw man) is a parody of what intelligence researchers attempt, and it is simplistic to draw conclusions concerning the relationship between IQ and problem-solving performance from a single measure, taken years after the commencement of a hobby. Hence, intelligence level could have determined how quickly the most dedicated race-goers were able to learn the algorithm (Regan, 1987). More acerbically, Detterman and Spry (1988) noted that IQ did make a prediction; it was negatively correlated with the number of years spent attending races (r = ÿ.42). Hence the more intelligent people realised that they were wasting their time trying to ``beat the system'', and gave up. Another dif®culty, suggested by Brody (1992), is that Ceci and Liker assumed a high degree of cognitive complexity for their ``seven-way interactive model'', but how complicated is the model relatively? Low-IQ people can become expert handicappers, but could they just as easily become expert doctors, physicists, or chess players? Without a formal task analysis, we cannot say. Furthermore, if a person with a low IQ eventually completed medical school, would the knowledge acquired be applied effectively?

340

Roberts

Brody also suggested that the skill of matching forecasters is a rather peculiar incidental expertise to acquire. If a person wishes to beat the system and make money by gambling, then it is necessary to spot the horses that are undervalued, using different procedures to the professionals. Although Ceci and Liker (1986b) asserted that the handicapping skill is needed in order to make money, their claim was not substantiated, for example by comparing the winnings of the participants. The most surprising ®nding of all, which gives cold comfort to people who deny the importance of domain general ability, is that nothing distinguished the experts from the nonexperts. The usual claimed alternative to intelligence as a predictor of problem-solving skill is motivation and practice, but the experts had attended the track for 15 years on average, whereas the nonexperts had attended for 17 years. What then accounted for the individual differences? This study is held by many to demonstrate that intelligence is of limited utility as a predictor of problem-solving success in the real world, but it equally demonstrates that experience is of limited utility. Ultimately, this study refutes both of the major theories of problemsolving performance.

Problem solving by anomalous savants Another argument against a domain general intelligence, necessary for problem solving, comes from the study of anomalous savants (also known by the less politically correct name, idiots savants). Here, people with intelligence levels that are moderately to severely below the normal range are nonetheless capable of exceptional problem-solving performance in an isolated domain (see Howe, 1989; Miller, 1999). These are often in artistic, musical, or mathematical domains, one example of the latter being calendrical calculators (e.g., Spitz, 1995). The underlying argument is twopronged: If the problem-solving skills of anomalous savants were displayed by normal people, (1) these would be accepted as examples of high intelligence, but (2) because the measured intelligence of anomalous savants is low, their achievements seem to be occurring without the need for high intelligence. Therefore, intelligence is not necessary for skilled problem solving (e.g., Howe, 1989).5 Overall, the claim can go one stage further than Ceci and Liker (1986b). Previously, it was asserted that people with high versus low intelligence could be equally skilled. Now it can be asserted that 5 Sometimes it is proposed that anomalous savant skills are evidence for the preservation of innate modules, although the skills are generally too specialist to sit comfortably with this. For example, taking calendrical calculation, there is little evidence for any of the other skills that might be expected had a likely candidate module (e.g., number) been preserved. Instead, general performance is more in line with what would be expected given the intelligence level (e.g., O'Connor & Hermelin, 1990).

14. Problem solving and intelligence

341

people with low intelligence can, in particular domains, be more skilled than highly intelligent people could ever be. The conclusion about the non-necessity of domain general intelligence is again dependent on an absolute barrier model. Even if this position were tenable, such arguments would require a formal understanding of (1) task complexity, along with (2) the relationship between knowledge, intelligence, and strategy acquisition (e.g., Roberts & Newton, 2005). Such analyses are absent from the literature, or are at best merely anecdotal. In reality, any notional intelligence-based barrier to performance is much more ¯exible than is assumed, and represents a complex relationship between what can be achieved in practical terms, in relation to the amount of time and effort that can realistically be expended. We also need to tackle the informal de®nition of intelligent behaviour above, which entails an assessment dependent on the end-product of skill acquisition. The process of acquisition has been neglected, but is crucial to interpretation. If we met an apparently normal person with surprising calendrical calculation ability, our assessment should be on the basis of how the necessary algorithms had been acquired. If this had been incidental, acquired while performing other tasks, such as setting up meetings, we would surely be impressed at how effortlessly this person was able to identify sound rules and make appropriate generalisations (skills that are necessary for solving intelligence test items such as Raven's Progressive Matrices). On the other hand, if this same person had acquired the skills as a result of months of studying calendars to the exclusion of all other activities, we would surely revise our judgement. There is much evidence to suggest that the domain speci®c problem-solving skills of anomalous savants are achieved as a result of months or even years of study (e.g., Ericsson & Faivre, 1988; Miller, 1999; Nettlebeck & Young, 1996) in a wide range of domains. Indeed, Howe (1989, p. 165) appears to reach this conclusion. The acquisition of skills may also be supported by speci®c abilities less retarded than general intelligence (e.g., Heaton & Wallace, 2004; Miller, 1999; Nettlebeck & Young, 1996), and the rules and principles are likely to be acquired implicitly, i.e., without conscious awareness (Spitz, 1995). Overall, the problem-solving skills of anomalous savants are entirely consistent with resulting from acquired expertise (see also Happaney & Zelazo, Chapter 11, this volume). For example, Heavey, Pring, and Hermelin (1999) found that the memory of calendrical calculators versus matched controls exactly mimicked patterns of expertise ®ndings (see below), with nonexceptional general memory, but superior memory for dates that had recently been calculated as answers. Furthermore, the acquisition of such skills can be facilitated by other dif®culties associated with the mental de®cits, leading to unusual levels of interest in the otherwise mundane (e.g., Howe, 1989). However, certain artistic, perceptual, and musical skills, and unusual levels of memory appear to be due to lack of interference by a central processor, for example leading to surface qualities being directly attended to, rather than their meaning.

342

Roberts

Despite anomalous savants demonstrating intensive study and practice within their domains, are their capabilities more than we might expect given their intelligence? This cannot be answered because of the lack of appropriate controls. If individuals with normal or high intelligence studied calendars as obsessively as calendrical calculators, and with as few day-today distractions (for example, the need to earn a living or run a household), would they learn calendar regularities and identify algorithms more rapidly than individuals with low intelligence? For arguments against domain general intelligence to have any weight, the answer would have to be no, there should either be no differences, or (for an exceptional module account) the individuals destined to become anomalous savants should learn faster than the rest. There is just one known attempt reported of a highly intelligent individual attempting to learn calendar calculation algorithms (see Ericsson & Faivre, 1988; Spitz, 1995), eventually automating them and performing as quickly as anomalous savants, but all that we can conclude from this is that nothing has been ruled out. From investigations of anomalous savants, we can reject the absolute barrier model, although this is a conceptualisation that should not be taken seriously. Assuming that the skills of anomalous savants, such as solving calendrical calculation problems, represent genuine problem-solving behaviour, we can accept that problem solvers do not need to be intelligent, in the sense that high levels of domain general ability are not required. However, these studies provide no evidence to contradict the notion that people with higher intelligence will be better at learning and problem solving in general than people with lower intelligence. Indeed, among anomalous savants, level of problem-solving performance within the domain is related to measured intelligence, as is the extent to which the skill can be generalised beyond this, especially when intelligence is measured nonverbally (see Miller, 1999). This is important because intelligence is often conceptualised in terms of a domain general ability to understand, generalise, and transfer knowledge and skills. All too often, for anomalous savants, a problem-solving skill is acquired, but is rarely understood or generalised, and hence the range of skill and understanding demonstrated is exactly what would have been expected given the level of intelligence. Add to this that there are dif®culties in measuring intelligence accurately among the mentally handicapped (for example, Anderson, O'Connor, & Hermelin, 1998, found that, depending on the assessment method used, one such person had an IQ of around 74, or 140), and we see that the case against domain general intelligence from the anomalous savant literature is shaky to say the least. Furthermore, savant skills are much rarer where IQ is very low (e.g., Miller, 1999; Nettlebeck & Young, 1996). Overall, we can form reasonable generalisations of what people with particular levels of intelligence might normally be expected to achieve (see Gottfredson, Chapter 17, this volume), but with the caveat that either higher or lower than average levels of motivation and practice might buck

14. Problem solving and intelligence

343

the expectations, especially when the level of application is zero for people with high intelligence, or becomes obsessive for people with low intelligence. Anomalous savants do not provide counterevidence to the notion of a domain general intelligence. Indeed, it is not even clear that a more elaborate domain-general-intelligence-plus-modules theory (cf. Anderson, 1992) is required in order to account for them. Expertise with motivation and practice, irrespective of intelligence? Many of the studies discussed here have effectively investigated expert± novice differences in problem solving. Hence, comparisons have been made between people with appropriate domain speci®c knowledge (``experts'') versus people without (``novices''). The claim has been that experience, not differences in domain general intelligence, distinguishes the two types of person. Indeed, this claim has also been made for the more traditional domains of expertise, such as chess, physics, and medicine. The requirement of knowledge (and hence practice) for expertise is not disputed here, nor are the major ®ndings. Comparing experts versus (1) complete novices and (2) people in the process of becoming experts, the knowledge of experts is highly structured, enabling them to tackle problems in their domains at the most appropriate level, apply appropriate strategies, and not be sidetracked by irrelevant details. As a by-product of expertise, they show domain-dependent memory effects, such that domain-relevant material is particularly well remembered compared with novices (e.g., Ericsson & Charness, 1994; Ericsson & Lehmann, 1996), although it is now known that randomised domain material (such as chess positions) does not necessarily eliminate this advantage (Gobet & Simon, 1996). However, the supposed equivalence of all domains of expertise, whether physics, medicine, chess, music, sport, or soap operas, in terms of ease and process of skill acquisition, remains conjecture. So far so good, but expertise researchers seek to go further, asserting that individual differences in domain general intelligence are irrelevant to expertise acquisition and application. All that is needed is suf®cient (and deliberate, Ericsson et al., 1993) practice for 10 years, and elite performance in any domain is a certainty. Hence the quote by Simon (1990) earlier. A clear understanding of the relationship between general cognitive ability, practice, and problem-solving performance is not facilitated by the strawman model being tested: the absolute talent model. The credence given to this suggests that expertise researchers believe that their conclusions are tenable because they are working with a (false) dichotomy: Either practice accounts for performance, or talent, but not both. However, for people who do not subscribe to this, our knowledge that even Mozart had to practise does not preclude that he had a domain general ability head start. Indeed, although he was hot-housed by his father, so was his elder sister (Nannerl; see Howe, 1990, p. 22), whom few people have heard of. Expertise

344

Roberts

researchers therefore also have to dispose of the predictive validity of intelligence tests, which they generally do by citing a handful of ostensibly null results (e.g., Ericsson & Charness, 1994, p. 730). Given the explicit and adamant denial of the importance of domain general ability for acquiring or using expertise, one might expect intelligence tests to be an integral part of expertise studies, along with longitudinal attempts to track expertise development, and identify whether some people acquire this more rapidly than others, trying to understand those who fail to improve suf®ciently well, eventually dropping out. Unfortunately, such studies appear to be nonexistent. Instead, the methodologies used (cross-sectional studies, diaries, retrospective reports, and biographies ± sometimes of people dead for hundreds of years) are too crude to enable us to evaluate the intelligence±expertise acquisition/use relationship (see also Waters, Gobet, & Leyden, 2002). Biographies, in particular, are inconclusive. They may show that even Mozart had to practise, but it is invalid to infer from this that anyone can be a Mozart. The ``10 year rule'' mentioned earlier refers to people who succeeded in becoming experts, a self-selected group, for whom their steady improvement with practice was, presumably, suf®cient to indicate that proceeding was worthwhile. Had the dropouts persevered, would they have taken 15 or 20 years to reach the same level of expertise? This concern and others has been discussed by Sternberg (1996), who chastises expertise researchers for their reluctance to acknowledge contradictory ®ndings, and for ignoring, and hence failing to explain, why it should be the case that the best performers are not necessarily those who practised the most (see also Simonton, Chapter 15, this volume, for a discussion of this and other anomalous expertise effects). We should also note that general motivation, the ubiquitous alternative for explaining away the general predictive powers of intelligence test scores, is itself a problematic construct. Although motivation is Howe's (1990) preferred explanation, it fails his own requirement for explanatory power in exactly the same way as he alleges intelligence does (Howe, 1988): Why do people perform well? Because they were motivated. How do we know they were motivated? Because they perform well. Add to this Howe's (1990, p. 187) claim that ``determination to succeed can sometimes be counterproductive'', and we have a dangerously unfalsi®able framework, in which people who succeed are those who are motivated and practise in the right way (Ericsson et al., 1993), whereas people who have failed to succeed, despite their attempts, might have been too motivated, or practised in the wrong way, or both. In fact, Gobet, de Vooght, and Retschitzki (2004) have concluded that there does seem to be an association between intelligence and skill at adversarial board games such as chess. Overall, we should not be too surprised that intelligence test score might predict the rate of expertise acquisition, and the eventual end point reached ± especially in traditional

14. Problem solving and intelligence

345

domains such as chess and physics ± because intelligence test score predicts precisely this at school and at work (e.g., Gottfredson, 2002; Jensen, 1998; Schmidt and Hunter, 1998; see also Brody, Chapter 18; Gottfredson, Chapter 17, this volume). School attendance, compulsory in the Western world for much of childhood, means that there is little attenuation of range, and correlations between measured intelligence and outcome are particularly substantial, with median predictive validities of around .5 (e.g., Jensen, 1998). Observed values are higher the earlier children are tested because there is less restriction in range of IQ. In terms of achievement at school subjects, intelligence test score predicts expertise acquisition, measured at the completion of compulsory schooling, and also the extent to which optional higher levels of expertise are attempted and succeeded at (e.g., Jensen, 1998). This persistent ®nding is rarely even acknowledged by expertise researchers. Even when predicting occupational success, correlations are nonzero, and the very low predictive validities claimed by some for intelligence tests (e.g., around .2 by Ericsson et al., 1993, p. 364) appear to be due to misinterpretations of older studies (see Gottfredson, 2003) and are almost certainly underestimates. Hence, whether a person attains the classi®cation expert physicist in terms of school or professional achievement, a lack of any predictive validity of intelligence test score would be a most peculiar exception to the rule of general predictive validity (see also Gottfredson, 2002).

High intelligence isn't necessary for problem solving, but it always helps Are people with high intelligence always better at problem solving than people with low intelligence? Not necessarily, and no researcher would ever expect this to be the case. However, all other things being equal, on average people with higher intelligence will be at an advantage in problem solving compared with people with lower intelligence. In support of this is the number of times that scores at intelligence tests have correlated with performance in the laboratory, and in real-life settings. To deny any association between intelligence and problem-solving performance, a number of studies are traditionally cited. However, these constitute the null hypothesis, and there are two reasons why researchers may fail to reject this: because (1) it is true for all intents and purposes, or (2) the research was inappropriately designed or implemented. It is easy to fail to ®nd a relationship between intelligence and problem-solving performance if any of the following errors are made: inadequate data; inadequate design; nonisomorphic comparison tasks; obsolete theoretical approaches; neglect of developmental trajectories; and the refutation of straw-man theories. Recurring dif®culties include cross-sectional comparisons by classi®cation variable (e.g., experts versus novices) and the neglect of contemporary theories of strategy development (e.g., how are strategies acquired? Who

346

Roberts

acquires them the fastest?). Overall, in conjunction with the predictive validities of intelligence test scores, the importance of domain general cognitive ability survives intact. All too often, attempts to break the association between intelligence and problem solving assume a straw-man hypothesis. The absolute talent model is not taken seriously by intelligence researchers. Instead, a more subtle model must be tested in which domain general cognitive ability in¯uences (1) the acquisition of relevant domain speci®c problem-solving skills, and (2) the use of domain general problem-solving procedures when encountering novel problems in circumstances where domain speci®c skills are insuf®cient (e.g., Hunter, 1986). The absolute barrier model is also not taken seriously by intelligence researchers. Instead, a more appropriate conceptualisation, and one that does not place an implicit restriction on what can be achieved, is to think of intelligence as facilitating the performance of complex tasks: The higher the level of intelligence, the more easily a task can be performed. If a person chooses to solve a problem, study hard at school, train for a career, or pursue a hobby, then he or she will be at an advantage with high intelligence compared with low intelligence (see also Gottfredson, 2002, for a conceptualisation of this in terms of ``can do'', ``will do'' and ``have done'' variables). Much research remains to be done in order to understand fully the link between intelligence and problem solving, both in and out of the psychology laboratory, but there is little reason to doubt that such a relationship exists. In general, people exhibit consistent domain general differences in their reasoning, problem-solving, and learning ability, with important consequences for real life.

Acknowledgements The author is grateful to Nat Brody, Dean Keith Simonton, and Linda Gottfredson for comments on an earlier draft.

References Ackerman, P. L. (1992). Predicting individual differences in complex skill acquisition: Dynamics of ability determinants. Journal of Applied Psychology, 77, 598±614. Ackerman, P. L., & Heggestad, E. D. (1997). Intelligence, personality, and interests: Evidence for overlapping traits. Psychological Bulletin, 121, 219±245. Anderson, M. (1992). Intelligence and development. Oxford: Blackwell. Anderson, M., O'Connor, N, & Hermelin, B. (1998). A speci®c calculating ability. Intelligence, 26, 383±403. Barrick, M. R., & Mount, M. K. (1991). The big ®ve personality dimensions and job performance: A meta-analysis. Personnel Psychology, 44, 1±26. Blinkhorn, S., & Johnson, C. (1990). The insigni®cance of personality testing. Nature, 348, 671±672.

14. Problem solving and intelligence

347

Brehmer, B., & DoÈrner, D. (1993). Experiments with computer-simulated microworlds: Escaping both the narrow straits of the laboratory and the deep blue sea of the ®eld study. Computers in Human Behavior, 9, 171±184. Brody, N. (1992). Intelligence. San Diego, CA: Academic Press. Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: A theoretical account of the processing in the Raven Progressive Matrices Test. Psychological Review, 97, 404±431. Ceci, S. J., & Liker, J. K. (1986a). Academic versus non-academic intelligence: An experimental separation. In R. J. Sternberg & R. K. Wagner (Eds.), Practical intelligence: Nature and origins of competence in the everyday world (pp. 119±142). Cambridge: Cambridge University Press. Ceci, S. J., & Liker, J. K. (1986b). A day at the races: A study of IQ, expertise and cognitive complexity. Journal of Experimental Psychology: General, 115, 255±266. Ceci, S. J., & Roazzi, A. (1994). The effects of context and cognition: Postcards from Brazil. In R. J. Sternberg & R. K. Wagner (Eds.), Mind in context: Interactionist perspectives on human intelligence (pp. 74±101). Cambridge: Cambridge University Press. Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorisation and representation of physics problems by experts and novices. Cognitive Science, 5, 121±152. Copeland, B. J. (1993). Arti®cial intelligence: A philosophical introduction. Oxford: Blackwell. Crowley, K., Shrager, J., & Siegler, R. S. (1997). Strategy discovery as a competitive negotiation between metacognitive and associative mechanisms. Developmental Review, 17, 462±489. Detterman, D. K., & Spry, K. M. (1988). Is it smart to play on the horses? Comment on ``A day at the races: A study of IQ, expertise and cognitive complexity'' (Ceci & Liker, 1986). Journal of Experimental Psychology: General, 117, 91±95. Dierckx, V., & Vandierendonck, A. (2005). Adaptive strategy application in linear reasoning. In M. J. Roberts & E. J. Newton (Eds.), Methods of thought: Individual differences in reasoning strategies (pp. 107±127). Hove, UK: Psychology Press. DoÈrner, D., & Schaub, H. (1994). Errors in planning and decision making and the nature of human information processing. Applied Psychology: An International Review, 43, 433±453. Ericsson, K. A., & Charness, N. (1994). Expert performance: Its structure and acquisition. American Psychologist, 49, 725±747. Ericsson, K. A., & Faivre, I. A. (1988). What's exceptional about exceptional abilities? In L. Obler & D. Fein (Eds.), The exceptional brain (pp. 436±474). New York: Guilford Press. Ericsson, K. A., Krampe, R. T., & Tesch-RoÈmer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100, 363±406. Ericsson, K. A., & Lehmann, A. C. (1996). Expert and exceptional performance: Evidence on maximal adaptations on task constraints. Annual Review of Psychology, 47, 273±305. Fiddick, L., Cosmides, L., & Tooby, J. (2000). No interpretation without representation: The role of domain speci®c representations and inferences in the Wason selection task. Cognition, 77, 1±79.

348

Roberts

Gick, M. L., & Holyoak, K. J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15, 1±38. Gobet, F., & Simon, H. A. (1996). Recall of rapidly presented random chess positions is a function of skill. Psychonomic Bulletin & Review, 3, 159±163. Gobet, F., de Vooght, A., & Retschitzki, J. (2004). Moves in mind: The psychology of board games. Hove, UK: Psychology Press. Gonzalez, C., Thomas, R. P., & Vanyukov, P. (2005). The relationships between cognitive ability and dynamic decision making. Intelligence, 33, 169±186. Gottfredson, L. S. (2002). g: Highly general and highly practical. In R. J. Sternberg & E. L. Grigorenko (Eds.), The general factor of intelligence: How general is it? (pp. 331±380). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Gottfredson, L. S. (2003). Dissecting practical intelligence theory: Its claims and evidence. Intelligence, 31, 343±397. Gould, S. J. (1997). The mismeasure of man. London: Penguin. Heaton, P., & Wallace, G. L. (2004). Annotation: The savant syndrome. Journal of Child Psychology and Psychiatry, 45, 899±911. Heavey, L., Pring, L., & Hermelin, B. (1999). A date to remember: The nature of memory in savant calendrical calculators. Psychological Medicine, 29, 145±160. Howe, M. J. A. (1988). Intelligence as an explanation. British Journal of Psychology, 79, 349±360. Howe, M. J. A. (1989). Fragments of genius. London: Routledge. Howe, M. J. A. (1990). The origins of exceptional abilities. Oxford: Blackwell. Hunter, J. E. (1986). Cognitive ability, cognitive aptitudes, job knowledge, and job performance. Journal of Vocational Behavior, 29, 340±362. Jensen, A. R. (1998). The g factor. Westport, CT: Praeger. Larkin, J. H., McDermott, J., Simon, D. P., & Simon, H. A. (1980). Models of competence in solving physics problems. Cognitive Science, 4, 317±348. Lave, J. (1988). Cognition in practice. Cambridge: Cambridge University Press. Liker, J. K., & Ceci, S. J. (1987). IQ and reasoning complexity: The role of experience. Journal of Experimental Psychology: General, 116, 304±306. Miller, L. K. (1999). The savant syndrome: Intellectual impairment and exceptional skill. Psychological Bulletin, 125, 31±46. Nettlebeck, T., & Young, R. (1996). Intelligence and savant syndrome: Is the whole greater than the sum of the fragments? Intelligence, 22, 49±67. Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Nunes, T., Schliemann, A. D., & Carraher, D. W. (1993). Street mathematics and school mathematics. Cambridge: Cambridge University Press. O'Connor, N., & Hermelin, B. (1990). The recognition failure and graphic success of idiot-savant artists. Journal of Child Psychology and Psychiatry, 31, 203±215. Rabbitt, P. M. A., Banerji, N., & Szemanski, A. (1989). Space Fortress as an IQ test? Predictions of learning and of practised performance in a complex videogame. Acta Psychologica, 71, 243±257. Raven, J., Raven, J. C., & Court, J. H. (1993). Manual for Raven's Progressive Matrices and Mill Hill Vocabulary Scales. Oxford: Oxford Psychologists Press. Regan, R. T. (1987). Complexity of IQ: Comment on Ceci and Liker (1986). Journal of Experimental Psychology: General, 116, 302±303. Richardson, K. (1991a). Reasoning with Raven ± in and out of context. British Journal of Educational Psychology, 61, 129±138.

14. Problem solving and intelligence

349

Richardson, K. (1991b). Understanding intelligence. Milton Keynes, UK: Open University Press. Rigas, G., Carling, E., & Brehmer, B. (2002). Reliability and validity of performance measures in microworlds. Intelligence, 30, 463±480. Roberts, M. J., Gilmore, D. J., & Wood, D. J. (1997). Individual differences and strategy selection in reasoning. British Journal of Psychology, 88, 473±492. Roberts, M. J., & Newton, E. J. (2003). Individual differences in the development of reasoning strategies. In D. Hardman & L. Macci (Eds.), Thinking: Psychological perspectives on reasoning, judgment, and decision making (pp. 23±43). Chichester, UK: Wiley. Roberts, M. J., & Newton, E. J. (Eds.). (2005). Methods of thought: Individual differences in reasoning strategies. Hove, UK: Psychology Press. Rose, S., Lewontin, R. C., & Kamin, L. J. (1990). Not in our genes. London: Penguin. Schiano, D. J., Cooper, L. A., Glaser, R., & Zhang, H. C. (1989). Highs are to lows as experts are to novices: Individual differences in the representation and solution of standardised ®gural analogies. Human Performance, 2, 225±248. Schliemann, A. D., & Carraher, D. W. (1992). Proportional reasoning in and out of school. In P. Light & G. Butterworth (Eds.), Context and cognition (pp. 47±73). New York: Harvester-Wheatsheaf. Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research ®ndings. Psychological Bulletin, 124, 262±274. Schoenfeld, A. H. (1987). What's all the fuss about metacognition? In A. H. Schoenfeld (Ed.), Cognitive science and mathematics education (pp. 189±215). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Schunn, C. D., & Reder, L. M. (2001). Another source of individual differences: Strategy adaptivity to changing rates of success. Journal of Experimental Psychology: General, 130, 59±76. Scribner, S. (1984). Studying working intelligence. In B. Rogoff & J. Lave (Eds.), Everyday cognition: Its development in social context (pp. 9±40). Cambridge, MA: Harvard University Press. Scribner, S. (1986). Thinking in action: Some characteristics of practical intelligence. In R. J. Sternberg, & R. K. Wagner (Eds.), Practical intelligence: Nature and origins of competence in the everyday world (pp. 13±30). Cambridge: Cambridge University Press. Siegler, R. S. (1999). Strategic development. Trends in Cognitive Sciences, 3, 430±435. Siegler, R. S., & Jenkins, E. A. (1989). How children discover new strategies. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Siegler, R. S., & Svetina, M. (2002). A microgenetic/cross-sectional study of matrix completion: Comparing short-term and long-term change. Child Development, 73, 793±809. Simon, H. A. (1990). Invariants of human behavior. Annual Review of Psychology, 41, 1±21. Snow, R. E., Kyllonen, P. C., & Marshalek, B. (1984). The topography of ability and learning correlations. In R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (Vol. 2, pp. 47±103). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.

350

Roberts

Spitz, H. H. (1995). Calendar calculating idiots savants and the smart unconscious. New Ideas in Psychology, 13, 167±182. Spitz, H. H. (1999). Beleaguered Pygmalion: A history of the controversy over claims that teacher expectancy raises intelligence. Intelligence, 27, 199±234. Stanovich, K. E., & West, R. F. (2000). Individual differences in reasoning: Implications for the rationality debate? Behavioral and Brain Sciences, 23, 645±665. Sternberg, R. J. (1996). Costs of expertise. In K. A. Ericsson (Ed.), The road to excellence (pp. 347±354). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Sternberg, R. J., Wagner, R. K., Williams, W. M., & Horvath, J. A. (1995). Testing common sense. American Psychologist, 50, 912±927. Waters, A. J., Gobet, F., & Leyden, G. (2002). Visuo-spatial abilities in chess players. British Journal of Psychology, 30, 303±311. Wood, D. J. (1978). Problem solving ± the nature and development of strategies. In G. Underwood (Ed.), Strategies in information processing (pp. 329±356). London: Academic Press.

15 Creativity Specialised expertise or general cognitive processes? Dean Keith Simonton

Could Albert Einstein have been a Pablo Picasso? Were the thought processes that led to the creation of the special theory of relativity essentially the same as those responsible for the emergence of Guernica? Or is scienti®c creativity so different from artistic creativity that the two minds operated in totally different worlds? Indeed, is creative problem solving even more specialised than this distinction implies? For instance, were the cognitive operations of Picasso, Igor Stravinsky, Henrik Ibsen, and Martha Graham all as divergent from each other as they were from those of Einstein? Is the creative process so domain speci®c that it is necessary to speak of separate mental processes for art, music, literature, and dance? Or are there one or more information-processing procedures that cut across all forms of creativity no matter what the domain? These questions represent one of the big debates in the theoretical and empirical research on creative problem solving (see e.g., Diakidoy & Spanoudis, 2002; Kaufman & Baer, 2002; Plucker, 2004). As often happens in intellectual controversies, many researchers have taken extreme positions on this issue. This chapter opens with an overview of these and then closes with an attempt at a reconciliation that avoids the extremism of either stance.

The debate Some argue that creativity constitutes a general cognitive process. Just as memory would presumably function the same way in Einstein as it does in Picasso, so would creativity. Others take the position that the creative process is highly domain speci®c, so much so that there are very few commonalities across different domains. Let me discuss each position in more detail. Domain generality Most of the early research on the creative process assumed that it consisted of one or more mental operations that are ubiquitous in all forms of

352

Simonton

creativity. A classic illustration is the work of Wallas (1926), who argued that creativity entailed four major steps: preparation, incubation, illumination, and veri®cation. Presumably this sequence was universal, transcending the speci®c domain of application. In any case, these stages were inferred from the examination of the introspective reports of highly creative individuals. Other investigators have used more mainstream methods to identify the general operations or abilities that supposedly underlie creativity in a diversity of domains. These methods fall into two major categories: experimental and psychometric. Experimental research The Gestalt psychologists were the ®rst to apply experimental methods to the study of creativity (Wertheimer, 1945/1982). They argued that the key process entailed a perceptual or conceptual restructuring of the problem. The processes underlying successful problem solving were so broadly applicable that they could even be discerned in the behaviour of chimpanzees (KoÈhler, 1925). The cross-species generality of the creative process was also af®rmed by behaviourists who tackled this problem ± most notably B. F. Skinner and his students (Epstein, 1991). In this case, creativity operated according to a variation-selection or ``trial-and-error'' mechanism, the same mechanism that provides the basis for operant conditioning in both human and nonhuman animals (Epstein, 1990). With the advent of cognitive science, creativity was viewed as a form of problem solving, and experimental psychologists scrutinised the latter phenomenon to determine the basic mental processes involved (Newell, Shaw, & Simon, 1958; Newell & Simon, 1972). Among those processes were a set of heuristics that could facilitate the discovery of solutions in a wide range of situations (e.g., means±end analysis and hill climbing). Although these cognitive psychologists had much less interest in showing that the same mental operations functioned in nonhuman thinking, they were often committed to implementing these operations on computers. There thus arose ``general problem solvers'' and ``discovery programs'' that purported to display creativity in a variety of domains (e.g., Shrager & Langley, 1990). An excellent example is the program known as BACON that has rediscovered several important laws and principles cutting across several scienti®c disciplines (Bradshaw, Langley, & Simon, 1983; Langley, Simon, Bradshaw, & Zythow, 1987). So powerful are these problem-solving methods that they have inspired ``self-help books'' that purport to enable their readers to become much more adept at creative problem solving (e.g., Hayes, 1989). Psychometric research Where experimental psychologists were interested in discerning the cognitive processes behind most forms of problem solving, differential psychologists

15. Creativity: Specialised or general? 353 were applying psychometric techniques to determine whether individual differences in certain general mental capacities provided the basis for crosssectional variation in creative accomplishments. Long ago, Galton (1869) had maintained that all forms of exceptional achievement, including those involving creativity, were dependent on the possession of an exceptional degree of natural ability. Although his attempts to assess this individualdifference variable were not successful, this hypothesis was later tested using more modern psychometric instruments, such as the Stanford-Binet Intelligence Test (Terman, 1925±1959). Unfortunately, although a low IQ usually precludes the manifestation of creativity, the relation between intelligence and creativity increasingly disappears after an IQ threshold of around 120 (Barron & Harrington, 1981). This lack of strong congruence in the upper ranges helped inspire the development of various tests that more directly gauge individual differences in creative ability (Simonton, 2003). One example is Mednick's (1962) Remote Associates Test (RAT). The fundamental assumption of the RAT is that creativity requires the ability to generate associations between remote ideas. Even more in¯uential are various measures of individual differences in divergent thinking, that is, the capacity to produce numerous and varied responses to a given stimulus (Guilford, 1967). Divergent thought is believed to represent an essential cognitive operation behind all creative acts. So popular have these tests become that they are often referred to as ``creativity tests.'' Domain specificity The preceding research notwithstanding, several investigators have challenged the conclusion that creativity represents a domain-free process or set of processes (e.g., Weisberg, 1992). This challenge is based on both negative and positive arguments. Negative arguments Proponents of domain speci®city argue that their opponents have failed to make a strong empirical case for domain generality. In the case of the psychometric literature, various creativity measures fail to satisfy high standards for validity. For instance, the correlations among alternative measures not only are small, but also are about the same magnitude as their correlations with intelligence tests (McNemar, 1964; see, e.g., Getzels & Jackson, 1962). Hence, they appear to lack convergent and discriminant validity. Even worse, they seem to lack predictive validity as well because the correlations between test performance and overt creative behaviour tend to be relatively modest (Simonton, 2003; see, e.g., Gough, 1979). Accordingly, the intellectual capacities assessed by these instruments, such

354

Simonton

as remote association and divergent thinking, may have relatively little to do with real-life creativity. The relevance of the experimental literature could likewise be challenged. Most of the original research conducted by Gestalt, behaviourist, and cognitive psychologists used relatively simple problems, especially ``insight'' problems that required no special knowledge (Sternberg & Davidson, 1995). Hence, the work was implicitly dedicated to discerning the cognitive or behavioural operations that underlie the solution of riddles, puzzles, and the like. Yet creativity in the real world does not operate this way. To make creative contributions to a given domain presupposes that the individual ®rst acquires the knowledge and skills that de®ne that domain (Ericsson, 1996b). Einstein could not have been a Picasso because he never devoted any effort to mastering painting and sculpture, just as Picasso could not have been an Einstein because he took no courses and received no training in physics and mathematics. Positive arguments In line with this last argument, proponents of domain speci®city have conducted research establishing the empirical validity of the so-called ``10year rule'' (Ericsson, 1996a; Hayes, 1989). According to this principle, the capacity for exceptional achievement requires that an individual devote at least a decade to extensive training and practice. During this phase of expertise acquisition the individual acquires knowledge about the key problems and solution techniques that characterise a particular domain of creative activity. Although some components of this expertise may be present in closely related domains, a signi®cant proportion is likely to be domain speci®c. Hence, Einstein's expertise consisted of analytical strategies that could help him resolve the paradoxes that led to his relativity theories, but these same skills would prove useless were he to decide to paint his own version of Guernica. Indeed, it is probable that the bulk of the expertise that Einstein acquired would not even prove useful were he to venture into chemistry, biology, or geology. Likewise, Picasso's prodigious expertise in the visual arts would prove useless not just in theoretical physics, but in literature and music besides. One interesting feature of this interpretation is that creativity becomes entirely the consequence of highly speci®c environmental experiences. As a result, creative ability is most likely attributable to nurture rather than nature. Even the most outstanding genius is made, not born (Howe, 1999). Anyone can become a creative genius given the right environment and suf®cient practice. This inference ensues from the assumption that genetic endowment is very unlikely to be domain speci®c (e.g., there exists no gene or even set of genes peculiar to, say, theoretical physics). In contrast, were creativity the function of capacities that are not tied to any single domain, then it is at least possible that some subset of these capacities has non-zero

15. Creativity: Specialised or general? 355 heritability coef®cients (Simonton, 1999b). For instance, to the extent that creativity depends on general intelligence, it would be subject to some degree of biological inheritance.

Reconciliation To reconcile these two opposing positions requires that I accomplish two tasks. First, I must show that the argument for domain speci®city is not as strong as advocates imply. Second, I must delineate the distinctive contributions of general processes and capacities in creative thought. Problems with conceiving creativity exclusively as domain specific expertise As already noted, advocates of the expertise-acquisition account of creativity have relied on both negative and positive arguments. Both of these have serious ¯aws that open the way for potential contributions from domain general processes and capacities. Negative arguments Although the validity coef®cients for various creativity measures are not impressively high, we would not expect them to be otherwise. Aside from certain methodological de®ciencies in much of the research (e.g., truncated variation on the variables), creativity is a very complex phenomenon with multiple determinants, some cognitive and others dispositional (Eysenck, 1995; Simonton, 1999a). Because so many variables are involved in the makeup of the creative individual, the contribution of any single factor will be necessarily small. Making matters even more complicated, it is probable that these cognitive and dispositional variables participate in an interactive manner (Eysenck, 1995; Simonton, 1999b). That is, rather that being the additive function of the separate components, creative capacity may be a multiplicative function of them. The multiplicative nature of the function not only leads to a smaller validity coef®cient for any single predictor, but also accounts for the highly skewed distribution of creative behaviour (Simonton, 1999b; cf. Simonton, 1997). The multiplicative function is also important because there is evidence that the genetic traits that underlie creativity are also inherited in a non-additive, ``emergenic'' fashion (Waller, Bouchard, Lykken, Tellegen, & Blacker, 1993). Criticisms of the experimental literature may also be off the mark, albeit for different reasons. Although creative problem solving in the real world presumes the acquisition of suf®cient expertise in a given domain, that provision does not mandate that the cognitive operations be domain speci®c. On the contrary, the expertise may merely provide the content of those mental processes, while the processes themselves remain relatively

356

Simonton

broad in application. This is not to assert that there are no domain speci®c problem-solving tactics and strategies. Even so, the existence of such processes does not automatically preclude the participation of more general cognitive operations and capacities. This fact will become more obvious when we turn to the next issue. Positive arguments The implications of the 10-year rule are much more ambiguous than ®rst meets the eye (Simonton, 1996, 2000a). In the ®rst place, the empirical research on this principle did not look at individual differences in the relationship between the duration of the expertise-acquisition period and the level of creativity eventually displayed. When investigators attend to this question, surprising results emerge (e.g., Simonton, 1991, 1992). In particular, the level of creativity exhibited by an individual is inversely related to the amount of time devoted to the acquisition of domain speci®c expertise. In other words, the more productive and in¯uential creators spend less time in such preparation. This makes no sense from the standpoint of the expertise-acquisition explanation. After all, the more knowledge and skill a person masters, the greater should be his or her creativity (Ericsson, 1996a). However, it is possible to interpret this inverse association in terms of a generic human capacity ± namely general intelligence. According to the original de®nition of the intelligence quotient, IQ was de®ned as the ratio of mental age to chronological age multiplied by 100 (Cox, 1926; Terman, 1925±1959). Although IQ is normally no longer de®ned in this manner, it still indicates that exceptional intelligence is related to accelerated mastery of knowledge and skills. As a consequence, this inverse relationship may show that those with greater general intelligence master the needed domain speci®c expertise in less time, and that these same people later become more productive and in¯uential. This latter inference ®ts the ®nding that general intelligence is positively correlated with achieved eminence in domains of creativity (Cox, 1926; Simonton, 1976a; Walberg, Rasher, & Hase, 1978). This is not the only place where expertise-acquisition does not operate in a manner consistent with the notion that creativity depends heavily on domain speci®c expertise (Simonton, 1996). Consider the three empirical ®ndings that highly creative contributors to a particular domain (1) display a wide range of interests, including some well outside their speciality area, (2) work on lots of different topics or genres simultaneously rather than specialising on any one for a considerable period of time, and (3) tend to be highly versatile, making contributions even to domains outside their main area of expertise (see, e.g., Root-Bernstein, Bernstein, & Garnier, 1993, 1995; Simonton, 2000a, 2004; Sulloway, 1996; White, 1931). Moreover, the higher the level of eminence attained, the more prominent these three correlates. To illustrate, consider (1) that Einstein would often interrupt his

15. Creativity: Specialised or general? 357 endeavours in theoretical physics to go sailing on the lake or to play Mozart violin sonatas, (2) that Darwin's work on the theory of evolution was intermixed with other inquiries concerning topics as diverse as coral formations, volcanic islands and mountain chains, Welsh and Scottish geology, glacial action, fossil and modern cirripedes, the role of bees in the fertilisation of certain ¯owers, the role of earthworms in soil maintenance, seed vitality, and half a dozen other questions, and (3) that the major contributions of the following notables cannot be pegged to a single category of creativity, namely, Alberti, Archimedes, Aristotle, Borodin, Buckminster Fuller, Cardano, Descartes, Erasmus, Benjamin Franklin, Galton, Gauss, Goethe, Hooke, Huygens, Hypatia, Leibniz, Lomonosov, Michelangelo, von Neumann, Newton, Pascal, Pasteur, QueÂtelet, Rhazes, Herbert Simon, Thomas Young, Leonardo da Vinci, Voltaire, and Wren. These cases would seem to contradict the idea that the cognitive skills required for creativity are highly speci®c to particular domains, or even speci®c to genres or topics within a given domain. Indeed, the results suggest that creativity may be enhanced, not hindered, by a breadth of knowledge and skills. One ®nal empirical ®nding concerns the creativity of output across the lifespan. If creativity were simply a matter of acquiring domain speci®c problem-solving techniques, then we would expect that the impact of creative products should be a positive monotonic function of career age. Perhaps impact might level off in the latter part of the career, as the creator attains complete mastery of all the relevant tactics and strategies, but we certainly would not predict that creativity would decline after some career peak. Yet that is precisely what happens (Simonton, 1988a, 1997). Worse yet, the best way to revive creativity during such a decline is to work on genres or topics that are tangential to the main course of creative activity (Simonton, 2000a). To broaden the scope of one's endeavours is more likely to maintain creativity than to narrow the focus on more specialised genres or topics. Domain-free processes and capacities in creativity I have already suggested one way that a general ability might be linked to creativity: General intelligence is very likely directly involved in the acquisition of domain speci®c expertise. Not only would higher intelligence be associated with accelerated acquisition, but also a threshold level of intelligence may be required for mastery to be attainable. For instance, it is very unlikely that someone has the capacity to become a creative scientist without an IQ of at least 120, and for certain disciplines, such as theoretical physics, the minimum level would be closer to 130 (Gibson & Light, 1967; Roe, 1953). There are yet other ways that domain-free factors are involved in creativity. Below I examine these with respect to experimental and psychometric research.

358

Simonton

Experimental literature Earlier I mentioned heuristics as general processes that can be applied to problems in a diversity of domains. These are sometimes called ``weak methods'' to distinguish them from ``strong methods'' (Klahr, 2000). Strong methods operate in a more algorithmic manner, almost invariably guaranteeing a solution. Strong methods are among the most important tools and skills that an individual acquires when obtaining expertise in a particular domain. In other words, strong methods tend to be highly domain speci®c. So if such methods are so effective, why would a creator even want to resort to weak methods? Why would some heuristic be called upon when a tried and tested algorithm will assure a solution? The answer to this question is the key to appreciating the role of domainfree thinking in creativity. The kinds of problem that can be solved using strong or algorithmic methods tend not to be the kinds of problem that demand a high degree of creativity. Quite the contrary, they tend to be the ``routine'' or ``run-of-the-mill'' problems that can be solved by anyone who has acquired the same expertise in the same domain. Because an essential component of creativity is that the product or idea be original, these solutions are not going to be seen to be as creative as problems that cannot be solved using strong methods (Simonton, 1999b). This opens the door to the weak or heuristic strategies ± methods that are almost entirely domain-free. Although several heuristics exist, there is a speci®c one that has a supreme place ± trial and error. If a creator has no strong method to go on, then he or she can explore various possibilities, trying out this or that line of attack, in the hope that something will work. Hence, an artist may ®ll up sketchbooks with diverse ideas that may eventually contribute to the solution (Arnheim, 1962), and a scientist may do the same with laboratory notebooks (Gruber, 1974). Signi®cantly, not only is trial and error a legitimate heuristic in its own right, but it also can be considered the one and only meta-heuristic (Simonton, 2004). One characteristic of heuristics is that they cannot guarantee a solution ± this is the very reason why they are called ``weak'' methods ± hence problem solving often requires the application of more than one heuristic until the creator ®nds one that ®nally works. This amounts to the trial-and-error application of heuristics. This elevates trial and error to a very special place. At the same time, it is easy to see that this combined meta-heuristic and heuristic is intrinsically domainfree. It can be applied to any form of creativity in both the arts and the sciences (Simonton, 1999a). The broad signi®cance of trial and error is suggested by other research traditions within experimental psychology (Simonton, 2004). A case in point is the Geneplore model put forward as part of the creative cognition approach (Finke, Ward, & Smith, 1992). This proposes that creativity consists of generating new ideas that are then explored for their potential

15. Creativity: Specialised or general? 359 utility. This process is strikingly similar to the variation-selection model of creativity advanced by Donald Campbell (1960) and others (Simonton, 1988b). This holds that the creative process consists of the ability to produce ideational variations and then select those that satisfy some criterion. This model can be considered a Darwinian theory of creativity insofar as biological evolution operates by an analogous variation-selection or trial-and-error mechanism (Simonton, 1999a). This connection is provocative because it ties these models with computer programs that function according to the same principles, namely, genetic algorithms and genetic programming (Holland, 1975, 1992; Koza, 1992, 1994). These programs generate combinations of ideas that are then subjected to a selection process. Unlike the computer programs mentioned earlier in this chapter, these evolutionary programs have actually made genuine discoveries rather than just rediscovering what has already been known. Putting this all together it can be argued that the domain-free process that can be found in all forms of creativity entails some kind of capacity for generating ideational variations or combinations that are then subjected to an appropriate selection procedure. Psychometric literature Those who argue that creativity depends on a highly speci®c domain expertise have somehow to explain the fact that experts in a domain are not the same as creators in that domain (Simonton, 1996). Einstein knew far less physics and mathematics than many other physicists of his day, and yet what Einstein lacked in knowledge he made up for with imagination. Indeed, empirical research indicates that creators can be differentiated from experts on a host of cognitive and dispositional variables (e.g., Rostan, 1994). It is telling that these individual differences appear to be precisely the kinds of traits that would facilitate the trial-and-error or variation-selection operations discussed in the previous section (Simonton, 1999a, 2004). With respect to cognitive traits, it should be clear that divergent thinking has a direct relation with the capacity to produce ideational variations (Simonton, 1999a, 2004). This is apparent, for instance, in the Unusual Uses Test, in which the respondent must generate as many functions as possible for a common object, such as a paper clip or tooth pick (Simonton, 2003). The uses generated are then scored for ¯uency (the number of responses), originality (the rarity of the responses), and ¯exibility (the number of different categories the responses can be assigned to). Clearly, this test is gauging the degree to which a person can conceive numerous, diverse, and novel ideas. Although it may be less immediately obvious, the capacity for remote associations, such as assessed by the RAT, is also intimately connected with the ability to generate ideational variations or combinations (Simonton, 1999a). Individuals who can make associative linkages between seemingly

360

Simonton

unrelated ideas have the capacity to connect ideas that normally are not placed together. The richer the associative interconnections, the greater are the number and variety of those ideational combinations (Simonton, 2004). To be sure, as noted earlier, neither the RAT nor the divergent thinking tests display substantial validity coef®cients. However, we would not expect it to be otherwise. Because these measures use everyday concepts as stimuli, they are really designed to assess individuals on everyday creativity. To the extent that a creative domain relies on concepts that are speci®c to that domain, the stimuli have to be tailored to that domain. Einstein had to work out the many implications of the relativity principle, but not the many uses of a paper clip. In fact, there is empirical evidence that the predictive validity of such psychometric instruments increases when their content is made speci®c to the domain in which creativity is being assessed (Baer, 1993, 1994; Han, 2003; Mumford, Marks, Connelly, Zaccaro, & Johnson, 1998). The same holds for the capacity to generate remote associations (Gough, 1976). Note that this is not equivalent to admitting that the thought processes involved in creativity are domain speci®c. Divergent thinking and remote association represent the processes, the domain speci®c concepts the content. The situation somewhat parallels that in psycholinguistics. Certain general mental processes and capacities underlie the acquisition and application of any language, even though the speci®c lexical and syntactic features vary across languages. It should be pointed out that several other individual-difference variables connected with creativity also appear to have linkage with the capacity to generate numerous and diverse ideas or combinations of ideas (Simonton, 2004). For instance, creativity is positively correlated with openness to experience (McCrae, 1987). This individual-difference variable signi®es openness to a diversity of novel and unconventional stimuli and ideas. Because such persons are receptive to, and even actively seek, a greater variety of unpredictable sensory and intellectual input, a broader range of associative pathways will be primed, including unanticipated pathways that may lead to possible solutions to unsolved problems. This openness thus provides a basis for the ``opportunistic assimilation'' of unexpected stimuli during the incubation phase of the creative process (Seifert, Meyer, Davidson, Patalano, & Yaniv, 1995). Somewhat related to the preceding factor are individual differences in the capacity to ®lter out presumably extraneous information (Eysenck, 1995). A speci®c example is latent inhibition, that is, the ability to direct attention away from stimuli previously determined to be irrelevant for a particular task situation. Although obviously a useful cognitive skill in many everyday contexts, this ability may become something of a liability in the case of creative problem solving. For instance, research on insight has shown how often the solution is overlooked because of ``functional ®xedness'' ± the tendency to see objects as having only their most commonplace functions. This obstacle has an obvious connection with the inability to conjure up

15. Creativity: Specialised or general? 361 original functions for commonplace objects in the Unusual Uses Test. Hence, it should be expected that creative behaviour would be associated with a lowered capacity for latent inhibition, and empirical research bears out this prediction (Carson, Peterson, & Higgins, 2003; Stavridou & Furnham, 1996). There are three aspects of this advantageously impaired ®ltering process that deserve special note. First, it is often domain speci®c expertise that interferes with creative problem solving by determining on a priori grounds what is acceptable and what is unacceptable in a solution. This is one reason why creative performance is so often a curvilinear, inverted-U function of ``practice'' in a given artistic genre or research topic (Simonton, 1988a, 2000a). Thus, when algorithmic or strong methods will not work, and even the most straightforward heuristics prove useless, then the creator must often ``think outside the box'' imposed by domain speci®c strategies and tactics. When the individual has no other recourse but to rely on trial-and-error, variationselection, or exploratory processes, it becomes particularly necessary to relax the discipline-imposed constraints on the range of potential solutions. Einstein's relativity theory could only result by rejection of the restrictions of classical physics, such as the independent status and absolute nature of space and time. Similarly, Picasso's analytic cubism could only emerge by rebellion against the rule that artists had to adhere to a single perspective on objects, which dominated Western representative art since the Renaissance. Second, although such ``defocused attention'' has clear adaptive value in solving problems that are intractable from the standpoint of traditional approaches, in excess this mental state can have dysfunctional consequences. Indeed, reduced negative priming and latent inhibition is also positively correlated with psychoticism, a scale of the Eysenck Personality Questionnaire (Eysenck, 1993, 1995; Stavridou & Furnham, 1996). Yet, at the same time, psychoticism is positively correlated with the capacity for unusual associations, and with appreciation for highly complex stimuli, abilities intimately connected with creativity (Eysenck, 1994; see also Peterson, Smith, & Carson, 2002). Hence, creative thought requires a subtle balance between the highly controlled information processing that characterises routine thinking, and the totally chaotic ramblings that are characteristic of those suffering from severe mental illness (Simonton, 1999a; see also Barron, 1969; Ghadirian, Gregoire, & Kosmidis, 2001; Wuthrich & Bates, 2001). Third, the optimum point between these two extremes varies according to the speci®c domain of creativity in which an individual is active. For example, scienti®c creativity generally operates under much more domain speci®c constraints than does artistic creativity (Simonton, 2004). No matter how creative Einstein's theorising may have been, it still had to conform to certain basic standards of subject matter, scienti®c logic, and

362

Simonton

experimental fact. Picasso's imagination, in contrast, had far more free rein, the only major restriction being some adherence to the doctrine of representationalism (i.e., the objects had to be at least partially recognisable). Yet it is necessary to emphasise that contrasts can also be found within the arts and sciences. In the case of the sciences, for example, creativity in paradigmatic disciplines (such as physics) has more domain speci®c constraints than creativity in nonparadigmatic disciplines (such as psychology), and within paradigmatic disciplines, revolutionary scientists, such as Einstein, operate under fewer restrictions than do other scientists, such as Hendrik Lorentz. These differences are important because they tend to correspond to the observed frequency and intensity of psychopathology across various creative domains (Simonton, 2004). For instance, mental illness is higher in the nonparadigmatic sciences than in the paradigmatic sciences (Ludwig, 1995). Likewise, psychopathology is more common in the highly expressive and romantic arts than in the highly controlled, formal, and classical arts (Ludwig, 1998).

Conclusion Research has shown that highly in¯uential psychologists tend to take extreme positions on the various issues that divide the discipline (Simonton, 2000c; see also Simonton, 1976b). A classic example is the notorious nature±nurture issue, a debate on which many psychologists have staked their reputations by arguing for either extreme environmentalism (e.g., John B. Watson) or extreme hereditarianism (e.g., Francis Galton). So perhaps the extremism displayed by both sides of the current issue represents an excellent strategy for optimising an investigator's visibility in the ®eld. Some attain distinction by advocating the domain-free status of the creative process, whereas others achieve comparable eminence by arguing for domain speci®city. The problem with this radical polarisation is that it may not be good science. When Einstein recognised that certain aspects of Newtonian mechanics contradicted Maxwell's equations for electromagnetism, he did not choose sides but rather found a theoretical integration via relativity theory. Similarly, when physicists debated about whether light was a wave or a particle, Niels Bohr resolved the con¯ict by arguing that light has properties of both phenomena. Of course, most psychologists now believe that human behaviour is a function of both genetic and experiential in¯uences. So some reconciliation is certainly possible in the present controversy as well. This takes the following form. With respect to domain speci®city, there can be no doubt that outstanding creativity must be founded on expertise, not ignorance. It is extremely rare for an individual to make major contributions to a particular domain without ®rst acquiring a minimal level of domain speci®c knowledge and skill. Even Albert Einstein could not walk into Picasso's studio and paint up a masterpiece. In this sense, the 10-year rule in expertise acquisition

15. Creativity: Specialised or general? 363 remains valid. Furthermore, among the various domain speci®c items acquired is a set of strong methods or algorithms that enable the creator to solve routine problems. A good part of any creative person's career consists of such solutions. For instance, most of the authors of chapters in this edited volume probably have the task of reviewing their ®eld or summarising their results down to a fairly straightforward procedure. No big breakthroughs or ¯ashes of insight are usually required to make a contribution to most edited volumes. Yet the very ease with which these routine tasks can be executed is their limitation. Few authors of these chapters would consider their contribution an instance of their most creative work. According to the accepted de®nition of the phenomenon, creativity presupposes originality (Simonton, 2000b). Such essays are less likely to be considered highly original than, say, an article published in a refereed journal. Even among refereed journal articles there may exist considerable variation in originality, and hence creativity. Einstein followed his ®rst 1905 paper on special relativity with elaborations and extensions of the principle, none of which could be viewed as equally creative (even the paper in which he announced the famous E = mc2). Not until over a decade later, when he began publishing on general relativity, could he be said to have matched or exceeded that earlier accomplishment. The same holds for Picasso, who would often introduce a revolutionary artistic idea in a particular work ± such as the 1907 Les Demoiselles d 'Avignon ± and then spend several years developing and enlarging upon that major innovation. To the extent that a creator aims at the highest level of originality, he or she must rely on domain-free processes. These enable the individual to break away from the constraints imposed by the domain, even constraints that the individual personally introduced into the domain in previous works. Einstein's special theory of relativity was constrained by its presumption of non-accelerating frames of reference. His general theory of relativity removed that assumption. But what are these processes? There are actually a great many of them, such as trial and error, remote association, divergent thinking, exploration and playful tinkering, defocused attention, and ideational variation or recombination. The speci®cs do not matter. What matters is that the creative intellect opens up to a wider range of possibilities, a range that does not automatically exclude an idea just because it has already been ruled out of court by the discipline. On the other hand, these processes do not operate in a vacuum. Their content has to be embedded in the context of a particular domain. Einstein did not conjure up his theories by painting, nor did Picasso get ideas for his compositions by contemplating solutions to physics problems. Hence, the process of major creative insights is domain-free while the content remains domain-bound. Nevertheless, the latter restriction does not mean that the creative process is necessarily domain speci®c. On the contrary, the more creative the solution, the lower the likelihood that it depended on domain

364

Simonton

speci®c thought. Hence, when Einstein and Picasso were engaged in routine work, they were miles apart in their thinking habits, each relying on distinctive algorithms to produce straightforward ideas. Only when they were each occupied with their most original work did they converge in the cognitive processes and capacities that they brought to bear on their creative thought.

References Arnheim, R. (1962). Picasso's Guernica: The genesis of a painting. Berkeley: University of California Press. Baer, J. (1993). Creativity and divergent thinking: A task-speci®c approach. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Baer, J. (1994). Divergent thinking is not a general trait: A multidomain training experiment. Creativity Research Journal, 7, 35±46. Barron, F. X. (1969). Creative person and creative process. New York: Holt, Rinehart & Winston. Barron, F. X., & Harrington, D. M. (1981). Creativity, intelligence, and personality. Annual Review of Psychology, 32, 439±476. Bradshaw, G. F., Langley, P., & Simon, H. A. (1983). Studying scienti®c discovery by computer simulation. Science, 222, 971±975. Campbell, D. T. (1960). Blind variation and selective retention in creative thought as in other knowledge processes. Psychological Review, 67, 380±400. Carson, S., Peterson, J. B., & Higgins, D. M. (2003). Decreased latent inhibition is associated with increased creative achievement in high-functioning individuals. Journal of Personality and Social Psychology, 85, 499±506. Cox, C. (1926). The early mental traits of three hundred geniuses. Stanford, CA: Stanford University Press. Diakidoy, I.-A. N., & Spanoudis, G. (2002). Domain speci®city in creativity testing: A comparison of performance on a general divergent-thinking test and a parallel, content-speci®c test. Journal of Creative Behavior, 36, 41±61. Epstein, R. (1990). Generativity theory and creativity. In M. Runco & R. Albert (Eds.), Theories of creativity (pp. 116±140). Newbury Park, CA: Sage Publications. Epstein, R. (1991). Skinner, creativity, and the problem of spontaneous behavior. Psychological Science, 2, 362±370. Ericsson, K. A. (1996a). The acquisition of expert performance: An introduction to some of the issues. In K. A. Ericsson (Ed.), The road to expert performance: Empirical evidence from the arts and sciences, sports, and games (pp. 1±50). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Ericsson, K. A. (Ed.). (1996b). The road to expert performance: Empirical evidence from the arts and sciences, sports, and games. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Eysenck, H. J. (1993). Creativity and personality: Suggestions for a theory. Psychological Inquiry, 4, 147±178. Eysenck, H. J. (1994). Creativity and personality: Word association, origence, and psychoticism. Creativity Research Journal, 7, 209±216.

15. Creativity: Specialised or general? 365 Eysenck, H. J. (1995). Genius: The natural history of creativity. Cambridge: Cambridge University Press. Finke, R. A., Ward, T. B., & Smith, S. M. (1992). Creative cognition: Theory, research, applications. Cambridge, MA: MIT Press. Ghadirian, A.-M., Gregoire, P., & Kosmidis, H. (2001). Creativity and the evolution of psychopathologies. Creativity Research Journal, 13, 145±148. Galton, F. (1869). Hereditary genius: An inquiry into its laws and consequences. London: Macmillan. Getzels, J., & Jackson, P. W. (1962). Creativity and intelligence: Explorations with gifted students. New York: Wiley. Gibson, J., & Light, P. (1967). Intelligence among university scientists. Nature, 213, 441±443. Gough, H. G. (1976). Studying creativity by means of word association tests. Journal of Applied Psychology, 61, 348±353. Gough, H. G. (1979). A creative personality scale for the adjective check list. Journal of Personality and Social Psychology, 37, 1398±1405. Gruber, H. E. (1974). Darwin on man: A psychological study of scienti®c creativity. New York: Dutton. Guilford, J. P. (1967). The nature of human intelligence. New York: McGraw-Hill. Han, K. S. (2003). Domain speci®city of creativity in young children: How quantitative and qualitative data support it. Journal of Creative Behavior, 37, 117±142. Hayes, J. R. (1989). The complete problem solver (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Holland, J. H. (1975). Adaptation in natural and arti®cial systems. Ann Arbor: University of Michigan Press. Holland, J. H. (1992). Genetic algorithms. Scienti®c American, 267, 66±72. Howe, M. J. A. (1999). The psychology of high abilities. New York: New York University Press. Kaufman, J. C., & Baer, J. (2002). Could Steven Spielberg manage the Yankees? Creative thinking in different domains. Korean Journal of Thinking & Problem Solving, 12, 5±14. Klahr, D. (2000). Exploring science: The cognition and development of discovery processes. Cambridge, MA: MIT Press. KoÈhler, W. (1925). The mentality of apes (E. Winter, Trans.). New York: Harcourt, Brace. Koza, J. R. (1992). Genetic programming: On the programming of computers by means of natural selection. Cambridge, MA: MIT Press. Koza, J. R. (1994). Genetic programming II: Automatic discovery of reusable programs. Cambridge, MA: MIT Press. Langley, P., Simon, H. A., Bradshaw, G. L., & Zythow, J. M. (1987). Scienti®c discovery. Cambridge, MA: MIT Press. Ludwig, A. M. (1995). The price of greatness: Resolving the creativity and madness controversy. New York: Guilford Press. Ludwig, A. M. (1998). Method and madness in the arts and sciences. Creativity Research Journal, 11, 93±101. McCrae, R. R. (1987). Creativity, divergent thinking, and openness to experience. Journal of Personality and Social Psychology, 52, 1258±1265. McNemar, Q. (1964). Lost: Our intelligence? Why? American Psychologist, 19, 871± 882.

366

Simonton

Mednick, S. A. (1962). The associative basis of the creative process. Psychological Review, 69, 220±232. Mumford, M. D., Marks, M. A., Connelly, M. S., Zaccaro, S. J., & Johnson, T. F. (1998). Domain based scoring of divergent thinking tests: Validation evidence in an occupational sample. Creativity Research Journal, 11, 151±164. Newell, A., Shaw, J. C., & Simon, H. A. (1958). Elements of a theory of human problem solving. Psychological Review, 65, 151±166. Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Peterson, J. B., Smith, K. W., & Carson, S. (2002). Openness and extraversion are associated with reduced latent inhibition: Replication and commentary. Personality & Individual Differences, 33, 1137±1147. Plucker, J. A. (2004). Generalization of creativity across domains: Examination of the method effect hypothesis. Journal of Creative Behavior, 38, 1±12. Roe, A. (1953). The making of a scientist. New York: Dodd, Mead. Root-Bernstein, R. S., Bernstein, M., & Garnier, H. (1993). Identi®cation of scientists making long-term, high-impact contributions, with notes on their methods of working. Creativity Research Journal, 6, 329±343. Root-Bernstein, R. S., Bernstein, M., & Garnier, H. (1995). Correlations between avocations, scienti®c style, work habits, and professional impact of scientists. Creativity Research Journal, 8, 115±137. Rostan, S. M. (1994). Problem ®nding, problem solving, and cognitive controls: An empirical investigation of critically acclaimed productivity. Creativity Research Journal, 7, 97±110. Seifert, C. M., Meyer, D. E., Davidson, N., Patalano, A. L., & Yaniv, I. (1995). Demysti®cation of cognitive insight: Opportunistic assimilation and the preparedmind perspective. In R. J. Sternberg & J. E. Davidson (Eds.), The nature of insight (pp. 65±124). Cambridge, MA: MIT Press. Shrager, J., & Langley, P. (Eds.). (1990). Computational models of scienti®c discovery and theory formation. San Mateo, CA: Kaufmann. Simonton, D. K. (1976a). Biographical determinants of achieved eminence: A multivariate approach to the Cox data. Journal of Personality and Social Psychology, 33, 218±226. Simonton, D. K. (1976b). Philosophical eminence, beliefs, and zeitgeist: An individual-generational analysis. Journal of Personality and Social Psychology, 34, 630±640. Simonton, D. K. (1988a). Age and outstanding achievement: What do we know after a century of research? Psychological Bulletin, 104, 251±267. Simonton, D. K. (1988b). Scienti®c genius: A psychology of science. Cambridge: Cambridge University Press. Simonton, D. K. (1991). Emergence and realization of genius: The lives and works of 120 classical composers. Journal of Personality and Social Psychology, 61, 829±840. Simonton, D. K. (1992). Leaders of American psychology, 1879±1967: Career development, creative output, and professional achievement. Journal of Personality and Social Psychology, 62, 5±17. Simonton, D. K. (1996). Creative expertise: A life-span developmental perspective. In K. A. Ericsson (Ed.), The road to expert performance: Empirical evidence from

15. Creativity: Specialised or general? 367 the arts and sciences, sports, and games (pp. 227±253). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Simonton, D. K. (1997). Creative productivity: A predictive and explanatory model of career trajectories and landmarks. Psychological Review, 104, 66±89. Simonton, D. K. (1999a). Origins of genius: Darwinian perspectives on creativity. New York: Oxford University Press. Simonton, D. K. (1999b). Talent and its development: An emergenic and epigenetic model. Psychological Review, 106, 435±457. Simonton, D. K. (2000a). Creative development as acquired expertise: Theoretical issues and an empirical test. Developmental Review, 20, 283±318. Simonton, D. K. (2000b). Creativity: Cognitive, developmental, personal, and social aspects. American Psychologist, 55, 151±158. Simonton, D. K. (2000c). Methodological and theoretical orientation and the longterm disciplinary impact of 54 eminent psychologists. Review of General Psychology, 4, 1±13. Simonton, D. K. (2003). Creativity assessment. In R. FernaÂndez-Ballesteros (Ed.), Encyclopedia of psychological assessment (Vol. 1, pp. 276±280). London: Sage Publications. Simonton, D. K. (2004). Creativity in science: Chance, logic, genius, and zeitgeist. Cambridge: Cambridge University Press. Stavridou, A., & Furnham, A. (1996). The relationship between psychoticism, traitcreativity and the attentional mechanism of cognitive inhibition. Personality and Individual Differences, 21, 143±153. Sternberg, R. J., & Davidson, J. E. (Eds.). (1995). The nature of insight. Cambridge, MA: MIT Press. Sulloway, F. J. (1996). Born to rebel: Birth order, family dynamics, and creative lives. New York: Pantheon. Terman, L. M. (1925±1959). Genetic studies of genius (5 vols.). Stanford, CA: Stanford University Press. Walberg, H. J., Rasher, S. P., & Hase, K. (1978). IQ correlates with high eminence. Gifted Child Quarterly, 22, 196±200. Wallas, G. (1926). The art of thought. New York: Harcourt, Brace. Waller, N. G., Bouchard, T. J., Jr., Lykken, D. T., Tellegen, A., & Blacker, D. M. (1993). Creativity, heritability, familiality: Which word does not belong? Psychological Inquiry, 4, 235±237. Weisberg, R. W. (1992). Creativity: Beyond the myth of genius. New York: Freeman. Wertheimer, M. (1982). Productive thinking (M. Wertheimer, Ed.). Chicago: University of Chicago Press. (Original work published 1945) White, R. K. (1931). The versatility of genius. Journal of Social Psychology, 2, 460±489. Wuthrich, V., & Bates, T. C. (2001). Schizotypy and latent inhibition: Non-linear linkage between psychometric and cognitive markers. Personality and Individual Differences, 30, 783±798.

16 The CASE1 for a general factor in intelligence Philip Adey

The image of intelligence in education The concept of intelligence has had a bumpy ride in education. Alfred Binet is credited with being the ®rst to have tried to characterise scienti®cally the notion of general intellectual ability. This idea had been around at least since ancient Greece: Why else would Plato report Socrates as being able to teach ``even a slave'' geometric theorems, if it were not that the slave was perceived as having inferior intelligence? Binet (1909) devised a wide range of questions, which included executing three simultaneous commands, comparing two objects from memory, or arranging ®ve blocks in order of weight. Unfortunately, he also used some rather different types of item, such as distinguishing an ugly from a pretty face, which we might now think to be somewhat culturally biased! Nonetheless, a characteristic of most of these questions is that they required the subject to attend to a number of features at one time, which raises the idea of connectivity in the mind, to which I shall return later. Binet introduced the idea of ``mental age'' ± the mental ability of the average child of a given age. Stern then introduced the idea of dividing mental age by chronological age to give an ``intelligence quotient'', IQ (Perkins, 1995). The Stanford-Binet scale (Thorndike, Hagen, & Sattler, 1986) provided a standard testing instrument that yielded a score whose population mean was 100 with a standard deviation of 15. Cyril Burt (1927) was employed by the London County Council to develop these methods for use in selecting the children most likely to bene®t from an academic secondary education. All this early work on intelligence seems to have assumed that it is a unitary construct, or at least has a common factor underlying it, and that intelligence is the most important factor in determining a child's likely

1 CASE stands for Cognitive Acceleration through Science Education. Thus this chapter will argue evidence for generality of intelligence from a particular educational perspective, rather than the general ``case'', which has been ably covered in other chapters.

370

Adey

success in education. In spite of the rise of various factor models of intelligence (Thurstone, 1924; Guilford, 1967), these assumptions broadly held sway in education until the 1960s, when in England, with a selective school system, questions started to be raised about the equity of determining a child's future by a single test at the age of 11. Criticisms varied, ranging from those who accepted the general idea of a unitary intelligence, but questioned the validity of the measuring instrument and its use on only one occasion (thus missing ``late developers''), to those who rejected completely the idea of a single intellectual factor and/or the importance of intellect in education. More generally, the process of IQ testing was brought into disrepute within educational circles by somewhat strident and populist claims (Herrnstein & Murray, 1994; Jensen, 1973) that intelligence, as represented by IQ, was (1) a powerful determinant of life success and (2) highly hereditable. Compounded with the apparent evidence that some races and some classes had lower mean IQs than others, this seemed to provide support for racist and classist policies. And this was just at a time when, especially in the United States, a more equitable society was being legislated for in an attempt to shake off a long history of oppression and segregation. I will return later to what I see as the major fallacy in the hereditists' argument, but here we need consider only the reaction of much of the educational world, which persists to this day. It is simply to reject all or some of the propositions that: 1 2 3 4

there is validity in recognising a general factor in intellectual ability as ``intelligence'' intelligence is important in life having a measure of an individual's intelligence might help one to direct their education programme intelligence can be assessed by psychometric tests.

In California, this reaction led to legislation banning all group testing of intelligence ± a wonderful example of an ostrich response. More generally, it led to the popularity among teachers of ``multiple intelligences'' (Gardner, 1993) ± a set of supposedly orthogonal forms of intelligence, each of which may develop individually without even correlational, let alone causal, relationship to one another. That this ¯ies in the face of all the evidence (Anderson, 1992; Carroll, 1993) counts as nothing against educators' very proper drive to attend with equity to all their charges, coupled with the misinterpretation of ``intelligence'' as a divisive and inequitable construct. Certainly, the notion of intelligence can be damaging if it is used in a dismissive or deterministic manner, which takes little account of the wide range of rates of development, and I do not doubt that such misuse is still widespread, but we do not overcome this problem by wishing it away, substituting even more problematic constructs such as motivation, or ¯ying in the face of the substantial evidence for a general factor in intelligence.

16. A general factor in intelligence

371

Generality does not imply hereditability There is a common misconception that the generality of a psychological construct necessarily implies its hereditability. That is, that a cognitive or personality trait that is shown to have quite general effects on behaviour must be hard-wired into our system, and that variance in values of that construct, across individuals, is largely genetically determined. In fact we need to tease apart the ideas of generality and hereditability, and consider plasticity, or openness to in¯uence by the environment. A general function is one that operates across a wide range of contexts. To talk of general intelligence is to imply a unitary entity that will show itself in ability to learn French, physics, or football. For a function to be hereditable, it must be possible to attribute signi®cant amounts of variance in that function to the genetic make-up of individuals, and to be able to correlate values of the function with those of close family members, with environmental factors partialled out. A function that is plastic is one that can be in¯uenced by the environment, whether consciously manipulated ± as in an educational programme ± or not. To the extent that something is hereditable, it cannot be plastic, since hereditability implies something ®xed at the moment of conception. But functions that are general may be hereditable, or they may be plastic. The relationship between these can be imagined in a twodimensional space where one dimension ranges from extreme hereditability to extreme plasticity, and the other ranges from highly context-speci®c abilities to those that are general across virtually all of our psychological lives. Memorising facts is an example of making use of the brain's plastic capability to learn information with rather speci®c function (no information is ever completely context-free). The re¯exes we are born with, such as sucking and gripping, are speci®c and inherited. They are not learned in any normal sense of the word, and may be supposed to be part of our evolutionarily developed genetic make-up. There are many more general capabilities that we ``learn'', either consciously or unconsciously, as we grow up, such as general effective ways of studying, or acceptable social behaviour. The form in which these develop depends on our particular cultural and educational experiences. But there are also general capabilities with which we seem to be genetically programmed, such as the ability to learn language. The rate and ef®ciency with which human infants learn language is far greater than could be explained by any associationist model, and despite massive efforts by researchers, no other animal has demonstrated an ability to learn language beyond that mastered easily by human 3-year-olds. Whether or not one buys into Chomsky's (1986) model of a general language processor, it is clear that the ability to learn language is both very general and a unique human attribute (Pinker, 1995). At the extreme of the generality dimension, we enter the area of controversy over the hereditability/plasticity of intelligence. The geneticists referred to earlier would consider intelligence as a hard-wired function of

372

Adey

the mind, largely determined in an individual by where the genetic roulette wheel settles at the moment of conception. But many of us take a far more optimistic view, that the development of general intellectual processing power can be signi®cantly in¯uenced by the environment, and so place it higher on the plasticity scale. In practice we ®nd that general intelligence has both inherited and plastic components, and the argument is about the relative proportions of each, or indeed, whether it makes any sense at all to talk of ``proportions of variance in intelligence that are inherited''. The classical approach to this problem has been through comparisons of identical twins who have been reared together, and those who have been separated at birth and brought up in different environments. The methodological problems with such studies are well known, but perhaps the most recent data from such studies (Plomin & DeFries, 1998) suggests that at birth 20% of variance in intelligence is accounted for by heredity. This proportion changes as the child grows older, since she is generally being reared by at least one of the people from whom her genes come. The optimistic educator would take this ®gure and say ``that gives us 80% to work with'', which is pretty good. But one might also argue that putting any percentages on the nature±nurture balance is meaningless. Genes, after all, only give a propensity for an individual to develop in a certain way. The actual development depends on the provision of appropriate conditions in the environment. Within 12 days of conception, some cells in the human foetus are already specialised as the neural tube, which will become the brain and spinal column. Imagine the impact on the development of the brain of such simple in¯uences as the mother smoking or taking other drugs, or of hormonal changes caused by stress, let alone the possible effects of playing Mozart (Sutton, 2004). An important point that hereditists frequently overlook is that their ®gures (typically 60% heredity) refer to relative contributions, which in return re¯ect typical genes in a typical environment. Schooling has been shown to lead to gains in intelligence (e.g., Ceci & Williams, 1997; Vernon, 1969). It follows that hereditability estimates will be far higher in all-school cultures, compared with cultures where schooling is optional and many people do not take the option: for the latter, the environments will differ more from person to person, and therefore they will contribute more. To conclude this section, I have argued that a function ± and I have been focusing on intelligence ± may be both general and plastic. Further, I have cited some evidence that a large proportion of variance in intelligence is open to environmental in¯uence. This poses educators with both an opportunity and a challenge. The opportunity is to maximise the intellectual development of as many children as possible, and the challenge is, to start with, to develop and formalise ways of doing this. There is the subsequent challenge of persuading educational policy-makers of the value and practicability of the enterprise ± but that is beyond the scope of this chapter. In the next section I will look at some models of general

16. A general factor in intelligence

373

intelligence, and draw implications from these models about what intellectual stimulation might look like.

Cognitive models of intelligence I have been asking teachers for many years what counts in their experience as intelligent behaviour. When one of their students says something, or writes something, or does something that makes them say ``Yes! that is smart'', what sort of thing is this? The answers I get are remarkably consistent. They always include things about seeing patterns in data, anticipating what the teacher is going to do next, asking probing questions, and above all applying knowledge from one context to a different one altogether. All these features require that the students have made some sort of connection in their mind. The professionals who are in the business of teaching and learning therefore see connectivity as a central characteristic of intelligent behaviour. ``Connectivity'' here is used in the simple sense of the conscious or unconscious making of connections in the mind between one idea and another. In this sense, making comparisons, relating causes to effects, or the elucidation of any relationship between variables, all involve connectivity. If this connectivity feature of intelligence has general validity, then whatever the cognitive model of the central processing mechanism proposed, it must afford connectivity a major role. We have seen that even in Binet's original characterisation of intelligence, the idea of seeing relationships between different elements was an important feature, and this connectivity idea has remained constant as models of intelligence have become more sophisticated. A prime example is the information processing models of Case (1985) and Pascual-Leone (1976), which link the development of cognition to growth in working memory capacity, with increasing capacity allowing for increasing numbers of bits of information to be processed in parallel (see also Halford & Andrews, Chapter 9, this volume). The idea of working memory as a key element in the mechanism by which information from the outside world becomes constructed in long-term memory is now a well-established element in cognitive psychology (Baddeley, 1990; Logie, 1999). Another useful approach to intelligence, especially when one is seeking pedagogic methods of cognitive stimulation, is that of one of Binet's students, Jean Piaget. With BaÈrbel Inhelder, Piaget describes the highest level of intellectual performance as ``formal operations'' ± the ability, for example, to operate on many variables in mind at once, use abstract ideas in conjunction with one another, and to see actual events as a subset of many possible events (Inhelder & Piaget, 1964; Smith, 1992). Again we see the connectivity idea very clearly, so teachers' intuitive ideas of intelligence turn out to tally very closely with that held by the original analysts of the nature of intelligence. Note that, at ®rst sight, the psychometric approach

374

Adey

originating from Binet and the developmentalist approach that stems from Piaget may seem incompatible. However, Styles & Andrich (1997; see also Styles, 1999) have used Rasch scaling of a standard intelligence test ± Raven's Progressive Matrices (Raven, 1960) ± and Piagetian measures to show that both form a unidimensional scale on a single latent trait. This is postulated to be that of abstract reasoning. Here again we have strong evidence that intelligence has an important general component. We might recognise Deary's (2000) concerns about the danger of reductionism ± trying to explain something as complex as intelligence by some simple neurophysiological mechanism ± but nevertheless accept the evidence that the complexity presents itself as a unitary construct.

Intervention principles One reason why American intelligence researchers ``know'' that environmental effects on intelligence are low is that major intervention programmes such as ``Head Start'' initially appeared to fail (e.g., Jensen, 1998). However, several criticisms have been made of their implementation. For example, after a Head Start programme is completed, children, especially African Americans, attend the same poor-quality schools that they would have attended in any case (Currie & Thomas, 2000). Not only this, but Levitt and Dubner (2005) complain about uncompetitive salaries for Head Start versus other teachers, and their consequent low levels of quali®cations. In support of the hypothesis that the poor implementation of Head Start is responsible for its apparent failure, Oden, Schweinhart, Weikart, Marcus, and Xie (2000) report a 17 year follow-up of children who had followed a structured Head Start programme called High/Scope. These children showed a signi®cantly higher grade point average throughout their schooling, and experienced less than half as many criminal convictions by age 22, compared with matched controls. Despite the problems with Head Start, the early negative evaluations may, in any case, be the result of looking for instant effects from a programme that was aimed at making fundamental changes in cognitive development, which are necessarily slow. Garces, Thomas and Currie (2000) have provided long-term evidence that, among whites, participation in Head Start is associated with a signi®cantly increased probability of completing high school and attending college, and elevated earnings in their early twenties. African Americans who participated in Head Start are signi®cantly less likely to have been charged with or convicted of a crime. The evidence also suggests that there are positive spillovers from older children who attended Head Start to their younger siblings. In developing an intervention designed to maximise the development of intelligence, we have interrogated the complementary models of cognitive growth, outlined above, for features that could be manipulated to enhance the positive cognitive stimulation. In Piaget's model of cognitive

16. A general factor in intelligence

375

development, the notion of equilibration is central (Piaget, 1977). This is the process by which existing cognitive structures must accommodate in the face of cognitive con¯ict: a two-way process between the mind and the environment. Faced with a cognitive task set by the environment, the mind must accommodate its mental structures to meet the task, in order to be able to assimilate the information. If the demand is low, accommodation will be minimal, and if it is very high, accommodation will not be possible. This suggests that a promising approach to cognitive acceleration would be to provide students with activities that generate cognitive con¯ict, challenges that are carefully pitched at a demand level appropriate to the child's current processing power. This is not straightforward. For one thing, there is the question of the distance beyond the students' current capability that would be optimal for inducing cognitive restructuring, and for another the signi®cant practical dif®culty faced by teachers whose pupils in the same class have widely varying ``current capabilities''. For guidance on just how dif®cult a cognitive challenge should be ± the level of cognitive dissonance that creates productive cognitive con¯ict ± we can turn to Vygotsky's (1978) notion of a zone of proximal development. This idea allows us to conceptualise the range of levels of demand, from that which an individual can meet without dif®culty and which therefore causes no con¯ict, to that which is beyond his capability under even the most favourable conditions of support. The zone of proximal development de®nes a range within which cognitive con¯ict can be productive, but with this proviso: The con¯ict has to be positively and ef®ciently managed by the teacher. The provision of a cognitive challenge by itself cannot be expected to provoke accommodation and cognitive growth. It is just as likely, in the absence of support and what Bruner (1974) calls ``scaffolding'' by the teacher, to lead to rejection and a sense of helplessness in the student. The lesson that emerges is that intervention programmes for cognitive acceleration always depend critically on the skill of the teacher, and can never be delivered by inanimate materials alone. The issue of the professional development of teachers for cognitive acceleration is beyond the scope of this chapter; see Adey, Hewitt, Hewitt, and Landau (2004) for a full discussion. Having considered Vygotsky, we can take from him a second cardinal principle for cognitive stimulation: that of social construction. Faced with a cognitive problem and in a supportive classroom atmosphere, students want to talk with one another and to share their dif®culties and their attempts at solutions. This social mechanism described by Vygotsky was therefore recognised as being the main driver of cognitive development. From this avowedly Soviet perspective, not just knowledge, but intelligence itself is thus constructed socially in a group as students interact not only with the teacher but with each other, listening, arguing, challenging, and generally exploring a situation and making meaning together. Thus, early on, the explicit promotion of social construction was added to cognitive con¯ict as the second main ``pillar'' of cognitive acceleration.

376

Adey

During an iterative process of developing stimulating classroom activities, implementing them in real classrooms, observing and reading, it soon became apparent that the most generative classrooms demonstrated another feature, which became the third central pillar in our model of cognitive acceleration. This was metacognition. Making explicit their thinking, revisiting a problem-solving procedure, and inspecting the dif®culties and false trails encountered were seen to be important elements in the process of maximising cognitive development. Metacognition can be seen as the ``re¯ective abstraction'' that Piaget describes as a feature of formal operations, but this is to suggest that it only becomes available to able adolescents (our survey of 14,000 school students in the 1970s, Shayer & Adey, 1981, showed that only 30% of 16-year-olds seem capable of using formal operations at all). In fact, work with 5- and 6-year-olds shows that they are quite capable of metacognition, in a real sense (Larkin, 2001). Thus far I have shown how the ``pillars'' of cognitive acceleration ± cognitive con¯ict, social construction, and metacognition ± are offered to us by the theories of Piaget and Vygotsky. But having earlier made strong claims for the role of working memory capacity in intelligence, I should try now to hypothesise how those pillars might relate to the growth or more ef®cient use of working memory. If working memory capacity lies at the centre of the central processing mechanism, then one way in which intelligence develops must be by an increase in working memory capacity. However if, following Pascual-Leone (1976) and Case's (1974) early work, we accept that the number of bits of information that can be independently managed by working memory grows gradually from zero at birth to about seven at maturity, that suggests a rather slow rate of development expressed as bits per year ± something less than one bit every two years on average.2 While it is likely that working memory capacity does increase under the in¯uence of both maturation of the central nervous system and cognitive stimulation, the potential of even the most successful of stimulation programmes must be constrained by the normal range of the rate of growth of capacity. Thus, while accepting the probability that well managed cognitive con¯ict, as described above, will have some effect on the growth of working memory, it seems unlikely that this mechanism alone could account for the large effect sizes (to be outlined later) we have obtained in the cognitive acceleration projects. We therefore need to seek additional ways in which central processing can be enhanced, or made more ef®cient.

2 In a later formulation, Case (1985) suggests that total working memory space remains constant, but more sophisticated and ef®cient operational capability means that less space needs to be devoted to operations, leaving more space for storage of bits of information. The practical outcome from our point of view is not very different.

16. A general factor in intelligence

377

There seem to be two possible mechanisms by which a given working memory capacity could be used more ef®ciently: the use of ``chunking'', and the use of schemata. These are related but distinct, and may act separately or together. Chunking means the linking together of two or more variables as a compound variable which, as long is it is not deconstructed, can be treated as a single variable and so dealt with in one working memory space instead of two or three. A schema is a general way of thinking about the ordering and organisation of variables encountered in the world. If we did not have schemata available to us then we would have to treat each problem completely anew and could transfer no processing from one context to another. To our 5year-olds, the practical problem of placing 10 sticks in order of length is not a trivial one. The schema of seriation is not well established and they have to apply their early concrete operations to the single variable, length, successively with each overlapping set of three sticks. Their tendency is to attend to only two sticks at a time, which means that they establish that, say, A is longer than B, then that A is longer than C, and then they order them A, C, B without thinking of the possibility that B may be longer than C. They do not have a general schema of seriation. When still a few years short of fully developing formal operations, the 12-year-olds we work with have signi®cant dif®culty in seeing, at one time, three independent variables, to hold two constant while trying two values of the third, and at the same time, to relate each value of the varying independent variable to a corresponding value of a dependent variable. This demands at least six spaces in working memory. I believe that it is unrealistic to claim that the effect of perhaps four control-of-variables activities within a cognitive acceleration programme over some two months could, by itself, expand students' working memory capacity from four to six. Nor do the activities and methods of cognitive acceleration encourage the learning of rules (``change only one variable at a time'') since such rules are almost inevitably misapplied without the formal understanding of why it matters.3 I suggest that the cognitive construction during this 8-week period is no more than the beginning of the schema of control of variables, together with an awareness that these experiments are not as simple as they look, require more attention to the variables than one might think, and so continued help will be required from both the teacher and more able peers. In other words, an element of uncertainty is introduced that operates as an internal mechanism for creating cognitive con¯ict whenever that type of problem is encountered.

3 My colleague Tony Hamaker was once forbidden by a class of 12-year-olds to take off his sweater while conducting an experiment rolling balls of various masses and materials down a runway, because he ``had to keep everything the same except for the mass of the ball''.

378

Adey

An important message that emerges is that cognitive stimulation must necessarily be a slow process. There is no question of inserting a ``quick ®x'' into the school curriculum, a few special lessons to teach children how to think. For cognitive intervention we have to think in terms of a year or more, and of changing the mindset of teachers, at least on occasion, from ``delivering content'' to ``stimulating the intellect'' ± a very different pedagogical process.

An intervention programme In this section I will show how the three-pillar model of cognitive stimulation has been operationalised as curriculum activities and pedagogic methods. Firstly, we should consider the issue of the contexts in which the activities might be set. If we are talking about intelligence as a general function, then we should be able to gain entry to this using the principles developed in the last section, through any of the traditional school subjects. When we started this work in the early 1980s, the project director, Michael Shayer, myself, and the principle researcher who joined us, Carolyn Yates, all had backgrounds as science teachers. Added to that, the schemata of formal operations described by Inhelder & Piaget (1958) had a very ``scienti®c'' look. Thus, our ®rst target within schools was the science department, and the original project was called CASE: Cognitive Acceleration through Science Education. Subsequently the same principles of cognitive acceleration have been applied in mathematics, technology, and the expressive arts (art, music, and drama). Secondly, what seemed likely to be the age at which students would bene®t most from such cognitive intervention? Since our early work was focused on the promotion of formal operations, the early years of secondary schooling, with students aged 12±14 years, seemed to be the most natural place to go. Many of these students are on the threshold of formal operations, the ®nal main patterns of neural pathways in the forebrain have not yet been established, and it seemed likely that intervention at this point could have long-term effects. In the original CASE project (1982±87) a set of 30 activities was developed based on the three-pillar model described above. These were introduced into science classes of students aged 12 to 14 years (Years 7 and 8 in England, equivalent to Grades 6 and 7 in the US), to be used instead of regular science lessons, at the rate of one every 2 weeks over 2 years. The expression of the three pillars can be illustrated with one activity from Thinking Science (Adey, Shayer, & Yates, 2001), the curriculum materials of the project: Activity 3, Tubes, is based on the schema of control of variables and exclusion of irrelevant variables. Students are given a set of tubes which

16. A general factor in intelligence

379

vary in length, width, and material (copper or plastic). They are asked to ®nd out what factors affect the note made when one blows across the tube. Initially they play with the tubes more or less at random to become familiar with the variables of length, width, and material and the pitch of the note produced. They are now encouraged to test the tubes just two at a time. Typically 11 and 12 year old students tend to choose their pairs at random, without any speci®c strategy for testing one variable at a time. The cognitive con¯ict must be induced by the teacher circulating around the class, asking groups what they can conclude from any pair of tubes they have tested, challenging them to explain how a particular difference in note can be attributed to a change in width, when both width and length have been changed, but never giving speci®c instructions to ``vary only one thing at a time''. Groups will be set tasks to explain their reasoning to each other, so anyone can be called to explain it to the whole class. At certain times, the teacher may call the whole class's attention and ask different groups what they have found, how they can justify their conclusion, and ask other groups to say whether they agree with the reasoning being put forward. Here we have social construction in action, which requires that the students have to learn how to question one another, and even disagree with one another, in a polite manner. Generally towards the end of the lesson (although this may occur at any time) the teacher invites students to re¯ect on what they have learnt, how they have learnt it, what problems they encountered and how they tried to overcome these problems. This is the metacognitive phase of the activity. Frequently, no clear conclusion is reached. It is not expected that at the end of every such activity all students will have fully constructed the control of variables schema for themselves. One of the many things that teachers have to learn is that in a cognitive acceleration lesson no immediately observable content objectives need be attained. Rather, they are in for the long term, with a series of such activities spread over 2 years providing repeated exposure to well-managed cognitive con¯ict and social construction. More recently, we have been applying the same model to much younger children, aged 5 and 6 years, in Year 1 in the English school system. The pillars of cognitive acceleration remain the same, but with these young children the ``content'' is provided by the schemata of concrete operations, such as seriation, simple classi®cation, and points of view. There are also a number of practical differences associated with the fact that cognitive acceleration at Year 1 depends on primary class teachers, rather than on secondary subject teachers. The Year 1 cognitive acceleration activities are published as Let's Think! (Adey, Robertson, & Venville, 2001) and are used with just six children at a time. The teacher supports this group while other children get on with their own work. Each day of the week another group gets the Let's Think! activity so that the whole class is covered in a week.

380

Adey

The 30 activities are delivered at the rate of one a week for one year. Here is one example, concerned with the schema of ``points of view'' (spatial perception). Crossroads: The teacher sits at one side of a table, with two children at each of the other three sides. On the table is a model of a city crossroads, with various buildings, vehicles, a bus stop, a tree, a horse, and a duck. From any one side of the table it is not possible to see all of the objects since some are hidden by buildings. The teacher starts by ensuring that the children can name all of the objects and that phrases such as ``in front of'' ``behind'' and ``beside'' are understood by all. Then each pair of children is given a set of picture cards, and asked to choose the card which represents the scene as they see it. Often they will choose a card which shows an object they cannot actually see, because they know it is there. This leads to some discussion until they understand that the picture they choose must show just what they can actually see. So far this is not too demanding for most 6-year-olds. Now they are asked to imagine themselves to be sitting at a different side of the table and to choose the picture which represents what they would see from there. They have mentally to place themselves in a different position and ``decentre'' from their actual seating position. Each pair chooses their pictures, discussing it between them and arguing for their choice so that the cognitive con¯ict entrained with thinking themselves in another position leads to active social construction of an agreed image of what would be seen from that position. The teacher encourages each pair to share their choice together with their reasoning, and this can be challenged by the other pairs. From time to time, the teacher asks the children to re¯ect on what they did, how they did it, what dif®culties they encountered, and any strategies they may have discovered for solving the problem. This is the metacognitive phase. Typically these activities last about 30 minutes, by which time both teacher and children are quite exhausted ± an effect we have repeatedly noticed from productive cognitive acceleration activities.

Effects in brief All cognitive acceleration work has been subject to intense evaluation, which has been widely reported (e.g., Adey, Robertson, & Venville, 2002; Adey & Shayer, 1993, 1994; Shayer, 1996; Shayer & Adey, 2002). In brief, from the work with adolescents it has been shown that students who participate in cognitive acceleration programmes when they are aged 12±14 years go on to show signi®cantly higher levels of cognitive development at the end of the programme, and signi®cantly higher grades in academic achievement, compared with controls. These effects have been found up to

16. A general factor in intelligence

381

Table 16.1 Effect sizes from the original CASE experiment: CASE over controls. Only signi®cant ( p < .01) effect sizes are shown. All effects are of CASE classes > non-CASE control classes

Year 7 start Year 8 start

Girls Boys Girls Boys

GCSE,1 1989 or 1990

Immediate post-1987

Delayed post-1988

Cognitive development Science

Science

Science

Maths

English

0.60 ± ± 0.72

0.67 ± ± 0.96

0.72 ± ± 0.50

0.69 ± 0.44 0.32

± ± ± 0.75

± ± ± ±

1 General Certi®cate of Secondary Education, the national public examination taken in the UK at 16 years.

3 years after the programme in subject areas far removed from science. Table 16.1 summarises the data from the original 1984±87 experiment. The effect sizes here are derived from residualised gain scores, that is, the gains made by experimental groups over and above that which would be expected on the basis of the control groups' ``normal'' development over the same period. Thus each effect size is the mean residualised gain score of the experimental group divided by the pooled standard deviation of the gain scores of experimental and control groups. From Table 16.1, it will be seen that substantial effect sizes are achieved. Although not all subgroups' effects reached statistical signi®cance in the early experiment, there is clear evidence of a substantial long-term fartransfer effect. An intervention set in a science context produced gains in national public examinations taken two or three years later, in English and mathematics as well as in science. It is on this far-transfer effect that I would base the claim that the cognitive acceleration programme is tapping into a general intellectual processing mechanism of the mind. It may be plausible to attribute gains in mathematics scores to the similarity of mathematical and scienti®c schemata, but this explanation cannot be applied to the gains in English. The English gains were achieved in spite of the fact that the English teachers knew nothing of the cognitive acceleration programme taking place in the science department ± we thus have a double-blind design. While one cannot de®nitively rule out the possibilities of socialisation, or simple language development, these seem unlikely explanations for effects of an intervention of about 30 hours, spread over two years in a science context, on English GCSE grades. I ®nd it very dif®cult to account for the gains in English scores other than in terms of a general intelligence that is amenable to stimulation, that is, which is somewhat plastic. When these results were published in 1995, we found ourselves at King's College London to be in great demand by schools who wished to buy into some of this ``magic dust''. Even after we had made clear the amount of

382

Adey A

0 0

Mean GCSE grade

* B

0

0 X 0 X XX

0

C

00 0

XX X

X

X

X X

D

0

X 0

X Control schools 0 CASE schools * National average

X X

E

X 20

30

40

50

60

70

Mean Year 7 (12 years old) school intake (percentile)

Figure 16.1 Value-added effect of CASE on GCSE English (note that the scale on the x-axis is not an equal-interval scale).

commitment and work that was required, that cognitive stimulation was no ``quick ®x'', we were able to accept a group of schools onto the ®rst of our 2-year professional development programmes for science departments. In August 2005 we started our 15th cohort of schools, commencing the programme in September. This continual stream of schools adopting the programme has provided ample opportunities for replication of the original research. However, once we had something in our hands which we believed to work, we could not ethically deny it to some classes as controls, but had to adopt a different research design. This, for example, involves comparing the ``value-added'' effects (gains from school entry levels to GCSE grades ®ve years later) of whole schools that have adopted cognitive acceleration with schools that have not. Figure 16.1 gives one example of such data. Here, each point represents one school, the x-axis is the mean cognitive developmental level of the students entering the school aged 12, and the y-axis is the mean grade obtained in that school in the GCSE examination taken ®ve years later. Unsurprisingly there is a strong relationship between the ability intake of the school and its mean performance. What is clear is that schools that have used CASE show signi®cant gains in GCSE grades compared with nonCASE schools, whatever their intake. Data for English are shown, but similar effects were obtained for science and mathematics. Work with the younger students, aged 5±6, is more recent and we do not yet have any long-term results, but signi®cant gains of experimentals over controls include effect sizes of 0.35 and 0.59 standard deviations for boys

16. A general factor in intelligence

383

and girls respectively on a spatial task and of 0.55 standard deviations for girls on a conservation task. Effects in tests of conservation demonstrate transfer, as there are no conservation activities in the Let's Think! materials.

Conclusion In this chapter I have traced something of the background of the construct of intelligence, suggested why it has become something of a dirty word in educational environments, and tried to build a case for its reinstatement. The evidence that intelligence has a signi®cant general component is based on correlations between apparently widely different intellectual abilities (Anderson, 1992) and on the massive reworking of factor-analytic studies of John Carroll (1993). Educators have nothing to fear from the generality of intelligence ± indeed, there is cause for celebration. Our evidence accumulated over the past 20 years lends strong support to the construct of intelligence as both general and plastic. Teachers have available to them the possibility of raising the intellectual level of all their students, and this is actually a very ef®cient approach to education. Time devoted in the early years to stimulation of students' general intellect pays off handsomely as they start to learn everything more ¯uently and with greater depth of understanding. If we are condemned to seven or eight or nine independent ``intelligences'', then each in turn will need that kind of stimulation. Worse, if knowledge is perceived as a series of essentially unconnected subjects, then the effects of stimulating work in one area will remain isolated in that area. None of this is to say that intellectual stimulation is an easy process, nor that we fully understand the mechanisms by which it operates. Cognitive acceleration is a process of stimulation of natural development, and there are natural limits to the rate at which it can be speeded up, and so the whole process must be seen as long-term. It requires a pedagogy that is, in many ways, the antithesis of what is recognised in national curricula and inspection systems as ``good teaching''. The necessary re-engineering demands faith on the part of education managers and a willingness to rethink their beliefs about learning on the part of teachers, as well as a lot of straight hard work. Our experience over the years has been that hundreds of schools and thousands of teachers have the will and capability of achieving the necessary shifts in practice. As for the cognitive mechanism, there is still much to understand. Quantitatively inclined psychologists sometimes ask us to tease apart the three ``pillars'' of cognitive con¯ict, social construction, and metacognition, to see if we can isolate the effect of each. The problem is that we are not working with laboratory rats, but with children and teachers in the incredibly complex environments of real classrooms, where the control of variables is a practical impossibility. Understanding the mechanism will require much detailed qualitative work in those classrooms.

384

Adey

References Adey, P., Hewitt, G., Hewitt, J., & Landau, N. (2004). The professional development of teachers: Practice and theory. Dordrecht, The Netherlands: Kluwer Academic. Adey, P., Robertson, A., & Venville, G. (2001). Let's Think! Slough, UK: NFERNelson. Adey, P., Robertson, A., & Venville, G. (2002). Effects of a cognitive stimulation programme on Year 1 pupils. British Journal of Educational Psychology, 72, 1±25. Adey, P., & Shayer, M. (1993). An exploration of long-term far-transfer effects following an extended intervention programme in the high school science curriculum. Cognition and Instruction, 11, 1±29. Adey, P., & Shayer, M. (1994). Really raising standards: Cognitive intervention and academic achievement. London: Routledge. Adey, P., Shayer, M., & Yates, C. (2001). Thinking science: The curriculum materials of the CASE project (3rd ed.). London: Nelson Thornes. Anderson, M. (1992). Intelligence and development: A cognitive theory. London: Blackwell. Baddeley, A. D. (1990). Human memory: Theory and practice. Hove, UK: Lawrence Erlbaum Associates Ltd. Binet, A. (1909). Les ideÂes moderne sur les enfants. Paris: Ernest Flammarion. Bruner, J. (1974). The relevance of education. Harmondsworth, UK: Penguin Education. Burt, C. (1927). The measurement of individual capacities: A review of the psychology of individual differences. London: Oliver & Boyd. Carroll, J. B. (1993). Human cognitive abilities. Cambridge: Cambridge University Press. Case, R. (1974). Structures and strictures: Some functional limits to cognitive growth. Cognitive Psychology, 6, 544±574. Case, R. (1985). Intellectual development: Birth to adulthood. New York: Academic Press. Ceci, S. J., & Williams, W. M. (1997). Schooling, intelligence, and income. American Psychologist, 52, 1051±1058. Chomsky, N. (1986). Knowledge of language: Its nature, origin, and use. Westport CT: Praeger. Currie, J., & Thomas, D. (2000). School quality and the longer-term effects of Head Start. The Journal of Human Resources, 35, 755±774. Deary, I. J. (2000). Looking down on human intelligence: From psychometrics to the brain. Oxford: Oxford University Press. Garces, E., Thomas, D., & Currie, J. (2000). Longer term effects of Head Start. National Bureau of Economic Research Working Paper No. 8054, Cambridge, MA. Gardner, H. (1993). Frames of mind (2nd ed.). New York: Basic Books. Guilford, J. P. (1967). The structure of human intelligence. New York: McGraw-Hill. Herrnstein, R. J., & Murray, C. (1994). The bell curve: Intelligence and class structure in American life. New York: Free Press. Inhelder, B., & Piaget, J. (1958). The growth of logical thinking. London: Routledge Kegan Paul. Inhelder, B., & Piaget, J. (1964). The early growth of logic in the child: A classi®cation and seriation. London: Routledge and Kegan Paul.

16. A general factor in intelligence

385

Jensen, A. R. (1973). Educability and group differences. London: Methuen. Jensen, A. R. (1998). The g factor. Westport, CT: Praeger. Larkin, S. (2001). Creating metacognitive experiences for 5- and 6-year old children. In M. Shayer & P. S. Adey (Eds.), Learning intelligence (pp. 65±79). Buckingham, UK: Open University Press. Levitt, S. D., & Dubner, S. J. (2005). Freakonomics. London: Penguin Allen Lane. Logie, R. H. (1999). Working memory. The Psychologist, 12, 174±178. Oden, S., Schweinhart, L., Weikart, D., Marcus, S., & Xie, Y. (2000). Into adulthood: A study of the effects of Head Start. Ypsilanti, MI: High/Scope Press Pascual-Leone, J. (1976). On learning and development, Piagetian style. Canadian Psychological Review, 17, 270±297. Perkins, D. (1995). Outsmarting IQ. New York: The Free Press. Piaget, J. (1977). The development of thought: Equilibration of cognitive structures. Oxford: Blackwell. Pinker, S. (1995). The language instinct: The new science of language and mind. London: Penguin. Plomin, R., & DeFries, J. C. (1998). The genetics of cognitive abilities and disabilities. Scienti®c American, 278, 62±69. Raven, J. C. (1960). Guide to the Standard Progressive Matrices. London: H. K. Lewis. Shayer, M. (1996). Long term effects of Cognitive Acceleration through Science Education on achievement. Technical Report, Centre for the Advancement of Thinking, King's College London, November. Shayer, M., & Adey, P. (1981). Towards a science of science teaching. London: Heinemann. Shayer, M., & Adey, P. (Eds.). (2002). Learning intelligence: Cognitive acceleration across the curriculum from 5 to 15 years. Milton Keynes, UK: Open University Press. Smith, L. (1992). Necessary knowledge: Piagetian perspectives on constructivism. Hove, UK: Lawrence Erlbaum Associates Ltd. Styles, I. (1999). The study of intelligence ± the interplay between theory and measurement. In M. Anderson (Ed.), The development of intelligence (pp. 19±42). Hove, UK: Psychology Press. Styles, I., & Andrich, D. (1997). Faire le lien entre variables psychomeÂtriques et variables cognitivo-deÂveloppemetales reÂgissant le fonctionnement intellectuel. Psychologie et PsychomeÂtrie, 18, 51±69. Sutton, J. (2004). Even more brief. The Psychologist, 17, 708. Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). Stanford-Binet intelligence scale (4th ed.). Chicago: Riverside. Thurstone, L. L. (1924). The nature of intelligence. London: Kegan, Paul, Trubner, & Co. Vernon, P. E. (1969). Intelligence, culture, and environment. London: Methuen. Vygotsky, L. S. (1978). Mind in Society. Cambridge, MA: Harvard University Press.

17 Innovation, fatal accidents, and the evolution of general intelligence Linda S. Gottfredson

How did humans evolve such remarkable intellectual powers? This is surely one of the most enduring and captivating questions in the life sciences, from paleoanthropology to neuroscience. Modern humans (Homo sapiens sapiens) far exceed all other species in their ability to learn, reason, and solve novel problems. We are, most strikingly, the only species whose members routinely use words and other abstract symbols to communicate with each other, record ideas in material form, and imagine alternative futures. Perhaps for these reasons we are the only species ever to have developed complex technologies that allow us radically to transform the physical environments we inhabit. Human intelligence is tied in some manner to the large increase in brain size going up the human evolutionary tree (Geary, 2005; Holloway, 1996). When the encephalisation quotient (EQ; Jerison, 2002) is used to measure brain size relative to body size, modern humans are three times as encephalised (EQ = 6) as other primates (EQ = 2) and six times the average for all living mammals (EQ = 1, the reference group). This phylogenetic increase represents a disproportionate expansion of the brain's prefrontal cortex (Schoenemann, Sheehan, & Glotzer, 2005), which matures last and is most essential for the highest cognitive functions, including weighing alternatives, planning, understanding the temporal order of events (and thus cause-andeffect relations), and making decisions. Moreover, encephalisation of the human line proceeded rather quickly in evolutionary terms: after the ®rst hominids (Australopithicines, EQ = 3) split off from their common ancestor with chimpanzees (EQ = 2) about 5 million years ago. Encephalisation was especially rapid during the past 500,000 to 1 million years (Aiello & Wheeler, 1995; Holloway, 1996; Ruff, Trinkhaus, & Holliday, 1997), when relative brain size increased from under EQ = 4 for Homo erectus (arguably the ®rst species of Homo) to about EQ = 6 for living humans (the only surviving subspecies of Homo sapiens). Brains are metabolically expensive. In humans they account for 2% of body weight but consume 20% of metabolic energy (Aiello & Wheeler, 1995). Hence, the rapid increase in relative brain size suggests that higher intelligence conferred a strong adaptive advantage. Attempts to identify the

388

Gottfredson

selection forces driving up intelligence in the human environment of evolutionary adaptedness (EEA) often look to the ecological, behavioural, and life history correlates of encephalisation, either in the paleontological record or through comparative studies of living species. Evolutionary psychologists agree that increases in brain size are crucial in tracing the evolution of humans' extraordinary intelligence, but they say relatively little about what that intelligence actually is. They agree that humans have impressive reasoning abilities, which in turn confer valuable behavioural ¯exibility, but they conceptualise human intelligence in very different ways. The debate has focused on whether intelligence is ``domain speci®c'' (e.g., has ``massive modularity'') or ``domain general.'' Proponents of domain speci®city emphasise the morphological modularity of the human brain, likening it to a Swiss army knife, and argue that human intellectual prowess consists of a large collection of separate abilities that evolved independently to solve different speci®c adaptive problems, such as ``cheater detection'' (e.g., Tooby & Cosmides, 1992). Humans, they argue, have not evolved any meaningful content- and context-free general reasoning or learning ability, but are smart because the human brain evolved myriad ``fast and frugal heuristics'' (Gigerenzer & Todd, 1999). The domain generalists, emphasising the highly interconnected circuitry of the brain's distinct parts, argue that human intelligence is best understood as a generalised capacity that facilitates reasoning and adaptive problem solving, especially in novel, changing, or otherwise complex situations (e.g., Geary, 2005). These theorists acknowledge the modular elements of the brain and mind, but consider them subject to the more general learning and reasoning mechanisms that they believe humans have evolved. This chapter aims to show not only that our species' distinctive intelligence is domain general at the phenotypic, genetic, and functional levels, but also how a general intelligence could have evolved. Drawing evidence from sister disciplines not often consulted by evolutionary psychologists, I ®rst describe how general mental ability, g, represents a suite of generic critical thinking skills that provides individuals with pervasive practical advantages in coping with many life challenges, especially when tasks are complex. As will be illustrated, the cognitive demands of even the most mundane daily tasks are suf®cient to put less intelligent persons at a higher relative risk for many unfavourable life outcomes, including premature death. One particularly large class of deaths ± fatal accidents ± will be used to illustrate how individual differences in g might contribute to differential mortality as people go about their daily lives. The prevalence, aetiology, and demographic patterning of accidental deaths in both modern and huntergatherer societies provide clues to how these could have winnowed away a group's less intelligent members throughout human evolution: fatal accidents (unintentional injuries) kill a disproportionate number of reproductive-age males, their accidents are generally associated with provisioning activities, and preventing these is a cognitively demanding process. Accidents have a

17. Innovation, accidents, and intelligence

389

high chance component, are diverse in type, but only rarely result in death, which dulls our appreciation of them. These attributes are also precisely what make them a potentially powerful force for evolving a general-purpose problem-solving mechanism rather than, for example, speci®c hazarddetection modules. As oft noted, there must have been something unique in the Homo EEA to trigger the peculiarly rapid increase in hominid brain size and mental power. That trigger may have been human innovations during the past half million years, especially since the emergence of Homo sapiens sapiens just 50±150 thousand years ago. My hypothesis is that innovations in obtaining and processing food (e.g., ®re, weapons, tools) lowered age-speci®c mortality rates relative to other primates, but they also created novel physical hazards that widened differences in risk of accidental death within human groups. Differences in risk within a population are, of course, the engine for natural selection. With each new innovation, humans could have strengthened natural selection for g.

Do humans possess a domain general intelligence? Domain speci®city theories of intelligence rest on the commonsense (but mistaken) notion that different tasks require different abilities. Indeed, until the 1980s, most experts on the topic believed that good performance on mental tests, in school, and at work, required having the particular constellation of specialised skills and abilities that best matched the idiosyncratic cognitive demands posed by particular tasks in particular settings. Most assumed, for example, that tests of mathematical ability would predict achievement well in maths but not in language, whereas tests of verbal ability would do the reverse. They likewise assumed that even in the same occupation (e.g., clerk) good performance required notably different sets of abilities when the work was performed in different companies, or units within them. Most social scientists therefore explicitly rejected the notion that any putative general intelligence could be useful in many endeavours, if it even existed. The evidence contradicting these early speci®city theories of cognitive ability (e.g., Jensen, 1984; Schmidt, Law, Hunter, Rothstein, Pearlman, & McDaniel, 1993) is equally relevant in refuting domain speci®city theories elsewhere in psychology. I begin with evidence for generality in the cognitive abilities that humans possess, and turn later to evidence on the abilities that everyday tasks require of us. Generality of human intelligence (g) at the phenotypic level There are many distinct cognitive abilities, and there are large ability differences within all human populations, including hunter-gatherers (Reuning, 1988). One of the ®rst discoveries about such variation, however, was that individuals who perform well on one mental test tend to perform

390

Gottfredson

well on all others, even ones often presumed not to have any mental component (e.g., multi-limb coordination and tactile±kinaesthetic sensitivity). This is the case regardless of test content or format. A century of factor analyses (Carroll, 1993, Figure 15.1) has delineated the structure underlying this covariation in cognitive abilities. Perhaps its most important ®nding is that, to some degree, all tests tap the same ability (dubbed g, for the general mental ability factor). Next, abilities are best distinguished by level of generality±speci®city, with the most general (g) at the apex of the hierarchy and highly speci®c abilities along its base. The most in¯uential hierarchical model is Carroll's three-stratum theory (1993). He con®rmed only one highly general factor, g, at the Stratum III apex, then 8±10 narrower but still broad abilities at the Stratum II level (broad ``group factors'' such as spatial, memory, and auditory abilities), and many speci®c aptitudes at the next lower level of generality (Stratum I or ``primary'' abilities such as ideational ¯uency, perceptual speed, and absolute pitch). Another crucial ®nding was that the g factor is not an amalgam of the narrower abilities in the strata below it, but provides the common core of them all. Each stratum dominates the composition of abilities in the stratum below. The Stratum II abilities have thus been aptly described as differently ``¯avoured'' (spatially, verbally, etc.) versions of g. They, in turn, dominate the Stratum I abilities, each of which in turn represents a particular mix of the broad abilities above it, and of experience in deploying them in particular contexts. Narrower abilities are, accordingly, more content-speci®c and less heritable. The Strata I and II abilities, though still having a large g component, represent the more modularised and more environmentally sensitive ability differences among us. They illustrate that highly specialised skills (extracting armadillos from their burrows, driving a car) do not necessarily require specialised innate reasoning modules, but just suf®cient practice in mobilising the pertinent combinations of abilities (cognitive, psychomotor) required to master speci®c tasks in speci®c settings. Other research has shown that differences in g are manifested in behaviour as differences in generic thinking skills ± such as learning, reasoning, and abstract thinking ± and hence in the ability to apprehend, transform, and understand information of virtually any kind. Differences in g are measured well, though not perfectly, by IQ tests. Moreover, when a general factor is extracted (1) from different IQ test batteries and (2) for test takers of different ages, sexes, races, and nationalities, all the resulting general factors are nearly identical and converge on the same psychometrically ``true'' g factor (Jensen, 1998). This signals that within all groups, the g continuum is a shared fact of nature, not the product of any particular culture (see also Chabris, Chapter 19, this volume). Why demographic groups tend to be spread somewhat differently along this common g continuum is a separate issue. Carroll (1993) tentatively placed two highly correlated but still distinguishable g factors at the Stratum II level: ¯uid g, which can be conceived

17. Innovation, accidents, and intelligence

391

as raw mental horsepower, and crystallised g, which re¯ects knowledge crystallised from sustained application of ¯uid g over the life course. The former is usually found to be isomorphic with the Stratum III g factor, and so all references to g in this chapter are to ¯uid g. Of psychometricians who still use the term ``intelligence,'' most now restrict it to the single Stratum III ability, g, as I do here. Some social scientists (e.g., Gardner, 1983) ignore the general factor and label the Stratum II abilities as multiple intelligences (``linguistic,'' ``visuospatial,'' ``musical,'' ``intrapersonal,'' etc.). Others stretch the label to include all human competencies, broad or narrow, cognitive or not (``successful intelligence''; Sternberg, 1997). Domain speci®city theorists also apply the term ``intelligence'' to a large collectivity of abilities that are no more general than Stratum I abilities, said to be independent, and which perhaps extend outside the cognitive realm. General intelligence is often described as the ability to learn, the implicit reference being to those natural settings in which people notice big differences in learning pro®ciency (school, jobs) and thus to tasks where learning well depends on reasoning well. Individual differences in g are most highly correlated with differences in learning pro®ciency when learning is intentional, hierarchical, meaningful, insightful, and age-related (easier for older than younger children), and when learning requires the transfer of prior knowledge to new tasks, allows everyone the same ®xed amount of time, and is moderately dif®cult. Like other life tasks, learning ranges from low to high in cognitive complexity, and thus in amount of reasoning required. High g confers little advantage when learning must be by rote or mere association. Figure 17.1 makes the practical consequences of ability differences more concrete: Adults near the threshold for mild mental retardation (IQ 70) can usually learn simple work tasks (mopping a ¯oor, answering a telephone, etc.) if given suf®cient hands-on, one-on-one, repetitive instruction and supervision. Persons of average psychometric intelligence (IQ 100) can learn a wide variety of routine procedures via written materials and demonstration. Individuals near the threshold for mild giftedness (IQ 130) can be self-instructing. Most individuals toward the left tail of the IQ distribution can learn simple ideas and procedures, but only individuals toward the right tail are likely to generate new ones. The latter are also the most pro®cient at picking up the knowledge and solving the problems that a broader culture generates, as well as being the most likely to lead it in new directions. No culture can sustain new practices, however, that impose cognitive demands on the general populace that are beyond the capacity of its large cognitive middle. On the whole, g is not correlated with differences in personality, temperament, or physical strength, and it is only moderately correlated with interpersonal and psychomotor skills when all are measured in a psychometrically sound manner (Campbell & Knapp, 2001) ± important because it

392

Gottfredson “High risk” 5%

Life chances: % pop.:

“Up-Hill Battle” 20%

“Keeping Up” 50%

“Out Ahead” 20%

Gathers, infers own information

Written materials, plus experience

Very explicit, hands-on

“Yours to Lose” 5%

Training potential: Slow, simple supervised

College format

Clerk, Teller Police Officer Machinist, Sales

Assembler Food Service Nurse’s Aide

Career potential: WAIS IQ:

Mastery learning, hands-on

Attorney Chemist Executive

Manager Teacher Accountant

70

75

80

85

90

95

100

105

110

115

120

125

130

6

8

10

13

15

17

20

23

25

28

30

33

36

WPT score

Out of labour force 22 1+ mo/yr (men)

19

15

14

10

Unemployed 1+ mo/yr (men)

12

10

7

7

2

Divorced in 5 yrs

21

22

23

15

9

Had illegitimate child (women)

32

17

8

4

2

Lives in poverty

30

16

6

3

2

Went on welfare after first child (women)

55

21

12

4

1

Doing time/ever incarcerated (men)

7

7

3

1

0

Chronic welfare recipient (mothers)

31

17

8

2

0

High school dropout

55

35

6

0.4

0

Figure 17.1 Training and career potential in different IQ ranges, and percentage of young white adults in each range who experience various negative outcomes. WPT = Wonderlic Personnel Test. (Adapted from Figure 3 and Table 10 in Gottfredson (1997). Copyright 1997 by Elsevier Science. Reprinted with permission.)

is easy to get falsely low or inconsistent correlations with unreliable measures or range-restricted samples. The g factor is therefore distinct from certain other abilities and propensities, sometimes referred to as intelligences, but for which there are no validated tests; for example, intrapersonal, emotional, kinesthetic, and Machiavellian (social) intelligence. There is a growing tendency in evolutionary psychology, however, to equate general intelligence with a ``social intelligence,'' said to have evolved from an evolutionary arms race to acquire skills for outwitting peers and

17. Innovation, accidents, and intelligence

393

competitors (Dunbar, 1998). The relevant set of social skills is never delineated, but they appear to range from the mostly preprogrammed (e.g., face recognition, cheater detection) to more consciously controlled, culturally recognised behaviours (e.g., coalition building). Many of the latter encompass fairly global people-related strengths, whose correlations with each other, g, and various forms of life success have already been charted by differential psychologists. These range from the ``big ®ve'' dimensions of personality (openness, conscientiousness, extraversion, agreeableness, and neuroticism), which are mostly independent of g, to particular aptitudes in in¯uencing people that depend somewhat on g (e.g., persuading, instructing, managing, leading; Barrick, Stewart, Neubert, & Mount, 1998; Campbell & Knapp, 2001). The skills for manipulating abstract information (g) and those for controlling other people (any putative social or Machiavellian intelligence) are only partially overlapping sets, and therefore can also be expected to have somewhat divergent genetic and evolutionary origins. Divergent origins for intellectual and interpersonal competence are also suggested by the large, consistent, and worldwide sex differences in socioemotional competencies, temperament, interests in people versus things, nonverbal behaviour and perceptiveness (including face recognition), and ways of dealing with other persons (Baron-Cohen, 2003; Campbell & Knapp, 2001). In contrast, there are at most only slight sex differences in general intelligence. The clearest sex differences in cognitive ability are seen in the narrower, more modularised abilities, such as spatial and verbal ability, some of which cluster with the sex differences in temperament and interests (Ackerman & Heggestad, 1997). Cognitive abilities hardly exhaust the palette of human competence. But to understand the evolutionary origins of general intelligence, g, inquiry must target the more strictly cognitive skills by which ancestral Homo sapiens met its environmental challenges, both human and not.

Generality of human intelligence (g) at the genotypic level Human intelligence is also general at the genotypic level (Plomin, DeFries, McClearn, & McGuf®n, 2001; see also Brody, Chapter 18, this volume). The heritability of IQ is moderately high, rising from under 40% in the preschool years, to 60% by adolescence, to 80% in adulthood. The Stratum II abilities are also moderately heritable, but they share most of their heritability with g. The high genetic overlap of the Stratum II abilities with g means that the same genes are responsible for much of the variation in all of them. This means, in turn, that all the distinct, broad abilities (and any associated brain modules) tend to function either in tandem or, if functioning independently, with similar ef®ciency owing to common physiological constraints (e.g., neural speed) (see also Chabris, Chapter 19, this volume).

394

Gottfredson

Recent brain imaging studies do, in fact, indicate that complex g-loaded cognitive tasks activate multiple brain areas (e.g., Gray, Chabris, & Braver, 2003). Other research con®rms the human brain's great connectivity by documenting a vast neurological web for transmitting information among all its parts. Indeed, as Homo evolved bigger brains, white matter (in essence, the relay stations for reciprocal transmission of information throughout the brain) increased faster than grey matter in the crucial prefrontal area of the brain (Schoenemann et al., 2005). So, instead of representing either the sum total of modular processes or simply a domain speci®c adaptation, psychometric g may support or constitute a general executive or integrative capacity that selectively mobilises, inhibits, and coordinates many of the brain's more specialised functions for gathering information and acting on it (see Happaney & Zelazo, Chapter 11; McKinnon, Levine, & Moscovitch, Chapter 7; Moses & Sabbagh, Chapter 12, this volume). A wide range of heritable metabolic, chemical, electrical, and structural features of the human brain correlate with differences in g, from volume of the whole brain and of its grey matter, to rate of glucose metabolism and complexity of brain waves. These features are found to correlate with g at the genetic level too, when the requisite behaviour genetic analyses have been possible (Jensen, 1998; Toga & Thompson, 2005). The many heritable physiological correlates of psychometric g have led some researchers to suspect that this represents a general property of the brain's neural substrate (nerve conduction velocity, dendritic branching, etc.) that affects how all its parts function. Generality of human intelligence (g) at the functional level Finally, g level has highly generalised effects on individuals' wellbeing, from physical health to social status (Deary, Whiteman, Starr, Whalley, & Fox, 2004; Gottfredson, 2002). In fact, whether g predicts well or poorly, it is generally the best single predictor ± better than socioeconomic status ± of both the good and bad life outcomes that concern policy makers (e.g., success in school and work, delinquency). Figure 17.1 illustrates that g-related gradients of risk are much steeper for some life outcomes than others. Compare, for instance, the risks facing young white adults of very low IQ (below 75) to those of very high IQ (above 125): The former are twice as likely to become divorced within 5 years (21% versus 9%) but their risk of unemployment is 6-fold (12% versus 2%) and living in poverty is 15fold (30% versus 2%) greater. But whatever the odds, they all tilt against persons lower in intelligence. And large or small, these greater risks to wellbeing pervade the lives of less intellectually able individuals, piling up one risk after another. Varied kinds of evidence indicate that g's role in the thick network of correlated life outcomes is causal ± that differences in g level create differences in performance in school, work, and everyday self-maintenance,

17. Innovation, accidents, and intelligence

395

and that they do so independently of social class. For instance, not only do siblings in the same household differ two-thirds as much in IQ (for mostly genetic reasons) as do random strangers, but these within-family IQ differences portend much the same inequality in life outcomes among siblings as they do in the general population (Murray, 1998). In addition, income, occupational, and educational levels are themselves moderately heritable, respectively, 40±50%, 50%, and 60±70%, with from half to two-thirds of their heritability overlapping that for g (Rowe, Vesterdal, & Rodgers, 1998). The breadth of g's utility means that a wide variety of ostensibly different ecological demands could have selected for this general cognitive capacity. It is essential to note, however, that whereas higher g enhances performance in perhaps all kinds of instrumental tasks, its in¯uence seems far weaker when individuals are dealing with socioemotional challenges, such as family, peer, and coworker relations. Additional personal strengths are crucial to being an effective leader, manager, salesperson, team-mate, citizen, or caregiver. This partial disjunction in the functional utility of cognitive versus socioemotional skills, together with the psychometric evidence that they are somewhat independent, suggests that g evolved more in response to the instrumental demands of humankind's early environment than to its social or emotional demands.

Do human environments make domain general cognitive demands? There are excellent discussions of the generality of psychometric g in evolutionary psychology (e.g., Geary, 2005) as well as in psychometrics (Jensen, 1998), but neither discipline has said much about what the cognitive demands of daily life actually are. These deserve close analysis, however, because they provide the ingredients of external forces that select for g. Abilities are, by de®nition, qualities that enhance an individual's performance in some particular range of tasks. This means that a general ability is one that is useful in a great variety of them. It is ``generalisable.'' Much research is available for two cognitive activities, seemingly at the extremes of real-world practicality ± taking mental tests and performing specialised jobs. Task demands of mental tests Just as humans differ in intelligence level (g), tasks differ in how well they call forth or measure individual differences in g (their g loadedness). More g-loaded tests require more complex information processing, either on the spot (tests of ¯uid g) or mostly in the past (tests of crystallised g), but their complexity has nothing to do with either their format or their manifest content. Nor does it depend on whether test items require some bit of cultural knowledge; are built with numbers, words, pictures, or symbols;

396

Gottfredson

are administered individually or in groups; or whether test takers respond orally or in writing. Rather, complexity increases when tasks require more mental manipulation; for example, when the information to be processed is more voluminous, abstract, ambiguous, uncertain, incomplete, novel, or embedded in distracting material; and when the task requires spotting regularities, judging relevance, drawing inferences, integrating information, or otherwise evaluating and mentally transforming information to some end. Virtually any format or content, academic or not, can be used to build differentially complex cognitive tasks; for example, more versus less gloaded tests of domain speci®c aptitudes (e.g., mathematical reasoning versus arithmetic computation; reading comprehension versus spelling), subtests on an IQ battery (digits backward versus digits forward), or items in a particular subtest (9-block versus 4-block diagrams to copy in block design). Increments in complexity can be seen in the vocabulary subtest of the Weschler Adult Intelligence Scale (WAIS), where the proportion of adults able to de®ne common words drops as the words become more abstract: bed (a practice item; 100% get at least partial credit), sentence (83%), domestic (65%), tranquil (36%), and travesty (5%; Gottfredson, 1997). Rising complexity is also readily apparent in the following three Number Series Completion items: 2, 4, 6, , ; 2, 4, 8, , ; and 2, 3, 4, 3, 4, 5, , . Perhaps the most important insight from psychometrics, for present purposes, is that individual test items need not measure g very well for a large number of them to create an excellent test of g. If g is the only thing that the items measure in common, and as long as there are enough items, the error (non-g) components of the different items will cancel each other out and leave the items' small g components to cumulate and create a highly reliable measure of virtually nothing but g (the Spearman-Brown prophecy formula indicates how many items are needed). In like manner, a good measure of g can be extracted from a broad collection of everyday knowledge tests (politics, religion, sports, health, etc.) despite none of them individually correlating highly with g (Lubinski & Humphreys, 1997). The lesson for evolutionary psychology, explored below, is that consistent effects, even when individually quite small, can cumulate over time to have large consequences ± much as does a gambling house's small advantage at roulette (Gordon, 1997). Complexity is likewise the active ingredient in tests of functional literacy, where items simulate everyday tasks that all adults are routinely expected to perform in modern societies (e.g., reading maps and menus, ®lling out bank deposit slips and job applications, grasping the main point of a short news article). The US Department of Education's National Adult Literacy Survey (NALS; Kirsch, Jungeblut, Jenkins, & Kolstad, 1993) set out to measure three separate kinds of functional literacy (prose, document, and quantitative). All three NALS scales, however, produced nearly identical results and measured virtually nothing but a single general factor. That

17. Innovation, accidents, and intelligence

397

factor was not readability, per se (e.g., word or sentence length). Rather, it was ``processing complexity.'' More speci®cally, formal analyses showed that differences in item dif®culty (percentage of people passing an item) re¯ected degree of inference required, abstractness, and amount of distracting information ± in essence, the item's g loadedness. Whether an individual is pro®cient at any particular NALS task seldom matters much. What does hurt a signi®cant proportion of adults is their being routinely unable to perform a wide variety of such daily tasks. To illustrate, here are the percentages of American adults who are routinely able to perform tasks comparable in complexity to the following: locate the time of a meeting on a form ± 77%; determine the correct change using information in a menu ± 21%; and interpret a brief phrase from a lengthy news article ± 3% (Kirsch et al., 1993, pp. 113±115). Being highly g loaded, all three NALS scales not surprisingly predict socioeconomic wellbeing (whether living in poverty, utilising welfare, looking for work, etc.) in the same pattern as presented earlier for IQ (Figure 17.1). The NALS results led one national panel to conclude that almost half of American adults do not have suf®cient functional literacy (Level 3 or above) to compete in the global economy, or engage their rights and responsibilities as citizens. These NALS results provide a concrete example of how the seemingly inconsequential minutiae of daily life can yield major differences in personal wellbeing when, like the items on a mental test, they consistently play to the strengths of some individuals, but not others, in avoiding common mistakes (Gordon, 1997). Cognitive demands of work The US Department of Labor's (1991) Dictionary of Occupational Titles provides separate descriptions for almost 18,000 job titles, so today's workplace might seem to represent the height of functional specialisation. Provisioning one's family in the Pleistocene clearly was not so specialised. It may even have been far less cognitively demanding than most jobs today. But we cannot thereby assume that the distinctions in ability that jobs render most important today were not also highly consequential throughout human evolution. Nor can we easily infer which distinctions were most important at some particular time just by comparing the cultural artifacts left behind in different epochs. Many activities leave no artifacts, and the sophistication of those that do remain may represent the ability level only of some critical mass of individuals suf®ciently bright to invent (or import) and sustain those practices within the group. Moreover, any such critical mass, or carrying capacity, might sometimes have been achieved by increases in a population's size rather than its average intelligence level. Large-scale job analysis studies routinely show that occupations today, like mental tests, differ most fundamentally in the general complexity of the work they require incumbents to perform, and not in their manifest content

398

Gottfredson

(medicine, law, technology, art). Content-speci®c task demands, such as dealing with people rather than things or data (three of the Dictionary's rating scales), become important only when distinguishing occupations of similar complexity level (e.g., mid-level sales versus crafts or clerical work). Drawing from such diverse job analyses, one study pro®led the particular worker tasks, worker aptitudes, and working conditions that contribute most to a job's overall complexity (Gottfredson, 1997). A job's complexity depends on the amount, level, and variety of information processing that it requires. Speci®c tasks correlating highly with complexity include compiling (r = .90), analysing (.83), and transmitting relevant information, whatever form it takes (written, .84; quantitative, .68; oral, .68; behavioural, .59; pictorial, .44, etc.). Tasks involving high-level controlled information processing (e.g., reasoning, .83; analysing, .83; planning, .83; decision-making, .82) contribute more to overall job complexity than do more elemental processes (e.g., recognise, .36; remember, .40; transcribe, .51; and code/ decode, .68). Working conditions and task con®gurations can also increase complexity, and they include working: under distractions (.78) or time pressure (.55); in varied and changing circumstances (.41); and with much need for updating knowledge (.85) and self-direction (.88) in which tasks to perform, when, and how ± all of which characterise many professional and executive jobs. Low-complexity jobs (e.g., packer, custodian, food service worker) entail quite the reverse: mostly activities that are repetitive and continuous (ÿ.79 with complexity), highly structured (ÿ.79), and closely supervised (.73). Middle-complexity jobs (much clerical, sales, and skilled trades work) require moderate levels of planning, analysis, judgment, and pertinent training, but their constituent tasks are narrower in scope, more fully speci®ed, and more predictable than those in complex occupations (and hence more fully trainable). Not surprisingly, IQ level best predicts differences in performance in high-level jobs, the correlations with IQ ranging from about 0.2 in simple jobs to 0.8 in the most complex (corrected for unreliability and restriction in range on incumbents' IQ). Being more cognitively facile aids performance at least a bit in all jobs, but these correlations show that the edge it provides grows with the complexity of a task. The same edge no doubt exists outside the workplace too, because most tasks that workers are paid to perform (transporting, instructing, advising, building, repairing, healing, etc.) mirror domestic tasks that the typical adult also undertakes. Arvey (1986) characterised task demands more globally, showing more directly that overall job complexity calls forth the very abilities often used to describe general intelligence itself: effective learning (e.g., ``learn and recall job-related information,'' r = .71 with the study's dominant Judgment and Reasoning factor; ``learn new procedures quickly,'' .66), reasoning (``reason and make judgments,'' .69), and problem solving (``apply common sense to solve problems,'' .66). Perhaps more importantly, the study

17. Innovation, accidents, and intelligence

399

highlights an underappreciated contributor to the complexity and criticality of work: dealing with unexpected, lurking, and nonobvious problems (``deal with unexpected situations,'' .75; ``identify problem situations quickly,'' .69; ``react swiftly when unexpected problems occur,'' .67). That is, jobs are more cognitively complex when they require not only solving known problems, but also spotting and diagnosing new ones: not just ®nding solutions, but seeing the problems in the ®rst place (see also Simonton, Chapter 15, this volume). Indeed, aptness in conceptualising risk and opportunity, in visualising the unseen and unexpected, may be the most distinctive aspect of highly complex jobs ± and of human intelligence itself. It represents what is sometimes dubbed the mind's eye: the ability that only humans have to conceptualise a world beyond the stimuli immediately in front of them, to create images of a reality not concretely present, and to realise they are effecting that separation. The mind's eye does not restrict its gaze to any particular content domain, but surveys many. It entails the ability to abstract salient features of the environment and to perceive a separate, intentional, selfdirected self within that environment. Aided by language, that uniquely human storehouse of concepts, the mind's eye confers the ability to ``time travel,'' ``read minds,'' and construct scenarios for any realm of life, whether physical, biological, social, or spiritual. Its breadth of vision contradicts the notion that the brain and mind consist only of specialised modules that evolved to solve highly domain-restricted problems. So does its very existence, precisely because the mind's eye represents humankind freeing itself somewhat from the dictates of immediate experience ± dictates that modularists are probably correct in supposing would foster modularity. Importantly, it allows humans to inhibit natural reactions to present circumstances in order to enhance future wellbeing. It seems mistaken to assume that the fundamental advantages of having a higher g than one's contemporaries are different today than during the human EEA. These advantages may also be far more elemental than most of us had supposed ± namely, to infer or imagine what cannot be seen directly. Homo sapiens may be Man the Toolmaker, the Hunter, the Hunted, Scavenger, Warrior, Coalition Builder, and much more, but his distinctive attribute is more profound ± he is an imaginist.

Does higher intelligence predict lower mortality? Selection proceeds, however, only when there is differential reproduction or mortality of different (genetically in¯uenced) phenotypes in the species. Data for modern populations provide valuable clues, once again, for how more pro®cient reasoning (higher g) in daily life might have enabled brighter individuals in the EEA to leave more genetic descendants than their contemporaries.

400

Gottfredson

Modern states have lowered their overall rates of morbidity and mortality by providing better medical care and buffering their inhabitants from many kinds of illness and injury (better sanitation, immunisations, safer cars and roads). If cognitive competence helps predict mortality in modern states, then it probably predicted mortality in early human environments too, where individuals had to rely more fully than now on their own resources and good judgment. IQ-related differences in health self-care Cohort studies reveal robust relations between childhood IQ and adult mortality. For example, three large cohort studies in the Scottish Mental Surveys found that higher IQ at age 11 forecast lower all-cause mortality, fewer deaths from stomach and lung cancer, less late-onset dementia, and more functional independence among persons followed up at ages 55 to 70 (Deary et al., 2004). A signi®cant association between IQ and premature death remained after controlling for confounding variables. A large cohort study of Australian male army veterans followed to about age 40 found that higher IQ at induction (~age 18) predicted lower all-cause mortality, and fewer deaths from suicide and motor vehicle accidents (the two major causes of death), even after controlling for other personal factors, including prior health (O'Toole & Stankov, 1992). Both sets of analyses reported that each additional IQ point (e.g., 97 versus 96) was associated with about a 1% reduction in relative risk of death, meaning that a one standard deviation difference in IQ (15 points) was associated with about a 15 per cent difference in mortality. Relatively little research is available on IQ's relation to health, but much has been done in relating health to other personal attributes that provide differentially valid surrogates for IQ. The closest surrogate is functional literacy, discussed earlier. Better performance on tests of health literacy (a general capacity to learn, reason, and solve problems in the health domain) predicts lower health costs, less hospitalisation, better understanding of one's chronic disease, and more effective adherence to treatment regimens (Gottfredson, 2004). Again, differences in risk are not much reduced after controlling for income, health insurance, and other risk factors. Years of education, occupational level, and income in adulthood provide progressively weaker surrogates for IQ because they are successively weaker correlates of it (from about 0.6 for years of education to 0.3 for income). All these surrogates correlate with health knowledge, health habits, morbidity, and mortality, but in order of their validity as surrogates for IQ. This consistent pattern for IQ surrogates, where income is the weakest, and functional literacy is the best correlate of both IQ and health, suggests that higher relative (not absolute) risk for poor health is rooted more in people's differences in mental than material resources. Health scientists often treat IQ as just a marker for socioeconomic status (SES), but the

17. Innovation, accidents, and intelligence

401

opposite is a safer bet. That is, social class may predict health differences within a population mostly because it provides a weak but valid signal for the cognitive capabilities that allow people to prevent and effectively manage illness and injury. Possessing material resources is not enough; they mean little if not exploited wisely. Supporting evidence for the cognitive resources hypothesis comes from failed efforts to equalise health by equalising relevant material resources. For example, when Great Britain established free national health care in the 1950s, health inequalities increased rather than decreased. Although health improved overall, it improved less in the lower occupational classes than in the higher ones. Absolute risk decreased, but class-related relative risk (i.e., differences in risk) increased. This is also the usual effect when new preventive techniques become available (e.g., Pap smears and mammograms), even when they are provided free of charge. SES-related gaps in knowledge likewise grow when vital health information (e.g., signs and symptoms of cancer and diabetes) is disseminated more widely to the general population, as is also the case for other educational interventions. Perhaps the strongest evidence for the causal importance of cognitive resources comes from reversals in g-related risk gradients when new hazards are discovered. Heart disease and certain cancers once disproportionately af¯icted the higher classes, who were better able to afford cigarettes and red meat, but the risk gradients ¯ipped to disfavour the lower classes once these luxuries were found to increase the risk of chronic disease. Other research suggests why: Childhood IQ predicted who, in a cohort of individuals born in 1921, quit smoking after its dangers became known in mid-century (Deary et al., 2004). Health literacy research converges on the same explanation for why inequalities grow even as a population's health improves. Researchers concluded that individuals who score poorly on tests of health literacy (misread medicine labels, etc.) do so primarily because they learn and reason poorly. They are thereby less able to pro®t from advances in health knowledge and medical technology. They less often seek the preventive care available to them, less often recognise when they need medical care, and adhere less effectively to the medical treatments they are prescribed (see Gottfredson, 2004). In an important sense, each of us is our own primary healthcare provider. Health selfcare is a lifelong job, and it is becoming ever more complex as health information proliferates and treatments become more complicated. Arvey's (1986) job analysis, when applied to the job of health selfcare, warns that it will increasingly require us to ``learn and recall job [health]related information,'' ``learn new procedures [treatments] quickly,'' ``deal with unexpected situations [health emergencies],'' ``identify problem situations [symptoms of disease] quickly,'' and ``reason and make judgments [in the daily management of a chronic illness].'' The mind's eye is especially important in motivating adherence to treatment when deadly

402

Gottfredson

diseases such as hypertension have no outward symptoms or, as with diabetes, lax selfcare (blood sugar frequently too high) causes no immediate, obvious harm, but the internal damage builds inexorably toward disability and death. SES-related risk of fatal accidents Relatively few people in developed nations die today from infectious diseases such as malaria and cholera, which still kill many people in developing countries. Instead, they succumb to chronic diseases such as cancer, stroke, and heart disease, usually long after their reproductive years have ended. What is common to all societies, however, is that injuries are a major killer (Baker et al., 1992; Smith & Barss, 1991). These may be either intentional (homicide and suicide) or unintentional (``accidents''). In 1999, unintentional injury was the single largest cause of death in the USA for ages 1±34, and it was the second and third largest, respectively, among persons aged 35±44 and 45±54 (National Center for Injury Prevention and Control, 2002). Developing countries show the same basic pattern. In the transition from hunter-gatherer societies to modern states, death rates from homicide and warfare fall, and rates of suicide rise, and these rates vary more by nation than do rates of death from unintentional injury (Smith & Barss, 1991). The large toll from unintentional (accidental) injury thus appears to be the more stable component of human mortality. Nations invest much less effort in preventing deaths from unintentional injury than from illness and intentional injury. Reports on the matter invariably refer to accidents as a large but neglected public health problem (National Research Council, 1985; Smith & Barss, 1991). This may be partly explained by unintentional injuries generally being thought of as accidental, as unlucky rolls of the dice. Chance plays a role, of course, but unintentional injury rates are highly patterned in all societies. They do not strike randomly by age, sex, or social class. Even death by lightning, the seemingly paradigmatic chance event, most often strikes adolescent males. As described later, human behaviour is deeply implicated in the cause and course of accidents. In fact, public health researchers describe how notoriously dif®cult it is to persuade people to behave in safer and more healthful ways (e.g., not smoke in bed, not drink and drive, eat right and exercise). Even laws that prohibit unsafe behaviour (speeding) and mandate protective gear (helmets, seat belts) have only limited ef®cacy in changing behaviour (National Research Council, 1985). Table 17.1 outlines the pattern of injury mortality in the United States in 1986, the most recent year for which such a detailed portrait has been compiled. The last column provides death rates per 100,000 for all categories of injury. For example, it shows that 64 of every 100,000 Americans in 1980±86 died from an injury, almost two-thirds (41 per 100,000) unintentional. Whereas chronic disease typically kills late in life, injuries often

Table 17.1 Rates of death from injury per 100,000 population, and relative risk (odds ratio) by per capita income of area of residence, 69 causes,a 1980±1986, United States Per capita income of neighbourhood < $6K $10±11K $14K 69. Total (causes 1±68) 58. Suicide (50±57) 64. Homicide (59±63)b 6. Motor vehicle accidents, traf®c (1±5) 48. Other unintentional (7±47) 27. 40. 5. 38. 42. 39. 31. 7. 34. 33. 35.

Falls (21±26) (elderly) Suffocation (infants) Pedestrian, traf®c (elderly) Aspiration, food (infants, elderly) Collision w/object/person (very old) Aspiration, nonfood (infants, elderly) Fires/burns (28±30) (1±4 and elderly) Pedestrian, non-traf®c (1±4, e.g., driveways) Excessive cold (infants, elderly) Excessive heat (infants, elderly) Exposure/neglect (infants, elderly)

3. 4. 12. 2. 1. 36. 32.

Motorcyclists, traf®c Cyclists, traf®c Drowning (10±11) Motor vehicle, occupant Motor vehicle, train Lightning Firearm

9. 18. 20. 8. 43. 45. 47. 46. 41. 44.

Aircraft (mostly small private) Poisoning, solids/liquidsc Poisoning, gas/vapourd Pedestrian, train Caught/crushed Cutting/piercing Electric current Explosion Struck by falling object Machinery

37. Natural disaster

Deaths per 100,000 pop.

3.5 1.0 0.5 0.9 1.0 0.8 0.9 1.0 0.3 Unintentional injuries, total 2.1 1.0 0.7 2.0 1.0 0.8 Primarily the very young and 1.0 1.0 0.9 1.3 1.0 0.8 1.3 1.0 0.6 1.5 1.0 0.9 1.8 1.0 0.8 2.1 1.0 0.9 2.5 1.0 0.6 2.7 1.0 0.6 3.1 1.0 0.6 4.4 1.0 0.6 7.4 1.0 0.8 Primarily young males 0.7 1.0 0.5 0.9 1.0 0.6 2.0 1.0 0.6 2.4 1.0 0.7 3.2 1.0 0.6 3.4 1.0 0.7 4.4 1.0 0.6 Primarily adult males 0.9 1.0 1.2 0.6 1.0 0.7 1.3 1.0 0.9 1.4 1.0 1.2 1.5 1.0 1.0 2.0 1.0 0.6 2.1 1.0 0.5 2.9 1.0 0.6 4.6 1.0 1.3 5.0 1.0 0.5

64.04 12.24 9.15 19.96 21.20 old 5.21 0.38 3.19 0.78 0.11 0.68 2.30 0.20 0.34 0.22 0.12 1.51 0.36 2.60 14.88 0.26 0.04 0.73 0.60 1.57 0.50 0.18 0.05 0.05 0.40 0.12 0.42 0.57

Risk rises gradually with age, both sexes 5.0 1.0 1.0 0.06

Source: Based on Table 7 in Gottfredson (2004). Reprinted by permission of the American Psychological Association. a b c d

Some of the 69 are subtotals of others. Four homicide categories are excluded here: homicide due to legal intervention with ®rearm (65), undetermined ®rearm (66), undetermined poisoning (67), and total undetermined (68). Solid/liquid poisonings include opiates (13), barbiturates (14), tranquilizers (15), antidepressants (16), alcohol (17). Gas/vapour poisonings include but are not limited to motor vehicle exhaust (19).

404

Gottfredson

take people at the peak of their productive potential. Years of life lost and lifetime dollar cost per death are thus many times higher than for cancer and cardiovascular disease (Baker, O'Neill, Ginsburg, & Li, 1992). Moreover, fatalities represent only a small proportion of all injuries: Injuries can create many adaptive problems short of death. They need not be fatal to stress a family emotionally and ®nancially, especially if the victim is permanently disabled. Table 17.1 lists speci®c causes of unintentional mortality according to the age±sex groups most subject to them, because different sexes and ages perish from notably different kinds of injury (see Baker et al., 1992). Only natural disasters seem to affect age±sex groups equally. The very young and very old die disproportionately from falls, aspiration (choking), burns, exposure, neglect, and being struck by vehicles. Relative to other age groups, they are cognitively weak, physically vulnerable, and dependent on caretakers, so they have less capacity for escape and recovery from harm. Young males are the major accident victims of drowning, lightning, weapons, and vehicles of many types (motorcycles, bicycles, automobiles). Many such deaths involve alcohol and reckless behaviour, and may result from the testosterone-driven displays of masculinity that surge at this age. Adult males are the group most subject to injuries involving productionrelated technology and activity, about half such deaths occurring at work and half at home: vapour poisoning, piercing, crushing, electrocution, explosions, falling objects, and machinery. Not surprisingly, male provisioners die disproportionately from the hazards associated with their provisioning activities. A second pattern in vulnerability to accidental death can be seen in the ®rst three columns of Table 17.1, which quantify relative risk by the victim's area of residence. Relative risk is measured here with the odds ratio (OR). An odds ratio is, as it sounds, simply the ratio of two odds: the odds that members of Group A will experience versus not experience the outcome in question, divided by the analogous odds for a reference group, Group R. For example, if 25% of Group A died from a certain disease but only 20% of Group R did, then the two odds would be 25/75 (0.33) and 20/80 (0.25), producing an odds ratio of 1.33. Table 17.1 provides ratios for residents of the lowest-income and highest-income neighbourhoods relative to residents of average-income areas in 1986 in the USA. Thus, the odds ratio of 3.5 for total injury mortality among residents in the poorest neighbourhoods (per capita income under $6000 in 1986) means that those residents were 3.5 times as likely to sustain a fatal injury as residents of the reference neighbourhoods ($10,000±$11,000 per capita). The risk gradients differ greatly depending on cause of death. They are shallow (that is, the ORs change little across the three income groups) for causes such as falls, suffocation, and gas/vapour poisoning. They are steeper ± and comparable to those for most chronic diseases ± for excessive cold, ®res/burns, drowning, vehicle accidents (occupant or train), lightning,

17. Innovation, accidents, and intelligence

405

and being cut/pierced, electrocuted, or killed in an explosion. They are especially steep for excessive heat, exposure/neglect, ®rearms, falling objects, machinery, and natural disasters. Disadvantaged circumstances (poor housing, dangerous jobs, etc.) may elevate risk by exposing individuals to more hazards, but the risk gradients do not track material disadvantage, at least in any obvious way. For example, although many adult men die in accidents associated with the tools of their trade, half those accidents occur at home (Baker et al., 1992). Voluntary self-exposure is likewise indicated by alcohol abuse being a factor in many drownings, vehicular fatalities, and burns. It is also hard to ®nd a compelling reason why differences in material resources should have their most dramatic effect on relative risk of (infants and the elderly) dying from exposure and neglect. The relation between SES and accidental death varies in magnitude, depending on cause, but seldom in direction. Relative risk rises as neighbourhood income falls for 23 of the 29 speci®c causes, and it is reversed for only one (plane crashes). Mortality gradients disfavouring lower socioeconomic groups are also found worldwide for most illnesses, regardless of their aetiology, preventability, treatability, or organ system involved (Adler, Boyce, Chesney, Folkman, & Syme, 1993). Additionally, SES usually has a dose±response (linear) relation to morbidity and mortality, meaning that each additional increment in education, occupation, or income level is associated with yet better health outcomes, even beyond the resource levels that seem more than suf®cient for good health. As health scientists note, social class differences in material resources cannot explain either the ubiquity or the linearity of the SES±health gradients across time, place, and malady, so they hypothesise a more fundamental cause or generalised susceptibility they cannot yet identify (Link & Phelan, 1995) but which, as discussed above, is mostly likely g. The distribution of fatal accidents in human populations today reveals how these might have contributed to selection for higher g. First, although any one form of death may be relatively rare in any given year, accidents are a major cause of death in all societies. Second, victims are disproportionately males of reproductive age. Third, most types of accidental death strike disproportionately often in the lower socioeconomic strata, some markedly so. Because adults in the lower social classes tend to have lower IQs, and because differences in IQ are 80% heritable in adulthood (i.e., not due to social class), higher mortality in the lower socioeconomic strata may actually re¯ect the impact of lower g, not fewer material resources. Recall that IQ was the best predictor of motor vehicle fatalities in the Australian veterans study. Those IQ-related differences in mortality rate were also large: 146.7, 92.2, and 51.5 deaths per 10,000, respectively, for men of IQs 80±85, 85±100, and 100±115 (O'Toole, 1990; neither the Australian nor American militaries may induct individuals below the 10th percentile, which is about IQ 80).

406

Gottfredson

Cognitive nature of accident prevention and containment Accident researchers have concluded that the key question is not what causes accidents, but what prevents them (Hale & Glendon, 1987). Hazards are ubiquitous, surrounding us from birth, lying in wait every day of our lives. Accident prevention consists of managing hazards so that they do not cause injury. The accident process begins when a system under control (e.g., driving safely down a familiar road, one's children are playing happily) becomes destabilised. Injury actually occurs fairly late in the accident process, after someone has failed to detect or diagnose the hazard (a car is following too closely, matches are within the children's reach) and failed to take appropriate action to bring the situation back under control (move out of the car's way, remove the matches). Individual action is critical not only for preventing and containing incidents, but also for limiting the damage they do. People often fail to take advance precautions, such as wearing protective gear (seatbelt, safety goggles) or installing warning systems (smoke alarms) that could limit harm. Catastrophic accidents (e.g., Challenger space shuttle explosion, Piper Oil platform ®re) usually involve the concatenation of multiple errors by different people. Victims and their caretakers are seldom responsible for all the human errors that led to the victims' injury, but most if not all have missed opportunities to prevent or minimise it. For instance, studies of accidents involving pedestrians and workers in gold mines have documented that most victims failed to respond appropriately, if at all, to visible imminent danger (approaching vehicle, falling rock). The issue here is not who bears most responsibility for causing a given accident, but whether people routinely use what opportunities they have to protect themselves. Relying on others alone to shield us from danger is foolhardy. We must practise ``defensive driving'' along all of life's paths. A recent study (Buffardi, Fleishman, Morath, & McCarthy, 2000) illustrates the importance of cognitive competence for preventing the human errors that can precipitate accidents, or fail to halt them. It found that error rates ± human error probabilities (HEPs) ± on work tasks in Air Force and nuclear power plant jobs generally correlated 0.5 to 0.6 with the number and level of cognitive abilities that the tasks required. This means that brighter workers are less likely than others to make errors on those tasks, an expectation that is consistent with meta-analyses showing that brighter workers outperform their coworkers (on average) in all jobs, but especially so in complex ones (Schmidt & Hunter, 2004). All people make cognitive mistakes, but higher-g persons make relatively fewer of them when holding dif®culty level of the task constant, whether on mental tests or in real life. Students of the accident process have long argued that accident prevention and control is a quintessentially cognitive process. Hazards are ubiquitous and many incubate without visible evidence (e.g., in a machine not serviced), so it is often unclear in the kaleidoscope of daily life what

17. Innovation, accidents, and intelligence

407

constitutes a hazard or how dangerous it might be. Avoiding accidental death, like exercising effective health self-care, thus requires the same information-processing skills as do complex jobs: continually monitoring large arrays of information, discerning patterns and anomalies, understanding causal relations, assessing probabilities, and forecasting future events. In essence, accident prevention requires imagining the unseen, the nascent, the ``what-if?'' Just as discoveries come more often to the prepared mind, so does effective accident prevention and containment. The conditions that make effective monitoring, detection, and estimation more dif®cult mirror the factors previously discussed as contributing to job complexity: situation changing rapidly, situation not as expected, ambiguity and uncertainty, working under distractions, and nonroutine tasks (Hale & Glendon, 1987). Lack of knowledge and training for handling contingencies also impedes timely detection of, and response to, systems going out of control. Even individuals who are fully aware of a particular danger, who are trained to deal with it, and who attempt to exercise control may nonetheless fall victim if they are distracted, fatigued, stressed, or impaired by drugs or alcohol. In short, the same task requirements that typify complex jobs are also at the heart of preventing unintentional injury: dealing with unexpected situations, identifying problem situations quickly, and reacting swiftly when unexpected problems occur.

Were accidents an important cause of death in precontact hunter-gatherer societies? Homo sapiens speciated 100,000±150,000 years ago, and then began radiating out of sub-Saharan Africa about 50,000 years ago (Sarich & Miele, 2004). Perhaps the closest we can come to observing the ecological circumstances associated with this is to study surviving hunter-gatherer societies. The Northern Ache of Eastern Paraguay provide the clearest such living window into our subspecies' EEA, because they are the only foraging group whose life before peaceful contact with the outside world has been carefully documented. Hill and Hurtado (1996) report fertility and mortality among the Northern Ache during three periods: precontact, when they lived entirely by foraging in the rainforest (before 1971); the initial period of peaceful contact (1971±1977); and after resettlement onto reservations (1978±1993). The Ache are not representative of all hunter-gatherers, current or prehistoric, but their environmental stressors and modes of adapting to them violate common presumptions about technologically primitive societies. Pre-contact Ache lived in bands of 15±70 individuals, with bands frequently shifting in size and composition. Bands were autonomous economic and residential units, moving camp frequently (often daily) and living entirely from hunting (e.g., monkeys, peccaries, armadillos) and gathering (e.g., palm ®bre, fruits, honey, insect larvae). On average, women had their

408

Gottfredson

menarche at age 15, their ®rst child at age 19, their last child at age 42, and a total of eight live births by age 45. Male fertility was more variable, with men fathering their ®rst child at mean age 24 and their last at age 48. Marriages were short, especially in early adulthood, and women averaged a total of 10 by age 30. Both the probable and possible biological fathers of each child were ritually acknowledged. Children were generally weaned around age two and a half. Half of all males and females survived to age 40, at which point they had a life expectancy of another 22 years (males) to 26 years (females). Small groups of men hunted for game on average 7 hours a day, collected honey when available, and shared their proceeds evenly among all adults in the band. Hunters used large bows and arrows but also killed small game by hand. Meat provided 87% of the band's calories. Women spent an average of two hours per day foraging for plant and insect products, which were not as widely shared in the band. Women spent another two hours moving camp, with men cutting a trail through the dense underbrush. Adults transported all children until age ®ve, after which children had to walk on their own to the new camp. Girls started producing as much food as the average adult woman beginning around age 10±12. Boys carried bows and arrows by that age, but they did not reach adult male production levels till their twenties. The many hazards of forest life included, among others, poisonous snakes and spiders, jaguars, stinging insects, parasites, malaria, and warfare with non-Ache, all of which could temporarily disable individuals, if not killing them outright. Temperatures sometimes dropped below freezing at night, and children and adults lost and without ®rebrands risked dying of exposure if they failed to return to camp, a common hazard also among the !Kung hunter-gatherers of sub-Saharan Africa (Howell, 2000, pp. 58±59). Of the 1423 Northern Ache born between 1890 and 1994, 881 had died by 1994 (843 with cause reported), of whom 382 died during the forest period (before 1971). Most of the Ache mortality data reported in Table 17.2 were collected retrospectively in interviews during 1981±1992. Ache informants provided reliable and forthright accounts of deaths from injuries, including homicide. Before peaceful contact, warfare (e.g., raiding) was the second most common cause of death (128 of 363), but it accounted for none of the 104 during the reservation period. The interim period is omitted here because nearly one third of all Ache died from epidemics after ®rst peaceful contact. Even during the forest period, however, somewhat more Ache (135) died of injuries not sustained during warfare (50 from accidents and 85 from homicide by other Ache). Baksh and Johnson (1990, p. 204) likewise report a large proportion of deaths from fatal accidents among the Machiguena Indians in the Amazon Forest. Ache rates of fatal injury, both intentional and unintentional, decreased considerably between the forest and reservation periods owing to state

Table 17.2 Number of deaths from speci®c causes among the Ache before peaceful contact (before 1971) Age 0±3

4±14

15±59

60

Total

Sex F M

T

F

M

T

F

M

T

F

M

T

F

M

T

19 8

17 11

36 19

8 0

7 0

15 0

9 1 3

26 0 0

35 1 3

2 2

3 4

5 6

38 11 3

53 15 0

91 26 3

Unintentional injury eaten by jaguar snakebite accidentally suffocated hit by lightning drowned lost hit by falling tree fell from tree

1

2

3

1

10

11

1 0

1 1

2 1

0

3

3

6 1 3

23 7 12

29 8 15

4 0 1

3 1 2

7 1 3

0 1 0 0

3 0 3 1

3 1 3 1

1

2

3

0 1 0

1 0 1

1 1 1

3

0

3

12 1 4 1 1 1 3 1 0

38 8 17 1 6 0 4 1 1

50 9 21 2 7 1 7 2 1

Homicide/neglect sacri®ced with adult mother died child homicide infanticide neglect buried alive left behind ritual club ®ghts nonsanctioned murder

26 7 1 9 6 1 0 2

26 4 1 15 1 1 1 3

52 11 2 24 7 2 1 5

14 10

3 1

17 11

4

7

11

1

4

5

3

0

3

1

2

3

1 1 0 2

0 0 6 1

1 1 6 3

1 0 0

0 2 2

1 2 2

45 17 1 12 6 1 2 4 0 2

40 5 1 15 1 1 1 7 8 1

85 22 2 27 7 2 3 11 8 3

Total (nonwarfare) Warfare

54 9

56 12

110 21

23 27

20 29

43 56

23 16

56 31

79 47

9 2

14 2

23 4

109 54

146 74

255 128

Illness Congenital/degenerative Childbirth

Source: Hill & Hurtado (1996, pp. 171±173), reproduced with permission.

410

Gottfredson

intervention (Hill & Hurtado, 1996). Homicide fell from 33% to 9% of all nonwarfare deaths, and fatal accidents from 20% to 6%. Although their absolute number dropped, deaths from illness nearly doubled as a percentage of all nonwarfare mortality (from 47% to 85%). The Ache mortality pattern in the reservation period is quite similar to that of the Yanomamo and !Kung societies (Hill & Hurtado, 1996) and the United States (National Center for Injury Prevention and Control, 2002), where illness accounted for 80±90% of all deaths, and the remainder was split about equally between fatal accidents and intentional injury. No suicides were reported in any of the three foraging societies, but in the United States suicide accounted for almost as many deaths as did homicide. The percentage of Ache deaths from illness did not differ by age, whether before or after contact. Fatalities from injury, however, differed greatly by both age and sex in both periods. In the forest period, as Table 17.2 shows, lethal accidents claimed more lives than did homicides during adolescence and middle adulthood (29 versus 11 deaths for ages 15±59), but the opposite was true for children (14 versus 69 for ages 0±14). This general pattern held for reservation life too: Adults died relatively more often from accidents; and children from homicide. In the United States, however, accidental injury was a bigger killer than intentional injury (suicide and homicide) at all ages. Perhaps the most striking difference between the two societies is the reversal in the ratio of accidental to intentional deaths among infants and toddlers: Whereas 3% of Ache nonwarfare deaths from ages zero to three resulted from accidents and 47% from homicide, the disproportion is reversed in the United States ± 40% versus 5% ± for a similar age group (1±4). Both the nature and number of Ache homicides during the forest period differed by age and sex. The only three unsanctioned murders (e.g., killing a wife in anger) were of adults. Another eight intentional deaths, all of them adult males, occurred during ritual club ®ghts. All band members who could not keep up because of age, illness, or disability (e.g., blindness) were eventually left behind (eight of the eleven being children) or buried alive (two of three being adults), sometimes at their own request (to avoid being eaten alive by vultures when left behind on the trail). Most Ache homicides, however, involved the killing of children, sometimes by parents themselves. Girls were more subject to infanticide and sacri®ce at adult burials, but boys were somewhat more likely to be killed after infancy. Table 17.2 also reveals several important age and sex differences in the cause of fatal accidents during the forest period. One pattern, which is still found worldwide (including among the !Kung; Howell, 2000), is that fatal accidents killed many more Ache males (23) than females (6) during their adolescent and middle-adult years (ages 15±59). Furthermore, the great disproportion, by sex, in fatal illnesses in this age range (26 male, 9 female), but not at other ages, suggests that many of the men's fatal ``illnesses'' (fevers, infections, and sores) were actually sequelae from injury (cf. Howell, 2000, Chapter 3 on the !Kung). Cuts, punctures, and bites provide

17. Innovation, accidents, and intelligence

411

entry points for infections that can debilitate or kill when modern medical treatment is not available. If the 19 surplus fatal illnesses among males (26 male minus 9 female) are reclassi®ed as delayed fatalities from injury, then the resulting 42 (23 reported plus 19 surplus) accidental deaths among males aged 15±59 constitute 75 per cent of their 56 nonwarfare deaths, and nearly half of their total 87 for the forest period. Most accidental deaths among adults of both sexes resulted from hazards in provisioning, and from basically the same causes (e.g., snakebites). However, since women spent only a quarter as many hours foraging as men spent hunting, they exposed themselves to fewer hazards and thus were injured less often. As occurs in the USA today, fatal accidents among adult Ache males, in the forest period, were usually associated with the trades by which men provisioned the band. Although dying at the teeth of a lurking jaguar or snake might not seem analogous to dying while using modern machines and tools, such deaths probably result from the same general cognitive failures: inexperience, and lapses in monitoring the environment for signs of imminent danger, while engrossed in one's primary activity. For instance, most snakebites occurred when the individual stepped on a snake while looking up into the forest canopy for arboreal game. This is also one of the chief hazards for primate researchers (Hart & Sussman, 2005, p. 113). A second pattern is that older Ache children died more often from accidents than did children aged zero to three, but the age-related increase involved males only. The causes of accidental death among the older boys re¯ected both their inexperience in the forest (getting lost) and exposing themselves to the needless injuries associated with inattentive male provisioning (snakebites). Combining the data for ages 4±14 and 15±59, seven females died from accidental injury whereas up to 52 males did (10 boys plus 23 men plus 19 surplus ``illnesses''). Accidents thus removed 45 more male than female provisioners, current or imminent, from the population. Warfare, in contrast, removed only 17 more males than females aged 4±59, because many females were captured or killed. Only homicide among older children (ages 4±14) removed more girls (14) than boys (3) from the population. Third, from birth to age three, the two sexes died in equal number and mostly from illness and homicide. Small children rarely died of unintentional injury, despite the many hazards of forest life, because they were carefully watched by their mothers and other caretakers. Children under age one spent about 93 per cent of their daylight time in tactile contact with their mother or father, and even at age three or four they were still spending three quarters of their daylight time no more than 1 meter from the mother. Caretakers were acutely aware of common dangers to small children, and protecting them from these predictable dangers was their primary activity (cf. Howell, 2000, on similar preventive efforts among the !Kung). Looking at the larger pattern, the two most striking epidemiological facts are the high loss of reproductive-age males to provisioning-related accidents

412

Gottfredson

and the even higher loss of children to homicide (respectively, 42 and 69 of all 255 nonwarfare deaths during the forest period). Each reproductive-age adult who died prematurely from any cause lost the opportunity to produce more offspring, in proportion to the prematurity of death. But the impact of such deaths was yet more profound in evolutionary terms because most child homicides followed the death of an adult (and were more sex-balanced than provisioning deaths). Important men were typically buried with a living child, usually girls under age ®ve. The children chosen for sacri®ce were usually ill, injured, defective, or orphaned, which meant they also had the fewest advocates during band discussions of whom to sacri®ce. Infanticide and child homicide often followed the loss of one or both parents through death or divorce. Some of these children were killed immediately, but others later in childhood, after other band members grew resentful of being coerced into caring for them. Children without mothers were 4.5 times as likely to be killed during each year of childhood, and infants losing their mother in their ®rst year of life had a 100% probability of being killed by another Ache. Children without fathers and those with divorced parents were, respectively, 3.9 and 2.8 times as likely to be killed in each year of childhood. Overall, death of the mother affected the youngest children primarily, but death of the father or parental divorce greatly increased the homicide rate of children at all ages. Moreover, father's death was more common than mother's death, and divorce was most common of all. As Hill and Hurtado (1996, p. 437) sum it up, ``The impact of parental absence on childhood homicide rates is quite astounding.'' They also conclude that, in contrast, ``The presence or absence or number of grandparents, aunts, uncles, and adult siblings seem to have little or no impact on child survival'' (p. 424). Loss of a provisioning adult put nutritional stress on the band, or particular families within it. A nursing infant who lost its mother lost its only possible provisioner. The more common loss, that of fathers through death or abandonment, put tremendous stress on the wife and biological children he left behind because it meant the family lost one of its two major provisioners. Recall that meat, the primary source of calories, was split evenly among adults, who then passed portions to their children. A child's father need not have been an effective hunter for his children to ¯ourish, but he had to stay alive, with the band, and preferably with their mother. Children with no parents ± orphans ± were hated and frequently sacri®ced for burial with adult males, because they were constantly begging for food (as did many fatherless children). Thus, although the loss of a good hunter nutritionally stressed the whole band, sharing norms concentrated the band's loss on the victim's own family (cf. Howell, 2000, pp. 51±53, on the !Kung), which in turn concentrated its loss on particular individuals within the family (usually the child still requiring the most investment to reach reproductive age). A man who had fatal lapses in judgment, or in detecting hazards, not only foreclosed all

17. Innovation, accidents, and intelligence

413

future genetic contributions, but also erased some of his past contributions. Even the temporary loss of a provisioner from nonfatal injuries endangered dependents' lives (Hill & Hurtado, 1996, pp. 154±155). The forest-period Ache lived under constant nutritional stress, even if usually mild, during the study period. If they could not hunt for three days because of continuous rain, they had little food for three days. They did not live in the ``original af¯uent society'' (Hill & Hurtado, 1996, p. 320), as some anthropologists have fantasised about the foraging life. Legal and social sanctions in state societies now discourage infanticide, although faint footprints of the practice can be observed in mortality reports, especially for developing countries. In contrast, unintentional mortality, although tending to be ignored, leaves an unmistakable swath of destruction across all societies. Accounts of injuries in developing countries (Smith & Barss, 1991) and peasant societies (Baksh & Johnson, 1990) are particularly revealing because they ®nd that, while particular hazards differ from one time and place to another, accidents maim and kill in the same few ways: primarily, drowning (e.g., falling into ponds, wells, drainage or irrigation ditches; falling off boats and bridges), burns and scalds (e.g., hot oil, clothing or dwellings catching ®re, falling into open ®res), animal attacks (dog bites, goring by cattle, water buffalo, and wild pigs), lacerations and punctures (machetes, knives, spears, digging sticks, arrows shot into the air), poisoning (venomous snakes, improperly distilled alcohol, nicked by poisoned arrow), falls (off beds, bridges, and buildings; out of trees and windows), and falling objects that cause internal damage (trees being cut down, coconuts being harvested). The introduction of new technologies (e.g., electricity, motorised vehicles) produces new ways to be injured (electrical burns, fatal collisions), but even so-called technologically primitive societies pose innumerable manmade threats to life and limb.

What environmental factor was unique to Homo sapiens and could have accelerated the evolution of general intelligence? Any explanation for the rapid encephalisation of Homo sapiens, and the remarkable intelligence of its only surviving line (Homo sapiens sapiens) has to provide a correspondingly unique selection agent, or con¯uence of them, for the evolutionary increase. It should also offer some ``nitty-gritty real-life selection walks'' (Holloway, 1995) for how the selection, triggered by that agent, would actually play out within a population and allow its higher-g members to contribute proportionately more genetic descendents to future selection walks. Many previously proposed selection forces do not meet the uniqueness criterion, including tool use, warfare, living in social groups, cooperative predation, and climate change. Other theories attempt to meet it by proposing runaway sexual selection; for example, arms races in mating displays (Miller, 2000) or for developing a social intelligence to outwit and out-

414

Gottfredson

compete fellow humans (Dunbar, 1998). Runaway selection supplies a unique (species-speci®c) trigger by de®nition, because the term is a label for, not a demonstration of, selection processes that operate independently of the species' external environment. But the runaway theories cannot explain what triggered the postulated arms races. The competition-for-mates proposals supply no trigger except chance, and the social-competition proposals supply an implausible one, namely, that within-species selection forces were unleashed when humans effectively nulli®ed external ones by gaining ``ecological dominance'' (Geary, 2005). As shown earlier, however, technological feats that raise average levels of human welfare need not eliminate, and may even increase, the power of external environments to cull populations differentially by g level. The social intelligence hypothesis also fails to detail a ``selection walk'' by which spiraling intragroup competition and cooperation would have skewed mortality or reproduction by g level, especially when groups are said to have effectively mastered their physical environments. More promising are hypotheses about how genes and cultures coevolve, which envision humans transacting with, not divorcing themselves from, their physical environments. Improvisation and innovation in dealing with ecological challenges are the sorts of transactions that could sustain directional selection (Lumsden & Wilson, 1983). I specify more fully below a deadly innovations hypothesis for how human innovation could have created, and then ampli®ed, g-related relative risks of premature death.

Human innovation changes the physical environment ± for better and worse Humans have not adapted to their environments so much as they have modi®ed them to suit their needs. The Homo sapiens EEA was therefore never one of extreme constancy and continuity, nor were humans ever merely passive adapters to external circumstance (Campbell, 1996). Early in the Pleistocene, humans began shaping the environments that shaped them, just as individual persons still do today (on extended phenotypes, see Bouchard, Lykken, Tellegen, & McGue, 1996; Plomin et al., 2001). Each innovation that fundamentally altered the EEA had the potential, in turn, to redirect human evolution. Lumsden and Wilson (1983) refer to this autocatalytic process as a Promethean ®re, after the Greek myth. Consider, ®ttingly, humankind's controlled use of ®re during the past 500,000 years, one of our Homo ancestor's ``most remarkable'' achievements (Campbell, 1996, p. 47). By externalising some digestive functions (grinding, metabolising, detoxifying, etc.), cooking allowed early humans to digest a wider range of foods more ef®ciently. It literally transformed the human body. The gut could now be much smaller, allowing the brain to be larger for any given metabolic investment (Aiello & Wheeler, 1995; Kaplan,

17. Innovation, accidents, and intelligence

415

Hill, Lancaster, & Hurtado, 2000). This gut±brain trade-off coevolved with a suite of other life-history changes that differentiate modern humans anatomically from earlier hominids, including a longer developmental period, neoteny (more infant-like appearance), and a more gracile skeletal structure (less dense bones, thinner skull, smaller jaw and teeth, etc.). The shift was marked: the brain of the standard 65 kg modern human male weighs more than his gastrointestinal tract (1.3 versus 1.1 kg), but a nonhuman higher primate male of similar size has a brain only a quarter the size of its gut (0.45 versus 1.881 kg; Aiello & Wheeler, 1995). This is almost a gram-for-gram evolutionary trade-off between gut and brain. Much human innovation improved the ef®ciency of provisioning. Cooking and hunting with ®re is an early example. Projectile weapons (spears, bows and arrows, etc.) are another, because they allowed killing game quickly and at a distance, making hunting for large game both safer and more feasible. Boats, rafts, and canoes would later be yet others, because they allowed provisioners to exploit territory and food sources not otherwise readily accessible. Each innovation likely improved the general welfare and lowered age-speci®c mortality rates relative to other primates (Hill, Boesch, Goodall, Pusey, Williams, & Wrangham, 2001). Each, however, was a double-edged sword. Innovations in hunting, gathering, growing, storing, and preparing food created novel hazards by altering either the physical environment itself (open ®res, sharp tools, weapons, enclosures, platforms) or how the body engages it (attending to the treetops rather than hazards on the trail in order to shoot arboreal game, clearing thorny or otherwise hazardous vegetation to build gardens or shelters, felling trees for fuel or shelter, navigating bodies of water). As Howell (2000, p. 55) describes, ``Probably the most serious cause of hunting accidents, in the sense of injuries leading to death, is not the animals themselves, but the weapons [with poisoned shafts] that the !Kung use to kill those animals.'' Altering or engaging the physical environment in evolutionarily novel ways increases the risk of incurring biomechanical and other physical traumas that exceed human limits (e.g., lacerations; drowning; falls and falling objects that break bones, crush internal organs, or slam the brain against the skull). Moreover, anatomically modern humans probably became more vulnerable to such trauma by the Late Pleistocene/Upper Paleolithic, because the long Homo trend toward greater body mass had reversed by then. By the time art and artifacts began to ¯ower in Europe around 35,000 years ago, the region's Homo sapiens sapiens had become notably smaller, as well as somewhat less skeletally robust. This decrease in body mass was larger than the decrease in brain size, which raised EQ (Ruff et al., 1997) and perhaps re¯ected a new trade-off between cognitive and physical strengths in a now much-transformed human EEA. Humans also introduced new physical hazards into their work and home environments when they domesticated animals (canines for herding and hunting; ungulates for food, transportation, and ploughing) and adopted

416

Gottfredson

virtually anything as pets. Dogs are still a major source of injury worldwide. And as material innovations spread to housing, transportation, agriculture, manufacture, and recreation, so did new physical hazards. There were new objects to fall from (beds, stairs, ladders, buildings or their open windows, aircraft); new ways to be crushed, pierced, or gashed (farm machinery, electric saws); new ways to be poisoned (radiation, pesticides, and even prescription medicines); and so on. Old hazards could become more lethal, as when transportation increased in velocity. Many such hazards were generated too recently in human culture to account for the evolution of intelligence in prehistoric Homo sapiens, but they illustrate why the species might have evolved a general protective mechanism to survive the ever multiplying, ever shifting hazards with which it was inundating its environment. However, the distribution of manmade hazards continually changes as humans generate new ones, spread them to new sectors of the population and arenas of life, and develop cultural practices that attempt to mitigate the new risks. Because manmade hazards provide dispersed, ever-moving targets for genetic adaptation, humans cannot evolve separate adaptations to each of them (cf. Low, 1990) as they might to speci®c pathogens (sickle cell anaemia for malaria) or extreme climates (body shape for thermoregulation). Fiddick, Cosmides, and Tooby (2000) argue that humans have evolved a set of content-specialised inference systems for managing recurring hazards, but their conditional reasoning experiments specify no particular hazards, identify no particular forms of precautionary reasoning, rely mostly on samples restricted in range on IQ (college students), and fail to control for task complexity (abstractness, degree of inference required, etc.), prior learning, and other factors known to affect item dif®culty. Amorphous ecological challenges foil the evolution not only of physiological adaptations and innate mental heuristics, but of learned ones too. Humans are distinctive, of course, for having language, which facilitates transfer and storage of knowledge, as well as a long developmental period for learning both. Information sharing is one reason human groups can usually outrun their Four Horsemen of the Apocalypse ± starvation, warfare, pestilence, and extreme weather. Food sharing also buffers all of a group's members from the inevitable shortfalls each is likely to experience from time to time. Single individuals do not die from starvation in huntergatherer societies, except when there is neglect or abuse (Baksh & Johnson, 1990). Accidental death is therefore quite unlike the Four Horsemen, whose stark terrors rivet attention and mobilise collective countermeasures. Hazards are side-effects of a group's survival activity, not its focus of concern. They are myriad in number, which fractures attention further, and individually they tend to be low-probability killers, which dissipates concern for any single one. By often foiling even learned heuristics, this shifting panoply of low-risk hazards puts a premium on the independent exercise of g by single individuals.

17. Innovation, accidents, and intelligence

417

As reviewed earlier, the cognitive demands of accident prevention do not reside so much in the obvious attributes of situations and technologies as in what is latent, nascent, and merely possible in them. The mind's eye must imagine what one's two eyes cannot see, for example, the possible presence of a jaguar, an exposed electrical wire, or a faulty tire, in order to prevent a dangerous incident, rather than just waiting to escape or recover from it. It must also ®nd portents in the physical reality that the two eyes can register; for instance, to apprehend the imminent danger of falling rocks or rising waters, which many victims fail to do. Innovation-related hazards thus provide a plausible mechanism, though hardly the only one, for evolving a highly general intelligence in the Homo line. Fatal accidents are still a major cause of death in all societies, so they provide continuing opportunity for natural selection. Preventing accidents will always be cognitively demanding, so we should not presume that selection on g has ceased, let alone that modern humans have the same mind and brain as their Pleistocene ancestors. Recent haplotyping studies indicate, in fact, that at least two genes affecting brain size were still evolving as recently as 5,800 and 37,000 years ago (Evans et al., 2005; Mekel-Bobrov et al., 2005). Differences in relative brain size by latitude, among both archaic (extinct) and modern human groups (Ruff et al., 1997), as well as current differences by ancestral region (race) in IQ, musculoskeletal features, and other life history traits (Rushton & Rushton, 2001), also suggest that g continued to evolve long after Homo sapiens radiated out of sub-Saharan Africa 50,000 years ago. So, rather than loosening the bonds of natural selection, human innovation may only have substituted new and potentially more powerful ones. Human innovation magnifies g-based differences in risk and opportunity Human innovation introduced evolutionarily novel risks by changing the physical environment and human transactions with it. It thereby created ecological pressure for evolving higher g. But how would innovation have accelerated selection for g, that is, widened the differences in mortality between individuals of higher and lower general intelligence, during the past half million years? I focus here on ampli®ers that work by steepening g-related gradients of relative risk for accidental death, the following ®ve being plausible candidates. Double jeopardy Innovations are created or imported more frequently by individuals at the top of the g bell curve, because they are the most able to engage in the ``what if'' thinking necessary for innovation ± that is, for disengaging thought from the tyranny of immediate reality, in order to imagine an alternative and how to achieve it. Because humans are both verbal and

418

Gottfredson

social, the product or technique will soon spread, if useful. But it tends to spread from the top of the intelligence continuum downwards, because learning to replicate and effectively use an innovation ± and even see its potential ± also entails some exercise of g. To the extent that there is diffusion down g the continuum, replication and use will also become more error-prone and less effective. Realised bene®ts thus shrink as innovations diffuse down the bell curve, much as do the payoffs of schooling today (see Figure 17.1). And recall what happens when modern nations introduce new medical treatments or free national healthcare: everyone bene®ts, but brighter individuals capitalise more effectively on the new resources. While bene®ts steadily fall, risks of fatal injury steadily increase as innovations diffuse down the bell curve. The risk gradients may be steeper for some hazards than others, but as the odds ratios in Table 17.1 illustrate, most of them tilt against persons of lower g. Recall also that brighter individuals exploit better even the innovations intended to mitigate the dangers of prior innovations ± for example, by more often using protective gear. Fewer people die after a safety campaign, but the remaining fatalities become more concentrated at the lower end of the g continuum. Innovation thus magni®es its selective power by doubly disadvantaging a group's less intellectually able members. Growing disparities in accidental injury were probably a stronger selection force than the new disparities in bene®ts, however, because small hunter-gatherer bands, like large societies today, redistribute the fruits of higher competence and good luck so that all can share more equally in them. In contrast, an innovation's downside in injury and death is experienced more exclusively by the direct victims of lower competence. Accidents and injuries cannot be evenly redistributed like the meat from a hunt. It might also be noted that any social or Machiavellian intelligence would affect mostly the negotiation or avoidance of sharing norms (distributive justice), but g would still dominate the production of bene®ts to be shared and the management of associated hazards. Spearman-Brown pump The increase in g-based relative risk of mortality, owing to cultural innovation, need not be large in absolute terms to drive selection, but only pervasive and persistent. The many hazards in life can be thought of as the many lightly g-loaded items in life's mental test for avoiding premature death (Gordon, 1997) and, as forecast by the Spearman-Brown prophecy formula for test reliability, more items would allow the test to make more reliable distinctions in ability. The test need not even be very reliable within any single generation, because when taken generation after generation, the small effects in successive generations would aggregate to produce a dramatic evolutionary shift. To illustrate, only a weak selection rate (s = 0.03) on only a modestly heritable (30%) trait could create a 1% change in a

17. Innovation, accidents, and intelligence

419

generation, which is many multiples of the rate needed for the observed evolutionary increase in Homo brain size (Williams, 1992, p. 132). Occasional jumps in the number of items on humans' selfmade test for hazards management could nudge up this selection ratio ± imagine more instruments of death or rows in Table 17.1. Spiralling complexity Innovation could also amplify selection for g by ramping up the complexity (g loadedness) of individual ``items'' in life's test of ability for extracting the bene®ts of an innovation while also avoiding its new hazards. For example, a new provisioning technique (e.g., horticulture) might require higher levels of learning or reasoning than old ones (gathering) for effective reproduction, use, and selfprotection, by requiring individuals to understand longer chains of cause and effect or look more steps ahead (Gordon, 1997). Complexity could also be ramped up by new task conditions or con®gurations, as the job analyses showed. For instance, simply dealing with two tasks (e.g., potential hazards) at the same time is more cognitively demanding than dealing with them serially (driving and talking on a cell phone), because multitasking erodes the ability to execute each one effectively. Lower-g individuals are far more vulnerable to such cognitive overload than most high-g people imagine. Training and practice can, of course, reduce the complexity (g loading) of most daily tasks, and even automatise the performance of some (e.g., aspects of driving a car, playing the piano, using a tool, following rules of etiquette), as is their purpose. Novel tasks do not long remain novel, except in the evolutionary sense, but education and training can never fully neutralise all the additional complexity that innovations pump into the cognitive environment, as the job analyses demonstrated. The large residual complexity of many already-practiced tasks explains why higher g (say, by one SD) ± but not greater experience (say, by 3 years) ± continues to yield higher average levels of job performance in successively more experienced groups of workers (McDaniel, Schmidt, & Hunter, 1988). Contagion of error Social processes diffuse useful knowledge through a population, but also propagate misinformation (wild rumors, health-damaging practices). Not all ``help'' is helpful and some is downright dangerous. Neighbourhoods often differ greatly (1±3 standard deviations) in average IQ level (Maller, 1933), so the ratio of constructive to destructive help is higher in some settings than others (Gordon, 1997). Individuals who are embedded in less favourable IQ contexts (families, tribes, etc.) are exposed more frequently, whatever their own IQ level, to the cognitive errors committed by others.

420

Gottfredson

This systematic difference in exposure can be visualised by imagining that the columns of risk rates in Figure 17.1 represent different IQ contexts. Not only do people who make stupid mistakes occasionally pay with their own lives, but sometimes so do their kin and coworkers. Mortality reports for the Ache, !Kung, and other technologically primitive groups typically include such accounts; for example, of individuals being killed by trees felled by kin, and of infants and toddlers perishing from preventable burns, falls, crushing, and poisonings (Baksh & Johnson, 1990; Hill & Hurtado, 1996; Howell, 2000). Recall also the large number of Ache infanticides and child homicides that followed the death of provisioning parents, thereby magnifying the evolutionary consequences of those deaths. The propagation of deadly error through a kin network may be the evil twin of inclusive ®tness (assisting the survival of individuals in proportion to genes shared). Migration ratchet Greater population density and resulting scarcity of resources led early human groups to migrate into previously unexploited territory. The ancestors of modern Homo sapiens dispersed out of sub-Saharan Africa to populate North Africa and the Mediterranean, then the temperate regions of Eurasia, and eventually the Arctic regions of the world. Each higher latitude and new ice age posed new survival challenges. The brightest members of a group, though a small contingent, always constitute a cognitive surplus on which the group can draw when confronted by new threats to survival, such as colder or more variable weather or food sources. Prodded by adversity, this pool of potential imaginators developed physical techniques to make the environment less extreme and more predictable (Low, 1990). More protective clothing, better shelters, more tools for different uses, ways to preserve and store food, and much more, enabled their groups to thrive in climates for which the human body is not otherwise physiologically adapted. Although migrating to new climates (or climate change in situ) may have sparked much innovation, it was these innovations that made daily life more cognitively complex. Each technological advance in taming adversity (e.g., hearths for cooking and heating inside enclosed shelters) could increase the need to anticipate, recognise, prioritise, and quickly mitigate its potential side-effects. Migration into successively less hospitable climes spurred new technologies that, individually or collectively, could ratchet up the g-related risk gradients for accidental death. A migration ratchet effect comports with the pattern of genetic divergence among current human populations who have ancestral origins in different regions of the world (Ingman, Kaessmann, PaÈaÈbo, & Gyllensten, 2000; Underhill et al., 2000). Increments in technological complexity need not have been large to be effective and, once again, were probably much smaller than moderns would assume necessary. For

17. Innovation, accidents, and intelligence

421

instance, it takes an extra 3 years of mental development for most children to progress from being able to copy a square (age four, on average) to copying a diamond (age seven; Jensen, 1980), the diamond being more cognitively complex for reasons that readers will readily recognise, and students of the ®rst human tools (¯aked stones) would appreciate (e.g., Wynn, 1996).

Conclusion The deadly-innovations hypothesis is grounded in a vast nomological network of evidence on human intelligence in modern populations. It is consistent with recent evidence on trends in relative brain size and genetic divergence of human populations, archaic and modern, across time and place. But the scenario remains to be tested against competing hypotheses, such as that higher intelligence evolved as a result of sexual, not natural selection, because it signals to potential mates, not greater practical acumen, but superior genetic ®tness or robust health (say, lower mutation load or greater developmental stability; Miller, 2000, Chapter 4). Whatever its validity, the chief strength of the deadly-innovations hypothesis may lie in the counterintuitive insights it introduces from other disciplines. Theories on the evolution of intelligence have focused on the same ecological demands that our ancestors focused on, namely, how to survive the most glaring, most certain threats to survival ± starvation, disease, war, predation, and the elements. A general intelligence may indeed be useful for surviving these but, by themselves, they do not seem suf®cient to evolve one. Instead, selection for a highly generalisable intelligence (g) may have been driven by what captures our attention least ± the myriad, seemingly remote threats to life and limb that pervade the humdrum of daily life so thoroughly that they lull us into complacency. Fatal accidents pick us off one by one, unexpectedly, infrequently, and for reasons we often cannot fully control or even perceive, so we tend to chalk them up to bad luck. It also takes scarce time and energy to manage hazards effectively, especially for individuals who have few cognitive resources to spare for the task, so people often neglect it to focus on more central concerns. Moreover, such neglect seldom leads to serious injury ± like playing Russian roulette with a gun having one live round and a thousand blanks ± so many of us are willing to tempt fate or be goaded into doing so. But evolution works precisely by playing tiny odds in whole populations over vast spans of time. When our ancestors began increasing those odds, one hazard at a time, they speeded us on our path toward evolving a remarkable domain general intelligence.

References Ackerman, P. L., & Heggestad, E. D. (1997). Intelligence, personality, and interests: Evidence for overlapping traits. Psychological Bulletin, 121, 218±245.

422

Gottfredson

Adler, N. E., Boyce, W. T., Chesney, M. A., Folkman, S., & Syme, L. (1993). Socioeconomic inequalities in health: No easy solution. Journal of the American Medical Association, 269, 3140±3145. Aiello, L. C., & Wheeler, P. (1995). The expensive-tissue hypothesis: The brain and the digestive system in human and primate evolution. Current Anthropology, 36, 199±221. Arvey, R. D. (1986). General ability in employment: A discussion. Journal of Vocational Behavior, 29, 415±420. Baker, S. P., O'Neill, B., Ginsburg, M. J., & Li, G. (1992). The injury fact book (2nd ed.). New York: Oxford University Press. Baksh, M., & Johnson, A. (1990). Insurance policies among the Machiguenga: An ethnographic analysis of risk management in a non-Western society. In E. Cashdan (Ed.), Risk and uncertainty in tribal and peasant economies (pp. 193± 227). San Francisco: Westview Press. Baron-Cohen, S. (2003). The essential difference: Male and females brains and the truth about autism. New York: Basic Books. Barrick, M. R., Stewart, G. L., Neubert, M. J., & Mount, M. K. (1998). Relating member ability and personality to work-team processes and team effectiveness. Journal of Applied Psychology, 83, 377±391. Bouchard, T. J., Jr, Lykken, D. T., Tellegen, A., & McGue, M. (1996). Genes, drives, environment, and experience: EPD theory revised. In C. P. Benbow & D. Lubinski (Eds.), Intellectual talent: Psychometric and social issues (pp. 5±43). Baltimore: Johns Hopkins University Press. Buffardi, L. C., Fleishman, E. A., Morath, R. A., & McCarthy, P. M. (2000). Relationships between ability requirements and human errors in job tasks. Journal of Applied Psychology, 85, 551±564. Campbell, B. G. (1996). An outline of human phylogeny. In A. Lock & C. R. Peters (Eds.), Handbook of human symbolic evolution (pp. 31±52). Oxford: Clarendon Press. Campbell, J. P., & Knapp, D. J. (Eds.). (2001). Exploring the limits in personnel selection and classi®cation. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York: Cambridge University Press. Deary, I. J., Whiteman, M. C., Starr, J. M., Whalley, L. J., & Fox, H. C. (2004). The impact of childhood intelligence on later life: Following up the Scottish Mental Surveys of 1932 and 1947. Journal of Personality and Social Psychology, 86, 130±147. Dunbar, R. I. M. (1998). The social brain hypothesis. Evolutionary Anthropology, 6, 178±190. Evans, P. D., Gilbert, S. L., Mekel-Bobrov, N., Vallender, E. J., Anderson, J. R., Vaez-Azizi, L. M., et al. (2005). Microcephalin, a gene regulating brain size, continues to evolve adaptively in humans. Science, 309, 1717±1720. Fiddick, L., Cosmides, L., & Tooby, J. (2000). No interpretation without representation: The role of domain speci®c representations and inferences in the Wason selection task. Cognition, 77, 1±79. Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York: Basic Books. Geary, D. C. (2005). The origin of mind: Evolution of brain, cognition, and general intelligence. Washington, DC: American Psychological Association.

17. Innovation, accidents, and intelligence

423

Gigerenzer, G., & Todd, P.M. (Eds.). (1999). Simple heuristics that make us smart. Oxford: Oxford University Press. Gordon, R. A. (1997). Everyday life as an intelligence test: Effects of intelligence and intelligence context. Intelligence, 24, 203±320. Gottfredson, L. S. (1997). Why g matters: The complexity of everyday life. Intelligence, 24, 79±132. Gottfredson, L. S. (2002). g: Highly general and highly practical. In R. J. Sternberg & E. L. Grigorenko (Eds.), The general factor of intelligence: How general is it? (pp. 331±380). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Gottfredson, L. S. (2004). Intelligence: Is it the epidemiologists' elusive ``fundamental cause'' of social class inequalities in health? Journal of Personality and Social Psychology, 86, 174±199. Gray, J. R., Chabris, C. F., & Braver, T. S. (2003). Neural mechanisms of general ¯uid intelligence. Nature Neuroscience, 6, 316±322. Hale, A. R., & Glendon, A. I. (1987). Individual behaviour in the control of danger. (Industrial Safety Series, 2). New York: Elsevier. Hart, D., & Sussman, R. W. (2005). Man the hunted: Primates, predators, and human evolution. New York: Westview. Hill, K., Boesch, C., Goodall, J., Pusey, A., Williams, J., & Wrangham, R. (2001). Mortality rates among wild chimpanzees. Journal of Human Evolution, 40, 437±450. Hill, K., & Hurtado, A. M. (1996). Ache life history: The ecology and demography of a foraging people. New York: Aldine de Gruyter. Holloway, R. L. (1995). Comment on ``The expensive-tissue hypothesis.'' Current Anthropology, 36, 213±214. Holloway, R. L. (1996). Evolution of the human brain. In A. Lock & C. R. Peters (Eds.), Handbook of human symbolic evolution (pp. 74±116). New York: Oxford University Press. Howell, N. (2000). Demography of the Dobe !Kung (2nd ed.). Hawthorne, NY: Aldine de Gruyter. Ingman, M., Kaessmann, H., PaÈaÈbo, S., & Gyllensten, U. (2000). Mitochondrial genome variation and the origin of modern humans. Nature, 408, 708±713. Jensen, A. R. (1980). Bias in mental testing. New York: Free Press. Jensen, A. R. (1984). Test validity: g versus the speci®city doctrine. Journal of Social and Biological Structures, 7, 93±118. Jensen, A. R. (1998). The g factor: The science of mental ability. New York: Praeger. Jerison, H. J. (2002). On theory in comparative psychology. In R. J. Sternberg & J. C. Kaufman (Eds.), The evolution of intelligence (pp. 251±288). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Kaplan, H., Hill, K., Lancaster, J., & Hurtado, A. M. (2000). A theory of human life history evolution: Diet, intelligence, and longevity. Evolutionary Anthropology, 9, 156±185. Kirsch, I. S., Jungeblut, A., Jenkins, L., & Kolstad, A. (1993). Adult literacy in America: A ®rst look at the results of the National Adult Literacy Survey. Washington, DC: National Center for Education Statistics. Link, B. G., & Phelan, J. (1995). Social conditions as fundamental causes of disease. Journal of Health and Social Behavior (Extra issue), 80±94. Low, B. S. (1990). Human responses to environmental extremeness and uncertainty:

424

Gottfredson

A cross-cultural perspective. In E. Cashdan (Ed.), Risk and uncertainty in tribal and peasant economies (pp. 229±255). San Francisco: Westview Press. Lubinski, D., & Humphreys, L. G. (1997). Incorporating general intelligence into epidemiology and the social sciences. Intelligence, 24, 159±201. Lumsden, C. J., & Wilson, E. O. (1983). Promethean ®re: Re¯ections on the origin of the mind. Cambridge, MA: Harvard University Press. McDaniel, M. A., Schmidt, F. L., & Hunter, J. E. (1988). Job experience correlates of job performance. Journal of Applied Psychology, 73, 327±330. Maller, J. B. (1933). Vital indices and their relation to psychological and social factors. Human Biology, 5, 94±121. Mekel-Bobrov, N., Gilbert, S. L., Evans, P. D., Vallender, E. J., Anderson, J. R., Hudson, R. R. et al. (2005). Ongoing adaptive evolution of ASPM, a brain size determinant in Homo sapiens. Science, 309, 1720±1722. Miller, G. F. (2000). The mating mind: How sexual choice shaped the evolution of human nature. New York: Doubleday. Murray, C. (1998). Income inequality and IQ. Washington, DC: AEI Press. National Center for Injury Prevention and Control (2002). Injury fact book, 2001± 2002. Atlanta, GA: Centers for Disease Control and Prevention. Retrieved March 8, 2005, from http://www.cdc.gov/ncipc/fact_book/04_Introduction.htm National Research Council (1985). Injury in America: A continuing public health problem. Washington, DC: National Academy Press. O'Toole, B. J. (1990). Intelligence and behavior and motor vehicle accident mortality. Accident analysis and prevention, 22, 211±221. O'Toole, B. I., & Stankov, L. (1992). Ultimate validity of psychological tests. Personality and Individual Differences, 13, 699±716. Plomin, R., DeFries, J. C., McClearn, G. E., & McGuf®n, P. (2001). Behavioral genetics (4th ed.). New York: Worth. Reuning, H. (1988). Testing bushmen in the central Kalahari. In S. H. Irvine & J. W. Berry (Eds.), Human abilities in cultural context (pp. 453±486). Cambridge: Cambridge University Press. Rowe, D. C., Vesterdal, W. J., & Rodgers, J. L. (1998). Herrnstein's syllogism: Genetic and shared environmental in¯uences on IQ, education, and income. Intelligence, 26, 405±423. Ruff, C. B., Trinkhaus, E., & Holliday, T. W. (1997). Body mass and encephalization in Pleistocene Homo. Nature, 387, 173±176. Rushton, J. P., & Rushton, E. W. (2001). Brain size, IQ, and racial-group differences: Evidence from musculoskeletal traits. Intelligence, 31, 139±155. Sarich, V., & Miele, F. (2004). Race: The reality of human differences. Boulder, CO: Westview Press. Schmidt, F. L., & Hunter, J. E. (2004). General mental ability in the world of work: Occupational attainment and job performance. Journal of Personality and Social Psychology, 86, 162±173. Schmidt, F. L., Law, K., Hunter, J. E., Rothstein, H. R., Pearlman, K., & McDaniel, M. A. (1993). Re®nements in validity generalization methods: Implications for the situational speci®city hypothesis. Journal of Applied Psychology, 78, 3±13. Schoenemann, P. T., Sheehan, M. J., & Glotzer, L. D. (2005). Prefrontal white matter volume is disproportionately larger in humans than in other primates. Nature Neuroscience, 8, 242±252.

17. Innovation, accidents, and intelligence

425

Smith, G. S., & Barss, P. (1991). Unintentional injuries in developing countries: The epidemiology of a neglected problem. Epidemiologic Reviews, 13, 228±266. Sternberg, R. J. (1997). Successful intelligence: How practical and creative intelligence determine success in life. New York: Plume. Toga, A. W., & Thompson, P. M. (2005). Genetics of brain structure and intelligence. Annual Review of Neuroscience, 28, 1±23. Tooby, J., & Cosmides, L. (1992). The psychological foundation of culture. In J. H. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 19±136). New York: Oxford University Press. Underhill, P. A., Shen, P., Lin, A. A., Jin, L., Passarino, G., Yang, W. H., et al. (2000). Y chromosome sequence variation and the history of human populations. Nature Genetics, 26, 358±361. US Department of Labor (1991). Dictionary of occupational titles (4th ed. rev.). Washington, DC: US Department of Labor, Employment and Training Administration, US Employment Service. Williams, G. C. (1992). Natural selection: Domains, levels, and challenges. Oxford: Oxford University Press. Wynn, T. G. (1996). The evolution of tools and symbolic behaviour. In A. Lock & C. R. Peters (Eds.), Handbook of human symbolic evolution (pp. 263±287). Oxford: Clarendon Press.

18 Heritability and the nomological network of g Nathan Brody

General intelligence (g) is a hypothetical molar component of intellect that is postulated to explain a series of relationships that collectively serve to triangulate the construct in a nomological network. Four relationships that provide empirical support for g are considered in this chapter. These are: (1) relationships between measures of intelligence obtained at different times over the lifespan of individuals; (2) relationships between intelligence and academic achievement; (3) relationships among different cognitive abilities; and (4) relationships between intelligence and relatively simple ``elementary information processing'' measures. This chapter reviews behavioural genetic analyses of the covariances considered in each of these relationships. They result in a partitioning of the covariance into three components: (1) a shared environmental component that leads to the development of relationships among two or more measures for individuals reared in the same family; (2) a nonshared environmental component that leads to relationships among two or more measures as a result of in¯uences not shared by individuals reared in the same family; and (3) a genetic component that leads to relationships between measures for individuals who have similar or identical genes. The analyses indicate that the relationships that de®ne g derive neither from shared family environmental in¯uences nor from idiosyncratic in¯uences that are not shared by individuals reared in the same family. Rather, they derive predominantly from genetic in¯uences. These results indicate that there is a common genetic g that contributes to the stability of measures of intelligence over the lifespan, to the development of relationships between intelligence and academic achievements, to relationships among measures of different cognitive abilities, and to relationships between elementary processing measures and complex intellectual abilities and achievements. The theoretical analysis presented in this chapter indicates that g, arguably the most important cognitive ability, is a molar construct that is not an adventitious component of the mind. Rather it is rooted in the structure and function of the nervous system. The chapter concludes with a brief critique of modular and bioecological views of the structure of

428

Brody

intellect that provide a conception of intelligence that is not compatible with the evidence reviewed in this chapter.

Longitudinal considerations The composition of tests of cognitive ability changes dramatically from infancy to adulthood. Despite this, there are longitudinal continuities in individual differences in cognitive ability that provide evidence for the stability of a latent trait. Phenotypic relationships Performance on measures of infant information processing during the ®rst year of life, such as ®xation times in a visual habituation study, are predictive of performance on tests of intelligence up to age 11 (the latest age for which information is available in longitudinal studies; see Colombo, 1993). Kavjek (2004) performed a meta-analysis of studies relating infant information processing measures to childhood IQ. He obtained a metaanalytic correlation of .37 between infant measures obtained prior to age 1 and childhood IQ from age 1.5 to age 11. Corrections for attenuation (infant measures are not highly reliable) suggest that infant measures may account for over 50% of the variance in childhood IQ. IQ tests administered after age 11 also exhibit substantial continuities. Pinneau (1961) reported a correlation of .96 between an aggregate IQ based on performance at ages 11, 12, and 13 and an aggregate IQ for ages 17 and 18. Preadolescent IQ is also predictive of IQ over the adult lifespan. The most dramatic evidence for the continuity of intellectual disposition derives from the analyses of the cohort of Scottish children who took a group intelligence test (the Moray House Test) at age 11 and at age 77. Deary, Whalley, Lemmon, Crawford, and Starr (2000) obtained a test±retest correlation of .73, corrected for restrictions in range of talent. This correction was necessary because, at age 11, IQ is positively related to mortality. Individuals with low IQ are less likely to be represented in the age 77 follow-up than individuals with high IQ. Deary et al. (2000) found that the time-lagged correlation between the Moray House test administered at age 11 and the Raven Test administered at age 77 was .48. This value may be compared to the concurrent correlation between the Raven and Moray House Tests, both administered at age 77, of .57. The relatively small difference between concurrent and time-lagged correlations indicates that the latent disposition of general intelligence, imperfectly assessed by both of these tests, remains relatively invariant from age 11 to age 77. The longitudinal stability of intelligence implies that this trait is not highly responsive to the many environmental variations that individuals are likely to encounter over their lifespans. For example, there is evidence that variations in educational exposure are not likely to have a large enduring

18. Heritability and g

429

in¯uence on intelligence. I believe that research on preschool interventions, educational deprivations, and the intelligence of deaf individuals provides evidence for this assertion. The Abecedarian Project is an experimental study of the long-term effects of intensive educational interventions on children assumed to be at risk of inadequate intellectual development. The Project randomly assigned children reared in extreme poverty to an experimental intervention in which they were provided with intensive early childhood education, beginning shortly after birth, and extending through the preschool period, or a control group. Children assigned to the experimental group had the higher scores on tests of intelligence, but the effect sizes for the intervention declined from .38 at age 6.5, to .31 at age 12, to .19 at age 21 ± the most recent follow-up data obtained 16 years after the end of the intervention (Campbell, in press). Campbell reported the results for a regression analysis in which childhood verbal ability scores were entered as a variable prior to the entry of the dummy coded experimental variable. The effect sizes for intelligence in this analysis declined from .10 at age 6.5, to ÿ.21 at age 12, to ÿ.38 at age 21. The negative values may be interpreted by assuming that childhood verbal ability is in¯uenced by two components of variance ± one indicative of a core intellectual disposition and the other re¯ecting the effects of early intervention. This latter component declines over the adult lifespan of an individual. Predictions of adult intelligence that include the latter component are likely to err by overpredicting performance. These results are compatible with the assertion that intensive early intervention has only marginal and declining in¯uence on core intellectual disposition. DeGroot (1951) studied the intelligence of Dutch adolescents who had been deprived of one or more years of formal education owing to the German occupation of The Netherlands during the Second World War. He found that deprived children exhibited declines in intelligence, and this study is often cited as providing evidence for the in¯uence of education on intelligence. Less often noticed is that the oldest cohort in DeGroot's study, who had experienced educational deprivations several years earlier, did not exhibit any decline at all. These results thus provide simultaneous evidence of the effects of educational deprivation on intelligence, and the resilience of intelligence in response to educational deprivation. The effects of educational deprivation were not long-lived. Braden (1994) comprehensively reviewed the performance of deaf individuals on tests of intelligence, ®nding that they exhibit cumulative de®cits on measures of verbal ability and verbal achievement, relative to hearing individuals, over the course of their educational career. Despite the dif®culties they exhibit in acquiring verbal skills, deaf individuals do not exhibit de®cits on nonverbal measures of abstract reasoning ability that are markers for ¯uid intelligence. Thus these individuals do not exhibit declines in g despite the apparent dif®culties they have in bene®ting from tuition requiring verbal skills.

430

Brody

In principle, environmental variations associated with educational deprivations or educational interventions could result in cumulative changes in IQ. The data do not support this outcome. Rather, the results reviewed above suggest that relatively dramatic changes in the environment have vanishingly small in¯uences on general intelligence in the long run, although they may have large short-term effects. These results, combined with evidence for stability of IQ, suggest that environmental variations commonly encountered do not have enduring in¯uences on cognitive ability ± g is a relatively resilient trait whose short-term perturbations are accompanied by a tendency for its phenotypic manifestations to revert to an enduring stable value, manifested initially in early childhood or infancy. This conclusion is buttressed by the results of a longitudinal analysis of changes in IQ reported by Mof®tt, Caspi, Harkness, and Silva (1993), who administered the Wechsler Intelligence Scale for Children (WISC) test to a representative sample of children when they were age 7, 9, 11, and 13. They obtained test±retest correlations varying between .74 and .84, and concluded that for close to 90% of the children in their sample, variations in IQ over this period were small and attributable to random errors of measurement. They also found a subset of children who exhibited larger changes in IQ over this period. They identi®ed 37 different environmental measures that might be related to changes in IQ, including socioeconomic status, changes in family composition, and such biological in¯uences as impaired vision or perinatal problems. They found that this set of environmental variables was not associated with changes in IQ. Genotypic relationships Behavioural genetic investigations of longitudinal relationships provide a method for assessing genetic and environmental in¯uences on stability and change in intelligence. These analyses indicate that the heritability of intelligence increases from childhood to postadolescence. In a longitudinal study, Plomin, Fulker, Corley, and DeFries (1997) repeatedly administered intelligence tests to adopted children for the ®rst 16 years of life. Their study also included a control sample of children reared in the same community by their natural parents. Correlations between the IQs of the biological parents and the IQs of their adopted-away children, at different ages, were compared to correlations between the IQs of the adoptive parents and these same adopted children. Furthermore, correlations were obtained between the IQs of the biological parents in the control families and the IQs of their own biological children at comparable ages. The correlations between all biological parents and their children were quite similar, both for the biological parents who were rearing their children and for the biological parents whose children had been adopted-away. Both sets of correlations exhibited monotonic increases with age. The correlation between the IQs of biological parents and their adopted-away children at

18. Heritability and g

431

age 16 was .39 for mothers and .32 for fathers. The comparable correlations for biological mothers and fathers in the control group, who were rearing their biological children, were .28 and .33, respectively. The comparable correlations between adoptive mothers and adoptive fathers and their adopted children were ÿ.05 and .11, respectively. The increasing correlation between the IQs of biological parents and the IQs of their children ± whether adopted away or not ± combined with evidence for a lack of relationship between the IQs of adoptive parents and their adopted children provides clear evidence of an increase in the heritability of IQ from ages 1 to 16. Longitudinal twin studies also provide evidence for an increasing heritability of IQ with age. Wilson (1983) assessed IQ in one such study from infancy until age 15. Monozygotic (MZ) twin correlations approached their asymptotic value (above .85) at age 3 and remained relatively invariant until age 15. Dizygotic (DZ) twin correlations declined from .78 to .54 over this period. The increased discrepancy between MZ and DZ twin correlations for IQ implies that its heritability increases from early childhood to age 15. Spinath and Plomin (2003) obtained heritability estimates for IQ from a large and representative cohort of British twins using a longitudinal design. They obtained a composite measure of intellectual ability for these twins at ages 2, 3, and 4. They also analysed data for the same twins when they were 7. They found that the shared environmental in¯uence on intelligence declined from .75 at the earlier ages to .31 for age 7. Heritability estimates, by contrast, increased from .22 at the earlier ages to .57 by age 7. These data indicate that there is a dramatic ampli®cation of genetic in¯uences from early to late childhood accompanied by a dramatic decline in shared family in¯uence. Multivariate genetic designs may be used to evaluate changes in IQ and the stability of IQ in a longitudinal design. Behavioural genetic methods may be used to analyse genetic and environmental relationships between measures. This method is based on an analysis of crosscorrelations between measures for individuals who differ in their degree of genetic resemblance. Consider a hypothetical example. A crosscorrelation of .60 between verbal and spatial ability is obtained for MZ twins by correlating scores on a test of verbal ability for one member of an MZ twin pair with the score on a test of spatial ability for his or her co-twin. The comparable correlation for DZ twin pairs is .30. These correlations support three inferences. First, genetic in¯uences contribute to the relationship between verbal and spatial ability test scores. This inference is supported by the ®nding that the MZ twin crosscorrelation is higher than the DZ correlation. Second, shared environmental in¯uences do not contribute to the relationship. If the DZ crosscorrelation had been greater than one-half the value of the MZ crosscorrelation, it would be possible to infer that shared environmental in¯uences contribute to the relationship. The MZ twin correlation is twice the value of the DZ twin correlation, thus the difference is accounted for by

432

Brody

genetic differences between MZ and DZ twins. Third, nonshared environmental in¯uences contribute to the relationship. The MZ correlation is less than 1.00. It is possible to estimate the genetic correlation between two variables ± providing an indication of the degree to which the same genes in¯uence both. It is also possible to estimate the degree to which a common genetic in¯uence accounts for the phenotypic correlation between two variables. In the example given above, ignoring the complexities of multivariate analyses, 60% of the phenotypic correlation between verbal and spatial ability is attributable to genetic in¯uences and 40% is attributable to nonshared environmental in¯uences. Extending this to a multivariate analysis of a battery of cognitive tests, it is possible to analyse genetic and environmental contributions to the phenotypic factor structure derived from multivariate analyses of relationships among diverse measures of cognitive ability. Petrill et al. (2004) used a multivariate analysis to assess genetic and environmental in¯uences on stability and change in IQ using data from the Colorado Adoption Project. They obtained correlations between IQ scores, obtained at different ages, for adoptive unrelated siblings reared in the same family, and for biologically related siblings reared in the same family. Their model ®tting analyses indicated that stability in IQ was largely attributable to genetic in¯uences. Common family environmental in¯uences on IQ stability and change were dropped from the model without subsequent degradation of the ®t. Nonshared environmental in¯uences (including errors of measurement) were responsible for age speci®c instability in IQ but were not related to stability over time. The estimated genetic correlations between IQ at age 16 and IQ tested at ages 1, 2, 3, 4, 7, 9, and 12 were .08, .71, .69, .69, 1.00, .82, and 1.00. The age 1 data is based on the Bayley test that is now known to have relatively low predictive validity for subsequent IQ. Using infant information processing measures instead might have increased the genetic correlation substantially. Longitudinal analyses of changes in intelligence indicate that its phenotypic measure is increasingly determined by a person's genotype, and as this in¯uence becomes dominant, stability in IQ is determined by genetic in¯uences that are similar at different ages. There is relatively little evidence of age-speci®c genetic in¯uences, or of environmental in¯uences that could be alternative causes of the stability of IQ. Caspi and Mof®tt (1991) presented a model of personality change that may be relevant to understanding changes in intelligence. They assumed that personality traits determine the response to novel events and that, subsequently, the response to these novel experiences will accentuate prior differences in personality. Caspi and Mof®tt obtained data that supports this model in a study of age of menarche. They analysed behaviour problems in girls who had experienced this early, assuming that this would be a stressful novel experience. They obtained a measure of behaviour problems prior to the onset of menarche and used a median split to create groups of girls who were high or low in

18. Heritability and g

433

them. They found that differences between behaviour problem groups increased after this novel stressful experience. Changes in intelligence may be hypothesised as taking place in a similar way. Prior differences in cognitive ability determine people's responses to novel encounters, and the outcome is the enhancement and sharpening of pre-existent differences in cognitive ability. The increase in heritability of intelligence, and the decrease in in¯uence of shared family environments associated with exposure to formal schooling, may be understood in terms of the reaction of individuals to novel environments as in¯uenced by their pre-existing cognitive abilities.

From ability to achievement and attainment The term ``ability'' has the connotation of being present prior to its actualisation. Abilities refer to capacities to accomplish something. The accomplishment need not be actualised. The term, like ``solubility,'' is a disposition term ± i.e., a property of something that is capable of being manifested in appropriate circumstances. This analysis may be extended to an expanded continuum of ability and achievement. Genotypes for intelligence constitute a logical anchor for one end of the continuum. What Cattell (1971) called historical ¯uid ability constitutes the next term of the continuum. Cattell understood this construct to represent a prior state of ability that had in¯uenced the current level manifested in obtained test scores. In Cattell's investment theory of intelligence, historical ¯uid ability is invested in current ¯uid ability which, in turn, may be invested in the development of crystallised abilities. Crystallised abilities may be invested in the development of academic achievements, which may be construed as de®ning the opposite end of the ability±achievement continuum. The longitudinal behavioural genetic analyses reviewed above imply that changes in psychometric indices of intelligence usually lead to an increased congruence between these measured scores and a latent ability trait (de®ned at its extreme in terms of the genotypes that represent a nascent in¯uence on intelligence, which are present at the moment of conception). Measured ability may re¯ect the current status of cumulative genetic in¯uences on the kinds of investments that will go on to determine subsequent abilities and achievements. This analysis implies that there should be both a phenotypic and a genotypic relationship between intelligence and academic achievement. There is an extensive literature documenting the ubiquitous relationship between scores on tests of intelligence and various measures of achievement. The magnitude of this is easily underestimated by the use of single measures of achievement that do not re¯ect the broad acquisition of knowledge associated with exposure to formal education. One of the most comprehensive studies of the relationship between g and academic performance has recently been completed by Deary (2004). For a representative

434

Brody

sample of 70,000 British students, he analysed the relationship between performance on tests of intelligence at age 11, and performance on 25 tests of achievement. These constitute the required assessment of academic competence of students ®nishing their compulsory education in England at age 16, and represent all the subject areas taught in British secondary schools. Students typically take a subset of these tests ± usually about eight ± in order to obtain evidence of their academic attainment. The correlation between his measure of g obtained at age 11 and the total score of academic quali®cations obtained by a student at age 16 was .69. Deary also performed a latent trait analysis for these data, obtaining a correlation of .81 between the latent trait for intelligence at age 11 and the latent trait of achievement (using the ®ve most common tests of the latter). Deary's results indicate that a single measure of g, obtained prior to the start of secondary school, is highly predictive of a comprehensive assessment of the total academic achievement of students on completion. The relationship between intelligence and achievement may be analysed by behavioural genetic methods. Heritability estimates obtained for academic achievement vary widely. Petrill and Wilkerson (2000) attribute some of the variability to the age of testing and suggest that, like intelligence, academic achievement increases in heritability from childhood to postadolescence. Multivariate studies of the relationship between intelligence and achievement indicate that there are shared genetic in¯uences linking the two measures. Thompson, Detterman and Plomin (1991) analysed the relationship between performance on tests of verbal and spatial abilities and measures of academic achievement in reading, mathematics, and language skills in a sample of 6- to 12-year-old twins. Their analyses indicated that the academic achievement tests had relatively low heritabilities ranging from .17 to .27, and relatively high shared environmental in¯uences ranging from .65 to .73. The measures of intelligence used had substantially higher heritabilities (.54 and .70 for verbal and spatial ability measures, respectively) and much lower shared family environmental in¯uences ± close to zero. An analysis of the covariance between intelligence and achievement indicated that shared genetic in¯uences accounted for approximately 80% of the phenotypic covariance between intelligence and achievement. The phenotypic correlation between intelligence and achievement was not in¯uenced by shared family environment (a result that is compatible with its minimal in¯uence on measures of intelligence). Wadsworth (1994) obtained similar results in a study of 7-year-old biologically related and unrelated siblings using data obtained from the Colorado Adoption Project. This indicated that shared genetic in¯uences accounted for 33% to 64% of the phenotypic covariance between measures of ability derived from the WISC Test and measures of academic achievement derived from the Peabody Test. Bartels, Rietveld, van Baal, and Boomsma (2002) obtained IQ data from a longitudinal study of twins at ages 5, 7, 10, and 12. They analysed the

18. Heritability and g

435

covariance between IQ and a comprehensive test of academic achievement at age 12, used to assess children at the completion of elementary school in The Netherlands. The estimated genetic contributions to the covariance between IQ and achievement were .40, .75, .83, and .41. The corresponding shared environmental contributions were .60, .25, .17 and .51. The longitudinal data for ages 5, 7, and 10 indicates a pattern of increasing genetic covariance between IQ and achievement as children grow older. The age 12 results provide a somewhat surprising reversal of this pattern and may be in¯uenced by age-speci®c environmental effects that do not contribute to continuities in IQ or achievement. Wainwright, Wright, Geffen, Luciano, and Martin (2005) reported a twin study in which the relationship was investigated between the Multiple Assessment Battery (a multiple-choice test of intelligence that is based on the Wechsler Adult Intelligence Scale (WAIS)) and performance on the Queensland Core Skills Test (an academic achievement test administered to students in their ®nal year of secondary school). This is designed to assess general academic competencies rather than those achievements that are closely tied to the contents of the curriculum. This study is one of the few genetically informative ones of relationships between intelligence and achievement assessed in students at the end of required formal education. IQ was assessed at age 16 and achievement was assessed at age 17. They obtained a heritability estimate of .67 for their measure of academic achievement. A common genetic factor in¯uencing performance on the Queensland Test and on verbal IQ (VIQ) and performance IQ (PIQ) measures accounted for 55%, 62% and 23% of the variance on these measures, respectively. Genetic in¯uences accounted for 72% of the covariance between VIQ and scores on the Queensland Test and 72% of the covariance between PIQ and scores on the Queensland Test. The genetic correlation between VIQ and the Queensland was .91 ± the comparable correlation between PIQ and the Queensland was .64. VIQ is a measure of crystallised ability, and is closer on the ability achievement continuum to achievement than PIQ. The results indicate that the phenotypic correlation between verbal ability and the Queensland measure is almost completely determined by common genetic in¯uences on both measures. Academic achievement is positively related to educational attainment ± typically measured in terms of the number of years of education completed. Educational attainment is heritable: Baker, Treoloar, Reynolds, Heath, and Martin (1996) estimated this to be .57 for a sample of Australian twins. Academic achievement and intelligence are genetically covariant; educational attainment is heritable. Are educational attainment and intelligence genetically covariant? Rowe, Vesterdal and Rodgers (1998) analysed relationships between a crystallised measure of intelligence and educational attainment from full and half siblings participating in the National Longitudinal Study of Youth. They used these data to obtain estimated heritabilities for IQ and educational attainment of .64 and .68, respectively.

436

Brody

A common genetic factor accounted for 68% of the covariance between educational attainment and IQ. Summary Intelligence may be construed as a latent trait that is initially manifested in the ways in which neonates respond to stimuli in the ®rst year of life. The trait is related to the acquisition of more complex reasoning abilities and also to academic achievements. Genetic covariance analyses indicate that the continuum of changes from basic information-processing abilities in infancy, to complex reasoning skills present in measures of ¯uid ability, to more crystallised skills and academic achievements, are all in¯uenced by a common genetic g that contributes to continuities in individual differences in intelligence assessed at different ages. The genetic g factor also accounts for a substantial amount of the covariance between achievements and abilities. The covariant relationships may be understood in terms of a model that assumes that there is a relationship between heritable characteristics of persons and their response to novel environmental encounters. Individual differences determine the response to novel encounters and these differences are, in turn, clari®ed and accentuated by the responses, leading to an increase in the heritability of manifestations of the underlying disposition. The changing characteristics of the trait, manifested in part by its extended relationships from primordial manifestations to broader socially relevant attainments, are ultimately determined by genetic characteristics present at the moment of conception. This sketch of the role of genetic in¯uences on intellectual development is subject to two quali®cations. First, the unfolding process of genetic in¯uence presupposes a suf®ciently supportive environment in which initial differences may be expanded and ampli®ed. There is evidence indicating that the heritability of intelligence is relatively low for children whose parents have little or no formal education (Rowe, 2003). Adoption studies such as the Colorado Study include families who vary widely in IQ and social class. Owing to the vetting procedures used to select adoptive families, such studies include few families living at the margins of society. The addicted, the unemployed, and the broadly dysfunctional are not well represented. Thus the unfolding process sketched above may not be descriptive of outcomes for a subset of individuals encountering less privileged environments. Second, the simpli®ed model sketched above is a unidirectional causal model in which a genotypic intellectual ability is causally related to the development of complex skills and educational achievements and attainments. The studies reviewed do not provide re®ned analyses of the causal sequence that is postulated. The precise relationships among the unfolding sequence of relationships are subject to many complex reciprocal causal connections and causal disconnects.

18. Heritability and g

437

Dolan, Colom, Abad, Wicherts, Hessan, and van de Sluis (2006) performed a study that is illustrative of some of the complexities involved in assuming a simple unidirectional causal analysis of the relationship between intelligence and educational attainment. They analysed WAIS standardisation data for a representative sample of the population of Spain. They used multivariate analyses to obtain measures of g and of ®rst-order factors of Verbal Ability, Perceptual Organisation, Working Memory and Perceptual Speed. They analysed the relationship between these various cognitive ability factors and educational attainment for their sample, ®nding that educational attainment was related to ®rst-order factors and to g. Educational attainment, considered as an independent variable, in¯uenced ®rstorder factors, and, independently of its in¯uence on these factors, it also in¯uenced g. With educational attainment considered as a dependent variable, the pattern of in¯uence was different. Educational attainment was in¯uenced by ®rst-order factors but not by g. These results imply that g (as opposed to ®rst-order factors) is not causally related to educational attainment. It should be noted in interpreting this result that the study is not longitudinal. Educational attainment was determined for adults on a single occasion. Thus the results represent an analysis of relationships at the end of a completed process. Longitudinal analyses, in which there is a multivariate decomposition of intellectual ability, might implicate the g factor ± as opposed to second-stratum abilities in Carroll's model ± in causal relationships between g and educational attainment.

Behavioural genetics and modularity and molarity Early in life there is very little evidence of genetic correlation between different measures of ability. Consider the results of a study by Price, Eley, Dale, Saudino, and Plomin (2000) on a large representative sample of a cohort of 2-year-old English twins. Verbal and nonverbal ability measures based on parental observations were obtained. The genetic correlation between the measures was .30. A common genetic in¯uence did not have a substantial contribution to the phenotypic correlation between these two measures (r = .42). Shared environmental in¯uences were larger than genetic in¯uences on each of the measures, and on the relationship between them. Rietveld, van Baal, Dolan, and Boomsma (2000) administered six subtests of the RAKIC ± a Dutch intelligence test ± to a sample of 5-year-old twins. They tested a number of models of genetic and environmental in¯uences on the structure of relationships among these tests. Their best ®tting genetic model was one that postulated a genetic nonverbal factor and a genetic verbal factor as well as test-speci®c genetic factors. The model ®t was not degraded by the assumption that the genetic correlation between the verbal and nonverbal factors was zero. This model was not congruent with the phenotypic factor structure. The verbal and nonverbal factors

438

Brody

derived from the phenotypic analysis were positively correlated. They also found evidence for a general shared environmental factor and for testspeci®c nonshared environmental in¯uences. The Rietveld et al. study and the Price et al. study both provide support for an absence of strong genetic relationships between verbal and nonverbal abilities in childhood. Analyses of genetic relationships for older samples provide evidence for the presence of a genetic g factor that in¯uences performance on all the tests in a battery. Consider the results obtained by Luciano et al. (2003a) for a study of adolescent twin performance on ®ve subtests of the Multidimensional Aptitude Battery ± a group-administered test based on the WAIS ± and the digit symbol subtest of the WAIS. Their phenotypic factor analysis supported a two-factor verbal and performance model. The factors were correlated supporting the presence of a g factor. The genotypic factor structure was isomorphic with the phenotypic one, with the genetic verbal and performance factors correlated .47. This study provides evidence for genetic in¯uences that are operative at three levels of generality. 1 2 3

There is a genetic g that has a variable in¯uence on performance on all the subtests of the battery. There are independent genetic in¯uences on all tests that together load on the same factor. Thus there is a genetic in¯uence on verbal tests that is independent of the genetic g factor. There are genetic in¯uences on speci®c tests that are both independent of the genetic g factor and independent of genetic in¯uences on all the other tests that load on the same factor.

At this age, shared environmental in¯uences are weaker and do not in¯uence verbal and performance factor scores. There is additional evidence for a common genetic in¯uence on diverse measures of ability in older samples. Petrill (2002) analysed the results of nine studies of genetic and environmental in¯uences on verbal, spatial, perceptual speed, and memory abilities. Each of the studies analysed heritability, phenotypic correlations and genotypic correlations for these measures. They had samples that varied in age from 4 to over 80. Petrill analysed the contribution of a common genetic factor to the total phenotypic variance in each of these abilities in each study. Table 18.1 presents a reanalysis of these data that indicates the magnitude of a common genetic in¯uence on total phenotypic variance for each of these abilities. The data presented indicate that genetic g accounts for more of the variance in tests of ability in adults than in childhood. The genetic g factor (i.e., a factor representing common genetic in¯uences on all the tests) accounts for more variance in the tests than test-speci®c genetic in¯uences (for a dissenting view see Grigorenko, 2002). These results imply that genetic in¯uences are primarily those that support molar relationships among diverse cognitive abilities. Shared environmental in¯uences decline in importance in

18. Heritability and g

439

Table 18.1 Contribution of genetic g to phenotypic variance in studies with samples varying in age Abilities Age 4 7 6±16 40±80

Verbal

Spatial

Perceptual speed

Memory

.04 .21 .22 .32

.10 .11 .12 .38

.10 .10 .39 .37

.04 .09 .05 .15

Source: Based on Petrill (2002). 1 Age 4 data are derived from a single study ± Rice et al. (1989). Age 7 data are derived from a single study ± Cardon, Fulker, DeFries, and Plomin (1992). Age 6±16 data are derived from two studies ± Luo, Petrill, and Thompson (1994) and Casto, DeFries, and Fulker (1995). Data for adults are derived from ®ve studies ± Tambs, Sundet, and Magnus (1986) and Finkel, Pedersen, Plomin, and McClearn (1998) (this study includes separate analyses of three data sets); and Petrill et al. (1998). 2 Estimates are based on data presented by Petrill (2002) in Figures 11.2, 11.3, 11.4, and 11.5.

adulthood as contributors to variance on individual tests or to relationships between tests. Nonshared environmental in¯uences contribute to phenotypic variance on individual tests but not to relationships among tests of different cognitive abilities.

Genetic molarity: Speculations There are many uncertainties surrounding evidence of genetic contributions to relationships between cognitive abilities. Current phenotypic research on the structure of abilities, most comprehensively reviewed by Carroll (1993), provides evidence for a detailed taxonomy with g at the apex of a hierarchical structure with several different second-stratum abilities, each of which is related to several narrow factors. To what extent is the phenotypic taxonomy isomorphic with the genotypic taxonomy? Owing in part to the fact that the taxonomy of cognitive abilities is extensive and detailed, none of the available studies has explored the relationship in depth. Nevertheless it is a plausible working hypothesis that the taxonomic hierarchy of cognitive abilities is mirrored by (derived from?) an isomorphic structure of genetic in¯uences. Such a hypothesis implies that there are genetic in¯uences contributing to g, genetic in¯uences contributing to second-stratum factors, and genetic in¯uences contributing to still narrower factors. What accounts for the increasing importance of molar genetic in¯uences from childhood to the adult years? The declining in¯uence of shared environmental in¯uences on cognitive ability provides a partial explanation of this phenomenon. Early in life, shared environmental in¯uences contribute to variance on individual measures and to the covariance between them. The decline of such in¯uences provides a wider scope for genetic

440

Brody

in¯uences. This does not, however, explain the evidence for genetic molarity. The decline of shared environmental in¯uences could, in principle, be accompanied by an increased contribution of genetic in¯uences related to lower strata factors that do not contribute to phenotypic g. There is an account of genetic contributions to g that is derived from theoretical speculations about its development. ``Bottom-up'' accounts assume that g derives from heritable differences in basic informationprocessing abilities that are determined by heritable properties of the nervous system (see Jensen, 1998). There are a number of relatively simple laboratory-based measures of basic information-processing tasks that are related to psychometric intelligence (see Deary, 2000). Among the more widely studied tasks are choice reaction time and inspection time. Inspection time tasks present subjects with two stimuli that are easily distinguished ± e.g., vertical lines differing in length ± followed by a masking stimulus that occludes the original. Subjects are required to judge which of the two stimuli was longer. Thresholds for the minimal inspection time for accurate judgment prior to the onset of the mask are obtained. Metaanalysis of these studies indicates that inspection time is inversely related to IQ (see Grudnik & Kranzler, 2001). Studies of the relationship between ``elementary information processing'' performance and IQ are sometimes construed as providing evidence for a ``bottom-up'' theory of the development of intelligence, in which these basic tasks are assumed to re¯ect heritable differences in information processing, which in¯uence the development of complex cognitive abilities. There are, however, several uncertainties surrounding the interpretation of these relationships. Some analyses of performance on these tasks indicate that their apparent simplicity is deceptive and that a full understanding of performance on them may involve an understanding of several complex cognitive processes. And there is some dispute about whether or not these relationships are mediated by a relationship between g and these tasks or, alternatively, by a relationship with narrower factors such as a speed factor or a perceptual organisation factor (see Burns & Nettelbeck, 2003). Behavioural genetic analyses of the relationship between elementary information-processing measures and psychometric indices of intelligence have consistently indicated that there are substantial genetic correlations between these measures (see Posthuma, de Gues, & Boomsma, 2003 for a comprehensive review of these studies). Among the tasks that have been studied in this way are several that are based on reaction times ± including decision time in a choice reaction time paradigm, performance on Sternberg memory scanning tasks, and Posner letter matching tasks (see Luciano, Wright, Smith, Geffen, Geffen, & Martin, 2003b; Neubauer, Spinath, Rieman, Angleitner, & Borkenau, 2000; Rijsdijk, Vernon, & Boomsma, 1998). These studies report genetic correlations between speeded measures and psychometric indices that range from ÿ.40 to ÿ.70. A substantial portion of the phenotypic covariance is mediated by common genes. Studies

18. Heritability and g

441

that use aggregate indices of several reaction time tasks have reported higher genetic correlations. For example, Baker, Vernon and Ho (1991) reported a genetic correlation of ÿ.92 between an aggregated index of reaction time and psychometric g. There is a genetic correlation between certain elementary informationprocessing tasks and psychometric g. And there is a genetic correlation between psychometric g and academic achievement. This suggests that there are common genetic in¯uences that contribute to relationships between elementary information processing and academic achievement. Luo, Thompson, and Detterman (2003) administered a battery of six elementary information tasks to a sample of 6- to 12-year-old twins. They also obtained measures of language, mathematics and reading achievement from the Metropolitan Achievement Test. They were able to derive general factors from their battery of elementary information-processing measures and their achievement measures. The phenotypic correlation between these general factors was .54. Behavioural genetic analyses indicated that genetic in¯uences and shared environmental in¯uences contributed to variance on each of these general factors and to the covariance between them. Common genetic in¯uences accounted for more of the covariance between the g factors than shared environmental in¯uences. Luo, Thompson, and Detterman interpret their results by assuming that speed of cognitive processing is a common genetic causal factor for both general intelligence and academic achievement. If genetic in¯uences on elementary cognitive process and intelligence derive from genetic in¯uences on structural and functional properties of the nervous system, it should be possible to obtain evidence of common in¯uences on biological processes, elementary information processing and intelligence. There are relatively few available studies that provide direct evidence of such relationships. One such study was reported by Rijsdijk and Boomsma (1997). They obtained a phenotypic correlation of .15 between a measure of speed of peripheral nerve conduction and IQ in a sample of 18year-old twins. Common genetic in¯uences accounted for the totality of the phenotypic correlation. There are of course a plethora of biologically based measures that are potential correlates of g. For example, Beauchamp (2005) obtained correlations in excess of .5 between EEG measures of P-300 amplitude to correctly judged tones in an auditory oddball task, in which subjects were required to judge whether a backwardly masked tone was one that appeared on 85% of the trials or was the ``oddball'' tone that appeared on 15% of the trials. van Beijsterveldt, van Baal, Molenaar, Boomsma, and de Geus (2001) obtained evidence indicating that this P-300 amplitude was heritable. Owing to the substantial relationship between this P-300 amplitude and IQ, a behavioural genetic analysis of the relationship between them might be informative. At present there are few studies dealing with genetic relationships among biological correlates of intelligence, or of the relationships between these correlates and IQ.

442

Brody

Conclusion: Modularity, bioecology, and g Modularity and g Cosmides and Tooby (2002) are proponents of a theory of intelligence based on the assumption that there are dedicated modules, shaped by evolution, that enable individuals to solve various cognitive problems. They do, however, assume that there is a need for a construct that they call improvisational intelligence that is, in some ways, similar to g, in that it involves inferences that are not domain speci®c. Improvisational intelligence is, however, closely linked to the modular dedicated intelligences that are designed to solve speci®c problems. Cosmides and Tooby assert that ``having a brain that is well endowed with intelligence1 is a precondition for intelligence2, improvisational intelligence (Cosmides & Tooby, 2002, p. 171). They write, ``improvisational intelligence might have been achieved . . . through bundling an increased number of specialised intelligences together and . . . embedding them in . . . an elaborate set of computational adaptations for regulating the interaction of transient and contingent information sets within a multi-modular mind'' (Cosmides & Tooby, 2002, p. 147). Improvisational intelligence allows humans to develop novel solutions by considering information that is transiently and contingently valid. It is designed to represent the unique features of particular combinations of evolutionary recurrent categories and requires mechanisms that translate information from dedicated intelligences into common formats. Modular adaptations are invariably triggered by speci®c external stimuli and improvisational intelligence, by contrast, permits the use of knowledge derived from domain speci®c inference systems in the absence of a triggering stimulus. Hence, humans can even reason about the consequences of what is not true or what is not present. Modular theorists and psychometric theorists both assume that it is necessary to postulate some form of general intelligence. There are both super®cial similarities and fundamental distinctions between the g construct and improvisational intelligence as construed by Cosmides and Tooby. These are: 1

2

Both of these constructs derive from relationships between narrow components of intellect and a general component. Although g may be statistically separated from narrow components of intellect, any phenotypic measure of intelligence will contain non-g variance. There are no pure measures of g. Spearman (1927), the originator of the g construct, famously de®ned g in terms of the ability to educe relationships. He gave as examples of this ability the awareness that beer tastes something like weak quinine, and the ability to imagine a musical note that is a ®fth higher than a

18. Heritability and g

3 4

443

note that is heard. This way of thinking about the cognitive processes that are central to g is similar to the de®nition of improvisational intelligence provided by Cosmides and Tooby. Improvisational intelligence is a species general construct whereas g is an individual difference construct. The most profound difference between improvisational intelligence and g, as construed in this chapter, derives from the assumed relationship between speci®c components of intelligence and general components of intelligence. It is impossible to exaggerate the extent to which improvisational intelligence is construed as a construct that is intertwined with speci®c modular components of the mind. g, by contrast, is genetically covariant with factors of intellect that are not modular, such as verbal and spatial ability. Still more problematic for a modular theorist is research indicating that there are genetic linkages between g and elementary information-processing tasks. Choice reaction times and inspection times may be partially dependent on abilities that have adaptive value, related to the ability to respond rapidly to stimuli or to discriminate stimuli under dif®cult conditions. These abilities are not, however, the kinds of domain speci®c knowledge structures discussed by modular theorists. Elementary information-processing tasks are relatively devoid of speci®c content and have little or nothing to do with the development of specialised knowledge structures ± they are context-independent components of intellect.

In contrast, Fodor (1983) is a modular theorist who also assumes that a modular mind requires a more general construct of intelligence. He states, ``there must be some mechanisms which cross the boundaries that input systems establish . . . I assume that there must be relatively nondenominational (i.e., domain inspeci®c) psychological systems which operate, interalia, to exploit the information that input systems provide'' (Fodor, 1983, p. 103, emphasis in original). Bioecology and g Bronfenbrenner and Ceci (1994; Ceci, Rosenblum, de Bruyn, & Lee, 1997) developed a bioecological theory of the development of intelligence that assumes that the actualisation of a child's genetic intellectual ability depends on proximal processes de®ned as ``reciprocal interactions between the developing child and other persons, objects, and symbols'' (Ceci et al., 1997, p. 311). Ceci et al. use parental monitoring of homework and a child's after-school activities as an example of the kind of proximal process that is described as the engine of intellectual development. The ef®ciency of proximal processes is dependent on distal environmental resources. For example, a parent may choose to monitor a student's algebra homework. If he or she does not understand algebra, the assistance that may be offered to

444

Brody

the student is limited. Over and above an emphasis on environmental contexts that lead to the development of intellect, Bronfenbrenner and Ceci emphasise the importance of context-speci®c local knowledge that in¯uences the ability to reason in a particular domain. (For an analysis of one such example see my discussion of a paper by Ceci and Liker dealing with expertise in handicapping horse races; Brody, 1992; Ceci & Liker, 1986; see also Roberts, Chapter 14, this volume). Two types of studies reviewed in this chapter contradict the central tenets of bioecological theory. Behavioural genetic studies of the intelligence of adopted children indicate that proximal and distal environmental encounters, as construed by Bronfenbrenner and Ceci, are not the major in¯uences on the development of individual differences in intelligence. Consider the Plomin et al. (1997) ®ndings. Adoptive parents surely differed in the characteristic proximal processes they arranged for their adopted children. Yet these putative differences were unrelated to the development of individual differences in intelligence. Biological parents of adopted children had no role in arranging proximal processes for their children. Nor were they implicated in creating variations in the children's distal environmental supports. Yet variations in cognitive ability among biological parents were related to their adopted-away children's intellectual development. The strong relationships between performance on some contextindependent elementary information-processing tasks and g, as well as evidence for genetic linkages between g and these tasks, provides evidence for a construct of general intelligence that is relatively context-independent. Consider for example the results obtained by Beauchamp (2005) in his study of auditory oddball tasks, obtaining correlations in excess of .5 between P-300 amplitude measures for correct responses and intelligence. He also obtained correlations in excess of .5 for a behavioural measure derived from these tasks ± the probability of correct response. The behavioural index and the EEG-derived index were relatively uncorrelated. Since each of these indexes accounts for over 25% of the variance in IQ, a multiple correlation with two variables derived from the same task would have R2 values close to .5, uncorrected for restrictions in range of talent in a university sample. If 50% of the variance on a test of intelligence is related to performance on a task requiring subjects to make judgments about backwardly masked tones, it is dif®cult to argue for a context-dependent model of intellect. g again Although there are many components of intelligence, virtually all the manifestations of individual differences in human intellectual performance are related to each other. The relationships that serve to triangulate g within a nomological network extending from biological processes to elementary information-processing abilities to complex academic achievements are

18. Heritability and g

445

related to common genetic in¯uences. The research reviewed in this chapter provides support for a molar intellectual construct that is neither modular nor particularly context-dependent.

References Baker, L. A., Treoloar, S. A., Reynolds, C. A., Heath, A. C., & Martin, N. G. (1996). Genetics of educational attainment in Australian twins: Sex differences and secular changes. Behavior Genetics, 26, 89±102. Baker, L. A., Vernon, P. A., & Ho, H. Z. (1991). The genetic correlation between intelligence and speed of information processing. Behavior Genetics, 21, 351±367. Bartels, M., Rietveld, M. J. H., van Baal, G. C. M., & Boomsma, D. I. (2002). Heritability of academic achievement in 12-year-olds and the overlap with cognitive ability. Twin Research, 5, 544±553. Beauchamp, C. M. (2005). Mental ability and event-related potentials in an auditory oddball task with backward masking: From description to explanation. Unpublished doctoral dissertation, University of Ottawa. Braden, J. A. (1994). Deafness, deprivation and IQ. New York: Plenum. Brody, N. (1992). Intelligence (2nd ed.). San Diego, CA: Academic Press. Bronfenbrenner, U., & Ceci, S. J. (1994). Nature±nurture reconceptualized in developmental perspective: A bio-ecological perspective. Psychological Review, 101, 568±586. Burns, N. R., & Nettelbeck, T. (2003). Inspection time in the structure of cognitive abilities: Where does IT ®t? Intelligence, 31, 237±255. Campbell, F. A. (in press). Early childhood intervention. In P. Kyllonen, R. Roberts, & L. Stankov (Eds.), Extending intelligence: Enhancements and new constructs. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Cardon, L. R., Fulker, D. W., DeFries, J. C., & Plomin, R. (1992). Multivariate genetic analyses of speci®c cognitive abilities in the Colorado Adoption Project at age 7. Intelligence, 16, 383±400. Carroll, J. B. (1993). Human cognitive abilities. New York: Cambridge University Press. Caspi, A., & Mof®tt, T. E. (1991). Individual differences are accentuated during periods of social change. Journal of Personality and Social Psychology, 61, 157±168. Casto, S. D., DeFries, J. C., & Fulker, D. W. (1995). Multivariate genetic analysis of Wechsler Intelligence Scale for Children-Revised (WISC-R) factors. Behavior Genetics, 25, 25±32. Cattell, R. B. (1971). Abilities: Their structure, growth, and action. Boston: Houghton Mif¯in. Ceci, S. J., & Liker, J. K. (1986). A day at the races: A study of IQ, expertise, and cognitive complexity. Journal of Experimental Psychology: General, 115, 255±266. Ceci, S. J., Rosenblum, T., de Bruyn, E., & Lee, D. Y. (1997). A bio-ecological model of intellectual development: Moving beyond h2. In R. J. Sternberg & E. Grigorenko (Eds.), Intelligence, heredity, and environment (pp. 303±322). Cambridge: Cambridge University Press. Colombo, J. (1993). Infant cognition: Predicting later intellectual functioning. Newbury Park, CA: Sage.

446

Brody

Cosmides, L., & Tooby, J. (2002). Unraveling the enigma of human intelligence. In R. J. Sternberg & J. C. Kaufman (Eds.), The evolution of intelligence (pp. 145± 198). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Deary, I. J. (2000). Looking down on human intelligence: From psychometrics to the brain. New York: Oxford University Press. Deary, I. J. (2004, December). Spearman (1904) revisited: General intelligence and educational achievement. Paper presented at the Fifth Annual Conference of the International Society for Intelligence Research, New Orleans, LA. Deary, I. J., Whalley, L. J., Lemmon, H., Crawford, J. R., & Starr, J. M. (2000). The stability of mental ability from childhood to old age: Follow-up of the 1932 Scottish Mental Survey. Intelligence, 28, 49±55. DeGroot, A. D. (1951). War and the intelligence of youth. Journal of Abnormal and Social Psychology, 46, 596±597. Dolan, C. V., Colom, R., Abad, F. J., Wicherts, J., Hessan, D. J., & van de Sluis, S. (2006). Multi-group covariance and mean structure modeling of the relationship between WAIS III common factors and sex and educational attainment in Spain. Intelligence, 34, 193±210. Finkel, D., Pedersen, N. L., Plomin, R., & McClearn, G. E. (1998). Longitudinal and cross-sectional twin data on cognitive abilities in adulthood: The Swedish Adoption/Twin Study of Aging. Developmental Psychology, 34, 1400±1413. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Grigorenko, E. L. (2002). Other than g: The value of persistence. In R. J. Sternberg & E. L. Grigorenko (Eds.), The general factor of intelligence: How general is it? (pp. 299±327). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Grudnik, J. L., & Kranzler, J. H. (2001). Meta-analysis of the relationship between intelligence and inspection time. Intelligence, 29, 523±535. Jensen, A. R. (1998). The g factor. Westport, CT: Praeger. Kavjek, M. (2004). Predicting later IQ from infant visual habituation and dishabituation: A meta-analysis. Journal of Applied Developmental Psychology, 25, 369±393. Luciano, M., Wright, M. J., Geffen, G. M., Geffen, L. B., Smith, G. A., Evans, D. M., et al. (2003a). A genetic two-factor model of the covariation among a subset of Multidimensional Aptitude Battery and Wechsler Adult Intelligence ScaleRevised subtests. Intelligence, 31, 589±605. Luciano, M., Wright, M. J., Smith, G. A., Geffen, G. M., Geffen, L. B., & Martin, N. G. (2003b). Genetic covariance between processing speed and IQ. In R. Plomin, J. C. DeFries, I. W. Craig, & P. McGuf®n (Eds.), Behavioral genetics in the postgenomic era (pp. 163±181). Washington, DC: American Psychological Association. Luo, D., Petrill, S. A., & Thompson, L. A. (1994). An exploration of genetic g: Hierarchical factor analysis of cognitive data from the Western Reserve Twin Project. Intelligence, 18, 335±347. Luo, D., Thompson, L. A., & Detterman, D. K. (2003). Phenotypic and behavioral genetic covariation between elemental cognitive components and scholastic measures. Behavior Genetics, 33, 221±246. Mof®tt, T. E., Caspi, A., Harkness, A. R., & Silva, P.A. (1993). The natural history of change in intellectual performance: Who changes? How much? Is it meaningful? Journal of Child Psychology and Psychiatry, 14, 455±506. Neubauer, A., Spinath, F. M., Rieman, R., Angleitner, A., & Borkenau, P. (2000).

18. Heritability and g

447

Genetic and environmental in¯uences on two measures of speed of information processing and their relationship to psychometric intelligence: Evidence from the German Observational Study of Adult Twins. Intelligence, 28, 267±289. Petrill, S. A. (2002). The case for general intelligence: A behavioral genetic perspective. In R. Sternberg & E. Grigorenko (Eds.), The general factor of intelligence: How general is it? (pp. 281±298). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Petrill, S. A., Lipton, P.A., Hewitt, J. K., Plomin, R., Cherney, S. C., Corley, R., et al. (2004). Genetic and environmental contributions to general cognitive ability through the ®rst 16 years of life. Developmental Psychology, 40, 805±812. Petrill, S. A., Plomin, R., Berg, S., Johansson, B., Pedersen, N. L., Ahern, F., et al. (1998). The genetic and environmental relationship between general and speci®c cognitive abilities in twins age 80 and older. Psychological Science, 9, 183±189. Petrill, S. A., & Wilkerson, B. (2000). Intelligence and achievement: A behavioral genetic perspective. Educational Psychology Review, 12, 185±199. Pinneau, S. R. (1961). Changes in intelligence quotient: Infancy to maturity. Boston: Houghton Mif¯in. Plomin, R., Fulker, D. W., Corley, R., & DeFries, J. C. (1997). Nature, nurture, and cognitive development from 1 to 16 years: A parent±offspring adoption study. Psychological Science, 8, 442±447. Posthuma, D., de Gues, E. J. C., & Boomsma, D. I. (2003). Genetic contributions to anatomical, behavioral, and neurophysiological indices of cognition. In R. Plomin, J. C. DeFries, I. W. Craig, & P. McGuf®n (Eds.), Behavioral genetics in the postgenomic era (pp. 141±161). Washington, DC: American Psychological Association. Price, T. S., Eley, T. C., Dale, P. S., Saudino, K., & Plomin, R. (2000). Genetic and environmental covariation between verbal and nonverbal cognitive development in infancy. Child Development, 71, 948±959. Rice, T., Carey, G., Fulker, D. W., & DeFries, J. C. (1989). Multivariate path analysis of speci®c cognitive abilities in the Colorado Adoption Project: Conditional path model of assortative mating. Behavior Genetics, 19, 195±207. Rietveld, M. J. H., van Baal, G. C. M., Dolan, C. V., & Boomsma, D. I. (2000). Genetic factor analysis of speci®c cognitive abilities in 5-year old Dutch children. Behavior Genetics, 30, 29±40. Rijsdijk, F. W., & Boomsma, D. I. (1997). Genetic mediation of the correlation between peripheral nerve conduction velocity and IQ. Behavior Genetics, 27, 87±98. Rijsdijk, F. W., Vernon, P. A., & Boomsma, D. I. (1998). The genetic basis of the relationship between speed of information processing and IQ. Behavioral Brain Research, 95, 77±84. Rowe, D. C. (2003). Assessing genotype±environment interactions and correlations in the postgenomic era. In R. Plomin, J. C. DeFries, I. W. Craig, & P. McGuf®n (Eds.), Behavioral genetics in the postgenomic era (pp. 71±86). Washington, DC: American Psychological Association. Rowe, D. C., Vesterdal, W. J., & Rodgers, J. L. (1998). Herrnstein's syllogism: Genetic and shared environmental in¯uences on IQ, education, and income. Intelligence, 26, 405±423. Spearman, C. (1927). The abilities of man. London: Macmillan. Spinath, F. M., & Plomin, R. (2003, December). The ampli®cation of genetic

448

Brody

in¯uences on g from early childhood to the early school years. Paper presented at the International Society for the Study of Intelligence Meetings, Irvine, CA. Tambs, K., Sundet, J. M., & Magnus, P. (1986). Genetic and environmental contributions to the covariation between the Wechsler Adult Intelligence Scale (WAIS) subtests: A study of twins. Intelligence, 16, 475±487. Thompson, L. A., Detterman, D. K., & Plomin, R. (1991). Associations between scholastic abilities and cognitive achievement: Genetic overlap but environmental differences. Psychological Science, 2, 158±165. van Beijsterveldt, C. E. M., van Baal, G. C. M., Molenaar, P. C. M., Boomsma, D. I., & de Geus, E. J. C. (2001). Stability of genetic and environmental in¯uences on P300 amplitude: A longitudinal study in adolescent twins. Behavioral Genetics, 31, 533±543. Wadsworth, S. J. (1994). School achievement. In J. C. DeFries, R. Plomin, & D. W. Fulker (Eds.), Nature and nurture during middle childhood (pp. 86±101). Cambridge, MA: Blackwell. Wainwright, M. A., Wright, M. J., Geffen, G. M., Luciano, M., & Martin, N. G. (2005). The genetic basis of academic achievement on the Queensland core skills test and its shared genetic variance with IQ. Behavior Genetics, 35, 133±145. Wilson, R. S. (1983). The Louisville Twin Study: Developmental synchronies in behavior. Child Development, 54, 298±316.

19 Cognitive and neurobiological mechanisms of the Law of General Intelligence Christopher F. Chabris

Why do scores on most cognitive ability tests correlate positively? The fact that people who score highly on one test tend to score highly on others ± that some people are more intelligent than others ± is so intuitively obvious that most psychologists, at least since it was ®rst discovered by Spearman (1904), take it for granted. Even the harshest critics of IQ profess no surprise at the positive correlations among tests; Stephen J. Gould (1981, p. 315) wrote, ``The fact of pervasive intercorrelation between mental tests must be among the most unsurprising major discoveries in the history of science.'' Psychologists are generally not interested in talking about, much less thinking about, let alone investigating, individual differences in cognitive ability. And there is nothing wrong with this; research topics and approaches fade in and out of popularity. But if we could step out of our normal modes of thinking, if we could free our minds of the ``debauchery of learning'' that William James famously noted (1890, as quoted by Cosmides & Tooby, 1994), we might wonder why there should be any correlation among cognitive tests, let alone the strong, consistent, positive correlations found in hundreds of studies over the past century (Carroll, 1993). Evolutionary psychology, with its emphasis on specialised adaptations and instincts forged in a long-ago stone age, has little to say about current variation in intelligence. Few theories in cognitive psychology specify how their component processes or representations might vary from person to person. Neuroscience likewise gives short shrift to individual differences, though it would appear to contain within its ranks fewer outright opponents of the very concept. Notwithstanding the plea of Cronbach (1957) for uni®cation of the ``two disciplines'' of psychology, instances of collaboration between students of human commonalities and students of human differences, or instances of individual researchers who effectively operate within both of these camps, are still the exception, not the rule. Those who employ the Analysis of Variance rarely actually analyse variance (Plomin & Kosslyn, 2001). The ubiquitous positive correlations among cognitive tests are a challenge for theories of how the mind works that focus on independent, domain speci®c, encapsulated processes, or ``modules'' (Fodor, 1983) as the

450

Chabris

elementary units of study. This is especially so if one believes that these modules must have evolved to solve critical problems rapidly and ef®ciently (Cosmides & Tooby, 1994): Why should there be any correlated variation in the ef®cacy of all these carefully designed processes? Paradoxically, the notion of intelligence as an individual difference, a fact that most humans take for granted, is outside the predictive power of one of the leading theoretical frameworks in modern cognitive psychology. Two general approaches have been taken toward reconciling modularity with general intelligence. One is to argue that the general factor revealed by the correlations among cognitive tests ± the g factor ± is a psychometrically unitary construct, but that it is caused by multiple biological factors (Jensen, 1998b). In a letter to Commentary, Jensen elaborated on this hypothesis: the design features of the brain ± its neural structures and functions ± that are necessary for the many distinct processes that enter into information-processing, or intelligence (such as attention, perception, discrimination, generalization, learning, memory, language, thinking, problem-solving, and the like) are essentially the same for all biologically normal Homo sapiens, i.e., those free of chromosomal and major gene anomalies or brain damage. Correlated individual differences in the functioning of these various information processes are a result of other quantitative biochemical and physiological conditions in the brain, most of them highly heritable, that are separate from the brain's essential design features, or ``hardwiring,'' but are, as it were, superimposed on all of them in common, and affect the overall speed and ef®ciency of their functioning. The domain general ``speed and ef®ciency'' of neural functioning can be assessed in several ways: by measuring speed of response in simple cognitive tasks, by measuring speed of perceptual intake via a simple psychophysical task, by measuring nerve conduction velocity, and the like. Global neural resources can be quanti®ed by measures such as total brain volume, total grey matter, and total white matter. Indeed, numerous studies support the correlation between these sorts of measures and performance on intelligence tests, especially the general factor. The alternative approach is that certain neural systems, perhaps ones that are only semimodular, are especially related to g because they are responsible for controlling the operation of other processes, or for managing limited central resources that other processes require in order to function optimally. Variability in the ef®cacy of these special systems could account for g (e.g., Kyllonen & Christal, 1990). As others have noted, this view encompasses two distinct ideas: that of a limited pool of central memory capacity, perhaps also constraining the action of domain speci®c processes, versus that of a limited ability of a central process to control the operation

19. The Law of General Intelligence

451

of domain speci®c processes (e.g., Thomas & Karmiloff-Smith, 2003). Region-speci®c neural resources corresponding to these processes could be indexed by the volumes of speci®c structures, or the degree to which those structures are activated in response to speci®c cognitive challenges. Each of these approaches (general neural ef®ciency versus central resources or processes) has the potential to reconcile the modular view of the mind ± an orchestra of multiple independent processes designed for speci®c purposes and playing speci®c parts in the overall performance of behaviour ± with the fact of general intelligence. In this chapter I will selectively review the evidence in favour of each of these hypotheses, focusing on recent studies of brain anatomy and function. This evidence indicates that these hypotheses are not in contradiction, and that both are likely to be true. However, I will show that we have not collected the data needed to determine the relative importance of all the neurobiological factors potentially underlying g, nor have we applied the best tools for modelling the human cognitive architecture to the problem of understanding the causes of g. I will outline a research programme for the next stage of research on the mechanisms of human intelligence, and conclude with a reminder of what intelligence is not.

The Law of General Intelligence As Brand (1996) eloquently points out, a folk-psychology understanding of cognition need not predict a general intelligence: a normal expectation is that time spent in one activity is time that is lost for another: an evening spent doing crossword puzzles or metaphysics is an evening lost to practising jigsaws or swatting up metallurgy. Thus, in so far as ``practice makes perfect'' and time is ®nite, the pervasive intercorrelation between mental abilities should actually tend to be negative; and a prediction of negative correlation should particularly made by anyone who, like Gould, is inclined to treat measured IQ-type abilities as collections of attainments. Contrary to this, in the preface to his masterful volume The g factor, Jensen (1998b, p. xii) remarks almost offhandedly: I have come to view g as one of the most central phenomena in all of behavioral science, with broad explanatory power at least as important for understanding human affairs as E. L. Thorndike's Law of Effect (or Skinner's reinforcement principle). Moreover, it became apparent that the g construct extends well beyond its psychometric origin and de®nition. The g factor is actually a biologically based variable, which, like any other biological functions in the human species, is necessarily a product of the evolutionary process. The human condition in all its

452

Chabris aspects cannot be adequately described or understood in a scienti®c sense without taking into account the powerful explanatory role of the g factor.

In essence, Jensen proposed that the concept of g was suf®ciently established and important to merit the status of a behavioural or biological law. Needless to say, this suggestion has not been taken up enthusiastically, or even halfheartedly, by psychologists. But even though only 8 years have passed, the evidence in favour of a law of general intelligence has increased markedly. This law would state that measurements of cognitive ability tend to correlate positively across individuals, with a corollary that the ®rst principal component or general factor extracted from any such correlation matrix ± assuming a diverse battery of mental tests and a diverse sample of subjects ± will account for a substantial fraction of the variance. The proposal that general intelligence is a behavioural law contrasts sharply with the suggestion by Ardila (1999, p. 117) that ``the psychometric concept of general intelligence should be deleted from cognitive and neurological sciences.'' It is good to be reminded that, despite its seeming ubiquity, g is not an automatic, trivial consequence of other established theories and ®ndings in cognitive science: In our world, cognitive abilities could be correlated or uncorrelated, and whether they in fact are correlated is still a point of controversy (Chabris, 1998b; Korb, 1994). In this regard the Law of General Intelligence lacks the widespread acceptance of Weber's Law, the Matching Law, or any of the other major behavioural laws (reviewed by Teigen, 2002). In this section I will review evidence in favour of the Law of General Intelligence and argue that it is likely to apply not just to human beings, but to all species with suf®cient cognitive complexity to allow for individual differences in problem-solving behaviour. General intelligence in humans The Law of General Intelligence is illustrated in a straightforward way by Table 19.1a, a correlation matrix from a Scottish population sample performing the Wechsler Adult Intelligence Scale ± Revised (WAIS-R), a prominent intelligence test. In this case all 55 correlations among the subtests are positive, although they range from .14 to .72. This is the ``positive manifold'' typically observed in IQ tests. Table 19.1b demonstrates that the law is in effect even when evidence for it is not being sought. I collected these data (in collaboration with several colleagues and research assistants) as part of a study of individual differences in decision-making in a sample of students and adults in the Boston area. Correlations are shown among seven cognitive tasks, none of which is traditionally included in IQ tests, ranging from ÿ.02 to .48, with 19 out of 21 positive. Note that although the samples are different (population sample in Scotland versus convenience sample in the USA), the tests are different (psychometrically

Table 19.1 Correlation matrices of scores, with loadings on the ®rst principal component (1st PC), from two batteries of tests administered to human adults (a)

V

Vocabulary Similarities Information Comprehension Picture arrangement Block design Arithmetic Picture completion Digit span Object assembly Digit symbol

.67 .72 .70 .51 .45 .48 .49 .46 .32 .32

(b) Raven's matrices (12-item APM) Working memory (3-back d') Verbal ¯uency Response time Mental rotation Coordinate spatial encoding Categorical spatial encoding

S

.59 .58 .53 .46 .43 .52 .40 .40 .33 Raven's .39 .36 .41 .41 .32 .21

I

.59 .50 .45 .55 .52 .36 .32 .26 WM

.48 .28 .29 .30 .12

C

.42 .39 .45 .46 .36 .29 .30 Verbal

.41 .15 .07 ÿ.02

PA

.43 .41 .48 .31 .36 .28 RT

.21 ÿ.02 .13

BD

A

.44 .45 .32 .58 .36

.30 .47 .33 .28

PC

.23 .41 .26

Rotation Coord

.04 .16

.21

DSp

.14 .27

OA

1st PC

.25

.83 .80 .80 .75 .70 .70 .68 .68 .56 .56 .48

1st PC .50 .46 .42 .39 .34 .25 .20

For all tasks, higher scores correspond to better performance. Correlations are uncorrected for attenuation in range. Those in boldface are signi®cant at p < .05 or better. (a) Data from a population sample of 365 Scottish subjects completing all 11 subtests of the Wechsler Adult Intelligence Scale-Revised (principal component analysis calculated, and correlations redisplayed, from data presented in Deary, 2000, p. 7). The 1st PC accounts for 48% of the variance. (b) Data from 111 Boston-area subjects (54 males, 57 females; age range 18 to 60; approximately one-half students) completing seven timed cognitive tasks as part of a larger study. All tasks except Raven's APM and verbal ¯uency were conducted by computer. The 1st PC accounts for 36% of the variance.

454

Chabris

validated IQ subtests versus mainly cognitive tests), and the context is different (mainly untimed, verbal responses versus mainly timed, computerised administration), the overall patterns are strikingly similar. When the correlation matrices are subjected to principal components analysis,1 the ®rst component accounts for 48% of the total variance in Table 19.1a and 36% in Table 19.1b. The differences can partly be explained by the superior reliability of the WAIS-R as a test battery, and range restriction in the smaller, more idiosyncratic Boston sample. Some correlational studies of cognitive tasks in humans have failed to ®nd a general factor, but on closer examination they turn out to be the exceptions that prove the rule. For example, Kosslyn, Brunn, Cave, and Wallach (1984) administered a set of mental imagery tasks to 50 subjects, mostly not students, and reported correlations among the 13 derived measures ranging from r = ÿ.44 to .79, with a mean of .28. According to a reanalysis of their published matrix, the ®rst three principal components account for 22%, 18%, and 13% of the total variance. However, most of the measures derived from the various tasks were not absolute performance levels, such as total accuracy, but rather theoretically relevant measures such as slopes of mental rotation functions, or introspective judgments of the blurriness or vividness of mental images. (Indeed, Kosslyn et al. report analyses con®rming a good ®t between (1) the observed correlations and (2) predictions derived from a detailed computational model of visual mental imagery processing.) Slopes or change scores inherently tend to be independent of individual differences in total speed or accuracy, and introspective judgments are not necessarily measurements of performance that would be regarded as demonstrating ``intelligence'' in the same way as speed or accuracy of responding. If one considers only the tasks for which objective total performance measures are given,2 all 10 correlations are positive, and the ®rst principal component accounts for 45% of the variance, results in line with the Law of General Intelligence. Adding the performance measures from the four non-imagery tasks also used by Kosslyn et al. results in 33 out of 36 positive correlations, and 35% of the variance accounted for by the ®rst principal component, comparable to the pattern in Table 19.1b ± again, from a dataset using entirely different tasks and subjects, collected for a purpose other than discovering a g factor. Extensive crosscultural studies (reviewed by contributors to Irvine & Berry, 1988) have established that the g factor is observed whenever a 1 There are many methods of factor analysis that can be used to discover the architecture of a correlation matrix of mental tests. Here I use only unrotated principal components analysis (PCA) because it is the simplest, and involves no decision-making on the part of the analyst. The general factor extracted as the ®rst principal component is essentially the same no matter what analytic method is used (Ree and Earles, 1991). 2 These measures are: Line Drawing Memory, Line Drawing Time, Reorganisation Probe, Nonreorganisation Probe, and Form Board (see Kosslyn et al., 1984, for details).

19. The Law of General Intelligence

455

battery of diverse, complex cognitive tasks is administered to a human sample. For example, Reuning (1988) summarised 8 years of studies administering custom-designed mental tests to 512 Kalahari Bushmen and concluded that ``the patterns of intercorrelations suggest that a fairly strong general intellectual factor is operative in all sets of data'' (p. 479) with positive correlations among cognitive and perceptual tests ranging from .20 to .78 ± almost the same range as shown in Table 19.1a. Standardisation samples for IQ tests con®rm that the factor structure is also consistent across sexes and ethnic groups within the United States and other countries (e.g., Taylor & Ziegler, 1987), and that cognitive ability level can be measured with high reliability (Mackintosh, 1998, pp. 55±62). And intelligence has been repeatedly shown to be the best psychometric predictor of job success (Ree & Earles, 1992) and many other important life outcomes, both positive and negative (Gottfredson, 1997; Herrnstein & Murray, 1994), as diverse as the propensity to invest in ®nancial markets (Benjamin, Brown, & Shapiro, 2006; see also Frederick, 2006 on intelligence and normative decision-making) and the risk of developing mental illness in response to traumatic stress (Macklin et al., 1998). Even death is a higher risk for individuals of lower intelligence (e.g., Deary, Whiteman, Starr, Whalley, & Fox, 2004). (See also Brody, Chapter 18, and Gottfredson, Chapter 17, this volume.) So far we have seen how the Law of General Intelligence describes a regularity of human behaviour. Note that this regularity applies not to patterns of behaviour within an individual, in contrast to Weber's Law or the Power Law of Practice, but to patterns across individuals. In this way it resembles the ``three laws of behaviour genetics'' proposed by Turkheimer (2000), which describe regularities in how trait variance appears to be caused by genetic, shared environmental, and nonshared environmental variance. It is a quantitative law in that it describes a systematic relation among quantities (test scores) but, like Turkheimer's laws, and qualitative statements such as the Gestalt laws of perceptual organisation, it does not make speci®c numerical predictions of empirical results. General intelligence in other species Thousands of studies have been conducted to demonstrate the existence of g in human samples, and to measure the reliability and validity of intelligence measures. Arguments that g is a mere statistical artifact, or is necessarily bound to possibly racist Western notions of measurement or ability (e.g., Gould, 1981) have been refuted (Bartholomew, 2004; Chabris, 1998b; Davis, 1983; Korb, 1994), but there has been relatively little work on g in nonhuman species (Locurto, 1997; Plomin, 2001). Note that the question of animal g is not the same as whether different species are on average more or less ``intelligent'' (MacPhail, 1987), or whether different breeds within a species seem to differ in ``intelligence'' (e.g., Coren, 1994). The case for a

456

Chabris

Law of General Intelligence would be strengthened considerably by ®nding a consistent g-factor in animals, such that the individuals within a species who perform well on one ``cognitive'' task are likely to perform well on other tasks. Table 19.2a shows data from a study of 84 outbred laboratory mice (Galsworthy et al., 2005) in which each animal performed six tests of learning and problem-solving that are typically used in cognitive studies with mice. Paralleling Table 19.1, all 15 correlations are positive (r = .05 to .52) and the ®rst principal component accounts for 35% of the variance. Indeed, the three datasets shown so far differ in so many respects on the surface that their fundamental similarity should be striking. Despite the idiosyncrasy of species, nationalities, sampling methods, and tasks, the correlation matrices share the properties of having almost all positive correlations (55 out of 55, 19 out of 21, and 15 out of 15) and of the ®rst principal component accounting for between one third and one half of the total variance. Table 19.3 summarises published studies of g in animals, conducted to the end of 2005. Several noteworthy points emerge from an examination of these results. First, in all but one of the studies the average intertask correlation is positive, and in the majority of studies (11 out of 19) the ®rst principal component accounts for more than twice as much variance as the second principal component ± the Law of General Intelligence appears to hold across species. Second, there are more studies on mice than on all other species combined, and all but one of these have been conducted in the past 10 years. The earliest study was conducted by Bagg (1920), but he did not present his data in a correlation matrix format. Galsworthy et al. (2005) report a reanalysis of Bagg's data in this format, as well as two studies of their own, and observe that the ®rst principal component (1st PC) accounted for 61%, 41%, and 23% of the variance in the three studies. The most recent common ancestor of mice and humans lived 80 million years ago, and many cognitive capabilities ± or modules ± found in the two species must have evolved separately, during that time, to face different environments and challenges. Nonetheless, as with humans, mice display correlated variation in mental ability levels across domains: The grand average intertask r across the studies of mice is .28 (weighted by sample size, r = .24). Third, as shown in Table 19.2b and c, the same test battery has been applied to samples of rhesus monkeys ± a fairly close primate relative ± and human children. In each case the average task correlations were positive; however, this pattern was much stronger in the human sample (perhaps due in part to these being low-birthweight children at risk for low intelligence). The human sample also completed a standard, age-appropriate IQ test, which correlated strongly with each of the tasks in the battery that loaded on the g factor. Taken together, these results link the g factors found in animals and humans. Finally, it should be noted that most of the studies

Table 19.2 Correlation matrices of scores, with loadings on the ®rst principal component (1st PC), from cognitive test batteries administered to mice, rhesus monkeys, and human children (a)

BP

Burrowing puzzle (latency) Hebb-Williams maze (latency) Plug puzzle (latency) Hebb-Williams maze (errors) Morris water maze learning (latency) T-maze (errors)

.21 .52 .12 .25 .10

(b) Progressive ratio Delayed match to sample Temporal response differentiation Conditioned position responding Incremental repeated acquisition

HWM(l)

PP

HWM(e)

MWML

1st PC

.30 .32 .39 .22

.13 .05 .06

.18 .17

.14

.66 .65 .62 .60 .56 .40

PR

DMTS

TRD

CPR

1st PC

.14 .02

.21

.75 .71 .64 .57 .56

TRD

CPR

1st PC

.48

.13 .76 .53 .84 .72

.31 .40 .18 ÿ.25

(c)

IQ

PR

Progressive ratio Delayed match to sample Temporal response differentiation Conditioned position responding Incremental repeated acquisition

.08 .41 .40 .48 .46

.01 .10 .02 .09

.21 .36 ÿ.04 DMTS

.23 .54 .37

.32 .17

For all tests, higher scores correspond to better performance. Correlations in boldface are signi®cant at p < .05 or better. (a) Data from 84 outbred heterogeneous stock mice (42 males, 42 females) completing a battery of six cognitive tests (redisplayed from Study 1 of Galsworthy et al., 2005). The 1st PC accounts for 35% of the variance. (b) Data from 69 male rhesus monkeys (aged 2±3 years) completing, over the course of 1 year, a battery of ®ve operant tasks designed for toxicology studies with monkeys (redisplayed from Paule, 1990). Correlations are uncorrected for attenuation in range (for some task pairs, only 44 or 64 subjects did both tasks). The 1st PC accounts for 35% of the variance. (c) Data from 85 preterm, low-birthweight human children tested on an analog of the monkey battery at age 6.5 years (additional data provided by authors from Paule et al., 1999). An additional column shows the correlation of each test with full-scale IQ obtained at age 5 years with the Wechsler Preschool Primary Scale of Intelligence (WPSSI). Correlations are uncorrected for attenuation in range. The 1st PC accounts for 42% of the variance.

Table 19.3 Studies of general cognitive ability in nonhuman species Reference Mice Bagg (1920)a Locurto & Scanlon (1998)b Sample A (F2 cross) Speed Accuracy Sample B (CD-1 outbred) Speed Accuracy Galsworthy et al. (2002) Locurto et al. (2003)c Matzel et al. (2003) Galsworthy et al. (2005) Study 1 Study 2 Kolata et al. (2005)d Locurto et al. (2005)c Experiment 1 Experiment 2

No. of No. of +/ total subjects measures correlations 8

28/28

.58

.61

6 4

15/15 6/6

.51 .26

.58 (.44)

40 60 56

6 4 8 6 5

15/15 6/6 26/28 11/15 10/10

.47 .31 .20 .13 .22

.55 .48 (.31) (.27) .38

84 167 21

6 11 7

15/15 41/55 21/21

.22 .09 .35

(.35) (.18) .45

47 51

5 5

4/10 9/10

ÿ.03 .15

(.28) (.34)

Dogs Anastasi et al. (1955)e Nippak & Milgram (2005)f

73 13

10 3

30/45 3/3

.10 .89

(.26) .92

Cats Warren (1961)g Livesey (1970)g

21 8

6 4

14/15 5/6

.50 .40

.57 .58

44±69

5

8/10

.16

(.36)

30 53

6 3

n/a n/a

n/a n/a

.48 .62

Humans (for comparison) WAIS-R data (Table 19.1a) 365 Cognitive data (Table 19.1b) 111 Operant battery data (Table 19.2c) 85

11 7 5

55/55 19/21 10/10

.43 .24 .24

.48 .36 .42

Rhesus monkeys Paule (1990) Herndon et al. (1997) Analysis 1 Analysis 2h

71

Mean 1st PC % r variance

34 41

Studies were included only if there were at least two published for a particular species, if they presented quantitative performance data (on at least three tasks) in a correlation matrix suitable for factor analysis, and if they were known not to have used inbred or mutant strains. Measures of performance on problem-solving tasks were included; measures of global activity, preference, ``personality,'' and the like were excluded (see below). Analysis is based on product-moment correlations (r) unless otherwise noted; mean r was calculated using zr transformation. Numbers in parentheses indicate studies in which the percentage of variance accounted for by the ®rst principal component was less than twice the variance accounted for by the second. (a) From data presented by Galsworthy et al. (2005). (b) Activity measures excluded. (c) Activity, stress, and anxiety measures excluded. (d) Open-®eld measure excluded; working-memory measures included. (e) Subjects were from six different breeds; leash control, motor skills, and obedience measures excluded. (f ) All subjects were beagles; 30 s delay version of the delayed nonmatch to sample task was used; all measures were speed. (g) Based on rank correlations. (h) Subjects were a superset of those in Analysis 1, adding individuals who completed only three of the six total tasks.

19. The Law of General Intelligence

459

reported here may be biased against ®nding a general factor, because laboratory animals are raised in environments that are not only somewhat impoverished but also identical, or at least very similar across individuals, removing a major source of individual differences, and increasing the homogeneity of behaviour. There was a small vogue for studying g in rats and chickens during the 1930s and 1940s, but, as Locurto (1997) points out in a review of comparative studies of intelligence, few lessons can be drawn from this work because it often used inbred, mutant (e.g., albino), or poorly-characterised laboratory animals, or focused exclusively on different maze-running tasks (see also Plomin, 2001). Samples with arti®cial genetic similarity, which include even the more recent, well-designed studies of rats (e.g., Anderson, 1993, 1995), are less likely to reveal a g factor, as are studies without a diverse set of cognitive challenges (these could instead identify a factor speci®c to the type of task used). Interestingly, two large recent studies that failed to ®nd strong evidence of g in the Sprague-Dawley rat strain did ®nd a positive manifold when animals were subjected to a variety of brain lesions early in development, a treatment that presumably increased the diversity of their surviving brain function by counteracting the effects of genetic similarity (Crinella & Yu, 1995; Thompson, Crinella, & Yu, 1990). Crinella and Yu (1995) present data from Thompson et al. (1990) comparing the unlesioned animals (N = 75) to the full sample (N = 424) on four performance measures; the average correlation among tasks in the unlesioned group is r = .01, compared to .24 for the full sample. In a replication with seven measures intended to sample a broader range of g loadings, Crinella and Yu found an average correlation of r = .02 in 24 unlesioned animals, increasing to .19 when 96 lesioned animals were added to the sample.3 Hints of evidence for general intelligence have also emerged in recent studies of nonmammals, speci®cally insects. Honeybees were divided into two groups according to their performance on a test of latent inhibition (high or low inhibition); the offspring of these groups showed a similar difference on latent inhibition, and high-inhibition bees performed worse on a reversal learning task (Chandra, Hosler, & Smith, 2000; but see Ferguson, Cobey, & Smith, 2001 for inconsistent ®ndings). While this study observed a positive association between just two tasks, it is intriguing that reversal learning is a test of executive function, which is related to ¯uid intelligence in humans (see next section). The development of more diverse batteries of tests, as has been done for mice, could facilitate studies of general intelligence in suitable insect species, such as honeybees and fruit ¯ies, as well as other species in which individual differences have been found at the singleneurone level, such as molluscs (Matzel & Gandhi, 2000).

3 This effect is reminiscent of Spearman's (1927) ®nding that the average intertask correlation is larger in low-scoring than in high-scoring groups of human subjects.

460

Chabris

The recent focus on mice has enabled the development of consensus methodologies (e.g., Crawley & Paylor, 1997) and a database of relatively comparable results, as well as facilitating the study of genetic modi®cations (e.g., Tang et al., 1999), but it cannot address some issues as well as can work on other species. Nonhuman primates such as monkeys should be studied because of their close evolutionary relationship to humans.4 Dogs present an interesting case for several reasons, including their high social ability and coevolution with humans (Hare & Tomasello, 2005), as well as their potentially greater suitability as a model for understanding human cognitive diseases and for developing therapies (e.g., Milgram et al., 2005; Tapp et al., 2004; Tapp, Head, Head, Milgram, Muggenburg, & Su, 2006). Studies of dogs and other nonprimate species that show comparable cognitive and neurological complexity to primates, such as dolphins (Marino, 2004), and even highly intelligent nonmammals such as corvids (e.g., crows and ravens; Clayton & Emery, 2005), afford the opportunity to ask whether g represents a universal property of diverse brains that evolved on longseparate lineages. Many speci®c cognitive abilities have evolved convergently, such as self-recognition in cetaceans and primates (Marino, 2002), and tool use and episodic memory in corvids and primates (Emery & Clayton, 2004). There appear to be considerable differences among ravens in their ability to solve a novel, complex problem (gathering up a string with their bills and feet to acquire a piece of meat tied to the end; Heinrich & Bugnyar, 2005). It is therefore plausible that covariation among cognitive abilities, perhaps owing to their need for shared central informationprocessing resources, can arise separately within species as well.

Cognitive and neurobiological mechanisms A second important corollary of the Law of General Intelligence is that individual differences in g result from differences in the amount or ef®ciency of information-processing resources brought to bear in solving problems. These resources could be speci®ed at the cognitive or biological levels.5 Jensen (1998b) devoted 99 pages to the correlates of g, focusing on brain volume, brain electrical activity, processing speed, and working memory. In the ensuing years, the biological foundation of g has been strengthened considerably, and it has been broadened by the use of new 4 In addition to the batteries of tasks tested on rhesus monkeys mentioned earlier, a version of the CANTAB neuropsychological test battery has been developed with eight tests suitable for rhesus (Weed et al., 1999) and perhaps marmosets. Later studies of similar tasks have shown considerable individual differences among monkeys (Taffe et al., 2004), but no factor analyses of inter-task correlations have been published with this battery. 5 Note that other biological differences between people, such as genotypes, nutrition, and general health, hormones, and other physiological biomarkers are not considered here, because their undoubted effects on intelligence differences must in some way be mediated by differences in brain structure or function.

19. The Law of General Intelligence

461

brain-imaging technologies, primarily based on magnetic resonance imaging (MRI), and new cognitive approaches. Also important is the identi®cation by Duncan and colleagues (Duncan, Burgess, & Emslie, 1995; Duncan, Emslie, Williams, Johnson, & Freer, 1996), and others, of executive function, as mediated by the frontal lobes, as being critical for general ¯uid intelligence (gF ).6 Table 19.4 summarises recent empirical studies of the major cognitive and neurobiological correlates of general mental ability. In some cases, such as brain volume, the results of a meta-analysis are given, while in others, such as response time, the results of a single, large, well-designed study are presented. In the case of some newer MRI technologies, only a single small study may have been reported to date. In this section I will review some of the most important studies of the major dimensions of cognitive and neurobiological variation that have been shown, or proposed, to correlate with general intelligence. Brain volume The idea that greater information-processing resources should facilitate greater cognitive ability is a simple one ± so simple that it appears simplistic. Yet the fact that brain volume is correlated with intelligence comes as a surprise to many contemporary cognitive scientists, who have apparently absorbed critiques of allegedly ``racist'' studies conducted a century ago (e.g., Gould, 1981). Their attitude is well expressed by Budiansky's (1998) assertion: ``Correlation between intelligence and brain size has been soundly rebuffed in humans.'' Indeed, according to a press release (dated 29 March 2006), Elias A. Zerhouni, the head of the US National Institute of Health, believes this to be the conclusion of empirical research: ``Studies of brains have taught us that people with higher IQs do not have larger brains.'' The general ethos of modularity, which prevails in cognitive psychology and evolutionary psychology, and which is in opposition to notions of equipotentiality, also feeds skepticism of the importance of global parameters of brain function and structure. Nonetheless, beginning with Andreasen et al. (1993), a lengthening series of studies using quantitative analysis of structural magnetic resonance images and modern psychometric tests has documented a robust association accounting for about 10% of the variance in IQ, according to a meta-analysis of 37 studies 6 In discussing variables that are correlated with intelligence in this chapter, I use ``g,'' ``IQ,'' ``cognitive ability,'' and similar terms interchangeably. However, this does not mean that they are synonymous. Bartholomew (2004) points out that measures of g based on factor analysis are normally superior to IQ scores, which are not necessarily constructed to optimally measure the general factor. Also, in some cases correlations with second-level factors such as general ¯uid intelligence, or general crystallised intelligence, or performance IQ (re¯ecting the nonverbal abilities measured by an IQ test) will be speci®ed rather than correlations with measures of the g-factor itself.

462

Chabris

Table 19.4 Selected estimated simple correlations between full-scale IQ scores or other measures of general intelligence and possible causal factors Measure Whole brain volume Grey matter volume White matter volume Frontal grey matter volume

Correlation (r) with IQ Reference .33

McDaniel (2005)

.27 .31

Gignac et al. (2003)

.41

Thompson et al. (2001)

Notes meta-analysis of 37 MRI studies, N = 1530 meta-analysis of 7 MRI studies, N = 428 N = 40 (20 twin pairs)

Response time (RT): simple reaction time ÿ.31 4-choice RT ÿ.49 variability of RT ÿ.26 number of 4-choice errors .07

Deary et al. (2001) N = 900, population sample aged 54 to 58, modi®ed Hick task

Inspection time (IT)

Grudnik & Kranzler (2001)

ÿ.51

meta-analysis of 92 studies, N = 4197, attenuation correction (uncorrected r = ÿ.30)

Nerve conduction velocity

.10

Reed et al. (2004) N = 387, P100 VEP latency, average over 3 stimulus conditions

White matter lesions

.09

Gunning-Dixon & Raz (2000)

White matter organisation

.44

Schmithorst et al. N = 47, DTI of FA, (2005) average of 7 brain areas showing positive correlations

White matter integrity

.51

Jung et al. (2005)

Working memory (WM): accuracy in 3-back task RT in 3-back task

.34 .01

Frontal activity during WM: speci®c to control .51 sustained across tasks ÿ.11

meta-analysis of 11 studies (4 crystallised, 7 ¯uid intelligence)

N = 27, proton MRS of NAA in occipitalparietal region

Gray et al. (2003) N = 58, averaged across 3 trial types Gray et al. (2003) N = 48, fMRI signal, weighted average of 10 signi®cant clusters

involving 1530 subjects (McDaniel, 2005). Controlling for height (as a measure of body size) typically does not alter the results (e.g., Witelson, Beresh, & Kigar, 2005). An earlier meta-analysis by Gignac, Vernon, and Wickett (2003) showed that the IQ±volume association was the same for grey matter (which consists mainly of the cell bodies of neurones) and white matter (which consists of axons and myelin, which carry information

19. The Law of General Intelligence

463

between neurones). As we shall see, however, properties of the white matter other than its total volume may be critical for intelligence. Modern image analysis technologies allow the volumes of particular structures and regions to be estimated from structural MRI data. Using such a method, Thompson et al. (2001) reported a correlation of .41 between frontal lobe volume and IQ (controlling for whole-brain volume), which was the largest and only signi®cant relationship of cortical lobe volumes to IQ. Studies using more automated techniques, such as voxel-based morphometry (VBM; Ashburner & Friston, 2000), have revealed varying patterns of results. In a study of 146 children aged 5 to 19, Wilke, Sohn, Byars, and Holland (2003) found that grey matter volume was most strongly correlated with IQ in the anterior cingulate cortex (ACC; r = .29). Interestingly, this correlation was highest in the oldest third of the subjects (r = .53). Frangou, Chitins, and Williams (2004) used a similar technique with 40 older children (ages 12 to 21) and replicated the correlation between IQ and ACC grey matter volume. They also found IQ correlations with orbitofrontal (bilateral), precuneus, thalamic, and cerebellar grey matter volume. In contrast to these developmental studies, Haier, Jung, Yeo, Head, and Alkire (2004) used VBM analysis from different sites with two adult samples (total N = 47) and found much more widespread regions of IQ± volume correlation, in both grey and white matter. As shown in Figure 19.1, the strongest correlations between grey matter concentration and IQ occurred in the medial prefrontal cortex, and the superior, middle, and inferior frontal gyri bilaterally. White matter concentration correlations with IQ were strongest in the frontal and temporal lobes. Using both VBM and a manual technique, Gong et al. (2005) found a correlation only in the medial PFC (speci®cally the ACC) between grey matter volume and g. Thus, while these studies disagree in the particulars of their ®ndings, the overall pattern is consistent with Thompson et al.'s (2001) conclusion that frontal cortex grey matter is closely related to IQ. The disagreement may result from the differing samples: During development, the frontal cortex continues to mature, so the IQ±volume effect may grow more apparent with increasing age. It is also possible that the reliability of automated warping techniques like VBM is suf®cient to capture robust trends of correlation over large regions, such as the lateral or medial prefrontal cortex (PFC), but not smaller areas such as speci®c gyri. Information-processing speed Just as a larger brain can be a more intelligent brain, a faster brain can also confer greater cognitive ability. This has been found even for the most simple measurements of processing speed. Deary, Der, and Ford (2001) administered a task with two conditions to a representative population sample of 900 Scottish adults aged 54 to 58. In one condition, subjects had to press one of four buttons according to which of four corresponding lights illuminated

464

Chabris

Figure 19.1 Maps of brain areas, shown by chequered regions, where tissue volume correlates with full-scale IQ, in each of two separate samples totalling 47 adults (reprinted from Haier et al., 2004 with permission from Elsevier). Labels indicate regions associated with general intelligence in this and other studies discussed in the text.

on each trial; in the other, only one light and one button were involved, so there was no ``choice'' component involving a decision of which button to press. Response time (RT) on the four-choice condition correlated r = ÿ.49 with IQ (higher-IQ subjects responded faster), but this correlation remained at r = ÿ.31 on the simple reaction time condition. Higher-IQ subjects also showed less variable response times (r = ÿ.26 between IQ and the standard deviation of RT). Notably, IQ explained at least 13 times more variance in RT parameters than it did in accuracy on the four-choice task, ruling out the possibility of a speed±accuracy trade-off. RT is not necessarily a pure measure of information-processing speed. Despite the apparent simplicity of even the simple RT task, it requires the sequencing of several operations (from detection of the light to initiation of the motor response, plus the speed of moving one's ®ngers). The ``inspection time'' (IT) task aims to measure a circumscribed component of perceptual processing by separating out the motor response component. On each trial, following a cue, two line segments are presented side by side on a screen for a very brief period of time, followed by a masking stimulus; the

19. The Law of General Intelligence

465

subject must decide which segment is longer. The dependent measure, IT, is not speed of response, rather it is the minimum stimulus display time at which the subject can reach a prede®ned accuracy criterion, such as 75% of trials correct. Early studies reported IT±IQ correlations as high as r = ÿ.85 (higher-IQ subjects are able to reach criterion at shorter stimulus display times; Deary, 2000), but a more recent meta-analysis of 92 studies, involving 4197 subjects, estimated the population correlation at ÿ.51 (Grudnik & Kranzler, 2001), essentially the same as the choice RT±IQ correlation found by Deary et al. (2001). White matter efficacy Larger brains and faster brains are both associated with higher IQ, as demonstrated by the volume, RT, and IT ®ndings. Even so, it is not clear whether larger brains should equate with being faster ones. But faster processing could certainly result from the more ef®cient function of white matter ± the axons that communicate information between neurones, and the myelin that insulates the axons to prevent signal degradation over distance. Miller (1994) argued that the biological substrate for IQ differences should be found in white matter, and as we have seen, white-matter volume, both globally and in speci®c regions, does predict IQ. Two of the whitematter factors presented in Table 19.4 have small correlations with g. Nerve conduction velocity, which can be measured in different ways, shows a consistent but very small association with g, as illustrated by the large recent study of Reed, Vernon, and Johnson (2004). The presence of white-matter lesions, which are areas of hyperintense MRI signal, and are often associated with ageing, correlates weakly with g, though perhaps more highly with executive function, according to a meta-analytic review (Gunning-Dixon & Raz, 2000). However, two newer MRI techniques enable direct measurement of the ef®cacy and integrity of white-matter tissue. Diffusion tensor imaging (DTI) measures how much water diffuses in different directions within each voxel (three-dimensional ``chunk'') of the tissue being sampled. Water diffuses more along the direction of axon projection than perpendicular to it, since the perpendicular direction may be restricted by the axon membrane and the surrounding myelin sheaths. Fractional anisotropy (FA) is the degree to which water diffuses in a single direction. Thus, if a voxel contains mainly axons lined up in a single direction, it should have high FA. Klingberg et al. (2000) treated FA as an individual-differences parameter that indexes the integrity of the microstructure of white matter, and found that FA in a region of the left temporalparietal cortex correlated as high as r = .84 with reading test score in a sample of 17 adults. Tuch, Salat, Wisco, Zaleta, Hevelone, and Rosas (2005) directly related FA to response time on a four-choice task in 12 young adults aged 19±26, and observed negative correlations (faster responders had higher FA values) in left parietal and superior temporal areas (cf.

466

Chabris

Madden, Whiting, Huettel, White, MacFall, & Provenzale, 2004). Finally, Schmithorst, Wilke, Dardzinski, and Holland (2005) measured FA and IQ (using the Wechsler Intelligence Scale for Children ± III Test) in 47 children aged 5 to 18. In addition to a bilateral region in the frontal lobes, several areas of positive correlation between FA and IQ were found in areas associated with visual±spatial processing, such as the right occipital±parietal region, with a collective average correlation of r = .44. (See Shenkin, Bastin, MacGillivray, Deary, Starr, & Wardlaw, 2003; Shenkin et al., 2005 for further studies of FA and intelligence in samples of older adults.) Magnetic resonance spectroscopy (MRS) measures the relative concentrations of different molecules in tissue, but typically cannot be performed with the same resolution or whole-brain coverage as structural MRI, DTI, or functional MRI (fMRI). A typical study samples from one or a few large regions of interest (ROIs) in the brain. Only certain molecules have spectroscopic signatures that enable quanti®cation; the most commonly used is N-acetyl-aspartate (NAA), which is a marker of neuronal health. In the most recent MRS study of intelligence, Jung et al. (2005) measured NAA in an ROI in a right occipital±parietal white-matter region, and observed a correlation of r = .51 with IQ in a sample of 27 adults (cf. Jung, Brooks, Yeo, Chiulli, Weers, & Sibbitt, 1999a). When the same region was examined in a separate study of 45 college students (Jung et al., 1999b), performance on a set of timed cognitive tasks was more highly correlated with NAA (r = .65) than was performance on a set of untimed tasks (r = .28), a speci®city to processing speed consistent with the ROI's placement within white matter. Working memory and cognitive control IQ differences may arise from variation in information-processing ef®ciency across brain areas (as measured by RT or IT), which may in turn re¯ect differences in available neural resources (as measured by global or regionspeci®c brain volume, or white-matter ef®cacy). Two complementary possibilities are that IQ differences arise from variation in the capacity of limited resource pools ± collectively described as working memory (WM) ± that are used by multiple other processes, or from variation in the ef®cacy of supervisory processes (executive function or cognitive control) that regulate the operation of other processes (Duncan & Owen, 2000).7 In either case, the requirement that the limited resource or speci®c process be 7 For the purposes of this chapter I will consider ``executive function'' and ``cognitive control'' to be synonymous, but in the broader literature they are not always taken to mean the same thing. Executive function may be viewed as a large-scale category of neuropsychological functions, which itself can be fractionated into subcomponents (Miyake, Friedman, Emerson, Witzki, Howerter, & Wager, 2000), each of which may correlate more or less with general intelligence. Indeed, Friedman, Miyake, Corley, Young, DeFries, and Hewitt (2006) report that measures of the ability to update working memory correlate signi®cantly with IQ, while measures of set-shifting and inhibitory abilities do not.

19. The Law of General Intelligence

467

used to perform many different cognitive tasks would explain the positive correlations among cognitive tests. Working memory is assessed, albeit crudely, by the digit span test that is standard in IQ batteries like the WAIS. The relationship between IQ and working memory capacity is so well supported by empirical evidence that some theorists have argued the two concepts are coextensive (e.g., Kyllonen & Christal, 1990), but current views are less radical (Conway, Kane, & Engel, 2003). The connection between IQ and executive function is a more recent development. Classical neuropsychological dogma claimed that measured IQ did not signi®cantly decrease after lesions to the prefrontal cortex (e.g., Warrington, James, & Maciejewski, 1986). Paradoxically, a separate speci®c neuropsychological evaluation, such as the Wisconsin Card Sort Test (WCST), was required to detect frontal damage. In the WCST, the subject must learn to sort cards into piles by inferring a simple rule the examiner has in mind, such as sorting by the number of shapes on the card (1, 2, 3, and 4) rather than the colour of the shapes (red, blue, green, or yellow). After the subject has followed the rule correctly for a series of trials, the examiner changes the rule. Perseverative errors ± continuing to follow the old, invalid rule ± are made by patients with frontal damage, and are interpreted as re¯ecting decreased executive function. Duncan et al. (1995) resolved this paradox by showing that frontal damage impairs performance on tests of ¯uid intelligence disproportionately, compared to the overall IQ score, which in traditional test batteries such as the WAIS was weighted towards measures of crystallised intelligence such as vocabulary. Frontal lobe patients scored signi®cantly lower on the Culture-Fair Test than on the WAIS, whereas matched control subjects, and nonfrontal patients, performed similarly on the two tests. The Culture-Fair Test includes ``matrix reasoning'' ± the quintessential measure of ¯uid intelligence, and the measure that typically loads most highly on the g factor in a battery of cognitive tests. A matrix task requires subjects to examine an array, whose nonverbal elements are linked by (unstated) rules, and determine which of several options is the correct one to ®ll an empty space in the array. Items get progressively harder as the test proceeds. (The newest edition of the WAIS, the WAIS-III, has added a matrix reasoning test.) Duncan et al. (1996) showed that frontal patients and low-IQ subjects exhibited a phenomenon of ``goal neglect'' ± an impairment of cognitive control, in which instructions to switch attention between two different stimulus streams are ignored, even though they are understood. A recent study by Gray, Chabris, and Braver (2003) illustrates the relationship between IQ, WM, and cognitive control. Fifty-eight adults completed Raven's Advanced Progressive Matrices (RAPM, the leading matrix reasoning test) and the 3-back working memory task. For this task, in each of a series of trial blocks, the subject views a series of words and presses one button if the current word is the same as the one that came three earlier, and another button if it is different. This is a deceptively

468

Chabris

dif®cult task; in addition to keeping three words in memory while waiting for the next one, which is relatively easy by itself, the subject must discard the oldest word and add the new one while keeping the order correct. Accuracy on this task across all trials correlated r = .34 with RAPM score. The task was made even more dif®cult by the inclusion of ``lure'' trials, in which the stimulus is the same as the one seen two, four, or ®ve trials earlier. On these trials, the familiarity of the word tempts the subject to respond incorrectly that it is the 3-back word, and performance is worse. On ``nonlure'' trials, when the stimulus matches that from one, six, or seven trials back, or is entirely new, performance is better. Cognitive control must be selectively increased on lure trials to suppress the incorrect response; indeed, RAPM score explained more variance on lure trials than on nonlure trials (partial r = .27 for the RAPM±lure correlation when controlling for nonlure accuracy). Functional architecture Several lines of evidence reviewed thus far point to the prefrontal cortex and cognitive control as critical anatomical and cognitive substrates of g. Gray et al. (2003) tested the hypothesis that individual differences in ¯uid intelligence are mediated by differences in the functioning of the neural system responsible for working memory. This system involves the lateral frontal cortex, anterior cingulate cortex, and cerebellum. Forty-eight of the subjects who performed the 3-back task discussed above had their neural activity measured with fMRI on a trial-by-trial basis, and activity during lure trials was correlated with RAPM scores. The WM task activated a widespread network of brain regions, and RAPM predicted lure trial activity in the expected lateral frontal, ACC, and cerebellar regions, as well as several other regions, especially in the parietal lobes. Figure 19.2 illustrates this effect for an ROI in the left lateral frontal cortex, in which the percentage signal change from baseline correlated r = .55 with RAPM score. High-gF subjects increased activation of this area during lure trials, while low-gF subjects actually decreased activation.8 These correlations 8 This study, and the other fMRI studies reviewed in this section, report that neural activity is positively correlated with g. This appears to contradict the ®ndings of several studies that measured neural activity in terms of cerebral glucose metabolism with PET, and consistently found negative correlations with g (for a review, see Haier, 1993). Gray et al. (2003) and others actually show that differences in activity between dif®cult and easier task conditions correlate positively with g, whereas the PET studies tend to show that the absolute glucose metabolic rate correlates negatively with g. Larson, Haier, LaCasse, and Hazen (1995) used PET to compare glucose metabolism between dif®cult and easy versions of a digit span test customised to the ability level of each of 28 subjects. High-RAPM scorers showed higher brain-wide activity in the dif®cult condition than in the easy condition, but low-RAPM scorers showed lower activity when the task was more dif®cult ± analogous to the pattern observed by Gray et al. (Figure 19.2).

19. The Law of General Intelligence

469

Figure 19.2 Results of an individual differences study of the neural mechanisms of general ¯uid intelligence (Gray et al., 2003). A: Chequered areas indicate brain regions in which gF, measured by Raven's APM score, predicted neural activity on the high-interference lure trials of a 3-back working memory task. Lateral views of the cerebral cortex (top) and cerebellum (below) are shown for the left hemisphere (LH) and right hemisphere (RH). In addition to the regions shown here, gFactivity correlations were also observed in the left anterior cingulate cortex (not pictured). The three highlighted regions in the left frontal, left parietal, and right parietal cortex collectively mediated nearly all of the behavioural correlation between gF and 3-back lure trial accuracy. B: Positive relationship between gF and neural activity (% signal change from baseline) in the circled region of the left lateral frontal cortex (lure trials). Note that sustained activity during the 3-back task (compared to periods of rest) was not signi®cantly correlated with gF in this region. C: Contrasting activation timecourses of high-gF and low-gF groups (de®ned by a median split on RAPM score) during performance of lure trials.

between gF and lure trial activity remained in analyses controlling for activity on other trial types, as well as for accuracy on lure trials. Two separate relationships have been established thus far: (1) RAPM predicts WM performance, especially on lure trials; and (2) RAPM predicts neural activity speci®c to lure trials, especially in the lateral PFC. These ®ndings are consistent with the idea that increased activation of this network explains why high-IQ subjects perform better on lure trials. To demonstrate directly that this neural activity is responsible for the relationship between ¯uid intelligence and working memory, Gray et al. (2003) used mediation analysis. Mediation occurs when the relationship between two variables, A and B, is transmitted through a third variable C, that is correlated with both A and B (MacKinnon, Lockwood, Hoffman, West, & Sheets, 2002). Activity differences in three regions ± the left lateral PFC

470

Chabris

region highlighted in Figure 19.2, as well as a left parietal and a right parietal region ± collectively accounted for nearly all of the relationship between gF and lure trial accuracy. Interestingly, the coordinates of these critical regions, whose activity re¯ects individual differences in ¯uid intelligence, are similar to those identi®ed in an fMRI study of taskswitching (Sohn, Ursu, Anderson, Stenger, & Carter, 2000), consistent with the ®ndings of Duncan et al. (1996) on goal neglect. Further evidence that the lateral frontal cortex and parietal cortex mediate general intelligence comes from studies using what Jensen (1998b) has described as the method of correlated vectors. According to Jensen, if a biological variable is related to g, its relationship with test performance should strengthen as the test's g-loading increases; that is, a vector of test g-loadings (vector A) should correlate signi®cantly with a vector of correlations between the test scores and the biological variable (vector B). Lee et al. (2005) had 36 high school-age subjects complete the WAIS-R and RAPM, and then perform a matrix reasoning task during fMRI. Vector A contained the g-loadings of the WAIS-R subtests and the RAPM (as reported in a published psychometric study). They selected a set of ®ve brain ROIs in which activity was greater during a dif®cult than during an easy version of the matrix task; these areas were the ACC, left and right PFC, and left and right posterior parietal cortex. For each of these ROIs, they created a vector of correlations between performance of the WAIS-R and RAPM tests and activity in that ROI (each of these is a Vector B1, B2, . . . B5). Finally, when Vector A was correlated with each of the Vector Bs, the highest correlations were observed for posterior parietal cortex ROIs bilaterally. That is, in the parietal cortex, as the g-loading of a test increased, so did the relationship between the test score and neural activity: The most g-loaded tests were the best predictors of individual differences in neural activity in this ROI compared with elsewhere in the brain. Duncan and Owen (2000) recorded neural activity with positron emission tomography (PET) while 13 young adults performed two pairs of tasks, one spatial pair and one verbal pair. Within each pair, one of the tasks had a higher g-loading than the other (spatial, .59 versus .37; verbal, .55 versus .41). The dorsolateral PFC was the only brain area in which activity was greater in the higher-g than in the lower-g version of both tasks. This conjunction criterion eliminated areas of activity speci®c to particular stimulus and response types. Neither of these two studies agreed completely with the Gray et al. (2003) ®ndings, but each had a smaller sample size than Gray et al., and the lateral PFC and posterior parietal cortex were each strongly implicated in two out of the three studies. Several other small studies relating neural activity to individual differences in intelligence have recently been reported. In a sample of 19 adults, Horn, Dolan, Elliott, Deakin, and Woodruff (2003) measured IQ using the National Adult Reading Test (NART) and neural activity using fMRI during a response-inhibition task, which demands cognitive control. NART

19. The Law of General Intelligence

471

predicted activation in a left lateral frontal area, as well as a precentral gyrus region on the same side. Geake and Hansen (2005) had 12 subjects perform a ``¯uid analogy'' task involving nonsense letter strings during fMRI, and found that a measure of verbal intelligence predicted taskspeci®c activation in two left lateral frontal areas. Despite possibly being weighted towards crystallised rather than ¯uid g, the results of these two studies are broadly similar to those of Gray et al. (2003). Haier, White, and Akire (2003) took a different tack from other studies by measuring g with RAPM and correlating it with PET activity while the 22 subjects simply watched videos, without performing any task. In contrast with the results discussed so far, this study's strongest ®nding was a difference in activation between high- and low-g subjects in an occipital-temporal region, likely involved in object recognition. The broad pattern that emerges from these studies of the functional architecture of g is in line with the studies of regional grey-matter volume: The prefrontal cortex is the most critical node in a network that implements working memory and cognitive control, which are intimately related to each other and to ¯uid intelligence (Kane & Engle, 2002). However, there is heterogeneity of g-related function in the frontal lobe, with the lateral frontal cortex appearing more closely related to individual differences than is the medial frontal or anterior cingulate cortex. The best evidence for this PFC±g relationship comes from the mediator analyses reported by Gray et al. (2003): As a rule, such analyses provide a stronger understanding of the neural mechanisms of g because they can reveal those mechanisms that account for the relationship between g and speci®c aspects of cognitive task performance, such as cognitive control. The evidence for posterior parietal cortex association with g comes from Gray et al. and from Lee et al. (2005), especially the correlated vectors analysis performed in the latter study. Interestingly, the parietal region does not show as strong a volumetric relationship to g as does the frontal cortex; perhaps variability in parietal function does not depend on its size, whereas frontal function increases with size. Future studies of the functional architecture of g should explore these frontal and parietal regions, as well as the thalamus, basal ganglia, and cerebellum, using multivariate methods such as mediation and correlated vectors analyses.9 Learning ability and ``neural plasticity'' The possibility that general intelligence results from variation in the ability to learn new skills and facts has been raised repeatedly, from Spearman 9 Studies of individual differences in neural activity to date have focused on variability in the magnitude of activation within regions, but it is also possible to explore the distribution of activation across the brain; for example, variation in locations, in correlations between locations, and in other properties (e.g., Glabus et al., 2003).

472

Chabris

(1927) to Garlick (2002, 2003). Garlick offers ``neural plasticity,'' measured by a learning rate value in a model neural network, as a single ``golden parameter'' (Thomas & Karmiloff-Smith, 2003) to explain g. In support of this hypothesis, he demonstrates that simple neural networks, using error-feedback learning rules, reach asymptotic performance with less training when their learning rates are higher. Garlick's learning rate parameter would seem to correspond most closely to the speed of acquisition of simple associations, or to the speed of conditioning. The idea that g re¯ects learning ability is entirely reasonable (Mackintosh, 1998) and accords well with common sense, but learning ability is conspicuously absent from Table 19.4. One reason is that whereas the ability to learn complex material invokes a variety of cognitive abilities, and does correlate with g (Jensen, 1998b), it is not easy reliably to measure elementary parameters of learning, such as acquisition rates or conditionability, that might underlie or explain individual differences in intelligence. This dif®culty arises because measures of change, difference, or slope are inherently less reliable than measures of absolute performance level (Jensen, 1998c). For example, the reliability of a difference score depends on the reliabilities of two separate measures, and subtracting the two measures essentially sums the ``noise'' associated with each. Thus, IQ may appear to correlate poorly with a measure of learning simply owing to poor reliability. Future research on learning and g might pro®t from developing reliable and valid ways to measure an individual subject's learning rate. Garlick's speci®c proposal is problematic for at least two reasons. First, as the above discussion implies, there is little direct evidence that IQ or g predicts performance on elementary learning tasks. Studies of implicit learning, which is an unconscious, associative process, have demonstrated only modest IQ correlations (e.g., r = .25 in Reber, Walkenfeld, & Hernstadt, 1991; r = .05 in Feldman, Kerr, & Streissguth, 1995; and r = .12 in McGeorge, Crawford, & Kelly, 1997). Second, Garlick's neural models con®rm nothing more than a tautology: that for some tasks, increasing the learning rate leads to faster learning. This fails to account for situations in which learning is too rapid and results in over®tting (learning associations that are too speci®c to the training examples provided, and thus generalise poorly to novel situations), or in a failure of the model to reach a stable solution. Learning ability is undoubtedly related to intelligence, but almost certainly not in such a simplistic way.

Overview and future directions Across the wide variety of different measures reviewed here, a remarkably consistent result holds: Virtually all of the cognitive and neurobiological variables correlate with IQ in the range of r = .30 to .50. This implies

19. The Law of General Intelligence

473

that both of the proposed accounts of the mechanistic basis of the Law of General Intelligence are correct, as the measures related to overall neural ef®ciency (e.g., gross brain volume, response time) and the measures of common resources or control processes (e.g., frontal lobe volume, working memory) are equally predictive of g. As a rule, none of the measures in Table 19.4 is able to go far beyond the .50 threshold, and some that come close (e.g., fMRI and DTI measures) come from regions of interest in the brain that are selected post-hoc, and thus may represent overestimates of correlations that would be found with prospectively chosen ROIs. One's interpretation of this general pattern may depend on one's prior beliefs about how strong correlations ``should be'' to be taken seriously. Some might argue that doing complex medical scans, at several hundred dollars per subject, or behaviourally testing a sample of several hundred individuals, and explaining ``only 9 to 25%'' of the variance in IQ is an unimpressive feat. But if g were a simple, linear product of a simple, obvious biological variable, that variable would probably have been identi®ed already, and the Law of General Intelligence would be much less interesting than it actually is. While some of the correlations reported here are based on meta-analyses of numerous studies, others need further replication and generalisation. Establishing ®rm empirical bases for the roles of white-matter ef®ciency and region-speci®c brain structure and function are important goals for further research, as are expanding the study of g to neural features such as cortical thickness and shape. In the remainder of this section I will propose three other critical questions regarding the mechanisms of g, and discuss possible means of answering them.

Mechanisms of g in other species First, as the Law of General Intelligence appears to apply to other species as well as it does to humans, its corollary on information-processing ef®ciency should likewise be species-general. There are some indications from the limited literature on animal g that this is the case. The relationship between g and working memory is observed in mice (Kolata, Light, Townsend, Hale, Grossman, & Matzel, 2005), and frontal cortex volume predicts performance of working memory and executive function tasks in dogs (Tapp et al., 2004). As Thomas and Karmiloff-Smith (2003) note, novelty-preference in human infants predicts higher intelligence at later ages. Likewise, across at least two studies (Kolata et al., 2005; Matzel et al., 2003), the propensity of mice to explore in an open ®eld is correlated signi®cantly with performance on multiple cognitive tasks. Anderson (1993) found that in a sample of 20 Long-Evans rats, brain volume correlated signi®cantly with the general factor, derived from a battery of four

474

Chabris

measures designed for high g-loading.10 This battery included a noveltypreference measure, which loaded .43 on the g factor. Future studies of g in animals, as proposed earlier, should incorporate biological measures whenever possible; given the availability of MRI and PET devices designed for mammals, it is possible to obtain virtually the same range of neurobiological assays that can be studied in humans, as well as additional measures that cannot be performed on humans. For example, Anderson (1995) observed signi®cant variability across individual LongEvans rats in the degree of dendritic arborisation, but did not ®nd differences in this measure between small groups of high-g versus low-g animals. In principle, such microanatomical features and cellular mechanisms, such as ®ring rate and long-term potentiation, could be measured in individual animals and correlated with differences in cognitive ability, affording examination of the mechanisms of g at a level of detail that is dif®cult to reach in human studies. Discovering new mechanisms of g Second, as illustrated by the unique possibilities of animal studies, we must continue to seek new mechanisms that could underlie g. New technologies and discoveries in cognitive neuroscience may come with obvious applications to understanding individual differences ± but researchers interested in individual differences must monitor these developments, because most cognitive neuroscientists and neuroimaging specialists are not interested in individual differences. But instead of merely waiting for new developments in other ®elds, students of human intelligence might use heuristic strategies for identifying factors likely to covary with g. For example, cognitive ability, especially ¯uid intelligence, tends to decline with age. There is a burgeoning literature on age-associated changes in the brain, and in speci®c cognitive processes. We should ask which of these speci®c brain changes parallel those in general cognitive ability. Neurone number decreases by approximately 10% from age 20 to 90 (Pakkenberg & Gundersen, 1997), as does the number of glial cells (Pakkenberg et al., 2003), but total myelinated ®bre length decreases by 40% to 50% during ageing (Marner, Nyengaard, Tang, & Pakkenberg, 2003). These ®ndings are consistent with the known association of brain volume and IQ, and point to white-matter factors as potentially even more important, given their greater sensitivity to ageing. 10 Shirley (1928) reported that ``the relationship between maze learning and brain weight, in so far as one exists at all, is, then, the heavier his brain, the more the rat blunders'' (p. 194). One imagines a rat with an oversized, overweight head causing it to stumble around a maze, bumping off walls as though drunk. As with negative studies of rat g around the same time, the cognitive measures were insuf®ciently diverse, and the genetic history of the animals insuf®ciently speci®ed, to allow any conclusions to be drawn from this work.

19. The Law of General Intelligence

475

As well as asking how the brain changes with age, one could ask how the human brain differs from that of other species. In human studies, brain volume is correlated with IQ, with or without correcting for individual differences in body size. In comparative studies of brain evolution, however, the large differences in body size among species must be taken into account. Brain and body size tend to scale together within a lineage, so that species in which the brain is larger than ``expected'' given the body size can be identi®ed as outliers that have developed unusually large cognitive capacity. Humans rank highest in this measure of ``encephalisation quotient.'' However, some species have larger brains than humans in terms of absolute volume, yet do not appear to have greater cognitive abilities, to the extent that species can be compared in ``intelligence.'' Whales, bottlenose dolphins, and African elephants have brains that are at least as large as the human brain, but the human brain probably has slightly more cortical neurones (Roth & Dicke, 2005). However, neurone count is not the only difference between the human and other large brains: As a rule, white-matter volume scales up faster than grey matter as total brain size increases (Bush & Allman, 2003). Axons are thicker in primates than in cetaceans and elephants, resulting in greater conduction velocity, and distances are shorter (because of packing more neurones into a smaller volume), resulting in more ef®cient communication and synchronisation between computational modules (Roth & Dicke, 2005). Thus, evidence from brain evolution and brain ageing converge to highlight white matter as a potentially crucial source of individual differences in intelligence. Further comparative research combining behavioural measures of processing speed and histological measures of white-matter structure may shed light on these relationships. Not all parts of the brain have evolved in parallel. In particular, in primate evolution, the cortex has grown more rapidly than subcortical structures, and the frontal cortex has grown more rapidly than the rest of the cortex (Bush & Allman, 2004; see also Schoenemann, Sheehan, & Glotzer, 2005). One can also ask which new features of the brain evolved most recently, and could still vary widely within the human population. Allman and colleagues (Allman, Hakeem, & Watson, 2002; Allman, Watson, Tetreault, & Hakeem, 2005) have proposed that a brain region (Brodmann area 10) and a cell type (known as ``spindle neurones'' or ``Von Economo neurones'') ± found only in great apes and humans ± may be associated with cognitive capacities in which humans excel, such as self-control.11 Area 10 is at the frontal pole, and Von Economo neurones are found in the ACC and fronto-insular cortex, all 11 We may not think of ourselves as creatures with great self-control, but compared to some other primates, humans are extremely patient, as are adults compared to children. Cottontop tamarin monkeys, with extremely small frontal lobes, apparently cannot learn to wait more than about 8 seconds for a food reward, even if doing so would triple the reward's size (Stevens, Hallinan, & Hauser, 2005).

476

Chabris

in or near regions of the frontal lobe associated with intelligence. Area 10 is at least twice as large in humans (as a fraction of brain size) as in apes, and Von Economo neurones are about 25 times more numerous in humans than in apes. Studies of ageing again converge nicely with the general evolutionary ®nding: The frontal cortex declines in size with age faster than any other cortical region (Allen, Bruss, Brown, & Damasio, 2005, found that age explained 37% of the variance in frontal lobe grey-matter volume, compared with 32%, 22%, and 8% for the temporal, parietal, and occipital lobes). Further application of the ageing and evolution heuristics may point to factors other than white matter and the frontal cortex, and there are doubtless other useful heuristics that may be used to search for novel mechanisms of general intelligence. Genomic investigations may prove especially fruitful in this regard (Gilbert, Dobyns, & Lahn, 2005).12 Developing statistical and computational models of g The most critical question about all the current (and future) studies that ®nd ``moderate'' correlations of r = .30 to .50 may be whether they are really all rediscovering the same correlation, or whether they are identifying substantially independent contributors to variation in cognitive ability. That is, does having a larger brain and more ef®cient white matter yield higher intelligence than just having one or the other? If so, what are the independent contributions of these two variables, and do they interact, or are they merely additive? The same can be asked of response time, working memory capacity, cognitive control, and all the cognitive and neurobiological factors associated with g. And exactly how do these interact? The answers to these questions can only come from developing detailed models of general intelligence (Gray & Thompson, 2004). These must be of two types: statistical and computational. A statistical model would specify the relationships among the many cognitive and biological factors associated with g. Such a model cannot be constructed from data in published studies, since these almost always evaluate only the relationship between a single measure and g. There are scattered recent examples of studies that relate IQ to multiple covariates (e.g., Deary et al., 2006; Gray et al., 2003; Schretlen et al., 2000), sometimes in genetically informative designs (e.g., Hansell, Wright, Luciano, Geffen, Geffen, & Martin, 2005), but these studies rarely incorporate more than one measure of brain function. (A 12 According to one research group, genes that are found to have evolved unusually rapidly in the primate lineage, or to show evidence of ongoing selection in the human population, are plausible candidates to explain individual differences in human phenotypes. Dorus et al. (2004) argue that genes involved in nervous system development are undergoing extra-rapid evolution (compared to genes for neural transmission, and to genes for cellular ``housekeeping'' processes), and Mekel-Bobrov et al. (2005) and Evans et al. (2005) claim that two of these genes, whose mutations cause microcephaly, are still undergoing selection.

19. The Law of General Intelligence

477

notable exception to this rule is the work of Walhovd et al., 2005, who showed that cortical volume and a measure of brain electrical activity were additive in explaining variance in performance IQ.) What is needed is a single study in which the same sample of subjects receives a comprehensive battery of assessments including gold-standard IQ tests (e.g., the WAIS-III, Raven's APM), cognitive measures (e.g., response time, working memory, inspection time, novelty preference, inhibition, and implicit learning), and a wide-ranging neurofunctional examination comprising at least high-resolution structural MRI, fMRI during performance of cognitive tasks with varying g-loadings, diffusion MRI, and MR spectroscopy, as well as EEG measures of the latency of various elementary processes. The best practices currently followed in each of these speci®c research areas should be adopted, beginning with the collection of a participant sample that is demographically representative of a large population, such as a medium-size city and its environs. Multivariate analyses such as regression, structural equation modelling, mediator and path analysis, and even factor analysis of the biological measures would be applied. With suf®cient funding, subjects could be followed over time so that changes in biological and cognitive factors could be related to changes in intelligence, as well as to life outcomes. On the model of the National Longitudinal Survey of Youth (NLSY), which has been mined for numerous studies on intelligence (e.g., Herrnstein & Murray, 1994; Benjamin et al., 2006), such a study would generate a unique resource for current and future investigators, yielding returns far beyond the required initial investment. A statistical model of the factors that correlate with individual differences in cognitive ability will not provide a complete picture of the nature of g. The crucial missing element is a computational model ± a simulation of interacting neural mechanisms that is suf®ciently complex to allow all the relevant factors to be speci®ed and manipulated (see Newell, 1990, for discussion of the value of computational models in psychology). The statistical model, as well as hierarchical factor analyses of cognitive tasks, would serve as constraints on the class of acceptable models. A diverse set of cognitive tasks would be chosen such that they could be implemented within the model and could be administered in a battery to human subjects so as to measure the task g-loadings. Parameters of the model would then be manipulated to simulate individual differences in g-related factors, such as the size of different brain regions, or the speed of information processing, or the ef®ciency of communication between processes. To the extent that the model is valid, varying the parameter should produce an expected change in performance of the model on the simulated tasks. Critically, performance changes on speci®c tasks should be related to how g-loaded the tasks are. For example, if the model included a parameter for the size of the frontal cortex, which correlates with IQ (Thompson et al., 2001), then increasing this parameter ± within some reasonable range ± should improve performance on the simulated tasks, with the greater

478

Chabris

improvements on highly g-loaded tasks, such as working memory, than on simpler tasks such as perceptual completion. Several existing frameworks could be used to model individual differences in cognitive ability, including ACT-R (Anderson et al., 2004; Daily, Lovett, & Reder, 2001), but neural network models may be the best choice. These are abstract simulations of networks of simple processing units reminiscent of neurones: Many such interconnected units all operate in parallel, computing simple transformations of input to output and communicating signals of information to one another. A further advantage of these models is the correspondence between their tuneable parameters and the factors known to correlate with g. O'Reilly and Munakata (2000) describe a neural network framework in which a wide variety of tasks, of varying levels of complexity, have been implemented. Figure 19.3 [not included in eBook] shows a model developed using this framework by O'Reilly, Noelle, Braver, and Cohen (2002) to simulate the intradimensional/extradimensional (ID/ED) set-shifting task, which is sensitive to frontal lobe damage (and can be learned by animals). The task model includes, in addition to representations of the input and output, four separate modules of processing units, each corresponding to a different anatomical region: orbital frontal cortex, lateral frontal cortex, the ventral tegmental area, and a generic ``posterior cortex.'' Unidirectional and bidirectional connections within and between some of these modules are also implemented. The details of the model implementation are not essential here; what is more important is the fact that, in principle, the size and processing ef®ciency of each module, as well as the quality of connections among the modules, can all be varied. Modules can also, of course, be added or removed to simulate different strategies, or combinations of processes, that might be used to complete a task. A simple qualitative analysis suggests that neural network models can account for some of the major ®ndings regarding g. For example, consider three of the tasks included in the battery whose data are shown in Table 19.1b: 3-back working memory, categorical spatial encoding, and coordinate spatial encoding. The second and third tasks, which involve deciding whether a dot is above or below a line (categorical encoding), or whether it is within a certain precise distance from the line (coordinate encoding), had the lowest g-loadings of the battery (.20 and .25). The 3-back working memory task (which was essentially the same task used by Gray et al., 2003) had the second highest g-loading (.46); the ®rst principal component accounted for over three times as much variance in 3-back performance as it did in either spatial encoding task. Kosslyn, Chabris, Marsolek, and Koenig (1992) and Baker, Chabris, and Kosslyn (1999) implemented network simulations of the spatial encoding tasks, and were able to account for several aspects of human performance with a simple feedforward network consisting of input, output, and two ``hidden'' layers. O'Reilly and Munakata (2000, Chapter 9) implemented a simpli®ed model of working

480

Chabris

memory involving four neural modules in addition to input and output, a level of complexity comparable to that of the ID/ED model in Figure 19.3 (but simpler than would be required for a full-scale 3-back task). As one would expect, simulating the more highly g-loaded task requires a more complex model; moreover, if all three tasks were implemented within a suf®ciently sophisticated model ± with ``posterior cortex'' corresponding to the hidden layers in the original models of the spatial tasks ± there would be more potential sites of damage that would impair performance of the 3back than the spatial encoding tasks. This is precisely what was found in the lesion studies of rat g that were mentioned earlier (Crinella & Yu, 1995; Thompson et al., 1990): the more highly g-loaded a task, the more anatomical locations to which damage impaired the task. And in humans, Colom, Jung, & Haier (2006) report that the g-loading of a WAIS-R subtest predicts the number of areas in the brain where grey matter concentration correlates with test performance. More complex models than that shown in Figure 19.3, including well over 200 processing units, can be readily simulated with currently available computational resources (e.g., O'Reilly & Frank, 2006). Similar models have been used to simulate the effects of Parkinson's disease on learning (Frank, Seeberger, & O'Reilly, 2004), self-organisation of task-speci®c rules in the frontal cortex (Rougier, Noelle, Braver, Cohen, & O'Reilly, 2005), the interactions of the hippocampus and cortex in recognition memory (Norman & O'Reilly, 2003), and the relationship between behavioural and fMRI data (e.g., Herd, Banich, & O'Reilly, 2006). The discovery of parameters within these models that explain individual differences may suggest novel experiments in human subjects, including neuroimaging studies to test speci®c hypotheses about the activation of systems during g-related tasks, and the relationship between neural activity, g, and task performance (that is, the models may generate predictions that can be tested by studies designed similarly to Gray et al., 2003). Finally, the learning process in such a system can itself be modi®ed, to properly test the hypothesis that differences in learning can account for differences in intelligence (Garlick, 2002).

Conclusion General intelligence is a property of the web of positive correlations among performance scores on cognitive tasks. The general factor that invariably emerges ± in humans and possibly other species ± is itself embedded in a larger web of correlations with basic properties of information processing (speed, working memory, control, learning) and of the brain (total and regional volume, white matter integrity, neurochemical concentrations). Surprisingly, these latter correlations are all within a relatively narrow range, with no single variable explaining more than 25% of the variance in intelligence. Nonetheless, they establish a biological basis for g that is ®rmer than that of any other human psychological trait. The challenges for

19. The Law of General Intelligence

481

future research on the Law of General Intelligence are to establish its species-universality, to discover its cognitive and neurobiological mechanisms, to quantify the relative importance of those different mechanisms, and to model the mechanisms underlying g with biologically plausible computer simulations. As Jensen (1998b) pointed out, there is no real con¯ict between the notion of the mind as a collection of separate processes or modules and the notion of correlated individual differences in the ef®ciency of cognition. Evolution has provided us all with bodies constructed according to the same genetic design: We all have two arms, two legs, one liver, one heart, and so on, but some of us are faster, stronger, and healthier than others. The Law of General Intelligence does not stand in contradiction to the hypothesis of a mind that has evolved via adaptation to solve speci®c problems, any more than a ``law of general athletic prowess'' ± that people who perform well in one sport will tend to perform well in others ± clashes with the modular construction of human musculature. Biological studies indicate that individual genes and molecules have functions in multiple organ systems of the body, so it is natural to believe that within the brain, mechanisms will have considerable generality and overlap ± it would be dif®cult for a biological system to develop for one speci®c purpose, except by piggybacking on other existing functions and using genes and pathways already established (for example, the Notch signalling pathway is involved in both cell differentiation during development and memory formation during adulthood; Costa, Drew, & Silva, 2005). Given this principle of the reuse of existing mechanisms for new functions, it is inevitable that variation in the existing mechanism will transfer to variation in the new one, yielding correlated individual differences in mental ability. This outcome is in clear opposition to extreme notions of domain-speci®city that assert each module's complete independence from other cognitive and neural mechanisms. Although this chapter has made the case for a Law of General Intelligence, it will end with a caution against putting intelligence, IQ, or g on a pedestal above the many other dimensions along which individual human beings differ: creativity, personality, con®dence, patience, ethicality, and the like. Intelligence may be the single best predictor of many life outcomes, but those of us who study intelligence should be especially vigilant against the tendency to associate it with moral worth or to exalt it as the only important human trait. Rather than rename other mental abilities like social skill as ``intelligences'' and pit them against general intelligence (Gardner, 1993; Goleman, 1995), we should study each for its own value in understanding the diversity of human behaviour. Finally, research on individual differences in cognition is often decried because of the belief that such work inevitably implicates genetic mechanisms, and thus it will further the agenda of those who seek to discriminate on the basis of ethnicity (Chabris, 1998a). But a true understanding of

482

Chabris

differences among individuals is actually inimical to a racist agenda: People must be evaluated, rewarded, punished, and treated according to their own personal actions and abilities, not to those of whatever groups they have involuntarily joined. More information about how people differ in intelligence, and why, can only help to replace false beliefs with knowledge. The right response to the overuse of stereotypes is not to pretend that all people are the same, but to discover precisely how each person is truly unique.

Acknowledgments Jonathon Schuldt provided invaluable research assistance and, along with Aerfen Whittle, Kirill Babikov, Jacob Sattelmair, Carrie Morris, Sarah Murphy, Lee Chung, and Thomas Jerde, assisted in collecting and processing data. Stephen M. Kosslyn and members of his laboratory provided useful suggestions. Michael Galsworthy, Jeremy Gray, Mark Moss, and Randy O'Reilly provided helpful advice and references. Merle Paule and his colleagues provided advice and data from their studies. And Maxwell Roberts proved to be an excellent (and extremely patient) editor. The author and this research were supported by a DCI Postdoctoral Fellowship, as well as an NSF ROLE grant to J. Richard Hackman and Stephen M. Kosslyn. This work is dedicated to the memory of Professor Sheldon H. White, Department of Psychology, Harvard University.

References Allen, J. S., Bruss, J., Brown, C. K., & Damasio, H. (2005). Normal neuroanatomical variation due to age: The major lobes and a parcellation of the temporal region. Neurobiology of Aging, 6, 1245±1260. Allman, J. M., Hakeem, A. Y., & Watson, K. K. (2002). Two phylogenetic specializations in the human brain. The Neuroscientist, 8, 335±346. Allman, J. M., Watson, K. K., Tetreault, N. A., & Hakeem, A. Y. (2005). Intuition and autism: A possible role for Von Economo neurons. Trends in Cognitive Sciences, 9, 367±373. Anastasi, A., Fuller, J. L., Scott, J. P., & Schmitt, J. R. (1955). A factor analysis of the performance of dogs on certain learning tests. Zoologica, 40, 33±46. Anderson, B. (1993). Evidence from the rat for a general factor that underlies cognitive performance and that relates to brain size: Intelligence? Neuroscience Letters, 153, 98±102. Anderson, B. (1995). Dendrites and cognition: A negative pilot study in the rat. Intelligence, 20, 291±308. Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An integrated theory of the mind. Psychological Review, 111, 1036±1060. Andreasen, N. C., Flaum, M., Swayze 2nd, V., O'Leary, D. S., Alliger, R., Cohen, G., et al. (1993). Intelligence and brain structure in normal individuals. American Journal of Psychiatry, 150, 130±134.

19. The Law of General Intelligence

483

Ardila, A. (1999). A neuropsychological approach to intelligence. Neuropsychology Review, 9, 117±136. Ashburner, J., & Friston, K. J. (2000). Voxel-based morphometry ± the methods. Neuroimage, 11, 805±821. Bagg, H. J. (1920). Individual differences and family resemblances in animal behavior. Archives of Psychology, 43, 1±58. Baker, D. P., Chabris, C. F., & Kosslyn, S. M. (1999). Encoding categorical and coordinate spatial relations without input±output correlations: New simulation models. Cognitive Science, 23, 33±51. Bartholomew, D. J. (2004). Measuring intelligence: Facts and fallacies. Cambridge: Cambridge University Press. Benjamin, D. J., Brown, S. A., & Shapiro, J. M. (2006). Who is ``behavioral ''? Cognitive ability and anomalous preferences. Unpublished manuscript. Brand, C. (1996). The g factor. Chichester, UK: Wiley. Budiansky, S. (1998). If a lion could talk: Animal intelligence and the evolution of consciousness. New York: Free Press. Bush, E. C., & Allman, J. M. (2003). The scaling of white matter to grey matter in cerebellum and neocortex. Brain, Behavior, and Evolution, 61, 1±5. Bush, E. C., & Allman, J. M. (2004). The scaling of frontal cortex in primates and carnivores. Proceedings of the National Academy of Sciences, 101, 3962±3966. Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. Cambridge: Cambridge University Press. Chabris, C. F. (1998a). Does IQ matter? Commentary, 106, 13±23. Chabris, C. F. (1998b). IQ since ``The Bell Curve''. Commentary, 106, 33±40. Chandra, S. B. C., Hosler, J. S., & Smith, B. H. (2000). Heritable variation for latent inhibition and its correlation with reversal learning in honeybees (Apis mellifera). Journal of Comparative Psychology, 114, 86±97. Clayton, N. S., & Emery, N. J. (2005). Corvid cognition. Current Biology, 15, 80±81. Colom, R., Jung, R. E., & Haier R. J. (2006). Distributed brain sites for the g-factor of intelligence. Neuroimage, 31, 1359±1365. Conway, A. R., Kane, M. J., & Engle, R. W. (2003). Working memory capacity and its relation to general intelligence. Trends in Cognitive Sciences, 7, 547±552. Coren, S. (1994). The intelligence of dogs: Canine consciousness and capabilities. New York: Free Press. Cosmides, L. & Tooby, J. (1994). Beyond intuition and instinct blindness: The case for an evolutionarily rigorous cognitive science. Cognition, 50, 41±77. Costa, R. M., Drew, C., & Silva, A. J. (2005). Notch to remember. Trends in Neurosciences, 28, 429±435. Crawley, J. N., & Paylor, R. (1997). A proposed test battery and constellations of speci®c behavioral paradigms to investigate the behavioral phenotypes of transgenic and knockout mice. Hormones and Behavior, 31, 197±211. Crinella, F. M., & Yu, J. (1995). Brain mechanisms in problem solving and intelligence: A replication and extension. Intelligence, 21, 225±246. Cronbach, L. J. (1957). The two disciplines of scienti®c psychology. American Psychologist, 12, 671±684. Daily, L. Z., Lovett, M. C., & Reder, L. M. (2001). Modeling individual differences in working memory performance: A source activation account in ACT-R. Cognitive Science, 25, 315±353.

484

Chabris

Davis, B. D. (1983). Neo-Lysenkoism, IQ, and the press. The Public Interest, 74, 41±59. Deary, I. J. (2000). Looking down on human intelligence: From psychometrics to the brain. Oxford: Oxford University Press. Deary, I. J., Bastin, M. E., Alison, P., Clayden, J. D., Whalley, L. J., Starr, J. M., et al. (2006). White matter integrity and cognition in childhood and old age. Neurology, 66, 505±512. Deary, I. J., Der, G., & Ford, G. (2001). Reaction times and intelligence differences: A population-based cohort study. Intelligence, 29, 389±399. Deary, I. J., Whiteman, M. C., Starr, J. M., Whalley, L. J., & Fox, H. C. (2004). The impact of childhood intelligence on later life: Following up the Scottish mental surveys of 1932 and 1947. Journal of Personality and Social Psychology, 86, 130±147. Dorus, S., Vallender, E. J., Evans, P. D., Anderson, J. R., Gilbert, S. L., Mahowald, M., et al. (2004). Accelerated evolution of nervous system genes in the origin of Homo sapiens. Cell, 119, 1027±1040. Duncan, J., Burgess, P., & Emslie, H. (1995). Fluid intelligence after frontal lobe lesions. Neuropsychologia, 33, 261±268. Duncan, J., Emslie, H., Williams, P., Johnson, R., & Freer, C. (1996). Intelligence and the frontal lobe: The organization of goal-directed behavior. Cognitive Psychology, 30, 257±303. Duncan, J., & Owen, A. M. (2000). Common regions of the human frontal lobe recruited by diverse cognitive demands. Trends in Neurosciences, 23, 475±483. Emery, N. J., & Clayton, N. S. (2004). The mentality of crows: Convergent evolution of intelligence in corvids and apes. Science, 306, 1903±1907. Evans, P. D., Gilbert, S. L., Mekel-Bobrov, N., Vallender, E. J., Anderson, J. R., Vaez-Azizi, L. M., et al. (2005). Microcephalin, a gene regulating brain size, continues to evolve adaptively in humans. Science, 309, 1717±1720. Feldman, J., Kerr, B., & Streissguth, A. P. (1995). Correlational analyses of procedural and declarative learning performance. Intelligence, 20, 87±114. Ferguson, H. J., Cobey, S., & Smith, B. H. (2001). Sensitivity to a change in reward is heritable in the honeybee, Apis mellifera. Animal Behaviour, 61, 527±534. Fodor, J. A. (1983). The modularity of mind: An essay on faculty psychology. Cambridge, MA: MIT Press. Frangou, S., Chitins, X., & Williams, S. C. (2004). Mapping IQ and gray matter density in healthy young people. Neuroimage, 23, 800±805. Frank, M. J., Seeberger, L. C., & O'Reilly, R. C. (2004). By carrot or by stick: Cognitive reinforcement learning in Parkinsonism. Science, 306, 1940±1943. Frederick, S. (2006). Cognitive re¯ection and decision making. Journal of Economic Perspectives, 19, 24±42. Friedman, N. P., Miyake, A., Corley, R. P., Young, S. E., DeFries, J. C., & Hewitt, J. K. (2006). Not all executive functions are related to intelligence. Psychological Science, 17, 172±179. Galsworthy, M. J., Paya-Cano, J. L., Liu, L., Monleon, S., Gregoryan, G., Fernandes, C., et al. (2005). Assessing reliability, heritability and general cognitive ability in a battery of cognitive tasks for laboratory mice. Behavior Genetics, 35, 675±692. Galsworthy, M. J., Paya-Cano, J. L., Monleon, S., & Plomin, R. (2002). Evidence

19. The Law of General Intelligence

485

for general cognitive ability (g) in heterogeneous stock mice and an analysis of potential confounds. Genes, Brain, and Behavior, 1, 88±95. Gardner, H. (1993). Frames of mind: The theory of multiple intelligences (2nd ed.). New York: Basic Books. Garlick, D. (2002). Understanding the nature of the general factor of intelligence: The role of individual differences in neural plasticity as an explanatory mechanism. Psychological Review, 109, 116±136. Garlick, D. (2003). Integrating brain science research with intelligence research. Current Directions in Psychological Science, 12, 185±189. Geake, J. G., & Hansen, P. C. (2005). Neural correlates of intelligence as revealed by fMRI of ¯uid analogies. Neuroimage, 26, 555±564. Gignac, G., Vernon, P. A., & Wickett, J. C. (2003). Factors in¯uencing the relationship between brain size and intelligence. In H. Nyborg (Ed.), The scienti®c study of general intelligence: Tribute to Arthur R. Jensen (pp. 93±106). Amsterdam: Pergamon. Gilbert, S. L., Dobyns, W. B., & Lahn, B. T. (2005). Genetic links between brain development and brain evolution. Nature Reviews Genetics, 6, 581±590. Glabus, M. F., Horwitz, B., Holt, J. L., Kohn, P. D., Gerton, B. K., Callicott, J. H., et al. (2003). Interindividual differences in functional interactions among prefrontal, parietal and parahippocampal regions during working memory. Cerebral Cortex, 13, 1352±1361. Goleman, D. (1995). Emotional intelligence: Why it can matter more than IQ. New York: Bantam. Gong, Q. Y., Sluming, V., Mayes, A., Keller, S., Barrick, T., Cezayirli, E., et al. (2005). Voxel-based morphometry and stereology provide convergent evidence of the importance of medial prefrontal cortex for ¯uid intelligence in healthy adults. Neuroimage, 25, 1175±1186. Gottfredson, L. S. (1997). Why g matters: The complexity of everyday life. Intelligence, 24, 79±132. Gould, S. J. (1981). The mismeasure of man. New York: Norton. Gray, J. R., Chabris, C. F., & Braver, T. S. (2003). Neural mechanisms of general ¯uid intelligence. Nature Neuroscience, 6, 316±322. Gray, J. R., & Thompson, P. M. (2004). Neurobiology of intelligence: Science and ethics. Nature Reviews Neuroscience, 5, 471±482. Grudnik, J. L., & Kranzler, J. H. (2001). Meta-analysis of the relationship between intelligence and inspection time. Intelligence, 29, 523±535. Gunning-Dixon, F. M., & Raz, N. (2000). The cognitive correlates of white matter abnormalities in normal aging: A quantitative review. Neuropsychology, 14, 224±232. Haier, R. J. (1993). Cerebral glucose metabolism and intelligence. In P. Vernon (Ed.), Biological approaches to the study of human intelligence (pp. 317±332). Norwood, NJ: Ablex. Haier, R. J., White, N. S., & Akire, M. T. (2003). Individual differences in general intelligence correlate with brain function during nonreasoning tasks. Intelligence, 31, 429±441. Haier, R. J., Jung, R. E., Yeo, R. A., Head, K., & Alkire, M. T. (2004). Structural brain variation and general intelligence. Neuroimage, 23, 425±433. Hansell, N. K., Wright, M. J., Luciano, M., Geffen, G. M., Geffen, L. B., & Martin, N. G. (2005). Genetic covariation between event-related potential (ERP) and

486

Chabris

behavioral non-ERP measures of working-memory, processing speed, and IQ. Behavior Genetics, 35, 695±706. Hare, B., & Tomasello, M. (2005). Human-like social skills in dogs? Trends in Cognitive Sciences, 9, 439±444. Heinrich, B., & Bugnyar, T. (2005). Testing problem-solving in ravens: Stringpulling to reach food. Ethology, 111, 962±976. Herd, S. A., Banich, M. T., & O'Reilly, R. C. (2006). Neural mechanisms of cognitive control: An integrative model of Stroop task performance and fMRI data. Journal of Cognitive Neuroscience, 18, 22±32. Herndon, J. G., Moss, M. B., Rosene, D. L., & Killiany, R. J. (1997). Patterns of cognitive decline in aged rhesus monkeys. Behavioural Brain Research, 87, 25±34. Herrnstein, R. J., & Murray, C. (1994). The bell curve: Intelligence and class structure in American life. New York: Free Press. Horn, N. R., Dolan, M., Elliott, R., Deakin, J. F., & Woodruff, P. W. (2003). Response inhibition and impulsivity: An fMRI study. Neuropsychologia, 41, 1959±1966. Irvine, S. H., & Berry, J. W. (Eds.). (1988). Human abilities in cultural context. Cambridge: Cambridge University Press. James, W. (1890). The principles of psychology. New York: Henry Holt. Jensen, A. R. (1998a). Response to ``IQ since The Bell Curve''. Commentary, 106, 20±21. Jensen, A. R. (1998b). The g factor: The science of mental ability. Westport, CT: Praeger. Jensen, A. R. (1998c). The suppressed relationship between IQ and the reaction time slope parameter of the Hick function. Intelligence, 26, 43±52. Jung, R. E., Brooks, W. M., Yeo, R. A., Chiulli, S. J., Weers, D. C., & Sibbitt, Jr, W. L. (1999a). Biochemical markers of intelligence: A proton MR spectroscopy study of normal human brain. Proceedings of the Royal Society of London B, 266, 1375±1379. Jung, R. E., Yeo, R. A., Chiulli, S. J., Sibbitt, Jr, W. L., Weers, D. C., Hart, B. L., et al. (1999b). Biochemical markers of cognition: A proton MR spectroscopy study of normal human brain. Neuroreport, 10, 3327±3331. Jung, R. E., Haier, R. J., Yeo, R. A., Rowland, L. M., Petropoulos, H., Levine, A. S., et al. (2005). Sex differences in N-acetylaspartate correlates of general intelligence: An 1H-MRS study of normal human brain. Neuroimage, 26, 965±972. Kane, M. J., & Engle, R. W. (2002). The role of prefrontal cortex in workingmemory capacity, executive attention, and general ¯uid intelligence: An individual-differences perspective. Psychonomic Bulletin and Review, 9, 637±671. Klingberg, T., Hedehus, M., Temple, E., Salz, T., Gabrieli, J. D., Moseley, M. E., et al. (2000). Microstructure of temporo-parietal white matter as a basis for reading ability: Evidence from diffusion tensor magnetic resonance imaging. Neuron, 25, 493±500. Kolata, S., Light, K., Townsend, D. A., Hale, G., Grossman, H. C., & Matzel, L. D. (2005). Variations in working memory capacity predict individual differences in general learning abilities among genetically diverse mice. Neurobiology of Learning and Memory, 84, 241±246. Korb, K. B. (1994). Stephen Jay Gould on intelligence. Cognition, 52, 111±123.

19. The Law of General Intelligence

487

Kosslyn, S. M., Brunn, J. L., Cave, K. R., & Wallach, R. W. (1984). Individual differences in visual imagery: A computational analysis. Cognition, 18, 195±243. Kosslyn, S. M., Chabris, C. F., Marsolek, C. J., & Koenig, O. (1992). Categorical versus coordinate spatial relations: Computational analyses and computer simulations. Journal of Experimental Psychology: Human Perception and Performance, 18, 562±577. Kyllonen, P. C., & Christal, R. E. (1990). Reasoning ability is (little more than) working-memory capacity? Intelligence, 14, 389±433. Larson, G. E., Haier, R. J., LaCasse, L., & Hazen, K. (1995). Evaluaton of a ``mental effort'' hypothesis for correlations between cortical metabolism and intelligence. Intelligence, 21, 267±278. Lee, K. H., Choi, Y. Y., Gray, J. R., Cho, S. H., Chae, J. H., Lee, S., et al. (2005). Neural correlates of superior intelligence: Stronger recruitment of posterior parietal cortex. Neuroimage, 29, 578±586. Livesey, P. J. (1970). A consideration of the neural basis of intelligent behavior: Comparative studies. Behavioral Science, 15, 164±170. Locurto, C. (1997). On the comparative generality of g. In W. Tomic, & J. Kigman (Eds.), Advances in cognition and education, Vol. 4: Re¯ections on the concept of intelligence (pp. 79±100). Greenwich, CT: JAI Press. Locurto, C., Benoit, A., Crowley, C., & Miele, A. (2005). The structure of individual differences in batteries of rapid acquisition tasks in mice. Unpublished manuscript, Department of Psychology, College of the Holy Cross. Locurto, C., Fortin, E., & Sullivan, R. (2003). The structure of individual differences in heterogeneous stock mice across problem types and motivational systems. Genes, Brain, and Behavior, 2, 40±55. Locurto, C., & Scanlon, C. (1998). Individual differences and a spatial learning factor in two strains of mice. Journal of Comparative Psychology, 112, 344±352. McDaniel, M. A. (2005). Big-brained people are smarter: A meta-analysis of the relationship between in vivo brain volume and intelligence. Intelligence, 33, 337±346. McGeorge, P., Crawford, J. R., & Kelly, S. W. (1997). The relationships between psychometric intelligence and learning in an explicit and an implicit task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 239±245. MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83±104. Mackintosh, N. J. (1998). IQ and human intelligence. Oxford: Oxford University Press. Macklin, M. L., Metzger, L. J., Litz, B. T., McNally, R. J., Lasko, N. B., Orr, S. P., et al. (1998). Lower precombat intelligence is a risk factor for posttraumatic stress disorder. Journal of Consulting and Clinical Psychology, 66, 323±326. Macphail, E. M. (1987). The comparative psychology of intelligence. Behavioral and Brain Sciences, 10, 645±656. Madden, D. J., Whiting, W. L., Huettel, S. A., White, L. E., MacFall, J. R., & Provenzale, J. M. (2004). Diffusion tensor imaging of adult age differences in cerebral white matter: Relation to response time. Neuroimage, 21, 1174±1181. Marino, L. (2002). Convergence of complex cognitive abilities in cetaceans and primates. Brain, Behavior and Evolution, 59, 21±32. Marino, L. (2004). Dolphin cognition. Current Biology, 14, 910±911.

488

Chabris

Marner, L., Nyengaard, J. R., Tang, Y., & Pakkenberg, B. (2003). Marked loss of myelinated nerve ®bers in the human brain with age. Journal of Comparative Neurology, 462, 144±152. Matzel, L. D., & Gandhi, C. C. (2000). The tractable contribution of synapses and their component molecules to individual differences in learning. Behavioural Brain Research, 110, 53±66. Matzel, L. D., Han, Y. R., Grossman, H. C., Karnik, M. S., Patel, D., Scott, N., et al. (2003). Individual differences in the expression of a ``general'' learning ability in mice. Journal of Neuroscience, 23, 6423±6433. Mekel-Bobrov, N., Gilbert, S. L., Evans, P. D., Vallender, E. J., Anderson, J. R., Hudson, R. R., et al. (2005). Ongoing adaptive evolution of ASPM, a brain size determinant in Homo sapiens. Science, 309, 1720±1722. Milgram, N. W., Head, E., Zicker, S. C., Ikeda-Douglas, C. J., Murphey, H., Muggenburg, B. A., et al. (2005). Learning ability in aged beagle dogs is preserved by behavioral enrichment and dietary forti®cation: A two-year longitudinal study. Neurobiology of Aging, 26, 77±90. Miller, E. M. (1994). Intelligence and brain myelination: A hypothesis. Personality and Individual Differences, 17, 803±832. Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex ``frontal lobe'' tasks: A latent variable analysis. Cognitive Psychology, 41, 49±100. Newell, A. (1990). Uni®ed theories of cognition. Cambridge, MA: Harvard University Press. Nippak, P. M., & Milgram, N. W. (2005). An investigation of the relationship between response latency across several cognitive tasks in the beagle dog. Progress in Neuropsychopharmacology and Biological Psychiatry, 29, 371±377. Norman, K. A., & O'Reilly, R. C. (2003). Modeling hippocampal and neocortical contributions to recognition memory: A complementary-learning-systems approach. Psychological Review, 110, 611±646. O'Reilly, R. C., & Frank, M. J. (2006). Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Computation 18, 283±328. O'Reilly, R. C., & Munakata, Y. (2000). Computational explorations in cognitive neuroscience: Understanding the mind by simulating the brain. Cambridge, MA: MIT Press. O'Reilly, R. C., Noelle, D. C., Braver, T. S., & Cohen, J. D. (2002). Prefrontal cortex and dynamic categorization tasks: Representational organization and neuromodulatory control. Cerebral Cortex, 12, 246±257. Pakkenberg, B., & Gundersen, H. J. (1997). Neocortical neuron number in humans: Effect of sex and age. Journal of Comparative Neurology, 384, 312±320. Pakkenberg, B., Pelvig, D., Marner, L., Bundgaard, M. J., Gundersen, H. J., Nyengaard, J. R., et al. (2003). Aging and the human neocortex. Experimental Gerontology, 38, 95±99. Paule, M. G. (1990). Use of the NCTR operant test battery in nonhuman primates. Neurotoxicology and Teratology, 12, 413±418. Paule, M. G., Chelonis, J. G., Buffalo, E. A., Blake, D. J., & Casey, P. H. (1999). Operant test battery performance in children: Correlation with IQ. Neurotoxicology and Teratology, 21, 223±230.

19. The Law of General Intelligence

489

Plomin, R. (2001). The genetics of g in human and mouse. Nature Reviews Neuroscience, 2, 136±141. Plomin, R., & Kosslyn, S. M. (2001). Genes, brain and cognition. Nature Neuroscience, 4, 1153±1154. Reber, A. S., Walkenfeld, F. F., & Hernstadt, R. (1991). Implicit and explicit learning: Individual differences and IQ. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 888±896. Ree, M. J., & Earles, J. A. (1991). The stability of g across different methods of estimation. Intelligence, 15, 271±278. Ree, M. J., & Earles, J. A. (1992). Intelligence is the best predictor of job performance. Current Directions in Psychological Science, 1, 86±89. Reed, T. E., Vernon, P. A., & Johnson, A. M. (2004). Con®rmation of correlation between brain nerve conduction velocity and intelligence level in normal adults. Intelligence, 32, 563±572. Reuning, H. (1988). Testing Bushmen in the Central Kalahari. In S. H. Irvine & J. W. Berry (Eds.), Human abilities in cultural context (pp. 453±486). Cambridge: Cambridge University Press. Roth, G., & Dicke, U. (2005). Evolution of the brain and intelligence. Trends in Cognitive Sciences, 9, 250±257. Rougier, N. P., Noelle, D. C., Braver, T. S., Cohen, J. D., & O'Reilly, R. C. (2005). Prefrontal cortex and ¯exible cognitive control: Rules without symbols. Proceedings of the National Academy of Sciences, 102, 7338±7343. Schmithorst, V. J., Wilke, M., Dardzinski, B. J., & Holland, S. K. (2005). Cognitive functions correlate with white matter architecture in a normal pediatric population: A diffusion tensor MRI study. Human Brain Mapping, 26, 139±147. Schoenemann, P. T., Sheehan, M. J., & Glotzer, L. D. (2005). Prefrontal white matter volume is disproportionately larger in humans than in other primates. Nature Neuroscience, 8, 242±252. Schretlen, D., Pearlson, G. D., Anthony, J. C., Aylward, E. H., Augustine, A. M., Davis, A. et al. (2000). Elucidating the contributions of processing speed, executive ability, and frontal lobe volume to normal age-related differences in ¯uid intelligence. Journal of the International Neuropsychological Society, 6, 52±61. Shenkin, S. D., Bastin, M. E., MacGillivray, T. J., Deary, I. J., Starr, J. M., Rivers, C. S., et al. (2005). Cognitive correlates of cerebral white matter lesions and water diffusion tensor parameters in community-dwelling older people. Cerebrovascular Diseases, 20, 310±318. Shenkin, S. D., Bastin, M. E., MacGillivray, T. J., Deary, I. J., Starr, J. M., & Wardlaw, J. M. (2003). Childhood and current cognitive function in healthy 80year-olds: A DT-MRI study. Neuroreport, 14, 345±349. Shirley, M. (1928). Studies in activity: IV. The relation of activity to maze learning and to brain weight. Journal of Comparative Psychology, 8, 187±195. Sohn, M.-H., Ursu, S., Anderson, J. R., Stenger, V. A., & Carter, C. S. (2000). The role of prefrontal cortex and posterior parietal cortex in task-switching. Proceedings of the National Academy of Sciences, 97, 13448±13453. Spearman, C. (1904). ``General intelligence,'' objectively determined and measured. American Journal of Psychology, 15, 201±293. Spearman, C. (1927). The abilities of man: Their nature and measurement. Oxford: Macmillan.

490

Chabris

Stevens, J. R., Hallinan, E. V., & Hauser, M. D. (2005). The ecology and evolution of patience in two New World monkeys. Biological Letters, 1, 223±226. Taffe, M. A., Weed, M. R., Gutierrez, T., Davis, S. A., & Gold, L. H. (2004). Modeling a task that is sensitive to dementia of the Alzheimer's type: Individual differences in acquisition of a visuo-spatial paired-associate learning task in rhesus monkeys. Behavioural Brain Research, 149, 123±133. Tang, Y. P., Shimizu, E., Dube, G. R., Rampon, C., Kerchner, G. A., Zhuo, M., et al. (1999). Genetic enhancement of learning and memory in mice. Nature, 401, 63±69. Tapp, P. D., Head, K., Head, E., Milgram, N. W., Muggenburg, B. A., & Su, M. Y. (2006). Application of an automated voxel-based morphometry technique to assess regional gray and white matter brain atrophy in a canine model of aging. Neuroimage, 29, 234±244. Tapp, P. D., Siwak, C. T., Gao, F. Q., Chiou, J. Y., Black, S. E., Head, E., et al. (2004). Frontal lobe volume, function, and beta-amyloid pathology in a canine model of aging. Journal of Neuroscience, 24, 8205±8213. Taylor, R. L., & Ziegler, E. W. (1987). Comparison of the ®rst principal factor on the WISC across ethnic groups. Educational and Psychological Measurement, 47, 691±694. Teigen, K. H. (2002). One hundred years of laws in psychology. American Journal of Psychology, 115, 103±118. Thomas, M., & Karmiloff-Smith, A. (2003). Connectionist models of development, developmental disorders and individual differences. In R. J. Sternberg, J. Lautrey, & T. Lubart (Eds.), Models of intelligence: International perspectives (pp. 133±150). Washington, DC: American Psychological Association. Thompson, P. M., Cannon, T. D., Narr, K. L., van Erp, T., Poutanen, V. P., Huttunen, M., et al. (2001). Genetic in¯uences on brain structure. Nature Neuroscience, 4, 1253±1258. Thompson, R. M., Crinella, F. M., & Yu, J. (1990). Brain mechanisms in problem solving and intelligence: A lesion survey of the rat brain. New York: Plenum. Tuch, D. S., Salat, D. H., Wisco, J. J., Zaleta, A. K., Hevelone, N. D., & Rosas, H. D. (2005). Choice reaction time performance correlates with diffusion anisotropy in white matter pathways supporting visuospatial attention. Proceedings of the National Academy of Sciences, 102, 12212±12217. Turkheimer, E. (2000). Three laws of behavior genetics and what they mean. Current Directions in Psychological Science, 9, 160±164. Walhovd, K. B., Fjell, A. M., Reinvang, I., Lundervold, A., Fischl, B., Salat, D. H. et al. (2005). Cortical volume and speed-of-processing are complementary in prediction of performance intelligence. Neuropsychologia, 43, 704±713. Warren, J. M. (1961). Individual differences in discrimination learning by cats. Journal of Genetic Psychology, 98, 89±93. Warrington, E. K., James, M., & Maciejewski, C. (1986). The WAIS as a lateralizing and localizing diagnostic instrument: A study of 656 patients with unilateral cerebral lesions. Neuropsychologia, 24, 223±239. Weed, M. R., Taffe, M. A., Polis, I., Roberts, A. C., Robbins, T. W., Koob, G. F., et al. (1999). Performance norms for a rhesus monkey neuropsychological testing battery: Acquisition and long-term performance. Cognitive Brain Research, 8, 185±201.

19. The Law of General Intelligence

491

Wilke, M., Sohn, J. H., Byars, A. W., & Holland, S. K. (2003). Bright spots: Correlations of grey matter volume with IQ in a normal pediatric population. Neuroimage, 20, 202±215. Witelson, S. F., Beresh, H., & Kigar, D. L. (2005). Intelligence and brain size in 100 postmortem brains: Sex, lateralization and age factors. Brain, 129, 386±398.

Author index

Abad, F. J., 437 Abbott, D. F., 158 Ackerman, P. L., 331, 333, 346, 393 Ackinclose, C. C., 219 Adams, E., 96 Adey, P., 20, 375±376, 378±380 Adler, N. E., 405 Aglioti, S. M., 170 Aharon-Peretz, J., 158 Ahern, F., 439 Ahn, W. K., 120 Aiello, L. C., 415 Akire, M. T., 471 Alexander, M. P., 153, 157±158 Alison, P., 476 Alkire, M. T., 260, 463±464 Allen, J. S., 476 Alliger, R., 461 Allman, J. M., 475 Almor, A., 216, 250 Anastasi, A., 458 Anderson, A. K., 158, 167 Anderson, B., 459, 473±474 Anderson, J., 307 Anderson, J. R., 122, 417, 470, 476, 478 Anderson, M., 342±343, 346, 370, 383 Anderson, P., 278 Anderson, S. W., 224, 267 Anderson, V., 157 Andreasen, N. C., 461

Andreiuolo, P. A., 165 Andrews, G., 216, 218±219, 221±222, 224±225, 227, 276 Andrich, D., 374 Angleitner, A., 440 Antell, S., 318 Anthony, J. C., 476 Anton, J. L., 170 Arbib, M. A., 201 Arbuthnot, J., 169 Archambault, F. X., 169 Archibald, S., 160 Ardila, A., 452 Armstrong, S. L., 122 Arnheim, R., 358 Arvey, R. D., 398, 401 Ashburner, J., 463 Ashby, F. G., 105, 122 Aslin, R., 299, 319 Athias, R., 59 Atran, S., 105, 109±110, 112, 120±121, 124 Augustine, A. M., 476 Avenanti, A., 170 Axelrod, R., 42 Aylward, E. H., 476 Ayton, P., 134, 141, 143±144 Bachevalier, J., 224 Backscheider, A. G., 309, 315 Baddeley, A. D., 213, 227, 373 Baer, J., 351, 360 Bagg, H. J., 456, 458 Baillargeon, R., 301

Bain, J. D., 216, 218±219, 221 Baird, J. A., 279 Baker, D., 169 Baker, D. P., 478 Baker, J. G., 32 Baker, L. A., 435, 441 Baker, R., 221 Baker, S. P., 402, 404±405 Baksh, M., 408, 413, 416, 420 Banerji, N., 333 Banich, M. T., 480 Bar-Hillel, M., 72 Bar-On, R., 268 Barkley, R. A., 278 Barkow, J. H., 59 Baron-Cohen, S., 155±160, 169, 263±265, 267±268, 275±276, 393 Baroody, A., 294, 304, 310, 312±315 Barrett, H. C., 169 Barrick, M. R., 331, 346, 393 Barrick, T., 463 Barron, F. X., 353, 361 Barrouillet, P., 245 Barsalou, L. W., 113, 118 Barss, P., 402, 413 Bartels, M., 434 Bartholomew, D. J., 435, 461 Basinger, K. S., 168 Basola, B., 146 Bastin, M. E., 466, 476 Bates, E. A., 261 Bates, T. C., 361

Author index Bauer, P. J., 303 Bauer, R. M., 155 Beattie, K., 203 Beauchamp, C. M., 441, 444 Bechara, A., 224, 267±268 Beer, J. S., 169 Behrmann, M., 268 Beller, S., 244 Belleville, S., 266 Benedetto, E., 156 Benjamin, D. J., 455, 477 Bennett, J., 95, 98 Benoit, A., 458 Benson, D. F., 167 Beresh, H., 462 Berg, S., 439 Berger, B. D., 158 Berlin, B., 110 Bernstein, M., 356 Berry, J. W., 454 Beth, E. W., 236 Bettman, J. R., 143 Bibby, H., 157 Biederman, J., 281 Billman, D. O., 305 Binet, A., 219, 369, 373±374 Birnbaum, M. H., 72 Birney, D. P., 222 Bjorklund, D. F., 280 Black, J., 237, 242±243 Black, S. E., 166±167, 460, 473 Blacker, D. M., 355 Blair, R. J., 159 Blake, D. J., 457 Blakeslee, S., 267 Blaye, A., 242, 247±248 Blinkhorn, S., 331, 346 Bloom, L., 316 Bloom, P., 111, 113, 316 Bloomer, R. H., 169 Blumberg, M. S., 319 Boag, C., 222 Boesch, C., 415 Bonini, N., 87 Boomsma, D. I., 434, 437±438, 440±441 Boone, K., 167 Borges, B., 136 Borkenau, P., 440 Bothell, D., 478

Bouchard, T. J., Jr., 355, 414 Bourne, L. E., Jr., 238 Bowden, D., 222, 224±225, 276 Boyce, W. T., 405 Boyd, M., 136 Boyer, P., 169 Boyes-Braem, P., 120 Braden, J. A., 429 Bradshaw, G. F., 352 Bradshaw, G. L., 352 Braine, M. D. S., 14, 59, 78, 235±236, 245 Braisby, N. R., 116 Bramati, I. E., 153, 165 Brammer, M. J., 158±159 Brand, C., 451 Brandao, M. C., 59 Brase, G. L., 59, 62, 84, 89±91 Braver, T. S., 394, 462, 467±471, 476, 478±480 Breedlove, D., 110 Brehmer, B., 334 Brem, S., 219 Breton, C., 278, 280 Brink, J., 166 BroÈder, A., 137±138, 142±146 Brody, N., 339±340, 444 Bronfenbrenner, U., 32, 443±444 Brooks, P. J., 59, 61, 77 Brooks, W. M., 466 Brown, A. L., 30±31, 305 Brown, C. K., 476 Brown, E., 301 Brown, J. R., 283 Brown, M., 205 Brown, S. A., 455, 477 Brownell, H., 157, 160 Bruner, J., 375 Brunn, J. L., 454 Brunswick, N., 159 Brunswik, E., 133 Bruss, J., 476 Bryant, P., 17 Bryson, J. J., 223 Bryson, S. E., 267±268 Budiansky, S., 461 Bueti, D., 170 Buffalo, E. A., 457 Buffardi, L. C., 406

493

Bugnyar, T., 460 Bullmore, E. T., 158±159 Bullock, P. R., 157±158 Bulman-Fleming, M. B., 267 Bunch, K. M., 222, 224±225, 276 Bundgaard, M. J., 474 Bunge, S., 262 Burack, J. A., 156, 267 Burgess, P., 461, 467 Burns, N. R., 440 Burrmann, D., 205 Burt, C., 369 Busemeyer, J. R., 146 Bush, E. C., 475 Buss, D., 131 Byars, A. W., 463 Byrne, M. D., 478 Byrne, R. M. J., 14, 40, 78, 185, 193, 219, 233, 236, 243±244, 246 Byrnes, J. P., 237, 242±243 Cahill, L., 260 Calder, A., 158, 268 Callicott, J. H., 471 Campbell, A., 166 Campbell, B. G., 414 Campbell, D. T., 359 Campbell, F. A., 429 Campbell, J. P., 391, 393 Canessa, N., 183 Cannon, T. D., 462±463, 477 Cantor, J. B., 61, 77 Caporael, L. R., 260 Cappa, S., 183 Cara, F., 54±55, 56, 242, 246, 250, 258±261 Caramazza, A., 110 Cardon, L. R., 439 Carey, G., 439 Carey, S., 113±114, 125, 301, 311, 319 Carling, E., 334 Carlson, S. M., 153, 225, 276, 278, 280±282, 284±285 Carpenter, P. A., 25, 28, 333 Carraher, D. W., 336±338

494

Author index

Carroll, J. B., 370, 383, 390, 437, 439, 449 Carruthers, P., 2, 223, 228 Carson, S., 361 Carter, A., 258, 277±278 Carter, C. S., 470 Case, R., 156±157, 222, 373, 376 Casey, P. H., 457 Caspi, A., 430, 432 Cassidy, S. B., 205 Casto, S. D., 439 Cattell, R. B., 433 Cavanaugh, J., 280 Cave, K. R., 454 Caverni, J.-P., 93 Ceci, S. J., 32, 330±331, 334, 337, 339±340, 372, 443±444 Cezayirli, E., 463 Chabris, C. F., 394, 452, 455, 462, 467±471, 476, 478, 480±481 Chae, J. H., 470±471 Chaigneau, S. E., 118 Chalmers, M., 201 Chan, A., 262 Chandler, M. J., 276 Chandra, S. B. C., 459 Chang, F.-M., 281 Changeux, J.-P., 223, 226 Channon, S., 158, 167 Chao, S., 246±247, 249 Chapell, M. S., 241, 243 Charman, T., 156, 264 Charness, N., 343±344 Chase, V., 132 Chater, N., 97, 137±138, 183, 216, 246 Chelonis, J. G., 457 Chenevert, T. L., 227 Cheng P. W., 13±14, 23, 39±44, 46, 48, 70, 77±78, 123, 216, 236, 246±247, 249±250 Cherney, S. C., 432 Cherubini, P., 46 Chesney, M. A., 405 Cheung, G., 166 Chi, M. T. H., 333 Chiappe, D., 261 Chiou, J. Y., 460, 473 Chitins, X., 463

Chiulli, S. J., 466 Cho, S. H., 470±471 Choi, J., 260 Choi, Y. Y., 470±471 Chomsky, N., 78, 202, 293, 371 Christal, R. E., 450, 467 Church, R. M., 299 Clark, A., 223 Claxton, L. J., 278, 281, 284±285 Clayden, J. D., 476 Clayton, N. S., 460 Clear®eld, M. W., 318 Clements, W., 283 Cobey, S., 459 Cohen, D. J., 275 Cohen, G., 461 Cohen, J. D., 153, 160, 165, 168, 478±480 Cohen, L., 226±227 Cohen, L. B., 303 Colby, A., 168 Coley, J. D., 110±112, 120, 124 Colom, R., 437, 480 Colombo, J., 428 Colvert, E., 266 Connelly, M. S., 360 Conway, A. R., 467 Cooper, L. A., 333 Cooper, R. G. Jr., 318 Copeland, B. J., 333 Coren, S., 455 Corley, R., 430, 432, 444 Corley, R. P., 466 Cosmides, L., 13, 15, 22, 39, 42±52, 54±55, 56, 59±65, 72, 74, 76±77, 83±84, 87, 89±91, 100, 131±132, 155, 161, 163, 182, 184, 197, 206, 216, 223, 236, 246, 249±250, 257±259, 261±262, 269, 330, 388, 416, 442, 449±450 Costa, D., 159 Costa, R. M., 481 Courchesne, E., 267 Court, J. H., 25, 333 Cowan, N., 221 Cox, C., 356 Cox, J. R., 22, 46, 182, 184

Craik, F. I. M., 159±160 Crawford, J. R., 428, 472 Crawford, S., 158, 167 Crawley, J. N., 460 Crinella, F. M., 459, 480 Croft, K., 275 Cronbach, L. J., 449 Cross, D., 223, 280 Crowley, C., 458 Crowley, K., 335 Crowther, R. D., 312 Cummings, J. L., 167 Cummins, D. D., 236, 246±247, 249 Cummins, T. D. R., 137, 143, 145±146 Currie, J., 374 Custance, D., 266 Cutting, J. E., 144 Czerlinski, J., 133 Daily, L. Z., 478 Dale, P. S., 437±438 Dalton, C., 222 Damasio, A. R., 166±167, 224, 267 Damasio, H., 224, 267, 476 Daniel, D. B., 243 Danna, M., 183 Darby, A., 167 Dardzinski, B. J., 462, 466 Darley, J. M., 153, 160, 165, 168 Darwin C., 261±262, 357 David, A. S., 169 Davidson, D., 206 Davidson, D. H., 281 Davidson, J. E., 354 Davidson, N., 360 Davies, W., 205 Davis, A., 476 Davis, B. D., 455 Davis, H. L., 285±286 Davis, S. A., 460 Dawkins, R., 261 de Bruyn, E., 443 de Geus, E. J. C., 440±441 de Oliveira-Souza, R., 153, 165 de Villiers, J. G., 180, 276 de Villiers, P. A., 180, 276

Author index de Vooght, A., 344 de Waal, F. B., 170 Deacon, T. W., 261 Deakin, J. F., 165, 170, 470 Deary, I. J., 374, 394, 400±401, 428, 433±434, 440, 452, 455, 462±463, 465±466, 476 Decety, J., 168±169 DeFries, J. C., 372, 393, 414, 430, 439, 444, 466 DeGroot, A. D., 429 DeGutis, J., 227 Dehaene, S., 223, 226±227, 293 DeLoache, J. S., 305±307 Demany, L., 319 Denburg, N. L., 267±268 Der, G., 462±463, 465 Detterman, D. K., 339, 434, 441 Dhami, M. K., 134, 143±144 Diakidoy, I.-A. N., 351 Diamond A., 278 Dias, M. G., 59, 61, 64, 76±77 Dicke, U., 475 Dickinson, A., 201 Dickson, L., 319 Dierckx, V., 336 Diesendruck, G., 111 Diessner, R., 160, 166 Dixon, M. J., 267 Dobyns, W. B., 476 Dockrell, J., 21 Dolan, C. V., 437±438 Dolan, M., 470 Dolan, R. J., 171, 224 Donaldson, M., 20 Donelan-McCall, N., 283 DoÈrner, D., 334 Dorus, S., 476 Douglass, S., 478 Doyle, A. E., 281 Drew, C., 481 Dror, I. E., 146 Dube, G. R., 460 Dubner, S. J., 374 Duchaine, B., 261 Dudink A. C. M., 28 Dunbar, K., 219±220

Dunbar, R. I. M., 393, 414 Duncan, J., 461, 466±467, 470 Dunn, J., 283 DupreÂ, J., 111 Durkin, K., 312 Eals, M., 260 Earles, J. A., 454±455 Eccles, M., 134 Edgington, D., 95 Edwards, A., 134 Edwards-Lee, T., 167 Eggleston, V. H., 313 Eimas, P. D., 114, 303 Ekman, P., 262 Eley, T. C., 437±438 Ell, P., 159 Elliott, R., 224, 470 Elman, J., 261 Elwyn, G., 134 Ely, T. D., 158 Emerson, M. J., 466 Emery, N. J., 460 Emslie, H., 461, 467, 470 Engle, R. W., 467, 471 English, L. D., 219 Epstein, R., 352 Ericsson, K. A., 329, 341±345, 354, 356 Erzinclioglu, S., 160 Eslinger, P. J., 153, 158, 165, 169 Evans, D. M., 438 Evans, J. St B. T., 23, 24, 40, 70, 77, 86, 94±96, 98±100, 199, 223, 225, 243, 246 Evans, P. D., 417, 476 Everett, B. A., 275 Everman, D. B., 205 Eysenck, H. J., 355, 360±361 Fairhall, S., 77 Faivre, I. A., 341±342 Falmagne, R. J., 236 Fan, J., 281 Faraone, S. V., 281 Farrelly, D., 24 Farrow, T. F., 165, 170 Fayers, K., 137, 142, 144 Fazio, F., 183

495

Feeney, A., 95 Feigenson, L., 226±227, 319 Feldman, J., 300, 472 Feltovich, P. J., 333 Ferguson, H. J., 459 Ferguson, R. W., 219 Fernandes, C., 456, 458 Fernandez, D., 140±141 Fernandez-Duque, D., 167, 279 Ferreira, S. M., 73 Fias, W., 226 Fiddick, L., 15, 22, 42, 44, 59±60, 76±77, 135, 161, 249, 330, 416 Fine, C., 159 Fineberg, E., 281 Finke, R. A., 358 Fischl, B., 477 Fisher, A. V., 308 Fivush, R., 157 Fjell, A. M., 477 Flaum, M., 461 Flavell, E. R., 275 Flavell, J. H., 275, 279 Fleishman, E. A., 406 Fletcher, P. C., 159 Fleury, M., 243 Fodor, J. A., 2, 56, 78, 83, 108, 154, 214, 223, 250, 257±258, 293, 311, 443, 450 Fogassi, L., 170 FOIRN-ISA, 64 Folkman, S., 405 Foltz, C., 243 Ford, G., 462±463, 465 Forster, B. B., 166 Fortier, L., 243 Fortin, E., 458 Fossella, J., 281 Fox, H. C., 394, 400±401, 455 Frangou, S., 463 Frank, M. J., 480 Franks, B., 116 Frederick, S., 99, 455 Freedman, M., 167±168 Freer, C., 461, 467, 470 Friedman, N. P., 466 Friedman, O., 155±156, 161, 164, 171, 223, 277±278

496

Author index

Friesen, W. V., 262 Friston, K. J., 463 Frith, C. D., 159, 170, 224 Frith, U., 159, 263±264, 266 Froman, R. D., 170 Frye, D., 21, 153, 156, 221, 225, 258, 267, 276±278 Fulker, D. W., 430, 439, 444 Fuller, D., 168 Fuller, J. L., 458 Fundele, R. H. H., 204 Funnell, E., 167 Furnham, A., 361 Furrow, D., 281 Gabrieli, J. D., 465 Gahl, S., 300 Gainer, P., 168 Galati, G., 170 Gallagher, H. L., 159 Gallagher, M., 158 Gallese, V., 170 Gallistel, C. R., 223, 226, 295 Gallup, G. G., 153, 157±158 Galsworthy, M. J., 456, 458 Galton, F., 353, 362 Gandhi, C. C., 459 Gao, F. Q., 460, 473 Garces, E., 374 Garcia, J., 269 Garcia, R., 236, 241 Gardner, H., 370, 391 Garlick, D., 472, 480±481 Garnier, H., 356 Gasser, M., 309 Gazzola, V., 170 Geake, J. G., 471 Geary, D. C., 387±388, 395, 414 Geffen, G. M., 435, 438, 440, 476 Geffen, L. B., 438, 440, 476 Gelman, R., 223, 226, 293±295, 316, 318±319 Gelman, S. A., 2, 105, 110, 112, 114±116, 276

Geng, J. J., 268 Gennari, S., 120, 126 Gentner, D., 219, 303, 305, 306±309, 312 Gentner, D. R., 219 Gerken, L. A., 300, 319 German, T. P., 155±156, 161, 164, 171, 194, 223±224, 268, 277 Gershkoff-Stowe, L., 315 Gerton, B. K., 471 Getzels, J., 353 Ghadirian, A.-M., 361 Ghiselin, M. T., 110 Gibbs, B., 299 Gibbs, J. C., 168 Gibson, J., 357 Gick, M. L., 15, 219, 333 Gigenenzer, G., 50, 62, 64, 72, 76, 83, 85, 88±91, 93, 131±137, 139±141, 145±146, 249±250, 388 Gignac, G., 462 Gilbert, S. L., 417, 476 Gilboa, A., 168 Gilly, M., 242, 247±248 Gilmore, D. J., 335 Gilmour, A., 21 Ginsburg, M. J., 402, 404±405 Girotto, V., 24, 44, 46, 54±56, 89, 93, 242, 246±250, 258±260 Glabus, M. F., 471 Glaser, R., 29, 333 Gleitman, H., 122 Gleitman, L. R., 122, 309, 315 Glendon, A. I., 406±407 Glotzer, L. D., 387, 394, 475 GluÈck, J., 243 Gobet, F., 343±344 Goel, V., 159 Gold, L. H., 460 Goldberg, E., 257, 265 Goldman, A. I., 276 Goldman, W. P., 167 Goldsher, D., 158 Goldstein, D. G., 133±137, 139±141, 145 Goldstone, R. L., 303, 305±306

Goleman, D., 481 Gomez, R. L., 300, 319 Gong, Q. Y., 463 Gonzalez, C., 334 Gonzalez, M., 89 Goodall, J., 415 Goodman, N., 106, 124 Goos, L., 204 Gopnik, A., 116, 118, 157, 275 Gordon, A. C., 156 Gordon, P., 310 Gordon, R. A., 396±397, 418±419 Gorini, A., 183 Gorno-Tempini, M. L., 167 GoÈssler, H., 243 Goswami, U., 29±31 Gottfredson, L. S., 332, 345±346, 392, 394, 396, 398, 400±401, 403, 455 Gottfried, G., 112 Gough, H. G., 353, 360 Gould, J., 263 Gould, S. J., 259, 330 Graf, P., 157 Grafman, J., 153, 165 Grafton, S. T., 158 Grattan, L. M., 169 Gray, J. R., 394, 462, 467±471, 476, 478, 480 Gray, R. D., 77 Gray, W. D., 120 Green, D. W., 23 Green, F. L., 275 Greene, J. D., 153, 160, 165±168 Green®eld, P., 201 Gregoire, P., 361 Gregory, C., 160 Gregoryan, G., 456, 458 Grice, H. P., 263 Griggs, R. A., 22, 23, 44±46, 50, 53, 182, 184, 236, 250 Grigorenko, E. L., 438 Groisser, D. B., 277 Grossman, H. C., 458, 473 Gruber, H. E., 358 Grudnik, J. L., 440, 462, 465 Guilford, J. P., 353, 370

Author index Gundersen, H. J., 474 Gunning-Dixon, F. M., 159, 462, 465 Gur, R. C., 159 Gustafson, L., 167 Gutierrez, T., 460 Gyllensten, U., 420 Hadjichristidis, C., 95, 99, 120 Hagen, E. P., 369 Haidt, J., 165±167 Haier R. J., 462±464, 466, 468, 471, 480 Hakeem, A. Y., 475 Hala, S., 278, 280 Hale, A. R., 406±407 Hale, G., 458, 473 Halford, G. S., 216±222, 224±225, 227, 276 Hall, D. G., 308 Hallett, M., 159 Hallinan, E. V., 475 Hamann, S. B., 158 Hamm, F., 195, 201 Hammond, K. R., 133 Hampton, J. A., 116 Han, K. S., 360 Han, Y. R., 458, 473 Handley, S. J., 24, 95, 99 Hansell, N. K., 476 Hansen, P. C., 471 Happaney, K., 267, 269 Happe, F., 159, 266 HappeÂ, F. G., 157, 160 Hare, B., 205, 460 Hare, R. D., 166 Harkness, A. R., 430 Harkness, K. L., 224 Harley, C., 24 Harries, C., 134, 143±144 Harrington, D. M., 353 Harris, P. L., 247, 249, 263, 265, 276 Hart, B. L., 466 Hart, D., 411 Hase, K., 356 Hasegawa, T., 249 Hastie, R., 118 Hauser, M. D., 475 Hayes, J. R., 352, 354 Hazen, K., 468 Head, E., 460, 473 Head, K., 460, 463±464

Heaney, M., 77 Heath, A. C., 435 Heaton, P., 341 Heavey, L., 341 Hedehus, M., 465 Heggestad, E. D., 331, 346, 393 Heider, K., 262 Heim, A. W., 29 Heinrich, B., 460 Heit, E., 120 Henderson, A., 278, 280 Herd, S. A., 480 Hermelin, B., 340±342, 346 Herndon, J. G., 459 Hernstadt, R., 472 Herrnstein, R. J., 370, 455, 477 Hersby, M., 137, 142, 144 Hertwig, R., 132 Hespos, S. J., 319 Hessan, D. J., 437 Hevelone, N. D., 465 Hewitt, G., 375 Hewitt, J., 375 Hewitt, J. K., 432, 466 Higgins, D. M., 361 Hill, K., 407, 409±410, 412±413, 415, 420 Hinvest, N., 137 Hiraishi, K., 249 Hirschfeld, L. A., 2, 114, 116 Hix, H. R., 278, 280 Ho, D. Y. F., 281 Ho, H. Z., 441 Hodges, J. R., 167 Hoffman, J. M., 469 Hoffman, M. L., 168 Hoffrage, U., 62, 64, 72, 76, 83, 89±91, 142 Hofstadter, D. R., 219 Hogarth, R. M., 145 Holland, J. H., 218±219, 359 Holland, P. C., 158 Holland, S. K., 462±463, 466 Holliday, T. W., 387, 415, 417 Holloway, R. L., 387, 413 Holt, J. L., 471

497

Holyoak, K. J., 13±15, 23, 39±44, 46, 48, 70, 77±78, 216, 218±219, 236, 246±247, 250, 305, 333 Horn, N. R., 470 Hornak, J., 267 Horvath, J. A., 331, 334 Horwitz, B., 471 Hosler, J. S., 459 Houser, D., 159 Howe, M. J. A., 331, 340±341, 343±344, 354 Howell, N., 408, 410±412, 415, 420 Howerter, A., 466 Howson, C., 91±92 Hudson, J. A., 157 Hudson, R. R., 417, 476 Huettel, S. A., 465 Hug, K., 50, 249±250 Hug, S., 278, 280 Hughes, C., 156±157, 191, 283±284 Hughes, M., 20 Humby, T., 205 Hume, D., 63 Hummel, J. E., 216, 218 Humphrey, N., 87 Humphreys, G. W., 154 Humphreys, L. G., 396 Hunsberger, B., 160, 166 Hunter, J. E., 345±346, 389, 406, 419 Hunter, M., 160 Hurtado, A. M., 407, 409±410, 412±413, 415, 420 Huttenlocher, J., 299, 306±307, 310, 313, 315, 318±319 Huttunen, M., 462±463, 477 Ikeda-Douglas, C. J., 460 Imai, M., 308 Ingman, M., 420 Inhelder, B., 236, 373, 378 Irvine, S. H., 454 Isles, A. R., 204 Iwasa, Y., 204

498

Author index

Jackson, E., 137 Jackson, P. L., 168±169 Jackson, P. W., 353 Jackson, S. L., 46 Jacques, S., 156, 267 James, M., 154±155, 467 James, W., 449, 459 Jarman, R. F., 279 Jaswal, G., 167 Jenkins, E. A., 334±337 Jenkins, L., 396±397 Jensen, A. R., 345, 370, 374, 389±390, 394±395, 421, 440, 450±452, 460, 470, 472, 481 Jensen, I., 222 Jerison, H. J., 387 Jin, L., 420 Johansen, M. J., 138 Johansson, B., 439 Johnson, A., 408, 413, 416, 420 Johnson, A. M., 462, 465 Johnson, C., 331 Johnson, D. M., 120 Johnson, E., 143 Johnson, M., 236 Johnson, M. H., 218, 227, 261 Johnson, R., 461, 467, 470 Johnson, S. P., 299, 319 Johnson, T. F., 360 Johnson-Laird, P. N., 14, 40, 78, 93, 184±185, 219, 233, 236, 244±246 Jones, S., 143 Jones, S. S., 315±316 Jones, T., 222, 224±225, 276 Jonides, J., 227 Jordan, G., 159 Jordan, N. C., 306 Jung, R. E., 462±464, 466, 480 Jungeblut, A., 396±397 Junn, E. N., 305 Juslin, P., 137±138, 143±144 Just, M. A., 25, 28, 333 Kacelnik, A., 145 Kaessmann, H., 420

Kahneman, D., 72, 87±88, 92±93, 96, 98±99, 100, 299 Kalish, C. W., 115, 121, 125 Kamin, L. J., 330 Kane, M. J., 305, 467, 471 Kanwisher, N., 268 Kaplan, H., 415 Kaplan, P. J., 169 Kareev, Y., 145 Karmiloff-Smith, A., 261, 265, 451, 472±473 Karnik, M. S., 458, 473 Karunadasa, D., 205 Kaufman, J. C., 351 Kavanaugh, R. D., 263 Kavjek, M., 428 Keane, J., 158, 268 Keating, D. P., 318 Kehrer-Sawatzki, H. W. V., 204 Keil, F. C., 105, 113±114, 121, 123±124 Kelemen, D., 114, 124±125 Keller, S., 463 Kellman, P. J., 301 Kelly, S. W., 472 Kemmelmeier, M., 56, 250 Kennedy, J. L., 281 Kerchner, G. A., 460 Kerr, A., 267, 269 Kerr, B., 472 Kessler, C. M., 23 Keysers, C., 170 Kidd, J. R., 281 Kidd, K. K., 281 Kiehl, K. A., 166 Kigar, D. L., 462 Killiany, R. J., 459 Kilpatrick, L., 260 Kilts, C. D., 158 Kimura, D., 261 Kirkham, N. Z., 299, 319 Kirsch, I. S., 396±397 Kivak, K. J., 281 Klaczynski, P. A., 243, 248 Klahr, D., 358 Klar, Y., 23, 250 Klein, G., 145

Kleiter, G., 88±90, 92 Klibanoff, R. S., 306 Klingberg, T., 465 Kloo, D., 224, 281, 283 Knapp, D. J., 391, 393 Knight, R. T., 157±158, 163, 263, 267±268 Koegel, L. K., 266 Koelling, R., 269 Koenig, O., 478 Kohlberg, L., 168 KoÈhler, W., 352 Kohn, P. D., 471 Kokinov, B. K., 219 Kolata, S., 458, 473 Kolstad, A., 396±397 Koob, G. F., 460 Korb, K. B., 452, 455 Kosmidis, H., 361 Kosslyn, S. M., 449, 454, 478 Kotovsky, L., 305±306 Koza, J. R., 359 Kraebel, K., 319 Kramer, J. H., 167, 169 Krampe, R. T., 329, 343±345 Kranzler, J. H., 440, 462, 465 Kraus, S., 135 Kripke, S., 115 Kroger, J. K., 46 Kroll, N., 163 Kruschke, J. K., 105, 122±123 Kyllonen, P. C., 25, 333, 450, 467 L'Hirondelle, N., 260 LaCasse, L., 468 Lagnado, D. A., 138 Lahn, B. T., 476 Lakoff, G., 236 Lamberts, K., 122 Lancaster, J., 415 Landau, B., 309, 315±316 Landau, K. R., 157 Landau, N., 375 Landry, R., 267 Lang, B., 156, 224, 278, 281, 283 Langdon, R., 275 Langley, P., 352 Langston, C., 99

Author index Larkin, J., 97 Larkin, J. H., 333 Larkin, S., 376 Larking, R., 23 Larson, G. E., 468 Lasko, N. B., 455 Lave, J., 336, 338 Law, K., 389 Lawrence, E. J., 169 Leavitt, G. C., 260 Lebiere, C., 478 Lee, D. Y., 443 Lee, K., 281 Lee, K. H., 470±471 Lee, M. D., 137, 143, 145±146 Lee, S., 470±471 Leekam, S. R., 264, 285±286 Leevers, H. J., 29±30, 265 Legrenzi, M. S., 40, 93, 184 Legrenzi, P., 24, 40, 93, 184, 247 Lehmann, A. C., 343 Lemmon, H., 428 Leslie, A. M., 155±156, 161, 164, 171, 180, 202, 223±224, 263±265, 268, 276±278, 285±286, 293, 299 Levenson, R. L., 169 Levidow, B. B., 219 Levin, D. T., 300 Levine, A. S., 462, 466 Levine, B., 157, 159, 166, 168 Levine, S. C., 299, 306±307, 310, 313, 315, 318±319 Levitt, S. D., 374 Lewkowicz, D. J., 319 Lewontin, R. C., 259, 330 Leyden, G., 344 Li, G., 402, 404±405 Liberman, N., 23, 250 Lien, Y., 123 Light, K., 458, 473 Light, P., 21, 242, 247±249, 357 Liker, J. K., 331, 334, 339±340, 444 Lin, A. A., 420 Link, B. G., 405

Lipton, P. A., 432 Litz, B. T., 455 Liu, D., 282 Liu, L., 456, 458 Livermore, D. P., 26±29, 244 Livesey, P. J., 458 Lo, Y., 308 Lockwood, C. M., 469 Locurto, C., 455, 458±459 Logan, G., 281 Logie, R. H., 373 Lohman, D. F., 29 Lombrozo, T., 125 Long®eld, E., 315 Lopez, A., 124 Lough, S., 160 Love, R. E., 23 Lovett, M. C., 478 Low, B. S., 416, 420 Lubinski, D., 396 Luciano, M., 435, 438, 440, 476 Luck, S. J., 221 Ludwig, A. M., 362 Lumsden, C. J., 414 Lumsden, J., 159 Lundervold, A., 477 Luo, D., 439, 441 Luria, A. R., 193, 266, 277 Lykken, D. T., 355, 414 Lynch, J. S., 23 MacDonald, K., 261 MacFall, J. R., 465 MacGillivray, T. J., 466 Maciejewski, C., 154±155, 467 MacKinnon, D. P., 469 Mackintosh, N. J., 455, 472 Macklin, M. L., 455 Macnamara, J., 236 Macphail, E. M., 455 Madden, D. J., 466 Madole, K. L., 303 Magnus, P., 439 Mahowald, M., 476 Maller, J. B., 419 Malt, B. C., 110±111, 115±116, 120, 122, 126 Mandell, D. J., 284

499

Mandler, J. M., 236, 303 Manktelow, K. I., 23, 246 Marcheutz, C., 227 Marcus, S., 374 Marin, O. S. M., 154 Marino, L., 460 Markman, A. B., 305±306 Markman, A. W., 219 Markman, E., 308 Markovits, H., 243, 245, 248 Markow, D. B., 308 Marks, K. S., 303 Marks, M. A., 360 Markson, L., 111 Marner, L., 474 Marr, D., 213±214 Marriott, M., 281 Marshalek, B., 25, 333 Marsolek, C. J., 478 Martignon, L., 142 Martin, L., 160 Martin, N. G., 435, 440, 476 Marucci, F. S., 28 Marzolf, D., 306 Matan, A., 113 Mattingley, J. B., 158 Mattock, A., 301 Matzel, L. D., 458±459, 473 Mauthner, N., 280 Mayberg, H. S., 167 Maybery, M., 216, 218±219 Mayes, A., 463 Maylor, E. A., 160 Mayr, E., 109 Mazzocco, A., 46 McCabe, K., 159 McCarthy, P. M., 406 McClearn, G. E., 393, 414 McClelland, J. L., 138, 218, 259 McCrae, R. R., 360 McCredden, J. E., 221 McDaniel, M. A., 389, 419, 462 McDermott, J., 333 McDonald, S., 157 McGarrigle, J., 20 McGeorge, P., 472

500

Author index

McGlone, F., 158 McGonigle, B., 201 McGrath, J., 267 McGue, M., 414 McGuf®n, P., 393, 414 McKechnie, J., 21 McKenzie, B., 319 McKinnon, M. C., 159±160, 166±168 McNally, R. J., 455 McNemar, Q., 353 Meck, W. H., 299 Medin, D. L., 106, 110±111, 114, 120, 124, 303, 306 Medina, J., 303 Mednick, S. A., 353 Mekel-Bobrov, N., 417, 476 Mellors, B. A., 72 Mendrek, A., 166 Meo, M., 28 Mervis, C. B., 120, 316 Metzger, L. J., 455 Meyer, D. E., 360 Mick, E., 281 Miele, A., 458 Miele, F., 407 Milgram, N. W., 458, 460 Miller, B. L., 167, 169 Miller, E. M., 465 Miller, G. F., 144, 261, 413, 421 Miller, L. K., 340±342 Miller, M. H., 167 Miller, P. H., 279 Miller, S. A., 279 Milner, B., 262 Mitchell, P., 194 Mix, K. S., 294, 299, 303±304, 306±308, 310±315, 317±319 Miyake, A., 466 Mof®tt, T. E., 430, 432 Molenaar, P. C. M., 441 Moll, J., 153, 165 Monleon, S., 456, 458 Moore, C., 21, 281 Morasse, K., 266 Morath, R. A., 406 Moriarty, J., 159 Morris, A. P., 158 Morris, R. G., 157±158 Morrison, V., 301

Moscovitch, M., 154±155, 160, 166, 168 Moseley, M. E., 465 Moses, L. J., 153, 156±157, 161, 225, 276, 278±282, 284±286 Moss, M. B., 459 Mottron, L., 266 Moulson, J. M., 160 Moulson, M. C., 224 Mount, M. K., 331, 393 Mourao-Miranda, J., 165 Mueller, U., 278 Muggenburg, B. A., 460 Mulholland, T. M., 29 MuÈller, U., 237±238, 242±243, 248, 269 Mumford, M. D., 360 Munakata, Y., 227, 478 Muncer, A.-M., 160 Murphey, H., 460 Murphy, J., 167 Murray, C., 370, 395, 455, 477 Mychack, P., 167 Naito, M., 283 Nakayama, K., 99 Nakisa, R., 137±138 Namy, L. L., 303, 306 Narasimham, G., 243, 248 Narr, K. L., 462±463, 477 National Center for Injury Prevention and Control, 402, 410 National Research Council, 402 Neary, D., 167 Needham, A., 301 Nelson, I., 21 Nelson, K., 223 Nettelbeck, T., 341±342, 440 Neubauer, A., 440 Neubert, M. J., 393 Newell, A., 213, 333, 352, 477 Newell, B. R., 132, 137, 140±146 Newport, E. L., 299±300, 319 Newstead, S. E., 24, 40, 243, 246

Newton, E. J., 22, 335, 341 Ngyuen-Xuan, A., 43 Nichols, S., 194 Nickerson, R. S., 183 Nippak, P. M., 458 Nisbett, R. E., 99, 216, 218±219, 246 Noelle, D. C., 478±480 Norman, D., 277 Norman, K. A., 480 Nosofsky, R. M., 105, 122±123, 138, 146 Noveck, I. A., 15, 23, 41, 46, 77, 237, 242±243, 246, 248 NunÄes, M., 247, 249 Nunes, T., 336±338 Nyengaard, J. R., 474 Nystrom, L. E., 153, 160, 165, 168 O'Brien, D. P., 14±15, 23, 41, 46, 59, 61, 64, 70, 76±78, 235±237, 242±243, 246, 248 O'Connor, N., 340, 342 O'Doherty, J., 171 O'Leary, D. S., 461 O'Neill, B., 402, 404±405 O'Reilly, R. C., 478±480 O'Sullivan, M., 262 O'Toole, B. I., 400, 405 O'Toole, C., 166 Oakes, L. M., 303 Oaksford, M., 97, 137±138, 183, 216, 246 Oberauer, K., 95 Oden, S., 374 Oliveira-Souza, R., 165 Oliver, L. M., 216, 246 Ollendick, T. H., 264 Olson, D. R., 156 Olsson, H., 143 È nkal, D., 141 O Oppenheimer, D. M., 139±140 Orr, S. P., 455 Ortmann, A., 136 Ortony, A., 106, 114 Osherson, D. N., 87, 124 Oswald, D. P., 264

Author index Over, D. E., 23, 83±86, 89±90, 92, 94±96, 98±99, 100, 111, 120, 131±132, 135±136, 223, 225, 246 Overman, W. H., 224 Overton, W. F., 70, 163, 235±238, 240±244, 248, 251 Owen, A. M., 466, 470 Oxbury, S., 167 Ozonoff, S., 156, 264, 266±267 PaÈaÈbo, S., 420 Paik, J. H., 303, 312 Pakkenberg, B., 474 Pakstis, A. J., 281 Palfai, T., 153, 156, 225, 276, 278 Palmer, M., 137 Palmeri, T. J., 146 Pancer, S. M., 160, 166 Parisi, D., 261 Parkin, L. J., 283, 286 Pascual-Leone, J., 373, 376 Passant, U., 167 Passarino, G., 420 Patalano, A. L., 360 Patel, D., 458, 473 Patterson, K., 167 Pauen, S., 114 Paule, M. G., 457, 459 Paya-Cano, J. L., 456, 458 Paylor, R., 460 Payne, J. W., 143 Pearlman, K., 389 Pearlson, G. D., 476 Pearlstone, Z., 237 Pedersen, N. L., 439 Pellegrino, J. W., 29 Peloquin, S. M., 170 Pelvig, D., 474 Pennington, B. F., 156, 264, 266±267, 277 Perilloux, H. K., 157 Perkins, D., 369 Perlmutter, M., 280 Perner, J., 156, 196, 224, 264, 275, 278, 281, 283, 285±286 Perry, R. J., 167, 169

Persson M., 137±138, 144 Peterson, D., 194 Peterson, J. B., 361 Petrides, M., 262 Petrill, S. A., 432, 434, 438±439 Petropoulos, H., 462, 466 Phelan, J., 405 Phelps, E. A., 158, 167 Phillips, S., 221 Piaget, J., 19, 219, 236, 238, 241, 373±376, 378 Piattelli-Palmarini, M., 183 Piazza, M., 226±227 Pilcher, J. J., 135 Pillow, B. H., 275 Pinel, P., 226±227 Pinker, S., 109, 131, 163, 371 Pinneau, S. R., 428 Platt, R. D., 23, 44±45, 50, 53, 250 Plomin, R., 372, 393, 414, 430±432, 434, 437±439, 444, 449, 455, 458±459 Plucker, J. A., 351 Plunkett, K., 261 Polis, I., 460 Politzer, G., 43 Polizzi, P., 223±224, 268 Polkey, C. E., 157±158 Pollack, R. D., 163 Polya, G., 219 Posner, M. I., 226, 279, 281 Posthuma, D., 440 Potter, M., 266 Poulin-Dubois, D., 110, 114 Poutanen, V. P., 462±463, 477 Povinelli, D. J., 157 Prasad, A., 222 Pratt, A., 160, 166 Pratt, C., 285±286 Pratt, M. W., 160, 166 Pressley, S., 29±30 Preston, S. D., 170 Price, T. S., 437±438 Primi, R., 28 Pring, L., 341 Provenzale, J. M., 465 Pure, K., 281

501

Pusey, A., 415 Putnam, H., 115 Pylyshyn, Z. W., 213±215, 228, 300 Qin, Y., 478 Qu, L., 269 Quine, W. V. O., 125, 294 Quinn, P. C., 114, 218, 303 Quinn, S., 243 Rabbitt, P. M. A., 333 Ragland, J. D., 159 Rakison, D. H., 110, 114 Rakow, T., 137, 142, 144 Ramachandran, V. S., 267 Rampon, C., 460 Ramsey, F. P., 98 Rankin, K. P., 167, 169 Rasher, S. P., 356 Ratcliff, R., 146 Rattermann, M. J., 305, 307±309 Raven, J., 25, 333 Raven, J. C., 25, 333, 374 Raven, P., 110 Raz, N., 159, 462, 465 Reber, A. S., 219, 472 Reder, L. M., 334, 478 Redington, M., 137±138 Ree, M. J., 454, 455 Reed, S. K., 219 Reed, T. E., 462, 465 Reene, K., 237±238, 242±243, 248 Regan, R. T., 339 Regier, T., 300 Regino, R., 281 Rehder, B., 118, 122 Reinvang, I., 477 Retschitzki, J., 344 Reuning, H., 389, 455 Reynolds, C. A., 435 Reznick, J. S., 258, 277±278 Ricco, R., 243 Rice, G. E., 168 Rice, N., 315 Rice, T., 439 Richardson, K., 13, 15, 25±29, 330, 333 Riddoch, M. J., 154

502

Author index

Riem, R., 312 Rieman, R., 440 Rietveld, M. J. H., 434, 437±438 Rigas, G., 334 Riggs, K., 194 Rijsdijk, F. W., 440±441 Ring, H. A., 158±159 Rips, L. J., 14, 106, 115±116, 235±237, 246, 248 Rivers, C. S., 466 Rizzolatti, G., 201 Roazzi, A., 17, 32, 59, 61, 64, 76±77, 330±331, 337 Robbins, T. W., 460 Roberts, A. C., 460 Roberts, M. J., 18, 22±24, 26±29, 244±246, 335, 341 Robertson, A., 379±380 Robinson, E., 194 Rochat, P., 319 Rodgers, J. L., 395, 435 Roe, A., 357 Rogers, S. J., 156, 264, 266±267 Rolls, E. T., 224, 267 Root-Bernstein, R. S., 356 Rosas, H. D., 465 Rosch, E., 120 Rose, S., 330 Rosen, D. E., 77 Rosen, H. J., 167 Rosen, H. R., 169 Rosenblum, T., 443 Rosene, D. L., 459 Rosenfeld, A., 163 Rosenfeld, R., 163 Ross, B. H., 306 Rostan, S. M., 359 Roth, G., 475 Rothstein, H. R., 389 Rougier, N. P., 480 Rovner, D., 134 Rowe, A. D., 157±158 Rowe, D. C., 395, 435±436 Rowland, L. M., 462, 466 Rubinstein, J., 120 Ruff, C. B., 387, 415, 417 Ruffman, T., 283

Rumelhart, D. E., 138, 218, 259 Rushton, E. W., 417 Rushton, J. P., 417 Russell, B., 304 Russell, G. L., 167 Russell, J., 191, 235±236, 245, 276, 278, 280 Rutter, D. R., 312 Ryan, L., 159 Ryan, P., 224 Ryle, G., 197 Sabbagh, M. A., 224, 278, 281±282, 285±286 Sadato, N., 159 Saffran, E. M., 154 Saffron, J., 299, 319 Sagi, A., 168 Salat, D. H., 465, 477 Salthouse, T. A., 159 Saltzman, J., 160 Salz, T., 465 Samuels, R., 83 Samuelson, L., 315 Sandhofer, C. M., 294, 304, 306±315, 317 Sarich, V., 407 Sattler, J. M., 369 Saudino, K., 437±438 Savary, F., 248 Saxe R., 268 Scanlon, C., 458 Schachar, R., 281 Schaeffer, B., 313 Schaub, H., 334 Schellenberg, E. G., 167 Schiano, D. J., 333 Schiffer, S., 137±138, 144±145 Schleifer, M., 243 Schliemann, A. D., 336±338 Schmidt, F. L., 345, 389, 406, 419 Schmithorst, V. J., 462, 466 Schmitt, D. P., 135 Schmitt, J. R., 458 Schmitz, B., 159 Schneider, W., 280 Schoenbaum, G., 262 Schoenemann, P. T., 387, 394, 475

Schoenfeld, A. H., 334 Scholl, B. J., 116, 299±300 Scholnick, E. K., 245 Schoner, G., 303 Schretlen, D., 476 Schroeder, L., 159 Schuff, N., 167 Schuhmann, E., 224 Schulz, L. E., 116, 118 Schuneman, M. J., 243 Schunn, C. D., 334 Schwartz, M. F., 154 Schwartz, M. L., 166 Schweinhart, L., 374 Scott, F. J., 264±265 Scott, J. L., 313 Scott, J. P., 458 Scott, N., 458, 473 Scribner, S., 334±336 Secord, W., 31 Seeberger, L. C., 480 Segar, C. A., 266 Seifert, C. M., 360 Setlow, B., 262 Sha®r, E., 124 Shallice, T., 277 Shamay-Tsoory, S. G., 158 Shanahan, M., 201 Shanks, D. R., 137±138, 142±145 Shapiro, J. M., 455, 477 Shapiro, L., 122 Sharon, T., 319 Sharpe, S., 280 Shatz, M., 309, 315 Shaw, J. C., 352 Shaw, P., 169 Shayer, M., 376, 378, 380 Sheehan, M. J., 387, 394, 475 Sheets, V., 469 Shell, P., 25, 28, 333 Shelton, J. R., 110 Shen, P., 420 Shenkin, S. D., 466 Shi, M., 120, 126 Shimizu, E., 460 Shipley, E. F., 124 Shire, B., 312 Shirley, M., 474 Shiverick, S. M., 285±286 Shomstein, S., 268

Author index Shrager, J., 335, 352 Sibbitt Jr., W. L., 466 Sides, A., 87 Siegal, M., 203, 224 Siegel, L. S., 306±307 Siegler, R. S., 222, 334±337 Silva, A. J., 481 Silva, P. A., 430 Silverman, I., 204, 260 Simmons, A., 158±159 Simon, D. P., 333 Simon, H. A., 133, 329, 333, 343, 352 Simon, T., 219 Simon, T. J., 226, 319 Simons, D. J., 300 Simonton, D. K., 353, 355±363 Singer, T., 170 Siwak, C. T., 460, 473 Skuse, D., 205 Slater, A., 301 Slaughter, V., 286 Slemmer, J. A., 299, 319 Sloman, S. A., 84, 89±90, 95, 98±100, 106, 111, 116, 118, 120, 123±124, 126, 216, 250 Sloutsky, V. M., 308 Slovak, L., 89, 100 Sluming, V., 463 Smid, H., 193 Smith, A. M., 166 Smith, B. H., 459 Smith, E. E., 99, 124 Smith, E. S., 227 Smith, G. A., 438, 440 Smith, G. S., 402, 413 Smith, I. M., 268 Smith, K. W., 361 Smith, L., 373 Smith, L. B., 303, 305±306, 308±309, 315±306 Smith, R., 205 Smith, S. D., 267 Smith, S. M., 358 Smith, V., 159 Snow, R. E., 25, 333 Snowden, J. S., 167 Sober, E., 109 Sohn, J. H., 463 Sohn, M.-H., 470

Soja, N. N., 315 Sokol, B. W., 243, 276 Somers, M., 301 Sommer, T., 281 Sommerville, R. B., 153, 160, 165, 168 Sophian, C., 20±21 Spada, H., 244 Spalding, T. L., 306 Spanoudis, G., 351 Spearman, C., 442, 449, 459, 471 Spelke, E. S., 226±227, 293±294, 296±297, 299, 301, 304, 311, 318±319 Spence, S. A., 165, 170 Sperber, D., 44, 54±56, 84, 179±180, 242, 246, 250, 258±261 Spiel, C., 243 Spinath, F. M., 431, 440 Spitz, H. H., 265, 331, 340±342 Spreng, N., 168 Spry, K. M., 339 Stankov, L., 400 Stanovich, K. E., 22, 24, 87, 99±100, 135±136, 199, 333 Starkey, P., 318±319 Starr, J. M., 394, 400±401, 428, 455, 466, 476 Staudenmayer, H., 238 Stavridou, A., 361 Steedman, M., 201 Stenger, V. A., 470 Stenning, K., 111, 183±184, 186, 188±189, 198±199, 203 Sternberg, R. J., 331, 334, 344, 354, 391 Stevens, J. R., 475 Stevenson, N. J., 18, 26±27 Stevenson, R. J., 85, 95, 120 Stewart, G. L., 393 Stewart, T., 133 Stibel, J. M., 89, 100 Stieglitz, S., 281, 284±285 Stip, E., 266 Stone, V. E., 157±158, 160, 163, 267±268

503

Storms, G. H. B., 111, 120 Strange, B. A., 171 Strauss, E., 160 Streissguth, A. P., 472 Strevens, M., 116, 130 Stuss, D. T., 153, 157±158, 167, 263, 267, 269 Styles, I., 374 Su, M. Y., 460 Sugarman, S., 303 Sugiyama, L. S., 61±62, 65, 76 Sullivan, R., 458 Sulloway, F. J., 356 Sundet, J. M., 439 Sunohara, G. A., 281 Sussman, R. W., 411 Sutton, J., 372 Svetina, M., 335 Svoboda, E., 159 Swanson, J. M., 281 Swayze 2nd, V., 461 Swettenham, J., 266 Syme, L., 405 Szemanski, A., 333 Taffe, M. A., 460 Tager-Flusberg, H., 275 Talmi, D., 167 Tambs, K., 439 Tang, Y., 474 Tang, Y. P., 460 Tannock, R., 281 Tapp, P. D., 460, 473 Tardif, T., 282 Tarlatzis, I. D., 262 Tarrier, N., 165, 170 Taylor, D., 301 Taylor, L. A., 160 Taylor, M., 224, 286 Taylor, R. L., 455 Tays, W. J., 267 Teigen, K. H., 452 Tellegen, A., 355, 414 Temple, E., 226, 465 Templeton, A. R., 109 Terman, L. M., 353, 356 Tesch-RoÈmer, C., 329, 343±345 Tetreault, N. A., 475 Thagard, P. R., 218±219 Thaiss, L., 264, 285±286

504

Author index

Theadom, A. M., 26±29, 244 Thelen, E., 303 Thomas, D., 374 Thomas, M., 451, 472±473 Thomas, R. P., 334 Thompson, L. A., 434, 439, 441 Thompson, P. M., 394, 462±463, 476±477 Thompson, R. M., 459, 480 Thompson, V. A., 248 Thorndike, R. L., 369 Thornhill, R., 135 Thurstone, L. L., 370 Tidswell, T., 280 Tobin, J. J., 281 Todd, P. M., 83, 85, 131±132, 135, 137, 144, 146, 388 Toga, A. W., 394 Tomasello, M., 87, 205, 460 Tomer, R., 158 Tooby, J., 13, 15, 22, 42, 44, 59±62, 64±65, 72, 76±77, 83±84, 87, 89±91, 100, 131±132, 161, 163, 216, 223, 249, 257±259, 261±262, 269, 330, 388, 416, 442, 449±450 Toupin, C., 305 Townsend, D. A., 458, 473 Townsend, J., 267 Townsend, J. T., 146 Tranel, D., 267±268 Treisman, A., 299 Treoloar, S. A., 435 Trinkhaus, E., 387, 415, 417 Trivers, R. L., 42 Trouard, T., 159 Tsivkin, S., 293±294, 296±297, 299, 311 Tuch, D. S., 465 Tucker, D. M., 261 Tulving, E., 237 Turetsky, B. I., 159 Turkheimer, E., 455 Turner, G., 167

Turner, J., 260 Turner, T., 159 Tversky, A., 72, 87±88, 92±93, 96, 98, 100 U. S. Department of Labor, 397 Uchida, N., 308 UmiltaÁ, C., 154±155 Uncapher, M., 260 Underhill, P. A., 420 Urbach, P., 91 Ursu, S., 470 Vachon, R., 243 Vaez-Azizi, L. M., 417, 476 Valentine, E. R., 24 Vallender, E. J., 417, 476 van Baal, G. C. M., 434, 437±438, 441 van Beijsterveldt, C. E. M., 441 van de Sluis, S., 437 van der Henst, J.-B., 56, 250 van der Molen, M. W., 28 van Erp, T., 462±463, 477 van Lambalgen, M., 183±184, 186, 188±189, 193, 195, 198±199, 201 Vandierendonck, A., 336 Vanyukov, P., 334 Varley, R., 224 Vavrik, J., 279 Venet, M., 243 Venville, G., 379±380 Verguts, T., 226 Vernon, P. A., 440±441, 462, 465 Vernon, P. E., 372 Vesterdal, W. J., 395, 435 Viale, R., 87 Vodegel Matzen L. B. L., 28 Vogel, E. K., 221 Voss, A. A., 219 Vurpillot, E., 319 Vygotsky, L. S., 261, 375±376 Wade, D., 267 Wadsworth, S. J., 434 Wager, T. D., 466

Wagner, R. K., 331, 334 Wagner, S. H., 313 Wainwright, J. A., 267 Wainwright, M. A., 435 Wainwright-Sharp, J. A., 268 Walberg, H. J., 356 Walhovd, K. B., 477 Walkenfeld, F. F., 472 Wallace, G. L., 341 Wallach, R. W., 454 Wallas, G., 352 Waller, N. G., 355 Walters, J., 313 Walton, P. D., 279 Wang, Y., 120, 126 Ward, S. L., 163, 237, 242±243, 248 Ward, T. B., 358 Wardlaw, J. M., 466 Warren, J. M., 458 Warrington, E. K., 154±155, 467 Wason, P. C., 40, 60, 181±184, 188, 215, 242±243 Waters, A. J., 344 Watson, J., 223, 280 Watson, J. B., 362 Watson, K., 475 Watson, K. K., 475 Waxman, S. R., 306, 308 Webster, D. S., 29 Weed, M. R., 460 Weers, D. C., 466 Weikart, D., 374 Weiner, M., 167 Weisberg, R. W., 353 Welch, J., 301 Welfare, H., 26±29, 244 Wellman, H. M., 223, 275±276, 280, 282 Welsh, M. C., 277 Wertheimer, M., 352 West, R. F., 22, 24, 87, 99±100, 135±136, 333 West, S. G., 469 Weston, N. J., 137, 142±145 Wexler, A., 268 Whalley, L. J., 394, 400±401, 428, 455, 476

Author index Wheeler, P., 415 Wheelwright, S., 29±30, 158±159 White, L. E., 465 White, N. S., 471 White, R. K., 356 Whiteman, M. C., 394, 400±401, 455 Whiting, W. L., 466 Wicherts, J., 437 Wicker, B., 170 Wickett, J. C., 462 Wigal, T., 281 Wiig, E. H., 31 Wilcox, T., 301 Wilda, M., 204 Wilhelm, O., 95 Wilke, M., 462±463, 466 Wilkerson, B., 434 Wilkie, O., 124 Wilkinson, I. D., 165, 170 Wilkinson, L. S., 204 Williams, G. C., 419 Williams, J., 415 Williams, L., 284 Williams, M. A., 158 Williams, P., 461, 467, 470 Williams, S. C., 463

Williams, W. M., 331, 334, 372 Williamson, C., 205 Wilson, D., 56 Wilson, E. O., 414 Wilson, R. S., 431 Wilson, W. H., 221 Wimmer, H., 196, 275 Wing, L., 263 Winman, A., 143 Winner, E., 157, 160 Winocur, G., 155 Winston, J. S., 171 Wisco, J. J., 465 Witelson, S. F., 462 Witzki, A. H., 466 Wolff, P., 219 Wood, D. J., 335 Woodruff, P. W., 470 Wrangham, R., 415 Wright, H., 24 Wright, M. J., 435, 438, 440, 476 Wu, D. Y. H., 281 Wu, Y., 281 Wuthrich, V., 361 Wynn, K., 223, 226, 293, 295±296, 311, 313, 315±316, 318±319 Wynn, T. G., 421

505

Xie, Y., 374 Xu, F., 226, 281, 301, 318 Yang, W. H., 420 Yang, Y., 236 Yaniv, I., 360 Yates, C., 378 Yaure, R., 163 Yeo, R. A., 462±464, 466 Young, A., 158, 268 Young, R., 341±342 Young, S. E., 466 Yu, J., 459, 480 Zaccaro, S. J., 360 Zaitchik, D., 285 Zaleta, A. K., 465 Zechner, U., 204 Zelazo, P. D., 153, 156, 221, 225, 258, 262, 267, 269, 276±278 Zhang, H. C., 333 Zheng, Y., 165, 170 Zhuo, M., 460 Zicker, S. C., 460 Ziegler, E. W., 455 Zielinski, T., 222 Zythow, J. M., 352

Subject index

Ability (general/cognitive), 2±3, 5, 19, 25, 91, 160, 225, 265, 329±333, 335±336, 340, 342±344, 346, 357, 360, 389±390, 393, 395, 406, 427±428, 430, 432±433, 437±440, 444, 449, 452, 455, 458, 460±461, 463, 472, 474±478 Adaptive toolbox (Swiss army knife metaphor), 83±84, 131±136, 138±139, 144±147, 388 Ageing and cognition, 159±164, 403±405, 428, 463, 474, 476 Amygdala, 158±159, 163, 165±167, 169±171, 224, 260, 268 Analogical reasoning, 5, 28±31, 34, 214, 217±222, 225±226, 228, 240, 261, 264, 298, 305, 333, 471 Animal cognition, 6, 100, 168, 180, 195, 201±202, 204±205, 225±226, 293, 299, 352, 371; see also intelligence Attention, 22±23, 27, 34, 54, 155, 158±159, 164, 168, 171, 215, 224±225, 266±268, 275, 278, 281, 286, 295±298, 303±304, 309, 311±313, 315, 318, 360±361, 363, 369, 377, 415, 450, 467 Autism, 156±157, 193, 205, 224, 263±269, 275, 278, 286 Bayesian reasoning, 62±64, 72±77, 89±91, 97±98, 116, 133 Beliefs (reasoning with), 4, 85±87, 109, 113±116, 118±119, 121±122, 124±126, 155±157, 160±161, 164, 180, 194, 196, 198±199, 202, 223±225, 264, 268, 275±276, 279, 284±286 Brain see neuroscience research Categorisation, 5, 63, 105±126, 132, 214, 218, 222, 296±298, 302±314, 316±318, 320

Causal learning and reasoning, 30, 63, 95, 98, 105±109, 111±121, 123±126, 194±197, 202, 241, 264±264, 373, 387, 407 Central systems and processes, 2, 5, 7, 99, 154±156, 159±161, 164, 166, 171, 213±214, 257, 258, 261, 267, 269±270, 279, 341, 373, 376, 394, 398, 450±451, 460, 466±468, 470±471, 473, 476; see also executive function Cheater detection module see modules Cognitive complexity see dif®culty Cognitive limitations see working memory capacity; executive function Communication, 6, 54±56, 120, 122, 168, 202, 205±206, 219, 263, 266, 387 Conditionals (reasoning with), 4, 8, 24±25, 39±40, 55, 59±62, 70±72, 76, 95±99, 132, 182±185, 187±194, 198±199, 216±217, 235, 238±243, 246±250, 416; see also Wason Selection Task Connectionism see neural net models Content/context and reasoning, 1±4, 6±8, 13±34, 39±45, 53, 59±62, 77±78, 83±88, 92, 113, 117, 132, 180±185, 190, 194, 197, 200, 203, 217±220, 237±238, 243±246, 248, 250±251, 261±262, 267, 284±287, 310±312, 314, 317, 330±339, 355, 363, 371, 373, 377±379, 388, 390, 416, 443±444 Counterfactual reasoning, 61, 76, 87, 188±190, 194±196, 198, 202, 264±268, 275±277, 284±285; see also false belief tasks Cross-cultural research, 3±4, 59, 61±62, 64±72, 74±76, 110±111, 113, 120±121, 193±194, 262, 281±284, 310, 454±455, 407±413, 420

Subject index Decision making see judgement and decision making Deductive reasoning see logical reasoning Deontic reasoning, 154±155, 165, 171, 183±184, 186, 190, 199, 203, 215±216, 228, 246, 248±250; see also social contracts Descriptive selection task see Wason Selection Task Dif®culty of tasks (in relation to demands and logic), 4, 6±8, 14, 17±18, 20, 23±25, 27±31, 33±34, 89±92, 133±134, 155±157, 159±164, 171, 188±190, 199±200, 221±222, 225, 227, 265, 278, 280±282, 284±287, 332, 336±340, 388, 391, 395±399, 401, 406±407, 419±421 Domain free see domain general Domain general procedures, 1, 4±5, 7±8, 13±19, 22, 25, 28±29, 31, 33±34, 59±60, 77±79, 83, 132, 147, 161, 171, 213, 217±218, 222, 228, 244, 261, 276±277, 287, 293, 297±299, 303±304, 307, 316±320, 330, 353, 355, 357±359, 362±363 Domain general resources and processing requirements, 1, 4±8, 14±19, 23, 25, 28, 30±34, 153, 155±156, 157, 159±161, 163±165, 171, 199, 221, 330, 400±401, 421, 450±451, 460±461, 466, 473; see also working memory Domain speci®c procedures, 1, 3±8, 13±19, 22±25, 27, 33, 40, 56, 83, 89, 91; see also knowledge; modules; schemas Domain speci®city, 1±6, 8, 16, 21±22, 29, 39; see also massive modularity Ecological validity, 88, 100, 137, 181 Education and schooling, 2, 6, 25, 65, 74, 193, 219, 280, 331, 335±339, 345±346, 369±383, 378, 381, 383, 389, 391, 394±396, 401, 405, 418±419, 428±430, 433±437, 443 Emotions and empathy, 4, 6, 153±154, 158±159, 163±171, 206, 224±225, 228, 260±262, 268, 270, 275, 392±393, 395 Evolution and evolutionary psychology, 4, 6, 15±16, 39, 42, 54, 59±64, 72, 74, 77±79, 83, 85, 87±92, 98, 100, 106, 109±111, 113, 117, 119,

507

121, 131±136, 138, 147, 179±182, 185, 190, 196±197, 199±206, 236, 246, 249±251, 257±262, 269±270, 299, 371, 387±389, 392±397, 399, 407, 412±421, 442, 449±451, 456±461, 475±476, 481 Executive function, 4, 28, 155±160, 164, 169, 171, 193, 199, 205, 213, 224, 258, 265±269, 276±287, 393±395, 459, 465±467, 473; see also central systems Experience see motivation and practice Expertise (acquisition and application), 5, 111, 114, 122, 124, 136, 265±266, 329, 331±335, 339±345, 354±364, 444 Face perception see modules False belief, photograph and sign tasks, 4, 156±157, 160, 194, 196, 223±225, 264, 268, 280±282, 284±286; see also counterfactual reasoning Fast-and-frugal heuristics see heuristics Frontal brain (including prefrontal cortex, PFC), 155, 157±159, 160, 163, 165±171, 204±205, 224, 227, 262, 266±267, 277±278, 387, 394, 461±463, 466±471, 473, 475±480 g factor see intelligence Gambling behaviour, 166±167, 224, 267±268, 340 General ability see cognitive ability Genetics, 1, 5±6, 13±5, 14±4, 203±206, 258, 281, 285, 305, 330, 354±355, 359, 362, 371±372, 387±388, 393±395, 399, 411, 413±414, 416±417, 420±421, 427, 430±441, 443±445, 459±460, 474, 476, 478, 481 Grey and white matter, 157, 394, 450, 462±466, 471, 473±476, 480 Hazard management see modules Heritability see intelligence Heuristics: availability, 98±99; fast and frugal, 4, 83, 85, 93, 134, 136±139, 142, 145, 147, 388; hill-climbing, 333, 352; means±ends analysis, 333, 352; recognition, 133, 135±137, 139±141, 146; relevance, 55; simulation, 98; strong methods, 358; take the best (TTB), 133±134, 137±139, 141±146; trial and error, 352, 358±359, 361, 363; weak methods, 333, 358

508

Subject index

Implicit cognitive processes, 99, 122, 169, 219, 265±266, 303, 307, 317, 341, 472, 477 Individual differences in cognitive processes, 4, 21, 99±100, 144±145, 317, 335±339, 340 Individual differences (quantitative), 3, 5, 7±8, 1±6, 25, 100, 329±330, 332, 343, 353, 356, 359±360, 388, 391, 393±395, 428, 432±433, 436, 440, 444, 449±450, 452, 454, 460, 465±466, 468±472, 474±477, 480±481 Inductive reasoning, 5, 25±28, 34, 61±62, 9±4, 105±107, 109±111, 113±120, 123±125, 234, 239, 333; see also categorisation; schemas; causal reasoning; analogical reasoning Inference rules, 14, 59, 78, 83, 85±86, 88, 187±188, 233, 235±238, 246, 249 Informational encapsulation, 56, 99, 132, 154, 197, 214±215, 223, 258, 449 Inhibition, 153, 155±157, 171, 193, 224, 277±278, 280±282, 284, 286, 360±361, 394, 399, 459, 466, 470, 477; see also executive function Intelligence: in animals, 387, 452±460, 473±476, 478±481; evolution, 387±388, 393, 413±421; g factor, 25, 333, 388±397, 399, 401, 405, 414, 416±420, 427, 429±430, 433±434, 436±444, 450±461, 463, 465, 467±474, 476±478, 480±481; heritability, 5±6, 261, 355, 370±372, 393, 395, 428±431, 433±436; improvement of, 374±383, 429±430; predictive validity in occupational, educational, and real-life settings, 261, 331, 339, 344±346, 353, 355±357, 370, 391±392, 394±407, 433±436; in relation to cognition, 5, 24, 33, 146, 219, 222, 329±334, 339±345, 373±374, 376, 452±455, 460±474; structure of, 370, 383, 388±393, 437±441; testing of 16, 25±31, 331, 342, 353, 395±396; twin and adoption studies, 372, 430, 431, 432, 434, 435, 436, 437, 438, 441, 462; versus domain speci®city, 1, 3, 5, 329±334, 339±346, 370, 388, 442±444, 449±450; see also cognitive ability Interference, 164, 182, 243, 266, 341, 361, 469; see also executive function

Judgment (under uncertainty) and decision making, 2, 4±5, 6±2, 73±75, 78, 85, 87±88, 93±96, 98±99, 105±108, 115, 122, 124±126, 131±134, 135±137, 139±140, 142±147, 165±166, 267±269, 333, 338±339, 387, 398, 400, 412, 452, 454±455 Knowledge and cognition, 1, 3, 7, 14, 19, 21, 23, 29±31, 33±34, 63, 77, 85±89, 90, 110±112, 114, 116±118, 120±121, 124±125, 139±140, 154, 167, 184, 188, 190, 198, 215±219, 227, 234, 239, 251, 261, 277, 279±281, 293±294, 296±299, 308, 315, 320, 329±330, 333±334, 339±343, 354, 356±357, 359, 362, 373, 395, 443±444 Language, 77±78, 95, 120, 171, 179±181, 183±184, 186±188, 195, 198±199, 201±203, 205±206, 220, 236, 241, 263, 294±296, 299, 304, 308, 311±313, 315±317, 360, 399, 416 Learning and skill acquisition, 1, 3, 5, 7±8, 14±16, 19, 31±32, 34, 107, 111±117, 119, 121, 123, 124, 165, 167, 179±180, 204, 218, 223±227, 236, 238, 241, 259, 261, 266, 269, 275, 282±283, 287, 293±302, 304, 306±317, 319±320, 329±330, 332, 336±337, 339, 341±346, 354±357, 360, 362, 371, 377, 388, 390±393, 398, 401, 416, 418±419, 433, 436, 451, 456, 459, 471±472, 475, 480 Logic and logical reasoning, 2±4, 7±8, 13±14, 17, 18, 22, 24±25, 33, 39±40, 55, 59±62, 64, 68±72, 76, 78, 83±88, 91±99, 132, 159, 179±203, 205±206, 216±217, 221±222, 233±251, 264±265, 416; see also Wason Selection Task Logic in relation to task performance see dif®culty Massive modularity, 1±3, 6±7, 15±19, 22, 33±34, 42±46, 48±50, 53±54, 56, 59±62, 77±78, 83, 85, 87, 89±90, 99±100, 106±112, 115, 131±132, 155±157, 161±164, 170±171, 179±182, 190, 197±200, 203, 206, 213, 215, 217±218, 222±225, 227±228, 249, 257, 259±261, 276, 330, 343, 388±389, 399, 442±443, 450, 461, 481 Matrix reasoning, 25±28, 34, 333, 341, 374, 453, 467±471, 477

Subject index Memory, 63, 91, 94, 100, 141, 157±159, 163, 165±166, 171, 201±202, 227, 236±237, 246, 260, 262, 266, 277±278, 280, 284, 298, 309, 333, 341, 343, 351, 368, 390, 437±440, 460, 480±481, 443, 449±451, 478±481; see also schemas; working memory capacity Mental logic see inference rules Mental models, 14, 78, 93, 219, 227, 236, 244±246, 248, 250 Mental state judgement see Theory of Mind; emotions Metacognition, planning, foresight, 6, 99, 157, 167±168, 195, 197±202, 205±206, 267, 277, 279±281, 283, 376, 379±380, 383, 387, 398±399, 401±402, 416±417; see also central executive; planning Methodological problems (confounds, extraneous manipulations, and nonisomorphic tasks), 3, 16±18, 20, 23±24, 26±29, 31, 33±34, 39±41, 44±46, 48±52, 56, 64, 76±77, 183±184, 217, 219±220, 248, 250, 264±265, 317±320, 332, 345 Modularity (non-massive), 2, 6, 53±56, 78, 83, 99±100, 154±156, 161±163, 168, 171, 179±180, 182, 185, 197, 200, 203±204, 206, 213±215, 223±225, 227±228, 249, 257±262, 266, 269, 277 Modules: cheater detection, 3, 7, 15, 22, 42±46, 48±50, 53±54, 56, 59±62, 77, 132, 161±164, 182, 190, 197, 206, 215±217, 225, 228, 330, 388; face perception, 99, 225; folk biology, 106±112, 115; hazard management, 3, 15, 22, 161±164, 249, 330, 389; language, 78, 195, 198; natural sampling, 89, 91; number concepts, 223, 227, 266, 295±300, 340; theory of mind, 4, 87, 155±157, 171, 180, 190, 198±199, 203, 223±225, 257, 263, 267, 276±277; vision, 213±215, 227±228 Motivation practice, and experience, 5, 14±15, 20±21, 32±33, 43, 62, 65, 72, 138, 216, 225, 246, 258, 263, 266, 275±276, 283±284, 287, 295, 300, 302, 304, 313, 316, 319, 329±331, 333±344, 354, 360±361, 370, 390, 419, 432±433; see also learning; expertise Neural net models, 99, 138, 199, 218, 309, 472, 478±480

509

Neuroscience research, 6, 110, 153±155, 157±159, 163±171, 201, 215, 224, 226±228, 262±263, 265, 267±268, 277, 387±389, 394, 441, 444, 449±451, 459±480 Numerical reasoning and judgement, 4, 19±21, 62±64, 72±77, 83±84, 88±95, 99, 225±227, 265±266, 293±320, 336±343 Permission schema see schemas Planning and foresight see metacognition Practice see motivation Pragmatic reasoning schemas see schemas Prefrontal cortex see frontal brain Problem-solving: naturalistic/real life, 19±21, 31±33, 144±145, 190, 259±261, 331, 334±340, 397±402, 406±407 Processing capacity limitations see domain general resources Progressive Matrices see matrix reasoning Rationality, 83, 86, 93±94, 122, 133±135, 137±138, 144±145, 147, 188, 215 Savant skills see expertise Schemas (formation and reasoning with), 3, 5, 7, 14±19, 22±23, 26, 34, 39±42, 46, 78, 163, 214, 216±220, 222, 228, 236, 246±249, 377±381; control of variables, 377±379; convergence, 15, 22; general puzzlesolving, 18; obligation, 15, 22, 41, 216, 246, 248; permission, 7, 14±15, 41, 216±218, 220, 246, 248; prediction, 217; seriation, 377; social contexts, 26 Selection Task see Wason selection task Set-shifting, 156, 267, 277±279, 466±467, 478±479 Sex differences, 204±205, 260±261, 393, 402, 410±411 Similarity (in relation to categorisation), 105±106, 116, 125, 138, 302±303, 305±308, 312±313, 315, 317 Social cognition, reasoning in a social context, 1±2, 6, 15, 21, 25±26, 60, 87, 111, 135, 153±154, 156±160, 164±169, 170±171, 202±206, 236, 260±261, 263, 267±268, 275±276, 283, 313, 371,

510

Subject index

375±376, 379±380, 392±393, 395, 413±414, 417±418, 419±420 Social contracts (reasoning with), 3, 7, 15, 22, 39, 42±54, 56, 59±62, 64±72, 76±78, 132, 161±163, 182, 185, 190, 197, 206, 216±217, 223, 225, 228, 249±251, 330, 388, 393; see also modules; massive modularity Task demands see dif®culty Theory of Mind (ToM) module see modules Theory of Mind (ToM) reasoning, 4±5, 153±171, 180, 190, 194, 196, 198±199, 202±203, 223±225, 228, 257, 263±265, 267±269, 275±277, 279±287

Transfer of learning, 6, 197, 219, 338, 342, 377, 381, 383, 391, 416 Wason Selection Task, 3±4, 8, 21±25, 39±56, 60±62, 65±70, 76±77, 161±164, 181±189, 199, 203, 206, 213, 215±217, 220, 242±243, 246±251 White matter see grey matter Working memory capacity/demands, 1, 7, 14, 16±17, 28±31, 34, 99, 155±161, 163±164, 166, 168±169, 171, 199, 221±222, 243±245, 278, 280±282, 286, 330, 337, 373, 376±377, 450, 462, 466±471, 473, 476±480 Working memory management see central executive

Integrating the Mind: Domain General Versus Domain Specific Processes in Higher Cognition

Domain-Specific Modeling

Domain (The Domain Trilogy)

Domain-Specific Model-Driven Testing

Domain-Specific Model-Driven Testing

Domain

Domain

Domain

Domain

Domain

Domain

Domain

Mapping the Mind: Domain Specificity in Cognition and Culture

Domain

Domain

Domain

Domain

Domain

Domain

Masochs Domain

Deep Domain

DSLs in Boo: Domain Specific Languages in .NET

Deep Domain

Eminent domain

Deep Domain

Unholy Domain

Public Domain

DSLs in Boo: Domain Specific Languages in .NET

Domain Decomposition

Domain (1985)

Domain Architectures

Integrating the Mind: Domain General Versus Domain Specific Processes in Higher Cognition

Domain-Specific Modeling

Domain (The Domain Trilogy)

Domain-Specific Model-Driven Testing

Domain-Specific Model-Driven Testing

Domain

Domain

Domain

Domain

Domain

Domain

Domain

Mapping the Mind: Domain Specificity in Cognition and Culture

Domain

Domain

Domain

Domain

Domain

Domain

Masochs Domain

Deep Domain

DSLs in Boo: Domain Specific Languages in .NET

Deep Domain

Eminent domain

Deep Domain

Unholy Domain

Public Domain

DSLs in Boo: Domain Specific Languages in .NET

Domain Decomposition

Domain (1985)

Domain Architectures

Recommend Documents