PROBLEM SOLVING
Problem solving S.Ian Robertson University of Luton, UK
First published 2001 by Psychology Press 27...
148 downloads
1771 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PROBLEM SOLVING
Problem solving S.Ian Robertson University of Luton, UK
First published 2001 by Psychology Press 27 Church Road, Hove, East Sussex, BN3 2FA This edition published in the Taylor & Francis e-Library, 2005. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” www.psypress.co.uk Simultaneously published in the USA and Canada by Taylor & Francis Inc 325 Chestnut Street, 8th Floor, Philadelphia, PA 19106 Psychology Press is part of the Taylor & Francis Group © 2001 Psychology Press Ltd All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging in Publication Data Robertson, S.Ian, 1951– Problem solving/S.Ian Robertson. p. cm. Including bibliographical references (p. ) and index. ISBN 0-415-20299-X—ISBN 0-415-20300-7 (pbk.) 1. Problem solving. I. Title BF449.R63 2001 153.4'3–dc21 00–059071 ISBN 0-203-45795-1 Master e-book ISBN
ISBN 0-203-76619-9 (Adobe eReader Format) ISBN 0-415-20299-X (hbk) ISBN 0-415-20300-7 (pbk) Cover design by Leigh Hurlock
For Kathryn, Poppy, and Aidan (and then there were three)
Contents
Part One:
Illustrations
ix
Preface
xii
Acknowledgements
xiii
Introduction 1.
Part Two:
Introduction to the study of problem solving
2
What is a problem?
2
Categorising problems
5
Methods of investigating problem solving
9
About this book: Three issues in problem solving
12
Summary
13
Problem representation and problem-solving processes
Part Two: 2.
3.
1
15
Introduction
16
Characterising problem solving
18
The information-processing approach
18
Analysing well-defined problems
21
The interaction of the problem solver and the task environment
27
Heuristic search strategies
28
Summary
36
Problem representation: The case of insight
38
Building a problem representation
38
Re-representing problems
39
Gestalt accounts of problem solving
41
Information-processing accounts of insight
48
vi
Part Three:
The relationship between insight problems and other problems
56
Influencing problem representations: The effect of instructions
59
Summary
60
Analogical problem solving
Part Three: 4.
5.
6.
7.
62
Introduction
63
Transfer of learning
65
Positive and negative transfer
66
Set and Einstellung as examples of negative transfer
67
Hypothesis testing theory
67
Transfer in well-defined problems
69
Specific transfer
72
General transfer
82
What kinds of knowledge transfer?
84
Summary
85
Problem similarity
87
Types of similarity
87
Relational similarity
96
Structural Similarity
101
Pragmatic constraints
104
The relation between surface and structural similarity
105
Summary
107
Analogical problem solving
109
The importance of analogising
109
Studies of analogical problem solving
110
Accessing a source to solve a target
113
Generating your own analogies
114
Expository analogies
118
“Aesthetic” analogies
123
Summary
124
Textbook problem solving
126
vii
Part Four:
Difficulties facing textbook writers
127
The role of examples in textbooks
129
The processes involved in textbook problem solving
131
Laboratory studies of within-domain and textbook problem solving
133
Understanding problems revisited
137
The role of diagrams and pictures in aiding understanding
138
Providing a schema in texts
141
Summary
142
Learning and the development of expertise
Part Four: 8.
9.
10.
144
Introduction
145
Expertise and how to acquire it
146
Induction
146
Schema induction
147
Schema-based knowledge
149
AI models of problem solving and learning
151
Skill learning in ACT-R
155
The power law of learning
160
Criticisms of production system models
160
Neurological evidence for the development of procedural knowledge
163
Summary
164
Experts, novices and complex problem solving
166
What distinguishes experts and novices
166
Are experts smarter? Are there differences in abilities?
167
Skill development
170
Knowledge organisation
172
Cognitive processes
176
Writing expertise: A case study
180
Summary
185
Conclusions
187
Problems, problems
187
viii
Problem representation
188
Transfer
189
Learning
190
Answers to questions
193
Glossary
195
References
202
Author index
216
Subject index
221
Illustrations
FIGURES 1.1 1.2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 3.1 3.2 3.3 3.4 3.5 3.6
Problem solving Representing a problem in terms of the problem statement, solution procedure, and goal state A trivial version of the Tower of Hanoi puzzle The sequence of imagined moves in solving the two-ring Tower of Hanoi puzzle The initial state, goal state, operators and restrictions in the Tower of Hanoi puzzle The possible states that can be reached after two moves in the Tower of Hanoi puzzle State space of all legal moves for the three-ring Tower of Hanoi puzzle Sources of information used to determine a problem space Solution and state-action graph of the Hobbits and Orcs problem The goal recursion strategy of solving the Tower of Hanoi puzzle The Mutilated Chequer Board problem A simplified version of the Mutilated Chequer Board problem Simpler version of the Dots problem A square “emerges” from the configuration of black dots The problem can be seen as adding up an increasing number of boxes By doubling the number of squares in Figure 3.5 the problem becomes the simple one of finding the area of a rectangle 3.7 Maier’s Two-string problem 3.8 Luchins’ Water Jars problem 3.9 Duncker’s Candle Holder problem (Source: Robertson, 1999, pp. 85–86) 3.10 Pruning the search tree 3.11 The different representations of the Mutilated Chequer Board problem (Source: Robertson, 1999, p. 50) 3.12 Diagrams used by Duncker in describing the Radiation problem (adapted from Duncker, 1945) 3.13 The Nine Dots problem 3.14 Two variants of the Nine Dots problem presented in MacGregor et al. (in press) 3.15 The Mad Bird problem from the bird’s point of view 3.16 The Mad Bird problem from the trains’ point of view 3.17 The River Current problem from the observer’s point of view 3.18 The River Current problem ignoring the observer 4.1 The 4-ring Tower of Hanoi state space showing the substructure of 1-, 2-, and 3-ring sub-spaces. Smaller sub-spaces embedded within larger sub-spaces illustrate the recursive nature of the problem. (Reprinted from Acta Psychologia, 1978, Luger and Bauer. “Transfer effects in
3 4 19 20 22 22 24 26 31 35 39 40 41 41 42 43 45 46 47 50 51 54 55 56 58 58 58 59 73
x
4.2 4.3 4.4 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.1 6.2 6.3 6.4 6.5 6.6 6.7 7.1 7.2 7.3 7.4 7.5 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 9.1 9.2 9.3 9.4
isomorphic problem situations” pp. 121–131. © 1978, reprinted with permission from Elsevier Science) Four cases where two tasks (A and B in each case) share a number of productions 76 Control panel device (adapted from Kieras & Bovair, 1986) 79 Three types of diagrammatic representations (adapted from Novick & Hmelo, 1994, p. 1299) 83 Intermediate representation of the Fortress problem 89 A continuum of similarity 91 A semantic hierarchy of transport systems 92 A semantic network showing connections between “car” and other forms of transport 93 A semantic network showing connections between “car” and pollution 94 A summary of processes needed to solve the analogy: hot: cold::dry:? 98 Relational mappings used to understand effects of those relations in a target domain 98 Similarity space: Classes of similarity based on the kinds of predicates shared. (From Gentner, 103 1989, p. 207. Reproduced by permission of the publishers) An architecture for analogical processing (adapted from Gentner, 1989, p. 216) 106 Accessing and adapting a source problem to solve a target 114 The relation between a problem and its solution involving an implicit schema 115 Implicit schema cued by accessing source and used to solve target 115 Effects of the analogy between batteries and water reservoirs 119 Batteries and resistors arranged in series and in parallel 119 Near and far analogies and their likely effect on understanding 122 Hawking’s analogy of the universe with the earth (adapted from Hawking, 1988, p. 138) 123 Is a category of problems best explained by presenting a number of close variants or a range of 129 examples including distant variants? A hierarchy of Rate problems 134 Applying the source to Target 4 involves two levels of the hierarchy 136 Intermediate bridging representations 139 A bridging analogy or intermediate representation (B′) helps the analogiser to make the leap from a140 concept (A) to a concrete example (B) The example-analogy view of schema induction. Using an ill-understood example to solve an ill- 148 understood problem leads to the partial formation of a problem schema The process of solving a problem using an example or analogy generates a schema for the problem 149 type. When faced with a later problem of the same type, a solver will either access the schema or an earlier example (or both) Repeated exposure to examples means that the schema becomes increasingly decontextualised 150 A problem model as a homomorphism 154 The architecture of ACT-R (Anderson, 1993) 158 The analogy mechanism in ACT-R 159 Mapping correspondences between the example and the current goal 160 The Power Law of Learning: (a) represents a hypothetical practice curve and (b) is the same curve 161 represented as a logarithmic scale Dimensions in which one can understand or study expertise 168 Raufaste and co-workers’ (1999) reinterpretation of the intermediate effect 172 Examples of problem diagrams used in Chi and co-workers’ studies (adapted from Chi et al., 1981)177 The Hayes-Flower model of composition (based on Hayes, 1989; Hayes & Flower, 1980) 181
xi
9.5 Structure of the knowledge-telling model (adapted from Bereiter & Scardamalia, 1987) 9.6 Structure of the knowledge-transforming model (adapted from Bereiter & Scardamalia, 1987) 10.1 External and internal factors affecting problem solving and learning
183 184 191
Tables 2.1 Average number of moves in each part of the Hobbits and Orcs task 3.1 The Monster problem 4.1 Conditions in the text editing experiments of Singley and Anderson 5.1 “Creative” uses of analogy 5.2 Surface and structural features of problems 5.3 Mapping similar objects 5.4 Attribute mapping 5.5 Attribute mappings with variables replacing specific values 6.1 Correspondences between two convergence problems and their schema 6.2 Mappings between the surface features (objects) in the Fortress and Radiation problems 6.3 Study-test relations in Ross (1987) 6.4 Mappings between the source domains of water and crowds and the target domain of electricity 6.5 Different ways new concepts were presented in Donnelly and McDaniel (1993) 6.6 Examples of texts with and without an analogy (from Giora, 1993) 7.1 Close and distant variants of different problem types 7.2 Examples of rate×time problems 7.3 The benefits of an explanatory schema and of diagrams 8.1 Schema representation of the problem 8.2 Information presented in evaluation and generation tasks in McKendree and Anderson (1987) 9.1 Protocol excerpts from an expert, showing early schema invocation, tuning, and flexibility
32 59 76 88 91 93 95 95 111 112 117 118 120 121 128 135 141 157 161 175
Preface
Despite the title, I’m afraid this book will not necessarily help you solve your personal problems. Instead it is about the psychological processes involved in problem solving—how we go about solving various types of problems, what kinds of things make it hard to solve problems, and how we learn from the experience of solving them. Who is it for? The book was written mainly (but not exclusively) with undergraduate students of psychology in mind, and should be particularly relevant to courses in cognitive psychology, cognitive science and problem solving and thinking. Students of artificial intelligence and computer modelling should also find several parts that are relevant to their studies. A large section of the book concerns “transfer of learning”. In other words a lot of the book is about the effectiveness or otherwise of certain teaching contexts. As a result there is much here that would be of interest to educationalists including classroom teachers. Business, industry, and even spin doctors are interested in how to solve problems and how to present information in an effective way. Part Two of the book shows that the way information is presented can influence how well it is understood, and hence reflects the likelihood or otherwise of solving a particular problem. Although the book was originally written for a particular audience, I have tried to ensure that it is accessible to the interested lay reader. Some concepts and phrases that are familiar to students of psychology are therefore explained, and there is a glossary at the back of the book to remind you of what these concepts mean. What’s in it? I should make it clear right away that this book does not cover such topics as models of deductive reasoning, or judgement and decision making under uncertainty. It sticks to a fairly restricted conception of problem solving mainly from an information-processing perspective. There are two implicit themes throughout the book. The first is simply the question of what kinds of things make it difficult to solve problems. The second is that problem solving and thinking in general involve processing the information given—that is, we are strongly influenced by the way information is presented and by those aspects of a problem or situation that appear to be salient. The book contains 10 chapters divided in turn into four parts. Part One and Chapter 1 introduce the subject and some of the main concepts that will be dealt with in the rest of the book. Part Two includes an overview of the subject, some of the history of problem-solving research, mainly by Gestalt psychologists, and how this research developed into the modern information-processing account. It deals with how we represent problems mentally and the kinds of processes we use as we go about solving them. Part Three deals with research into how we apply what we learn in one situation to another situation. Common sense tells us that we must be able to transfer our learning from one context (learning French in a classroom) to another (going on holiday to France). However, research suggests that evidence of such transfer is sometimes very hard to come by. Part three goes on to cover analogical problem solving, the creative use of
xiii
analogy, and the pedagogical role of analogies as a means of teaching or explaining new concepts. It ends with a look at textbook problem solving, as a great deal of what we learn at school and college in science, mathematics, and computer programming comes in the form of textbooks. Part Four includes two chapters on learning from examples and the eventual development of expertise. The penultimate chapter looks at writing as a complex problem-solving task with the aim of integrating many of the themes that have come up in the book. It also includes a general summary and conclusion. Throughout the book there are “Activities” to get you to think about the material presented. It is the role of a teaching text to teach, and, in the absence of a human teacher, the text itself must do what it can to help the student learn. I hope you find these activities and questions fun to do. There are also “Study Boxes” that provide detailed information about specific studies, and “Information Boxes” explaining some concepts in more detail. The aim of these is to cover certain studies and concepts in more depth than might otherwise be possible, and to prevent the flow of the text from being interrupted. ACKNOWLEDGEMENTS I am indebted to those who commented on early drafts of parts of this book. My thanks go to Thom Baguley, Norma Pritchett, Jill Cohen, Stephen Payne, Kathryn, and several anonymous reviewers for keeping me on the straight and narrow. The quotation by S.Mithen is taken from The prehistory of the mind. © 1996, Thames and Hudson. Reprinted with permission of the publishers. Quotations from Newell, A. are reprinted by permission of the publisher from Unified Theories of Cognition by Allan Newell, Cambridge, Mass: Harvard University Press. © 1990, by the President and Fellows of Harvard College. S.Ian Robertson University of Luton
PART ONE Introduction
CHAPTER ONE Introduction to the study of problem solving
We face problems of one type or another every day of our lives. Two-year-olds face the problem of how to climb out of their cot unaided. Teenagers face the problem of how to live on less pocket money than all their friends. Students have mathematics problems to do for homework. We have problems thinking what to have for dinner, how to organise a dinner party, what to buy Poppy for Christmas, how to fix the vacuum cleaner when it seems to have stopped sucking. Problems come in all shapes and sizes, from the small and simple to the large and complex, and from the small and complex to the large and simple. In some cases at least it is fairly obvious to those with experience what people should do to solve the problems. This book deals with what people actually do in their attempts to solve them. To help get a handle on the issues involved in the psychology of solving problems, we need a way of defining our terms—what is a problem, exactly, and what is involved in problem solving? Are there useful ways of classifying problems that would help in working out how they are typically dealt with? What methods are used in studying problem solving? What issues are raised by the study of human (and machine) problem solving? The aim of this introduction is to touch on some of these questions and to explain how this book is structured and the kinds of things you will find in it. To begin with we will look at what constitutes a “problem”—a number of definitions and concepts that recur throughout the book are introduced here. WHAT IS A PROBLEM? You have a problem when you are sitting at home in Manchester and you want to be on the beach at St Tropez, or when you read a section of a textbook and then try an exercise problem at the end of the section and haven’t a clue how to do it. In these examples there is a difference between where you are now (e.g., in Manchester) and where you want to be (e.g., in St Tropez). In each case “where you want to be” is an imagined (or written) state that you would like to be in. In other words, a distinguishing feature of a problem is that there is a goal to be reached and how you get there is not immediately obvious. To put it at its simplest, you have a problem when you are required to act but you don’t know what to do. However, perhaps a fuller definition might be more appropriate (Duncker, 1945, p. 1): A problem arises when a living creature has a goal but does not know how this goal is to be reached. Whenever one cannot go from the given situation to the desired situation simply by action, then there is recourse to thinking… Such thinking has the task of devising some action which may mediate between the existing and the desired situations.
1. INTRODUCTION
3
Figure 1.1. Problem solving.
What this definition highlights is that if you know exactly what action (or series of actions) to take in a given situation, then you don’t really have a problem. It’s only when you don’t know exactly what steps to take to get to the goal and have to take some mediating action that you have a problem. Figure 1.1 illustrates in an abstract form what is involved in a problem, and two forms of mediating action. The first action (Figure 1.1b) is to try to take steps along a path that seems to lead in the direction of the goal. This is not a particularly sensible thing to do if your problem is how to get to St Tropez. The second is to perform some other action that will make it easier to get to the goal (Figure 1.1c). This is a reasonable thing to do, as you will probably get everything you need to know from the travel agent.
ACTIVITY 1.1 According to the definition just given would you class the following as problems:
a. 2+3=? If so, why? If not, why not?
b. 237×38=?
In Activity 1.1, (a) is not a real problem because all you need to do is retrieve the answer directly from memory. It does not require a sequence of steps for its solution. For (b) things are a little bit trickier. You have a goal, but you probably do know what to do to achieve it, so in one sense you don’t really have a problem. On the other hand, you would probably have recourse to thinking.
4
ROBERTSON
Figure 1.2. Representing a problem in terms of the problem statement, solution procedure, and goal state.
J.R.Anderson once wrote “Problem solving is defined as any goal-directed sequence of cognitive operations” (Anderson, 1980, p. 257). This definition does not distinguish between a sequence of actions that one knows will achieve a goal and a sequence of actions one undertakes when one does not immediately know how to reach the goal. The first is the result of experience and the second is the situation a novice faces. On the other hand, the view of problem solving emphasised in Duncker’s quotation can be summarised by “What you do when you don’t know what to do” (Wheatley, 1984, p. 1, cited in Frensch & Funke, 1995). More recently Anderson’s definitions of problem solving have distinguished between early attempts at solving problems and the automated episodes that are still “no less problem solving” (Anderson, 2000a, p. 241; 2000b, p. 311). Problem solving starts off from an initial given situation or statement of a problem (known as the initial state of the problem). Based on the problem situation and your prior knowledge you have to work towards a solution. When you reach it you are in the goal state of the problem. On the way from the initial state to the goal state you pass through a number of intermediate problem states. (This aspect is dealt with in more detail in Chapter 2.) In some cases you don’t know what the answer is in advance and you have to find it. How you find the answer might not be particularly relevant. In other problems it is precisely how you get the answer that is important: “I have to get back by six and my car’s broken down”; “I have to get this guy into checkmate”. The point of doing exercise problems in textbooks is to learn how to solve problems of a particular type. If the answer was all you were interested in, you could look that up at the back of the book. For example, in cases where you have to prove something, the “something” is given and it is the proof that is important. Finally, you can have problems where you have only a vague idea what the answer is (although you could probably recognise it when you saw it) and an even vaguer idea of how to get there. Writing a 4000-word experimental report as an assignment is an example. The only thing you can be really sure of is whether you have written 4000 words or not. Another example might be finding a compromise solution in an industrial dispute—you are very unlikely to know in advance what the compromise position is going to be, or how you are going to find it. The term “solution” can refer to two aspects of problem solving: either the final solution (what I have been calling the “answer”) or the means of finding the answer (the solution procedure). It is usually obvious from the context what the word “solution” refers to, so it will be used throughout this book to cover both cases— except where it would be clearer if one were used rather than another. In Figure 1.2, the A box represents the problem statement or the given situation. The B box is the goal or rather the goal state of the problem, and the line that links the A box to the B box is the solution procedure. This line hides what is often a complex sequence of steps, relations between the parts of a problem situation, and information accessed from memory.
1. INTRODUCTION
5
CATEGORISING PROBLEMS Problems range from trying to find somewhere to have lunch, to solving the daily crossword, to trying to find a cure for cancer. Given this range of problems it would be useful to be able to categorise them. Have a look at the problems in Activity 1.2 as briefly as you like. As you think about each one, think also about how you are trying to solve it, and how it might be categorised. There are a number of ways in which the problems can be categorised. One way is to look at the types of solutions required. For example, they can be categorised according to: 1. 2. 3. 4.
the prior knowledge required to solve them; the nature of the goal involved; their complexity; whether the problem tells you everything you need to know to solve it, or whether you need to work out for yourself what you are supposed to do; 5. whether it is the same as one you’ve solved before; 6. whether it needs a lot of working out or whether you can solve it in one step if you could only think what that step was.
ACTIVITY 1.2 Have you encountered these problems before? What kinds of knowledge are required to solve them?
1. Write a reverse-funcall macro that calls a function with its arguments reversed. 2. Solve: 3(x+4)+x=20 3. The Dots Problem: Starting with an arrangement of dots given below: rearrange them so that they end up in the following arrangement: A dot can only move to an empty space adjacent to it. A dot can jump over only ONE dot of either colour. White dots can only move to the right and grey dots to the left. 4. Given 16 matches arranged in 5 squares like this: move 3 matches to form 4 squares. 5. What policies should the government adopt to reduce crime?
Knowledge needed to solve novel problems Faced with a novel problem, we bring to bear what we already know in order to solve it. However, I would guess that problem 1 in Activity 1.2 (a problem in the Lisp programming language) is completely meaningless to most readers. At times we may have no knowledge of the problem or can remember nothing like it in the past. Problem 2 may be straightforward to many. Problem 3 may also be new to you, but this time you can fall back on a few general strategies that have worked in the past with a variety of problems. In problem 3 you might start moving pieces to see where it
6
ROBERTSON
gets you, and try to get closer and closer to the goal one step at a time. In other unfamiliar problems you might start generating hypotheses and testing them out, and keep going until you find one that fits. These strategies often work in a variety of spheres of knowledge, or domains, so they can be called domain general. On the other hand, to solve problems 1 and 2 you would need a fair amount of knowledge specific to that kind of problem. Most new problems in a familiar domain remind us of similar problems we have had experience with in the past. In such cases we can use domain-specific strategies: ones that work only in this particular type of situation. For example, if you have to organise your very first holiday in Crete, your strategy might be to go along to a travel agent. This strategy only works in the domain of organising holidays. The travel agent would be rather confused if your problem was what to eat at a dinner party. Some of the problems in Activity 1.2 have a simple structure. Problems 3 and 4 have simple rules and a clear description of how everything is set up at the beginning of the problem—the initial state of the problem —and what the problem should look like at the end—in the goal state. There are also some restrictions on what you are allowed to do in such problems. For example in problem 4 you cannot move more than three matches. Puzzle problems such as these are examples of knowledge-lean problems that require very little knowledge to solve them. Problems 1 and 5, on the other hand, are knowledge-rich problems. Problem 1 requires a lot of knowledge of Lisp programming. If you have that knowledge then it may be quite simple. If you are a novice Lisp programmer it might be quite difficult. If you have no knowledge of Lisp the question is completely meaningless. Similarly problem 5 requires a lot of knowledge of the causes of crime that may involve politics, sociology, criminal law, psychology, and so on. Different types of goal As mentioned earlier, some of the problems explicitly state what the goal “looks like”. The task is to find a means of getting there. The Dots problem explicitly gives the goal. The problem here is: how do you get to it? The goal in problem 6 in Activity 1.3 is slightly different. Here the goal is to come up with an answer (a certain number of francs) but you won’t necessarily know if it is correct (see next section). In problem 7, although you can probably readily evaluate whether you have reached your goal or not, there appear to be very few limits on how you get to it. You would probably impose your own constraints, however: getting caught might not be a good idea. Problem 8 has an infinite number of solutions that are probably rather hard to evaluate—it may well be an insoluble problem, unfortunately.
ACTIVITY 1.3 If you managed to achieve the goal in the Dots problem in Activity 1.2 and the three problems below, how confident would you be that the solution was adequate?
6. A tourist in St Tropez wants to convert £85 to French francs. If the exchange rate is 28FF to the pound how many francs will she get? 7. “Your task is in two parts… One, to locate and destroy the landlines in the area of the northern MSR [main supply route, Iraq]. Two, to find and destroy Scud… We’re not really bothered how you do it, as long as it gets done.” (McNab, 1994, p. 35) 8. How do I write a textbook on problem solving that everyone will understand?
1. INTRODUCTION
7
Well-defined and ill-defined problems Problems such as the Dots problem contain all the information needed to solve them, in that they describe the problem as it stands now (the initial state), what the
ACTIVITY 1.4 Try this problem:
9. The tale is told of a young man who once, as a joke, went to see a fortune-teller to have his palm read. When he heard her predictions, he laughed and exclaimed that fortune telling was nonsense. What the young man did not know was that the fortune-teller was a powerful witch, and, unfortunately, he had offended her so much that she cast a spell on him. Her spell turned him into both a compulsive gambler and also a consistent loser. He had to gamble but he never ever won a penny. The young man may not have been lucky at cards but it turned out that he was exceedingly lucky in love. In fact, he soon married a wealthy businesswoman who took great delight in accompanying him every day to the casino. She gave him money, and smiled happily when he lost it all at the roulette table. In this way, they both lived happily ever after. Why was the man’s wife so happy to see him lose?
situation should be when you have solved the problem (the goal state), and exactly what actions you have to take to solve it (the operators). In the Dots problem the operation is MOVE (move a single dot). You are also told exactly what you are not allowed to do (the operator restrictions). Because thinking is done in your head, the operations are performed by mental operators. Although multiplication, say, is an arithmetic operator, you still have to know about it in order to apply it. Operators are therefore knowledge structures. (A useful mnemonic is to remember that the acronym from Initial state, Goal state, Operators, and Restrictions spells IGOR.) A problem that provides all the information required to solve it is therefore well-defined. Actually, although you know where you are starting from, where you are going to, and what actions to perform to get there, it is not quite true to say that you have been given all the necessary information, as you are not told what objects to perform the action on. For example, in algebra (e.g., problem 2) there are four possible basic arithmetic operations—multiplication, division, addition and subtraction—but you might not know how to apply those operators or in what order to apply them to get the answer. Problem 9 in Activity 1.4 is different. The initial state of the problem is given, and the goal is to find a reason why the man’s wife is apparently happy to watch him lose money, but you are not told what operators to apply to solve the problem or what the restrictions are, if any. Since these two elements are missing this problem is ill-defined.
ACTIVITY 1.5 Is problem 2 (repeated below) well-defined or ill-defined?
Solve: 3(x+4)+x=20
8
ROBERTSON
In Activity 1.5, if you thought problem 2 was well-defined then you are assuming that the solver knows what operators are available. You would also have to assume that the solver has a rough idea what the goal state should look like. This is a knowledge-rich domain—you need a certain amount of knowledge to solve the problem. If you can assume that the solver has the relevant knowledge and can readily infer the missing bits of information (e.g., the operators, the goal state), then the problem is well-defined; otherwise the problem is ill-defined. For most of you the Lisp problem in Activity 1.2 is ill-defined, because you probably haven’t a clue how to start writing a Lisp function, whereas for a Lisp programmer it may be perfectly obvious what the operators and goal are (although this does not of itself make the problem easy). MacGregor, Ormerod and Chronicle (in press) also refer to abstractly defined problems. These are problems that have all the components of a well-defined problem except that the goal is defined in abstract terms. An example would be the Matchsticks problem in Activity 1.2. If you solve the problem you end up with a state that satisfies the description of the goal. Semantically rich and semantically lean problems Another way of characterising problems is in terms of how much pre-existing knowledge the solver brings to the problem. When we encounter a word for the first time its meaning may be unclear. With repeated encounters in a variety of contexts its meaning is likely to become clearer—it will have a lot of semantic associations. Semantics is the study of meanings, and the more you know of a word or concept (the more associations and relations you form) the bigger will its semantic space become. When someone is presented with puzzles such as the Dots problem or a matchstick puzzle for the first time, the puzzle is semantically lean (or semantically poor) as far as the solver is concerned; that is, there is very little knowledge or prior experience that the solver can call on to help solve the problem. The same would be true of someone learning to play chess for the first time. The learner does not have a great deal of experience on which to draw. For a chess expert, on the other hand, the game of chess is semantically rich. The expert can bring to bear a rich body of knowledge and experience about varieties of opening moves, defensive positions, strategies, thousands of previously encountered patterns of pieces, whole games even. Problems sharing the same structure Consider Activity 1.6. Now look at problem 6 in Activity 1.3. On the surface the problem about exchanging pounds sterling for French francs and the problem about the distance travelled by the car seem completely different. However, both problems share the same underlying solution principle. That is, both problems use the same equation in their solution: a=b×c. In the car problem, the distance travelled equals the speed the car travels
ACTIVITY 1.6 Does this problem remind you of any others you have seen recently?
10. A car travelling at an average speed of 40 mph reaches its destination after 5 hours. How far has it travelled?
1. INTRODUCTION
9
times the time taken. In the currency problem, the number of francs equals the rate of exchange times the number of pounds sterling. Although these problems differ in terms of their surface features—they seem to be dealing with entirely different things—they both nevertheless share the same underlying structural features: they share the basic equation that you would use to solve them. Despite the fact that they both involve the same simple equation, the chances are that you wouldn’t have noticed their similarity until it was pointed out. Multistep problems and insight
ACTIVITY 1.7 Solve this problem:
11. I have a drawer full of black socks and blue socks in the ratio 4 to 5 respectively. How many socks will I have to pull out of the drawer to ensure I get a pair of the same colour?
Some problems can be characterised by the fact that they have a crucial step necessary for a solution. Once that step has been taken the solution becomes obvious either immediately or very quickly afterwards. These are referred to as insight problems. You may have to think for a while about problem 11 in Activity 1.7— for example, you may imagine yourself pulling out socks at random or you may concentrate on the likely effect of the 4:5 ratio—before you suddenly see the answer. Similarly in the Matchsticks problem (problem 4) you may shuffle the matchsticks around for a while before an answer suddenly hits you. Problem 6 about the fortune-teller may be puzzling for a long time before the penny drops. The metaphors in the last three sentences (“you suddenly see the answer”, “the answer hits you”, “the penny drops”) serve to emphasise that insight is usually regarded as a sudden phenomenon in which the solution appears before you without any obvious step-by-step progression towards it, such as would be necessary in the Dots problem, say. It is because of the immediacy of the phenomenon of insight (sometimes referred to as the “Aha!” or “Eureka” experience) that some psychologists, particularly the Gestalt psychologists in the first half of the twentieth century, have regarded insight as something special in problem solving. Indeed, it was one of the first types of problem solving to be systematically studied. METHODS OF INVESTIGATING PROBLEM SOLVING Studying how people solve problems is relatively straightforward. Experimenters tend to present their subjects with problems and then sit back and watch what happens. Problems can be manipulated in various ways. Typically, researchers have manipulated the way instructions are presented, whether hints or clues are given, the number of problems presented, and so on. Experimenters can take a variety of measures such as the number of problems correctly solved, the time taken to solve them, the number and type of errors made, how much subjects can remember, variations in the speed of problem solving during a problem-solving task, etc. However, a lot of insight into human problem solving has been gained by simply asking subjects to “think aloud” while solving problems. By analysing these think-aloud verbal protocols, we may be able to infer the processes underlying the subjects’ problem solving. Furthermore, as verbal reports display the sequential nature of problem solving, they can be used as the basis of computer models of problem solving.
10
ROBERTSON
Protocol analysis The use of verbal reports as data is based on the fact that human thinking can be construed as manipulating or processing information. A computer is also an information-processing system. As I type these words, the computer encodes them as strings of 0s and 1s. These strings can be understood as information. As I type, I also make mistakes; I frequently type “teh” for “the”. This means I have to go back, double click on the word “teh” to highlight it, and type over it. That is the observed effect; the computer, however, has to perform computations on the strings of 0s and 1s to perform that edit. I could also get the computer to go through the whole text replacing all instances of “teh” with “the”; once again the computer would perform computations on the strings of digits to perform that action. Performing computations on strings of digits is the way the computer processes information. The actual processes involved in carrying out these computations are invisible to the writer but the effects are visible on the computer screen. Human beings can also be regarded as information-processing systems that have a limited-capacity shortterm working memory and a vast long-term memory (LTM). The limitations to the short-term memory (STM) mean that we can attend to and store only a few things at a time (Miller, 1956). In the past STM was thought of as a relatively passive memory system. Working memory (WM), on the other hand, is a “dynamic” view of STM. WM is involved in selecting, manipulating, and interpreting information in a temporary memory store (see e.g., Baddeley, 1997). According to this more recent view of short-term working memory, we tend to encode information primarily visually or phonologically (e.g., Baddeley, 1981)—we can see images in our mind’s eye, and hear an inner voice. As with the computer, the processes that produce these images or voices are not accessible because they are very fast and below the level of consciousness. What we do have access to are the results of those processes, and those results, it is argued (Ericsson & Simon, 1993), can be verbalisable—we can talk about them. The job of the researcher is to analyse verbal data in line with some theoretical rationale that allows her or him to make inferences about the processes underlying the verbal report. According to Simon and Kaplan (1989, p. 23): Verbal protocols generally provide explicit information about the knowledge and information needed in solving a problem rather than about the processes used. Consequently it is usually necessary to infer the processes from the verbal reports of information needed instead of attempting to code processes directly. Ericsson and Simon’s (1993) theory of thinking aloud is described in more detail in Information Box 1.1.
INFORMATION BOX 1.1 Ericsson and Simon’s (1993) theory of think-aloud protocols Ericsson and Simon’s theory of thinking aloud asserts that there are three kinds of verbalisation. Type I verbalisations are direct verbalisations, This is where subjects simply say out loud what their inner voice is “saying”. For example, you can look up a phone number and keep it in STM long enough to dial the number by rehearsing it; that is, by repeating it to yourself. It is quite easy therefore to say the number out loud, as it is already in a verbal code in STM, Direct verbalisations are also involved in more complex tasks. Most nonmathematicians would use a verbal code to solve a problem such as 48×24 in their heads. Saying it out loud instead is direct verbalisation. This kind of verbalisation does not involve reporting on one’s own thought processes by saying things such as “I am imagining the 4 below the 8”. Type I direct verbalisations should not therefore interfere with normal problem solving by either slowing it down or affecting the sequence of problemsolving steps.
1. INTRODUCTION
11
Type II verbal reports involve receding the contents of STM. This type of verbalisation does slow down problem solving to some extent as it requires the subject to recode information. The most common example is where the subjects are being asked to verbalise their thoughts when performing an imagery task. Describing an image involves receding into a verbal code. When the processing load becomes too great the subjects find it harder to verbalise, as verbalising uses up the attentional resources they are trying to devote to the imagery task (e.g., Kaplan & Simon, 1990). Ericsson and Simon’s theory predicts that Type II reports should not affect the sequence of problem solving. Type III verbal reports involve explanations. Unlike Type I and Type II verbal reports, verbalisations that include explanations or reasons for doing something can have a strong effect on problem solving. Subjects instructed to give a verbal or written rationale for why they performed a particular action improved on their subsequent problem solving (Ahlum-Heath & DiVesta, 1986; Berry, 1983). Similarly, subjects asked to elaborate on a text aloud recalled more than a silent control group (Ballstaedt & Mandle, 1984). Although there are benefits for problem solving and recall due to elaborating and providing reasons, such verbalisations are liable to disrupt the task in hand. Providing explanations involves interrupting a task to explain an action. Furthermore, we cannot always be really sure that subjects’ explanations accurately reflect the processes they actually used. For these reasons Type I and Type II protocols are the most common in analysing human problem solving.
Artificial intelligence models To the extent that human thinking can be regarded as information processing, it should be possible to simulate or model human thinking on other information-processing devices. The information-processing device in most common use is the digital computer and so it has been used to model aspects of human thinking. Such models are only as good as the theory behind them and can only be built if the theory is presented in enough detail. If our theory of human thinking is specific enough, then it can be used as the basis of a computer program that instantiates (incorporates in a concrete form) that theory. There are two main contenders for modelling the structure of the mind, each of which tends to emphasise different aspects of human cognition. Because they represent the structure of the mind, they are known as cognitive architectures. To understand what a cognitive architecture is, one can think of the architecture of a house (Anderson, 1993). The occupants of the house need to be protected from the rain so some form of roof is required. Structures are needed to hold this roof up and walls are needed to keep the occupants warm. Certain parts of the house have to be reserved for certain functions; the occupants need somewhere to prepare food, to sleep, to relax, and so on. There also needs to be some way of getting in and out, and windows to let the light in. Like a house, a cognitive architecture contains certain structures that are required to support cognition. Different parts have different functions; some parts might store information, others might process information from the outside, and so on. One architecture, known as production systems, places emphasis on the fact that much of our behaviour, particularly problem-solving behaviour, is rule-governed, or can be construed as being rule-governed, and is often sequential in nature—we think of one thing after another. In the other corner is connectionism. The emphasis of this architecture is on learning and categorising, and on how we can spontaneously generalise from examples. As this book deals mainly with our (usually) conscious experience of problem solving, there is a discussion of production systems in Chapters 4 and 7 in particular. Through lack of space, connectionism is not discussed further.
12
ROBERTSON
ABOUT THIS BOOK: THREE ISSUES IN PROBLEM SOLVING This book deals with three main questions in the psychological study of problem solving. How do we mentally represent a given problem and what processes then come into play to solve it? To what extent do we apply what we learn in one context in a different context? How do we learn from experience of solving problems? To answer these questions the rest of this book is divided into three parts. Part Two covers problem representation and the processes and strategies used to solve problems. Part Three covers transfer of learning and reasoning by analogy. Part Four covers learning and the development of expertise. Representing problems and the processes involved in solving them The “big issue” is how we form a representation of a problem in the first place. What is this representation like? How can we characterise it? Can we make it any easier to improve our representations so that problem solving can be made easier? A general aspect of human thinking that crops up throughout this book is the idea that it is a lot easier to use the information available to us to solve problems than to engage in generating inferences. That is, we are adept at using the information that appears salient in the environment (see also Gigerenzer & Todd, 1999). Examples of how the problem statement strongly influences how people will attempt to solve the problem occur throughout the book, but the power of salience is dealt with in particular in Chapters 3 and 8. Part Two discusses the nature of problem representation and how we process the information available to us. Chapter 2 elaborates on some of the points mentioned in this introduction and looks principally at welldefined problems. The study of problem-solving skill has to start somewhere, and puzzle-like problems are useful in that the researcher knows just about everything there is to know about the task domain (the problem in its context) and can concentrate on the representations and processes the solver uses without having to worry about how much previous knowledge the solver brings to it. Chapter 3 looks at the ill-defined problems used by the Gestalt psychologists in their study of insight. Solving insight problems is a question of finding the correct representation of the problem. Early studies of insight problems and the like have tended to dwell on why we often fail to solve them. Recent reappraisals of insight are discussed in the light of information-processing psychology. The problem of transfer of learning Part Three looks at transfer of learning and analogical problem solving (solving one problem by analogy to another). Transfer is a problem for psychologists because there is a paradox that has emerged from research on analogical transfer. In the first place it seems to stand to reason that we use our past experience of problems to solve new problems of the same type. Unfortunately we seem to be very bad at recognising that a problem we have just done can actually help us solve the problem we are currently attempting. More generally it seems that learning one subject doesn’t usually help you learn another. Skills are not readily transferable— they are generally domain-specific. Nevertheless there are some aspects of learning that must be transferable to some extent. Learning how to structure an argument in an essay on some topic in geography is likely to come in useful if we find we have to write an essay in sociology. Chapter 4 deals with some of the general issues involved in transfer. If we are to use one example to help solve another, there must be some kind of similarity between the two problems. Chapter 5 examines in some depth how we can characterise different types of similarity. Problems can be similar for several reasons—they may be superficially similar (for example, they may seem to be about the same kinds of objects); on the other hand, they may be similar because of some abstract
1. INTRODUCTION
13
quality (for example, they may be German verbs with separable prefixes); more importantly, they may be similar because they have the same solution structure (that is they are structurally similar). Chapter 6 deals with the processes involved in analogical problem solving. If you are faced with a new problem and something about it reminds you of an earlier one, the chances are that this reminding is based on surface similarities between problems. Surface similarity is like saying that one book is similar to another because they both have blue covers. When a new problem shares few surface similarities with one you already know, the chances are you won’t be reminded of the earlier one. In fact, we are not very good at recognising the analogy between a problem we have done in the past and a current one unless we are explicitly reminded of the earlier one or told that they are in some way the same. In analogical problem solving the analogue is presumed to be in long-term memory and the problem is mostly one of retrieving it. The other difficulty in analogical problem solving is adapting the analogue to solve the current problem. Chapter 6 also looks at the role of analogies as powerful teaching devices. Chapter 7 looks at how analogies can be used to enhance teaching textbooks. The kind of analogical problem solving that one finds in teaching texts is different from the kind examined in Chapter 6, which concentrates on analogies in different domains. Textbooks use examples that students are supposed to use to solve exercise problems. This is within-domain analogising. When students are learning something for the first time they are unlikely to have an earlier problem in long-term memory that they can retrieve in the first place. Instead they have a textbook. Textbooks usually have examples to explain a solution procedure, but because students are novices they are unlikely to have a complete understanding of the examples, so adapting them to solve an exercise problem can be particularly difficult. How do we ever manage to learn anything Part Four follows on from the section on transfer and textbook problem solving by looking at how we learn from examples. Chapter 8 concentrates on two interrelated aspects of problem solving and learning: • how we manage to abstract out a problem structure from experience. This problem structure (or problem schema) allows us to identify new problems of the same type and to access potentially useful solution strategies; • how we abstract out general rules from specific examples. This is known as rule induction. It also covers artificial intelligence (AI) models of problem solving and learning, and cognitive architectures. The ultimate goal of learning, at least for some people, is to become an “expert”. Chapter 9 therefore discusses studies of experts and novices, and looks at what distinguishes them. The chapter also discusses complex problem solving—the kind that experts engage in and that is far removed from the puzzle-like problems with which the study of problem solving began. The chapter ends with a brief look at writing expertise. SUMMARY 1. Problem solving involves finding your way towards a goal. Sometimes the goal is easy to see and sometimes you will recognise it only when you see it. 2. Problems can be categorised as being:
14
ROBERTSON
• Knowledge-lean, where little prior knowledge is needed to solve them; you may be able to get by using only domain-general knowledge—general knowledge of strategies and methods that apply to many types of problem. • Knowledge-rich, where a lot of prior knowledge is usually required; such knowledge is domainspecific—it applies only to that domain. • Well-defined, where all the information needed to solve it is either explicitly given or can be inferred. • Ill-defined, where some aspect of the problem, such as what you are supposed to do, is only vaguely stated. • Semantically lean, where the solver has little experience of the problem type. • Semantically rich, where the solver can bring to bear a lot of experience of the problem type. • Insight problems, where the solution is usually preceded by an “Aha!” experience. 3. Methods of investigating problem solving include: • “Laboratory” experiments, where variables are controlled, and the nature of the problem to be solved is under the control of the researcher. • Analysis of verbal protocols, where people talk aloud while solving problems and the resulting protocol is then analysed. • Artificial intelligence models, where theories of human problem solving are built into a computer program and tested by running the program. Different types of “architecture” can model different aspects of human thinking. 4. Three important issues in relation to how people solve problems are addressed throughout the book. These are: • How we generate an initial representation of a problem and the problem-solving strategies we adopt when we don’t know what to do. • The extent to which information or skills learned in one context can be transferred to another context. This also encompasses the role of analogy in thinking. • The processes involved in learning and the development of expertise.
PART TWO Problem representation and problem-solving processes
Part Two: Introduction
Imagine for a moment that you are a parent and that you normally drop your daughter off at school on your way to work. One day you both pile into the car and discover that the car won’t start. What do you do? Let’s suppose, for the sake of argument, that it is important you get to work and your daughter gets to school as soon as possible. However, you have to get your car fixed. You decide to call a taxi and phone a garage from work to ask them to go out and look at the car. You then realise the garage will need a key for the car. While you are waiting for the taxi you call the garage and explain the situation. You say you will drop the keys off at the garage on your way to work. Okay so far. The next problem is how to get home and pick your daughter up from school in the evening. You decide that the simplest thing would be to take a bus from work that stops close to the garage and pick up the car (assuming there’s nothing seriously wrong with it). You can then go and pick up your daughter in the car. She may have to wait in school for a while but, with a bit of luck, she shouldn’t have to wait longer than about quarter of an hour. Five minutes later, the taxi arrives. The point of this little story is that all you have done to solve the problem is to stand beside the telephone and think. It is exceedingly useful to be able to imagine— to think about—the results of an action or series of actions before you actually perform them. But what exactly does it mean to think? Thinking involves reasoning about a situation, and to do that we must have some kind of dynamic “model” of the situation in our heads. Any changes we make to this mental model of the world should ideally mirror changes in the real world. So where does this mental model or mental representation come from? Before engaging in any kind of conscious problem-solving activity the solver needs to understand the problem in the first place. Understanding a problem means building some kind of representation of the problem in one’s mind, based on what the situation is or what the problem statement says and on one’s prior knowledge. It is then possible to reason about the problem within this mental representation. Generating a useful mental representation is therefore the most important single factor for successful problem solving. What kind of representation do we form of a problem? When we talk of a mental representation we are referring to the way that information is encoded. The word “rabbit” can be represented visually as a visual code, and by what it sounds like as a phonological code. We also know what “rabbit” means, so there must be a semantic code (a “meaning code”). The way we encode information from a problem situation is often based on what we are told or what we read. When we read a piece of text, for example, we not only encode the information that is explicitly stated, but we also have to make inferences as we read to make sense of the text. Most of these inferences are so automatic that we are often unaware that we have made any inferences at all. Bransford, Barclay, and Franks (1972) presented people with the following sentence: 1. Three turtles rested on a floating log and a fish swam beneath them.
PART TWO: INTRODUCTION
17
They later gave a recognition test to some of the subjects that included the sentence: 2. Three turtles rested on a floating log and a fish swam beneath it. Bransford et al. had hypothesised that subjects would draw the inference that the fish swam beneath the log (notice that this is not stated in the original sentence). Indeed, the subjects who were presented with the second sentence on the recognition task were as confident that it had been presented originally as those subjects who had been given the original sentence on the recognition task. The point here is that one’s memory of a situation, based on a reading of a text, may include the inferences that were drawn at the time the representation of the text was constructed or retrieved. Problem solving, then, involves building a mental representation of a problem situation, including any inferences you make on reading the problem, that will allow you to carry out some action. In other words, for the mental representation to be of any use it has to include some idea of what you can do that will allow you to move from the initial problem situation to the goal. Now it follows that if you don’t know much about the domain, or you have never attempted this kind of problem before, then your understanding of the problem is unlikely to be all that good. Glaser (1984, p. 93) explains why: At the initial stage of problem analysis, the problem solver attempts to ‘understand’ the problem by construing an initial problem representation. The quality, completeness, and coherence of this internal representation determine the efficiency and accuracy of further thinking. And these characteristics of the problem representation are determined by the knowledge available to the problem solver and the way the knowledge is organised. This does not mean that the representation has to be “complete” before any problem solving can take place. If you had a “complete” representation of a problem then you wouldn’t have a problem, as you would know exactly how to get from where you are now to where you want to be. A problem only exists when it is not immediately obvious how to get from where you are now to your goal. An adequate representation should at least allow you to see what moves you can possibly make and allow you to start heading towards your goal. Chapter 2 deals with how we build mental representations of problems and how we use them to work towards a solution. (The topic is revisited in Chapter 7.) However, there is also the case of those problems where our initial representation gets us nowhere. The difficulty here lies in finding a different way of representing the problem (this is sometimes referred to as “lateral thinking”, although the concept of re-representing problems in order to find a solution goes back long before the term was invented). Unfortunately, knowing that you should find a new way of representing the problem does not in fact help you very much—you still have to find this “new way of representing” it. Nevertheless when a new representation comes to mind a solution is often immediately obvious; or, at least, you often know what to do so that a solution can be reached very quickly. The kind of problem solving that involves being stuck for a while until you suddenly see what to do is called insight, and is the subject of Chapter 3. Insight is not confined to so-called “insight problems”. The same phenomenon may occur when solving typical textbook algebra problems, for example, or in solving simple everyday problems. Here again the initial representation may be extremely unhelpful, and only when a new representation is found can the poor student or DIY enthusiast apply the solution procedure that is made obvious by the new representation. The study of insight problems can therefore tell us something about everyday problems and textbook problems.
CHAPTER TWO Characterising problem solving
As mentioned in Chapter 1, some types of problems can be solved by following a clear and complete set of rules that will allow you, in principle, to get from the situation described at the start of the problem (the initial state) to the goal. Many of them are also knowledge-lean; that is, very little domain knowledge is needed to solve them. It was also pointed out that such problems are rare in everyday life, yet these are the types of problems that information-processing psychologists have spent a lot of time studying in the past. So why study them? First of all, the experimenter knows in advance all the possible paths that a “fully rational” solver can take to solve the problem. The experimenter can therefore concentrate on what strategies people use, rather than on the nature of the problem. One can also examine how people improve after repeated attempts at solving the problem. Indeed, if you are prepared to wait long enough you can watch the emergence of expertise. Second, the subjects in experiments using well-defined puzzle problems generally tend to know nothing in advance about the knowledge-lean problems they are usually presented with. This allows the psychologist to examine how someone builds up a useful representation of the problem. Furthermore, to get at the heart of the processes involved in solving problems it is often necessary to control for individual subject variables, and knowledge is a variable that bedevils experiments in problem solving. Third, as the problem structure is usually fairly clear, psychologists can examine the generalisation of transfer of learning from one problem to another by presenting two problems with identical structures (known as isomorphs) but a different cover story. For example, there are several isomorphs of the Tower of Hanoi puzzle (see Figure 2.1) involving acrobats, monsters with globes, a Himalayan tea ceremony, and so on. The description of each of these problems involves a different cover story. Thus, we can look at how people differ in how well they understand (represent) problems with different cover stories; or we can look at the circumstances under which we can apply what we learn in one problem to another problem of the same type. THE INFORMATION-PROCESSING APPROACH The information-processing approach to thinking and problem solving owes a very great deal to the work of Alan Newell and Herb Simon, and is described in detail in their book Human Problem Solving (Newell & Simon, 1972). Indeed, their model of human and computer problem solving could be termed the modal model of problem solving, given that it is used to explain a wide variety of studies of thinking (see, e.g., Ericsson & Hastie, 1994). This would be the problem-solving equivalent of the modal model of memory (Atkinson & Shiffrin, 1968).
2. CHARACTERISING PROBLEM SOLVING
19
Figure 2.1 A trival version of the Tower of Hanoi puzzel.
To understand Newell and Simon’s model we shall take a simple example of thinking. Take a few moments to do Activity 2.1. As you solve the problem, try to be aware of how you go about doing it; that is, try to imagine the individual steps you would go through.
ACTIVITY 2.1
Look at the fairly trivial Tower of Hanoi problem in Figure 2.1. Using only your imagination, how would you get the two rings from peg A to peg C in three moves bearing in mind that: • you can move only one ring at a time from one peg to another and; • you cannot put the large ring on top of the small one.
Your mental solution to the problem may have been something like that represented in Figure 2.2. Using the diagram in Figure 2.1 you probably formed a mental image similar to that in “thought 1” in Figure 2.2. Activity 2.1 describes the “state” of the problem at the start (the initial state) and “thought 1” is simply a mental representation that is analogous to this initial state of the problem. The state of the problem at the end (the goal state) can be represented as in “thought 4”. Notice here that there is no diagram given in Activity 2.1 to correspond to “thought 4”. Instead you had to construct the representation from the verbal description of the goal. In between there are intermediate states, “thought 2” and “thought 3”, which you reach after moving one ring each time. All of these thoughts are not states in the real world, but states inside your head. You created a model of the problem in your head and solved the problem within that model. When we read a statement of a problem, such as the one in Activity 2.1, we put together what we think are the relevant bits of information to construct this internal model of the problem situation. An internal model corresponds to a specific concrete situation in the external world and allows us to reason about the external situation. To do so you used information about the problem presented in the problem statement. The process of understanding, then, refers to constructing an initial mental representation of what the problem is, based on the information in the problem statement about the goal, the initial state, what you are not allowed to do, and what operator to apply, as well as your own personal past experience. Past experience, for example, would tell you that moving the small ring from peg A to peg C, then moving it to peg B, is a complete waste of time. As your knowledge of the problem at each state was inside your head, each “thought” corresponds to a knowledge state. A second aspect of your thought processes that you may have noticed is that they were sequential. This simply means that you had one thought after another, as in Figure 2.2. One consequence of the fact that much of our thinking is conscious and sequential in nature is that we can often easily verbalise what we are thinking about. You may have heard a voice in your head saying something like “okay, the small ring goes
20
PROBLEM SOLVING
Figure 2.2. The sequence of imagined moves in solving the two-ring Tower of Hanoi puzzle.
there…no, there. And then the large ring goes there…” and so on. In fact, for many people saying the problem out loud helps them to solve it, possibly because the memory of hearing what they have just said helps to reduce the load on working memory (that part of memory that briefly stores and manipulates information). The fact that a lot of problem solving is verbalisable in this way provides psychologists with a means of finding out how people solve such problems (Ericsson & Simon, 1993). Next, notice that you moved from one state to another as if you were following a path through the problem. If you glance ahead at Figures 2.3 and 2.4 you will see that harder versions of the problem involve a number of choices. Indeed, in the simple version you had a choice to start with of moving the small ring to peg B or peg C. As such problems get harder it becomes less obvious which choices you should make to solve the problem in the smallest number of moves. In this case the ideal path through the problem is unclear and you have to search for the right path. One further aspect you may have noticed was the difficulty you may have had keeping all the necessary information in your mind at once. Try Activity 2.2 and keep a mental watch on your thought processes as you do so.
ACTIVITY 2.2 Try to multiply 243 by 47 in your head.
Tasks such as the one in Activity 2.2 are tricky because the capacity of our working memory is limited—we can only keep so much information in our heads at any one time; overload it and we forget things or lose track of where we are in the problem. Other examples of the limits to our capacity to process information are given in Information Box 2.1.
INFORMATION BOX 2.1 Processing limits and symbol systems The information-processing account of problem solving sees it as an interaction between the informationprocessing system (the problem solver; either human or computer) and the task environment (the problem in its context). By characterising the human problem solver as an information-processing system (IPS), Newell and Simon saw no qualitative distinction between human information processors and any other kind, the digital computer being the most obvious example. An IPS processes information in the form of symbols and groups of symbols, and a human IPS has built-in limitations to how well it processes information, This Information Box gives a very brief sketch of some of the processing limits of human problem solving and what is meant by symbols, symbol structures, and symbol tokens. Processing limitations The human information-processing system has certain limitations. It is limited in:
2. CHARACTERISING PROBLEM SOLVING
21
• how much it can keep active in working memory at any one time; • its ability to encode information—we may not be able to recognise what aspects of a task are relevant; we don’t have the capacity to encode all the information coming through our senses at any one time; • its ability to store information—memories laid down at one time can suffer interference from memories laid down later, or may be distorted in line with prior expectations (see e.g., Brewer & Treyens, 1981); • its ability to retrieve information—human memory, as you may have noticed, is fallible; • its ability to maintain optimum levels of motivation and arousal—we get bored, we get tired. Symbols, symbol structures, and tokens Newell and Simon regarded the human problem solver as an information-processing system. An IPS encodes individual bits of information as symbols. In other words a symbol is a representation of something. Symbols include things like words in sentences, objects in pictures, numbers and arithmetic operators in equations, and so on. These symbols are grouped into patterns known as symbol structures. Knowledge is stored symbols and symbol structures. Here is an example of a symbol structure for “cat”.
A specific occurrence of the word “cat” in the phrase “the cat sat on the mat" is known as a symbol token. A symbol token refers the information processor to the symbol itself. As you read the phrase “the cat sat on the mat", you process the individual words. Processing the word “cat” means accessing your stored knowledge associated with the word “cat” and retrieving something that permits the processing (also referred to as “computation”) to continue: …when processing ‘the cat sat on the mat’ (which is itself a physical structure of some sort) the local computation at some point encounters ‘cat’; it must go from ‘cat’ to a body of (encoded) knowledge associated with ‘cat’ and bring back something that represents that a cat is being referred to, that the word ‘cat’ is a noun (and perhaps other possibilities), and so on. Exactly what knowledge is retrieved and how it is organized depend on the processing scheme. In all events, the structure of the token ‘cat’ does not contain all the needed knowledge. It is elsewhere and must be accessed and retrieved.
(Newell, 1990, p. 74) ANALYSING WELL-DEFINED PROBLEMS In order to investigate the processes used in problem solving we first need to find a way to characterise or analyse the task. We will begin with well-defined problems such as the Tower of Hanoi that conform to the IGOR format—that is, they have a well-defined initial state, a clear goal state, and the operators that should be applied are given as well as the restrictions that apply. The initial state, goal state, operators, and restrictions for the Tower of Hanoi problem are given in Figure 2.3. Figure 2.4 shows the different states that can be reached when the move operator is applied twice. In state 1 only the smallest ring can move and there are two free pegs it can move to. If the solver places it on peg C then the problem is now in state 2. In state 2 there are three moves that can be made. The smallest ring can move from peg C back to peg A which takes the solver back to the initial state, state 1. Alternatively the
22
PROBLEM SOLVING
Figure 2.3. The initial state, goal state, operators and restrictions in the Tower of Hanoi puzzle.
smallest ring can move from peg C to peg B leading to state 4, or the middle-sized ring can move to peg B leading to state 3. If you carry on this type of analysis then you end up with a diagram containing all possible states and all possible moves leading from one state to another. Although the only action you need to perform in the Tower of Hanoi is “move”, other problems may involve a variety of mental operators. For this reason the diagram you end up with is known as a state-action diagram or state space diagram. State-action spaces Thinking through a problem can be a bit like trying to find a room in an unfamiliar complex of buildings such as a university campus or hospital. Suppose you have to get to room A313 in a complex of buildings. Initial attempts to find your way may involve some brief exploration of the buildings themselves. The initial exploration, where you are trying to gather useful information about your “problem environment”, can be characterised as an attempt to understand the problem. You discover that the buildings have names but not letters. However, one building is the Amundsen Building. As it begins with an “A” you assume (hypothesise) that this is the building you are looking for, so you decide to find out (test this hypothesis). You enter and look around for some means of getting to the third floor (accessing relevant operators). You see a stairway and a lift next to it. You take the lift. When you get out on the third floor you see swing doors leading to corridors to the right and left. Outside the swing doors on the left is a notice saying “301–304, 321–324” and outside the one on the right is the sign “305–308, 317–320”. You want 313, so now what do you do? (The answer, by the way, is on page 235.) The room-finding analogy likens problem solving to a search through a three-dimensional space. Some features of the environment in which the problem is embedded are helpful, and some less so, leaving you to make inferences. The room numbers are sequential to some extent, although it’s not clear why there are gaps. Furthermore, in trying to find your way through this space you call on past knowledge to guide your
2. CHARACTERISING PROBLEM SOLVING
23
Figure 2.4. The possible states that can be reached after two moves in the Tower of Hanoi puzzle.
search. The problem of finding the room is actually fairly well-defined—you know where you are, you know where you want to be, you know how to get there (walk, take the lift, take the stairs) even if you don’t actually know the way. Activity 2.1 showed that problem solving can be regarded as moving from one “place” in the problem to another. As you move through the problem your knowledge about where you are in the problem has to be updated; that is, you go through a sequence of knowledge states. If you carry on the analysis of what choices are available at each state of the Tower of Hanoi problem, as in Figure 2.4, you end up with a complete search graph for the problem (Figure 2.5). In the Figure you can see that each state is linked to three others except at the extremities of the triangle where there is a choice of only two moves: at states 1, 15, and 27. Figure 2.5 constitutes a state-action space, or more simply state space, of all legal moves (actions) for the three-ring Tower of Hanoi problem, and all possible states that can be reached. In tree diagrams of this sort, the points at which the tree branches are called nodes. All of the numbered states are therefore nodes of the state space diagram. The Dots problem introduced earlier has some dead ends, as well as a goal state from which no other legal moves are possible. These would therefore constitute terminal nodes or leaf nodes. The space of all possible states in a problem, as exemplified in Figure 2.5, “represents an omniscient observer’s view of the structure of a problem” (Kahney, 1993, p. 42). For instance, there are hundreds of possible states you can reach in the Dots problem in Activity 1.2. No system, human or computer, can encompass the entire state space of a game of chess, for example. Indeed the size of the problem space for a typical chess game is estimated to be 10120. This means that our mental representation of the problem is likely to be impoverished in some way, which in turn means that the path through the problem may not be all that clear. Newell and Simon (1972) referred to the representation we build of a problem as the problem space.
24
PROBLEM SOLVING
Figure 2.5. State space of all legal moves for the three-ring Tower of Hanoi puzzle.
A person’s mental representation of a problem, being a personal representation, cannot be “pointed to and described as an objective fact” (Newell & Simon, 1972, p. 59). Various sources of information combine to produce a problem representation. The main sources of information are: The task environment. The problem itself is the main source of information about how to construct a relevant problem space. It defines the initial state and goal state, and may provide information about possible operators and restrictions. People are also particularly influenced by parts of the problem statement that appear particularly salient. In addition, the problem is embedded in a context that may also influence how a task is carried out. Inferences about states, operators, and restrictions. Any information missing from the problem statement may have to be inferred from the person’s long-term memory. For a problem such as “Solve: (3x+4)+x=20” no operators are provided and the solver has to access the necessary arithmetic operators from memory. It is also left to the solver to infer what the final goal state is likely to look like, so it can be recognised when it is reached. Text-based inferences. Other inferences may have to be generated from the text of a problem. For example, if the problem involves one car leaving half an hour after another and overtaking it, the solver will (probably) infer that both cars have travelled the same distance when they meet (see also Kintsch, 1998 on understanding text; Nathan, Kintsch, & Young, 1992). Previous experience with the problem. The solver may have had experience with either the current or a similar problem before, and can call on this experience to help solve it. Previous experience with an analogous problem. The solver may recognise that the structure of an earlier problem that, superficially at least, seems unrelated to the current one is actually relevant to the solution of the current one. For example, the equation in a problem involving the distance covered by a car travelling at a certain speed may be identical to one involving currency exchange rates, even though both problems are from different domains. The likelihood of this happening is usually fairly low. When it does happen it may constitute an “insight” (see Chapter 3).
2. CHARACTERISING PROBLEM SOLVING
25
Misinformation. The solver may construct a problem space based on a misapprehension of some aspect of the problem. According to Newell and Simon (1972, p. 76): States represented by nodes of the search space need not correspond with realizable states of the outside world but can be imaginable states—literally so since they are internal to the problem solver. These states, in turn, may be generated, in turn, by operators that do not satisfy all the conditions for admissibility For example, someone might decide to move all three rings of the three-ring Tower of Hanoi problem at once, not realising or remembering that there is a restriction on the number of rings that can be moved at the same time. Procedures for dealing with problems. From general problem-solving experience the solver has a number of procedures for combining information in the external environment (e.g., the written statement of the problem, the state of the rings and pegs of the Tower of Hanoi problem) with information in LTM. Thus the solver might generate context-specific heuristics for dealing with the problem. For example, in a crossword puzzle the solver may pick out the letters believed to form an anagram and write them down in a circle to make it easier to solve. External memory. The external environment may contain clues to the particular state a problem is in. The Tower of Hanoi, for example, provides constant information about the state of the problem, as it changes to a visibly new state each time you move a ring. Similarly the little scribbles or 1s one might write on a subtraction or addition problem serve as an external memory to save us having to try to maintain the information in working memory. These days, diaries, palmtops, electronic organisers, telephone numbers stored in mobile phones, etc., all serve as forms of external memory. Instructions. Newell (1990) further argued that problem spaces come from instructions. In a psychological experiment involving reaction times, for example, there are liable to be trial-specific instructions a participant would be given immediately before a particular trial of the experiment. There may also be prior general instructions before the experiment begins, and introductory instructions when the participant walks through the door that provide the general context for the experiment and what the participant is expected to do. All these sources of information together constitute the “space” in which a person’s problem solving takes place. Together they allow us to define the nodes (the states) of a problem and links between them along with a possible strategy for moving from node to node. Hayes (1989a, p. 53) provides the following analogy for a problem space: As a metaphor for the problem solver’s search for a solution, we imagine a person going through a maze. The entrance to the maze is the initial state of the problem and the exit is the goal. The paths in the maze, including all its byways and blind alleys, correspond to the problem space—that is, to all the possible sequences of moves available to the solver. According to Newell and Simon, problem solving involves finding your way through this problem space (searching for a way through the maze). Because of the limits of working memory we can only see a very few moves ahead and may not remember all the states that have been visited before. Figure 2.6 attempts to capture some of the information an individual might access when trying to understand a novel problem. The shading in the figure represents the fact that we can only see a few moves ahead and are likely to have only
26
PROBLEM SOLVING
Figure 2.6. Sources of information used to determine a problem space.
a couple of previous states in working memory at one time. The state of the problem that is in focal attention is the currently active knowledge state. Although I have made a distinction between a state space as the space of all possible moves in a problem, and a problem space as the individual representation a person has of the problem, you will often see the term “problem space” used to describe something like a search tree such as the one in Figure 2.5. Reed, Ernst, and Banerji (1974), for example, refer to their version of Figure 2.5 as the problem space of legal moves. Similarly, Hunt (1994, p. 221) describes the game of chess thus: “In chess the starting node is the board configuration at the beginning of the game. The goal node is any node in which the opponent’s king is checkmated. The (large) set of all legal positions constitutes what Newell and Simon call the problem space”. The kind of problem space referred to in both cases is the objective problem space—the set of all possible paths and states a solver could theoretically reach given the initial state, operators, and some way of evaluating when the goal has been reached. The solver (Newell and Simon refer to the informationprocessing system) “incorporates the problem space, not in the sense of spanning its whole extent, but in possessing symbol structures and programs that provide access to that space via the system’s processes for generating, testing and so on” (Newell & Simon, 1972, p. 78). Apart from being unable to encompass the sheer size of some state spaces, people are usually just not aware of them. As VanLehn points out (1989, pp. 531–532):
2. CHARACTERISING PROBLEM SOLVING
27
An incontestable principle of cognition is that people are not necessarily aware of the deductive consequences of their beliefs, and this principle applies to problem spaces as well. Although the state space is a deductive consequence of the initial state and operators, people are not aware of all of it The actual path solvers take through a state space does not depend on them being aware of all of it. People usually manage to find ways of limiting the search space in some way, and that is what problem-solving research is interested in. THE INTERACTION OF THE PROBLEM SOLVER AND THE TASK ENVIRONMENT Newell and Simon stressed two main processes involved in solving unfamiliar problems: understanding and search. As mentioned earlier, understanding is the process that allows you to construct a representation of the problem from a combination of what it says in the problem statement, what inferences you can draw from the problem statement based on general knowledge, and your past problem-solving experience. Armed with this representation the information-processing system engages in a search to find a path through the problem that will lead to a solution. Concentrating on the task environment sounds as though an analysis of the task itself is enough to show how a person can perform the task (i.e., solve the problem). This, of course, is not the case. People are not perfectly rational, and analysing the task in detail does not necessarily tell us how or if the solver can actually solve the problem. Nor does a task analysis tell us that the solver will use that particular representation (problem space) to solve the problem. So what is the point of such analyses? Laird and Rosenbloom (1996) refer to the “principle of rationality”, ascribed to Newell, that governs the behaviour of an intelligent “agent” whereby “the actions it intends are those that its knowledge indicates will achieve its goals” (p. 2). Newell and Simon argue that behaviour is usually rational in the sense that it is adaptive. This means that people’s problem-solving behaviour is an appropriate response to the task, assuming that they are motivated to achieve the goal demanded by the task. Gigerenzer and Todd (1999) would also add that we have developed problem-solving short-cuts (“fast and frugal heuristics”) that achieve our goals with the least computational effort. Newell and Simon (1972) make the point that “if there is such a thing as behavior demanded by a situation, and if a subject exhibits it, then his [sic] behavior tells us more about the task environment than about him” (p. 53). If we want to know about the psychology of problem-solving behaviour, then examining behaviour that is entirely governed by the task environment tells us nothing. If, on the other hand, our problem analysis reveals (as far as such a thing is possible) how a perfectly rational person would solve the problem and we then compare this to what a subject actually does when confronted with the problem, then the difference between the two tells us something about the psychology of the solver. The first thing an IPS (solver) does is generate an internal representation of the task environment based on the problem statement. This representation involves the selection of a problem space. The choice of a problem space can be influenced by manipulating the salience of objects in the task environment or the givens in a problem statement. The selection of a problem space results in the system choosing appropriate problem-solving methods. A method is “a process that bears some rational relation to attaining a problem solution” (Newell & Simon, 1972, p. 88). Problem-solving methods come in two general types: “strong” and “weak”. Strong methods are domain-specific, learned methods that are pretty much guaranteed to get a solution; that is, strong methods are used when you already know how to go about solving the problem. Of course if, as a result of reading a problem, you already know what to do (you have an available strong
28
PROBLEM SOLVING
method), then you don’t really have a problem. Weak methods are general-purpose problem-solving strategies that solvers fall back on when they don’t know what to do directly to solve the problem. These methods are discussed in the next section. The particular method chosen controls further problem solving thereafter. The outcome of applying problem-solving methods is monitored; that is, there is feedback about the results of applying any particular step in the application of the method. This feedback may result in a change in the representation of the problem. Suppose you are trying to find the video channel on a television with which you are unfamiliar. There are two buttons marked −P and +P, and you can’t work out what they stand for. Lacking any “strong” method for finding the correct channel, you fall back on weak methods. There are three choices facing you. You can either press −P, press +P or press both together. Past experience might tell you that pressing both together is probably not a good idea—at best they may just cancel each other out. That reduces the choice to two. What you are left with is a trial and error method, which is about as weak a method as you can get. You decide to press +P and a recognisable channel appears along with the channel number on the top left of the screen. As a result, you may infer that pressing +P steps through the channels in an ascending order, and −P in a descending order. Applying a trial and error method once may therefore be enough to allow you to re-represent the problem based on monitoring the result of your action. On the other hand, if the buttonpressing strategy fails and nothing appears on the screen you may have to change the problem space to one that involves searching for a remote control, or finding someone who knows how this television works. HEURISTIC SEARCH STRATEGIES Searching for a solution path is not usually governed by trial and error, except in a last resort or where the search space is very small. Instead people try to use heuristics to help them in their search. Heuristics are rules of thumb that help constrain the problem in certain ways (in other words they help you to avoid falling back on blind trial and error), but they don’t guarantee that you will find a solution. Heuristics are often contrasted with algorithms that will guarantee that you find a solution—it may take forever, but if the problem is algorithmic you will get there. However, heuristics are also algorithms. The clearest description of the two has been made by Dennett (1996, p. 210): There is…a tradition within computer science and mathematics of contrasting heuristic methods with algorithmic methods: heuristic methods are risky, not guaranteed to yield results, whereas algorithms come with a guarantee. How do we resolve this “contradiction”? There is no contradiction at all. Heuristic algorithms are, like all algorithms, mechanical procedures that are guaranteed to do what they do, but what they do is engage in risky search! They are not guaranteed to find anything—or at least they are not guaranteed to find the specific thing sought in the time available. But, like well run tournaments of skill, good heuristic algorithms tend to yield highly interesting, reliable results in reasonable amounts of time. Examples of both are provided in Information Box 2.2. Heuristics serve to narrow your options and thus provide useful constraints on problem solving. However, there are other, more general heuristics that a solver might apply. When you don’t know the best thing to do in a problem, the next best thing is to choose to do something that will reduce the difference between where you are now and where you want to be. Suppose you have a 2000–word essay to write and you don’t know how to go about writing a very good introductory paragraph. The next best thing is to write down something that seems vaguely relevant. It might not be particularly good, but at least you’ve only got
2. CHARACTERISING PROBLEM SOLVING
29
1800 words left to write. You are a bit nearer your goal. In the Tower of Hanoi problem this means that solvers will look at the state they are in now, compare it with where they want to be (usually the goal state), and choose a path that takes them away from the initial state and nearer to the goal state. This general type of heuristic is called difference reduction, the most important examples of which are known as hill climbing and means-ends analysis. Hill climbing The term hill climbing is a metaphor for problem solving in the dark, as it were. Imagine that you are lost in a thick fog and you want to climb out of it to see where you are. You have a choice of four directions: north, south, east and west. You take
INFORMATION BOX 2.2 Algorithms and heuristics To illustrate the difference between algorithms and domain-specific heuristics, imagine how you might go about solving a jigsaw puzzle. Algorithmic approach: Starting off with a pile of jigsaw pieces, an algorithm that is guaranteed to solve the jigsaw might proceed as follows:
1. Select piece from pile and place on table. 2. Check>0 pieces left in pile; IF YES go to 4 ELSE go to 3 3. Check>0 pieces in discard-pile IF YES discard-pile becomes pile; Go to 2 ELSE go to 7 4. Select new-piece from pile. 5. Check whether new-piece fits piece (or pieces) on table IF YES go to 6 ELSE put new-piece on discard-pile; go to 2 6. Check colour-match IF YES go to 2 ELSE put new-piece on discard-pile; go to 2 7. FINISH Heuristic approach: Divide pieces into categories in terms of: colour, straight edges, pieces belonging to same object. Assemble categorised pieces. Look for specific pieces that are required to fill perceived gaps, etc.
a step north—it seems to lead down, so you withdraw your foot and turn 90° to the east and take another step. This time it seems to lead upwards, so you complete the step and try another step in the same direction. It also seems to lead upward so you complete the step and try again. This time the path leads
30
PROBLEM SOLVING
downwards, so you withdraw your foot, turn 90° and try again. You carry on doing this until there comes a time when, no matter which direction you turn, all steps seem to lead down. In this case you are at the top of the hill. This kind of problem-solving heuristic is useful if, say, the revs on your car seem to be too high and you try to adjust the petrol/air mixture. You make fine adjustments by turning a screw one way; if the revs increase then you start turning it the other way. If the revs slow down so much that the car is in danger of stalling then you turn the screw back the other way. You continue in this way until the engine sounds as if it is running neither too fast nor too slow. Although hill climbing will eventually take you to the top of a hill, there is no guarantee that it is the top of the highest hill. You may emerge from the fog only to find you are in the foothills and the mountain peaks are still some way off. That is, in some cases you do not necessarily know if the solution you have arrived at is the correct one, or the optimal one. (Dennett, 1996, has argued that this “local” selecting of the next step is how Darwinian selection operates.) Another problem with the method is that it only applies if there is some way of measuring whether you are getting closer to the goal. If no matter which way you step the ground remains flat, then any direction is as good as any other and your search for a way out of the fog will be random. Anyone who has ever got lost in the middle of an algebra problem will know what this feels like. You might end up multiplying things, subtracting things, and nothing you do seems to be getting anywhere near the answer. A puzzle that has been often used to examine hill climbing is the so-called Missionaries and Cannibals problem. Subjects tend to use hill climbing as their main strategy to reduce the difference between where they are in the problem and where they want to be. The basic version of the problem is shown in Activity 2.3. Have a go at it before reading on to see the complete problem space of legal moves.
ACTIVITY 2.3 Three missionaries and three cannibals, having to cross a river at a ferry, find a boat but the boat is so small that it can contain no more than two persons. If the missionaries on either bank of the river, or in the boat, are outnumbered at any time by cannibals, the cannibals will eat the missionaries. Find the simplest schedule of crossings that will permit all the missionaries and cannibals to cross the river safely. It is assumed that all passengers on the boat disembark before the next trip and at least one person has to be in the boat for each crossing.
(Reed et al., 1974, p. 437) There are a number of variations of the basic problem including: the Hobbits and Orcs problem (Greeno, 1974; Thomas, 1974) where Orcs will gobble up Hobbits if there are fewer Hobbits than Orcs; book-burners and book-lovers (Sternberg, 1996a) where book-burners will burn the books of the book-lovers if they outnumber them; and scenarios where, if there are fewer cannibals than missionaries, the missionaries will convert the cannibals (Eisenstadt, 1988; Solso, 1995). The structure of the problem is interesting because there are always two legal moves. One will take you back to the previous state and one will take you nearer the solution (Figure 2.7). Figure 2.7 contains a pictorial representation and a state-action graph of the Hobbits and Orcs problem to make it a little easier to see what the graph represents. The figures divided by a colon represent the numbers of Hobbits and Orcs in that order on both banks at any one time (the figure on the left always refers to the number of Hobbits and the figure on the right always refers to the number of Orcs). Thus 33:00 means that there are 3 Hobbits and 3 Orcs on the left bank (left of the colon) and 0 Hobbits and 0 Orcs on
2. CHARACTERISING PROBLEM SOLVING
31
Figure 2.7. Solution and state-action graph of the Hobbits and Orcs problem.
the right bank (right of the colon). On the lines linking the ovals containing these, there are figures
32
PROBLEM SOLVING
representing who is in the boat. Thus, on the line linking state B to state E, the 10 means that there is 1 Hobbit and no Orcs on the boat. From the initial state (state A) there are three legal moves that will lead you to states B, C, or D. If one Orc were to take the boat across from the initial state then there would be 3 Hobbits and 2 Orcs on the left bank and no Hobbits and 1 Orc on the right bank as in state D. This is a pointless move, as the only possible next move is for the Orc to take the boat back to the left bank (i.e., back to state A). It makes more sense for two persons to take the boat, either two Orcs (leading to state C) or one Hobbit and one Orc (leading to state B). It looks from Figure 2.7 that, if you avoid illegal moves and avoid going back to an earlier state, you ought to get straight to the solution. So what makes this problem hard? If hill climbing is the method used to solve this task then one can make some predictions about where people will have difficulty. If you are standing in a metaphorical fog at night and you are trying to take steps that will lead you uphill, and if you get to a “local maximum” (where every step you take seems to take you downhill), then you are stuck. If the same thing happens in the middle of a problem then problem solving will be slowed down or you will make mistakes. The Hobbits and Orcs problem was also studied by Thomas (1974) and by Greeno (1974). Thomas’ study is described in Study Box 2.1. If you look back at state H in Figure 2.7 you will see that there are two Hobbits and two Orcs on the right bank. If the subjects are using some form of hill-climbing strategy, and trying to reduce the difference between where they are at state H and where they want to be (increase the number of people on the right bank), then state H poses a problem. To move on in the problem, one Hobbit and one Orc have to move back to the left bank, so the subject seems to be moving away from the goal at this point. This is equivalent to taking a step that seems to lead downhill when trying to climb uphill using a hill-climbing strategy. There was also another point at which subjects in Thomas’s study seemed to get stuck. At state E in Figure 2.7, there are three legal moves leading to states F, B, and C. It is the only place in the problem where subjects can move backwards without returning to an earlier problem state. In fact subjects may find themselves caught in a loop at this point, moving, for example, from B to E to C to A to B then back to E. Another difficulty subjects seemed to come up against was not realising that a particular move would lead to an error. In other words their Hobbits kept getting eaten by the Orcs. Again the problems centred around states E and H where subjects made an average of 4 errors compared with between 0 and 1 error elsewhere.
STUDY BOX 2.1 Thomas’ Hobbits and Orcs Study Thomas (1974) used two groups of subjects. The control group were asked to solve the problem on a computer and the times taken to solve the first half and second half of the problem were noted. A second group (known as the “part-whole” group) were given prior experience of the second part of the problem starting from state H and were then asked to solve the whole problem. Results TABLE 2.1 Average number of moves in each part of the Hobbits and Orcs task Group
First part of task
Second part of task
First attempt (partwhole group)
Control Part-whole
13.0 10.8
15.5 14.3
– 12.0
2. CHARACTERISING PROBLEM SOLVING
33
Discussion The part-whole group performed significantly better than the control group on the first half of the task but there was no significant difference between them on the second part. This suggests that the prior experience the part-whole group had on the second half of the problem benefited them when they came to do the first part of the whole problem afterwards. However, the fact that there was no difference between the groups on the second half suggests that there was a strong context effect when they came to state H and were reluctant to make a “detour”. The results show that means-ends analysis exerts a strong influence on problem solving, making subjects reluctant to choose moves that seem to be leading them away from the goal.
Figures showing delays before making a move (the latencies) at each state follow a very similar pattern, with states E and H once again causing difficulties for the subjects. The longest response time occurs at state A where subjects are thinking about the problem and planning how to move. On successive attempts at the problem this delay is greatly reduced. What these studies show us is that you don’t need to have a complicated state space for problem solving to be hindered. If the main method used is hill climbing, then you are likely to experience difficulty when you have to move away from the goal state to solve the problem. There is, however, a more powerful difference-reduction strategy for searching through a problem space. Means-ends analysis The most important general problem-solving heuristic identified by Newell and Simon was means-ends analysis. In means-ends analysis the solvers also try to reduce the difference between where they are in a problem and where they want to be. They do so by choosing a mental operator or choosing one path rather than another that will reduce that difference, but the difference between this heuristic and hill climbing is that the problem solver is able to break the problem down into sub-problems. For instance, if you have a 2000–word essay to write and an essay title in front of you then, rather than starting to write immediately, you might generate the sub-goal of planning what you are going to say in your introduction first. In order to write the introduction you need to have some idea of the issues, the pros and cons, and how you are going to deal with them, so you generate a sub-goal of generating an essay plan, and so on. Here is a further example to illustrate this. Suppose you want to go on holiday to Georgioúpoli in Crete. Let’s assume for the sake of the example that you like to organise holidays by yourself and avoid package holidays as much as possible. You can begin to characterise the problem in the following way: initial state: goal state:
you at home in Milton Keynes (well, somebody has to live there) you at Georgioúpoli
There are several means (operators) by which you normally travel from one place to another. You can go: on foot, by boat, by train, by plane, by taxi, and so on. However, your general knowledge of the world tells you that Crete is a long way away and that it is an island. This knowledge allows you to restrict your choice of operators. restrictions:
Crete is a long way away You want to get there as quickly as possible
34
PROBLEM SOLVING
Your general knowledge tells you that the fastest form of transport over land and sea is the plane. So, operator:
go by plane
Unfortunately your problem is not yet solved. There are certain preconditions to travelling by plane, not the least of which is that there has to be an airport to travel from and there is no airport in Milton Keynes. You are therefore forced to set up a sub-goal to reduce the difference between you at home and you at the airport. Once again you have to search for the relevant means to achieve your end of getting to the airport. Some operators are not feasible due to the distance (walking), others you might reject due to your knowledge of, say, the cost of parking at airports, the cost of a taxi from home to the airport, and so on. So you decide to go by train. initial state: sub-goal: restrictions: operator:
you at home you at the airport you don’t want to spend a lot of money you don’t want to walk go by train
Once again the preconditions for travelling by train are not met, as trains don’t stop outside your house, so you set up a new sub-goal of getting from your home to the station; and so on. Means-ends analysis therefore involves breaking a problem down into its goal-sub-goal structure and should provide a chain of operators that should eventually lead you to the goal. This method of problem solving is also known as sub-goaling. It can also be applied to the Tower of Hanoi problem. Simon (1975) outlined three different strategies that involved decomposing the goal into sub-goals. One of them is the “goal recursion strategy”. If the goal is to move all three rings from peg A (see Figure 2.8) to peg C, then first move the two top rings from peg A to peg B so that the largest ring can move to peg C (Figure 2.8b). Then set up the sub-goal of moving the two-ring pyramid on peg B to peg C. Another more “trivial” sub-
2. CHARACTERISING PROBLEM SOLVING
35
Figure 2.8. The goal recursion strategy of solving the Tower of Hanoi puzzle.
goal involves moving the two-ring pyramid from one peg to another. If the goal is to move two rings from peg A to peg B then first move the smallest ring from peg A to peg C (Figure 2.8c). This form of sub-goaling is also known as working backwards because you are analysing the problem by starting off from the goal state and working backwards from it to see what needs to be done (i.e., what sub-goals need to be achieved). The reason why it’s called a “recursion” strategy is because the procedure for moving entire pyramids of rings (e.g., three rings, five rings, 64 rings) involves moving the entire pyramid minus one ring. And the procedure for moving the pyramid-minus-one-ring involves moving the pyramidminus-one-ring minus one ring, and so on. A diagrammatic representation of the recursion strategy can be found in Figure 4.1 in Chapter 4. The recursive procedure can be written as follows (Simon, 1975, p. 270): To move Pyramid (k) from A to C Move Pyramid (k−1) from A to B Move Disk (k) from A to C Move Pyramid (k−1) from B to C where k is an (odd) number of rings (the substructure of the Tower of Hanoi problem is illustrated in Figure 4.1 in Chapter 4). The difficulty with this strategy for solving the problem is the potential load on short-term memory. “For a subject to use this recursive strategy, he must have some way of representing goals internally, and holding them in short-term memory while he carries out the sub-goals. How large a STM this strategy calls for depends on how many goals have to be retained in STM simultaneously” (Simon, 1975, p. 270). The idea that we have a “stack” of goals that we keep in mind as we solve some types of problem comes from computer science. Some computer programs are hierarchically ordered so that in order to achieve goal A you first have to achieve goal B and to achieve goal B you have to achieve goal C, and so on. Newell and Simon (1972) give the example of driving a child to school (the main goal), but the car doesn’t work so you have to fix the car first (a new goal) and to fix the car you have to call the garage (a new goal), and so on. Notice that these goals and sub-goals are interdependent, each one depending on the next one “down”. When you embark on a new goal—when a new goal comes to the top of the goal stack —this is called pushing the stack. When the goal is satisfied (for example, when you have made the phone call to the garage), the
36
PROBLEM SOLVING
goal of phoning is dropped from the stack—known as popping the stack—and you move to the next goal down: in this case waiting for the mechanic to repair the car. The psychological reality of human beings using such a goal stack as they solve problems is discussed in Anderson (1993). Aspects of which are discussed in Information Box 2.3. Newell and Simon’s theory of human problem solving has had a profound influence on our understanding of human and machine thinking. Their model of a limited-capacity information-processing system operating in an environment that provides the system with information still influences current models of problemsolving behaviour. The notion that there are general problem-solving heuristics such as means-ends analysis explains how we can often solve problems that we have never before encountered, and allows this kind of behaviour to be modelled on a computer. SUMMARY 1. Information-processing accounts of problem solving emphasise the interaction of the problem solver and the environment (in this case the problem). Newell and Simon (1972) suggested that problem solving involved two co-operating processes called understanding and search. Understanding refers to the process of building a representation of a problem, initially by reading the text of the problem. This representation constitutes a “problem space”. Search is the process whereby the solver attempts to find a solution within this problem space. 2. Search and understanding interact. The search process might lead to the solver revising the mental representation, or the solver may re-read the problem or parts
INFORMATION BOX 2.3 The psychological validity of goal stacks Anderson (1993) argues that much of our behaviour can be described as arising from a hierarchy of goals and sub-goals. For example, a subject in a laboratory experiment “may be trying to solve a Tower of Hanoi problem in order to satisfy a subject participation requirement in order to pass a psychology course in order to get a college degree in order to get a job in order to earn social respect” (1993, p. 48). This hierarchy of goals is modelled in the ACT-R model of cognition (see Chapter 8) as a goal stack. The goals, as in the example of the laboratory subject, are interdependent, but one can only make progress by dealing with the currently active goal—the one at the top of the goal stack. Because of this interdependency of goals, and the logic of having to deal with the currently active one, “goal stacks are a rational adaptation to the structure of the environmen" (1993, p. 49). As goal stacks are an “adaptation to the environment” Anderson argues that: (a) we have evolved cerebral structures that co-ordinate planning (the pre-frontal cortex); and (b) we can find evidence of hierarchical planning in other species. “The decision to design a tool is a significant sub-goal (a means to an end), and the construction of a tool can involve complex co-ordination of sub-goals under that. Neither hierarchical planning nor novel tool use are uniquely human accomplishments and they are found to various degrees in other primates” (1993, p. 49). There is an argument that certain types of forgetting are not dealt with adequately in this scenario. For example, people sometimes forget to remove the original from a photocopier after making photocopies. This can be explained by arguing that making and collecting the photocopies is the goal and this does not include removing the original. Byrne and Bovair (1997, p. 36) argue that, when the super-goal of making and getting copies is satisfied, it is popped from the stack along with its associated sub-goals, and so forgetting the original should happen every time.
2. CHARACTERISING PROBLEM SOLVING
37
Although this explanation has intuitive appeal, it does not make entirely plausible predictions. According to this account, people should make this error every time—the goal structure for tasks like the photocopier never changes, so the error should persist indefinitely.
On the other hand, if the “post-completion” step (removing the original) is always placed on the stack as part of the overall goal then the error should never happen. of it (an aspect of problem understanding) which in turn may suggest ways in which the search for the solution can continue. 3. To guide search through a problem space people tend to use strategies called heuristics. The main types involve trying to find a way of reducing the difference between where you are now and where you want to be. One such fairly blind method is called hill climbing, where the solver heads in a direction that seems to lead to the solution. A more powerful method is means-ends analysis which can take into account the goal-sub-goal structure of problems. 4. Working memory can be seen as a “goal stack”. A goal stack means that behaviour involves making plans, which in turn involves breaking plans down into goals and sub-goals. Goal stacks are a rational adaptation to a world that is structured and in which we can identify causes and effects. 5. The modern sense of rationality implies that (a) our ways of thinking are the product of evolutionary processes, (b) as far as we can we use our knowledge in pursuit of our goals, (c) our thinking is likely to be biased by the knowledge we have available.
CHAPTER THREE Problem representation: The case of insight
My favorite example of problem solving in action is the following true story. When I was a graduate student, I attended a departmental colloquium at which a candidate for a faculty position was to present his research. As he started his talk, he realized that his first slide was projected too low on the screen. A flurry of activity around the projector ensued, one professor asking out loud, “Does anyone have a book or something?” Someone volunteered a book, the professor tried it but it was too thick—the slide was now too high. “No, this one’s too big. Anyone got a thinner one?” he continued. After several more seconds of hurried searching for something thinner, another professor finally exclaimed, “Well, for Pete’s sake, I don’t believe this!” He marched over to the projector, grabbed the book, opened it halfway, and then put it under the projector. He looked around at the lecture hall and shook his head, saying, “I can’t believe it. A roomful of PhDs, and no one knows how to open a book!” (Ashcraft, 1994, p. 576) This chapter is about why a roomful of PhDs (bar one) could not solve this problem. In particular it elaborates on the point made in the last chapter that we often do not represent a problem completely. That is, there is often something missing from our representation of a problem, such as the fact that you can open a book; or, indeed, that our representation of a problem may be totally inappropriate. Another point to be made is that intelligence does not necessarily guarantee successful problem solving. In examples such as the one above, the answer is “obvious”, there is no missing bit of knowledge that prevents people from solving it. Finally, this everyday example of problem solving is one where one can imagine a lightbulb lighting up above someone’s head as the person suddenly realises what to do. This lightbulb blinking on—the “Aha!” experience—is known as insight. BUILDING A PROBLEM REPRESENTATION In the real world we are rarely faced with problems where all the relevant information is provided. Usually we need to supplement what the problem says with information from long-term memory, so that we can make inferences and possibly choose operators that will allow us to make changes to our mental model that reflect what would happen if we manipulated the concrete world. The difficulty facing us when we have to make inferences is two-fold. First, we may build entirely the wrong mental model from the information we read or hear. If you hear, for example, that a man walked into a bar and fell on the floor, the chances are that you have some kind of image in your head about the man’s position and surroundings and so on. If you then hear that it was an iron bar, whatever mental model you had built will probably have to be changed.
3. PROBLEM REPRESENTATION: INSIGHT
39
Figure 3.1. The Mutilated Chequer Board problem.
You probably know that the word “bar” has at least two meanings, yet I would guess that you picked on only one of them and did not consider any other—at least, not consciously. Our habit of sticking to one mental model once we have built it is the reason we find some types of problem hard to solve, as the next few sections will show. The second difficulty facing us is that we may well build a reasonably correct initial representation of a problem, but this representation may be impoverished in some way because we have no idea what inferences are relevant, such as the fact
ACTIVITY 3.1 Imagine that you have a normal chequer board containing 64 black and white squares. You also have 32 dominoes each one of which exactly covers two squares on the chequer board. It is therefore possible, and quite straightforward, to cover the entire board with all 32 dominoes. Now suppose that the chequer board were “mutilated” in such a way that two squares were removed from diagonally opposite corners as in Figure 3.1. You now have 62 squares. How can you cover those 62 squares with 31 dominoes?
that a book can be opened; or, indeed, in the worst case we may be unable to make any inferences. This makes the problem hard. An example of this can be found in Activity 3.1. RE-REPRESENTING PROBLEMS You may have tried to solve the Mutilated Chequer Board problem by mentally trying to place dominoes on the board. After a while you will have realised that there are just so many possible permutations of pieces— 758,148, in fact—that you can’t keep track of them all. You can’t solve the problem that way because your limited-capacity working memory rapidly becomes overloaded. The solution can be found a lot more easily if you can change your representation of the problem. Here are three ways in which you might try to rerepresent the Mutilated Chequer Board problem.
40
PROBLEM SOLVING
Figure 3.2. A simplified version of the Mutilated Chequer Board problem.
Focus on a different aspect of the problem First, you may notice that a domino has to cover both a black square and a white square. If you have 31 dominoes then they have to cover 31 black squares and 31 white squares. However, there are only 30 black squares and there are 32 white ones, so it is impossible for 31 dominoes to cover them. Look at extreme cases A second way of re-representing the problem is to think of a much simpler version of it. Any conclusions you reach may be scaled up to apply to the full-scale version of the problem. This technique, however, is not always guaranteed to work. It is therefore a heuristic. The simplified version of the Mutilated Chequer Board in Figure 3.2 works for the same reason as the one given in the last paragraph. In this version you can probably imagine placing dominoes on the squares and you can easily see that it is impossible to cover them all no matter how they are arranged. Nevertheless, you can’t really be sure that the same applies to the full-size version unless you understand why. Another example is the Dots problem introduced in Activity 1.1 (Chapter 1). If you start with the simplest form of the puzzle (Figure 3.3) and see what’s involved in solving that, then look at a slightly more difficult version and see what’s involved in solving that, then you can build up a picture of the strategies that are needed to solve the full-size Dots problem. Find an analogous solution
ACTIVITY 3.2 In the dance floor problem there are 32 dancing couples—32 men and 32 women. If two of the women leave, can the remaining 62 people form heterosexual dancing couples? Explain your answer. (Source: Gick & McGarry, 1992)
3. PROBLEM REPRESENTATION: INSIGHT
41
Figure 3.3. Simpler version of the Dots problem.
Figure 3.4. A square “emerges” from the configuration of black dots.
The third way of re-representing the problem is to use an appropriate analogy, although the likelihood of your thinking of one spontaneously is vanishingly small (see Chapter 6). A problem that is identical in structure to the Mutilated Chequer Board problem is the dance floor isomorph given in Activity 3.2. The answer to the dance floor isomorph is obviously no. There are two spare men with no dancing partners. The same reasoning can be applied to the Mutilated Chequer Board problem where there are two spare white squares. If you want to make people really suffer, you can make the Mutilated Chequer Board problem much more difficult by having only white squares. GESTALT ACCOUNTS OF PROBLEM SOLVING It was the question of how we represent problems and how we might re-represent them that interested the Gestalt psychologists. The Gestalt school is well-known for its study of perceptual processes. For Gestalt psychologists, the relationships between elements in the visual field gave rise to “wholes”. In Figure 3.4 the viewer sees a square rather than a series of black dots with chunks cut out. As with their study of perceptual phenomena, the Gestalt school provided a description of a number of problem-solving phenomena and also some useful labels for them. With the rise of the behaviourist school, however, the study of problem solving and such inaccessible mental events as insight declined in the middle part of the twentieth century. As we saw in Chapter 2, the emergence of cognitive science has recently seen a strong interest in problem solving, including the study of insight, within the information-processing framework.
ACTIVITY 3.3 Is the following number divisible by 9? 1,000,000,000,000,000,000,000,008
42
PROBLEM SOLVING
Figure 3.5. The problem can be seen as adding up an increasing number of boxes.
Gestalt psychologists laid great stress on how we “structure” problems. When we have difficulties solving a problem, insight into its solution can come about by restructuring the problem. Nowadays we would talk of having the wrong, or at least an unhelpful, initial representation of a problem. A solution comes about after we have “re-represented” it correctly. Activity 3.3 gives an example of the types of mathematical problems that Gestalt psychologists were interested in. The initial way in which you may have approached the question in the activity was probably to try to divide the number by 9 to see what happens. This would be a perfectly natural approach to what appears to be a division problem. Our past experience of division problems predisposes us to attempt it by applying the procedure we have learned for dealing with division problems. However, sometimes a learned procedure is not the easiest way of solving the problem. In Activity 3.3, notice what happens when you subtract 9 from the number. Now can you say whether it is divisible by 9? It is always possible that you tried to solve the problem by the “method of extreme cases”. If you did so, you will have noticed that the simple case of 18 is divisible by 9, 108 is divisible by 9, 1008 is divisible by 9, and so on. You may have boldly extrapolated to the number in the Activity and assumed it was divisible by 9 as well. You may even have worked out why. Another example of this kind of restructuring can often be found on the walls of mathematics classrooms in secondary schools. You may sometimes find a poster there describing how the 6-year-old Karl Gauss, who later became a prominent mathematician (the Gauss, a unit of magnetic flux, is named after him), solved a tedious arithmetic problem very quickly by reconstruing the problem. His teacher, thinking to give himself a few minutes’ peace, had asked the class to add up the numbers 1+2+3+4 etc. up to 100. Hardly had the class begun laboriously to add up all the numbers when Gauss put his hand up with the answer. How had he done it so quickly? Gauss had seen the problem structured in a way similar to Figure 3.5. The figure looks a bit like a rectangle with a side 100 units long cut diagonally in half. All Gauss did was to complete the rectangle by imagining Figure 3.5 duplicated, flipped over and added to itself as in Figure 3.6. You end up with a rectangle 100×101 giving an area of 10,100. Adding the series 1+2+3+4+5 up to 100 is therefore the same as taking half of 10,100; that is, 5050 (see also Gilhooly, 1996). The young Gauss was able to see relationships between the parts of the problem that the other children in the class were unable to see. In other words he had understood the underlying structure of the problem. Wertheimer (1945) referred to the kind of thinking exhibited by Gauss as productive thinking. This can be contrasted with reproductive thinking where the solver attempts to solve the problem according to previously learned methods—in this case by simply adding up the numbers 1 to 100. Reproductive thinking of this
3. PROBLEM REPRESENTATION: INSIGHT
43
Figure 3.6. By doubling the number of squares in Figure 3.5 the problem becomes the simple one of finding the area of a rectangle.
latter kind is structurally blind. There appears to be no real understanding of the underlying structure of the problem. Lack of understanding of a problem (or a concept, or a system of relations) can lead to superficial answers to problems, as well as to blindly following a procedure. Peter Hughes, chief examiner in the public awareness of science for the University of Cambridge Local Examinations Syndicate, bemoans the fact that answers to science questions in an English A-Level science exam showed too great a reliance on factual knowledge and little understanding. For example, students “assumed that electric cars would be greener without considering the relative efficiency of petrol and electric vehicles, or exploring other ways of generating electricity, say, by wind or nuclear power” (Hughes, 1996, p. 45). Set effects As Wertheimer’s analysis showed, one of Gestalt psychology’s achievements in the study of problem solving was to point out the difficulties people often have in solving problems because of the inappropriate use of prior knowledge. Using our knowledge can sometimes make us psychologically set in our ways. Applying a learned rule or procedure for doing something when there is a simpler way of doing it is therefore called a set effect. The Gestalt term for using a learned method for solving problems, where a simpler method might be quicker and more appropriate, is Einstellung. Einstellung can be regarded as “the blinding effects of habit” (Luchins & Luchins, 1950). Learned procedures for doing things are extremely useful most of the time. We wouldn’t get on too well if we didn’t apply the rules we had learned in the past to new occurrences of the same situation. Just how many novel ways are there of making a cup of tea or of doing multiplication problems? Do we need to think up novel ways of doing them? Nevertheless, Luchins (1942, p. 93) argued that a mechanised procedure for solving a particular problem type ceases to be a tool “when…instead of the individual mastering the habit, the habit masters the individual”.
44
PROBLEM SOLVING
Another type of mental set is functional fixedness (or functional fixity) where we are unable to see how we might use an object as a tool to help us solve a problem, because that is not the normal function of the object. To get an idea of what functional fixedness means, have a go at Activity 3.4 before reading on.
ACTIVITY 3.4 Imagine you and a friend are on a picnic and you have brought along a bottle of wine. Having found a convenient spot and cooled the wine in a nearby babbling brook, you are ready to eat and drink. At this point you realise you have forgotten to bring a corkscrew. You both frantically empty your pockets looking for something you can use. You find you have a £10 note, a cigarette lighter, a piece of string, a £20 note, some coins of various denominations, a pen, and a box of matches. With a flash of insight you realise how you can completely remove the cork from the bottle. How do you do it?
The point here is that the objects you have found in your pocket all have specific functions, none of which has anything to do with removing corks from bottles. Functional fixedness is being unable to forget for a moment the normal function of an object to be able to use it for a totally novel purpose. In doing Activity 3.4 you may have realised that you can force the cork into the bottle using the pen. You may even have done so in the past. However, the cork is still in the bottle and tends to get in the way when you are pouring the wine. If you tie a fairly large knot in the string and push it into the bottle below the cork, you can then pull on the string which causes the knot to pull the cork out of the bottle. Restructuring, Einstellung and functional fixedness can be illustrated by three classic experiments conducted by Maier (1931), Luchins and Luchins (1959), and Duncker (1945). Study Box 3.1 describes Maier’s account of restructuring in an insight problem; Study Box 3.2 describes Luchins and Luchins’ study of Einstellung; and Study Box 3.3 describes studies by Duncker into functional fixedness.
STUDY BOX 3.1 Maier’s Two-string problem Rationale The aim of the study was to see how people can solve insight problems by “restructuring” the problem, and how they might be led to do so. Method In Maier’s (1931) experiment subjects were brought into a room with the experimenter where there were two strings hanging from the ceiling and some objects lying around on the floor (pliers, poles, extension cords). The subjects’ task was to tie the two strings together. However, the subjects soon found out that if they held onto one string, the other was too far away for them to reach (Figure 3.7).
The only way to solve the problem is to use the objects lying around on the floor. In particular, Maier was interested in how the subjects might use the pliers. After the subjects had been trying to solve the problem for a while, Maier gave one of two hints. • The experimenter “accidentally” brushed against one of the strings, causing it to swing. • If the first hint failed, after a few minutes the subject was handed the pliers and told that the problem could be solved using them.
3. PROBLEM REPRESENTATION: INSIGHT
Figure 3.7. Maier’s Two-string problem.
Discussion According to Maier, apparently accidentally brushing against the string often led to a “restructuring” of the problem. Interestingly, very few of the participants who needed a hint to solve the problem seemed to be aware that they had been given any kind of hint at all. They also seemed to fall into two categories based on what they reported afterwards. There were those who reported that the solution “just came to me”, and those who seemed to go through a series of stages: “Let’s see, if I could move the cord”, “throw things at it”, “blow at it”, “swing it like a pendulum”, “aha!”. Using these failed attempts at solving the problem to help refine what the problem is, and thereby to work towards a solution, is known as solution development (Duncker, 1945; Gick & Holyoak, 1980).
STUDY BOX 3.2 The Water Jars problem (Luchins & Luchins, 1959) Rationale The aim of much of Luchins’ work was to examine the effects of learning a solution procedure on subsequent problem solving, particularly when the subsequent problems can be solved using a far simpler procedure. In such cases learning actually impedes performance on subsequent tasks Method In these studies subjects are given a series of problems based on water jars that can contain different amounts of water. Using those jars as measuring jugs, the participants had to end up with a set amount of water. For example, if jar A can hold 18 litres, jar B can hold 43 litres, and jar C can hold 10 litres, how can you end up with 5 litres? The answer is to fill jar B with 43 litres, from it fill jar A and pour out the water from A, then fill C from B twice, emptying C each time. You end up with 5 litres in jar B. After a series of such problems, subjects begin to realise that pouring water from one jar to another always follows the same pattern. In fact, the pattern or rule for pouring is B−A−2C (from the contents of jar B take out enough to fill A, and enough to fill C twice; e.g., 43−18−(2×10)=5). Examples 1, 2, and 3, below all follow that rule (try 2, 3, and 4 for yourself).
45
46
PROBLEM SOLVING
Figure 3.8. Luchins’ Water Jars problem.
Study Box 3.2 (continued)
1 2 3 4
Jug A
Jug B
Jug C
Goal
18 9 14 28
43 42 36 76
10 6 8 3
5 21 6 25
Results and discussion When subjects reached problems at the very end of the series, the rule changed as in example 4. Subjects who had induced the rule B−A−2C did much worse than control subjects on this problem. In fact, it can be solved very easily following the rule A−C, which is pretty trivial. Not only that, but when the problem could be solved by either rule (i.e., B−A−2C or A−C), subjects who had learned the complicated rule applied it without noticing that the simpler rule applied. For example, both rules will work in example 3. Did you notice the easier solution in 3? The results showed the effects of Einstellung. Having learned a rule that worked, participants followed it blindly and thus failed to see that a simpler solution could be used. Furthermore, those who had learned a rule got stuck on problems such as example 4 for far longer than those who had not learned a rule showing the deleterious effects of reproductive thinking.
STUDY BOX 3.3 The Candle Holder problem (Duncker, 1945) Rationale The aim here was to examine the effects of the “functional fixedness of real solution-objects”. Can people ignore the usual function of objects in order to use them for a different function to solve a particular problem? Method In this study, subjects were presented with the items shown in Figure 3.9. Their task was to fix three candles to a door so that when lit the wax would not drip onto the floor. The experiment was repeated using a number of conditions. In one condition subjects were asked to fix three candles to a door. On the table before them were three matchbox-size boxes, tacks, and candles “among many other objects” (Figure 3.9a). In the second
3. PROBLEM REPRESENTATION: INSIGHT
47
Figure 3.9a. Duncker’s Candle Holder problem: the “without pre-utilisation” condition.
Figure 3.9b. The “after pre-utilisation” condition (Source: Robertson, 1999, pp. 85–86). condition, the boxes were filled with candles, matches, and tacks. As the boxes were being used to hold objects, and subjects would have to empty the boxes before using them, this condition was known as the “after preutilisation” condition (Figure 3.9b). The other condition was known as the “without pre-utilisation” condition, as the boxes were not used for anything beforehand. Results All subjects solved the problem in the “without pre-utilisation” condition but only three out of seven solved it in the “after pre-utilisation” condition. In a third condition the boxes were filled with “neutral” objects such as buttons. Here only one subject solved the problem.. Discussion Subjects could not reconceptualise a box containing matches, for example, as a candle holder, due to “fixating” on its function as a matchbox. In Duncker’s words: “the crucial object [the box] is embedded in a particular context, in a functional whole, which is to some degree dynamically segregated” (1945, p. 100). If the functional whole disintegrates, as in Figure 3.9a, the elements (boxes, candles, tacks) are “released from its grasp”,
While the Gestalt psychologists provided descriptions of the situations in which insight occurred or failed to occur, as well as useful methods for examining problem solving (such as the use of verbal protocols), they were less precise about the processes underlying insight. Explanations such as “short-circuiting” normal problem-solving processes don’t tell us how or why such a short-circuit takes place. More recently, therefore, information-processing explanations have been put forward to explain insightful problem solving.
48
PROBLEM SOLVING
INFORMATION-PROCESSING ACCOUNTS OF INSIGHT Insight poses a problem for modern information-processing accounts of problem solving. Why should working away at a problem produce no results, whereas a complete answer can sometimes suddenly pop into one’s head? According to Ohlsson (1992, p. 2): Insight poses a particularly severe challenge to information-processing theories of thinking. If thinking is a species of symbol manipulation (Newell, 1980), then it follows that problem solutions are constructed piecemeal, through heuristic search, means-ends analysis, planning or other step-wise processes. Also there is little reason to believe that extensive problem-solving can happen outside consciousness, in parallel with deliberate efforts, because the capacity of human beings for processing information is limited. The sudden appearance of a complete solution within consciousness constitutes an anomaly, which cannot be accommodated within a computational theory without auxiliary assumptions. Ohlsson then goes on to explain just how insight might readily be plugged in to an information-processing framework (see page 65). Nevertheless, there are some disagreements among modern-day psychologists about just how special insight is. Insight as a special property Does solving the following two types of problem involve the same basic cognitive processes, or is there something special about the first one? 1. A stranger approached a museum curator and offered him an ancient bronze coin. The coin had an authentic appearance and was marked with the date 544 BC. The curator had happily made acquisitions from suspicious sources before, but this time he promptly called the police and had the stranger arrested. Why? (From: Metcalfe, 1986, p. 624) 2. (3x2+2x+10)(3x)=? (From: Metcalfe & Wiebe, 1987, p. 245) Some psychologists argue that solving these two problems involves the same sorts of processes. For example, the argument that restructuring in Gestalt insight problems comes about through the same type of search process as described by Newell and Simon was put forward by Weisberg and Alba (1981, 1982). They argued that insightful problem solving involved both a search through the problem space and a search through memory. “Restructuring of a problem comes about as a result of further searches of memory, cued by new information accrued as the subject works through the problem. This is in contrast to the Gestalt view that restructuration is spontaneous’” (Weisberg & Alba, 1982, p. 328). Janet Metcalfe, on the other hand, argued that, if the same memorial processes were at work in insightful as in non-insightful problem solving, then one should find that the metacognitive processes would also be the same. Metacognition (also sometimes referred to as “metamemory” or “metaknowledge”) means “knowing what you know”. If you have played “Trivial Pursuit” you may well have experienced the tip-ofthe-tongue phenomenon, where you are sure you know the answer but just can’t quite get it out. It has been shown that people are quite good at estimating how likely they are to find an answer to a question that produces a tip-of-the-tongue state, given time or a hint such as the first letter (Cohen, 1996; Lachman, Lachman, & Thronesberry, 1979). In fact you can produce a gradient of “feeling-of-knowing” (FOK) from “definitely do not know” to “could recall the answer if given more hints and more time”. If, therefore,
3. PROBLEM REPRESENTATION: INSIGHT
49
insight problems involved a search through memory using the current state of the problem as a cue, one might reasonably expect that one could estimate one’s FOK (in this case feeling that you know how close you are getting to an answer) just as readily for insight problems as for trivia knowledge or even algebra problems. Metcalfe (1986) found that there was a significant positive correlation between subjects’ estimates of FOK for trivia questions but not for insight problems. Furthermore, as subjects solved algebra problems, deductive reasoning problems, or the Tower of Hanoi problem, they were able to produce “warmth” ratings as they got closer to a solution—the closer they were to a solution the “warmer” they were (the more confident that they were close to a solution) (Metcalfe & Wiebe, 1987). Indeed, these types of problems showed gradual increases in the subjects’ warmth ratings from 1 (“cold”) to 7 (“very warm”) every 15 seconds. For insight problems, on the other hand, there were hardly any increases in feelings of warmth until immediately before a solution was found. Metcalfe and Wiebe therefore concluded that “insight problems…require a sudden illumination for their solution” (p. 292). They also argued that their study “shows an empirically demonstrable distinction between problems that people thought were insight problems and those that are generally considered not to require insight such as algebra or multistep problems” (p. 243). They even argue that such “warmth protocols” might be used to diagnose problem types. Insight as nothing special Despite Metcalfe’s conclusion, there have been several attempts at explaining Gestalt insight problems in “classical” information-processing terms. Most of the accounts have a lot in common, as they appeal to a number of cognitive processes usually involved in other forms of problem solving such as retrieval of information from long-term memory, search through a problem space, search for relevant operators from memory, problem understanding, and so on. Kaplan and Simon’s account of insight The title of Kaplan and Simon’s (1990) paper In Search of Insight is intended to show that search is part of insightful problem solving. The difference is that, rather than developing a representation of a problem and then searching through that representation (the problem space), the solver has to search for the appropriate representation. Kaplan and Simon use the metaphor of searching for a diamond in a darkened room to illustrate insight. At first you might grope blindly on your hands and knees as you search for the diamond. After a while, though, you may feel that this is getting you nowhere fast. You therefore look for a different way of trying to find the diamond and decide to start searching for a light switch instead. If you find one and turn it on, the diamond can be seen almost at once. Kaplan and Simon used variants of the Mutilated Chequer Board problem to examine the process of search in insightful problem solving. If you re-read the statement of the problem in Activity 3.2, the most obvious apparent solution is to try covering the squares with dominoes. This method of searching for a solution is equivalent to groping in the dark for the diamond. The reason why this strategy is likely to fail is that the search space is too big (one graduate student spent over 18 hours trying to solve it this way). Another way of looking at it is that there are not enough constraints on the problem—there are too many possible paths to search. A problem constraint allows you to “prune the search tree”. Figure 3.10 makes this idea a little clearer. Kaplan and Simon argued that the problem was hard because there were not enough constraints. The only way to find a solution is to stop searching through the initial representation (the problem space of covering
50
PROBLEM SOLVING
Figure 3.10a. This depicts an imaginary problem showing all possible stages (grey circles) and paths you can follow (lines) from the start (white circle) to the goal (black circle).
Figure 3.10b. This depicts the effects of pruning the search tree. Constraints thereby allow you to concentrate on fewer paths and steps through the problem space.
squares with dominoes) and search for a representation that provides more constraints. Figure 3.11 depicts this switch from a search through a single representation to a search through a meta-representation (the problem space of problem spaces). Figure 3.11 also contains some of the problem spaces used by subjects and identified from think-aloud protocols. All subjects at first searched for a solution by “covering” squares with dominoes. When this failed to work they eventually switched to another representation of the problem. Some attempted to search for a mathematical solution; some attempted to manipulate the board by, for example, dividing it into separate areas; some sought an analogy or another similar problem. Eventually all tried to find a solution based on parity (that is, a solution based on the fact that the dominoes had to cover two squares of different colours). In sum, Kaplan and Simon argued that insight is not a special type of problem-solving phenomenon, but involves the same processes as other forms of problem solving, “The same processes that are ordinarily used to search within problem space can be used to search for a problem space (representation)” (Kaplan & Simon, 1990, p. 376). Subjects’ difficulty in solving the problem was mainly due to an inappropriate and underconstrained representation. Keane’s information-processing explanation of functional fixedness Another reinterpretation of Gestalt insight problems in information-processing terms comes from Mark Keane. According to Keane (1989a) the difficulties in solving “construction” problems (such as Maier’s Two-
3. PROBLEM REPRESENTATION: INSIGHT
51
Figure 3.11. The different representations of the Mutilated Chequer Board problem (Source: Robertson, 1999, p. 50).
string problem and Duncker’s Candle Holder problem) are due to (a) the subjects not having a complete representation of the properties of the objects involved, and (b) consequently failing to access a suitable solution plan from memory. The failure to represent completely the properties of objects leads to functional fixedness. When we think of objects and categories, we access context-independent information about those objects and categories. When we think of a skunk we think of a bad smell; when we think of pliers we think of something for gripping small objects and made of metal; when we think of a box we think of a container; and so on. This context-independent information represents an object’s conceptual core (Barsalou, 1989). However, our knowledge of an object or category may include context-dependent information under the right circumstances. As Barsalou (1989) puts it, “When people read frog in isolation, for example, eaten by humans typically remains inactive in memory. However, eaten by humans becomes active when reading about frogs in a French restaurant” (p. 77). Furthermore, you might more readily retrieve such contextdependent information if it was mentioned or encountered recently. For example, edible may be retrieved when someone thinks of frogs if that person had recently eaten frogs’ legs in a restaurant. Applying this idea of how we recall the properties of objects to the Gestalt “construction” problems, Keane points out that it is the salience of these context-independent properties that either cues or fails to cue a solution plan. If you see a pair of pliers next to an electrician’s toolbox, the property of gripping tightly would be salient and hence incorporated into your representation of the tool. If, on the other hand, you saw the pliers beside the bleeding head of a corpse, you might represent the tool in terms of its heaviness because that property is salient in this context. In Maier’s Two-string problem the context-independent properties of the pliers (gripping tightly, made of metal) are likely to be represented, and the pliers’ heaviness may not be seen as salient. In Duncker’s Candle Holder problem the salience of the boxes as
52
PROBLEM SOLVING
containers is emphasised because they contain tacks, matches, etc. Other possible properties of the box are therefore not immediately represented. The second aspect of Keane’s theory involves seeing the process of problem solving as attempting to access relevant solution plans from long-term memory. The objects lying around the room in Maier’s and Duncker’s problems are “bad cues” for retrieving relevant solution plans. “This view of problem solving as a sort of memory recall task conflicts with the traditional view of problem solving; that it is something more than simply remembering. However, remembering the right thing at the right time is not a trivial task; in fact, it could be viewed as the very basis of human intelligence” (Keane, 1989b, p. 4). In summary, Keane’s view of solving certain types of Gestalt insight problems involves looking at the solution process in terms of the way we represent the objects and the kinds of retrieval processes that come in to play based on those object representations. To the extent that our representations of objects and categories are unstable (Barsalou, 1989) (that is, the information about an object or category that we deem to be relevant may vary from context to context) we are often unable to pick out the salient properties of the objects. This leads to functional fixedness, for example, when only context-independent properties are represented. However, the salience of an object’s properties can be manipulated, to make it more likely that it will cue a relevant solution plan. Thus, in Duncker’s Candle Holder experiment, when subjects were given the candles, matches, and tacks inside boxes, the container property of the boxes was seen as salient. When all the objects including the boxes were presented separately, the “container” property of the boxes was less salient and made it more likely that subjects would find a solution. Ohlsson’s theory of insight One of the features of insight problems is that, once you have read the problem and made a few incorrect attempts at solving it, you get stuck. Problem solving grinds to a halt. Getting stuck in the course of problem solving is usually referred to as reaching an impasse (e.g., Laird, Newell, & Rosenbloom, 1987; Newell, 1990; VanLehn, 1990). A second aspect of insight problems is that people are usually able to solve them but they don’t realise it. The answer is often obvious once you hear it. Understanding how to solve an insight problem is therefore a bit like getting a joke (Koestler, 1970). Jokes often rely on the listener generating a typical, but in this case wrong, interpretation of a situation. One could turn the statement “a man walked into a bar and fell on the floor” into an insight problem by adding the question “Why?” The answer (the punch line) is that it was an iron bar. The point here is that you could have solved the problem (got the joke) if you had accessed the relevant meaning of “bar”. It is not that you didn’t know that bar had two meanings, it’s just that you didn’t realise which one was relevant. Thus, “insight occurs in the context of an impasse, which is unmerited in the sense that the thinker is, in fact, competent to solve the problem” (Ohlsson, 1992, p. 4). The corollary of this is that if you do not have a particular competence then you cannot have an insight. You are “terminally stuck” as Ohlsson puts it. Once again a joke can illustrate this point: Question: How many Heisenbergs does it take to change a light bulb? Answer: If you knew that you wouldn’t know where the light bulb was.
ACTIVITY 3.5 The Radiation problem
3. PROBLEM REPRESENTATION: INSIGHT
53
Suppose you are a doctor faced with a patient who has a malignant tumour in his stomach. It is impossible to operate on the patient, but unless the tumour is destroyed the patient will die. There is a kind of ray that can be used to destroy the tumour. If the rays reach the tumour all at once at sufficiently high intensity, the tumour will be destroyed. Unfortunately, at this intensity the healthy tissue that the rays pass through on the way to the tumour will also be destroyed. At lower intensities the rays are harmless to healthy tissue, but they will not affect the tumour either. What type of procedure might be used to destroy the tumour with the rays, and at the same time avoid destroying the healthy tissue?
(Gick & Holyoak, 1980, pp. 307–308) If you don’t know anything about Heisenberg’s Uncertainty Principle then you won’t get the joke—you would be terminally stuck. Similarly if an insight problem requires for its solution knowledge that you do not have, then there is no way you can get out of the impasse. A third aspect of insight problems is that, as Weisberg (1995) pointed out, you can have an “Aha!” experience on problems that would not normally be classed as insight problems. One might suddenly have an insight into how to solve an algebra problem. Furthermore, an insight may well be completely wrong. Some of these and other points made by Ohlsson will be illustrated with the Radiation problem originally used by Duncker (1945). The version here (Activity 3.5), however, is taken from Gick and Holyoak (1980). Ohlsson’s theory is based on five problem-solving principles. 1. Reading a problem generates a mental representation of the problem’s givens (the situation described by the problem) and some solution criterion. 2. Based on this mental representation we access a set of mental operators that we think might apply. Associated with the operators is information about pre-requisites and the effects of applying them. The solver can “see” what happens in his or her mind’s eye when an operator is applied. 3. Problem solving is sequential, hence only one operator can be selected and applied at a time from those retrieved from memory. On the basis of some heuristic or plan, an operator is chosen from the ones retrieved from memory. Any operators not retrieved naturally cannot be executed. 4. Retrieving a relevant operator from the vast number in memory is not trivial. It comes about through spreading activation (bits of information in a semantic network that are related to the current context are activated, some more strongly than others). Activation spreads from information currently in working memory or in the goal stack. Spreading activation is an unconscious process. 5. The mental representation we form of the problem situation acts as a memory probe for relevant operators in long-term memory. The operators retrieved will have some semantic relationship with the problem situation and goal. Operators that have no such semantic relationship will not be retrieved. Notice that this aspect of the theory resembles Keane’s, in which salient features of the problem activate related operators. When a problem is unfamiliar we may not interpret it in an optimal way. We therefore encounter an impasse when we generate a representation based on an interpretation that does not allow us to retrieve relevant operators from memory. When solvers hit an impasse, the only way out is to construct a new representation of the problem (we “restructure” the problem). The new representation generates a different spread of activation. In the Radiation problem any interpretation that involves rays projected at full power is not going to work. Firing the rays down the oesophagus, for example, won’t work as there is no straight path that way to the tumour. Opening the stomach to clear the way for the rays is expressly forbidden. However providing hints
54
PROBLEM SOLVING
Figure 3.12. Diagrams used by Duncker in describing the Radiation problem (adapted from Duncker, 1945).
or rephrasing the problem statement can have an effect by acting as different memory probes, allowing different mental operators and hence different solutions to be attempted. Duncker gave two diagrams with the Radiation problem to two groups of people (Figures 3.12a and 3.12b). Figure 3.12b was more effective at generating a solution. He also tried rephrasing the problem statement causing the participants to focus on different aspects of the problem: 1. How could one prevent the rays from injuring… 2. How could one protect the healthy tissues from being injured. Giving different diagrams and presenting the question in different ways influenced the representation subjects formed of the problem. There are, according to Ohlsson, three ways that one can change an initial representation: Elaboration. The solver might notice features of the problem that he or she had not noticed before. This might enrich or extend the representation of the problem. In the Mutilated Chequer Board problem, for example, the solver might notice that the domino has to cover one square of each colour, so if two squares of the same colour are missing then the Chequer Board cannot be entirely covered by dominoes. Elaboration can also come about by retrieving relevant information from long-term memory Re-encoding. The representation of the problem may be mistaken rather than incomplete. In Duncker’s Candle Holder problem, the solver has to re-encode the boxes from containers to platforms. Similarly the thinker has to re-encode the pliers in Maier’s Two-string problem as a pendulum weight. Constraint relaxation. Sometimes the solver may have assumed that there were constraints placed on the problem that were not in fact there. In the Radiation problem there is nothing to stop you using more than one ray machine or changing the intensity of the rays. To solve it you have to relax any constraints you may have imposed about the number of ray machines at the doctor’s disposal. Completing the solution If an impasse is successfully broken by forming a re-representation of the problem, a new set of operators becomes available. This leads to either a partial insight or full insight. A partial insight occurs when a new path through the problem can suddenly be seen. A full insight is where the answer is suddenly staring you in
3. PROBLEM REPRESENTATION: INSIGHT
55
Figure 3.13. The Nine Dots problem.
the face. The latter is hard to explain within an information-processing framework. Ohlsson suggests that a solution is constructed at the moment of insight. This process is fast, hence it seems immediate (the “Aha!” experience). How such a construction process might come about can be seen from a new view of a well-known insight problem. The Nine Dots problem is an example of a problem where the difficulty seems to be in overcoming self-imposed constraints (Figure 3.13 in Activity 3.6).
ACTIVITY 3.6 Draw four straight lines that connect all nine dots without taking the pencil from the paper. There is a hint in the text (answer on page 235).
A Gestalt explanation of the Nine Dots problem is that there is a perceptual constraint—subjects assume they have to stay within the square formed by the nine dots (Scheerer, 1963). However, there has been some argument over the exact nature of the constraints involved in this problem. Weisberg and Alba (1981) found that only 20% of their subjects produced a correct solution after being given a hint to go outside the dots. As so few people solved it despite the constraint being presumably “relaxed”, Weisberg and Alba argued that the difficulty could not be due to a perceptual constraint. Lung and Dominowski (1985) argued that the constraint was that people felt that lines had to begin and end on dots. In a series of studies, MacGregor, Ormerod, and Chronicle (in press; Chronicle, Ormerod, & MacGregor, in press; Ormerod, Chronicle, & MacGregor, 1997) argue that the self-imposed constraints are due to the information-processing demands of the problem. For example, Chronicle et al. found that, despite being given visual cues to guide them towards constraint relaxation, participants were still unlikely to solve the problem. MacGregor et al. therefore propose a model of performance on the Nine Dots problem based on two general problem-solving heuristics—a type of means-ends analysis, and progress monitoring based on some criterion. The Nine Dots problem is an abstractly defined problem, so there is no explicit goal state— no “end”, as it were, that can readily be reached by difference reduction. Nevertheless, MacGregor et al. argue that, in abstractly defined problems, people try to use operators that are “locally rational”. Such an operator would allow a solver to reduce the difference between where they are in a problem and some “local sub-goal state”. Specifically, people try to cancel as many dots as possible in one move. This meansend strategy MacGregor et al. refer to as a “maximisation heuristic”. The second strategy is to monitor progress through a problem. One way to do this is to look one or more moves ahead. Unfortunately, capacity limitations restrict our ability to do this. Another way to monitor progress is to evaluate prospective moves based on some kind of criterion: in this case, cancel an average of 2.25 dots per move. MacGregor et al. argue that fixating on the square or failing to consider non-dot turning points are the results of applying the local means-end heuristic to the basic Nine Dots problem. It also explains the finding
56
PROBLEM SOLVING
Figure 3.14. Two variants of the Nine Dots problem presented in MacGregor et al. (in press).
shown in Figure 3.14. In the Figure the first line provided in (a) was less likely to lead to a correct solution than the diagonal line in (b), despite the fact that the horizontal line seems to indicate a relaxation of a constraint that says you should stay within the square. The model also predicts that experience of criterion failure in problems of this kind—possibly brought about by various manipulations—may cause the problem space to be expanded and new operators sought that may in turn lead to an insightful solution (e.g., drawing a line outside the dot array). Such a move may produce a “promising state” that may in turn lead to a re-conceptualisation of the problem. MacGregor et al. seem to have succeeded in building a detailed process model of success and failure on abstractly defined problems such as the Nine Dots problem. A traditional insight problem has yielded to an information-processing account that can predict success and failure on the problem and its variants. Not only that but their model also shows “how insight may be achieved incrementally through experience of one or many partial solutions”. THE RELATIONSHIP BETWEEN INSIGHT PROBLEMS AND OTHER PROBLEMS The necessity to re-represent problems is not confined to those normally referred to as insight problems. All so-called “word” problems in algebra and physics
ACTIVITY 3.7 Rate problems 1. The Mad Bird problem Two train stations are 50 miles apart. At 2 p.m. one Saturday afternoon two trains start towards each other, one from each station. Just as the trains pull out of the stations, a bird springs into the air in front of the first train and flies ahead at 100 mph to the front of the second train. When the bird reaches the second train it turns back and flies towards the first train. The bird continues to do this until the trains meet. If both trains travel at the rate of 25 miles per hour, how many miles will the bird have flown before the trains meet? (Posner, 1973, pp. 150–151)
3. PROBLEM REPRESENTATION: INSIGHT
57
2. The River Current problem You are standing by the side of a river which is flowing past you at the rate of 5 mph. You spot a raft 1 mile upstream on which there are two boys helplessly adrift. Then you spot the boys’ parents 1 mile downstream paddling upstream to save them. You know that in still water the parents can paddle at the rate of 4 mph. How long will it be before the parents reach the boys? (Hayes, 1989a, p. 25)
(problems involving some kind of cover story rather than a simple sum) require the solver to represent the problem appropriately before a way of solving it (a solution procedure) becomes obvious. The study of insight problems therefore has important consequences for our understanding of how we solve all kinds of problems. Activity 3.7 contains two examples of distance=rate×time problems and ways of representing them. There are two features of these kinds of algebra problems that are important. The first is that you often have to throw common sense out of the window (Birds don’t fly at 100 mph and anyway why would it fly back and forth like that? How do the parents know their sons are adrift on a raft two miles away?). Common sense, or general world knowledge, can often interfere with such problem solving. Second, both problems are written in such a way that certain features are more salient than others—that is, they stand out. One effect of this is that you may be led into representing the problems in a certain way. In the first case you may be led to represent the problem from the point of view of the bird. In the second case you may be led to represent the problem from the point of view of the person on the bank. Both these representations of the problems make the solutions harder. A representation of the Mad Bird problem from the bird’s point of view, as in Figure 3.15, involves trying to add up how far the bird travels each time it flies from one train to the other. This makes the solution rather difficult. However, the problem is readily solved if we ignore the bird for the moment and concentrate on the trains as in Figure 3.16. All you need to find out to begin with is how long it takes the trains to meet. Both trains travel at a very sluggish 25 miles an hour and meet after travelling 25 miles so
58
PROBLEM SOLVING
Figure 3.15. The Mad Bird problem from the bird’s point of view.
Figure 3.16. The Mad Bird problem from the trains’ point of view.
Figure 3.17. The River Current problem from the observer’s point of view.
they take an hour to meet. The problem now becomes how far does the bird travel in an hour at a speed of 100 miles an hour. The answer is obviously 100 miles. The River Current problem can be represented in two ways as in Figures 3.17 and 3.18. In Figure 3.17 the parents are paddling at 4 mph but the river current is 5 mph so they are actually heading away from the observer at 1 mph. The boys are travelling in the same direction at 5 mph so the difference between the two speeds is 4 mph. That is, the boys are approaching their parents at 4 mph. If they travel 4 miles in one hour then they will travel 2 miles in half an hour. In Figure 3.18 we can forget about the observer and take the point of view of the boys on the raft. We can also forget about the river current, as it affects the parents and the boys equally (similarly, if you are sitting on a train travelling at 80 mph you don’t normally think of a ticket inspector as moving at 82 mph—you can ignore the train’s speed as it affects you both equally and regard the inspector as moving at a much more sedate 2 mph). The only relevant figure therefore is the speed at which the parents are paddling, i.e., 4 mph. In which case they will reach the boys in half an hour.
3. PROBLEM REPRESENTATION: INSIGHT
59
Figure 3.18. The River Current problem ignoring the observer.
INFLUENCING PROBLEM REPRESENTATIONS: THE EFFECT OF INSTRUCTIONS The two examples just quoted illustrate how we are often “persuaded” to form a particular representation of a problem by the way a problem is worded. The Mad Bird problem is likely to generate a mental representation involving a bird flying between the two trains. The River Current problem invites you to represent the problem in terms of the speed of the river past an observer, so that’s what you do. You are given information, some of which appears to be particularly salient by the way the problem is worded, and so the representation you form is based on that salient information. The effect of instructions on the representations people form of problems was investigated by Hayes and Simon (1974; Simon & Hayes, 1976) using two variants of the Tower of Hanoi problem. When two problems have an identical underlying structure (as revealed, for example, by state-space analysis) they are said to be isomorphic (see also Chapter 4). The two isomorphs of the Tower of Hanoi problem used by Simon and Hayes are shown in Table 3.1. In the move isomorph, globes are transferred from one monster to another in much the same way that disks in the Tower of Hanoi are moved from one peg to another and with the same constraints—if a peg (monster) has more than one disk (globe) only the smallest can be transferred to another peg (monster); a large disk (globe) cannot be placed on a peg (moved to a monster) that has a smaller disk TABLE 3.1 The Monster problem A move isomorph Three five-handed extraterrestrial monsters were holding three crystal globes. Because of the quantum-mechanical peculiarities of their neighbourhood, both monsters and globes come in exactly three sizes with no others permitted: small, medium, and large. The medium-sized monster was holding the small globe; the small monster was holding the large globe; and the large monster was holding the medium-sized globe. As this situation offended their keenly developed sense of symmetry, they proceeded to transfer globes from one monster to another so that each monster would have a globe proportionate to his own size. Monster etiquette complicated the solution of the problem as it requires: 1. that only one globe may be transferred at a time, 2. that if a monster is holding two globes, only the larger of the two may be transferred; 3. that a globe may not be transferred to a monster who is holding a larger globe. By what sequence of transfers could the monsters have solved the problem?
60
PROBLEM SOLVING
A change isomorph Three five-handed extraterrestrial monsters were holding three crystal globes. Because of the quantum-mechanical peculiarities of their neighbourhood, both monsters and globes come in exactly three sizes with no others permitted: small, medium, and large. The medium-sized monster was holding the small globe; the small monster was holding the large globe; and the large monster was holding the medium-sized globe. As this situation offended their keenly developed sense of symmetry, they proceeded to shrink and expand the globes so that each monster would have a globe proportionate to his own size. Monster etiquette complicated the solution of the problem as it requires: 1. that only one globe may be changed at a time; 2. that if two globes are of the same size, only the globe held by the larger monster can be changed; 3. that a globe may not be changed by a monster who is holding a larger globe. By what sequence of changes could the monsters have solved the problem? (Adapted from Simon & Hayes, 1976, pp. 168–169)
(globe). The change isomorph is trickier. It involves changing the sizes of globes, and the globes stay with a single monster. Actually what has happened is that the equivalent of pegs and disks have swapped round in the two representations. The equivalent of a disk in the move isomorph is actually a peg in the change isomorph. Simon and Hayes developed a computer program called UNDERSTAND that would take as input the problem instructions and identify the goal, states, and operators that applied to the problem. This information was then fed to a general-purpose program called the General Problem Solver. The UNDERSTAND program predicted that the representations solvers would generate (in terms of states and operators) would be entirely determined by the problem instructions. For example, a move in the change isomorph would be represented as a particular monster changing the size of its globe. When Simon and Hayes (1976) analysed the verbal protocols of people attempting to solve the two variants, they found that their subjects were powerfully influenced by the version of the story they were given, as UNDERSTAND predicted. This chapter has shown that our initial representation of a problem can often determine its difficulty, or even whether it is possible to solve it at all. In insight problems in particular we often hit an impasse early on that prevents further search for a solution. Whereas in well-defined problems a solution may be found by searching through a problem space, in insight problems a solution can only be found by searching for a useful problem space. Recently theorists have shown that the processes involved in trying to solve illdefined or abstractly defined insight problems are essentially similar to those used in solving well-defined problems. We are now in a position to say much more precisely why such problems are difficult. SUMMARY 1. The way we represent problems when we encounter them has a powerful influence on our ability to solve them. Occasionally, our representation of a problem is so poor that we get stuck—we reach an impasse. 2. It is sometimes possible to get out of an impasse by such means as: • focusing on a different aspect of the problem; • looking at extreme conditions; • trying to find an analogy;
3. PROBLEM REPRESENTATION: INSIGHT
61
• re-encoding aspects of the problem—trying to see them in different ways; • relaxing constraints that we have inadvertently placed on the problem. 3. Gestalt psychologists were interested in how we represent and “restructure” problems. They viewed thinking as often either reproductive, where we use previously learned procedures without taking too much account of the problem structure, or productive, where thinking is based on a deep understanding of a problem’s structure and is not structurally blind. 4. They were also interested in the failures of thinking due to: • functional fixedness, when we fail to notice that an object can have more than one use; • the effects of set, when we apply previously learned procedures when a simpler procedure would work. 5. Insight problems would appear to pose problems for information-processing theories of problem solving because they do not appear to involve sequential, conscious, heuristic search. Consequently some researchers have viewed insight as a special case of problem solving. Others have tried to fit insight into traditional information-processing accounts: • Kaplan and Simon saw insight as a search for a representation rather than a search in a representation. • Keane pointed out that functional fixedness could be understood in terms of the salient properties of objects. That is, functional fixedness is due to the organisation of semantic memory. • Ohlsson argued that we access one operator at a time based on our initial interpretation of a problem. Retrieving operators is an unconscious process involving spreading activation (and hence also depends on the organisation of semantic memory). When we cannot retrieve an operator based on our initial representation we reach an impasse and have to change the representation before we can access a relevant operator. 6. MacGregor et al. have produced a detailed model of the processes involved in solving abstractly defined insight problems such as the Nine Dots problem that incorporate heuristics such as means-ends analysis, and show how the conditions for an insightful solution might arise. 7. The study of insight problems is important because the processes involved are often the same as those involved in establishing an appropriate representation of a situation or word problem. Without an appropriate representation, no relevant operators can be accessed and problem solving reaches an impasse or becomes more difficult. 8. The way a problem is phrased can have a powerful effect on the representation that we form of it. Simon and Hayes showed how an initial representation can be formed from problem instructions and how it can be modelled.
PART THREE Analogical problem solving
Part Three: Introduction
Very often the simplest way of solving a problem is to think of a similar one we have solved in the past and use that solution. The general term for using an earlier problem to solve a new one is analogical problem solving. Although we can often use an earlier problem as an analogy, the term analogy can be used to cover a wide range of phenomena. For example, if we are trying to explain an unfamiliar concept to someone it is common to use some kind of analogy with a concept the person already knows. Analogies in the form of metaphors can also provide a new way of looking at things with which we are already familiar, allowing us to see new and perhaps unforeseen relationships. Analogies, then, come in all shapes and sizes. Because of their ubiquity, they are used for a variety of purposes. They can be found in: Poetry: Shall I compare thee to a summer’s day? Pithy sayings: Experience is the comb nature gives us when we are bald. Textbook explanations: Russian dolls can be used to explain recursion. Explanations of systems: The heart can be seen as a pump. Witty repartee: Unknown interlocutor: Oliver Wendell Holmes:
Ah, I see we are to have a battle of wits. Sir, I never fight an unarmed man.
Aesthetic prose: “No bull ever abused a china shop as sex abuses the DNA archives” (Dawkins, 1995, p. 40). Persuasive argument: President Johnson believed there would be a domino effect if Vietnam fell to the communists and therefore escalated the war in Vietnam. Instructional texts: Science, mathematics, and computing textbooks usually contain example problems and exercise problems; the student uses the examples to solve the exercise problems. Several important issues concerning using past examples or experience to solve a current problem or make sense of a current situation are dealt with over the next four chapters. Generally, the most important thing to bear in mind about the usefulness of analogies in solving new problems is that you have to understand the analogue if it is to be of any use. Similarly, if analogies are used to illustrate new concepts in teaching texts, the analogies have to be understandable, otherwise the new concepts may not be well understood. This aspect of analogical problem solving has important ramifications for teaching with analogies. Curtis and Reigeluth (1984) found a total of 216 analogies used in 26 science textbooks to elucidate scientific concepts. They are also particularly relevant to the kind of textbook problem solving discussed later in Chapter 7.
64
PROBLEM SOLVING
Another important aspect of analogies is that the similarity between the analogue and the problem you are trying to solve (or the concept you are trying to understand) can vary enormously. Bulls in china shops seem to have very little on the surface to do with genetic inheritance through sex. On the other hand, an example involving cars overtaking one another is very similar to an exercise problem involving trucks overtaking one another. All analogies have to be adapted in some way in order to solve a new problem. As they need to be adapted in some way, there is always a point at which an analogy breaks down. The surface features of analogues and the need to adapt the analogue to solve the new problem have a powerful influence on analogical problem solving. For a novice, identifying the relevant similarities and ignoring the irrelevant differences can be tricky. Thus analogies may help novices by shifting their focus away from the surface features of problems to the underlying structure. When we are presented with numerous examples of the same type of thing we normally eventually abstract out the commonalities between them. As toddlers we saw many examples of dogs and thereby learned to recognise new instances of dogs and to distinguish them from cats. Now, if we see a strange and unusual breed of dog for the first time, we can readily categorise it as a dog. The process of abstracting out the common features of things in this way is known as induction. Applying what we have learned— generalising from one or a few examples to a whole range of new examples—is known as deduction. Induction is the process of moving from the particular to the general, and deduction is the process of going from the general to the particular. Most of what we have learned about the world is through induction. For this reason a large part of this book is devoted to the inductive role of analogies and examples in learning.
CHAPTER FOUR Transfer of learning
The two problems described in Activity 4.1 are different in their surface features (one is about currency exchange rates and the other is about a car’s journey) and similar in their underlying structure—in both you get the answer by multiplying the two known quantities (both are therefore solved using the equation: a=b×c). The elements (45 mph, 5 hours) or surface features of a problem are related to each other, and the set of relations (represented here by the structure of the equation) constitutes the problem’s structure. As the examples in Activity 4.1 show, the set of relations may indicate a procedure that has to be followed. Over the next few chapters you will find that the emphasis is sometimes on the relational structure and sometimes on the procedure that derives from it, although the two are intertwined.
ACTIVITY 4.1 In what ways are these two problems similar and different?
A. A tourist in St Tropez wants to convert £85 to Euros. If the exchange rate is 1.8€ to the pound how many Euros will she get? B. A car travelling at an average speed of 40 mph reaches its destination after 5 hours. How far has it travelled? 1. How likely is it that learning to do problem A will make it easier to do problem B? 2. Suppose you have just exchanged pounds for Euros and are idly motoring through the French countryside pondering question B; how likely are you to be reminded of problem A? 3. If so, why; and if not, why not? 4. Under what conditions are you likely to use A to solve B?
Using the experience gained on solving an earlier problem to solve a new one of the same type is called analogical transfer. The seeming ease with which human beings are able to generalise, classify, and indeed to generate stereotypes, suggests that transfer of learning is a simple and straightforward affair. As Jill Larkin (1989, p. 283) puts it:
66
PROBLEM SOLVING
Everyone believes in transfer. We believe that through experience in learning we get better at learning. The second language you learn (human or computer) is supposed to be easier than the first… All these common beliefs reflect the sensible idea that, when one has acquired knowledge in one setting, it should save time and perhaps increase effectiveness for future learning in related settings. However, there are some surprising limits to our ability to transfer what we learn, and much research has gone into the study of the conditions under which transfer occurs or fails to occur. At the beginning of the twentieth century, and for a long time before that, it was assumed that what children learned from one subject would help them learn a host of others. The prime example of such a subject that was supposed to transfer to all sorts of other disciplines was Latin. Its “logical” structure was thought to improve the learning of mathematics and the sciences as well as languages. More recently the computer-programming language Logo has replaced Latin as the domain that is believed by some to produce transfer of learning to other domains (Klahr & Carver, 1988; Papert, 1980, 1993). Another view at the beginning of the twentieth century, which goes back to Thorndike’s theory of identical elements (Thorndike, 1913; Thorndike & Woodworth, 1901), is that we cannot expect transfer when there are no similar surface elements, even if two problems share the same underlying features. Two tasks must share the same perceptually obvious features such as colour or shape before one can be used to cue the solution to the other, or they must share the same specific stimulus-response associations. A consequence of this view is that transfer is relatively rare. Latin cannot help one do mathematics because they don’t share identical surface elements. Learning a computer language such as Lisp is not going to help anyone learn a new language such as Prolog; and so on. Recently the tension between these opposing views (transfer is ubiquitous versus transfer is rare) has tended to centre on the relative roles played by domaingeneral and domain-specific skills (of which more later). General transfer involves the learning of generalisable skills or habits. If you learn to write essays that get good marks in, say, politics or economics, then the chances are that you will be able to use your knowledge of how to structure an argument when you come to write essays in a totally different field such as history or psychology. You don’t have to learn to write essays from scratch when you shift to a different field. Furthermore, there is a phenomenon known as learning to learn, whereby when someone performs a novel task once, such as learning a list of words or a list of paired associates (pairs of words in which the first word of the pair is often later used to cue the second), then performance on this type of task is greatly improved the second time it is performed even though the words used are entirely different (Postman & Schwartz, 1964; Thune, 1951). Specific transfer is related to Thorndike’s idea of identical elements. Some mathematics textbooks provide extensive practice at solving problems that are very similar. Once the students have learned how to solve the example they should, in principle, have little difficulty in solving the remaining problems. However, it should be noted that transfer really refers to transferring what one has learned on one task to a task that is different in some way from the earlier one. If two problems are very similar or identical, then what one is likely to find is improvement rather than transfer. POSITIVE AND NEGATIVE TRANSFER We are constantly using old knowledge in new situations. When a solver can successfully use a solution procedure used in the past to solve a target problem this is known as positive transfer. If you have already learned to drive a Vauxhall, then the length of time it will take you to learn to drive a Ford will be considerably reduced. This is because it is easy to transfer what has been learned from one situation
4. TRANSFER OF LEARNING
67
(learning to drive a Vauxhall) to the new situation (learning to drive a Ford), as the two tasks are similar. At the same time there are likely to be differences between the two cars that necessitate learning something new such as the layout of the dashboard. However, it is also the case that a procedure learned in the past can impede one’s learning of a new procedure. This is known as negative transfer. In this case what you have learned prevents you from solving a new problem or at least prevents you from seeing an optimal solution. Imagine, for example, that you have learned to drive a car where the indicator lever is on the right of the steering column and the windscreen washer is on the left. After a few years you buy a new car where the indicator lever is on the left and the windscreen washer is on the right of the steering column. In this case learning to use the levers in your old car will make it harder to learn to use the levers in your new one. You try to apply what you have learned in the wrong circumstances.
ACTIVITY 4.2 There were some examples of negative transfer in the previous chapter. Can you think of what they might be?
It should be borne in mind that negative transfer is usually very hard to find. It is often confused with partial positive transfer. Suppose you learn to do word processing with an application that uses certain key combinations to do certain tasks until you are quite proficient at it. You are then presented with a totally new word processing package that uses different sets of key combinations. Just as one might get confused in a car whose indicators and windscreen wiper controls are reversed, so you might find it irritatingly hard to learn the new key combinations. You might readily feel that the old knowledge is interfering with the new. However, if you were to compare the time it took you to learn the original word processor for the first time with the time taken to learn the new one you would probably find that you learned the new one faster than the old one despite the initial confusion (Singley & Anderson, 1985; VanLehn, 1989). Whether this is true of the windscreen wiper/indicator scenario is a moot point. SET AND EINSTELLUNG AS EXAMPLES OF NEGATIVE TRANSFER Some of the difficulties in problem solving identified by the Gestalt psychologists can be classified as negative transfer. The most obvious example is Einstellung. In Luchins and Luchins’ Water Jar experiments subjects learned a rule for generating a solution that prevented them from seeing a simpler solution to a later example of the same type of problem. Similarly functional fixedness prevents you from seeing a solution to a problem. Learning that a tool such as pliers is used for grasping things tightly may prevent you from seeing that the tool can be used as a pendulum weight. What you have learned in one context prevents you from solving a problem in a different context using the same tool. HYPOTHESIS TESTING THEORY To try to make sense of why human problem solving is affected by negative and positive transfer, John Sweller (1980; Sweller & Gee, 1978) conducted a series of neat little experiments to show how the same set of training examples could produce both positive and negative transfer. According to Sweller (1980) positive transfer accounts for the sequence effect (Hull, 1920). The sequence effect means simply that if there is a series of similar problems graded in complexity, then it is easier to solve them in an easy-to-
68
PROBLEM SOLVING
complex sequence than in a complex-to-easy sequence. A solver’s experience of solving the early examples speeds up the solution to the later ones. Furthermore, a modification of Levine’s (1975) hypothesis testing theory can account for this type of positive transfer as well as for the kind of negative transfer produced by Einstellung (Luchins, 1942). I should point out that the following discussion uses “hypotheses” and “rules” almost interchangeably. This is because they can both be framed in the same way. “If you do X then Y will happen” can be both a rule and a hypothesis. The hypothesis testing theory involves three assumptions: 1. “If people perceive a series of problems as being related, they tend to begin each problem by testing hypotheses as closely related as practicable to their previously correct hypotheses…the corollary is that to the extent that people do not perceive a series of problems as being related, their choice of hypotheses on subsequent problems will not be influenced by previous problems” (Sweller, 1980, p. 234). So if you have a rule (or closely related rules) that worked on similar problems before, then use it. The more these or similar rules have worked in the past, the more likely they will work again. On the other hand, if you don’t see any similarity between the problem you are working on and one you did in the past, then you are hardly likely to use the earlier one as a basis for solving the current one for the simple reason that is does not occur to you to do so. 2. A hypothesis that is fairly simple and has few similar related hypotheses will be more salient than a complex hypothesis or one that has a number of similar related hypotheses (see Activity 4.3). In other words a simple rule will appear more obvious than a complex one. 3. Assumption number 1 overrides 2 if there is a conflict between them. For example, you are more likely to retrieve and use a rule or hypothesis that has been successful with similar problems in the past, no matter how complex, than to retrieve a rule that is simple rather than complex.
ACTIVITY 4.3 A. A simple hypothesis: Complete the following series:
B. A complex hypothesis: Complete the following series:
(Answers on pages 235–236) When we learn something new we generally start off with simple examples. With learning and practice the examples can get harder and harder until eventually we can solve complex problems in that domain. When we have learned to solve problems of a certain type, features of a new problem will trigger the relevant procedure. As we saw in the Water Jars problem, features of the problem can trigger a complex procedure when a much simpler one exists. A simple problem whose features resemble a complex one can therefore trigger the complex learned procedure. Sweller’s experiment into transfer effects is shown in Study Box 4.1. The results show that you can solve a difficult problem if you have had experience with similar problems involving related rules. With experience of related rules people can make successful hypotheses about a new problem that is similar to
4. TRANSFER OF LEARNING
69
earlier ones, on condition that the sequence of problems starts off with one requiring a relatively simple hypothesis and becomes increasingly complex. The downside to this effect is that a simple problem can be made difficult for the same reason. The same rule-learning ability that makes an otherwise insoluble problem soluble produces
STUDY BOX 4.1 Sweller’s (1980) study of transfer effects Rationale The assumptions of the hypothesis testing theory predict that there will be different transfer effects (both positive and negative) despite very similar learning conditions in problems where subjects have to induce a rule. The same “training set” of problems can therefore turn a simple problem into a hard one (due to the effects of negative transfer) and a hard problem into a simple one (due to the effects of positive transfer and the sequence effect). Method Subjects were given a two-digit number such as 49 and told to give a number in response. They were then given the correct answer. This was repeated until the subjects got the answer correct 6 out of 7 times, or until they had 40 trials. There were two types of possible response. A simple Add response where the correct answer was simply the sum of the two digits. So, for example, the two digit number 49 should elicit the answer 13 (by adding 4 and 9). The second type of response was based on a set of increasingly complex rules. The first problem was very simple. No matter what two-digit number was given to the participant the answer was always 10. In the second problem two responses alternated. For the first two-digit number the answer was 12, for the second two-digit number it was 7, for the third 12 again, and so on. The third problem required the subject to pick out a sequence of three numbers, and so on. Results None of the subjects who had learned the increasingly complex procedure were able in 10 attempts to solve the problem requiring them simply to add the numbers. All bar one (12 out of 13) subjects in the control group who had had no training managed to solve the simple Add problem within 10 attempts. No one managed the complex problem without practice.
Einstellung when a new problem is presented that requires a different although simpler hypothesis. As Sweller puts it, “Identical variations in preliminary problems either can transform a normally simple problem into an insoluble one or alternatively an insoluble problem into a simple one with the only alteration being in the degree of similarity between the hypotheses required for the initial problems and the critical problem” (1980, p. 237). Sweller’s studies are important for the light they cast on human thinking. Sweller regards the process by which we induce rules from experience as a natural adaptation. As a result positive and negative transfer are two sides to the same coin. Einstellung is not an example of human rigid or irrational thinking, but a consequence of the otherwise rather powerful thinking processes we have evolved to induce rules that in turn make our world predictable. TRANSFER IN WELL-DEFINED PROBLEMS Using the methods described in Chapter 2 to analyse problem structures, we can tell if two problems have the same structure or not. If there are differences in people’s ability to solve two problems that share the same structure then the difficulty must lie in some aspect of the harder problem other than the structure
70
PROBLEM SOLVING
itself. As stated in Chapter 3, problems that have an identical structure and have the same restrictions are known as isomorphs. The Monster problems in Table 3.1 described two such isomorphs. Have a look now at Activity 4.4. Activity 4.4 illustrates one further variant of the Tower of Hanoi problem. The only difference between the Tower of Hanoi problem and the three isomorphs you have seen is in their cover stories. Because the problems look so different on the surface you may not have realised that the underlying structure was identical with the Tower of Hanoi problem. All four problems therefore differ in terms of their surface features but are similar in their underlying structural features. As pointed out in Chapter 3, despite their underlying similarity, it is the surface features that most influence people. The Himalayan Tea Ceremony is an example of a “Transfer
ACTIVITY 4.4 The Himalayan Tea Ceremony In the inns of certain Himalayan villages is practised a most civilised and refined tea ceremony. The ceremony involves a host and exactly two guests. When his guests have arrived and have seated themselves at his table, the host performs three services for them. These services are listed below in the order of the nobility that the Himalayans attribute to them: stoking the fire, fanning the flames, passing the rice cakes.
During the ceremony, any of those present may ask another, “Honoured Sir, may I perform this onerous task for you?” However, a person may request of another only the least noble of the tasks that the other is performing. Furthermore, if a person is performing any tasks, then he may not request a task that is nobler than the noblest task he is already performing. Custom requires that by the time the tea ceremony is over, all of the tasks will have been transferred from the host to the most senior of his guests. How may this be accomplished?
(Adapted from Simon & Hayes, 1976) problem” where tasks are transferred from one participant to another, like the first Monster problem in Table 3.1. Simon and Hayes (1976) used 13 isomorphs of the Tower of Hanoi problem, all of which were variations of the Monster problem with two forms: Change and Transfer. They found that subjects were strongly influenced by the way the problem instructions were written. None of their subjects tried to map the Monster problem onto the Tower of Hanoi to make it easier to solve, and “only two or three even thought of trying or noticed the analogy” (1976, p. 166).
ACTIVITY 4.5 The Jealous Husbands problem Three jealous husbands and their wives having to cross a river at a ferry find a boat, but the boat is so small that it can contain no more than two persons. Find the simplest schedule of crossings that will permit all six people to cross the river so that none of the women shall be left in company with any of the men, unless her
4. TRANSFER OF LEARNING
71
husband is present. It is assumed that all passengers on the boat disembark before the next trip, and at least one person has to be in the boat for each crossing.
The problem in Activity 4.5 should, however, be recognisable. It was used by Reed et al. (1974) in their study of transfer between similar problems. The Jealous Husbands and Hobbits and Orcs problems both have an identical structure (see Figure 2.7). However, notice that there is one further restriction in the Jealous Husbands problem. A woman cannot be left in the company of other men unless her husband is present. “Moving two missionaries corresponds to moving any of three possible pairs of husbands since all husbands are not equivalent. But only one of the three possible moves may be legal, so there is a greater constraint on moves in the Jealous Husbands problem” (Reed et al., 1974, p. 438). This added constraint means that the Missionaries and Cannibals and the Jealous Husbands problems are not exactly isomorphic but are homomorphic. That is the problems are similar but not identical. Reed, Ernst, and Banerji’s experiments are outlined in Study Box 4.2. Reed et al. found a number of things. First the extra constraint imposed by the Jealous Husbands problem made it harder to ensure that a move was legal or not. The effect of this was to increase the length of time subjects took to make a move and the number of illegal moves made. Second, transfer was asymmetrical; that is, there was transfer only from Jealous Husbands to Missionaries and Cannibals but not the other way round. Third, there was no transfer between the two different problems unless the subjects were explicitly told to use the earlier problem (e.g., “whenever you moved a husband previously, you should now move a missionary, etc.”). Fourth, subjects claimed to make little use of the earlier problem even when a hint was given to use it. No one claimed to remember the correct sequence of moves from the earlier problem. Instead subjects must have remembered the earlier
STUDY BOX 4.2 Reed, Ernst, and Banerji (1974) Rationale The aim of Reed and co-workers’ experiments was “to explore the role of analogy in problem solving”. They examined under what circumstances there would be an improvement on a second task. Method In Experiment 1 subjects were asked to solve both the Missionaries and Cannibals (MC) problem and the Jealous Husbands (JH) problem. Approximately half did the MC problem before doing the JH problem and approximately half did them in the reverse order. No transfer was found between the two problems. The next two experiments were therefore designed to find out how transfer could be produced. Experiment 2 looked for signs of improvement between repetitions of the same problem. Half the subjects did the MC problem twice and half did the JH problem twice. Experiment 3 tested whether being told the relationship between the problems would produce transfer. The independent variables were: the time taken, the number of moves, and the number of illegal moves made by the subjects. Results and discussion In all cases there was no significant difference in the number of moves made, but in some conditions there was a reduction in time to solve the problem and a reduction in the number of illegal moves in the second presentation of a problem. As one might expect there was some improvement on the same problem presented twice. However, the only evidence for any transfer was from the Jealous Husbands to the Missionaries and Cannibals problem, and then only when there was a hint that the two problems were the same. Some transfer of learning took place from the harder to the simpler problem only.
72
PROBLEM SOLVING
problem at a “more global level”. That is, they remembered some general strategies (“balance missionaries and cannibals, move all missionaries first”, etc.). One other point emerges from Reed and co-workers’ experiments. As with Simon and Hayes’ (1976) Tower of Hanoi isomorphs, it was not the problem structure that was the source of the difficulty in problem solving. The main difficulty was knowing whether a particular move was legal or not. This aspect was examined further by Luger and Bauer (1978). They found no transfer asymmetry between the Tower of Hanoi (TOH) and Himalayan Tea Ceremony (HTC) problems—there was a transfer of learning no matter which of the two problems was presented first. One reason they give for the difference between their results and those of Reed et al. is that the TOH has “an interesting substructure of nested isomorphic subproblems” (p. 130) (see Figure 4.1). What this means is that learning to solve one of the two variants involves decomposing the problem into sub-problems, the solution of which can be used in the other isomorphic problem. The Missionaries and Cannibals variants lack this kind of problem substructure. We will look first at studies that have examined the extent to which domain-specific knowledge can be transferred. By representing procedures as IF-THEN rules (known as production rules) some researchers claim to be able to predict the extent to which the rules that apply in one area can transfer to another. We will then look at the kinds of domain-general knowledge that can be transferred. This generally refers to strategic knowledge; metacognitive skills such as time, mood, and task management; and learning to learn. SPECIFIC TRANSFER The persistent failure to find transfer effects in many studies of learning unless some kind of hint is given seems to fly in the face of our everyday experience. We are constantly faced with new situations and we constantly use our past experience to deal with these new situations, mostly successfully. We must therefore have learned something from the past and we must be able to use what we learned to cope in the present. To examine what aspects of past experience we can readily apply to new situations we need a way to characterise or represent that past experience. As mentioned at the beginning of the chapter, one way to think of the similarity between one problem or bit of knowledge is in terms of their shared “elements”. When Thorndike referred to identical elements he had in mind a set of stimulus-response associations. Some aspect of the environment would lead to a specific behaviour learned in one context; when the same (perceptually similar) aspect of the environment was encountered later the same behaviour would be elicited even if aspects of the task were slightly different. However, the contexts would have to be awfully similar for the behaviour to be elicited in both. As a result we should expect very little transfer of learning from one situation to another unless the two situations were very similar. A more recent version of this theory of transfer has been developed by Singley and Anderson (Anderson & Singley, 1993; Singley & Anderson, 1985, 1989). This modified version of the identical elements theory of transfer is known as the identical productions theory of transfer. The emphasis here is on the transfer of skill where a skill can be broken down into individual bits or productions. Before looking at the theory in more detail, the next section gives some background on types of knowledge. Declarative and procedural knowledge Assume for the moment that you are learning to drive a car for the first time (or think back to a time when you did not know how to drive). In order to know what to do inside a car you either have to be told or you have to read the instructions for driving this particular car. You might know that you have to change from second to third gear but you might not know how to do it. Knowing that is called declarative knowledge.
4. TRANSFER OF LEARNING
73
Figure 4.1. The 4-ring Tower of Hanoi state space showing the substructure of 1-, 2-, and 3-ring sub-spaces. Smaller sub-spaces embedded within larger sub-spaces illustrate the recursive nature of the problem. (Reprinted from Acta Psychologica, 1978, Luger and Bauer. “Transfer effects in isomorphic problem situatios,” pp. 121–131. © 1978, reprinted with permission from Elsevier Science.)
Somehow or other you have to convert that declarative knowledge into a set of actions that will perform a gear change—a procedure for changing gear. Knowledge of how to do something is called procedural knowledge (see Information Box 4.1). Notice that procedural knowledge is not so much either right or wrong as more or less useful (as in the last rule given in Information Box 4.1). The difference between declarative and procedural knowledge can be seen in the example of driving a car. You can memorise the instructions for driving a car so that you are
INFORMATION BOX 4.1 Declarative and procedural knowledge Declarative knowledge (“knowing that”) is our knowledge of facts, and includes general knowledge about the world as well as episodic and autobiographical knowledge. Examples include:
74
PROBLEM SOLVING
• • • • •
Rome is the capital of Italy We are coming out of recession e=mc2 Instructions for driving a car or a bike I am going to have chicken for dinner
Generally declarative knowledge tends to be either true or false. Procedural knowledge (“knowing how”) is our knowledge of how to do something. This knowledge underlies our skilled performance and is often not accessible to consciousness. That is we can’t get back the declarative knowledge from which our procedural knowledge was originally built. Declarative knowledge can be turned into procedural knowledge by stating it as a rule for action; compare the declarative version e=mc2 above with the procedural version below. Examples include:
• • •
How to drive a car or ride a bike IF you want to find out the energy (e) in a system THEN multiply its mass (m) by the square of the speed of light (c2) IF you want to know if PERSON is a man or a woman THEN check to see if PERSON has long hair
word perfect, but this won’t make you a skilled driver. In fact, if you have never driven before then you’d be lucky to pull away from the kerb without stalling. Declarative knowledge can be of three types (Ohlsson & Rees, 1991): 1. Factual knowledge such as the instructions for driving a car or any other form of general knowledge. Factual knowledge consists of assertions about specific objects or events such as “The gear lever is currently in second gear”. 2. Episodic knowledge, in fact a subset of (1). which is knowledge of personal experiences. This kind of knowledge consists of assertions about a specific spatio-temporal context: “The last time I tried to change gear the car stalled”. 3. Abstract knowledge or general principles. This kind of knowledge can be applied to an infinite number of cases. For example, the arithmetic principle: A set of numbers always yields the same sum, regardless of the order in which they are added applies to any set of numbers you like: 3+7+2956+ 675 will give the same result as 7+2956+3+675. Procedural knowledge specifies the actions that are to be taken under certain conditions. It often refers to the ability to perform skilled actions. The learner driver may have a great deal of declarative knowledge about the sequences of actions involved when changing gear, but this does not mean that he or she will be able to execute that sequence smoothly when called upon to drive a car for the first time. Procedural knowledge comes about through practice at performing the actions of driving. In many theories of problem solving and skill acquisition declarative knowledge is a necessary precursor to procedural knowledge (Anderson, 1983; Byrnes, 1992). Another distinction that can be made between declarative and procedural knowledge is in terms of the use that can be made of them. The general semantic and episodic forms of declarative knowledge are goalindependent. They can be used in a variety of circumstances and can serve a variety of goals—there is no
4. TRANSFER OF LEARNING
75
“commitment” as to how they should be used. Furthermore, declarative knowledge can be true or false: “We are coming out of the recession”, “Italy is the capital of the Ukraine”, “The button on top of the gear lever operates the ejector seat”. Procedural knowledge, on the other hand, can only be understood in terms of the goals one is trying to achieve. It involves the knowledge of what to do given a certain set of circumstances. Rather than being strictly true or false, procedural knowledge is more or less effective in achieving one’s goals: IF the goal is to get to the city centre by 5.00 p.m. THEN look up the bus timetable AND walk to the bus stop AND get on the bus AND buy a ticket etc. IF the goal is to change from second into third gear AND the car is currently in second gear THEN put your foot down on the clutch pedal AND push the gear lever into the neutral position AND move the gear lever to the right AND push the gear lever forward into the third position AND take your foot slowly off the clutch pedal Note that in these examples, the procedures are ones that have already been learned. In the second example the declarative knowledge has been incorporated into a procedure and a number of individual procedures have been chained together. How declarative knowledge is turned into procedural knowledge in this way is dealt with in Chapter 8 on learning. Not everyone accepts that there is a rigid distinction between declarative and procedural knowledge (Laird et al., 1987; Newell, 1990; Silver, 1986). Silver (1986), for example, points out that procedural fluency (driving a car) does not have to rest on conceptual fluency (knowledge of internal combustion). He argues that conceptual knowledge is neither necessary nor sufficient for procedural knowledge, but rather that the two are interrelated. Nevertheless, conceptual knowledge can be equated with abstract knowledge, and this type of declarative knowledge may not be the relevant type when considering procedural learning. Procedural learning may still need a solid base of factual and episodic knowledge. Declarative knowledge of the type “Put your hand on the gear lever and push it forward” would be more appropriate. Identical productions theory of transfer If we look at learning to drive a car or learning to solve a problem as the acquisition of procedures, then we can begin to look at the question of how knowledge can transfer from one situation to another. Going back again to the infamous car you have been learning to drive, suppose now that it is a left-hand-drive car. Once
76
PROBLEM SOLVING
Figure 4.2. Four cases where two tasks (A and B in each case) share a number of productions.
you have learned to drive this car it should be relatively easy for you to go off and drive another left-handdrive car. You might find it a bit more difficult to drive a truck, however, or a right-hand-drive car. In each case there are going to be similarities and differences between the new vehicle and the one you learned to drive. The similarities should make it easier to learn to drive the new vehicle. In other words there will be positive transfer from the old context to the new one. Where there are differences between the vehicles in terms of the procedures for driving them you will have to start from scratch to learn some new procedures. Finally, as pointed out near the beginning of the chapter there may even be differences (reversed position of windscreen wiper and indicator levers) that will produce a degree of negative transfer to the new vehicle. If we can characterise the similarities and differences in terms of specific procedures, can we predict how much will be transferred from one situation to another? Indeed, we can, according to Singley and Anderson (Anderson & Singley, 1993; Singley & Anderson, 1985, 1989). They argued that the theory of identical elements was a useful way of explaining various aspects of transfer, although not quite in the stimulusresponse terms that Thorndike proposed. Rather, if one looks at transfer as the overlapping of shared productions (see VanLehn, 1989) then one can make predictions about the effects of learning one task on how well one learns another (Figure 4.2). Singley and Anderson (1985, 1989) performed a number of experiments looking at text-editing skill. Their subjects learned various screen editors and Singley and Anderson were interested in seeing the effect learning one text editor would have on learning a different one, or the same one with certain changes made to the commands that could be performed. The text-editing tasks were also modelled as a production system. Singley and Anderson found that the number of productions involved in using one text editor could predict the time saved by subjects learning a second text editor. For example, two “line editors” ED and EDT shared a great many productions and there was a consequent high level of transfer from one to the other. However, they shared relatively few productions with a third editor, EMACS, and there was relatively little transfer between the line editors and EMACS. The structure of the experiments is shown Table 4.1. TABLE 4.1 Conditions in the text-editing experiments of Singley and Anderson (Adapted from Anderson & Singley, 1993, p. 189) Day 1&2 Experiment 1
3&4
5&6
4. TRANSFER OF LEARNING
77
Day One editor Two editors Three editors Control Experiment 2 Negative transfer
1&2
3&4
5&6
EMACS ED EDT ED EDT type
EMACS ED EDT EDT ED type
EMACS EMACS EMACS EMACS EMACS EMACS
EMACS
Perverse EMACS
EMACS
On day 3 the three-editor group switched from one editor to another. Nevertheless there was no difference in times to perform a number of edits compared to the two-editor groups. In fact the scores for the three-editor group and the two-editor groups remained the same throughout the experiment. There was therefore almost total transfer from one line editor to another. On day 5 all groups switched to EMACS. There was a significant difference between the line-editor groups and the EMACS group, with the latter performing better. However, they were still faster than the EMACS group on day 1 so there had been a small amount of positive transfer. One of the things to notice from these results is that there seems to be nowhere for negative transfer to fit. Either some productions are shared between task A and task B, in which case there will be at least some speed-up in learning task B, or there isn’t. There appears to be no mechanism for negative transfer to take place. Singley and Anderson tested this prediction (Experiment 2 in Table 4.1). They created a version of EMACS where the key assignments were changed round. For example, in EMACS the key combinations Ctrl-D erased a letter, Esc-D erased a word, and Ctrl-N moved down one line. In “Perverse EMACS” Ctrl-D moved down one line and Esc-R erased a letter. There were two groups of subjects: one learned pure EMACS for six days, whereas the other learned EMACS for the first two days, Perverse EMACS on days 3 and 4, and EMACS again on days 5 and 6. Although the performance of the Perverse EMACS group was worse on Day 3, when they switched to Perverse EMACS, then on Day 2, it was still better than on Day 1. In other words what looked like negative transfer was in fact positive, just as the identical elements theory of transfer would predict. Anderson argues that his ACT-R production system (or indeed most other production-system models) can represent the use of knowledge in a more flexible way than Thorndike’s fairly strict theory of identical elements (Singley & Anderson, 1989, p. 112): The analysis of transfer in the text-editing experiment has implications for the identical productions model of transfer. The very high level of positive transfer observed between text editors that shared few commands reinforces the position that superficial identical elements models of the type that Thorndike advocated are inadequate. In other words it is not useful to look at transfer in terms of superficial identical elements but rather in terms of identical productions. A similar study of transfer effects was carried out by Kieras and Bovair (1986). They examined how their participants converted declarative knowledge, in the form of instructions for carrying out a sequence of
78
PROBLEM SOLVING
actions, into procedural knowledge. They then looked at how well the participants were able to learn new procedures. Their study is covered in Study Box 4.3 and Information Box 4.2. If a procedure can be broken down and described in detail, that is in terms of specific production rules, then any effort required to learn a new related procedure depends on how different a new production is from previously learned ones. As was mentioned earlier, declarative knowledge is not “committed”. Once embedded in a production it is committed because it is used to achieve a specific goal. If the use of the knowledge changes then there is likely to be very little transfer even if the use of knowledge in the two situations relies on the same basic facts (declarative knowledge) (Pennington & Rehder, 1996). Given the identical productions view of transfer, how can we account for the effects of Einstellung described, for example, by Luchins and Luchins and Sweller (see Study Box 3.2 and Study Box 4.1). One way of explaining aspects of negative transfer is by assuming that the features of a problem remind one of a procedure that worked in the past. Set effects are caused by accessing irrelevant procedures. They persist either because there is no feedback about what they should do to get a correct answer, or because they get a correct answer albeit in an unnecessarily laborious way. Singley and Anderson did not find negative transfer because their subjects were not kept in the dark about what they should do.
STUDY BOX 4.3 Kieras and Bovair (1986) Background Kieras & Bovair were interested in how the model of cognitive skill acquisition proposed by Anderson could be examined. In particular they wanted to find out (a) how declarative knowledge as represented by the instructions for carrying out various procedures were converted into production rules; and (b) if a representation of procedural knowledge in terms of production rules allows one to predict the amount of transfer from one task to a related one, Method
4. TRANSFER OF LEARNING
Participants had to learn a number of procedures for controlling a device (shown in Figure 4.3). There were two “normal” procedures and eight “malfunction” procedures. Examples are taken from Kieras and Bovair (1986, pp. 508–509). Example of a normal procedure: If the command is to do the MA procedure, then do the following:
Step 1. Turn the SP switch to ON. Step 2. Set the ES selector to MA. Step 3. Press the FM button, and then release it. Step 4. If the PF indicator flashes, then notice that the operation is successful. Step 5. When the PF indicator stops flashing, set the ES selector to N, Step 6. Turn the SP switch to OFF. Step 7. If the procedure was successful, then type “S" for success. Step 8. Procedure is finished.
Figure 4.3. Control panel device (adapted from Kieras & Bovair, 1986).
Study Box 4.3 (continued) Example of a malfunction procedure: If the command is to do the MA procedure, then do the following:
Step 1. Turn the SP switch to ON. Step 2. Set the ES selector to MA.
79
80
PROBLEM SOLVING
Step 3. Press the FM button, and then release it. Step 4. If the PF indicator does not flash, then notice that there is a malfunction. Step 5. If the EB indicator is on, and the MA indicator is off, then notice that the malfunction might be compensated for. Step 6. Set the ES selector to SA. Step 7. Press the FS button, and then release it. Step 8. If the PF indicator does not flash, then notice that the malfunction cannot be compensated for. Step 9. Set the ES selector to N. Step 10. Turn the SP switch to OFF. Step 11. If the malfunction could not be compensated for, then type “N” for not compensated. Step 12. Procedure is finished. To take account of the possibility that there might be an order effect, there were three different training orders. Participants had to perform each procedure correctly three times in a row before moving on to the next procedure. A program in LISP was written to simulate transfer effects on the basis of the production rules involved. To get a flavour of what a production system model of a task is like see Information Box 4.1. Kieras and Bovair’s simulation program reported “the number of rules considered identical to existing rules, the number that could be generalized with existing rules, and the number of new rules added to the total” (1986, p. 513). Results and discussion There was a very great deal of overlap between the observed performance of the participants and the effects predicted by the simulation program. Furthermore the length of time required to learn a new procedure depends on the number of new production rules that have to be learned. For this particular training order the length of time to learn new procedures levels off, as later procedures require no or only one or two new procedures. (Other training orders reveal a different zig-zag pattern depending on the number of new productions that have to be learned.) As the simulation program is based on a production system representation of the rules for carrying out the procedures, and given the degree of agreement between the observed and predicted values, Kieras and Bovair argue that “a production rule representation can provide a very precise characterisation of the relative difficulty of learning a set of related procedures” (p. 507) and that “the amount of transfer is predicted very well from the similarities between the production system representations for the procedures” (p. 507).
INFORMATION BOX 4.2 Example of production rules from Kieras and Bovair (1986) (MA-N-START IF (AND (TEST-GOAL DO MA PROCEDURE) NOT (TEST-GOAL DO??? STEP))) THEN ((ADD-GOAL DO SP-ON STEP))) (MA-N-SP-ON
4. TRANSFER OF LEARNING
IF (AND (TEST-GOAL DO MA PROCEDURE) (TEST-GOAL DO SP-ON STEP)) THEN ((OPERATE-CONTROL *SP ON) (WAIT-FOR-DEVICE) (DELETE-GOAL DO SP-ON STEP) (ADD-GOAL DO ES-SELECT STEP))) (MA-N-ES-SELECT IF (AND (TEST-GOAL DO MA PROCEDURE) (TEST-GOAL DO ES-SELECT STEP)) THEN ((OPERATE-CONTROL *ESS MA) (WAIT-FOR-DEVICE) (DELETE-GOAL DO ES-SELECT STEP) (ADD-GOAL DO FM-PUSH STEP))) (MA-N-FM-PUSH IF (AND (TEST-GOAL DO MA PROCEDURE) (TEST-GOAL DO FM-PUSH STEP)) THEN ((OPERATE-CONTROL *FM PUSH) (WAIT-FOR-DEVICE) (OPERATE-CONTROL *FM RELEASED) (DELETE-GOAL DO FM-PUSH STEP) (ADD-GOAL DO PFI-CHECK STEP))) (MA-N-PFI-CHECK IF (AND (TEST-GOAL DO MA PROCEDURE) (TEST-GOAL DO PFI-CHECK STEP) (LOOK *PFI FLASHING)) THEN ((ADD-NOTE OPERATION SUCCESSFUL) (DELETE-GOAL DO PFI-CHECK STEP) (ADD-GOAL DO ES-N STEP))) (MA-N-ES-N IF (AND (TEST-GOAL DO MA PROCEDURE) (TEST-GOAL DO ES-N STEP) (LOOK *PFI OFF)) THEN ((OPERATE-CONTROL *ESS N) (WAIT-FOR-DEVICE) (DELETE-GOAL DO ES-N STEP) (ADD-GOAL DO SP-OFF STEP))) (MA-N-SP-OFF IF (AND (TEST-GOAL DO MA PROCEDURE) (TEST-GOAL DO SP-OFF STEP)) THEN ((OPERATE-CONTROL *SP OFF) (WAIT-FOR-DEVICE) (DELETE-GOAL DO SP-OFF STEP) (ADD-GOAL DO TAP STEP))) (MA-N-TAP IF (AND (TEST-GOAL DO MA PROCEDURE) (TEST-GOAL DO TAP STEP) (TEST-NOTE OPERATION SUCCESSFUL)) THEN ((DELETE-NOTE OPERATION SUCCESSFUL) (ADD-NOTE TYPE S-FOR SUCCESS)
81
82
PROBLEM SOLVING
(DELETE-GOAL DO TAP STEP) (ADD-GOAL DO FINISH STEP))) Name of rule: MA-N-START Condition: if the goal is to do the MA procedure AND you haven’t started on any step yet Action: THEN add the goal to do the SP-N step to WM Name of rule: MA-N-SP-ON Condition: If the goal is to do the MA procedure AND to do the SP-ON step Action: THEN turn the SP switch to ON wait for the device to respond delete the goal to do the SP-ON step and add the goal to do the ES-SELECT step Name of rule: MA-N-ES-SELECT Condition: If the goal is to do the MA procedure AND to do the ES-SELECT step Action: THEN turn the ESS to MA wait for the device to respond delete the goal to do the ES-SELECT step and add the goal to do the FM-PUSH step
There is evidence that an identical elements view of transfer is not the whole story. Payne, Squibb, and Howes (1990) gave subjects a set of a text-editing tasks and looked at transfer between two text editors. They found evidence for conceptual transfer. That is, subjects given one type of text editor (MacWrite) induced the concept of a string (strings of characters including blank spaces and punctuation) and were able to transfer this conceptual knowledge to a task involving another editor (IBM Personal Editor). This was despite the fact that the specific procedures for editing were different. “This aspect of transfer is not readily admitted in the common elements/productions model of transfer and is outside the scope of any model of user knowledge in which methods are the only encoding of expertise” (Payne et al., p. 442). GENERAL TRANSFER The discussion so far has looked at specific transfer from one problem to a similar problem of the same type where there is some overlap in the use of knowledge required. General transfer may include advice about how to choose operators, about looking at extreme conditions, changing representations, loosening constraints, focusing on different aspects of a problem, generating an analogy, and so on. Other domaingeneral skills such as essay writing, using information and communication technology, writing reports and the like, rely on learning sets of schemas. A schema is a knowledge structure composed of bits of semantic knowledge and the relations between them. Your knowledge about houses, for example, would include the fact that they have roofs, windows, walls, a door. It’s hard to imagine a house without one of these features. These are the “fixed values” that would be incorporated into a house schema. You can also use your knowledge of houses to understand me if I tell you I have locked myself out. You can mentally represent my predicament because you know that houses have doors and doors have locks that need keys. These are “default” assumptions you can make. So if information is missing from an account you can fill in the missing bits from your general knowledge of houses—your house schema. Now it could be that I have locked myself out because I have forgotten one of these fancy push button locks and I forgot the number, but it’s not what you would immediately assume. Schemas have been proposed for a number of domains including problem solving. With experience of houses, or cinema-going, or “distance=rate×time” problems, you learn to recognise and classify them, and
4. TRANSFER OF LEARNING
83
Figure 4.4. Three types of diagrammatic representations (adapted from Novick & Hmelo, 1994, p. 1299). Copyright © 1994 by the American Psychological Association. Adapted with permission.
to have certain expectations about, for example, what goes on in a cinema or what you are likely to have to do to solve a particular type of problem. Schemas are abstractions from our everyday experience and can be at different levels of abstraction. But what about transfer at a fairly abstract level? Are there aspects of some problems that can be transferred to other problems that are completely different? Representational transfer These questions have been explored by Laura Novick (Novick, 1990; Novick & Hmelo, 1994). Whereas analogical problem solving is normally thought of as looking back to an earlier problem to find a solution procedure that can be used in the current problem, representational transfer is where a useful way of representing a problem is transferred where there is no common solution procedure. Novick and Hmelo (1994) examined four types of representation: networks, hierarchies, matrices, and part-whole representations (see Figure 4.4). They found that having access to appropriate representational schemas allowed solvers to transfer their learning of these schemas to new problems whose underlying structures lent themselves to a particular symbolic representation. In Chapter 2, I mentioned that Wertheimer made a distinction between reproductive and productive thinking. Reproductive thinking involves “blindly” using previously learned knowledge to solve a problem or perform a task without necessarily taking into account the underlying structure of the problem. Productive thinking, on the other hand, needs an understanding of the deeper structure of a problem. Now it has been found that “good students” often engage in “self-explanations” (Chi et al., 1989; Chi & Bassok, 1989; Chi, de Leeuw, Chiu, & LaVancher, 1994) when learning new material, and that these explanations seem to allow those students to gain a more elaborate representation of the material they are learning and
84
PROBLEM SOLVING
hence a better understanding of a problem’s structure. Further problem solving and transfer is thereby enhanced. On the other hand, when students attempt to learn something difficult for the first time, especially when the explanations given are relatively poor, then any transfer is likely to be limited to declarative knowledge, as few procedures have been learned (Robertson, 2000; Robertson & Kahney, 1996). Robertson (2000) found that transfer in such circumstances tended to be limited to those bits of declarative information that could be readily mapped across from one problem to another because they looked the same. According to Pennington and Rehder (1996, p. 265) what the foregoing discussion suggests is that: learning by rote results in transfer to highly similar problems (i.e., procedural transfer) but learning by “understanding” results in transfer to less similar or novel problems as well[…] In the same way that information-processing concepts have allowed identical elements theories of transfer to progress from the early formulations, further investigation of the details of declarative elaboration and its translation into effective procedures may assist in bringing “understanding” as well into a theory of transfer. Implicit transfer Recent studies have shown that transfer need not involve the explicit mapping of one problem onto a source analogue. Schunn and Dunbar (1996), for example, found that knowledge of a concept in one domain could be spontaneously transferred to another domain despite large differences in the surface features of the problems in the two domains. One group of subjects was given a computer simulation of a problem in biochemistry designed to illustrate the role of inhibition. A control group was given a similar task not involving inhibition. Next day both groups were given a task in the domain of molecular genetics. The concept of inhibition was also relevant to this task. Schunn and Dunbar found that the experimental group who had been given the inhibition problem the previous day was more likely to find a solution than the control group. Concurrent verbal protocols and post-task questionnaires revealed that the subjects were unaware that they were using the relevant concept (inhibition) from the earlier domain. Schunn and Dunbar interpret these finding as evidence for conceptual priming. They point out, however, that only a single relation was involved rather than a system of relations. As analogical reasoning involves using a system of relations it is computationally expensive. Priming, on the other hand, “is computationally inexpensive and does not require a complex mental structure to occur” (p. 272). WHAT KINDS OF KNOWLEDGE TRANSFER? The discussion so far has shown that knowledge or skill can be transferred from one domain to another only under certain circumstances. Larkin (1989) has reframed the question of what transfers as: To what extent does domain-general knowledge transfer from one context to another, and to what extent can domainspecific knowledge transfer? The answer to the latter question would appear to depend on the overlap of the elements of a skill shared by two different tasks. As to the former, it depends on what kind of knowledge is being referred to. Domain- general knowledge is really knowledge about useful strategies— means-ends analysis, hill climbing, and so on. Singley and Anderson (1985) have pointed out that such knowledge is well known to most adults anyway. Improving such skills (in some general problem-solving course, say) may be difficult and would not lead to much of an improvement when someone has to learn a new domain. Furthermore, Larkin points out that the domain-specific knowledge that someone needs to solve problems far outweighs the domain-general knowledge that a person needs.
4. TRANSFER OF LEARNING
85
For example, domains such as calculus or experimental design involve a lot of domain-specific problemsolving skills and relatively few domain-general skills. If you spend time improving your domain-general skills then you are only going to make a small improvement to a small part of the skills used in the domain. On the other hand, if you had spent your time learning the domain-specific skills then you would probably have improved your grade even more. A debate about whether—or even if—knowledge and skill can transfer from one domain or context to another has been going on in the pages of the journal Educational Researcher. Proponents of situated learning emphasise the degree to which learning is bound to a specific context, particularly a social context (Cobb & Bowers, 1999; Greeno, Moore, & Smith, 1993; Lave & Wenger, 1991). In one famous case Carraher, Carraher, and Schliemann (1985) showed that Brazilian street children were able to perform complex mathematical calculation in the street but not in a school context. Anderson, Reder, and Simon (1996) reject a strong version of this claim that all knowledge, both specific and general, does not transfer. Learning to read in a classroom does not mean you can’t read outside it. Learning arithmetic in school does not mean that you can’t make calculations in a supermarket. Furthermore, as we have seen in this chapter, “there can be a large amounts of transfer, a modest amount, no transfer at all, or even negative transfer” (Anderson et al., 1996, p. 8). Lave and Wenger (1991) regard learning as “legitimate peripheral participation”. By this is meant that the acquisition of knowledge and skill involves engaging in the socio-cultural practices of the community of which a learner is a part. Thus you cannot and should not extract the learning from the social context in which it is embedded. Again, Anderson et al. beg to differ and see no problem in a tax accountant learning tax code separately from interacting with a client. Learning to use a calculator does not require a client, or anyone else for that matter, to be present. As you may realise, this is far from being an academic debate. The effectiveness of and the prerequisites for transfer are fundamental to how we educate our children. SUMMARY 1. Transfer is fundamental to the educational process. We assume that what is learned in a classroom can transfer to situations outside the classroom. However, there is a lot of debate about the extent to which this can happen. 2. Transfer comes in various flavours: • Positive transfer means that learning something in one context makes learning something new easier. • Negative transfer means that learning something in one context makes learning something new harder. • Both positive and negative transfer occur because of the same underlying learning processes (e.g., skill learning)—it may be that a particular situation is inappropriate for the skill leading to negative transfer. • Specific transfer is the most frequent type of transfer and occurs when there is an overlap of the specific knowledge used in two situations. • General transfer occurs when general strategies for problem solving or representational schemas are learned in one context and can be applied in a (superficially) completely different one. • Implicit transfer occurs when there is conceptual priming, i.e., a piece of declarative (conceptual) knowledge is carried over unconsciously from one domain to another when there is little delay between presentation of the two situations.
86
PROBLEM SOLVING
3. People find it difficult to transfer what they have learned in one context to another context unless: • • • • • •
a hint is provided to use the earlier problem; the earlier problem is harder than the current one; there is a clear goal-sub-goal structure that maps across from one problem to another; solvers can represent the problems in a “useful” way; the two problems share productions and the problems are seen as similar; two different domains share the same knowledge (e.g., a knowledge of mathematics gained while learning physics would make learning chemistry easier; or, at a more abstract level, two different domains may share the same forms of argument or rationale—e.g., scientific method in the case of chemistry and physics); • one is aware of and can access general strategies for representing problem types.
4. The relative roles of domain-specific knowledge, domain-general knowledge, and the effects of context on learning are extremely important for educational practice.
CHAPTER FIVE Problem similarity
Love is like the measles—all the worse when it comes late in life Douglas Jerrold Love is like quicksilver in the hand. Leave the fingers open and it stays. Close it and it darts away Attributed to Dorothy Parker Love is like war: easy to begin but very hard to stop H.L.Mencken In these examples love is being likened to something else. In each case the similarity between love and that other thing is explained. At one level, of course, love is nothing like the thing it is being compared with. Love is not a dense silver metal that is liquid at room temperature; it does not normally bring you out in a rash; it does not normally involve tanks, artillery, and death. The similarity therefore lies somewhere else: at a “deeper” level. In his discussion of metaphors Black (1993, p. 30) has stated: “Every metaphor is the tip of a submerged model”. (I include similies such as the examples at the beginning of the chapter in my discussion of metaphor.) Clearly, if we are to understand what is going on when analogies are noticed and used, we need some way to describe what is meant by “similarity”. That is, we need some way to express precisely in what way two (or more) problems or ideas can be said to be similar and in what ways they can be said to differ. We will look first at the similarities between ideas or concepts, and then at the similarities between the relations linking concepts before looking at the similarities between problems that involve complex systems of relations. TYPES OF SIMILARITY Analogising involves taking a problem, concept, or situation that you already know and understand and applying it to a new problem, concept, or situation. The problem, concept, or situation that you already know is known as the source, and the problem, concept, or situation to which it is being applied is known as the target. “To propose an analogy or simply to understand one, requires taking a kind of mental leap. Like a spark that jumps across a gap, an idea from the source analogue is carried over to the target. The two analogues may initially appear unrelated, but the act of making an analogy creates new connections between
88
PROBLEM SOLVING
them” (Holyoak & Thagard, 1995, p. 7). Table 5.1 gives some examples of a source being used to understand or as inspiration for a target. TABLE 5.1 ′′Creative′′ uses of analogy (maps to) Target
Source Burrs sticking to dog’s fur Sitting in a tram imagining a clock
Velcro Theory of
Bones of human ear operated by thin
telephone
Analogiser Georges de Mestral Albert Einstein tower receding at the speed of light relativity Alexander Graham Bell and delicate membrane William Harvey
Water pumps the heart Metaphors “The Falklands thing was a fight between two bald men over a comb” Bald men Britain & Argentina Jorge Luis Borges Comb Falklands “Currency speculation is the AIDS of the world economy” AIDS currency speculation Jacques Chirac Other uses of analogy Example problems in textbook exercise problem in textbook Flow of water flow of electricity
Although few of the source and target items in Table 5.1 seem to have any kind of similarity on the surface, we can nevertheless see that there is an underlying similarity in each case. Water pumps, for example, are man-made mechanical devices usually made of metal. Water is drawn in through an inlet pipe and forced out through an outlet pipe by the action of a revolving metal (or possibly rubber) armature or by the action of a piston. The heart is not made of metal, has no pipes, and does not have an internal revolving armature of any kind. In fact the operating principles of water pumps and the heart are entirely different, and yet we have no problem with seeing the heart as a pump. This suggests that there are differences between the source and target that do not matter and similarities that do matter. Structural similarities and differences Before going on, try to solve the Fortress problem in Activity 5.1.
ACTIVITY 5.1 The Fortress problem A small country fell under the iron rule of a dictator. The dictator ruled the country from a strong fortress. The fortress was situated in the middle of the country, surrounded by farms and villages. Many roads radiated outward from the fortress like spokes on a wheel. A great general arose who raised a large army at the border and vowed to capture the fortress and free the country of the dictator. The general knew that if his entire army could attack the fortress at once it could be captured. His troops were poised at the head of one of the roads leading to the fortress, ready to attack. However, a spy brought the general a disturbing report. The ruthless
5. PROBLEM SIMILARITY
89
Figure 5.1. Intermediate representation of the Fortress problem. dictator had planted mines on each of the roads. The mines were set so that small bodies of men could pass over them safely, since the dictator needed to be able to move troops and workers to and from the fortress. However, any large force would detonate the mines. Not only would this blow up the road and render it impassable, but the dictator would then destroy many villages in retaliation. A full-scale direct attack on the fortress therefore appeared impossible.
(Gick & Holyoak, 1980, p. 351) How did the general succeed in capturing the fortress?
It is an example of an ill-defined problem. However, it is also an example of a problem whose solution is analogous to a problem you have already encountered in Chapter 3. Indeed, the solution would be fairly straightforward if you happened to remember it. Admittedly that was some time ago and you have probably been presented with a lot of fresh information, experiences, problems, etc., in between. If you still haven’t remembered the solution there is a further hint in Figure 5.1. Figure 5.1 is a graphical re-representation of the problem to help you to see the relationship between the problem statement and the solution. If you still can’t remember what the earlier problem might have been, the earlier source problem is reproduced on page 109. Holyoak (1984) has described four ways in which two problems can be said to be similar or different. 1. Identities. These are elements that are the same in both the source and the target (the analogues). They are more likely to be found in within-domain analogies. For example, if you are shown an example in mechanics involving a pulley and a slope and then are given a problem also involving a pulley and a slope, there are likely to be several elements in the two problems that are identical. However, identities also include generalised rules that apply to both analogues such as “using a force to overcome a target”. These identities are equivalent to the schema that is “implicit” in the source. 2. Indeterminate correspondences. These are those elements of a problem that the problem solver has not yet mapped. In the course of problem solving the solver may have to work at mapping elements from one problem to another. It may be that some elements have been mapped and others have not yet been considered. The ones that have not yet been considered (or that the solver is unsure of) are the indeterminate correspondences. Trying to map the elements of the Monster Change mentioned in Chapter 3 onto the Tower of Hanoi can be quite tricky and there may be aspects whose mappings are unclear.
90
PROBLEM SOLVING
3. Structure-preserving differences. These refer to the surface features of problems that, even when changed, do not affect the solution. Examples would be ‘armies’ and “rays”, and “fortress” and “tumour”. Although entirely different, such surface features do not affect the underlying solution structure. Nevertheless, notice that they play the same roles in both problems: armies and rays are agents of destruction; fortress and tumour are the respective targets. Structure-preserving differences also include those irrelevant aspects of a word problem that play no part in the solution. If the Fortress problem had included the line, “One fine summer’s day, a general arrived with an army…”, the weather would be completely irrelevant. 4. Structure-violating differences. These differences do affect the underlying solution structure. The Fortress and Radiation problems use a solution involving “division” and “convergence”. However, whereas armies can be divided into smaller groups, a ray machine cannot; nor is it immediately obvious how the rays can be divided. The solution to the radiation problem involves getting hold of several machines and reducing the intensity of the rays. The operators involved in the Fortress problem have to be modified in the Radiation problem. Similarity forms a continuum. At one extreme two problems may be entirely identical. At the other extreme two problems may be similar simply because they
ANALOGUE AND SOLUTION TO THE FORTRESS PROBLEM: Duncker’s (1945) Radiation problem Suppose you are a doctor faced with a patient who has a malignant tumour in his stomach. It is impossible to operate on the patient, but unless the tumour is destroyed the patient will die. There is a kind of ray that can be used to destroy the tumour. If the rays reach the tumour all at once at a sufficiently high intensity, the tumour will be destroyed. Unfortunately at this intensity the healthy tissue that the rays pass through will also be destroyed. At lower intensities the rays are harmless to healthy tissue, but they will not affect the tumour either. What type of procedure might be used to destroy the tumour with the rays, and at the same time avoid destroying the healthy tissue?
(Gick & Holyoak, 1980, pp. 307–308) Solution: The solution is to split up the large force (ray/army) into several rays/groups, and get them to converge simultaneously (attack) on the target (tumour/fortress).
are both problems. Most of the time we are likely to encounter problems that are neither identical to ones we have already solved nor differ so much that all we can say about them is that they are problems. That is, the problems we encounter are often variants of ones we have encountered before. These may be close variants to one we know or they may be distant variants (Figure 5.2). The degree to which problems of a particular type vary is likely to have an effect on our ability to notice that the problem we are engaged in is similar to an earlier one, and on our ability to use that source problem to solve the target. Figure 5.2 also illustrates that there are two respects in which problems can vary. Their surface features can change while the underlying solution procedure remains the same; or the surface features may be the same but the underlying solution procedure may differ. Table 5.2 shows examples of problems varying in their surface and structural features. In Table 5.2 notice that some of the surface features or “objects” play a role in the solution (400 miles, 50 mph) whereas others do not (Glasgow, London). Notice also that, despite the similarity in the objects in the
5. PROBLEM SIMILARITY
91
Figure 5.2. A continuum of similarity. A target can vary from a source with respect to its (structure-preserving) surface features or its solution method—the greater the number of structure-violating differences the greater the problems will vary until they are no longer the same type of problem. Where structural and surface features are the same (the left of the figure) the two problems are identical.
“similar-surface/dissimilar-structure” part, the solution procedure requires you to multiply two quantities whereas the source problem requires you to divide two quantities. Now try to identify similarities and differences between the objects and solution method in the Fortress and Radiation problems. To try to find a way of describing what these differences and similarities are and why they might or might not matter in analogising we will begin with surface similarities. Surface similarity A problem involving a car travelling at a certain speed for a certain length of time is similar to another problem involving a car travelling at the same speed but for TABLE 5.2 Surface and structural features of problems Source A car travels 400 miles from Glasgow to London at an average speed of 50 mph. How long does it take to get there? Surface “objects” car, 400 miles, Glasgow, London, 50, mph, “how long…” a=b÷c Solution structure time=distance÷speed Similar A truck travels 480 kilometres from Paris to A car takes 10 hours to travel from Glasgow to Lyons at an average speed of 80 kilometres per London at an average speed of 40 miles per hour. How long does it take to get there? hour. What distance does it travel?
92
PROBLEM SOLVING
Figure 5.3. A semantic hierarchy of transport systems. Source Dissimilar
A tourist exchanges $300 for pounds sterling at an exchange rate of $1.5 to the pound. How many pounds does she get? Solution structure a=b÷c
A tourist exchanges her dollars for £200 pounds sterling at an exchange rate of $1.5 to the pound. How many dollars did she get? b=a×c
a different length of time. Here all the objects (cars speeds, times) are identical but the value of one of them (time) is different. In the similar/similar part of Table 5.2, the car has been replaced by a truck. The problems are still likely to be recognised as similar because the objects (car and truck) are closely semantically related. How, then, can we describe or represent semantic similarity? One way objects can be represented is as a hierarchy of semantically related concepts. “Car” is similar to “truck” as both are examples of road transport. “Aeroplanes” and “helicopters” are examples of air transport. “Ferries” and “cruise liners” are examples of sea transport. These examples are similar to each other because they are all forms of transport (Figure 5.3). The general term “transport” at the top of the hierarchy is the superordinate category. “Road transport” not only has a superordinate but also contains subordinate categories at a level below it. Because of their semantic similarity, a problem involving a helicopter travelling over a certain distance at a certain speed may remind you of the earlier problem involving the car. Although semantic similarity can be represented as a hierarchy, this does not mean that the mind represents concepts in exactly this way. Rather we can think of the items at each level of the hierarchy as nodes in a semantic network (Figure 5.3). The nodes are linked and some links can be stronger than others. That is, items may be more strongly associated with some items than others (Figure 5.4). Furthermore, there are likely to be links with associated concepts other than transport. Hearing or seeing the word “car” is likely to lead to the most strongly associated concepts being activated, which in turn may activate further related concepts. This is known as “spreading activation” (Anderson, 1983; Collins & Loftus, 1975; Collins & Quillian, 1969). Links between nodes can have their activation or “weights” temporarily strengthened. For example, if you are discussing transport then other forms of transport are likely to be activated (see Figure 5.4). If, on the other hand, you are discussing pollution then you are more likely to have links between things like cars, exhaust gases, sewage discharges, etc., activated (Figure 5.5). Indeed, representing problems can be seen as patterns of activation in a semantic network. If the representation is inappropriate, then finding a correct representation can be hampered by the fact that some links in the semantic network are temporarily strengthened and others, weakened. It is thus hard for the system to take a detour, as it were, through a “weakened” link which may lead to a solution. In such cases it may be best to forget about the problem (or
5. PROBLEM SIMILARITY
93
Figure 5.4. A semantic network showing connections between “car” and other forms of transport (black lines). Other links have been inhibited (grey lines).
the name of that actress that is on the tip of your tongue) until the strengthened links have dropped back to their resting state. You may then find that the answer you have been looking for unexpectedly “pops” into your head. In problem solving this phenomenon is known as incubation (see Simon, 1966; Yaniv & Meyer, 1987, for different explanations of incubation). Spreading activation is a mechanism proposed by Ohlsson (1992) to explain how we might find solutions to insight problems (see Chapter 3). In analogical problem solving objects that are seen as semantically similar are more likely to be mapped across. The mapping of related objects from one situation to another is illustrated in Table 5.3. TABLE 5.3 Mapping similar objects Source
(maps to)
Target
Identical objects car Similar (semantically related) objects car laser beams
car truck X-rays
If you are on a picnic and find you have a bottle of wine and no corkscrew, you may remember that you read once in a problem-solving book about a way to remove a cork with a piece of string and a pencil. However, having searched your pockets you can only come up with a biro and a thin piece of electric cable you just happen to have with you. It is very probable that you would realise that you can substitute the biro for the pencil and the electric cable for the string. When objects are strongly semantically related they often also share a number of features. Not only can objects be similar in the sense of belonging to the same category but they can also be similar because they share the same properties. Now you may have noticed
94
PROBLEM SOLVING
Figure 5.5. A semantic network showing connections between “car” and pollution (black lines). Other links have been inhibited (grey lines).
that string and electric cable are not particularly close in terms of function or category membership. If you didn’t notice, it is probably because they are very similar in terms of their shared properties or attributes. Both are relatively long, both are very thin, both are flexible. In problems that involve vehicles it is usually the functional aspects of vehicles that are important—vehicles are for carrying things from point A to point B. With the biro and pencil mapping it is not the fact that both belong to the same category or have the same sort of function that is important in solving the problem, but rather that both are fairly narrow and rigid. Objects can be alike, then, because they share some important attributes and those attributes may (or may not) be useful in solving problems. Semantically related objects are likely to share several attributes in common. A tomato and a strawberry are similar in that both are red, edible, soft, sweet, and contain seeds. On the other hand, objects that are semantically unrelated can have attributes in common: a tomato and a car may be alike in that they are both red; and redness may well be the only attribute they have in common. One way of representing a piece of knowledge such as the statement “the tomato is red” is to use a propositional representation in which the attribute is stated first and the thing the attribute refers to is placed in brackets after it. Thus, “the tomato is red” becomes: red (tomato), and “my car is red” becomes: red (my_car). Notice that any object can be slotted into the brackets to represent the proposition that that object is red. So an attribute is followed by a single slot that can be filled by any object that has that attribute. The underscore linking “my” and “car” in the last example indicates that “my_car” is one object. We now have a way of comparing objects in terms of their attributes. The important similarity between the pencil and the pen in the picnic problem is not that they are both writing implements but that both are narrow enough to fit inside the neck of a bottle and both are rigid. A narrow green twig might do just as well. Table 5.4 therefore shows how to make attribute mappings.
5. PROBLEM SIMILARITY
95
TABLE 5.4 Attribute mapping Source
(maps to)
rigid (pencil) narrow (pencil) flexible (string) thin (string)
Target rigid (pen) narrow (pen) flexible (cable) thin (cable)
There are a couple of important points to notice here. First, the slot fillers for the rigid and narrow attributes and the flexible and thin attributes are to a great extent irrelevant. Almost anything rigid and narrow enough will do. Anything flexible and thin enough to tie a knot in will do to pull the cork out of the bottle. Holyoak and Thagard (1995) point out: “The capacity to focus selectively on a particular attribute of an object, and hence on particular similarities between objects, is an important cognitive advance, because it breaks the dependence on global similarity” (p. 25). Second, in each case both attributes are necessary in the same object to solve the problem. Whatever object you find that is narrow enough to go down the neck of the bottle must also be rigid enough to push the cork down into the bottle. A stalk of liquorice won’t do. Table 5.5 shows how we can generalise from the pen pencil mappings. We have in a sense moved up one level of abstraction from mapping “pen” to “pencil”, because of their general semantic similarity, to mapping attributes of those TABLE 5.5 Attribute mappings with variables replacing specific values Source
(maps to)
rigid (X) narrow (X) flexible (A) thin (A)
Target rigid (Y) narrow (Y) flexible (B) thin (B)
objects. However, attributes can also be semantically related. “Tiny” and “minute” and “green” and “turquoise” are pairs of values of the attributes of size and colour respectively. A tiny armchair and a minute book may be recognised as similar enough objects to be put together in a doll’s house. Semantic constraints on analogising Holyoak and Koh (1987) studied subjects’ ability to access and use analogous problems involving the “convergence” solution used in the Fortress and Radiation problems. What they found was that the more semantically similar the objects in the analogues were, the more subjects were likely to access and use the earlier problem to solve the target. For example, subjects given a problem involving a broken lightbulb filament that required converging laser beams to repair it were more likely to solve the Radiation problem, where X-rays are used to destroy a tumour, than the subjects who were first given the Fortress problem, where armies attack a castle. This is because laser beams and X-rays are similar sorts of things —both problems involve converging beams of radiation. Furthermore, their semantic similarity suggests that they probably play similar roles in the problems and therefore have high transparency. “Armies” and “rays” are dissimilar objects, and have low transparency. Accessing a previous problem has been found to depend
96
PROBLEM SOLVING
strongly on the amount of semantic similarity between objects in different problems (Gentner, Rattermann, & Forbus, 1993; Gentner & Toupin, 1986; Holyoak & Koh, 1987; Ross, 1987). RELATIONAL SIMILARITY So far we have seen how we can map objects in two situations or problems that are similar in that the objects are either identical or semantically related. Attribute mapping allows us to recognise and use similarities at a level above the level of the objects themselves and therefore to map the attributes of one object on to the attributes of another. However, we can go even further and look at similarities in the relations between two or more objects. In the sentence “Joe opened the wine bottle” there are two objects, “Joe” and “wine bottle”, and there is a relation between them represented by the verb “opened”. This proposition, which can be represented in natural language as “Joe opened the wine bottle”, can also be represented by: opened (Joe, wine_bottle). In this example there are two slot fillers, and the term before the brackets is a relation rather than an attribute. The relations and attributes that are outside the brackets are collectively known as predicates. The things inside the brackets are arguments. A predicate can have many arguments. The verb “give”, for example, involves a giver, a receiver, and an object that is given. “Charles gave Daphne a bunch of flowers” can be represented as: gave (Charles, Daphne, bunch_of_flowers). It is because we can understand the nature of relational similarity that we can make sense of analogies such as “Currency speculation is the AIDS of the world economy”. In Table 5.1 “AIDS” was shown as mapping onto “currency speculation”. However, this is a bit of a simplification. The point of making the analogy in the first place was because we expected to infer from the statement that currency speculation has a debilitating effect on the world economy and that there is nothing much that can be done about it. How are we able to make these inferences? In fact, there is something missing from the statement that we have to put back in. AIDS is not a disease that affects the world economy. It affects human beings. So another way of understanding the statement is to read it as “Currency speculation bears the same relation to the world economy as AIDS does to human beings”. Before analysing the implications of this kind of analogy, I am going to take you on a brief diversion into proportional analogies. Proportional analogies Proportional analogies are the kinds of items often found in IQ tests. They often take the form: 2:4 :: 6:? (“2 is to 4 as 6 is to ?”—“what is the relation between 2 and 4, and what would you get if you applied it to 6”) You are then given four options: a. b. c. d.
8 7 9 12
The difference between 6 and whatever the answer is, is the same “proportion” as the difference between 2 and 4. The goal is to work out what the relation is between 2 and 4 and apply that relation to the 6. You
5. PROBLEM SIMILARITY
97
have to infer a relation between the 2 and the 4 and generalise it to the 6. In fact several relations can be inferred from the first part of the analogy. The relation could be “increase by 2”, or “double”, or any “higher number”. If you infer that the relation is “increase by 2” and you apply that to the 6 then the answer would be 8; if you infer that the relation is “double” then the answer is 12; and so on. Verbal analogies are of the same form, as in: cat: kitten :: dog:? Proportional analogies of the form A:B :: C:D are known as 4–term analogies. The processes involved in solving proportional analogies are dealt with in more detail later. In the mean time we shall be looking at a simpler, if not simplistic, model for the purposes of illustrating relational mappings. Suppose you were asked to complete the 3-term analogy: hot:cold :: dry:? First of all you would have to know what the words meant (Figure 5.6a). Based on the information you retrieve about the words, you can generate a hypothesis about the likely relation between “hot” and “cold” (Figure 5.6b). You would then attempt to apply this relation to “dry” (Figure 5.6c) and generate the answer “wet” (Figure 5.6d). By inferring or inducing a relation between “hot” and “cold” you create the structure: opposite_of (hot, cold). This inference process is known as induction. When you come to apply it to another item in the analogy you are generalising from that one instance. This gives rise to the structure: opposite_of (A, B). Applying a generalisation to a specific instance is known as deduction. Indeed, you can apply this structure to items other than “dry” to generate: opposite_of (tall, short), opposite_of (fat, thin), etc. In these examples it is the relation that is being mapped across from the source to the target, and the items that are inside the brackets are largely irrelevant except to establish the relation in the first place and to provide the specific instance to which the relation should be applied. We are now in a better position to understand Jacques Chirac’s analogy: “Currency speculation is the AIDS of the world economy”. Whereas establishing the higher-order relation “opposite_of” was a means to the end of establishing the missing term, in Jacques Chirac’s analogy the end is to establish the higher-order relation itself. In other words, if you understand the effects of AIDS then you can understand the effects of currency speculation on the world economy (Figure 5.7). Relational similarity: Reproductive or productive? Earlier I mentioned that the preliminary requirement to finding a relation between two concepts was a search through our semantic system. One of the features of the metaphors just discussed is that two ideas have been juxtaposed that may never have been juxtaposed before. Generating such analogies and, indeed, understanding them is a creative ability. Compare the following two analogies. 1. uncle:nephew :: aunt:? 2. alcohol:proof :: gold:? In the first case there is a well known set of relations between uncle and nephew. The latter is the offspring of the former’s brother or sister. Both are male. The next item, “aunt”, invites us to generate a parallel known
98
PROBLEM SOLVING
Figure 5.6. A summary of processes needed to solve the analogy: hot: cold::dry:?
relation for a female offspring. We are reconstructing known sets of relations when we complete this analogy.
ACTIVITY 5.2 Using Figure 5.7 as a model, how would you analyse the metaphor: “The Falklands thing was a fight between two bald men over a comb”. (Answer on p. 236)
5. PROBLEM SIMILARITY
99
Figure 5.7. Relational mappings used to understand effects of those relations in a target domain.
In the second case, as in the case of metaphors, the solver has possibly never juxtaposed these items before. Any solution the solver generates will therefore be constructed for the very first time. That is, the similarity between the relations (here the second item is a measure of the purity of the first item) is a novel one. Indeed if we couldn’t invent new relations on the spur of the moment, then we wouldn’t be able to understand what Jacques Chirac was going on about. For this reason, our ability to generate and understand a presumably infinite number of possible relations is seen as a productive ability (Bejar, Chaffin, & Embretson, 1991; Chaffin & Herrmann, 1988; Johnson-Laird, Herrmann, & Chaffin, 1984).
100
PROBLEM SOLVING
Relational elements theory Bejar et al. (1991) argue that relations are of two types: those that the person generating the relation already knows, and those that have to be constructed on the spot. “The variety of relations suggests that people are capable of recognizing an indefinitely large number of distinct semantic relations. In this case, the explicit listing of all possible relations required in a network representation of relations will not be possible” (p. 18). Bejar et al. argue that theories of concept formation and representation do not readily explain how relations can be generated. Johnson-Laird (1989) said something similar when he argued that finding relevant relations (category membership relations) was an algorithmic process. However, the “deep” analogies involved in complex problem solving or in understanding a new domain required further explanation. Bejar and co-workers’ (1991) theory covers A:B :: C:D analogies where A and B are given and the subject has to choose from alternative C: D pairs which ones are similar. They provide evidence that people can readily evaluate the similarity of relations fairly quickly (Bejar et al., 1991, p. 26). The fact that people evaluate the similarity of relations is important because it implies that relations are decomposable into more primitive elements. To judge that two things are similar it is necessary to identify ways in which they are the same and ways in which they are different (Tversky, 1977). The aphorism “You can’t compare apples and oranges” expresses this point. Apples and oranges cannot be compared if they are considered to be unitary, unanalysable wholes. On the other hand, if they are decomposed into aspects in which they are the same—size, shape, nutritional value—and different— texture, taste, color—then the comparison can be made. The ability to readily compare relations means that relations are readily decomposable into more primitive elements. Bejar et al. give the example of types of “inclusion relations”. These include: spatial inclusion where something is in something else (MILK: BOTTLE); part-whole inclusion where something is part of something else (ENGINE: CAR); and class inclusion where something is an example of something else (ROBIN: BIRD). These three inclusion relations vary in their complexity. Spatial inclusion is fairly simple and refers to something being in, on, under, etc., something. Part-whole inclusion not only requires that something be in, on, under, etc., something but also that the part is connected to the whole in some way. Bejar et al. go onto argue that class inclusion involves one object being in the set of some general (superordinate) category, being connected to that category, and further: having certain features that make the object similar to the general category. These inclusion relations can therefore be characterised by three relational elements: , , and <similarity>. The inclusion relations can have one or more of these relational elements. If you want to compare relations this can only be done on the basis of shared relational elements. Bejar et al. give the following examples: part-whole spatial
The wheel is part of the bike The bike is in the garage
You can look upon these statements as premises and draw conclusions from them. However, the conclusions can only be based on the shared relational elements. In other words, we can readily infer that if the wheel is part of the bike and the bike is in the garage then the wheel must be in the garage, too. However, we would not readily conclude that the wheel is part of the garage. This is because only the relational element is common to both statements.
5. PROBLEM SIMILARITY
spatial part-whole
The wheel is in the garage The wheel is part of the garage
101
By breaking down relations into “relational elements” Bejar et al. were able to divide a very large number of IQ test items into 10 categories which varied in difficulty. In fact, they argued that the processes involved in solving the analogy problems were different for different categories. Most other models of proportional analogies downplay the role of semantics (e.g., Pellegrino & Glaser, 1982). However, Bejar et al. found that the semantic characteristics of the test items were important determinants of how easily the items were likely to be solved. Variations in the semantic features of the relations explain much of the individual differences between subjects on verbal analogy IQ tests. Furthermore, Sternberg and Nigro (1983, p. 36) used proportional analogies as a means of investigating how metaphors in general are understood: On the present theory, an interaction between tenor [target] and vehicle [source] occurs when the semantic subspace containing the tenor of a metaphor is mentally superimposed upon the semantic subspace containing the vehicle of a metaphor […] in some cases this mapping results is a shift in one’s perception of the respective natures of the tenor and vehicle. STRUCTURAL SIMILARITY So far we have seen that two things can be seen as similar if they are closely semantically related (e.g., trucks and vans); they can also be similar if they share the same attributes (yellow trucks and yellow books); and the relations between objects can also be regarded as similar. We have been climbing up a hierarchy of similarity, as it were. However, we can go even further and look at similarity between even more complex structures. The best example of how similar hierarchical structures can be mapped can be seen in the work of Dedre Gentner. Gentner’s structure-mapping theory According to Dedre Gentner, an analogy is not simply saying that one thing is like another. “Puppies are like kittens” or “milk is like water” are not analogies. They are in Centner’s terms “literally similar” because they share the same attributes, such as “small” in the first case and “liquid” in the second, as well as the same relations. Gentner (e.g., Falkenhainer, Forbus, & Gentner, 1989; Gentner, 1989; Gentner & Toupin, 1986) and Vosniadou (1989) argue that “real” analogies involve a causal relation. In the analogy “puppies are to dogs as kittens are to cats”, there is a relation between “puppies” and “dogs” which also applies between “kittens” and “cats”, and this relation can readily be explained. An analogy properly socalled involves mapping an explanatory structure from a base (source) domain (puppies and dogs) to a target (kittens and cats). Analogising, according to Gentner (1983), involves mapping a relational structure that holds in one domain onto another. This is known as the principle of systematicity, which states that people prefer to map hierarchical systems of relations in which the higher-order relations constrain the lower-order ones. In other words, it is only because we can infer that two bald men fighting over a comb is pointless that we can map the bald men to Britain and Argentina and the comb to the Falklands and generate the assumption that the fight was pointless.
102
PROBLEM SOLVING
The structure-mapping theory is implemented as the Structure Mapping Engine (Falkenhainer et al., 1989) and uses a predicate calculus of various orders to represent the structure of an analogy. At the lowest order, order 0, are the objects of the analogy (the salient surface features), such as “army”, “fortress”, “roads”, “general”. When an example is being used as an analogy, the objects in one domain are assumed to be “put in correspondence with” the objects in another to obtain the best match that fits the structure of the analogy. At the next level the objects at level 0 can become arguments to a predicate. So “army” and “fortress” can be related using the predicate “attack” giving “attack (army, fortress)”. This would have order 1. A predicate has the order 1 plus the maximum of the order of its arguments. “Army” and “fortress” are order 0; “greater_than (X, Y)” would be order 1; but “CAUSE [greater_than (X, Y), break (Y)]” would be order 2, because at least one of its arguments is already order 1. CAUSE, IMPLIES, and DEPENDS ON are typical higher-order relations. “On this definition, the order of an item indicates the depth of structure below it. Arguments with many layers of justifications will give rise to representation structures of higher order” (Gentner, 1989, p. 208). Mapping an explanatory structure allows one to make inferences in the new domain or problem, as the relations that apply in the source can be applied in the target. Notice that the inferences are based on purely structural grounds. A structure such as: CAUSE [STRIKE (ORANGE, TREE), FLATTEN (ORANGE)] can be used in a Tom and Jerry cartoon to decide what happens when Tom smashes into a tree—CAUSE [STRIKE (TOM, TREE),?]—to generate the inference CAUSE [STRIKE (TOM, TREE), FLATTEN (TOM)]. The missing part concerning what happens when Tom strikes a tree can be filled in by referring to the structure expressing what happens when an orange hits the tree. The predicate FLATTEN (ORANGE) is inserted into the target with ORANGE mapped to TOM. CAUSE CAUSE CAUSE CAUSE
[STRIKE [STRIKE [STRIKE [STRIKE
(ORANGE, (TOM, (ORANGE, (TOM,
TREE), TREE), TREE), TREE),
FLATTEN ?] FLATTEN FLATTEN
(ORANGE)] (ORANGE)] (TOM)]
For Gentner, analogising involves a one-to-one systematic mapping of the structure of the base domain onto the target. The surface features—the object descriptions—are not mapped onto the target because they play no role in the relational structure of the analogy. For example, the colour of the orange or of Jerry is irrelevant because it plays no part in the structure. The problems dealt with by Gick and Holyoak (1980, 1983) are therefore analogous under this definition. Some researchers are happy to give analogy a very broad definition (e.g., Anderson & Thompson, 1989). Holyoak (1985), for example, refers to analogies as ranging from the “mundane to the metaphorical”. However, Gentner (e.g., 1989) is much stricter in her definition. She provides a taxonomy of similarities between problems in which analogy is distinct from other types of similarity (Falkenhainer et al., 1989; Gentner, 1989; Gentner et al., 1993). Figure 5.8 shows different types of similarity and where they might lie on a continuum (see also Gentner, 1989, p. 207). Literal similarity. Literal similarity is almost always found in within-domain comparisons. When two problems or situations are literally similar, both the structure of the problem and the surface features (or attributes) are similar. You tend to find literal similarity in within-domain comparisons. In the example “milk is like water” many of the attributes of water can also be applied to milk. There is a fuzzy boundary between what is technically analogy, where just the relational structure is shared between problems, and literal similarity which also includes object descriptions.
5. PROBLEM SIMILARITY
103
Figure 5.8. Similarity space: Classes of similarity based on the kinds of predicates shared. (From Gentner, 1989, p. 207. Reproduced by permission of Cambridge University Press).
Mere appearance match. With mere appearance match only the lower-order predicates match. The relational structure is ignored. Gentner gives the example “the glass tabletop gleamed like water” in which only the physical description is shared between the source and target. She claims that novices are prone to mere appearance matches (see Chapter 9). Abstraction mapping. Abstraction mapping involves mapping variables or abstract principles to a target such as “heat is a through variable”. Abstraction mapping assumes prior knowledge on the part of the reader because, as in the example just given, the abstraction can be used to categorise a problem or instantiate a procedure or rule. Abstraction mapping also forms a continuum with analogy. Mapping one structure on to another Gentner and her co-workers have produced a great deal of evidence to back up her claim that analogising comes about by mapping entire relational structures. Study Box 5.1 gives one example by Gentner and Toupin (1986). In Gentner and Toupin’s study, where the children understood the story’s underlying rationale, they were able to adapt their characters to fit the story’s structure. The rest of the time they simply imitated the sequence of actions taken by the semantically similar counterparts in the earlier story. The younger children may not have had an adequate understanding of the earlier story to be able to apply the rationale behind it. These results do not confine themselves to children. Gentner and Schumacher (1987; Schumacher & Gentner, 1988) found the same results with adults. Their subjects had to learn a procedure for operating a computer-simulated device and then use it to learn a new device. Once again the systematicity and
104
PROBLEM SOLVING
transparency were manipulated. The systematicity was varied by providing either a causal model of the device or simply a set of operating procedures. The transparency referred to the type of device components. The results showed that systematicity constrained learning and transfer to the target device. Transparency also had strong effects on transfer. The speed of learning the new device was greater when corresponding pairs of components were similar than when they were dissimilar. PRAGMATIC CONSTRAINTS So far we have looked at similarity in terms of the objects, attributes, relations, and structure. Each if these can be said to act as constraints on analogising. There is one further constraint on which analogy depends, and that is the purpose of the analogy (Holyoak, 1985). The main constraints are therefore: • semantic: referring to the degree of similarity between the objects, the features or attributes of the objects, and the relations between objects; • syntactic: referring to the structure of the analogy along the lines of Gentner’s systematicity principle; • pragmatic: referring to the analogiser’s goals in making an analogy. Holyoak argues that the context and the goals of the thinker are important constraints on thinking. Pragmatic constraints, like syntactic constraints, reduce the amount of information about a problem or situation that we need to attend to. If your goal was to make children laugh at cartoons by having a cat named Tom chasing a mouse and crashing into a tree, then you might not want the result to be too horrible. So you would limit your analogy to the flattening of Tom’s face, say, just as an orange would flatten if hurled against a tree. If, on the other hand, you were a cartoonist for “The Simpsons” and were making fun of the nasty things that happen to Tom in “Tom and Jerry” cartoons, then SPURT_FROM (JUICE,
STUDY BOX 5.1 Gentner andToupin (1986) Rationale The aim of the study was to examine the effects of varying the systematicity of a storyline and the transparency of objects in the stories (the degree of semantic similarity between the characters in the stories and the role they played in them). As the study was conducted with children, there was a further aim of finding out if there was a developmental trend in children’s ability to use structure mapping in analogising. Method Gentner and Toupin (1986) presented children with stories they had to act out. The transparency of object correspondences and the systematicity of the first (source) story were varied in later stories. Transparency was varied by changing the characters that appeared in them. For example, the original story (Gentner refers to the source as the base) included characters such as a chipmunk who helped a moose to escape from a frog. In the high-transparency condition, characters became squirrel, elk, and toad. In the medium-transparency condition, different types of animals (not semantically related to the ones in the source) were used. In the lowtransparency condition, the roles of the characters were reversed (leading to “cross-mapping”): in the new story the elk played the role of the chipmunk in the original, the toad played the role of the moose, and the squirrel that of the frog. Systematicity was varied by adding a sentence to the beginning to provide a setting and a final sentence to the story in the form of a moral summary (the systematic condition). The non-systematic condition had neither a moral nor a setting.
5. PROBLEM SIMILARITY
105
The children were given certain roles to play and asked to act out the original story and then to act it out again with different characters. Results and discussion Both age groups did well when there was high transparency—similar characters playing similar roles—in spite of a lack of systematicity. However, for the medium- and low-transparency conditions the younger age group were more influenced by the similarity between the characters than the older age group; that is, the surface similarity led them to produce a non-systematic solution. For the older age group the systematicity allowed them to “hold onto” the proper mappings.
ORANGE) in the source might be applied to the target. (I’ll leave you to imagine the mapping.) Holyoak has made the strong claim that “an analogy is ultimately defined with respect to the system’s goals in exploring it”, and that “syntactic approaches [such as Gentner’s systematicity principle], which do not consider the impact of goals on analogical transfer, are doomed to fail” (Holyoak, 1985, p. 70). The higher-order predicates in Gentner’s scheme (such as CAUSES, IMPLIES, DEPENDS ON, EXPLAINS, ENTAILS, PREVENTS, etc.) are the “causal elements that are pragmatically important to goal attainment” (1985, p. 73). A knowledge of such causal relations can guide analogical mapping. Not surprisingly, Gentner has not been too happy with this characterisation of her structure-mapping theory. She has referred to analogies as being either in context or in isolation (Gentner, 1989). By an analogy in context, Gentner is referring to analogical problem solving proper, where the solution structure in one domain can be extrapolated and used in another, as in Gick and Holyoak’s variations of the Radiation problem. Analogies in isolation refers to analogies one finds in metaphors such as Francis Bacon’s “All rising to a great place is by a winding stair”. Gentner argues that a pragmatic account of analogical mapping requires the analogiser to know what is relevant before the analogy can be understood. It therefore has problems coping with analogies in isolation such as Bacon’s metaphor, and creative analogies such as one finds in scientific discovery. The role of pragmatic considerations is better placed “outside” the analogiser (see Figure 5.9). For more on the role of pragmatic versus structural influences on analogical problem solving see Keane (1988). THE RELATION BETWEEN SURFACE AND STRUCTURAL SIMILARITY If the surface characteristics of problems or objects or situations told us nothing about them, then we would be in trouble. The surface features of objects or situations usually indicate something about their underlying nature. Sweller (1980) has defined similarity in terms of “shared representational predicates” which are the lower-order predicates in Gentner’s systematicity theory. These representational predicates are accessible surface properties. Sweller makes the point that shared surface features are a useful heuristic for accessing earlier problems. It is reasonable that two problems that look the same on the surface also share an underlying structure. Accessing a problem through shared surface features, he argues, is a good constraint to impose on the predicates that compose mental representations. Vosniadou (1989) also argues that salient attributes are a way of linking both surface and structural similarity. An attribute is salient because such a feature, by the fact that it is salient, provides a direct link to the underlying structure. Salience is important because people naturally expect a causal relation between surface and structural similarity. For example, the “redness” of a tomato (a surface feature) has a causal relation with the underlying feature of “ripeness”. This knowledge can come about either from experience with tomatoes or from an analogy with the relation between redness and ripeness in apples. If novices are presented with an example in which the same surface features occur, then they might reasonably infer that
106
PROBLEM SOLVING
Figure 5.9. An architecture for analogical processing (adapted from Gentner, 1989, p. 216. Reproduced by permission of Cambridge University Press).
the example will involve the same underlying relation and that this relation can be applied in the current problem. By the same token, when there are no obvious featural similarities between problems, there is no particular reason why the features of the target problem will allow you to access a relevant source. The relation between surface and structure is also pertinent in concept formation. Medin and Ortony (1989) argue that similarity judgements are based on our representations of objects (and of problems) not on actual entities themselves. The descriptive properties of objects, or their surface features, are usually related to deeper, less accessible properties. It therefore makes sense to attend to the surface properties of objects because they are a good heuristic for accessing the underlying structure. It is not only novices that use surface features as a useful heuristic in accessing possibly relevant source problems or in categorising problems. Hinsley, Hayes, and Simon (1977) found that their expert subjects could categorise problems after hearing only a few phrases. Blessing and Ross (1996, p. 806) argue that
5. PROBLEM SIMILARITY
107
there is a correlation between problem types and their contents (surface features) and that experienced solvers use these correlations to make hypotheses about the problem type: Quick access on the basis of surface content, even if it is not guaranteed to be correct, may be an attractive initial hypothesis given the longer time required for determining the deep structure…experts would be able to begin formulating the problem for solution while still reading it, thus saving time in solving the problem It is a consistent finding in problem-solving research and in everyday life that people are strongly influenced by the surface features of problems or situations. We tend to remember, and hence are influenced by the information that stands out. We may remember a US presidential candidate falling off a podium and base our judgement of his adequacy as a president on that. If one reads about genetically modified foods, one may recall the phrase “Frankenstein foods” applied to them, and, although it is meaningless, the phrase may influence our view of such foods. Gentner et al. (1993) have posed the question: “How can the human mind, at times so elegant and rigorous, be limited to this primitive retrieval mechanism?” (p. 567). They suggest that accessing information based on surface features is an older mechanism in evolutionary terms than reasoning. Indeed, there are strong arguments suggesting that accessing information on the basis of surface features is an extremely fast and efficient way of “thinking” even if it does occasionally lead to error. After all “if something looks like a tiger, it probably is a tiger” (1993, p. 567). Gentner et al. refer to this as the kind world hypothesis. In the environment, salient surface features are strongly correlated with structural features. Gigerenzer and Todd (1999) have argued that we have adapted to such an environment, and the fast and efficient ways we have developed to deal with it make us smart. Although relying on surface features can lead to errors, most of the time it allows quick and effort-free decisions. You gain more on the swings than you lose on the roundabout. SUMMARY 1. Analogical problem solving requires the solver to see some kind of similarity between a current problem (the target) and an earlier problem in memory (a source). The degree of similarity depends on how the problems are represented. 2. Similarity can be of several types: • • • • •
identities—where the objects or surface features of both source and target are identical; semantic—where the surface features in the source and target are semantically related; featural—where there is similarity between the attributes of objects in the source and target; relational—where the same relation exists between objects in the source as in the target; syntactic—where the same hierarchical structure involving higher-order relations exists in both a source and target.
3. Semantic similarity and syntactic similarity provide constraints or limits on analogising: • Semantic constraints limit what can be mapped from the source to the target to objects that are related.
108
PROBLEM SOLVING
• Syntactic constraints limit mappings to the objects that fill the same slots in a hierarchical structure, whatever those objects are. • Pragmatic constraints limit mappings to objects and relations that reflect the analogiser’s goals. 4. Problems differ with respect to both surface features and the underlying structure. Structure-preserving differences allow the analogiser to use the source to make inferences about relevant operators in the target. The analogiser can therefore perform the same kinds of actions in the target as in the source without having to be overly concerned about the specific objects involved. Where there are structureviolating differences, the solver is going to have to adapt the source analogue in some way. Structureviolating differences prevent relevant operators from being applied and cause the analogy to break down. 5. There is (most of the time) a correlation between surface features and the underlying structure of problems. Relying on surface features to access what might be a relevant source problem is therefore a useful heuristic. Indeed, that seems to be how the human cognitive system has evolved to operate.
CHAPTER SIX Analogical problem solving
The importance that has been attached to the role of analogising in human thinking cannot be emphasised too strongly. Many artificial intelligence models that attempt to capture aspects of human thinking have some kind of mechanism for analogical mapping (e.g., Anderson, 1993; Carbonell, 1983; Gentner & Forbus, 1991a, b; Hofstadter, 1997; Holyoak & Thagard, 1989a; Kolodner, 1993). Anderson’s ACT-R and the various analogical models produced by Hofstadter and the Fluid Analogies Research Group, for example, consider analogising to be fundamental to human thinking and learning. Analogies in the form of metaphors and similes pervade language; see, for example: http://metaphor.uoregon.edu/metaphor.htm http://www.le.ac.uk/psychology/metaphor/metaphor.html Theories of the evolutionary origins of human thought regard the ability to see analogies between different domains of thought that were once sealed off from each other as the hallmark of human intelligence. Arguments from past cases in the legal and tax systems are used to interpret the law. This is within-domain analogising, often called case-based reasoning. Diagrammatic and pictorial representations of information play a similar role to analogy. They often allow one to abstract out the important relationships between concepts that might otherwise be hard to see. The map of the London Underground loses a lot of “surface information” such as accurate representations of distances and positions, and instead allows us to see relational information. Diagrams often perform the same function as analogies in that they highlight a structure by stripping off irrelevant surface features. This chapter and the next deal with the studies of analogical problem solving (APS) and thinking, and with the uses of analogy in textbooks. As with the studies of transfer in relatively well-defined problems discussed in Chapter 4, the main difficulty in APS are first of all accessing a relevant analogue, and second adapting and using it to solve a new problem or understand a new concept or situation. The second part of the chapter looks at teaching and explaining using analogies. THE IMPORTANCE OF ANALOGISING Mithen (1996) argues that the evolutionary development of the human mind involves the development of special-purpose “intelligences”. Alongside a general-purpose intelligence there arose: a Natural History intelligence, a Social intelligence, a Technical intelligence, and language growing out of Social intelligence. Furthermore, consciousness developed out of the need to have a theory of mind to deal with complex social relationships; this idea is based on those of Nicholas Humphrey (1984, 1992). These special-purpose intelligences were originally “informationally encapsulated”—that is they were each “sealed off” from the others and so there was no cross talk between the different intelligences. In modern humans, however, there was a breakdown of the barriers between the intelligences. Blending combinations of intelligences led
110
PROBLEM SOLVING
to art, religion, and science. Indeed, the ability to see similarities and draw analogies between disparate domains has caused the explosion of thought that has led to human civilisation. Mithen uses three main analogies: a Swiss army knife represents multiple intelligences, in that the early hominid brain had a separate intellectual tool for different kinds of thought; a cathedral represents the mind, in that there are different chapels for different types of thought, each built more or less separately over time and whose walls in modern humans have been pierced to provide free access between them; and a play represents our prehistory, with different Acts standing for different stages in our evolution. To provide further examples of the importance that has been ascribed to analogical thought, I can do no better than quote Mithen himself (1996, pp. 153–154): …Jerry Fodor (1985, p. 4) finds the passion for the analogical to be a central feature of the distinctly non-modular central processes of the mind and […] Howard Gardner (Gardner, 1983, p. 279) believes that in the modern mind multiple intelligences function ‘together smoothly, even seamlessly in order to execute complex human activities’. […] Paul Rozin (1976, p. 262) concluded that the ‘hall mark for the evolution of intelligence…is that a capacity first appears in a narrow context and later becomes extended into other domains’ and Dan Sperber (1994, p. 61) had reached a similar idea with his notion of a metarepresentational module, the evolution of which would create no less than a ‘cultural explosion’. [Annette Karmiloff-Smith (1994, p. 706) argued that] the human mind ‘re-represents knowledge’, so that ‘knowledge thereby becomes applicable beyond the special-purpose goals for which it is normally used and representational links across different domains can be forged’, which is so similar to the notion of ‘mapping across different knowledge systems’ as proposed by Susan Carey and Elizabeth Spelke (1994, p. 184), and the ideas of Margaret Boden (1994, p. 522) regarding how creativity arises from the ‘transformation of conceptual spaces’. STUDIES OF ANALOGICAL PROBLEM SOLVING “The essence of analogical thinking is the transfer of knowledge from one situation to another by a process of mapping—finding a set of one-to-one correspondences (often incomplete) between aspects of one body of information and aspects of another” (Gick & Holyoak, 1983, p. 2). Ah, but you have to find the right analogue first… In Chapter 5 you were presented with the Fortress problem, and I mentioned that you had encountered the solution before in Chapter 3. Most people are unlikely to have noticed the similarity between the Fortress problem and the Radiation problem in an earlier chapter. This is a difficulty facing students reading mathematics or science textbooks. They are often presented with exercise problems at the end of chapters, and there is often an implicit assumption by textbook writers that the poor student can remember and understand everything that has been presented and explained in earlier chapters. Textbooks contain examples and exercise problems within the same domain (and these will be dealt with in the next chapter). Much research, however, has been into the effects of analogising between different domains. Gick and Holyoak (1980, 1983) were interested in the effect of previous experience with an analogous problem on solving Duncker’s Radiation problem. They used various manipulations. Some subjects were given different solutions to the Fortress problem to find out what effect that would have on the solutions they gave for the Radiation problem. For example, when subjects were given a solution to the Fortress problem whereby the general attacked down an “open supply route” (an unmined road) they tended to suggest a solution to the Radiation problem involving sending rays down the oesophagus. If the general dug a tunnel, then more subjects suggested operating on the patient with the tumour. The solution involving
6. ANALOGICAL PROBLEM SOLVING
111
dividing the army into groups and converging simultaneously on the fortress is known as the “divide and converge” solution (or the “convergence” solution). Thus the type of solution presented in the early problem influenced the types of solution suggested for the later one. Another important point about Gick and Holyoak’s studies was that their subjects were often very poor at noticing an analogy and would only use one when they were given a hint to do so. Only about 10% of those in a control group who did not receive an analogy managed to solve the problem using the “divide and converge” solution. Of those who were given an analogy, only 30% used the Fortress problem analogue without being given a hint to do so; and between 75% and 80% used the analogy when given a hint. So, if you subtract the 10% who manage to solve the problem spontaneously, this means that only about 20% noticed that the two problems were similar. This is in line with the findings by Simon and Hayes (1976) and Reed et al. (1974) who found that their subjects were very poor at noticing that the well-defined problems with which they were presented were analogous. Although the Fortress and Radiation problems are not welldefined, they nevertheless have the same underlying solution structure. That is, they differ in their surface features, but the similarity lies in the underlying structural features of the problems. Gick and Holyoak have pointed out this structural similarity in Table 6.1. The first major obstacle to using relevant past experience to help solve a current problem is accessing the relevant past experience in the first place. Why should this be the case? Notice that the two “cover stories” differ. The Fortress problem is about military strategy and the Radiation problem involves a surgical procedure. They are from different domains so there is no real reason for us to connect the two. It would be time consuming, not to say foolish, to try to access information from our knowledge of mediaeval history when attempting to solve a problem in thermodynamics. We are therefore not predisposed to seek an answer to a surgical problem in the domain of military strategy. The second major obstacle to using past experience is that of adapting the past experience to fit the current problem. Despite being told that the Fortress problem and the Radiation problem involved the same “divide and converge” solution strategy, there were about 20% of subjects who still failed to solve the Radiation problem. The difficulty here is adapting the Fortress problem to solve the Radiation TABLE 6.1 Correspondences between two convergence problems and their schema Fortress problem Initial state Goal Resources Operators Solution plan Outcome Radiation problem Initial state Goal Resources Operators Constraints
Use army to capture fortress Sufficiently large army Divide army, move army, attack with army Send small groups along multiple roads simultaneously Fortress captured by army
Use rays to destroy tumour Sufficiently powerful rays Reduce ray intensity, move ray source, administer rays Unable to administer high-intensity rays from one direction safely
112
PROBLEM SOLVING
Solution plan
Administer low-intensity rays from multiple directions simultaneously Tumour destroyed by rays
Outcome Convergence schema Initial state Goal Use force to overcome a central target Resources Sufficiently great force Operators Reduce force intensity, move source of force, apply force Constraints Unable to apply force along one path safely Solution plan Apply weak force along multiple paths simultaneously Outcome Central target overcome by force Table from “Analogical problem solving” by Gick and Holyoak in Cognitive Psychology, 12, 1–38. © 1980 by Academic Press, reproduced by permission of the publisher.
problem, and there are various reasons why people might fail to map one solution onto the other. These include the effects of general knowledge, imposing constraints on a solution that are not in the problem statement, and the fact that there is not always a clear mapping between objects in one problem and objects in the other: World knowledge. Various bits of general knowledge might prevent one from seeing how a solution might work. For example, a ray machine may be thought of as a large machine—you might expect one such machine in an operating theatre but not several. Where is the surgeon going to get hold of several ray machines? In Britain the National Health Service always seems short of cash, so how is the hospital where the surgeon works going to afford several ray machines? Furthermore, the solver may have a faulty mental model of how rays work. Is it possible to reduce the intensity of the rays? Someone once told me that he couldn’t imagine the effect of several weak rays focusing on a tumour. He wasn’t sure their effect would be cumulative. There could be many such sources of interference from other world knowledge on the ability to generate a solution. Imposing unnecessary constraints. In Chapter 3, I mentioned that solvers sometimes place constraints on problem solving that are not actually stated in the problem. For example, in the Nine Dots problem one might feel constrained to begin and end on a dot although no such constraint is imposed by the question. In the Radiation problem it says: “There is a kind of ray that can be used to destroy the tumour”. This might be taken to mean that there is only one ray machine (although it doesn’t explicitly say that). If there is only one machine then the Fortress solution will not work because several sources of weak rays are needed. Mapping elements. Some of the objects in both problems play the same roles in the solution structure. Others do not. Table 6.2 shows where there are (relatively) obvious mappings and where the mappings are likely to break down. Matches can TABLE 6.2 Mappings between the surface features (objects) in the Fortress and Radiation problems Source Objects tyrant army villages
(maps to)
Target tumour rays healthy tissue
6. ANALOGICAL PROBLEM SOLVING
Source
(maps to)
roads mines Relations divide (army, groups) disperse (groups, ends_of_roads) converge (groups, fortress)
113
Target ? ? divide (rays, ?) ? converge (rays, tumour)
be found for tyrant, army, and villages and for converging the groups simultaneously onto the fortress. However, look at 4, 5, 6, and 7 in Table 6.2. In 6 and 7 in particular, it is not obvious how one might go about dividing a ray machine or rays, or dispersing whatever results from the dividing around the tumour. Another point that Table 6.2 brings out is that to adapt a solution one often has to find the most appropriate level of abstraction. ACCESSING A SOURCE TO SOLVE A TARGET Figure 6.1 presents an “ideal” model of analogical problem solving. In the Figure there is a source problem (A) that is assumed to be in long-term memory. The memory also includes information about how to solve the problem—represented in the Figure by the line linking the A box to the B box. C in the figure is the current problem you have to solve. If problem A is to be of any use, then the solver has to know that A and C are the same type of problem. Accessing a relevant source problem means that there is some perceived similarity between the two problems; that is, A and C in the figure must be seen as being similar in some way. Once a relevant source is accessed, the solution procedure in the source problem (P) can then be applied to the new problem. As the solution procedure is applied, the objects in A are mapped onto the objects in C that appear to perform the same role (Chapter 5 dealt with this in more detail). The procedure in the source is adapted where necessary to fit the new problem, resulting in a slightly modified version of the original solution procedure (P′). For example, in Table 6.2, where objects and relations between objects are missing, new ones have to be inferred, such as reducing the intensity of the rays and either finding several ray machines or constructing some complicated system of mirrors. The reason Figure 6.1 represents an idealised process of analogical reasoning is because actual human behaviour may differ from this. For example, Heller (1979) found that people trying to solve proportional analogies of the type A:B :: C:D (A is to B as C is to D—see Chapter 5) made a variety of mappings, only a few of which could be called relevant. A solver might engage in a number of possible comparison processes between the A and B terms, the A and C terms, the B and D terms, the C and D terms, and so on. Some of Heller’s subjects also made comparisons that violated the structure of these problems. For example, they made irrelevant inferences and comparisons between the A and D terms, the B and C terms, the A-B-C terms, and so on (see Kahney, 1993, for a fuller discussion of Heller’s model). Similarly, in looking at subjects’ problem-solving performance using instructional texts, Kahney (1982) and Conway and Kahney (1987) found that the subjects would often look back to the solutions of previous problems, ignore the relation between the example problem statement and its solution, and instead try to map the exercise problem statement onto the example solution (the C term onto the B term).
114
PROBLEM SOLVING
Figure 6.1. Accessing and adapting a source problem to solve a target.
GENERATING YOUR OWN ANALOGIES Blanchette and Dunbar (2000) have tried to elucidate the paradox of why people tend to find it hard to retrieve a relevant source when presented with a target based on the source’s structural features. They argue that most laboratory experiments use a “reception paradigm” where subjects are given a source and a target by the experimenter. Using a “production paradigm” they got subjects to generate their own source analogues to a target. They found that subjects chose source analogues whose structural features matched those of the target (arguments about the desirability or otherwise of the Canadian government’s budget deficit). On the other hand, when they were put back into a reception paradigm and presented with the sources that were both superficially similar or structurally similar, then a source analogue tended to be chosen based on superficial characteristics. Blanchette and Dunbar (2000, p. 109) argue that “the reception paradigm may constrain the search for structural relations and provide a picture of analogical reasoning that underestimates the subjects’ abilities to use deep structural features in the retrieval of source analogues. In real world contexts people generate their own analogies.” They also argue that one explanation for retrieval based on superficial features (despite being asked which of the source texts “might make a good analogy to this one”) might be that the original sources were encoded on the basis of superficial features (subjects were asked to evaluate the pleasantness of the text). Nevertheless, the study does not encompass situations where the target is a problem that subjects are not readily able to solve despite the fact that an analogy may have been presented earlier. Principle cueing Ross (1984, 1987, 1989b) discusses two possible scenarios in APS: the principle-cueing view and the example-analogy view. In the principle-cueing view, learners may be reminded of an earlier example by some feature or combination of features of the current one. This reminding triggers or cues the abstract information or principle involved in the earlier problem which is relevant to the current one. In the case of Gick and Holyoak’s work, the principle here would be the divide and converge solution schema; in algebra
6. ANALOGICAL PROBLEM SOLVING
115
Figure 6.2. The relation between a problem and its solution involving an implicit schema.
Figure 6.3. Implicit schema cued by accessing source and used to solve target.
problems it might be an equation such as distance=rate×time. The principle thus accessed can then be used to make sense of a new situation or solve a new problem with the same structure. The role of the surface features of problems is to cue or access a possible relevant source problem. Holland, Holyoak, Nisbett, and Thagard (1986) refer to analogues as having an “implicit” schema which is reconstructed during the solution process. In Figure 6.2, A represents a problem statement and B the solution. The relation or set of relations between A and B is represented by the line linking them. If the problem is an instance of a category of problems then the solution procedure used to get from A to B can be applied to other problems of the same type. There is therefore a schema implicit in the solution that can be applied to a range of problems of the same type. This is shown as the S box in Figure 6.2. When a source problem is accessed (A and B in Figure 6.3) then the principle underlying the solution to the source is accessed (the S on the line linking A and B) and applied to the target (C) to generate the solution (D). In the principle-cueing view, when people are reminded of an analogy the remindings serve to categorise the current problem. When presented with a problem involving two boats on a river going at different speeds one might be reminded of an earlier problem (or problems) of the same type and hence categorise
116
PROBLEM SOLVING
the current problem as a riverboat problem, or a rate times time problem, or whatever. The studies by Gick and Holyoak have shown that only one presentation of a problem type is required for the solver to abstract out the underlying schema. This is probably true only when the underlying schema is relatively straightforward and easily understood. More complex concepts such as recursion, say (see Chapter 2 and Figure 4.1), would presumably require several examples before a schema would emerge that would help solve new examples. Whether the source problem be simple or complex, the abstract principle that the source exemplified has to be already understood by the learner, and the surface details in the source problem are no longer required to solve the target. The original source problem was nevertheless important in that it allowed the learner to understand the principle in question and how it is used. The principle-cueing view implies that solving another problem from an example involves abstracting out the principle or procedure from the example and applying it to the target. This smacks of abstraction mapping where an abstract principle such as an equation is mapped on to the target problem rather than the specific elements in the example (see page 124). In the present case, the abstraction is “hidden” or implicit within the example and has to be extracted before it is applied. Much of the literature on expert-novice differences has concentrated on how the correct perception of a problem can cue access to the “problem schema” (Chi, Glaser, & Rees, 1982; Larkin, 1978). This problem schema in turn suggests a straightforward, stereotypical solution method. Novices, however, are often unable to identify the problem schema or categorise problems accordingly. Furthermore, principle cueing by definition presupposes that the analogiser understands the principle in the first place, and how it can be instantiated in a particular problem (e.g., what values map onto the variables of an equation). For novices studying a new subject, that may not necessarily be the case. If the solver is trying to use a complex example as an analogy in a domain that is unfamiliar, it would be unwarranted to assume that the solver has a schema, implicit or otherwise, for a problem. There may be a schema implicit in the problem but there is no guarantee that it is represented in the mind of the solver. It is likely that principle cueing is limited to either relatively simple problems where a lot of prior general knowledge can be brought to bear, or to problems where the solver already has at least a partial schema for a problem type. Using an example as an analogy The view that the role of superficial features is simply to access a previous problem has been challenged by Ross in a series of experiments. According to the second view of analogising—the example-analogy view (Ross, 1987, p. 629): the principle is understood only in terms of the earlier example. That is the principle and example are bound together. Thus even if learners were given the principle or formula, they would use the details of the earlier problem in figuring out how to apply that principle to the current problem Much of Ross’s work was concerned with the effects of superficial similarities in problem access and use. An example of Ross’s work is given in Study Box 6.1.
STUDY BOX 6.1 Effects of problem similarity on problem solving (Ross, 1987) Rationale
6. ANALOGICAL PROBLEM SOLVING
117
In Ross (1987) the superficial similarity between example and test problems was varied in terms of the story line and the object correspondences (the extent to which similar objects in the source and target problems appeared to map onto one another). The correspondences between objects were either similar (similar objects played the same role in both problems), reversed (where the objects played different roles), or unrelated to the study problem (the objects in the cover stories were different). (Notice that the general structure of the problem is similar to the study described in Gentner & Toupin (1986) in Chapter 5—see Study Box 5.1—although the aims of the two experiments were different.) Methodology Table 6.3 summarises the conditions used. The problems were probability problems with various story lines such as IBM mechanics choosing what company car to work on. In the same/same condition there were only minor superficial changes to the problem. The underlying solution structure remained the same. In the same/reversed condition it was the IBM salespeople who chose which mechanics should work on their cars, so the same objects were used in the study and test problems but the roles they played were reversed. The same/unrelated condition involved computers and offices in an IBM building. The unrelated/unrelated condition involved ticket sales for a high-school athletic team whose objects (teams and teachers) were unrelated to the example problem. TABLE 6.3 Study-test relations in Ross (1987) Study-test relation Condition same/same same/reversed same/unrelated unrelated/unrelated
Story line same same same unrelated
Objects same same unrelated unrelated
Correspondence same reversed unrelated unrelated
Results Ross found that the ability of the subjects to use the relevant formula, even when the formula was given to them, still depended on the superficial similarity of the problems. The similarity between objects in the problems with the same story line was used to instantiate the formula, so that the objects were assigned to the same variable roles as in the example. Thus, in the same/same condition (e.g. where mechanics chose the cars in the example and in the test), performance was higher than for the unrelated group. If the object correspondences were reversed, the same/reversed condition, then performance was lower than in the unrelated condition. Where it was difficult to tell which formula to use, the superficial similarity of problems with the same underlying structure led to the best performance.
When trying to make an analogy between two problems without an adequate representation of the problem structure, the usual means of instantiating variables (plugging specific values into variable slots) through an understanding of what they represent is very difficult. Without that understanding, novices can only rely on superficial similarities. This means that, even when learners are provided with a formula at test, they will still make use of an earlier example in which the principle is incorporated in order to solve the current problem. Ross’s results are therefore at odds with those one would expect from a principle-cueing view in which the example plays no role other than as an instantiation of a schema or principle which is either already known or readily induced.
118
PROBLEM SOLVING
EXPOSITORY ANALOGIES Instructing with analogies refers to “the presentation of analogical information that can be used in the form of analogical reasoning in learning” (Simons, 1984, p. 513). Another aim of using analogies in texts is to make the prose more interesting. Now, it may be that these two aims are sometimes in conflict. Not only that, but there is always a point at which analogies break down. As a result writers have to be very careful about the analogies they use. In expository texts writers make much use of analogies to explain new concepts, and they are often quite effective in helping students understand them. The flow of water, for instance, is traditionally used to explain the flow of current in electricity. Indeed, “flow” and “current” applied to electricity derive from that analogy. Rutherford made the structure of the atom more comprehensible by drawing an analogy to the structure of the solar system. When analogies are used in a pedagogical context, their aim is often to allow the student to abstract out the shared underlying structure between the analogy and the new concept or procedure the student has to learn. Writers hope that it can thereafter be used to solve new problems or understand new concepts involving that shared structure. The kind of analogy people develop or are told can have a strong influence on their subsequent thinking. If you think that some countries in the Far East stand like a row of dominoes, then when one falls under Communist control the rest will fall inevitably one after the other—that’s what happens to rows of dominoes. If you think that Saddam Hussein is like Hitler and that the invasion of Kuwait was like the invasion of Poland in the Second World War, then this will influence your response to the invasion and to your dealings with Iraq. If your theory of psychosexual energy (Freud) or of animal behaviour (Tindbergen) takes hydraulics as an analogy, then you will naturally see aspects of human behaviour as the results of pressure building up in one place and finding release in another. Flowing waters or teeming crowds: Mental models of electricity Our understanding of physical systems is often built on analogy. There are, for example, two useful analogies of electricity: one that involves the flow of water and one that involves the movement of people. These analogies can either help or hinder one’s understanding of the flow of electricity. Gentner and Gentner (1983) used these two analogies to examine their effects on how the flow of electricity is understood. The analogy with a plumbing system can be quite useful in understanding electricity. Several aspects of plumbing systems map onto electrical systems (see Figures 6.4 and 6.5 and Table 6.4). Gentner and Gentner found that subjects who were given an analogy with flowing water were more likely to make inferences about the effects of the flow of current in batteries that were in series and in parallel than those subjects who were given a TABLE 6.4 Mappings between the source domains of water and crowds and the target domain of electricity Target
Source: flowing water
Source: teeming crowds
resistor current voltage battery
narrow section of pipe rate of flow of water pressure reservoir
turnstile movement of people number of people ?
6. ANALOGICAL PROBLEM SOLVING
119
Figure 6.4. Effects of the analogy between batteries and water reservoirs.
Figure 6.5. Batteries and resistors arranged in series and in parallel.
moving-people analogy. However, when the problem concerned resistors more subjects who had been given the moving-people analogy were able to infer the effects of the resistors than those who had been given the flowing-water analogy. Those who were reasoning from a flowing-water analogy believed the flow would be restricted no matter how the resistors were arranged. People using the analogy with people moving through turnstiles had a better mental model of the flow of electricity through the resistors. Issing, Hannemann, and Haack (1989) performed an experiment in which they examined the effects of different types of representation on transfer. The representations were pictorial analogies of the functioning of a transistor. They presented subjects with an expository text alone, the text plus a sluice analogy, the text plus a human analogy, and finally the text plus an electronics diagram of the transistor. Issing et al. found that the sluice analogy (involving the flow of water) led to better understanding of the function of the transistor. The human analogy was less effective and the diagram and text alone were the least effective. Issing et al. argue that analogies depicting human-like situations are regarded as artificial and take on more of a motivating than a cognitive function. This, they say, explains why the human analogy fails to be as effective as the flow of water. This is, perhaps, a strange conclusion given that they are taking into account Gentner’s view of structure-mapping (see Chapter 5). A more likely reason is that the water analogy shares more higher-order relations with the operation of transistors than the human analogy, and this would account for the stronger effect of the sluice analogy.
120
PROBLEM SOLVING
Donnelly and McDaniel (1993) performed a series of experiments on the effects of analogies on learning scientific concepts. Subjects were given either a literal version of the scientific concept, an analogy version, or a familiar-domain version (their Experiment 4). Three examples are given in Table 6.5. The study showed several important features of the effectiveness of analogies in teaching. First, the analogies helped subjects answer inference questions but did not help with the recall of facts in the target domain. One can understand this in terms of the goals of learning: providing an analogy obliges the learner to look at how the structure of the analogy can help their understanding of the new concept. Providing just a literal version obliges the learner to concentrate more on the surface features. To put it yet another way, people will tend to concentrate on the surface features of new concepts (or situations or problems) unless given a way of reanalysing the concept (or situation or problem). TABLE 6.5 Different ways new concepts were presented in Donnelly and McDaniel (1993) Literal version Collapsing Stars. Collapsing stars spin faster and faster as they fold in on themselves and their size decreases. This phenomenon of spinning faster as the star’s size shrinks occurs because of a principle called “conservation of angular momentum”. Analogy version Collapsing Stars. Collapsing stars spin faster and faster as their size shrinks. Stars are thus like ice skaters, who pirouette faster as they pull in their arms. Both stars and skaters operate by a principle called “conservation of angular momentum”. Familiar-domain version Imagine how an ice skater pirouettes faster as he or she pulls in his or her arms. This ice skater is operating by a principle called “conservation of angular movement” (p. 987)
Second, the analogies helped novices but not advanced learners. If the concept being tested is in a domain that is already known, then the analogy performs no useful purpose. One can therefore solve problems using one’s pre-existing domain-relevant knowledge (see also Novick & Holyoak, 1991). Analogies work for novices because they provide an anchor, or advance organiser (Ausubel, 1968). That is, novices can use a domain they know to make inferences about a domain they do not know. Third, Donnelly and McDaniel suggest that subjects were able to argue (make inferences) from the analogy prior to the induction of a schema (that is, before a schema has been formed from experience of individual instances). In other words, there was no integration of the source analogy with the target. This emphasises that schema induction is a by-product of analogising and that schemas are built up gradually (depending on the complexity of the domain). Giora’s study Giora (1993) has argued that analogies in expository texts are superfluous and tend to impede understanding and recall of texts. In a series of experiments she gave subjects a number of passages and asked them at the end “What is this passage about?” Some of the passages contained analogies (see Table 6.6). Giora found that the inclusion of an analogy impaired both recall of facts and understanding of the passage. In fact, the longer the analogy the worse the subjects’ performance became. These results appear to go against other studies that seem to demonstrate the power of analogies in influencing thinking. Giora argues that “Analogy, an example from a distant domain, is a digression from
6. ANALOGICAL PROBLEM SOLVING
121
relevance and will therefore require more processing” (p. 597). She also argues that “a well-formed informative text is easy to understand since it contains enough redundancy to be easily assigned meaning and its ordering [the structure of the argument] is indicative of ‘importance,’ which I would rather term informativeness” (p. 596). In other words, a well-written text does not need analogies. TABLE 6.6 Examples of texts with and without an analogy (from Giora, 1993) Literal version It has often occurred in the history of science that an important discovery was come upon by chance. A scientist looking into one matter, unexpectedly came upon another which was far more important than the one he was looking into. Penicillin is a result of such a discovery. Analogy version It has often occurred in the history of science that an important discovery was come upon by chance. A scientist looking into one matter, unexpectedly came upon another which was far more important than the one he was looking into. Such scientists resemble Saul, who, while looking for donkeys, found a kingdom. Penicillin is a result of such a discovery (Giora, 1993, p. 592).
If one compares the texts in the Donnelly and McDaniel study with the texts used in Giora’s study then we might begin to see why there are differences in interpreting the role of analogies. The stories and analogies Giora used were rather simple. A concept such as the fact that many discoveries in the history of science have been accidental is simpler than concepts such as “conservation of angular momentum”. The subjects in Giora’s experiments did not need to use the analogies to understand the texts. They were indeed digressions. “Conservation of angular momentum”, on the other hand, is not a simple concept to understand when it is encountered for the first time and so an analogy would be useful in helping people understand it. The influence of far and near analogies The studies so far have shown that distant analogies can have an influence on the way we reason about a new concept. Presumably, then, close analogies would have an even greater effect on understanding because there would be more shared surface features between the concept to be learned and the analogue as well as shared underlying structural features. This assumption was tested by Halpern, Hanson, and Riefer (1990). They gave subjects booklets to study containing passages on the lymphatic system, electricity, and enzymes. The booklets contained combinations of near, far, and no analogies. For example, the far-domain analogy with the lymph system involved the movement of water through a sponge; the near domain was a comparison with blood through the veins. Halpern et al. found that the far-domain analogies produced greater recall both immediately and after a week of the target texts, and subjects were able to answer more inference questions than in the near-domain and no-analogy conditions (see also Iding, 1993, 1997). Halpern et al. argued that the far-domain analogies led to the subjects putting more effort into mapping the structure of the analogy onto the target domain. It was this “effort after meaning” (Bartlett, 1932), or the encoding of the information in the source and target at a “deeper” level, that produced the strong results from the far analogies. Generally speaking, when far domains are used in teaching texts they are more likely to enhance understanding than near domains, because the far domains tend to be ones that the learner already knows and understands. Textbook writers choose them for that very reason. If the domain is a new one and the material is difficult then a near-domain analogy is also likely to be difficult and hence is of little use to analogise from. This is summed up in Figure 6.6.
122
PROBLEM SOLVING
Figure 6.6. Near and far analogies and their likely effect on understanding.
Influencing thought It is obvious from the preceding discussion that analogies can play an important role in influencing thinking. One way that analogies can operate is by activating a schema that in turn influences the way we think about a current situation. Even relatively simple metaphors such as a “flood” of asylum seekers may bring to mind something uncontrollable and damaging. It is certainly more emotive than simply saying “a large number” of asylum seekers. Genetically modified (GM) food was labelled “Frankenstein food” by one British newspaper. The phrase probably helped influence the direction in which the debate about GM food in Britain was going at the time. Similarly, Keane (1997) points out that using a war analogy when discussing drug pushing is likely to bring to mind “solutions that are based on police action and penal legislation rather than solutions that involve the funding of treatment centres or better education (an illnessof-society analogy might have the opposite effect)” (p. 946). We have to be extremely careful about the nature of the analogies we present when demonstrating something, as they can have a profound effect on thinking about the current situation. Before we go on to look at possible problems with using analogies to teach with, here is one last example about how analogies can be used to affect people’s thinking (Halpern, 1996, pp. 84–85): We frequently use analogies to persuade someone that X is analogous to Y, therefore what is true for X is also true for Y. A good example of this sort of “reasoning by analogy” was presented by Bransford, Arbitman-Smith, Stein, & Vye (1985). They told about a legal trial that was described in the book, Till death us do part (Bugliosi, 1978). Much of the evidence presented at the trial was circumstantial. The attorney for the defense argued that the evidence was like a chain, and like a chain, it was only as strong as its weakest link. He went on to argue that there were several weak links in the evidence: therefore, the jurors should not convict the accused. The prosecutor also used an analogy to make his point. He argued that the evidence was like a rope made of many independent strands. Several strands can be weak and break, and you will still have a strong rope. Similarly, even though some of the evidence was weak, there was still enough strong evidence to convict the accused. (The prosecutor won.)
6. ANALOGICAL PROBLEM SOLVING
123
Figure 6.7. Hawking’s analogy of the universe with the earth (adapted from Hawking, 1988, p. 138).
“AESTHETIC” ANALOGIES A second way in which analogies are used in texts is as aesthetic devices to make the prose more interesting. However, often the boundaries between the two uses are a little unclear. Here are some examples from developmental psychology, evolution and cosmology: …emotion is a Cinderella of cognitive development. I won’t pretend to be her fairy godmother, but I can bring together some of the recent work which might be put together into the magic coach that will take her to the ball (Meadows, 1993, pp. 356–357). My ‘river’ is a river of DNA, flowing and branching through geological time, and the metaphor of steep banks confining each species’ genetic games turns out to be a surprisingly powerful and helpful explanatory device (Dawkins, 1995, p. xii). Concepts such as the fact that time ceases to exist when we go back to the beginning of the universe are hard to grasp. “Before” and “after” seem so natural that we can’t readily imagine what it means that there was no “before” the Big Bang. It is a consequence of Einstein’s theory of relativity that matter and time are intimately bound up and that both were created together. How might one therefore make such concepts easier to grasp? Stephen Hawking has tried by presenting an analogy with the earth (Figure 6.7). Nevertheless, there is always a point at which analogies break down. Structure-mapping theory argues that lower-order relations are ignored when making an analogy. For example, Rutherford made the analogy between the structure of the atom and the structure of the solar system. In Gentner’s structure-mapping theory the size of the sun and the fact that it is very hot are irrelevant to the analogy and play no part in it. However, sometimes there are higher-order relations that are ignored when making an analogy. Compare Hawking’s analogy in Figure 6.7 with these comments by Darling (1996, p. 49): One of the most specious analogies that cosmologists have come up with is between the origin of the Universe and the North Pole. Just as there is nothing north of the North Pole, so there is nothing before the Big Bang. Voilà! We are supposed to be convinced by that, especially since it was Stephen Hawking who dreamt it up. But it will not do. The Earth did not grow from its North Pole. There was
124
PROBLEM SOLVING
not ever a disembodied point from which the material of the planet sprang. The North Pole only exists because the Earth exists—not the other way round The moral of the story is: you can only take an analogy so far. There is always a point at which an analogy breaks down. In the next chapter we will look at what makes problems or situations similar, and just what aspects can and cannot reasonably be mapped. SUMMARY 1. Analogising has been regarded as a fundamental property of human thinking. 2. Studies of analogical problem solving (APS) have shown that retrieving a relevant analogy is not often a spontaneous process. The context or domain in which a situation is embedded can profoundly affect the likelihood that it will be retrieved in a different context later. 3. Holyoak has listed in various places the processes assumed to be involved in APS (see Information Box 6.1).
INFORMATION BOX 6.1 The processes involved in analogical problem solving Holyoak has listed several steps involved in analogical problem solving*. Each step depends on the previous ones although there is likely to be some iteration (i.e., the analogiser may have to go through some stages more than once) particularly in stages 3 and 4. They are: Forming a representation of the source problem and the target problem. Accessing “a plausibly relevant analogue in memory” (Holyoak & Thagard, 1989b), This depends on the form of representation the solver has of the problems. If solvers are to access a previously encountered problem from memory, then the representation they form of the target must match that of the source. This point begs the question of what constitutes problem similarity. Mapping across corresponding elements in the source and target. Correspondences are features that appear to play the same role in the source and target. Narrow sections of pipe restricting the flow of water have to be seen as corresponding to resistors in an
Information Box 6.1 (continued) electric circuit before an analogy between the flow of water and the flow of electricity in complex systems can be understood. Adapting the mapping, constrained by the shared underlying structure of the problems, to generate a solution in the target. This step is also known as analogical inference or transfer. Knowing what happens to water when it passes through two narrow sections of pipe one after the other should allow students to infer what will happen when electricity has to pass through two resistors arranged in series. Learning. This takes the form of schema abstraction from comparing examples (either implicitly or explicitly). With experience in solving a type of problem, students abstract out the underlying structure and can thereafter apply it to new versions of the problem type without having to refer to a previous solution. Mapping and adapting are not regarded as being strictly sequential. Mapping “can be conducted in a hierarchical manner, the process may be iterated at different levels of abstraction” (Holyoak, 1985).
6. ANALOGICAL PROBLEM SOLVING
125
Analogical problem solving assumes that the solver has a representation of the source in LTM, and that the underlying structure of the problem is understood well enough so that the solver can map it across and generate inferences in the target. *Holyoak lists different steps at different times. The five presented here are derived from different sources (Catrambone & Holyoak, 1989; Holyoak, 1984, 1985; Holyoak & Thagard, 1989a, b).
4. Schunn and Dunbar have argued that the reason people seem to be poor at analogising is due to the nature of the experiment. When asked to generate their own analogies, subjects produced structurally appropriate ones. 5. The end result of the comparison of analogous problems is learning that takes the form of schema induction. Learning from examples is considered in more depth in Chapter 8. 6. Expository analogies are ones that are used to help learners understand new material. They provide an anchor or bridge between what the novice already knows and the new material that has to be understood. The nature of whatever analogy is used in teaching is important as it is likely to influence the way the new material is understood; and, indeed, the way it is likely to be misunderstood. 7. Analogies in texts have an effect when the material to be learned is difficult. When the material is already easily understood, analogies have little impact. Analogies can also have a purely aesthetic effect in this context—they liven up the text rather than add to it.
CHAPTER SEVEN Textbook problem solving
I once attempted to learn the rudiments of the programming language Prolog from textbooks. At one point I found myself reading a section on recursion. The section attempted to explain the concept of recursion and presented about half a dozen examples. As I read the section I felt that the concept was quite well explained and that the examples were readily understandable. At the end of the section there were some exercise problems and so I got a sheet of paper and had a go at the first one. I quickly discovered that I hadn’t a clue where to begin to solve the problem. Now a couple of reasons might be adduced for this state of affairs. My first, fortunately fleeting, thought was that I just couldn’t “do” Prolog; I wasn’t cut out to be a computer programmer. (This thought could lead to what Bransford and Stein, 1993, called a “mental escape”: when presented with a problem from a particular domain the solver exclaims “Oh no, I can’t do algebra” or “geometry” or “statistics” or whatever. The solver gives up before even reading the problem because of a kind of learned helplessness.) My second thought was that I had simply missed something when I had read the section. So I went back over the worked examples looking for one that looked similar to the exercise problem I was trying to solve. I couldn’t find one. I still had no idea how to solve the very first exercise problem, so I did what most people probably do in such circumstances—I looked up the answer at the back of the book. This was revealing—not only because it told me the answer, but also because it showed me that the section I had just read had not, in fact, told me how to solve this particular exercise problem. If I had gained a deep understanding of the concept of recursion from reading the
ACTIVITY 7.1 Have you ever had to do exercise problems in textbooks? Have you ever been stuck on one? Who did you “blame”?
section then perhaps I could have solved it, and presumably the author believed that he had provided enough of an explanation to produce just such a deep understanding. Nevertheless I would venture to suggest that students reading textbooks in domains they are unfamiliar with do not always understand things particularly well on a first reading. It is hard to imagine a modern culture where most of its teaching is not done through textbooks or an electronic equivalent. It is equally hard to imagine anyone not getting stuck on exercise problems, or getting them wrong, at least some of the time. In Activity 7.1, I would hazard a guess that your answer to question 3 was you yourself. It is a reasonable bet that most people would tend to “blame” themselves (if they thought
7. TEXTBOOK PROBLEM SOLVING
127
about it) for failing to solve a textbook problem. We tend to take it on faith that the textbook writer has provided all the information we need to solve exercise problems. This may well not be the case for a number of reasons, some of which we will look at in the next section. One way of examining the difficulties facing students engaged in textbook problem solving is by looking at the processes involved in using examples to solve problems. There is an important distinction between the kinds of analogical problem solving discussed in the last two chapters and the use of an example as an analogy in a textbook. So far, we have concentrated on analogy as the transfer or mapping of knowledge from a familiar domain onto a less familiar one. In the case of metaphors, the transfer is often from one familiar domain to another familiar one. In textbook examples, on the other hand, the student is trying to use an example from an unfamiliar domain to solve a problem in the same unfamiliar domain. Analogical reasoning works when you can reason from a domain that you understand well to solve a present problem that is puzzling. In textbook problem solving the student is presented with examples to use as analogies to solve exercise problems. The big difference here is that the example and the exercise problem are both in the same domain, and the student (who is presumably a novice) does not yet understand the domain, otherwise the student would not be a novice. This, in a nutshell, is what makes problem solving from textbook examples difficult. DIFFICULTIES FACING TEXTBOOK WRITERS Textbook writers face a number of difficulties when writing textbooks. To make the task a little easier they have to make some assumptions about the reader. Assumed prior knowledge If a textbook is aimed at a readership that has presumably reached a certain level of competence in a specific domain (for example, by passing exams in that domain), then the writer has a good idea of the prior knowledge of the readers. If the textbook is aimed at a more general readership, then there are likely to be parts that are better understood by some readers than by others and conversely some parts that are likely to be well-known to some and completely new to others. The Lisp textbook by Winston and Horn (1989), for example, presents an example of a recursive function in LISP that computes the Fibonacci series (p. 73). If you already know what the Fibonacci series is, then you may have little problem understanding what the function is trying to do. If you don’t, then fortunately the textbook spends half a page explaining what it is. Having to explain what the problem statement means before explaining what the solution is can present an added level of difficulty for the reader. Assumed recall of already presented information The writer has to make assumptions about how much the reader remembers from previous chapters. When a problem is presented in chapter 5, say, it would probably be foolish to suppose that everything presented in chapters 1 to 4 will be readily recalled. In a study into novice programmers, one of Anderson, Farrell, and Sauers’ (1984) subjects had to be reminded of an earlier example from a previous chapter (problem 2–5 in chapter 2 of the Winston and Horn textbook) before she could go on and solve the problem she was working on (problem 3–1 in chapter 3).
128
PROBLEM SOLVING
Assumed understanding Another assumption that it would be unwise to make is that all the material from earlier chapters or the current one has been understood. Several studies have shown that learners do not always have a clear idea of how much they understand from a textbook (Chi et al., 1989, 1994; Ferguson-Hessler & de long, 1990; VanLehn, Jones, & Chi, 1992). In analysing the study processes of students studying physics texts, Ferguson-Hessler and de Jong, for example, found that “poor” students said “everything is clear” three times more often than good students, whereas their performance showed that this was not the case. They also found that poor performers processed declarative knowledge more than good performers, who concentrated more on situational and procedural information. The finding that poorer students felt that everything was clear more often than “better” students can be explained in terms of Hiebert and Lefèvre’s (1986) distinction between primary understanding and reflective understanding. Primary understanding occurs when the student understands a new domain at a surface level; that is, at the same level of abstractness as, or at a less abstract level than, the information being presented. This type of understanding is highly context-specific. The specific examples presented seem to be clear but the student is unlikely to see how the examples can be adapted or applied to another problem. Reflective understanding is at a more abstract level when students recognise the deeper structural features of problems and can relate them to previous knowledge. Assumed schematic knowledge of the structure of teaching texts Writers need to ensure that they do not violate the student’s schema for what the layout of a scientific textbook should look like. Students are likely to have expectations about how textbooks are laid out in formal domains such as mathematics, science, and computer programming because they tend to have a particular stereotypical layout (Beck & McKeown, 1989; Kieras, 1985; Robertson & Kahney, 1996; Sweller & Cooper, 1985). With experience of such textbooks, students come to develop a schema for that type of text. Such a schema includes the default assumptions that solutions follow statements of the problem rather than vice versa, and that a particular section of a textbook will give them enough information to solve exercise problems at the end of that section. However, it may be the case that textbooks are not structured that way (see Britton, Van Dusen, Gulgoz, & Glynn, 1989). Assumptions about generalisation Another difficulty facing writers is whether to present close variants of the problem type or a range of variants (see Figure 7.1). Principles, concepts, how to generate an equation, etc., can be understood better by presenting a concrete example (such as Example 1 in Figure 7.1). This concrete example often acts as an exemplar or paradigm representing the problem type. Table 7.1 shows some concrete examples. However, it may be hard for the reader to recognise whether a concept, principle, or solution procedure is relevant or applicable based on one example alone. Often, therefore, a range of examples is presented. If this range of examples is composed mainly of close variants of the exemplar, then the reader might be better able to abstract out the commonalities between them and hence be better able to understand the concept, principle, or solution procedure, and to automate the procedure for TABLE 7.1
7. TEXTBOOK PROBLEM SOLVING
129
Figure 7.1. Is a category of problems best explained by presenting a number of close variants or a range of examples including distant variants? Close and distant variants of different problem types Domain
Problem category
Paradigm
Close variant
Distant variant
Prolog
Recursion
has_flu (bill). has_flu (X):-
on_top (floor). on_top (A):-
kisses (X, Y), has_flu (Y). Je l’ai achetée hier.
above (A, B), on_top (B). Je les ai achetés la semaine dernière.
member (X[X|Tail]). member (X[Head| Tail]):member (X, Tail). C’est bien la voiture que j’ai vue.
A car travelling at a speed of 30 mph left a certain point at 10.00 a.m. At 11.30 a.m., another car departed at the same place at 40 mph and travelled the same route. In how many hours will the second car overtake the first?
A car travels south at the rate of 30 mph. Two hours later, a second car leaves to overtake the first car, using the same route and going 45 mph. In how many hours will the second car overtake the first car?
A pick-up truck leaves 3 hours after a large delivery truck but overtakes it by travelling 15 mph faster. If it takes the pick-up truck 7 hours to reach the delivery truck, find the rate of each vehicle.
French grammar Agreement of past principle with preceding direct object Algebra Rate1×Time1= Rate2×Time2
solving a subset of such problems. This, however, would be at the expense of demonstrating the range of applicability of the concept, etc. (Cooper & Sweller, 1987). If students are expected to solve distant variants of a problem, writers have to provide explicit information about the relationship between source examples and other problems of the same type (Conway & Kahney, 1987; Reed et al., 1974). As examples provide information about a category of problems, the more information about the features of that category that is given to the reader the better. THE ROLE OF EXAMPLES IN TEXTBOOKS Why bother with examples at all? Isn’t a textual explanation sufficient? If people are asked to perform a complex procedure, then the best method of teaching it is by demonstration. Once people have read the text
130
PROBLEM SOLVING
in a subject such as physics, mathematics, or computer programming, they tend not to re-read it if they can help it. Instead they concentrate on worked-out examples if they are trying to solve exercise problems at the end of a section. Ross (1989a) has referred to examples as “potent teachers” and there is a lot of evidence suggesting that they are more important for problem solving than the rest of the text. This phenomenon has been commented on in a number of studies (LeFèvre, 1987). For example, Pirolli (1991) states: “When a learner is faced with novel goals, the preferred method of problem solving involves the use of example solutions as analogies for the target solution” (p. 209). VanLehn (1990) also argues that concrete examples of problem solving are more important than the textual explanations that accompany them. Study Box 7.1 contains some more specific examples. VanLehn (1986) has referred to a “folk model” of how students learn from textbook examples and explanations. The explanations are seen as the important aspect of instruction and are assumed to be adequate for students to learn procedures for solving problems. In contrast he argues that explanations serve as guides to help students make more accurate inductions from examples. So why are examples so important for learning? According to Ross (1989b) being reminded of an earlier example can have four possible effects which amount to different roles played by worked-out examples. First, it may allow the learner to remember the details of a solution procedure rather than an abstract principle or rule (such as an equation). “The memory for what was done last time is highly interconnected and redundant, allowing the learner to piece it together without remembering separately each part and its position in the sequence” (Ross, 1989b, p. 439). Second, even if the learner can remember the rule or principle that is supposed to be applied to a problem, the learner may not know how to apply it. Activity 7.2 gives some indication of what this means. For novices in algebra, being told the principle (the relevant equation) underlying the problem may be of no use because they do not necessarily know how to instantiate the variables. The novices still have to make a number of inferences based on domain knowledge before they can solve the problem.
STUDY BOX 7.1 LeFevre and Dixon (1986) found that students learning a procedural task prefer to use examples as a source of information and that written instructions tend to be ignored. VanLehn (1986, 1990) has built a theory of children’s errors on the evidence he has gleaned that people prefer to use examples rather than written explanations. VanLehn (1986) has estimated that some 85% of children’s systematic errors are due to misunderstanding textbook explanations of problems. Pirolli (1991; Pirolli & Anderson, 1985) found that novice programmers relied heavily on examples rather than instructions to help solve LISP recursion problems. Carroll, Smith-Kerker, Ford, and Mazur-Rimetz (1987–1988) redesigned computer training manuals partly to take account of the fact that learners are put off by the “verbiage” in traditional training manuals.
ACTIVITY 7.2 Car A leaves a certain place at 10 a.m. travelling at 40 mph, and car B leaves at 11.30 a.m. travelling at 55 mph. How long does it take car B to overtake car A? The equation to use is:
RateCarA×TimeCarA=RateCarB×TimeCarB What figures would you use to replace the variables in the equation?
7. TEXTBOOK PROBLEM SOLVING
131
Third, novices may not understand the concepts embodied in the rule or principle, or may have misinterpreted them. For example, the equation in Activity 7.2 is based on the more general equation Distance=Rate×Time. As both cars travel the same distance then DistanceCarA=DistanceCarB; and as the distances are equal the Rate×Time for both cars must be equal too: hence the form of the equation. Now if you know something about algebra or mathematics in general, then that explanation might make sense and you can understand where the equation comes from. If you have little knowledge of mathematics, then the origin of the equation may be rather obscure. That is, you may not understand the concepts involved in the equation. Fourth, trying to solve a current problem may force novices to extract more information from an earlier problem than they did at the time. If you saw how to solve the Fortress problem based on the Radiation problem, you may have been able to abstract out information from the Radiation problem that was more relevant to “divide and converge” problems. Reimann and Schult (1996) point to three problems that the use of examples helps to overcome. These are the “interpretation problem”, the “control problem”, and the “generalisation problem”. The interpretation problem. Examples show how theoretical principles can be interpreted and instantiated in a problem (the second of Ross’s four roles). They show the relationship between a problem description and the concepts or principles they embody. The control problem. At any one time in the middle of an algebra problem there may be a number of possible operators that you can apply (related to the first of Ross’s roles). Examples show what specific operators apply and therefore demonstrate the specific solution procedure. The generalisation problem. It is always difficult for novices to know what the salient aspects of a problem (or a textbook for that matter) are. That is, they are poor at distinguishing between those surface features that are related to the structural ones and those that are irrelevant (see the discussion of Gentner’s structure-mapping theory in Chapter 5). Only the superficial features can be generalised over. THE PROCESSES INVOLVED IN TEXTBOOK PROBLEM SOLVING Various representations can be derived from a textual presentation of a problem. The first thing a student finds when confronted with a word problem to solve in a textbook is a piece of text. The first thing the student has to do, therefore, is to make sense of the text itself. This in turn requires several layers of representation. First of all there are the individual words that compose the text. Understanding these comes through our semantic knowledge of the items in our mental lexicon—our mental dictionary. From the individual words and the context of the sentence, our overall understanding of the text of a problem is constructed, and so on. Kintsch (e.g., 1986, 1998; Nathan et al., 1992; Van Dijk & Kintsch, 1983) has argued that word problems require the solver to generate a number of different representations. The initial representation of the text is a propositional representation called the textbase. However, knowing what the text of a question means does not therefore entail an understanding of the problem. Kintsch (1986, p. 89) gives the example of trying to understand a computer manual: all too often we seem to “understand” the manual all right but remain at a loss about what to do; more attention to the text as such would be of little help. The problem is not with the words and phrases, nor even with the overall structure of the text; indeed, we could memorize the text and still not know which button to press. The problem is with understanding the situation described by the text. Clearly understanding the text as such is not a sufficient condition for understanding what to do.
132
PROBLEM SOLVING
From the textbase students have to develop a representation of the situation described in the text. This is a mental model which Van Dijk and Kinstch (1983) termed a situation model. For problem solving to be successful, the solver has to generate all the necessary inferences in order to build a representation of the problem that is useful enough to solve it. This, in turn, means that novices have to have enough domainrelevant knowledge to do so. In a later formulation of the theory, Nathan, Kintsch, & Young (1992) divided the situation model into two. They explicitly distinguished between the situation model and the problem model. The situation model includes elaborated inferences generated from an understanding of the text. Such inferences might include the fact that if two cars leave from the same point at different times and the second car overtakes the first then both cars will have travelled the same distance at that point. The fact that both cars travelled the same distance may not be explicitly mentioned in the text. Nathan et al. (1992, p. 335) also add that: because of the added demands of inference making, readers will make inferences only when they seem necessary. Poor problem solvers will tend to omit them from their representations, and so they will omit the associated equations (supporting relations) from their solutions to story problems. Problem solvers who reason situationally will tend to include these inference-based equations The other representational form proposed by Nathan et al. is the problem model which includes formal knowledge about the arithmetic structure derived from the text, for example, or the operating procedure constructed from information in the text. The ability to make inferences from texts in order to derive a useful problem model depends on the relevant prior domain knowledge of the learner. Kinstch (1986) argues that the text determines what situation model is constructed and how it is constructed. The situation model is important for learning and the textbase is important for remembering text (bear in mind that the situation model and problem model are conflated here). In a study of problem solving and retrieval of earlier problems, he found that recall of word problems that had already been solved was determined both by the properties of the textbase and the model constructed to solve a problem. It was the situation model that provided recall of earlier problems and not a reproduction of the textbase. Learning, according to Kintsch, depended on the problem model constructed from examples, and remembering depended on the coherence of the text. For example, common terms repeated in succeeding sentences lead to greater coherence and greater recall (Kintsch & Van Dijk, 1978). He argued that it was easier (at least for children) to form an appropriate situation model if there is a concrete, familiar structure. However, other studies (e.g., Chen & Daehler, 1989; Novick, 1990) have shown that this is not the whole story. Problemsolving transfer by adults and children from abstract representations can also take place (see Chapter 4). The distinction between a propositional (textbase) representation of a text and the elaborated situation model was examined by Tardieu, Ehrlich, and Gyselinck (1992). They argued that novices and experts in a particular domain would not differ in the propositional representation they derived from a text, but that there would be differences between the two groups in the situation model (here again the situation model and the problem model are synonymous). Tardieu et al. found that there was no difference between experts and novices on their ability to paraphrase a text (i.e. they both generated much the same textbase) but experts performed better on inference questions than novices (they had derived different situation models from the textbase). These hierarchical forms of representation have two implications for how novices understand textbook explanations and examples. First, as they are unfamiliar with the domain, they tend to have only a propositional representation of the surface features of the examples. Using examples to solve further problems means matching propositions and is unlikely to be guided by an understanding of the deeper
7. TEXTBOOK PROBLEM SOLVING
133
relational structure. Second, novices may not know enough to make necessary elaborative inferences to generate a complete situation or problem model. The next section presents an example of a study where the students were unable to generate a complete situation or problem model. LABORATORY STUDIES OF WITHIN-DOMAIN AND TEXTBOOK PROBLEM SOLVING Figure 7.2 represents a hierarchy of “Rate” problems. The lower down the hierarcy, the more specific or concrete the problem becomes. The distance between any two nodes in the hierarchy represents a crude measure of the amount of transfer that would be involved between them. Generally speaking solving problems using examples in textbooks usually involves problems that would be adjacent in the hierarchy. Reed, Dempster, and Ettinger (1985) describe four experiments in which one example problem and solution is presented and the student is thereafter expected to solve a transfer problem, or a problem whose solution procedure was unrelated to the practice problem. In the terminology of Reed et al., the transfer problems were called “equivalent” or “similar”. We will look at the experiments in general and at some of
134
PROBLEM SOLVING
Figure 7.2. A hierarchy of Rate problems.
the algebra word problems in particular with a view to discovering just what the solution explanations that were provided failed to explain.
STUDY BOX 7.2 Reed et al. (1985) Rationale Reed et al. were interested in establishing how transfer could be produced in within-domain problem solving. They used the kinds of problems one finds in mathematics textbooks and gave explanations of how to solve them (which were generally better than the explanations one normally finds in such textbooks). Using one example problem and associated explanation they looked for transfer to close and distant variants of the example problem. Methodology Subjects were given six minutes to solve the following problem and then given the solution. In the discussion that follows this is referred to as the source problem. A car travelling at a speed of 30 miles per hour (mph) left a certain place at 10.00 a.m. At 11.30 a.m. another car departed from the same place at 40 mph and travelled the same route. In how many hours will the second car overtake the first car?
7. TEXTBOOK PROBLEM SOLVING
135
The problem is a distance-rate-time problem in which Distance=Rate×Time We begin by constructing a table to represent the distance, rate, and time for each of the two cars. We want to find how long the second car travels before it overtakes the first car. We let t represent the number that we want to find and enter it into the table. The first car then travels t+ 3/2 hr because it left 1½ hrs earlier. The rates are 30 mph for the first car and 40 mph for the second car. Notice that the first car must travel at a slower rate if the second car overtakes it. We can now represent the distance each car travels by multiplying the rate and the time for each car. These values are shown in the following table.
Car First Second
Distance
Rate
Time
(miles) 30(t+ 3/2) 40×t
(mph) 30 40
(hours) t+ 3/2 t
Because both cars have travelled the same distance when the second car overtakes the first, we set the two distances equal to each other: 30(t+ 3/2)=40t Solving for t yields the following: 30t+45=40t; 45t=10t, t=4.5 hours
Study Box 7.2 (continued) Three types of test problem were presented: an unrelated problem that did not use the same equation or have the same surface features; a close variant, where the solver had to find the time taken for the vehicles to meet (“Target 1” in Table 7.2); and a distant variant, where the solver had to find the rates of travel of the two vehicles (“Target 4” in Table 7.2). In the first experiment the explanation was removed after two minutes and subjects were asked to solve a target problem. Results Subjects were extremely poor at solving the targets without having the source in front of them (only 6% were successful); in subsequent experiments most groups were allowed to consult the source solution. Despite the fact that the equation is the same as the one in the source problem, only 22% of students successfully solved the distant variant (Target 4) when the complete text of the problems was in front of them.
Some aspects of the relation between example and test problems were examined by Novick and Holyoak (1991). Using algebra word problems they looked at the effects of giving subjects specific numerical mappings for transfer problems. For example, in the “number mapping hint” condition subjects were presented with hints such as: “the 12, 8, and 3 in the band problem are like the 10, 4, and 5 in the garden problem” (p. 415). Transfer success was much more likely to occur with number mappings than if the subjects were given “concept mappings” such as: “your goal TABLE 7.2 Examples of rate×time problems Source problem:
136
PROBLEM SOLVING
Figure 7.3. Applying the source to target 4 involves two levels of the hierarchy. A car travelling at a speed of 30 miles per hour (mph) left a certain place at 10.00 a.m. At 11.30 a.m. another car departed from the same place at 40 mph and travelled the same route. In how many hours will the second car overtake the first car? Target 1 A car travels south at the rate of 30 mph. Two hours later, a second car leaves to overtake the first car, using the same route and going 45 mph. In how many hours will the second car overtake the first car? Target 1 is almost identical to the source. The solution can be found “syntactically” using the example; that is, the values given (30 mph, 45 mph 11.00 and 12.30) can be substituted for the values in the source problem and the solver simply copies everything that was done in the source problem. There is no need for any “understanding”. Target 2 Car A leaves a certain place at 10 a.m. travelling at 40 mph and car B leaves one and a half hours later travelling 15 mph faster. How long does it take car B to overtake car A? Here the values cannot be directly substituted apart from RateCarA. The solver has to apply different arithmetic operations from those in the source. For example, to find RatecarB the solver has to add 15 to 40 in the target. Target 3 A truck leaves a certain place one and a half hours after another truck and tries to overtake it by travelling 15 mph faster. If the first truck travels at 40 mph, how long does it take the second truck to catch up? Here the cars have been replaced by trucks. Nevertheless, it should be easy for the solver to generalise from cars to trucks so this should pose few problems, just as in Holyoak and Koh (1987). However, there is the possibility that the solver will confuse the trucks because the truck that leaves second is mentioned first. Some solvers are therefore likely to assign TruckA to the first truck mentioned and so slot the wrong values into the equation—and indeed that’s what some do (Robertson, 2000). Target 4 A pick-up truck leaves 3 hours after a large delivery truck but overtakes it by travelling 15 mph faster. If it takes the pick-up truck 7 hours to reach the delivery truck, find the rate of each vehicle. In this case not only is the order of the trucks swapped round but the question is asking for the rates of both vehicles rather than the speeds. This is an example of far transfer, as the solver is expected to solve a problem different from the one given as an example. The example gives a procedure for finding the time taken for one vehicle to overtake another. Solvers have not been told the procedure for finding rates. This explains the fact that Reed et al. failed to
7. TEXTBOOK PROBLEM SOLVING
137
find that their explanation of how to do time problems was transferred to rate problems (see text). Figure 7.3 gives an indication of the “distance” between variants of the problem. In essence, Reed et al. had asked their subjects to solve a problem (the distant variant) that they had not been shown how to solve.
in this problem is to arrange the band members into rows or columns so that each row (or each column) has the same number of people in it, with no-one left over. That’s like the goal you had in the garden problem of grouping plants into different types so that there were the same number of plants of each type, with no extra spaces left in the garden” (p. 415). Novick and Holyoak found that the numerical mappings were a necessary (but not sufficient) prerequisite for transfer. The difficulty came when subjects had to adapt the procedure to solve a transfer problem. Problem adaptation takes a variety of forms, each of which presents a particular kind of difficulty for the solver. Novick and Holyoak list three of them: 1. Substitute numbers from the test problem into the source operators. Elements of one problem have to correspond to elements in the other. An example would be substituting 30 mph in one problem for 40 mph in another in distance-speed-time problems. Failure to map the correct numbers led to what Reed et al. called a “quantity error”, and using a number in a source problem without changing it in the target led to a “matching error”. This form of adaptation is not a major source of difficulty when the problems to be solved are very close variants. 2. Postulate new test-problem elements not described in that problem. This occurs when the target is a distant variant of the source problem. If the source contains “time taken” but the target does not, or if the target gives “rates of travel” but none is given in the source, then the subject will have to generate new testproblem elements. This was the difficulty faced by the subjects of Reed et al. 3. Generalise source procedure in ways that preserve the essential structure of the procedure. If the procedure involves generating an equation in an example, then the equation has to have the same form in any test problem. Failure to preserve the problem structure would lead to what Reed et al. call a “frame error”, where the form of the equation is wrong. How difficult a problem is depends on the level of the student’s understanding of the domain and of the nature of the problem. Generating a situation model from the textbase and thence a problem model depends on the student’s prior knowledge, both factual and conceptual. So what does it mean to “understand” a problem in an unfamiliar domain? UNDERSTANDING PROBLEMS REVISITED “Understanding is arguably the most important component of problem solving (e.g., Duncker, 1945; Greeno, 1977), and representations indicate how solvers understand problems… One cannot solve a problem one does not understand (except by chance)” (Novick, 1990 , p. 129, italics added). On the other hand, you can solve problems you don’t understand if they involve very little adaptation and all you are doing is copying the example (Robertson, 2000). There are several reasons why novices may fail to make elaborative inferences from a reading of a problem. First, their representations of the text are often fragmentary and incomplete, as they may not know what aspects of the text are important or relevant to the solution. Second, they require practice at solving problems before they develop the necessary inference rules. In other words, their declarative knowledge of the domain (or conceptual knowledge) is not in a form that can support inferences.
138
PROBLEM SOLVING
Many models of learning such as ACT-R, SOAR, Sierra, PI, etc. (see Chapter 8), emphasise the deriving of knowledge from experience. That is, procedures are created from experience of instances of problem solving. During this phase learners might show procedural skill but not necessarily much conceptual understanding. An example would be learning to write recursive functions in LISP from a purely “syntactic” point of view (Hasemer & Domingue, 1989)— one can learn to write recursive functions successfully without at first a full understanding of how recursion works. Even when concepts are not fully understood they can nevertheless be useful tools for thought. Procedures in mathematics still allow problems to be solved even though the problem or procedure is not fully understood. VanLehn (1990), among others, has shown that having a procedure that successfully solves a problem involving a concept does not mean that the student has a proper understanding of the concept (Brown & Burton, 1978; Brown & VanLehn, 1980; VanLehn, 1986, 1990; Young & O’Shea, 1981). Of course, it is also possible to understand a concept without knowing how it can be used; that is, without knowing how it can be instantiated. One can understand concepts at an abstract level such as recursion or “the past participle agrees with the preceding direct object” and yet rarely apply them. Even though a concept is understood by a student, this does not mean that the student has acquired the necessary procedures to implement that concept (Ohlsson & Rees, 1991). Byrnes (1992) proposed a “dynamic-interaction” model to account for the interface between conceptual and procedural understanding. His model assumes that concepts and procedures are stored in separate memory systems. The interface between the two types of knowledge involves the activation of procedures that are indexed to concepts. The procedures, however, do not define concepts. Silver (1979, 1986), in contrast, believes that the two are more fundamentally interrelated and that there is little point in distinguishing between them. Just as there is a conceptual basis for procedures, there is also a procedural basis for concepts. For example, the concept “equilateral triangle” may include procedures for distinguishing between examples and non-examples of such triangles. Concepts can therefore be defined by the operations that apply to them (see also Ross, 1996). Conceptual knowledge has been referred to as “knowledge rich in relationships” (Hiebert & Lefèvre, 1986). When concepts are learned, they are by definition learned with meaning. Procedures, on the other hand, may or may not be learned with meaning. This leads to a paradox. If conceptual (declarative) understanding comes first, then procedures are surely also learned with meaning, as those procedures make reference to known concepts. If, however, procedures are acquired first, then conceptual understanding comes after the procedures have been learned. In other words, it is possible for procedural knowledge to precede certain types of declarative knowledge. Some of these issues are dealt with further in the next chapter. THE ROLE OF DIAGRAMS AND PICTURES IN AIDING UNDERSTANDING Many concepts in a new domain do not lie within people’s previous experience. For this reason it is often useful to provide some intermediate means of relating the new concepts to something that the learner does know. People understand new concepts better if they are “anchored” to existing knowledge schemas. If the text succeeds in providing this anchor then readers will more readily understand and remember new material. One way to explain concepts in textbooks is therefore to try to relate them to the assumed prior knowledge of the reader. This general point has already been discussed in previous chapters. Here we will discuss one form of aiding conceptual understanding using illustrations.
7. TEXTBOOK PROBLEM SOLVING
139
Figure 7.4. Intermediate bridging representations
Bridging analogies and intermediate representations Graphs, figures, and tables are ways of representing information pictorially. In many cases they therefore serve the role of making textual information easier to understand. They can also serve to re-represent a question statement or a solution procedure. In this role they are “intermediate representations”. One example is what is known as an “intermediate bridging analogy” (Brown & Clement, 1989). The role of such analogies is the same as any other intermediate representation. For example, people often find it hard to understand that a surface such as a table top is deformed by, and has to push up with, a force exactly equal to the force of gravity exerted by an object—even a sheet of paper—sitting on the surface. To help understand the concept, imagine a heavy block resting on a sheet of rubber; the deformation of the rubber sheet should be obvious. Now imagine the block on a sheet of hardboard; the deformation should be less but still noticeable. Finally imagine the heavy block on a table top. The previous imagined situations should allow you to see that the table top is likewise deformed by the weight but that the deformation is not noticeable to the human eye (Figure 7.4). Diagrams, graphs, and tables are often used in textbooks to provide an intermediate representation of the material presented there. Where learners find it difficult to understand the structure of novel abstract concepts, or relate such concepts to the concrete examples in texts, they may find it easier to understand the relation if an intermediate representation is used. In Figure 7.5, R represents the relation between a concept (A) and a concrete example (B); R1 represents the relation between the concept and some intermediate representation (B'); and R2 represents the relation between the intermediate representation and the concrete example. These intermediate representations are an important part of the explanation of new abstract material in textbooks and act as a form of “scaffolding” to help bridge the gap between the learners’ prior knowledge (represented as B' in Figure 7.5) and the new concept (A in the Figure along with the relation between it and a specific concrete example represented by B). According to Resnick (1989) by providing a different representation of the textual material, writers can “bootstrap” learners’ constructions of novel concepts. “Objectifying theoretical constructs”, that is, making the abstract more concrete, can be done in texts by presenting the learner with some form of physical display. In that way the theoretical construct can be “seen”.
140
PROBLEM SOLVING
Figure 7.5. A bridging analogy or intermediate representation (B′) helps the analogiser to make the leap from a concept (A) to a concrete example (B).
There have been a number of experiments to test the effects of intermediate representations as an aid in problem solving. Such representations may involve some kind of visual representation (Beveridge & Parkins, 1987; Gick, 1985, 1988; Lewis, 1989), or an analogy whose purpose is to clarify a concept (Brown & Clement, 1989), or some other way of representing problems such as tables (Reed &Ettinger, 1987). Both Gick (1985) and Beveridge and Parkins (1987) examined the effects of visual analogues on problem solving and found that visual representations can act as effective retrieval cues. Beveridge and Parkins used both diagrams and coloured strips as cues, and the results suggested that the coloured strips of different intensities, representing summative effects analogous to the concepts in the problems presented, were the most effective retrieval cue. Presenting a problem along with a visual representation seems to facilitate recall of a solution. Similarly Gick used a large arrow representing a large force and several smaller arrows arranged in a circle to represent the division of a large force and its convergence on a target (see Figure 5.1). The diagram was presented along with an explanation of the solution to the Fortress problem. When the diagram was reproduced in the target problem, it facilitated spontaneous transfer. The same was true for subjects who were presented with diagrams alone before solving the transfer problem. Robinson and Kiewra (1995) found that students learned more and could apply that learning more if they had been presented with text that included graphic organisers, compared with students given text outlines. Writers still have to be very careful that the textual representation of a new construct and the intermediate representation they provide are structurally equivalent. The inferences that readers can make in the known source domain should also be made in the target. If this is not the case then learners will have difficulty transferring the induced structure to novel problems of the same type. The reasons why pictorial or diagrammatic representations are so effective are discussed by Larkin and Simon (1987). Texts present information in a linear sequence. Understanding this sentential representation incurs a great deal of computational cost in terms of search. The larger the data-structure contained within the sentential representation, the greater the search time. It is as if you had to search through a lot of “mental text” to retrieve relevant information or make a useful inference. In a diagrammatic representation the information is indexed by its 2-D location—diagrams can make relations perceptually explicit, which is not the case in sentential representations. According to Larkin and Simon, diagrams: • allow a large number of automatic perceptual inferences; • avoid the need to match symbolic labels (matching a variable in one part of a sentential representation to a related variable elsewhere);
7. TEXTBOOK PROBLEM SOLVING
141
• obviate the need to search for problem-solving inferences. While graphical representations can help us understand concepts or systems, they also have a number of other functions. Levin (1988) classifies the functions of “pictures-in-text” into five categories: 1. decoration, where pictures are designed to make a text more attractive but are not related to the content; 2. representation, where pictures make the text more concrete, as in children’s books; 3. organisation, where pictures enhance the structure of a text; 4. interpretation, where pictures are supposed to make a text more comprehensible; 5. transformation, where pictures are presented to make a text more memorable. Levin relates these functions to different prose-learning outcomes by appealing to the notion of transferappropriate processing (Morris, Bransford, & Franks, 1977). Learners have to take account of the goals of the learning context and adapt their learning strategies accordingly. In the context of using pictures in text, Levin argues that writers should use different pictorial representations depending on whether they want to encourage the learner to understand the material, remember the material, or to apply the material. For example, in the studies by Beveridge and Parkins (1987) and Gick (1985) the function of the intermediate representation was principally to aid retrieval. PROVIDING A SCHEMA IN TEXTS Another method of providing the reader with help in understanding new concepts is by using an explanatory schema. Figure 7.1 showed that examples can represent a category of problems. It is possible to provide a general schema for solving a range of problems of the same type. The difficulty here is presenting the schema at the appropriate level of abstraction (see Chapter 9). If it is too abstract, then it might be hard to see how it applies in a specific example. If it is too specific, then it might be hard to see how it can be transferred to another more distant variant of the problem type. Chen and Daehler (1989) examined the relation between the type of story representation (specific or abstract schema) and positive and negative analogical transfer in children. Where an abstract schema was provided, the children were able to transfer analogous solutions spontaneously even when the source and target problems shared few surface similarities. Indeed the abstract representation of the source analogue was a strong determinant of positive transfer. When the target problem involved a solution principle different from the source, negative transfer resulted. However, although schema training had a strong effect on positive transfer, another important aspect was the ability to determine when it should be applied. Some of the benefits of providing an explanatory schema have been listed by Smith and Goodman (1984). They apply equally well to the benefits of diagrams and other pictorial representations such as graphs and tables and can be related to the information-processing model of Larkin and Simon (1987). These are compared in Table 7.3. Learning to solve problems in an unfamiliar domain can be a difficult and taxing business. The writer can make the learner’s task a little easier in three main ways: TABLE 7.3
142
PROBLEM SOLVING
The benefits of an explanatory schema and of diagrams Schemas
Diagrams
Smith and Goodman (1984)
Larkin and Simon (1987)
Schemas provide an explanatory framework or “scaffolding”. They improve understanding, as the preexisting connections between the framework slots can be mapped to the new domain directly. Schemas contain information that can be added to fill in gaps in knowledge and help form connections between gaps. Schema-based instructions reduce the time required to understand the relation between steps. Schemas boost memory for specific information.
In Larkin and Simon’s terms the diagram and text should be “informationally equivalent” so that information in one representation is also inferable in the other.
Schemas boost performance where they depend on understanding the relations between steps. Schemas should lead to a hierarchical organisation of material which should, in turn, lead to “chunking” and hence improve recall (Eylon & Reif, 1984)
In diagrams this includes the ability to generate perceptual inferences. In diagrams there is less need for search. According to Larkin and Simon, diagrams allow the reader to focus on perceptual cues and so retrieve problem-relevant inference operators from memory. Similarly, diagrams have computational benefits, as the information in them is better indexed and is supported by perceptual inferences. The information in diagrams is perceptually grouped— related bits of information are adjacent to each other.
1. By not making too many assumptions about what the learner understands. Unfortunately, what constitutes “too many” is an empirical question. 2. New concepts and information can be made easier to grasp by including intermediate representations such as analogies and illustrations that relate the new material to what the reader already knows, or that make the material visually clearer. 3. Presenting both concrete examples and an explanation at a moderately abstract level (a schema) might help the learner see how the examples might be adapted to different situations. SUMMARY 1. Analogical problem solving involves reasoning from a familiar domain to solve problems in an unfamiliar one. Textbook problem solving, on the other hand, tends to involve mapping an unfamiliar example onto an even less familiar exercise problem. 2. Textbook writers have to make some assumptions about the readership. These include assumptions about: • • • • •
The readers’ prior knowledge. How much the readers are likely to remember from previous chapters. How much they understand from previous chapters. Their schema knowledge of how such texts are constructed. How much the readers can generalise from the examples and explanations given.
7. TEXTBOOK PROBLEM SOLVING
143
3. Examples in textbooks are the salient aspects of instruction. They are the parts of the text that readers pay most attention to and use when solving later problems. They show: • How abstract principles can be made concrete. • What operators to choose at any given point. • What features can readily be generalised over. 4. Different forms of representation are necessary to understand word problems. These include knowledge of the lexicon, understanding of the text including local coherence (the textbase), the mental model of the situation described in the text including inferences derived from it (the situation model), and the relation between the latter and the solver’s knowledge of the domain (e.g., mathematical knowledge) that allows the generation of the problem model. 5. When solvers attempt to use an example as a source to solve a target problem they are likely to be successful if the two are close variants. If they are distant variants then the textual explanation needs to include an explanation of how to generalise over the different variants (i.e., a problem schema needs to be provided). 6. Diagrams and analogies provide a means of forming a bridge between a familiar situation and the novel unfamiliar situation. Although pictures, graphs, and illustrations can have a variety of functions in texts, they share the same pedagogical function as analogies and schemas in texts.
PART FOUR Learning and the development of expertise
Part Four: Introduction
Despite the fact that there appear to be only certain kinds of situations in which knowledge transfers from one context to another, we nevertheless manage to learn things and apply what we learn to new situations. With a great deal of practice one might eventually become what is called an “expert”. The next two chapters look at learning and the acquisition of expertise, and at what distinguishes experts from novices. Chapter 8 looks at the mechanisms that produce learning and expertise. Chapter 9 examines some of the dimensions along which experts appear to differ from novices.
CHAPTER EIGHT Expertise and how to acquire it
As a result of solving problems within a domain, we learn something about that domain. Learning manifests itself as a permanent change in a person’s behaviour as a result of experience, although such a change can be further modified, or even interfered with by further experience. There are two kinds of learning that are discussed here, both of which would normally result in expertise in a particular domain. These are an increase in knowledge about a domain—declarative knowledge—and an increase in skill related to the domain—procedural knowledge. The two are necessarily interconnected, as procedural knowledge often develops as a result of using one’s declarative knowledge in specific contexts. Declarative and procedural knowledge have already been discussed in Chapters 4 and 7. Here we look more closely at the interrelation between the two. Some kinds of general declarative knowledge come about as a result of the process of induction. With experience of a problem type we induce a schema for the problem type. This schema helps us recognise and categorise problem types and access the relevant solution procedure. With repeated practice a procedure can become automated. This automaticity, in turn, frees up resources to deal with any novel aspects of a particular problem or situation. The next two sections therefore deal with schema induction and how procedures become automated as we move from novice problem solving to expertise. INDUCTION Induction allows us to generalise from our experience of the world. We don’t have to encounter every single piece of coal in the world to learn that coal burns. A relatively few instances will do, or sometimes even one. The ability to reason inductively is extremely important for our survival. It allows us to reason from specific instances to reach a (probable) conclusion which in turn allows us to make predictions: “I’m cold. If I set fire to those black rocks I’ll get warm.” To put it another way, what appears to be true for a sample of a population or category is assumed to be true for the whole population or category. Induction does not have to be a very sophisticated form of reasoning or inferencing. Most organisms are innately predisposed to associate event A with event B. A rat can learn that pressing a lever means that food pellets will appear. Cats learn what kinds of things they can eat and what not to eat, what situations are dangerous, and that they can get fresh food at 5 o’clock in the morning by bursting into bedrooms and miaowing loudly. Along with other animals we are even predisposed to make generalisations from single instances. It only takes a single attempt to eat a particular poisonous mushroom to ensure we avoid all mushrooms that look the same thereafter. This is liberal induction (Robertson, 1999). In fact we may even over-generalise and avoid any kind of mushroom forever after. The main problem with induction, as far as achieving our goals is concerned, is that there is no guarantee that the induction is correct. Inducing things
8. EXPERTISE AND HOW TO ACQUIRE IT
147
too readily may be dangerous: “Induction should come with a government warning” as Johnson-Laird has put it (1988a, p. 234). However, the kind of induction that is more pertinent here is conservative induction (Medin & Ross, 1989). This kind of induction means that people are very careful about just how far they think it is safe to generalise. In terms of problem solving, conservative induction means that we rely heavily on the context and the surface details of problems when using an example to solve another problem of the same type. As a result the generalisations we gradually form contain a great deal of specific, and probably unnecessary, information. Furthermore, according to Bassok (1997), different individuals are likely to induce different structures from different problem contents. Spencer and Weisberg (1986) and Catrambone and Holyoak (1989) found evidence for redundant and irrelevant specific information in whatever schema subjects had derived from the examples they had seen. Chen, Yanowitz, and Daehler (1995) argued that children found it easier to use abstract principles when they were bound to a specific context. Bernardo (1994) also found that generalisations derived from problems included problem-specific information (see also Perkins & Salomon, 1989). Reeves and Weisberg (1993) argued that the details of problems are necessary to guide the transfer of a solution principle from one problem to another. The reason why induction is conservative in this way is because we tend to base our categorisations of objects, problems, or whatever on the features of previously encountered examples rather than on abstractions. SCHEMA INDUCTION The literature on induction tends to concentrate on the induction of categories (such as fruit, animals, furniture, etc.), but the findings are equally applicable to learning problem categories (Dellarosa-Cummins, 1992; Ross, 1996; VanLehn, 1986). Problem categories are generally characterised by what you have to do. More specifically, a problem type is usually characterised by not only the set of features the problems exhibit but also the relationships between those features. Those relationships, in turn, tend to be determined by the kinds of operators that should be applied to them. With experience of problem types we begin to build up a mental representation that includes both information about the features of such problems and information about the solution procedure (Marshall, 1995; VanLehn, 1989). Gick and Holyoak’s (1980, 1983) “divide and converge” schema is one example. Two views of the role of examples were introduced in Chapter 6. These were the “principle-cueing” view and the “example-analogy” view (Ross, 1987). In the principle-cueing view an example is used to access an already implicitly known principle or schema. Figure 6.3 showed the example problem as containing an “implicit schema” that could be extracted and applied to a new problem of the same type. The role of the source example is to provide this schema. When a new problem is encountered, the surface structure of the source problem helps access the schema and the schema information aids in instantiating the rest of the problem (Blessing & Ross, 1996). The pattern is: C A S D. Features of the current problem (C) allow you to access a relevant source example (A) which brings to mind the solution schema (S) which, in turn, can be used in the current problem to generate a solution (D). After experience of further problems of the same type the solver develops a schema that becomes increasingly decontextualised. This view works if a principle or schema can be readily induced from only one or perhaps two examples. It may not be so easy to induce a schema from problems in an unfamiliar domain. Here induction is likely to be conservative and is more likely to fit into the example-analogy view, which treats a training example as including a kind of recipe that the learner has to follow. Solving a target problem can therefore be achieved by imitating the sequences of actions that were carried out in the source. Any schema generated from
148
PROBLEM SOLVING
Figure 8.1. The example-analogy view of schema induction. Using an ill-understood example to solve an ill-understood problem leads to the partial formation of a problem schema.
solving problems using an example develops as a by-product of comparing a source and target problem. In Figure 8.1 the relation between A and B is partially generalised to C and D. In creating this partial generalisation a partial schema (the shaded S box in Figure 8.1) is created as a by-product. This is represented as a dotted arrow from the generalisation line from the source to the target problems. The earliest stages of problem solving may involve using an example that is poorly understood to solve a problem that is even more poorly understood. This is the situation described in Figure 8.1. The set of relations (the solution procedure) linking A and B is poorly represented (hence the dotted lines in Figure 8.1) and the application process essentially involves imitating the example procedure in the current problem (C). As a result of feedback about success or failure during this process a partial schema is formed (S in Figure 8.1). When faced with a further problem of the same type the solver may be able to access both the previous examples and the partially induced schema (Figure 8.2). Figure 8.3 shows the effect of repeated exposure to similar problems. With experience the schema becomes increasingly decontextualised. Eventually the solver can either access the schema directly (which means, in effect, that the solver knows how to solve the problem and does not need an example to work from), or a paradigmatic example that instantiates the schema. One of the benefits of schema induction being rather conservative is that a schema may be more easily instantiated if it includes some content-specific information. A schema that is not at too high a level of abstraction may allow faster mapping between source and target. That is, it may make it clearer what objects play the same roles in two similar problems (Bassok, 1990; Bassok & Holyoak, 1989). Generally, then, it is unwise to lose track of specific examples when learning a new domain. Specific details can prove useful during learning. According to Ross (1996), by interacting with a problem type the solver comes to recategorise it in terms of its solution procedure rather than its surface features. Ross has argued that problem classification is no different from other kinds of conceptual classification. The use of a
8. EXPERTISE AND HOW TO ACQUIRE IT
149
Figure 8.2. The process of solving a problem using an example or analogy generates a schema for the problem type. When faced with a later problem of the same type, a solver will either access the schema or an earlier example (or both).
category leads to changes in the way the category is represented. Experience in problem solving leads to changes in the way the problems are classified—generally this means moving from a classification based on surface features to one based on the underlying solution method. The use of a category or experience in problem solving forces the solver to pay attention to the relations between the features of the problems rather than the features alone. SCHEMA-BASED KNOWLEDGE There are various kinds of knowledge that can be encapsulated in schemas, as schemas can represent structured knowledge from the specific to the abstract. Marshall (1995) has listed some of the types of knowledge captured by problem schemas. These include knowledge of typical features or configurations of features, abstractions and exemplars, planning knowledge and problem-solving algorithms. Information Box 8.1 gives these in more detail and also identifies five basic word-problem schemas that can be derived from a problem’s textbase.
INFORMATION BOX 8.1 Problem schemas Marshall states that there are four knowledge types associated with schemas:
1. Identification knowledge. This type of knowledge is made up of a configuration of several problem features. It is the aspect of a problem schema that allows pattern recognition. For example, in the river problems that Hinsley and co-workers’ (1977) subjects recognised, they
150
PROBLEM SOLVING
Figure 8.3. Repeated exposure to examples means that the schema becomes increasingly decontextualised.
would include those aspects mentioned by expert subjects such as speeds of the boats and river, direction of travel, etc. 2. Elaboration knowledge. This is declarative knowledge of the main features of a problem. It includes both specific examples and more general abstractions. Both of these allow the creation of a mental model (a problem model) in a given situation, For example, there may be a specific paradigmatic river crossing example that allows people to access the relevant solution schema. 3. Planning knowledge, This is an aspect of problem-solution knowledge used to create plans, goals, and sub-goals. Someone could recognise a problem type (through identification knowledge and elaboration knowledge) but not have the planning knowledge to solve it, for example “This is a recursion problem isn’t it? But I have no idea what to do.” 4. Execution knowledge. This is the knowledge of the procedure for solving a problem, allowing someone to carry out the steps derived through the planning knowledge. The solver is here following a relevant algorithm, for example, or has developed a certain procedural skill related to this type of problem. This would include the kind of skill required to carry out a plan to cap a blown oil well, or mount a take-over bid, or whatever. Marshall identifies five basic word problem schemas (based on the problems’ textbase): 1. Change: Stan had 35 stamps in his stamp collection. His uncle sent him 8 more for a birthday present. How many stamps are now in his collection? [Operator: add quantities mentioned] 2. Group: In Mr Harrison’s third-grade class there were 18 boys and 17 girls. How many children are in Mr Harrison’s class? [Operator: add quantities mentioned]
8. EXPERTISE AND HOW TO ACQUIRE IT
151
3. Compare: Bill walks a mile in 15 minutes. His brother Tom walks the same distance in 18 minutes. Which one is the faster walker? [Operator: subtract quantities mentioned]
4. Restate: At the pet store there are twice as many kittens as puppies in the store window. There are 8 kittens in the window. How many puppies are also in the window [Operator: divide quantity] 5. Vary: Many bought a package of gum that had 5 sticks of gum in it. How many sticks would she have if bought 5 packages of gum? [Operator: multiply quantities mentioned]
Marshall includes “execution knowledge” as a specific type of schematic knowledge structure. Anderson (1993) prefers to separate the schema knowledge and the procedural knowledge required to solve it. Generally speaking, schemas serve to identify the problem type and indicate at a fairly general level how this type of problem should be solved. For a more detailed view of how a certain category of problems should be solved it is best to look at the specific procedures for solving them. In a sense this means that problems can be viewed at different grain sizes. Anderson has argued that, if we are interested in skill acquisition rather than problem categorisation, we would be better to look at the specific solution method used. It is to this aspect of problem-solving skill acquisition that we now turn. Al MODELS OF PROBLEM SOLVING AND LEARNING Production system architectures An important way of studying problem solving and learning is to build models. Just as an architect may produce a two-dimensional plan or a three-dimensional model that shows how the different parts of a building are interconnected and serviced, so a psychologist can build a model that incorporates a theory of how we think. Such a model may be a diagram, a mathematical model, or a computer program (or all three). One can build a model to examine specific aspects of cognition, such as face recognition or sentence parsing, or one can build a general model or architecture in which a variety of cognitive processes can be modelled. Such a cognitive architecture describes the functional aspects of a thinking system. For example, a cognitive system has to have input modules that encode information coming in from the outside world. There also has to be some kind of storage system that holds information long enough to use it either in the short term or in the long term. There also has to be some kind of executive control module that allows the system to make decisions about what information is relevant to its current needs and goals, and to make inferences based on stored knowledge. It also has to be able to solve problems and learn from experience. Several such general models of cognition exist. One class of them is based around production systems. A production is essentially an if-then rule. An example would be “if it rains then take an umbrella”. You might see the “if part and the “then” part referred to in various ways, such as the left side of a rule and the right
152
PROBLEM SOLVING
side of a rule, or goal and sub-goal, or condition-action. The condition part of a rule specifies the circumstances (“it is raining”) under which the action part (“take an umbrella”) is triggered. If a set of circumstances matches the condition part of a rule then the action part is said to “fire”. A production memory can be likened to a procedural memory, which is our memory for how to do things, and can be contrasted with a declarative memory, which is our memory for facts and episodes. A production system is therefore a set of condition-action rules, and a production system architecture has some form of production memory with other parts bolted on, such as a working memory or a declarative memory, along with connections between them and the outside world. In Anderson’s (1983, 1993) ACT architecture, there are three memory systems: a production memory that is the system’s long-term procedural memory, a declarative memory that is the system’s memory for facts (the facts are not isolated but interrelated in a “tangled hierarchy”), and a working memory. Information entering working memory from the outside world or retrieved from declarative memory is matched with the condition parts of rules in production memory and the action part of the production is then executed. The executed part of the production rule is now in working memory and may in turn lead to the firing of other rules. For example, information entering the senses from the world (“it is raining”) may enter working memory. This information may match with the condition of a production rule (“if it is raining then take an umbrella”) in which case the action part of the rule fires (“take an umbrella”). This in turn enters working memory (“you have an umbrella”) and may in turn trigger a further production (“if you have an umbrella and it is raining then open the umbrella”), and so on. Declarative knowledge, however, can be expressed as productions. In the SOAR architecture (Laird et al., 1987; Laird & Rosenbloom, 1996) there is no separate declarative memory store. Both models of human cognitive architecture claim to model a wide variety of human thinking. Because of their explanatory power, both claim to be unified theories of cognition with Alan Newell championing SOAR (Newell, 1990) and John R.Anderson claiming the prize for ACT-R (Anderson, 1993). Anderson, for example, has made the strong assertion that “cognitive skills are realised by production rules” (1993, p. 1). Another model of how we induce both categories and rules from experience is that of Holland and coworkers’ (1986). They have produced a general cognitive architecture that lays emphasis on induction as a basic learning mechanism. An important aspect of Holland and co-workers’ model is the emphasis placed on rules representing categorisations derived from experience. Rules that lead to successful attainment of a goal are strengthened and are more likely to fire in future. Information Box 8.2 presents some of the main features of the model.
INFORMATION BOX 8.2 Processes of Induction (PI) Holland et al. (1986) present a model of induction based on different types of rules. Synchronic rules represent the general features of an object or its category membership. Diachronic rules represent changes over time. There are two kinds of synchronic rule: categorical and associative (Holland et al., 1986, p. 42). Categorical rules include rules such as: If an object is a dog, then it is an animal. If an object is a large slender dog with very long white and gold hair, then it is a collie. If an object is a dog, then it can bark.
8. EXPERTISE AND HOW TO ACQUIRE IT
Note that these rules encompass both a hierarchy (dogs are examples of animals) and the properties of individual members (dogs belong to the set of animals that bark) in contradistinction to the conceptual hierarchy of Collins and Quillian (1969) where properties are attached to examples at each level of a hierarchy. Associative rules include: If an object is a dog then activate the “cat” concept If an object is a dog then activate the “bone” concept.
Associative rules therefore represent the effects of spreading activation or priming. Diachronic rules are also of two kinds: “predictor” and “effector”. Predictor rules permit expectations of what is likely to occur in the world. Examples of predictor rules are: If a person annoys a dog then the dog will growl. If a person whistles to a dog then the dog will come to the person.
Effector rules tell the system how to act in a given situation. They include rules such as: If a dog chases you then run away. If a dog approaches you with its tail wagging then pet it.
Our mental representation of the world tends to simplify the real world by categorising aspects of it. To make sense of the world and not overload the cognitive system, a categorisation function “maps sets of world states into a smaller number of model states”. The model of the world produced by such a “many-to-one” mapping is known as a homomorphism. Now sometimes generalisations (categorisations) can lead to errors. If you say “Boo” to a bird it will fly away. However, no amount of booing will cause a penguin to fly away despite the penguin being a bird. To cope with penguins another more specific rule is required. You end up with a hierarchy from superordinate (more general) categories to subordinate (more specific) categories and instances. Higher-level categories provide default
Information Box 8.2 (continued) expectations (birds fly away when you say “Boo!”). An exception (penguins waddle swiftly away when you say “Boo!”) evokes a lower level of the hierarchy. A hierarchical model that includes a set of changes of state brought about by a set of transition functions (one for each level) is known as a quasi-homomorphism or qmorphism for short. Figure 8.4 represents a mental model based on a state of affairs in the world (a sequence of operators applied to a sequence of problem states). The top half represents the world including transitions between states, and the bottom half is a mental model of that world. A problem model is valid only if (1) given a model state S′i, and (2) given a model state S’g that corresponds to a goal state Sg in the environment, then (3) any sequence of actions (operator categories) in the model, {O’(1), 0’(2),…O’(n)}, which transforms S’i into S’g in the model, describes a sequence of effector actions that will attain the goal Sg in the environment An ideal problem model thus is one that describes all those elements in the world necessary and sufficient for concrete realization of a
153
154
PROBLEM SOLVING
Figure 8.4. A problem model as a homomorphism. T is a transition function, Si is the initial problem state, Sg is the goal state, O is a category of operators, and P is a categorisation function. The prime (T', S’', O’) represents the mental representations of those operators, states and functions (adapted from Holland et al, 1986, p. 40). successful solution plan. The process of induction is directed by the goal of generating mental models that increasingly approximate this ideal.
(Holland et al., 1986, p. 40) In the homomorphism represented in Figure 8.4, P is a categorisation function that serves to map elements in the world to elements in the model. Analogical problem solving can be modelled in this system by assuming a “second order morphism”. A source model as in Figure 8.4 can be used to generate a model of the target. Generalisation Similar conditions producing the same action can trigger condition-simplifying generalisation. Suppose there are learned rules such as If X has wings and X is brown then X can fly” and “If X has wings and X is black then X can fly”. A simplifying generalisation would remove the “X is brown” and “X is black” conditions since they do not seem to be relevant to flying. People have beliefs about the degrees to which instances of objects that form categories tend to vary. That is, the generalisations that we form depend on the nature of the categories and our beliefs about the variability of features of instances of the category. Induction therefore ranges from liberal to conservative. The Processes of Induction (PI) system takes this variability into account in inducing instance-based generalisation. Specialisation It is in the nature of our world that there are exceptions to rules. Once we have learned a rule that says that if an animal has wings then it can fly, and we encounter a penguin, our prediction is going to be wrong. PI can generate an exception rule (the q-morphism is redefined) that seems to cover the case of penguins. The unusual properties of penguins can be built into the condition part of the rule. Expertise The acquisition of expertise can be modelled by assuming that the sequence of rules that led to a successful solution are strengthened by a “bucket brigade” algorithm. This algorithm allows strength increments to be
8. EXPERTISE AND HOW TO ACQUIRE IT
155
passed back along a sequence of linked rules. The final rule in the chain is the one that gets the most strength increments. As PI is a model designed to show how we learn from experience of instances, and as it includes algorithms for both generalisation and specialisation, it is a general model that explains the development of expertise.
SKILL LEARNING IN ACT-R Anderson has produced a series of models that attempt to approximate as closely as possible to a general theory of cognition. Anderson’s model, known as Adaptive Control of Thought (ACT), has evolved over the past two decades and has gone through a number of manifestations. The most recent is known as ACT-R where the “R” stands for Rational. Unfortunately, over the years there has been a kind of “genetic drift” in the meaning of the term “rational”. The earliest sense was “logically correct reasoning”. Another sense refers to the fact that organisms attempt to act in their own best interests. Chapter 2 referred to Newell and Simon’s idea of “intendedly rational behaviour” or “limited rationality”. Laird and Rosenbloom (1996, p. 2) refer to the principle of rationality ascribed to Newell that governs the behaviour of an intelligent “agent” whereby “the actions it intends are those that its knowledge indicates will achieve its goals”. Anderson’s term is based on the sense used by economists. In this sense “human behavior is optimal in achieving human goals” (Anderson, 1990, p. 28). Anderson’s “General Principle of Rationality” states that “The cognitive system operates at all times to optimize the adaptation of the behavior of the organism” (1990, p. 28). Anderson (1993, p. 47) later wrote: Certain aspects of cognition seemed to be designed to optimize the information processing of the system…optimization means maximizing the probability of achieving [one’s] goals while minimizing the cost of doing so or, to put it more precisely, maximizing the expected utility, where this is defined as expected gain minus expected cost. In a sense, where Anderson’s definition emphasises the evolutionary process that shapes our thinking, Laird and Rosenbloom’s emphasises the results of that process. Our cognitive system has evolved to allow us to adapt as best we can to the exigencies of the environment in which we find ourselves, and in line with our goals. This entails a balance between the costs involved in, say, a memory search, and the gains one might get from such a search (the usefulness of the retrieved memory). As a result of taking this view of rationality, the cognitive mechanisms built into ACT-R are based on Bayesian probabilities (see Information Box 8.3). For example, the world we inhabit has a certain structure. Features of objects in the world tend to co-vary. If we see an animal with wings and covered in feathers, there is a high likelihood that the animal can fly. It would be useful for a cognitive system to be constructed to make that kind of assumption relatively easily. It’s wrong; but it’s only wrong in a very small percentage of cases. The gains of having a system that can make fast inductions of this kind outweigh the costs of being wrong on the rare occasion. Assume that you are an experienced driver and you are waiting at a junction. A car is coming from the left signalling a right turn into your road. There is a very high probability that the car will, indeed, turn right. In this case, however, the costs of being wrong are rather high, so you might wait for other features to show themselves, such as the car slowing down before you decide to pull out. The general structure of ACT-R is shown in Figure 8.4. There are two types of long-term memory in ACT-R: a declarative memory and a production (procedural) memory. Declarative knowledge (factual and episodic knowledge) can be verbalised, described, or reported. Procedural memory can be observed and is
156
PROBLEM SOLVING
expressed in a person’s performance. People can get better at using procedural skills but forget the declarative base. Anderson has consistently argued that all knowledge enters the system in a declarative form. Using that knowledge in context generates procedural knowledge which is represented as a set of productions in production memory. Bear in mind that there is an argument that says that some forms of conceptual knowledge might come about as a result of procedural knowledge, as explained in Chapter 7. For example, as a result of becoming reasonably proficient at solving the Tower of Hanoi puzzle one may notice that the smallest ring is moved on every second move;
INFORMATION BOX 8.3 Bayes’ Theorem Almost all events or features in the world are based on probabilities rather than certainties. The movement of billiard balls on a table becomes rapidly unpredictable the more the cue ball strikes the other balls; uncertainties govern the movement and position of fundamental particles; if someone has long hair then that person is probably a woman; most (but not all) fruits are sweet, and so on. Fortunately, some events or covariations of features are more likely than others, otherwise the world would be even more unpredictable than it already is. Our beliefs reflect the fact that some things are more likely to occur than others. Aeschylus probably did not weigh up the benefits of going out in a boat against the potential costs of being killed by a falling tortoise. Bayes’Theorem allows us to combine our prior beliefs, or the prior likelihood of something being the case, with changes in the environment. When new evidence comes to light, or if a particular behaviour proves to be useful in achieving our goals, then our beliefs (or our behaviour) can be updated based on this new evidence. The theorem can be expressed as:
Odds (A given B)=LR×Odds(A) where A refers to one’s beliefs, a theory, a genetic mutation being useful, or whatever; B refers to the observed evidence or some other type of event. Odds (A given B) could reflect the odds of an illness (A) given a symptom (B), for example, in the equation, Odds (A) refers to what is known as the “prior odds”—a measure of the plausibility of a belief or the likelihood of an event. For example, a suspicion that someone is suffering from malaria might be quite high if that person has just returned from Africa. The LR is the Likelihood Ratio and is given by the formula:
where P is the probability, B is an event (or evidence), and A is the aforementioned belief, theory, or other event. The Likelihood Ratio therefore takes into account both the probability of an event (B) happening in conjunction with another event (A) (a symptom accompanying an illness) and in the absence of A (the probability of a symptom being displayed for any other reason). Anderson has used Bayesian probabilities in his “rational analysis” of behaviour and cognition. For example, in the domain of memory, the probability that an item will be retrieved depends on the how recently the item was used, the number of times it has been used in the past, and the likelihood it will be retrieved in a given context. In problem solving there is an increased likelihood that the inductive inferences produced by using an example problem with a similar goal will be relevant in subsequent similar situations.
or, as a result of using a word processor, one might learn the concept of a “string” (Payne et al., 1990).
8. EXPERTISE AND HOW TO ACQUIRE IT
157
Declarative memory in ACT-R The “chunk” is the basic unit of knowledge in declarative memory. A chunk is a data structure also known as Working Memory Elements (or “wimees”). Only a limited number of elements can be combined into a chunk—generally three. Examples would be USA, BSE, AIDS. Chunks have configural properties such that the different component elements have different roles. “1066” is a chunk because it is a meaningful unit of knowledge (assuming you know about the Norman Conquest), but the elements rearranged to form 6106 would no longer constitute a chunk. Chunks can be hierarchically organised. Broadbent (1975) found that 54% of people split items entering working memory into pairs, 29% split them into three items, 9% into four items, and 9% longer. French phone numbers, being divided into pairs of numbers, are therefore ideal. Generally speaking two or three items make a reasonably good chunk size. Chunks can also be used to represent schemas in declarative memory. Table 8.1 represents a schema for an addition problem. In the first part the problem is given TABLE 8.1 Schema representation of the problem
a name (“problem 1”). This is followed by a series of “slots”, the first one (the “isa” slot) being the problem type. The next row has a slot for columns which are listed in brackets. This part also shows a simple example of a hierarchical arrangement, as the columns themselves are represented as schematic chunks. If we take “column 1” as an example we can see that this represents configural information. The chunk type (represented by the “isa” slot) determines to an extent the types of associated slots that follow. The chunk size is also determined by the number of associated slots. Working memory in ACT-R Working memory is a convenient fiction. It is simply that part of declarative memory that is currently active and available. Information can enter working memory from the environment, through spreading activation in declarative memory (associative priming—see Chapter 3), or as a result of the firing of a production in production memory. For example, if your goal is to do three-column addition and you haven’t yet added the rightmost column then add the rightmost column. If your goal is to do three-column addition and you have added the rightmost column (now in working memory along with any carry) then add the middle column. If your goal is to do three-column addition and you have added the rightmost column and the middle column (both in working memory) then add the leftmost column, and so on. Production memory in ACT-R The production is the basic unit of knowledge in procedural memory. Examples of productions have already been encountered in Chapter 4 along with aspects of their role in the transfer of procedural skill. Declarative knowledge is the basis for procedural knowledge. ACT-R requires declarative structures to be active to support procedural learning in the early stages of learning a new skill. Productions in ACT-R are modular. That is, deleting a production rule will not cause the system to crash. There will, however, be an effect on the behaviour of the system. If you had a complete model of two-column subtraction and deleted one of the production rules that represented the subtraction problem, then you would generate an error in the
158
PROBLEM SOLVING
Figure 8.5. The architecture of ACT-R (Anderson, 1993).
subtraction. This is one way you can model the kinds of errors that children make when they learn subtraction for the first time (Young & O’Shea, 1981). Procedural memory in ACT-R can respond to identical stimulus conditions in entirely different ways depending on the goals of the system. Learning in ACT-R The development of cognitive skill involves somehow transforming declarative knowledge (such as the instructions for driving a car) into procedural knowledge (skilled driving). As a result of this process, knowledge is compiled. This is an analogy with computers where “high-level” languages have to be interpreted by the computer’s built-in machine code before the program can run. That is, the program has to be compiled, and the result of converting it to machine code means it is no longer directly accessible in its original form. Anderson’s views on how this comes about have evolved over the years. When we encounter a novel problem for the first time we hit an impasse—we don’t immediately know what to do. We might therefore attempt to recall a similar problem we have encountered in the past and try to use that to solve the current one. According to Anderson, this process involves interpretive problem solving. This means that our problem solving is based on a declarative account of a problem-solving episode. This would include, for example, using textbook examples or information a teacher might write on a board. Anderson argues that even if we have only instructions rather than a specific example to hand, then we interpret those instructions by means of an imagined example and attempt to solve the current problem by interpreting this example. As a result the information is compiled. “The only way a new production can be created in ACT-R is by compiling the analogy process. Following instructions creates an example from which a procedure can be learned by later analogy” (Anderson, 1993, p. 89).
8. EXPERTISE AND HOW TO ACQUIRE IT
159
Figure 8.6. The analogy mechanism in ACT-R.
In short Anderson is arguing that learning a new production—and, by extension, all skill learning— occurs through analogical problem solving. In this sense the model is similar to Holland and co-workers’ (1986) PI model. Figure 8.6 shows the analogy mechanism in ACT-R in terms of the problem representation used throughout this book. In this model, the A and C terms represent goals (the problem statement including the problem’s goal). The B term is the goal state, and the line linking the A and B terms is the “response”: the procedure to be followed to achieve the goal state. As in other models of analogy, mapping (2 in Figure 8.6) involves finding correspondences between the example and the current problem. Before looking in a little more detail at ACT-R’s analogy mechanism try Activity 8.1. The mapping of the elements in the example onto elements in the current problem is shown in more detail in Figure 8.7. The structure in the example shows you that the problem involves a Lisp operation, that the operation involves multiplication, that there are two “arguments”: argl and arg2, and that there is a method for doing it shown by response 1. These elements can be mapped onto the elements of the current problem. The problem type is different, nevertheless both are arithmetic operations. They are similar because
ACTIVITY 8.1 Imagine you are given the problem of writing a Lisp function that will add 712 and 91. You have been given an example that shows you how to write a Lisp function that will multiply two numbers together: Defun multiply-them (2 3) *2 3
You know that “defun” defines a function and that “multiply-them” is a name invented for that function and that * is the multiplication symbol. How would you write a function in Lisp that will add 712 and 91? (See p. 237 for answer.)
they are semantically related and because they belong to the same superordinate category type.
160
PROBLEM SOLVING
Figure 8.7. Mapping correspondences between the example and the current goal.
THE POWER LAW OF LEARNING It is a common experience (so common as to be unremarkable) that the more we practice something the easier it gets. Furthermore, you may have noticed that you can get to a point where it seems hard to improve any further. Top athletes, for example, have to put in a huge amount of practice just to stand still (if the metaphor doesn’t sound too bizarre). Indeed, there is a ubiquitous finding in the literature that learning something new produces a curve of a predictable shape. That is, there is a relationship between amount of practice and measures such as reaction times, recognition times, number of items recalled, number of items correct, and so on. Figure 8.8 shows the relationship between practice and measures such as reaction times or error rates. This relationship is known as the Power Law of Learning (Newell & Rosenbloom, 1981). It is called a Power Law because the equation involves practice raised to a power: T=cPx where T is time taken to perform a task, c is a constant, P is the amount of practice, and x represents the slope of the line in Figure 8.8. The curve in the figure shows that the benefits of practice get smaller and smaller, although we still learn something. Examples of the Power Law of Learning have turned up all over the place (Anderson, 1982; Anderson, Boyle, & Yost, 1985; Logan, 1988; Newell & Rosenbloom, 1981; Pirolli & Anderson, 1985). Anderson provides several examples and several rationales for how the Power Law operates using a Bayesian analysis and how the learning mechanisms of ACT-R spontaneously generate such learning (and forgetting) curves (see Anderson, 1990, 1993, 2000a, b for further details). CRITICISMS OF PRODUCTION-SYSTEM MODELS There have been a number of criticisms of production-system models of learning and expertise. For example, Anderson has argued that there is an asymmetry in production rules such that the conditions of rules will cause the actions to fire but the actions will not cause conditions to fire. The conditions and actions in a condition-action rule cannot swap places. Learning LISP for coding does not generalise to using LISP for code evaluation; for example, one can become very skilled at solving crossword puzzles, but that skill should not in theory translate into making up a crossword for other people to solve. The skills are different. We saw in Chapter 4 that transfer was likely only where productions overlapped between the old
8. EXPERTISE AND HOW TO ACQUIRE IT
161
Figure 8.8. The Power Law of Learning: (a) represents a hypothetical practice curve and (b) is the same curve represented as a logarithmic scale.
and new task (e.g., Singley & Anderson, 1989). Practising a skill creates use-specific or context-specific production rules. McKendree and Anderson (1987) and Anderson and Fincham (1994) have provided experimental evidence for this use-specificity in production rules (see Table 8.2). ACT-R successfully models this transfer or retrieval asymmetry. In Table 8.2, V1 is a variable that has the value (A B C). CAR is a “function call” in LISP that returns the first element of a list. The list in this case is (A B C) and the first element of that list is “A”. When LISP evaluates (CAR V1) the result is therefore “A”. The middle column in Table 8.2 represents a task where a LISP expression is evaluated. The third column represents a situation where someone has to generate a function call that will produce “A” from the list (A B C). McKendree and Anderson argue that the production rules involved in the evaluation task and in the generation task are different. TABLE 8.2 Information presented in evaluation and generation tasks in McKendree and Anderson (1987) Type of information
Evaluation
Generation
V1 Function call Result Adapted from Müller (1999).
(ABC) (CAR V1) ?
(ABC) ? A
P1: If the goal is to evaluate (CAR V1) And A is the first element of V1 Then produce A. P2: If the goal is to generate a function call And ANSWER is the first element of V1 Then produce (CAR V1) As the condition sides of these two productions do not match, the transfer between them would be limited, bearing in mind that the knowledge they represent is compiled. A similar finding was made by Anderson and Fincham (1994) who got participants to practise transformations of dates for various activities such as: “Hockey was played on Saturday at 3 o’clock. Now it is Monday at 1 o’clock.” Transforming one day and
162
PROBLEM SOLVING
time into another involves adding or subtracting 1 or 2 from Day 1 (e.g., Saturday +2) and Hour 1 (1–2). Practising on one transformation did not transfer to the reverse transformation. Müller (1999), however, challenged the idea of use-specificity of compiled knowledge. One effect of such use-specificity is that skilled performance should become rather inflexible; yet expertise, if it is of any use, means that knowledge can be used flexibly (this aspect of expertise is discussed in the next chapter). Müller also used LISP concepts such as LIST, INSERT, APPEND, DELETE, MEMBER, and LEFT. He also got his participants to learn either a generation task or an evaluation task using those concepts in both. His study was designed to distinguish between the results that would be obtained if the use of knowledge was context-bound and those that follow on from his own hypothesis of conceptual integration. According to this hypothesis concepts mediate between a problem’s givens and the requested answers. Concepts have a number of features that aggregate to form an integrated conceptual unit. The basic assumptions of the hypothesis (Müller, 1999, p. 194) are: (a) conceptual knowledge is internally represented by integrative units; (b) access to the internal representation of conceptual knowledge depends on the degree of match between presented information and conceptual features; (c) conceptual units serve to monitor the production of adequate answers to a problem; and (d) the integration of a particular feature into the internal representation depends on its salience during instruction, its relevance during practice, or both. Whereas ACT-R predicts that transfer between different uses of knowledge would decrease with practice, the hypothesis of conceptual integration predicts that transfer between tasks involving the same concepts would increase with practice. Müller found typical learning curves in his experiments, but did not find that this pre-sumably compiled knowledge was use-specific. There was a relatively high transfer rate between evaluation and generation tasks. Thus the overlap and relevance of conceptual features was important in predicting transfer and allowed flexibility in skilled performance. Going back to our example of the crossword puzzler, it seems reasonable to suppose that a skilled puzzle solver can transfer his or her knowledge of how to solve crossword puzzle clues to the task of generating them. Müller’s hypothesis can account for this type of transfer whereas use-specificity of productions should make such transfer unlikely. Others have criticised production systems as representations of cognitive skill. As Copeland (1993, p. 101) puts it: When I turn out an omelette to perfection, is this because my brain has come to acquire an appropriate set of if-then rules, which I am unconsciously following, or is my mastery of whisk and pan grounded in some entirely different sort of mechanism? It is certainly true that my actions can be described by means of if-then sentences: if the mixture sticks then I flick the pan, and so on. But it doesn’t follow from this that my actions are produced by some device in my brain scanning through lists of if-then rules of this sort (whereas that is how some computer expert systems work). Copeland is arguing here that regarding knowledge as production rules may simply be a useful fiction. In a similar vein, Clancey (1997) argues that it is a mistake to equate knowledge with knowledge representation. The latter is an artefact. Architectures such as ACT-R and SOAR are descriptive and language-based, whereas human knowledge is built from activities: “Ways of interacting and viewing the world, ways of structuring time and space, pre-date human language and descriptive theories” (Clancey, 1997, p. 270). Furthermore he argues that too much is missed out of production-system architectures, such as the cultural and social context in which experts operate.
8. EXPERTISE AND HOW TO ACQUIRE IT
163
Johnson-Laird (1988a, b) has suggested that production-system architectures can explain a great deal of the evidence that has accrued about human cognition. However, their very generality makes them more like a programming language than a theory and hence difficult to refute experimentally. He also argues (Johnson-Laird, 1988b) that condition-action rules are bound to content (the “condition” part of the production) and so are poor at explaining human abstract reasoning. Furthermore, he has argued (1988b) that regarding expertise as compiled procedures suggests a rigidity of performance that experts do not exhibit (this aspect is discussed further in the next chapter). Understanding and conceptual change Some researchers have argued that production models of learning tend to ignore metacognitive skills and the social contexts of learning (see Glaser & Bassok, 1989, for a comparison of learning theories). Brown and Palincsar (1989), for example, state that students learn specific knowledge-acquisition strategies. Group settings encourage understanding and conceptual change and these are contrasted with “situations that encourage automatization, ritualization, or routinization of a skill, in which speed is emphasized at the expense of thought” (Brown & Palincsar, 1989, p. 395). Metacognitive strategies are also emphasised in the study of the role of self-explanations in learning (Chi et al., 1989, 1994). Micheline Chi and colleagues lay stress on the declarative (conceptual) knowledge that underpins understanding and problem solving. For example, Chi and Bassok (1989) argue that “the critical issues are how and to what extent students understand the declarative principles that govern the derivation of a solution” (p. 254). Cobb and Bowers (1999) also argue that the cognitive approach to learning adopted by Anderson (among others) may not be helpful in deciding what to do in a classroom whereas the situated learning approach is more useful in that context (see Chapter 4 for more on this debate). NEUROLOGICAL EVIDENCE FOR THE DEVELOPMENT OF PROCEDURAL KNOWLEDGE Neurological evidence for a distinction between declarative and procedural knowledge has existed for some time (see e.g., Squire, Knowlton, & Musen, 1993). More recently there has been some more detailed evidence of the mechanisms involved in chunking, the learning of condition-action bonds, and the development of procedural skill and the concomitant forgetting of the declarative knowledge that originally supported it. Graybell (1998) has found that the striatum (basal ganglia) is involved in receding information in the cortex, causing sequences of actions to become automated. This comes about by the “chunking” of motor and cognitive action sequences. This would appear to give a neurological correlate of the kinds of stimulusresponse or condition-action learning that ACT-R incorporates. One of the results of this kind of receding and chunking is that a sequence of actions that was once conscious and slow becomes automated and fast (Graybell, 1998, p. 131): The lack of conscious awareness in S-R learning may also be an advantageous property for a chunking mechanism in that the action chunks are treated as units (not, for example, as response chains). We do not want ‘supervisory attention’ (Shallice, 1988)…or conscious manipulation to intervene inside the macro unit itself. Chunks take their advantage from being manipulable as entities, and the intervention of consciousness or attention might actually disrupt their smooth implementation.
164
PROBLEM SOLVING
Skill learning was also examined by Raichle (1998). His aim was to identify the brain systems involved when a verbal task is novel and effortful, and to compare these with those brain systems that were active when the task became routine and automated. Studies of the type that Raichle carried out involved imaging the brain using Positron Emission Tomography (PET) scans or functional Magnetic Resonance Imaging (fMRI). Five brain states of the participants were examined. The participants: 1. were alert, eyes closed, performing no task; 2. maintained visual fixation on a TV monitor containing only a fixation point; 3. maintained visual fixation on a TV monitor while common English nouns were presented just below the point of fixation; 4. read aloud names of objects (nouns) presented at the rate of 40 per minute; 5. spoke aloud an appropriate use or verb for each noun as it was presented. Computers were used to subtract the digital image of one stage from another. For example, subtracting the “illuminated” parts of the brain in state 1 from state 2 shows what areas of the brain are involved in visually fixating a point, without it being confused with the areas of the brain active when resting. Practice on a verbal task led to reduction in voice onset latency; that is, the more often a noun was presented, the more quickly participants responded— usually with a stereotyped answer, giving the same verb each time the noun appeared. The learning task revealed dramatic changes over time in the brain regions that were active during the performance of the task. Large areas of brain were involved in early learning, but the number of areas active dropped dramatically with practice. Furthermore, previously inactive centres became active with practice. It would appear that early learning requires a lot of brain activity, particularly those regions involved in monitoring and planning behaviour such as the anterior cingulate cortex, the left prefrontal cortex, and the left temporal cortex. The wide distribution of areas of the brain active at the beginning seems to correspond to the supervisory attention system (consciousness, in effect). When these brain areas are no longer active and the task becomes automated, the task is often no longer accessible to consciousness. Whether one agrees or not that “cognitive skills are realised by production rules” depends on whether one regards a production-system account as an appropriate level of explanation. One way perhaps of deciding its appropriateness is by its explanatory power. On that criterion ACT-R is indeed powerful. Nevertheless, there are rival architectures such as SOAR and connectionism, so we are still some way from finding a unified theory of cognition. SUMMARY 1. There are several explanatory models of how we learn from our experience of the world. Most acknowledge the supremacy of analogising as the prime mechanism for inducing schemas and for learning new skills. 2. Induction is a powerful and ubiquitous learning mechanism whereby we abstract out the commonalities of experience and use them thereafter as the basis of our deductions and predictions about the world. 3. There is generally a correlation between superficial features of the environment and its underlying structure. It is therefore useful to pay attention to and retain specific details of our experience.
8. EXPERTISE AND HOW TO ACQUIRE IT
165
4. Conservative induction means that our memory contains specific details of the examples or context in which the original learning took place, and sheds them only when it is clear what aspects can be generalised over. 5. Schema induction is often a piecemeal and conservative process whereby the early schema contains parts of the acquisition context. The schema becomes increasingly decontextualised over time. 6. Schemas contain different types of knowledge that allow us both to categorise the problems we encounter and to indicate how the problems might be solved. 7. Once a category of problems has been identified, we also need to access the relevant procedure for solving it. Several models exist that assume that much of human thinking is rule-based. Productionsystem architectures based on if-then or condition-action rules have been used to model the development of expertise from the acquisition of declarative information and its use by novices to the automated categorisations and skills of the expert. 8. Criticisms of production-system architectures as general models of learning and problem solving have centred on: • the idea that cognition can best be modelled as if-then rules; • lack of detail about how conceptual knowledge underpins solutions, and the development of productions; • the variable influence of context, especially social context. 9. There is neurological evidence for the kinds of condition-action learning built into production systems, as well as for the automatisation and the relaxation of conscious control with practice.
CHAPTER NINE Experts, novices and complex problem solving
The result of learning and continual practice in a particular domain can lead to what is called expertise. As a rule of thumb, an expert is someone who has had 10 years’ experience in that domain (Simon & Chase, 1973). There is a variety of ways in which expertise can be, and has been, studied. For example, one can look at the development of expertise over time (which we began to look at in the last chapter); one can compare directly experts with novices on a number of dimensions; one can look at individual examples of genius or of “ordinary” experts; or one can find out the kinds of things that experts do well. Nevertheless, whatever aspect one concentrates on, there is at least an implicit comparison between experts and novices. WHAT DISTINGUISHES EXPERTS AND NOVICES In studying simple puzzle problems psychologists assumed that what was learned there about the psychology of problem solving could be generalised to other “reallife” problem situations. However, most problems people face are ill-defined and usually make demands on at least some relevant prior experience and knowledge. The more knowledge a person has about a domain (however defined) the more that person is equipped to deal with complex problems in that domain. Thus one can expect a broad range of individual differences in problem-solving ability in a particular domain. One domain that has created a bridge between puzzle problems and an understanding of the development of expertise is chess. Early influential studies of the cognitive processes involved in expertise were carried out by DeGroot (1965, 1966). A Master chess player himself, DeGroot was interested in finding out how chess players of different ranks planned their moves. He showed five Grandmasters and five expert players a number of middle games in chess and asked them to choose the next move while “thinking aloud”. He found that the Grandmasters made better moves than the experts (as judged by independent raters) and yet the former did not seem to consider more moves nor search any deeper than the experts. DeGroot argued that this was because the Grandmasters had a greater store of games they had studied and played and of different board positions. In other words the major difference between novices and experts seemed to be in the knowledge they possessed. This kind of research represented a move away from the studies of general problem-solving capacities and heuristics that were often built into AI models of problem solving. Although Newell and Simon (1972) concentrated on the demands a task put on a general information-processing system, they also pointed to the ways that an individual might get over processing limitations. There was therefore a shift from the 1970s onwards, from general models of cognitive processes to models that could incorporate specialised knowledge such as expert systems. Indeed research into artificial intelligence and human expertise tends to go hand in hand these days (see e.g., Hoffman, 1992). This shift in focus from domain-general to domainspecific expertise has not been universally welcomed (Stork, 1997).
9. EXPERTS AND NOVICES
167
As a result of research over the past few decades, a number of qualitative and quantitative differences in the performance of experts and novices have been revealed. Chi, Glaser, and Farr (1988, pp. xvii–xx) have listed seven “key characteristics of experts’ performance”: 1. Experts excel mainly in their own domain. 2. Experts perceive large meaningful patterns in their domain. 3. Experts are fast: they are faster than novices at performing the skills of their domain, and they quickly solve problems with little error. 4. Experts exhibit superior short-term and long-term memory performance. 5. Experts see and represent a problem in their domain at a deeper (more principled) level than novices; novices tend to represent a problem at a superficial level. 6. Experts spend a great deal of time analysing a problem qualitatively. 7. Experts have strong self-monitoring skills. These characteristics have emerged from the variety of ways in which expertise has been examined. Expertise has been examined in terms of differences in predispositions, such as intelligence, personality, thinking styles, motivation, and so on. The assumption here is that there is more to the differences between experts and novices than can be accounted for by the knowledge they possess. We have already looked at the development of procedural skills in the last chapter. On the road to expertise people develop automatic procedures for dealing with routine situations. It has often been argued that there are identifiable stages on the way to skilled performance on complex tasks, and that there are measurable differences in the performance of people at different stages. Another general characteristic shown up by these seven key characteristics is the way experts represent a task. It is not just quantity of knowledge but the way the knowledge is structured that affects how an individual represents a problem. Third, one can look at the way expertise develops over time from naïve (no knowledge of the domain), to novice (some little knowledge), to intermediate (a few years’ experience), to expert (some 10 years’ experience). Experts end up with compiled or automated procedures that help them perform better than novices. Fourth, experts may have developed reasoning or problem-solving strategies and heuristics to help them deal with novel or difficult problems in their domain of knowledge. These include variations in the way resources or time are allocated to planning or representing a problem in the first place. Differences in knowledge can also affect cognitive processes such as perception and the role played by working memory. Expertise can change how one perceives a situation visually (at least in some domains). These dimensions are summarised in Figure 9.1. ARE EXPERTS SMARTER? ARE THERE DIFFERENCES IN ABILITIES? Is knowledge the only or main factor that leads to expertise, or are there other factors (such as “ability”) to be taken into account? For over a century there have been many diverging explanations for exceptional performance in a particular domain. They have generally tended to take a stance somewhere along two dimensions: innate versus acquired ability, and domain-specific versus domain-general ability (Ericsson & Smith, 1991). Some explanations have included the role played by personality, motivation, and thinking styles. One specific ability that has long been assumed to play a part in exceptional performance is “intelligence” (the quotation marks are meant to represent the slipperiness of this concept). In other words one could ask the question: Do you need to be intelligent to be an expert? This question is essentially asking
168
PROBLEM SOLVING
Figure 9.1. Dimensions in which one can understand or study expertise.
if there is a domain-general ability that leads to expertise in a chosen field. Some studies have suggested that expert chess players also performed well in other fields such as chess journalism and languages (DeGroot, 1965; Elo, 1978). However, there seems to be remarkably little correlation between intelligence and other measures such as subsequent occupation, social status, money earned, and so on, despite a high correlation with success in school tests (Ceci, 1996; Sternberg, 1997a). Schneider, Körkel, and Weinert (1989) found that children who were highly knowledgeable about football, but who were low on measured IQ scores, could nevertheless outperform other children with high IQ scores in reading comprehension, inferencing, and memory tasks if those tasks were in the domain of football. Ceci and Likert (1986) compared two groups of racetrack goers, one of which was expert at predicting what the odds would be on a horse at post time. The expert group, unlike the other non-expert group, used a complex set of seven interacting variables when computing the odds. Each bit of information (one variable) they were given about a real horse or a hypothetical one would change the way they considered the other bits of information (the other six variables). Despite the “cognitive complexity” of the task, the correlation between the experts’ success at the racetrack and their IQ scores was–.07—no correlation at all, in fact. Swanson, O’Connor, and Carter (1991) divided a group of schoolchildren into two sub-groups based on their verbal protocols while engaged in a number of problem-solving tasks. One sub-group was designated as having “gifted intelligence” because of the sophistication of the heuristics and strategies they employed.
9. EXPERTS AND NOVICES
169
The sub-groups were then compared with each other on measures of IQ, scholastic achievement, creativity, and attribution (what people ascribe success or failure to). No substantial differences were found between the two groups. Swanson et al. concluded that IQ, among other measures, is not directly related to intelligence defined in terms of expert/novice representations. If a person shows “gifted intelligence” on a task, this does not mean that that person will have a high IQ. Does becoming an expert in a domain help you in understanding or developing expertise in another? In a review of the literature, Frensch and Buchner (1999) have pointed out that there is little evidence for expertise in one domain “spreading” to another. Ericsson and Charness (1997) have also stated (although with specific reference to memory) that “Experts acquire skill in memory to meet specific demands of encoding and accessibility in specific activities in a given domain. For this reason their skill is unlikely to transfer from one domain to another” (Ericsson & Charness, 1997, p. 16, emphases added). On the other hand, there can be skills developed in one domain which can be transferred to another where the skills required overlap to some extent. Schunn and Anderson (1999) tested the distinction between domain-expertise and taskexpertise. Experts in the domain of the cognitive psychology of memory (with a mean of 68 publications), social science experts (mean of 58 publications—task experts), and psychology undergraduates were given the task of testing two theories of memory concerning the effects of massed versus distributed practice. As they designed the experiment they were asked to think aloud. All the experts mentioned the theories as they designed the experiment, whereas a minority of students referred to them— nor did the students refer often to the theories when trying to interpret the results. Domain experts designed relatively complicated experiments manipulating several variables, whereas task experts designed simple ones keeping the variables under control. The complexity of the experiments designed by the students was somewhere in between. Schunn and Anderson claim that, at least in this domain, there are shared general transferable skills that can be learned more or less independently of context. Nevertheless, we need some way of explaining why one person can be outstanding in a field whereas someone else with the same length of experience is not. Factors that have been used to explain exceptional performance are personality and thinking styles. People vary. Some are better at doing some things than others. Gardner (1983) has argued that there are multiple intelligences that can explain exceptional performance by individuals in different domains. In this view, exceptional performance or expertise comes about when an individual’s particular intelligence profile suits the demands of that particular domain. Furthermore, people differ in their thinking styles. While some prefer to look at the overall picture in a task or domain, others are happier examining the details. While some are very good at carrying out procedures, others prefer to think up these procedures in the first place (Sternberg, 1997b). Because people vary in their experience, predispositions, and thinking styles, it is possible to devise tests in which person A will perform well and person B will perform poorly and vice versa (Sternberg, 1998). A might do well in a test of gardening and poorly in an IQ test; B may perform well in the IQ test but poorly in the test of gardening. In this scenario measures of intelligence such as IQ tests are really measures of achievement. Sternberg has therefore argued that intelligence and developing expertise are essentially the same thing and that the intelligence literature should be a subset of the expertise literature. Nevertheless, knowledge differences are not the only measures of individual differences in expertise. There are also differences in people’s ability to gain and exploit knowledge in the first place. The wide variety of interacting variables involved in skilled performance gives rise to individual differences in performance. Sternberg and Frensch have pointed out the same thing, although their argument is based on Sternberg’s own model of intelligent performance. They state (1992, p. 193):
170
PROBLEM SOLVING
The reason that, of two people who play chess a great deal, one may become an expert and the other a duffer is that the first has been able to exploit knowledge in a highly efficacious way, whereas the latter has not. The greater knowledge base of the expert is at least as much the result as the cause of the chess player’s expertise, which derives from the expert’s ability to organize effectively the information he or she has encountered in many, many hours of play. To sum up this section: although there are generally poor correlations between exceptional performance and measures of ability or personality, there is a range of factors—including innate ones—that can lead to expertise and exceptional performance. These factors also help to account for wide individual differences in performance on the same task despite the same amount of experience. SKILL DEVELOPMENT The last chapter dealt with the acquisition of skill in some detail. Following on from that, the recent literature has tended to emphasise the role of extended practice. For example, Ericsson, Krampe, and TeschRömer (1993) argued that expert performance was a function of the amount of deliberate practice in a particular skill such as violin playing. The next few sections look at research into the development of expertise and novice-expert differences that tends to ignore individual differences within each sub-group. Some researchers have listed various stages people go through as they move from novice to expert. For example, Dreyfus (1997) reviews a five-stage model from novice at stage 1, to advanced beginner in stage 2, competent at stage 3, proficient at Stage 4, leading to expertise at stage 5. Glaser (1996) has referred to three general stages in what he has termed a “change of agency” in the development of expert performance. The stages are termed external support, transitional, and selfregulatory. The first stage is one where the novice receives support from parents, teachers, coaches, and so on, who help structure the environment for the novice to enable him or her to learn. The second stage is a transitional stage where the “scaffolding” provided by these people is gradually withdrawn. The learner at this stage develops self-monitoring and self-regulatory skills, and identifies the criteria for high levels of performance. In the final self-regulatory stage the design of the learning environment is under the control of the learner. The learner might seek out the help of teachers or coaches or other sources of information when he or she identifies a gap or shortcoming in performance, but the learning and the regulation of performance is under the control of the learner. Another stage model is presented by Schumacher and Czerwinski (1992), this time the model refers to the development of memory representations of systems such as a computer-operating system. The three stages are pretheoretical, experiential and expert. The pretheoretical stage involves retrieval of specific instances from memory based on superficial features of the current situation and the instances stored in memory. This works assuming the superficial features are correlated with the underlying structural features of the system. In the experiential stage an understanding of the causal relations that underpin the system emerges. This stage sees the emergence of abstractions from experience of instances. In the expert stage the learner can make abstractions across various system representations. The knowledge gained is therefore transferable. An understanding of experts’ mental models of systems has been the basis of research into how one might teach novices the appropriate representations to understand complex systems. A conceptual model of a system such as a computer-operating system can form the basis of instructional materials. However, to make sure that such instructional materials are effective the conceptual models on which they are based should be as accurate as possible. Hence experts are required so that the appropriate models can be generated and used. Cardinale (1991), for example, found that training using such conceptual models of
9. EXPERTS AND NOVICES
171
systems made significant changes in the conceptual, strategic, and syntactic knowledge of low-ability novices. Similarly, in the domain of statistics, Hong and O’Neill (1992) found that presenting appropriate mental models to novices in the form of diagrams or other conceptual instruction before procedural or quantitative instruction produced fewer misconceptions than presenting a purely descriptive explanation. Schumacher and Czerwinski’s stage model reflects the acquisition of problemsolving schemas outlined in the previous chapter. Novice to expert shifts of this type have been modelled by Elio and Scharf (1990). They developed a computer model called EUREKA that begins by using novice strategies such as meansend analysis and builds up to the conceptual knowledge of the expert. EUREKA models the shift from surface feature understanding of problems, and develops schemas for problem types along with the associated solution procedures. As EUREKA solves problems, it abstracts out the commonalities between them, and the problem features that it identifies change over time to become more discriminating. In this way the model begins to identify features that reflect fundamental principles in physics. Other examples of computer models designed to learn from experience include Larkin, Reif, Carbonell, and Gugliotta (1986) and Keane (1990). Lamberts (1990) used a hybrid system that employed a connectionist system that learned to categorise and identify a problem based on the problem features, and a production system that employed the relevant solution strategy to solve it. The intermediate effect In some domains at least there is evidence of an intermediate effect as one of the stages that learners go through on the way to becoming experts. Lesgold and colleagues (Lesgold et al., 1988; Lesgold, 1984) have found evidence for such an intermediate effect, where there is a dip in performance on certain kinds of tasks such that novices outperform the intermediates despite the greater experience of the latter. Lesgold (1984) found that third- and fourth-year hospital residents performed less well at diagnosing X-ray films than either first-and second-year residents or experts. The same phenomenon was found by Patel and Groen (1991) (see also Schmidt, Norman, & Boshuizen, 1990). Lesgold argues that this should not be surprising, as the same phenomenon can be found in the intermediate phase of language learning that children go through where they produce over-regularisations. For example, a child might start off by saying “I went…” but by absorbing the rules for the formation of English past tenses, the child goes through a stage of saying “I goed…” before learning to discriminate between regular and irregular past tenses. Patel and Groen (1991) have argued that the intermediates have a lot of knowledge but that it is not yet well organised. This lack of coherent organisation makes it hard for them to encode current information or retrieve relevant information from long-term memory. Experts have a hierarchically organised schematic knowledge that allows them to pick out what is relevant and ignore what is irrelevant (Patel & Ramoni, 1997). Novices don’t know what is relevant, and stick to the surface features of situations and base their representations on them. Intermediates, on the other hand, try to “process too much garbage”. As a result, the novices in Lesgold’s (1984) study rely on the surface features of the problem (which are usually diagnostic of the underlying features), the experts take the context into account, but the intermediates try to do so and fail. Raufaste, Eyrolle, and Mariné (1999) have argued that the intermediate effect can be explained by assuming that some kinds of knowledge are only weakly accessible. Much of an expert’s knowledge is implicit, and experience adds this implicit knowledge to the structures originally acquired through the academic study of the domain. Furthermore Raufaste et al. distinguish between two types of experts: basic experts and “super experts”. Basic experts are typical of the population (at least, of radiologists in France) whereas “super experts” refers to the very small number of world-class experts. This distinction is similar to that between expert chess players and Grandmasters. There is, Raufaste et al. argue, a qualitative difference
172
PROBLEM SOLVING
Figure 9.2. Raufaste and co-workers’ (1999) reinterpretation of the intermediate effect.
between basic experts and super experts. If Lesgold had used basic experts in his studies then the U-shaped curve produced by the performance of novices, intermediates, and basic experts on atypical X-ray films would have become a straight line. Figure 9.2 compares Lesgold’s model of the intermediate effect and Raufaste and co-workers’ model. Although basic experts have typically had less practice than super experts, an appeal to weakly associated memory, and hence implicit or intuitive knowledge, is probably not enough to explain a qualitative gap between basic experts and the super experts. There still seems to be some magic involved in producing the gap. Other changes in the shift from expert to novice concern the nature of the information presented and what the novice or expert finds useful. Kalyuga, Chandler, and Sweller (1998) have shown that different types of presentation of material can reduce or increase the demands on working memory, or what they term “cognitive load” (Sweller, 1988). In the early stages of learning from textbooks, learners are often presented with text-based explanations of concepts along with diagrams or figures. Sweller and his colleagues have shown that the text and illustrations have to be well integrated in order to reduce cognitive load in the early stages of learning a domain. Kalyuga et al. have demonstrated that the same diagram that has been integrated with the text to reduce cognitive load for novices can increase cognitive load for experts, who do better with the diagram alone without the redundant text (see also Marcus, Cooper, & Sweller, 1996; Ward & Sweller, 1990). According to Glaser (1996), the final stage in the development of expertise involves a degree of selfregulatory skill. Chi, Glaser, and Rees (1982) found that physics experts were better than novices at assessing a problem’s difficulty. They also have knowledge about the relative difficulty of particular schemas. KNOWLEDGE ORGANISATION The fact that experts know more than novices will not come as a shock to many people. Nor will the fact that experts have more experience and skill than novices. In fact these features of expertise are defining features. Of more psychological interest is how that knowledge is organised.
9. EXPERTS AND NOVICES
173
Moderately abstract conceptual representations Zeitz (1997) argues that experts’ knowledge is represented at an intermediate level of abstraction known as a moderately abstract conceptual representation (MACR). This level of representation is a compromise between abstractions, such as equations in physics or chemistry, and concrete specific problems. A representation at too abstract a level (such as equations and general principles) is not appropriate for use. For example, a set of equations by itself does not include conditions for their use. On the other hand a representation at too concrete a level is not readily transferable. An expert would have to have a representation of thousands of individual problems, which does not seem to be an economical way of representing problems. Some kind of intermediate representation appropriate to a given domain would seem to be needed. “The development of expertise amounts to becoming facile at processing information at the level of abstraction appropriate for use in a given domain” (Zeitz, 1997, p. 45). Zeitz suggests that there are a number of benefits of having a level of abstraction that is neither too abstract, and hence shorn of content, nor too concrete and detailed: • Detailed representations become rapidly inaccessible due to retroactive interference. Abstractions are more stable (see Brainerd & Reyna, 1993). • A MACR can be retrieved by a broader range of retrieval cues and, therefore, is more accessible. • A MACR is easier to manipulate due to its schematic nature. • Processing non-essential details in a concrete representation may produce no accuracy gains and may interfere with reasoning. MACRs provide a link between low-level specific problems and the experts’ extensive domain knowledge. One could argue that device models and causal models are examples of MACRs. Such models help link the details of a system with high-level abstractions. In science education, causal models link bottom-up details with high-level abstractions (equations) (White, 1993). Koedinger and Anderson (1990) found that experts in geometry plan at a higher level of abstraction than basic step-by-step procedures. As a result they are able to skip steps. Expert computer programmers have been found to have schemas or abstract plans for particular programming tasks. These are often script-like plans or stereotypical “chunks” of code. An “averaging plan” would be the knowledge that an average involves a sum divided by a count (Rist, 1989). Programmers plan a program at an abstract level by co-ordinating and sequencing chunks of code to achieve a desired task (Erlich & Soloway, 1984). Such plans are MACRs because they relate abstract knowledge to its use in stereotypical situations. Adelson (1981) found that novices categorised computer programming code in terms of the syntax, whereas experts used a more hierarchical organisation based on underlying principles of programming. Chan (1997) also found that experts’ knowledge of architecture was organised in a hierarchical and functional way. Hierarchically organised knowledge structures should allow experts to store MACRs as a kind of basic-level representation and to be flexible in the use they make of their knowledge. Schema development and the effects of automatisation Schemas are useful because they allow aspects of knowledge to be abstracted out of the context in which the knowledge was originally gained. That knowledge can therefore be transferred to another context. The benefits are that the abstracted knowledge can be applied to a wide variety of similar situations. However, there are drawbacks to schematisation (Feltovich, Spiro, & Coulson, 1997, p. 126):
174
PROBLEM SOLVING
…being abstractions, they are reductive of the reality from which they are abstracted, and they can be reductive of the reality to which they are applied… An undesirable result is that these mental structures can cause people to see the world as too orderly and repeatable, causing intrusion on a situation of expectations that are not actually there. One would therefore expect that one of the effects of schema acquisition, and of knowledge compilation in general, would be a potential rigidity in expert performance. As we have seen (Chapter 3) rule learning can lead to functional fixedness or Einstellung. Sternberg and Frensch (Frensch & Sternberg, 1989; Sternberg & Frensch, 1992), for example, have argued that experts’ automaticity frees up resources to apply to novel problems. That is, their knowledge is compiled and no longer accessible to consciousness (Anderson, 1987; Carbonell, 1982; Glaser & Bassok, 1989). Automaticity can therefore lead to lack of control when a learned routine is inappropriate to the task in hand. When a routine procedure is required there is probably little difference between expert and novice performance (Spence & Brucks, 1997). Indeed there is an argument that the routinisation of behaviour can lead experts to make more mistakes than novices (or at least mistakes that novices are unlikely to make). In Reason’s Generic Error-Modelling System (GEMS) (Reason, 1990) there are three types of error: skillbased (SB) slips and lapses (that occur in automatised actions), rule-based (RB) mistakes, and knowledgebased (KB) mistakes. SB and RB errors are likely to be caused by following well-learned “strong but wrong” routines, and hence are more likely with increased skill levels. In SB errors “the guidance of action tends to be snatched by the most active motor schema in the ‘vicinity’ of the node at which an attentional check is omitted or mistimed” (Reason, 1990, p. 57). RB errors are likewise triggered by mismatching environmental signs to well-known “troubleshooting” rules. KB mistakes are unpredictable as “they arise from a complex interaction between ‘bounded rationality’ and incomplete or inaccurate mental models… No matter how expert people are at coping with familiar problems, their performance will begin to approximate that of novices once their repertoire of rules has been exhausted by the demands of a novel situation” (1990, p. 58). There are therefore circumstances when novices may do better or at least no worse than experts. As novices and experts differ in the way they represent information, Adelson (1984) hypothesised that expert programmers would represent programs in a more abstract form than novices. Novices, she argued, would represent programs in a more concrete form than experts, who would represent them in terms of what the program does. An entailment of this theory is that novices and experts would differ in how they dealt with questions based on a concrete or an abstract representation of a program. In one experiment Adelson represented a program as a flowchart (an abstract representation) or described how it functioned (a concrete representation). She then asked questions about either what the program did (abstract) or how it did it (concrete). She found that she could induce experts to make more errors than novices if they were asked a concrete question when presented with an abstract representation. Likewise novices made more errors in the abstract condition. Flexibility in thinking There would appear to be something wrong with this conception of expertise where learned procedures, automatisation, compiled knowledge, schematisation—call it what you will—lead to degraded performance. Experts wouldn’t be experts unless they could solve problems flexibly. Hatano and Inagaki (1986) have distinguished between routine expertise, which refers to the schemabased knowledge experts use to solve standard familiar problems efficiently, and adaptive expertise that
9. EXPERTS AND NOVICES
175
permits experts to use their knowledge flexibly by allowing them to find ad hoc solutions to non-standard unfamiliar problems. According to Lesgold et al. (1988), expert radiologists flexibly change their representations when new problem features manifest themselves. For example, Feltovich et al. (1984) gave expert and novice clinicians clinical cases to diagnose that had a “garden path” structure. That is, the pattern of symptoms indicated a “classical” (but wrong) disease. It was the novices who misdiagnosed the disease rather than the experts with their supposed abstracted out schema for patterns of disease. Experts were more likely to reach a correct diagnosis after consultation with the patient. Feltovich, Spiro, and Coulson (1997) argue that the very fact that experts have a large well-organised and highly differentiated set of schemas means that they are more sensitive to things that don’t fit. When that happens the expert is more likely to engage in more extensive search than the novice. Table 9.1 shows a protocol from an expert examining an X-ray of a patient who had had a portion of a lung removed a decade earlier. As a result the slide seemed to show a chronic collapsed lung. An effect of the removed portion was that the internal organs had moved around. The protocol in the Table shows the expert testing the “collapsed lung” schema and finding that there are indications that there are other features which don’t quite fit in with that schema. He switches to a lobectomy schema which, in the final part of the protocol, he also checks. From their work, Lesgold et al. (1988, p. 317) have suggested that the behaviour of the expert radiologist conforms to the following general pattern: First, during the initial phase of building a mental representation, every schema that guides radiological diagnosis seems to have a set of prerequisites or tests that must be satisfied before it can control the viewing and diagnosis. Second, the expert works efficiently to reach the stage where an appropriate general schema is in control. Finally, each schema contains a set of processes that allows the viewer to reach a diagnosis and confirm it. According to Voss, Greene, Post, and Penner (1983) experts’ knowledge is flexible because information can be interpreted in terms of the knowledge structures the experts have developed and new information can be assimilated into appropriate structures. Similarly, Chi, Glaser, and Rees (1982) have stated that experts have both more schemas and more specialised ones than novices, and this allows them to find a better fit to the task in hand. Experts’ extensive knowledge and categorising ability may lead to expert intuition. As intuition is by definition not accessible to consciousness it can best be modelled in a connectionist system rather than a rule-based one (Partridge, 1987). TABLE 9.1 Protocol excerpts from an expert, showing early schema invocation, tuning, and flexibility Something is wrong, and it’s chronic: “We may be dealing with a chronic process here…” Trying to get a schema: “I’m trying to work out why the mediastinum and the heart is displaced into the right chest. There is not enough rotation to account for this. I don’t see a displacement of fissures [lung lobe boundaries].” Experiments with collapse schema: “There may be a collapse of the right lower lobe but the diaphragm on the right side is well visualized and that’s a feature against it…” Does some testing; schema doesn’t fit without a lot of tuning: “I come back to the right chest. The ribs are crowded together… The crowding of the ribcage can, on some occasions, be due to previous surgery. In fact,… The third and fourth ribs are narrow and irregular so he’s probably had previous surgery.
176
PROBLEM SOLVING
Cracks the case: “He’s probably had one of his lobes resected. It wouldn’t be the middle lobe. It may be the upper lobe. It may not necessarily be a lobectomy. It could be a small segment of the lung with pleural thickening at the back.” Checks to be sure: “I don’t see the right hilum…[this] may, in fact, be due to the postsurgery state I’m postulating… Loss of visualization of the right hilum is…seen with collapse…” Source: Lesgold et al. (1988).
COGNITIVE PROCESSES In a well-known study of expert-novice differences in categorisation, Chi, Feltovich, and Glaser (1981) gave novice and expert physicists cards with the text and a diagram of a single elementary physics problem on each and asked them to categorise them (see Figure 9.3). They found that novices tended to categorise them in terms of their surface features, such as whether they involved pulleys or ramps. Experts classified them according to the deep structure, that is, according to the laws of Newtonian mechanics they involved. You may recall that earlier in this chapter I described the early influential studies of the cognitive processes involved in expertise carried out by DeGroot (1965, 1966) using expert and Grandmaster chess players. Five Grandmasters and five expert players were shown a number of middle games in chess and asked to choose the next move while “thinking aloud”. The Grandmasters made better moves than the experts (as judged by independent raters). However the Grandmasters did not seem to consider more moves or search any deeper than the experts. DeGroot argued that this was because the Grandmasters had a greater store of games they had studied and played, and of different board positions. In another study he briefly presented board positions to subjects including chess Masters and asked them to reconstruct them from memory. Masters could reconstruct the board positions correctly 91% of the time on average. Less expert players managed only 41% accuracy. DeGroot argued that the Masters were encoding larger configurations of pieces than the experts. When boards with random configurations of pieces were presented, the Masters did no better than the experts—Masters did not have better memories than experts. Chase and Simon (1973) hypothesised that experts and novice chess players differed in the size of the “chunks” they could encode at any one time. They asked chess players to reconstruct configurations of chess pieces on chess boards with the model still in view. They measured the number of glances players took and the number of pieces placed after each glance. The best player managed to encode 2.5 pieces per glance and used shorter glances than the weakest player who managed only 1.9 pieces per glance. The expert player was therefore encoding more information per glance than the weaker players. Similar findings have been noted in other domains. Waters, Underwood, and Findlay (1997) found that this same kind of perceptual chunking occurred in sight reading from musical scores. In one experiment they asked eight full-time music students, eight psychology students who had passed a music exam, and eight non-musicians to compare two visually presented musical sequences. The experienced musicians needed fewer and shorter glances to encode groups of notes. Similar results have been found in a wide variety of other domains from computer programming (McKeithen, Reitman, Rueter, & Hirtle, 1981) to figure skating (Deakin & Allard, 1991). Adelson (1981) found that expert programmers could recall more lines of code than novices and had larger chunk sizes for encoding information.
9. EXPERTS AND NOVICES
177
Figure 9.3. Examples of problem diagrams used in Chi and co-workers’ studies (adapted from Chi et al., 1981).
The role of perception in skilled performance According to DeGroot, and Chase and Simon, perception is the key to chess skill (Cooke, 1992). However,
178
PROBLEM SOLVING
this may be putting the cart before the horse to some extent. The ability to recognise perceptual patterns and to categorise problems or situations appropriately is the result of expertise. You can’t teach people to categorise problems unless they already have the requisite knowledge of principles—the conceptual knowledge. You can’t chunk stuff perceptually without the experience and concepts with which to do so. Egan and Schwartz (1979) repeated the “traditional” expert-novice memory task for meaningful and meaningless displays. The domain this time was electronic circuit drawings. In one condition experts tried to recall drawings of randomly placed electronic circuit symbols in terms of functionally related units, and were faster than the novices on the task. Egan and Schwartz argued that there was more of a top-down process taking place than a perceptual chunking hypothesis could account for. It was not so much perceptual chunking that was taking place, but conceptual chunking. That is, higher-level concepts were being used to govern perceptual chunking of the display. Conceptual chunking can also be seen in a study by Cooke, Atlas, Lane, and Berger (1991, cited in Cooke, 1992). Meaningful board configurations were presented to chess players. A verbal description of the configurations either preceded or followed the visual presentation of the chess board. Where a verbal description preceded the visual presentation, the performance of the experts was enhanced. This suggests that higher-level (conceptual) information prepared them for the pattern-recognition (perceptual) task. Johnson and Mervis (1997) performed a categorisation study with experts on songbirds. They also found that conceptual knowledge interacted with perception. Experts’ categories can be less distinct than those of novices. Murphy and Wright (1984) asked experts and novices to list attributes of three childhood disorders. Experts listed more features for each disorder than novices and agreed with each other more, but there were fuzzier boundaries between the categories of disorder than those produced by novices. An explanation for the difference is that novices learn about prototypical cases during training, but experts have had experience of exceptions to the prototype and hence have developed a broader view of the category. Norman, Brooks, and Allen (1989) also argue that experts’ processing of information is “effortful” unlike the findings from more “visual” areas of expertise such as patterns in chess. Nevertheless, some forms of expertise rely on making very fast decisions based on perceptual cues. In many sports the ability to make accurate predictions from subtle cues improves a sportsperson’s chances of hitting a ball. In baseball, for example, an experienced player can predict the pitch within the first 80 milliseconds (Paull & Glencross, 1997). Expert categorisation of problems, and indeed of things such as dogs, fish, or trees, is often based on the goal of the categorisation rather than just a concern with simple taxonomy (Ross, 1996). Ross cites a number of studies where the experts’ judgements were based on goal-defined categories that come about as a result of repeated interaction with the category members. The experts’ classification of items in Chi and coworkers’ study is not the goal so much as a means to an end (solving the relevant problem). The development of expertise is therefore a question of shifting from a classification based on surface features to one based on the solution method, which can only come about through solving such problems in the first place. The role of memory in expert performance Chase and Simon’s original studies assumed a short-term working memory capacity of around seven chunks, as did most of the studies of expert memory in the 1970s and 1980s. More recently, however, a number of studies and papers have caused a re-assessment of those early findings. For example, Gobet and Simon (1996) found that expert chess players could recall the positions of more pieces than the original
9. EXPERTS AND NOVICES
179
theory predicted. In one experiment, a Master chess player was able to increase the number of chess boards he could reproduce to nine with 70% accuracy and around 160 pieces correctly positioned. Gobet and Simon suggest that experts in a particular domain can use long-term memory templates to encode and store information. Ericsson and Kintsch (1995) provide an explanation for how experts and skilled performers can manage a ten-fold increase in performance tests of short-term memory. They cite a number of studies in the memory and expertise literature that do not seem to fit well with the notion of a limited-capacity working memory limited to around only seven items. Ericsson and Poison (1988) describe a well-known case of a waiter (JC) who could memorise long and complex dinner orders. He used a mnemonic strategy that allowed him to retrieve relevant information later. Furthermore his strategy could transfer to categories other than food orders, so the strategy was not domainspecific. Medical diagnosis also presents a problem for a limited-capacity short-term working memory. Many symptoms and facts have to be maintained in memory in a form that can be retrieved readily until a diagnosis is made. Ericsson and Kintsch therefore propose a long-term working memory associated with skilled memory performance and expertise. Experts are able to store information in long-term memory rather than maintaining it in short-term memory. In order to be able to do this and to do it quickly, three criteria have to be met: • The expert has to have an extensive knowledge of the relevant information needed. • The activity the expert is engaged in must be highly familiar; otherwise it would not be possible to predict what information will be needed at retrieval. • The information encoded has to be associated with an appropriate set of retrieval cues that together act as a retrieval structure. When the retrieval cues are activated later, the original conditions of encoding are partially reinstated which in turn leads to the retrieval of the relevant information from long-term memory. Ericsson (Ericsson & Charness, 1997; Ericsson & Hastie, 1994) has argued that comparing experts and novices can only take us so far. A more useful and valid task is to examine those aspects of a person’s superior performance that are reproducible. That is, expertise can be examined and levels of expertise differentiated by looking at a set of representative tasks that experts do well under standardised conditions. DeGroot’s study of middle games in chess and the next move from random positions provides an example. Ericsson and Charness (1997, p. 3) argue that superior performance is not best understood in terms of “incremental refinements of pre-existing capacities and processes” but that the mechanism that produces expertise is deliberate, individualised, intensive practice. This kind of individualised training, or “effortful adaptation” to the demands of a domain, allows experts to restructure their performance and acquire new methods and skills. It remains a sad truth that, no matter how much effort we put in, the level of performance that characterises the Nobel Prize winner or the “genius” will remain beyond the grasp of most of us. As Sternberg (1996b, pp. 352–353) put it: The truth is that deliberate practice is only part of the picture. No matter how hard most psychologists work, they will not attain the eminence of a Herbert Simon. Most physicists will not become Einstein. And most composers will wonder why they can never be Mozart. We will be doing future generations
180
PROBLEM SOLVING
no favor if we lead them to believe that, like John Watson, they can make children into whatever they want those children to be. The age of behaviorism has passed. Let us move beyond, not back to it. WRITING EXPERTISE: A CASE STUDY One form of complex problem solving that people engage in at school, college, and university is writing. Academic writing constitutes a very ill-defined problem, because students typically have an assignment topic (such as an essay question) and a word limit as the only explicit constraint. The initial state is a blank sheet of paper or blank computer screen, and the goal state is very vaguely defined. If the student is given information about what constitutes a B-grade or an A-grade this is usually given as criteria that in turn add up to more and more specific constraints: “for an A-grade there should be a clear and concise evaluation of the theory…”. The student probably finds it hard to tell if the constraints have been met by the final essay. Research into writing has shown that the area seems to throw up a number of paradoxes. As in any other domain there are beginners and novices and there are super experts. According to Bryson, Bereiter, Scardamalia, and Joram (1991), writing expertise is unlike other domains of expertise in that writing doesn’t seem to get any easier with experience. From novice to expert, writing involves cognitive overload. Writing experts who win prestigious prizes are not necessarily (or even usually) the ones who have had the most practice. Indeed, it is difficult to tell who should be regarded as the super experts—those who win most prizes or those who sell most books. Novice writers, including young schoolchildren, often appear to use a working forward strategy that is normally considered the method used by experts. Give a primary schoolchild a writing topic and he or she will start to write. Anyone who has invigilated exams at university will often see students turn over the exam paper and begin to write almost immediately. On the other hand, there are accounts of the pain that expert writers seem to go through from time to time that would mark them out as novices in other domains (Bryson et al., 1991). Bereiter and Scardamalia (1987) argue that the writing task doesn’t get any easier from novice to expert if one assumes that expert and novice writers are solving different problems. In particular, as the goal state is very ill-defined, different writers may have radically different goals. This would apply not only to novices and experts at academic writing but also to prize-winning and bestselling novelists. Another phenomenon one finds in academic writing is that experts and novices differ not only in terms of their knowledge and skill in writing but also in their knowledge of a particular domain. A student entering the first year at university may be given a 2000 word assignment such as “Compare and contrast the developmental theories of Piaget and Vygotsky”. The student may never (or very rarely) have had to structure an essay of this type before; not only that, but they may never have heard of Piaget or Vygotsky. Their main source of information is the course textbook and there are heavy penalties for plagiarism. What is the poor student supposed to do? Expert writers don’t seem to be any better off, as they shift their own goal-posts so that the writing task continues to involve a host of conflicting constraints and hence remains difficult. The writing process One attempt to characterise the writing process was produced by Hayes and Flower (Flower & Hayes, 1980, 1981; Hayes, 1989b; Hayes & Flower, 1980). They regarded writing as a complex problem-solving task, and tried to build a model of the writing process using mainly protocol analysis as their method for finding evidence (Figure 9.4). The model owes a lot to Newell and Simon’s (1972) model of human problem solving.
9. EXPERTS AND NOVICES
181
Figure 9.4. The Hayes-Flower model of composition (based on Hayes, 1989b; Hayes & Flower, 1980).
In the model the assignment topic is embedded in the task environment which “includes everything outside the writer’s skin that influences the performance of the task” (Hayes & Flower, 1980, p. 12). There are three main writing processes: planning, translating, and reviewing. Planning. This involves three sub-processes of generating, organising, and goal-setting. The generating process uses cues in the task environment including motivational and situational cues to retrieve relevant information from long-term memory. The organising process selects from information retrieved material that seems most useful and organises that material into a writing plan. Goal-setting refers to the fact that some of the information retrieved may not necessarily be content information but criteria relevant to the intended audience. Plans are like artists’ preliminary sketches. They give useful information about the general outline and content but can readily be modified. “Good plans are rich enough to work from and argue about but cheap enough to throw away” (Flower & Hayes, 1980, p. 43). Translating. The information retrieved from long-term memory is presumed to be in the form of propositions. These propositions need to be translated into written prose.
182
PROBLEM SOLVING
Reviewing. This process is subdivided into reading and editing. The process detects violations of written expression. It also involves making some kind of evaluation of the text to see whether it fits in with the writer’s goals. The Hayes-Flower model is a model of competent writing. As such it tends to lay most emphasis on ideal writing processes. However, there is scope in the model for individual differences, particularly in terms of the goals of writers at different levels and in the constraints under which they operate. A novice writer may be operating under a set of constraints such as: time, availability of textbooks, lack of subject knowledge, avoidance of plagiarism, length, style, structure, and appropriate genre language (language appropriate to the type of text being written —a report, a formal essay, a thank-you letter to granny). Expert writers may be operating under a different set of constraints: originality, desire to be interesting, coherence, audience design, and so on. “Writing is like trying to work within government regulations from various agencies” (Flower & Hayes, 1980, p. 34). One of the effects of trying to juggle a large number of constraints is “cognitive overload”. The major constraint imposed on writers is the rhetorical problem: “whatever writers choose to say must ultimately conform to the structures posed by their purpose in writing, their sense of audience and their projected selves or imagined roles” (Flower & Hayes, 1980, p. 40). The other constraints of domain knowledge and written language production are subsumed and directed by the rhetorical problem—how to achieve the writer’s goals in writing. In order to make the writing task manageable and avoid cognitive overload there are several strategies we can adopt: Drop constraints: “I won’t bother taking the reader’s point of view”; “I won’t worry about punctuation or spelling”. Partition the problem: “I’ll divide the task into sub-problems and deal with each one at a time”. Satisficing: “I’ll go with the first solution that meets my criteria”. Automatisation: “I have developed routines and skills so I no longer need to spend time over grammar or worrying about coherence between paragraphs”. Planning: “Flexible high-level plans allow me to manage the writing task by dividing it into subproblems and integrating them and so reduce the cognitive load”. Being a model of competent writing, the model can cope with novice-expert differences by emphasising or de-emphasising one aspect or another of the model. For example, novices don’t seem to do a lot of planning. They revise spelling and punctuation mistakes but don’t tend to rearrange whole sections of text. In order to account for the fact that novice and expert writers seem to be doing different things and setting different goals, Bereiter and Scardamalia (1987) have chosen to create two different models: one that attempts to explain novice writing and one that covers expert writing. Although novice writers’ think-aloud protocols reveal little planning, they often manage to produce coherent imaginative prose. The writing process involves an interaction of a large number of interacting variables and goals. The cognitive load produced by this ought to (and frequently does) overload the novice writer, especially at college level, leading to writer’s block. Nevertheless, novice writers do manage to write essays, and despite their lack of expertise in both the content domain and in writing itself they do sometimes manage to get high grades or to produce a readable story. Bereiter and Scardamalia argue that novice writers manage this because they are employing a knowledgetelling strategy. This strategy reduces the problem faced by the writer to one of telling what one knows about the topic. The structure of the knowledge-telling model is shown in Figure 9.5. In the model, topic identifiers are used as cues to access knowledge using a process of spreading activation. Another form of cue is provided by discourse (or genre-related) knowledge. This is knowledge in the form of schemas about the structure of certain types of written text. Such a schema might be: when writing an essay provide a statement
9. EXPERTS AND NOVICES
183
Figure 9.5. Structure of the knowledge-telling model (adapted from Bereiter and Scardamalia, 1987).
of your belief and back it up with evidence. Schemas such as this would act as retrieval cues that would enable the writer to access evidence from content knowledge to back up an assertion: “Evidence from Kellogg (1988) backs up my argument.” Generally speaking the information recalled in this process is written down more or less as it is retrieved. According to Bereiter and Scardamalia, expert writers employ a different strategy known as a knowledgetransforming strategy. The expert writer is trying to solve two problems simultaneously: the problem of the writer’s knowledge and beliefs, and the problems of achieving the goals of the composition. The two necessarily interact. Figure 9.6 shows the structure of the knowledge-transforming model. There are two interacting problem spaces: the content problem space and the rhetorical problem space. Content knowledge has implications for how the text is constructed. A difficult concept might require a lot of “anchoring” or a wide range of concrete examples or grounding in an analogy and so on. Similarly a decision about how to solve a rhetorical problem has implications for how one uses content knowledge. “By responding to the content implications of rhetorical decisions and vice versa, the knowledge-transforming writer engages in a process that can, in principle, exponentiate the problems of composition without limit, therefore truly courting mental overload” (Bryson et al., 1991, p. 73). Writing expertise therefore does not seem to fit neatly into Chi and co-workers’ (1988) list of features that distinguish experts from novices with which we began. Experts excel mainly in their own domain: There are domain experts and writing experts and it is possible the twain may never meet. Britton et al. (1989) got five experts to rewrite instructional texts originally
184
PROBLEM SOLVING
Figure 9.6. Structure of the knowledge-transforming model (adapted from Bereiter and Scardamalia, 1987).
written by domain experts. Students remembered more of the texts rewritten by three of the writing experts than the originals. Experts perceive large meaningful patterns in their domain: There are some domains where perceptual chunking is not relevant and writing expertise is probably one of them. Experts are fast: They are faster than novices at performing the skills of their domain, and they quickly solve problems with little error. Experts have superior short-term and long-term memory: How this might manifest itself in writing expertise is unclear. Experts see and represent a problem in their domain at a deeper (more principled) level than novices; novices tend to represent a problem at a superficial level: This reflects the difference between novice and expert writers in various aspects of the writing process, notably planning and revision. Overall, knowledge telling is a relatively superficial strategy compared with knowledge transforming, which requires a “deeper” representation of how one’s goals can be achieved in composition. Experts spend a great deal of time analysing a problem qualitatively: This is generally true of expert writers. It manifests itself in representing the writing problem and in planning. Experts have strong self-monitoring skills: This is also generally true of expert writers. It manifests itself most obviously in revision.
9. EXPERTS AND NOVICES
185
SUMMARY 1. Theories of both expertise and exceptional performance have been around for centuries. The earliest concentrated on innate differences in personality, intelligence, and aptitudes. However, recent studies have shown little correlation between expertise and IQ. Nevertheless differences in thinking styles or intelligence profiles can lead to different people becoming expert in different domains. 2. Experts tend to be experts in one domain. Expertise does not seep into another domain unless the two domains share a set of skills. Thus there can be “content” knowledge specific to a domain and “task” knowledge that can be shared with closely related domains. 3. There are several stage models of expertise development. On the way to expertise there is evidence of an “intermediate effect” in which people with some knowledge of a domain perform worse than others with a little knowledge of the domain. Raufaste et al. (1999) argue that the intermediate effect is a sideeffect of confusing “basic experts” and “super experts” in these studies. 4. Zeitz (1997) argues that experts’ knowledge is best understood as involving representations at an intermediate level of abstraction. Moderately abstract conceptual representations (MACRs) provide a level of representation between the concrete and specific and the abstract and content-free. 5. Automatisation ought to make expert performance “rigid” because routine well-practised procedures are no longer accessible to consciousness. Although automaticity can indeed lead to errors in certain circumstances, experts would not be experts if they could not use knowledge flexibly. The paradox can be overcome if one assumes routine expertise for dealing with typical problems and adaptive expertise for dealing with novel problems. Experts have schemas and strategies that cover exceptions as well as typical cases. 6. Consequences of expertise in many domains include: • Fast categorisation processes: experts categorise problems differently from novices. • Perceptual chunking: experts can “chunk” larger configurations of elements than novices. • Long-term working memory: experts have developed strategies within their domain of expertise for using long-term memory for tasks that novices would rely on limited-capacity short-term memory to deal with. 7. Ericsson and Charness (1994) have argued that it is more profitable to examine what it is in a domain that experts do well. Expertise involves effortful adaptation to the demands of a domain. 8. Writing expertise is unusual in that the writing task does not seem to get easier with practice. Writers at different stages of expertise change the goals of the writing task, effectively ensuring that the writing task remains difficult. That is, novice and expert writers faced with the same task are trying to solve different problems. 9. The Hayes-Flower model of writing assumes that writing is a form of complex problem solving involving multiple constraints. According to the model, writers engage in planning, translation, and revision in an attempt to solve the rhetorical problem (attempting to satisfy the writer’s communicative goals). 10. Bereiter and Scardamalia produced two models of the writing process corresponding to the behaviour of novice and expert writers. Novice writers employ a “think-say” method of composing called “knowledge telling” involving: • writing down all you know about a topic;
186
PROBLEM SOLVING
• little planning or organisation; • little account of the intended reader; • revision of only surface details (spelling, punctuation, etc.). 11. Expert writers use a “knowledge-transforming” strategy involving: • • • •
an interaction between a content problem space and a rhetorical problem space; information retrieved from memory is transformed in line with the writer’s communicative goals; planning is elaborate; revision can involve structural changes and wholesale changes of content.
CHAPTER TEN Conclusions
PROBLEMS, PROBLEMS We began by looking at what constitutes a problem and ended with how you can become really good at solving them so that typical problems you might encounter every day aren’t problems any more. At first, not knowing how to get round constraints, or not being aware of what operators can apply, means that our understanding of a situation or task is impoverished—we have a problem. With experience we learn to operate inside constraints and we learn what operators apply and when to apply them—the problem goes away. Constraints When a goal is blocked you have a problem; when you know ways round the block or how to remove it, you have less of a problem. Often the blocks are caused by the constraints that are imposed by the problem itself (the Tower of Hanoi wouldn’t be a problem if it weren’t for the constraints; jetting off for a holiday in Mexico wouldn’t be a problem if it weren’t for the constraints of money, job, small infants to look after). The environment can impose constraints—a particular task might become difficult when there is a lot of loud noise or when the weather turns bad. Finally, there are constraints imposed by the solver. An inability to solve some problems may be due to unnecessary constraints that were not mentioned in the problem statement. Insight problems are one example, but the same occurs in everyday problems—I have a meeting to chair and there are diverging views about the topic; how do I get an agreement and avoid confrontation? Assuming that you have to choose one view over another may be an unnecessary constraint. There may a third way. Relaxing constraints or switching to a different problem space may permit a new way of representing the problem. Operators In order to solve problems you need to know what to do. That is, you have to be able to retrieve the relevant operators from memory. It goes without saying that, if you don’t have them in the first place then you can’t retrieve them. This is even more difficult when the domain is unfamiliar. Another barrier to successful problem solving is if you have the relevant operators but retrieve the wrong ones. You might add instead of subtract. In insight problems you might know what to do, but don’t know that you know. The more you know about a domain, the more you are likely to be able to retrieve the relevant operators when necessary and hence the more likely you are to solve the problem. Domain knowledge is more important than analogical reasoning ability (Novick & Holyoak, 1991). Creative solutions to problems often involve
188
PROBLEM SOLVING
finding a new metaphor or analogy that opens up a new set of potential operators. Schön (1993) gives the example of a product-development team working for a paintbrush manufacturer. They were at first unable to get synthetic bristles to work as well as natural bristles. Eventually someone came up with the idea that a paintbrush was actually a kind of pump. This metaphor gave rise to a whole new research endeavour. There was suddenly a new set of things they could do. Another way of ensuring that you apply the correct operators in a new domain is to use an example. Indeed to ensure that you use the correct operators it is best if you actually copy the example as much as possible. In a study by Robertson (2000) children were given an example where two cars left a location at different times and the second vehicle overtook the first. If the example gave a one-and-a-half-hour difference as 3/2, some of the children would change the two-hour difference in the exercise problem into 4/ 2 even though that was completely unnecessary. They converted it because that’s what happened in the example. The moral of the story is that when you are unsure what you are doing it’s best to keep your inductions conservative. Goals Problems vary in the nature of their goals. In some the answer is given and the solver has to find out how to get there. In others the goal may be only vaguely stated but you would recognise it when you see it. Thus in an algebra problem where the goal is to find the value of “x”, as soon as you end up with “x=something” you have got an answer. Similarly, if you are trying to find a catchy name for a new product, you will know if you’ve got one when you’ve got one. Of course there’s going to be some kind of test against explicit or implicit criteria that will tell whether the goal is adequately satisfied. Other goals are even vaguer still. You might have no idea what you are going to end up with until you have finished. Writing is one example, and artistic creation is another. PROBLEM REPRESENTATION The way you go about trying to solve a problem or making a decision, or even just what to pay attention to, depends on the salience of elements in the environment. Salience is therefore one of the factors that influence the way we represent the world, including the problems we face. A housebuyer, an architect, and a burglar looking at a house are going to find different aspects salient. People are therefore going to generate different representations of problems and situations depending on their past experience. A result of this is that it is possible to manipulate the likelihood of a solution by manipulating the instructions (Hayes & Simon, 1974; Simon & Hayes, 1976). Spin doctors manipulate information in an attempt to get us to represent it in certain ways. Features of the environment that stand out in some way are likely to be relevant or important. Although paying attention to what appear to be the salient features of the environment is an extremely useful heuristic most of the time, there are times when it can lead you up the garden path. Our perceptual systems have evolved over millions of years to allow fast recognition of objects and faces. Our perceptual systems are also prey to visual illusions for the same reason. The way we read sentences can lead to initial misunderstandings as in “the old man the boats”. Similarly, the kinds of trick questions you get in puzzle books rely on the fact that certain features stand out and influence how you respond. Activity Box 10.1 gives some typical, though frivolous, examples (see p. 237 for answers). What people find salient influences their behaviour in more serious circumstances for good or for ill. Solving insight problems or generating a creative solution to a problem often involves making some hitherto
10. CONCLUSIONS
189
irrelevant feature salient. In Schön’s example of the new paintbrush, the spaces between the bristles suddenly became important rather than the bristles themselves (that’s where the paint gets pumped out of). Experience (learning) causes changes in the features that are salient. Variability in the way different features of things stand out for different people is due to the fact that people vary in their
ACTIVITY 10.1
1. 2. 3.
A farmer had 17 sheep. All but 9 died. How many did he have left? If joke is spelt J O K E and folk is spelt F O L K, how is the white of an egg spelt? There are 6 apples in a basket in a room and 6 girls in the room. Each girl takes an apple. One apple remains in the basket. How come? 4. Two Romanians get on a bus. One Romanian is the father of the other Romanian’s son. How come?
experience. It manifests itself in expert-novice differences but also works at smaller timescales than the years expertise takes to develop. In Chi, Feltovich, and Glaser’s (1981) study of novice-expert differences, the features experts found salient were different from those novices found salient. Salient features of a situation can trigger a learned procedure (a lever on the right-hand side of a car’s steering wheel can be flicked down to signal a right turn). When circumstances change these features may no longer be relevant (the windscreen wiper switches on). In Duncker’s (1945) Candle Holder problem the role of the boxes as containers is salient in one version and not salient when they are presented empty. Of course, focusing on surface features is usually reliable, as they often reflect underlying structural features. If something you have never seen before has feathers, the chances are that it can fly. This is an example of a diachronic rule (Holland et al., 1986). On the other hand, basing decisions or other forms of behaviour on a human being’s skin colour or their nationality would be a very silly thing to do because these features tell you absolutely nothing about an individual (see Hinton, 2000). Problem solving can’t begin until there is a mental representation of the problem for the solver to work on. A representation generated from the task environment will include text-based inferences, information retrieved from our vast semantic network in long-term memory, the operators cued by information in the task environment. This form of representation is one of the two major problem-solving processes described by Newell and Simon. Representing a problem in terms of the kinds of things you can do and the kinds of problem states you might get to is known as understanding in their scheme. In knowledge-lean problem solving the initial understanding forms the basis for a search through the problem space. In novice problem solving an initial mental model is generated by translating the text of a problem. For experts the process is somewhat different: “expertise allows one to substitute recognition for search” (VanLehn, 1989, p. 529). For experts, features of the problem statement trigger appropriate schemas that in turn indicate the appropriate solution procedure, which VanLehn refers to as the “second half of the schema” (1989, p. 548). TRANSFER The probability that learning will be transferred from one context to another depends on the representation formed of both the target in short-term working memory and the source in long-term memory. The
190
PROBLEM SOLVING
representation one forms of a problem will contain aspects of the context in which it was learned. The role of context in recall is well known. If you misplace something, going back to the place where you had it last will help you remember what you did with it. In exams students often find that they can’t quite remember the bit of information they need, but they do remember that it is on the bottom right-hand corner of a lefthand page in the textbook. The position of the information required is irrelevant but it is still stored in the memory trace none the less. Salient features of the problem along with features of the context determine whether a particular source is likely to be accessed or not in the first place. If a relevant source problem can be found then it needs to be adapted to the current situation. When this happens and a solution can be found we have an example of positive transfer. If, on relatively rare occasions, learning something impedes our learning of something new, then we have an example of negative transfer. Examples of negative transfer include Einstellung and functional fixedness. Together they mean that our patterns of activity, our habits, or the use we make of tools and materials blind us to alternative courses of action or functions. Again, there is a danger of overemphasising the negative aspects of well-learned procedures. They are extremely useful and 99% of the time they will allow us to achieve our goals. The more context plays a role in transfer, the more the transfer will be specific. General transfer, on the other hand, involves transferring knowledge or skills from one context to another one. This kind of knowledge has to be decontextualised to some degree to allow transfer to happen. As Singley and Anderson have pointed out (Anderson & Singley, 1993; Singley & Anderson, 1985, 1989), specific transfer occurs when there is an overlap in the elements (production rules) in the source and target. General transfer also involves an overlap of skills. Learning to search psychology databases to find relevant articles for an experimental report should help the student search archaeology databases to write a report in that domain. Schunn and Anderson (1999) give an example of transfer by “task domain” experts. Novick has shown that there can be transfer of highly abstract representational methods (Novick, 1990; Novick & Hmelo, 1994) and Pennington and Rehder (1996) and Müller (1999) have emphasised the importance of conceptual transfer. Analogical transfer depends on there being some kind of similarity between source and target. They can have the same objects or surface features or they can share the same underlying structure. Gentner’s structure-mapping theory (Falkenhainer et al., 1989; Gentner, 1983, 1989; Gentner & Toupin, 1986) explains how the effects of the surface features of two situations can be overcome, allowing a hierarchical structure from one situation to be mapped on to a current situation. LEARNING Despite the fact that transfer of knowledge is often constrained by context (it is often “inert”) we still manage to learn. Indeed, analogies are often used as teaching devices (Mayer, 1993; Petrie & Oshlag, 1993). Using analogies that are either given or in the form of textbook examples leads eventually to the abstraction of the common features between them. One learns to recognise the problem type and to access the relevant solution method. The eventual representation is usually characterised as a problem schema. Extended practice over many years leads in turn to expertise in a field. To be of any use in a variety of situations, problem schemas have to be general enough to apply to a range of situations and detailed or concrete enough to be used to solve a specific example, e=mc2 doesn’t really tell you how to solve a given problem. Equations and general principles are often too abstract to help the learner. Schema representations formed from experience have to be at a moderately abstract level. Having induced a problem schema from a number of examples, we should be able to apply that schema to new instances. There are various models of how we generalise from experience and Chapter 8 has dealt with
10. CONCLUSIONS
191
Figure 10.1. External and internal factors affecting problem solving and learning.
only some of them. Although generalisation is a very important mechanism, specialisation is also important. We need to learn the exceptions to the rules as well as the rules themselves: i comes before e except after c and except after a few other letters that don’t conform to the rule. The development of expertise includes the learning of schemas that cover exceptions as well as the generality of cases. For this reason experts’ representations can be flexible and they can get over the effects of automaticity—at least to some extent. Although this book has concentrated mostly on the general cognitive processes involved in problem solving and learning, there are many other variables that affect whether an individual successfully solves a problem, achieves his or her goal, learns a new domain, becomes well-versed in it, or manages to achieve a level that could be called exceptional performance. As the last chapter pointed out, there are many factors that affect how we solve problems and eventually develop expertise. Charness, Krampe, and Mayr (1996) have described a taxonomy of factors that are important in skill acquisition and the development of exceptional performance. External social factors (e.g., parental, cultural, financial support), internal factors (e.g., motivation, competitiveness) and external “informational” factors (the nature of the domain and the sources of information about it) all affect the amount and quality of the practice a person puts in. That, in turn, interacts with the cognitive system which includes “software” (knowledge base and problem-solving processes) and “hardware” (e.g., working-memory capacity, processing speed). For some domains different factors are likely to be emphasised more than others. For example, skilled chess performance may rely more on the cognitive system than on the external factors, although the latter are not negligible. Figure 10.1 includes certain areas that have not been covered in this book. Individual performance on any task is influenced by a whole host of cultural, social, and contextual factors interacting with usually stable motivational, personality, and physical factors. These interact with inherent differences in knowledge that change over time, and differences in cognitive processes that remain relatively stable over time (within limits). A specific problem is embedded in some immediate context which may have certain demand characteristics. Solvers may ask themselves “Why am I being asked to do this experiment. Is this perhaps a memory task?” Alternatively the problem may be something like a car breaking down in the middle of nowhere during a thunderstorm, where physical factors such as the temperature may affect the nature of the problem solving that can take place.
192
PROBLEM SOLVING
The social setting can be very important. The very presence of other people can affect processing speed, for example (Zajonc, 1965, 1980). The cultural setting can affect how one regards a problem or even whether a situation is a problem at all. One culture may spend time over the problem of how many angels can dance on the end of a pin or whether women have souls. Another culture may not regard these as problems worth considering in the first place. Context, social setting, an individual’s nervous system, and personality factors can together influence performance. During an exam, performance on an essay question can be entirely different from performance on the same question while sitting at home by the fire. This book, however, has concentrated on the interaction between a problem in its context and the cognitive system, and tried to show how human beings in general (and occasionally some other forms of information-processing system) attempt to solve problems. Other areas outlined in Figure 10.1 would need to be addressed if we want to fully understand how any given individual confronts a particular type of problem. They also need to be taken into account if we are to understand individual differences in how people faced with a problem that they are at first unable to solve, become (or fail to become) a world-class expert.
Answers to questions
CHAPTER 2 State-action spaces: finding your way in a building The building is organised thus: 301–304 321–324
Stair Lift
305–308 317–320
Stair Lift
309–312 313–316
CHAPTER 3 The solution to the Nine Dots problem. CHAPTER 4 Activity 4.3
(A) A simple hypothesis. (B) A complex hypothesis. The sequence of three shapes changes such that circle-square-triangle becomes square-triangle-circle which becomes triangle-circle-square, and so on. At the same time the colour of the shapes changes: one white-one black, two white-two black, three white-three black. CHAPTER 5 Activity 5.2
Analyse the metaphor: “The Falklands thing was a fight between two bald men over a comb” a. Encode meanings of terms. b. Infer missing term.
194
PROBLEM SOLVING
c. Infer relation. Bald men don’t need a comb. Nevertheless, they are fighting over it. d. Apply relations to understand the relationship between the countries and the Falklands.
CHAPTER 8 Activity 8-1
Defun multiply-them(2 3) *2 3 would turn into Defun add-them(712 91) +2 3 The name changes from “multiply-them” to “add-them” (although the function would still work even if you didn’t change its name). The numbers change from 2 to 712 and from 3 to 91. Finally the arithmetic operator changes from a * to a +. CHAPTER 10 Activity 10.1
1. 2. 3. 4.
9 WHITE One of the girls takes the basket with the apple in it. One’s his mother.
Glossary
(Italics refer to other items in the Glossary.) Abstract knowledge: a type of declarative knowledge. Knowledge of general principles. Abstractly defined problems: problems that have all the components of well-defined problems except that the goal is defined in abstract terms. Adaptive expertise: kind of expertise that allows experts to use their knowledge flexibly by finding ad hoc solutions to non-standard unfamiliar problems. Algorithm: problem-solving procedures that are generally guaranteed to work if specified in enough detail. Analogical problem solving: using a familiar problem to solve an unfamiliar problem. The structural features of the analogue (familiar problem) are mapped onto the unfamiliar problem. Analogical transfer: applying knowledge learned in one situation to a new situation (see analogical problem solving). Arguments: in the case of predicate calculus, arguments are the objects that are governed by predicates as in the example “gave (Bill, Mary, flowers)” where “gave” is the predicate and the objects in brackets are the arguments. Attribute mappings: mappings of attributes from one situation to another. For example, milk and water share the same attribute of being liquid. Attributes: properties or features of a concept. Automaticity: processing that has become automatic (and fast) and no longer requires attentional resources to execute. Chunk: a data structure forming a basic unit of knowledge. Close variant: a problem that is similar to another, in that it involves little adaptation if it is to be used to solve the other problem. Cognitive architecture: a theory of the structure of the mind. An architecture describes the functions and modules necessary for a range of cognitive behaviours. Compiled knowledge: knowledge that was originally declarative but has become automatised so that it is no longer readily accessible to conscious inspection. Conceptual core: context-independent features of a concept. Conceptual integration hypothesis: concepts are presumed to mediate between a problem’s givens and the requested answers. The hypothesis explains the fact that occasionally transfer of knowledge from one task to another increases with practice if they involve the same concepts, e.g., from solving crossword puzzles to generating them. Condition-simplifying generalisation: a generalisation that is produced when features of two or more instances of a concept are deemed to be irrelevant. Such features can then be dropped from the representation of the concept. For example, the colour of a cat’s fur has little to do with whether or not the cat can miaow. Connectionism: cognitive architecture that consists of neuron-like nodes connected by links that have different connection strengths. Learning or representation consists of the pattern of weights between nodes. Conservative induction: people tend to be careful about the inductive generalisations they make and therefore these generalisations may contain a lot of specific detail that may not be relevant. Constraints: aspects of a problem that serve to limit what actions are available or appropriate. Context-dependent information: information about a concept that tends to be restricted to certain contexts. Context-independent information: features of a category or concept that tend to be invariant across contexts Cover story: the “story” or situation described in the problem statement.
196
GLOSSARY
Declarative knowledge: “knowing that”— explicit episodic and general semantic knowledge that one has about the world. This knowledge is either right or wrong and is not use-dependent (see procedural knowledge). Declarative memory: (see declarative knowledge). Deduction: generalising from one or a few examples to a whole range of new examples. Diachronic rules: rules that represent changes over time. Diachronic rules are of two kinds: Predictor rules permit expectations of what is likely to occur in the world, e.g., “If a person annoys a dog then the dog will growl”; Effector rules tell the system how to act in a given situation, e.g., “If a dog chases you then run away”. Diagrammatic representation: representation of a task, concept, or situation in the form of a diagram. Difference reduction: general-purpose problem-solving heuristics that involve estimating the distance between where you are now and the goal, and trying to find an operator that will reduce that distance. Discourse knowledge: knowledge in the form of schemas about the structure of certain types of written text (see also genre). Distant variant: a problem that is different in some aspects of its structure from another problem of the same general type. Using one to solve the other would require a lot of adapting. Domain: a field of knowledge, skill, or endeavour. Domain-general: knowledge, strategies, procedures that apply to any or a range of domains. Domain-specific: knowledge, strategies, procedures that are specific to a single domain. Einstellung: a type of mental set (see set effects) where a habitual problem method is selected when a more straightforward one is possible. Elaborative inferences: inferences made on the basis of an understanding of a text. For example, a problem about one car overtaking another that left earlier from the same place may include the elaborative inference that both cars have travelled the same distance. Episodic knowledge: a type of declarative knowledge. Knowledge of personally experienced events. Example-analogy: a view of analogical problem solving whereby the details of a source example are used to help solve a current target problem. Experiential stage: second stage in the development of expertise. An understanding of the causal relations that underpin a system of knowledge emerges along with the abstraction of instances from experience. Expert stage: third stage in the development of expertise. The learner can make abstractions across various system representations. The knowledge gained is therefore transferable. External memory: any object in the environment that can be used to store information and that takes the load off working memory or aids long-term memory. Examples would be an address book or diary. Factual knowledge: a type of declarative knowledge. General knowledge of facts. Full insight: the solution to a problem becomes immediately available (see partial insight). Functional fixedness: a type of set effect where the typical or most recent function of an object blinds one to seeing an alternative function for the object. Functional fixity: (see functional fixedness). General transfer: transfer of domain-general skills (also known as “transferable skills”). Genre: language appropriate to the type of text being written—a report, a formal essay, a thank-you letter belong to different genres. Givens of a problem: the situation described in the problem statement. Goal stack: the assumption that human behaviour is goal-directed. Goals are arranged hierarchically (stacked) so that if an overall goal involves attaining a sub-goal then that sub-goal goes to the top of the stack and is dealt with first. Goal state: where you are when you have attained your problem-solving goal. Heuristics: rules of thumb that help constrain a problem (and hence often speed up problem solving) but are not guaranteed to work.
GLOSSARY
197
Hill climbing: systematic search strategy that involves moving from a current state to one that seems closer to the goal. Used when the goal-sub-goal structure of a problem is not all that clear. Homomorphic problems: problems that are similar in structure but not identical (for example there may be extra constraints—see isomorphic problems) Homomorphism: a model of the world produced by a “many-to-one” mapping where a large number of states of affairs in the world are mapped onto a smaller number of states in a mental model of the world. Identical elements theory: transfer can take place between two situations or tasks only when there are similar surface elements. Ill-defined problem: a problem where the means of solving it are not immediately apparent, either because relevant operators or a clear description of the goal are not explicitly stated. Impasse: point in problem solving where one gets stuck. Usually one has to take a “detour” by creating a sub-goal to remove the impasse before problem solving can continue. Incubation: presumed unconscious process where an answer to a problem seems to pop into one’s head despite the fact that one is no longer consciously thinking about it. Induction: the process of abstracting out general rules from specific examples. Information processing: manipulating (processing) mental representations (information) about the world. Information-processing system (IPS): a system that processes symbols and symbol structures that mentally represent the external world. Human beings have certain processing limitations. Informationally encapsulated: systems, such as vision or hearing, that are cut off from other systems. Initial state: the starting state of a problem (e.g., an essay question and a blank sheet of paper). Insight: a phenomenon whereby a solution is arrived at without any apparent conscious working-out. The solution appears to pop into consciousness. Instance-based generalisation: there is variability in the likelihood of generalising from one or more instances of a concept depending on one’s beliefs about the variability of the features of instances. The PI system (Holland et al., 1986) can allow generalisation from a single instance or from experience of many instances. Instantiation: a concrete form or example (an instance) of a general concept or variable. Intermediate effect: in some domains, a stage learners go through on the way to becoming experts where there is a dip in performance on certain kinds of tasks, such that novices outperform the intermediates despite the greater experience of the latter. Isomorphic problems: problems that have an identical underlying solution structure. Isomorphs: (see isomorphic problems). Knowledge state: the particular stage you are in during problem solving, specifically the mental state you have reached at that point. Knowledge-lean: problems that require very little knowledge to solve them. Knowledge-rich: problems that require prior domain knowledge to solve them. Knowledge-telling strategy: a writing strategy that reduces the problem faced by the writer to one of telling what one knows about the topic. Knowledge-transforming strategy: an expert writing strategy in which the writer tries to deal with the problem of presenting his or her knowledge and beliefs, and the problem of achieving the goals of the composition. Lateral thinking: solving ill-defined problems or insight problems by searching for an alternative problem space rather than searching within a problem space. Liberal induction: sweeping generalisations based on very little evidence—often only one example. Long-term memory (LTM): memory for events, general knowledge, procedures, strategies, etc., that are stored for longer than a few minutes. There are no known limits to the duration or capacity of information in LTM.
198
GLOSSARY
Means-ends analysis: general-purpose problem-solving heuristic that involves breaking a problem down into manageable sub-goals. Problem solving then proceeds by analysing the difference between the current problem state and the sub-goal and choosing a method that will reduce that difference. Mental lexicon: our mental store of words; their spellings, pronunciations, and meanings. Mental model: generally image-based representation of how something works or how a situation can be imagined. Mental representation: how knowledge about objects, problems, states of affairs, layouts, etc., is stored in the mind. Representations can be manipulated mentally in ways analogous to the ways states of affairs can be manipulated in the real world. Metacognition: knowing how cognitive processes such as memory work so that one can come to control them. Moderately abstract conceptual representation: a level of abstraction intermediate between abstractions, such as general principles or equations and concrete examples. Negative transfer: learning to solve a problem in one situation prevents or impedes the solution of what looks like a similar problem in another situation (see positive transfer). Nodes: points at which the state-action diagram branches; neuron-like elements in a connectionist system. Operator: an action that can be taken to change the state of a problem, e.g., “move a ring”, multiply, “take the bus”. Operator restrictions: constraints imposed by a problem, often stating what you are not allowed to do. Partial insight: an insight into a problem that indicates how one might proceed in order to reach a solution (see full insight). Planning: one of the processes involved in writing. It includes the sub-processes of generating information from long-term memory using information in the task environment. Positive transfer: learning to solve a problem in one situation increases the speed and/or accuracy with which you solve a similar problem in another situation (see negative transfer). Power Law of Learning: a law that seems to govern learning: the more we practise something the easier it gets, but the harder it gets to improve (improvement is a power function of practice). Predicate calculus: system of representing knowledge based on predicates and their arguments. Predicate: a relationship that exists between one or more objects can be represented as a predicate. “Gave” is the predicate in the example “gave (Bill, Mary, flowers)” (see also arguments). Pretheoretical stage: first stage in the development of expertise where specific instances are retrieved from memory based on superficial features of the current situation and the instances stored in memory. Primary understanding: occurs when students understand information at a surface level; usually at the same or a less abstract level than the information being presented. Principle of rationality: in various guises the principle refers to the fact that people choose actions that they believe will attain their goals. The choice of actions is dependent on knowledge and the capacity limits of the processing system. Principle-cueing: view of analogical problem solving whereby accessing a source problem cues the underlying principle which is used in turn to solve the target problem. Problem model: a type of situation model that includes knowledge of how the situation described in a text of a problem can be converted into a solution procedure. For example, it may include formal knowledge about the arithmetic structure derived from the textbase. Problem schema: problem structure abstracted from experience that allows us to identify new problems of a particular type and to access potentially useful solution strategies. Problem space: a mental representation of a problem built up from information present in a problem statement, the context, and past experience. The representation includes the states that can be reached through applying problem-solving operators.
GLOSSARY
199
Problem understanding: the process of constructing an initial mental representation of a problem based on information in the problem statement (about the goal, the initial state, restrictions, and operators to apply where given) and personal past experience. Problem variants: different problems of the same general type. Variants can be close—for instance, an exercise problem can be similar to an example problem—or distant —for instance, where an example problem needs to be adapted a lot to solve a current exercise problem. Procedural knowledge: “knowing how”— knowledge of how to do something (riding a bike, swimming, baking a cake). Can be more or less useful in attaining one’s goals. Procedural memory: (see procedural knowledge). Production systems: models of thinking and problem solving where behaviour is regarded as sequential and rule-governed in nature. Rules (productions) are generally of the form “if…then” (condition-action rules). Productive thinking: generating a solution to a problem due to an understanding of the problem’s “deep structure”. Proportional analogies: analogies of the form A: B:: C: D (A is to B as C is to D) where the relation between A and B is “proportional” to the relation between C and D. Propositions: a unit of knowledge that represents a statement or assertion, usually in the form of a predicate with arguments. Propositional representation: representation that captures underlying meanings in terms of sets of propositions. Rationality: this can have two meanings: (a) logically correct thinking; (b) thinking that operates to advance our goals. Most often it is the second of these two meanings that is referred to in the text (see principle of rationality). Recursion: a solution procedure that involves applying the same algorithm within itself until some halting condition is reached. For example, to find out if someone has flu, then check to see if they have kissed someone who has flu. To check whether that person in turn has flu, then check to see if he has kissed someone else who has flu; and so on. Reflective understanding: occurs when students recognise the deeper structural features of problems and can relate them to previous knowledge. Relational elements: a set of elements that constitute a relation between one concept and another. According to Bejar et al. (1991), “The wheel is part of the bike” involves the relational elements “inclusion” and “connection”. Reproductive thinking: using a previously learned solution method to solve a problem. Such a method is triggered by the problem type and not by a “deep” understanding of the problem (see productive thinking). Restructuring: Gestalt term for the necessity in some problems to find an appropriate problem representation when a current representation is not working. Reviewing: one of the processes involved in writing, subdivided into reading and editing. The process detects violations of written expression. It also involves making some kind of evaluation of the text to see whether it fits in with the writer’s goals. Rhetorical problem: the problem of writing prose in conformity with the writer’s goals and that is directed at the intended audience. Routine expertise: refers to the schema-based knowledge experts use to solve standard familiar problems efficiently (see also adaptive expertise). Rule induction: (see induction). Schema: a knowledge structure composed of bits of semantic knowledge and the relations between them (see problem schema). Search: engaging in problem solving involves a search through a problem space. The solver seeks a sequence of operators that will solve the problem. Search graph: (search tree) similar to a state-action diagram. Diagrammatic depiction of a problem space. Semantic associations: (see semantics).
200
GLOSSARY
Semantic network: network of semantically related concepts (see also semantics; spreading activation). Semantic space: (see semantics). Semantically lean problems: problems where the solver has little prior knowledge to call upon. The problem has few semantic associations for the solver. Semantics: the study of meanings. The more knowledge one has of a word or concept (the more associations and relations a word has) the bigger is its semantic space. Sentential representation: representation of a task, concept, or situation that is generally linear in nature; that is, in the form of a textual description. Set effect: applying a learned rule or procedure for doing something when there is a simpler way of doing it. Short-term memory (STM): limited-capacity storage of information usually in focal attention. The information is unlikely to have undergone much processing or interpretation. Situated learning: learning theory that emphasises the importance of the learning context (especially the social context) on learning for transfer. Situation model: a mental model of the situation described in a text, derived from the textbase. Slot: element in a schema representation (including a propositional representation). Slots can be filled with specific values. Solution development: working towards a solution—failed attempts at solving a problem sometimes help refine what the problem is and hence direct the solver to a more fruitful solution path Solution procedure: a method, usually involving a sequence of actions, used to solve a problem. Source problem: (also known as base problem) earlier problem, usually in long-term memory, that can be used as an analogy for solving a current target problem. Specific transfer: transfer of domain-specific skills—usually from one problem to another that is very similar Spreading activation: memory traces in a semantic network related to the current context are activated, some more strongly than others. Activating a memory trace leads to many related memories also being activated above some threshold level. State space: the space of all the problem states that can be reached and the paths between them (the problem space). A solver can go from one state to another along a path by applying an operator. The state space can be represented visually as a state-action diagram or search graph State-action diagram: (see state space). State-action space: (see state space). Strong methods: known problem-solving methods that can reliably solve a problem type. Structural features: those features of a problem that are relevant to its solution, and can be used to categorise the problem type. Structurally blind: “reproductive” problem solving that does not take account of the problem’s underlying structure (see reproductive thinking). Structurally similar problems: problems with the same underlying solution structure (see also structural features). Sub-goaling: (see means-ends analysis). Subordinate category: certain categories can be organised hierarchically. “Spaniel” is a subordinate category of the superordinate “dog”. Superordinate category: certain categories can be organised hierarchically. “Spaniel” is a subordinate category of the superordinate “dog”. Surface features: those features of a problem that are irrelevant to the solution. Surface similarities: two problems may be similar because they are about the same sorts of things (e.g., computers) although the underlying structure of the problems may be different (see also surface features and structural features). Symbol structures: structures formed by collections of symbols. Symbol tokens: a symbol token stands for a symbol. A specific occurrence of a word, object, event, etc. (a symbol token) refers the information processor to the symbol itself.
GLOSSARY
201
Symbols: a representation of something in the world such as words in sentences, objects in pictures, numbers, arithmetic operators, and so on. Synchronic rules: rules that represent general features of an object or its category membership. There are two kinds of synchronic rule: categorical and associative. Categorical rules are of the type: “If an object is a dog then it is an animal”. Associative rules are of the type: “If an object is a dog then activate the ‘bone’ concept”. Systematicity principle: according to Gentner’s theory of analogy, people prefer to map hierarchical systems of relations in which the higher-order relations constrain the lower-order ones. Systematicity governs how well an analogy works. Target problem: current problem that requires to be solved (currently in working memory). A source problem is sought (usually from long-term memory) to use as an analogy. Task environment: a problem in its context. The task environment may contain specific problem-related information or cues that may help retrieve problem-relevant information from memory. Terminal nodes: points at which the state-action diagram terminates (the solver can go no further). A terminal node can be either a goal state or a dead end. Textbase: an initial propositional representation of a text. From the textbase, we can (hopefully) generate a situation model—a mental model of the situation described in the text that includes elaborated inferences. Transfer-appropriate processing: in processing information, learners have to take account of the goals of the learning context and adapt their learning strategies accordingly. For example, if the goal is to do a rhyming task then it would be best to encode information in terms of the sound of the words. Transition function: a function that produces a change of state in the world or in a model of the world. Applying an operator to elements in a particular state produces a transition to another state. Translating: one of the processes involved in writing. Information retrieved from long-term memory is presumed to be in the form of propositions that need to be translated into written prose. Transparency: the degree to which objects in two problem situations appear to play the same roles or share the same features. High transparency occurs where there is a great deal of semantic similarity between the objects. A solver or analogiser can see relatively easily how one might map onto the other. Low transparency occurs when there is low semantic similarity between objects despite their occupying the same roles (“armies” and “rays”). Understanding: (see problem understanding). Unified theories of cognition: theories of cognition that attempt to explain as much of human cognitive behaviour as possible with the ultimate goal of having a single representational framework for, or model of, cognition. Unstable representations: this refers to the fact that our knowledge of what features of concepts are relevant can vary from one context to another. Verbal protocols: transcript obtained when a solver is asked to think aloud while solving a problem. Weak methods: general-purpose problem-solving methods used when we don’t know how to solve a problem. Well-defined problem: a problem where the solver has available information about the goal state, operators, and constraints. Working backwards: a problem-solving strategy that involves starting at the goal state and working backwards from it to see what needs to be done (i.e., what sub-goals need to be achieved). Working memory elements (wimees): a chunk; data structures forming basic units of knowledge in temporary store. Working memory (WM): short-term store where information is held while being processed. Sometimes used as an equivalent to short-term memory, although the emphasis is on a more dynamic memory system than a traditional “passive” view of short-term memory.
References
Adelson, B. (1981). Problem solving and the development of abstract categories in programming languages. Memory & Cognition, 9, 422–433. Adelson, B. (1984). When novices surpass experts: The difficulty of a task may increase with expertise. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 483–495. Ahlum-Heath, M.E., & DiVesta, F.J. (1986). The effects of conscious controlled verbalization of a cognitive strategy on transfer in problem solving. Memory and Cognition, 14, 281–285. Anderson, J.R. (1980). Cognitive psychology and its implications. New York: W.H.Freeman. Anderson, J.R. (1982). Acquisition of cognitive skill. Psychological Review, 89, 369–406. Anderson, J.R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press. Anderson, J.R. (1987). Skill acquisition: Compilation of weak method problem situations. Psychological Review, 94(2), 192–210. Anderson, J.R. (1990). The adaptive character of thought. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Anderson, J.R. (Ed.) (1993). Rules of the mind. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Anderson, J.R. (2000a). Cognitive psychology and its implications (5th ed.). New York: W.H. Freeman. Anderson, J.R. (2000b). Learning and memory: An integrated approach. New York: Wiley. Anderson, J.R., Boyle, C.F., & Yost, G. (1985). The geometry tutor. Paper presented at the Proceedings of the International Joint Conference on Artificial Intelligence–85, Los Angeles. Anderson, J.R., Farrell, R., & Sauers, R. (1984). Learning to program in LISP. Cognitive Science, 8, 87–129. Anderson, J.R., & Fincham, J.M. (1994). Acquisition of procedural skills from examples. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1322–1340. Anderson, J.R., Reder, L.M., & Simon, H.A. (1996). Situated learning and education. Educational Researcher, 25(4), 5–11. Anderson, J.R., & Singley, M.K. (1993). The identical elements theory of transfer. In J.R. Anderson (Ed.), Rules of the mind. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Anderson, J.R., & Thompson, R. (1989). Use of an analogy in a production system architecture. In S.Vosniadou & A.Ortony (Eds.), Similarity and analogical reasoning (pp. 267–297). London: Cambridge University Press. Ashcraft, M.H. (1994). Human memory and cognition (2nd ed.). New York: HarperCollins. Atkinson, R.C., & Shiffrin, R.M. (1968). Human memory: A proposed system and its control processes. In K.W.Spence & I.T.Spence (Eds.), The psychology of learning and motivation (Vol. 2). London: Academic Press. Ausubel, D.P. (1968). Educational psychology: A cognitive view. New York: Holt, Rinehart & Winston. Baddeley, A. (1997). Human memory: Theory and practice (rev. ed.). Hove, UK: Psychology Press. Baddeley, A.D. (1981). The concept of working memory: A view of its current state and probable future development. Cognition, 10, 17–23. Ballstaedt, S.-P., & Mandle, H. (1984). Elaborations: Assessment and analysis. In H.Mandl, N.L.Stein, & T.Trabasso (Eds.), Learning and comprehension of text (pp. 331–353). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Barsalou, L.W. (1989). Intraconcept similarity and its implications for interconcept similarity. In S.Vosniadou & A.Ortony (Eds.), Similarity and analogical reasoning. London: Cambridge University Press. Bartlett, F.C. (1932). Remembering: A study in experimental and social psychology. Cambridge: Cambridge University Press. Bassok, M. (1990). Transfer of domain-specific problem solving procedures. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16(3), 522–533.
REFERENCES
203
Bassok, M. (1997). Two types of reliance on correlations between content and structure in reasoning about word problems. In L.D.English (Ed.), Mathematical reasoning: Analogies, metaphors, and images. Studies in mathematical thinking and learning (pp. 221–246). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Bassok, M., & Holyoak, K.J. (1989). Interdomain transfer between isomorphic topics in algebra and physics. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 153–166. Beck, I.L., & McKeown, M.G. (1989). Expository text for young readers: The issue of coherence. In L.B.Resnick (Ed.), Knowing, learning, and instruction: Essays in honor of Robert Glaser (pp. 47–66). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Bejar, I.I., Chaffin, R., & Embretson, S. (1991). Cognitive and psychometric analysis of analogical problem solving. New York: Springer-Verlag. Bereiter, C., & Scardamalia, M. (1987). The psychology of written composition. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Bernardo, A.B.I. (1994). Problem specific information and the development of problem-type schemata. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 379–395. Berry, D.C. (1983). Metacognitive experience and transfer of logical reasoning. Quarterly Journal of Experimental Psychology, 35A, 39–49. Beveridge, M., & Parkins, E. (1987). Visual representation in analogical problem solving. Memory and Cognition, 15 (3), 230–237. Black, M. (1993). More about metaphor. In A.Ortony (Ed.), Metaphor and thought (pp. 19–41). Cambridge: Cambridge University Press. Blanchette, I., & Dunbar, K. (2000). How analogies are generated: The roles of structural and superficial similarity. Memory and Cognition, 28(1), 108–124. Blessing, S.B., & Ross, B.H. (1996). Content effects in problem categorisation and problem solving. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(3). Boden, M. (1994). Précis of “The creative mind: Myths and mechanisms”. Behavioral and Brain Sciences, 17, 519–570. Brainerd, C.J., & Reyna, V.F. (1993). Memory independence and memory interference in cognitive development. Psychological Review, 100, 42–67. Bransford, J.D., Arbitman-Smith, R., Stein, B.S., & Vye, N.J. (1985). Improving thinking and learning skills: An analysis of three approaches. In J.W.Segal, S.F.Chipman, & R.Glaser (Eds.), Thinking and learning skills (Vol. 1, pp. 133–206). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Bransford, J.D., Barclay, J.R., & Franks, J.J. (1972). Sentence memory: A constructive versus interpretative approach. Cognitive Psychology, 3, 193–209. Bransford, J.D., & Stein, B.S. (1993). The ideal problem solver (2nd ed.). New York: W.H. Freeman. Brewer, W.F., & Treyens, J.C. (1981). Role of schemata in memory for places. Cognitive Psychology, 13, 207–230. Britton, B.K., Van Dusen, L., Gulgoz, S., & Glynn, S.M. (1989). Instructional texts rewritten by five expert teams: Revisions and retention improvements. Journal of Educational Psychology, 81(2), 226–239. Broadbent, D.E. (1975). The magical number seven after fifteen years. In R.A.Kennedy & A. Wilkes (Eds.), Studies in long-term memory. New York: Wiley. Brown, A.L., & Palincsar, A.S. (1989). Guided cooperative learning and individual knowledge acquisition. In L.B.Resnick (Ed.), Knowing, learning, and instruction: Essays in honor of Robert Glaser (pp. 393–452). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Brown, D.E., & Clement, J. (1989). Overcoming misconceptions via analogical reasoning: Abstract transfer versus explanatory model construction. Instructional Science, 18(4), 237–261. Brown, J.S., & Burton, R.R. (1978). Diagnostic models for procedural bugs in basic mathematical skills. Cognitive Science, 2, 155–192. Brown, J.S., & VanLehn, K. (1980). Repair theory: A generative theory of bugs in procedural skills. Cognitive Science, 4, 379–426.
204
REFERENCES
Bryson, M., Bereiter, C., Scardamalia, M., & Joram, E. (1991). Going beyond the problem as given: Problem solving in expert and novice writers. In R.J.Sternberg & P.A.Frensch (Eds.), Complex problem solving (pp. 61–84). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Bugliosi, V. (1978). Till death us do part. New York: Bantam Books. Byrne, M.D., & Bovair, S. (1997). A working memory model of a common procedural error. Cognitive Science, 21(1), 31–61. Byrnes, J.P. (1992). The conceptual basis of procedural learning. Cognitive Development, 7, 235–257. Carbonell, J.G. (1982). Experiential learning in analogical problem solving. Paper presented at the AAAI–82: Proceedings of the National Conference on Artificial Intelligence, Los Altos, CA. Carbonell, J.G. (1983). Derivational analogy and its role in problem solving. Paper presented at the Third National Conference on Artificial Intelligence, Washington, DC. Cardinale, L.A. (1991). Conceptual models and user interaction with computers. Computers in Human Behavior, 7(3), 163–169. Carey, S., & Spelke, E. (1994). Domain-specific knowledge and conceptual change. In L.A. Hirschfield & S.A.Gelman (Eds.), Mapping the mind: Domain specificity in cognition and culture (pp. 169–200). Cambridge: Cambridge University Press. Carraher, T.N., Carraher, D., & Schliemann, A.D. (1985). Mathematics in the streets and in the schools. British Journal of Developmental Psychology, 3, 21–29. Carroll, J.M., Smith-Kerker, P.L., Ford, J.R., & Mazur-Rimetz, S.A. (1987–88). The minimal manual. HumanComputer Interaction, 3, 123–153. Catrambone, R., & Holyoak, K.J. (1989). Overcoming contextual limitations on problem-solving transfer. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15(6), 1147–1156. Ceci, S.J. (1996). On intelligence: A bioecological treatise on intellectual development. London: Harvard University Press. Ceci, S.J., & Likert, J.K. (1986). A day at the races: A study of IQ, expertise, and cognitive complexity. Journal of Experimental Psychology: General, 115, 255–266. Chaffin, R., & Herrmann, D.J. (1987). Relational element theory: A new account of the representation and processing of semantic relations. In D.Gorfein & R.Hoffman (Eds.), Memory and learning: The Ebbinghaus Centennial Conference. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Chaffin, R., & Herrmann, D.J. (1988). The nature of semantic relations: A comparison of two approaches. New York: Cambridge University Press. Chan, C.S. (1997). Mental image and internal representation. Journal of Architectural and Planning Research, 14(1), 52–77. Charness, N., Krampe, R., & Mayr, U. (1996). The role of practice and coaching in entrpreneurial skill domains: An international comparison of life-span chess skill acquisition. In K.A. Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports and games. Mahwah, NJ: Lawrence Erlbaum Associates Inc. Chase, W.G., & Simon, H.A. (1973). Perception in chess. Cognitive Psychology, 4, 55–81. Chen, Z., & Daehler, M.W. (1989). Positive and negative transfer in analogical problem solving by 6 year old children. Cognitive Development, 4(4), 327–344. Chen, Z., Yanowitz, K.L., & Daehler, M.W. (1995). Constraints on accessing abstract source information: Instantiation of principles facilitates children’s analogical transfer. Journal of Educational Psychology, 87(3), 445–54. Chi, M.T., Bassok, M., Lewis, M.W., Reimann, P., & Glaser, R. (1989). Self-explanations: How students study and use examples in learning to solve problems. Cognitive Science, 13, 145–182. Chi, M.T.H., & Bassok, M. (1989). Learning from examples via self-explanations. In L.B. Resnick (Ed.), Knowing, learning, and instruction: Essays in honor of Robert Glaser (pp. 251–282). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Chi, M.T.H., de Leeuw, N., Chiu, M.H., & LaVancher, C. (1994). Eliciting self-explanations improves understanding. Cognitive Science, 18, 439–478.
REFERENCES
205
Chi, M.T.H., Feltovich, P.J., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5, 121–152. Chi, M.T.H., Glaser, R., & Farr, M.J. (Eds.) (1988). The nature of expertise. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Chi, M.T.H., Glaser, R., & Rees, E. (1982). Expertise in problem solving (Vol. 1). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Chronicle, E.P., Ormerod, T.C., & MacGregor, J.N. (in press). When insight just won’t come: The failure of visual cues in the nine-dot problem. Quarterly Journal of Experimental Psychology. Clancey, W.J. (1997). The conceptual nature of knowledge, situations, and activity. In P.J.Feltovich, K.M.Ford, & R.R.Hoffman (Eds.), Expertise in context (pp. 247–291). London: MIT Press. Cobb, P., & Bowers, J. (1999). Cognitive and situated learning perspectives in theory and practice. Educational Researcher, 28(2), 4–15. Cohen, G. (1996). Memory in the real world. Hove, UK: Psychology Press. Collins, A.M., & Loftus, E.F. (1975). A spreading activation theory of semantic processing. Psychological Review, 82, 407–428. Collins, A.M., & Quillian, M.R. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behaviour, 8, 240–248. Conway, M., & Kahney, H. (1987). Transfer of learning in inference problems. In J.Hallam & C.Mellish (Eds.), Advances in artificial intelligence. Chichester, UK: John Wiley. Cooke, N. (1992). Modelling human expertise in expert systems. In R.R.Hoffman (Ed.), The psychology of expertise: Cognitive research and empirical AI (pp. 31–60). New York: Springer-Verlag. Cooke, N.J., Atlas, R.S., Lane, D.H., & Berger, R.C. (1991). The role of high-level knowledge in memory for chess positions (unpublished manuscript). Houston, TX: Rice University. Cooper, G., & Sweller, J. (1987). Effects of schema acquisition and rule automation on mathematical problem solving transfer. Journal of Educational Psychology, 79(4), 347–362. Copeland, B.J. (1993). Artificial intelligence: A philosophical introduction. Oxford: Blackwell. Curtis, R.V., & Reigeluth, C.M. (1984). The use of analogies in written text. Instructional Science, 75, 99–117. Darling, D. (1996). On creating something from nothing. New Scientist, 151(2047), 49. Dawkins, R. (1995). River out of Eden. London: Weidenfeld & Nicolson. Deakin, J.M., & Allard, F. (1991). Skilled memory in expert figure skating. Memory and Cognition, 19, 79–86. DeGroot, A.D. (1965). Thought and choice in chess. The Hague: Mouton. DeGroot, A.D. (1966). Perception and memory versus thought. In B.Kleinmuntz (Ed.), Problem solving. New York: Wiley. Dellarosa-Cummins, D. (1992). The role of analogical reasoning in the induction of problem categories. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18(5), 1103–1124. Dennett, D.C. (1996). Darwin’s dangerous idea. London: Penguin. Donnelly, C.M., & McDaniel, M.A. (1993). Use of analogy in learning scientific concepts. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19(4), 975–987. Dreyfus, H.L. (1997). Intuitive, deliberative, and calculative models of expert performance. In C.E.Zsambok & G.Klein (Eds.), Naturalistic decision making. Expertise: Research and applications (pp. 17–28). Mahwah, NJ: Lawrence Erlbaum Associates Inc. Duncker, K. (1945). On problem solving. Psychological Monographs, 58 (Whole no. 270). Egan, D.E., & Schwartz, B.J. (1979). Chunking in recall of symbolic drawings. Memory and Cognition, 7, 149–158. Eisenstadt, M. (1988). PD622 Intensive Prolog. Milton Keynes, UK: Open University Press. Elio, R., & Scharf, P.B. (1990). Modeling novice-to-expert shifts in problem-solving strategy and knowledge organization. Cognitive Science, 14(4), 579–639. Elo, A. (1978). The rating of chessplayers, past and present. New York: Arco. Ericsson, K.A., & Charness, N. (1997). Cognitive and developmental factors in expert performance. In P.J.Feltovich, K.M.Ford, & R.R.Hoffman (Eds.), Expertise in context (pp. 3–41). Cambridge, MA: MIT Press.
206
REFERENCES
Ericsson, K.A., & Hastie, R. (1994). Contemporary approaches to the study of thinking and problem solving. In R.J.Sternberg (Ed.), Thinking and problem solving (2nd ed.). New York: Academic Press. Ericsson, K.A., & Kintsch, W. (1995). Long-term working memory. Psychological Review, 102, 211–245. Ericsson, K.A., Krampe, R.T., & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406. Ericsson, K.A., & Poison, P.G. (1988). A cognitive analysis of exceptional memory for restaurant orders. In M.T.H.Chi, R.Glaser, & M.J.Farr (Eds.), The nature of expertise. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Ericsson, K.A., & Simon, H.A. (1993). Protocol analysis: Verbal reports as data (rev. ed.). Cambridge, MA: MIT Press. Ericsson, K.A., & Smith, J. (1991). Prospects and limits of the empirical study of expertise: An introduction. In K.A.Ericsson & J.Smith (Eds.), Toward a general theory of expertise: Prospects and limits (pp. 1–38). Cambridge: Cambridge University Press. Erlich, K., & Soloway, E. (1984). An empirical investigation of tacit plan knowledge in programming. In J.C.Thomas & M.L.Schneider (Eds.), Human factors in computing systems. Norwood, NJ: Ablex. Eylon, B.-S., & Reif, F. (1984). Effects of knowledge organization on task performance. Cognition and Instruction, 1 (1), 5–44. Falkenhainer, B., Forbus, K.D., & Gentner, D. (1989). The structure-mapping engine: Algorithm and examples. Artificial Intelligence, 41, 1–63. Feltovich, P.J., Johnson, P.E., Moller, J.H., & Swanson, D.B. (1984). LCS: The role and development of medical knowledge in diagnostic expertise. In W.C.Clancey & E.H.Shortliffe (Eds.), Readings in medical artificial intelligence (pp. 275–319). Reading, MA: Addison-Wesley. Feltovich, P.J., Spiro, R.J., & Coulson, R.L. (1997). Issues in expert flexibility in contexts characterized by complexity and change. In P.J.Feltovich, K.M.Ford, & R.R.Hoffman (Eds.), Expertise in context (pp. 126–146). London: MIT Press. Ferguson-Hessler, M.G.M., & de Jong, T. (1990). Studying physics texts: Differences in study processes between good and poor performers. Cognition and Instruction, 7(1), 41–54. Flower, L.S., & Hayes, J.R. (1980). The dynamics of composing: Making plans and juggling constraints. In L.W.Gregg & E.R.Stienberg (Eds.), Cognitive processes in writing (pp. 31–50). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Flower, L., & Hayes, J.R. (1981). Plans that guide the composing process. In C.H.Fredriksen & J.F.Dominic (Eds.), Writing: The nature, development and teaching of written communication. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Fodor, J.A. (1985). Précis of “The modularity of mind”. Behavioral and Brain Sciences, 8(1), 1–42. Frensch, P.A., & Buchner, A. (1999). Domain-generality versus domain-specificity in cognition. In R.J.Sternberg (Ed.), The nature of cognition. London: MIT Press. Frensch, P.A., & Funke, J. (Eds.) (1995). Complex problem solving: The European perspective. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Frensch, P., & Sternberg, R.J. (1989). Expertise and intelligent thinking: When is it worse to know better? In R.J.Sternberg (Ed.), Advances in the psychology of human intelligence (Vol. 5, pp. 157–188). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York: Basic Books. Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7, 155–170. Gentner, D. (1989). The mechanisms of analogical reasoning. In S.Vosniadou & A.Ortony (Eds.), Similarity and analogical reasoning. London: Cambridge University Press. Gentner, D., & Forbus, K.D. (1991a). MAC/FAC: A model of similarity-based access and mapping. Paper presented at the 13th Annual Conference of the Cognitive Science Society, Chicago, IL. Gentner, D., & Forbus, K.D. (1991b). MAC/FAC: A model of similarity-based retrieval. Paper presented at the 13th Annual Conference of the Cognitive Science Society, Chicago, IL.
REFERENCES
207
Gentner, D., & Gentner, D.R. (1983). Flowing waters or teeming crowds: Mental models of electricity. In D.Gentner & A.L.Stevens (Eds.), Mental models. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Gentner, D., Rattermann, M.J., & Forbus, K.D. (1993). The roles of similarity in transfer: Separating retrievability from inferential soundness. Cognitive Psychology, 25, 524–575. Gentner, D., & Schumacher, R.M. (1987). Use of structure mapping theory for complex systems. Paper presented at the IEEE International Conference on Systems, Man and Cybernetics, 14–17 October, 1986, Piscataway, NJ. Gentner, D., & Toupin, C. (1986). Systematicity and surface similarity in the development of analogy. Cognitive Science, 10, 277–300. Gick, M.L. (1985). The effect of a diagram retrieval cue on spontaneous analogical transfer. Canadian Journal of Psychology, 39(3), 460–466. Gick, M.L. (1988). Two functions of diagrams in problem solving by analogy. In H.Mandl & J.R.Levin (Eds.), Knowledge acquisition from text and pictures (Vol. 58, pp. 215–231). Amsterdam: North-Holland. Gick, M.L., & Holyoak, K.J. (1980). Analogical problem solving. Cognitive Psychology, 12, 306–356. Gick, M.L., & Holyoak, K.J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15, 1–38. Gick, M.L., & McGarry, S.J. (1992). Learning from mistakes: Inducing analogous solution failures to a source problem produces later successes in analogical transfer. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 623–639. Gigerenzer, G., & Todd, P.M. (Eds.) (1999). Simple heuristics that make us smart. Oxford: Oxford University Press. Gilhooly, K.J. (1996). Thinking: Directed, undirected and creative (3rd ed.). London: Academic Press. Giora, R. (1993). On the function of analogies in informative text. Discourse Processes, 16, 591–596. Glaser, R. (1984). The role of knowledge. American Psychologist, 39(2), 93–104. Glaser, R. (1996). Changing the agency for learning: Acquiring expert performance. In K.A. Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports, and games. Mahwah, NJ: Lawrence Erlbaum Associates Inc. Glaser, R., & Bassok, M. (1989). Learning theory and the study of instruction. In M.R.Rosenzweig & L.W.Poerter (Eds.), Annual review of psychology (Vol. 40, pp. 631–666). Palo Alto, CA: Annual Reviews Inc. Gobet, F., & Simon, H.A. (1996). Templates in chess memory: A mechanism for recalling several boards. Cognitive Psychology, 31(1), 1–40. Graybell, A.M. (1998). The basal ganglia and chunking of action repertoires. Neurobiology of Learning and Memory, 70, 119–136. Greeno, J.G. (1974). Hobbits and Orcs: Acquisition of a sequential concept. Cognitive Psychology, 6, 270–292. Greeno, J.G. (1977). Process of understanding in problem solving. In N.J.Castellan, D.B.Pisoni, & G.R.Potts (Eds.), Cognitive theory (Vol. 2, pp. 43–83). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Greeno, J.G., Moore, J.L., & Smith, D.R. (1993). Transfer of situated learning. In D.K.Detterman & R.J.Sternberg (Eds.), Transfer on trial: Intelligence, cognition, and instruction (pp. 99–167). Norwood, NJ: Ablex. Halpern, D.F. (1996). Thought and knowledge: An introduction to critical thinking (3rd ed.). Hove, UK: Lawrence Erlbaum Associates Inc. Halpern, D.F., Hanson, C., & Riefer, D. (1990). Analogies as an aid to understanding and memory. Journal of Educational Psychology, 82(2), 298–305. Hasemer, T., & Domingue, J. (1989). Common Lisp programming for artificial intelligence. Wokingham, UK: AddisonWesley. Hatano, G., & Inagaki, K. (1986). Two courses of expertise. In H.Stevenson, H.Azuma, & K. Hatuka (Eds.), Child development in Japan. San Francisco, CA: W.H.Freeman. Hawking, S.G. (1988). A brief history of time. London: Bantam Press. Hayes, J.R. (1989a). The complete problem solver (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Hayes, J.R. (1989b). Writing research: The analysis of a very complex task. In D.Klahr & K. Kotovsky (Eds.), Complex information processing: The impact of Herbert A.Simon (pp. 209–268). Hillsdale, NJ: Lawrence Erlbaum Associates Inc.
208
REFERENCES
Hayes, J.R., & Flower, L.S. (1980). Identifying the organisation of writing processes. In L.W. Gregg & E.R.Stienberg (Eds.), Cognitive processes in writing (pp. 3–30). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Hayes, J.R., & Simon, H.A. (1974). Understanding written problem instructions. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Heller, J.I. (1979). Cognitive processes in verbal analogy solutions. Pittsburgh, PA: University of Pittsburgh. Hiebert, J., & Lefèvre, P. (1986). Conceptual and procedural knowledge in mathematics: An introductory analysis. In J.Hiebert (Ed.), Conceptual and procedural knowledge: The case of mathematics (pp. 1–27). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Hinsley, D.A., Hayes, J.R., & Simon, H.A. (1977). From words to equations: Meaning and representation in algebra word problems. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Hinton, P. (2000). Stereotypes, cognition and culture. Hove, UK: Psychology Press. Hoffman, R.R. (Ed.) (1992). The psychology of expertise: Cognitive research and empirical AI. New York: SpringerVerlag. Hofstadter, D. (Ed.). (1997). Fluid concepts and creative analogies: Computer models of the fundamental mechanisms of thought. London: Allen Lane, The Penguin Press. Holland, J.H., Holyoak, K.J., Nisbett, R.E., & Thagard, P.R. (1986). Induction: Processes of inference, learning and discovery. Cambridge, MA: MIT Press. Holyoak, K.J. (1984). Mental models in problem solving. In J.R.Anderson & S.M.Kosslyn (Eds.), Tutorials in learning and memory: Essays in honour of Gordon Bower (pp. 193–218). San Francisco, CA: W.H.Freeman. Holyoak, K.J. (1985). The pragmatics of analogical transfer. In G.H.Bower (Ed.), The psychology of learning and motivation (Vol. 19). New York: Academic Press. Holyoak, K.J., & Koh, K. (1987). Surface and structural similarity in analogical transfer. Memory and Cognition, 15(4), 332–340. Holyoak, K.J., & Thagard, P. (1989a). Analogical mapping by constraint satisfaction. Cognitive Science, 13(3), 295–355. Holyoak, K.J., & Thagard, P.R. (1989b). A computational model of analogical problem solving. In S.Vosniadou & A.Ortony (Eds.), Similarity and analogical reasoning. London: Cambridge University Press. Holyoak, K.J., & Thagard, P. (1995). Mental leaps. Cambridge, MA: MIT Press. Hong, E., & O’Neil, H.F. (1992). Instructional strategies to help learners build relevant mental models in inferential statistics. Journal of Educational Psychology, 84(2), 150–159. Hughes, P. (1996). Failing the comprehension test. New Scientist, 151(2046), 45. Hull, C.L. (1920). Quantitative aspects of the evolution of concepts. Psychological monographs, 28(123). Humphrey, N. (1984). Consciousness regained. Oxford: Oxford University Press. Humphrey, N. (1992). A history of the mind. London: Chatto & Windus. Hunt, E. (1994). Problem solving. In R.J.Sternberg (Ed.), Thinking and problem solving (2nd ed.). New York: Academic Press. Iding, M.K. (1993). Instructional analogies and elaborations in science text: Effects on recall and transfer performance. Reading Psychology: An International Quarterly, 14, 33–55. Iding, M.K. (1997). How analogies foster learning from science texts. Instructional Science, 25(4), 233–253. Issing, L.J., Hannemann, J., & Haack, J. (1989). Visualization by pictorial analogies in understanding expository text. In H.Mandl & J.R.Levin (Eds.), Knowledge acquisition from text and picture (pp. 195–214). Amsterdam: Elsevier. Johnson, K.E., & Mervis, C.B. (1997). Effects of varying levels of expertise on the basic level of categorization. Journal of Experimental Psychology: General, 126(3), 248–277. Johnson-Laird, P.N. (1988a). The computer and the mind: An introduction to cognitive science. London: Fontana Press. Johnson-Laird, P.N. (1988b). A taxonomy of thinking. In R.J.Sternberg & E.E.Smith (Eds.), The psychology of human thought. Cambridge: Cambridge University Press. Johnson-Laird, P.N. (1989). Analogy and the exercise of creativity. In S.Vosniadou & A.Ortony (Eds.), Similarity and analogical reasoning. London: Cambridge University Press.
REFERENCES
209
Johnson-Laird, P.N., Herrmann, D.J., & Chaffin, R. (1984). Only connections: A critique of semantic networks. Psychological Bulletin, 96, 292–315. Kahney, H. (1982). An in-depth study of the cognitive behaviour of novice programmers (HCRL Technical Report 5). Milton Keynes, UK: The Open University. Kahney, H. (1993). Problem solving: Current issues (2nd ed.). Milton Keynes, UK: Open University Press. Kalyuga, S., Chandler, P., & Sweller, J. (1998). Levels of expertise and instructional design. Human Factors, 40(1), 1–17. Kaplan, C.A., & Simon, H.A. (1990). In search of insight. Cognitive Psychology, 22(3), 374–419. Karmiloff-Smith, A. (1994). Précis of “Beyond modularity: A developmental perspective on cognitive science” . Behavioral and Brain Sciences, 17, 693–745. Keane, M.T.G. (1988). Analogical problem solving. Chichester: Ellis Horwood. Keane, M.T. (1989a). Modelling “insight” in practical construction problems. Irish Journal of Psychology, 11, 202–215. Keane, M.T.G. (1989b). Modelling problem solving in Gestalt “insight” problems (HCRL). Milton Keynes, UK: The Open University. Keane, M.T.G. (1990). Incremental analogizing: Theory and model. Vol. 1: Representation, reasoning, analogy and decision making. Chichester, UK: John Wiley. Keane, M.T. (1997). What makes an analogy difficult? The effects of order and causal structure on analogical mapping. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23(4), 946–967. Kieras, D.E. (1985). Thematic processes in the comprehension of technical prose. In B.K.Britton & J.B.Black (Eds.), Understanding expository text: A theoretical and practical handbook for analyzing explanatory text (pp. 89–107). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Kieras, D.E., & Bovair, S. (1986). The acquisition of procedures from text: A production-system analysis of transfer of training. Journal of Memory and Language, 25, 507–524. Kinstch, W. (1986). Learning from text. Cognition and Instruction, 3(2), 87–108. Kintsch, W. (1998). Comprehension: A paradigm for cognition. Cambridge: Cambridge University Press. Kintsch, W., & Van Dijk, T.A. (1978). Toward a model of text comprehension and production. Psychological Review, 85, 363–394. Klahr, D., & Carver, S.M. (1988). Cognitive objectives in a LOGO debugging curriculum: Instruction, learning, and transfer. Cognitive Psychology, 20(3), 362–404. Koedinger, K.R., & Anderson, J.R. (1990). Abstract planning and perceptual chunks: Elements of expertise in geometry. Cognitive Science, 14(4), 511–550. Koestler, A. (1970). The act of creation (rev. Danube ed.). London: Pan Books. Kolodner, J. (1993). Case-based reasoning. San Mateo, CA: Morgan Kaufmann. Lachman, J.L., Lachman, R., & Thronesberry, C. (1979). Metamemory through the adult lifespan. Developmental Psychology, 15, 543–551. Laird, J.E., Newell, A., & Rosenbloom, P.S. (1987). SOAR: An architecture for general intelligence. Artificial Intelligence, 33, 1–64. Laird, J.E., & Rosenbloom, P.S. (1996). The evolution of the SOAR cognitive architecture. In D.Steier & T.M.Mitchell (Eds.), Mind matters: A tribute to Allen Newell. Mahwah, NJ: Lawrence Erlbaum Associates Inc. Lamberts, K. (1990). A hybrid model of learning to solve physics problems. European Journal of Cognitive Psychology, 2(2), 151–170. Larkin, J.H. (Ed.) (1978). Problem solving in physics: Structure, process and learning. The Netherlands: Sijthoff & Noordhoff. Larkin, J.H. (1989). What kind of knowledge transfers? In L.B.Resnick (Ed.), Knowing, learning, and instruction: Essays in honor of Robert Glaser (pp. 283–306). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Larkin, J.H., Reif, F., Carbonell, J.G., & Gugliotta, A. (1986). FERMI: A flexible expert reasoner with multi-domain inferencing (Tech. Rep.). Pittsburgh, PA: Department of Psychology, Carnegie-Mellon University.
210
REFERENCES
Larkin, J.H., & Simon, H.A. (1987). Why a diagram is sometimes worth ten-thousand words. Cognitive Science, 11, 65–99. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge: Cambridge University Press. LeFèvre, J.A. (1987). Processing instructional texts and examples. Canadian Journal of Psychology, 41(3), 351–364. LeFèvre, J., & Dixon, P. (1986). Do written instructions need examples? Cognition and Instruction, 3, 1–30. Lesgold, A., Rubinson, H., Feltovich, P., Glaser, R., Klopfer, D., & Wang, Y. (1988). Expertise in a complex skill: Diagnosing X-ray pictures. In M.T.H.Chi, R.Glaser, & M.J.Farr (Eds.), The nature of expertise (pp. 311–342). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Lesgold, A.M. (1984). Acquiring expertise. In J.R.Anderson (Ed.), Tutorials in learning and memory (pp. 31–60). San Francisco, CA: W.H.Freeman. Levin, J.R. (1988). A transfer-appropriate processing perspective of pictures in prose. In H.Mandl & J.R.Levin (Eds.), Knowledge acquisition from text and pictures. Amsterdam: North-Holland. Levine, M. (1975). A cognitive theory of learning: Research on hypothesis testing. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Lewis, A.B. (1989). Training students to represent arithmetic word problems. Journal of Educational Psychology, 81 (4), 521–531. Logan, G.D. (1988). Toward an instance theory of automatization. Psychological Review, 95(4), 492–527. Luchins, A.S. (1942). Mechanization in problem solving: The effect of Einstellung. Psychological Monographs, 54(248). Luchins, A.S., & Luchins, E.H. (1950). New experimental attempts at preventing mechanization in problem solving. Journal of General Psychology, 42, 279–297. Luchins, A.S., & Luchins, E.H. (1959). Rigidity of behaviour. Eugene, OR: University of Oregon Press. Luger, G.F., & Bauer, M.A. (1978). Transfer effects in isomorphic problem situations. Acta Psychologica, 42, 121–131. Lung, C.-T., & Dominowski, R.L. (1985). Effects of strategy instructions and practice on nine-dot problem solving. Journal of Experimental Psychology: Learning, Memory and Cognition, 11, 804–811. MacGregor, J.N., Ormerod, T.C., & Chronicle, E.P. (in press). Information-processing and insight: A process model of performance on the nine-dot and related problems. Journal of Experimental Psychology: Learning, Memory, and Cognition. Maier, N.R.F. (1931). Reasoning in humans II: The solution of a problem and its appearance in consciousness. Journal of Comparative Psychology, 12, 181–194. Marcus, N., Cooper, M., & Sweller, J. (1996). Understanding instructions. Journal of Educational Psychology, 88(1), 49–63. Marshall, S.P. (1995). Schemas in problem solving. Cambridge: Cambridge University Press. Mayer, R.E. (1993). The instructive metaphor: Metaphoric aids to students’ understanding of science. In A.Ortony (Ed.), Metaphor and thought (2nd ed., pp. 561–578). Cambridge: Cambridge University Press. McKeithen, K.B., Reitman, J.S., Rueter, H.H., & Hirtle, S.C. (1981). Knowledge organization and skill differences in computer programmers. Cognitive Psychology, 13, 307–325. McKendree, J., & Anderson, J.R. (1987). Effect of practice on knowledge and use of Basic LISP. In J.M.Carroll (Ed.), Interfacing thought. Cambridge, MA: MIT Press. McNab, A. (1994). Bravo two zero. London: Corgi. Meadows, S. (1993). The child as thinker. London: Routledge. Medin, D., & Ortony, A. (1989). Comments on Part I: Psychological essentialism. In S.Vosniadou & A.Ortony (Eds.), Similarity and analogical reasoning. London: Cambridge University Press. Medin, D.L., & Ross, B.H. (1989). The specific character of abstract thought: Categorization, problem solving and induction (Vol. 5). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Metcalfe, J. (1986). Feeling of knowing in memory and problem solving. Journal of Experimental Psychology: Learning, Memory, and Cognition, 12(2), 288–294. Metcalfe, J., & Wiebe, D. (1987). Intuition in insight and noninsight problems. Memory and Cognition, 15(3), 238–246.
REFERENCES
211
Miller, G A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information . Psychological Review, 63, 81–97. Mithen, S. (1996). The prehistory of the mind. London: Thames & Hudson. Morris, C.D., Bransford, J.D., & Franks, J.J. (1977). Levels of processing versus transfer appropriate behaviour. Journal of Verbal Learning and Verbal Behavior, 16, 519–533. Müller, B. (1999). Use specificity of cognitive skills: Evidence for production rules? Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(1), 191–207. Murphy, G.L., & Wright, J.C. (1984). Changes in conceptual structure with expertise: Differences between real-world experts and novices. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 144–155. Nathan, M.J., Kintsch, W., & Young, E. (1992). A theory of algebra-word-problem comprehension and its implications for the design of learning environments. Cognition and Instruction, 9(4), 329–389. Newell, A. (1980). Physical symbol systems. Cognitive Science, 4, 135–183. Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press. Newell, A., & Rosenbloom, P.S. (1981). Mechanisms of skill acquisition and the law of practice. In J.R.Anderson (Ed.), Cognitive skills and their acquisition. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Newell, A., & Simon, H.A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Norman, G.R., Brooks, L.R., & Allen, S.W. (1989). Recall by expert medical practitioners and novices as a record of processing attention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15(6), 1166–1174. Novick, L.R. (1990). Representational transfer in problem solving. Psychological Science, 1(2), 128–132. Novick, L.R., & Hmelo, C.E. (1994). Transferring symbolic representations across nonisomorphic problems. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(6), 1296–1321. Novick, L.R., & Holyoak, K.J. (1991). Mathematical problem solving by analogy. Journal of Experimental Psychology: Learning, Memory, and Cognition, 77(3), 398–415. Ohlsson, S. (1992). Information processing explanations of insight and related phenomena. In M.T.Keane & K.J.Gilhooly (Eds.), Advances in the psychology of thinking (pp. 1–43). London: Harvester-Wheatsheaf. Ohlsson, S., & Rees, E. (1991). The function of conceptual understanding in the learning of arithmetic procedures. Cognition and Instruction, 8(2), 103–179. Ormerod, T.C., Chronicle, E.P., & MacGregor, J.N. (1997, August). Facilitation in variants of the nine-dot problem: Perceptual or cognitive mediation. Poster presented at the 18th Annual Conference of the Cognitive Science Society, Stanford, CA. Papert, S. (1980). Mindstorms: Children, computers and powerful ideas. New York: Harvester Press. Papert, S. (1993). The children’s machine. New York: Basic Books. Partridge, D. (1987, June). Is intuitive expertise rule based? Paper presented at the Third International Expert Systems Conference, Oxford. Patel, V.L., & Groen, G.J. (1991). The general and specific nature of medical expertise. In K.A. Ericsson & J.Smith (Eds.), Towards a general theory of expertise (pp. 93–125). Cambridge: Cambridge University Press. Patel, V.L., & Ramoni, M.F. (1997). Cognitive models of directional inference in expert medical reasoning. In P.J.Feltovich, K.M.Ford, & R.R.Hoffman (Eds.), Expertise in context (pp. 67–99). London: MIT Press. Paull, G., & Glencross, D. (1997). Expert perception and decision making in baseball. International Journal of Sport Psychology, 28(1), 35–56. Payne, S.J., Squibb, H.R., & Howes, A. (1990). The nature of device models: The yoked state space hypothesis and some experiments with text editors. Human-Computer Interaction, 5, 415–444. Pellegrino, J.W., & Glaser, R. (1982). Analyzing aptitudes for learning: Inductive reasoning. In R.Glaser (Ed.), Advances in instructional psychology (Vol. 2, pp. 245–269). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Pennington, N., & Rehder, B. (1996). Looking for transfer and interference. In D.L.Medin (Ed.), The psychology of learning and motivation (Vol. 33). New York: Academic Press. Perkins, D.N., & Salomon, G. (1989). Are cognitive skills context bound? Educational Researcher, 18, 16–25. Petrie, H.G., & Oshlag, R. (1993). Metaphor and learning. In A.Ortony (Ed.), Metaphor and thought (2nd ed., pp. 579–609). Cambridge: Cambridge University Press.
212
REFERENCES
Pirolli, P. (1991). Effects of examples and their explanations in a lesson on recursion: A production system analysis. Cognition and Instruction, 8(3), 207–259. Pirolli, P.L., & Anderson, J.R. (1985). The role of learning from examples in the acquisition of recursive programming skills [Special Issue: Skill]. Canadian Journal of Psychology, 39(2), 240–272. Posner, M.I. (1973). Cognition: An introduction. Glenview, IL: Scott, Foresman. Postman, L., & Schwartz, M. (1964). Studies of learning to learn: Transfer as a function of method and practice and class of verbal materials. Journal of Verbal Learning and Verbal Behavior, 3, 37–49. Raichle, M.E. (1998). The neural correlates of consciousness: An analysis of cognitive skill learning. Philosophical Transactions of the Royal Society, London B: Biological Sciences, 353(1377), 1889–1901. Raufaste, E., Eyrolle, H., & Marine, C. (1999). Pertinence generation in radiological diagnosis: Spreading activation and the nature of expertise. Cognitive Science, 22(4), 517–546. Reason, J. (1990). Human error. Cambridge: Cambridge University Press. Reed, S.K., Dempster, A., & Ettinger, M. (1985). Usefulness of analogous solutions for solving algebra word problems. Journal of Experimental Psychology; Learning, Memory, and Cognition, 11(1), 106–125. Reed, S.K., Ernst, G.W., & Banerji, R. (1974). The role of analogy in transfer between similar problem states. Cognitive Psychology, 6, 436–450. Reed, S.K., & Ettinger, M. (1987). Usefulness of tables for solving word problems. Cognition and Instruction, 4(1), 43–58. Reeves, L.M., & Weisberg, R.W. (1993). On the concrete nature of human thinking: Content and context in analogical transfer [Special Issue: Thinking]. Educational Psychology, 13(3–4), 245–258. Reimann, P., & Schult, T.J. (1996). Turning examples into cases: Acquiring knowledge structures for analogical problem solving. Educational Psychologist, 31(2), 123–132. Resnick, L.B. (1989). Introduction. In L.B.Resnick (Ed.), Knowing, learning, and instruction: Essays in honor of Robert Glaser (pp. 1–24). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Rist, R.S. (1989). Schema creation in programming. Cognitive Science, 13, 389–414. Robertson, S.I. (1999). Types of thinking. London: Routledge. Robertson, S.I. (2000). Imitative problem solving: Why transfer of learning often fails to occur. Instructional Science, 28, 263–289. Robertson, S.I., & Kahney, H. (1996). The use of examples in expository texts: Outline of an interpretation theory for text analysis. Instructional Science, 24(2), 89–119. Robinson, D.H., & Kiewra, K.A. (1995). Visual argument: Graphic organisers are superior to outlines in improving learning from text. Journal of Educational Psychology, 87(3), 455–467. Ross, B.H. (1984). Remindings and their effects in learning a cognitive skill. Cognitive Psychology, 16, 371–416. Ross, B.H. (1987). This is like that: The use of earlier problems and the separation of similarity effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13(4), 629–639. Ross, B.H. (1989a). Distinguishing types of superficial similarities. Different effects on the access and use of earlier problems. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 510–520. Ross, B.H. (1989b). Remindings in learning and instruction. In S.Vosniadou & A.Ortony (Eds.), Similarity and analogical reasoning (pp. 438–469). London: Cambridge University Press. Ross, B.H. (1996). Category learning as problem solving. In D.L.Medin (Ed.), The psychology of learning and motivation (Vol. 35, pp. 165–192). New York: Academic Press. Rozin, P. (1976). The evolution of intelligence and access to the cognitive unconscious. In J.M. Sprague & A.N.Epstein (Eds.), Progress in psychobiology and physiological psychology. New York: Academic Press. Scheerer, M. (1963). Problem-solving. Scientific American, 208, 118–128. Schmidt, H.G., Norman, G.R., & Boshuizen, H.P. (1990). A cognitive perspective on medical expertise: Theory and implications. Academic Medicine, 65(10), 611–621. Schneider, W., Körkel, J., & Weinert, F.E. (1989). Expert knowledge and general abilities and text processing. In W.Schneider & F.E.Weinert (Eds.), Interaction among aptitudes, strategies, and knowledge in cognitive performance . New York: Springer-Verlag.
REFERENCES
213
Schön, D.A. (1993). Generative metaphor: A perspective on problem-setting in social policy. In A.Ortony (Ed.), Metaphor and thought (pp. 137–163). Cambridge: Cambridge University Press. Schumacher, R.M., & Czerwinski, M.P. (1992). Mental models and the acquisition of expert knowledge. In R.R.Hoffman (Ed.), The psychology of expertise: Cognitive research and empirical AI. New York: SpringerVerlag. Schumacher, R.M., & Gentner, D. (1988). Transfer of training as analogical mapping [Special Issue: Human-computer interaction and cognitive engineering]. IEEE Transactions on Systems, Man, and Cybernetics, 18(4), 592–600. Schunn, C.D., & Anderson, J.R. (1999). The generality/specificity of expertise in scientific reasoning. Cognitive Science, 23(3), 337–370. Schunn, C.D., & Dunbar, K. (1996). Priming, analogy and awareness in complex reasoning. Memory and Cognition, 24 (3), 271–284. Shallice, T. (1988). Information-processing models of consciousness: Possibilities and problems. In A.J.Marcel & E.Bisiach (Eds.), Consciousness in contemporary science (pp. 305–333). Oxford: Oxford Science Publications. Silver, E.A. (1979). Student perceptions of relatedness among mathematical problem solvers. Journal for Research in Mathematics Education, 10, 195–210. Silver, E.A. (1986). Using conceptual and procedural knowledge: A focus on relationships. In J.Hiebert (Ed.), Conceptual and procedural knowledge: The case of mathematics (pp. 181–198). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Simon, H.A. (1966). Scientific discovery and the psychology of problem solving. In R.G.Colodny (Ed.), Mind and cosmos: Essays in contemporary science and philosophy (pp. 22–40). Pittsburgh, PA: University of Pittsburgh Press. Simon, H.A., & Chase, W.G. (1973). Skill in chess. American Scientist, 61, 394–403. Simon, H.A., & Hayes, J.R. (1976). The understanding process: Problem isomorphs. Cognitive Psychology, 8, 165–190. Simon, H.A., & Kaplan, C.A. (1989). Foundations of cognitive science. In M.I.Posner (Ed.), Foundations of cognitive science (pp. 1–47). Cambridge, MA: MIT Press/Bradford Books. Simon, H.H. (1975). The functional equivalence of problem solving skills. Cognitive Psychology, 7 , 269–288. Simons, P.R.J. (1984). Instructing with analogies. Journal of Educational Psychology, 76(3), 513–527. Singley, M.K., & Anderson, J.R. (1985). The transfer of text-editing skill. Journal of Man-Machine Studies, 22, 403–423. Singley, M.K., & Anderson, J.R. (1989). The transfer of cognitive skill. Cambridge, MA: Harvard University Press. Smith, E.E., & Goodman, L. (1984). Understanding written instructions: The role of an explanatory schema. Cognition and Instruction, 1(4), 359–396. Solso, R.L. (1995). Cognitive psychology (4th ed). London: Allyn and Bacon. Spence, M.T., & Brucks, M. (1997). The moderating effects of problem characteristics on experts’ and novices’ judgments. Journal of Marketing Research, 34(2), 233–247. Spencer, R.M., & Weisberg, R.W. (1986). Context dependent effects on analogical transfer. Memory and Cognition, 14 (5), 442–449. Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In L.A.Hirschfield & S.A.Gelman (Eds.), Mapping the mind: Domain specificity in cognition and culture. Cambridge: Cambridge University Press. Squire, L.R., Knowlton, B., & Musen, G. (1993). The structure and organisation of memory. Annual Review of Psychology, 44, 453–495. Sternberg, R.J. (1996a). Cognitive psychology. Orlando, FL: Harcourt Brace. Sternberg, R.J. (1996b). The costs of expertise. In K.A.Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports, and games (pp. 347–354). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Sternberg, R.J. (1997a). Cognitive conceptions of expertise. In P.J.Feltovich, K.M.Ford, & R.R.Hoffman (Eds.), Expertise in context (pp. 149–162). London: MIT Press. Sternberg, R.J. (1997b). Thinking styles. Cambridge: Cambridge University Press.
214
REFERENCES
Sternberg, R.J. (1998, June). Abilities as developing expertise. Paper presented at the International Conference on the Application of Psychology to the Quality of Learning and Teaching, Hong Kong . Sternberg, R.J., & Frensch, P.A. (1992). On being an expert: A cost-benefit analysis. In R.R.Hoffman (Ed.), The psychology of expertise: Cognitive research and empirical AI (pp. 191–203). New York: Springer-Verlag. Sternberg, R.J., & Nigro, G. (1983). Interaction and analogy in the comprehension and appreciation of metaphors. Quarterly Journal of Experimental Psychology, 35A, 17–38. Stork, D.G. (1997). Scientist on the set: An interview with Marvin Minsky. In D.G.Stork (Ed.), HAL’s legacy: 2001’s computer as dream and reality. Cambridge, MA: MIT Press. Swanson, H.L., O’Connor, J.E., & Carter, K.R. (1991). Problem-solving subgroups as a measure of intellectual giftedness. British Journal of Educational Psychology, 61(1), 55–72. Sweller, J. (1980). Transfer effects in a problem solving context. Quarterly Journal of Experimental Psychology, 32, 233–239. Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12, 257–285. Sweller, J., & Cooper, G.A. (1985). The use of worked examples as a substitute for problem solving in learning algebra. Cognition and Instruction, 2, 59–89. Sweller, J., & Gee, W. (1978). Einstellung, the sequence effect and hypothesis theory. Journal of Experimental Psychology: Human Learning and Memory, 4, 513–526. Tardieu, H., Ehrlich, M.-F., & Gyselinck, V. (1992). Levels of representation and domain-specific knowledge in comprehension of scientific texts [Special Issue: Discourse representation and text processing]. Language and Cognitive Processes, 7(3–4), 335–351. Thomas, J.C. (1974). An analysis of behavior in the Hobbits-Orcs program. Cognitive Psychology, 6, 257–269. Thorndike, E.L. (1913). Educational psychology (Vol. 2). New York: Columbia University Press. Thorndike, E.L., & Woodworth, R.S. (1901). The influence of improvement in one mental function upon the efficiency of other functions. Psychological Review, 8, 247–261. Thune, L.E. (1951). Warm-up effect as a function of level of practice in verbal learning. Journal of Experimental Psychology, 42, 250–256. Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352. Van Dijk, T.A., & Kintsch, W. (1983). Strategies of discourse comprehension. New York: Academic Press. VanLehn, K. (1986). Arithmetic procedures are induced from examples. In J.Hiebert (Ed.), Conceptual and procedural knowledge: The case of mathematics. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. VanLehn, K. (1989). Problem solving and cognitive skill acquisition. In M.I.Posner (Ed.), Foundations of cognitive science. Cambridge, MA: MIT Press. VanLehn, K. (1990). Mind bugs: The origins of procedural misconceptions. Cambridge, MA: MIT Press. VanLehn, K., Jones, R.M., & Chi, M.T.H. (1992). A model of the self-explanation effect. The Journal of the Learning Sciences, 2, 1–59. Vosniadou, S. (1989). Analogical reasoning as a mechanism in knowledge acquisition: A developmental perspective. In S.Vosniadou & A.Ortony (Eds.), Similarity and analogical reasoning. London: Cambridge University Press. Voss, J.F., Greene, T.R., Post, T.A., & Penner, B.C. (1983). Problem solving skill in the social sciences (Vol. 17). New York: Academic Press. Ward, M., & Sweller, J. (1990). Structuring effective worked examples. Cognition and Instruction, 7 (1), 1–39. Waters, A.J., Underwood, G., & Findlay, J.M. (1997). Studying expertise in music reading: Use of a pattern-matching paradigm. Perception and Psychophysics, 59(4), 477–88. Weisberg, R.W. (1995). Prolegomena to theories of insight in problem solving: A taxonomy of problems. In R.J.Sternberg & J.E.Davidson (Eds.), The nature of insight. Cambridge, MA: MIT Press. Weisberg, R.W., & Alba, J.W. (1981). An examination of the alleged role of “fixation” in the solution of several “insight” problems. Journal of Experimental Psychology: General, 110, 169–192. Weisberg, R.W., & Alba, J.W. (1982). Problem solving is not like perception: More on Gestalt theory. Journal of Experimental Psychology: General, 111, 326–330. Wertheimer, M. (1945). Productive thinking. New York: Harper & Row.
REFERENCES
215
Wheatley, G.H. (1984). Problem solving in school mathematics (MEPS Technical Report 84.01). West Lafayette, IN: Purdue University, School of Mathematics and Science Center. White, B.Y. (1993). Intermediate causal models: A missing link for successful science education? In R.Glaser (Ed.), Advances in instructional psychology (Vol. 4). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Winston, P.H., & Horn, B.K.P. (1989). LISP (3rd ed.). Reading, MA: Addison-Wesley. Yaniv, I., & Meyer, D.E. (1987). Activation and metacognition of inaccessible information: Potential bases for incubation effects in problem solving. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 187–205. Young, R.M., & O’Shea, T. (1981). Errors in children’s subtraction. Cognitive Science, 5, 153–177. Zajonc, R.B. (1965). Social facilitation. Science, 149, 269–274. Zajonc, R.B. (1980). Compliance. In P.B.Paulus (Ed.), Psychology of group influence (pp. 35–60). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Zeitz, C.M. (1997). Some concrete advantages of abstraction: How experts’ representations facilitate reasoning. In P.J.Feltovich, K.M.Ford, & R.R.Hoffman (Eds.), Expertise in context (pp. 43–65). Cambridge, MA: MIT Press.
Author index
Adelson, B. 211–12, 214, 247 Ahlum-Heath, M.E. 13, 247 Alba, J.W. 60, 68, 262 Allard, F. 214, 251 Allen, SW. 216, 254, 256–7 Anderson, J.R. xiii, xiv, 5, 14, 46–7, 84, 91, 93–6, 103, 111, 123, 131, 153, 183–4, 187–9, 192, 194–6, 198, 205, 210–11, 231, 247, 254–8, 260 Arbitman, Smith, R. 249 Ashcraft, M.H. 49, 248 Atkinson, R.C. 26, 248 Atlas, R.S. 216, 251 Ausubel, D.P. 145, 248 Azuma, H. 253
Bowers, J. 103, 198, 250 Boyle, C.F. 194, 247 Brainerd, CJ. 210, 249 Bransford, J.D. 22, 147, 151, 169, 245, 249, 257 Brewer, W.F. 28, 249 Britton, B.K. 154, 223, 249, 255 Broadbent, D.E. 190, 249 Brooks, L.R. 216, 257 Brown, A.L. 198, 249 Brown, D.E. 166–7, 249 Brown, J.S. 165, 249 Brucks, M. 211, 260 Bryson, M. 219, 223, 249 Buchner, A. 205, 252 Bugliosi, V. 147, 249 Burton, R.R. 165, 249 Byrne, M.D. 47, 249 Byrnes, J.P. 93, 165, 249
Baddeley, A. 12, 248 Ballstaedt, S.-P. 13, 248 Banerji, R. 35, 40, 88–9, 133, 259 Barclay, J.R. 22, 249 Barsalou, L.W. 64, 246, 248 Bartlett, F.C. 146, 248 Bassok, M. 101, 178, 181, 197, 211, 248, 250, 253 Bauer, M.A. xii, 89, 256 Beck, I.L. 154, 248 Bejar, I.I. 120–1, 244, 248 Bereiter, C. 219, 222, 226, 242, 248–9 Berger, R.C. 216, 251 Berry, D.C. 13, 248 Beveridge, M. 167–9, 248 Bisiach, E. 260 Black, J.B. 255 Black, M. 105, 248 Blanchette, I. 137, 248 Blessing, S.B, 129, 179, 248 Boden, M. 132, 249 Boshuizen, H.P. 208, 259 Bovair, S. xii, xiii, 47, 96, 99, 249, 255
Carbonell, J.G. 131, 207, 211, 249, 256 Cardinale, L.A. 207, 249 Carey, S. 132, 249 Carraher, D. 103, 249 Carraher, T.N. 103, 249 Carroll, J.M. 156, 249, 257 Carter, K.R. 204, 261 Carver, S.M. 82, 255 Castellan, N.J. 253 Catrambone, R. 149, 178, 250 Ceci, S.J. 204, 250 Chaffin, R. 120, 248, 250, 255 Chan, C.S. 211, 250 Chandler, P. 209, 255 Charness, N. 205, 218, 225, 232, 250–1 Chase, W.G. 201, 214, 216–17, 250, 260 Chen, Z. 159, 169, 178, 250 Chipman, S.F. 249 216
AUTHOR INDEX
Chiu, M.H. 101, 250 Chronicle, E.P. 10, 68, 239, 250, 257–8 Clancey, W.C. 252 Clancey, W.J. 197, 250 Clement, J. 166–7, 249 Cobb, P. 103, 198, 250 Cohen, G. 61, 250 Collins, A.M. 111, 185, 250–1 Colodny, R.G. 260 Conway, M. 137, 155, 251 Cooke, N. 216, 251 Cooper, G.A. 154, 251, 261 Cooper, M. 209, 257 Copeland, B.J. 197, 251 Coulson, R.L. 211–12, 252 Curtis, R.V. 80, 251 Czerwinski, M.P. 207, 240, 243, 260 Daehler, M.W. 159, 169, 178, 250 Darling, D. 149, 251 Davidson, J.E. 262 Dawkins, R. 79, 148, 251 de long, T. 153, 252 deLeeuw, N. 101, 250 Deakin, J.M. 214, 251 DeGroot, A.D. 201–2, 204, 214, 251 Dellarosa-Cummins, D. 179, 251 Dempster, A. 160, 259 Dennett, D.C. 38, 40, 251 Detterman, D.K. 253 DiVesta, F.J. 13, 247 Dixon, P. 156, 256 Domingue, J. 165, 253 Dominic, J.F. 252 Dominowski, R.L. 68, 256 Donnelly, C.M. xiii, 144–5, 251 Dreyfus, H.L. 206, 251 Duncker, K. xi, xii, 4–5, 55, 57–9, 63–4, 66–7, 109, 133, 164, 230, 251 Egan, D.E. 216, 251 Ehrlich, M.-F. 159, 261 Eisenstadt, M. 40, 251 Elio, R. 207, 251 Elo, A. 204, 251 Embretson, S. 120, 248 English, L.D, 54, 199, 208, 248 Epstein, A.N. 259
217
Ericsson, K.A. 12–13, 26–7, 204–6, 217–18, 225, 250–3, 258, 260 Erlich, K. 210, 252 Ernst, G.W. 35, 40, 88–9, 133, 259 Ettinger, M. 160, 167, 259 Eylon, B.-S. 170, 252 Eyrolle, H. 208, 259 Falkenhainer, B. 122–3, 231, 252 Farr, M.J. 202, 250–1, 256 Farrell, R. 153, 247 Feltovich, P.J. 211–12, 214, 230, 250–2, 256, 258, 261–2 Fincham, J.M. 195–6, 247 Findlay, J.M. 214, 262 Flower, L. xiii, 219–21, 252, 254 Fodor, J.A. 132, 252 Forbus, K.D. 115, 122, 131, 231, 252 Ford, J.R. 156, 249, 262 Ford, K.M. 250–2, 258, 261–2 Franks, J.J. 22, 169, 245, 249, 257 Fredriksen, C.H. 252 Frensch, P.A. 5, 205, 206, 211, 252, 261 Gardner, H. 132, 205, 252 Gee, W. 84, 261 Gelman, S.A. 249, 260 Gentner, D. xii, 115, 122–7, 129, 131, 142, 144, 148, 157, 231, 245, 252–3, 260 Gentner, D.R. 142, 252 Gick, M.L. xiii, 52, 57, 65–6, 107, 109, 123, 127, 133–4, 138–9, 167–9, 179, 253 Gigerenzer, G. 15, 36, 129, 253, 261 Gilhooly, K.J. 54, 253, 258 Giora, R. xiv, 145, 146, 253 Glaser, R. 22, 101, 121, 139, 197, 202, 206, 209, 211, 213– 14, 230, 248–51, 253, 256, 258–9, 262 Glencross, D. 217, 258 Glynn, S.M. 154, 223, 249 Gobet, F. 217, 253 Goodman, L. 169–70, 260 Gorfein, D. 250 Graybell, A.M. 198, 253 Greene, T.R. 213, 261 Greeno, J.G. 40, 42, 103, 164, 253 Gregg, L.W. 252, 254 Groen, G.J. 208, 258 Gugliotta, A. 207, 256 Gulgoz, S. 154, 223, 249 Gyselinck, V. 159, 261
218
AUTHOR INDEX
Haack, J. 144, 255 Hallam, J. 251 Halpern, D.F. 146–7, 253 Hannemann, J. 144, 255 Hanson, C. 146, 253 Hasemer, T. 165, 253 Hastie, R. 26, 218, 251 Hatano, G. 212, 239, 244, 253 Hatuka, K. 253 Hawking, S.G. xii, 148–9, 253 Hayes, J.R. xiii, 34, 70, 72–3, 75, 87–9, 129, 133, 219–21, 229, 252, 254, 260 Heller, J.I. 137, 254 Herrmann, D.J. 120, 250, 255 Hiebert, J. 153, 165, 254, 260, 261 Hinsley, D.A. 129, 182, 254 Hinton, P. 230, 254 Hirschfield, L.A. 249, 260 Hirtle, S.C. 214, 257 Hmelo, C.E. xii, 101, 231, 258 Hoffman, R.R. 202, 250–2, 254, 258, 260–2 Hofstadter, D. 131, 254 Holland, J.H. 138, 184–6, 192, 230, 241, 254, 256 Holyoak, K.J. xiii, 57, 65–6, 106–9, 114–15, 123, 125, 127, 131, 133–4, 138–9, 145, 149–50, 162–4, 178–9, 181, 184, 228, 230, 248, 250, 253–4, 258 Hong, E. 207, 254, 261 Howes, A. 100, 258 Hughes, P. 54, 254 Hull, C.L. 84, 254 Humphrey, N. 132, 254 Hunt, E. 35, 254 Iding, M.K. 146, 254 Inagaki, K. 212, 239, 244, 253 Issing, L.J. 144, 255 Johnson, K.E. 216, 255 Johnson, P.E. 252 Johnson-Laird, P.N. 120, 178, 197, 255 Jones, R.M. 153, 261 Joram, E. 219, 249 Kahney, H. 32, 102, 137, 154, 155, 251, 255, 259 Kalyuga, S. 209, 255 Kaplan, C.A. 12–13, 61–3, 74, 255, 260 Karmiloff-Smith, A. 132, 255 Kennedy, R.A. 249 Kieras, D.E. xii, xiii, 96, 99, 154, 255
Kintsch,W. 33, 158, 159, 217, 243, 251, 255, 257, 261 Klahr, D. 82, 254, 255 Klein, G. 251 Klopfer, D. 256 Knowlton, B. 198, 260 Koedinger, K.R. 210, 255 Koestler, A. 65, 255 Koh, K. 115, 163, 254 Kolodner, J. 131, 255 Körkel, J. 204, 259 Kotovsky, K. 254 Krampe, R.T. 206, 232, 250, 251 Lachman, J.L. 61, 256 Lachman, R. 61, 256 Laird, J.E. 36, 65, 93, 184, 187–8, 256 Lamberts, K. 207, 256 Lane, D.H. 216, 251, 254 Larkin, J.H. 82, 102–3, 139, 168–70, 207, 256 LaVancher, C. 101, 250 Lave, J. 103, 256 LeFèvre, J. 155, 256 Lefèvre, R 153, 165, 254 Lesgold, A.M. xiv, 208–9, 212–13, 256 Levin, J.R. 84, 168–9, 253, 255–6 Levine, M. 84, 256 Lewis, A.B. 167, 256 Lewis, M.W. 101, 250 Liker, J.K. 250 Loftus, E.F. 111, 250 Logan, G.D. 194, 256 Luchins, A.S. xi, 55, 57, 84, 96, 256 Luchins, E.H. xi, 55, 57, 96, 256 Luger, G.F. xii, 89, 256 Lung, C.-T. 68, 256 MacGregor, J.N. xii, 10, 68–9, 75, 239, 250, 257–8 Mandl, H. 248, 253, 255–6 Marcel, A.J. 260 Marcus, N. 209, 257 Marinè, C. 259 Marshall, S.P. 179, 182–3, 257 Mayer, R.E. 231, 257 Mayr, U. 232, 250 Mazur-Rimetz, S.A. 156, 249 McDaniel, M.A. xiii, 144–6, 251 McGarry, S.J. 52, 253 McKeithen, K.B. 214, 257 McKendree, J. xiv, 195–6, 257
AUTHOR INDEX
McKeown, M.G. 154, 248 McNab, A. 8, 257 Meadows, S. 148, 257 Medin, D.L. 128, 178, 240, 257–9 Mellish, C. 251 Mervis, C.B. 216, 255 Metcalfe, J. 60–1, 257 Meyer, D.E. 112, 262 Miller, G.A. 12, 257 Mitchell, T.M. 256 Mithen, S. xvi, 132, 257 Moller, J.H. 252 Moore, J.L. 103, 253 Morris, C.D. 169, 245, 257 Müller, B. iv, 196–7, 239, 257 Murphy, G.L. 216, 257 Musen, G. 198, 260 Nathan, M.J. 33, 158–9, 243, 257 Newell, A. xvi, 26, 28–9, 32–7, 44, 46, 60, 65, 93, 184, 187, 194, 202, 220, 230, 256–7 Nigro, G. 121, 261 Nisbett, R.E. 138, 184, 230, 254 Norman, G.R. 190, 208, 216, 257, 259 Novick, L.R. xii, 101, 145, 159, 162, 164, 228, 231, 257–8 Ohlsson, S. 59–60, 65–8, 75, 92–3 112, 165, 258 Ormerod, T.C. 10, 68, 239, 250, 257–8 Ortony, A. 128, 247–8, 252, 254–5, 257–9, 261 Oshlag, R. 231, 258 Palincsar, A.S. 198, 249 Papert, S. 82, 258 Parkins, E. 167–9, 248 Partridge, D. 214, 258 Patel, V.L. 208, 258 Paull, G. 217, 258 Payne, S.J. 100, 190, 258 Pellegrino, J.W. 121, 258 Penner, B.C. 213, 261 Pennington, N. 96, 102, 231, 258 Perkins, D.N. 178, 258 Petrie, H.G. 231, 258 Pirolli, P. 155–6, 194, 258 Pisoni, D.B. 253 Poerter, L.W. 253 Poison, P.G. 217, 251 Posner, M.I. 70, 258, 260–1 Post, T.A. 213, 261
219
Postman, L. 83, 258 Potts, G.R. 253 Quillian, M.R. 111, 185, 251 Raichle, M.E. 199, 258 Ramoni, M.F. 208, 258 Rattermann, M.J. 115, 252 Raufaste, E. xiii, 208–9, 225, 259 Reason, J. 211, 259 Reder, L.M. 103, 247 Reed, S.K. 35, 40, 88–9, 133, 155, 160, 163–4, 167, 259 Rees, E. 92, 139, 165, 209, 213, 250, 258 Reeves, L.M. 178, 259 Rehder, B. 96, 102, 231, 258 Reif, F. 170, 207, 252, 256 Reigeluth, C.M. 80, 251 Reimann, P. 101, 250, 259 Reitman, J.S. 214, 257 Resnick, L.B. 167, 259 Reyna,V.F. 210, 249 Riefer, D. 146, 253 Rist, R.S. 210, 259 Robertson, S.I. xi, xvi, 102, 154, 163–4, 178, 228, 259 Rosenbloom, P.S. 36, 65, 184, 187–8, 194, 256–7 Rosenzweig, M.R. 253 Ross, B.H. xiii, 115, 129, 138, 140–1, 155–7, 165, 178–9, 181, 217, 240, 248, 257, 259 Rozin, P. 132, 259 Rubinson, H. 256 Rueter, H.H. 214, 257 Salomon, G. 178, 258 Sauers, R. 153, 247 Scardamalia, M. 219, 222, 226, 242, 248, 249 Scharf, P.B. 207, 251 Scheerer, M. 68, 259 Schliemann, A.D. 103, 249 Schmidt, H.G. 208, 259 Schneider, M.L. 252 Schneider, W. 204, 252, 259 Schön, D.A. 228–9, 259 Schult, T.J. 157, 259 Schumacher, R.M. 125, 207, 240, 243, 253, 260 Schunn, C.D. 102, 150, 205, 231, 260 Schwartz, B.J. 216, 251 Schwartz, M. 83, 258 Segal, J.W. 249 Shallice, T. 198, 260
220
AUTHOR INDEX
Shiffrin, R.M. 26, 248 Shortliffe, E.H. 252 Silver, E.A. 93, 165, 260 Simon, H.A. 12–13, 26–8, 32–7, 44–6, 60–3, 72–5, 87–9, 103, 112, 129, 133, 168–70, 187, 201–2, 214, 216–18, 220, 229–30, 247, 250, 252–7, 260 Singley, M.K. xiii, 84, 91, 94–6, 103, 195, 231, 247, 260 Smith, D.R. 103, 253 Smith, E.E. 169, 170, 255, 260 Smith, J. 204, 252, 258 Smith-Kerker, P.L. 156, 249 Soloway, E. 210, 252 Solso, R.L. 40, 260 Spelke, E. 132, 249 Spence, I.T. 248 Spence, K.W. 248 Spence, M.T. 211, 260 Spencer, R.M. 178, 260 Sperber, D. 132, 260 Spiro, R.J. 211–12, 252 Sprague, J.M. 259 Squibb, H.R. 100, 258 Squire, L.R. 198, 260 Steier, D. 256 Stein, B.S. 147, 151, 249 Stein, N.L. 248 Sternberg, R.J. 40, 121, 204–6, 211, 218, 249, 251–5, 260– 2 Stevens, A.L. 252 Stienberg, E.R. 252, 254 Stork, D.G. 202, 261 Swanson, D.B. 252 Swanson, H.L. 204, 261 Sweller, J. 84–6, 96, 127, 154, 209, 251, 255, 257, 261–2 Tardieu, H. 159, 261 Tesch-Römer, C. 251 Thagard, P.R. 106, 114, 131, 138, 149, 184, 230, 254 Thomas, J.C. 40, 42–3, 252, 261 Thompson, R. 123, 247 Thorndike, E.L. 82–3, 91, 94, 96, 261 Thronesberry, C. 61, 256 Thune, L.E. 83, 261 Todd, P.M. 15, 36, 129, 253, 261 Toupin, C. 115, 122, 126, 231, 253 Trabasso, T. 248 Treyens, J.C. 28, 249 Tversky, A. 120, 261
Underwood, G. 214, 262 Van Dijk, T.A. 158, 159, 255, 261 Van Dusen, L. 154, 223, 249 VanLehn, K. 36, 65, 84, 94, 153, 156, 165, 179, 230, 249, 261 Vosniadou, S. 122, 127, 247–8, 252, 254–5, 257, 259, 261 Voss, J.F. 213, 261 Vye, N.J. 147, 249 Wang, Y. 256 Ward, M. 209, 262 Waters, A.J. 214, 262 Weinert, F E. 204, 259 Weisberg, R.W. 60, 66, 68, 178, 259, 260, 262 Wenger, E. 103, 256 Wertheimer, M. 54, 101, 262 Wheatley, G.H. 5, 262 White, B.Y. 210, 262 Wiebe, D. 60–1, 257 Wilkes, A. 249 Winston, P.H. 153, 248, 262 Woodworth, R.S. 82, 261 Wright, J.C, 216, 257 Yaniv, I. 112, 262 Yanowitz, K.L. 178, 250 Yost, G. 194, 247 Young, E. 33, 158, 257 Young, R.M. 165, 192, 262 Zajonc, R.B. 233, 262 Zeitz, C.M. 210, 225, 242, 262 Zsambok, C.E. 251
Subject index
abstraction mapping 139 ACT-R 47, 96, 131, 165, 184, 187–8, 190–3, 195–9 algorithms 38–9, 120, 182, 187, 239, 243 analogical mapping 108, 114–15, 117, 126, 130, 135, 137, 162–3, 239 analogical problem solving xvi, 15–16, 33, 52, 79–80, 82, 101, 112, 115, 127, 131, 133, 136, 150, 152, 169, 192, 239–40, 243, 248–250, 253–4, 259–60 analogy xvi, 15, 16, 18, 31, 34, 52, 62, 74, 79–80, 88–9, 100, 105–6, 108–9, 115–28, 130–3, 137–42, 144–50, 152, 156, 166–7, 170–1, 192–3, 200, 223, 228, 231, 243–5, 247–9, 251–5, 258–61; metaphor 11, 34, 38, 61, 79, 105, 118, 120–1, 127, 131, 146, 148, 152, 194, 228, 236, 248, 257, 261 arguments (of prepositional representation) 116, 122, 129, 137, 193, 239, 242–3 artificial intelligence xv, 14, 17, 131, 183, 202, 247, 249, 251–4, 256, 260, 261 automaticity 177, 198–200, 203, 211–12, 225, 232, 239, 256
condition-simplifying generalisation 187, 240 connectionism 14, 199, 207, 214, 240, 242 constraint relaxation 68 constraints 38, 62, 68, 72, 74, 100, 115, 125, 130, 135, 219, 221, 226–8, 240–2, 246, 252 context: context-dependent information 64, 240; context-independent information 63– 4, 239, 240; (effects on problem solving) xv, xvi, 6, 10, 14–15, 18, 28, 33–4, 43, 59, 64–6, 84, 91–2, 94, 103–4, 125, 127, 132, 137, 142, 149, 150, 153, 158, 169, 175, 177–8, 188–9, 196–8, 200, 205, 208, 211, 230–3, 240, 243–6, 252, 258–9, 261 control problem 157 cover story 25–6, 70, 87, 134, 240 criterion failure 69 declarative knowledge see knowledge declarative memory 184, 188, 190–1, 240; see also knowledge, ACT-R deduction xv, 36, 61, 80, 117, 240 developing expertise 205–6, 261 diachronic rules 240 difference reduction see heuristics discourse knowledge 222 domain (of knowledge) 7–9, 15–17, 22, 33, 82, 85, 100, 102–4, 120, 122–3, 127, 131– 4, 140, 143, 145–6, 149, 151–4, 156, 159, 164, 166, 170–1, 177, 181, 189, 201– 5, 207–10, 214, 216–19, 221–5, 228, 231–2, 240–2, 244, 250; ‘distance’ between domains 145–6; domain-expertise 205, 210, 219, 223; domain-general knowledge 7, 17, 82, 90, 100, 103–4, 202, 204, 240–1; domain-specific knowledge 7, 15, 17, 37, 39, 82, 90, 103–4, 202, 204, 217, 240, 244, 261;
base problem 122, 123, 244; see also source problem bridging analogy 166 candle holder problem 58, 63, 64, 230 chunk 190–1, 216, 225, 239, 246 cognitive architecture 14, 17, 183–4, 239–40, 256; see also ACT-R, connectionism cognitive load 209, 222 cognitive overload 219, 221 conceptual chunking 216 conceptual core 64, 239 conceptual fluency 93 conceptual integration hypothesis 196, 239 conceptual priming 102, 104 conceptual transfer see transfer 221
222
SUBJECT INDEX
effects of domain familiarity 7, 123, 143–6, 152, 159, 164, 168–70, 179, 201, 203, 228 effort after meaning 146 Einstellung 55, 58, 84, 86, 96, 211, 231, 240, 256, 261 elaboration, 102, 182 elaboration knowledge 182 elaborative inferences see inferences EUREKA, 207 example-analogy 138, 140, 179, 240 execution knowledge 183 expert systems 197, 202, 251, 258 expertise xvi, 10, 15, 17–18, 25, 100, 129, 159, 173, 175, 177, 182, 187, 195–7, 200– 14, 216–19, 222–6, 230–3, 239, 240, 242– 4, 247, 249–53, 255–62; adaptive expertise 212, 225, 239, 244; computer models 207; expert-novice differences 17, 139, 159, 175, 177, 201– 4, 206–8, 211–14, 216, 218–19, 221–6, 230, 242, 247, 249–51, 257; routine expertise 212, 225, 244; stage models of expertise 206–7, 240 fast and frugal thinking see heuristics feeling-of-knowing 61 FERMI 256 fortress problem 107, 109, 115, 133, 135, 157 functional fixedness 55, 58, 63–4, 74–5, 84, 211, 231, 241 functional magnetic resonance imaging 199 Gauss 53–4 genralisation problem 157 Generic Error-Modelling System 211 genre 221, 240–1 Gestalt psychology xvi, 11, 15, 52–3, 59, 74, 84 goal stack 46–8, 66, 241 graphic organisers 168 Hayes-Flower model 221, 226 heuristics 34, 38–9, 44, 46–7, 51, 60, 66, 68–9, 74–5, 127, 129, 130, 202–4, 229, 240–42; ‘fast and frugal’ 36; difference reduction 38, 43, 69, 240; hill climbing 38; maximization heuristic 69; means-ends analysis 44–5; trial and error 37–8; working backwards 45, 246 hill climbing see heuristics
Himalayan tea ceremony problem 25 Hobbits and Orcs problem 40, 42–3, 253 homomorphism 185–6, 241 hypothesis testing theory 84, 86 identical elements theory 82–3, 91, 94, 96, 100, 102, 241, 247 identical productions theory 91, 96 identification knowledge 182 identities 108, 130 IGOR 9, 29 impasse 65–6, 68, 74–5, 192, 241 improvement 83 incubatlon 112, 241, 262 indeterminate correspondences 108 induction 17, 86, 87; schema induction 139, 145, 150, 177–9, 181, 200, 232; see also inferences inference rules 164 inferences 12, 15, 22, 31, 33, 36, 50, 116–17, 123, 130, 137, 142, 144–6, 150, 156, 158–9, 164, 168, 170, 183, 230, 240, 251, 258; derived from text 22, 36, 158, 171; elaborative 158–9, 164, 240, 245; inductive inferences (transfer) 189; perceptual inferences 168, 170 information processing xv, xvi, 12, 14, 25, 28, 35, 37, 46, 63, 74, 188, 202, 233, 241 informational encapsulation 132, 241 insight 11–12, 15, 17, 23, 33, 49, 53, 55–6, 59–61, 63–6, 68–70, 74–5, 112, 228–9, 241–2, 250, 255, 257–8, 262; full 68, 241–2; partial 68, 241–2 instance-based generalisation 187, 241 instructions 11, 34, 72–3, 75, 88, 91–2, 96, 137, 156, 170, 192, 207, 223, 229, 254–7 260 intermediate effect 208–9, 225, 241 interpretation problem 157 jealous husbands problem 88–9 kind world hypothesis 129 knowledge: abstract 93, 210, 239; declarative 91–3, 96, 102, 153, 164–5, 177, 182, 192, 198, 239, 240–1; procedural 91–3, 96, 165, 177, 183, 188, 192, 198, 240, 243, 254, 260–1;
SUBJECT INDEX
see also memory knowledge compilation 192, 211 knowledge state see state knowledge-telling strategy 222, 242 knowledge-transforming strategy 222, 242 learned helplessness 151 learning to learn 82, 90, 258 lexicon 158, 171, 242 Lisp 7, 8, 10, 82, 153, 193, 253 locally rational 69 Logo 82 mad bird problem 70, 72 mapping see analogical mapping maximization heuristic see heuristics means-ends analysis see heuristics memory: episodic memory 94, 188, 240; external memory 34; long term memory 12, 16, 33–4, 50, 61, 64, 66–7, 136, 150, 188, 202, 208, 218, 221, 224, 225, 230, 241– 2, 244–5; long-term working memory 217; short term memory, 12, 13, 46, 217– 18, 244, 246; working memory 12, 51, 66, 99, 184, 190–1, 203, 209, 217, 225, 230, 232, 241, 245–6, 248–9, 251 mental escape 151 metacognition 60, 242, 262 metaphor see analogy Minsky M. 261 missionaries and cannibals problem 40, 88– 90 Moderately Abstract Conceptual Representation 210, 242 monster problems 73 mutilated chequer board problem 51–2, 62 nine dots problem 68, 235 nodes 32–4, 111, 160, 240, 242, 245 novice 5, 8, 16, 80, 124, 128–9, 140–1, 145, 150, 152–3, 156–9, 164, 175, 177, 200–3, 206–9, 211–12, 214, 216, 219, 221–2, 224–6, 230, 242, 255, 260; see also expertise operator (mental operator) 5, 9, 10, 27, 29, 31, 33, 35–6, 44–5, 50, 61, 66–9, 73, 75, 100, 108, 130, 157, 163–5, 170–1, 179, 186, 193, 227–8, 230, 237, 240–6 operator restrictions 9, 242 perceptual chunking 214, 216, 223
223
PI 165, 187, 192, 241 planning (writing processes) 220 planning knowledge 182 positron emission tomography 199 power law of learning 194, 242 predicate calculus 122, 239, 242 predicates 116, 124, 127, 239, 242 predictor rules 185 principle-cueing 138, 139–41, 179, 243 problem: definition 4, 5; problem representation 15, 18, 21–3, 25–8, 32, 35–7, 40, 46, 49–53, 61–4, 66–7, 69–70, 72–5, 100–11, 120, 123, 127, 129, 131, 135, 141–4, 149–50, 158–9, 164, 166–71, 179, 182, 185–6, 193, 196–7, 204, 207–8, 210–13, 224–5, 229–32, 240–6, 248, 250, 252, 254–5, 258, 260–2; problem structure, 16, 53–6, 60, 74, 139, 171, 182, 231, 243–5: problem types; abstractly-defined 10, 68– 9, 74–5, 239; complex xvi, 17, 120, 218, 220, 226; ill-defined 8–10, 15, 17, 74, 107, 201, 218–19, 241–2; knowledge-lean 8, 17, 25, 230, 242; knowledge-rich 8, 9, 17, 242; multistep 61; semantically lean 10, 244; semantically poor 10; well-defined 9–10, 15, 17, 25, 29, 31, 74, 87, 133, 239, 246; word problems 4–5, 16, 75, 80, 106, 108, 133, 137, 151–2, 154–5, 158–60, 162, 170–1, 182, 228, 243, 248, 254, 256, 259; see also insight, structural features, surface features problem givens 37, 66, 196, 239, 241 problem model 158–9, 164, 171, 182, 186, 243 problem similarity 16, 123, 126, 149, 253; homomorphs 88, 241; isomorphs 25, 52, 72–3, 87–9, 242, 260; relational similarity ,115, 118; semantic similarity 111, 114– 15, 126, 246; structural similarity 16, 81, 122, 137, 245 (see also structural features, problem structure) ; surface similarity 16, 82, 109, 137, 169, 241, 245 (see also surface features) problem solving methods 37, 246 problem space 32–7, 40, 43, 46–7, 60–3, 69, 74, 222, 226, 228, 230, 242–5 problem state:
224
SUBJECT INDEX
see ‘state’ problem statement 6, 15, 22, 27, 33, 36–7, 67, 108, 135, 137–8, 153, 193, 227, 230, 240–1, 243 problem variants 62, 69, 72–3, 90, 109, 154, 163, 164, 171, 243, 258; close variants 109; distant variants 109, 155, 171 procedural fluency 93 procedural knowledge see knowledge procedural memory see memory procedural transfer see transfer production rules 90–1, 94–6, 100, 104, 184, 188, 192, 195– 7, 199–200, 231, 243, 257 production systems 14, 183, 200, 243 productive thinking 54, 101, 243–4 Prolog 82, 151, 251 proportional analogies 116–17, 121, 137, 243 propositional representation 114, 158–9, 243–5 protocols 12–13, 17, 27, 59, 73, 102, 202, 204, 214, 220, 246 q-morphism 186, 187 radiation problem 65–6, 109, 115, 133, 135, 157 rationality 36, 187, 243 recursion 45–6, 79, 139, 151, 153, 156, 165, 182, 243, 258 re-encoding 74 relational elements 121, 244 representation: diagrammatic 45, 168, 240; intermediate 166–7, 170; sentential 168, 244; unstable representations 64, 246; see also problem representation representational transfer see transfer reproductive thinking 54, 58, 244–5 restrictions see operator restrictions retrieval asymmetry 195 reviewing (writing processes) 220, 244 rhetorical problem 221–3, 226, 244 river current problem 70 salience xvi, 15, 33, 37, 64–5, 66, 70, 72, 75, 85, 122, 127, 129, 157, 171, 196, 229–30 scaffolding 166, 170, 206 schema induction see induction schemas 100–1, 108, 134, 138–41, 145–46, 150, 154, 166, 169–71, 177–9, 181–3, 190, 200, 207, 209–13, 222, 225, 230, 232, 240, 244, 248–9, 251;
‘implicit’ solution schemas 138, 140, 179; definition 100; explanatory schemas 169–70, 260; representational schemas 101, 104; schematic chunks in WM 191; see also induction search 27, 31, 33–6, 38, 40, 44, 46–7, 60–4, 74, 118, 137, 168, 170, 188, 202, 212, 214, 230–1, 241, 244–5, 255 search graph 31, 244, 245 search space 33, 36, 38, 62 self-explanations 101, 198, 250 semantic 10, 22, 66, 75, 93, 100–11, 114–15, 118, 120–1, 125, 126, 130, 158, 230, 240, 244, 246, 250–1, 255 semantic network 66, 111, 230, 244, 255 sequence effect 84, 86, 261 set effects 55, 240–1, 244 Sierra 165 situated learning 103, 198, 244, 250 situation model 158–9, 164, 171, 243–5 SOAR 165, 184, 197, 199, 256 solution 5, 6, 8, 10, 11, 16–17, 23, 26, 33–4, 36–8, 40, 42, 46–7, 51, 53, 57–8, 60–70, 73–4, 79, 82–4, 90, 101–2, 107–10, 112, 115, 120, 126–7, 129, 133–9, 150, 153–4, 156–7, 160, 163–4, 166, 168–9, 177, 179, 181, 182–3, 186–7, 198, 207–8, 217, 221, 229–31, 235, 241–5, 257, 262; creative solution 69, 75, 229; solution criterion 66; solution plan 63–4, 186; solution principle 10, 169; solution procedure xi, 6, 16, 23, 57, 62, 70, 83, 101, 109, 136, 138–9, 154, 156–7, 160, 166, 177, 179, 181, 183, 207, 217, 230–1, 243–4; solution schema (see also schemas) 179, 182; solution structure 16, 108, 110, 127, 134–5, 242, 245; two meanings 6 solution development 57, 244 source domain: see domain 143, 168 source model 186 source problem 102, 106, 108–9, 117, 123–4, 126–30, 136–9, 143, 146, 149–50, 155, 163–4, 168–9, 171, 179, 181, 230–1, 240, 243–5, 250, 253 specialisation 187, 232 spreading activation 66, 75, 111, 185, 191, 222, 244, 250 state (problem state) 4–10, 15, 17, 22–3, 25–7, 29, 31–8, 40, 42–8, 58, 61, 66–7, 69, 73, 93, 96, 99, 108, 116, 125, 127, 135, 137–8, 153, 162–3, 184–6, 189, 191, 193, 196, 217–19, 227–8, 230, 232, 239–46, 248, 259;
SUBJECT INDEX
goal state 5–6, 8–10, 26, 29, 32–3, 38, 43–5, 69, 186, 193, 218–19, 241, 245–6; initial state 5, 8, 9, 25–7, 29, 31–2, 34–6, 38, 42, 44– 5, 218, 241, 243; intermediate problem state 27; knowledge state 27, 31, 35, 242 state space 31–2, 35–6, 43, 72, 245, 258; see also problem representation state-action diagram 31, 242, 245 structure-mapping theory 122 structure-preserving differences 108, 130 structure-violating differences 108, 130 sub-goaling 45, 245 super experts 208–9, 219, 225 structural features (of problem) 11, 54, 72, 80–1, 87, 101, 109, 127, 129–30, 134, 137, 142, 146, 150, 153, 200, 207, 230–1, 239, 245; surface features (of problem) 11, 80–1, 87, 102, 108–9, 122–4, 127–31, 134–5, 138, 144, 146, 157, 159, 181, 208, 214, 217, 230–1, 245; symbol 28–9, 216, 241, 245 symbol systems 28 symbol tokens 28, 245 symbol structures 28–9, 35, 241, 245 synchronic rules 245 systematicity principle 122, 125–7, 245 target problem 83, 106, 108–10, 112, 114–15, 117, 121–5, 127–30, 135–9, 143–5, 147, 150, 156, 163–4, 168–9, 171, 179, 181, 186, 230–1, 240, 243–5 task environment 28, 32, 36–7, 220, 230, 242, 245 task-expertise 205; see also expertise terminally stuck 65–6 textbase 158–9, 164, 171, 182, 243–5 textbook problem solving xvi, 3, 5, 8, 16, 23, 80, 83, 106, 131, 133, 151–60, 166, 171, 192, 209, 219, 221, 231 tip-of-the-tongue 60 Tower of Hanoi problem 25, 29, 31–4, 38, 45–7, 61, 72, 87–9, 108, 188, 227 transfer (of learning) xv, 14–15, 25, 81–3, 85–6, 88–9, 91, 94, 96, 125, 132–3, 144, 150, 152, 159–60, 162–3, 168, 175, 178, 197, 207, 211, 217, 230–1, 241, 244, 247–6, 258–61; analogical transfer 15, 82, 127, 169, 239, 250, 253–4, 259–60 (see also analogical problem solving); asymmetric 88–9; conceptual transfer 100, 231, 239;
225
general transfer 104, 205, 241; implicit transfer 104; negative transfer 83–4 86, 94–6, 103–4, 169, 231, 242; positive transfer 83–4, 86, 94–6, 104, 169, 231, 242, 250; procedural transfer 91, 94, 96, 102, 192, 195–6, 241, 247, 255, 260; representational transfer 101; specific transfer 90, 100, 104, 169, 231, 244, 248; transferable skills 18, 91, 103, 205 transfer problem 87 transfer-appropriate processing 169, 245, 256 transition functions 186 translating (writing processes) 220, 230, 245 transparency 115, 126, 241, 242 trial and error see heuristics two-string problem 64, 67 UNDERSTAND 73 understanding 16, 22, 33, 36, 46–7, 54, 61, 70, 74, 101–2, 116, 118, 120, 125, 141–2, 144–6, 151–3, 158–9, 163– 6, 169–71, 198, 201, 205, 207, 227, 230, 240, 243, 244, 246, 250, 253, 255, 257–8, 260; definition 27, 36; primary 153, 243; reflective 153, 243 unified theories of cognition 184, 246 unstable representations see representation use-specificity 195 verbal protocols see protocols water jars problem 57, 85 working backwards see heuristics working forwards 219 working memory elements 190, 246 world knowledge 135