Controlling Uncertainty Decision Making and Learning in Complex Worlds
Magda Osman
A John Wiley & Sons, Ltd., Publicat...
22 downloads
2104 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Controlling Uncertainty Decision Making and Learning in Complex Worlds
Magda Osman
A John Wiley & Sons, Ltd., Publication
Controlling Uncertainty
To ,
and Yousef
Controlling Uncertainty Decision Making and Learning in Complex Worlds
Magda Osman
A John Wiley & Sons, Ltd., Publication
This edition first published 2010 © 2010 Magda Osman Wiley-Blackwell is an imprint of John Wiley & Sons, formed by the merger of Wiley’s global Scientific, Technical, and Medical business with Blackwell Publishing. Registered Office John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK Editorial Offices The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK 9600 Garsington Road, Oxford, OX4 2DQ, UK 350 Main Street, Malden, MA 02148-5020, USA For details of our global editorial offices, for customer services, and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell. The right of Magda Osman to be identified as the author of this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought. Library of Congress Cataloging-in-Publication Data Osman, Magda. Controlling uncertainty : decision making and learning in complex worlds / Magda Osman. p. cm. Includes bibliographical references and index. ISBN 978-1-4051-9946-9 (hardcover : alk. paper) 1. Uncertainty. 2. Decision making. 3. Cognitive psychology. 4. Control (Psychology) I. Title. BF463.U5O86 2010 003′.5–dc22 2010011928 A catalogue record for this book is available from the British Library. Set in 10.5 on 13.5 pt Palatino by Toppan Best-set Premedia Limited Printed in Malaysia 01
2010
Contents
Preface: the master puppeteer Acknowledgements
vii x
1
Introduction
2
Causation and agency
16
3
Control systems engineering
52
4
Cybernetics, artificial intelligence and machine learning
80
Human factors (HCI, ergonomics and cognitive engineering)
115
Social psychology, organizational psychology and management
147
7
Cognitive psychology
179
8
Neuroscience
216
5 6
1
v
Contents 9
Synthesis
254
10 Epilogue: the master puppeteer
270
References
273
Index
318
vi
Preface: the master puppeteer
The ringmaster of a travelling circus told a promising young puppeteer of a mysterious string puppet that would present a challenge to anyone who tried to command it. The puppeteer was intrigued. ‘Though it could be mastered, it would take all the skill and determination of even the most maven puppeteer.’ Then, in answer to the young puppeteer ’s question, the ringmaster replied, ‘The reason why it is so difficult to work the puppet is that inside the puppet is a mechanism which follows its own rules and logic.’ This was enough to inspire the puppeteer. At last he had found his ambition. He would seek out and learn all there is to know of the puppet and how to rule it. But, before leaving, as a warning to the young apprentice, the ringmaster ’s final words on the matter of the string puppet were these: Be very careful. You can’t always be sure if the puppet acts the way you want it to because you’ve pulled the right strings or because of what is inside it. Sometimes you’d be forgiven for thinking that it acts as if it has a mind of its own.
vii
Preface When eventually the young puppeteer found the mysterious string puppet, what he saw was a large glass box supported by an elaborately carved wooden frame. The puppeteer paced around inspecting the box. On one of the sides of the box, there was a panel with many levers of different shapes and sizes, and beside that was a chair. The puppet inside the box looked fairly ordinary suspended in space, its limbs heavy, holding the posture that all string puppets have before they’re brought to life. Though the puppeteer was told that this was a string puppet, having peered hard into the glass, he just couldn’t find the strings. The strings must be either very fine or transparent to me, the puppeteer thought to himself. The young puppeteer sat at the chair. He tried out various levers in a haphazard way at first, but not much seemed to happen. It was hard to tell which lever moved the head, and which the limbs. There were more moving parts to the puppet than there were levers. This brought moments of doubt into the puppeteer, making him wonder if he could ever hope to find the right means to master the puppet. However, when there was a purposeful order to operating the levers, the puppeteer could make the puppet’s arm jolt, and even bring about the tapping of a foot. Once in a while the words of the ringmaster surfaced in the mind of the puppeteer. As the ringmaster had warned, the puppet moved on its own, sometimes quite dramatically, but other times it would just be an almost imperceptible twitch. The puppeteer tried to ignore this, hoping that it would be enough to keep pulling and pushing at the same familiar levers. Though this did seem to work, it was only an illusory sense of command. The puppeteer knew that his rehearsed actions couldn’t alleviate the scepticism that the puppet’s behaviour brought about. ‘I wonder whether the mechanism in the puppet truly is erratic, or is it simply that I haven’t yet learnt all there is to know about it?’ Perseverance and will helped to keep the puppeteer ’s resolve to struggle through the doubting times. The belief that he was the one who could ultimately rule the puppet and not the other way around gave him hope. The puppeteer became wise to the fact that the mechanism that made the puppet move of its own accord wouldn’t viii
Preface always keep in time with the music that was played. On some occasions it seemed like the mechanism would go faster, and on other occasions it would seem to go much slower. There were even times when the limbs would move and dance in sequence, but not when anticipated. It did take a long time, but the young puppeteer became skilled in conducting the mysterious string puppet. The ringmaster had followed the progress of the puppeteer, and found an opportunity to watch a performance. The tales were true: the puppet danced elegantly before the audience. The ringmaster hoped to know what valuable lessons the puppeteer had gained in their time of apprenticeship. ‘All I know is this,’ said the puppeteer. ‘To make the puppet dance the way I wanted, I had to know how supple the limbs and head were. The suppler it seemed to be, the more elegantly I could make it dance. I had to know when the mechanism in the puppet was running fast or slow, and when it failed to behave as I’d expected from what I did. This helped me to decide the times when it simply seemed that it was indeed behaving of its own accord, from times that it was behaving as I wanted. For all this, I needed to keep in mind two very important details: that I was the one who could choose what levers to operate, and I was the one who could choose when to operate them. This is how I gathered what the puppet might do from one moment to the next, and this was what helped me to understand what more I needed to know to make it dance to different tunes.’ It was coming to know these things that helped to make the apprentice into the master puppeteer.
ix
Acknowledgements
I begin my thanks to the Department of Cognitive, Perceptual and Brain Sciences University College London, where I spent the long summer months of 2009 working on this book. I would also like to acknowledge the support of the Centre for Economic Learning and Social Evolution (ELSE) University College London; the Psychology Department at Queen Mary University of London; the Centre for Vision, Speech and Signal Processing at University of Surrey; and the Engineering and Physical Sciences Research Council (EPSRC) grant EP/F069421/1. There are many people to thank for their various contributions in helping me prepare for this book, as well as just helping me prepare myself in the first place. First and foremost, the individual who I would like to give thanks to is David Shanks. He has been a great mentor to me. His patient and methodical manner as well as his incisive style of thinking have been a great inspiration. Moreover, his continued support and endorsement have meant that I have been able to take advantage of one of the best intellectual environments in which to write this book. On this point, my thanks also go to Gabriella Vigliocco, Shelley Channon and John Draper, who have made the practicalities of working at UCL possible. x
Acknowledgements Of course, the intellectual environment I have been lucky to find myself in includes my friends. Dave Lagnado, Mark Johansen, Maarten Speekenbrink and Ramsey Rafaat have been most pivotal in introducing me to ideas that have been instrumental to the development of this book, as well as making suggestions that have helped structure the way the work has been presented. Their expertise and ideas have been invaluable. There are also many other lively discussions that I’ve had with a number of other good friends and colleagues that have helped shape the ideas that are meted out in this book. While you may not have realized, I hope this acknowledgement will now make this known. So, thanks to Ellen Seiss, Adam McNamara, Lorraine Attenbury, John Groeger, Shuichiro Taya, David Windridge, Bill Christmas and Josef Kittler. Also, thanks to Erica Yu, Andrea Smyth, Leonora Wilkinson, Marjan Jahanshahi, Nick Chater, Geoff Bird, Anna Cox and Paul Cairns. My thanks and appreciation also extend to those further afield: Axel Cleeremans, Stéphane Doyens, Wim De Neys, Bjoern Meder, Momme von Sydow, York Hagmayer, Franziska Bocklisch, Michael Waldmann, Ruth Stavy, Reuven Babai, Bob Hausmann and Brandon Weeks. I would also like to thank those friends who have played various other important roles. They’ve encouraged me throughout this process, or helped to motivate me to embark on such an adventure in the first place, and others were there at the right place and right time to distract me when I needed it. So, thanks to Charlie Mounter, Belen Lopez, Kulwinder Gill, Christiane Waleczko-Fernandes, Susan Brocklesby, Madeline McSorley, Piotr Ptasinski, Mandy Heath, Sara Spence, Robert Coleman, Rikki Wong, Simon Li, Darrell Collins and Tracy Ray, and to Billy Nelson who has also helped coach me through some hard times in preparing this. Also, to a special friend who is no longer with us, Rüdiger Flach. Along with my family members including little Layla, I reserve final thanks to two people who have gone through most of my trials and tribulations. They endured much of me, and this has not gone unnoticed. While I may have selfishly used their goodwill, and benefited from their exceptionally well-chosen, tailor-made xi
Acknowledgements diversions (you know me far too well), I suspect that during this time I was less than entertaining in return. I hope that you can see this book as a return of favour. Thanks Gill Ward and Christopher Berry.
xii
Chapter 1
Introduction
There are two key characters in the story of control, namely, the master puppeteer and the puppet. Assume now that the master puppeteer is us, and our goal is to control the behaviour of the puppet. The puppet can represent any system that requires our control. For instance, the puppet could represent a system that is biological, like keeping our physical fitness levels up so that we can run a marathon. It could be economic, like a stock market in which we are maximizing our profit by buying and selling shares. It could be organizational, like marshalling a troop of soldiers to protect a safety zone. It could even be ecological, for instance trying to sustain an endangered ecosystem like a coral reef. More typically, when we think of control systems, what comes to mind is something industrial like operating a nuclear power plant, or mechanical like driving a car, or safety critical like flying a plane. Clearly, then, there are many examples of control systems, some of which we are likely to experience on a regular basis in our daily lives. Control systems, therefore, are a rather broad subject and, as Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
1
Magda Osman will be made clear in this book, can encompass almost everything. So how are we able to do it; how are we able to exert control over these various types of systems? For those who have thought of this question, this book will be a guide to some answers.
Puppets, Puppets Everywhere … The reason that control systems can be seen everywhere is that almost anything can be thought of as a system, and most things we do in our lives are driven by our need to control. But, while the systems may be incredibly pervasive, they do adhere to some basic characteristics. Control systems are complicated. This is because they have a number of elements that will vary all at once and from one point in time to another – like puppets. Puppets can have multiple components; their moving parts (e.g., arms, legs, feet, hands fingers and head) are more often than not interconnected (e.g., fingers to hand, hand to arm etc.), and can vary all at once (e.g., performing a jump) as well as singularly in one point in time (e.g., raising a hand to wave). Thus, given these capacious characteristics of control systems, they are quite literally everywhere.
… It Takes All the Running You Can Do, to Keep in the Same Place1 In these systems we are often required to manage the events that occur in a way that leads to something predictable, and desirable. This can be incredibly difficult to achieve and takes many years of training (e.g., becoming a pilot of a passenger jet), because the system varies of its own accord, as well as because acting upon it makes it change in some way. Or, more often than not, it is a combination of us acting on it and it doing something itself that produces changes in events. To bring the analogy of the puppet to 1
2
Lewis Carroll, Through the looking glass (1871).
Introduction bear more obviously with control systems, in the story the puppet had its own internal mechanism that also made it move. Imagine how hard it is to control a malleable puppet and make it dance in time to a tune without an internal mechanism that can make it move on its own. Now imagine how much harder it is when it can move on its own and not always predictably. As hard as it seems, we are capable of achieving this. So we return to the question again: how are we able to exert control over such a complicated situation?
Vladimir: ‘Say Something!’ Estragon: ‘I’m Trying. … In the Meantime Nothing Happens’. Pozzo: ‘You Find It Tedious?’ Estragon: ‘Somewhat’.2 To answer the question ‘How?’, we need to find a better way of asking it. First of all, finding some way of describing how these different types of systems work is of great importance, particularly if they can, on a basic level, be thought of in a similar way. Second, to complement this, our ability to control what happens in these systems should reduce to some basic psychological learning and decision-making mechanisms. They should do this because we need psychological mechanisms in place that enable us to predict the behaviour of the system and coordinate our own behaviours to effect a specific change in it. Therefore, finding some way of describing our psychological processes, along with describing the control system itself, is crucial to having an understanding of control (i.e., the scientific pursuit) and being able to improve our ability to manipulate our environment (i.e., the applied pursuit). Given the extensiveness of both objectives, typically at the start of books like this there is a tendency to spell out at the beginning what things will not be included and what can’t be achieved. I am going to avoid this. The aim of this book is to be as inclusive as possible. If you’ve flicked through it already, you will have 2
Samuel Beckett, Waiting for Godot (1954/2009).
3
Magda Osman noticed that there are chapters spanning subject areas that include philosophy, engineering, cybernetics, human factors, social psychology, cognitive psychology and neuroscience. In order to get to the answer of ‘How?’, we need to consider the various contributions that each of these subjects has made. The issue of control invites attention from many disciplines that don’t always speak to each other. Putting them side by side in chapters in a book is also a way of showing how they in fact do relate. Moreover, they also provide the groundwork for my answer to the question of ‘How?’ which is presented at the end of this book. There are two important ideas that will help to carry you along this book: (1) all the themes introduced in this book are reducible to five basic concepts: control, prediction, cause–effect associations, uncertainty and agency; and (2) all of the issues that these basic concepts raise are ultimately, and will in this book be, directed towards addressing one question, which for the purposes of this book is THE question: how do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both? To understand the issue of control psychologically, and to understand the control system itself in all its various guises, we have to become familiar with these five core concepts and how they are tackled through the eyes of each of the aforementioned subjects. However, I am not alone; this endeavour has been embarked on by many,3 and throughout the different chapters of the book it will become apparent that there are various ways of understanding the psychological and objective characteristics of control systems. Therefore, I will take the opportunity here to qualify why this book is not a reinvention of the wheel, by stating what it hopes to do differently. 3
Cybernetics is the best-known example of an interdisciplinary movement designed to examine all issues related to control and self-regulation (see Chapter 4). More recently, machine-learning theorists have also attempted to draw work from engineering, biology, psychology and neuroscience to develop formal descriptions of behaviours associated with learning and controlling outcomes (e.g., Sutton & Barto, 1998).
4
Introduction
The Aim of This Book Role 1: catalogue At its most humble, this book serves the purpose of being an inventory of sorts of what we currently know in a range of disciplines (e.g., engineering, artificial intelligence [AI], human factors, psychology and neuroscience) about control systems and control behaviour. Though not ever seriously taken up, an appeal of this kind was made in the late 1940s by Wiener, the self-proclaimed father of cybernetics – a discipline designed to study all matters related to self-organizing systems. Wiener (1948) hoped to bring together many disciplines to understand common problems concerning control. However, Wiener (1948) proposed that ‘the very speed of operations of modern digital machines stands in the way of our ability to perceive and think through the indicators of danger ’ (p. 178). The effort in understanding all matters related to control came with a warning that technological advances may be such that artificial autonomous agents would be controlling our lives. That is, in the future the puppet would eventually rule the puppeteer, and not the other way around. Though the worry that control systems will reach a level of self-organization that may challenge our mastery of the world is perhaps unwarranted, surveying the most recent advances in theory and practice should give us a better understanding of what control systems can do, and our place with respect to them. As suggested, an overhaul of this kind has yet to be undertaken, and so this book is an opportunity to do just that. For instance, due to the increasing complexity of the systems under our control (e.g., systems that identify tumours in X-ray images, voice recognition, predicting stock market trends, creating game play in computer games and profiling offenders), there are in turn ever increasing demands placed on them to achieve optimal performance reliably. Even something as prosaic as the car now includes an increased level of automation. This is generically classified under the title of driver assist systems (DAS). DAS now include electric power-assisted 5
Magda Osman steering (EPAS), semi-automatic parking (SAP), adaptive cruise control (ACC), lane departure warning (LDW) and vehicle stability control (VSC). All of these things now influence the ride and handling of vehicles we drive. So we might ask ourselves, if we have handed over so much autonomy to the car, what control do we have? More to the point, disciplines such as control systems engineering present us with ever growing challenges because the control systems (e.g., car) that are part of our everyday interactions continue to increase in their capabilities and complexity. If complexity is increasing, then surely we need to know how we cope with it now, especially when things go wrong. Increasing complexity in our everyday lives doesn’t just come from controlling devices such as cars. There has been a charted increase in the complexity of the decision making involved in economic, management and organizational domains (Willmott & Nelson, 2003). We can spot this complexity because some of it filters down to our consumer choices. For instance, take shopping. We have to adapt to the growing complexity that we face in terms of the information we have to process (e.g., more available product information), the choices we are presented with (e.g., more products to choose from) and the changing goals that we are influenced by (i.e., desires, aspirations and expectations). At the heart of adapting to the increasing level of complexity in our lives is our ability to still exert control. So, given the new challenges and demands that are placed on us in our lives right now, this book may be considered a sort of stock take of relevant and current research in the study of all things control related.
Role 2: solving the problem of complexity A broader aim of the book is to help clarify what we mean when we say an environment is complex, and what it is about control systems that invites researchers from different disciplines to refer to them as complex. The complexity issue is important for the reason that there needs to be a cohesive idea about what makes 6
Introduction control systems difficult to understand, and why we can fall into traps when we come to control them. For instance, the term complexity has a specific reference in design engineering and is a measure of the structure, intricateness or behaviour of a system that characterizes the relationship between its various components.4 Thus, the properties of a control system can be specified according to objective characteristics of complexity from an engineering perspective. Efforts in defining complexity have also been attempted from a psychological perspective. Studies of human behaviour in control systems have taken properties of a control system (e.g., transparency, dynamics, number of variables, number of connections and functional forms – linear, curvilinear, stochastic and feedback loops) and investigated how competent we are at controlling systems when these properties are manipulated. The steady amassing of data from experiments along these lines started from early work by Dörner (1975). But, unfortunately, despite the wealth of findings, there has been little headway in being able to say generally what contributes to making a system complex from a psychological point of view. This is a major problem. We simply can’t say what makes systems complex in general. What we can only say is what might make a particular system complex. This is hugely limiting because we can’t generalize to different types of control systems the psychological factors associated with them. In other words, the analogous situation would be this: we might know that the puppet in your hand is going to be hard work to operate because every joint is movable. But we wouldn’t know why it is that if, given a variety of other puppets, you still find it hard to operate them all. Is it down to psychology? Is it down to the way the different puppets function? Or is it a combination of both? Attempts to answer these questions involve identifying possible measures of control systems complexity in order to predict the 4
As systems become more sophisticated in their functions, the interrelationship of the controlled variables in turn is used as a measure of the systems’ complexity.
7
Magda Osman success of psychological behaviours. Again, these include borrowing ideas from another discipline, in this case computer science, which have been used to measure how controllable the system is (e.g., whether it is in non-polynomial time [NP] or Polynomial time [P]),5 the size of the system (e.g., its search space) and the number of interdependent processes that are contained within it. However, to date, none of these captures the true complexity of a control system, or for that matter accurately predicts what psychological behaviours we are likely to use. While it might be the case that from a computer science perspective a complicated control system can be described in some mathematical way, and from this, based on a long arduous formal process, some claims are made about how we ought to behave in order to control it, there is a problem. There is a lack of correspondence. Humans have elegant and simple means (e.g., heuristics)6 to reduce complexity that may be formally difficult to describe, but equally there are examples in which humans find it virtually impossible to tame a complex situation that may, from a computer science perspective, be mathematically simple to describe. Thus, there is a gap. Knowing what defines the system as complex is no guarantee for understanding the types of behaviours needed to learn about it in order to eventually control it. Therefore, the aims of the book are to present ways of clarifying the problems faced when attempting to define complexity, and to offer a solution to it. The solution is based on describing control systems as uncertain environments. What I propose is that identifying and measuring uncertainty of the system boil down to tracking over time when changes to events in the system occur that are judged to occur independently of our actions, while also tracing those changes that are rare but have a substantial impact on what happens overall, and 5
This relates to a general issue concerning the solvability and verification of a problem and the number of computational steps that are needed (i.e., polynomial time) to achieve both. 6 A method by which prior knowledge and experience are used to help quickly generate decisions and actions to solve a problem.
8
Introduction that also didn’t result from our actions. These descriptions of the system’s behaviour need to be integrated with people’s judgements about how confident they are that they can predict the changes that occur in a control system, and people’s judgements as to how confident they are that they can control the changes that occur in a control system. These are what I claim to be the bare essentials of understanding and developing a metric for examining the success of controlling uncertainty in control systems. However, you could argue that I’ve just performed a sleight of hand by saying complexity is uncertainty in which case I haven’t solved the problem of complexity at all. Let this might be true, but let me qualify a few things. First of all, I’m not claiming that complexity and uncertainty are the same thing. What I am claiming is that translating the issue of complexity into defining what makes a control system uncertain paves the way for connecting objective descriptions of the control system with psychological descriptions of our behaviour. Thus, uncertainty as a concept is better than complexity as a bridge between the control system and us. Second, I’m not saying complexity is reducible to the other. Instead, I am suggesting that, to answer the question of what makes a system complex a new perspective ought to be taken. From a practical standpoint, attempts to solve the mystery of what makes a system complex have thus far been illuminating but ultimately unsuccessfully in defining what makes a system difficult for us to control. I argue that it may be easier to decide on the objective properties of the control system that are uncertain than those that make it complex. This pragmatic point then, arguably, is a good starting point for presenting a framework that develops on the solution to the problem of complexity. Better still would be a framework that also tries to bridge objective uncertainty with psychological uncertainty. Here, then, is the final ambitious objective of this book.
Role 3: providing answers to how we control uncertainty In an effort to capitalize on the humble and broad objectives of this book, the final aim is to present some general principles along with 9
Magda Osman a framework that relates work focused on designing specific kinds of control systems (such as engineering and machine learning), work examining the interaction between humans and control systems (human factors, human–computer interaction [HCI] and ergonomics), and work investigating the underlying psychological mechanisms that support behaviours that enable control (social psychology, cognitive psychology and neuropsychology). By taking current stock and by tackling a number of critical issues that have dogged many fields, there are new insights to be gained. At the very least, in pursuing this objective a number of new questions can be raised and ways of answering them can now be formulated. This discussion is reserved for the final chapter, and can also be viewed as a sort of roundup of the main issues surrounding the puppet and the puppeteer. Moreover, all the details needed to understand the framework are crystallized by the puppeteer ’s final comment to the ringmaster. ‘All I know is this’, said the puppeteer. ‘To make the puppet dance the way I wanted, I had to know how supple the limbs and head were. The suppler it seemed to be, the more elegantly I could make it dance. I had to know when the mechanism in the puppet was running fast or slow, and when it failed to behave as I’d expected from what I did. This helped me to decide the times when it simply seemed that it was indeed behaving of its own accord, from times that it was behaving as I wanted. For all this, I needed to keep in mind two very important details: that I was the one who could choose what levers to operate, and I was the one who could choose when to operate them. This is how I gathered what the puppet might do from one moment to the next, and this was what helped me to understand what more I needed to know to make it dance to different tunes’.
Whether or not the final objective is successfully achieved is left to be judged by the reader. But one can read all the related and current work on all matters concerning control systems and control behaviours in Chapters 2–8 without relying on the proposed framework to do so. 10
Introduction
The Structure of This Book To provide as comprehensive an account as possible of the work that relates to control systems and control systems behaviour, a broad range of disciplines are covered by this book. In short, if we go back to the story of control to understand the structure of the book, to begin we take the point of view of the ringmaster who sees everything: the puppet in its box, the strings, the panel with the levers and the puppeteer. From this point of view, we can consider all the important issues in the situation. Next we focus on the puppet and its internal mechanisms, including how it operates and what it can do. From there we consider the strings that link the puppet to the puppeteer. Finally, we need to take into account the puppeteer, and the internal mechanisms and processes that make up his behaviour. By following this order we can build up a picture of the whole story of control from puppet to puppeteer, and from this we can see how we get to the answer at the end. So, to begin, the fundamental issues that concern control are grounded in philosophy, namely, the way in which we construe causality and our sense of agency. Both of these are central to understanding how we interact with, and assert control over, our environment. The next step is to consider the control systems environment itself in terms of its more common reference to engineering. In particular, we will examine how engineers develop control systems, and what methods they use to formally describe them. This helps to illuminate the general aspects of the environment worth knowing about which apply to all other types of control systems. Closely related to engineering are the fields of cybernetics, AI and machine learning. To situate these fields of research in the context of the general issues they tackle, and the historical basis for them, the next chapter spends some time discussing cybernetics. Cybernetics, as mentioned before, was designed to provide a framework for understanding issues of control in all possible self-regulatory environments, and in so doing, it also focused on examining how humans interact with and relate to them. This has become a research matter 11
Magda Osman that has gained in importance, and human factors research specifically tackles this. Thus the chapter on this subject serves as the intersection between the first and second halves of this book. The second half of the book is orientated towards control behaviours from a psychological perspective. Across the next three chapters, the discussion focuses on general descriptions of our behaviour at a social level, and the underlying mechanisms that support that behaviour at a cognitive and neuropsychological level. The final chapter draws on all the work that has been presented throughout the book in order to end with some general principles of the control system and the psychological processes involved in it. This is the basis for presenting a succinct framework that takes into account the properties of the environment and the ways in which we as puppeteers master it. To help the reader, below is a short summary of each chapter, so that the general gist of each is conveyed and can more easily aid the reader in deciding what may be relevant to refer to.
Chapter 2: Causation and agency This chapter considers our sense of agency (i.e., the relationship between thoughts, beliefs, intentions, desires [reasons] and the events that occur in the external world [actions]) and causality, our experiences and understanding of relations between events. Much of the knowledge we develop while interacting with our environment, particularly one which is uncertain, is anchored by the actions we generate, and the sense of agency attached to them. This helps us to relate what we do with the effects that are produced. Thus, understanding the deeper issues related to these basic assumptions provides an important foundation for tackling the issues that will be discussed throughout the book.
Chapter 3: Control systems engineering Engineering work on control systems has a long tradition dating back to the nineteenth century. Control systems engineering is con12
Introduction cerned with how dynamic systems operate, and provides ways of formally describing them. Engineering not only aims to accurately describe the system, but also has another objective, which is to offer a greater level of automaticity in the systems that surround our everyday lives. In order to present the details of the control system from the viewpoint of engineering, the chapter introduces control theory, and discusses how it is applied in order to help develop systems that offer a high level of automation.
Chapter 4: Cybernetics, artificial intelligence and machine learning Cybernetics has been pivotal in suggesting that what underlies control systems in engineering applies to any systems we can think of in nature and society, whereas the way in which complex devices function and help to offer adaptive control has been the focus of study in engineering. Importantly, the issues faced by engineering are also encountered by machine learning and AI. So, the chapter aims to draw attention to the critical issues that are faced by current research in machine learning and AI, and, in so doing, to highlight the progress made thus far, and the limitations that are currently faced.
Chapter 5: Human factors (HCI, ergonomics and cognitive engineering) Some of the key concerns for human factors research is what happens when errors in control systems occur, and how things can be improved given the increasing exposure that humans have to automated control systems. What this chapter illuminates is that there are often differences in the kinds of assumptions that designers make about human capabilities, and the assumptions that human operators of control systems make about how the control system behaves. Misalignment of assumptions can be the basis for many of the problems that arise when humans interact with complex systems. 13
Magda Osman
Chapter 6: Social psychology, organizational psychology and management What this domain of research helps to draw attention to is that motivation, and the ways in which we pursue goals, can have a profound impact on our ability to successfully control our immediate environment. Moreover, agency takes centre stage in research and theory in the study of control in social psychology. This has also been central to understanding psychological behaviours in organizational and management contexts, which are also discussed. Thus the chapter covers the main research findings and current theoretical positions on control in social psychology with examples of how it applies in management and organizational contexts as examples of control systems.
Chapter 7: Cognitive psychology Cognitive psychology has amassed over 40 years of research examining the underlying mechanisms associated with controlling complex systems. More specifically, work on control systems behaviour is concerned with how we acquire knowledge about the system, and how we apply it to reach different goals. The aim of this chapter is to introduce the types of psychological tasks that have been used to study control behaviours, along with the general findings and theories to account for them. In so doing, the chapter also discusses relevant work on perceptual-motor control, causal learning and reasoning, and predictive judgements in multiple cue probabilistic learning tasks.
Chapter 8: Neuroscience The chapter approaches neuroscience in two ways: it suggests that neurological functioning of the brain can be thought of as a control system with respect to feedback, dynamics, uncertainty and control. It also considers the neuropsychological basis for behaviours associated with control. By looking at the underlying 14
Introduction mechanisms that support decision-making and learning processes, the chapter examines current work that has contributed to understanding uncertainty and the role of feedback from a neuroscientific perspective. To bring all of this to the fore, the latter half of the chapter introduces a new field of research: neuroeconomics. Neuroeconomics has been used as a way to draw together neuroscience, economics and psychology under the common vision of understanding how we learn about and make decisions in uncertain environments.
Chapter 9: Synthesis The aim of this chapter is to propose that the control system is an uncertain environment, and therefore it is crucial to know whether the effects that occur in it are the result of our own actions or some aspect of the way the system behaves independently of us. This kind of uncertainty can reflect a genuine feature of the environment: that it is highly probabilistic in nature and hard to predict. But uncertainty is not only a feature of the environment; it can also be generated by the individual acting on the system. We can be uncertain about how accurately we can predict the system will behave, and unsure about our ability to effect a change in the system. The chapter lays out a number of general principles from which a framework is described that provides a general account of control.
Chapter 10: Epilogue The aim here is to come full circle. The story is presented again, but this time I’ve slotted into various places the five core concepts discussed in this book along with the framework proposed in the final chapter. I hope that this will at least provide a final entertaining glimpse at ways of understanding how we control uncertainty.
15
Chapter 2
Causation and agency
The aim of this chapter is to introduce from philosophy some established debates which relate to the overall issues that are the concern of this book. In particular, the discussions here focus on our causal knowledge (causation), and the relationship between our intentions and actions (agency). To begin, though, it is worth considering some examples of the kinds of real-world problems that we face when interacting with a control system, and why it is that causation and agency matter. Again, each example below can be thought of as a puppet that needs mastering. 1. Biological We are sensing the beginnings of a cold that prompts us to buy vitamin C tablets, aspirin and decongestion tablets. For the next two days we take a combination of vitamin tablets and aspirin. However, this doesn’t seem to be effective. Instead of reducing the symptoms, we begin to feel worse. We judge that vitamin C tablets and aspirin
Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
16
Causation and agency aren’t having much effect, so we stop taking them, and start taking the decongestion tablets. The cold symptoms seem to stabilize, and after a week the cold has gone. 2. Organizational As commander-in-chief, we have sent a small unit on an information-seeking exercise with orders to return after 12 hours. The remaining unit is stationed on the borders of a marked zone in which any behaviours indicative of intent to attack including unauthorized landing of troops or material on friendly forces are signs of enemy behaviour. Bombs are dropped 10 hours after the unit has been sent out. The bombing is just outside of the demarcation zone, which prompts our response to return fire. 3. Mechanical We have just activated the cruise control in the car, and set it to start when we reach 50 mph. It’s been a long trip and we are heading back home along the motorway. However, because of upcoming roadblocks we change route and take the next motorway turnoff to avoid a longer journey home. On approaching a set of traffic lights, we try to slow down the car but it doesn’t seem to be slowing. We end up having to brake quickly to avoid a collision with the car in front. 4. Industrial We have correctly assessed that the rise in pressure in the nuclear reactor is a failure of the ethylene oxide to react. The temperature indicator suggests that the temperature is too low for a reaction to start, so we decide that the best and most sensible course of action is to raise the temperature. The indicator shows that the temperature is rising as expected, but the pressure still hasn’t fallen. We continue to raise the temperature, which eventually exceeds the safe limits. As a result we fail to stop an explosion that seriously injures two people. What do these examples have in common? 17
Magda Osman In general, what connects these examples is that from moment to moment states change, and we need to regulate the system (e.g., biological, organizational, mechanical or industrial) to either maintain or shift the state to a desired goal. In the first illustration the system happens to be a biological one that we are familiar with in which the desired goal is to get rid of the cold. As the states change (e.g., feeling worse, getting better), we adjust the things we do (e.g., the medication) to get us closer to our target (e.g., feeling well). The next illustration is a system of people, all of which are working towards a main goal (keeping an identified zone safe) by adjusting the behaviour of the system (sending out a unit on a fact-finding mission, and protecting the safe zone) and responding to the change in circumstances (returning fire after the initial bombing) (Johnson, 2003). The third illustration is a mechanical system that should also be familiar. Here, the desired goals change as the circumstances in the environment change (e.g., roadblocks), and so we continue to adjust our driving behaviour in the face of changing circumstances whilst also having to operate the system to meet our changing goals (slow the car down at the traffic lights). The final illustration is a recorded event that took place in an industrial system (Kletz, 1994). Here, there is one main goal that needs to be maintained, and that is to keep the reactor in a stable safe state. To achieve this, the operator needs to continually monitor aspects of the system (e.g., temperature and pressure) so that every change in state of the system works towards the target goal. Clearly, no matter what variety of system it is, at the most fundamental level, these illustrations show that the requirements are the same. Some outcome needs to be achieved, and various steps are needed in order to reach it. Moreover, this has to be done in the backdrop of a huge degree of uncertainty about what makes things in the system change. So, we may ask ourselves, how is it possible to achieve the kinds of changes we want, if we aren’t quite sure what it is that is going on in the first place? It is this question, more precisely stated as ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’, which is going to be tackled 18
Causation and agency here from the perspective of philosophy. One way of considering this question with respect to the examples presented is to think of them from the puppeteer ’s perspective. What knowledge did the puppeteer possess of the way the puppet acts at the time that he was manipulating it? What were the intentions and actions of the puppeteer? How were the events interpreted? If we consider the examples from the point of view of the individual’s intention, we could describe the examples in the following way: The biological and organizational examples indicate that the occurrences are entirely consistent with the individual’s understanding of the relationship between cause (medication/unauthorized attack) and effect (alleviate cold/retaliate), and with their intention (get rid of the cold/safeguarding the zone). In the mechanical and the industrial examples, the occurrences don’t correspond to the individual’s intentions, but that doesn’t necessarily challenge his or her understanding of cause-and-effect relations. So, from the point of view of the individual, when the outcome in the systems didn’t fit with what was expected, this was likely to result in a change in behaviour, but not necessarily a change in the understanding of the system, even though there may have been good reason to revise it. Now consider the examples from the point of view of the cause– effect relations between the puppeteer and the puppet. What may be implied as to the internal workings of the puppet may in fact not be the case. So, imagine that, in the biological example, we now know that colds will dissipate after a week of infection, and in actual fact most combinations of medication have virtually no impact on changing the rate of recovery. In the organization example, typically there are notorious delays between signal input and output in communication systems in the field of battle. Often decisions by commanders are made quicker than the time that information can be relayed back to those in command, and this often results in friendly fire. So, in actual fact, for the first two examples there is no correspondence between the events in the real world and the intended actions. The medication didn’t have any effect in alleviating the cold, and the retaliation was in response to bombing the 19
Magda Osman individual’s own unit – they simply failed to receive information that the unit was itself under attack just outside the protected zone. The individuals in each example are operating with incorrect knowledge of the cause–effect relationships, and failing to realize that their actions did not produce the desired effects, even though that was the assumption that they were working from. Turning to the mechanical example, the car was set to cruise control while on the motorway, so while it seemed like the car was out of control when approaching the traffic lights, it was simply operating according to pre-set conditions that the driver forgot she had implemented. In the case of the industrial example, the operator is unaware that the reactor process failed to start as a direct result of the operator continuing to increase the temperature. Here again, these two examples involve individuals operating with the wrong causal understanding of the environment, but this time they are failing to attribute the events that occur to their own actions because at the time they didn’t accurately represent the cause–effect relations in the system. In sum, seemingly unpredictable events may suggest that we cannot exert control, despite the fact that the unpredictable events are actually a direct consequence of our actions. It might also be the case that we think we are exerting control when in fact our actions have absolutely no bearing on the outcome. This means that the uncertainty between our actions and the effects is created by a lack of correspondence between our causal knowledge and our agency. But take heart. Despite this, we still manage to interact reasonably well with control systems, and this is mostly because we follow some core assumptions. We typically assume that intentions are behind our actions. So, it may be fair to also assume that in general our intentions give reason to our actions, and cause them. We also tend to assume that there is a general correspondence between what we do and what happens. So, it may also be fair to say that we can assign a relationship between the things that we think (in terms of our beliefs, wants and desires) and the actions that follow which we observe in the external world (sense of agency). We also tend to assume that, if we have clear intentions, and they are the cause of our actions, and 20
Causation and agency events happen as a consequence of our actions, then what should happen in the world resulting from actions that we take, and the events that are reliably generated from them, is predictable. So, we should be able to reliably distinguish between what we do and the events we generate from what happens independently of us (causal knowledge). Given these assumptions, one approach would be to formally describe ways in which to support them, which philosophy endeavours to do. As alluded to, our assumptions may not always be well founded but they are necessary for our survival. The job of philosophy has been to examine these assumptions and illuminate the problems that come from them. For instance, there are debates concerning the correspondence between the causal relations in the world and those that we have of the world; our general understanding of causal relations and how we use our experience to form our understanding of the causal relations in the world; the nature of our actions as compared to bodily processes; the awareness we have concerning our current behaviour, the control we have – that is, the power to act or not act; the choice itself and how it is supported by our desires and reasons for acting a certain way; and our freedom to choose to act or not act (Thalberg, 1972). Though not all these issues will be exclusively discussed, this chapter will deal with them broadly, by grouping them into the following division: (1) our sense of agency – that is, the relationship between thoughts, beliefs, intentions, desires (reasons) and the events that occur in the external world (actions); and (2) causality – our experiences and understanding of relations between events. The final section of this chapter presents a round of the practical implications of the debates that are meted out in philosophy.
Agency Reasons and whether they are causes of actions What we intend (i.e., a thought of doing something) and what we actually do (having some effect on the world) seem to have an obvious relationship that need not be inspected further. But, if we 21
Magda Osman have reason to doubt this, then one way of inspecting this issue is to focus on answering the question of whether reasons do or don’t provide the basis for causal explanations of actions. The stage was set by Davidson’s (1963) seminal paper ‘Actions, reasons, and causes’, in which he argued that mental events cause physical events, a claim that seems in perfect keeping with what we generally tend to assume. However, there has been a long and detailed debate about whether we can say that one (intentions) has influence over the other (actions) (i.e., mental causation) – that is, our own sense of agency. While some (Dretske, 1993; Hornsby, 1980; Mele, 1992; Pacherie, 2008) are of the same view that reasons are causes of relevant actions, others (e.g., Dennett, 1987; Stoutland, 1976) are of the view that we shouldn’t bother with such issues because it is not a useful question to ask. To understand the debate, it is worth considering an example by Hornsby (1993) concerning something we should all be rather familiar with. Say we want to make some tea, and we go and boil a kettle to make it. The ‘want to make tea’ remains in our head, but the boiling of the water is very much outside of that (assuming we have actually boiled the kettle, and haven’t only imagined doing it). The ‘wanting to make tea’ is internal to us and therefore a personal view, but the boiling kettle is external to us and can be viewed objectively or impersonally. If the whole series of behaviours leading up to the tea stewing in the pot could be explained from an entirely objective point of view, then there isn’t any need in assigning agency (i.e., knowing the reasons for why the person initiated the actions). If it doesn’t add much explanatory power to simply focusing on what is going on outside in the shared world, then why complicate matters by trying to understand the internal world of the individual? This is Dennett’s (1971, 2006) view. Intentional stance systems theory makes this very point. Dennett distinguishes the physical level (e.g., laws of physics) from the design (e.g., biological or chemical) and intentional (e.g., attribution of beliefs or desires) levels of description of any system (e.g., human, animal or artefact). If we want to explain and understand how things operate and make predictions about the actions that we observe, then we should try 22
Causation and agency to remain as close as possible to the physical-level description. This is because there are two extra unsteady assumptions that the designlevel and intentional-level descriptions make that the physical stance does not, namely, that there is purpose behind a system, and it behaves in a rational way. But before psychology and neuroscience give up the ghost, Dennett proposes that we may still want to think about the intentional and design stance in order to make predictions about human behaviour. The point is not that considering systems in terms of intentions is unacceptable as a level of description; Dennett argues that it has its uses. It’s just that it only tells us about how and why we make sense of behaviours of complex systems based on considering them as having agency, but this level of description does not say anything about the internal mechanisms that achieve the seemingly purposeful rational behaviours that we are able to predict. This would mean that for the rest of the book, the descriptions of how people interact with complex systems should really focus on the observable behaviour of what people do when interacting with control systems, without explaining much of the mental activities. The reason for this is that it isn’t necessary to take this into account to explain what is observably going on. Even if there is some attempt to refer to descriptions of people’s personal reasons for their behaviour, that too can be subsumed into a general notion that even their personal reasons are causally determined as part of an order of nature. This represents the argument that the impersonal view not only is a rival to the personal view but also supersedes it, because the reasons people give for actions are not relevant to understanding their behaviour. Some psychological theories (stimulus–response theories) can do a good job of describing behaviour in this way without the need to posit any discussion of mental events like intentions which might go on to generate actions.1 In 1
Dennett in ‘Skinner skinned’ (1978) argues that Skinner ’s attempts to establish a level of description that circumvents mentalism through his psychological theory on learning did nothing of the sort, but instead fell into the same traps as all psychological theories do.
23
Magda Osman fact Thomas Nagel (1986) critically discusses this, but what he highlights is that with this view also comes the problem that the choices we make cannot be treated as free ones, because freedom then becomes fictional. For those comfortable with the implications of a purely impersonal position on freedom of choice, the discussion ends here. However, others consider this to be rather bleak, and so for them, they would rather follow a position in which there is free agency. Dretske (1989) and others (e.g., Hornsby, 1993; Mele, 2009) are of this persuasion, and present a number of claims that follow from the impersonal view that present limitations in our understanding or our actions: (1) behavioural biology has to ignore the causal efficacy of reason into action in order to explain behaviour, and (2) psychology and neuroscience should only try to offer us correlations between internal attributes forming an intention (that is, beliefs, wants and desires) and behavioural outputs (arm movements, finger presses, changes in heart rate and speech acts). Anything beyond these pursuits is a conceptual leap too far. To understand these points, Dretske makes an elegant argument by referring to the ‘design problem’. As a designer we want to have a system S (say, a new biological organism) that is selectively sensitive to the presence or absence of condition C (another biological organism running towards it), so that it performs an action A (running away) when C is fulfilled. The solution to this must be that S has an internal mechanism I (recognizing another biological organism) that indicates the condition, so that C causes A. But what Dretske (1989) is quick to point out is that C causes I, and I causes A. The intermediate step in which an internal process indicates the condition becomes the cause of an action. While the fact remains that the design problem can represent a machine, a simple organism, as well as humans, it is enough of a first step to acknowledge that I represents the world outside S and has causal power. This is an important point for us because in this book we are interested in purposeful behaviours; after all, control is the most obvious purposeful behaviour that we have. In the case of learning, a purposeful voluntary behaviour is rewarded in some way that
24
Causation and agency under C, S learns to A. Imagine that for this to happen, the indicator has not been inherited or is given the properties by an external agent that make it the cause of some type of action. The indicator acquires the relation to external conditions that then indicates C and in turn causes A. The relation is special in that it is meaningful, its content can be used to rationalize behaviour and this has causal efficacy because it causes actions.2 To proceed to the next discussion, we have to carry with us the view that actions are purposeful because they are governed by thought to the extent that we can’t assign ourselves agency if the events are likely to occur regardless of whatever we think (e.g., a reflex action) (Dretske, 1999).
Sense of agency, actions and what they are made up of Another aspect that has been controversial in understanding agency is how to refer to actions. Returning to the tea example, I get up off my seat,3 I move my hand to the kettle, I flick the switch of the kettle, the kettle boils the water, I pour the water, and so on until the tea is brewing in the pot. These can all be discussed in such terms as an ‘action’ in a general way, that is, ‘tea making’ (Davidson, 1980), or they can be referred to as a distinct list of related actions of making tea (Goldman, 1970), or as components that can be hierarchically organized according to a general ‘making tea’ action (Pacherie, 2008; Searle, 1983). So what? Or, for that matter, why
2
The debate isn’t left here. There are challenges to Dretske’s view: for instance, Dennett’s (1983) point is that computers trained to win at chess can be considered rational agents because they select a course of action that fulfils a goal. They show intention, and have meaningful internal representations of the external world as well as internally driven representations of the rules of chess, but clearly don’t possess these properties. 3 I have avoided using examples here that also include mental actions (e.g., remembering the name of the person I’m making tea for) for the reason that the discussion focuses on effects in the external world, not the internal (for further discussion of mental actions, see Davidson, 1970; Mele, 1995).
25
Magda Osman would we want to worry about making these distinctions about what an action is? Intentionality is an essential concept for agency because the assignment of actions is intimately linked to intentions. In fact, Mele (1992) makes a specific case for what an intention is and how it differs from desires with respect to actions. Intentions are a way of initiating actions; even if they are put off for a while, they persist until they are executed in some way, whereas desires can persist without the need to initiate an action. So, the intention is the basis for the action, and the action can cause bodily movements such as using my finger to flick the switch of the kettle, causing the water inside it to boil. If they are all essentially the same underlying action, then bodily movements can be treated as effects in the same way as other non-bodily movements are treated as effects (e.g., Davidson, 1980). In this case, intentions to generate an action are not distinguished by the intention to move an arm and the intention to boil the kettle. More to the point, there would also be no distinction between our awareness of effects like bodily movements (raising a hand) and other non-bodily movement effects (boiling the kettle). Conversely, if they are treated as different, then we may want to assign some special status to bodily movements as actions, for instance by saying that they are closer to (i.e., proximal) the intention than other non-bodily movement actions. This would also imply that intentions can be distinguished according to different actions, as can awareness of them. Take for instance a noted difference between causal basicness and intentional basicness (Hornsby, 1980). The muscle movements needed to direct my finger towards the switch on the kettle are more basic than the kettle boiling, because one causes the other to occur. In addition, intending to lift my arm is an action that is intentionally basic (if I’ve done it a million times) because it too cannot be deconstructed into further intentions; I simply intend to lift my arm, and don’t need to intend to do anything else in order to have the basic intention to lift my arm. So, having established that, more often than not, causally basicness and intentionally basicness correspond. If I’ve performed 26
Causation and agency the same action repeatedly (say, tea making) and am highly expert at it, then from my perspective, flicking the switch on the kettle may be causally basic and also intentionally basic because the muscle movements are initiated as a function of the intention-embedded plan of tea making. Or, in Dretske’s terms, actions can be considered as purposeful in a way that mere behaviours are not, because behaviours are executions of pre-programmed sets of instructions that perform under specific conditions without the need for deliberate thought (Dretske, 1999). So, we might begin with actions in which intentions have a primary role, particularly when learning to achieve an unfamiliar goal (e.g., a novice tennis player serving a ball in the court). But through extensive practice we relax our level of intentionality (e.g., a professional tennis player serving an ace) and end up with behaviour. However, Pacherie highlights two points concerning the different types of basicness. First, causal basicness is a relative concept, in that a well-rehearsed body movement may be basic for an expert who is highly practised in it, but not for a novice. Second, intentional basicness is not always associated with causal basicness, and is illustrated by referring to the following point. Contracting X muscle of the right arm is causally more basic than lifting the right arm, but contracting X muscle is not intentionally basic more than the intention of moving the left arm of which X muscle then contracts. That is, we tend to have little or no awareness of how we get to move our arm through the contraction of X muscle, or of other details of how we can get to move our arm, apart from the basic intention to move our arm. We may at this stage ask again, ‘Why is all this important?’ The answer to this question is that, all this matters because of how we construe actions in terms of psychological and neurological perspectives which have an important bearing on our understanding of how we control events in control systems. If we are of the opinion, that bodily movements are what should be considered actions, and that intentions are conscious, then we can go one of two routes. If we are able to say that people are aware of their bodily movements, then there is little reason for thinking any of the discussion 27
Magda Osman thus far is at all relevant. Moreover, if we are sure that our subjective experiences of intentions and bodily movements correspond exactly, then agency need not be questioned, and discussion on this matter can draw to a close. But, if we begin to start asking questions concerning how we know that the effects we produce in a control system are the result of our actions, then we need to examine closely the issue of our conscious intentions to act. But, do we have much awareness of the actions we produce? Some would say we don’t seem to have much awareness of actions generated that are intentionally basic. This has been, and still is, a serious issue for psychology and neuroscience, particularly because, as Pacherie (2008) highlights, psychologists and some philosophers (e.g., Davidson, 1980) have tended to identify actions as physical movement. For good reason, many other philosophers (Brand, 1984; Frankfurt, 1978; Hornsby, 1980; Searle, 1983) have heavily criticized this. In turn, while these philosophers believe that a sense of agency and awareness of an action are bounded, psychological examples suggest otherwise (see, e.g., Chapter 7 discussions of implicit learning). If, as psychologists might claim (e.g., Libet, 1985), the representational content of an intention should include lower-level components such as motor schemata and neuromuscular activity, then there is a problem, because some of the representational contents of our intentions are not accessible to consciousness. Dretske (1993) helps to put this problem in perspective by contrasting mental events as structuring causes, and how they differ from trigger causes of actions. If we return to the design problem, under condition C (an approaching mugger with a knife) the internal state I (e.g., representation of a knife-wielding mugger) causes action A (e.g., running). This can be construed as a triggering cause, since I has intrinsic properties (e.g., neurotransmitter and electrical charge) that cause A (i.e., the running movement). I also has extrinsic properties (e.g., causal, informational, functional and historical) which are structural causes of A. This relates to why it is that S is displaying A under I – that is, what makes the representation of a knife-wielding mugger (i.e., its relational properties to the external world) a possible reason for S to run. Or, in Mele’s (1992) terms, 28
Causation and agency intentions involve plans; these have higher-level representations but are supported by lower-level representations. Even if the preparation of bodily movements is organized earlier than the action itself, that in itself doesn’t constitute a lack of agency, because agency is the operation of deciding (choosing) to cause an action,4 and this sets off the preparatory motor behaviours. Drawing this kind of distinction and how it can precisely relate to what we can refer to as agency are discussed in the next section.
Temporal properties of our sense of agency A number of modern philosophers (Brand, 1984; Bratman, 1987; Mele, 1992, 1995; Pacherie, 2008; Searle, 1983) have drawn similar distinctions between intentions that cause immediate actions, and those that take place in the non-immediate future. Mele (2009) describes this distinction as proximal and distal intentions, in which the representations that are associated with each will also depend on the goal and its complexity (see Chapter 6, ‘Social psychology’) that needs to be met. If the wind suddenly picks up dust in the courtyard and heads in my direction, then I will shield my eyes by covering them with my hand. The action is immediate because the intention involves a goal that needs to be immediately fulfilled. If, however, I intend to finish sketching and then paint a scene from a masked ball, the intention is distal because the goal is a complex one to achieve and requires more than just a single basic arm movement. Pacherie’s (2008) description of a current sense of agency, much like Mele’s (1992) proximal intention, is concerned with the experiences that are close to the execution of a particular action. This is different from a long-term sense of agency which is a projection of what one believes one has the capacity to achieve in the future by relating past actions to future goals. In Pacherie’s view, the role of initiating actions and the role of guiding and monitoring actions are involved in a current sense of agency and a longer-term 4
Some would go further by adding that what is important is that agency causes actions that have particular consequences (Mele, 2009; Quinn, 1989).
29
Magda Osman sense of agency, and not, as many philosophers propose, exclusive roles assigned to different types of intention. This is a particularly noteworthy point because it shows sensitivity to the dynamics of intentions and, for that matter, the dynamics of the external world that they cause changes in. To illustrate, it might be worth thinking about these details in terms of an actual control systems situation. When interacting with a control system like a flight management system in a plane, we tend to act according to the information available to us. We may become immersed because the control system is a busy one and requires us to constantly follow the change in flow of information (e.g., a pilot responding to the changes in information on the flight management system of a cockpit in a passenger plane). So, under these circumstances, we are not always engaged in a reflective experience during acting; we are responding to the current demands of the task, often executing familiar well-rehearsed behaviours, but still having a sense of agency to the extent that our intentions appear to match those of the events we cause. Under these conditions the actions are predictable because we are initiating a plan that includes wellrehearsed actions, from which we know what to expect will follow. We gain information back from the outcome of our actions, and this is used to gauge the success of our actions by comparing the outcomes against what we expected would happen. The comparison between the action and the goal need not be monitored to any great degree if one has confidence in being able to produce the actions for the purposes of meeting the goal (e.g., the weather conditions are stable and the flight route is a well-rehearsed one so the plane will easily reach its destination). But, our attention can change, and we might realize that we have adopted a strategy that isn’t meeting the demands of our goals (e.g., the automated pilot system has relinquished control because the weather conditions are beyond the range in which it can operate, and manual control is needed to fly the plane out of danger). The discrepancy between the intentions and the events prompts a ‘third-person’ form of introspection of the actions – as if externally observing and evaluating one’s actions (e.g., what exactly prompted a change in the behaviour of the auto30
Causation and agency mated system from its normal operation when the pilot activated it) – and a even ‘first-person’ view in which introspection occurs while preparing and performing an action (e.g., what is the immediate danger faced by the change in weather conditions, and what plans do I need to implement now?). It is important to note that, for all the different types of perspective (first, third, immersed), intentions are still pivotal to initiating and guiding actions. In sum, the point made by the example here is that we have two aspects of agency that are important. One is that the execution of actions and the intended effects are as planned, and so the correspondence is matched enough for the individual to assume a sense of agency. The second is that the intentions and the actions don’t quite correspond, and so the situation may require a sustained sense of agency that will help to revise the actions to eventually meet the goals. Neither of these aspects of agency nor the entire discussion can be isolated from the concept of causation, which is the focus of the next half of this chapter.
Causation Causality or not In all four of the illustrations presented at the start of this chapter, the various events described and the order in which the events occurred invited interpretations involving causes and effects. Something happened (we took some medicine), and it made something else happen (we eventually got better). However, as with agency, the very basic assumptions that we have about the events in the world and the possible underlying relationship that they have – some of which we take to be causal, and others we decide are not – are still profoundly problematic in philosophy and psychology. The debate in philosophy was for the most part initiated by Hume, though discussion on the matter of causation pre-dates Hume: for example, Aristotle, Descartes and Ockham were concerned with specifying the concept of causation (for discussion, see Lucas, 2006; Von Wright, 1973; White, 1990). 31
Magda Osman In a way, the pivotal debate in philosophical discussion on our sense of agency is inherited from the debate on causation, which is the following. If we see that one event regularly follows another, and there is some way of describing the regularity of those events without having to posit some intermediary concept (i.e., causality, or sense of agency) that seems incoherent, then why not build theory around just doing that? Just as there were arguments against positing intentionality to the external observable events in the world because it is a messy and difficult concept, this same point was made of the concept of causality nearly 300 years ago by Hume. And so, what follows from this single question is a coarsely carvedup division between either theories of causation that adopt the intuition that there are intrinsic relations between events (there is something that is causal between events) or theories that adopt the view that there is only regularity between events (causality is in actual fact illusory, and there is only succession in the events we observe in the world). The next main discussion will focus on issues concerning regularities and coincidences because these tend to offer us clues as to causal relations in the world (illusory or otherwise), but for the moment before approaching this, what follows is a little more detail of the ideas associated with the two theoretical positions of causality. Hume’s denial of necessary connections (i.e., causality) in nature suggests that there are events that appear to logically (i.e., necessarily not probably) follow one from another to which we assign a causal relation. But, that causality is a concept that disguises what are essentially spatial contiguity and temporal succession. The problem as he saw it was that the concept of necessary connection involves an extra step beyond our senses and experience. It is used to relate events (philosophical relation), and it is also a relation the mind rests on to form our understanding of the world by drawing inferences (natural relation). To put it another way, the gap between one event that is always followed by another that we experience and fill by referring to causation, and that we go on to attribute as a matter of fact that one event that follows another is cause and effect, is a necessary connection, but it only exists in the mind, not 32
Causation and agency outside of it. Where Hume’s denial of necessary connections is located is of importance, since it doesn’t make sense to say that our ideas of the world aren’t based on necessary connections; after all, that’s what we rely on to reason causally. But what he proposed is that it is just that they cannot, as far as he was concerned, be found in the procession of external objects.5 So why is it that we have a mind that tends to go about inferring causes? This was a question that Hume was concerned with. The problem for Hume was that if the properties of relations between events in the external world don’t involve necessary connections, then where does necessary connection come from? In his view, there was no point in looking for answers in philosophical relation, that is, the relations in the external world. What was left was natural relation (i.e., psychological experiences of causality). And what Hume proposed is essentially an early psychological theory of how we come to identify causes from coincidences through reasoning. The proposal didn’t need to set out what conditions in the external world serve as the basis for inferring cause, but rather the proposal need only focus on what we perceive and then use to infer cause. The details that follow concern what we need to infer cause psychologically. (See also Psillos, 2002.) Imagine we observe a single sequence of events may have the characteristics of contiguity and succession, for them to prompt a causal inference they must also have constant conjunction. That is, we must experience them in the first place, they must always happen together, and they must happen more than just once. But, this still leaves a problem. Constant conjunction of relevant events is not necessary connection per se; we may rely on this characteristic to generate a causal inference, but two events that have constant conjunction are still only regularities perceived in the mind. 5
Though this is primarily what Hume is recognized as saying, others (Craig, 1987; Strawson, 1989) have reinterpreted his work and suggest that the denial of necessary connections is an over-statement, and in fact Hume accepted that there are genuine objective causes in nature, but that his scepticism concerned our understanding and knowledge of them.
33
Magda Osman Moreover, we can’t look to past experiences of cause (c) and effect (e) any more than we can rely on our reasoning to be rational for us to say that in the future c will be the cause of e, because we have to assume that (1) past experiences must resemble those of the future, which we cannot assume; and (2) the past, present and future conditions in which c and e occur must be uniformly the same, which we cannot assume. The resolution is an entirely psychological concept: constant conjunction, at least for the mind, which is custom (or habit). Our psychology is such that the constant conjunction of c and e means that when we see or think of c, our experience automatically leads us to believe that e will happen, and this furnishes us with the idea of necessary connections. It is this which is the basis of our causal inferences about the events we observe in the world. Elaborating on this further, Hume proposes that we have an idea of causation which involves a relation of events external to our mind, and this is comprised of three essential components (causal power, causal production and necessity). In addition, our idea is based on an impression, a ‘determination to carry our thoughts from one object to another ’ (Hume, 1739/1978, p. 165). The impression and the idea don’t always correspond, and it is here that custom, rightly or wrongly, projects us forward to causality.6 To take a step back before continuing, at this juncture, the question again may arise, ‘Why is this at all important?’ What we are currently examining here is the correspondence between the events in the world, and what we see and think that actually happens. What the discussion here raises is the possibility that there is good reason to have uncertain beliefs about the connection between the two. But we try to reduce uncertainty by trying to learn to predict and control it; more to the point, we can do this despite living in an uncertain world that is always changing. While the world doesn’t have necessary connections as Hume argues, we seem to operate 6
Sloman and Lagnado (2005) present a thorough and detailed examination of Hume and related work along with its implications for the study of human induction in psychology.
34
Causation and agency with an idea of it in our heads that just so happens to coincide happily with the succession of events in the external world. Kant also offered an answer for why this is. It is common to present Kant in opposition to Hume, but before going on to spell out Kant’s alternative position on causality, it is worth highlighting where they both actually agree. What we can take from the discussion so far is that when we see two events occurring regularly, we can only really say that this is an invariant regularity if the events have some specific properties (spatial contiguity and temporal succession). These are important features of the connection, which we get into sticky ground for saying is a necessary connection, because that is an extra leap, and this is based on a trick of the mind. The other way of viewing the necessary connection, no less problematic and also the result of our mind’s trickery, is to look to the early event in the sequence of two events, and say that it has a special property, a type of force, or a power that necessitates the later event. Kant denied that based only on observing an effect we can draw conclusions about the nature of causes a priori,7 and he denied that we have any knowledge of causes as having powers. He also proposed that when considering two events in terms of cause and effect as being necessarily connected, they follow this sequence as a result of necessary succession, and not necessary power. All of this amounts to views shared with Hume (Falkenstein, 1998; Langsam, 1994). The difference comes in Kant saying that there is something we can know a priori about causality, because as a general principle, for every event that happens there must be a cause of it that necessarily follows in accordance to a universal rule; knowing this is not a transgression of our minds.8 7
I will discuss this term in more detail later. A diluted contrast between Kant’s and Hume’s positions on causality has played out in psychological theories of causal learning in which associative theories (Shanks & Dickinson, 1987) have focused on the way in which we learn to attribute causality through contingency and contiguity, whereas mechanistic theories (Cheng, 1997) suggest that we use these properties along with an a priori belief that there is a mechanism for which a cause has the propensity to generate its effect as a basis of attributing causality to a regular succession of events. 8
35
Magda Osman Hume’s concern was that there didn’t seem to be a secure basis for arguing that two events that follow each other in an invariant succession in the external world are causal, because only experience tells us that, and that is a product of the mind. For Kant, the point is not in describing the world outside of the way our mind structures it; the fact that we cannot say anything about this matter is because we are bound to it by the way in which we see it; we have to rely on synthetic a priori judgements.
Supporters of Hume Bertrand Russell (1918) and Rudolf Carnap (1928) have since suggested that causation is a faulty concept and deserves only to be discussed in terms of the perceptual world, not science. In fact, Kant viewed that the very principles of science, mathematics and philosophy are synthetic a priori cognitions. The concept of causation is amongst 12 pure categories of knowledge,9 and is not derived from experience at all; it is a priori. What this entails is the following. What Kant takes to refer to as a priori is not knowledge that is innate (see Kitcher [1980] for a detailed discussion of this); a priori is an item of knowledge that is not based on experience of things (in fact, it is entirely independent of them) but simply applies to them. For example, if I tell you that ‘eating fish with yellow fins will cause you to feel nauseous’, and you have never eaten a fish with yellow fins, then it’s fair to say you haven’t ever had any experience of it. Furthermore, say you go on to infer that eating yellow-finned fish will make you sick because you will feel nauseous. You’ve just made a causal inference based on knowledge you have received, not experienced, and you’ve formed some new causal knowledge 9
Kant saw that Hume’s concerns with the concept of causality were actually a more general problem with knowledge, and that causality was one amongst 12 ‘categories of understanding’ (i.e., unity, plurality, totality, reality, negation, limitation, substance-and-accident, cause-and-effect, reciprocity, possibility, necessity and contingency) from which our perceptions were organized.
36
Causation and agency through reason alone. But Kant would say that even this does not represent a priori causal knowledge. This is because the totality of your experience up until the point at which you made the inference then led you to make the inference that you did, and not any other. In contrast, Kant proposes that a posteriori knowledge is based on experience because it refers to particular things. For example, I know that I’m going to start feeling nauseous now, because I’ve just eaten a plate of fish with big yellow fins. I can know this because I can refer directly to my experience of eating, seeing the fish on the plate, examining the colour of the fins and so on. Also, I can say this because if I ate the same thing last week and the week before, and each time felt nauseous, then I can use this experience to warrant thinking that I’ve made a similar mistake now as then, in eating something that has caused me to feel nauseous. But, for me to say that in the PAST, NOW and in the FUTURE ALL fish with yellow fins WILL CAUSE people to feel nauseous, I’m having to commit to a causal claim that it necessarily follows that objects of certain kinds (fish with yellow fins) have a particular effect (make people sick). Kant argues that this is based on a priori knowledge. It is not based on experience. Having established, in a cursory way at least, what a priori refers to, we now need to consider what synthetic judgement is. This refers to a type of judgement (belief formation) made about the world that adds information rather than deconstructs or clarifies. In contrast, analytic judgements seek to clarify because they deconstruct information from what is already there through the laws of logic and by virtue of the terms they refer to. So, returning to the earlier statement that ‘in the past, now and in the future ALL fish with yellow fins will cause people to feel nauseous,’ we are applying a synthetic a priori judgement because we are relying on a combination of a priori knowledge to make it. We can assess the statement according to its truth or falsity by analytic judgements, but can only justify it by referring back to a priori knowledge. Hence our understanding of the world can use combinations of synthetic and analytic judgements that refer to either a priori or a posteriori knowledge, but traced back far enough they are all a 37
Magda Osman priori synthetic judgements, as in the case of our fish example which is essentially referring to the a priori synthetic judgement ‘Every cause has an effect’. Through this, Kant makes a case that has since been referred to as a Copernican revolution in view of the fact that it made a radical suggestion about our position in relation to the world (Guyer, 2006). Whereas Copernicus advanced our understanding by demoting our status in the world as part and parcel of the ongoing motion of the planets and the sun rather than at the centre of it, Kant’s revolutionary idea is to assert the reverse with reference to our knowledge and how it stands in relation to the world. Kant holds that all objects must conform to the conditions of our experience and how we structure it, rather than the conditions of our experience conforming to the independent properties of the objects. Take, for example, the case of regularities; what this amounts to saying is that we cannot say anything about regularities as they might be themselves in the world beyond the way in which we impose structure on them, because we impose the filter by which we receive them.
Problems with regularity, the difference between causes and coincidences, and making statistical generalization explanatory One of the many problems with experiencing regularities in the world is that we can assign a causal relationship without the need to experience a combination of events repeatedly, but the problem is that not all regularities can be viewed as causal. For example, two events that regularly coincide can be thought of as a genuine statistical regularity (e.g., a positive relationship between income and education) or as a spurious statistical regularity (e.g., on occasions that I think of the word milkmaid, women in milkmaid outfits then walk by).10 How do I tell the difference between the two? More to 10
The use of this example is also to make reference to the earlier discussion on mental causation, since we wouldn’t deny that our mental states affect the outside world; but this would, however, be an illustration in which we would infer very
38
Causation and agency the point, how can I distinguish between regularities (genuine or spurious) that are simply just that, from perhaps a single event that I assign a causal relation to (e.g., when I hit the squash ball with my racket for the first time and it bounced off it)? I need a method that allows me to make a sensible inference that enables me to say that this is not a causal relation even when one could be invited (correlation), from a commonsense inference that there could not be a causal relation (coincidence), from an inference that allows me to sensibly assume a causal relation must be there even if I only experience it once (causality). Again, to bring this all back to the context of trying to control an outcome in control systems, it’s worth thinking again of the examples presented at the start of this chapter. Some of the events that were experienced could in fact be construed as coincidental; for instance, the treatment of the cold, although seemingly effective, may actually not have had any impact at all, because of the rate of recovery. Therefore, though the individual may have interpreted his actions as causal, they were merely coincidental. However, because of his general knowledge and assumptions he would take as fact that their intervention (i.e., taking medication) was causal (i.e., made him feel better). Similarly, in the case of the industrial example, the operator may have interpreted the failure of the power plant to initiate despite the efforts of the operator to increase the temperature as correlational, when in fact the relationship was causal. Clearly, then, interpreting the events in a control system as causal or coincidental can have serious implications in the real world. There has been a dedicated research programme in psychology that has tried to address the way in which our cognition does this,11 but the focus here will remain strictly on the philosophical theories that have considered these questions in terms of general law like claims that can be made about the world. quickly that there cannot be a causal relationship between a mental activity and the events that occur in the external world. 11 For a general discussion of the psychology of causal reasoning, see Sloman and Lagnado (2005).
39
Magda Osman
Coincidences and causes To clarify some of the issues raised here concerning the problems with regular co-occurrences, it is worth establishing what coincidences are. Owens (1989) begins with the point that when we think of coincidences we need to assert that there is nothing that can bring them about, and that a cause cannot be construed as a coincidence if it ensures an effect. Beyond just seeming like platitudes, these claims are rather important. What follows from them is the point that two occurrences that occur by accident, each of which may have fixed prior causes, must be independent of each other to constitute coincidence. Coincidences, however, may be causes, but they in themselves cannot be caused; perhaps an example is necessary to understand this. Owens refers to the accidental meeting of two friends at a station.12 The cause of friend A going to the station was to meet his mother, and the cause of friend B going to the station was to catch a train to his holiday destination. The context is causal because there are factors that generated each independent event to occur (i.e., friend A and friend B being in the same place at the same time), but the accidental meeting is the conjunction of these independent causal processes (i.e., the constituent causal factors for each friend being there are different). We cannot infer that the meeting of the two friends has a cause from the causes that contributed to each of them being in the same place at the same time. This is because it is the different constituent causes that produced the accidental meeting. It was a coincidence that both friends each had the desire to go to
12
This example, in fact, is based on Jackson and Pettit’s (1988) example in which they discuss non-causal and causal relations by referring to two billiard balls on a table that are stationary with exactly the same properties, each of which then independently has the same force applied to them, after which they accelerate at the same rate. They use this to argue that the equality of the forces applied causally explains the equality of the accelerations. Owens’ (1989) response is that equality is not a causally efficacious property of the balls; it is the individual forces applied to the balls that causally explains their movement at the same rate.
40
Causation and agency the station. But this coincidence would help explain the joint presence of the friends if, for instance, both desires would have been necessary for friend A to go; then at least friend A’s desires would be causally relevant to the meeting of both friends. The presence of either friend is logically necessary for the joint presence of the friends; if friend A hadn’t turned up, then B wouldn’t have seen friend A, and there wouldn’t be a coincidence. But, importantly, each desire to go to the station is necessary for the meeting for entirely independent causal reasons. The ordering of the events in space and time for the coincidence to occur is contingent but not causal; one doesn’t entail the other. Now if we apply these ideas to a concrete example from the ones presented at the start of this chapter we can see how they bare out. In the biological example the steps taken by the individual to alleviate the cold and the fact that the cold eventually alleviates are treated by the individual as causally related. This is because the intentions, the actions and the outcomes are perfectly predicted by the individual. But an alternative interpretation would suggest that the individual took actions to get better, and the recovery rate of getting better coincides. It could be that there are in fact two independent causes that contribute to the events that occurred. How can we know if the events in the world and what we intend are a regular happy coincidence? This is a small-scale example, but can easily be scaled up to many real-world problems in which the actions we take and the effects that occur are hard to interpret as causal, because they may simply be coincidental (see Chapter 5, ‘Human factors’). Mackie’s (1973), Lewis’ (1973) and more recently Woodward’s (1993) theories of counterfactual conditionals couch the idea of causation in essentially non-causal terms, and without having to refer to regularities, which, as has been alluded to already, is problematic. The reason for mentioning counterfactual theory at this juncture is that in the above discussion of coincidences, there is an example of a counterfactual statement: if friend A hadn’t turned up, then B wouldn’t have seen friend A. Counterfactual theory proposes an analysis of causation in terms of counterfactuals, which makes use of thinking about possible worlds, by stating that if c 41
Magda Osman causes e, if and only if, if c hadn’t happened, then e wouldn’t have either. The very reason for the proposal of such a theory was a reply to many anti-Humean critics that highlighted that regularities are not sufficient for causation (Ducasse’s [1969] single-difference theory, and Salmon’s [1984] mechanistic theory). Hausman and Woodward’s (1999; see also Woodward, 2000) development of the theory with reference to intervention13 is premised on the following type of argument. Suppose that we want to examine whether there is a causal relationship between A (flicking the switch of a kettle) and B (the kettle turning on): we could do this by examining if an intervention on A produces a corresponding change in B, by reasoning that ‘if I had flicked the switch (A), then the kettle would be on (B)’. That A exerts a causal influence on B follows only if the change to B depends counterfactually on the intervention which changed under similar background conditions. Assume that (1) the change to A in the first place is based solely on the intervention we make (e.g., my flicking the switch is a result of me, and not the kettle reading my mind and flicking on); (2) any corresponding change to B, if it does happen, is entirely as a result of a change to A (e.g., the kettle coming on happened only because I flicked the switch, and not because it is on a timer and was about to turn on just as I was going to flick the switch); and (3) the change to A through the intervention does not change the nature of the causal relationship between A and B, only that the change to B, if it does happen, results from the change to A (e.g., my flicking on the switch didn’t change the relationship between the switch and the kettle in addition to the relationship that already exists in which the kettle will turn on because I intervened on the switch). There is a problem with this. What we decide to do as an intervention rests on various assumptions, so therein and of itself there is no guarantee that the intervention will illuminate a
13
Intervention has a specific reference in Hausman and Woodward’s (1999) theory, and is taken to mean an ideal manipulation which need not be achieved in practice.
42
Causation and agency genuine causal relation; it depends on what we know in advance of making the intervention in the first place to know what intervention to make. In other words, while we can use a combination of counterfactual reasoning and interventionist thinking to gain causal knowledge, we can only make sense of the new causal knowledge based on what we already knew about the interventions we made. This is because our interventions and the counterfactual reasoning we employ is based on other prior causal knowledge. In fact, the counterfactual theory in general has had many commentators who have criticized it on grounds of circularity (Armstrong, 1999), inconsistency (Bogen, 2004) and also that counterfactual dependence is not necessary for causation (Kim, 1973). One good reason for worrying about applying counterfactuals to render the effect as necessarily dependent on the cause is highlighted by Owens’ example (Owens, 1989, 1992). The counterfactual ‘If friend A hadn’t turned up, then B wouldn’t have seen friend A’ may reveal a logical dependence between A and B, and a relation between A and B; it could also show that the action of B (seeing the friend) counterfactually depends on A (turning up at the station). But, we know from the description I gave about the events themselves that the relationship between A and B is a coincidence. While there is counterfactual dependence in causation (e.g., the nuclear bomb wouldn’t have gone off if the button hadn’t been pressed), there is more to causation than just counterfactual dependence.
Making causal explanations from statistical regularities In addition to the counterfactual theory, another response to the argument that not all regularities in nature (e.g., night follows day) can be deemed causal is to distinguish between regularities that can constitute laws of nature and those that can’t, by expressing them as axioms of a deductive system of our knowledge of the world. Subject to the conditions of the deductive system, those regularities that don’t meet the conditions can be discarded as accidental rather than genuine regularities. Cartwright’s (1980) proposal is that features of the type of explanation referring to 43
Magda Osman causation should not be attributed to general explanatory laws, because each serves different functions. Her proposal was in part a response to Harman’s (1968) view on explanatory laws, which is also held by many. Harman proposed that one way to make a supporting case for the truth of an explanatory law is to say that the wider and the broader the range of phenomena that a law can explain suggests that it is increasingly likely to be true. Furthermore, it wouldn’t make sense to conclude that the explanatory law is false because it could only be coincidental that the wide and broad phenomena it explains didn’t in reality follow the law. The law is seen as providing the best explanation – but with some extra constraints, such as the law can only hold under certain conditions when other things are equal, and no other law offers an equally satisfactory account. As Cartwright argues, the problem with these claims is that causal explanations have the capacity to change, they should because new knowledge helps to increase the precision of the explanation, however explanatory laws should not be imbued with the same flexibility. Laws should not be thought of as true or false; they should be exceptionless.14 Put simply, Harman suggests that laws may have exceptions and Cartwright highlights that having the property of being axiomatic, laws cannot have exceptions. This is, for the most part, because Cartwright places greater emphasis on the functional importance of causal explanation, as opposed to explanatory laws. We can accept that a theory will give a particular explanation for a causal claim, and if we go along with the description then we can treat the claim as true, even in the backdrop of knowing that something may come along that will be a better explanation. My simplistic causal explanation of the relationship between me controlling the outcome of the kettle boiling and my simple 14
But we know that they aren’t. For example, Kepler ’s three laws of planetary motion include ‘the orbit of every planet is an ellipse with the sun as its focus’ which is not universally true: planets do follow this orbital description but approximately because, given that there is a gravitational pull exerted through the interaction of other planets, the orbit is then modified.
44
Causation and agency pressing of a button is enough to operate it, but if there is reason to doubt this explanation – for instance, the kettle doesn’t work on an occasion after I’ve flicked the switch – then I may have to revise or even replace my simple explanation with a better one. This is different from making a law-like explanation, which would mean that I would state as a law, ‘In all instances of me flicking the switch of a kettle, it will turn on’. Going back to coincidence, then, a causal explanation can be tested under careful experimental conditions. The effect that emerges is the result of a causal structure we aim to examine. The effect and its cause will be evaluated against the background of other knowledge. If our causal explanation is inaccurate, then our conclusions from the experiment are hollow, because the effect we observe in the experiment could be an artefact (coincidence). Alternatively, we may not be sure of our causal explanation, and through many varied experiments end up with the same effect that our causal explanation applies to, so we appeal against coincidence. This is because the observation from each of the experiments converges on the same effect, from which we can say that our causal claim is legitimate, because we are making an inference to the best cause, rather than just a best explanation. If laws can only be justified by turning to the argument of ‘best explanation’, then we might as well focus on causal explanations, because for them at least, we can examine their truth. That is, we are required to make a localist claim distinguishing causal factors from non-causal factors and factors that coincidentally accompany an effect, because we should be concerned with how specific effects are produced (Anscombe, 1981). What becomes examined, then, is the contribution that a particular cause makes in producing a particular effect, against other empirically testable facts about how other particular causes behave, and what then follows from them (Bogen, 2008). This view attempts a demystification of causality. To put it another way, we can’t always be sure that our descriptions of what happens in the world are true like our causal claims about the world. This is especially the case when we accept that the thing we are describing in the world operates probabilistically, that 45
Magda Osman is, it doesn’t apply all the time. Theories in social sciences operate on this basis, by developing deterministic descriptions despite tacitly accepting that they are referring to phenomena which are probabilistic. There is a trade-off in which deterministic descriptions allow theory to make predictions about causal relations, but not that the phenomena being investigated have deterministic causal relations (Bogen, 2008). In practical everyday terms, we do this all the time. We compose deterministic descriptions ‘flicking the switch turns the kettle on’, but know that there are other background factors that mean it won’t always work (e.g., a power surge cuts out the electricity supply, the fuse in the plug blows, forgetting to put the plug in the socket or faulty wiring). Hempel’s (1965) pioneering theory developed the inductive statistical model which develops the concept of uncertainty, or more precisely statistical regularities. Take the following example, borrowed from Salmon (1989), in which there is a general causal claim that when a person takes vitamin C, her cold will be alleviated with a high probability after a week. Say that 9 times out of 10 whenever a person gets a cold, she will recover after a week. I then end up with a cold, and have taken vitamin C, and so also expect that my cold will disappear with a high probability after a week. Hempel proposed that this is an example of an inductive argument that forms the basis of an explanation which is governed by a statistical generalization. The premises are true (e.g., vitamin C and colds), and the conclusion (e.g., one-week recovery period) follows with a high probability. There are several problems with this position, the first of which focuses on high- and low-probability events. Hempel’s concern is that good inductive arguments are based on high-probability occurrences, with no interest at all in rare events, or more specifically low-probability events. However, rare events are not all simply spurious occurrences that inductive analysis should repel; they also need explaining, as they may have a causal basis to them (e.g., lethal violence in American schools or the Challenger space shuttle disaster; Harding, Fox, & Mehta, 2002), and so high probability is not a necessary condition for a statistical explanation (Jeffrey, 1969). 46
Causation and agency Moreover, high probability is also not sufficient to be a good statistical explanation. Returning to the cold example, Salmon (1989) makes clear the most critical problem with the inductive statistical model. The general claim involves a highly probable outcome, and the individual case (my recovery from the cold) involves a highly probable outcome, which meets Hempel’s requirements for a good inductive argument. But, if after a week colds will generally go, what has been described isn’t a causal relationship, but highly probable events that are correlated. I could always take vitamin tablets and observe that my cold will go away after a week, but if it’s going to go in a week anyway, my behaviour is effectively useless. But more importantly, I don’t have a basis to make a causal claim, and this prompts further exploration for a secure basis in making statistical generalizations explanatory, other than high probability. A final point to make concerning statistical regularities is of the two ways in which probabilities are involved in the way we reason about causal events; Carnap15 (1950) clarifies these differences. The logical interpretation of probability comes from making the conclusion follow from the premises of the argument with high probability. The relationship between a premise (i.e., if so and so) and conclusion (i.e., then so and so) is objectively logical. As with the example above, the conclusion that there is a high probability of recovery after a week follows logically from the premises that I take vitamin C and I have a cold. The other involvement of probabilities in the inductive model is how the actual statistical generalizations are described, that is, their relative frequencies. In saying that there is a high probability of recovering from a cold after a week, we are referring to a virtual scenario in which there are a series of individual cases of people with colds in which they have taken vitamin C and most of them have recovered. This turns a probability into
15
There is a third type of classification that Carnap (1945) makes along with the logical and relative frequency conceptions of probability, and that is ratio: ‘probability is defined as the ratio of the number of favorable cases to the number of all possible cases’ (p. 516).
47
Magda Osman something more restrictive (i.e., a frequency, which makes it objective but cannot then apply to single cases); we cannot refer to the probability of the rate of recovery from a cold given one person having taken vitamin C. Whichever of the two considerations of probabilities enters into the inductive process will matter for what we can say about statistical generalizations, as well as their use in explaining single and multiple occurrences. Here, too, the reader again may beg the question ‘How does this all apply to controlling outcomes in control systems?’ The points raised in this part of the discussion clearly apply on a large scale to general descriptions about how the world works, which are the concerns of science and philosophy. But also, they have a bearing on the control systems examples that were presented at the start of this chapter. We formulate what we know about what’s going on at the time in which we are making our decisions and acting on them, as well as using our general experience of how the world works to support our understanding. We need to formulate an assumption of how the world works in order to judge what might be a genuine causal relationship, but also what is a coincidence, and a judgement cannot be made about one without the other. We treat as true our assumptions about our plans of actions (e.g., taking a combination of medications, planning a response to an attack, breaking quickly or increasing the temperature of the reactor) and the effects they will generate (alleviating a cold, successfully defending a safety zone, avoiding a car crash or initiating the nuclear reactor). But, we also know that if the effects we try to produce don’t always occur, we need to examine why that is the case. Is it because (1) this is an occasion where something random happened, but doesn’t challenge our understanding of the world we are trying to control, because we accept that we live in an imperfect world; (2) this is an occasion that highlights the problem of adapting a rigid explanation of high-probability events; or (3) this is an occasion that highlights that the way in which the world operates is different from how we had assumed? What the latter part of this discussion was designed to highlight is that there are a number of possible ways in which we reason (e.g., counterfactually, inductively and 48
Causation and agency deductively) in order to identify whether we need to revise our knowledge, or not, and what we should base our revised understanding of the world on.
A Little Synthesis It may well be apparent why, in a broad sense, agency and causality have been discussed at length as the introductory ideas to a book on controlling uncertainty. But now we have to consider how the issues illuminate the question ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’ The systems that I will refer to (e.g., air traffic control, automated pilot systems, cars, people, ecosystems, subway systems, economy, organizations and nuclear power plants) have one important characteristic in common; they involve decision making which has to manage uncertainty. In terms of a pragmatic perspective, unless we maintain a sense of agency and can trust some basic causal knowledge we have of the world, we cannot learn about or control something that is uncertain. As was the experience in the story of control, the puppeteer ’s resolve was crucial to helping him learn to master the puppet, especially at times when the pursuit seems hopeless because the puppet seemed to be behaving of its own accord and not because of the actions of the puppeteer. Causality, can be thought of in terms of the concept itself – in other words, what it can be taken to mean, which was largely the focus of the chapter – but also causality can be thought of as the application of the idea to particular cases (Lucas, 2006). Adopting this kind of distinction can help to succinctly capture the problems highlighted in this chapter that concern making causal assertions, and the traps of necessary connection. Thus, for example, I can say that in general, given past and current instances in which there is bombing close to or in restricted safety zones, the threat to the zone is deliberate, not accidental. I make this assertion because this rests on accumulated evidence, albeit statistical, and this helps to form 49
Magda Osman the idea that an enemy bombing in a safety zone is a deliberate attack. However, this information can only really be useful in its application, and it enables me to make predictions that if there is bombing near or in a restricted safety zone in the future, it is because it is designed to threaten it, and that this is the result of enemy fire. The application of this knowledge to a particular future event and any response that I make on the basis of this are formed on the understanding that there is a necessary connection between cause and effect, and I will likely use counterfactual reasoning to support this as well. This is in spite of whether what we are doing is going beyond the evidence to say what will happen next. This really is the crux of what we are doing when we are applying causal knowledge to predict future events, but also when we are controlling a future outcome. We may do this in a crude fashion, and more likely make incorrect causal assumptions, but nevertheless, however crude this method of reasoning is, it is also incredibly robust in the face of a potentially crippling level of uncertainty about the complex systems we attempt to predict and control. We may not be able to truly detect the necessary connection between causes and effects or realize that ultimately there is no necessary connection, and in particular with respect to the actions we make and the effects they may produce, but nevertheless it is part of the cognitive mechanism we need that enables us to interact and manage uncertainty. Much of the causal knowledge we develop is anchored by the actions we generate, and the sense of agency attached to them. This is because we can initiate an action and the sense of agency entitles us to draw an inference that the effect that immediately happened was caused by us (Davidson, 1967; Lucas, 2006). Moreover, from this, our concept of cause is directed by time. We know that the cause precedes the effect, and that there is likely to be a separation between the two in time. In the context of controlling outcomes in complex systems, actions are directed towards a goal, because they are designed to achieve a particular effect; and because of this, we usually have formulated an explanation for the effects that occur, typically at a localist level. That is, the range of possible factors contributing to the effects produced is confined as much as possible 50
Causation and agency to the observed behaviours of the system (Hooker, Penfold, & Evans, 1992; Lucas, 2006). All of these factors help build up our causal knowledge, and at the same time make a seemingly intractable problem of controlling uncertainty into a practical problem. One of the most successful ways of considering how to deal with these issues from a practical perspective, while still remaining sensitive to the more general problems raised through philosophy, is to consider them from a control systems engineering perspective (Sloman, 1993, 1999), which is the subject of the next chapter.
51
Chapter 3
Control systems engineering
Before we can begin to consider the psychological aspects of interacting and controlling uncertainty, we need to set the scene by first considering the actual mechanisms that drive complex systems. To do this, we need to explore the nature of control systems from an engineering perspective. In this sense what this chapter aims to do is detail the workings of the puppet, the nuts and bolts and theory that help to develop it and what contributes to its unpredictable nature. What will become apparent in this chapter is that while there are properties of the puppet that are well defined by theories of engineering, there are properties that cannot be well defined, and this means that some aspect of the puppet will remain uncertain. Nevertheless, identifying the source of uncertainty in a control system should provide a more secure basis from which to find answers to the central question of this book, ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’ This is because if we know how the environment behaves, we can understand better how we react, interact and control it. Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
52
Control systems engineering So, following this rationale, we turn to the realm of engineering. Engineering involves control systems of many kinds, from highly complex real-world systems (e.g., insulin delivery systems, blood pressure control during anaesthesia, aircraft, robotics, traffic control or nuclear power plants) to everyday devices (e.g., cars, CD players or digital audio players). The discipline treats all these systems in the same way by examining the relationships between inputs (i.e., the things that go into the system) and outputs (i.e., the stuff that comes out of the system). Offshoots of engineering (e.g., human factors, ergonomics, human-computer interaction) focus on the relationship between human and control systems, particularly in work on aviation (e.g., Degani & Heymann, 2002; Kirlik, 2007; Rothrock & Kirlik, 2003)1 and other automated systems such as nuclear power plants, military systems and hospitals (Rasmussen, 1985; Sheridan, 2002), which will be considered in more detail in Chapter 5. The aim here is to present the details of the control systems environment from the viewpoint of engineering – and that is to examine the system itself, and the ways in which it is developed which provides the grounding to examine out interaction with control systems. More specifically, this chapter discusses control theory in engineering,2 which is the formal way in which control systems are described. This chapter ends with a discussion of how work in control systems engineering can go some way to answering the main question that this book is concerned with.
What does Control Mean for Engineers? The concept of control has two distinct meanings in the domain of engineering and mathematics. One refers to the process of testing or checking that a physical or mathematical device behaves in a 1
For an excellent review of this literature, see Degani (2004). Control theory has been applied to a variety of dynamical systems including biological, economic, sociological and psychological ones, and so many of the themes that are discussed with reference to this theory in engineering will be revisited throughout the book. 2
53
Magda Osman specific manner. The other refers to the act of controlling, which involves implementing decisions that guarantee that the device behaves in a way that we expect (Andrei, 2006; Birmingham & Taylor, 1954). Both interpretations of control systems engineering work towards one main aim, which is to develop a physical system (e.g., an ecosystem or economy) or artificial system (e.g., a car or commercial aircraft) that operates in ways that reliably meet the goals intended for it. And, more importantly, it should be able to do this automatically. Why? Well, because if a system reaches a point where it is reliable and stable, and its output replicable, then it can be automatized, which constitutes optimization of the system.3 Interestingly, then, in engineering at the heart of control are the description and development of systems; these systems involve human operators, but the ultimate aim is that the system operates by itself so that humans don’t need to supervene. Control is given to the system, not to us, because we should be relieved of needing to control the system; it should be able to manage itself. Control means automation, but how do we achieve successful control? To achieve successful control, the objective of the system needs to be defined so that it can be predicted over a given time scale, assuming, that is, that all the outcomes of all the actions of the system are available for inspection. If the output in the system cannot be reliably predicted based on what we know of the inputs, then that can lead us to suspect that there are unknown or unpredictable inputs (disturbance, noise, etc.) (Doyle, Francis, & Tannenbaum, 1990). Thus successful control requires accurate prediction of events, but how can this be accurate? The answer is time. If designed well, choosing a short time scale will likely imply a predictable system; if the time scale used to predict the behaviour of the system increases, this presents a major problem because this in turn introduces uncertainties in the accuracy of prediction, par3
Historically, the development of automatic control was concerned with replacement of the human worker by the automatic controller, for example the mechanization of various processes in the industrial age (e.g., the steam engine).
54
Control systems engineering ticularly given the likelihood of unforeseen events. If uncertainty surpasses a given threshold, then no meaningful control can be implemented because the function of the system is unpredictable. However, taking into account potential long-term future contingencies may come with a cost. That is, the system is designed in mind of long-term contingencies that may not happen, while having to sacrifice design features that concern the types of events that will occur on a short time scale (Leigh, 1992). How does one balance all of these practical issues? We need a theory to set out the specific objective of the system – that is, its intended goals – to determine how to achieve this by setting out the necessary actions required of the system, and set out a method by which to choose actions, or leave scope to change the actions in the system so that the intended goal is reliably achieved.
Control Theory Control theory applies to most everyday situations in which there is a goal-directed system. Its universality comes from being able to abstractly describe any situation (i.e., a system) that needs to be controlled. The theory follows the principle that once we can establish how to control a situation, then by extension, any and every particular situation has the capacity to be controlled. To achieve successful applicability, control theory needs to specify the following: 1.
The purpose or objective of the system. Typically, the purpose of any control system is assessed according to its performance over a given period of time. As highlighted above, there can be a misalignment between the need to meet the requirements of a long-term goal and meeting the requirements of the short term. These types of conflicts need to be resolved in order to describe and implement a successful control system. This is
55
Magda Osman
2.
3.
underpinned by the concept of feedback. This is a process in which the current state of the system, or its output, determines the way in which the control system will behave in the future. In other words, through feedback the behaviour of a past event will be relayed informationally through the system to bring about the same behaviour in the future. Fluctuations in the system. Control systems often need to have room for flexibility in view of the fact that, to reach its objective, the system may not be able to achieve this without sensitivity to change. Therefore, a simple brute design means a system will follow a particular course of behaviours over and over again that cannot be modified. Allowing for fluctuations in the system not only offers an element of choice in the behaviours that the system has available, but is often the most efficient way of reaching its objective because it can discover the dynamics that will drive the system towards its desired state. Optimization. The specific characteristic or functions of the systems (parameters) are adjusted to the point at which the control system operates in a way that minimizes error and maximizes reliability in producing the desired goal. This requirement is met by a model (formal representation) capable of reliably predicting the effect of control actions on the system state.
In summary, then, a control system is essentially composed of interconnections of elements (components) that are organized into a structure that is goal directed, that is, it serves a functional purpose. The way in which the control system operates is through controlling variables that influence the state of a controlled object. The controlling variables are regarded as inputs to the controlled object, and its outputs as controlled variables (e.g., to regulate [control] the temperature [output] of an oven, the heat [input] needs to be adjusted). What control theory is concerned with are the structural properties of the system, particularly dynamic systems – systems whose behaviour over a time period is the focus of interest. It focuses on describing three broad aspects of the system: feedback 56
Control systems engineering loops in the system, fluctuations that occur in the system and system optimization. By using block designs – a graphical representation of the system with its components and its connectedness – a system can then be formally described through mathematics.
Linearity and Dynamics To build as well as implement a control system, its performance capabilities and its stability have to be analysed, and this is based on transforming the properties of the system that vary (i.e., the variables) into a form that is amenable to mathematical treatment. To do this, the Laplace transformation is applied, but before explaining this, two central concepts of control systems need to be discussed: dynamics and linearity.
Dynamics What makes a system dynamic is that there is an intermediary internal mechanism that connects an input to an output. The state of the system won’t change instantaneously, that is, the output won’t respond to the input, because the processes that are required for control are inert, and the dynamics of the system are provided by its own internal energy store (an internal transient response mechanism – or actuator). For example, if I sit on the seat (input) of a swing (internal mechanism), it won’t move until I generate the energy to move it. If I intend to keep swinging at a steady pace for exactly a minute (output), I can’t control this efficiently and accurately unless I know how the energy is stored in the swing. More specifically, the dynamics are characterized by variables which are functions of time; these can be continuous-time variables (e.g., they change at a particular place) which correspond to ongoing change over time, or discrete-time variables which correspond to change over uniformly spaced time periods (Jacobs, 1993). Dynamic systems are described for the purposes of mathematical analysis by dynamic equations (these involve differential and difference equations – see 57
Magda Osman Perko, 1991). These equations can be used to specify two types of relationships within a control system: (1) relationships between the rates of change of time-varying parameters of the system, and (2) relationships between parameters at different time points.
Linearity Pierre-Simon de Laplace’s (1749–1827) work has been crucial to engineering his developments in mathematics enabled a formal analysis of control systems, which is now referred to as the frequency-domain approach. In the linear world, the relation between cause and effect is constant and the relation is independent of magnitude. What this refers to in terms of control systems is this: by describing a system as linear, one can take a system that responds to a particular stimulus, and scale it up by increasing magnitude, with proportionality preserved. Why is this important? To describe and analyse the function of a system, it is often easier to treat it as linear. What the Laplace transform does is give a functional description of an input or output to a system that simplifies analysis of the behaviour of the system. For instance, in physical systems the Laplace transform takes inputs and outputs as functions of time (time-domain) and transforms them into inputs and outputs as functions with respect to frequency rather than time (frequencydomain). This is important because differential equations often have to be used to describe control systems because the types that are usually examined are dynamic, and some of the functions are unknown. If they can be linearized, then the Laplace transform can be applied to reduce complexity in the mathematical calculations analysing the operation of the system. Thus, the central concept of the frequency-domain approach is the transfer function, and in the analysis of continuous-time dynamical systems, the use of Laplace transforms predominates for good reason. In general, control theory has a tendency to offer solutions to problems that arise in analysing the function of systems in terms of linear problems, even though most practical problems are nonlin-
58
Control systems engineering ear. While control theories that rely on solving linear problems often lead to precise quantitative results which make them widely applicable, there is a notable criticism. Application to specific real problems may amount to no more than provisional recommendations of the properties of the system that need to be altered, and of the particular way that this should be done. This is because these theories neglect the dynamical changes that transform inputs to outputs, which in turn demand a different method of description and analysis – this will be discussed later in this chapter in the ‘States and Uncertainty’ section. The point made here is that, for the sake of devising practical solutions to control problems, the easiest way of approaching them is to ignore non-linearities and treat the system as a linear one.
Representation System: Block Designs To analyse a control system, engineering uses a representational system referred to as block designs. This is founded on linear system theory, for the reasons that the relationships between the components represented in the system are assumed to be linear, and that there is a cause–effect relationship between these components. Thus, what is shown in a block design is the process to be controlled. The examples presented in Figures 3.1–3.3 are block designs that include a basic dynamic system (Figure 3.1), a simple openloop system (Figure 3.2) and the most common type of control system, which is the close-loop system (Figure 3.3).
Input
Process
Output
Figure 3.1 Process to be controlled.
59
Magda Osman
Image not available in the electronic edition
Figure 3.2 Open-loop control system without feedback. Source: From Dorf and Bishop (2008).
Image not available in the electronic edition
Figure 3.3 Close-loop feedback control system with feedback. Source: From Dorf and Bishop (2008).
All these examples represent the cause–effect relationship between an input and output, which are unidirectional. They represent the transfer function of the variables of interest in the control system – in other words, what goes in, what comes out, and the process that transforms one into the other. As discussed earlier, these systems are dynamic because there is an intermediary process between input and output, that is, the actuator. In each of the three examples, the input–output relationship is causal: it goes from the direction of input (i.e., a signal from the input) via an actuator to an output (i.e., an output signal variable). The inclusion of loops in a system is, in the main, for the reason that they offer a powerful means of control. The difference between open-loop and close-loop systems is the presence of feedback. As is 60
c03n.indd 60
7/30/2010 5:33:25 PM
Control systems engineering shown in Figure 3.2, the open-loop system has a controller and an actuator, and the objective response (the output) is achieved without feedback. An open loop is an example of a direct system, for the reason that it operates without feedback and directly generates the output in response to an input signal. In such a system, it is likely to be automatic; that is, a device controller influences the behaviour of the process of a control system in a particular way, by issuing commands which are decided in advance by the designer. For example, a shop window display that has a rotating disk in which the mannequin is spinning around slowly involves an open-loop control system. In view of the block design represented in Figure 3.2, the desired speed of the rotating disc is set (input), the controller in this case is an amplifier, the actuator is a direct current (DC) motor, and the process is the actual rotating disc itself, with the output being the actual speed of the rotating disc (Dorf & Bishop, 2008). Closed loops differ from open-loop systems because the decision governing an action taken produces an effect which is then reported back by an information channel (via a sensor). Because of the presence of feedback, further decisions that are taken are looped back continuously. For this to be achieved, close-loop systems have to include an additional measure of the output in order to compare it with the desired output response. The measure of the output is called the feedback signal. And so, for a simple closed-loop feedback control system, maintaining control involves a comparison functioning continually between the different variables of the system, and using the difference as a means of maintaining accurate control. As is shown in Figure 3.3, the system uses a measurement of the output signal, and the comparison between that and the intended output is the error signal (reference or command), which in turn is used by the controller to adjust the actuator. Going back to the rotating disk example, the open-loop system is effective if there are unlikely to be any environmental changes. However, imagine that the shop is displaying an ice sculpture, and so over the course of the day the distribution of weight over the disc will change as the sculpture melts. The rotating disc can be converted 61
Magda Osman into a close-loop system, by including a tachometer (a sensor) which can measure the rotation speed of the disc. Viewed from the block design presented in Figure 3.3, the output of the controller causes the actuator to modulate the process in order to reduce the error. In other words, if the speed of the disc changes as the sculpture changes in shape, the signal from the tachometer can be used to further adjust the system, so that the error between actual output (rotation speed) and desired output (intended speed) remains small. This system in particular is an example of a negative feedback control system, because the actual output is subtracted from the input (desired output) and the difference is used as the input signal to the controller. Though loops offer many advantages in control systems, they are essentially error driven – because there are always likely to be changes between the intended and actual outcomes of the system, and this difference is error. Therefore, the success or performance measure of a system is its ability to reduce error. In addition, while the concept of control loops and feedback has been the foundation for control systems analysis, the introduction of control loops, and particularly those with feedback, also increases instability, and as a consequence tighter performance requirements are sought.
Feedback Loops Feedback is central to almost every practical control system. Some (Doyle, 1982; Jacobs, 1993) have argued that it is the presence of uncertainty that motivates the use of feedback in control systems, and it is for this reason that feedback can be regarded as one of the most crucial concepts in any control theory. Feedback control systems involve one or more loops (see Figures 3.1–3.3) in which the connection between input and output forms a circuit. Information flows through the loops, so that part of the information entering a component is information that previously left that com-
62
Control systems engineering ponent – that is, the system feedbacks back on itself. Feedback control is when an error detected by the system initiates corrective actions which are useful when the actions take immediate effect, but this cannot occur if an error is not detected in the first place. However, there are constraints to the usefulness of feedback control in systems, because often it isn’t possible to develop a system with a high enough bandwidth (a measure of ability of the system to reliably generate an input signal – and also the range of frequencies to which it will respond) from which accurate error detection can occur and correction through feedback is initiated. Instead, there are a number of alternative ways (e.g., preprogrammed control, feedforward control and prediction followed by control) of coping with the problems that feedback is usually designed to countermand. Preprogrammed control involves a sequence of controls that are calculated in advance and are implemented irrespective of signals that are received from the system while it is performing. That is, there is no corrective behaviour; the system is simply implemented and acts according to the set programs within its design. This type of system is implemented in situations in which the conditions in which the system operates are highly predictable. Feedforward control signals are important for correcting future actions of the system, but these are based on calculated error prior to the implementation of the system. So, unlike feedback control in which the error signal is used online to generate corrective actions, the calculations are made in advance. Similarly, prediction followed by control also involves advanced estimations of future contingencies and is based on making advanced predictions about these events either from extrapolations from algorithms, or based on previous histories which contribute to developing a low-bandwidth control system. Why is this important? Well, clearly there are cases where prediction needs to precede control. Without this, control systems will misalign the constantly changing needs and an increasing rate of response to them. Therefore, it is important to predict in
63
Magda Osman advance some aspects of the future in order to adjust the system to them quickly enough as the changes arise. Leigh (1992) gives an illustration of this by referring to maintaining a supply of electricity from a power station as the demands of the consumer change. Since there are obvious changes in the demand according to daily as well as seasonal changes in demand, this enables fairly stable predictions and so preprogrammed control can be implemented. This approach is far more useful than a feedback control system in which the process of bringing new generators online to match the changing demand would be slower than the actual fast rate of change of consumer electricity demand.
Cost of feedback Costs of feedback include the fact that there is an increase in the number of components which are in the control system. Once the number of components increases, so does the actual expense of the control system. In addition, the increase in the number of components in turn means increases in noise and introduces further inaccuracies into the system, again largely as a result of sensors. For example, sensors in turn need extra wiring, and this reduces the reliability of control systems though stochastic noise, cyclic errors and limited responsiveness. Finally, an important cost of feedback is instability.
Stability The issue of stability takes precedence when feedback is introduced into a control system. A useful analogy to employ when considering stability in a system is a spinning top. If the handle from the spinning top is removed, and it is turned over on its flat end, then its resting position is stable, that is, it won’t move easily unless the surface it is resting on is moved. If the handle is placed back on it, and it is left to rest on the table, it is in a neutral position, but is less 64
Control systems engineering stable because it can still roll around. If you then actually tried to balance it on its point without spinning it, its position would be unpredictable, and also unstable. Thus, a system is considered stable if after some change from an equilibrium state the system will reliably return to its equilibrium state.
Using stability to measure performance of the control system Stability is important because, along with measuring the performance of a feedback system according to the reduction in error, the upper bound of performance of such a system is also decided according to its stability. Measuring the stability of the system involves examining its behaviour over a given time period by introducing a perturbation (deviation from equilibrium) and then detecting what change occurs following it, and whether it decreases or increases.4 If the system is stable, then the response to a specific input signal will provide several measures of the performance. If the system is unstable, then any deviation from an equilibrium state will mean that it will deviate further, moving off with ever increasing deviation (linear system) or possibly moving towards a different equilibrium state (non-linear system). For example, as happens with loudspeakers, a poor adjustment to the sound may follow if a microphone is too close to the loudspeaker. What then tends to follow is a whistling and booming sound which after a while then dissipates. The sound coming from the loudspeaker is an amplified version of the sound picked up from the microphone, but also the sound coming from the speaker itself. Depending on the distance between microphone and loudspeaker, and as a result of a time delay, there is positive feedback produced by too much loudspeaker output feeding back into the microphone driving the loudspeaker. Hence, the reason for this is because this is an
4
For a linear system, the responses to initial perturbations of different magnitudes are identical except for a scaling factor.
65
Magda Osman example of a close-loop feedback system, and it destabilizes as a result of feedback.
Introducing instability into a control system However, there are also many systems that involve open loops which are unstable by design – in which active control is designated to the operator for reasons of flexibility. The instability is then handled by feedback, because judgements can be made as to how to adjust the system according to the various environmental conditions, which are likely to be transient. To compare, a commercial aircraft is much more stable than a fighter aircraft because of the different requirements of the system. Consequently, the former has less manoeuvrability than the latter, for good reason, but the instability of the latter also makes it flexible enough to adjust its behaviour as and when the operator needs it to. This flexibility would not be an advantage for a commercial aircraft that requires the motion of the aircraft to be smooth. Thus, the examples here show that given the different goals of the system, the design will introduce flexibility (in this case instability) within the system to accommodate fluctuations which either are needed for quick changes in behaviour or need to be ignored (Dorf & Bishop, 2008). Moreover, another related factor that introduces instability is time delays. Time-varying control systems are defined according to one or more of their parameters varying as a function of time (e.g., its properties, size, weight or different demands over time – such as a need to respond to changing goals). Moreover, for some control systems the chemical processes themselves are likely to be dynamic (e.g., Weeks et al., 2002); combined with a closed loop in which there are time delays (e.g., hydraulic processes, chemical systems and temperature processes), the stability of the system can be severely affected (Yang et al., 2007). A time delay refers to the interval between initiation of an event that can be specified in the system and the effect of that event occurring at another specified point in the system. For instance, a time delay can occur in a system that has a movement of a material that requires a finite time to pass from 66
Control systems engineering an input to an output. Therefore, as with other aspects of control systems, a formal description of the behaviour of the system is needed in terms of time and its stability in relation to time, particularly if it is unclear how the system will behave within a given time interval (Mayne et al., 2000; Yu & Goa, 2001). The Nyquist stability criterion is used to determine the effect of the time delay on the relative stability of the feedback system. Harry Nyquist’s (1889–1976) work was instrumental in developing solutions to reducing instability in feedback amplifiers, which had particular impact in the development of mass communication over long distances. The Nyquist stability criterion is a method of assessing the stability of a system; more specifically, it gives details concerning the frequency characteristics of the system, and time shifts in the system. For example, holding a lighted match (lowfrequency input) under a thin strip of aluminium for a few seconds is unlikely to make it change its behaviour or make it unstable, but holding an Oxy-Acetylene torch (high-frequency input) under it will make it melt and run from the flame and separate, behaving in an unstable manner. Thus, the input characteristics of a system may mean that the system itself remains stable under low frequencies, but behaves erratically under high frequencies. A frequency response analysis is the application of the Nyquist stability criterion in order to predict the effects that occur in a control system to reduce instability (Bode, 1945; Nyquist, 1932).
States and Uncertainty The state of the system refers to the combined properties of the components of the system; this is important because the state of the system at any given time point can be used to determine the future response of the system. What then happens in the future in terms of the state and output of the system will be determined by the input signals, and the values of the variables of that system combined with mathematical descriptions of the dynamics (dynamic equations). The concept behind this is also known as uncertainty, 67
Magda Osman because central to analysing the state of a system is an estimation of judged future states of the system, the important point being that they cannot be known for sure, hence why this is an uncertain factor. Optimal control theory has been developed in order to accurately make these kinds of estimates. The theory is primarily concerned with problems of state space, and analysing properties of controllability and observability in the system. There is, however, no definitive characterization of uncertainty. It is simply referred to as the lack of direct measurements of the state variables. These issues are central to the time-domain approach of control theory. As mentioned previously, one of the key limitations of the frequency-domain approach (i.e., linear theory) is that it fails to take into account dynamical changes in the system. In optimal control theory, the states of the system are analysed using methods of probability to provide a complete stochastic description of control under uncertainty. If there is uncertainty about current values of the state of a system at any one time, then conditional probability distribution5 serves as the basis for equations of optimal control. To understand this, it is first necessary to bring time back to the fore. Analysing and implementing a system on the basis of states require that despite the obvious limitations that no system involves states that can be completely measurable, the system is treated as such. The advantage of this is that even though there are things about the control system that are potentially unknowable, optimal control theory says that the states of a system are utilizable, and this can be done in accordance to a full-state feedback control law. Therefore, the system is designed with an observer, which has the function of estimating the states that are not directly sensed or available for measurement. Essentially, the role of the observer is to
5
In other words, given the value of some factor X, this can be used to predict the probability of Y happening.
68
Control systems engineering augment or in some cases replace sensors in a control system. The advantages of observers compared with sensors are that they are less expensive, are more reliable, and incur less error in the system, because they are algorithms that are implemented in combination with other knowledge of the control system to produce observed signals (Ellis, 2002). The observer is connected to the full-state feedback control law – which is also referred to as a compensator. The compensator is an added component to the circuit in this closed loop which further compensates for a deficiency in the performance of the system. Probability theory (e.g., Bayes theorem) can be used to design algorithms which accept available measurements and generate estimates of states which could be used by an optimal control law. If we refer back to the swing example, my legs swinging underneath the seat (power conversion) enable the seat to sway back and forth (plant) with me on it. My ears (sensors) are used for balance, and my judgement which is formed by my experience (observer) is used to adjust my leg swinging so that the swing continues to move in a regular controlled way. In this way, the observer combines a measured feedback signal from the sensor with knowledge of the control system components to adjust the behaviour of the plant. This is far more accurate than if the feedback signal from the sensor alone were to be used to adjust the behaviour of the plant. As has been made clear, the observer involves a store of knowledge – or built-in a priori information and equations governing the object whose states are to be estimated, and about the statistics of exogenous variables. When the governing equations are linear and all random factors are normally distributed, the estimation algorithm is implementable and is known as a Kalman filter. Rudolf Kalman (1930–present) developed the optimal filtering estimation theory, which overcame the problem of the limited frequency-domain approach by introducing the mathematical entity ‘state’ that mediates input and output. This provides an inherent notion of causality to the control system, because the concept of a
69
Magda Osman ‘state’ emphasizes that there is an internal structure to the control system, and that it refers to a directional relationship between input and output.6 Two important contributions based on this are Kalman state-space decomposition and the Kalman filter.7 The Kalman state-space decomposition is a method of separating the state (or combinations of states) of the system according to controllable and non-controllable states. For those states that are identified as controllable, the system can be treated as stabilizable. However, since some states cannot be known, the state is not available to be fed back – that is, it is inaccessible – and so under these conditions a state estimator is used to reconstruct the state from a measured output. Since many powerful feedback control strategies require the use of state feedback, a state estimator is needed to essentially predict what will happen next in the system based on prior information of its behaviour. The Kalman filter is a state estimator, and one of its many famous applications in a highly complex control system is the moon landings of the Apollo space shuttle. The filter ’s strength and wide applicability come from its use in time-varying processes – both discrete and continuous time. What the filter does is take a process that may vary in unknown ways across time, and estimate either some or all of its model parameters numerically in real time from measured process data from the control system. In essence, parameter estimation is based on a relabelling as state variables those parameters that are to be estimated – that is, the state which is an unknown variable. The problem with this is that by relabelling, this method introduces artificial non-linearities, so to cope with this, the non-linearities are linearized.
6
It is worth considering again that even though there is a formal description of a state of a control system, and there is a causal relationship between input and output, the notion of causality is still under-specified, and moreover, the states of the system are still essentially opaque to the observer, since they are estimated from prior histories of the system, and inferred from the observed output signals. 7 Outside of engineering, the Kalman filter has been used as a formal method of describing learning behaviour in humans (Dayan, Kakade, & Montague, 2000).
70
Control systems engineering
Control System Design Given the various details concerning the problems with control systems, in particular the way behaviour in the systems changes and how it can be formally described, the point of engineering is to build from this systems that can operate reliably to produce specific outcomes, which is control systems design. The approach to designing systems tends to follow four broad activities: outline the goals, model the environment, analyse and test the system. Determining the goal of the system involves outlining the desired behaviour that the designer wants the control system to display (e.g., an automatically rotating disc). This will also involve establishing the necessary performance specifications (e.g., how fast should the disc rotate, and for how long?). In broad terms, they include (1) good regulation against disturbances, (2) desirable responses to commands, (3) realistic actuators, (4) low sensitivities and (5) robustness. These will be constrained with respect to the components that are necessary to achieve the goals (e.g., controller, actuator, process and sensor),8 that is, how accurately the goal can be achieved given the components of the system. What is typically the case in design, and is apparent from the discussion of loops and feedback, is that however the system is developed, it will incorporate an element of behaviour that is outside of the designer ’s influence or intention. But there is still a need to develop a model of the system – that is, defining and describing the system using block design (e.g., developing a formal description of the automatic rotating disc from a block
8
It is worth revisiting Dretske’s design problem, and Dennett’s criticism of Dretske in this context, particularly, given the specifications of control systems design. One may ask, do these systems have agency? The question becomes even more relevant when asked with respect to adaptive control systems that are designed to ‘learn’. Moreover, feedback control systems are essentially informationprocessing systems in which feedback gives rise to behaviour that can appear to be intelligent (e.g., neural networks; see Leondes, 1998).
71
Magda Osman design like the one in Figure 3.3). This process requires that the system is describable and predictable.
Coping with Uncertainties in Control Systems Existing control theory is rich in discussions of the dynamics of systems, their stability, and the achievement of optimization. In the development of a resilient control system, its resilience and robustness are decided according to whether it has low sensitivities – that is, that it is stable over a range of parameter variations, that its performance continues to meet the specifications in the presence of a set of changes in the system parameters and that its robustness in light of effects that were not considered in the analysis design phase (e.g., disturbances, measurement noise and unmodelled dynamics) (Ioannou & Sun, 1996; Safonov & Tsao, 1997). However, there are still ongoing problems that control theory faces concerning uncertainties and, based on engineering problems, dynamics, and the flow of information across the system. Take, for instance, the issue of emergent properties. To explore this Mill’s (1884) early distinction between heteropathic and homeopathic effects has a strong bearing on current approaches to emergent properties. Heteropathic effects produced from the combined causes are equal to the sum of the individual effects produced from the causes in isolation. That is, what you get out of the control system is essentially within the bounds of what you put into it. This makes determining effects in the system easy to identify. Homeopathic effects (emergent properties) are the products of causes which are more than the sum of the individual effects produced from the causes in isolation. That is, the control system generates effects that exceed the bounds of what the individual components could produce. In one sense, emergent properties are unpredicted behaviours of the system. They may arise not only from interactions between components of the system, but also as a result of interactions
72
Control systems engineering between its application and the environment (Johnson, 2005). If the aim is to be able to describe a control system accurately, and get it to reliably produce the same effects time and time again, emergent properties are aspects of a control system that present systems engineers with a fundamental problem. The practical way in which they are dealt with depends on how they are referred to in the first place. A control system with these effects functions in a way that is not easy to characterize. Clearly if one accepts that a control system may behave in a way that produces unpredicted outcomes, and that it can be described as homeopathic, then this is a problem, because a control systems engineer will have to accept that the system may be unknowable. This is highly contentious because attempts to model control systems involve simulations of possible scenarios, not unknowable ones. Rather than attributing a control system with emergent properties – as labelled homeopathic effects – for pragmatic reasons it is easier to assume a case that the system is exhibiting behaviours that were not adequately formally described. Another sticky concern is system dynamics. In order to have a functioning control system there is a particular bias in the way that uncertainty is and this also applies to dynamics. To avoid many problems dynamics are characterized it by standard linear equation modelling (Doyle, 1982). However, other types of uncertainties, for instance the way information transmits across a system, still present significant problems to control theory. Coping with uncertainty usually requires that estimations are made about the future state of a system, and this entails devising prior assumptions using probability theory. The issue at hand with different sources of uncertainty is whether to treat them all in the same way either by labelling or translating them into different more manageable problems, or else simply denying their relevance. Yang et al. (2007) discuss dynamical systems with time delays. A great deal of research interests have been generated with respect to various time delay systems, which can be found in chemical systems, or for example when temperature and hydraulic processes
73
Magda Osman are involved. The problems that systems of this kind raise are that they generate a huge amount of instability, and can degrade the performance of the control system. Yang et al. (2007) highlight a further issue concerning control systems of this kind in which there have been attempts to solve these problems. Implicitly assumed is that the measurement of signals sent across the control systems contains the real signal, with some expected probability that the signal is mixed with external disturbances – hence noisy. However, in some situations in control systems this is not true. In fact, as a result of factors such as sensor temporal failure or network transmission delay or loss, there are often occasions in which no real signal is being detected; instead, what is being measured in the signal is just pure noise. Again, this appears to be an example in which control systems can operate effectively, and can be described relatively well, but as engineering techniques advance, so do the issues that they raise concerning where possible uncertainties arise and how they should formally be treated. While a robust control system can be designed well and operate effectively, and our general day-to-day experiences involve many interactions with such systems, they behave according to their performance specifications, but with some degree of uncertainty. This is because the state of the controlled object (e.g., passenger aircraft) may be affected by uncontrolled, external and more or less unpredictable exogenous inputs (e.g., variable weather conditions); desired values for the state of the controlled object are often exogenous and unpredictable (e.g., air traffic control systems tracking a manoeuvring target); and sensors carrying information about current states may also be noisy (Jacobs, 1993), and in some cases may carry only noise (Yang et al., 2007). Specifying the events that need to occur in a system along a given timeline has been a consistent theme running through this chapter. Either a specific time frame is allocated in which to examine the performance of a system, or changes are expected to occur across time which need to be accommodated within the system, or because all responses in the system are in fact transient, the response of a system is as a function of time. For example, urban traffic light 74
Control systems engineering systems (Bate, McDermind, & Nightingale, 2003; Huang & Su, 2009; Papageorgiou et al., 2003) use mathematical techniques (i.e., Petri nets; see Chretienne, 1989) which provide a graphical representation of the temporal properties (i.e., complex sequencing and synchronization) of events that occur in the system. So, for instance, the traffic lights could be set to change according to a specific sequence and fixed timing, or can be set to change in response to traffic flow. The state changes of the control system (in this case, the regulation of traffic lights in a specific urban area) and distributed components will happen often in the course of a specified time duration. Petri nets are able to characterize this behaviour by representing the way in which the events in the system are triggered, and by describing their conditional probabilities. This means that changes in state are represented by the new conditions that hold after an event has occurred. Though there has been an attempt to use this technique to specify exactly when events occur in the system, there are problems with it. For instance, in the traffic example, the aim would be to present a description that specifies a certain known event within a certain time. This then makes it easy to identify what event was expected to occur (i.e., the state transition of the light from amber to green) and when (i.e., always after X seconds after the initiation of the red light). However, there are conditions in which temporal uncertainties arise, and these can be because a certain event can occur but with uncertain timing – that is, the event is known to have taken place but it isn’t clear in the timeline exactly when it happened. The converse can also be true in which the timing can be specified, but it is unclear what event took place. In worst-case conditions, it isn’t clear whether an event took place and, if it did, when it took place (Johnson, 2003). When this happens, the control system isn’t always equipped to detect the misalignments of timings or inappropriate delays or the absence of an event. The role of error detection is left for the human operator, and their job is to intervene when things get out of hand. Clearly, then, event timings are an additional source of uncertainty that in a practical sense are dealt with by humans rather than the control system. 75
Magda Osman One reason for revisiting the issue of uncertainty here is that in later discussions in this book concerning how people interact with control systems (see Chapter 5, ‘Human factors’), uncertainty becomes a crucial factor, in particular with respect to the fault diagnosis in operation of a system. That is, when tracing back the occurrence of events in a control system, and reasoning from effects to cause, particularly when the effects produced in the system were unexpected, assigning ‘fault’ – that is, determining the root cause of the unpredicted outcome in the system – is important. This is the case not only for legal issues (i.e., did the plane crash due to pilot error, or because there was a fault with the plane – in which case the aircraft company may be liable?) but also for system design. If the control system doesn’t operate correctly, and is likely to fail again, then the reason for this needs to be established. To complicate matters further, as has just been discussed, a known event can be stated to occur within the system, but without a precise time allocation, or a known event can be considered as uncertain – because it may be difficult to determine if it occurred if you don’t know when it was supposed to occur. Ideally, in system design, it is important to have available indicators (e.g., flashing lights, flashing buttons, buzzers and alarms) of events having taken place and at the points in which they occur in order to trace the behaviour of the system and that of the operator (e.g., Degani, 2004; Johnson, 2003).
A Little Synthesis We are surrounded by the success of control systems engineering. Methods such as supervisory control9 (e.g., Morse, 1996, 1997; Ramadge & Wonham, 1987, 1989) relieve humans of the burden of 9
Supervisory control is a high-level artificial decision maker within the control system. The supervisor functions as an updating mechanism, and renews the controller (recall, this is a component of the control system that is designed to track the effects that are generated by the system) parameters. For instance, if there are
76
Control systems engineering dealing with complex and challenging problems. The increasing computational capability of human–machine interface techniques (e.g., Suchman, 1987, 2006), and computer database management (Ramakrishnan & Gehrke, 2003) has revolutionized the application of control theory and design. Digital control systems are used in applications ranging from industrial chemical processes to urban traffic controls. Control systems engineering has led to highly sophisticated formal information-gathering, -sampling and -processing techniques, which have also been used to deepen our understanding of the operations of human cognition10 (see Chapters 5 and 8). Additionally, as techniques like system dynamics methods used to formally describe the changes in control systems improve, they lead to a more accurate characterization of dynamic behaviour in not only engineering control systems but also other types of complex systems (biological, ecological, management and economic) (e.g., Coyle, 1996; Gee & Wang, 2004; Georgantzas & Katsamakas, 2008; Richardson & Pugh, 1981). But this book has a mission of sorts. So, while it is important to highlight the many virtues of engineering, we have a question to
unknown parameters that enter into the system (i.e., any one, or combination, of uncertainties described in the latter section of this chapter), the controller needs to devise some way of tracking online and estimating continuously over a given time period the process behaviour. In addition, if the control system is set to have low sensitivities, then high performance is sought and this presents a challenge for the system to regularly adapt itself to the continuous changes, the causes of which may not be known. As a result, design of adaptive control algorithms involves a complex combination of techniques as well as trial and error. Whenever there is a new estimation of the process parameters, the supervisory control can proceed either by continuously tuning or by estimating events at discrete instants of time. A currently popular method in supervisory control is to switch between discrete and continuous adaptive methods to keep the system in line with its target goal (Feng & Lozanao, 1999). 10 For example, Wolpert (Körding & Wolpert, 2006; Wolpert, 1997) has extended engineering optimal control theory (Bryson & Ho, 1975) and the Kalman filter to describe motor control behaviour in humans.
77
Magda Osman answer. With regard to engineering, what we are interested in knowing is ‘What are the implications of understanding uncertainty in the context of the target question of this book?’ As mentioned at the start of this chapter, control system engineering has two approaches to the issue of control: a formal one and an applied one. The first point I want to draw attention to is this: we can formally describe how the puppet will behave, because we built the puppet, and despite some of the mysteries of the internal workings of the puppet, on the whole we have found sensible ways of making it do clever things, crucially without having to acknowledge that the mysteries are still there. That is, formal models describing the control system are incomplete. These are often in terms of changes in parameters (e.g., dynamics, time delays, changes in equilibrium, sensor noise and unpredicted disturbances to inputs). What this means is that there are aspects of control systems behaviour that are opaque to us all. They are uncertain. Most of existing theory on control engineering concentrates on symptoms of control systems problems, in the form of the behaviour of feedback systems, to the exclusion of root causes11 in the form of interactions between uncertainty, information transmission and performance of the system. We take it for granted that when interacting with control systems, the behaviour of the system is predictable and can be formally described. While control systems engineering is capable of characterizing a control system and its behaviour, it achieves this following an important pragmatic constraint (Richardson, 1999), and that is to find ways of formally parcelling out uncertainties – some of which have been discussed briefly in this chapter. The second point I will make is this. The puppeteer still has ultimate reign over the puppet, but that’s partly because the puppet isn’t quite at the stage of being able to cut the strings. Even though the ongoing goal of control systems engineering is to develop systems that offer flex11
By which is meant here the root causes of uncertainties, and not the representation of causes of which there are a variety of formal techniques (e.g., block designs, causal Bayes nets [e.g., Russell, Clark, & Mack, 2006] and system dynamics causal loops [Lakatos, Sapundzhiev, & Garside, 2007]).
78
Control systems engineering ibility and autonomy supplemented by supervisory control, control systems still rely on human supervision (Vicente, 2002; Vicente, Roth, & Murnaw, 2001). This can be with regard to scheduling changes in the system, adapting the system to the constraints of limited resources, monitoring information flow across the system, scheduling the maintenance of the system and coping with failures that may arise (Redmill & Rajan, 1997; Sheridan, 2002). Thus, even with the high level of automation that current control systems provide us, and the continued drive to improve the range of automated functions of control systems, while it is obvious that there have been impressive advancements (e.g., mechatronics and robotics), some (e.g., Summer, 2009) have argued that the underlying causes of failures in control systems and human error have not appreciably changed. The practical aspects of engineering have been pivotal to the development of a movement that has tried to apply control theory to understanding all possible types of systems (i.e., cybernetics), and this is the subject of the next chapter.
79
Chapter 4
Cybernetics, artificial intelligence and machine learning
Cybernetics, as its founder Wiener (1948) described it, refers to a union of problems of communication, control and statistical mechanics.1 All of these were already concerns of Wiener ’s work on system dynamics in engineering, before he went on to establish these ideas in Cybernetics: Or control and communication in the animal and the machine. Again, as with Chapter 2, this chapter considers the basic operations of the puppet, and how it functions, but in this case the puppet is not just a mechanical structure. As will be made clear in this chapter, the puppet can refer to any type of control system. This is because cybernetics was designed to encompass everything. It presented an argument for why the core problems that are integral to control systems engineering are actually the concern of every 1
Statistical mechanics places probability theory at the heart of explanations of physical concepts (e.g., particle and quantum physics) by using descriptions of classes of behaviours to apply to micro and macro levels of material systems. In early developments of statistical mechanics, this class of behaviours used probability theory to described geometric, dynamical and thermal properties of matter. Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
80
Cybernetics, AI and machine learning type of control system, be it biological, ecological, chemical, economic, organizational, social, industrial or mechanical. This ambitious vision is reason alone for examining cybernetics. Another is that at its core, cybernetics – or systems theory later, as it was coined by Ashby (1956) – set out some general properties for describing complex systems (i.e., feedback, information, causal structures, input–output associations, mechanism, state change, regulation and autonomous behaviour). If we are to edge closer to answering the target question of this book, ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’, then we need to find a way of accurately describing the control system itself. In order to do this, we need to examine what available descriptions there are out there. Control systems engineering provides one type of description, particularly what factors make a control system uncertain; the aim of this chapter is to consider what cybernetics, artificial intelligence (AI) and machine learning have to say on the matter of control systems. Thus, the foremost aim of this chapter is to acquaint the reader with the basics of cybernetics, and to show that the ideas that have been formative in control systems engineering are virtually unbounded in their generalization to other types of systems we can think of in nature and society. Second, cybernetics as a scientific movement may have found that its goals exceeded its grasp, but it still has a bearing on how complex systems are described today, and offers important clues in understanding how we go about explaining the behaviour of systems. Another aim, then, is that this chapter introduces the basic ideas that were expounded by cybernetics to understand how they were later developed by its successors, AI and machine learning. This then advances us further from characterizing control systems in their most prototypical setting – namely, engineering – to having a set of general descriptions of the system. This will also help us to compare the objective descriptions of the system with the tacit assumptions that people have about the workings of systems when interacting with them, which is discussed in the next chapter (Chapter 5, ‘Human factors’). 81
Magda Osman This chapter begins by discussing the general concerns of the cybernetics approach, and the key problems that it attempted to solve (communication, control and statistical mechanics). From this, work on AI and machine learning is introduced. The rationale for this is to help chart the progression of the ideas that shaped the way control systems were described in the 1930s–1950s (i.e., especially uncertainty) and how they are thought of in current thinking in AI and machine learning. This should help us to see how AI and machine learning work has tried to tackle the critical question that this book tries to address.
Cybernetics: Problem of Communication, Control and Statistical Mechanics At the time when Norbert Wiener (1930) was solving communication engineering problems, he was already exploring the analogy between control systems in engineering and regulatory control complex feedback systems in broader domains (e.g., anthropology, biology, ecology, economics, chemistry, physics, physiology and psychology). Wiener ’s main interest was the instability of feedback mechanisms (particularly with respect to feedback loops).2 Wiener ’s early discussion on cybernetics (1948) acknowledges that ‘an extremely important factor in voluntary activity is … feedback’ (p. 6). Wiener claimed that contained in the notion of feedback is all necessary properties of any type of system. That is, information is recursive, and it must be, out of necessity, because it is the mecha-
2
This was a subject that was dominating scientific circles at the time, and though there was a late lag in its emergence in psychology, feedback, and the different effects it produced on the way we make decisions, came to the fore between the 1960s and 1980s, particularly in studies of complex decision making and in multiple cue probability learning tasks (see Chapter 7, ‘Cognitive psychology’). Moreover, block designs and characteristics were incorporated as part of formal descriptions of decision making (Einhorn & Hogarth, 1978).
82
Cybernetics, AI and machine learning nism that tracks change in behaviour which is ultimately designed to regulate a system. So, as a response to the seemingly obvious analogy that engineering control theory offered for understanding complex systems more broadly, Wiener ’s discipline of cybernetics was formed in the late 1940s. There is debate about this, since some historians (e.g., Mindell, 2002) have predated the cybernetics movement to Lewis Mumford’s (1934) seminal work ‘Technics and civilization’, and Harold Hazen’s (1934) work on servo-mechanisms.3,4 Nevertheless, the story of cybernetics is usually told as starting with Wiener (1948), and in actual fact, regardless of who spearheaded the cybernetics movement, the main claim made by all of them was that feedback is instrumental in technological as well as scientific advancements (e.g., as highlighted in Chapter 3, ‘Control systems engineering’). The following discussion examines feedback in connection to three major problems that were thought to be the concern of every control system: the problem of communication, the problem of control and the problem of statistical mechanics.
The Problem of Communication Wiener proposed that transmitting a signal5 in a system requires a sequence of regular measurements over time, or a time series. 3
Servo-mechanisms are a type of automated system that uses an iterative process in which the servo (a human operator and/or device) is used as an error detector and corrector, so that information feeds back into a system to adjust and improve future behaviours. A current example of this is the automatic cruise control in a car. 4 Again, as a further illustration of the formative influence of engineering on psychology, Herbert Simon’s theories on decision making and problem solving, which are recognized as instrumental to modern understanding of these behaviours in economics and psychology, borrowed from theories of servo-mechanisms as analogies of complex human processes (see Simon, 1952). 5 Wiener (see Rosenblueth, Wiener, & Bigelow, 1943) later proposed that transmission could be of any kind of signal sent electrically, mechanically or via the nervous system.
83
Magda Osman The goal was to make sure that the transmission is smooth and the signal is accurately sent and received. The problem Wiener highlighted is that whatever system there is, there is noise attached to the signal, and so to solve the problem of noise, some process is needed to reduce the noise. Wiener proposed a predictive process that made estimations of what the signal would be without noise (by smoothing noise) and compared that with the actual signal at regular measurement intervals of the signal. However, the problem was that predictions were based on past behaviour of the signal, which highlighted two issues. First, if the signal was particularly noisy, then there was less of a requirement for sensitivity in the measurement apparatus. This is because wild fluctuations in noise in the signal would be too big to go detected by a highly sensitive measurement system, and so lead to poor predictive accuracy. Second, the more sensitive the measurement apparatus was, the better the smoothing of noise, but the greater the likelihood of hypersensitivity. That is, the predictions could be more and more accurate but small departures from the predictions would create instability from which the system could not recover. To correct for noise in signals, Wiener developed his technique using time series statistical analyses. What Wiener took from this was that any form of communication can suffer from the same concerns (even communication between two people), and so this was not an isolated case in communication engineering. Messages are often contaminated by noise, and interpreting signals carrying information requires some method of statistical analysis, a view which in part was influenced by his contemporaries (Kolmogoroff, 1941; Shannon & Weaver, 1948). While they were concerned with how information was coded, Wiener was of course interested in how it could be used to make predictions over time.
Problem of Control How we use information is the problem of control, particularly in error-corrective behaviours in systems. Having established that, in 84
Cybernetics, AI and machine learning general complex systems involve the manipulation of information; this borrows from the main tenet of control theory in engineering, which is that the goal of any system is to become self-regulating. This means that it can achieve the same desired outcome autonomously. The problem of control, then, is essentially finding solutions to two joint issues: how to manage future events, and how to manage feedback. Uncertainty of the behaviour of a system was identified as not knowing what its future behaviour will be, which is how it will continue to behave in the future based on the influences of its own internal dynamic properties and the influences of the environment in which it exists. The other problem concerns feedback, particularly negative feedback – which was the feedback of choice in self-regulatory systems. How it is incorporated in a control system presents challenges to the flow of information across the system. The problems are related, because feedback is treated as a mechanism that is designed to generate information and make predictions, but exerting control requires knowing in advance what will happen in order to predict future events – hence why these two problems of control are connected, though in actual fact one could argue that they reduce to the same problem, which is controlling uncertainty. To illustrate the problem of control, Wiener refers to two examples: (1) an aircraft following a flight plan, and (2) the movements of a hand holding a pencil tracing a line across a page. He argues that the degree to which we are conscious of the exact muscle movements involved in directing the pencil across the page is a matter of debate (see Chapter 2 for a detailed discussion of this). While we may not be entirely sure how we realize this in terms of biological and neurological processes, inherent to the control of our movements is a mechanism that regulates information by taking some measure of what we have done, what we have not yet done and what we then need to do to accomplish our task. The output of the system (i.e., the arm movement) is subtracted from the input (the desired goal of tracing the whole line across the page), and it is this difference that is used by the system as the new input (i.e., negative feedback). This process cycles until the difference between 85
Magda Osman goal and outcome is reduced until the goal is finally achieved. But this can only be done if the system can successfully predict what behaviours are necessary from each round of negative feedback in order to advance towards the goal. The flight example that Wiener uses is a little more thorough in its analysis, in part because of his engineering expertise. The negative feedback problem is that much more complex in this example for the reason that the information that the system is drawing from has different histories of prior behaviour (i.e., immediate past, longterm past of the flight of the aircraft and long-term past of the flight of other aircraft). In turn, the predictive process used to implement the appropriate behaviours to advance the aircraft along its flight plan has to predict multiple state changes that will change over time, as well as those steady states that need to be maintained. Usually aircraft have transfer functions that vary widely with different conditions (airspeed, altitude, fuel load, atmospheric conditions etc.). The different histories of prior behaviour are required by the system because it has to learn to adapt its behaviour based on a record of past behaviour (i.e., experience). But also, advance estimates of behaviour are needed to progress the aircraft across its flight path, which is why it isn’t easy to predict the state of the system based only on negative feedback to the system. Therefore, a combination of building on existing information through feedback (that can often be non-linear) and extrapolating new states of the system from that by using a linear operation to predict the future is used to control such complex systems. This is why in one way or another, uncertainty is the base root of the problem of control, as I’d argue is the case with the problem of communication. The solutions offered through Wiener ’s work are based on utilizing feedback and developing accurate predictive techniques. Similarly, Wiener proposes that natural self-regulatory mechanisms (i.e., the biological example) solve the problem of control in the same way – using a combination of negative feedback and extrapolation from that to predict. Feedback always works in opposition to what the system is presently doing, because it is always revealing the gap between what it 86
Cybernetics, AI and machine learning is doing and what it ought to be doing. Wiener proposed that automatized systems such as electronic and mechanical control systems use negative feedback along with a host of other different types (e.g., non-linear, mechanical and positive). But, there is a different class of feedbacks that are particular to voluntary control behaviours (i.e., like those typically found in animals and humans), of which negative feedback is a case in point. Biological systems, particularly humans, rely on this type of feedback because of the self-corrective mechanisms we need in order to control our complex functioning. Moreover, we have a well-developed anticipatory mechanism that we use to predict future states of our environment, and it enables us to plan far ahead of our current situation. Wiener hailed this particular human virtue. Anticipatory feedback demonstrates what we are able to do so efficiently, and that is not to judge the difference between, say, where the ball is in the air and where we are now, but where the ball will be and where we need to go. Machines couldn’t do this; we can project into the future because anticipation is what gives us the edge. Wiener was inspired by the types of feedback mechanisms that he observed in nature, and one of his proposals through cybernetics was to understand and utilize natural feedback mechanisms in machine design in order to overcome the problem of control. One of the most important of these is informative feedback.6 Again, Wiener looks to an example of our ability to control to demonstrate the flexibility that this type of feedback offers us – that is, driving under slippery or icy road conditions. As he notes, we are able to use knowledge of the road conditions and how the driving vehicle performs, and we can adjust the wheels quickly using our own kinesthetics to control the car and maintain relatively stable behaviour of the system. Thus, encapsulated in this example is the elegance of a biological control system that uses information online to make multiple adjustments in behaviour in order to control a mechanical control system. 6
This has in fact been incorporated into engineering design (see the Kalman filter in Chapter 3).
87
Magda Osman
Problem of Statistical Mechanics Wiener proposed that systems deal with information,7 be it in recording, storing or transmitting it, which, in turn, serves as an index of the level of organization of the system. If the system is operating reliably, then the flow of information is systematic and the signal should be relatively clear. Informationally, then, Wiener argued that predicting the flow of information across a system in time reveals an asymmetry between past to present and present to future, suggesting that, given the way information proceeds, systems are directional in time. What does this mean? Control systems don’t conform to Newtonian mechanics. If you run a complex system forward in time and backwards in time, the system won’t always behave the same way. The reason for this is two fundamental features of control systems: system dynamics and feedback.8 Of the many requirements of a complex system, dealing with information involves a memory and a capacity to adapt (even a rotating disc; see Chapter 3). That is, a system receives information that doesn’t need to be used at the time it is received; instead, it can be stored, thus delaying its applicability until the requirements of the system change such that the information is necessary. To complement this, the way in which a system operates leaves scope for it to change and adjust, again for the reason that based on the information that it receives it can respond to new extraneous influences on the running of the system. Moreover, as with most aspects of control systems, what looms in the background is uncertainty. There is no method by which we can take regular measurements that will provide sufficient information about the system’s past behaviours to enable a complete description of its future behaviour. 7
Information itself is viewed in terms of choices, so a unit of information is a choice between two equally probable alternatives. 8 The account offered by statistical mechanics concerning the asymmetry in time of physical processes also plays an important role in philosophy, particularly with respect to causality, because understanding the alleged asymmetries of time is informative of causation, and vice versa.
88
Cybernetics, AI and machine learning In both these cases, information from the past informs the present actions taken in the system. Adjustments then will influence future behaviours. If systems operate to maintain a goal, and through feedback are using information to improve their efficiency in achieving and maintaining their goal, then the information of the systems is not constant over time.
Artificial Intelligence There are many issues that are raised by this domain, not least is whether machines can actually be ascribed intelligence, but also what we should take as evidence of intelligence in the first place9 (i.e., is machine learning intelligence, and can it stand up to the kind of learning that humans and animals display?). There are many dedicated research articles and books on this area (e.g., Bryant, 1991; Fayyad et al., 1996; Schaffer et al., 2007; Searle, 1980; Turing, 1950). The focus of the discussion here will be to consider an aspect of AI that is of critical concern to any adaptive system (artificial or 9
As Turing (1950) had considered in his thesis on machinery and intelligence, there are at least nine main objections to AI. Of those, there are three that have attracted the most attention. One of these refers to the fact that there are some properties of being human that cannot be replicated by machines. A version of this objection is Searle’s (1980) Chinese room argument that challenges the possibility that could demonstrate agency and understanding, or, for that matter, consciousness. Put simply, Searle proposed that the machine is analogous to a man (a nonChinese-speaking individual) in a room who receives messages in Chinese that according to a set of instructions the man follows. He can translate and respond, but without understanding Chinese. The issue at hand being that semantics cannot simply emerge from syntax, the computer like the man is implementing rules, not comprehending them. Another objection refers to failing to take into account the consequences of machines should they posses artificial intelligence, for which we cannot fully comprehend or find ways to control. This was a comment that Wiener (1948) had made as a warning that advances in technology will exceed the time in which we have to evaluate the consequences of them. The other main objection is a mathematically based one. Some (e.g., Kleene, 1952; Lucas, 1961; Putnam, 1960; Smart, 1961) have used Gödel’s (1931) Incompleteness Theorems to disprove the
89
Magda Osman otherwise), and that is ‘How do AI systems deal with multiple sources of information that can generate uncertainty, and how is uncertainty controlled?’ Examining these issues also presents us with insights into our own ability to manage multiple sources of information that can contribute to states of uncertainty. As Aitkenhead and McDonald (2006) suggest, the benefits of AI are far reaching in commercial sectors (e.g., weather prediction, stock market prediction, regulating subway systems, space exploration, factory quality control and flight navigation) and industrial sectors (e.g., car manufacture, chemical and material waste plants and nuclear power plants). Moreover, AI is also treated as a way of informing psychological theories of how humans and animals learn and manage uncertainty in their environment (e.g., Aitkenhead and McDonald, 2006; Weng et al., 2001).
What Does AI Aim to Do? The aims of AI can be seen as some of the most difficult and complex pursuits of science. The very basic idea is to develop an intelligent system that must possess sensory and motor capabilities in order to predict and control the environment, and have some way of processing the information it receives in order to learn to adapt to changes in its environment. The difficulty comes in finding a way of combining the goals (i.e., adaption, prediction and control) with the different information the system is operating with (i.e., sensory fact that there can be artificial intelligence (for discussion, see Bryant, 1991). Gödel proposed that for any powerful logical system with internal consistency, there are statements that within the system can be proved neither true nor false, and this makes the system incomplete. Therefore, statements like ‘This sentence is false’ (e.g., a paradox) have an indeterminate status within the logical system, but outside of that, given another formal logic they can be true. This then is extended to machines, which also rely on a logical system, and so they can only be consistent, but not complete, unlike humans who demonstrate inconsistency. Therefore, the analogy between mind and machine is not an appropriate one to make, and so machines cannot operate in the same way as human minds do.
90
Cybernetics, AI and machine learning data from different modalities). The capacity that machines exhibit in extracting and utilizing information has been the focus of machine learning (which is discussed later in this chapter). However, the way complex devices function and offer adaptive control has been the focus of study in engineering (see Chapter 3 on engineering), particularly work in mechatronics (e.g., Rao, 1996) and robotics (e.g., Agre & Horswill, 1997; Ilg & Berns, 1995). Moreover, many of the technical problems that are faced in developing control systems in engineering and early cybernetics (i.e., feedback, dynamics) are common to AI. Therefore, it is no surprise that underlying the issues in AI research are the same ones that underpin cybernetics, because both endeavors are concerned with devising optimal use of sensory information that is attuned to the dynamics of the environment for the purposes of automated adaptive control. Again, taking a reductivist approach as mentioned before, I’d argue that the issues raised here all come down to two things: uncertainty and how it is managed. All AI systems are informationprocessing systems, and so one of the most significant issues is ‘How does a system (machine/robot) cope with imperfect information, and how does it cope when the correctness of the information available is in doubt?’
Barriers to Perfect AI Systems Perfect information denotes that which contains a complete approximation of the environment. That is, the control system is the approximation of the environment (i.e., the context in which its effects are produced), and the control system is sound and complete if the information it contains of the environment is only true.10 This 10
In a way, this issue runs parallel to the relationship between epistemology (having a theory of the means by which knowledge can be discovered) and ontology (having a theory as to the world’s general structure). One can’t function without one or the other, because a theory of knowledge and a theory of knowledge representation work in concert – assuming that there are things that can be known, and there are kinds of knowledge to represent.
91
Magda Osman poses a foundational problem, since no system (artificial or otherwise) can meet this standard of perfection. So, to cope with this, systems operate pragmatically by ignoring imperfections (Motro & Smets, 1997). This pragmatic strategy should seem familiar. Discussed in the previous chapter were many formal ways of making intractable problems manageable, and similarly, AI adopts similar practices. The consequences of this are that when the system has to respond in light of these imperfections in information, its responses may exclude correct information, or include incorrect information. To avoid this, attempts to reduce uncertainty involve making the approximation of the system to the environment as close as possible (i.e., reducing the score of imperfections). As well as this, the aim is also to develop systems that can represent imprecise information (e.g., the sensor cannot precisely detect the temperature because of environmental distortions and so can only give a range from 40° to 90° Celsius) and uncertain information (e.g., the sensor cannot precisely detect the temperature because of environmental distortions and has to estimate the temperature that it is 75° Celsius with 0.85 probability) and assign quality to the information it represents (e.g., the sensor cannot precisely detect the temperature because of environmental distortions, and so estimates that it is accurately recording the temperature with 0.80 probability).11 Probability here represents imprecision and uncertainty,12 and there is often a tradeoff between the two. That is, the system can be more precise but less certain (e.g., it is 20 per cent likely that there will be a storm in the South Pacific), or more certain and less precise (there is 80 per cent chance of rain over the week).
11
This is by no means exhaustive; others (e.g., Smets, 1997) have specified in great detail categories of imprecise (error full, error free), inconsistent (conflicting, incoherent) and uncertain (objective, subjective) types of information. 12 For example, in a probability distribution a range (e.g., 40–90° Celsius) which represents imprecision of the sensor can also have a probability attached to it, which represents how likely the value of a temperature reading from the sensor is likely to be 60° Celsius, which would represent uncertainty.
92
Cybernetics, AI and machine learning Intelligent control systems incorporate formal descriptions or models (fuzzy set theory, possibility theory, relative frequency theory and Bayesian subjective probability theorem) to cope with uncertainties. Given the particular goal that the intelligent control system operates under (e.g., if the temperature reaches 75° Celsius, then move the robotic hand away from the object), it is certain to behave based on information that is imprecise, uncertain or of poor quality. The problems that AI machines face is that they still need to utilize whatever information is received, by classifying and retrieving it, as well as making inferences and decisions from it – and for this, machine-learning principles need to be understood.
Early Ideas of Machine Learning and AI To cross the bridge between AI and machine learning, we need to step back and think about cybernetics again. The cybernetics description of systems follows a Humean perspective on causality (Bryant, 1991). There are no essential connections between system states which transform into one another; it is contiguity that describes the relationship between states, not causality. Moreover, as a student of Russell, it is no surprise that Wiener adopted this view. However, Wiener saw how causality could be translated as a construct into dynamics, because the seeming transformation of one event to another in time appears to be dependent on discrete shifts along a timeline, from a static state to a different state, and therefore embedded in these temporal contiguous events is a dynamical process – which invites the idea of causality. It is unclear from Wiener ’s work whether dynamics should be treated as some sort of reflection of what characterizes the necessary connection between sequences of events in a system. Perhaps instead, causality should just remain a term that only really has psychological potency, and that is needed to understand contiguous events as cause–effect relations. What is clear is that once the issue of causality was transformed into the
93
Magda Osman dynamics of a system, Wiener could discuss it in terms of feedback mechanisms. Again, the system he was referring to could be mechanical, biological or physical. The significance of this position is its bearing on the relationship drawn between mind and machine. Wiener (1948) was forward thinking in predicting the development of machine learning as a new scientific domain, and particularly in drawing the analogy of information storage in early computing machines (Turing, 1936) with changing thresholds of neuronal activity. He examined the relationship between mental state and mental process, which he saw as equally applicable to computational machines and brains. He proposed that a stimulus is sensed, and an internal mechanism uses this as information to then generate an appropriate response. There is temporal contiguity of the stimulus and union of patterns of internal behaviour. This is revealed by the patterns of actions that are generated in the very dynamic process that he was referring to, which to all intents and purposes appears as causal. To return to the topical issue of memory and learning, machines can be designed to operate by recording information (even if that information carries imprecision) from sensors which is stored and used at a later stage in the operations of the system, particularly when it can be used to return the system to its steady state. In addition, machines are designed to cycle through the same behaviour many times over, so often a permanent record of information isn’t needed. If it is merely repeating the same behaviour, it doesn’t need to operate with reference to each recorded item of information stored. For example, a rotating disc may need to adjust its rate of rotation in order to remain stable for a given time period, but need not retain the information received from its sensors which feed back into the system to keep it stable for future rotations of the disc. However, AI and machine learning are not only concerned with systems that successfully replicate behaviours, but also concerned with control systems that adapt, just like humans do. Humans don’t clear out information stored between computational runs; rather, the record of information is retained. This is because we do not 94
Cybernetics, AI and machine learning merely repeat the same successive behaviours. We tend to adapt our behaviours for the future. In part, we do this because we are self-regulatory systems and this just so happens to be a characteristic of such systems. We also do this because our environment is changing, thus requiring us to adjust our behaviours. Finally, we also do this because we set the goals we want to achieve, and we can change them as we go along. Wiener ’s cautionary position on the issue of adaptable behaviours was that computational machines have a memory, and once they have this they can use this to learn, and therefore adapt, and if they can do that, then they change their behaviour. At the time he voiced this concern, there were significant technological advances. His worry was that the computational speed of modern digital machines could be dangerous because the technological advances would occur faster than we would have time to properly consider and evaluate them. Clearly, the question he was raising is ‘How much control can we have over machines if they are capable of adapting?’ We revisit this issue in the next chapter.
Machine Learning While Wiener ’s work launched cybernetics, Ashby’s (1945, 1947, 1956) ideas in his systems theory set much of the groundwork for describing machine learning. Moreover, it is perhaps because of his training in biology and psychology that the connections between machine learning, neurology and genetics that he drew in this theory continue to influence current work on machine learning today. Since Ashby’s work in cybernetics, machine learning works towards developing computer systems that can use experience to improve performance autonomously (i.e., learn). If computers have the capability to learn, then systems (e.g., robots) can be designed to interact with a variety of different environments and learn to solve novel and challenging problems, much in the same way as animals and humans do (Mitchell et al., 1990). Thus, machine learning also faces the question ‘How does a system (machine) learn 95
Magda Osman about, and control online, an uncertain environment that may be changing as a consequence of its decisions and actions, or autonomously, or both?’ So, let us consider what Ashby’s contribution to machine learning is.
Ashby’s systems theory Ashby’s main tenet is this: autonomous control system’s state changes are deterministic. Given a complete set of values of the variables in the system in its initial state, combined with a description of the environment the system is interacting with, and the structural relationship between the variables in the system, future states can be predicted. That is, there are no emergent properties from the system, and all states can be accountable given the correct formal description of them (for discussion of emergent properties, see Chapter 3 on engineering). Though this does not entail that all systems are entirely predictable, this position implies that there are operations that are inherent to any system for it to behave, and given these and the purpose of operation (i.e., the desired goals given the characteristics of the environment in which the goal is to be achieved), it is possible to provide a description, albeit restricted, of the operations and future behaviours of any complex system.
Learning: adaptation as feedback The themes discussed in Wiener ’s work are shared by Ashby, in particular feedback, exchange of information and the role that dynamics play in encapsulating the relationship between states of a system. They did, however, differ with respect to ideas on learning. For Ashby, in order for a system to learn, dynamics serve two purposes. One is that there are functional rules13 that change the 13
These are formal descriptions of the means by which the relationship between two variables is conditional, i.e., that C is conditional on A which is a value or state (for further details, see Ashby, 1947).
96
Cybernetics, AI and machine learning state of the system (i.e., the internal mechanics of the system), and the other is that the same functional rules change the behaviour of the system in such a way that positive changes are retained (i.e., the adaptive properties of the system). The second of these two forms of dynamics is an early version of reinforcement learning, which will be discussed in more detailed later in this chapter. Crucially, embedded in this idea is feedback, because without information feeding back into a system, it cannot learn about the effects on the environment resulting from state changes generated, and in turn cannot benefit from the positive changes that have occurred. The fact that a system can incrementally improve its behaviour through this adaptive process suggests intelligence, though this was not an attribute that Ashby ascribed to machines any more than he wished to ascribe consciousness to them. Instead, in his view, the state changes in a machine lead to changes in the patterns of behaviour of the system, which was as reductive a description as he could give without positing any unaccounted for behaviours such as intelligence and consciousness. What follows next are the specific ideas that Ashby had concerning learning: learning and thresholds, the Law of Requisite Variety, and learning and memory. Learning and thresholds Ashby’s idea of critical states is also central to adaptive learning systems, because it introduces the notion of thresholds, which is now the mainstay of most current theories of machine learning (Kotsiantis, Zahaakis, & Pintelas, 2006). The idea is this. Certain variables within the system are bounded to a set range of values, which in some cases can be highly restrictive (e.g., our homeostatic system means that for healthy functioning of our body, the temperature must remain at 37° Celsius, and a deviation from this value by just one-half degree requires serious adjustments of the system). A component of the system is needed to monitor (i.e., make internal observations of the state of the system at regular intervals) changes in value of these critical variables, and if they go beyond the threshold then this leads to a critical state – that is, it forces a change in the behaviour of the system, which is 97
Magda Osman needed to return the system back to a stable state. The critical variables that are required by the system to keep within a range can exceed their thresholds in response to positive as well as negative changes in the environment. The Law of Requisite Variety From these ideas, Ashby (1956) proposed the Law of Requisite Variety, which is still adopted in a diverse range of disciplines that fall under the cybernetics umbrella (e.g., informatics, human–computer interaction [HCI], systems design, communications systems, information systems, machine learning, management and sociotechnical systems) (Heylighen & Joslyn, 2001; Love & Cooper, 2008), and has been formally described within engineering systems design (Casti, 1985). Essentially, variety refers to whatever aspect of the system has the capacity to change. The properties of a system that can change include information, organizational structure, systems processes, activities, inputs, functions, control mechanisms and, of course, states. The fact that variety is a reflection of the system’s change over time is also an indication of the system’s dynamics and adaptive capacity. The success of the system in regulating itself is dependent on a control-theoretic version of the Second Law of Thermodynamics, that is, the Law of Requisite Variety: ‘the capacity of R as a regulator cannot exceed R’s capacity as a channel of communication’ (Conant & Ashby, 1970). What this means is that, in order for all of the variety (aspects of changes in a system) present in a system that can be communicated across it, the method of communication that connects all aspects of the system must itself have the capacity to transmit the full variety of the system. That is, the full range of changes that can occur in a control system need to be represented and communicated. Learning and memory One aspect of Ashby’s theory (1947, 1956) that is controversial is his position on memory14 (Bowker & Chou, 14
Ashby (1947) presented the case that, in actual fact, there are many aspects of a system’s characteristics that we would ascribe to the system which appear to be
98
Cybernetics, AI and machine learning 2009). It poses a challenge to the commonly held view that it is a store of information, and this gets retrieved given some function that recognizes that the information is required by the system. Rather, Ashby proposed that memory as a concept as viewed from an observer ’s standpoint of a system is a behaviour that connects information about past behaviours of the system with the present task situation; it marries the adaptive mechanism of the machine to the environment it interacts with. There are three important aspects that follow from this claim. One is that, as an observer examining the behaviour of a system, we cannot have complete access to the current state of a system (e.g., a laptop computer running a series of calculations); there are variables hidden from view (i.e., hidden variables). We use the present as a basis to chart the behaviour of the system for a given period, which becomes its history. The fact that the system repeats the same behaviour across this time period suggests that it has a memory of its behaviour, but there is no reason to attribute memory to the system. This process of observation, along with the attribution of memory, is a means of gathering information about the way a system advances from its initial state to the next state. In drawing the parallel between animals and artificial systems, the organization of the machine isn’t defended according to intelligent design, as with the case of biological systems, but rather according to the conditionality between the parts of the system and the regularity of its behaviour. This observation merely reveals the ignorance of the observer, since if the observer had complete knowledge of the system and the environment, then there is no need to attribute memory to a system – we can predict how the system will behave based on complete knowledge of it and the features of the environment it responds to. The third aspect of the argument is that memory is in fact information about the interaction between machine (man, necessary (including its organization, and the interconnectedness of its components) but which have little grounds for support in absolutist terms. Instead, the conditions of the environment in which the characteristics benefit the system need to be detailed in order to argue that they serve an adaptive function.
99
Magda Osman or any other complex system) and the environment that produces its state changes. In sum, Ashby’s systems theory proposed that in any given system that is autonomous, complex and adaptive, a central role is played by the notion of variety. That is, variety as considered with respect to adaptive systems (machine learning) is used to explain the way in which the system regulates itself (i.e., control). Additionally, the process of learning is Markovian (or memory-less); in other words, if, in the system, the probability of the change from present state (A) to future state (B) is constant, then it is rule obeying, and if the future state is conditional on the present state, then the relationship between present and future states can be described as a conditional probability P(A/B). No history of past events is needed to determine the next state, only the present state of the system. Many of these principles are still applied to current research in machine learning.
Current Work in Machine Learning There are several outstanding problems that machine learning is currently concerned with, of which the two most prominent are as follows: how does a machine generalize from the information (data set) it receives about a certain environment to a new environment (Mitchell, 2006; Mitchell et al., 1990), and how can a machine cope with uncertainty or noise15 in information (Kotsiantis et al., 2006)? Other more practical concerns are problems such as scaling, that is, as the information that the machine draws from to learn increases in size, the speed with which it can process it decreases. Machine learning tried to find ways of matching the increase in processing demand with the information received in the system. Also, rule induction from information is based on static rather than dynamic 15
Noise here can refer to several different aspects of the data, including incomplete data, incorrect data and irrelevant data.
100
Cybernetics, AI and machine learning representation. This is a problem because this means that information is received as a series of stilled snapshots which are used to learn from, whereas in actual fact, in real-world situations the information flow is continuous and dynamic. Although these are ongoing issues, machine learning has made significant advances since the 1980s in which it first became popular. Its applications are broad, and the capabilities that have been demonstrated are vast (e.g., speech recognition, computer vision, robotic control and data mining). It has also helped psychology in understanding how humans are able to identify and recognize speech patterns, or orientate their attention across dynamic visual scenes. While machine learning has informed many disciplines (e.g., psychology, biology and neuroscience), it has often looked towards the domain of statistics for solutions to critical problems. One reason for this is that statistical methods provide canonical inferences from data, and machine learning can incorporate them to develop algorithms to efficiently encode, store, retrieve and combine the information for adaptive purposes (Goldberg & Holland, 1988; Mitchell, 2006). This raises the point that rather than mimicking the way humans and animals adapt to and control their environment, an altogether different way of conceptualizing information processing is needed for machines to do the same.
Models of Learning There are various models that describe how the machine behaves in terms of the way it uses information; in other words, given a set of labelled16 data {[xi, yi]}where xi refers to the inputs and yi refers 16
Most learning models handle labelled data, which involves some level of preprocessing of the information before learning algorithms are applied to it. However, there are also algorithms that have been developed to work from unlabelled data and that they make appropriate assumptions about the information they receive in order to classify it (see Mitchell, 2006).
101
Magda Osman to the outputs, the machine learns the function ( f ) that transforms xi into yi, that is, yi = f (xi). For instance, this could refer to a pattern of sounds (xi) and the label refers to the word they may represent ( yi ), and there is a process of recognizing those sounds as a particular word ( f ). The algorithm that describes the function could be a genetic algorithm, Bayesian learning algorithm, logic-based algorithm, or neural network, where some (e.g., Bayesian learning algorithms [i.e., Q-learning], genetic algorithms and neural networks) attempt to use the analogy of biological mechanisms to inform the way they ought to operate.
Logic-based algorithms Logic-based algorithms translate the information into decision trees. That is, a classifier sorts the information into examples that belong to either one category or another, and then a further set of classifiers divides up the categories by specifying more precise details about the assignment of information into more specialized features. The problem comes in finding a sensible way of dividing up the information according to appropriate categories, and this applies to every level of the decision tree in which the categories split off into further more refined categories. Decision trees represent the process of learning, because hypotheses need to be developed to construct the decision tree, as well as to use the information to make accurate predictions (Elomaa & Rousu, 1999). What makes the algorithms logic based is that they use logical operators (e.g., conjunction, disjunction, negation and conditionals) to make category assignments (e.g., if all instances at time t have feature x, then assign them to category A, or else assign them to category B). As new examples are encountered by the system, some sort of linear function is used to relate new instances to previously encountered ones (Gama & Brazdil, 1999; Quinlan, 1993). Alternatively, instead of using decision trees, the same classification system that organizes information hierarchically can be converted into a set of decision rules. Here, learning represents finding the most concise set of rules that can make reliable pre102
Cybernetics, AI and machine learning dictions (e.g., Cohen, 1995). This is usually through a process of growing and pruning, which means first generating a wide set of rules, and then reducing the set to the most essential rules that can meet the intended goal (e.g., recognize a set of features of an image as a face, even when the features are moving). Fuzzy set theory (Yager, 1991; Zadeh, 1965) provides an alternative to probabilistic algorithms in which problematic information is classified into one category of another by matter of degree between 0 and 1. To represent this in logical terms, fuzzy set theory introduces quantities (e.g., several or few), fuzzy probability (e.g., likely), fuzzy quantifiers (e.g., most), fuzzy predicates (e.g., tall) and linguistic hedges (e.g., very), and so it takes vague information, but based on its assigned value (in terms of logical properties) logical operations can then be performed. In this case, the main difference between fuzzy set and probabilistic algorithms is that in the former case an instance can be precisely defined (according to logical properties) but what set it should be assigned to is imprecise. Probability concerns taking a yet undefined instance, and assigning it to a wellspecified population.
Genetic algorithms What makes genetic algorithms (GAs; Goldberg, 1989; Reeves & Rowe, 2003) appealing is that inherent to them is the ability to generalize from one set of instances to another. Moreover, they share in the vision of cybernetics, an attempt to unify psychology, economics, biology, mathematics, engineering and physics in its application. The algorithms are procedures that are probabilistic (rather than using deterministic logical operators) and are used to search through large spaces of information or states, in which a distributed set of samples from the space of all states is used to generate new examples. Moreover, rather than relying on binary classifications (e.g., 0 or 1), GAs can use strings of values (e.g., 0,1,1,0,1,1,0). The application of concepts such as genetics and evolution are based on the iterative process needed to produce new populations that re-represent the states (i.e., generations) as a 103
Magda Osman means of adapting and findings ways of reaching a goal more efficiently. GAs automatically bias the generation of future populations by using the evolutionary method (e.g., mutation, crossover and selection) to uncover specialized components of the current sample set, and use them as building blocks from which to construct structures that will exploit regularities in the environment (problem space). The fit between the future-generated set and the previous set refers to the success of approaching the goal or finding the solution to the problem. For instance, provided with a set of features of an image, the algorithm needs to classify the features (in order to recognize that it is a face) as well as generate the next population of features that would improve the classification rule (i.e., improve the fitness) that is needed to still accurately predict what features are needed to decide that the image is a face. This becomes particularly important in situations in which the designer relaxes her involvement in the learning process and the machine learns to establish the rules of the environment, or develop strategies (i.e., exploit the environment) as to which decisions are most appropriate to make.
Bayesian learning algorithms The aim of any machine-learning algorithm is to gather information and develop ways of using it appropriately, and this dual role of probing and controlling, now referred to as exploring and exploiting, was first introduced by Feldbaum (1960a, 1960b). Feldbaum’s dualcontrol theory uses a statistical approach (Bayesian) that gives a probability estimate of how instances or states belong to a set. This was favoured over using a decision tree method as in the case of logic-based or genetic algorithms. Essentially, there is a tradeoff between controlling the environment (i.e., using the current instances to achieve a desired goal) and probing the environment (i.e., conducting some form of hypothesis testing that will lead to new instances that could in turn improve control). Control is a stabilizing feature of learning, and probing is de-stabilizing, because
104
Cybernetics, AI and machine learning when gathering information, the environment is not managed for the purposes of control, which is why the trade-off exists because pursuing one mode is at the cost of the other. The relevance of Bayesian statistical methods in dual-control theory is that an algorithm can be developed in which optimal adaptive control can be achieved under uncertainty about the transition dynamics of instances and states. Using a Bayesian learning algorithm involves devising a particular hypothesis (h), and given an observed outcome (d = datum) (i.e., the system has some data by which to evaluate the hypothesis), the hypothesis can be evaluated and revised. To achieve this, formal methods involve integrating base rate information (i.e., the prior probability of the hypothesis) and the likelihood ratio (the success of competing hypotheses in predicting the data) by using the Bayes rule. In the Bayes rule, the probability of the hypothesis tested (h) is derived by multiplying the likelihood ratio of the observed datum (d) by the prior probability favouring the claim the system is evaluating (for a more detailed description of Bayesian networks and Bayesian learning algorithms, see Jensen, 1996). Uncertainty is present as an unknown feature or structure in the environment, and how it is dealt with through Bayesian learning algorithms has important implications for the two complementary methods of learning (i.e., probing and control). If the machine is cautious – that is, it simply controls – then one disadvantage is that information gathering is limited. Because of this constraint, the information that is currently available will eventually decrease the system’s efficiency because it isn’t actually learning much. The upper bound of what can be achieved is significantly reduced if the information gathered and the assumptions made of the environment are too conservative. This observation led Wieslander and Wittenmark (1971) to propose that having a machine make hypotheses (i.e., probing) that reduce assumptions of uncertainty can have beneficial effects because of the potential information gain. However, this method still involves correctly factoring into the decision rule continuous modifications because of uncertainties in the environment.
105
Magda Osman For optimal control to be achieved, this is important with or without active experimentation (i.e., probing). In response, some (Acid & de Campos, 2003; Cowell, 2001) have recently developed Bayesian algorithms that search for conditional independencies of structures in the environment in order to adequately constrain the space of instances that are probed. One offshoot of Bayesian learning algorithms is Q-learning17 (Ishii, Yoshida, & Yokimoto, 2002; Sutton & Barto, 1998; Watkins, 1989), in which the dual-control methods of exploring and exploiting are reformed. In Q-learning, dual control acts on primitive reinforcement learning methods in which decisions made by the machine learner follow a utility function. This refers to a simple assignment of reward or punishment to decisions. Successful decisions are rewarded (reinforced), and unsuccessful decisions are punished (weakened). At each time step, the learning algorithm refers to the probabilistic reward it will receive based on its state and action, and the state change in the environment. As with dualaction control, there is a similar trade-off concerning exploration (i.e., information seeking) and exploitation (information application for goal-directed purposes). Algorithms of this kind are particularly suited to dynamic environments because the success of learning in complex environments can be examined in two different ways. The algorithm may make no assumptions about the environment itself (i.e., it is model free), and so predicts state changes by directly learning the value function; in order words, this is like trial-and-error learning. Alternatively, model-based versions follow the same basic premise, but instead these try to capture the environmental dynamics, and the value function is used to examine the approximation of the model to the environment (for discussion, see Dayan & Niv, 2008). That is, prior assumptions are made
17
Of the many different machine learning algorithms developed, Q-learning or (reinforcement learning) has been popularized as a method of modelling psychological as well as neurological systems that contribute to optimal action control (see Daw, Niv, & Dayan, 2005; Daw et al., 2006).
106
Cybernetics, AI and machine learning and tested by information gathering, so that decisions based on the assumptions which are correct are reinforced and they are fed back to the model revealing its success in capturing the structure of the environment.
Neural Networks Of all the different learning algorithms, this bears the closest resemblance to biological systems. As its name suggests, the algorithm is analogous to the networks of neurons in a brain and is based on an artificial network of links between input nodes, a hidden layer of nodes and output nodes. Neural networks are also referred to as connectionist networks because of the pattern of connections between the different nodes. Typically, a network receives information through its input nodes, and generates a result to the output nodes via processing the information received from the inputs to the hidden layer (see Figure 4.1). As the information comes into the system, inputs will become activated, where an input corresponds to a feature of an image, and the hidden layer learns to recode (i.e.,
Inputs
Hidden layer
Outputs
Figure 4.1 Example of a neural network.
107
Magda Osman represent) them. Each input has a corresponding threshold value, and the weighted sum of the inputs subtracted from the threshold comprises its activation. The activation function most commonly used is the sigmoid activation function, which generates a number between 0 (for low input values) and 1 (for high input values). The values generated then pass to other neurons via connections in the network, which have a strength attached to them (i.e., they are ‘weighted’ – corresponding to neuronal synaptic efficacy). The hidden layer is so called because only what goes in and comes out from this layer can be observed. Despite this, the performance of the neural network is determined by the connections and weights at the hidden layer. The problems at the hidden level are that either too few nodes are estimated and so the network is deficient in approximating or generalizing, or too many nodes are estimated (i.e., overfitting), which hampers the network’s attempts to optimize its learning. The usual direction of flow of information is referred to as a feedforward neural network; however, to learn the information from the current state to the next it needs to be utilized to adjust the performance of the network. Neural networks learn by changing their weights by an amount proportional to the error between the output of the learning system and the desired output. This can be achieved through back-propagation, in which the output is compared with the desired goal and the error is used to adjust the weights of the output, and in turn the error is back-propagated to the hidden unit level to adjust the weights of the hidden nodes (psychology: Rosenblatt, 1958; AI: Hagan & Menaj, 1994). The problem with feedforward neural networks is that they are slow to learn (Kotsiantis, Zahaakis, & Pintelas, 2006; Zhang, 2000) and so other learning algorithms such as genetic algorithms (Siddique & Tokhi, 2001) and Bayesian approaches (Linsker, 2008; Vivarelli & Williams, 2001) have been used to train up the weights within the network for it to improve its rate of learning. An alternative to this would be to use a feedback method. Recurrent neural networks (psychology: Elman, 1990 and Jordan,
108
Cybernetics, AI and machine learning
Inputs
Hidden layer
Outputs
Context
Figure 4.2 Example of a recurrent neural network.
1986; and AI: Williams & Zipser, 1989) (see Figure 4.2) cycle back the information through the network. At each time step, as the new inputs enter into the network, the previous values of the hidden layer are passed into the context layer, which are then fed back into the hidden layer at the next time step; this enables the network to operate as if it had a memory. The advantage of this type of neural network is that it has dynamic properties that make it sensitive to time. It isn’t distinguishing between different patterns of information which have temporal properties because an added learning rule explicitly directs it to. Instead, it preserves information and then learns to generalize across changes in patterns as a function of time. One of the main disadvantages is that recurrent networks include many more properties of the learning algorithm that need to be specified (i.e., many parameters) (e.g., number of units in
109
Magda Osman the hidden and context layer, the appropriate structure and the learning rate); the success of any algorithm depends on using the least number of parameters. They are also slow, and also rely on other learning algorithms to speed up the rate of learning. Moreover, as with the problems of feedback discussed in the previous chapter, the introduction of feedback can make the network unstable (Haykin, 1994).
A Little Synthesis The continuing appeal of the cybernetics movement is that it integrates many different disciplines in order to tackle very basic common problems, such as finding a way of describing and even mimicking successful autonomous agents that can interact with an environment on the basis of achieving and sustaining a goal (Corning, 2007). Psychology continues to borrow from machine learning and engineering in order to find formal benchmarks, such as Kalman filters, Markov decision processes (MDPs)18 and particle filters,19 to examine human motor learning capabilities (e.g., Grush, 2004; Wolpert, 1997) and predictive accuracy in complex decisionmaking scenarios (e.g., Brown & Steyvers, 2009; Gureckis & Love, 2009). Machine learning in turn occasionally borrows from psychology and biology (e.g., Barto & Sutton, 1981; Shevchenko, Windridge, & Kittler, 2009), neuroscience (e.g., Yu & Dayan, 2005) and engi-
18
Reinforcement learning models (e.g., Q-learning) describe Markov decision processes; that is, they aim to describe the function that enables sequential decision making. This can be any situation which can be reformulated as a problem in which there are discrete or continuous set of states, in each of which a decision maker chooses an actions, the consequences of which are either a reward or a cost. The decision maker ’s aim is to make optimal choices at every stage in order to capture the policy function (i.e., decision rule) that maps states to actions maximizing some long-term cumulative measure of rewards. 19 Particle filters are a machine learning algorithm designed to solve problems concerning dynamic state estimation using Bayesian inference. The particles (samples)
110
Cybernetics, AI and machine learning neering (Watkins, 1989; Watkins & Dayan, 1992) in order to refine formal methods of reinforcement learning as well as increase the capabilities of artificial systems (e.g., robots). In addition, cybernetics has led to the development of systems that have many applied benefits. Machine learning continues to have far-reaching impact on our daily lives, because the techniques that are used underpin many control systems that we interact with (e.g., identifying tumours in X-ray images, voice recognition, predicting stock market trends, creating game-play in computer games and offender profiling). Thus, control systems that operate with machine-learning algorithms in turn modify and adapt their behaviour to suit our purposes. This makes understanding machine-learning approaches a valuable pursuit as to gaining insights into the processes of the systems we aim to control. If we return to our target question, ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’, then we need to consider what insights cybernetics, AI and machine learning have to offer with respect to descriptions of how control systems operate. What they tell us, much as control systems engineering has told us, is that the rules used to describe how a system operates must serve two opposing goals. One is to be robust, so that it can operate autonomously. The other is that it is flexible, so that it can respond and adapt to new outcomes. Put another way, the puppet’s internal mechanism may allow it to learn what the puppeteer wants it to do when a certain tune is played, and it may learn well enough that the puppeteer need do very little in manipulating
represent the posterior distribution of the state of a system given a series of measurements, which are updated as new information about the state of the system comes in. As new information arrives, the particles are constantly revised and updated in order to predict the state of the system – in which the algorithm trades off drawing from a known sample that may not be fully sensitive to the function of the dynamic states with the cost of developing a new sample that more accurately reflects sensitivity to the changing state (for more discussion see, Doucet, Andrieu, & Godsill, 2000).
111
Magda Osman the strings. But when the tune changes, all the learning that the puppet has gained might be lost unless it has worked out the rules that helped choreograph its movements with certain musical phrases in the first place. Implicit in much of the work that has been described in cybernetics, AI and machine learning is the effort in tackling the main question that is of importance in this book. The methods that have been described concern adaptive learning in the face of highly uncertain environments. However, one of the most interesting problems still facing AI and machine learning is describing not only how successful learning can take place in the face of one type of uncertain environment, but also how the information that is acquired can be used to develop decision policies in new uncertain environments – that is, how can learning to manage uncertainty in one environment transfer to another unfamiliar environment (Mahadevan, 2009; Tsao, Xiao, & Soo, 2008)? All intelligent systems that learn to generate behaviours are designed to reach and maintain a goal. They build on past experience to develop a set of actions that can affect the environment with some view of generalization – that is, going from the known (experience) to the unknown (new states). The principle objective is that the intelligent system finds the optimal way of producing a set of behaviours that is robust to the uncertainty generated in its environment. The ongoing challenge in machine learning is to find an accurate description of the environment that it can abstract from. This is particularly difficult when there are large numbers of states and actions, and the system needs to generalize from a small sample and condense the problem space into finite sets of regions (i.e., the problem space is vast). The limitations thus far of machine-learning algorithms are that they often require a large amount of data from which an abstraction process can begin. They don’t often generalize over actions. Goal setting can be problematic. Also, many machinelearning algorithms (e.g., MDPs) cannot model the decision process when the control decisions take place in continuous time. Also, to tackle the types of decision problems that humans face when controlling systems, machine-learning algorithms are computationally 112
Cybernetics, AI and machine learning more expensive than us, which is why the issue of control and the transfer of control behaviours are still difficult for them to tackle (e.g., Barto & Mahadevan, 2003; Bratko & Urbancˇicˇ, 1997; Gahegan, 2003; Kochenderfer, 2005; Mitchell et al., 1990). Take, for example, the automotive industry. One of its aims is to improve car safety. The development of intelligent control systems includes assisted steering to hold the car in lane, semi-automatic parking computers to aid parallel parking, and traction control to manage under-steer and over-steer in critical situations (e.g., Leen & Heffernan, 2002; for discussion on general issues, see Hokayem & Abdallah, 2004). These systems rely on careful monitoring of the state of the car through a range of sensors that process information about the environment. However, processing of sensory information cannot reach complete accuracy – particularly given the problems with the kinds of imprecision in the information that machine-learning systems have to work with. Moreover, given this, as was discussed in the previous chapter, the types of complex automated systems that have been designed still require human supervision. This still presents us with a problem, and that is that any automated system that is implemented based on specific information can still intermittently relinquish control because the information it is working from (for the many reasons discussed in this chapter) may still not satisfy the conditions under which the system should apply. Given this, if the information is not sufficient, or does not satisfy certain conditions, then a system like, say, those developed and used in the automotive industry will include a mechanism that hands back control to the driver. In many situations, drivers may not be fully aware of this, with the result that they may believe that the system is always operating under full autonomous control. Here, then, is an example of applications of machine learning in which there are direct consequences for human interaction when the control system struggles to encode and control uncertainty (e.g., Gollu & Varaiya, 1998; Sivashankar & Sun, 1999). Thus, while the cybernetics movement was designed to forge links between disciplines and tackle a number of critical control 113
Magda Osman issues, the mechanisms that are formalized essentially provide a framework for understanding the same issues that concern human interaction with a complex dynamic environment: coping with uncertainty, making accurate predictions about changes to the environment and making appropriate decisions (Melnik, 2009). The aim of the next chapter is to examine in more detail the nature of the interaction between humans and control systems of the general kinds referred to by cybernetics.
114
Chapter 5
Human factors (HCI, ergonomics and cognitive engineering)
Early engineering control systems, particularly those developed from the 1930s to the 1960s, were concerned with tackling problems such as the transmission of information across vast networks (e.g., telecommunications) and how to achieve dynamic optimal control (e.g., rocket propulsion). Up until then, control systems were developed in mind of solving specific problems, and though the human operator was the linchpin of the various elements of the system, in the design of the systems, the operator was often treated as an adjunct. The focus in this book until now has been the internal workings of the puppet. But clearly what is also important isn’t just the puppet, or the puppeteer for that matter, but both together, bound by the strings that connect them. This requires an integration of two perspectives: (1) from the view of the puppet (i.e., ‘What kind of puppet is it?’), and (2) from the view of the capabilities of the puppeteer. This integrative process is the focus of this chapter. As Hazen noted at the time (1934, 1941), automated control systems involved placing the human operator in amongst an Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
115
Magda Osman established system and asking him or her to adapt to what was already there. While human operators were a necessary element of automatic feedback loops in control systems, Hazen was one of the first to recognize the need to design systems that suited human capabilities, rather than humans suiting the capabilities of the system. The shift from only focusing on developing a system that operates efficiently, reliably and autonomously to examining the way humans behave with systems, and integrating those facets with the main principles of control theory, reflects the motivation of human factors research (Mindell, 2002; Sheridan, 2002). In essence, Hazen’s work was the first blush at explicitly acknowledging the importance of humans as dynamic interactive properties within control systems. Until now, the work discussed in this book has focused on ways of representing and designing control systems. This chapter is the first occasion in which the uncertainties of the environment are considered with respect to human behaviour and the kinds of adaptive control behaviours that emerge as a product of the interaction with control systems. Another aim of this chapter is to provide illustrations in which there are gaps between the assumptions and capabilities of the human operator and the behaviour of the control system. In most day-to-day cases, such gaps make little difference to controlling the actual outcome in a system – in fact, we behave with a rather impoverished understanding of the control systems we operate. However, in other cases such gaps can have dire consequences. The severity of such cases suggests that even with ever increasing technological advances and sophisticated system design, the changes generated and observed in the environment present the same continued challenge. For this reason alone, it is important to consider the target question of this book: ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’ from the point of view of the kinds of interactions with control systems that human operators face. Another reason is that human factors research examines possible causes for increases in our levels of uncertainty when operating control systems. These 116
Human factors possible causes can lead to errors, and so being aware of them is clearly of interest. However, before this discussion can begin, it is important to establish more specifically what the domain of human factors actually is.
Human Factors There is no strict agreement as to the differences between human factors research and other related areas such as HCI, ergonomics and cognitive engineering (Vicente, 1999), and it is not the place of this chapter to attempt to do so. Rather, the aim here is to consider across all these different research domains the main tenet they all share. In the general, there appears to be little controversy in suggesting that their common goal is to accurately represent and describe the dynamic relationship between humans and complex systems.1 Human factors research also crosses over to other disciplines (e.g., engineering, psychology, AI and machine learning). Why might this be the case? Well, it has been argued that the formal models (e.g., fuzzy set theory, Bayes theorem, optimal sampling, reinforcement learning, neutral networks, Kalman filter and signal detection theory) used to describe the environment (i.e., the artificial system) can also be used to characterize the interaction between the humans and control systems (Melnik, 2009). The choice of model applied to describe the capabilities of human learning and decision making depends on the purpose of the interaction. So, as with machine learning, no one model wins outright in formally capturing either what the environment is (i.e., the type of control system) or human behaviour (e.g., what kind of decision-making
1
In the remainder of this chapter, the most commonly referred to research domain will be human factors (which is the oldest and most established of all the research domains listed here), though when there are specific references to computing, design and safety-critical systems (e.g., railway signal systems, medical infusion pumps, autopilot systems or nuclear reactors) HCI, ergonomics and cognitive engineering will be mentioned specifically.
117
Magda Osman behaviour needs to be described), but rather different flavours of models suit different goals. It is important to highlight here that an ongoing issue in human factors research is whether there can in fact be a formal description of the interaction between humans and machines – this point will be discussed in some detail later.
The trouble with humans is … The themes that are of principle concern to human factors research are also much the same as those discussed in AI, machine learning and control engineering (i.e., feedback, control, stability, dynamics and time delays). But in the context of human factors research, the frame of reference is how people cope with feedback, control, dynamics etc. … and the extra complication that comes with this. For instance, though automated systems (such as air traffic control, military operation, nuclear power plants, subways, railways, chemical process plants, cars and phones) are reliable, they are highly complicated. When events in these systems don’t go according to plan, human operators are required to respond. In the most extreme cases when disasters occur, and the root cause is human, this is often because of having adopted one of two erroneous prior assumptions: 1. 2.
The system is sophisticated and automated; therefore, one ought to mistrust the operations of the system or device. The system is sophisticated and automated, so one must put total and absolute faith in the operation of the system or device.2
These two particular perspectives can have serious consequences in the way in which interactions with control systems proceed. 2
Some of the early concerns of automation raised by Wiener (1948), the forefather of cybernetics, and the objections that Turing (1950), forefather of AI, was forced to respond to concerning automation clearly still pervade because of the attitudes people have to complex control systems, which is that they show some form of intelligence.
118
Human factors That means that the problems resulting from interacting with the system can be imagined as well as real, or else ignored, in which case any fault diagnosis needs to first establish the causal factors in assigning responsibility for the outcome3 (i.e., to either the human or the machine) (Johnson, 2003; Salvendy, 2006). At this juncture, it is worth being reminded that since the matter of prior assumptions plays a huge role in assignment of cause, many of the questions concerning agency and causality discussed in Chapter 2, which appear to be abstract, in fact have an important bearing here.
The trouble with designers is … The complement of this, as highlighted by human factors research (e.g., Busby & Chung, 2003; Norman, 1992; Rasmussen, 1987; Reason, 1990a; Vicente & Rasmussen, 1992), is that in the development of complex systems, assumptions are made by control systems designers about the way human operators behave. An operator ’s perception of how the system behaves is based on the actions taken in it and the feedback mechanisms that indicate what is observable about system behaviour. The feedback available is determined by what designers assume should be indicated to the operator (e.g., light changes on devices, or sounds indicating errors or progress). However there can be problems in the correspondence between events in the system and what is indicated to the operator. Take for instance the case in which the panel display doesn’t indicate that the automatic pilot system has been overridden because of
3
For example, Johnson (2003) presents the details of the Valujet accident (Flight 592) that claimed the lives of all cabin crew and passengers, and could be traced back to human error. The Valujet DC-9 crashed after takeoff from Miami. The investigation by the National Transportation Safety Board (NTSB) found that SabreTech maintenance safety employees had incorrectly labelled oxygen canisters as empty that were not supposed to have been carried on board the flight. The oxygen canisters were actually full, which created the conditions for the fire and caused the crash.
119
Magda Osman extraneous weather conditions, and that actually control of the craft has been returned to the pilot of the passenger aircraft. In actual fact, something much like this has occurred, and has resulted in a number of reported aircraft crashes (Degani, 2004; Johnson, 2003; Sarter & Woods, 1995; Sarter, Woods, & Billings, 1997; Sheridan, 2002; Sherry & Polson, 1999). Thus, accident reports reveal that poor human performance can have as much to do with poor design as it does with lapses in attention, poor decision making, poor monitoring and poor planning.
What Constitutes Uncertainty in Human Factors Research? This issue is particularly salient when determining causes of accidents (Johnson, 2003) and making recommendations for improving system design (Vicente, 1999). Thus, the following discussion focuses on features of uncertainty of control systems from the viewpoint of the operator (Perrow, 1984; Woods, 1988, 1996), though it will also become apparent that some factors (e.g., dynamics and automation) have already been considered in previous chapters (i.e., on engineering, cybernetics, AI and machine learning) with respect to uncertainty arising from the control systems.
Specific Human Factors Contributing to Uncertainty Given the nature of the systems that people interact with and control, the number of relevant components of a system that the operator could use at any one time is vast, so they are often described as having large problem spaces. This can refer to the space of information that is shared across people that is mediated by systems (e.g., stock exchange), or between people and multiple control systems (e.g., air traffic control and flight management system on board an aircraft), or between people and machines (e.g., driving a car). The 120
Human factors design problem here is to reduce the problem space so that while all relevant information can be called upon, it is available in a way that doesn’t swamp the operator. For instance, flight management systems on board aircrafts can include information concerning flight routes, navigational advice, air traffic, fuel efficiency, engine status and automated pilot commands. This is by no means an exhaustive list, but this gives a flavour of the range of information that needs to be attended. Therefore, systems designers and human factors researchers are concerned with tackling the problem of large problem spaces because, when the system contains detailed and multiple sources of information that need to be regularly updated and relayed to the operator, this needs to be presented in a manner that enables the operator to quickly identify critical changes in the system. In addition, control systems are rarely ever isolated from social networks; in fact, they are often referred to in human factors as sociotechnical systems. Therefore, what can make an already uncertain system even more uncertain are social issues or management issues. These issues could potentially be insurmountable. To make them tractable, they are tackled from the point of view of communication; that is, how information (whatever the source) is transmitted across systems. For instance, the transmission of information across a corporation or organization can lead to significant distortions, much like trying to send a signal that may be distorted by noise and change over time, as in the telecommunications problem discussed by Wiener (1930). In fact, the mapping between the two had been the explicit focus of communication and information theories in cybernetics (Shannon, 1948; Shannon & Weaver, 1949; Wiener, 1948). Though these early examples were concerned with the transmission process of information and not the functional role of it, more recent developments in human factors research have distinguished between their formal descriptions of the statistical and structural properties of information, and that of control information. The idea behind control information is that control systems are vast interconnected networks, in which information is acquired, disposed and utilized, analogous to the movement of 121
Magda Osman matter/energy (see, for discussion, Corning, 2007). Thus, the transmission of information across control systems as part of a network is sufficiently broad an idea that it includes the behaviour of information in physical systems (e.g., temperature, velocity and viscosity), biological systems (e.g., neurochemical and electrical) as well as sociotechnical systems (e.g., culture). As well as the uncertainty attached to communicating across large networks, and the problems that come with that, the networks themselves are heterogeneous. That is, a combination of different types of expertise is required for the functioning of many control systems. For instance, the development of a new automated parking system for a family vehicle may involve engineers, mathematicians, physicists, accountants, advertisers, managers as well as psychologists and human factors researchers (Bucciarelli, 1994). Different expertise in turn entails different languages of description and different priorities, and this in turn can influence the way in which a system is managed and to what ends. Moreover, systems are more likely than not to be distributed, that is, the different communities of people involved in managing and controlling the system are going to be located across different countries. Therefore, making control systems manageable to users requires sensitivity in the interpretation of information, because users interpret the same information differently. In addition, control systems comprise many subsystems which are also interconnected; this is referred to as coupled. Therefore, the system design and manufacture will involve multiple steps each with collaborative exchanges across wide social networks, over different time zones and different countries (Burns & Vicente, 1995; Norris, 1995). Given the mass of people involved, and multiple dynamic information exchanges, as well as the operation of control systems (i.e., the actual systems [social, technical and digital] that will help to produce a particular outcome – in this example, a new automated parking system), this can generate as well as expose problems that can impact all levels (Spender, 1996). Because many systems are distributed and involve a high level of coupling, there is in turn a high demand to accurately predict what effects will result from the many and varied decisions that have been. Reasoning about 122
Human factors a network of decisions with respect to a single goal is imperative in order to monitor the behaviour of the system. However, the ability to do this accurately decreases as it becomes more difficult to conditionalize on multiple decisions made in the system that will generate different outcomes, which are incompatible with the target goal (Dörner, 1989). This can also be further complicated when the effects of the decisions are not realized immediately, but instead are delayed. This is referred to as disturbances. The advantage of a human operator is that when goals have to shift, or adapt, or the system reveals sudden irregularities, the human operator can develop online contingency plans to maintain the safe function of the system. Rare idiosyncratic events still require control, and although they cannot be predicted, the fact that they can occur means that a set of decisions and actions need to be devised and implemented quickly, and an automated system cannot as yet do this as effectively as humans can. There is a heavy demand on the user to integrate vast amounts of information generated through interaction with control systems. To be successful, and adapt to the changes that might occur in the system, the target goal must always be prevalent. Not all but most control systems, particularly those on a large scale (e.g., air traffic control, nuclear power plants, waste disposal plants, water purification systems, automated pilot systems and subway systems), are safety critical. That is, if the system malfunctions the consequences for human life can be fatal. So, another aspect of sociotechnical systems is that they are hazardous. There are pressures attached to monitoring the system (and taking command when needed), managing the multiple sources of information, predicting the potential effects of the decision made and considering delays in outcomes of actions taken. In addition to these, another factor that increases pressures on decision making is that an incorrect decision can cost lives which means that solutions to problems, and attempts to adapt to different demands made by the system, must be within the bounds of safety. Explore and exploit strategies in combination, discussed in the previous chapter, provide an optimal way of increasing information about the environment. However, with increased costs in exploring (e.g., time, effort and 123
Magda Osman risk), relying only on exploitation strategies to reduce uncertainty as to how to control the system leads to impoverished knowledge, or, as has been discussed previously, tends to slow the acquisition of relevant knowledge to help control the system sensibly (Wieslander & Wittenmark, 1971).
Typical Factors of Uncertainty Revisited Most sociotechnical systems are automated, dynamic and uncertain, and these are features that are all connected to each other. As we have considered in previous chapters, the introduction of feedback loops enables a control system to operate autonomously and to adapt to change. A compensatory mechanism like prediction is needed to anticipate changes from one state to the next in order to offer reliable adaptive control. However, this flexibility comes at a cost, because feedback loops introduce instability, and where there is instability, there is also uncertainty. The automation of control systems (e.g., cars, planes, factory looms, subways and railway signalling) means trusting that the system will reliably and efficiently produce behaviours under goals that we set. We can incrementally improve control systems and increase demands of automation (i.e., increase the sophistication of automated behaviours of control systems), as well as increase the combinations of functions within control systems (e.g., the functionality of mobile phones when they were first made commercial compared to now has increased in range). But, that means we must continually develop ways of learning, adapting to and controlling the new levels of automation that we demand. One reason that we continually extend the level of automation is because we can successfully adapt to the changes that it generates.4 4
As noted by Wiener (1948), and as has been a worry of many since, fears of automated systems taking over our lives should be displaced: if current advancements of technology are anything to go by, as automation increases in complexity, there will be more of a supervisory role for humans, not less.
124
Human factors For instance, much of the time, computer algorithms (machinelearning algorithms) will enable control systems to function automatically (e.g., automated trading in stock, automatic vehicle systems, automated sailing systems, automated storage and retrieval of robotic systems in warehouses, and automated medical analyses in medical laboratories). Typically, the operator ’s role is to monitor the way the system is behaving. Because the systems are robust, they can often operate with little intervention from the actual human operator. However, control systems are typically built to respond to preset conditions. Once the conditions go outside a prescribed range, the automated system will relinquish control, and this is where careful monitoring by the operator is needed. Under these conditions, it is the job of the operator/operators to organize the operations of the system. The problem with this is that these situations can be rare and when they occur is unpredictable, so operators’ expectations of these events occurring are low. Moreover, with respect to actual control, the operator ’s usual involvement in the system may actually be minimal. Therefore, sudden shifts from automated to manual control can be problematic because the operator needs to establish why the system has given up control, and come up with quick decisions that will mute adverse situations that may arise. Control systems are likely to unravel quickly if the operations are not maintained, which imposes a significant degree of pressure on operators on those rare occasions that the system fails (Green, Ashton, & Felstead, 2001; Hirschhorn, 1984; Kelly, 1989; Macduffie, 1995; Smith, 1997). Usually dynamic properties of systems (i.e., the changes that occur in them) can be described formally from an engineering and machine-learning perspective. However, the way in which they are experienced by human operators, and the effects that change can have on the interaction between the system and the operator, can often be problematic. If, for instance, control systems have longtime constants,5 which they typically do, the system is likely to 5
Note that long time constants or risetime refers to the time required for a signal to change from a specified low value to a specified high value: the risetime
125
Magda Osman respond with some delay (minutes, hours) after an action that is initiated by an operator. In the mind of the operator, the expectancy of a change following an action taken in the system has to remain there, even if there is a considerable delay between what they have done and the effect that will be produced. In addition, operators also need to predict other effects that will follow in the future as well as the outcome they are expecting. But what happens if there is a very long delay between actions taken in the system and the expected effects? The chances are that operators are likely to forget that they needed to a expect a delayed change in the system from a decision they had made earlier. Failing to take into account a single delay between an action and the changing state of the system can lead to disaster if it continues to go unchecked. To illustrate this, Vicente (1999) gives the example of manoeuvring an oil tanker, in which the dynamics of the system (tanker) are incredibly slow not only because of the size (average 200 m) but also the weight it is carrying (average 30,000 tonnes dead weight). So there is a considerable lag between actions and effects. Failing to take into account the consequences of future outcomes when manoeuvring the tanker in and out of docks can result in it running aground, which means damage to the hull, loss of structural integrity and loss of load (e.g., the Exxon Valdez oil spill in 1989). Usually to guard against this, pilot harbour is used to oversee navigational control when entering and existing ports. Similar issues apply in cases where the lag between action and outcome is relatively short, but highly frequent and spread across multiple decisions in which the decisions are likely to be interactive. Uncertainty itself is not separate from the automated aspects of a system or its dynamic features (Bisantz et al., 2000; Degani, Shafto, & Kirlik, 2006; Rothrock & Kirlik, 2003). We can easily accept the possibility that control systems will not function the way we want
characterizes the response to a time-varying input of a first-order, linear timeinvariant (LTI) system.
126
Human factors them to all the time. That is, the system presents us with imperfect, unreliable, unsystematic or irregular data (Lipshitz & Strauss, 1997). It may falsely indicate its internal states as well as falsely indicate changes in states that result from actions taken by the operator. Or, its sensors may fail, suggesting that there are no changes in states. Moreover, even if the information is accurate – that is, we can observe an outcome and it is correctly indicated as occurring – we may not always be able to observe the relevant causal factors that contributed to the change in outcome because they are too intricate (e.g., interactions between politicians and CEOs that end up affecting stock prices, neurochemical interactions following drug and chemotherapy interventions, and the production of waste from crude oil refineries). This means that the human operator ’s relationship to the system is through mediated interaction, and is also a source of uncertainty. Reduction of uncertainty comes via design features of the system that regularly update the operator as to the functions of the system. This remote level of information or mediated presentation, as Vicente (1999) refers to it, can be problematic, because a high degree of reliance and trust is placed on such a method. Why might this be problematic? Automation and dynamic features of the system tend to raise the level of uncertainty for a operator, and when changes occur it is easier for the operator to track them via a mediated system – but if there are only indirect indicators/sensors, then there is more scope for error and less opportunities to directly intervene to examine the source of uncertainty when there is reason to be uncertain. For example, if an individual has stopped to fill up her car with a full tank of petrol, and after 10 minutes of driving down the motorway the indicator on the dashboard shows that the car is running on empty, various assumptions need to be made to reduce uncertainty, not least because immediate action needs to be taken to respond to the situation. This example of uncertainty is based on conflicting information, because on the one hand the individual knows that the tank shouldn’t be empty, but the indicator is showing that it is. Attempts to reduce uncertainty rely on using available
127
Magda Osman information to assess the events. So, the individual may possibly search through her knowledge of the functioning of the car in order to judge what may have caused a sudden loss of fuel. She may recall that in the previous week the car wouldn’t start and it has never had this problem before; therefore, diagnostic information is used to reduce uncertainty, and help to make judgements based on what evidence the system presents. In this example, there is less reason to doubt that the indicator (sensor) is wrong, but rather that the system itself is not functioning correctly (e.g., a faulty or leaky pipe). However, given the same set of events, but the added information that the sensor incorrectly indicated that the fuel tank was empty after just being filled, then the decision-making process would be different, and for good reason. Isolated instances of how a system operates can create rather than reduce uncertainty, and this places high demands on operators because they need to use appropriate information to assess underlying causes of atypical events in the system (Wickens & Dixon, 2007). Therefore, misjudging what information to use can create more uncertainty rather than reduce it.
A summary thus far … It may be worth thinking about the general factors that contribute to uncertainty examined thus far. In their review, Lipshitz and Strauss (1997) discussed the factors that make control systems uncertain environments for us, the operators. They highlight two broad categories, based on properties of the environment (i.e., random fluctuations, probabilistic relationships between cause and effect, and non-linearity) or perceived uncertainty (i.e., slips in attention, misinterpretation of information, biases in information processing, forgetting, stress). Any interaction with a control system will involve a combination of perceived and actual uncertainty. It is also important to stress that, even if the individual has complete information of the operations of the system, failing to accurately detect cause–effect relations or adopting poorly defined goals will make controlling the system seem impossible, as well as bias the 128
Human factors way information generated by the system is interpreted, which can also contribute to perceived uncertainty.
So, given what we can establish as factors contributing to uncertainty, can human–machine interaction be formally described? After considering the ways in which the interaction between humans and machines contributes to uncertainty, we can now return to an important issue for human factors research. Sheridan (2000, 2002; Sheridan & Parasuraman, 2006) poses human factors research an important question: if we can use formal models to provide us with optimal allocation of human–machine functions, what should they be, because ultimately they will tell us what our role within control systems should be? Given what we know about human behaviour (learning, decision making and control) and what we know about machines (learning, decision making and control), now assume we could attempt to model the interaction. The question becomes more and more important as the level of automation within the systems that we interact with increases. The benefits of a formal analysis in answer to Sheridan’s question would mean that the advantages of human capabilities can be matched to those of the task demands and, likewise for the machine, and by harmonizing them, a control system can work to its optimal efficiency. In fact, Fitts (1951) proposed MABA–MABA (‘Men are best at – Machines are best at’), a list of human and machine capabilities in air navigation and traffic control that attempted to address this very question. This is one of the earliest examples in which task analysis was used to break down the component parts of a given situation, and the skills sets of humans and machines. Since then, there has been much debate about the merits of prescriptive methods of analysis, and whether it should be used in human factors research as a means to evaluate systems design (Dekker & Woods, 2002). For one, the main criticism levelled at this kind of approach is that many assumptions need to be made to even begin to evaluate the various skill sets of humans and machines 129
Magda Osman that would be involved in any given problem (e.g., resizing a mobile phone while increasing its functionality). The assumptions made can often be arbitrary, and even if they are founded on empirical evidence6 it still doesn’t ensure that the allocation of roles to machines and humans can be clearly defined. For instance, assigning the role of monitoring specific signals from a computer monitor to a human operator, which is a supervisory role, does not guard against further unforeseen roles that in turn require more analysis as to who or what to assign them to (Dekker & Woods, 2002; Hollnagel & Woods, 1983). These may be regarded as technical rather than fundamental issues, since finding ways of surmounting problems of clarity can be achieved through improved experimentation and data. In either case, such analyses should help to refine the specific questions about what goals need to be fulfilled, and the shared tasks or individual tasks that need to be assigned based on the specific features of the situation itself (Vicente, 1999). Though, as has been highlighted, some issues can be tackled empirically, there are more deep-rooted concerns with the problem of formalizing interactions between humans and systems, for example: (1) it is not appropriate to use normative models to prescribe what operators should do given certain defined situations because it simply can’t be done (Sheridan, 2000); and (2) because the contributions the human operator makes to the control system cannot be reduced to formal descriptions, there cannot be a formal model of the interaction between human and machine, and therefore no normative prescription of operator behaviour can be made (Sheridan, 2002). However, Sheridan’s arguments are not shared by all cognitive engineers. As a descendant of applied engineering, physics, statistics, electronics and biology, cognitive engineering has a clearly specified niche within the sociotechnical and human– technical systems design network of expertise (psychologists, biolo6
For example, the length of time humans can sustain the same level of attention when carrying out long sequences of actions before fatigue sets in (Banbury & Triesman, 2004), or what type of information on the visual display would be most attention grabbing (Egeth & Yantis, 1997).
130
Human factors gists, engineers, physicists and mathematicians), and that is in evaluating and proposing work station design, the design of controls, manual tracking, the design of tools as well as functional analysis (on which this current discussion is centred) of systems for human use. Models of human interaction with control systems have been developed to simulate different human physical dimensions, as well as physical capacities7 using Laplace transformations, block designs, information theory and transfer functions. So, from a cognitive engineering perspective a block design is devised and the factors that contribute to the transfer function for each subsystem (e.g., human operator, technological devices and extraneous factors) are defined. There are other reductionist models like task network modelling (e.g., Wickens, 1984), which are block designs in which a goal is decomposed into a sequence of task components that are involved in a given human–systems interaction. The model represents the interaction as a closed loop which is formed between the various actions that need to be performed by a human and those that are generated by the system. For example, in selecting the camera function of a phone, the actions can be decomposed into the closed loop shown in Figure 5.1. To represent
Pick up phone
Find and press Back button
Select buttons to unlock the phone
Look for camera icon on display
Mistake
Select the button to activate the camera
Stop
Mistake
Figure 5.1 Task network model representing a human searching for a camera function on a mobile phone. 7
Human capabilities can be specified precisely in terms of laws (e.g., control laws, elementary co-ordination law and Fitts’ Law), or measurements that are used
131
Magda Osman complex dynamic human–systems behaviour, in addition to decomposing the tasks, Micro Saint Sharp task network models (e.g., Engh, Yow, & Walters, 1998; Lawless, Laughery, & Persensky, 1995). These models can become extremely elaborate and incorporate a wide range of interactive behaviours between humans and systems.8 The success of these types of models is that they intersect theory and practice. What these different types of formal model do is assume that human operators deal with information by following simple information-processing principles (Phillips, 2000). The assumptions can be more specifically stated as follows: 1.
2.
The human operator is physically realizable, meaning that characteristics of human behaviour must be reduced to real-world physical characterizations; otherwise, no formal mathematical tractable solutions can be offered. The human operator model is linear and time invariant; while acknowledging that non-linear and time-varying properties exist, formal applications of linear control theory (see Chapter 3) mean that they must approximate linear time-invariant functions.
for formal modelling purposes (e.g., human force levels using hand grasps – power grasp male [400 newtons], female [228 newtons], key pinch male [109 newtons], and female [76 newtons]. For more details and discussion, see Fuchs and Jirsa (2008). 8 For example, these can include estimations of time distributions (e.g., Monte Carlo simulations are carried out to generate distributions of task performance times), mean time (e.g., defining the mean time taken to perform a component of the task), standard deviation (e.g., standard deviation of the task performance time) and release condition (e.g., determining when a task is executed to completion), and probability (e.g., estimations are made about decisions by assigning a probability to a human decision from a random draw of alternative decisions, each of which are weighted according to relevance, importance and salience; and the weightings can be determined dynamically as the decisions are made in the sequence components of the task).
132
Human factors 3.
The approximation is to a SISO (single-input/single-output) system for the reason that it is the simplest and most parsimonious model. Within a closed loop in which a human has to, say, track a moving visual stimulus on a monitor (e.g., as with air traffic control systems), the description of the human operator is based on the above assumptions and physical laws (e.g., Fitts’ Law).9
For instance, a cognitive engineer will evaluate differences, such as self-paced (choosing when to act) responses to events changing lights on a monitor compared with externally driven responses (having to react) to the same events. The value of doing this is to examine how quickly a human operator will react to a simple event in a control system given different motivational factors. This can be examined formally, by quantifying the response time as a reaction and movement (e.g., action time in seconds [AT] = reaction time in seconds [RT]) + movement time in seconds [MT]). This offers a firstorder preliminary approximation of the human operator ’s interactive behaviour with a system and how he or she responds to the changes in states of the system. This also provides a powerful means of evaluating a system, especially since the interaction between human and system can be assessed according to (1) the stability of the system, (2) the system steady-state performance and (3) the transient response of the system (Phillips, 2000; Repperger, 2004).
The advantages of formal descriptions of human-control system interaction Phillips (2000) proposes that the formal models that describe the interaction recommend design changes fit better to the desired goals of the system. Though they may help inform theory, they are 9 Fitts’ Law ‘ID (bits) = Log2(2A/W)’ defines an Index of Movement Difficulty (ID) with analogy to information theory, where A = distance of movement from start to target centre, and W = width of target.
133
Magda Osman not designed for theoretical advancements, but are essentially there to facilitate practical solutions to problems of engineering design. This suggests that formal modelling of the interaction between human and system is slanted towards the system. Despite this kind of view, formal models of human behaviour and performance when interacting with complex real-world systems are used to ask a variety of important questions that human operators care about (e.g., what are the workload demands on the human as a function of system design and automation? How should tasks be allocated to optimize performance?). The worry that formal models still have some way to go before they truly capture the nature of the interaction is not shared by others (e.g., Jax et al., 2003). Jax et al. suggest that the gap between the current state of modelling and the goal to be achieved is diminishing.10 There is perhaps good reason for thinking this. Formal models can describe human behaviour in general terms, and by identifying the specific task components, they can do a reasonably good job of predicting the performance of the operator. For example, activity networks are functional models (Elmaghraby, 1977) used to indicate the arrangement of activities in a control system (e.g., sequentially or concurrently). They formally describe the transitions from one activity to another by arrows and rules (e.g., ‘If Condition, then Action’ or ‘If and only if Condition, then Action’). In the HCI domain the critical path method–goals operators methods and selection rules (CPM-GOMS) technique uses specific bio-metric values (e.g., eye fixation duration, button press speed and scan path for tracking moving visual objects) to predict the nature of the interaction of the operator with a computer given a specific set of tasks (Card, Moran, & Newell, 1983; John, 1990). Order-of-processing
10
One reason that this claim is made is that general mathematical models that have been used to represent behaviours of the system and the operator can incorporate latent factors. Latent factors refer to processes and conditions which may lie dormant and only become evident when they combine with local triggering factors to generate outcomes (Reason, 1990b). The behaviour of control systems as well as the behaviour of humans serve as prime examples of latent factors.
134
Human factors (OP) diagrams (Fisher & Goldstein, 1983) later developed into task network models, and Micro Saint Sharp task network models use decision trees to represent a path of actions/states based on a given task to predict performance. These and other models (e.g., signal detection theory: Green & Swets, 1966; information theory: Shannon & Weaver, 1949; associative networks: Anderson & Bower, 1973; and connectionist networks: Rumelhart & McClelland, 1986) have been applied to situations including mail sorting, vehicle collision warning systems, displays of web menu hierarchies, air traffic controller training and intensive care units (for discussion, see Fisher, Schweickert, & Drury, 2006). Also, another reason to take note is this: many of the limitations that have been used to criticize the efforts of formally modelling human–control systems interactions (Dekker & Woods, 2002; Hollnagel & Woods, 1983; Sheridan, 2000, 2002) are well recognized by cognitive engineers. Formal models make various assumptions about the human operator, and while the descriptions are reductionist they offer practical benefits. These activities are aided by specific psychological constructs that can be used to measure mental capacity (e.g., short-term memory, speed in retrieval of long- and short-term memories, working memory load, divided attention and distributed attention, attention spans [uni-modal and cross-modal], and saccadic movements across static and dynamic visual stimuli). For task analysis this can be invaluable. So, let’s take a task in which the system demands that the operator focus on visual information via computer and auditory information (i.e. from another operator). The human operator may be required to simultaneously respond to another operator as well as respond to immediate changes in information presented in a visual display. This may, under highly practised conditions, pose little problem for the operator, but may be highly problematic if the information presented visually conflicts with the information received by the other operator. Clearly, conducting a formal analysis of the impact on the mental capabilities to cope with such scenarios has considerable applied advantages. Thus, it may be fair to say that formal analysis has its place in human factors research, and offers many benefits. 135
Magda Osman But, as is typically the case, we cannot draw a veil on this point just yet. What can be said of the formal models discussed thus far is they do not describe human information-processing schemes per se; instead, they try to build on existing formal models of psychological processes including memory, learning, perception, visuomotor actions and decision making (Salvendy, 2006; Sheridan, 2002). This is treated as a limitation in much the same way as the fact that these same models are not able to truly describe the interaction between human and control systems – rather, they describe a type of implementation of behaviours. To draw the veil now, what I’d suggest is that the reason for these limitations is not a fundamental one, but a matter of progress. There appears to be a lag between connecting up the advances in psychology to engineering, as there is between connecting up advances in machine-learning to human factors research. Nevertheless, formal methods simulate possible scenarios and predict performance for different configurations of composite tasks, and this can be used to inform design – and so have a critical role not only in engineering, but also in human factors research because not all possible iterations can feasibly be examined, but can be simulated through formal models. While they may be limited with respect to the assumptions they make, and at this stage cannot provide a complete formal analysis of optimal allocation of human–machine functions, formal models contribute to system design and assessments of human error. As promised, now we turn to the issue of error.
Error Control systems engineering suggests that we cannot be fully aware of how systems work because systems have their own defined uncertainties (see Chapter 3) and are not faultless. Human factors research (Busby & Chung, 2003; Busby & Hibberd, 2002; Johnson, 2003; Sheridan, 2002) suggests that there are limitations to human capabilities and this can result in human errors. However, despite this, when developing systems, designers continue to make 136
Human factors assumptions about the operator ’s capabilities, and when they are not met, the control system can fail. In addition, there are assumptions that the human operator tends to make of the system, and when they are not met the system can also fail. The difference between the expectation of the designer and the operator has been investigated (e.g., Busby & Chung, 2003; Smith & Geddes, 2003). The examples presented in Tables 5.1a and 5.1b are by no means an exhaustive list (for a detailed discussion, see Redmill & Rajan, 1997), but they are an indication of where gaps appear. The assumptions made from the operator ’s perspective in general reflect ways that the operator tries to minimize the amount of information from which decisions in the control system are based. Another contributing factor to the kinds of assumptions that operators make is the fact that typically operators have accumulated extensive experience with the system and so feel highly familiar with it. This buffers operators from keeping active in mind the risk factors that often need to be considered, and entitles them to behave as if the risk factors aren’t there because they are rare events. Although expertise often comprises the main reason for this, it is also the result of limitations in our cognitive processing. Because of the volume of information we need to attend to and make decisions our processing capacity may end up stretched to the point that we fail to integrate all the information available to us, and fail to detect the many risk factors that could be indicated at any one time. Also, there is a sense that our pragmatism also prevents us from being alert to all the risk factors that come with interacting with a control system. That is, if we are aware of all potential areas of risk, our choice of decision would need to constantly be evaluated against them, and that would slow down our actions and the efficiency of our daily functioning with control systems (Redmill & Rajan, 1997). However, these issues concern the actual operator. Some (Dekker, 2005; Perrow, 1984; Reason, 1990a) propose that actually the types of assumptions made by designers taken together with management and organizational factors constitute latent errors that form part of the pre-conditions for most human errors that are generated in control systems. 137
Table 5.1a Busby and Chung’s (2003) lists of unmet designer and operator assumptions categorized according to monitoring, operator knowledge, decision making, code of conduct and system behaviour
Text not available in the electronic edition
c05_n.indd 138
7/30/2010 5:41:59 PM
Human factors Table 5.1b Busby and Chung’s (2003) lists of unmet designer and operator assumptions categorized according to cues, system behaviour, designer and operator knowledge
Text not available in the electronic edition
Human error How errors arise in humans has been the focus of much interest in human factors research (for detailed discussions on human error, see Norman, 1981; Rasmussen, 1987; Reason, 1990a; Salvendy, 139
c05_n.indd 139
7/30/2010 5:42:00 PM
Magda Osman 2006). Human errors can include situations in which there was an intention to act by performing a certain action, in which the action may not have gone as planned. This resulted in an unwanted consequence (e.g., I accidentally pressed the Delete option instead of the Save button on my mobile phone), or the action went to plan but didn’t achieve the intended result (e.g., I pressed the Back button, but pressing this button actually initiated the mobile phone to save the text I’m drafting). As well as intentional actions, errors can result from situations where there was no prior intention to act (e.g., performing a routinized sequence of actions like pulling out your keys to lock a door that isn’t yours, which implies a slip, or lapse in attention). The broad taxonomy for classifying human error distinguishes among (1) omissions and commissions; (2) sensory, memory-based, decision-based, response-based; and (3) forced and random errors (Reason, 1990a). It is important not only to classify the types of errors that can be generated, but also to evaluate them. For instance, in safety-critical systems error takes one of two forms, one of which is broad and informal, and focuses on the latent errors and their influence on operators’ decision making (e.g., situation awareness); and the other is formal and systematic, and is a specialized analysis of causal factors that are involved in incidents and accidents (e.g., fault diagnosis). For the latter part of the discussion errors will be considered with respect to task analysis (Boy, 1998; Parasuraman & Riley, 1997; Woods, 1996), situation awareness (Endsley, 1995; Smith & Hancock, 1995) and fault diagnosis (Duncan, 1987; Hammer, 1985). These are domains of research in which the above issues are considered with respect to incidents and accidents generated in control systems.
Task Analysis, Situation Awareness and Fault Diagnosis Task analysis refers to the components of the task or goal with respect to space, time and function. The operator is considered 140
Human factors within the scheme of the task as well as the actual control system. This takes into account what the human needs to do in an automated system and what the system does anyway, where, at what time, and the actions needed to get to the goal of the system. Task analysis is the starting point in the design of systems, as well as the end point when establishing the cause of errors. This is because it outlines the desired goals of the system and the desired actions of the operator, which help to inform both starting and end points in the development of control systems. Therefore as a method of evaluating the error sources, task analysis can provide the first pass at detecting where the error may have occurred. Formal methods (activity networks: Elmaghraby, 1977; and signal detection theory: Green & Swets, 1966) have often been used to describe the composites of a task and to predict outcomes based on simulating various combinations of composites of a tasks. In this context, formal methods have been used to uncover potential biases in the way humans perceive information and their awareness of situations. Therefore, situation awareness concerns operators and their perception of the elements that make up the system in time and space (i.e., all aspects of the system and their own states and behaviours) and their expectations of themselves into a proximal future state (i.e., the next possible relevant situation that they will face). This is a general analysis that focuses on case studies where operators are examined according to what they know, how they’ve perceived their environment and how they think these influence their decision-making process. Here, then, is a case in which identifying errors that occur in the control system require tracing the knowledge of the operator at the time the error is likely to have occurred. Finally, fault diagnosis is a technique that decomposes the multiple contingencies that are involved in, or contribute to, an event in the control system: for example, pressing the On button P(A) on the computer in the office, the computer starting up P(B), the day of the week P(C), the time of day P(D) and testing the power supply in the building P(E) (where P means probability). The individual probabilities of these separate elements are combined to form a fault 141
Magda Osman tree, which can be used to calculate the likelihood of various events occurring (e.g., the computer starting up after the On button is pressed) based on the contingent probabilities. This type of analysis has been applied to a variety of large-scale control systems (e.g., industrial systems: nuclear power plants; military: missile deployment; and aviation: automated pilot systems). Other fault diagnosis methods include the management oversight and risk tree (MORT) (Johnson, 1980), root cause analysis (RCA) (Ammerman, 1998; Dettmer, 1997) and the technique for human error rate prediction (THERP) (for a more detailed discussion, see Sharit, 2006). The aim of all these models is to quantify human error by attaching a probability to events; of course, whether assigning objective probabilities can really be achieved in the kinds of situations that are formally described is a matter of discussion. Nevertheless, the models perform probabilistic risk analyses often using Boolean logic11 to depict the relationship between humans and the events that occur in the system. From this, the sequence of events is charted in order to determine the likely candidate causes of the incident/accident, or at least establish the conditions for which the error occurred.
General issues concerning error and their remedies While there are general characteristics of behaviour and circumstance that can be used to identify the presence of human errors, there is no agreement as to how to define human error. Some (e.g., Dekker, 2005) have called for it to be abandoned altogether. The most notable criticism of human error research is that attributing error to humans underestimates the influence of context, and in turn makes the assignment of cause and blame of unintentional actions problematic. For instance, take an example where an operator working in a confined work space reaches over to collect some tools but accidentally turns a switch on while performing the action. They have no knowledge of this because there is no indicator on 11
This is a system of logic which represents events in binary mode (i.e., ‘1’ or ‘0’, or ‘True’ or ‘False’).
142
Human factors the work station to show the switch is on. The switch has now activated an alarm that shuts off some operating machines, costing the organization time and money (Sharit, 2006). Who is to blame? Is it the context (i.e., the cramped room), or is it human error (i.e., the clumsy operator)? Thus, the assignment of cause to outcomes failures in a system is important not only because they need to be avoided in the future, but also because of the insights they give us, revealing ways in which we cope well and badly to uncertainties that arise in the control systems. Typically, suggestions as to how to remedy human error are commonsensical (e.g., extra training, more awareness of the contributing factors that lead to errors, and what to do when emergencies arise). Other suggestions are to bring the perspectives of designers and operators closer in line in order to overlap the two different frames of reference. Primarily this involves accepting that humans are not automata and therefore cannot be expected to perform repeated complex actions for sustained periods without errors, (Degani, 2004; Sheridan, 2004). What is noteworthy is that the types of suggested remedies tends to reflect where the onus of responsibility for the errors should be. But, because this is an ongoing issue, it is unlikely there will be a clear solution to making accurate causal attribution of error in control systems.
A Little Synthesis Human factors research helps to draw attention to the role of the operator within the control systems. The most obvious contributions that this perspective makes are in examining what contributes to error and recommending improvements to the design of control systems. The concerns in using formal methods to capture the interaction between human operators and control systems will remain, but while the issues still stand, formal models are being developed and used to provide insights into both endeavours. To highlight why this will continue to be an important issue for human factors research, and how this issue informs the target question of 143
Magda Osman this book, the final discussion returns to the problem of error. The reason for this is that errors are an important example of uncertainties created in a control system, and if we want to know how to control uncertainty, we need to know how to tackle errors. Put succinctly, the issue is this. I might notice that on occasion the foot of the puppet moves out of beat to the music. I need to know if the reason for this is because of something concerning the internal mechanism of the puppet, or because the puppeteer isn’t able to always keep to the beat of the music, because puppeteers in general don’t have the capacity to do this perfectly. It might be that the puppet forces the puppeteer ’s hand to move in a certain way, just as much as the puppeteer ’s hand forces the puppet to move. How can I address the issue? What should I do if I have no secure means of knowing how to judge the capacity of the puppeteer ’s skills in the first place, any more than I have secure means of judging if it’s a result of the internal mechanism of the puppet, or an interaction between the two? As we have seen, human factors research tends to look towards formal methods to help resolve this issue because they try to improve design and evaluate human–machine interactions. But, this is based on prescribing what should be done by human operators according to a particular standard (i.e., normative model) that human behaviour can be measured against. Specific types of cognitive activity, for example sensorimotor capabilities (e.g., Fitts’ Law), can in fact be used to inform models, and help to develop a normative model. But formal models can’t incorporate more general cognitive capabilities that humans possess. This is because currently there are no general models of cognition in psychology that are sufficiently able to capture human cognition. This poses a limit as to the normative model that can be used to characterize ideal human behaviour.12 Formal methods offer practical solutions and improvements in design of control systems. However, a tacit agreement 12
This is a critical issue in the study of human decision making and reasoning, and continues to fuel considerable research (for discussion, see McKenzie, 2003; Stanovich, 1999).
144
Human factors needs to be made to help decide where errors lie. When locating the source of error, and trying to stop its reoccurrence, we retrospectively compare the performance of the system and operator against an idealized standard. The debates within human factors research concern whether these normative standards are accurate and fair. Though perhaps it is not always sensible to conclude a chapter by raising more questions than one can answer, the questions do serve an illustrative point. So, imagine that an operator has developed an internal mental model (i.e., his own representation of the system) of the system he is interacting with. This internal model is based on as well as evaluated and corrected using the cues that are present in the control system itself (e.g., external devices such as display monitors). As is typically required, the operator must make a series of decisions which entail an action that is taken in the control system. The action in this case is made very quickly, and requires a simple combination of button presses, which the operator deliberately intends to make. However, the action is, to an extent, causally influenced by the control system, because an external cue indicates the state of the system and this in turn requires a particular response. Given this toy example of an interaction, a human factors researcher may ask the following questions: can the action of the operator be judged according to the cues presented by the control system? If so, should the control system be used as a basis for developing a normative model by which to compare human decisions and actions? What if, instead, the action is taken to be something that is chosen by the individual, and so one may ask, ‘Is the decision to act generated independently of the control system?’ Human factors research focuses on the interaction between humans and control systems, and so is characterized as bi-directional. This is why making causal attributions and assigning human agency to actions are complicated, because the interaction between human (H) and control system (CS) can often be described as ‘Action by H causes event in CS’ as well as ‘Event in CS causes action by H’. The operator is an element in an automatic feedback loop in the control 145
Magda Osman system. Any feedback system that can be viewed in this way can be viewed like a spring.13 Now imagine that in the simple interaction that has been described, the operator pressed the wrong button and triggered an unwanted series of events in the control system. Depending on where along the causal chain one wants to begin, the attribution of the fault can be made with the control system (or even the design engineer). Say, for argument’s sake, the operator was the cause of the fault. Assuming that the operator has a correct model of the control system, he may be able to correctly attribute the fault to himself. However, there are conditions in which he may incorrectly attribute it to the system. Conversely, as has been discussed previously, the operator may have an incorrect model of the control system and incorrectly attribute the fault to the system. But, it may be possible to incidentally attribute the fault to himself, which would still be a correct causal attribution. Though this is clearly an exaggerated example, the practical issue that human factors research deals with is how to prevent this incorrect button press from happening again. So establishing what information from the system is available to the operator, and what information is in the mind of the operator when he is making his actions, is of importance. But in addition to that, it is important to identify the cases in which there is misalignment between the model of the control system and the causal attribution, because in these cases the misalignment is the result of judgements concerning the operator ’s sense of agency. Judging one’s own sense of agency in reliably generating desirable outcomes in a control system is a theme that has received considerable attention in social cognitive research, and is one of the main subjects of the next chapter.
13
Springs begin to have velocity via a force effect; this effect causes a position change in the spring, and in turn the spring exerts an equal force in the opposite direction, and so on.
146
Chapter 6
Social psychology, organizational psychology and management
Until now, very little has been said about the goals that guide our behaviour when we learn about and control systems. The work that was discussed in the previous chapters focused on the mechanisms that enable a control system to work towards a goal. Recall that much of the work in engineering, AI and machine learning concerns how it is that control systems operate autonomously to achieve a particular assigned goal. But this leaves out a rather important element: our needs, desires and beliefs give purpose to our actions and help to bring about the actions that enable us to interact with control systems. As we proceed along the story of control, this is the first occasion in the book so far in which we consider the puppeteer in relation to the puppet and not the other way around. The focus of this chapter, then, is what motivates the puppeteer to do what he does. Without any goals to follow, there would be little reason to interact with the puppet, and in turn there would be little knowledge to be gained about the capabilities of the puppet
Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
147
Magda Osman because there would be no purpose to attach to that knowledge. Therefore, what the puppeteer needs is will. From a social psychology perspective, our sense of agency is a goal-driven activity shaped by our wants, needs and desires (Bandura, 1989, 2001), and is rather a different interpretation of the phenomenon as compared to philosophy (see Chapter 2). In social psychology, agency and all associated components of it (e.g., goal setting, causality, motivation and self-efficacy – i.e., expectancy of successful goal-directed actions) play a central role in the empirical and theoretical efforts designed to understand control. So, in relation to addressing the target question ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’, social psychology presents an alternative perspective to approaching the question. What this domain of research draws attention to is that the matter of most interest is not the system, but the individual’s sense of agency and the goals pursued. It is for this reason that agency should take centre stage in the study of control behaviours. To reflect the special attention that goals and agency are thought to have, the organization of this chapter will follow along these lines. To start with, it will introduce the different factors that are thought to affect actions (e.g., motivational aspects, goal-setting processes, cognitive processes and of course agency). These different factors have each been associated with control, which the field of social psychology commonly refers to as self-regulation. Also, interleaved between work from social psychology will be references to studies from organizational psychology and management. As was discussed in the previous chapter, systems of people (e.g., companies, universities, police forces, army and navy) are all examples of control systems that are goal directed and require direction from an operator (e.g., a chief executive officer [CEO], vice chancellor, commissioner of police, general or admiral). The goals that the individuals choose, and their estimates of their success in generating goal-directed actions, will be the focus
148
Social and organizational psychology of this chapter, and the work discussed will be directed towards understanding how we regulate our learning and decision making in control systems. There has been a long history of research, and consequently there are a number of established theoretical positions. The chapter sets out the two main theoretical views on self-regulation: social-cognitive theory (Bandura & Locke, 2003) and perceptual control theory (Austin & Vancouver, 1996; Vancouver, 1996; Vancouver & Putka, 2000). They appear to be separated by one critical aspect, and that is whether or not we ought to use mechanical, biological and electrical systems (e.g., thermostat, cells and circuits) as analogies of human self-regulatory behaviour. Vancouver and Putka’s (2000) answer to this is in the affirmative. They propose that the way our internal dynamic processes help us to attain and maintain goals in the face of uncertainty map neatly onto early control theories from cybernetics (Ashby, 1947; Powers, 1989; Wiener, 1948). Bandura’s (1989, 2001) and others’ (Bandura & Locke, 2003; Locke, 1991) attitude to cybernetics is far less enthusiastic. They question the validity of this endeavour and propose an entirely different framework for understanding self-regulatory control behaviour in humans. Despite these differences, theories of self-regulation (Bandura and Locke’s social-cognitive theory, 2003; Locke and Latham’s goalsetting theory, 2002; and Vancouver and Putka’s control theory, 2000) agree on a number of basic processes (e.g., self-efficacy, monitoring and goal setting) that concern control behaviour. As will become apparent, these shared ideas hinge on recurring themes in this book which include uncertainty, dynamics and feedback.
Control: Components of Self-Regulatory Behaviours In a general sense, self-regulatory behaviours are the starting point to approaching any task in which (1) there is little performance history to guide the way in which it should be tackled or (2) the task demands change in such a way as to present new challenges
149
Magda Osman to the individual (Bandura, 2001; Karoly, 1993; Vancouver, 2000, 2005). Both represent conditions of uncertainty, and from the perspective of the social-psychological domain, estimations of personal achievement determine the way in which uncertainty is overcome. Self-regulatory behaviours are those that are involved in achieving desired outcomes and achieving desirable outcomes requires a method of estimating and evaluating success, both of which involve a feedback mechanism. Estimations of the success of self-initiated behaviours in a control system help to organize what actions should be taken. From this, the actual effects generated either are directly felt or else can be observed. In either case, the outcomes of actions are described as ‘fed back’ to the individual. This forms part of a complex dynamic interactive process between self-assessment of behaviours according to their impact on the world (i.e., agency), and self-assessment of the behaviour itself in relation to a desired goal. This type of feedback mechanism has been referred to as a negative1 for the reason that it alerts the individual as to what stage in the goalseeking process they are in. It then motivates the individual to try to reduce the error between desired and achieved outcomes. As a regulatory mechanism, some (e.g., Bandura & Locke, 2003) have gone as far as suggesting that negative feedback is not enough to explain our regulatory behaviours. Bandura proposes an additional feedback mechanism, feedforward, which works in concert with negative feedback. The feedforward mechanism incrementally increases the challenges set by the individual, so the reach is always slightly further than the grasp. One may already recognize the terms feedback and feedforward mechanisms and loops, and inclusion of these terms in understanding human control behaviour has encouraged some researchers to draw 1
There has been a long-standing debate concerning the exact definitions of negative feedback, and indeed whether it serves only as an error-reductive mechanism (Bandura & Locke, 2003; Locke & Latham, 1990), or in a broader sense is a mechanism that enables the attainment and maintenance of goals (Vancouver, 2000, 2005; Vancouver & Kendall, 2006).
150
Social and organizational psychology on cybernetics to help explain how we control the events around us. This primarily began with Powers (1973, 1978, 1989), but also more recently Appley (1991), Karoly (1993) and Vancouver (2005) have helped to make cybernetics fashionable, though not necessarily popular. Collectively, they claim that we have two interactive systems that serve different functions with respect to regulatory behaviours: (1) a regulatory system operates at a super-ordinate level in which there is a dynamic feedback relationship between human/organism actions and the environment, which has the function of controlling the outside world; and (2) a regulatory system operates at a subordinate level in which there is a dynamic feedback relationship between perceptions of self and outcomes of actions. This has the function of controlling our internal behaviours. The complex interplay between both types of regulatory system helps to explain the range of behaviours that we show when exerting control. Another group of self-regulatory theories that share similar ideas are expectancy value theories2 (e.g., Azjen, 1991; Maddux, 1995; Maddux, Norton, & Stoltenberg, 1986). At the heart of these theories are two concepts: self-expectancy and outcome reward expectancy. The first concerns a judgement of perceived behavioural control – that is, our capability in generating a set of behaviours that will succeed in producing a desired outcome. The second is the expectancy of reward – that is, the reinforcement value of the expected outcome. Expectancy value theories claim that behaviour
2
Expectancy value theory has its origins in reinforcement learning theory. One may recall that reinforcement learning (e.g., Dayan & Niv, 2008) describes the relationship between expectation of outcomes and their associated rewards with the choices and actions made in an uncertain environment. Theories that concern estimations of outcomes suggest that these are the driving force of behaviour in an uncertain changing environment, because our predictions about outcomes offer adaptive control. The success of this theory can be observed in its applications in machine-learning algorithms, control theory in engineering as well as human factors research. Moreover, its versatility will also be considered in relation to neuropsychological research, which is discussed later in this book.
151
Magda Osman is the product of both self-expectation and the value of the anticipated outcome. Expectancy value theories have more in common with economics than they do with social psychology, and so have not been developed to any great extent in social domain. This is also because the empirical findings on which expectancy value is based are reductive of our social behaviours. For instance, the theories have been supported by evidence from studies examining response– outcome expectancies (i.e., non-volitional responses, e.g., fear, sadness, elation and pain) (Kirsch, 1985). Moreover, another point of contention with more modern theories of self-regulation is that expectancy theories were too reductive in trying to develop their own notion of reinforcement (i.e., the extent to which an individual judges the reinforcement outcome to be conditionally dependent on her action)3 (Rotter, 1966). However, expectancy value theories have contributed to the current understanding of control because they drew attention to the idea of ‘perceived control’ (i.e., the relationship between perceived actions and desired outcomes) (Bandura, 1977, 2001; Thompson, 1991), better known as agency, and its association with judged probability of success, better known as estimations of success (i.e., McClelland, 1985). Both aspects of control have already been mentioned briefly in the discussion thus far, and will be revisited many times over, because they are the mainstay of many current theories of control. What we can take from expectancy value theories may be reductionist, but they have provided important insights into control behaviours, which have informed theories in cognitive psychology (see the next chapter). However, in the main, the social-psychological domain takes the view that in a complex control situation, examining people’s estimations of outcomes and their reward is not enough to explain or predict the 3
Though some have confused locus of control of reinforcement outcomes (Rotter, 1966) with perceived self-efficacy (Bandura, 1977), the former is a general assessment of outcomes resulting from self-initiated behaviours, whereas the latter refers to assessment of the ability to generate particular behaviours from self-initiated goal-directed actions.
152
Social and organizational psychology social behaviours that tend to be exhibited. This is because expectancy theories – and, for that matter, reinforcement theory – lack an account of the effects of particular types of motivational factors (i.e., volitional: social and personal; and non-volitional: emotional) that govern behaviour.
Motivational Factors Modern self-regulatory theories in social psychology would say that the outcome values are not separate from the complex motivational factors that influence regulatory behaviours. However, that is not to say that motivational factors refer to a single wellestablished concept. None of the theories discussed in this chapter have an agreed on approach to understanding motivational factors and their causal role in self-regulatory mechanisms. More to the point, for some theorists, the relationship between motivational factors and self-regulatory mechanisms is described as a causal chain beginning with estimations of self-efficacy affecting behaviour, which in turn affects outcome expectations, which in turn affect the outcome (Bandura, 1977). For others, the start of the causal chain is motivational factors (e.g., emotions, needs and values), and these help formulate the goals which affect the actions that are generated in achieving them (Locke, 2000). Alternatively, others (Powers, 1973, 1978, 1989; Vancouver & Putka, 2000) have proposed that there is a loop in which a stimulus serves as an input and is used to generate perceived judgements of behaviour, while the goals formulated guide the behaviours, and the effects of these behaviours are then fed back into the loop. The ongoing discussion as to which description most appropriately captures the relationship between perception of actions, motivational factors and actions is perhaps worth sidestepping. To simplify matters here, the following discussion is concerned with the causal relationship between motivation and action, and the causal relationship between cognitive factors and action, which are a little more clearly specified.
153
Magda Osman
Motivation and Action Locke’s (2000) work provides a systematic examination of motivational factors on social behaviour, and is a good place to start when trying to understand how they affect our control behaviours. He identified four types of motivation (needs, values, goals and emotion). Needs are the objective physiological and psychological demands of a human. Values are those things that a human identifies as personally relevant and beneficial. Goals are situationspecific things that might constitute the aim of the action. Emotion is the form in which an individual automatically experiences value judgements. Locke proposed that there was a causal chain starting with emotions, which set the conditions for an action to be pursued, followed by needs which help attach a value to the action; these in turn are followed by goal setting, from which actions are directed towards achieving a specific goal. Locke (2000) proposes that the different types of motivational factors affect actions of all kinds (social or otherwise) in the following ways: 1.
154
Actions can be affected by motivation during the selection process. In other words, motivation influences the information we choose, from which we then act. In this way, motivation also helps prioritize what actions need to be executed, and becomes an important arbitrator when we end up having competing goals. For instance, take the following simple example. Imagine that while driving home, we suddenly remember that we were supposed to pick up a birthday cake that we’d ordered. This means that we now have to plan a different route back to the shops to collect the cake, while still negotiating through the current traffic demands. The motivational aspects of this scenario help to assign the actions to meet proximal goals (getting out of current traffic demands, and driving to the cake shop in time before it closes) and then the target distal goal (getting the cake from the shop back home).
Social and organizational psychology 2.
3.
Motivation can also affect action according to the magnitude of effort that is put into achieving a goal. So, returning to the example, we check the clock on the dashboard of the car and it reads 4:40 p.m., so now there is a sense of urgency because we have to direct all our resources to making sure that we drive back in time to collect the cake before the shop closes at 5:00 p.m. Motivation affects the sustained effort in maintaining an action in pursuit of a goal. For example, if the birthday party is in a few days, then we have time to pick up the birthday cake on another day, but we may have a terrible memory, so we have to remain vigilant that we don’t forget to pick it up, so we then write reminders to ourselves that we must pick it up.
Even if there are concerns about how to determine which has the stronger influence over action, motivation or goals, goals are an important guiding force. Like reinforcement learning algorithms in machine learning (see Chapter 5) in animal psychology, human cognitive psychology and neuropsychology (Dayan & Niv, 2008). Put simply, goal-directed actions are essentially positive (reward seeking) or negative (danger avoiding). In social psychology, the context in which goals are examined is more elaborate, but they are still reducible in this way.
Goals The influence of goals on behaviour has been of particular interest in social-cognitive research, especially in connection to the development of skilled behaviours in simple and complex environments. The main proponents of this research have been Locke and Latham (1990, 2002) and Austin and Vancouver (1996), and many of their theoretical claims concern how we set goals and how we manage our performance (e.g., effort, time and resources). Goals can be defined in terms of their simplicity or complexity, and they can be described according to whether they are specific or non-specific. To illustrate, if we are managers of a small company 155
Magda Osman and want to increase our profits, we may eventually decide that the best way to do this is to sell more of our products (e.g., office supplies). The task of doing this may be easy because all we need to do is increase production of all types of office supplies (staplers, pens, desk tidies, etc.), or it can be made difficult because we need to decide on exactly which items will sell better than others. The goal can be specific (specific goal, or SG) by defining exactly what the outcome should be (i.e., how much we want to increase our profits by) and when (i.e., for the next financial quarter), or it can be less specific (non-specific goal, or NSG) by simply deciding that we just want to increase profits without saying by how much, or exactly by when. There is much evidence to suggest that people evaluate their performance through the goals they set themselves (Bandura, 1989; Cervone, Jiwani, & Wood, 1991). In difficult and complex tasks (i.e., the task has multiple informational cues and makes dominant strategies difficult to discover), poorer performance is associated with lower judgements of expectancy in generating successful outcomes. Additionally, during situations in which people learn about the task by pursuing SGs, they are quick to change their strategies multiple times, rather than just focusing on developing a single successful strategy (e.g., Cervone, Jiwani, & Wood, 1991; Earley, Connolly, & Ekegren, 1989; Earley et al., 1990). This is because overall the task is harder to do (e.g., sell enough products and make a profit, while keeping costs of production down), the solution is more likely to be obscured, and in a highly uncertain situation of this kind, the conditions in which a solution apply keeps changing. In simple tasks (e.g., maintaining the same production levels as the last quarter) or tasks in which the conditions may be complex but there is less uncertainty (e.g., keeping sales up to the same level as the last quarter), people tend to already possess the relevant rudimentary knowledge needed to perform the task, and so SGs and difficult goals serve to enhance people’s existing ability (Huber, 1985; Locke & Latham, 1990). In Cervone, Jiwani, and Wood’s (1991) psychological experiment, people were presented with a complex decision-making task in 156
Social and organizational psychology which they imagined they were managers of a business organization. People had to assign employees to jobs and were given feedback about the success of their decisions based on the number of hours their assigned employees took to complete their weekly orders compared to a standard criterion. This was manipulated by controlling when and if employees received instructive feedback or social rewards. There were three experimental conditions. People in the NSG condition were told to ‘do their best’ by producing the weekly orders as efficiently as possible, those in the moderate SG condition were told that they should take no longer than 25 per cent of the standard criterion, and those in the difficult SG condition were told that they should complete the weekly order within the standard criterion. During the task people were asked to give estimates of how well they thought they were performing, as well as their confidence in meeting the criteria they were presented. In addition, they also gave information about the self-set goals they were working towards, and how dissatisfied they were in their performance. Cervone, Jiwani, and Wood (1991) showed that selfevaluative ratings and performance were higher in both SG conditions (moderate and difficult) compared to the NSG condition. They claimed that when people work towards well-defined goals, they react evaluatively to their accomplishments and use task feedback to assess their capabilities and future performance. Studies examining the mediating effects of different set goals on self-evaluative processes have also revealed that difficult SGs lead to increases in cognitive, physical and motivational effort, which in turn can lead to increases in performance compared to vague instructions like ‘Do your best’ (Huber, 1985; Locke et al., 1981; Mento, Steel, & Karren, 1987; Wood and Locke, 1987). In complex management decision-making tasks in which the control system (i.e., people) is a dynamic one, and there is no obvious dominant strategy for controlling the outcome, studies have shown that SGs impair performance compared to NSGs (Chesney & Locke, 1991; Earley, Connolly, & Ekegren, 1989). In management tasks of this type, the early introduction of difficult SGs can impair people’s ability in successfully controlling the outcome, but when introduced 157
Magda Osman later this can actually facilitate performance (Kanfer & Ackerman, 1989). Moreover, feedback about how successful the outcome is (i.e., outcome feedback) is sought out more often in SG training situations than in NSG situations (Huber, 1985), when varying goal specificity and outcome feedback, the more difficult the task and the more specific the goal, feedback on the success of strategies that are applied to the situation (i.e., process feedback) helps benefit people and increases their performance compared to outcome feedback (Earley et al., 1990; Neubert, 1998). The same findings have been reported in studies examining the impact of task difficulty, feedback and goals setting in real management and organizational situations (e.g., Vigoda-Gadot & Angert, 2007). It is important to bear in mind that there is a delicate interplay between uncertainty and goal specificity, and this has been the basis for much research on the strategies that are developed to control the outcomes we want in a variety of situations. To understand this, we need to consider how our cognition influences our actions.
Cognition and action In addition to motivational factors, there are cognitive factors (evaluative thinking, decision making and self-efficacy) that affect actions, and these include evaluative thinking, which Locke (2000) describes as a strategy for considering the discrepancy between the desired goal and the current state we are in (e.g., an example of a negative feedback process). One might be forgiven in thinking this is a familiar claim since it was developed by expectancy theories discussed earlier. In social psychology, decision making is treated as a strategic process which helps to identify the actions that can reduce the discrepancy between desired and achieved outcomes. The evaluative nature of decision making comes from weighing up alternative courses of action and assessing them in terms of their appropriateness or efficiency in achieving the goal. Locke (2000) also describes self-efficacy as a cognitive factor because it is the conviction that an individual has in performing a certain skill to a certain level. Again, like decision making, self-efficacy involves an 158
Social and organizational psychology evaluative component because the individual performs a selfassessment of his or her capability in achieving the goal-directed behaviour. Drawing a distinction between decision making and judgements of self efficacy is not necessarily needed, and certainly not adhered to by all theorists. Bandura (1977), the main proponent of the phenomenon selfefficacy, presents an alternative description of this construct. Rather than describing cognitive factors as separate influences on actions from motivational factors, Bandura suggests that motivation is the activation and persistence of behaviour, but is based in part on cognitive activities. By cognitive activities, he refers to information processing. So, motivation is informed by cognitive activities such as expectancies of outcomes. Recall, this was also an important component to expectancy value theories. In addition to developing expectancies of outcomes, our capacity for self-evaluation and goal setting, are important cognitive processes that inform motivational factors. Representing future outcomes provides the incentive to carry out the action because given the expectancy estimate and the reliability of making an accurate prediction, the individual will then either carry out the action or not.4 Similarly, self-rewarding goaldirected behaviours are conditional on achieving a certain standard of behaviour, and this has a motivating effect. This conditional selfinducement propels individuals towards maintaining their efforts until their performance matches their self-set standards. The complement to this is perceived negative discrepancies between performance and standards; this creates dissatisfaction which motivates us to correct our behaviour. Bandura (1977, 1989, 2001) proposes that all psychological processes are designed to increase and strengthen self-efficacy, and this
4
Like Bandura’s work in 1977 and since (Bandura, 1989, 1991, 2001), current work in neuroeconomics, a field of research that combines psychology, economics and neuroscience to examine learning and decision-making behaviours in uncertain environments, is particularly concerned with the relationship between expectancy and reward and how it influences choice behaviour, though neuroeconomics does not examine self-efficacy (for more details, see Chapter 8).
159
Magda Osman is because it is the method by which we act and control our environment. Therefore, self-efficacy is concerned with the estimation of how successfully one can execute a course of action designed to deal with a prospective situation, and for this reason it precedes outcome expectancies. People can estimate that an outcome will occur given a set of actions, but if estimations of self-efficacy are low because the individual doubts his or her ability in executing the actions in a particular fashion, then there will be little motivation to carry out the actions. Alternatively, if estimations of self-efficacy misalign with the actual outcomes produced so that self-efficacy is high and estimation of the outcome occurring is too low, then the same effect on motivation and in turn lack of action will be observed. The main idea is that estimations of self-efficacy and outcomes will change because internal judgements of one’s capabilities will change5 and also because the environment in which the actions occur is uncertain and will change. Alternatively, Vancouver, More, and Yoder (2008) and Olson, Roese, and Zanna (1996) describe self-efficacy as a type of expectancy related to an individual’s belief that he or she can execute actions necessary to achieve a goal. Other related social domains such as organizational-psychological and management research have adopted a similar perspective of self-efficacy in a host of contexts. For instance, Gist and Mitchell (1992) refer to self-efficacy as an individual’s estimate of his or her capacity to co-ordinate performance on a specific task, and Jones (1986) describes self-efficacy as influencing people’s expectations about their abilities to perform successfully in new situations. In general, then, in different circles of social psychology research, self-efficacy is thought to be directly related to people’s perceptions of their success in dealing with past situations and expectancies of future successful outcomes. 5
Bandura proposes that there are four different sources of self-efficacy that are likely to direct changes in perceived self-efficacy: performance accomplishments (e.g., self-instructed performance), vicarious experience (e.g., observing another performing actions with desirable rewards), verbal persuasion (e.g., self-instruction or suggestions) and emotional arousal (e.g., attribution or symbolic exposure).
160
Social and organizational psychology Regardless of the nuances surrounding self-efficacy, in the social, clinical, educational and organizational domain there is considerable evidence to suggest that estimations of self-efficacy are a reliable and superior predictor of behaviour compared with estimations of outcomes of behaviour. Measures of self-efficacy have a substantial applied benefit if they can track success in performance above and beyond any other measure. For example, empirical studies have reported that judged self-efficacy predicts work-related performance (Gist, 1989; Gist, Schwoerer, & Rosen, 1989), coping with difficult career-related problems (Stumpf, Brief, & Hartman, 1987), self-management (Frayne & Latham, 1987), educational performance (Campbell & Hackett, 1986; Wood & Locke, 1987), adaptability to use new technology (Hill, Smith, & Mann, 1987), overcoming phobic behaviours (Williams, Kinney, & Falbo, 1989), experiences of pain (Litt, 1988; Manning & Wright, 1983), regulating motivation in sport (e.g., Moritz et al., 2000), affective processes (e.g., Bandura & Cervone, 1986; Elliott & Dweck, 1988; Spering, Wagener, & Funke, 2005), and decisional processes (e.g., DeShon & Alexander, 1996; Earley, Connolly, & Ekegren, 1989; Kanfer et al., 1994; Tversky & Kahneman, 1974). Moreover, badly calibrated self-efficacy has been shown to mediate problem solving so that it produces poor performance, irrespective of people’s actual capabilities (Bandura & Wood, 1989; Bouffard-Bouchard, 1990; Hogarth et al., 1991; Wood & Bandura, 1989). Equally compelling is evidence that increasing people’s belief in their self-efficacy guides attentional processes so that, in problem-solving tasks, people’s accuracy in detecting and analysing solutions to problems can be radically improved (e.g., BouffardBouchard, 1990; Jacobs, Prentice-Dunn, & Rogers, 1984). Clearly, self-efficacy is an important component of all social and related domains because it signals what behavioural change will occur, and when. It involves a dynamic mechanism that weights the relationship between perceived expectations of self-generated goaldirected behaviours and the demands of a changing control system (Sexton, Tuckman, & Crehan, 1992). Some would go as far to say that self-efficacy is the basis of control behaviours (Azjen, 1991), while others suggest that self-efficacy is our sense of agency (Bandura, 161
Magda Osman 2001; Cervone, 2000; Gecas, 1989) or, as some call it, personal causation in the world6 (DeCharms, 1979; McClelland, 1975). To build up an understanding of how motivation, self-efficacy and goals relate to our ability to control events in uncertain environments such as control systems, we need to examine theories of self-regulation.
Theories of self-regulation Theories of self-regulation really only consider two types of processes: those that advance goal-directed behaviours in such a way that challenges and exceeds the goals that were set, and goaldirected behaviours that limit the discrepancies between the goal we set out and our distance from it. These different functions of goal-directed behaviours can apply to a wide range of situations. Many of the empirical studies designed to test theories of self-regulation have used a variety of simulated management and applied control systems situations. What now follows is a sketch of the main proposals that the theories make along with the main claims that distinguish them.
Social cognitive theory The theory is concerned with explaining the factors involved in goal-directed behaviours. This, of course, includes every behaviour that is designed to generate a specific effect. For this reason, selfregulation is at the heart of Bandura’s (1989, 1991, 2000, 2001) social-cognitive theory because it is the driving force of virtually all our behaviours. It is the basis for setting the goals we aim to reach, and helps track and adjust our behaviours according to them. As 6
Gecas (1989) discusses self-efficacy in the specific context referred to in this chapter (which is as an assessment of effectiveness, competency and causal agency) but also examines the broader context in sociology and philosophy (Marx, 1844/1963; Mead, 1934), where it is understood as expectancy as well as perception of control.
162
Social and organizational psychology discussed earlier, there are two self-regulatory mechanisms (i.e., reactive control and proactive discrepancy) discussed by Bandura (1989, 1991) that help to evaluate and advance our actions towards a goal (Bandura & Locke, 2003; Karoly, 1993). The first is through error detection and correction. People adjust their behaviour by monitoring how close or far away they are to the desired goal, which is reactive control because people are responding to the goal to be achieved. Alternatively, by incrementally setting more and more difficult challenges, goals are also met, and even exceeded, through a process of proactive discrepancy which means that actions are designed to continually meet ever-exceeding goals. Bandura’s theory proposes that we monitor ourselves, and this involves making online judgements about our behaviour and its relationship to the pursuit of goals. To master any environment or to develop any skill, monitoring behaviours are essential because we need an internal regulator that evaluates and tracks the actions and effects that are produced. Without a status check, we can’t know how we are doing and what we need to do to improve. The main point that Bandura makes in his work is that we are not only subject to the influences from the environment, that is, we see a stimulus (it rains) and we simply respond to it (we pull our umbrella out). Our actions constitute more than this because we exercise control over what we do and make choices and act in ways that are self-willed. So, one can view proactive discrepancy as the self-willed propeller behind our actions and reactive control as the external world prompting us to act. Interestingly, despite the many criticisms that Bandura and colleagues (Bandura & Locke, 2003; Locke, 1991) have levelled at perceptual control theory (Austin & Vancouver, 1996; Vancouver, 1996, 2005; Vancouver & Putka, 2000), the very same crucial point is made by Vancouver, which is that what we should be most interested in is the ongoing changes in our behaviour and how they are modulated. So, rather than focusing on manipulating the environment and examining what effects that has on our behaviour, as is typically the favoured method in cognitive psychology, instead we
163
Magda Osman should examine the effects we produce on the environment and how that changes our behaviour.7 To have an idea of how we develop goal-directed behaviours designed to affect particular changes in the environment (broadly, any control-based system), we need to be less concerned with the stimulus mapping (i.e., what happens in the world and what we do), and more concerned with the response–stimulus mapping (i.e., what we do and the effect it has on the world). This is the concern of perceptual control theory.
Perceptual control theory Perceptual control theory has been used to describe low-order functions like motor-control behaviours (e.g., how we move our hand to type a word), as well as high-order functions like those described in this chapter, which are involved in operating a complex system. Regardless of the scale of the control system in which control behaviours are being made, perceptual control theory proposes that goal setting is the essential first step in formulating goal-directed behaviours. The way in which goals are maintained is based on estimating the difference between the perceived current state of the environment (this could refer to our internal states, as well as external states in a given task we are engaging with) with the desired state (the goal). This estimation is designed to produce ways of counteracting any disturbances that may influence the variable that we are trying to control (e.g., a company’s output, or a passenger jet’s flight path). The actions that are then produced generate effects on the variable, and these effects are then fed back and used to update the perceived current state of the environment; this process iterates many times until the goal is reached. Otherwise, if the efforts in trying to reach the goal are too high, then all goal-directed behaviours are terminated. Actions then are designed to reduce the discrepancy between the current state and the desired state, and so self-regulatory behaviours 7
This proposed direction in examining behaviour will be explored in Chapter 7, ‘Cognitive psychology’.
164
Social and organizational psychology are part of a feedback loop similar to many of the cybernetic models proposed by Wiener (1948) and Ashby (1947). The main difference between these early models and the perceptual control theory is that through Powers’ (1973, 1978) and Vancouver ’s (Austin & Vancouver, 1996; Vancouver, 1996, 2005; Vancouver & Putka, 2000) extensions to psychological phenomena, the variable we aim to control is transformed into an internal signal (i.e., we need to represent the thing that we aim to control), so we aren’t controlling the variable itself, but instead the perception of the variable (Yeo & Neal, 2006). This is why the theory is referred to as perceptual control. Nevertheless the effects on the variable are detectable and measurable, and so we can still detect our effects on the environment from the actions that we generate. One of the main criticisms that Bandura and colleagues make concerning the perceptual control theory is that its primary selfregulatory mechanism is error reduction, and so control behaviours are mediated by negative feedback, whereas in the case of socialcognitive theory, a feedforward mechanism is also used to incrementally advance towards goals in a positive way. This criticism has been acknowledged by Vancouver (2005), while in Powers’ (1973, 1978) original theory, prediction is not explicitly described; prediction is part and parcel of any control theory because the aim is to control uncertainty. This comes close to what Bandura and colleagues construe as a feedforward mechanism. Predicting the possible outcomes that result from our actions is integral to the error reduction self-regulatory mechanism. We use predictions to reduce the uncertainty of events, by evaluating the value of those events in terms of the focal goal and other subsidiary goals. The confusions concerning what negative feedback is, and its relative merits in producing behaviours that actively extend the reach of the goal, as well as simply obtain it, have been discussed at length elsewhere (Richardson, 1991). For the purposes of this discussion, whether or not there is a genuine feedforward mechanism in perceptual control theory, or whether prediction is in fact equivalent to feedforward (in engineering circles it would be), prediction is a process that has not been specifically named until now, but is also 165
Magda Osman integral to most of the discussions thus far. Estimates of future outcomes from our behaviours, or evaluations of our actions in relation to a goal, all involve prediction – we reduce the uncertainty of the future because we want to determine a specific event in it. Bandura’s (1977) early work that staged much of what came to be later known as his social-cognitive theory was a reaction to simple reinforcement models. He introduced the idea that we rely on a self-serving mechanism (self-efficacy) to direct our actions. Vancouver (1996, 2000, 2005) takes the issue of goal setting and places it in the context of cybernetics theories of control to explain goal-directed behaviours. Though they may have different histories, and the latter theory more obviously connects to the theme of this book, the descriptions of human behaviour in social contexts have led to many practical applications beyond the sciences, into the realms of the management and organization domains, education, and clinical practices. To draw these issues neatly to a close here, it is worth acknowledging that both theories explore the components of goal-directed behaviours in terms of two very basic ideas: prediction (feedforward – the anticipation of events for goal advancement) and control (feedback – discrepancy reduction between the current state and desired goal). Both will have an important bearing in answer to the title question of this book, and will be discussed again in the next chapter.
A little synthesis Given the alternative perspective that social-psychological research has offered in terms of addressing the target question posed in this book, the synthesis of this chapter is more detailed. Also, at this juncture in the book, the following discussion aims to integrate many of the ideas raised here and across the previous chapters by deconstructing the target question, ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’ into two: (1) should we have good reason to tacitly assume that we have the ability to choose our actions independently of what the environment 166
Social and organizational psychology may require us to react to? And (2) should we assume that we can affect the environment through our actions, and therefore change it in a way so that it behaves differently from what it would do without our intervention? The first deconstructed question essentially refers to the point about whether as the puppeteer we are in control of how we make the puppet dance, or if there are other underlying factors that are the real driving force (e.g., the music, the rules of dance, or the capacity of the puppet to move in a certain way). The question also concerns an issue that Bandura (1989, 2001) has discussed on many occasions in his work. He raises the point that perhaps our sense of agency is derived not from generating actions that are reactive to external influences (as might be the case with machines). Instead, agency is a special capacity only reserved for humans, because it has a different causal influence over our own thoughts and behaviours from the mere reactionary mechanisms that machines have. To clarify, in regard to the kinds of control behaviours that have been discussed in the chapters on engineering, human factors, cybernetics and particularly machine learning, one could argue that they are self-serving self-regulatory systems. Moreover, if we take the view that any goal-directed behaviour needs some method of assessment for judging the success in achieving a desired goal, and deciding what further actions need to be taken to generate that goal, then all machines (artificial and otherwise) require self-regulatory behaviours. In particular, given the fact that many machines (artificial and otherwise) demonstrate learning, problem solving, reasoning and decision making, this suggests that there is some sort of internal mechanism that drives the agent to behave in a certain way. Therefore, we can attribute agency to them in the same way we would to ourselves, correct? Bandura says that we could do this, but only if we accept a limited and fairly extreme view of agency (see Bandura, 1989, for discussion). His argument goes somewhat like this. The agent has internal representations of the outside world, and like us could have some sort of neurochemical-based internal system that represents them. Outside influences determine the necessary set of actions needed to act by triggering rules that are 167
Magda Osman internally represented, and that in turn initiate set plans of action. So far so good; the artificial agent is very much like us. But here, Bandura argues, if we want to call this process some sort of selfagency, then it is merely illusionary – that is, the illusionary byproduct of a reactive system that initiates actions that are only triggered by external events. But, a machine-learning theorist might ask, ‘What if the artificial agent learnt to respond to different stimuli not originally part of its design – won’t agency be demonstrated?’ Bandura’s response is that the adapted behaviour is adaptive because of an appropriate plan of action within the global range of actions that the artificial agent was designed to pursue (David, Newen, & Vogeley, 2008). To achieve this behaviour, it was not through a self-willed decision, and this is because the agent has no self-initiated causal efficacy over his actions. To illustrate Bandura’s point with a simple control device, imagine that as you are reading this, your mobile phone rings. You exercise a series of goal-directed actions, which typically follow a well-rehearsed sequence of behaviours: you look for the phone, you pick it up, you look at the number or name displayed on the screen, you look for the button to press to answer, and then you either speak first or listen to the voice on the other end of the phone. You can do this without any need for self-reflection because the external world triggered your actions. In the same way as a light motion detector will initiate the light in a room to turn on as soon as you enter it, a specific environmental condition (motion) initiates a response (turn on the light). We may feel a little uncomfortable about the view that a motion detector can be construed as having agency in the same way as a human can, because agency is more than just a reactive initiation of actions. This is Bandura’s point. We have good reason for thinking that human agency is not reducible in the same way. Agency, at least in the case of humans and perhaps animals, involves control over our internal mental activities, and this executive control has causal properties which affect internal (thinking processes) as well as external (actions we choose that generate effects) events (Bandura, 1989, 2001). It is this point that Bandura argues sets us apart from 168
Social and organizational psychology machines because he proposes that it implies that there is a type of self-referential thinking (metacognition – we can think about our own thinking). The crucial point that many theorists (Bandura, 2001; Bunge, 1977; Eisenberg, 1995) in the social psychology domain make is that thinking, especially the kind that directs actions, is not the preserve of the physical (neurochemical, and neural circuitry) that it is founded on, but rather its functional use in terms of its effects on mental life and the external world. Through Bandura’s arguments we have established that we have the capacity to evaluate our actions and reflect on them, and this maintains our sense of agency because we believe that they can affect behaviour (see Chapter 2 for a discussion on agency). From this, Bandura (2000) argues that there are three different forms of agency – personal, proxy and collective. Personal agency includes intentionality,8 forethought, self-reflection and self-activeness, all of which hang on the central idea that we have power to generate actions for specific purposes that we identify as significant to us. Proxy refers to social conditions in which there are many activities that involve practices that have institutional norms of behaviour. The main idea behind proxy agency is that unlike personal control over our actions, there are desired outcomes that cannot be achieved directly without the help of parents, partners, etc., who are mediators for reaching particular goals. The burden of control is shifted from the individual to a proxy in order to achieve socially desirable outcomes. Collective agency is a further acknowledgement that as social creatures our values, wants and desires are not only personally defined but also socially defined. Examples in which collective agency is found are social networks or groups in which responsibilities for actions are shared across the group. For example, take the collaborative efforts between systems (e.g., corporations, air traffic control and subway systems) and the many operators involved in 8
Intentionality and its relationship to consciousness and agency comprise a considerably large issue in psychology, too large to be discussed in any great detail in this chapter, and so for a more detailed discussion of the issues the reader should consult Dienes and Perner (1999) and Wegner (2004).
169
Magda Osman interacting with them. This often requires management strategies in which the locus of control (i.e., internal or external) is clearly specified (Kren, 1992), and protocols are designed to generate collaborative decision making in training to operate control systems (Shebilske et al., 1998; Shebilske et al., 1992). These are set in place to enable effective communication across networks of people that converge in their activities to reach and maintain specific desired goals in a large organization (e.g., Hoc, 1993; Inagaki, 2003; Moray, Inagaki, & Itoh, 2000). In answer to the question ‘Should we tacitly assume that we have the ability to choose our actions independently of what the environment may require us to react to?’, Bandura (1989, 2000, 2001) proposes that indeed we do act in purposeful ways, and that our actions are not merely the product of reacting to external events; they are self-governed as well as socially mediated. However, there are notable critiques of this view that suggest that even if we have an explicit belief that we choose our actions in real-world situations we do not always find a close correspondence between agency and action (Wegner, 2004). In particular, many choices that we make are habitual. To refer to an often used example, car drivers report experiences of finding themselves taking a route towards their home when they had intended to go otherwise. This suggests that control of behaviours operates at many different levels that do not require much self-evaluative thinking, and in fact agency and its relationship to actions may be misaligned; it may seem that our sense of agency follows rather than precedes actions. Illustrations of this include subliminal priming, in which we make responses that are seemingly directed by information outside of our conscious awareness (Dehaene et al., 2001), or we unconsciously make decisions to act, and only after some temporal delay do we feel the conscious experience of having selected our actions (e.g., Libet, 1985; Soon, et al., 2008). There are many debates concerning whether studies showing misalignment between our agency and our actions are genuine examples in which agency is independent of our action, or whether the effects reported are merely an artefact of the particular experimental 170
Social and organizational psychology methodology used. Nevertheless, if taken at face value, these studies actually say very little with respect to agency and action, for the reason that showing the unconscious influences of information on actions, or showing that actions are initiated before we consciously think about them, are typically in situations in which the individual has virtually no vested interest in any of the choices available.9 If we don’t care about the consequences of our actions, or if the choices seem irrelevant to us, then essentially we are making arbitrary responses. Agency matters because we are making goal-directed actions that have consequences for us and for the events that occur in the external world. Before considering the next question, it is worth ending the discussion on this one with a more satisfactory point. If we want to argue that agency doesn’t have a functional causal influence on actions from either a philosophical or psychological perspective, or indeed if one takes the view of Bandura and his many sympathizers that agency is an essential part of human action and is instrumental in affecting it, whichever way we look at it, we still have the experience of agency. Agency and causality are bedfellows, and so one way of thinking about the relevance of agency in its various forms (e.g., personal, proxy and collective) on actions is that it contributes to our understanding of causality (Bandura, 2001; Glymour, 2004; Murayama, 1994). We usefully gain knowledge about the world by acting on it. Moreover, if we believe we act in the world in a purposeful manner, particularly to achieve goals that we have decided are important and worth reaching, then we have to have some sense of agency. If we didn’t, then we would not attribute our actions to
9
For example, Soon et al.’s (2008) study examined the stage at which we consciously decide to make a decision by presenting participants with a series of change letters on a computer screen, and providing them with the option to spontaneously decide to press one of two buttons. Soon et al. reported that the outcome of a decision was encoded neurologically up to 10 seconds before it reached conscious awareness. Again, while it is a dramatic example of our seeming lack of agency, the context in which people were making decisions and acting involved options that were of little consequence or interest to the individual.
171
Magda Osman ourselves, and therefore they wouldn’t be under our control. On that basis, then, any actions that were generated by us and any effects that we observe would remain ambiguous because we could not attribute that (1) the actions generated by us are as a result of something we intended and (2) the relationship between the actions generated by us and the events in the outside world that precede them spatially and temporally is not connected. This would be quite devastating, because our notion of control collapses. Therefore, to generate purposeful actions that are designed to exert control over our environment, and to meaningfully reduce our uncertainty about how it operates, we need a sense of agency, and much of social psychology has demonstrated its influences on our individual and social behaviours. The first question concerning agency and action dealt with personal or social mastery over actions, or agent causality as Bandura (2001) refers to it, while the second question concerns event causality. Not only are we equipped with a sense of agency, but we also have an additional belief that our thoughts affect our actions, and in turn our actions affect the outcomes in the environment (i.e., any control system, including ourselves). The second question, ‘Should we assume that we can affect the environment through our actions, and therefore change it in a way that it behaves differently from what it would do without our intervention?’, concerns our perceptions of the events in the world. That is to say, we act in ways that produce consequences in the world, and they in turn have future consequences. This directly concerns whether we adhere to the belief that we can exert control over our environment and so produce specific changes that would have otherwise not occurred without our actions. Take a simple example like news reports that claim certain types of foods have health benefits. One recent trend was the introduction of ‘super-foods’ and ‘super-drinks’ which have special properties that can improve our health and reduce our risk of fatal diseases. Sales of blueberries and goji berries increased after campaigns were introduced that suggested these fruits contain antioxidants. We may not know what antioxidants really are, or for that matter how 172
Social and organizational psychology they affect the complex internal workings of our biological system, but we know that they are good, because we have now been told this. Therefore, we follow the rationale that, if we need antioxidants because they are good for us, and something we can eat has lots of them, then we are obviously going to be healthier as a result of consuming something that has them than we would otherwise. We make decisions like this every day of our lives, and implied in this is that we have choice over what we do (i.e., our sense of agency), and while we don’t know the exact details of the causal relations in the world (event causality) we have enough clues to inform our actions, and our actions can affect changes in ways we want. What Bandura (2001) proposes is that behaviours such as those illustrated here reveal that our actions are not isolated from social structures (e.g., rules, social practices and sanctions) and that agent causality and event causality work in combination to direct our actions. This acknowledges that while we obviously need to learn something about the relationship between causes and effects in the world, this is different from developing agent causality. Agent causality refers to our personal experience that we can bring about changes in the world because we can cause them. This relies on using feedback that helps us to differentiate experiences that we have that can be traced back to us through our own control of events, from those events that we have no control over but still affect us anyway.10 To shed light on this, Kelley’s (Kelley, 1967; Kelley & Michela, 1980) attribution theory makes an important distinction between antecedents of behaviour in terms of our explanations of causes and effects, and the consequence of these explanations on our later behaviours. The work discussed by Kelley concerns causal attribution of social behaviours, but it extends more generally to all behaviours because in both social and non-social contexts, agency 10
Whether we can do this accurately is a moot point, especially given that evidence from studies examining the influence of self-efficacy on behaviour provides remarkable demonstrations in which agentic judgements can override actual competencies for producing various actions and controlling outcomes (see the section ‘Cognition and action’ earlier in this chapter).
173
Magda Osman plays an important role in most actions (Sutton & McClure, 2001). The main proposal is that people generate rationales for understanding the circumstances and motivations that cause behaviour (antecedents). That is, we use information from a variety of sources to judge co-variation of events, especially those in which we can attribute intentionality (agency), and we do this by considering actions and their consequences, in which the consequences are not common to alternative causes. Moreover, the rarer the consequences, the more obvious it is to us to attribute agency to the actions that generated them. In turn, the rationales developed to form attributions of actions and their effects (others’ as well as our own) also have consequences because they continue to influence perceptions of actions and their effects on our behaviour, feelings and expectancies of events. As Kelley and Michela (1980) highlight, whether we attribute an action and its consequences as intended by an actor, or simply some aspect of the environment, can have dramatic effects on our subsequent behaviours. For example, if we imagine that we are an employee of a large company, and that in last month’s pay cheque we noticed that we didn’t get the bonus that our manager had promised us, or we were passed up for promotion, the explanation that we come up with for the causal basis of these events will in turn affect how we behave with our manager. We could question the failure of getting a promotion from a personal point of view: is it my fault that I didn’t get the promotion because my performance was bad over the last few months (personal agency)? Perhaps I’d have got the promotion if I worked even harder than usual (locus of control)? We could alternatively think of other possibilities: maybe the manager has deliberately reneged on his promise to promote me because he doesn’t like me (manager ’s agency). Maybe new company policies have been introduced and he can’t make any promotions at the moment (situational/environmental factors). In real management scenarios, attributions (e.g., self-serving attributions or actor–observer attributions) are affected by different circumstances, and this highlights that as a case in point, managers (e.g., Barker & Barr, 2002; Barker & Patterson, 1996; Barker, 174
Social and organizational psychology Patterson, & Mueller, 2001; Huff & Schwenk, 1990; Wagner & Gooding, 1997) and chief executive officers (CEOs) (e.g., Haleblian & Rajagopalan, 2006) are not systematic in the interpretation of their actions and their consequences. For instance, Wagner and Gooding (1997) showed that when managers had to explain a drop in their company’s profit margins, they attributed this to situational factors and downturns in markets, while increases in profit margins were attributed to their own successful management plans (i.e., self-serving attribution). Moreover, when asked to observe other managers’ performance and explain the relationship to company profits, Wagner and Gooding (1997) found that managers tended to judge other managers harshly. When it came to positive outcomes in another company, managers would explain this as the result of situational opportunities that had arisen, and not the consequence of the other managers’ behaviours. However, when it came to declining profit margins, managers would hold other managers responsible for this by suggesting that this was the result of their poor decision making. It should, however, be noted that these types of inconsistent attributions have been reported in conditions that are often highly uncertain. Nevertheless, the consequences of these types of attributions result in continued organizational failures, since overconfidence or underestimating the impact of managerial or CEO decisions on the company’s outputs can sustain further poor decision making, and spiralling negative outcomes. As Kelley and Michela (1980) suggest, identifying the antecedents of causal attributions is important, but so too is the effort in examining the dynamic effects they have on later actions and their consequences. So, we return to the second question, ‘Should we assume that we can affect the environment through our actions, and therefore change it in a way that it behaves differently from what it would do without our intervention?’ While the theory and evidence suggest that we operate with a belief that our actions generate direct effects as a result of our intentions, it also distinguishes this belief from what we ascribe as situational factors that we have no control over. However, knowing when actions and their effects are a result 175
Magda Osman of our intention, in comparison to knowing when they are the result of situational factors, may be perceived rather than real. This implies a serious issue concerning the main theme of this book, which is that in uncertain environments that we aim to control, our sense of agency (Bandura, 2001) and our causal attributions (Kelley, 1967; Kelley & Michela, 1980) are critical in determining what happens in that environment. But, we have a fluid understanding of the causal relationship between our actions and what effects they may produce in the environment. In fact, even if the environment may be changing only as a direct consequence of our actions, we aren’t always willing to accept that we had control of the consequences. Why might this be a serious issue? To illustrate the gravity of the point, rather than referring to a control system, let’s imagine the case in which we are in a well-defined one that is not uncertain (e.g., a maze – see Figure 6.1). The maze environment is clearly structured and it is static. Therefore, uncertainty about the environment and its operations has been completely removed. The only uncertainty
Figure 6.1
176
Labyrinthine maze.
Social and organizational psychology is the fact that the hedge rows are 8 feet high, and so we can’t see where we are, and we don’t have a map of the maze. The only control element of the task is that you and I are to start at different ends of the maze and we have one hour to get to the middle because we arranged to have tea there at noon. This is clearly not a case where we need to concern ourselves with the complexities of agency in terms of affecting changes in the environment, receiving feedback from the dynamic outcomes that may occur and so on; all of these factors have been removed with respect to the environment, and can now only apply with respect to ourselves. Huber (1985) conducted an experiment very much like this in which a computer program simulated the experience of being in a maze, and people were asked to solve mazes of varying difficulty, and with goals of varying specificity (i.e., ‘Start from the middle and reach the exit in the fewest number of moves’ versus ‘Do your best’). Success in solving the mazes was not mediated by people’s general problem-solving ability, but rather the effort and commitment to solving the task. The interesting point is that this came down to people’s estimation of their probability in successfully solving the mazes. When told that the maze was difficult, as was the goal they were pursing, people’s performance dropped. Their effort in solving the task reduced, and feedback from their ongoing performance during the task was negatively viewed, leading to dysfunctional problem-solving strategies. Similar findings have since been reported in a variety of tasks in simulated organizational tasks (Maynard & Hakel, 1997), financial decision making (Endres, 2006) and group decision making (Crown & Rosse, 1995; Mesch, Farh, & Podsakoff, 1994). Now let’s scale the environment from a maze to a type of control system. This system now has multiple cause–effect relations, in a closed loop, with many dynamic properties which we have to interact with and control; then we’ve just added layers of uncertainty that brings this hypothetical system in line with the kinds we are familiar with (e.g., cars, planes, companies, subway systems and nuclear power plants). As uncertainty increases, seeking ways of reducing it and managing it is just as much a factor of our perceptions of the 177
Magda Osman environment and our capabilities as it is of the actual characteristics of the environment and our actual capabilities. We don’t always seem to be systematic in terms of our interpretation of the events that can be objectively ascribed to our own actions, any more than we are in our interpretation of the events that can be objectively ascribed to the system. This is because we maintain an asymmetric relationship between how we view our actions and their effects, and how we interpret our actions and our agency from the effects we observe. What social psychology tells us is that we don’t infer cause–effect relations in the same way as we infer causes by examining the effects in the environment, because depending on the assessment of the effects, we are not always willing to accept that we had any involvement in generating them. Clearly, subtleties in the information that we use to embark on any goal-directed task can have a serious impact on our sense of agency and our actions. The way in which we interpret information from the environment and how we use it to learn and make decisions about what actions to take are reserved for the next chapter.
178
Chapter 7
Cognitive psychology
Cognitive psychology has an entire dedicated field of research (complex problem solving) to examine the kinds of psychological behaviours that help us to control uncertainty. The very aim of this research field is to empirically investigate goal-directed behaviours in situations that map as closely as possible onto those likely to be found in real-world control systems. Nevertheless, interest in control systems is much broader and encompasses many other aspects of human behaviour beyond control, as the list of psychological examples below indicates. The study of psychological behaviours in control systems has been used to study unconscious learning processes (Berry & Broadbent, 1984; Dienes & Fahey, 1998), skill acquisition (Sun, Slusarz, & Terry, 2005; Sun et al., 2007; Vicente & Wang, 1998), learning by observing (Osman, 2008a, 2008b, 2008c), dynamic decision making (Busemeyer, 2002; Lipshitz et al., 2001), group behaviour (e.g., Broadbent & Ashton, 1978), motor control behaviours (e.g., Osman et al., 2008; Witt et al., 2006), memory processes Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
179
Magda Osman (Broadbent, 1977; Gonzales, Lerch, & Lebiere, 2003; Sweller, 1988), planning decisions and actions (Dörner, 1989; Earley et al., 1990; Quesada, Kintsch, & Gonzales, 2005) and attentional processes (Burns & Vollmeyer, 2002; Gonzales, Lerch, & Lebiere, 2003; Kanfer et al., 1994; Lerch & Harter, 2001). The reason for all this attention is that studying control behaviours in their natural environments is appealing because they have ecological validity. Also, control systems are uncertain environments, and uncertainty is a concept that has relevance to virtually every aspect of cognition. We need to perceive it, attend to it, remember aspects of it, learn about it, reason from it, make decisions about it and of course try to control it. Returning to the story of control, this chapter examines work that focuses on the way the puppeteer perceives the actions of the puppet, and how this is used to learn what levers to push and pull in order to make the puppet move in a particular way. The target question of this book, ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’, was the impetus for early work in complex problem solving, which is why this question is just as relevant now as it was then, and will be used to integrate empirical work generated from cognitive psychological research presented in this chapter. Much of the empirical work directed towards understanding control behaviours can essentially be thought of as tackling two information-processing questions: how we acquire and process information (knowledge acquisition), and also how we use information in ways that can effectively manage uncertainty (knowledge application). Of the many different research directions that have been taken, there appear to be two main theoretical positions which make contrasting claims about how we learn to control uncertain outcomes in complex systems. There are those who say, given the characteristics of the environment, the uncertainty that is generated means that we can’t always process everything that is going on all at once. Therefore, much of the hard work of our internal information-processing
180
Cognitive psychology system is done outside of our conscious thinking. That is, we can learn to control our environment because we have an intuitive sense of what we need to do in it, and the more experience we gain, the better able we are to automatically generalize our decisions and actions to new situations. Contrary to this, others hold the position that the very fact that we are interacting with uncertain environments means we need to think about what we are doing, and monitor the changes that occur. This is because our behaviours have important consequences for the thing we are attempting to control, and these need to be tracked and adjusted continuously because of the continual changes and demands the control system is making of us. There are of course other possible ways of carving up the bulk of theory and evidence (e.g., levels of expertise: skill versus novice learning behaviours, predictive versus control-based learning, and action- versus observation-based learning). But, for the main, the distinctions I have drawn are popular ones. Of the many divisions that could be referred to, there is an important point to bear in mind. Whatever dichotomy is adopted, it really is only useful as a basis for identifying important types of behaviours;1 the reason for this cautionary point is that dichotomies can be just as damaging as they can be helpful in describing human behaviours. Overly attending to what differentiates between psychological constructs means that the subtle connections between them are overlooked. Therefore, the
1
The problem of false dichotomies has been, and continues to be, dogged by the introduction of a dichotomy by Fitts and Posner (1967). They were the first to propose the distinction between automatic versus controlled processing. Since then, many domains of cognitive psychology (e.g., memory, reasoning, decision making, attention, perception and learning) have built on this distinction, and from this comes a long-standing debate concerning whether or not it is valid to posit separate mechanisms that support a mode of processing that is unconscious, fast and intuitively based, and one that is conscious, slow and analytically based (for a more detailed discussion on this debate, see Osman, 2004; Osman & Stavy, 2006; Shanks & St John, 1994; Sternberg, 1989).
181
Magda Osman distinction I refer to serves as an organizational tool. It is used as a way of understanding the contrast between the different theoretical claims and will be used to summarize the wealth of findings from work investigating learning and decision-making processes in complex control environments.
What are the Experimental Paradigms That are Used in the Study of Control Behaviours? Studies that examine control behaviours have developed tasks that are designed to simulate real-world examples of control systems. For this reason, the chapter begins by introducing the types of tasks that psychologists have developed, and the typical psychological behaviours that are associated with them. What follows from this is a discussion of the main theories that have been developed to account for the kinds of behaviours that have been explored in these tasks. To help put the types of behaviours examined in control systems in the context of more general learning and decision-making processes, the latter section of this chapter discusses psychological studies concerned with perceptual-motor control, causal learning and predictive judgements in probabilistic learning tasks. All of these behaviours share common elements with studies of control behaviours in control systems, and that is that the tasks involve uncertainties, and the psychological behaviours are directed towards reducing it. This provides the context for concluding this chapter by discussing how it is in general that we learn about and decide on behaviours that are relevant to controlling uncertainty.
What types of control systems are used? Because cognitive psychology is the only research paradigm thus far that directly examines decision-making and learning behaviours in control task environments, it is important to give a sense of the 182
Cognitive psychology kinds of simulated control systems that have been used. Below are five famous experimental tasks that have been used. Illustration 1: Ecological control system. People are presented with a simulated environment of an African landscape with different flora and fauna. Two tribes live on the land, the ‘Tupis’ and the ‘Moros’, and they sustain themselves by cattle and sheep farming. The ecosystem is harsh and often unstable, so the task involves assuming the role of a soil and crop science technical advisor. The goal is to make interventions that will improve the living conditions of the native population. This involves managing approximately 50 interconnected variables that have a complex feedback process, both negative and positive, and with time delays. Illustration 2: Automated control system. Trained pilots are required to take part in one of three 1-hour flight scenarios (including takeoff and landing) in a flight simulator. In one scenario, on three separate occasions (once during climb, once when established on the descent path and once later during the descent), an anomaly in the flight deck was experienced. In another scenario, while the plane is climbing, the pilot receives a request from air traffic control to change altitude, but the system will not automatically change without a sequence of manual operations from the pilot. The third scenario involves a failure in a critical function of the system. Thus, pilots have to respond to delays between their commands and their actions in the system, as well as failures in the system, and miscued problems from the control panel. Illustration 3: Management control system. People are told to adopt the role of a worker in a sugar factory, and their job is to control the rate of production of sugar to 9000 tons and maintain it for a given period of time. However, people are unaware of the formula (P = (2 * W − P1t−1) + R) that can be used to calculate the production level of sugar in which the relevant variables are the workforce (W), the current sugar output (P), the previous sugar output (P1) and a random variable (R). Illustration 4: Industrial control system. (See Figure 7.1.) People are required to learn to operate a water purification system with the goal of eventually controlling it to reach specific criteria. There are three substances (i.e., salt, carbon and lime) that can be used to purify the
183
Magda Osman
Water tank quality control system Salt 0 –100
Oxygentation (O2) measure
100 0 O2 Reading =
Carbon 0 –100
0
Chlorine concentration (CL) measure
100
(CL) Reading =
Lime 0 –100
0
Temperature measure
100
Temp Reading =
To begin the experiment press the start button; this will reveal the first set of input values, press the “output readings” button to show you the values of the outputs.
output readings
RENEMBER TO PAY CLOSE ATTENTION TO THE INPUT AND OUTPUT VALUES. GOOD LUCK
Start
Trial 6
Figure 7.1 Example of industrial control system.
water supply. In order to measure the water quality there are three different indicators of purity (i.e., chlorine concentration, temperature and oxygenation). By changing the amount of salt, carbon and lime, the aim is to control the water quality of the purification plant. Illustration 5: Biological control system. People are told to adopt the role of personal fitness advisor of an athlete who is training for an upcoming race. They then learn that there are four indicators (symptoms) that can be used to gauge the athlete’s general fitness during the training session (i.e., redness, sickness, thirst and tiredness) which can change as a result of differences in temperature, a circulatory prob-
184
Cognitive psychology lem, problems with their metabolism or a false alarm. In addition, the fitness of the athlete can change rapidly or slowly over time. So, to control fitness levels the trainer can make three simple interventions (i.e., cool the athlete down, rest the athlete or hydrate the athlete). The aim of the task is to decide which intervention is appropriate and when to make it given the various changing states of the athlete.
What are the typical findings associated with these examples? The first example is one of the earliest control tasks devised (Dörner, 1975). In fact, the first dedicated studies (Broadbent, 1977; Dörner, 1975; Toda, 1962) of control behaviour were designed to investigate how people responded in micro-simulations of realworld scenarios. In an attempt to make the tasks as close to real as possible, people were presented with numerous variables which they had to manipulate and which had many complex properties. From this, the idea was to examine how people cope with (1) illstructured problems (2) in uncertain dynamic environments, (3) with shifting, ill-defined, or competing goals, (4) feedback loops, and (5) time pressures (Funke, 2001). Because of this, tasks like the kind Dörner (1975) developed were highly complicated and required extensive training over weeks. In his ecological system the operator interacted with variables that changed in real time both directly as well as independently of their actions. Given that the aim of the task is to make multiple decisions in order to achieve a specific state of the system, the difficulty comes from having to manipulate numerous elements simultaneously each of which are interconnected, and that also have positive, negative and delayed effects on multiple outcomes. Dörner (1989) suggested that the main problem with why so many people in his task failed to master the control system was based on the kinds of assumptions they made. Unlike, say, flying a plane, which requires years of expertise, controlling Dörner ’s ecosystem did not need
185
Magda Osman specialized knowledge, so poor control performance was not a result of lacking appropriate experience. In fact, the key problems that lead to poor mastery of the environment were that people simply did not appreciate that effective control depended on making multiple decisions at the same time. People tended to make poor estimations as to the kinds of information they needed to attend to, because they assumed they already knew what information was relevant to base their decisions on. Dörner (1989) claimed that, more generally, the problems with the way systems are controlled can be avoided if people state their goals more precisely, and avoid relying on biased assumptions about how the system works. If they are more accurate, they can make better assessments of the outcomes they generate and this can then help inform them about decisions they need to make in the future. The second example comes from work by Sarter, Mumaw and Wickerns (2007), and like the first, it involves controlling multiple variables as they change in real time. However, unlike all the other examples presented in this section, the second examines expertise in control using a genuine control system. Though designed to examine psychological behaviour, Sarter, Mumaw and Wickerns’ (2007) study also has implications for human factors issues concerning how to improve training schedules. The psychological focus of the study was to examine why pilots occasionally make inappropriate responses, and to also help rule out the possibility that pilots fail to attend to the right information when making a decision. Sarter, Mumaw and Wickerns’ (2007) findings suggested that pilots correctly attend to the information that signals changes occurring, even when those changes are the result of delays between their implemented decisions and the subsequent effects. However, pilots tended to base their decisions on incomplete knowledge of the changes that occurred. This implied that the way in which pilots monitored the flight management system in turn biased the way in which they used outcome feedback, and this was what contributed to poor pilot decision making. In contrast to the first two examples, the next two are actually static systems; that is, events in the control system will occur, but 186
Cognitive psychology only as a direct result of an intervention that is made by the operator, which is different from control systems that have on going changes in states. Nevertheless, both are examples of uncertain control systems because the different connections between the input-outputs are either non-linear or linear but noisy. The third example (Berry & Broadbent, 1984) requires that people make a single decision at a time, and from this the consequences of the outcome are fed back directly and used as a basis for making the next action. Since Berry and Broadbent’s (1984) seminal study, there have been nearly 350 citations of their work. It has been one of the most popular tasks used to examine control behaviours, and has been instrumental in maintaining the prevailing view that learning and decision making in control systems are underpinned by processes that are unavailable for conscious inspection (e.g., Berry & Broadbent, 1988; Dienes & Berry, 1997); that is, we are not always aware of our decisions and actions when attempting to control events in control systems. We are often unsure of why we actually choose the actions we choose, and we cannot actually describe accurately how the system works (e.g., Berry & Broadbent, 1987, 1988; Dienes & Fahey, 1995; Marescaux, Luc, & Karnas, 1989; Stanley et al., 1989). More startlingly, the implications of this work are that we can operate a complex system with little knowledge about its underlying causal structure and how we produce changes in it.2 Berry and Broadbent’s task has also been used to examine the effects of learning on later control behaviours by watching an operator making changes to the system. This method comes close to the style of training procedure used in applied situations such as in clinical (e.g., Giesler, Josephs, & Swann, 1996) and military domains (e.g., Hill, Gordon, & Kim, 2004) in which operators first learn by
2
Though this is a widely accepted view, there have been challenges to this which suggest that the dissociation between what we know (declarative knowledge) and what we do (procedural knowledge) is an artifact of the experimental methods used to measure this knowledge, and not an actual reflection of the underlying psychological mechanisms involved (Osman, 2008a, 2008b, 2008c; Sanderson, 1989).
187
Magda Osman observing, and then have the chance to experience the system firsthand. Studies contrasting observation-based learning with active-based learning show that active-based learners are better able to control the system (Berry, 1991; Lee, 1995). Other interesting findings to emerge from this particular control systems task are that with extensive practice people’s ability to control the system further improves, but does not in turn improve the accuracy of their knowledge of the system or their behaviours (Berry & Broadbent, 1988). This task still remains popular amongst researchers studying early stages of knowledge acquisition (e.g., Anderson, 1982; Eitam, Hassan, & Schul, 2008) as well as knowledge application in highly skilled individuals (Gardner, Chmiel, & Wall, 1996). While learning about a control system, the operator can be described as making multiple predictions about outcomes in the system based on particular changes made to the system. To this end, the fourth task was developed in mind of examining this type of learning, referred to as hypothesis testing. Hypothesis testing involves a series of systematic tests which are conducted based on first generating a hypothesis (or rule) that with additional information (feedback) gained from making a decision is evaluated. If the feedback is inconsistent with what was predicted, then the hypothesis may be rejected or revised. If the feedback is consistent with the predicted outcome, then the hypothesis may be retained, and depending on how confidently the hypothesis is held, it may be tested again to check its validity, or a different hypothesis will then be explored. These predictions form the basis of knowledge of the underlying properties of the system. Importantly, learning and decision-making processes rely on monitoring the feedback from testing out predictions, and it is from this process that knowledge about the system is updated or revised. Burns and Vollmeyer ’s (2002) study revealed that the operator ’s ability to control the system depended on the type of goal that he or she was trying to achieve. This connects with work discussed in social psychology which has examined the impact of pursuing different goals on knowledge gain and performance (see the Chapter 6 section entitled ‘Goals’). In
188
Cognitive psychology general, setting and keeping to a specific goal works in the short term as a way of controlling a system but not in the long term. This is because pursuing a specific goal involves focusing all one’s efforts on achieving and maintaining that particular goal, and this constrains what we can know in general about controlling the system. So, when we need to adapt to a differ goal, our detailed experience ends up becoming so specialized that it becomes inflexible, and we can’t easily adjust what we know to other types of goals. Instead, pursuing an unconstrained goal (e.g., learn as well as you can how the system works) is a good starting point for exploring the control system, and this means that a wider range of hypotheses are generated and tested. This benefits control ability in the short and long term because the knowledge that is gained is sufficiently flexible to suit a variety of goals. The final task, developed by Kerstholt (1996), was designed to examine the effects on decision making when having to control a system, under different types of time pressure. This, of course, is a general problem in control systems because changes in the system can occur over long periods and may be hard to detect, or can occur sharply and with little warning. Therefore, the study examined how people learn about the system under conditions in which there were high demands placed on decision making because the athlete’s fitness could decline sharply, or when the demands were low because the fluctuations in fitness were relatively small over time. To add to this, the treatments administered to the athlete cost money (the cost was manipulated so that for some people the cost was high, and for others the cost was low), so not only was it important to keep the athlete healthy by making the right intervention, but also making the wrong decision came at a real financial cost. Moreover, though people could request information about the symptoms of the athlete in order to choose more wisely which treatment to administer, this also came at a financial cost. Again, this was manipulated so that there was a high-cost condition and lowcost condition. Kerstholt (1996) reported that people tended to mediate between two types of strategies: judgement-orientated
189
Magda Osman strategies which involved seeking information and then basing a decision on that, and action-orientated strategies which involved making a decision and using the feedback from the outcomes they generated to judge what further actions needed to be taken. In situations in which costs were low and there was limited time pressure to make a decision, people favoured an action-orientated strategy, while under high-cost and high-pressure conditions people preferred the less risky but more costly judgement-orientated strategy. This was clearly a suboptimal strategy for the reason that further requests for information delayed making an action. In the highpressure condition the athlete’s decline in fitness was sudden and rapid, and so delaying an action in order to choose the most appropriate intervention was made too late for it to have any benefits on the outcome. This highlights a more serious and commonly reported problem with control systems which concerns people’s experiences of time (Brehmer, 1992; Diehl & Sterman, 1995; Kerstholt & Raaijmakers, 1997; Sterman, 1989). Failing to appreciate the temporal relationship between actions and outcomes and poorly planning for the future mean that information-seeking strategies can be less advantageous than action-oriented strategies. So, do these types of control systems have any common characteristics?
Complexity To understand the kind of behaviours that are involved in control systems, it is important to try to uncover if there is any common ground in the kinds of tasks that are used to study them. This exercise helps to establish if there is any mapping between the general properties of the environment and the repertoire of behaviours we have that are designed to respond to them. As a starting point, studies examining the psychological behaviours in control systems typically refer to these environments as highly complex ones. The behaviours exhibited in the examples described are thus a response to the different types of complexities of different types of control 190
Cognitive psychology scenario. Therefore, if we understand what makes them complex, we can then build a theory of why people behave in certain ways in such types of environments. But what defines complexity? There have been considerable efforts in describing the critical factors that constitute complexity in these contexts (Buchner & Funke, 1993; Campbell, 1988; Funke, 2001; Gatfield, 1999; Jonassen & Hung, 2008; Kerstholt & Raaijmakers, 1997; Quesada, Kintsch, & Gonzales, 2005). Often, these attempts refer to the objective characteristics of the system, for instance: 1. 2. 3. 4. 5.
6.
The transparency of the relationship between the variables we manipulate and the outcomes produced. Time variance: is the system dynamic or static? The number of items of information that need to be attended to and manipulated at any one time. Intercorrelations (i.e., the number of input variables that are connected to each other as well as to the output variables). The validity and reliability of the variables that are manipulated (i.e., the predictability of the system), and the functional form that can be used to describe the relationship between the input variables that are manipulated and the output variables (i.e., linear, curvilinear, stochastic, deterministic and power law). The type of feedback that is generated; i.e., immediate, delayed, negative, positive and random.
The problem is that, as yet, while the control systems can be described according to these objective characteristics, there is no agreement as to what contributes to making them complex psychologically. For instance, linear relationships are easy to conceptualize because there are many naturally occurring examples of them (e.g., as height increases, so does weight). Non-linear relationships (see Chapter 3, ‘Engineering control systems’) are often thought to be harder to handle, especially if they are non-monotonic – that is, the changes that might occur are not smooth, but instead are abrupt. However, it isn’t necessarily the case that non-linear non-monotonic 191
Magda Osman relationships between variables are difficult to appreciate.3 So, from a psychological viewpoint, the apparent difficulty with properties that are thought to make a control system complex can be handled well enough provided the context in which they occur is understandable. This has led some to abandon the idea that we should look to the environment as the only basis for defining what makes control systems complex, and instead consider the psychological factors associated with coping with uncertainty (Osman, 2010), particularly since there are many examples in which complexity is evoked for reasons other than the task itself. For instance, factors such as self-doubt, anxiety and fear (Campbell, 1988), and judgements of self-efficacy (Bandura, 1989, 2001), motivation (Locke, 1991, 2000) and misperceptions of the task (Sterman, 1989, 1994, 2002), all have a hand in turning a task that may seem to be manageable into an unmanageable one. This implies that our subjective experiences of uncertainty and our beliefs in our ability to successfully reduce it contribute to making a control system difficult to control, and this can be independent of the objective characteristics of complexity (Osman, 2010).
So, do these types of control systems generate any common psychological behaviours? Work in human factors, social psychology, management and organization psychology suggests that the way in which we attempt to
3
For instance, take an everyday example such as trying to alleviate a headache. We may take an aspirin to start with, and notice that after an hour it has only slightly dulled the effects of the headache. We take another two aspirin in order to speed up the process, which again has been effective but we still have the headache. We decide to take two more in quick succession to ensure it goes away. But we’ve gone past a critical point where the aspirin is effective and now it is becoming dangerous, and we suddenly feel dizzy and slightly nauseous from all the aspirins we have just consumed. The relationship between drug dosage and effective pain relief is both non-linear and non-monotonic, but this is an example which is easy to conceptualize.
192
Cognitive psychology reduce uncertainty comes from relying on biases that help organize what actions we ought to take in controlling a system.4 To complement this, as many of the examples discussed thus far suggest, both experts and novices are susceptible to biases. Biases may be relied upon particularly under highly pressurized conditions in which immediate actions need to be generated in response to the demands of the environment (e.g., Krueger, 1989; Lichacz, 2005). They may also be employed because there simply is not enough information to decide on, or to predict, the outcome in the environment (Degani, 2004; Klein, 1997; Orasanu & Connolly, 1993; Sauer et al., 2008).
Biases in the Structure of the Environment The most common biases that people rely on tend to be about the structure of the control system. Typically, people perceive the inputs and outputs as following a one-to-one mapping (Schoppek, 2002), which means that if we change a dial on a control system and next to the dial we see a light that is flashing, we are most likely to assume that there is an underlying relationship between the dial and the light. Though cues that are proximal to each other can often be causally associated, they aren’t always, and so this assumption can be erroneous. Moreover, as tends to be the case with biases, it is difficult to override. Another common bias is to assume that associations between our actions and the outcomes are salient, in other words, we assume the simplest, most plausible cause–effect relationship (Berry & Broadbent, 1988; Chmiel & Wall, 1994). In addition, our tendency to assume the simplest associations means that we are biased in perceiving the relationship between inputs and outputs as positive (Diehl & Sterman, 1995; Sterman, 1989), 4
Here, the term bias refers to a misperception as the result of assumptions founded on accumulated experience of control systems. That is, we perceive outcomes of our behaviour or the behaviour of the system in a way that does not actually match up with what is objectively happening, but our experience preferentially directs our attention towards believing we are correct.
193
Magda Osman linear (Brehmer, 1992; Strohschneider & Guss, 1999) and unidirectional (Dörner, 1989). That is, if we turn a dial on a control panel, then we expect that there will be an immediate change say, in increasing sound and turning the dial off will switch off the sound, and the more we turn the dial the louder the sound will be. These simple assumptions may in fact stem from rudimentary scientific and mathematical intuitive rules (e.g., ‘More X … More Y’ or ‘Same X … Same Y’5) that we develop as children. Often, salient features of a task invite people to infer these simple general6 rules, or the features themselves map onto previous knowledge that the individual has. More to the point, we use them because they save time. They help to generate a quick response without having to methodically work out the finer details.
Biases in Interpreting Different Timings and Feedback Another typical finding is that people are maladapted to delays in feedback and time pressure. People are biased in inferring that changes in the system occur only as a result of their actions (Kerstholt & Raaijmakers, 1997), and effects of their actions are assumed to be immediate. Moreover, long temporal delays between actions and effects tend to be forgotten or ignored (Bredereke & Lankenau, 2005; Brehmer, 1992; Degani, 2004; Diehl & Sterman, 1995; Kerstholt & Raaijmakers, 1997). Also, people are generally poor in co-ordinating decisions that will offset the immediate demands for an action that
5
For instance, X and Y can refer to quantities of weight, height, volume, width, density, size, area, time and distance (Osman & Stavy, 2006; Stavy & Tirosh, 1996, 2000). 6 Domain-general rules refer to the concatenation of experiences and knowledge gained from many differences which may or may not have underlying associations (e.g., ‘If we trip up we fall’ is a domain-general rule that can be formed from experiences of tripping up and falling in the house, abroad, at work, etc.).
194
Cognitive psychology the control system is calling for, while at the same time planning for longer-term effects (Camp et al., 2001; Langley & Morecroft, 2004; Moxnes, 2000). One reason for these failings is that people’s knowledge of a system may degrade as they encounter combinations of long and short repeated delays in feedback from their actions (Diehl & Sterman, 1995; Gibson, 2007). Temporal delays between actions and effects are often interpreted as the result of an unpredictable system, and so people fail to accurately interpret the relationship between their actions and how they affect the system (Diehl & Sterman, 1995; Moxnes, 2000). People tend to be quick to develop expectations of likely outcomes from observed or self-generated events from relatively little experience of the conditional relationship between the two. In particular, close temporal proximity between self-initiated actions and outcomes in the world helps to bind these events to form causal representations (Lagnado & Sloman, 2004). In turn, purposive actions require some anticipation of likely consequences, and it is the choices that we make and the corresponding outcomes that they achieve which contribute to our sense of agency (Bandura, 2001; Pacherie, 2008). It is unclear from this research as to whether lacking a sense of agency leads to a misperception of feedback delays, or whether the uncertainty of the environment is enhanced through feedback delays, which in turn reduces people’s sense of agency and in turn their ability to control the outcomes in a control task.
Preliminary summing up What the work discussed thus far suggests is that trying to define the critical aspects of the task environment that make it complex is not necessarily a reliable gauge as to what makes them difficult to control. A different approach to this is to start from the psychological processes that are used to reduce uncertainty. Clearly we rely on many biased assumptions about how control systems work that we use as a basis to make decisions and plan actions. While we do find the system difficult to control, and there are many demonstrations
195
Magda Osman in which the behaviours we develop are not optimal, we are still able to operate and control complex systems reasonably well. The next section focuses on the theories that have been developed to describe the general learning and decision-making processes that we use to control events in control systems.
Theoretical developments As mentioned before, theories of control behaviours in complex systems generally fall into one of two categories. One class of theories posits that goal-directed learning and decision making advance without much need for deliberation (instance-based theories) because we acquire information incidentally, and with little understanding of the underlying causal mechanisms. In contrast, the other class of theories claims that deliberation is required (hypothesis testing, theories and naturalistic decision-making theories) in order to monitor and evaluate the actions and decisions that we make. A discussion of the main ideas behind each theory is now presented.
Exemplar and instance-learning accounts One of the earliest theories of control born out of pivotal work by Broadbent (Berry & Broadbent, 1984, 1987, 1988; Broadbent, Fitzgerald, & Broadbent, 1986) and others (Stanley et al., 1989) is instance-based theory. This refers to the idea that while learning to control a system, actions that are taken which lead to desirable outcome get stored in a type of ‘look-up table’. The look-up table contains snapshots, or ‘instances’, of our behaviour in the system; the instance represents the decision we made, the action we took and the outcome that occurred. Similarly, formal instance-based models of learning in control tasks (Dienes & Fahey, 1995, 1998) assume that each encounter with a control task creates an instance representation that is stored in memory, and that successful control comes about through a process of matching environmental cues with similar past instances that were experienced. 196
Cognitive psychology For these accounts, the knowledge that is acquired in control systems is highly specialized, so much so that the instances are bound closely to the particular type of control system in which they were generated (Berry & Broadbent, 1988). This in turn makes transfer of knowledge hard to achieve because the specificity of the instances does not enable them to generalize further than the context in which they were generated. Moreover, what is also absent from early learning encounters with the system is the abstraction of any rules or structural knowledge. Compared with the relatively effortless acquisition of instances through the active interaction with the control task, explicitly relating actions to their consequences by uncovering rules about how a control task behaves, in information-processing terms, is very costly (Berry & Broadbent, 1987; Lee, 1995; Lee & Vakoch, 1996; Stanley et al., 1989). Gonzales, Lerch and Lebiere’s (2003) recent advancement of the instance theory – the instance-based learning theory (IBLT) model – describes instances as having three properties. These are the environmental information, the decision taken at the time which generates the action and the utility of the outcome of that decision – that is, the value that the outcome has according to getting us closer to our desired goal. They retain the basic early claims of instance-based theory, which are that learning to control a system requires the accumulation of successful instances which are consigned to memory, and that a pattern-matching component in memory is needed for the eventual retrieval of instances as learning progresses. The difference between this model and earlier instance-based theories is that evaluative (rule-learning) processes are given a prominent role. Although not explicitly referred to, Gonzales et al.’s IBLT model includes an evaluative mechanism. They propose that under situations of high uncertainty, the learner will evaluate the best action to take by relying on heuristics (referred to here as contextual factors that guide attentional focus) to determine which cues are relevant to act upon in the system. As uncertainty decreases, the learner instead evaluates the utility of an action. So, rather than simple trial-and-error learning, people begin with basic (albeit biased) assumptions by relying on 197
Magda Osman heuristics, and with more interaction with the task replace this with instance-based learning. Similarly, Sun’s (Sun, Merrill, & Peterson, 2001; Sun et al., 2005, 2007) Connectionist Learning with Adaptive Rule Induction Online (CLARION) formal model also includes an evaluative component. The motivation for the model comes from work contrasting early stages and advanced stages of skill development in complex learning environments. Sun and his colleagues claim that early on, knowledge development moves in the direction from intuitive to rule-like. As we progress, successful skilled behaviour develops, and our rule-like knowledge becomes habitualized and, therefore, automatic. This means that our knowledge of the system we are trying to control is implemented quickly once a degree of experience with it is achieved. Here, expertise includes knowledge of the underlying structure and rules governing the control system. Their formal model describes this by proposing two distinct subsystems which roughly align with procedural learning (an action-centred subsystem, or ACS) and declarative learning (a non-action-centred subsystem, or NACS). Sun’s (Sun et al., 2005, 2007) two recent additions to the CLARION framework are a motivational subsystem (the MS) and a metacognitive subsystem (the MCS). The role of the former is to self-regulate behaviours from the ACS and NACS, whereas the latter functions to monitor and evaluate refining and improving the quality of rules and instances generated by the ACS and NACS. As with Gonzales, Lerch and Lebiere’s (2003) IBLT, Sun’s formal descriptions of control learning do more than simply intuit what actions to make in order to control a system; there is thinking and evaluation behind them, though theories of hypothesis-testing place even more emphasis on this.
Hypothesis-testing accounts An alternative approach to instance-based theories is to examine the presence of skilled rule-learning and hypothesis-testing behaviour (Burns & Vollmeyer, 2002; Klahr & Dunbar, 1988; Simon & Lea,
198
Cognitive psychology 1974; Vollmeyer, Burns, & Holyoak, 1996). In their accounts, expertise is identified by the ability to call to mind appropriate schemas (structures of knowledge that allows the expert to recognize a problem state as belonging to a particular category of problem states that typically require a specialized operation) developed from past experiences that can be transferred across domains (e.g., Trumpower, Goldsmith, & Guynn, 2004). Sweller (1988) claimed that achieving expertise is dependent on the goal directedness or specificity of the goal the individual is engaged in. When goal directed, the individual is focused on achieving a particular outcome through means–end analysis (a method of reducing the distance between the current position in the problem and the end state) and is unable to develop a deep understanding of the task. This process of means-end analysis interferes with the uptake of relevant knowledge through hypothesis testing, because the individual is only concerned with serving the immediate demands of a specific goal (SG) (Sweller, 1988). Consequently, in a complex task people are able to reach a specific goal they are set, but they have poor knowledge of the general structure of that task. In order to promote schema-based knowledge, the goal directedness of a complex task needs to be removed and replaced with non-specific goal (NSG) instructions. NSG instructions are characterized as constraint free; they encourage the exploration with the view to encourage people to discover the relevant properties of the task. Rather than relating the effects of goal specificity to the different constraints on information processing, Burns and Vollmeyer (2002) develop dual-space hypothesis (Klahr & Dunbar, 1988; Simon & Lea, 1974) by describing the goal specificity effect in terms of the individual’s focus of attention. Burns and Vollmeyer claim that control systems can be deconstructed into spaces: the rule space, which determines the relevant relationship between inputs and outputs, and the instance space, which includes examples of the rule being applied. Under SG instructions, the instance space is relevant because it is integral to the goal, that is, the individual’s attention is focused primarily on achieving a particular instantiation of the
199
Magda Osman rule, and not on discovering the rule itself. Because searching through the control task is unconstrained, under NSG instructions both rule and instance spaces are relevant. Thus, in this case, attention is distributed across all relevant properties of the task because no one instantiation of the rule is more important than another. In turn, searching through the rule space encourages hypothesis testing, which leads to a richer understanding of the underlying structure of the problem (e.g., Burns & Vollmeyer, 2002; Geddes & Stevenson, 1997; Renkl, 1997; Trumpower, Goldsmith, & Guynn, 2004; Vollmeyer, Burns, & Holyoak, 1996). In sum, learning and deciding what to do in order to control events in a system require attending to the right informational space in the system, and adopting the right goal at the right time. Goal-specific pursuits are important in control, but only when the individual controlling the system is completely knowledgeable about the system.
Naturalistic decision-making theories Inherent to theories describing behaviour in naturalistic decisionmaking environments is how people differ according to their subjective experiences of uncertainty. More specifically, the judged uncertainty of an individual embarking on an unfamiliar control system situation should be different from that of an individual experienced in it (Lipshitz & Strauss, 1997). The theories describe the interactions between individuals and control systems in terms of a naturalistic dynamic decision-making process. Cohen, Freeman, and Wolf ’s (1996) recognition/metacognition (R/M) model and Lipshitz and Strauss’ (1997) reduction, assumption-based reasoning, weighing pros and cons, forestalling and suppression (RAWFS) model describe how a specific set of skills is utilized by experienced decision makers in novel situations. Klein’s (1997) recognitionprimed decision (RPD) model focuses on the way people can use their experience to make rapid decisions under conditions of time pressure and uncertainty. In general, though, these accounts describe the process of decision making in applied real-world situations (e.g., commercial aviation pilots, fire commanders, design 200
Cognitive psychology engineers and offshore oil installation managers). In addition, Vicente’s (Chery, Vicente, & Farrell, 1999; Jamieson et al., 2007; Vicente & Wang, 1998) work examines human–computer interaction (HCI) on industrial-scale control systems (nuclear power plant, petrochemical production, radio communication and thermohydraulic processor). By taking a prescriptive approach, Vicente’s (2002) ecological interface design framework describes the informational requirements and the actions needed to meet the high-goal stakes in such environments. Like the RPD model, Vicente’s framework is one of the few that describes the pressured environment and the high costs associated with poor decision making in control systems. Consequently, these theories have been extended to accommodate research on control tasks in HCI, ergonomics and organizational psychology. In the initial stages of controlling a system, all three models propose that experts tend to make a match between the perceptual details of the task and their stored knowledge (i.e., pattern matching). The R/M model asserts that when people interact with control systems, the events (states) that are generated by the system cue recall of related knowledge, goals and plans. Thus the initial step in learning about a novel situation is recognitional pattern matching. Each prior experience has a pattern of environmental cues associated with it. Schemas of past actions that are cued have associated relevant information of previously encountered events that facilitates assumption-based reasoning which is used to generate relevant actions in the current novel situation. Although the representations need not be viewed as specific instances, as described by instance-based theories, the mechanism of learning thus far bears close resemblance with those of instance-based theories. In the RAWFS model, people begin generating actions, evaluating them and mentally simulating the outcomes, whilst also drawing from memory recognizable plans of behaviour suited to the task, achieved through pattern matching. They cycle through this process, monitoring the success of their understanding of the task and the action outcomes. The RPD model proposes that expert decision makers begin by identifying critical cues, so as 201
Magda Osman to attain a match between the current situation and previously experienced plans of action. In their mode, there is very little time devoted to developing and evaluating alternative strategies. Instead, resources are directed towards understanding the situation, and once understood the course of action is relatively clear. This pattern of behaviour seems to bear out often, and there are many applied situations in which experts show that their knowledge is pattern indexed in relation to domain-specific tasks, consistent with RPD’s claims. As interactions with the task progress, the RM model proposes that metacognitive7 processes are recruited to help the decision maker evaluate her actions and refine past actions that have been cued by the current situation. This is achieved through critiquing, which involves identifying sources of uncertainty, and correcting them by seeking more information from memory or from the control system. Similarly, in the RAWFS model, metacognition is an integral part of the decision-making process. When the decision maker is making slow progress towards reducing uncertainty, this prompts her to fill in gaps in knowledge by devising plausible possibilities. These inferences are weakly held, and are abandoned when later found to conflict with new evidence or other strongly supported inferences. If at any stage a single good action cannot be identified or the decision maker cannot differentiate from several sensible options, she resorts to suppression. This is a tactical and practical
7
In its simplest description, metacognition is a process of stepping back and thinking about what one is thinking. Some have proposed that it is involved in tracking goal-relevant behaviour, modulates motivation and triggers self-reflective judgements (Bandura & Locke, 2003). Others propose that it is used to estimate our current status within a problem; i.e., how far from or near an intended goal we are, and what we need to do to get there (Kornell, Son, & Terrace, 2007). Metacognition is also thought to have a supervisory role because it enables the resolution of the conflict that arises between competing goals, or between a competing choice of actions, by arbitrating between several options and evaluating which is best (Nelson, 1996).
202
Cognitive psychology decision that enables the decision maker to continue operating within the system, while ignoring undesirable information and developing strong rationales for the course of action she eventually decided to take. In the later stages of interacting with a control system, the RPD model proposes that the individual engages in diagnostic decision making. This captures an important characteristic distinguishing expert from novice decision makers. During situational assessment, experts are adept at realizing when they do not have sufficient information to adequately assess a situation. The situation is diagnosed using techniques such as feature matching and story building, each of which requires that more information be extracted from the control system. Experts are also adept at recognizing anomalies between current and past situations; this is done by mentally simulating the action and its consequences.
Summing up of theories One way or another, despite the different underlying claims made by the theories presented here, there is some obvious common ground. The most common is that the theories focus on how we minimize the high degree of uncertainty associated with control systems. For example, pattern matching is an effective means of reducing uncertainty because it draws from prior experience, as well as from those experiences formed online, to decide which actions are needed in order to control the system. Prediction is a means of regulating one’s knowledge on a frequent basis. Control systems are dynamic, and so we need to update and integrate information as it changes; in order to do this, the individual needs to predict what will happen next as a way of reducing uncertainty. Learning, at either an expert level (e.g., commercial aviation pilots, fire commanders, design engineers and offshore oil installation managers) or a novice level (e.g., simulated control systems in the laboratory), is a goal-directed pursuit, because the ultimate aim is to reliably control outcomes in a system. However, to do this,
203
Magda Osman our actions and the actions of the system need to be regulated, and this requires a monitoring process. Monitoring then constrains uncertainty by focusing cognitive resources towards tracking selfgenerated and environmentally determined changes in relation to a goal. The remainder of this chapter will now explore in a little more detail how decision making and learning in other research domains in cognitive psychology relate to how we reduce uncertainty in control systems.
Related research on decision making and learning to control uncertainty How it is that we can predict an outcome and perform actions from that has been the focus of the most influential theories in psychology. For example, in his treatise on behaviourism, Skinner (1953) commented, ‘Any condition or event which can be shown to have an effect upon behaviour must be taken into account. By discovering and analysing these causes we can predict behaviour; to the extent that we can manipulate them, we can control behaviour ’ (p. 1). Here, Skinner considers prediction and control at two levels. He lays bare the fundamentals of how general scientific inquiry should proceed when examining human and animal behaviour on the basis of prediction and control. He also recognizes that the same principles underlie basic human and animal behaviour – that is, animals and humans learn to predict effects from causes, and this is the first critical step to developing behaviours that enables us to master our environment. In one form or another, these components of behaviour have received considerable attention in studies of perceptual-motor control (i.e., using perceptual information to direct actions), causal learning and reasoning (i.e., predicting associations between combinations of events), and multiple cue probabilistic decision making (i.e., deciding on information from the environment that will be useful in predicting events). The aim here is to situate the work on control systems behaviour within the broader context of general learning and decision-making mechanisms. 204
Cognitive psychology
Multiple cue probability learning The closest cousin to work on control systems behaviour is the study of predictive learning in multiple cue probability learning (MCPL) tasks. MCPL tasks involve presenting people with cues (e.g., symptoms such as rash, fever or headaches) which are each probabilistically associated with an outcome (e.g., a disease such as flu or cold) that needs to be learned. By presenting people with various cue combinations (e.g., a rash or fever), they predict the outcome (e.g., a flu or cold), and then receive some type of feedback on their prediction (e.g., the correct response is flu). The main difference between MCPL learning and control tasks is that because only observations of the cue patterns are used to predict the events, the goal is to reduce the discrepancy between the predicted (i.e., expected) and the actual outcome. In contrast, learning to control an outcome (i.e., SG learning) involves manipulating the cues, and the goal is to reduce the discrepancy between the actual outcome and the specified goal (see Figure 7.2 for an example of an MCPL task). MCPL tasks are inherently uncertain, and so early studies were largely concerned with the type of information acquired under different levels of uncertainty. This involved varying the following: the number of cue–outcome relations (Slovic, Rorer, & Hoffman, 1971), combining continuous cues with binary outcomes (Howell & Funaro, 1965; Vlek & van der Heijden, 1970), varying the number of irrelevant cues included (Castellan, 1973), varying the type of feedback presented (Björkman, 1971; Castellan, 1973; Holzworth & Doherty, 1976; Muchinsky & Dudycha, 1975), imposing time limits (Rothstein, 1986), manipulating cue variability (Lanzetta & Driscoll, 1966) and manipulating cue validity (Castellan & Edgell, 1973; Edgell, 1974). The general predictive behaviour revealed in these and more recent studies suggests that people do not exclusively focus on cues that reliably predict the outcome, but rather distribute their attention across relevant and irrelevant cues; though, as the number of irrelevant cues increases, predictive accuracy suffers. In order 205
Magda Osman
Figure 7.2 Example of a multiple-cue learning task. The three hormones labelled A, B and C are the three cues, and the values of the cues are used to predict the outcome, which is indicated by the slider at the side of the small screen. Participants are required to move the slider to indicate the value of the outcome (i.e. they predict the outcome from the multiple-cue values).
words, when trying to predict events in the world (tomorrow’s weather), people pay attention to things that are useful to them (e.g., temperature and air pressure), but also people pay attention to things that are useless to them (e.g., population growth in Spain). Moreover, people are sensitive to the cue validities (i.e., how useful air pressure is in predicting the weather), and will look for further discriminatory information to secure their understanding of them. Under time restrictions and other methods of imposed constraints on cognitive resources, people’s prioritization of task information becomes apparent. In addition, as has been discussed in studies of 206
Cognitive psychology control, feedback is important here too. It is a key feature in MCPL tasks, and its effects on predictive behaviour have been an enduring issue in the history of this research area. Overall, what has been found is that by varying the probabilistic relationship between cue and outcome, as uncertainty increases, the facilitative effect of feedback on predictive performance decreases. Castellan’s (1977) review of findings from early studies suggests that, despite the difficulty in learning posed by uncertain environments, people do not simply memorize the cue patterns, but learn the relations between the outcome and the individual cues which constitute the patterns. In later work, this position has been revisited. It also bears some relation to the proposals made by hypothesis-testing accounts of control behaviours. Recent work on MCPL tasks suggests that people do indeed integrate different sources of probabilistic information (e.g., Lagnado et al., 2006; Price, 2009; Speekenbrink, Channon, & Shanks, 2008; Wilkinson et al., 2008). Furthermore, the findings suggest that learning complex probabilistic associations in MCPL tasks is a purposeful activity of hypothesis-testing behaviour. The general suggestion is that people go about learning the cue–outcome associations by formulating separate hypotheses about the cue patterns, and about the accuracy of their response given a particular cue pattern.8
Perceptual motor learning Recent evidence in which prediction is shown to be critical for control can be found in studies of simple motor learning in dynamic environments (Flanagan et al., 2003; Grafton et al., 2008). Typically, visuo-motor tasks involve tracking objects (e.g., tracking a coloured spot moving smoothly or jaggedly across a computer screen) using cursors or else the task involves modulating grip 8
However, others have contested this, and much research has been directed towards showing that probabilistic information is acquired through incidental learning (see Knowlton, Mangels, & Squire, 1996; Poldrack et al., 2001), and is a similar mechanism to that proposed by instance-based theories of control.
207
Magda Osman control in response to visual cued signals. Studies of visuo-motor learning that have directly contrasted the effects of learning to predict (e.g., where will the spot appear next on the screen?) with learning to control the outcome of actions (e.g., move the spot so that it accurately tracks the moving target) in a probabilistic environment claim that prediction has the overall advantage in building up accurate representations of the environment (Flanagan et al., 2003; Grafton et al., 2008). These studies show that directing actions towards goals is the role of control (e.g., moving a cursor within a specified region), while predicting the outcome of control behaviours is the role of prediction (tracking the movement, the effects of the movement or both). The implications of these findings are that both prediction and control are essential to skilled motor behaviour in which people are required to follow a specific goal and make choices about the actions needed to pursue it. But, prediction serves as a compensatory mechanism that enables immediate adjustments of behaviour because there are delays in transfer of information from the environment to the sensory-motor system (Barnes, 2008; Gawthrop, Lakie, & Loram, 2008). If one were to chart the timeline from initiated action to execution, prediction is acquired earlier than control. One reason for this is that the motor system needs a feedforward system (prediction) that forecasts outcomes, and a complementary feedback-system that builds on the information to attune behaviour accordingly (control), which is important when learning to achieve new goals. Additionally, the claim has also been made that whilst the environment we interact with is dynamic and often demands an immediate response, there is a delay in information transfer from environment to the sensory-motor system; therefore, predicting outcomes in behaviour is a compensatory mechanism that enables us to adjust our behaviours in light of processing delay (Barnes, 2008; Gawthrop, Lakie, & Loram, 2008).9 9
A similar proposal has been made by Leigh (1992) with respect to control systems engineering (see Chapter 3).
208
Cognitive psychology
Causal learning and causal reasoning A broad and crude distinction that can be drawn in the study of how we acquire causal knowledge is to consider psychological studies of causality in the following way. Typically, people are required to, on the basis of some evidence, estimate the association between two events (X and Y), and predict from this the likelihood of event Y given event X. Examining causal learning behaviours involves presenting people with the evidence over a series of trials, so that they experience the events X and Y (causal learning). Examining causal reasoning behaviours involves the presentation of events X and Y and inferring the type of casual structure that connects them (causal reasoning). People may be able to make causal inferences from accumulating evidence of events in the environment, or from the presentation of a single combination of events. But causal learning and causal reasoning are not orthogonal to each other. Our experiences of uncertainty in the context of acquiring causal knowledge have been explored by a number of theorists (e.g., Jungermann & Thüring, 1993; Kerstholt, 1996; Krynski & Tenenbaum, 2007; Pearl, 2000; Tenenbaum, Griffiths, & Kemp, 2006). Much of this work considers how we make causal judgements using a Bayesian belief-updating process. A person will evaluate a claim that an action taken in a complex dynamic control (CDC) task will have a particular outcome (h = hypothesis), given that he has observed an actual outcome (d = datum) (i.e., he has some data by which to evaluate his hypothesis), and subject to a general assumption he has about how the CDC task operates. To combine these in such a way that enables him to evaluate his hypothesis, a person will integrate base rates (i.e., the prior probability of the hypothesis) and the likelihood ratio (the success of competing hypotheses in predicting the data) according to Bayes’ rule. In Bayes’ rule, the probability of the hypothesis tested (h) is derived by multiplying the likelihood ratio of the observed datum (d) by the prior probability favouring the claim they are evaluating (for
209
Magda Osman a more detailed description, see Tenenbaum, Griffiths, & Kemp, 2006). There is considerable evidence supporting the claim that people often make causal judgements that approximate this form of reasoning (e.g., Krynski & Tenenbaum, 2007; Steyvers et al., 2003), but it has not yet been used to describe how people formulate causal models in environments such as complex dynamic control systems. One key issue that connects this work up to prediction and control is the contrast between learning by directly intervening and learning indirectly through observation.10 This contrast has become an important issue in investigations of causal structure learning (Cooper & Herskovits, 1992; Steyvers et al., 2003; Tenenbaum & Griffiths, 2001) because it has been informative of how closely our psychological behaviours adhere to optimal formal descriptions of them. There are formal models (Pearl, 2000; Spirtes, Glymour, & Schienes, 1993) that capture the probabilistic dependencies present in a set of data, and their relation to the causal structures that could have generated the data. These models provide a strong theoretical basis for arguing that intervention is a crucial component in the acquisition of causal structures, by which is meant cause–effect relationships that include multiple causes, multiple effects or causal chains (e.g., coffee drinking may be the cause of insomnia, which in turn causes headaches, which in turn causes stress). For instance, there is much interest in how people infer what the causal links are from a network involving multiple cause–effect relationships (e.g., is insomnia the cause of stress and depression? Or is depression a common effect of stress and insomnia?). Studies that examine causal structure learning often require participants to observe trials in which the events of a causal structure are presented, and they must then infer which causal structure they observed. This is in contrast with conditions in which participants actively learn about the causal structure, by manipulating a candidate cause and observing the
10
One can draw a parallel between observation and prediction, and again with intervention and control (Hagmayer et al., in press).
210
Cognitive psychology events that follow. The evidence suggests that people learn more from the information gained when making interventions than when observing trials in which information about different causal structures is presented (Gopnik et al., 2004; Lagnado & Sloman, 2004; Sloman & Lagnado, 2005; Steyvers et al., 2003). However, the benefit of interventions isn’t always quite so clear-cut because not all interventions are atomic; they can also be uncertain11 (e.g., Meder et al., in press). While in general interventions are helpful in uncovering causal structures, they can be obstructive, and most recent work suggests that observation followed by intervention leads to the most accurate causal representations of the structure of the environment (e.g., Hagmayer et al., in press; Meder, Hagmayer, & Waldman, 2008). Why is this important to control? Controlling outcomes in control systems involves directly interacting with the system, which can be treated as an intervention in the realm of causal structure learning. Therefore, it is important to understand what the implications are for our representation of the control system in terms of its causal structure based on our direct actions within it (interventions) and our indirect actions outside of it (prediction/observation). If there are differences in our representations based on how we behave with respect to the control system, then it will have a direct bearing on the way we go about trying to control it. The work here suggests that specific types of interventions can gain us valuable knowledge of the causal relations between our actions and the events that occur. But the informativeness of our interventions is
11
Atomic interventions are ones in which acting on the environment will clearly reveal some aspect of the causal structure (e.g., when trying to determine the relationship between a light switch and two light bulbs in the room, by pressing the light switch and revealing that only one of the bulbs lights up we have achieved an atomic intervention). An uncertain or imperfect intervention is one in which it may not fix the state of a variable quite so clearly (e.g., taking hay fever tablets to alleviate the runny nose and sore throat may reduce them, but other factors like mould also contribute to the symptoms, which the intervention is not affecting, and so in this case the intervention is imperfect).
211
Magda Osman also dependent on the causal structure of the environment, and crucially they are not isolated from our predictions. We cannot learn the cause–effect relations based on interventions if we haven’t carefully chosen the interventions to start with, and that comes from making accurate predictions. In general, much of the work on causal learning and causal reasoning suggests that discovering the underlying association between events in the environment often involves recruiting prior knowledge and experience (which comes with its own problems, i.e., biases) to develop a sense of what events in the environment may usefully go together. In addition, hypothesis testing helps to develop a means of uncovering the underlying association between events through planned interventions (i.e., what aspect of the environment should be manipulated to reveal the underlying causal relationship?). This is, of course, not too far off from the kinds of behaviours that have been described in studies of control systems behaviour, and recent work has indeed shown that causal reasoning is part and parcel of learning to control an uncertain environment (e.g., Hagmayer et al., 2009).
A little synthesis The role of cognitive psychology is to tell us what informationprocessing systems contribute to our ability to control outcomes in control systems of any kind. Of all the chapters so far, this should be the one that offers the clearest answer to the target question ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’ However, the answers that cognitive psychology provides are incomplete, because until now the various domains that are relevant to understanding control behaviours have remained separate from each other. Studies of multiple cue learning don’t draw from work on control systems behaviour, work on control systems behaviour does not draw from work on causal learning, and so on. This is a rather bleak state of affairs
212
Cognitive psychology since all of these domains have an important contribution to make in answering the target question. Thus, this short synthesis will examine the family resemblances in order to present the general behaviours that cognitive psychology proposes contribute to controlling complex systems. To put the ideas in the context of the story of control, what we need to bear in mind as a puppeteer is that we must always interact with the puppet in a goal-directed way when learning to operate it to start with, and becoming expert in making it dance. If we structure the way in which we seek information (i.e., learning which levers make the head move), then we can formulate predictions, and from those predictions we can contain our knowledge in a coherent way. Similarly, when we have acquired enough knowledge to make the puppet dance, we still know that there are aspects of its internal mechanism that will make it behave unpredictably. We can only know when it’s behaved unpredictably if we have a secure understanding of our actions (i.e., the order in which we pressed the levers and when), and a good understanding of the causal relationship (in which the dynamic and probabilistic properties are taken into account) between actions and events.
Prediction and control Much of the work from complex problem solving suggests that the development of skill in its early stages, as well as when expertise emerges, depends on the mediation between exploration – in order to refine control behaviours – and exploitation,12 which reinforces our knowledge of how to behave in the system through repeated practice of those behaviours. The prevalence of these activities in studies of control system behaviour, as well as in MCPL, motorlearning and causal-learning tasks, seems to be the result of the
12
See the ‘Models of Learning’ section in Chapter 4 for more details on the distinction between exploration and exploitation.
213
Magda Osman practical demands that are placed on us when we have to manage uncertain situations. That is, we reduce uncertainty by (1) extracting, evaluating and estimating (i.e., monitoring/prediction) relevant task information, on which to base our actions; and (2) implementing actions (i.e., control) without the need to analyse them at any great length before we make them, but with the desire to track the effects that they produce. Moreover, the practical demands of dynamic unstable and uncertain environments are such that we need to adapt to changes quickly. Thus, psychological behaviours that incorporate both (1) and (2) ensure that decisions and actions are sufficiently flexible in order to respond to the shifting demands of the situation which requires monitoring of our behaviours (Osman, 2010).
Feedback and feedforward mechanisms The psychological processes that have been commonly identified by all the accounts discussed here concern representing and controlling uncertainty. There are processes that help orientate attention to relevant properties of the uncertain environment (e.g., pattern matching, biases and recruiting prior knowledge), and there are those processes that help to plan decisions and actions to interact with the uncertain environment (e.g., hypothesis testing, tracking and evaluating self-initiated outcomes as well as the outcomes that may be generated by the environment). Condensing these processes further, the picture that emerges is that across the different theories and empirical findings presented here, the underlying mechanisms that support these behaviours reduce to a type of feedforward mechanism that serves as a compensatory mechanism and enables people to forecast the outcome of their actions, or the behaviour of the environment. In concert with this is a feedback mechanism, which is used to strengthen and update knowledge of the task, as well as knowledge of the status between desired and achieved outcomes in the environment. Feedback and feedforward mechanisms not only usefully summarize the crux of the underlying mechanisms that support decision making and learning behaviours in 214
Cognitive psychology control systems, they are also at the core of neurological systems that operate in our brains. Moreover, feedback and feedforward mechansims also describe the relationship between neurological systems and the information that they operate on; both are discussed in the next chapter.
215
Chapter 8
Neuroscience
In the second half of this book, the focus has been on the puppeteer. The chapters have discussed the importance of the goals and desires that motivate the puppeteer to operate the puppet, and the knowledge that the puppeteer has gained through his interactions with the puppet. In this chapter we will examine the internal workings of the puppeteer ’s brain. Just like the puppet, the brain is a control system and it operates according to its own rules and logic. So, one approach to answering the target question of this book, ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’, is to investigate our brain mechanisms. Brain mechanisms have properties like other control systems we have examined, such as feedback loops, dynamics, uncertainty and control. Another approach is to try answering the question by looking to studies of decision making and learning from a neuroscience perspective. Now that we have considered the cognitive and social-psychological processes that enable us to control uncertainty, Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
216
Neuroscience we are better equipped to understand how these processes are supported by brain functions. Before examining the different ways that neuroscience can broach the question of this book, a little context needs to be supplied. The types of empirical methods developed in neuroscience that examine decision-making and learning processes focus on low-level behaviours (e.g., motor movements, eye movements and action plans). The laboratory tasks do not have the same advantage as those used in cognitive psychology; they are simple tasks (e.g., tracking a moving object on a screen with a cursor, or selecting which of two buttons to press in response to a gambling task),1 not ecologically valid ones. However, while the tasks used in neuropsychological studies may be simple, and the behaviour under test is seemingly restricted to choices between two or three button presses, there are important insights to be gained. The details of the experimental structure of the task in conjunction with the areas of activation can inform us as to how we process uncertainty from a neurophysiological level, which can have profound implications for our general understanding of our behaviour. To capitalize on this, a relatively new field of research has attempted to co-ordinate psychology, economics and neuroscience (Camerer, Loewenstein, & Prelec, 2005; Glimcher, 2003; Rustichini, 2005). As a new research endeavour, neuroeconomics investigates the neural correlates of our choice behaviours and compares them with economic models of choice under uncertainty. Although not specifically tackling issues of control, it does have a bearing on the themes of this book, because it is concerned with the way in which we make decisions under
1
Studies involving functional magnetic resonance imaging (fMRI) scanners used to image the brain impose restrictions on the types of experiments that can be conducted (for a discussion of practical and philosophical matters raised by imaging techniques and methodologies, see Uttal, 2001). There is a long-standing debate as to what fMRI can offer as a technique for examining psychological processes above and beyond simply using behavioural studies in conjunction with formal models (for critical views, see Coltheart, 2006; and for supportive arguments, see Henson, 2005, 2006).
217
Magda Osman uncertainty, more specifically how we predict and determine outcomes, and how we interpret their effects through feedback (e.g., gains and losses, or reward and punishment). Given the two different ways in which the target question can be tackled, this chapter is divided into two sections, the first of which focuses on a few key cortical regions and their general operations as illustrations of neurobiological control systems. The particular cortical regions chosen for discussion are also commonly referred to in neuropsychological studies of decision making and learning under uncertainty. In this way, the regulatory operations of the cortical regions will be described as examples of control systems in view of how they are involved in the cognitive processes considered in the first section of this chapter. In the second section, the aim is to cover in as comprehensive a way as possible the work that has been conducted on neuropsychological functioning in humans and animals with respect to how we represent and deal with uncertainties. From this, the discussion then turns to the study of neuroeconomics and the general issues it raises with respect to how we process uncertainties and how we should interpret the neurological and psychological findings according to formal descriptions of behaviour.
The Neurobiology of Control Behaviours As a warning, it is inevitable that when describing specific brain regions and their functioning, the descriptions will refer to associated cognitive processes. This is a problem not only because keeping separate the descriptions of brain function from cognitive function is hard, but also because it actually reflects a fundamental question that neuroscience tries to address, namely, is the order of mental activities isomorphic to the organization of the neural order of cortical regions (Fuster, 2003)? This question will most probably continue to await an answer for a very long time. For the moment we must acknowledge it as an issue, and move on. So, to begin, the general structural properties of cortical networks need some elaboration 218
Neuroscience before any discussion of specific cortical regions can start. The very brief introduction is designed to provide some basic concepts to understanding how cortical networks tend to operate. Cortical networks are regions in the cortical (outer) layer of the brain. Each network represents collections of neurons that are layered; this is characterized by connectionist neural networks (see Chapter 4). What this means is that cortical networks have input layers (these receive information), output layers (these relay information – to other cortical networks) and processing layers sandwiched in between (where the information received from the inputs gets processed and sent to the output layer). As with artificial control systems, cortical networks appear to possess emergent properties, and as with artificial control systems, these properties pose a problem to those examining them. The behaviour of a cortical network is such that tracing the inputs it receives and the outputs it generates doesn’t map onto mental representations in a way that the sum is equal to its parts. This has led some to suggest that cortical networks are non-linear, and this explains why they seem to have emergent properties (for discussion, see Getting, 1989). Cortical networks vary in their processing characteristics of network connections. The connections can show convergence (synchronous), divergence (asynchronous) or recurrence. Overall, the connective features of cortical networks are styled in such a way that they allow serial processing (i.e., processing small amounts of information from single sources) and parallel processing (i.e., processing large amounts of information from multiple sources). Given these rudimentary details, we can proceed to the prefrontal cortex.
Prefrontal Cortex The prefrontal cortex (PFC) is the pinnacle of the hierarchy of cortices in the frontal lobe (see Figure 8.1). The PFC represents the area of the brain that offers high-level cognitive functions, and is situated at the anterior end (front end) of the brain and has reached its greatest elaboration and relative size in the primate (Fuster, 2003). The 219
Magda Osman Septum
Prefrontal cortex
Hypothalamus
Sensory inputs
Nucleus accumbens
Motor outputs
Amygdala Ventral fragmental area
Figure 8.1 Prefrontal cortex.
reason for this is that the PFC is responsible for goal-directed activities, and given that most high-level functions are goal-directed behaviours, they tend to be found in the PFC. Moreover, because of the nature of the information processing involved in goal-directed activities (i.e., taking perceptual representations as inputs and generating actions as outputs), the PFC involves a perception–action cycle. When we are required to coordinate our decisions and actions with respect to a goal, the PFC is often seen to be activated. Perception is integrated into action via cortico-cortal connectivity mediating feedforward and feedback components of the cycle, with the highest executive structures implicated in the cycle in the PFC. When the goal and the behaviours to be executed are familiar, then the perceptual–motor integration occurs at circulatory loops that include, for example, the spinal cord, basal ganglia and hypothalamus. But when more complex and less routinized goals are sought, and learning to co-ordinate and plan new behaviours is required, then the PFC works in close collaboration with the basal ganglia (to 220
Neuroscience be discussed next). To understand this type of integration, Miller and Buschman (2006) proposed that one reason is that the PFC is associated with high-level cognition that is part of the perception– action cycle which takes outputs from the basal ganglia which helps to train the PFC networks to build up high-level representations that underlie abstract thought. Goal-directed behaviours, both simple and complex, must at some stage have been learnt. To achieve this, a system has to learn what outcomes are possible, what actions might be successful in achieving them and what their costs might be. The PFC is well placed to achieve these requirements because it helps to formulate expectancies, and it is multimodal, which means that its neurons are responsive to a wide range of information from multiple senses. These attributes become important because when a new situation is faced, multiple sensory information needs to be processed. In addition, expectancies are needed to orientate attention to help select the relevant information to respond to. There is evidence to show that neurons activate in anticipation of a forthcoming expected event or outcome (Watanabe, 1996), which further suggests that the PFC learns rules that guide volitional goal-directed behaviour in new situations. After learning goal-directed behaviours, the PFC can work just like an autonomous artificial control system. That is, given a subset of inputs, it can complete or recall an entire pattern of goal-directed behaviours, and run them off like an automatized program. To initiate as well as maintain goal-directed behaviours, the PFC will receive signals from the posterior cortex (see Figure 8.1). In the case of initiated goal-directed behaviours, inputs from the posterior cortex will trigger activation of perceptual cortical networks by sensory signals whereas in the case of externally driven goaldirected behaviours, a sensory stimulus will gain access to the perceptual network and activate it by association. If associations between perceptions and specific actions have already formed established connections with the executive network of the frontal region of the brain, goal-directed behaviours will also result from activations through posterior–anterior connections. Activation of 221
Magda Osman this kind will initiate an action, the sensory feedback will be processed in the posterior cortex, and this may in turn provide new inputs to the frontal cortex triggering a perception–action cycle. The general consensus appears to be that the PFC helps to coordinate behaviours. For instance, Miller and Cohen (2001) claim that the cardinal function of the PFC is in the acquisition and maintenance of patterns of activity that are goal directed. Fuster (2003) claims that the PFC’s primary function is the temporal organization of behaviour. This is because time separates sensory signals involved in goal-directed behaviour. Sensory feedback from actions taken and outcomes of behaviour are both delayed, and the PFC integrates these different sources of information. The PFC clearly has many functions, and has been associated with a wide range of activities. Specific regions associated with the PFC, such as the orbital PFC and dorsolateral PFC are thought to be associated with goal maintenance and executive control (Glimcher & Rustinchini, 2004), as well as reward expectation-related processes, whereas the ventromedial PFC is thought to be associated with evaluating feedback (Glimcher & Rustinchini, 2004) and predictive learning (Schultz, 2006; Watanabe, 1996).
Basal Ganglia: Striatum (Caudate, Putamen, Ventral Striatum) Essentially, the basal ganglia (BG) comprise a collection of subcortical nuclei (see Figure 8.2). Cortical inputs arrive via the striatum (which includes the caudate and the putamen) and are processed via the globus pallidus (GP), the subthalamic nucleus (STN) and the substantia nigra (SB). As with other control systems, a feedback system takes inputs and feeds them back into the cortex via the thalamus. Though the BG is a cortical hub, it does maintain some separation from other channels for the reason that outputs via the thalamus to cortical regions are the same cortical regions that gave rise to the initial inputs to the BG. Thus it has its own internal
222
Neuroscience Caudate nucleus
Striatum
Putamen
Basal ganglia
Globus pallidus Lateral medial Subthalamic nucleus Substantia nigra Thalamus
Deep cerebellar nuclei Cerebellum
Pons Spinal cord
Figure 8.2 striatum).
Basal ganglia: striatum (caudate, putamen and ventral
feedback system; it loops in on itself. Also, like the PFC, the basal ganglia are another example of a major site of cortical convergence, for the reason that most of the cortex projects directly onto the striatum through divergent connections. The striatum itself can be divided into the striosomes which receives inputs from the entire cerebral cortex, and the matrix which receives inputs primarily from the limbic, hippocampal systems and prefrontal cortex (see Figure 8.2). Projections from the
223
Magda Osman striatum are distributed along two parallel pathways: the direct and indirect pathways, both of which project directly onto the thalamus. The direct pathway leads from the striatum into the globus pallidus internal (GPi) and the substantia nigra pars retuculata (SNpr). The indirect pathway involves striatal projections to the globus pallidus external (GPe), which in turn projects to the subthalamic nucleus (STN), which projects onto the GPi/SNpr. The significance of these two pathways is that they complement each other, existing in an equilibrium that enables the release of desired patterns of neural activity while inhibiting unintended ones. For example, Miller and Buschman (2006) claim that pathways of these kinds are needed to separate responding to relevant events that can be predicted from responding to irrelevant events such as coincidences. Therefore, flexibility (or plasticity, as it is referred to commonly) needs to be guided by information about which associations are predictive of desirable outcomes. One possible way in which guidance can occur is through reinforcement signals provided by dopaminergic neurons in the midbrain. The activity of neurons in this area is associated with reward–prediction error signals, and these activate and release dopamine (DA) in the subcortex and cortex. Over time, and through the course of learning, neurons originally responding to reward signals will eventually respond to the expected event that produces a reward, rather than the reward alone. The signals from these neurons eventually result in a net strengthening of connections between neurons in the PFC and BG, and a strengthening of representation of a network of reward-predicting associations. Given the various roles of PFC and BG, there have been claims that we possess two types of learning: fast learning, which is quickly learning to avoid something (e.g., moving your hand away from a flame) and which tends to be associated with the BG; and slow learning, which is accumulating experiences over time about a particular task (e.g., gaining knowledge about a field of research) and which tends to be associated with the PFC. But even if there is a suggestion that these two types of learning may serve completely different underlying functions, the evidence from neuropsychological studies suggests that there is a strong relationship between the PFC and BG. 224
Neuroscience The PFC monitors fast-learnt goal-directed behaviours from the BG in order to correct and improve their application over time.
Anterior Cingulate Cortex In general, the ACC is a large brain region that does not seem to have a uniform contribution to brain function because of its broad association with sensory, motor, cognitive and affective modules in the brain. But to simplify matters, it can be thought of in terms of having a separate cognitive and affective component. The cognitive division of the ACC involves connections with the lateral PFC, parietal cortex and pre-motor cortex and supplementary motor areas (SMA). The cingulated motor area (or caudal cingulated zone) of the ACC which is the most posterior part of it is associated with the motor system and projects to the spinal cord as well as to the striatum. As part of the motor system the ACC has a role in highlevel motor control, and detecting when strategic control of behaviours is required, as well as in actual strategic control (MacDonald et al., 2000). In contrast, the most inferior regions of the ACC (see Figure 8.3) are thought to play a role in the processing of emotion. The affective division or emotional section of the ACC involves areas which project to the autonomic brainstem nuclei, and is connected to the amygdala, hypothalamus, hippocampus, nucleus accumbens and anterior insula. This aspect of the ACC modulates autonomic activity (e.g., blood pressure and heart rate) and emotional responses and is thought to be quite separate from the cognitive division of the ACC. There is controversy as to the specific associations that the ACC has with psychological functions such as tracking the outcomes of decisions, processing emotion, as well as how conflicts between representations, decisions and actions are represented and dealt with. In fact, even with respect to conflicts in representations, decisions and actions, it is still unclear whether ACC activation is associated exclusively with conflict at the level of response selection (e.g., which action to choose) or at the level of stimulus selection (e.g., which of 225
Magda Osman
Anterior cingulate cortex
Cingulate gyrus
Thalmus
Spinal cord
Figure 8.3 Anterior cingulate cortex.
the competing sources of information should I be focused on) (for discussions, see Bush, Luu, & Posner, 2000; Devinsky, Morrell, & Vogt, 1995). In line with the idea that the ACC is involved in conflict monitoring, the ACC is also thought to be involved in error detection. Segalowitz and Dywan (2009) discuss the importance of error detection (error-related negativity, or ERN) signals which are thought to originate from the ACC. These are claimed to be sensitive to the detection of action slips, prediction errors, errors of choice, errors in expectancy and a host of other errors. It may be that because detection of errors and conflict monitoring are necessary for all manner of human activities the ACC is needed to support both cognitive and affective psychological activities. 226
Neuroscience
Summary: Cortical Networks as Control Systems One thing to take away from the discussion thus far is that cortical networks do indeed seem to operate very much the control systems described earlier in this book. Namely, there are multiple feedback and feedforward loops that carry information across wide networks, not only to maintain the operations of the system but also to increase efficiency of the actual operations of the system for adaptive control.2 However, this view is not shared by all. As with socialpsychological theorists, similar unsympathetic reactions (e.g., Marcus, 2008; Morsella, 2009) have been made in response to drawing an analogy between neurobiological functions and control systems. Their main argument goes something like this. We cannot reverse engineer from the structure of cortical networks to their function because even the types of cognitive mechanisms that are supposed to support them lead to suboptimal, inefficient and sometimes inconsistent actions. More to the point, we cannot do so, how can we possibly hope to use the workings of the brain to understand cognitive mechanisms? However, there are elegant demonstrations of how we can estimate the consumption of energy of populations of neurons and use this as a metric for predicting the processing costs of a given psychological task (e.g., Lennie, 2003). There are also examples in which electrophysiological recordings of neuronal firing (i.e., electroencephalography, or EEG) have been used to identify task difficulty and predict the accuracy of decisions made in psychological tasks (e.g., Philiastedies, Ratcliff, & Sajda, 2006; Ratcliff, Philiastedies, & Sajda, 2009). So, in response to doubters, there are empirical ways to support the analogy between the brain and control systems (e.g., examining the efficiency and processing costs of the system), which
2
Though the focus of this chapter is on decision making and learning, other neuropsychological studies that complement the findings from work in this field discuss prediction and control in visuo-motor learning (Luu et al., 2008; Rodriguez, 2009). For more examples, see the discussion on perceptual-motor studies in Chapter 7.
227
Magda Osman in turn informs us about cognitive mechanisms. What will become obvious in the discussion that follows is that many advances in neuropsychological research directed at understanding core cognitive processes such as learning and decision making use machinelearning algorithms as a way of formally describing functioning at this level. Adopting such a framework tacitly implies that the analogy between brain functioning and control systems theory is a useful basis for understanding our cognition.
The Neuropsychology of Control Having examined the way in which our neurobiology functions, in this section of the chapter we consider how cortical networks connect up to the psychological processes that are involved in goaldirected activities. One way of integrating the various research efforts designed to understand the way we learn to control outcomes in the face of uncertainty is to treat control behaviours as being composed of the following: (1) the sensitivity we show to the current situation we are in, (2) the evaluation of our actions in terms of the rewards and punishments they incur in relation to a desired goal, and (3) the evaluation of future actions based on the outcome that we have achieved in trying to reach a desired goal (Doya, 2008; Schultz, 2006; Yu, 2007).3 By decomposing the elements of decision making and learning in this way, the discussion that follows aims to show how each of the component parts of behaviour can then be investigated in connection to an uncertain environment. Neuropsychological work studying goal-directed activities has specifically
3
It should be observed at this stage this bears very closely to the composites of psychological studies examine learning and decision making, as well as machinelearning algorithms, and control systems engineering approaches to developing systems. The obvious reason for this is that there should be overlap because psychology, machine learning and engineering seek to describe and implement methods that aim to do the same thing as neuropsychological studies of control, and that is to examine actions that achieve goals in the face of uncertainty.
228
Neuroscience characterized the environment by deconstructing it into three components, each of which is based on reward and punishment: (1) different rates at which rewards and punishments occur (magnitude), (2) different times at which rewards and punishments occur (dynamic), and (3) different probabilities with which rewards and punishments occur (stochastic). The discussion categorizes the relevant research according to the three different components that make up uncertainty.
Uncertainty: rewards and punishments occur in different degrees (magnitude) Sailer et al.’s (2007) functional magnetic resonance imaging (fMRI) study examined the role of feedback in learning and decision making. They presented participants with a simple two-choice task in which on a screen they would be shown two rectangles with a value (i.e., 5 and 25) presented inside each one. The values denoted the number of points that they could win or lose. The goal of the task was to accumulate as many points as possible. After choosing a rectangle, participants then received feedback on their choice. They found out one of three things: if their choice could be (1) a loss instead of a gain, (2) the greater of two losses or (3) the lesser of two gains. By learning the distribution of correct and wrong answers, participants could acquire the underlying rule which was a repeating sequence of six specific outcome choices: 5, 5, 25, 25, 5, 25. In Sailer et al.’s experiment, participants were categorized according to whether they were (1) ideal learners who explored the task and then exploited it, (2) slow learners who remained in an extensive exploratory state or did not learn at all, or (3) fast learners who committed little time for exploration because they managed to successfully exploit from the start.4
4
Exploration and exploitation in the context of Sailer et al.’s (2007) study correspond with those of Bayesian learning algorithms discussed in Chapter 4. Exploitation (control) involves controlling the environment by making choices that
229
Magda Osman The brain-imaging data revealed that overall there was higher brain activation in the orbitalfrontal cortex, caudate nucleus and frontalpolar areas when participants experienced gains, then losses. This finding was taken to suggest that while the same brain regions were associated with both types of outcomes (gains and losses), the outcomes themselves were processed differently. Moreover, over the course of learning, activation in the OFC and putamen following losses increased, while it decreased in activation following gains, a finding consistent with EEG studies (e.g., Cohen, Elger, & Ranganath, 2007) and fMRI studies (e.g., Cohen, Elger, & Weber, 2008). The different types of feedback that people received also led to different brain activity. Unambiguous feedback (i.e., correct-loss, wrong-gain) corresponded with greater activity in the pre-SMA, frontalpolar area (frontal region of the cerebral cortex) and DLPFC, while for ambiguous feedback (i.e., correct-loss, wrong-gain) the ACC, middle temporal areas, caudate, insula and OFC showed greater activation. They also found that there were differences between ideal and slow learners, in which slow learners showed greater parietal activation. Sailer et al. (2007) and others (e.g., O’Doherty, 2004) proposed that the OFC and putamen are associated with encoding rewards and the impact of rewards on our actions; this is because of prediction error signals located in these areas which have a bearing on learning strategies (i.e., exploration, exploitation). In general, then, with respect to magnitudes, this study showed that during exploration of the task environment, expectancies are generated which are used to minimize uncertainty, because participants try to predict the outcome of their choice. Thus, people tend to use their expectancies to evaluate the achieved gain and loss
will maximize your wins (i.e., devising actions that will lead you to achieve a desired goal). Exploration involves probing the environment by making choices that could incur losses in the short term, but will increase your knowledge of the environment so that you can maximize your wins at a later stage (i.e., devise hypotheses that will lead to new knowledge, which will in turn enable successful exploitation).
230
Neuroscience associated with their actions. Work of this kind shows that prediction error signals indicate whether an event is better or worse than expected (Montague, Hyman, & Cohen, 2004; Sailer et al., 2007; Schultz, Dayan, & Montague, 1997). In addition, the findings from these studies show that at a cortical level, activation is associated with different types of learning strategies (e.g., greater activation in the OFC and putamen during exploration) and that feedback has a different impact depending on when it is experienced in an uncertain environment.
Uncertainty: rewards and punishments occur at different times (dynamic), and rewards and punishments occur with different probabilities (stochastic) Behrens et al. (2007) devised a study to examine the effect on learning rate5 when varying the probability of the reward. Participants were presented with two choices (a green or blue rectangle) on each trial during learning, and all they were required to do was choose between the two. In addition, they were told that the chance of the correct response being associated with green or blue depended on the recent history of outcomes. The learning phase was divided in two, the first of which was a stable environment in which the probability of a correct response associated with a blue outcome was 75 per cent, and the second of which was an unstable environment in which the probabilities switched between 80 per cent for the blue outcome and 80 per cent for the green outcome after a short period of trials. The aim of the experiment was to examine the extent to which participants showed optimal Bayesian learning. That is, could they accurately estimate the probability of the reward associated with each colour, and accurately judge the expected value as calculated according to Reward Probability × Reward Size? 5
This refers to the frequency over time with which new information is used to update old information (for further discussion, see Dayan, Kakayde, & Montague, 2000).
231
Magda Osman Given the design of the study, the reward probability varied as reflected by the stability or instability (dynamics) of the environment. For successful learning to occur, participants would be expected to show sensitivity to how stable (i.e., dynamic) the environment was, and track the statistical property of the environment. Behrens et al. (2007) presented people with either the stable environment first and then the unstable environment, or the reversed ordering. On each trial they separated out the decision stage, which was the point at which people made their choice, and the monitoring stage, which was the point at which they evaluated the outcome of their decision. They found that participants’ choices were close to those of an optimal Bayesian learner, in that they tended to be more reactive to the reward schedules in unstable compared to stable environments. This was taken as providing strong evidence that people are able to track changes in the environment and update their experiences successfully so that they can make choices in order to continually exploit it. Behrens et al. (2007) isolated brain regions that are typically associated with monitoring and integrating outcomes of actions – specifically the ACC, which previous studies have shown to be activated with respect to these cognitive activities (e.g., Walton, Devlin, & Rushworth, 2004) as well as processing error following action (e.g., Marco-Pallarés, Müller, & Münte, 2007; Ullsperger & von Cramon, 2003). As discussed earlier, the ACC is part of a distributed neural system that is associated with decision making (error detection, anticipation, updating decision outcomes and monitoring) and emotion (motivation, emotional response modulation and social behaviours) (for review, see Devinsky, Morrell, & Vogt, 1995). In the context of decision-making situations, some have claimed that when people await the consequences of a decision they have made, the conflicts that arise between what they expected and what had occurred increase emotional arousal. Thus, conflicts between expected and desired outcomes, as well as emotional arousal, lead to ACC activity (Botvinick, Cohen, & Carter, 2004; Critchley,
232
Neuroscience Mathias, & Dolan, 2001). Behrens et al. (2007) found that the ACC’s response to the dynamics of the environment directly affected learning – the greater the activity in ACC, the higher the learning rate. From this, they proposed that the sensitivity people showed to the dynamics of the environment was actually what led to increases in emotional arousal. In addition, Behrens et al. (2007) suggested that the fluctuations in ACC activity found during the stages in which participants were updating their knowledge of the environment (i.e., the monitoring stage of trials) were closely associated with their learning rate. Consistent with others (Cohen, 2006; Schultz, 2006; Schultz, Dayan, & Montague, 1997, 2008), Behrens et al. (2007) concluded that gains and losses are represented differently in the brain,6 and that cortical activation reflects the differences in the environment according to the probabilities of losses and gains as well as changes in the probabilities of losses and gains over time. Rather than examining the effects of losses and gains, Jocham et al. (2009) used Behrens et al.’s (2007) task to examine the effect of positive and negative feedback on learning in stable and unstable environments. They focused on the associated activation in brain regions such as the dorsal anterior cingulated cortex (dACC) and rostral cingulate zone (RCZ).7 The task was similar but for the fact that the outcome was now in the form of either positive feedback (a smiley face) or negative feedback (a sad face). Learning rates were measured according to the probability of the feedback from actions, and changes in probability of the feedback of actions in either stable or unstable environmental conditions. Jocham et al.
6
There is some controversy as to whether gains and losses are in fact encoded by the same brain system, as suggested by, for example, Boorman et al. (2009) and Sailer et al. (2007), or whether in fact different brain systems encode gains and losses differently (e.g., O’Doherty et al., 2001; Seymour et al., 2007). 7 The reason for selecting these regions as relevant is because they have been previously associated with the integration of action–outcome associations acquired over extensive learning.
233
Magda Osman used a Q-learning algorithm8 to determine optimal learning behaviour to compare against actual choices made. They found that there was more switching behaviour and lower rates of learning in the unstable environment compared with the stable environment. Jocham et al. (2009) interpreted these findings as evidence that the accumulation of information following feedback was suited to the stability of the environment, since the more it fluctuated, the more extensive were the learning trials from which information needed to be integrated, consistent with their optimal learning model. They extended Behrens et al.’s (2007) findings by suggesting that in the course of integrating actions and their outcomes, the RCZ adapts by enhancing or decreasing its response to negative feedback through reinforcement learning. In the unstable environment the activity of the RCZ is less pronounced in response to single events, as compared with when the environment is stable. This is because a more extensive range of events (learning trials) is needed because the reliability of feedback is reduced. Finally, another follow-up study conducted by Boorman et al. (2009) using Behrens et al.’s (2007) task was designed to examine the impact of increasing uncertainty on decision making. While participants were still directed to consider the history of their choices in order to track the probability of a reward associated with their choice, to create maximum uncertainty they were informed that the magnitude of the reward was random from trial to trial. As in Behrens et al.’s (2007) study, the reward attached to the two options was probabilistic and dynamic – it could be either stable (i.e., choice A would lead to a reward 75 per cent of the time) or unstable (the probability of a reward for choice A would switch from high to low after blocks of trials). Boorman et al. (2009) also found that the choices people made were consistent with those of an optimal Bayesian learner, indicating that they based their choices on the outcome probabilities and not on the magnitude of the 8
Q-learning is a learning algorithm developed from Bayesian learning models, and so shares similar assumptions about learning behaviour (see Chapter 4 for discussion).
234
Neuroscience reward – which of course was randomly assigned from trial to trial. The change in design enabled them to also examine if brain regions that encoded long-term information accumulated across learning that would be associated with switches in choice behaviour. They focused on activity in the frontopolar cortex (FPC), a region found to be active when there is voluntary switching from one complex behaviour to another (e.g., Koechlin & Summerfield, 2007). Boorman et al. (2009) found that brain activity was not associated with the particular type of option selected, but instead FPC activity was associated with the relative un-chosen action probability on stay and switch trials, and activity in this region predicted switching behaviour. That is, brain activity tracked not only those options that were selected, but also the probabilities of unselected options. In addition, they examined activity in the ventral motor prefrontal cortex (vmPFC), an area that has been associated with the processing of expected values of chosen actions (e.g., Daw et al., 2006). Similarly, activity in the vmPFC was associated with the relative chosen probability, with a positive correlation between activity and chosen action value, and a negative correlation between activity and the un-chosen action value. In general, in situations with dynamic and stochastic reward structures there is sufficient flexibility in the types of decisionmaking strategies that people learn to apply in order to adapt to these changes. Moreover, there are many examples in which learning is revealed to be optimal in the face of unstable environments. Finally, at a neuropsychological level, the evidence suggests that people represent rewards in the same way as they represent feedback, and in both cases, the probability of reward associated with an action is encoded as well as an action not taken.
Uncertainty: Animal Studies Imaging studies are carried out on animals as well as humans, and one reason for this is to examine the types of basic functions that are shared in a variety of species. For instance, Watanabe’s (2007) 235
Magda Osman primate study revealed their learning and decision making capabilities in uncertain environments similar to those found in humans. During this type of cognitive processing in primates, the bulk of brain activity is done by the LPFC and the ACC, both of which are associated with goal-directed behaviours in humans. As with human studies, the types of tasks designed to examine learning and decision-making processes involve simple choice tasks in which a property of uncertainty in the environment is manipulated. In a typical task examining expectancy (e.g., Watanabe et al., 2002, 2005), the monkey is presented with a screen with three horizontal windows, three buttons directly underneath, and a lever. The monkey is trained to first depress the lever for a specified period of time, which then raises an opaque screen for one second. This in turn reveals behind a transparent screen the outcome for that particular trial; this is a tray containing either reward (different types of food/different types of liquid) or no reward (no food or liquid). After this a white light appears in one of the three windows, which serves as a go signal, and as soon as this happens the monkey is required to release the lever and press the button corresponding to the window that is lit. If correctly pressed, this then raises the screen to reveal the outcome. However, throughout the task, the monkey is required to press the button corresponding to the white light regardless of whether the monkey would be receiving a reward or not. Often the presentation of reward and no-reward trials alternates in a pseudo-random manner (e.g., a 3 : 2 ratio). What has been found is that the type of reward (i.e., the type of food and drink provided) presented to monkeys influences their responses, in particular their speed of response; responses are faster when rewards are expected and when they are preferred rewards. As with other studies described here, fMRI work shows that activation in the LPFC reflects the absence as well as the presence of the reward, but in addition, activation will vary according to the type of reward that is present or absent (e.g., Watanabe et al., 2002, 2005). In addition to this, other studies (e.g., Matsumoto, Suzuki, & Tanaka, 2003) have also shown that activation in the ACC as well as LPFC reflects reward expectan-
236
Neuroscience cies. Overall, as with humans, the type of reward had motivational value for monkeys, and at a neurological level even in its absence it is still processed. In addition, the neurophysiology of reward-based learning has been studied with other animals, most commonly rats (for a review, see Schultz, 2006). For example, Winter, Dieckmann, and Schwabe (2009) presented rats with two levers each associated with a reward. Lever A was inefficient in delivering a reward (after 25 lever presses, the animal would receive a pellet of food), and lever B was efficient in delivering a reward (after nine lever presses, the animal would receive a pellet of food). Thus the reward was the same, but the associated effort in acquiring it differed. Moreover, to introduce uncertainty, Winter, Dieckmann and Scwabe (2009) set the conditions in such a way that if the animal received seven pellets in a row – that is, it only operated the efficient lever, B – the efficiency of B and A switched, so that lever B delivered a reward after 25 lever presses and lever A now rewarded the animal after nine lever presses. The success in learning was measured by the number of times the animal switched levers over the course of a specified period of learning. More to the point, switching was not triggered by any external detectable signals; therefore, sensible switching behaviour reflected the animals’ internal cost–benefit analysis of the reward provided by the environment and the effort needed to acquire it. Sticking to the same lever throughout would suggest that the animal was insensitive to the change in effort-reward of the lever, or didn’t care that it would periodically have to exert greater effort in order to receive the same reward. Brain regions of interest that were isolated included the motor prefrontal cortex (mPFC) and orbitofrontal cortex (OFC), and dopaminergic receptors in the prefrontal cortex. The reason for this is that they have all been implicated (for both humans and animals) in coding uncertainty. Also, these regions have been associated with reward-based learning – particularly when the reward is unpredictable – and are activated when flexibility in adaptive control of behaviours is demanded by the dynamics of the environment
237
Magda Osman (e.g., Gallistel, Mark, King, & Latham, 2001;9 Robbins, 2000; Schultz, 2006; Schultz et al., 2008; Tobler, O’Doherty, Dolan, & Schultz, 2007; Tremblay & Schultz, 2000). Winter et al. (2009) found that overall, blocking dopaminergic receptors in the mPFC impacted on the success of behaviour leading to switching behaviour to the efficient lever, but not on sustaining lever pressing at the efficient side. This suggested that changes in reward values were detected for both levers, but that basing future actions on the detection of reward outcomes can be affected by blocking dopaminergic receptors, and in normal functioning it’s most likely that the mPFC and OFC act together in modulating this aspect of behaviour. Beyond the apparent commonalities in processing rewards, others have examined the broader associations between humans and non-humans in terms of learning and decision making. Shea, Krug, and Tobler (2008) examine this point in terms of the similarities and differences between habit learning (which they propose humans and non-humans are capable of) and goal-directed learning (which they propose only humans are capable of). They argue that in the case of habit learning, an action is selected from available actions on the basis of the stored values for actions previously made. The result is that this does not involve evaluating the outcome of those actions. Because different types of outcomes need to be represented independently of the reward in goal-directed learning, the outcomes of actions do need to be evaluated. Thus, one way in which habit learning and goal-direct learning differ is according to the processing of rewards. Shea et al. (2008) propose that another potential difference between these forms of learning is that goal-directed learning is more likely to involve conceptual representations of the environment (i.e., syntactic-based representations which refer to the properties of the environment, structure, 9
Gallistel et al. (2001) developed the experimental paradigm later used by Winter et al. (2009). Gallistel et al. (2001) used a Bayesian learning algorithm to compare animal learning behaviour with that of an ideal learner, and found that animals’ switching behaviour was sensitive to the differences in reward, and indeed approximated optimal learning.
238
Neuroscience causal relations etc.) than habit learning. Even when there have been demonstrations of goal-directed learning in animals, Shea et al. (2008) raise doubts as to whether the choices that animals make are informed by conceptual representations of the environment.10 However, at the same time they also raise the question as to the extent to which human goal-directed learning necessarily involves conceptual representations. The broader issue identified here is that decision making is goal directed, and is supported by goal-directed learning. As the complexity of the environment increases (e.g., increasing the delay between action and reward), some sort of causal structure of the environment needs to be represented, for the reason that we run off inferences from internal models that can connect proximal actions with delayed outcomes via a chain of intermediate actions. These have to be structured by some internal representation of the outside world. This means greater flexibility in the representations that are used to generate actions in uncertain decision-making environments, but also suggests that adaptive behaviours need this type of flexible representation in order to generalize behaviours to new environments – in other words, conceptual representations enable transfer of decision–action behaviours. The extent to which transfer of behaviours occurs in similar as well as different situations has not been an area of much investigation. As Shea, Krug and Tobler (2008) argue, this could offer ways of understanding decision making and learning in uncertain environments in terms of the type of representational framework (conceptual and non-conceptual) that is used, and when it is applied. They propose that one way of targeting this issue is to examine the neural activity in the inferior frontal gyrus during goal-directed learning in training and transfer tasks, because it is an area that has been associated with changes in conceptual representations. 10
Though controversial, recent evidence by Blaisdell et al.’s (2006) study on rats demonstrates that they can indeed infer the causal structure of their environment, suggesting that they can build syntactic representations of the environment that map onto the causal structure, from which actions are selected that reflected sensitivity to it.
239
Magda Osman
Summary So Far In sum, though the types of animal studies investigating decision making and learning are examined in simple task environments, the details of the processes involved have an important bearing on our understanding of how we control outcomes in control systems, and how they are realized at a neuropsychological level. Reductionist approaches of the kind described here suggest that adaptive behaviours in the light of uncertainty require some estimates of the type of outcome that will follow (i.e., the rewards associated with choices, decisions and actions). Moreover, the approach assumes that actions are goal directed to the extent that they are designed to achieve, at minimum, a reward and avoid a loss, and to predict when and how often rewards and losses will occur. Finally, this approach assumes that we can use formal methods (e.g., Bayesian algorithms, linear learning models using Laplace transformations and neural networks) to judge the success of learning and decision making in uncertain situations. The findings from the studies presented and particularly those that will be discussed in the next sections suggest that humans and non-humans can indeed be compared against formal methods of estimating optimal behaviour and more to the point, behaviour seems to be optimal according to normative economic standards.
Neuroeconomics The culmination of recent work in psychology, neuroscience (i.e., broadly, learning and decision making) and economics (i.e., broadly, risk aversion, time preferences and altruism) has been the development of the field of neuroeconomics. Gambling tasks are the task of choice for neuroeconomists, and so many of the studies described hereafter are gambling based. Gambling tasks can be thought of as micro-environments in which adaptive behaviours can be investigated under conditions of uncertainty, particularly because the 240
Neuroscience environment is a changing one. More to the point, they are ideal conditions for examining the relationship between prediction and control (i.e., the choices and actions that are made) by looking at the strategies people develop and the stage at which people decide that their information gathering is sufficient for them to maximize their rewards. In turn, the candidate actions during exploration and exploitation can be evaluated against predicted values; these can be defined in terms of the amount of reward they are expected to eventually bring about. For this reason, the domain of gambling has also been popular in studies of decision making (e.g., Boles & Messick, 1995; Nygren et al., 1996; Steyvers, Lee, & Wagenmakers, 2009) and economics (e.g., Banks, Olson, & Porter, 1997; Brezzi & Lai, 2002). So, many of the studies in this field of research are concerned with examining humans’ and non-humans’ (1) sensitivity to the current situation, (2) evaluations of actions just taken, and (3) evaluations of future actions. That is, studies by neuroeconomists are concerned with the component behaviours of control in terms of the strategies adopted in a highly uncertain environment. The main difference is that these components have been specifically integrated with the components that are thought to make up uncertainty. The focus, then, for neuroeconomists is to determine (1) the subjective utility that people use to inform their decisions, (2) the effects on learning and decision making when real rewards are presented, and (3) what behaviours are sought in order to reduce uncertainty (i.e., exploring versus exploiting). As with the previous section, the following section will be divided up according to the way uncertainty in the environment is characterized but in this case it will be from a neuroeconomics standpoint: risky rewards, and delayed rewards.
Uncertainty: risky rewards As discussed, gambling tasks have been developed to examine the neurological foundations of choice behaviour. For example, Daw et al. (2006) presented participants with a screen in which 241
Magda Osman there were four one-armed bandits (equivalent to a slot machine in which operating a lever reveals a payoff), and on each trial they could choose to operate one of them. The four one-armed bandits had different associated payoffs (this was converted into real money) – and within each machine the payoff fluctuated around a given value. So, from one trial to the next the mean payoff would change randomly and independently of what payoff had been set previously.11 Structuring the environment in this way meant that people had to spend enough time gathering information about each one-armed bandit in order to learn about the associated payoff of each one. However, people were operating under dual goals, because while they needed to explore the environment, this would come at a cost, and so they also needed to find ways to offset this by exploiting it. In other words, the main goal was to accrue money (exploit), but this couldn’t be achieved without gathering information (explore), and this came at a price. This type of paradigm has been used by many to examine motivation, risky decision-making behaviour and addictive behaviours (e.g., Clark et al., 2009; Cohen, 2006; Reddish et al., 2007; Tobler et al., 2007). To examine the type of behaviour in their task, Daw et al. (2006) compared people’s behaviour with formal learning algorithms such as the Kalman filter (e.g., Dayan, Kakade, & Montague, 2000; Dayan & Niv, 2008; also see Chapter 3). They found that people’s learning behaviour approximated exploration guided by expected value. This meant that the decision to choose a different one-armed bandit from the one currently being exploited was determined probabilistically on the basis of actions relative to expected values. By characterizing people’s behaviour in this way, Daw et al. then examined 11
Cohen, McClure and Yu (2007) claim that though the environment that Daw et al. (2006) used was a dynamic one, the changes that occurred were continuous and relatively slow over time, whereas dramatic fluctuations as observed in the real world would call for a different type of exploratory behaviour from the one reported by Daw et al. (2006).
242
Neuroscience if there were corresponding differences in brain activity associated with exploratory and exploitative decision-making strategies.12 Daw et al. (2006) reported that, while there were no specific regions of activation associated with exploitation, the right anterior frontopolar cortex and the anterior intraparietal sulcus were active during exploration – both these areas have often been associated with goal-directed behaviours and reward expectancy. In addition, consistent with other choice tasks, they reported increased levels of activation in the ventral and dorsal striatum following predictive errors, while activation in the medial orbitofrontal cortex was correlated with the magnitude of the received payoff. There have been suggestions that findings of this kind imply two different aspects of learning behaviour. One tracks the success of behaviour that is motivated by assumptions about the environment from prior experiences (i.e., model-based learning), which is thought to be supported by the PFC; while the other tracks behaviour that is not guided by any specific assumption about the environment and is influenced only by immediate outcomes of actions taken (i.e., model-free learning), thought to be supported by striatal projections. Both learning strategies can appear in explorative and exploitative behaviour (Daw, Niv, & Dayan, 2005). For example, one can explore by accumulating information for a long period of trials at one of the one-armed bandits, and then systematically moving on to the other three (model based), or randomly shift between the four one-armed bandits as soon as a big loss occurs (model free). Likewise, when exploiting, people may stick with one of the onearmed bandits because of their overall experience that it will yield a high payoff with small losses along the way (model based), or
12
Dam and Körding’s (2009) study examined motor learning with respect to exploration and exploitation learning strategies in which, rather than making discrete decisions, the trade-off between exploration and exploitation in a continuous search decision task with monetary rewards attached to successful motor movement. They found that, consistent with Daw et al. (2006), a reinforcement learning algorithm accurately described learning behaviour.
243
Magda Osman neglect the history of the payoffs they have received, and randomly shift as soon as a big loss is experienced (model free). How do we switch between exploration and exploitation? Moreover, how is this switch realized neurologically? Cohen, McClure, and Yu (2007) propose that factors that influence the switch between these modes of learning include familiarity with the environment, and the dynamics of the environment (i.e., how stable or predictable it is). In addition, the cost of reducing uncertainty through exploration versus the relative value of exploiting the environment has also been shown to bear strongly on switching behaviour. Yu and Dayan (2005) propose that switching behaviour can be determined at the neurochemical level. They propose that two neuromodulators, acetylcholine (Ach) and norepinephrine (NE), signal expected and unexpected sources of uncertainty which interact, and enable optimal adaptive behaviour in dynamic environments. This type of distinction is particularly informative, because in general, we seem to have two tacit assumptions of uncertainty. That is, people tend to accept that there will be a level of uncertainty attached to the outcomes of their behaviour on some occasions, because there is unreliability attached to certain outcomes (e.g., even though my aim for the bin was accurate, my throw of the banana skin still missed it), and so violation of predicted outcomes is less of a surprise. However, there are outcomes that occur that are beyond the scope of our expectations, that is, something completely unpredicted happens (e.g., the computer that I’m working on suddenly blows up) which would be classed as unexpected uncertainty. Cohen, McClure and Yu (2007) and Yu and Dayan (2005) claim that while tracking the outcomes of choices that have some variable change but are expected Ach signals are sent and encourage exploitative behaviours. However, when big sudden losses are experienced, NE signals prompt exploratory behaviours. However, what still remains is how this account explains switches between exploiting and exploring learning strategies on the basis of levels of NE and Ach. In other words, these neuromodulator systems signal uncertainty, but how they influence the selection of particular actions remains an empirical question. 244
Neuroscience
Uncertainty: delayed rewards Kable and Glimcher (2007) and other neuroeconomists (McClure et al., 2004; Pado-Schiopp & Assad, 2006) have examined decisionmaking behaviour and brain activity under conditions in which monetary gains were presented after varying delays (i.e., intertemporal choice). That is, they were concerned with what happens when people are faced with an option of taking an early but small reward now compared with waiting for a larger reward in the future. Offsetting the differences between the magnitude of rewards offered and the differences in time between the presentations of rewards (early versus late) has also generated considerable research interest in psychology and economics. Again, just as exploratory and exploitative learning strategies have a bearing on the way we try to attenuate uncertainty, the same rationale applies with respect to delayed rewards. The reason for this is that the cost incurred currently (i.e., not receiving the small reward, or gathering information by exploring the environment) is offset by a high future reward (i.e., receiving a big reward, or using the information gathered to exploit the environment). Kable and Glimcher (2007) presented participants with a series of trials in which on each trial they had to choose between an immediate or delayed reward. The immediate reward was always the same, $20, and the delayed reward was either $20.25 or $110, and they had a short period to consider which option to choose. The individual choice that people made was accurately predicted by economic theory, and this was based on the objective values and subjective preferences. They also measured whether brain activity reflected subjective value of delayed monetary reward, and reported that the different preferences that individuals revealed in their choice behaviour matched brain activity in the ventral striatum, medial prefrontal cortex and posterior cingulated cortex. The analysis was based on comparing activity given the constant presentation of the immediate reward and the switch to choices of delayed reward. Interestingly, their findings revealed that as the objective amount of reward increased, so did brain activity in these regions. 245
Magda Osman They also reported that increased activity matched conditions in which people had just received the reward, as well as when they expected a particular reward.
Animal Studies Along with human studies, animal studies have also been used to explore core adaptive behaviours in the context of neuroeconomics. In particular, the focus has been on examining the effects of variable rewards, and the extent to which non-humans are sensitive to the dynamic properties of the environment and their responses to it (for discussion, see Schultz, 2006). A number of studies on monkeys (e.g., Platt & Glimcher, 1999), rats (e.g., Kheramin et al., 2004) and pigeons (Madden et al., 2005) have revealed that they share similar behaviours and brain activity as those reported in human studies (for a review of this literature, see Madden, Ewan, & Lagorio, 2007). Platt and Glimcher ’s (1999) study is a prototypical example of an animal study in which decision-making behaviour (i.e., magnitude of reward, probability of reward, expected value of reward, and uncertainty – as defined by risk and ambiguity of the environmental conditions) was compared with standard economic theory. The aim of Platt and Glimcher ’s (1999) study was to examine if influences on behaviour at the psychological level also apply at the neurobiological level, and how they compare with behaviour as prescribed by economic theory. Some (e.g., Schultz, 2006; Schultz et al., 2008) propose that at the psychological level, animals, like humans, encode and use the probability and the magnitude of the reinforcement to influence their actions; whereas at a neurobiological level, the proposal is that the nervous system identifies and transforms sensory and motor signals, but no additional information concerning the environmental contingencies (i.e., probability and magnitude) is used. To examine this, Platt and Glimcher used an eye movement technique. A fixed stimulus was presented centrally on screen which the monkey focused on. After a short inter-
246
Neuroscience val, two stimuli (i.e. a target and a distractor) were presented above and below the fixation point. The fixation point changed colour, which indicated which direction the monkeys should move their eyes towards, that is, either the stimulus above or below the fixation point.13 Movement to the target was rewarded, but the magnitude of the rewards was varied (expected gain), as was the probability that each of the two stimuli would be cued (outcome probability). This procedure made it possible to identify the activity of classes of neurons in the lateral intraparietal area (LIP) during different stages of the trial, that is, encoding information, choice and action, expectancy and reward. They found that, contrary to neurobiological models, the behavioural response (eye movement) and the neuronal activity (increased firing of neurons in LIP, and in general prefrontal cortical activity) were consistent with the amount of reward the monkeys expected to gain as a result of their chosen response. This study has since been used to support general claims that for any type of choice behaviour, sensory signals combine with estimates of the expected gain from an action that is taken, as well as with the probability that the gain will follow the action. That is, the basic decision-making mechanism implemented neurologically that enables monkeys to process complex environmental contingencies is the same as that used to support decision making in humans.
Summing Up As a new field, neuroeconomics faces some problems, particularly in terms of how the environment should be described and whether
13
In one experiment, the distractor was relevant because it also cued the monkey to make eye movements towards the target; in the second experiment, the distractor was completely irrelevant. The reasons for manipulating the relevancy of the distractor were to examine if the brain region of interest (i.e., the lateral intraparietal area, or LIP) was associated with sensory attentional processes, or instead with choice behaviour and motor movement – which is what Platt and Glimcher reported.
247
Magda Osman our behaviours correspond to rational models or not. This is not a new issue, or an unfamiliar one, and it has been raised a number of times in this book. Problem 1: One of the most pressing problems faced by neuroeconomics has been how to integrate the different levels of explanation into one theoretical or formal description of behaviour. As Glimcher and Rustichini (2004) highlight, economists describe global not individual choice behaviour by using a single formal method (expected utility theory, game theory). Whereas psychologists consider how subjective and objective estimates of values of action-outcomes differ and describe the behavioural differences, neurobiologists focus on the simplest neural circuitry that can explain the simplest measurable component of behaviour. They all refer to the same phenomena, but the level and language of description are very different. However, this rather crucial issue seems not to be reconciled, and presents gaps between prescriptive and descriptive approaches to behaviour examined under the auspices of neuroeconomics. Problem 2: Another problem, discussed by Camerer, Loewenstein, and Prelec (2005), is that the different levels (i.e., neuropsychology, psychology and economics) may present challenges for each other. For instance, psychological studies demonstrate that people may develop exploratory behaviour in one environment, but more exploitative behaviour in another – in other words, the particular environment determines certain types of behaviours. This challenges economic models that classify choice behaviour in such a way as to assume it is consistent over place and time. But, the fact is that neuroeconomics studies often examine the conditions that generate this difference, and the underlying mechanisms that support it.14 Whether or not such a significant challenge to economic 14
Vromen (2007) presents a critical view of neuroeconomics and its ambitious endeavours by claiming that (1) neuroeconomics is an offshoot of bioeconomics (Tullock, 1971) – which has maintained the success of standard economics in describing patterns of behaviour throughout the animal kingdom, and has not
248
Neuroscience theory warrants changes to it, or whether in fact this presents a more profound challenge to the appropriateness of formal descriptions of rational behaviour, still remains to be considered. Problem 3: Finally, on a related issue, there are two very different theoretical positions within neuroeconomics. Glimcher and Rustichini (Glimcher, 2003; Glimcher & Rustichini, 2004; Rustichini, 2005; see also Platt & Glimcher, 1999) focus on behaviour found to be consistent with economic theory, and therefore rational, and Camerer, Loewenstein and Prelec (2004, 2005) focus on behaviour that deviates from that predicted by economic theory, and is therefore irrational. The framework that Camerer, Loewenstein and Prelec (2004, 2005) use as a method of integrating the different areas of neuroeconomics is the very framework that divides up processes into autonomic or controlled (see the previous chapter) – a distinction that essentially refers to unconscious and affective processes in one category, and conscious cognitive processes in the other category. Camerer, Loewenstein and Prelec (2004, 2005) take as rational all conscious cognitive behaviours, but overlook the fact that unconscious and affective behaviours could also be rational. In contrast, Glimcher and Rustichini take rational behaviour as any behaviour that is consistent with that predicted by standard economic theory. This of course presents a serious problem with respect to what behaviours are classified as normative, as well as what normative model should be used to compare them against (see Problems 1 and 2). However, what does seem to obviously connect these two different theoretical positions is how they treat behaviours that may be construed as ‘irrational’. They both propose that the deviation from the standard normative model
needed to amend standard economic theory; and, more crucially, (2) in neuroeconomics, economic theory is used as a benchmark for determining the goals of behaviour, but what is not assumed is whether humans and animals will attain the goals efficiently. This implies that humans’ and animals’ actions and choices are not in fact rational, which presents a major challenge to economic theory, which takes as its axiom that humans’ and animals’ learning and decision-making operates rationally to achieve their goals.
249
Magda Osman suggests that there are features of the environment which trigger neurological mechanisms that in turn give rise to simple adapted behaviours. These behaviours would be rational under the conditions they evolved in, but irrational given the complexities of the actual task environment.15 Thus, this raises an issue that also has an important bearing on the issues central to this chapter, in that when determining optimal learning behaviour for complex uncertain environments, a formal description of behaviour is needed. This assumes that we should judge behaviour according to some form of rationality. However, what may be construed as rational could differ according to the level of focus, that is, neurological, psychological and economic. If there is no agreement at the different levels as to what should be used as a normative model of behaviour, then different rational models need to be applied. However, this too raises questions as to whether it is in fact appropriate to apply different normative models at different levels of explanation of behaviour.
A Little Synthesis In returning to the target question of this book, ‘How do we learn about, and control online, an uncertain environment that may
15
This taps into a similar and long-standing debate in the study of reasoning and judgement and decision making concerning the fit between rational models of behaviour and actual behaviour (see Stanovich, 1999). Drawing dichotomies between rational and irrational, between adaptive and non-adaptive, and between conscious and unconscious behaviour generates tensions which cannot be easily reconciled within a single theoretical framework (e.g., Gigerenzer & Todd, 1999; Kahneman & Frederick, 2002; Stanovich & West, 2000) without appearing inconsistent and incoherent. Moreover, serious empirical issues concern the failure, by such theories, to clearly establish the properties of a task environment that will elicit rational, adaptive and conscious, or irrational, adaptive and unconscious, behaviours, and so the evidence winds up supporting circular claims.
250
Neuroscience be changing as a consequence of our actions, or autonomously, or both?’, neuropsychological studies have provided important glimpses into the way we process our environment. What neuroscience would say is that the puppeteer is able to learn online about the behaviour of the puppet. This is because the cortical networks in the brain of the puppeteer enable him to track all the irregular actions of the puppet when he didn’t choose to do anything to the levers, along with the occasions in which nothing followed when he chose to manipulate the levers. Without knowing it, the puppeteer has the capabilities to track the changes and irregularities of the puppet’s internal mechanism that make it behave in unpredictable as well as predictable ways. Much of the evidence reported from studies of neuropsychology shows that neural activity reflects sensitivity to the various properties that make up uncertainty with respect to the consequences of our actions. However, as yet, one outstanding aspect of this research that still remains to be answered is how this sensitivity translates to processing the consequences of action that affect a change in the environment. The evidence thus far provides early indications that a similar type of adaptive mechanism is in place that tracks uncertainties in the environment as well as those resulting from our decisions and actions. In addition to this, neurological mechanisms have been shown to be sensitive to the relative advantage in switching from one chosen course of action to another in stable and unstable situations. This clearly suggests that at a neurological level, we have the capacity to track the effects of our actions as well as prepare the actions needed to successfully and reliably achieve our goals in the face of an uncertain environment. The concluding discussion in this section will draw on two important points that suggest that neuropsychological studies are close to providing a big picture concerning how we control uncertainty, but neglect two issues of relevance to control. First, one aspect of our neuropsychological functioning that awaits further experimentation is our ability to process expected outcomes, for which there are many different associated cortical networks. While
251
Magda Osman we use expected outcomes and expected values to co-ordinate our behaviour in uncertain situations, what has not yet been examined is the neurological and psychological relationship between generating expected outcomes and generating expectancies of the success of expected outcomes. This latter activity has been of significant interest to social psychologists examining control behaviours. In particular, work by Bandura and Vancouver clearly shows that judgements of the success of expected outcomes are influential in the actions we generate to control uncertainty. Moreover, the work from the social psychology domain suggests that judgements of self-efficacy work in tandem with judgements of expected outcome in order to establish how the environment is structured, as well as how to develop plans of actions that can control it. Nevertheless, the work discussed in this chapter clearly highlights the importance of judgements of expected outcomes in coping with uncertainty, since we have so many neural circuits that help us achieve this judgement. The second issue that still needs to be tackled by neuropsychology is to use tasks that are more akin to situations that people face when controlling systems. Complex control tasks of the kind discussed in the previous chapters have yet to be studied using fMRI techniques. However, tasks that mimic gambling scenarios are sufficiently uncertain and difficult that they involve many of the behaviours typical of controlling outcomes in complex dynamic systems. In fact, gambling behaviour is an example where there is an obvious desired goal (e.g., maximizing one’s profits), the outcomes of choice behaviour have important consequences for the individual (e.g., wins and losses), and the environment is an uncertain one (e.g., there are probabilities attached to winning, as well as to the magnitude of the payoff). The individual is required to learn the properties of the environment in order to predict it and control it to sustain regular rewards while avoiding losses. What neuroeconomics studies tell us is that we are good at learning the risky factors that are generated by the gambling environment, and so this can be used as a basis for understanding how we learn about and manage our choices in risky situations. The only limitation is that 252
Neuroscience there is a crucial difference between control systems and gambling environments, and this may in turn effect what parallels we can draw between the two. The decisions and actions that are made in the control systems can change the outcome in the environment, whereas in the latter, no matter what action is taken, the outcome will occur independently of the action.16
16
Clearly in gambling situations control is illusionary (see Langer, 1975; and, more recently, Martinez, Bonnefon, & Hoskens, 2009), and so these situations are an interesting counterpoint to control systems environments.
253
Chapter 9
Synthesis
When we think about what it takes to understand how we control our world, the single biggest problem that psychology, neuroscience, engineering, cybernetics and human factors research faces is finding a coherent way of describing the exact properties of the world and correspondingly describing the psychological processes involved in controlling it. To make any headway, the first and most obvious thing to do is to find an appropriate question to ask which helps prioritize the things of interest. Being a pragmatist, I framed the question from a perspective that I know best, and that is from the point of view of cognitive psychology: how do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both? What I hope I have achieved is to show that even if the question has its origins in cognitive psychology, it is a question that equally applies to all the disciplines that have been discussed, and it begs an answer which would be relevant to all of them.
Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
254
Synthesis The purpose of this chapter is to present a framework that can answer the target question. The answer is built from not only the clues that psychology has offered, but also the insights that neuroscience, engineering, cybernetics and human factors have offered. While it is obvious that many have presented their own versions of a framework (e.g., Hooker, Penfold, & Evans, 1992; Sloman, 2008; Sutton & Barto, 1998; Watkins & Dayan, 1992), and there are entire dedicated fields of research designed to do the same job (e.g., early cybernetics and its current form, machine learning, as well as control systems engineering and human factors), the main difference here is that this is the first attempt to bridge all of the disciplines in a way that promotes unity, which until now was still distinctly lacking. So, this chapter will revisit all the main themes that were discussed in this book which should be familiar to you now. By drawing from the different disciplines, it will present some basic principles, and in turn offer a solution to a problem that has hampered progress in the study of control, which is the mystery of complexity. As discussed in Chapter 7, attempts to understand how we control complex systems have stumbled because they’ve tried to answer the question ‘What makes a system complex?’ The book is entitled Controlling Uncertainty and not Controlling Complexity for good reason. One of the main arguments made in Chapter 7 is that complexity is better thought of by another name, and that is uncertainty.1 Therefore, the framework indirectly tackles complexity by turning the question of what makes a system complex into a more manageable question: what makes a system uncertain?
Building a Framework What we’ve come to know in the study of control in its various guises is that there are a multitude of possible models (representations) or descriptions that can be used to refer to the system (e.g., 1
For a detailed discussion of this, see Osman (2010).
255
Magda Osman describing the behaviour of the system from the point of view of engineering, cybernetics and machine learning). There are also a number of possible ways of characterizing the behaviour that humans exhibit when trying to control it (e.g., describing behaviour from the point of view of human factors, social and cognitive psychology and neuroscience). The answer to the question of how we control a complex control system has to come from finding a shared model of the control system, and a shared basic description of our behaviour within it.2 The point is this. At a scientific level the models used to characterize the control system vary because every disciple has its own language of description to refer to what a complex system is. We as humans vary in the models we have of the control system and how we come to think of them. But, in spite of this, all the models must reduce to some critical properties, because the way we behave in all the various systems that there are tends to be roughly the same. This is my starting point, and this is my motivation for finding a framework that can help to unify the various disciplines interested in understanding control systems behaviour. Having surveyed the many different approaches to control systems research, and having considered the evidence from cognitive psychology, what I plan to spell out in the framework is what those critical properties are. This will be done by presenting a few general principles of the control system itself, and a few general principles of our psychological behaviour when interacting with control systems. The framework and the general principles that I will outline integrate two concepts that have been repeatedly referred to in this book: agency and uncertainty. The ultimate goal, but one that cannot be achieved here, is to use the framework to develop a formal description of our environment and our behaviours. This 2
Given the different domains that have been described in this book, each have their own formal model of dynamic control environments, and their own particular language and terminology used to refer to them, though in essence each discipline is referring to the same underlying concept; this was a point strongly made by Wiener (1948) and Ashby (1947).
256
Synthesis will eventually equip us with a succinct method of predicting behaviour in a variety of environments in which we exert control, and help to make recommendations for what would need to be improved in order to help us control our environment better. We could be on the road to achieving this, but not yet. As mentioned before, there have been many attempts to find a formal way of describing control systems and our behaviours. So before I present the framework itself, the following discussion examines why it is that we cannot use what is currently available as formal descriptions (i.e., formal modelling) as an analogue of what the control system is and what we actually do in it. 1.
Formal models come with their own limitations, and, of the range of many models available, each tends to be suited to specific types of uncertain environments. That is, there are specific environments for which the model of the system’s dynamics can adequately predict the response of the operator. In order to provide a general description of behaviours in all uncertain control systems, formal models need to be sufficiently flexible to apply to any type of control systems.
A number of recent reviews (e.g., Barto & Mahadevan, 2003; Chen, Jakeman, and Norton, 2008; Wörgötter & Porr, 2005) have examined the current status of a variety of formal techniques (e.g., case-based reasoning, rule-based systems, artificial neural networks, genetic algorithms, fuzzy logic models, cellular automata, multiagent systems, swarm intelligence, reinforcement learning and hybrid systems). What is apparent in the reviews is that there is a wide range of formal models each of which are well suited to a particular type of control system. For instance, artificial neural networks have wide applications because they can be used in dataintensive problems in which the solution is not easy to uncover, is not clearly specified, and or is difficult to express. As earlier discussions throughout this book suggest, these types of model have been used to solve problems like pattern classification, clustering, prediction and optimization. They also extend beyond simulating cognitive and neurological behaviours to a wide range of contexts 257
Magda Osman that include sewage odour classification (Onkal-Engin, Demir, & Engin, 2005); ecological status of streams (Vellido et al., 2007); land classification from satellite imagery (Santiago Barros & Rodrigues, 1994); predicting weather, air quality and water quality from time series data (Agirre-Basurko, Ibarra-Berastegi, & Madariaga, 2006; Kim & Barros, 2001; Zhang & Stanley, 1997); and process control of engine speed (Jain, Mao, & Mohiuddin, 1996). But they are limited because of their black-box status. In other words, when there is no fit between the problems generated by the control system and the model’s solution to them, it is difficult to track what process contributed to the lack of convergence between problem and solution. As another case in point, take reinforcement learning (RL) algorithms, which have also enjoyed wide applicability in neuroscience and psychology (Barto, 1994; Dayan & Daw, 2008), as well as in robotics and game playing (Kaelbling, Littman, & Moore, 1996; Konidaris & Barto, 2006). The success in their application often comes from model-based policy search approaches to RL. What this actually refers to are decisions or actions (policies) that are generated by using a model (or ‘simulator ’) of the Markov decision process.3 The concern with RLs comes when the actual RL algorithms are applied to highly uncertain control systems, termed dimensional continuous-state situations. Examples include the control of industrial manufacturing processes; management systems; the operation of surface water and groundwater reservoirs; the management of fisheries, crops, insect pests and forests; and the management of portfolios in markets with uncertain interest rates. Often, developing models of these highly uncertain control systems is problematic because they are mostly only successful in simulation, but not for real-world problems. That is, they don’t match up to reality. In contrast, model-free RL algorithms (i.e., 3
This is a mathematical method for modelling decision making when the outcome may be the result of random factors as well as the result of actions made by an agent (Malikopoulos, Papalmbros, & Assanis, 2009).
258
Synthesis Q-learning models) are based on finding the optimal policy for a given environment, rather than operating with one to start with. This is equivalent to picking up what rules you need along the way rather than being given a set of rules to begin with and simply recognizing the conditions in which to implement them. However, when model-free RL algorithms are applied to real-life situations, they also face problems because they tend to require an unrealizable number of real-life trials from which to accurately describe the processes (Abbeel, Quigley, & Ng, 2006; Kaelbling, Littman, & Moore, 1996; Malikopoulos, Papalmbros, & Assanis, 2009). Moreover, while in some cases the models may well be good at finding policies over time while interacting with a given environment, they do not simultaneously learn the dynamics of the system (i.e., they do not solve the system identification problem). This is clearly one thing we can do very well, and without needing an obscene number of experiences to achieve it. To get around some of these issues, hybrid models have been developed that combine different formal techniques. However, when hybrids are created (e.g., artificial neural networks combined with swarm intelligence models4 [e.g., Chau, 2006] or artificial neural networks combined with the Kalman filter [Linsker, 2008]), they are able to make important extensions beyond the capabilities of the individual models. However, new issues arise. Hybrid models can inherit the problems of the models that are combined, or else new problems can arise as a result of the combination (e.g., communication between models can be problematic because representations from one model need to be translated for the other model to use). The main point to take away here is that formal methods are extremely successful in their application to a variety of control systems, but they are still constrained. Each has its own problems to overcome, and as yet there is no single model or combination that can accurately describe a range of behaviours in a range of 4
Swarm intelligence models are based on the behaviour of colonies (e.g., migration, flocking, herding and shoaling) of insects (e.g., ants and bees) and fish.
259
Magda Osman highly uncertain control systems (high-dimensional continuousstate situations) or multi-agent systems such as those control systems discussed throughout this book. 2.
Formal models are rational descriptions of behaviour. They serve as a benchmark, but we still aren’t sure how to deal with the gap between the optimal behaviour prescribed by formal methods and what we actually do in control systems.
In a way, this point picks up from various arguments made in Chapter 5 and Chapter 8, in which questions are raised as to the appropriateness of using prescriptive methods to examine human behaviour in uncertain environments. Formal models are rational descriptions of an environment (control systems) or a process (e.g., problem solving, decision making and learning). The critical problems which have been raised by many (e.g., Sheridan, 2000, 2002; Sheridan & Parasuraman, 2006; Vromen, 2007) are succinctly captured by McKenzie’s (2003) comment: Rational models serve as useful guides for understanding behavior, especially when they are combined with considerations of the environment. When a rational model fails to describe behavior, a different rational model, not different behavior, might be called for.
The rationality debate is an ongoing one and has an obvious bearing here. In an attempt to specify behaviour through models that provide a standard or yardstick against which to compare our behaviour, there comes a problem. Our behaviour deviates from formal descriptions. We can precisely say what we ought to do, and we can precisely say the extent to which we do what is prescribed, but the fact that we don’t always do what the formal model specifies is a problem. This requires us to then either (1) describe the behaviour as irrational, or (2) propose that the methods of assessing behaviour in the laboratory (e.g., our capability in operating a simulated sugar plant factory) are not a good map to reality, or (3) posit that the model is inappropriate for the context in which it was 260
Synthesis applied. McKenzie’s comment was directed at the explosion of Bayesian formal methods (e.g., Griffiths & Tenenbaum, 2005; Tenenbaum, Griffiths, & Kemp, 2006) which have been used to describe, amongst many things, information gathering and updating, and the inferences and decisions we make from this process (for critical reviews, see McKenzie & Mikkelsen, 2006; Nelson, 2005). Though McKenzie’s (2003) critical point was specifically addressing Bayesian models, the argument can also be extended to all formal models which have been used to describe psychological behaviour. Formal methods are prescriptive, so deviations from them in turn imply that our behaviour is irrational. By implication this would also suggest our behaviour in the real world needs to change. Why is this at all important in the context of control? It is an important scientific issue, but it also has serious consequences in the real world. For human factors research, this issue is faced head on when trying to identify the root cause of a fault occurring in a system. That is, it is absolutely critical to find the right model to describe the environment, because without it, equipped with only an understanding of psychological behaviour in those environments, one cannot accurately assign the causes of errors (e.g., Castanon et al., 2005). In sum, while the usefulness of formal models is undisputable, and theory has benefited from a tightening in the specificity of claims as a result of being formally tested, they are still founded on ideas that are problematic. McKenzie appeals for the case that sometimes we need to recognize that the formal description may not be appropriate as a means of evaluating our behaviours, rather than the fact that we appear to be behaving in a way that seems to be suboptimal. However, there does still need to be some explanation as to why we deviate from prescriptive models, as well as a need to propose alternative models that best capture our behaviour. 3.
Formal models ignore the important issue of agency, which is integral to understanding our control behaviours in control systems. 261
Magda Osman Sloman (1993, 1994) has proposed that formal models neglect the importance of causal power.5 Even if a model does include a description of a mechanism that is capable of estimating causal power to affect changes in an environment,6 this is a case of the formal model having pre-set conditions that, when passed, trigger judgements of causal power. But this isn’t equivalent to a description of an agent actually having causal power.7 Even formal models that rely upon a motivational system (i.e., RL algorithms) do not have an inbuilt system that generates rewards to begin with (Konidaris & Barto, 2006). The RL algorithms need to be told what is valuable and rewarding and what the goal ought to be. Moreover, a formal model to estimate causal power and simulate an agent implementing it is not equivalent to us organizing our actions to do what we want when we want. Why might this be a problem? As work in social psychology has highlighted, motivational principles are key to understanding agency, and as is argued here, formal models are just not that well equipped to deal with them as yet. Formal models are reactive, that is, conditions from the outside trigger an internal system which then internally represents the condition. From this, pre-designed sets of rules that are assigned to the model by a designer (i.e., a computer programmer) are used to determine which action suits the condition. The problem here is that, if the conditions are not described beforehand, even within the scope of a learning algorithm, sensible responses from the model are not guaranteed. Compared to a learning algorithm, we are reasonably good at deciding on the fly how to suddenly respond to something unexpected that happens in a control system.
5
Causal power here can be taken to refer to agency. For example, an analysis that includes a principled way of specifying task requirements, resources that are required to perform the task, what prior knowledge can be brought to bear, how much effort is needed to perform the task, and what actually be achieved and what new goals need to be set. 7 See the section in Chapter 2 entitled ‘Sense of agency, actions and what they are made up of ’. 6
262
Synthesis Reactive formal models learn by acting in a situation and then updating what they know from the consequences. Other formal models are more contemplative, and before taking an action various plans of alternative actions are evaluated, and then after this a carefully selected action is made. What this requires is some type of monitoring process which keeps track of what was done and what happened from it (Sloman, 1999). Again, for control behaviours, monitoring processes are essential because they are used to track the actions and decision made and the success of the actions taken. The problem here is that a formal model simultaneously needs to generate actions while also playing out various scenarios that may or may not then be acted on. This level of implementation in a model appears to still be hard to achieve, particularly because the type of monitoring system needed has to self-monitor as well as monitor changes in the environment. Again, the problem is that formal descriptions can match up to what we can do. We can selfmonitor, though not often enough, and to our detriment sometimes (Osman, 2008); this is an important way of allowing us to estimate our capabilities in order to plan what we need to do. Studies in social and cognitive psychology suggest that our internal reflective process is crucial in directing our behaviours, and it is not static. Rather, what seems to be the case is that given the goals that we pursue and the various changes that occur in our environment, our sense of agency can change quite radically. Moreover, it can change even if the environment that we are acting on remains relatively stable. Our estimates of our effect in the world can be motivating and result in control over incredibly complex systems, but then again our estimates of our effect on the world can also be crippling if we believe our actions make a minimal impact on our environment, or at worst are believed to be irrelevant. Our sense of agency is not always conditional on what happens in our environment; it can be internally driven by the self. In sum, the limitation that I have tried to highlight here is that the sense of agency that propels our actions or halts them cannot as yet be incorporated into current formal modelling.
263
Magda Osman
A Way Forward With respect to answering the target question of this book, ‘How do we learn about, and control online, an uncertain environment that may be changing as a consequence of our actions, or autonomously, or both?’, the principles that I will outline and the framework that draws them together are based on integrating two concepts that are fundamental to control systems, and to our behaviour in control systems: agency and uncertainty. 1.
2.
AGENCY: Of the work discussed throughout this book, there are two factors that contribute to our sense of agency: (1) subjective estimates of confidence in predicting outcomes of events in a control system (predictability of the environment) (E1), and (2) subjective estimates of expectancy that an action can be executed that will achieve a specific outcome in a control system (predictability of control) (E2). UNCERTAINTY: Of the work discussed throughout this book, there are two different types of uncertainty that affect our agency with respect to the control system: (1) those fluctuations in the system that will occur that are not accurately predicted by the individual, but that are infrequent fluctuations that contribute to the system perceived as having low uncertainty (U1); and (2) those fluctuations in the system that will occur that are not accurately predicted by the individual, and are highly erratic fluctuations that contribute to the system perceived as having high uncertainty (U2).
Uncertainty and agency are essential components to controlling our environment. They refer to how we perceive the control system according to how uncertain it is, as well as our general ability to effect a change in it. Both these attributes in combination have often been overlooked in formal descriptions. The integration of these components is an acknowledgement of the critical findings that research from all the disciplines covered in this book has revealed. 264
Synthesis
General Principles The control system 1. Goal oriented: All control systems are goal directed; they consistent of a target goal and various subgoals, each of which contributes to the target goal. 2. Probabilistic and dynamic: The two most critical features of a control system are the probabilistic and dynamic relationship between causes initiated by an individual or resulting from actions independent of those made by the individual (i.e., other environmental factors, or internal triggers in the system) and their effects on the state of the control system. The cause–effect relations are probabilistic: the effect may not always occur in the presence of the cause, and/or in the absence of the cause the effect may still be present. The cause–effect relations are dynamic: the internal mechanisms that are the functional relationship between causes and effects in control systems are what make the relationship between causes and effects dynamic. 3. Stability: The greater the flexibility and range of outcomes generated by the control system, the greater its instability, and the greater the demands it places on exerting control on the system. 4. Rate of change: The objective characteristics of the environment most important for the purposes of psychological factors related to control are (1) the rate of changes in the outcome of the system over time that is non-contingent on external interventions (i.e., the changes that occur when we don’t act on the system), in combination with (2) the rate of absent changes in the outcome of the system over time that should follow external interventions (i.e., occasions in which nothing changes when in fact we expect a change in the system from our actions). From the perspective of the individual, this can translate into two types of judgement uncertainty: (1) those fluctuations8 in the system that will occur that are not 8
If the rate of change for both (1) and (2) is low, then these are conditions of the control system that the individual will experience as U1.
265
Magda Osman accurately predicted by the individual, but that are infrequent fluctuations that contribute to the system perceived as having low uncertainty (U1); and (2) those fluctuations9 in the system that will occur that are not accurately predicted by the individual, and are highly erratic fluctuations that contribute to the system perceived as having high uncertainty (U2).
Our behaviour in control systems 1. Goal oriented: Our interactions with a control system are entirely dependent on specifying a goal which can be determined by the control system, and so our behaviours contribute to reaching that goal. Also, the goals we pursue can be self-determined, which in turn may or may not coincide with the goal that the system is operating by. Thus, our behaviours are goal directed; they involve the generation and application of actions designed for a future event in a control system with respect to a goal. 2. Monitoring the control system: Monitoring involves tracking discrepancies between a target outcome (a desirable outcome) and the achieved outcome which is based on expectancies formed from judging cause–effect relations (task monitoring). Task-monitoring behaviours reduces uncertainty by generating testable predictions about the behaviour of the environment to update the individual’s knowledge of the environment. 3. Monitoring our own behaviour: Monitoring involves tracking discrepancies between events based on expectancies formed from judging the success of goal-directed actions (self-monitoring). Selfmonitoring behaviours reduce uncertainty because the expectancy generated from estimating the success of producing a specific outcome from an action generates feedback which in turn is used to evaluate the individual’s decisions and actions.
9
If the rate of change for either (1) or (2) is high, or if either (1) or (2) is high, then these are conditions of the control system that the individual will experience as U2.
266
Synthesis 4. Agency: We are able to manage uncertainty in control systems with dynamic and probabilistic properties because we make two estimates: (1) subjective estimates of confidence in predicting outcomes of events in a control system (predictability of the environment) (E1), and (2) subjective estimates of expectancy that an action can be executed that will achieve a specific outcome in a control system (predictability of control) (E2).
Framework: the integration of agency and uncertainty in control systems 1. Feedback psychological system of control: the control system is goal-directed, and to reduce uncertainty about how the system operates we must interact with the system in a goal-directed manner. By devising a goal, we help to motivate our actions in a specific way, which forms the basis of our control behaviours. In turn, we generate an action that produces an outcome. From this, we will also generate a discrepancy between the goal (desirable outcome) and the actual outcome (achieved outcome). The discrepancy between goal and actual outcome is evaluated by monitoring the control system and our own behaviour). 2. Feedforward psychological system of control: to motivate our actions to continue to reduce the discrepancy between goal and achieved outcome, and maintain control over the outcome, the two methods of evaluation of that discrepancy form the basis of two estimates of agency (Estimate 1: predictability of the environment; Estimate 2: predictability of our control over it). The more encounters we have with the control system, the more examples we have of the discrepancy between E1 and the outcome generated by the system, and between E2 and the generated outcome. Combined, these generate an overall judgement of the control system’s uncertainty. People differ with respect to their threshold of uncertainty. If the overall judgement of uncertainty exceeds the threshold, then the control system is judged to be U2. If the overall judgement of uncertainty falls below the threshold, then the control system is judged as U1.
267
Magda Osman 3. U1 will not lead to regular exploitation of the control system (i.e. a systematic application of the same pattern of interactions with the system) after every change to an outcome in the system. U2 will lead to regular exploration of the control system (i.e. a change in the pattern of interactions with the system designed to seek out additional knowledge about the relationship between selfgenerated actions and outcomes in the system) after every change to an outcome in the system.
General Claims Learning to control To accurately learn about a control system and judge the impact of one’s behaviour on it, the most effective way to do so is to experience it as U2. However, to develop a sense of agency and generate behaviours that will help to build up a representation of the control system, even if it means that we fall under the illusion of control, then the most effective way to do so is to experience it as U1. To examine the accuracy of people’s estimates of uncertainty, their perceived U1 or U2 can be compared with the objective characteristics of the environment according to the two types of rates of change. Over time, judgements of uncertainty will converge with the objective characteristics of the control system, because our learning and decision-making mechanisms are driven by them, and are sensitive to them.
Prediction and control Prediction is the development of an expectancy of an outcome. Learning via prediction involves a process that refines the decisions that will determine the expected value function associated with an outcome. Control is the management of an expected outcome. Learning via control involves refining the decisions that will help utilize the value functions associated with an outcome to reduce the 268
Synthesis discrepancy between the goal and the outcome. The functional mechanism that supports prediction and control learning should be the same, because they are both reducible to the same underlying process – the development of an expectancy of an outcome. However, the magnitude of the value function attached to the outcome is greater in control learning than predictive learning because control involves behaviours that can have direct consequences on the outcome, whereas prediction has indirect consequences.
Illusion of control Highly frequent changes that are non-contingent on human action can make the system appear controllable – which can be illusionary. This is because with more changes occurring in the system, it is more likely that there will be effects that coincide with frequently generated actions generated by humans, thus giving us a stronger impression that our actions determine the changes that occur. In the long term, improvements to learning and decision making in control systems are dependent on the control system being experienced in its most unstable state. This is because the greater the range of states of the system that is experienced, the more depth of knowledge is acquired, and the less susceptibility to the illusion of control.
269
Chapter 10
Epilogue: the master puppeteer
The ringmaster of a travelling circus told a promising young puppeteer of a mysterious string puppet that would present a challenge to anyone who tried to control it. The puppeteer was intrigued. ‘Though it could be mastered, it would take all the skill and determination of even the most expert puppeteer.’ Then, in answer to the young puppeteer ’s question, the ringmaster replied, ‘The reason why it is so difficult to work the puppet is that inside the puppet is a mechanism which follows its own rules and logic.’ This was enough to inspire the puppeteer. At last he had found his goal. He would seek out and learn all there is to know of the puppet and how to control it. But, before leaving, as a warning to the young apprentice, the ringmaster ’s final words on the matter of the string puppet were these: Be very careful. The puppet can make you feel uncertain; you can’t always know if it acts the way you want it to because your actions have caused it to act that way or because of its probabilistic and
Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
270
Epilogue dynamic properties. Sometimes you’d be forgiven for thinking that it is unpredictable because it acts completely independently of your actions.
When eventually the young puppeteer found the mysterious string puppet, what he saw was a large glass box supported by an elaborately carved wooden frame. The puppeteer paced around, inspecting the box. On one of the sides of the box was a panel with many levers of different shapes and sizes, and beside that was a chair. The puppet inside the box looked fairly ordinary suspended in space, its limbs heavy, holding the posture that all string puppets have before they’re brought to life. Though the puppeteer was told that there were cause–effect relations between them and the puppet, having peered hard into the glass, he just couldn’t find them. The cause–effect relations must be either very fine or transparent to me, the puppeteer thought to himself. The young puppeteer sat at the chair. He tried out various levers in an exploratory way at first, but not much seemed to happen. It was hard to tell which lever moved the head, and which the limbs. There were more moving parts to the puppet than there were levers. This brought moments of uncertainty into the puppeteer, making him wonder if he could ever hope to find the right means to control the puppet. However, when there was a goal-directed order to operating the levers, the puppeteer could be exploitative and make the puppet’s arm jolt, and even bring about the tapping of a foot. Once in a while the words of the ringmaster surfaced in the mind of the puppeteer. As the ringmaster had warned, the puppet moved on its own; there were infrequent erratic fluctuations, but also infrequent minor fluctuations. The puppeteer tried to ignore this, hoping that it would be enough to keep pulling and pushing at the same familiar levers. Though this did seem to work, it was an illusion of control. The puppeteer knew that his rehearsed actions couldn’t alleviate the scepticism that the puppet’s behaviour brought about. ‘I wonder whether the mechanism operates truly independently of my actions, or is it simply that I haven’t yet explored all there is to know about it?’ 271
Magda Osman Confidence in predicting outcomes and confidence in controlling outcomes helped to maintain the puppeteer ’s sense of agency in the face of uncertainty. His self-efficacy helped him to believe that he could ultimately control the puppet and not the other way around. The puppeteer learnt to predict the way the mechanism operated. The puppet moved of its own accord and wouldn’t always keep in time with the music that was played. On some occasions there would be a high rate of change, and on other occasions there would be a low rate of change. There were even times when the limbs would move and dance in sequence, but not when predicted. It did take a long time, but the young puppeteer became skilled in conducting the mysterious string puppet. The ringmaster had followed the progress of the puppeteer, and found an opportunity to watch a performance. The tales were true: the puppet danced elegantly before the audience. The ringmaster hoped to know what valuable lessons the puppeteer had gained in the time of apprenticeship. ‘All I know is this,’ said the puppeteer. ‘To make the puppet dance the way I wanted, I had to know how stable it was. The less stable it seemed to be, the more flexibility and range there were in its movements. I had to know the rate of change, and the changes that were non-contingent on my actions. This helped me to develop my sense of agency according to my estimate of success in predicting the puppet’s behaviour, and my estimate of my success in producing an action that would achieve a specific type of action by the puppet, this helped me to distinguish the times when it simply seemed that it was indeed behaving of its own accord from the times when it was behaving as I wanted. For all this, I needed to keep in mind two very important details: that I was the one who could control the actions I generated, and when I generated them. This is how I predicted what the puppet might do from one moment to the next, and this was what helped me to understand what more I needed to know to transfer my experience to controlling other puppets.’ It was coming to know these things that helped to make the apprentice into the master puppeteer. 272
References
Chapter 1 Beckett, S. (2009). Waiting for Godot. New York: Continuum. (Originally published in 1954) Carroll, L. (1871). Through the looking glass. London: Macmillan. Dörner, D. (1975). Wie Menschen eine Welt verbessern wollten und sie dabei zerstörten [How people wanted to improve a world and in the process destroyed it]. Bild der Wissenschaft, 12 (Populärwissenschaftlicher Aufsatz), 48–53. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press. Wiener, N. (1948). Cybernetics: Or control and communication in the animal and the machine. New York: Wiley. Willmott, M., & Nelson, W. (2003). Complicated lives: Sophisticated consumers, intricate lifestyles, simple solutions. Chichester: Wiley.
Chapter 2 Anscombe, G. E. M. (1981). Causality and determination. In J. Barnes (Ed.), The collected philosophical papers of G. E. M. Anscombe: vol. 2. Metaphysics Controlling Uncertainty: Decision Making and Learning in Complex Worlds, Magda Osman. © 2010 John Wiley & Sons Ltd.
273
References and the philosophy of mind (pp. 133–147). Minneapolis: University of Minnesota Press. Armstrong, D. M. (1999). The open door: Counterfactual versus singularist theories of causation. In H. Sankey (Ed.), Causation and laws of nature (pp. 175–195). Dordrecht: Kluwer. Bogen, J. (2004). Analyzing causality: The opposite of counterfactual is factual. International Studies in the Philosophy of Science, 18, 3–26. Bogen, J. (2008). Causally productive activities. Studies in History and Philosophy of Science, 39, 112–123. Brand, M. (1984). Intending and acting. Cambridge, MA: MIT Press. Bratman, M. E. (1987). Intention, plans, and practical reason. Cambridge, MA: Cambridge University Press. Carnap, R. (1928). The logical structure of the world. Berkeley: University of California Press. Carnap, R. (1945). The two concepts of probability: The problem of probability. Philosophy and Phenomenological Research, 5, 513–532. Carnap, R. (1950). Logical foundations of probability. Chicago: University of Chicago Press. Cartwright, N. (1980). The reality of causes in a world of instrumental laws. Proceedings of the Bienneial Meeting of the Philosophy of Science Association, 2, 38–45. Cheng, P. W. (1997). From covariation to causation: A causal power theory. Psychological Review, 104, 367–405. Craig, E. (1987). The mind of God and the works of man. Oxford: Oxford University Press. Davidson, D. (1963). Actions, reasons, and clauses. Journal of Philosophy, 60, 685–700. Davidson, D. (1967). Causal relations. Journal of Philosophy, 64, 691–703. Davidson, D. (1970). Events as particulars. Nous, 4, 25–32. Davidson, D. (1980). Essays on actions and events. Oxford: Clarendon Press. Dennett, D. (1971). Intentional systems. Journal of Philosophy, 68, 87–106. Dennett, D. (1978). Skinner skinned. In D. Dennett, Brainstorms. Cambridge, MA: MIT Press. Dennett, D. (1983). Intentional systems in cognitive ethology: The panglossian paradigm defended. Behavioural and Brain Sciences, 6, 343–390. Dennett, D. (1987). The intentional stance. Cambridge, MA: MIT Press. Dennett, D. (2006). The hoax of intelligent design, and how it was perpetrated. In J. Brockman (Ed.), Intelligent thought: Science versus the intelligent design movement (pp. 33–49). New York: Vintage.
274
References Dretske, F. (1989). Reasons and causes. Philosophical Perspectives, 3, 1–15. Dretske, F. (1993). Mental events as structuring causes of behaviour. In A. Mele & J. Heil (Eds.), Mental causation (pp. 121–136). Oxford: Oxford University Press. Dretske, F. (1999). Machines, plants, and animals: The origins of agency. Erkenntnis, 51, 19–31. Ducasse, C. J. (1969). Causation and the types of necessity. London: Dover. Falkenstein, L. (1998). Hume’s answer to Kant. Nous, 32, 331–360. Frankfurt, H. G. (1978). The problem of action. American Philosophical Quarterly, 15, 157–162. Goldman, A. (1970). A theory of human action. Englewood Cliffs, NJ: Prentice-Hall. Guyer, P. (2006). Kant. New York: Routledge. Harding, D., Fox, C., & Mehta, J. (2002). Studying rare events through qualitative case studies: Lessons from a study of rampage school shootings. Sociological Methods Research, 31, 174–217. Harman, E. (1968). Enumerative induction and best explanation. Journal of Philosophy, 65, 523–529. Hausman, D. M., & Woodard, J. (1999). Independence, invariance and the causal Markov condition. British Journal of the Philosophy of Science, 50, 521–583. Hempel, C. H. (1965). Aspects of scientific explanation. New York: Free Press. Hornsby, J. (1980). Actions. London: Routledge and Kegan Paul. Hornsby, J. (1993). Agency and causal explanation. In A. Mele & J. Heil (Eds.), Mental causation (pp. 160–188). Oxford: Oxford University Press. Hooker, C. A., Penfold, H. B., & Evans, R. J. (1992). Control, connectionism and cognition: Towards a new regulatory paradigm. British Journal for the Philosophy of Science, 43, 517–536. Hume, D. (1978). A treatise of human nature (ed. L. A. Selby-Bigge & P. H. Nidditch). Oxford: Clarendon Press. (Originally published in 1739) Jackson, F., & Pettit, P. (1988). Functionalism and broad content. Mind, 47, 381–400. Jeffrey, R. (1969). Statistical explanation vs statistical inference. In N. Rescher (Ed.), Essays in Honor of Carl G. Hempel (pp. 104–113). Dordrecht: Reidel. Johnson, C. (2003). Failure in safety-critical systems: A handbook of accident and incident reporting. Glasgow: University of Glasgow Press. Kim, J. (1973). Causes and counterfactuals. Journal of Philosophy, 70, 429– 441. Kitcher, P. (1980). A priori knowledge. Philosophical Review, 89, 3–23.
275
References Kletz, T. (1994). What went wrong? Case histories of process plant disasters (3rd ed.). Houston, TX: Gulf Publishing. Langsam, H. (1994). Kant, Hume, and our ordinary concept of causation. Philosophy and Phenomenological Research, 54, 625–647. Lewis, D. (1973). Counterfactuals. Cambridge, MA: Harvard University Press. Libet, B. (1985). Unconscious cerebral initiative and the role of conscious will in voluntary action. Behavioural and Brain Sciences, 8, 529–566. Lucas, J. (2006). Reason and reality. Retrieved March 28, 2010, from http:// users.ox.ac.uk/∼jrlucas/. Mackie, J. L. (1973). Truth, probability, and paradox. Oxford: Oxford University Press. Mele, A. R. (1992). Springs of action. Oxford: Oxford University Press. Mele, A. (1995). Autonomous agents. New York: Oxford University Press. Mele, A. (2009). Moral responsibilities and agents’ histories. Philosophical Studies, 142, 161–181. Nagel, T. (1986). The view from nowhere. Oxford: Clarendon Press. Owens, A. (1989). Causes and coincidences. Proceedings of the Aristotelian Society, 90, 49–64. Owens, A. (1992). Causes and coincidences. Cambridge: Cambridge University Press. Pacherie, E. (2008). The phenomenology of action: A conceptual framework. Cognition, 107, 179–217. Psillos, S. (2002). Causation and explanation. Chesham: Acumen. Quinn, W. (1989). Actions, intentions, and consequences: The doctrine of double effect. Philosophy and Public Affairs, 18, 334–351. Russell, B. (1918). On the notion of cause. In B. Russell, Mysticism and logic. London: Allen & Unwin. Salmon, W. (1984). Scientific explanation and the causal structure of the world. Princeton, NJ: Princeton University Press. Salmon, W. (1989). Four decades of scientific explanation. Oxford: Oxford University Press. Searle, J. (1983). Intentionality. Cambridge: Cambridge University Press. Shanks, D. R., & Dickinson, A. (1987). Associative accounts of causality judgment. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 21, pp. 229–261). San Diego, CA: Academic Press. Sloman, A. (1993). The mind as a control system. In C. Hookway & D. Peterson (Eds.), Philosophy and the cognitive sciences (pp. 69–110). Cambridge: Cambridge University Press.
276
References Sloman, A. (1999). What sort of architecture is required for a human-like agent? In M. Wooldridge & A. Rao (Eds.), Foundations of rational agency (pp. 35–52). Dordrecht: Kluwer Academic. Sloman, S. A., & Lagnado, D. A. (2005). The problem of induction. In K. Holyoak & R. Morrison (Eds.), The Cambridge handbook of thinking and reasoning (pp. 95–116). Cambridge: Cambridge University Press. Stoutland, F. (1976). The causation of behaviour. In Essays on Wittgenstein in honor of G. H. von Wright. Acta Philosophica Fennica, 28, 286–325. Strawson, G. (1989). The secret connexion. Oxford: Clarendon Press. Thalberg, I. (1972). Enigmas of agency. London: Allen & Unwin. Von Wright, G. H. (1973). On the logic of the causal relations. In E. Sosa & M. Tooley (Eds.), Causation (pp. 205–124). Oxford: Oxford University Press. White, C. (1990). Ideas about causation in philosophy and psychology. Psychological Bulletin, 108, 3–18. Woodward, J. (1993). Capacities and invariance. In J. Earman, A. Janis, G. J. Massey, & N. Rescher (Eds.), Philosophical problems of the internal and external worlds. Pittsburgh, PA: University of Pittsburgh Press. Woodward, J. (2000). Explanation and invariance in the special science. British Journal for the Philosophy of Science, 51, 197–254.
Chapter 3 Andrei, N. (2006). Modern control theory: A historical perspective. Studies in Informatics and Control, 10, 51–62. Bate, I., McDermind, J., & Nightingale, P. (2003). Establishing timing requirements for control loops in real-time systems. Microprocessors and Microsystems, 27, 159–169. Birmingham, H. P., & Taylor, F. V. (1954). A design philosophy for manmachine control systems. Proceedings of IRE, 42, 1748–1758. Bode, H. (1945). Network analysis and feedback amplifier design. New York: Van Nostrand. Bryson, A. E., & Ho, Y. C. (1975). Applied optimal control. New York: Wiley. Chretienne, P. (1989). Timed Petri nets: A solution to the minimum-timereachability problem between two states of a timed-event graph. Journal of Systems and Software, 6, 95–101. Coyle, G. (1996). System dynamics modelling: A practical approach. London: Chapman & Hall.
277
References Dayan, P., Kakade, S., & Montague, P. R. (2000). Learning and selective attention. Nature Neuroscience, 3, 1218–1223. Degani, A. (2004). Taming HAL: Designing interfaces beyond 2001. New York: Palgrave Macmillan. Degani, A., & Heymann, M. (2002). Formal verification of human-automation interaction. Human Factors, 44, 28–43. Dorf, R. C., & Bishop, R. H. (2008). Modern control systems. Upper Saddle River, NJ: Pearson Prentice Hall. Doyle, J. (1982). Analysis of feedback systems with structured uncertainties. IEEE Proceeding, 129, 242–250. Doyle, J., Francis, B., & Tannenbaum, A. (1990). Feedback control theory. New York: Macmillan. Ellis, G. (2002). Observers in control systems: A practical guide. New York: Academic Press. Feng, G., & Lozanao, R. (1999). Adaptive control systems. Oxford: Newness. Gee, S. S., & Wang, C. (2004). Adaptive neural control of uncertain MIMO nonlinear systems. IEEE Transactions on Neural Networks, 15, 674–692. Georgantzas, N. C., & Katsamakas, E. G. (2008). Information systems research with system dynamics. System Dynamics Review, 24, 247–264. Huang, Y. S., & Su, P. J. (2009). Modelling and analysis of traffic light control systems. IET Control Theory and Application, 3, 340–350. Ioannou, P. A., & Sun, J. (1996). Robust adaptive control. Englewood Cliffs, NJ: Prentice-Hall. Jacobs, O. L. R. (1993). Introduction to control theory (2nd ed.). Oxford: Oxford Science. Johnson, C. (2003). Failure in safety-critical systems: A handbook of accident and incident reporting. Glasgow: University of Glasgow Press. Johnson, C. (2005). What are emergent properties and how do they affect the engineering of complex systems? Reliability Engineering and System Safety, 91, 1475–1481. Kirlik, A. (2007). Conceptual and technical issues in extending computational cognitive modelling to aviation. In J. Jacko (Ed.), Human-computer interaction: Part I. HCII (pp. 872–881). Berlin: Springer-Verlag. Körding, K., & Wolpert, D. (2006). Bayesian decision theory in sensorimotor control. Trends in Cognitive Science, 7, 319–326. Lakatos, B. G., Sapundzhiev, T. J., & Garside, J. (2007). Stability and dynamics of isothermal CMSMPR crystallizers. Chemical Engineering Science, 62, 4348–4364.
278
References Leigh, J. R. (1992). Control theory: A guided tour. London: Peter Peregrinus. Leondes, C. T. (1998). Control and dynamic systems: Vol. 7. Neural network systems techniques and applications. New York: Academic Press. Mayne, D. Q., Rawlings, J. B., Rao, C. V., & Scokaert, P. O. M. (2000). Constrained model predictive control: Stability and optimality. Automatica, 36, 789–814. Mill, J. S. (1884). A system of logic (Bk. 3, Chap. 6). London: Longman. Morse, A. S. (1996). Supervisory control of families of linear set-point controllers – part 1: Exact matching. IEEE Transactions in Automated Control, 41, 1413–1431. Morse, A. S. (1997). Supervisory control of families of linear set-point controllers – part 2: Robustness. IEEE Transactions in Automated Control, 42, 1500–1515. Nyquist, H. (1932). Regeneration theory. Bell System Tech Journal, 11, 126–147. Papageorgiou, M., Diakaki, C., Dinopoulou, V., & Kotsialos, A. (2003). Review of road traffic control strategies. Proceedings in IEEE, 2043–2067. Perko, L. (1991). Differential equations and dynamical systems. New York: Springer-Verlag. Ramadge, P. J., & Wonham, W. M. (1987). Supervisory control of a class of discrete event processes. SIAM J. Control Optimization, 25, 206–230. Ramadge, P. J., & Wonham, W. M. (1989). The control of discrete-event systems. Proceedings in IEEE, 77, 81–98. Ramakrishnan, R., & Gehrke, J. (2003). Database management systems (3rd ed.). New York: McGraw-Hill. Rasmussen, J. (1985). Human information processing and human machine interaction. Amsterdam: North Holland. Redmill, F., & Rajan, J. (1997). Human factors in safety-critical system. Oxford: Butterworth Heinemann. Richardson, G. P. (1999). Reflections for the future of system dynamics. Journal of the Operational Research Society, 50, 440–449. Richardson, G. P., & Pugh, A. L., III. (1981). Introduction to system dynamics modeling with DYNAMO. Portland, OR: Productivity Press. Rothrock, L., & Kirlik, A. (2003). Inferring rule-based strategies in dynamic judgment tasks: Towards a noncompensatory formulation of the lens model. IEEE Transactions on Systems, Man and Cybernetics – Part A: Systems and Humans, 33, 58–72. Russell, A., Clark, M., & Mack, G. (2006). Automated population of dynamic Bayes nets for pre-conflict analysis and forecasting. IEEE Aerospace Conference Proceedings, 1–9, 3422–3432.
279
References Safonov, M. G., & Tsao, T. C. (1997). The unfalsified control concept and learning. IEEE Transactions on Automatic Control, 42, 843–847. Sheridan, T. B. (2002). Human and automation: Systems design and research issues. New York: Wiley. Suchman, L. (1987). Plans and situated actions. Cambridge: Cambridge University Press. Suchman, L. (2006). Human-machine reconfigurations: Plans and situated actions. Cambridge: Cambridge University Press. Summer, A. (2009). Safety management is a virtue. Process Safety Progress, 28, 210–213. Vicente, K. J. (2002). Ecological interface design: Process and challenges. Human Factors, 44, 62–78. Vicente, K. J., Roth, E. M., & Mumaw, R. J. (2001). How do operators monitor a complex, dynamic work domain? The impact of control room technology. Journal of Human-Computer Studies, 54, 831–856. Weeks, B. L., Ruddle, C. M., Zaug, J. M., & Cook, D. J. (2002). Monitoring high-temperature solid-solid phase transitions of HMX with atomic force microscopy. Ultramicroscopy, 93, 19–23. Wolpert, D. (1997). Computational approaches to motor control. Trends in Cognitive Science, 1, 209–216. Yang, F, Wang, Z., Ho, D, W. C., & Gani, M. (2007). Robust H control with missing measurements and time delays. IEEE Transactions on Automatic Control, 52, 1666–1672. Yu, L., & Goa, F. (2001). Optimal guaranteed cost control of discrete-time uncertain systems with both state and input delays. Journal of the Franklin Institute, 338, 101–110.
Chapter 4 Acid, S., & de Campos, L, M. (2003). Searching for Bayesian network structures in the space of restricted acyclic partically directed graphs. Journal of Artificial Intelligence Research, 18, 445–490. Agre, P., & Horswill, I. (1997). Lifeworld analysis. Journal of Artificial Intelligence Research, 6, 111–145. Aitkenhead, M. J., & McDonald, A. J. S. (2006). The state of play in machine/ environment interaction. Machine Learning, 25, 247–276. Ashby, W. (1945). Effects of control on stability. Nature, 155, 242–243.
280
References Ashby, W. (1947). Principles of the self-organizing dynamic system. Journal of General Psychology, 37, 125–128. Ashby, W. (1956). Introduction to Cybernetics. New York: Wiley. Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems-Theory and Applications, 13, 41–77. Barto, A. G., & Sutton, R. S. (1981). Landmark learning: An illustration of associative search. Biological Cybernetics, 42, 1–8. Bowker, G., & Chou, R-Y. (2009). Ashby’s notion of memory and the ontology of technical evolution. International Journal of General Systems, 2, 129–137. Bratko, I., & Urbancˇicˇ, T. (1997). Transfer of control skill by machine learning. Engineering Application of Artificial Intelligence, 10, 63–71. Brown, S. D., & Steyvers, M. (2009). Detecting and predicting changes. Cognitive Psychology, 58, 49–67. Bryant, J. (1991). Systems theory and scientific philosophy. New York: University Press of America. Casti, J. L. (1985). Canonical models and the Law of Requisite Variety. Journal of Optimization Theory and Applications, 46, 455–459. Cohen, W. (1995). Fast effective rule induction. In Proceedings of ICML-95 (pp. 115–123). San Francisco: Morgan Kaufmann. Conant, R. C., & Ashby, W. (1970). Every good regulator of a system must be a model of that system. International Journal of Systems Science, 1, 89–97. Corning, P. A. (2007). Control information theory: The ‘missing link’ in the science of cybernetics. Systems Research and Behavioral Science, 24, 297–311. Cowell, R. G. (2001). Conditions under which conditional independence and scoring methods lead to identical selection of Bayesian network models. In Proceedings of 17th International Conference on Uncertainty in Artificial Intelligence (pp. 319–328). San Francisco: Morgan Kaufmann. Daw, N. D., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioural control. Nature Neuroscience, 8, 1704–1711. Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441, 876–879. Dayan, P., & Niv, Y. (2008). Reinforcement learning: The good, the bad and the ugly. Current Opinion in Neurobiology, 18, 185–196.
281
References Doucet, A., Andrieu, C., & Godsill, S. (2000). On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10, 197–208 Einhorn, H. J., & Hogarth, R. M. (1978). Confidence in judgment: Persistence of the illusion of validity. Psychological Review, 85, 395–416. Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179–211 Elomaa, T., & Rousu, J. (1999). General and efficient multisplitting of numerical attributes. Machine Learning, 36, 201–244. Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (Eds.). (1996). Advances in knowledge discovery and data mining. Menlo Park, CA: AAAI Press/MIT Press. Feldbaum, A. A. (1960a). Dual control theory. Automation Remote Control, 21, 1240–1249. Feldbaum, A. A. (1960b). Dual control theory. Automation Remote Control, 21, 1453–1464. Gahegan, M. (2003). Is inductive machine learning just another wild goose (or might it lay the golden egg)? International Journal of Geographical Information Science, 17, 69–92. Gama, J., & Brazdil, P. (1999). Linear tree. Intelligent Data Analysis, 3, 1–22. Gödel, K. (1931). Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I. Monatshefte für Mathematik und Physik, 38, 173–198. Goldberg, D. E. (1989). Genetic algorithm in search, optimization, and machine learning. Reading, MA: Addison-Wesley. Goldberg, D. E., & Holland, J. H. (1988). Genetic algorithms and machine learning. Machine Learning, 3, 95–99. Gollu, A., & Varaiya, P. (1998). SmartAHS: A simulation framework for automated vehicles and highway systems. Mathematical and Computer Modelling, 27, 103–128. Grush, R. (2004). The emulation theory of representation: Motor control, imagery, and perception. Behavioral and Brain Sciences, 27, 377–396. Gureckis, T. M., & Love, B. C. (2009). Learning in noise: Dynamic decisionmaking in a variable environment. Journal of Mathematical Psychology, 53, 180–195. Hagan, M. T., & Menaj, M. B. (1994). Training feedforward networks with the M algorithm. IEEE: Transactions of Neural Networks, 5, 989–993. Haykin, S. (1994). Neural networks: A comprehensive foundation. New York: Macmillan.
282
References Hazen, H. L. (1934). Theory of servo-mechanisms. Journal of Franklin Institute, 218, 279–331. Heylighen, F., & Joslyn, C. (2001). The law of requisite variety. Principia Cybernetica Web. Retrieved March 22, 2010, from http://pespmc1.vub. ac.be/REQVAR.HTML. Hokayem, P., & Abdallah, C. T. (2004, June 30–July 2). Inherent issues in networked control systems: A survey. In Proceedings of the American Control Conference (pp. 4897–4902). Boston. Ilg, W., & Berns, K. (1995). A learning architecture based on reinforcement learning for adaptive control of the walking machine LAURON. Robotic Automated Systems, 15, 321–334. Ishii, S., Yoshida, W., & Yokimoto, K. (2002). Control of exploitation – exploration meta parameter in reinforcement learning. Neural Networks, 15, 665–687. Jensen, F. (1996). An introduction to Bayesian networks. New York: Springer. Jordan, M. I. (1986). Serial order: A parallel distributed processing approach (Tech. Rep. No. 8604). San Diego: University of California, Institute for Cognitive Science. Kleene, S. C. (1952). Introduction to metamathematics. New York: Van Nostrand. Kochenderfer, M. J. (2005). Adaptive modeling and planning for reactive agents. In M. M. Veloso & S. Kambhampati (Eds.), Proceedings of the Twentieth National Conference on Artificial Intelligence and the Seventeenth Annual Conference on Innovative Applications of Artificial Intelligence (pp. 1648–1649). Menlo Park, CA: AAAI Press. Kolmogoroff, A. N. (1941). Interpolation und Extrapolation von stationären zufälligen Folgen. zv. Akad. Nauk SSSR Ser. Mat, 5, 3–14. Kotsiantis, S. B., Zahaakis, I. D., & Pintelas, P. E. (2006). Machine learning: A review of classification and combing techniques. Artificial Intelligence Review, 26, 159–190. Leen, G., & Heffernan, D. (2002). Expanding automotive electronic systems. IEEE, 35, 88–93. Linsker, R. (2008). Neural network learning of optimal Kalman prediction and control. Neural Networks, 21, 1328–1343. Love, T., & Cooper, T. (2008). Complex built-environment design: For extensions to Ashby. Kybernetes, 36, 1422–1435. Lucas, J. R. (1961). Mind, machines and Gödel. Philosophy, 36, 112–127. Mahadevan, S. (2009). Learning representation and control in Markov decision processes: New frontiers. Foundations and Trends in Machine Learning, 1, 403–565.
283
References Melnik, R. V. N. (2009). Coupling control and human factors in mathematical models of complex systems. Engineering Applications of Artificial Intelligence, 22, 351–362. Mindell, D. (2002). Between human and machine: Feedback, control and computing. Baltimore: John Hopkins University Press. Mitchell, T. (2006). The discipline of machine learning (Carnegie Mellon University White Paper). Mountain View, CA: Carnegie Mellon. Mitchell, T., Buchanan, B., DeJong, G., Dietterich, T., Rosenbloom, P., & Waibel, A. (1990). Machine learning. Annual Review of Computer Science, 4, 417–433. Motro, A., & Smets, P. (1997). Uncertainty management information systems: From needs to solutions. Dordrecht: Kluwer Academic. Mumford, L. (1934). Technics and civilization. New York: Harcourt Brace Jovanovich. Putnam, H. (1960). Minds and machines. In S. Hook (Ed.), Dimensions of mind: A symposium. New York: New York University Press. Quinlan, J. R. (1993). C4.5. programs for machine learning. San Francisco: Morgan Kaufman. Rao, S. S. (1996). Engineering optimization: Theory and practice. Chichester: Wiley. Reeves, C. R., & Rowe, J. E. (2003). Genetic algorithms – principles and perspectives: A guide to GA theory. Dordrecht: Kluwer Academic. Rosenblatt, F. (1958). The Perceptron: A probabilistic model for information storage and organization in the brain, Cornell Aeronautical Laboratory. Psychological Review, 65, 386–408. Rosenblueth, A., Wiener, N., & Bigelow, J. (1943). Behavior, purpose, and teleology. Philosophy of Science, 10, 18–24. Schaffer, J., Burch, N., Björnsson, Y., Kishimoto, A., Müller, M., Lake, R., Lu, P., & Sutphen, S. (2007). Checkers is solved. Science, 317, 1518–1522. Searle, J. (1980). Minds, brains and programs. Behavioral and Brain Sciences, 3, 417–457. Shannon, C. E., & Weaver, W. (1948). The mathematical theory of communication. Chicago: University of Illinois Press. Shevchenko, M., Windridge, D., & Kittler, J. (2009). A linear-complexity reparameterization strategy for the hierarchical bootstrapping of capabilities within perception-action architectures. Image and Vision Computing, 27, 1702–1717.
284
References Siddique, M. N. H., & Tokhi, M. O. (2001). Training neural networks: Backpropagation vs. genetic algorithms. IEEE International Joint Conference of Neural Networks, 4, 2673–2678. Simon, H. A. (1952). On the application of servomechanism theory in the study of production control. Econometrica, 20, 247–268. Sivashankar, N., & Sun, J. (1999). Development of model-based computeraided engine control systems. International Journal of Vehicle Design, 21, 325–343. Smart, J. J. C. (1961). Gödel’s theorem, Church’s theorem and Machinism. Synthese, 13, 105. Smets, P. (1997). Imperfect information: Imprecision and uncertainty. In A. Motro & P. Smets (Eds.), Uncertainty management information systems: From needs to solutions (pp. 225–254). Dordrecht: Kluwer Academic. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press. Tsao, Y., Xiao, K., & Soo, V. (2008, May). Graph Laplacian based transfer learning in reinforcement learning. In AAMAS ’08: Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (pp. 1349–1352), Estoril, Portugal. Turing, A. M. (1936). On computable numbers with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 42, 230–265. Turing, A. (1950). Computing machinery and intelligence. Mind, 236, 433–460. Vivarelli, F., & Williams, C. (2001). Comparing Bayesian neural network algorithms for classifying segmented outdoor images. Neural Networks, 14, 427–437. Watkins, C. J. C. (1989). Learning from delayed rewards. Ph.D. dissertation, Cambridge University. Watkins, C. J. C., & Dayan, P. (1992). Q-Learning. Machine Learning, 8, 279–292. Weng, J., McClelland, J., Pentland, A., Sporns, O., Stockman, I., Sur, M., & Thelen, E. (2001). Autonomous mental development by robots and animals. Science, 291, 599–600. Wiener, N. (1930). Extrapolation, interpolation and smoothing of stationary time series with engineering applications. Cambridge, MA: MIT Press. Wiener, N. (1948). Cybernetics: Or control and communication in the animal and the machine. New York: Wiley.
285
References Wieslander, J., & Wittenmark, B. (1971). An approach to adaptive control using real time identification. Automatica, 7, 211–217. Williams, R. J., & Zipser, D. (1989). A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1, 270– 280. Wolpert, D. (1997). Computational approaches to motor control. Trends in Cognitive Science, 1, 209–216. Yager, R. (1991). Connectives and quantifiers in fuzzy sets. International Journal of Fuzzy Sets and Systems, 40, 39–76. Yu, A., & Dayan, P. (2005). Uncertainty, neuromodualtion and attention. Neuron, 46, 681–692. Zadeh, L. (1965). Fuzzy sets. Information and Control, 8, 338–353. Zhang, G. (2000). Neural network for classification: A survey. IEEE: Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 30, 451–462.
Chapter 5 Ammerman, M. (1998). The root cause analysis handbook. New York: Quality Resources. Anderson, J. R., & Bower, G. H. (1973). Human associative memory. Washington, DC: Winston. Banbury, S. P., & Triesman, S. (2004). Cognitive approaches in situation awareness. Brookfield, VT: Ashgate. Bisantz, A. M., Kirlik, A., Gay, P., Phipps, D., Walker, N., & Fisk, A. D. (2000). Modeling and analysis of a dynamic judgment task using a lens model approach. IEEE Transactions on Systems, Man and Cybernetics – Part A: Systems and Humans, 30, 605–616. Boy, G. A. (1998). Cognitive function analysis. Stamford, CT: Ablex. Bucciarelli, I. L. (1994). Designing engineers. Cambridge, MA: MIT Press. Burns, C. M., & Vicente, K. J. (1995). A framework for understanding interdisciplinary interactions in design. In Proceedings of DIS ’95: Symposium on Designing Interactive Systems (pp. 97–103). New York: Association for Computing Machinery. Busby, J. S., & Chung, P. W. H. (2003). In what ways are designers’ and operators’ reasonable-world assumptions not reasonable assumptions? Transactions of Institution of Chemical Engineers, 81, 114–120.
286
References Busby, J. S., & Hibberd, R. E. (2002). Mutual misconceptions between designers and operators of hazardous systems. Research in Engineering Design, 13, 132–138. Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of humancomputer interaction. Mahwah, NJ: Erlbaum. Corning, P. A. (2007). Control information theory: The ‘missing link’ in the science of cybernetics. Systems Research and Behavioral Sciences, 24, 297–311. Degani, A. (2004). Taming HAL: Designing interfaces beyond 2001. New York: Palgrave Macmillan. Degani, J. A. S., Shafto, M., & Kirlik, A. (2006). What makes vicarious functioning work? Exploring the geometry of human-technology interaction. In A. Kirlik (Ed.), Adaptive perspectives on human-technology interaction (pp. 179–196). New York: Oxford University Press. Dekker, S. W. A. (2005). Ten questions about human error: A new view of human factors and system safety. Mahwah, NJ: Erlbaum. Dekker, S. W. A., & Woods, D. D. (2002). MABA-MABA or abracadabra? Progress on human–automation co-ordination. Cognition, Technology & Work, 4, 240–244. Dettmer, H. W. (1997). Goldratt’s theory of constraints: A systems approach to continuous improvement. Milwaukee, WI: ASQC Quality Press. Dörner, D. (1989). The logic of failure. New York: Henry Holt. Duncan, K. D. (1987). Fault diagnosis training for advanced continuous process installations. In J. Rasmussen, K. Duncan, & J. Leplat (Eds.), New technology and human error (pp. 209–221). New York: Wiley. Elmaghraby, S. E. (1977). Activity networks: Project planning and control by network models. New York: Wiley. Endsley, M. R. (1995). Measurement of situation awareness in dynamic systems. Human Factors, 37, 65–84. Egeth, H. E., & Yantis, S. (1997). Visual attention: Control, representation, and time course. Annual Review of Psychology, 48, 269–297. Engh, T., Yow, A., & Walters, B. (1998). An evaluation of discrete event simulation for use in operator and crew performance evaluation (report to the Nuclear Regulatory Commission). Washington, DC: Nuclear Regulatory Commission. Fisher, D. L., & Goldstein, W. M. (1983). Stochastic PERT networks as models of cognition: Derivation of the mean variance, and variance, and distribution of reaction time using order-of-processing (OP) diagrams. Journal of Mathematical Psychology, 27, 121–155.
287
References Fisher, D. L., Schweickert, R., & Drury, C. G. (2006). Mathematical models in engineering psychology: Optimizing performance. In G. Salvendy (Ed.), Handbook of human factors and ergonomics (3rd ed., pp. 997–1024). Hoboken, NJ: Wiley. Fitts, P. M. (1951). Human engineering for an effective air navigation and traffic control system (Ohio State University Foundation report). Columbus: Ohio State University Foundation. Fuchs, A., & Jirsa, V. K. (2008). Coordination: Neural, behavioural and social dynamics. Berlin: Springer-Verlag. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley. Green, F., Ashton, D., & Felstead, A. (2001). Estimating the determinants of supply of computing, problem-solving, communication, social and teamworking skills. Oxford Economic Papers, 3, 406–433. Hammer, W. (1985). Occupational safety management and engineering (3rd ed.). Englewood Cliffs, NJ: Prentice Hall. Hazen, H. L. (1934). Theory of servo-mechanisms. Journal of Franklin Institute, 218, 279–331. Hazen, H. L. (1941). The human being is a fundamental link in automatic control systems, 13 May 1941 (OSRD7 GP, memorandum between Harold Hazen and Warren Weaver, office files of Warren Weaver, Box 2). Washington, DC: Office of Scientific Research and Development. Hirschhorn, L. (1984). Beyond mechanization: Work and technology in a postindustrial age. Cambridge, MA: MIT Press. Hollnagel, E., & Woods, D. D. (1983). Cognitive systems engineering: New wine in new bottles. International Journal of Man–Machine Studies, 18, 583–600. Jax, S. A., Rosenbaum, D. A., Vaughan, J., & Meulenbroek, R. G. J. (2003). Computational motor control and human factors: Modeling movements in real and possible environment. Human Factors, 45, 5–27. John, B. E. (1990). Extensions of GOMS analyses to expert performance requiring perception of dynamic visual and auditory information. In Proceedings of CHI ’90 Conference on Human Factors in Computing Systems (pp. 107–115). New York: Association for Computing Machinery. Johnson, C. (2003). Failure in safety-critical systems: A handbook of accident and incident reporting. Glasgow: University of Glasgow Press. Johnson, W. G. (1980). MORT safety assurance systems. New York: Decker.
288
References Kelly, M. R. (1989). Alternative forms of work organization under programmable automation: The transformation of work? Skill, flexibility and the labour process. London: Unwin Hyman. Lawless, M. L., Laughery, K. R., & Persensky, J. J. (1995, August). Micro saint to predict performance in a nuclear power plant room: A test of validity and feasibility (NUREG/CR-6159). Washington, DC: Nuclear Regulatory Commission. Lipshitz, R., & Strauss, O. (1997). Coping with uncertainty: A naturalistic decision-making analysis. Organizational Behavior and Human Decision Processes, 69, 149–163. Macduffie, J. P. (1995). Human resource bundles and manufacturing performance: Organization logic and flexible production systems in the world auto industry. Industrial and Labor Relations Review, 48, 197–221. McKenzie, C. R. M. (2003). Rational models as theories – not standards – of behavior. Trends in Cognitive Science, 7, 403–406. Melnik, R. V. N. (2009). Coupling control and human factors in mathematical models of complex systems. Engineering Applications of Artificial Intelligence, 22, 351–362. Mindell, D. (2002). Between human and machine: Feedback, control and computing. Baltimore: John Hopkins University Press. Norman, D. A. (1981). Categorization of action slips. Psychological Review, 88, 1–15. Norman, D. A. (1992). Design principles for cognitive artifacts. Research and Engineering Design, 4, 43–50. Norris, G. (1995). Boeing’s seventh wonder. IEEE Spectrum, 32, 20–23. Parasuraman, R., & Riley, V. A. (1997). Humans and automation: Use, misuse, disuse, and abuse. Human Factors, 39, 230–253. Perrow, C. (1984). Normal accidents: Living with high-risk technologies. New York: Basic Books. Phillips, C. A. (2000). Human factors engineering. New York: Wiley. Rasmussen, J. (1987). The definition of human error and a taxonomy for technical system design. In J. Rasmussen, K. Duncan, & J. Leplat (Eds.), New technology and human error (pp. 23–30). Chichester: Wiley. Reason, J. (1990a). Human error. Cambridge: Cambridge University Press. Reason, J. (1990b). The contribution of latent human failures to the breakdown of complex systems. Philosophical Transactions of the Royal Society of London, 327, 475–484.
289
References Redmill, F., & Rajan, J. (1997). Human factors in safety-critical system. Oxford: Butterworth-Heinemann. Repperger, D. W. (2004). Adaptive displays and controllers using alternative feedback. CyberPsychology and Behavior, 7, 645–652. Rothrock, L., & Kirlik, A. (2003). Inferring rule-based strategies in dynamic judgment tasks: Towards a noncompensatory formulation of the lens model. IEEE Transactions on Systems, Man and Cybernetics – Part A: Systems and Humans, 33, 58–72. Rumelhart, D. E., & McClelland, J. L. (1986). Parallel distributed processing: Vol. 1. Exploration in the microstructure of cognition. Cambridge, MA: MIT Press. Salvendy, G. (2006). Handbook of human factors and ergonomics (3rd ed.). Hoboken, NJ: Wiley. Sarter, N., & Woods, D. D. (1995). How in the world did we ever get into that mode? Mode error and awareness in supervisory control. Human Factors, 37, 5–19. Sarter, N., Woods, D., & Billings, C. E. (1997). Automation surprises. In G. Salvendy (Ed.), Handbook of human factors and ergonomics (2nd ed., pp. 1926–1943). New York: Wiley. Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423, 623–656. Shannon, C. E., & Weaver, W. (1949). The mathematical theory of communication. Urbana: University of Illinois Press. Sharit, J. (2006). Human error. In G. Salvendy (Ed.), Handbook of human factors and ergonomics (3rd ed., pp. 708–760). Hoboken, NJ: Wiley. Sheridan, T. B. (2000). Function allocation: Algorithm, alchemy or apostasy? International Journal of Human-Computer Studies, 52, 203– 216. Sheridan, T. B. (2002). Humans and automation: Systems design and research issues. Hoboken, NJ: Wiley. Sheridan, T. B. (2004). Driver distraction from a control theory perspective. Human Factors, 46, 587–599. Sheridan, T. B., & Parasuraman, R. (2006). Human-automation interaction. In R. Nickerson (Ed.), Proceedings of the Human Factors and Ergonomics Society 46th annual meeting. Santa Monica, CA: Human Factors and Ergonomics Society. Sherry, L., & Polson, P. G. (1999). Shared models of flight management system vertical guidance. International Journal of Aviation Psychology, 9, 139–153.
290
References Smith, K., & Hancock, P. A. (1995). Situation awareness is adaptive, externally direct consciousness. Human Factors, 37, 137–148. Smith, P. J., & Geddes, N. D. (2003). A cognitive systems engineering approach to the design of decision support systems. In J. A. Jacko & A. Sears (Eds.), The human-computer interaction handbook: Fundamentals, evolving technologies, and emerging applications. Mahwah, NJ: Erlbaum. Smith, V. (1997). New forms of work organization. Annual Review of Sociology, 23, 315–339. Spender, J. C. (1996). Making knowledge the basis of a dynamic theory of the firm. Strategic Management Journal, 17, 45–62. Stanovich, K. E. (1999). Who is rational? Studies of individual differences in reasoning. Mahwah, NJ: Erlbaum. Turing, A. (1950). Computing machinery and intelligence. Mind, 236, 433–460. Vicente, K. J. (1999). Cognitive work analysis: Towards safe, productive, and healthy computer-based work. Mahwah, NJ: Erlbaum. Vicente, K. J., and Rasmussen, J. (1992). Ecological interface design: theoretical foundations. IEEE Transactions on Systems, Man and Cybernetics – Part A: Systems and Humans, 22, 589–606. Wickens, C. (1984). Engineering psychology and human performance. Columbus, OH: Merrill. Wickens, C., & Dixon, S. R. (2007). The benefits of imperfect diagnostic automation: A synthesis of the literature. Theoretical Issues in Ergonomics Science, 8, 202–212. Wiener, N. (1930). Extrapolation, interpolation and smoothing of stationary time series with engineering applications. Cambridge, MA: MIT Press. Wiener, N. (1948). Cybernetics: Or control and communication in the animal and the machine. New York: Wiley. Wieslander, J., & Wittenmark, B. (1971). An approach to adaptive control using real time identification. Automatica, 7, 211–217. Woods, D. D. (1988). Coping with complexity: The psychology of human behavior in complex systems. In L. P. Goodstein, H. B. Anderson, & S. E. Olsen (Eds.), Task, errors, and mental models: A festschrift to celebrate the 60th of Professor Jens Rasmussen (pp. 169–188). San Diego, CA: Academic Press. Woods, D. D. (1996). Decomposing automation: Apparent simplicity, real complexity. In R. Parasuraman & M. Mouloua (Eds.), Automation and human performance: Theory and applications. London: Erlbaum.
291
References
Chapter 6 Appley, M. H. (1991). Motivation, equilibration, and stress. In R. A. Dienstbier (Ed.), Perspectives on motivation (pp. 1–67). Lincoln: University of Nebraska Press. Ashby, W. (1947). Principles of the self-organizing dynamic system. Journal of General Psychology, 37, 125–128. Austin, J., & Vancouver, J. (1996). Goal constructs in psychology: Structure, process, and content. Psychological Bulletin, 120, 338–375. Azjen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50, 179–211. Bandura, A. (1977). Self-efficacy: Towards a unifying theory of behavioral change. Psychological Review, 84, 191–215. Bandura, A. (1989). Human agency in social cognitive theory. American Psychologist, 44, 1175–1184. Bandura, A. (1991). Social cognitive theory of self-regulation. Organizational Behavior and Human Decision Processes, 50, 248–287. Bandura, A. (2000). Exercise of human agency through collective efficacy. Current Directions in Psychological Science, 9, 75–78. Bandura, A. (2001). Social cognitive theory: An agentic perspective. Annual Review of Psychology, 52, 1–26. Bandura, A., & Cervone, D. (1986). Differential engagement of self-active influences in cognitive motivation. Organizational Behavior and Human Decision Processes, 38, 92–113. Bandura, A., & Locke, E. A. (2003). Negative self-efficacy and goal effects revisited. Journal of Applied Psychology, 88, 87–99. Bandura, A., & Wood, R. E. (1989). Effect of perceived controllability and performance standards on self-regulation of complex decision making. Journal of Personality and Social Psychology, 56, 805–814. Barker, V., & Barr, P. (2002). Linking top manager attributions to strategic reorientation in declining firms attempting turnarounds. Journal of Business Research, 55, 963–997. Barker, V., & Patterson, P. (1996). Top management team tenure and top manager causal attributions at declining firms attempting turnarounds. Group and Organization Management, 21, 304–336. Barker, V., Patterson, P., & Mueller, G. (2001). Organizational causes and strategic consequences of the extent of top management team replace-
292
References ment during turnaround attempts. Journal of Management Studies, 38, 235–269. Bouffard-Bouchard, T. (1990). Influence of self-efficacy on performance in a cognitive task. Journal of Social Psychology, 130, 353–363. Bunge, M. (1977). Emergence and the mind. Neuroscience, 2, 501–509. Campbell, N. K., & Hackett, G. (1986). The effects of mathematics task performance on math self-efficacy and task interest. Journal of Vocational Behavior, 28, 149–162. Cervone, D. (2000). Thinking about self-efficacy. Behavior Modification, 24, 30–56. Cervone, D., Jiwani, N., & Wood, R. (1991). Goal setting on the differential influence of self-regulatory processes on complex decision-making performance. Journal of Personality and Social Psychology, 61, 257–266. Chesney, A., & Locke, E. (1991). An examination of the relationship among goal difficulty, business strategies, and performance on a complex management simulation task. Academy of Management Journal, 34, 400–424. Crown, D. F., & Rosse, J. G. (1995). Yours, mine and ours: Facilitating group productivity through the integration of individual and group goals. Organizational Behavior and Human Decision Processes, 64, 138–150. David, N., Newen, A., & Vogeley, K. (2008). The ‘sense of agency’ and its underlying cognitive and neural mechanisms. Consciousness and Cognition, 17, 523–534. Dayan, P., & Niv, Y. (2008). Reinforcement learning: The good, the bad and the ugly. Current Opinion in Neurobiology, 18, 185–196. DeCharms, R. (1979). Personal causation and perceived control. In L. C. Perlmuter & R. A. Monty (Eds.), Choice and perceived control. Hillsdale, NJ: Erlbaum. Dehaene, S., Naccache, L., Cohen, L., Bihan, D., Mangin, J-F., Poline, J-P., & Rivière, D. (2001). Cerebral mechanisms of word masking and unconscious repetition priming. Nature Neuroscience, 4, 752–758. DeShon, R. P., & Alexander, R. A. (1996). Goal setting effects on implicit and explicit learning of complex tasks. Organizational Behavior and Human Decision Processes, 65, 18–36. Dienes, Z., & Perner, J. (1999). A theory of implicit and explicit knowledge. Behavioral and Brain Sciences, 22, 735–808. Earley, P. C., Connolly, T., & Ekegren, G. (1989). Goals, strategy development and task performance: Some limits on the efficacy of goal setting. Journal of Applied Psychology, 74, 24–33.
293
References Earley, P. C., Northcraft, G. B., Lee, C., & Lituchy, T. R. (1990). Impact of process and outcome feedback on the relation of goal setting to task performance. Academy of Management Journal, 33, 87–105. Eisenberg, L. (1995). The social construction of the human brain. American Journal of Psychiatry, 152, 1563–1575. Elliott, E. S., & Dweck, C. S. (1988). Goals: An approach to motivation and achievement. Journal of Personality and Social Psychology, 54, 5–12. Endres, M. L. (2006). The effectiveness of assigned goals in complex financial decision making and the importance of gender. Theory and Decision, 61, 129–157. Frayne, C. A., & Latham, G. P. (1987). Application of social learning theory to employee self-management of attendance. Journal of Applied Psychology, 72, 387–392. Gecas, V. (1989). The social psychology of self-efficacy. Annual Review of Sociology, 15, 291–316. Gist, M. E. (1989). The influence of training method on self-efficacy and idea generation among managers. Personal Psychology, 42, 787–805. Gist, M. E., & Mitchell, T. R. (1992). Self-efficacy: A theoretical analysis of its determinants and malleability. Academy of Management Review, 17, 183–211. Gist, M. E., Schwoerer, C., & Rosen, B. (1989). Effects of alternative training methods on self-efficacy and performance in computer software training. Journal of Applied Psychology, 74, 884–891. Glymour, C. (2004). We believe in freedom of the will so that we can learn. Behavioral and Brain Sciences, 27, 661–662. Haleblian, J., & Rajagopalan, N. (2006). A cognitive model of CEO dismissal: Understanding the influence of board perceptions, attributions and efficacy beliefs. Journal of Management Studies, 43, 1009–1026. Hill, T., Smith, N. D., & Mann, M. F. (1987). Role of efficacy expectations in predicting the decision to use advanced technologies. Journal of Applied Psychology, 72, 307–314. Hoc, J. M. (1993). Some dimensions of a cognitive typology of processcontrol situations. Ergonomics, 36, 1445–1455. Hogarth, R. M., Gibbs, B. J., McKenzie, C. R. M., & Marquis, M. A. (1991). Learning from feedback: Exactingness and incentives. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 734–752. Huber, V. (1985). Effects of task difficulty, goal setting, and strategy on performance of a heuristic task. Journal of Applied Psychology, 70, 492– 504.
294
References Huff, A., & Schwenk, C. R. (1990). Bias and sense-making in good times and bad. In A. S. Huff (Ed.), Mapping strategic thought (pp. 89–108). Chichester: Wiley. Inagaki, T. (2003). Automation and the cost of authority. International Journal of Industrial Ergonomics, 31, 169–174. Jacobs, B., Prentice-Dunn, S., & Rogers, R. W. (1984). Understanding persistence: An interface of control theory and self-efficacy theory. Basic and Applied Social Psychology, 5, 333–347. Jones, G. R. (1986). Socialization tactics, self-efficacy, and newcomers’ adjustment to organizations. Academy of Management Journal, 29, 262– 279. Kanfer, R., & Ackerman, P. L. (1989). Motivation and cognitive abilities: An integrative/aptitude-treatment interaction approach to skill acquisition [Monograph]. Journal of Applied Psychology, 74, 657–690. Kanfer, R., Ackerman, P. L., Murtha, T. C., Dugdale, B., & Nelson, L. (1994). Goals setting, conditions of practice, and task performance: A resource allocation perspective. Journal of Applied Psychology, 79, 826–835. Karoly, P. (1993). Mechanisms of self-regulation: A systems view. Annual Review of Psychology, 44, 23–52. Kelley, H. H. (1967). Attribution theory in social psychology. Nebraska Symposium on Motivation, 15, 192–238. Kelley, H. H., & Michela, J. L. (1980). Attribution theory and research. Annual Review of Psychology, 31, 457–501. Kirsch, I. (1985). Response expectancy as a determinant of experience and behavior. American Psychologist, 40, 1189–1202. Kren, L. (1992). The moderating effects of locus of control on performance incentives and participation. Human Relations, 45, 991–1012. Libet, B. (1985). Unconscious cerebral initiative and the role of conscious will in voluntary action. Behavioral and Brain Sciences, 8, 529–566. Litt, M. D. (1988). Self-efficacy and perceived control: Cognitive mediators of pain tolerance. Journal of Personality and Social Psychology, 54, 149–160. Locke, E. A. (1991). Goal theory vs. control theory: Contrasting approaches to understanding work motivation. Motivation and Emotion, 15, 9–28. Locke, E. A. (2000). Motivation, cognition, and action: An analysis of studies of task goals and knowledge. Applied Psychology: An International Review, 49, 408–429. Locke, E. A., & Latham, G. P. (1990). A theory of goal setting and task performance. Englewood Cliffs, NJ: Prentice Hall.
295
References Locke, E. A., & Latham, G. P. (2002). Building a practically useful theory of goal setting and task motivation. American Psychologist, 57, 705– 717. Locke, E. A., Shaw, K., Saari, L., & Latham, G. (1981). Goal setting and task performance: 1968–1980. Psychological Bulletin, 90, 125–152. Maddux, J. E. (1995). Self-efficacy theory: An introduction. In J. E. Maddux (Ed.), Self-efficacy, adaptation, and adjustment: Theory, research, and application (pp. 3–34). New York: Plenum. Maddux, J. E., Norton, L., & Stoltenberg, C. D. (1986). Self-efficacy expectancy, outcome expectancy, and outcome value: Relative influences on behavioral intentions. Journal of Personality and Social Psychology, 51, 783–789. Manning, M. M., & Wright, T. L. (1983). Self efficacy expectancies, outcome expectancies, and the perspective of pain control in childbirth. Journal of Personality and Social Psychology, 45, 421–431. Marx, K. (1963). Early writings (Ed. & Trans. T. B. Bottomore). New York: McGraw-Hill. (Originally published in 1844) Maynard, D., & Hakel, M. (1997). Effects of objective and subjective task complexity on performance. Human Performance, 10, 303–330. McClelland, D. C. (1975). Power: The inner experience. New York: Irvington. McClelland, D. C. (1985). How motives, skills, and values determine what people do. American Psychologist, 40, 812–825. Mead, G. H. (1934). Mind, self, and society. Chicago: University of Chicago Press. Mento, A. J., Steel, R. P., & Karren, R. J. (1987). A meta-analytic study of the effects of goal setting on task performance: 1966–1984. Organizational Behavior and Human Decision Processes, 39, 52–83. Mesch, D., Farh, J., & Podsakoff, P. (1994). The effects of feedback sign on group goal-setting, strategies, and performance. Group and Organizational Management, 19, 309–333. Moray, N., Inagaki, T., & Itoh, M. (2000). Adaptive automation, trust, and self-confidence in fault management in time-critical tasks. Journal of Experimental Psychology: Applied, 6, 45–58. Moritz, S. E., Feltz, D. L., Fahrbach, K. R., & Mach, D. E. (2000). The relation of self-efficacy measures to sport performance: A meta-analytic review. Research Quarterly for Exercise and Sport, 71, 280–294. Murayama, I. (1994). Role of agency in causal understanding of natural phenomena. Human Development, 37, 198–206.
296
References Neubert, M. (1998). The value of feedback and goal setting over goal setting along and potential moderators of this effect: A meta-analysis. Human Performance, 11, 321–335. Olson, J. M., Roese, N. J., & Zanna, M. P. (1996). Expectancies. In E. T. Higgins & A. W. Kurglanski (Eds.), Social psychology: Handbook of basic principles (pp. 211–238). New York: Guilford. Powers, W. T. (1973). Behavior: The control of perception. Chicago: Aldine. Powers, W. T. (1978). Quantitative analysis of purposive systems: Some spadework at the foundations of scientific psychology. Psychological Review, 85, 417–435. Powers, W. T. (1989). Living control systems. Gravel Switch, KY: Control Systems Group. Richardson, G. P. (1991). Feedback thought: In social science and systems theory. Philadelphia: University of Pennsylvania Press. Rotter, J. B. (1966). Generalized expectancies for internal versus external control of reinforcement. Psychological Monographs, 80, 1–28. Sexton, T. L., Tuckman, B. W., & Crehan, K. (1992). An investigation of the patterns of self-efficacy, outcome expectation, outcome value, and performance across trials. Cognitive Therapy and Research, 16, 329–348. Shebilske, W. L., Jordan, J. A., Goettl, B. P., & Paulus, L. E. (1998). Observation versus hands-on practice of complex skills in dyadic, triadic and tetradic training-teams. Human Factors, 40, 525–540. Shebilske, W. L., Regian, J. W., Arthur, W., Jr., & Jordan, J. A. (1992). A dyadic protocol for training complex skill. Human Factors, 34, 369–374. Soon, C. S., Brass, M., Heinze, J-C., & Hayes, J-D. (2008). Unconscious determinants of free decisions in the human brain. Nature Neuroscience, 11, 543–545. Spering, M., Wagener, D., & Funke, J. (2005). The role of emotions in complex problem solving. Cognition and Emotion, 19, 1252–1261. Stumpf, S. A., Brief, A. P., & Hartman, K. (1987). Self-efficacy expectations and coping with career-related events. Journal of Vocational Behavior, 31, 91–108. Sutton, R., & McClure, J. (2001). Covariation influences on goal-based explanations: An integrative model. Journal of Personality and Social Psychology, 80, 222–236. Thompson, S. C. (1991). Intervening to enhance perceptions of control. In C. R. Snyder & D. R. Forsyth (Eds.), Handbook of social and clinical psychology (pp. 607–623). New York: Pergamon.
297
References Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131. Vancouver, J. B. (1996). Living systems theory as a paradigm for organizational behaviour: Understanding humans. Behavioral Sciences, 41, 165–204. Vancouver, J. B. (2000). Self-regulation in organizational settings: A tale of two paradigms. In M. Boekaerts, P. Pintrich, & M. Zeidner (Eds.), Handbook of self-regulation. New York: Academic Press. Vancouver, J. B. (2005). The depth of history and explanation as benefit and bane for psychological control theories. Journal of Applied Psychology, 90, 38–52. Vancouver, J. B., & Kendall, L. (2006). When self-efficacy negatively relates to motivation and performance in a learning context. Journal of Applied Psychology, 91, 1146–1153. Vancouver, J. B., More, K. M., & Yoder, R. J. (2008). Self-efficacy and resource allocation: support for a nonmonotonic discontinuous model. Journal of Applied Psychology, 93, 35–47. Vancouver, J. B., & Putka, D. J. (2000). Analyzing goal-striving processes and a test of the generalizability of perceptual control theory. Organizational Behavior and Human Decision Processes, 82, 334–362. Vigoda-Gadot, E., & Angert, L. (2007). Goal setting theory, job feedback, and OCB: Lessons from a longitudinal study. Basic and Applied Social Psychology, 29, 119–128. Wagner, J. A., & Gooding, R. Z. (1997). Equivocal information and attribution: An investigation of patterns of managerial sensemaking. Strategic Management Journal, 18, 275–286. Wegner, D. M. (2004). Précis of The illusion of conscious will. Behavioral and Brain Sciences, 27, 649–659. Wiener, N. (1948). Cybernetics: Or control and communication in the animal and the machine. New York: Wiley. Williams, S. L., Kinney, P. J., & Falbo, J. (1989). Generalization of therapeutic changes in agoraphobia: The role of perceived self-efficacy. Journal of Consulting and Clinical Psychology, 57, 436–442. Wood, R. E., & Bandura, A. (1989). Impact of conceptions of ability on selfregulatory mechanisms and complex decision making. Journal of Personality and Social Psychology, 56, 407–415. Wood, R. E., & Locke, E. A. (1987). The relation of self-efficacy and grade goals to academic performance. Educational and Psychological Measurement, 47, 1013–1024.
298
References Yeo, G. B., & Neal, A. (2006). An examination of the dynamic relationship between self-efficacy and performance across levels of analysis and levels of specificity. Journal of Applied Psychology, 91, 1088–1101.
Chapter 7 Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Review, 89, 369–406. Bandura, A. (1989). Human agency in social cognitive theory. American Psychologist, 44, 1175–1184. Bandura, A. (2001). Social cognitive theory: An agentic perspective. Annual Review of Psychology, 52, 1–26. Bandura, A., & Locke, E. A. (2003). Negative self-efficacy and goal effects revisited. Journal of Applied Psychology, 88, 87–99. Barnes, G. R. (2008). Cognitive processes involved in smooth pursuit eye movements. Brain and Cognition, 68, 309–326. Berry, D. (1991). The role of action in implicit learning. Quarterly Journal of Experimental Psychology, 43, 881–906. Berry, D., & Broadbent, D. E. (1984). On the relationship between task performance and associated verbalizable knowledge. Quarterly Journal of Experimental Psychology, 36, 209–231. Berry, D., & Broadbent, D. E. (1987). The combination of implicit and explicit knowledge in task control. Psychological Research, 49, 7–15. Berry, D. C., & Broadbent, D. E. (1988). Interactive tasks and the implicit– explicit distinction. British Journal of Psychology, 79, 251–272. Björkman, M. (1971). Policy formation as a function of feedback in a nonmetric CPL-task. Umeå Psychological Reports, 49. Bredereke, J., & Lankenau, A. (2005). Safety-relevant mode confusions modelling and reducing them. Reliability Engineering and System Safety, 88, 229–245. Brehmer, B. (1992). Dynamic decision making: Human control of complex systems. Acta Psychologica, 81, 211–241. Broadbent, D. E. (1977). Levels, hierarchies and the locus of control. Quarterly Journal of Experimental Psychology, 32, 109–118. Broadbent, D., & Ashton, B. (1978). Human control of a simulated economic system. Ergonomics, 78, 1035–1043.
299
References Broadbent, D., Fitzgerald, P., & Broadbent, M. H. P. (1986). Implicit and explicit knowledge in the control of complex systems. British Journal of Psychology, 77, 33–50. Buchner, A., & Funke, J. (1993). Finite-state automata: Dynamic task environments in problem-solving research. Quarterly Journal of Experimental Psychology, 46, 83–118. Burns, B. D., & Vollmeyer, R. (2002). Goal specificity effects on hypothesis testing in problem solving. Quarterly Journal of Experimental Psychology, 55, 241–261. Busemeyer, J. R. (2002). Dynamic decision making. In N. J. Smelser & P. B. Bates (Eds.), International encyclopedia of the social and behavioral sciences: Methodology, mathematics and computer science (pp. 3903–3908). Oxford: Elsevier. Camp, G., Paas, F., Rikers, R., & Van Merriënboer, J. (2001). Dynamic problem selection in air traffic control training: A comparison between performance, mental effort and mental efficiency. Computers in Human Behavior, 17, 575–595. Campbell, D. (1988). Task complexity: A review and analysis. Academy of Management Review, 13, 40–52. Castellan, N. J., Jr. (1973). Multiple-cue probability learning with irrelevant cues. Organizational Behavior and Human Performance, 9, 16–29. Castellan, N. J., Jr. (1977). Decision making with multiple probabilistic cues. In N. J. Castellan, B. B. Pisoni, & G. R. Potts (Eds.), Cognitive theory, vol. 2. Hillsdale, NJ: Erlbaum. Castellan, N. J., Jr., & Edgell, S. E. (1973). An hypothesis generation model for judgment in non-metric multiple-cue probability learning. Journal of Mathematical Psychology, 10, 204–222. Chery, S., Vicente, K. J., & Farrell, P. (1999). Perceptual control theory and ecological interface design: Lessons learned from the CDU. In Proceedings of the Human Factors and Ergonomics Society 43rd annual meeting (pp. 389–393). Santa Monica, CA: Human Factors and Ergonomics Society. Chmiel, N., & Wall, T. (1994). Fault prevention, job design, and the adaptive control of advance manufacturing technology. Applied Psychology: An International Review, 43, 455–473. Cohen, M. S., Freeman, J. T., & Wolf, S. (1996). Meta-cognition in time stressed decision making: Recognizing, critiquing and correcting. Human Factors, 38, 206–219. Cooper, G., & Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, 309–347.
300
References Degani, A. (2004). Taming HAL: Designing interfaces beyond 2001. New York: Palgrave Macmillan. Diehl, E., & Sterman, J. D. (1995). Effects of feedback complexity on dynamic decision making. Organizational Behavior and Human Decision Processes, 62, 198–215. Dienes, Z., & Berry, D. (1997). Implicit learning: Below the subjective threshold. Psychonomic Bulletin and Review, 4, 3–23. Dienes, Z., & Fahey, R. (1995). Role of specific instances in controlling a dynamic system. Journal of Experimental Psychology: Learning, Memory, & Cognition, 21, 848–862. Dienes, Z., & Fahey, R. (1998). The role of implicit memory in controlling a dynamic system. Quarterly Journal of Experimental Psychology, 51, 593–614. Dörner, D. (1975). Wie Menschen eine Welt verbessern wollten und sie dabei zerstörten [How people wanted to improve a world and in the process destroyed it]. Bild der Wissenschaft, 12 (Populärwissenschaftlicher Aufsatz), 48–53. Dörner, D. (1989). The logic of failure. New York: Henry Holt. Earley, P. C., Northcraft, G. B., Lee, C., & Lituchy, T. R. (1990). Impact of process and outcome feedback on the relation of goal setting to task performance. Academy of Management Journal, 33, 87–105. Edgell, S. E. (1974). Configural information processing in decision making. Indiana Mathematical Psychology Program, Report no. 74-4. Bloomington: Department of Psychology, Indiana University. Eitam, B., Hassan, R., & Schul, Y. (2008). Nonconscious goal pursuit in novel environments: The case of implicit learning. Psychological Science, 19, 261–267. Fitts, P. M., & Posner, M. I. (1967). Human performance. Belmont, CA: Brooks/Cole. Flanagan, J. R., Vetter, P., Johansson, R. S., & Wolpert, D. M. (2003). Prediction precedes control in motor learning. Current Biology, 13, 146–150. Funke, J. (2001). Dynamic systems as tools for analyzing human judgment. Thinking and Reasoning, 7, 69–89. Gardner, P. H., Chmiel, N., & Wall, T. (1996). Implicit knowledge and fault diagnosis in the control of advanced manufacturing technology. Behaviour & Information Technology, 15, 205–212. Gatfield, D. (1999). Can cognitive science improve the training of industrial process operators? Journal of Safety Research, 30, 133–142. Gawthrop, P., Lakie, M., & Loram, I. (2008). Predictive feedback control and Fitts’ law. Biological Cybernetics, 98, 229–238.
301
References Geddes, B. W., & Stevenson, R. J. (1997). Explicit learning of a dynamic system with a non-salient pattern. Quarterly Journal of Experimental Psychology, 50A, 742–765. Gibson, F. P. (2007). Learning and transfer in dynamic decision environments. Computational and Mathematical Organizational Theory, 13, 39–61. Giesler, B. R., Josephs, R. A., & Swann, W. B. (1996). Self-verification in clinical depression: The desire for negative evaluation. Journal of Abnormal Psychology, 105, 358–368. Gonzales, C., Lerch, F. J., & Lebiere, C. (2003). Instance-based learning in dynamic decision making. Cognitive Science, 27, 591–635. Gopnik, A., Glymour, C., Sobel, D. M., Schulz, L. E., Kushnir, T., & Danks, D. (2004). A theory of causal learning in children: Causal maps and Bayes nets. Psychological Review, 111, 3–32. Grafton, S., Schmitt, P., Van Horn, J., & Diedrichsen, J. (2008). Neural substrates of visuomotor learning based on improved feedback control and prediction. NeuroImage, 39, 1383–1395. Hagmayer, Y., Meder, B., Osman, M., Mangold, S., & Lagnado, D. (In press). Spontaneous causal learning while controlling a dynamic system. Open Psychology Journal. Hill, R. J., Gordon, A., & Kim, J. (2004). Learning the lessons of leadership experience: Tools for interactive case method analysis. In Proceedings of the Twenty-Fourth Army Science Conference, Orlando, FL. Holzworth, R. J., & Doherty, M. E. (1976). Feedback effects in a metric multiple-cue probability learning task. Bulletin of the Psychonomic Society, 8, 1–3. Hommel, B. (1998). Perceiving one’s own action – and what it leads to. In J. S. Jordan (Ed.), Systems theories and a priori aspects of perception. (pp 145–178). Amsterdam: Elsevier Science. Howell, W. C., & Funaro, J. F. (1965). Prediction on the basis of conditional probabilities. Journal of Experimental Psychology, 69, 92–99. Jamieson, G. A., Miller, C. M., Ho, W. H., & Vicente, K. J. (2007). Integrating task- and work domain-based work analysis in ecological interface design: A process control case study. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 6, 887–905. Jonassen, D., & Hung, W. (2008). All problems are not equal: Implications for problem-based learning. Interdisciplinary Journal of Problem-Based Learning, 2, 6–28.
302
References Jungermann, H., & Thüring, M. (1993). Causal knowledge and the expression of uncertainty. In G. Strube & K. F. Wender (Eds.), The cognitive psychology of knowledge (pp. 53–73). Amsterdam: Elsevier Science. Kanfer, R., Ackerman, P. L., Murtha, T. C., Dugdale, B., & Nelson, L. (1994). Goals setting, conditions of practice, and task performance: A resource allocation perspective. Journal of Applied Psychology, 79, 826–835. Kerstholt, J. H. (1996). The effect of information cost on strategy selection in dynamic tasks. Acta Psychologia, 94, 273–290. Kerstholt, J. H., & Raaijmakers, J. G. W. (1997). Decision making in dynamic task environments. In R. Ranyard, R. W. Crozier, & O. Svenson (Eds.), Decision making: Cognitive models and explanations (pp. 205–217). New York: Routledge. Klahr, D., & Dunbar, K. (1988). Dual space search during scientific reasoning. Cognitive Science, 12, 1–55. Klein, G. (1997). Developing expertise in decision making. Thinking and Reasoning, 3, 337–352. Knowlton, B. J., Mangels, J. A., & Squire, L. R. (1996). A neostriatal habit learning system in humans. Science, 273, 1399–1402. Kornell, N., Son, L. K., & Terrace H. S. (2007). Transfer of metacognitive skills and hint seeking in monkeys. Psychological Science, 18, 64–71. Krueger, G. P. (1989). Sustained work, fatigue, sleep loss and performance: A review of the issues. Work Stress, 3, 129–141. Krynski, T., & Tenenbaum, J. (2007). The role of causality in judgment under uncertainty. Journal of Experimental Psychology: General, 136I, 430–450. Lagnado, D., Newell, B. R., Kahan, S., & Shanks, D. R. (2006). Insight and strategy in multiple cue learning. Journal of Experimental Psychology: General, 135, 162–183. Lagnado, D., & Sloman, S. A. (2004). The advantage of timely intervention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 856–876. Langley, P. A., & Morecroft, J. D. (2004). Decision aiding: Performance and learning in a simulation of oil industry dynamics. European Journal of Operational Research, 155, 715–732. Lanzetta, J. T., & Driscoll, J. M. (1966). Preference for information about an uncertain but unavoidable outcome. Journal of Personality and Social Psychology, 3, 96–102.
303
References Lee, Y. (1995). Effects of learning contexts on implicit and explicit learning. Memory and Cognition, 23, 723–734. Lee, Y., & Vakoch, D. (1996). Transfer and retention of implicit and explicit learning. British Journal of Psychology, 87, 637–651. Leigh, J. R. (1992). Control theory: A guided tour. London: Peter Peregrinus. Lerch, F. J., & Harter, D. E. (2001). Cognitive support for real-time dynamic decision making. Information Systems Research, 12, 63–82. Lichacz, F. (2005). Examining the effects of combined stressors on dynamic task performance. International Journal of Aviation Psychology, 15, 45–66. Lipshitz, R., Klein, G., Orasanu, J., & Salas, E. (2001). Taking stock of naturalistic decision making. Journal of Behavioral Decision Making, 14, 332–251. Lipshitz, R., & Strauss, O. (1997). Coping with uncertainty: A naturalistic decision-making analysis. Organizational Behavior and Human Decision Processes, 69, 149–163. Locke, E. A. (1991). Goal theory vs. control theory: Contrasting approaches to understanding work motivation. Motivation and Emotion, 15, 9– 28. Locke, E. A. (2000). Motivation, cognition, and action: An analysis of studies of task goals and knowledge. Applied Psychology: An International Review, 49, 408–429. Marescaux, P-J., Luc, F., & Karnas, G. (1989). Modes d’apprentissage sélectif et nonsélectif et connaissances acquies au contrôle d’un processes: Évaluation d’un modèle simulé [Selective and nonselective learning modes and acquiring knowledge of process control: Evaluation of a simulation model]. Cahiers de Psychologie Cognitive, 9, 239–264. Meder, B., Gersteberg, T., Hagmayer, Y., & Waldmann, M. R. (In press). Observing and intervening: Rational and heuristic models of causal decision making. Open Psychology Journal. Meder, B., Hagmayer, Y., & Waldmann, M. R. (2008). Inferring interventional predictions from observational learning data. Psychonomic Bulletin & Review, 15, 75–80. Moxnes, E. (2000). Not only the tragedy of the commons: Misperceptions of feedback and policies for sustainable development. System Dynamics Review, 16, 325–348. Muchinsky, P. M., & Dudycha, A. L. (1975). Human inference behavior in abstract and meaningful environments. Organizational Behavior and Human Performance, 13, 377–391.
304
References Nelson, T. O. (1996). Consciousness and metacognition. American Psychologist, 51, 102–116. Orasanu, J., & Connolly, T. (1993). The reinvention of decision making. In G. Klein, J. Orasanu, & R. Calderwood (Eds.), Decision making in action: Models and methods (pp. 3–20). Norwood, NJ: Ablex. Osman, M. (2004). An evaluation of dual process theories of reasoning. Psychonomic Bulletin & Review, 11, 998–1010. Osman, M. (2008a). Evidence for positive transfer and negative transfer/ anti-learning of problem solving skills. Journal of Experimental Psychology: General, 137, 97–115. Osman, M. (2008b). Observation can be as effective as action in problem solving. Cognitive Science, 32, 162–183. Osman, M. (2008c). Seeing is as good as doing. Journal of Problem Solving, 2(1), art. 3. Osman, M. (2010). Controlling uncertainty: A review of human behavior in complex dynamic environments. Psychological Bulletin, 136, 65–86. Osman, M., & Stavy, R. (2006). Intuitive rules: From formative to developed reasoning. Psychonomic Bulletin & Review, 13, 935–953. Osman, M., Wilkinson, L., Beigi, M., Parvez, C., & Jahanshahi, M. (2008). The striatum and learning to control a complex system? Neuropsychologia, 46, 2355–2363. Pacherie, E. (2008). The phenomenology of action: A conceptual framework. Cognition, 107, 179–217. Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge: Cambridge University Press. Poldrack, R., Clark, J., Paré-Blagoev, E. J., Shahomy, D., Creso Moyano, J., Myers, C., & Gluck, M. (2001). Interactive memory systems in the human memory. Nature, 414, 546–550. Price, A. (2009). Distinguishing the contributions of implicit and explicit processes to performance on the weather prediction task. Memory and Cognition, 37, 210–222. Quesada, J., Kintsch, W., & Gonzales, E. (2005). Complex problem solving: A field in search for a definition? Theoretical Issues in Ergonomic Science, 6(1), 5–33. Randel, J. M., Pugh, L., & Reed, S. K. (1996). Differences in expert and novice situation awareness in naturalistic decision making. International Journal of Human Computer Studies, 45, 579–597. Renkl, A. (1997). Learning from worked-out examples: A study on individual differences. Cognitive Science, 21, 1–29.
305
References Rothstein, H. G. (1986). The effects of time pressure on judgment in multiple cue probability learning. Organizational Behavior and Human Decision Processes, 37, 83–92. Sanderson, P. M. (1989). Verbalizable knowledge and skilled task performance: Association, dissociation, and mental models. Journal of Experimental Psychology: Learning, Memory, & Cognition, 15, 729–747. Sarter, N. B., Mumaw, R. J., & Wickens, C. D. (2007). Pilots’ monitoring strategies and performance on automated flight decks: An empirical study combining behavioral and eye-tracking data. Human Factors, 49, 347–357. Sauer, J., Burkolter, D., Kluge, A., Ritzmann, S., & Schüler, K. (2008). The effects of heuristic rule training an operator performance in a simulated process control environment. Ergonomics, 7, 953–967. Schoppek, W. (2002). Examples, rules and strategies in the control of dynamic systems. Cognitive Science Quarterly, 2, 63–92. Shanks, D. R., & St John, M. F. (1994). Characteristics of dissociable human learning systems. Behavioral & Brain Sciences, 17, 367–447. Simon, H. A., & Lea, G. (1974). Problem solving and rule induction: A unified view. In L. W. Gregg (Ed.), Knowledge and cognition (pp. 105– 127). Hillsdale, NJ: Lawrence Erlbaum Associates. Skinner, B. F. (1953). Science and human behavior. New York: Free Press. Sloman, S. A., & Lagnado, D. A. (2005). Do we ‘do’? Cognitive Science, 29, 5–39. Slovic, P., Rorer, L. G., & Hoffman, P. J. (1971). Analyzing use of diagnostic signs. Investigative Radiology, 6, 18–26. Speekenbrink, M., Channon, S., & Shanks, D. R. (2008). Learning strategies in amnesia. Neuroscience & Biobehavioral Reviews, 32, 292–310. Spirtes, P., Glymour, C., & Schienes, P. (1993). Causation, prediction, and search. New York: Springer Verlag. Stanley, W. B., Mathews, R. C., Buss, R. R., & Kotler-Cope, S. (1989). Insight without awareness: On the interaction of verbalization, instruction, and practice in a simulated process control task. Quarterly Journal of Experimental Psychology, 41, 553–577. Stavy, R., & Tirosh, D. (1996). Intuitive rules in science and mathematics: The case of ‘more of A more of B’. International Journal of Science Education, 18, 653–667. Stavy, R., & Tirosh, D. (2000). How students (mis-)understand science and mathematics: Intuitive rules. New York: Teachers College Press.
306
References Sterman, J. D. (1989). Misperceptions of feedback in dynamic decision making. Organizational Behavior & Human Decision Processes, 43, 301–335. Sterman, J. D. (1994). Learning in and about complex systems. System Dynamics Review, 10, 291–330. Sterman, J. D. (2002). All models are wrong: Reflections on becoming a systems scientist. System Dynamics Review, 18, 501–531. Sternberg, R. J. (1989). Domain-generality versus domain-specificity: The life and impending death of a false dichotomy. Merrill-Palmer Quarterly, 35, 115–130. Steyvers, M., Tenenbaum, J. B., Wagenmakers, E. J., & Blum, B. (2003). Inferring causal networks from observations and interventions. Cognitive Science, 27, 453–489. Strohschneider, S., & Guss, D. (1999). The fate of the MOROS: A crosscultural exploration of strategies in complex and dynamic decision making. International Journal of Psychology, 34, 235–252. Sun, R., Merrill, E., & Peterson, T. (2001). From implicit skills to explicit knowledge: A bottom-up model of skill learning. Cognitive Science, 25, 203–244. Sun, R., Slusarz, P., & Terry, C. (2005). The interaction of the explicit and the implicit in skill learning: A dual-process approach. Psychological Review, 112, 159–192. Sun, R., Zhang, X., Slusarz, P., & Mathews, R. (2007). The interaction of implicit learning, explicit hypothesis testing learning and implicit-toexplicit knowledge extraction. Neural Networks, 20, 34–47. Sweller, J. (1988). Cognitive load during problem solving: Effects of learning. Cognitive Science, 12, 257–285. Tenenbaum, J. B., & Griffiths, T. L. (2001). Generalization, similarity, and Bayesian inference. Behavioral and Brain Sciences, 24, 629–641. Tenenbaum, J. B., Griffiths, T. L., & Kemp, C. (2006). Theory-based Bayesian models for inductive learning and reasoning. Trends in Cognitive Sciences, 10(7), 309–318. Toda, M. (1962). The design of the fungus eater: A model of human behavior in an unsophisticated environment. Behavioral Science, 7, 164–183. Trumpower, D. L., Goldsmith, T. E., & Guynn, M. (2004). Goal specificity and knowledge acquisition in statistics problem solving: Evidence for attentional focus. Memory & Cognition, 32, 1379–1388. Vicente, K. J. (2002). Ecological interface design: Process and challenges. Human Factors, 44, 62–78.
307
References Vicente, K, J., & Wang, J. H. (1998). An ecological theory of expertise effects in memory recall. Psychological Review, 105, 33–57. Vlek, C. A. J., & van der Heijden, L. H. C. (1970). Aspects of suboptimality in a multidimensional probabilistic information processing task. Acta Psychologia, 34, 300–310. Vollmeyer, R., Burns, B. D., & Holyoak, K. J. (1996). The impact of goal specificity and systematicity of strategies on the acquisition of problem structure. Cognitive Science, 20, 75–100. Wilkinson, L., Lagnado, D. A., Quallo, M., & Jahanashahi, M. (2008). The effect of feedback on non-motor probabilistic classification learning in Parkinson’s disease. Neuropsychologia, 46, 2683–2695. Witt, K., Daniels, C., Daniel, V., Schmitt-Eliassen, J., Volkmann, J., & Deuschl, G. (2006). Patients with Parkinson’s disease learn to control complex systems: An indication for intact implicit cognitive skill learning. Neuropsychologia, 44, 2445–2451.
Chapter 8 Banks, J., Olson, M., & Porter, D. (1997). An experimental analysis of the bandit problem. Economic Theory, 10, 55–77. Behrens, T. E., Woolrich, M. W., Walton, M. E., & Rushworth, M. F. (2007). Learning the value of information in an uncertain world. Nature Neuroscience, 10, 1214–1221. Blaisdell, A. P., Sawa, K., Leising, K., & Waldmann, M. (2006). Causal reasoning in rats. Science, 311, 1020–1022. Boles, T. L., & Messick, D. M. (1995). A reverse outcome bias: The influence of multiple reference points on the evaluation of outcomes and decisions. Organizational Behavior and Human Decision Processes, 61, 262–275. Boorman, E. D., Behrens, T. E., Woolrich, M. W., & Rushworth, M. F. (2009). How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron, 62, 733–743. Botvinick, M. M., Cohen, J. D., & Carter, C. S. (2004). Conflict monitoring and anterior cingulate cortex: An update. Trends in Cognitive Science, 8, 529–546. Brezzi, M., & Lai, T. L. (2002). Optimal learning and experimentation in bandit problems. Journal of Economic Dynamics & Control, 27, 87–108.
308
References Bush, G., Luu, P., & Posner, M. I. (2000). Cognitive and emotional influences in anterior cingulate cortex. Trends in Cognitive Science, 4, 215–222. Camerer, C., Loewenstein, G., & Prelec, D. (2004). Neuroeconomics: Why economics needs brains. Scandinavian Journal of Economics, 106, 555–579. Camerer, C., Loewenstein, G., & Prelec, D. (2005). Neuroeconomics: How neuroscience can inform economics. Journal of Economic Literature, 43, 9–64. Clark, L., Lawrence, A., Astley-Jones, F., & Gray, N. (2009). Gambling nearmisses enhance motivation to gamble and recruit win-related brain circuitry. Neuron, 61, 481–490. Cohen, M. (2006). Individual differences and the neural representations of reward expectancy and reward error prediction. Scan, 2, 20–30. Cohen, M., Elger, C. E., & Ranganath, C. (2007). Reward expectation modulates feedback-related negativity and EEG spectra. NeuroImage, 35, 968–978. Cohen, M., Elger, C., & Weber, B. (2008). Amygdala tractography predicts functional connectivity and learning during feedback-guided decision making. NeuroImage, 39, 1396–1407. Cohen, M., McClure, S. M., & Yu, A. (2007). Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philosophical Transactions of the Royal Society, 362, 933–942. Coltheart, M. (2006). What has functional neuroimaging told us about the mind (so far)? Cortex, 42, 323–331. Critchley, H. D., Mathias, C. J., & Dolan, R. J. (2001). Neural activity in the human brain relating to uncertainty and arousal during anticipation. Neuron, 29, 537–545. Dam, G., & Körding, K. (2009). Exploration and exploitation during sequential search. Cognitive Science, 33, 530–541. Daw, N. D., Niv, L., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8, 1704–1711. Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441, 876–879. Dayan, P., Kakade, S., & Montague, P. R. (2000). Learning and selective attention. Nature Neuroscience, 3, 1218–1223. Dayan, P., & Niv, Y. (2008). Reinforcement learning: The good, the bad and the ugly. Current Opinion in Neurobiology, 18, 185–196.
309
References Devinsky, O., Morrell, M. J., & Vogt, B. A. (1995). Contributions of anterior cingulate cortex to behavior. Brain, 118, 279–303. Doya, K. (2008). Modulators of decision making. Nature Neuroscience, 11, 410–416. Fuster, J. M. (2003). Cortex and mind. Oxford: Oxford University Press. Gallistel, C. R., Mark, T. A., King, A. P., & Latham, P. E. (2001). The rat approximates an ideal detector of changes in rates of reward: Implications for the law of effect. Journal of Experimental Psychology: Animal Behavioral Processes, 27, 354–372. Getting, P. (1989). Emergent principles governing the operation of neural networks. Annual Review of Neuroscience, 12, 185–204. Gigerenzer, G., & Todd, P. (1999). Simple heuristics that make us smart. New York: Oxford University Press. Glimcher, P. (2003). Decisions, uncertainty, and the brain: The science of neuroeconomics. Cambridge, MA: MIT Press. Glimcher, P., & Rustichini, A. (2004). Neuroeconomics: The consilience of brain and decision. Science, 306, 447–452. Henson, R. (2005). What can functional neuroimaging tell the experimental psychologist? Quarterly Journal of Experimental Psychology, 58A, 193–233. Henson, R. (2006). What has (neuro)psychology told us about the mind (so far)? Cortex, 42, 387–392. Jocham, G., Neumann, J., Klein, T., Danielmeier, C., & Ullsperger, M. (2009). Adaptive coding of action values in the human rostral cingulate zone. Journal of Neuroscience, 29, 7489–7496. Kable, J., & Glimcher, P. (2007). The neural correlates of subjective value during intertemporal choice. Nature Neuroscience, 10, 1625–1633. Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive judgement. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive judgement (pp. 49–81). New York: Cambridge University Press. Kheramin, S., Body, S., Ho, M. Y., Velazquez-Martinez, D. N., Bradshaw, C. M., Szabadi, E., Deakin, J. F. W., & Anderson, I. M. (2004). Effects of orbital prefrontal cortex dopamine depletion on inter-temporal choice: A quantitative analysis. Psychopharmacology, 175, 206–214. Koechlin, E., & Summerfield, C. (2007). An information theoretical approach to prefrontal executive function. Trends in Cognitive Science, 11, 229–235. Langer, E. J. (1975). The illusion of control. Journal of Personality and Social Psychology, 31, 311–328.
310
References Lennie, P. (2003). The cost of cortical computation. Current Biology, 13, 493–497. Luu, P., Shane, M., Pratt, N., & Tucker, D. (2008). Corticolimbic mechanisms in the control of trial and error learning. Brain Research, 1247, 100–113. MacDonald, A. W., Cohen, J. D., Stenger, A., & Carter, C. S. (2000). Dissociating the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive control. Science, 288, 1835–1838. Madden, G. J., Dake, J. M., Mauel, E. C., & Rowe, R. R. (2005). Labor supply and consumption of food in a closed economy under a range of fixedand random-ratio schedules: Tests of unit price. Journal of the Experimental Analysis of Behavior, 83, 99–118. Madden, G. J., Ewan, E. E., & Lagorio, C. H. (2007). Toward an animal model of gambling: Delay discounting and the allure of unpredictable outcomes. Journal of Gambling Studies, 23, 63–83. Marco-Pallarés, J., Müller, A. V., & Münte, T. (2007). Learning by doing: An fMRI study of feedback-related brain activations. NeuroReport, 14, 1423–1426. Marcus, G. (2008). Kluge: The haphazard construction of the mind. Boston: Houghton-Mifflin. Martinez, F., Bonnefon, J-F., & Hoskens, J. (2009). Active involvement, not illusory control, increases risk taking in a gambling game. Quarterly Journal of Experimental Psychology, 62, 1063–1071. Matsumoto, K., Suzuki, W., & Tanaka, K. (2003). Neuronal correlates of goal-based motor selection in the prefrontal cortex. Science, 301, 229–232. McClure, S. M., Laibson, D. I., Loewenstein, G., & Cohen, J. D. (2004). Separate neural systems value immediate and delayed monetary rewards. Science, 306, 503–507. Miller, E. K., & Buschman, T. J. (2006). Bootstrapping your brain: How interactions between the frontal cortex and basal ganglia may produce organized actions and lofty thoughts. In J. L. Martinez & R. P. Kesner (Eds.), Neurobiology of learning and memory (2nd ed.). Burlington, MA: Academic Press. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortical function. Annual Review of Neuroscience, 24, 167–202. Montague, P. R., Hyman, S. E., & Cohen, J. D. (2004). Computational roles for dopamine in behavioural control. Nature, 431, 760–767.
311
References Morsella, E. (2009). The mechanisms of human action: Introduction and background. In E. Morsella, J. A. Bargh, & P. M. Gollwitzer (Eds.), Oxford handbook of human action. Oxford: Oxford University Press. Nygren, T. E., Isen, A. M., Taylor, P. J., & Dulin, J. (1996). The influence of positive affect on the decision rule in risk situations: Focus on outcome (and especially avoidance of loss) rather than probability. Organizational Behavior and Human Decision Processes, 66, 59–72. O’Doherty, J. (2004). Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769–776. O’Doherty, J., Kringelbach, M. L., Rolls, E. T., Hornak, J., & Andrews, C. (2001). Abstract reward and punishment representations in the human orbitofrontal cortex. Nature Neuroscience, 4, 95–102. Pado-Schiopp, C., & Assad, J. (2006). Neurons in the orbitofrontal cortex encode economic value. Nature, 441, 223–226. Philiastedies, M. G., Ratcliff, R., & Sajda, P. (2006). Neural representations of task difficulty and decision making during perceptual categorization: A timing diagram. Journal of Neuroscience, 26, 8965–8975. Platt, M. L., & Glimcher, P. (1999). Neural correlates of decision variables in the parietal cortex. Nature, 400, 233–238. Ratcliff, R., Philiastides, M. G., & Sajda, P. (2009). Quality of evidence for perceptual decision making is indexed by trial-to-trial variability of the EEG. Proceedings of the National Academy of Sciences of the United States of America, 106, 6539–6544. Reddish, D., Jensen, S., Johnson, A., & Kirth-Nelson, Z. (2007). Reconciling reinforcement learning models with behavioral extinction and renewal: Implication for addition, relapse, and problem gambling. Psychological Review, 114, 784–805. Robbins, T. W. (2000). Chemical neuromodulation of frontal-executive functions in humans and other animals. Experimental Brain Research, 133, 130–138. Rodriguez, P. F. (2009). Stimulus-outcome learnability differentially activates anterior cingulate and hippocampus at feedback. Learning and Memory, 16, 324–331. Rustichini, A. (2005). Neuroeconomics: Past and future. Games and Economic Behavior, 52, 201–212. Sailer, U., Robinson, S., Fischmeister, F., Moser, E., Kryspin-Exner, I., & Bauer, H. (2007). Imaging the changing role of feedback during learning in decision making. NeuroImage, 37, 1474–1486.
312
References Schultz, W. (2006). Behavioral theories and the neurophysiology of reward. Annual Review of Psychology, 57, 87–115. Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275, 1593–1599. Schultz, W., Preuschoff, K., Camerer, C., Hsu, M., Fiorillo, C., Tobler, P., & Bossaerts, P. (2008). Explicit neural signals reflect reward uncertainty. Philosophical Transactions of the Royal Society B, 363, 3801– 3811. Segalowitz, S. J., & Dywan, J. (2009). Individual differences and developmental change in the ERN response: Implications for models of ACC function. Psychological Research, 73, 857–870. Seymour, B., Daw, N., Dayan, P., Singer, T., & Dolan, R. (2007). Differential encoding of losses and gains in the human striatum. Journal of Neuroscience, 27, 4826–4831. Shea, N., Krug, K., & Tobler, P. (2008). Conceptual representations in goaldirected decision making. Cognitive, Affective, & Behavioral Neuroscience, 8, 418–428. Stanovich, K. E. (1999). Who is rational? Studies of individual differences in reasoning. Mahwah, NJ: Erlbaum. Stanovich, K. E., & West, R. F. (2000). Individual differences in reasoning: Implications for the rationality debate. Behavioral and Brain Sciences, 23, 645–726. Steyvers, M., Lee, M., & Wagenmakers, E-J. (2009). A Bayesian analysis of human decision making on bandit problems. Journal of Mathematical Psychology, 53, 168–179. Tobler, P., O’Doherty, J. P., Dolan, R., & Schultz, W. (2007). Reward value coding distinct from risk attitude-related uncertainty coding in human reward system. Journal of Neurophysiology, 97, 1621–1632. Tremblay, L., & Schultz, W. (2000). Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. Journal of Neurophysiology, 83, 1864–1876. Tullock, G. (1971). Biological externalities. Journal of Theoretical Biology, 33, 379–392. Ullsperger, M., & von Cramon, D. Y. (2003). Error monitoring using external feedback: Specific roles of the habenular complex, the reward system and the cingulate motor area revealed by functional magnetic resonance imaging. Journal of Neuroscience, 23, 4308–4314. Uttal, W. R. (2001). The new phrenology: The limits of localizing cognitive processes in the brain. Cambridge, MA: MIT Press.
313
References Vromen, J. (2007). Neuroeconomics as a natural extension of bioeconomics: The shifting scope of standard economic theory. Journal of Bioeconomics, 9, 145–167. Walton, M. E., Devlin, J. T., & Rushworth, M. F. (2004). Interactions between decision making and performance monitoring within prefrontal cortex. Nature Neuroscience, 3, 502–508. Watanabe, M. (1996). Reward expectancy in primate prefrontal neurons. Nature, 382, 629–632. Watanabe, M. (2007). Role of anticipated reward in cognitive behavioral control. Current Opinion in Neurobiology, 17, 213–279. Watanabe, M., Hikosaka, K., Sakagami, M., & Shirakawa, S. (2002). Coding and monitoring of motivational context in the primate prefrontal cortex. Journal of Neuroscience, 22, 2391–2400. Watanabe, M., Hikosaka, K., Sakagami, M., & Shirakawa, S. (2005). Functional significance of delay-period activity of primate prefrontal neurons in relation to spatial working memory and reward/omissionof reward expectancy. Experimental Brain Research, 166, 263–276. Winter, S., Dieckmann, M., & Schwabe, K. (2009). Dopamine in the prefrontal cortex regulates rats’ behavioral flexibility to changing reward value. Behavioral Brain Research, 198, 206–213. Yu, A. (2007). Adaptive behavior: Humans act as Bayesian learners. Current Biology, 17, 977–980. Yu, A., & Dayan, P. (2005). Uncertainty, neuromodulation and attention. Neuron, 46, 681–692.
Chapter 9 Abbeel, P., Quigley, M., & Ng, A. Y. (2006). Using INAC curate models in reinforcement learning. Paper presented at the International Conference on Machine Learning (ICML), Pittsburgh, PA. Agirre-Basurko, E., Ibarra-Berastegi, G., & Madariaga, I. (2006). Regression and multilayer Perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environmental Modelling Software, 21, 430–446. Ashby, W. (1947). Principles of the self-organizing dynamic system. Journal of General Psychology, 37, 125–128. Barto, A. G. (1994). Reinforcement learning control. Current Opinion in Neurobiology, 4, 888–893.
314
References Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems-Theory and Applications, 13, 41–77. Castanon, L. E. G., Oritz, R. J. C., Morales-Menendez, R., & Ramirez, R. (2005). A fault detection approach based on machine learning models. MICAI 2005: Advances in Artificial Intelligence, 3789, 583–592. Chau, K. W. (2006). Particle swarm optimization training algorithm for ANNs in stage prediction of Shing Mun River. Journal of Hydrology, 329, 363–367. Chen, S., Jakeman, A., & Norton, J. (2008). Artificial intelligence techniques: An introduction to their use for modeling environmental systems. Mathematics and Computers in Simulation, 78, 379–400. Dayan, P., & Daw, N. (2008). Connections between computational and neurobiological perspectives on decision making. Cognitive, Affective, & Behavioural Neuroscience, 8, 429–453. Griffiths, T., & Tenenbaum, K. B. (2005). Structure and strength in causal induction. Cognitive Psychology, 51, 334–384. Hooker, C. A., Penfold, H. B., & Evans, R. J. (1992). Control, connectionism and cognition: Towards a new regulatory paradigm. British Journal for the Philosophy of Science, 43, 517–536. Jain, A. K., Mao, J., & Mohiuddin, K. (1996). Artificial neural networks: A tutorial. IEEE Computation, 29, 31–44. Kaelbling, L., Littman, M., & Moore, A. (1996). Reinforcement learning: A survey. Journal in Artificial Intelligence Research, 4, 237–285. Kim, G., & Barros, A. P. (2001). Quantitative flood forecasting using multisensor data and neural networks. Journal of Hydrology, 246, 5–62. Konidaris, G., & Barto, A. G. (2006). An adaptive robot motivational system. Proceedings of the 9th International Conference on Simulation of Adaptive Behaviour, Italian National Research Council, Rome, Italy, from Animals to Animats 9, 4095, 346–356. Linsker, R. (2008). Neural network learning of optimal Kalman prediction and control. Neural Networks, 21, 1328–1343. Malikopoulos, A., Papalmbros, P., & Assanis, D. (2009). A real-time computational learning model for sequential decision-making problems under uncertainty. Journal of Dynamic Systems, Measurement, and Control, 131, 041010. McKenzie, C. R. M. (2003). Rational models as theories – not standards – of behaviour. Trends in Cognitive Science, 7, 403–406.
315
References McKenzie, C. R. M., & Mikkelsen, L. A. (2006). A Bayesian view of covariation assessment. Cognitive Psychology, 54, 33–61. Nelson, J. D. (2005). Finding useful questions: On Bayesian diagnosticity, probability, impact, and information gain. Psychological Review, 112, 979–999. Onkal-Engin, G., Demir, I., & Engin, S. N. (2005). Determination of the relationship between sewage odour and BOD by neural networks. Environmental Modelling Software, 20, 843–850. Osman, M. (2008). Evidence for positive transfer and negative transfer/antilearning of problem solving skills. Journal of Experimental Psychology: General, 137, 97–115. Osman, M. (2010). Controlling uncertainty: A review of human behavior in complex dynamic environments. Psychological Bulletin, 136, 65–86. Santiago Barros, M. S., & Rodrigues, V. (1994). Nonlinear aspects of data integration for land-cover classification in a neural network environment. Advances in Space Research, 14, 265–268. Sheridan, T. B. (2000). Function allocation: Algorithm, alchemy or apostasy? International Journal of Human-Computer Studies, 52, 203–216. Sheridan, T. B. (2002). Humans and automation: Systems design and research issues. New York: Wiley. Sheridan, T. B., & Parasuraman, R. (2006). Human-automation interaction. In R. Nickerson (Ed.), Proceedings of the Human Factors and Ergonomics Society 46th annual meeting. Santa Monica, CA: Human Factors and Ergonomics Society. Sloman, A. (1993). The mind as a control system. In C. Hookway & D. Peterson (Eds.), Philosophy and the cognitive sciences (pp. 69–110). Cambridge: Cambridge University Press. Sloman, A. (1994). Semantics in an intelligent control system. Philosophical Transactions of the Royal Society: Physical Sciences and Engineering, 349, 43–58. Sloman, A. (1999). What sort of architecture is required for a human-like agent? In M. Wooldridge & A. Rao (Eds.), Foundations of rational agency (pp. 35–52). Dordrecht: Kluwer Academic. Sloman, A. (2008). The well-designed young mathematician. AIJ, 172, 2015– 2034. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press. Tenenbaum, J. B., Griffiths, T. L., & Kemp, C. (2006). Theory-based Bayesian models for inductive learning and reasoning. Trends in Cognitive Sciences, 10, 309–318.
316
References Vellido, A., Marti, E., Comas, J., Rodriguez-Roda, I., & Sabater, F. (2007). Exploring the ecological status of human altered streams through generative topographic mapping. Environmental Modelling Software, 22, 1053–1065. Vromen, J. (2007). Neuroeconomics as a natural extension of bioeconomics: The shifting scope of standard economic theory. Journal of Bioeconomics, 9, 145–167. Watkins, J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 8, 55–68. Wiener, N. (1948). Cybernetics: Or control and communication in the animal and the machine. New York: Wiley. Wörgötter, F., & Porr, B. (2005). Temporal sequence learning, prediction, and control: A review of different models and their relation to biological mechanisms. Neural Computation, 17, 245–319. Zhang, Q., & Stanley, S. J. (1997). Forecasting raw-water quality parameters for the North Saskatchewan River by neural network modeling. Water Research, 31, 2340–2350.
317
Index
Note: n indicates information in a note. Abdallah, C. T. 113 ACC (anterior cingulate cortex) 225–6, 230, 232–3, 236, 237 acetylcholine 244 Acid, S. 106 Ackerman, P. L. 161, 180 ACS (action-centred subsystem) 198 action time 133 action-based learning 181, 188 action-orientated strategy 190 actuators 57, 60–2, 71 adaptable behaviours 95 addictive behaviours 242 affective processes 161 agency 11, 14, 20, 32, 35, 49, 119, 145, 150, 168, 256 actions and 12, 16, 21–31, 50, 170–4, 178, 264 318
argument that challenges the possibility that could demonstrate 89n causal 162 centre stage in study of control behaviours 148 choices and 195 collective 169 complexities of 177 internally driven by the self 263 judgements concerning 146, 162 motivational principles key to understanding 262 personal 169 proxy 169 temporal properties of our sense of 29–31 uncertainty and 267–8, 272 Agre, P. 91
Index AI (artificial intelligence) 5, 89–90, 108, 109 barriers to perfect systems 91–2 machine learning and 11, 13, 81–2, 93–5, 111–12, 117–18, 120, 147 air traffic control 49, 53, 74, 118, 120, 123, 132, 164, 169, 183 human and machine capabilities in air navigation and 129 models applied to controller training 134 aircraft 53, 76 commercial 54, 66 fighter 66 flight management systems 121 flight plan 85, 86 transfer functions 86 Aitkenhead, M. J. 90 Alexander, R. A. 161 algorithms 63, 68, 113, 242, 262 adaptive control 77n Bayesian learning 102, 104–7, 229n, 234n, 238n genetic 102, 103–4, 105, 109, 257 logic-based 102–3, 105 machine-learning 104, 111, 112, 113, 124, 151n, 228 probabilistic 103, 104, 106 reinforcement learning 106n, 155, 243n, 257–8, 262 altruism 240 Ammerman, M. 141 analytic judgements 37–8 Anderson, J. R. 134, 188 Andrews, C. 233n Andrieu, C. 111n Angert, L. 158
animals 101, 155, 204, 235–9, 240, 246 neuropsychological functioning in 218 anticipatory mechanism 87 anti-Humean critics 42 Appley, M. H. 151 approximation 91–2, 107, 132, 133 Aristotle 32, 40 artificial systems 54, 111, 117 parallel between animals and 99 see also AI (artificial intelligence) Ashby, W. 81, 82, 95–100, 149, 164, 256n Ashton, B. 179 Assad, J. 245 assisted steering 5–6, 113 associative networks 134 associative theories 36n Astley-Jones, F. 242 atomic interventions 211 attention spans 135 attentional processes 180 attribution theory 173 auditory information 135 Austin, J. 149, 155, 163, 165 automatic pilot systems 30–1, 49, 76, 123, 141, 183, 186–7 control returned to pilot 119–20 automation 5, 13, 54, 79, 87, 115–16, 127, 129 increase in complexity 124n Azjen, I. 151, 161 Back, J. 139 Bandura, A. 148, 149, 150, 152, 153, 156, 159, 160n, 161, 162–3, 165, 166, 167–9, 170, 171, 172, 173, 176, 192, 195, 202n, 252 319
Index Banks, J. 241 Barker, V. 174 Barr, P. 174 Barto, A. G. 4n, 106, 110, 113, 255, 257, 258, 262 basal ganglia 220, 221, 222–5 basicness 26–7 Bayes rule 105, 209 Bayes theorem 69, 117 subjective probability 93 Bayesian belief-updating process 209 Bayesian formal methods 261 Bayesian inference 110n Bayesian learning algorithms 102, 104–7, 229n, 238n, 240 optimal 231, 232, 234 Bayesian networks 105, 108 causal 78n Beckett, Samuel 3n behaviourism 204 Behrens, T. E. 231, 232–3, 234 Beigi, M. 179 Berns, K. 91 Berry, D. 179, 187, 188, 193, 196, 197 biases 73, 104, 128, 186, 193–8, 212, 214 Bigelow, J. 83n bioeconomics 248n biological systems 1, 16–17, 22, 81, 82, 87, 99, 122, 173, 189 learning algorithm that bears closest resemblance to 107 biology machine learning occasionally borrows from 110 biometric values 134 320
Blaisdell, A. P. 239n Blandford, A. 139 block designs 57, 59–62, 71–2, 82n, 131 Blum, B. 210 Bogen, J. 43, 45, 46 Boles, T. L. 241 Bonnefon, J.-F. 253n Boolean logic 142 Boorman, E. D. 233n, 234–5 Botvinick, M. M. 232 Bouffard-Bouchard, T. 161 Bower, G. H. 135 Boy, G. A. 140 brain functions see cortical regions; neural networks; also under further headings prefixed ‘neuro-’ Brand, M. 28, 29 Brass, M. 170 Bratko, I. 113 Bratman, M. E. 29 Brezzi, M. 241 Brief, A. P. 161 Broadbent, D. E. 179, 187, 188, 193, 196, 197 Brown, S. D. 110 Bryant, J. 89, 93 Burns, B. D. 180, 188, 198, 199, 200 Burns, C. M. 122 Busby, J. S. 119, 136–8 Buschman, T. J. 221, 224 Busemeyer, J. R. 179 Bush, G. 226 Buss, R. R. 187 Camerer, C. 217, 248, 249 Campbell, N. K. 161
Index Campos, L. M. de 106 career-related problems 161 Carnap, Rudolf 36, 47 Carroll, Lewis 2n cars 53, 54, 118, 124, 177 assisted steering 113 kinesthetics to control 87 Carter, C. S. 225, 232 Cartwright, N. 44 Castanon, L. E. G. 261 Castellan, N. J. 205, 207 category assignments 102 causal attributions 143, 145, 146, 176 identifying the antecedents of 175 causal learning 35n, 182, 212, 213 and reasoning 14, 204, 209–12 causality/causation 11, 12, 16, 21, 31–6, 39, 41–2, 44, 49–50, 69, 119, 148 agent 172, 173 asymmetries of time informative of 88n counterfactual dependence in 43 demystification of 45 event 172, 173 Humean perspective on 93 mental 22, 39n notion still under-specified 70n personal 162 translated as a construct into dynamics 93 understanding of 171 cause–effect relationships 4, 33, 35, 36n, 50, 59, 193, 210 dynamic 265 failing to accurately detect 128
multiple 177–8, 210–11 operating with incorrect knowledge of 20 probabilistic 128, 265 puppeteer and puppet 19, 271 CDC (complex dynamic control) 209–10 CEOs (chief executive officers) 127, 148, 175 Cervone, D. 156–7, 161, 162 Chau, K. W. 259 chemical systems 22, 81, 90 dynamic 66 industrial processes 77, 118 time delay systems found in 73 Chen, S. 257, 269n Cheng, P. W. 35n Chery, S. 201 Chinese room argument 89n Chmiel, N. 188, 193 Chretienne, P. 75 Chung, P. W. H. 119, 136–8 circularity 43 CLARION (Connectionist Learning with Adaptive Rule Induction Online) 198 Clark, L. 242 Clark, M. 78n classifications 102, 103, 140, 258 binary 103 clinical domains 188 cognition 77, 180 high-level 221 useful basis for understanding 228 see also metacognition cognitive engineering 117, 130, 132–3, 135 cognitive factors 158–62 321
Index cognitive processes 148, 216–17, 218, 236 conscious 249 core 228 important 159 limitations of 137 cognitive psychology 4, 10, 12, 14, 152, 155, 179–215, 216–17, 224, 254, 256, 263 favoured method in 163 Cohen, J. D. 222, 225, 231, 245 Cohen, M. 230, 232, 233, 242, 244 Cohen, M. S. 200 Cohen, W. 103 coincidences and causes 32, 38, 39, 40–3, 44, 45, 48 cold symptoms 16–17, 18, 19, 46, 47–8 Coltheart, M. 217n communication problem 83–4, 86 complex systems 77, 85, 88, 100, 119, 187, 263 behaviours of 23, 96 controlling 14, 86, 196, 213, 255 description of operations 96 dynamic relationship between humans and 117 general properties for describing 81 learning to control uncertain outcomes in 180 problems that arise when humans interact with 13 uncertainty about 50 understanding 83, 256 complexity 5, 29, 53, 58, 190–3, 239 agency 176 automation increases in 124n 322
goals can be defined in terms of 155 mystery of 255 solving the problem of 6–9 computational machines 94, 95 computer database management 77 computer vision 101 conditionals 102, 123, 159 counterfactual 41 connectionist networks 135 see also neural networks Connolly, T. 156 constant conjunction 34 continuous-time variables 57 control information 121 control systems engineering 6, 12–13, 51, 52–79, 83, 111, 136, 208n, 228n, 255 core problems integral to 80–1 control theory 13, 55–9, 72, 73, 79, 83, 85, 151n application revolutionized 77 crucial concepts in 62 humans and main principles of 116 linear 132 optimal 68, 77n perceptual 149, 163, 164–6 statistical approach 104 Vancouver and Putka’s 149 control-based learning 181 controllability 68, 70, 269 convergence 170, 219 Cook, D. J. 66 Copernicus 38 Corning, P. A. 110, 122 cortical regions 218–33, 235–8, 243 counterfactuals 41–2, 43, 48–9, 50
Index Cowell, R. G. 106 Coyle, G. 77 CPM-GOMS technique 134 Craig, E. 33n Cramon, D. Y. von 232 critical variables 98 Crown, D. F. 177 cruise control 6, 17, 20, 83n cues 206, 207 critical 201 environmental 197, 201 irrelevant 205 proximal 193 see also MCPL (multiple cue probabilistic learning) tasks cybernetics 4n, 11–12, 13, 80–114, 120, 165, 166, 167, 254, 256 attitudes to 149 communication and information theory in 121 early 91, 255 father of 5, 118n made fashionable 151 dACC (dorsal anterior cingulated cortex) 233 Dake, J. M. 246 Dam, G. 243n Danielmeier, C. 233 DAS (driver assist systems) 5–6 data mining 101 David, P. 265n Davidson, D. 22, 25n, 26, 28, 50 Daw, N. D. 106n, 233n, 235, 241, 242–3, 258 Dayan, P. 70n, 106n, 106, 111, 151n, 155, 231n, 233n, 235, 241, 242–3, 244, 258 decision rules 102, 105, 110n
decision trees 102, 104, 135 decisional processes 161 declarative knowledge 187n declarative learning 198 Degani, A. 53, 76, 120, 126, 143, 193, 194 Dekker, S. W. A. 129, 130, 135, 137, 142 Dennett, D. 22–3, 71n Descartes, René 31 DeShon, R. P. 161 design-level description 23 design problem 24, 28, 71n desired goals 18, 85–6, 96, 104, 108, 133, 141, 163, 164 achieving 167, 230n discrepancy between current state and 158, 166 getting closer to 197 obvious 252 reaching and maintaining 170 reliability in producing 56 rewards and punishments incurred in relation to 228 desires 6, 12, 20, 21, 22, 24, 147, 148, 169 causally relevant 41 intention and how it differs from 26 deterministic descriptions 46 Dettmer, H. W. 142 Devinsky, O. 226, 232 Devlin, J. T. 232 dichotomies 181–2, 250n Dickinson, A. 35n Dieckmann, M. 237 Dienes, Z. 169n, 179, 187, 196 digital control systems 77–8 discrete-time variables 57 323
Index disturbances 54, 72, 74, 123 good regulation against 71 unpredicted 78 ways of counteracting 164 divergence 219, 223 Dixon, S. R. 128 DLPFC (dorsolateral prefrontal cortex) 222, 230 Dolan, R. 106n, 233n, 235, 241, 242–3 dopamine 224 Dörner, D. 7, 123, 180, 185, 194 Doucet, A. 111n Doyle, J. 62, 73 Dretske, F. 22, 24, 25, 27, 28–9, 71n Drury, C. G. 135 dual-control theory 105, 106 dual-space hypothesis 199 Ducasse, C. J. 42 Dugdale, B. 161, 180 Dulin, J. 241 Duncan, K. D. 140 Dweck, C. S. 161 dynamic decision-making 179 dynamic equations 57–8, 67 dynamic systems 7, 14, 30, 56, 60, 73, 74, 97, 126 causal loops 78n complex 35, 52, 114, 131, 150, 252 linearity and 57–9 optimal 115 see also CDC (complex dynamic control) Dywan, J. 226 Earley, P. C. 156, 157, 158, 161, 180 ecological systems 1, 77, 81, 82, 183, 185–6 324
economics 82, 83n, 103, 152, 240, 241, 245, 248 see also neuroeconomics economy 49, 54 ecosystems 1, 49, 54, 183, 186 educational performance 161 EEG (electroencephalography) 227, 230 efficacy causal 168 synaptic 108 see also self-efficacy Eitam, B. 188 Ekegren, G. 156 electricity supply 46, 64 electronic control systems 87 Elger, C. E. 230 Elliott, E. S. 161 Elmaghraby, S. E. 134, 141 Elman, J. L. 108 emergent properties 72–3, 96, 219 Endres, M. L. 177 Endsley, M. R. 140 Engh, T. 132 engineering 7, 91, 111, 147, 255 advances in 136 early control systems 115 formal methods have a critical role in 136 psychology continues to borrow from 110 see also cognitive engineering; control systems engineering EPAS (electric power-assisted steering) 5–6 equilibrium state 65, 78, 224 ergonomics 10, 117, 201 ERN (error-related negativity) 226
Index error 68, 108, 116, 150, 230 ability to reduce 62 cyclic 64 forced 140 general issues and remedies 142–3 identifying 141 latent 140 minimized 56 predictive 243 random 140 reward-prediction signals 224 scope for 127 sounds indicating 119 see also human error error correction 84, 163 error detection 63, 75, 83n, 163, 226, 232 estimations 69, 70, 77n, 132n, 153, 159–60, 165, 177 Evans, R. J. 255 Ewan, E. E. 246 exemplar theories 196 expectancy value theories 151–3, 159 expected utility theory 248 experimental paradigms 182–90 explanatory laws 44–5 exploration and exploitation 213n, 229n, 230, 231, 241, 243, 245 switching between 244 extrinsic properties 28 Exxon Valdez oil spill (1989) 126 eye-movement technique 246–7 Fahey, R. 179, 187, 196 Fahrbach, K. R. 161
Falbo, J. 161 false dichotomies 181n falsity 37 Farh, J. 177 Farrell, P. 201 fatigue 130n fault diagnosis 141–2 Fayyad, U. M. 89 feedback 14, 15, 30, 55, 64–71, 79, 81, 88, 91, 94, 109, 110, 118, 119, 149, 166, 173, 176, 177, 190, 191, 217, 229, 231, 235, 267 adaptation as 96–100 ambiguous 230 anticipatory 87 biases in interpreting different timings and 194–215 complex 183 consistent/inconsistent with predicted outcome 188 delays in 194–5 dynamic 151 evaluated 188, 222 facilitative effect on predictive performance 207 instructive 157 instrumental in technological advancements 83 negative 85, 86, 87, 150, 158, 164, 165, 233, 234 ongoing performance 177 outcome 158, 187 positive 233 process 158 sensory 221–2 unambiguous 230 updating and integrating 268 varying the type of 205 325
Index feedback loops 7, 56, 60–4, 82, 185, 216, 222 automatic 116, 145 instability introduced by 124 multiple 227 negative 164 feedback signals 61, 69 feedforward control 63, 108, 110, 150, 165, 214–15, 220, 227 Feldbaum, A. A. 104 Feltz, D. L. 161 filtering estimation theory see Kalman financial decision-making 177 Fisher, D. L. 135 Fitts, P. M. 129, 181n Fitts’ Law 131n, 133, 144 Fitzgerald, P. 196 Flanagan, J. R. 207, 208 flight management systems 30, 120–1 fMRI (functional magnetic resonance imaging) 217n, 229, 230, 236, 252 force effect 146n formal descriptions 133 formal models 78, 117, 129–36, 143, 144, 210, 217n, 256n, 260–4 limitations and range 257 see also CLARION (Connectionist Learning with Adaptive Rule Induction Online) FPC (frontopolar cortex) 235 Francis, B. 54 Frankfurt, H. G. 28 Frayne, C. A. 161 Frederick, S. 250n Fredin, E. S. 265n Freeman, J. T. 200 326
frequency-domain approach 58, 68, 69 frequency-response analysis 67 Fuchs, A. 132n functional rules 96–7 Funke, J. 161 Fuster, J. M. 218, 219, 222 fuzzy logic models 257 fuzzy set theory 93, 103, 117 Gahegan, M. 113 Gallistel, C. R. 238 gambling environments 240–2, 243 crucial difference between control systems and 253 game theory 248 Gani, M. 66, 75 Gardner, P. H. 188 GAs (genetic algorithms) 102, 103–4, 108, 257 Gecas, V. 162 Geddes, B. W. 200 Geddes, N. D. 137 Gee, S. S. 77 genetics 95, 103 see also GAs (genetic algorithms) Georgantzas, N. C. 77 Gersteberg, T. 211 Gibbs, B. J. 161 Giesler, B. R. 187 Gigerenzer, G. 250n Gist, M. E. 160, 161 Glimcher, P. 245–6, 246–7, 248, 249 Goa, F. 67 goal-directed actions/activities 55, 106, 148, 171, 178, 271 brain activity associated with 219, 220, 229, 236, 238, 239, 243
Index information processing involved in 220 outcome of 266 self-initiated 152n, 161, 211 goal-directed behaviours 167, 179, 196, 199, 203, 213, 265 capability in achieving 159 different functions of 162 fast-learnt 224–5 goal-setting and 164, 166 self-rewarding 159 time separates sensory signals involved in 222 well-rehearsed sequence of 168 goals 155–8 attainment and maintenance of 149, 150n behaviours contribute to reaching 266 changing, need to respond to 66 complex 29 difficult 177 directing actions towards 207 distal 154 essential rules that can meet 103 long-term 55 opposing 111 poorly defined 128 proximal 154 relating past actions to future 29 unconstrained 189 varying specificity 177 see also CPM-GOMS; desired goals; NSGs (non-specific goals); SGs (specific goals) goal-setting 148, 154, 158, 159, 162 goal-directed behaviours and 164, 166
Locke and Latham’s theory 149 problematic 112 short-term 189 Gödel’s Incompleteness Theorems 89–90nn Godsill, S. 111n Goldsmith, T. E. 199 Goldstein, W. M. 135 Gollu, A. 113 Gomez, E. 180 Gonzales, C. 180, 197, 198 Gooding, R. Z. 175 Gordon, A. 187 Gray, N. 242 Green, D. M. 135, 140 Griffiths, T. L. 209, 261 group behaviour 179 decision-making 177 growing and pruning process 103 Grush, R. 110 Gundlack, M. J. 265n Gureckis, T. M. 110 Guttmann, H. E. 142 Guynn, M. 199 Hackett, G. 161 Hagmayer, Y. 210n, 211, 212 Hakel, M. 177 Haleblian, J. 175 Hammer, W. 140 Hancock, P. A. 140 Harman, E. 44 Harter, D. E. 180 Hartman, K. 161 Hassan, R. 188 Hausman, D. M. 42 Hayes, A. F. 265n Hayes, J.-D. 170 Hazen, Harold 83, 115–16 327
Index HCI (human–computer interaction) 10, 13, 53, 98, 115–46, 201 recent advances in 54 Heffernan, D. 113 Heinze, J.-C. 170 Hempel, C. H. 46, 47 Henson, R. 217n heteropathic effects 72 heuristics 8, 197, 198 Heymann, M. 53 Hibberd, R. E. 136 Hikosaka, K. 236 Hill, R. J. 187 Hill, T. 161 Ho, D, W. C. 66, 75 Hoc, J. M. 170 Hogarth, R. M. 161 Hokayem, P. 113 Hollnagel, E. 130, 135 homeopathic effects 73 Hooker, C. A. 255 Hornak, J. 233n Hornsby, J. 22, 24, 26, 28 Horswill, I. 91 Hoskens, J. 253n hospitals 53, 126 Huff, A. 175 human error 79, 119n broad taxonomy for classifying 139–40 formal models contribute to assessments of 136 no agreement as to how to define 142 see also THERP (technique for human error rate prediction)
328
human factors 4, 5, 10, 53, 167, 186, 192, 255, 256 see also HCI (human–computer interaction); human factors research human factors research 12, 119, 128–9, 145, 146, 151n, 254, 261 connecting up advances in machine-learning and 136 differences between other related areas and 117 errors in 136, 137 formal analysis has its place in 135 important issue for 143 key/principle concerns for 13, 118 motivation of 116 ongoing issue in 118 recent developments in 121 uncertainty in 120–4 human-machine interactions 118, 129, 133 computational capability of interface techniques 77 formal methods try to improve design and evaluate 144 optimal allocation of functions 129, 136 see also HCI (human–computer interaction) Hume, David 32, 33, 34, 35–6, 42, 93 Hyman, S. E. 231 hypothesis testing 104, 196, 198–200, 209, 212, 214 purposeful activity of 207 systematic tests 188
Index IBLT (instance-based learning theory) 196–8, 201, 207n Ilg, W. 91 imperfect information 91 Inagaki, T. 170 incidental learning 207n inconsistency 43, 90n, 92n, 175, 188, 227, 250n induction 34n, 46, 47, 48–9 see also rule induction industrial systems 1, 17, 18, 81, 142, 183–4, 201 inferences 32, 45, 50, 93, 239, 261 Bayesian 110n canonical 101 causal 33, 34, 37, 209 weakly held 202 information flows 30, 62–3, 72 challenges to 85 continuous and dynamic 101 monitoring 79 systematic 88 usual direction of 108 information theory 121, 133, 135 information-gathering 99, 104, 107, 241, 242, 245, 261 highly sophisticated 77 limited 105 too conservative 105 information-processing 71n, 136, 159, 180, 220 complex, achieving expertise in 199 conceptualizing 101 input 63, 65, 74, 98, 109, 221, 222, 223 artificial network of links between nodes 107
high- and low-frequency 67 time-varying 126n unpredictable 54, 78 input–output relationship 53, 60–2, 67, 81, 85, 102, 107, 187, 191, 219 causal 60 connections either nonlinear or linear 187 delays 19 dynamical changes 59 errors 108 functional description 58 intermediary internal mechanism 57 one-to-one mapping 193 perceptual representations and generating actions 220 instability 62, 64, 66–7, 74, 82, 84, 124, 232, 265 instance space 199–200 intelligent design 99 intensive care units 135 intentionality 26, 27, 32, 169, 174 intentions and actions 12, 16, 19–31, 140, 175 intercorrelations 191 introspection 31 Isen, A. M. 241 Ishii, S. 106 iterative process 103, 136, 164 Itoh, M. 170 Jackson, F. 40n Jacobs, B. 161 Jacobs, O. L. R. 57, 62, 74 Jahanshahi, M. 179 Jakeman, A. 257
329
Index Jax, S. A. 134 Jensen, F. 105 Jensen, S. 242 Jirsa, V. K. 132n Jiwani, N. 156 Jocham, G. 233–4 Johansson, R. S. 207 Johnson, A. 242 Johnson, C. 18, 73, 75–6, 119, 120, 136 Johnson, W. G. 142 Jones, G. R. 160 Jordan, M. I. 108 Josephs, R. A. 187 judgement-orientated strategy 190 Jungermann, H. 209 Kable, J. 245–6 Kahan, S. 207 Kahneman, D. 161, 250n Kakade, S. 70n, 231n, 242 Kalman filters 69–70, 77n, 87n, 110, 117, 242, 259 Kanfer, R. 161, 180 Kant, Immanuel 35–8 Karnas, G. 187 Karoly, P. 150, 151, 163 Katsamakas, E. G. 77 Kelley, H. H. 173–4, 175, 176 Kemp, C. 209, 261 Kendall, L. 150n Kepler, Johann 44n Kerstholt, J. H. 189, 190, 191, 209 Kheramin, S. 246 Kim, J. 187 King, A. P. 238 Kinney, P. J. 161 Kintsch, W. 180 Kirlik, A. 53, 126 330
Kirth-Nelson, Z. 242 Kitcher, P. 36 Kittler, J. 110 Kleene, S. C. 89n Klein, G. 179, 200 Klein, T. 233 knowledge a posteriori 37 a priori 36, 37, 38 causal 37, 49, 50, 51, 209 pure categories of 36 knowledge acquisition 180, 188 Knowlton, B. J. 207n Kochenderfer, M. J. 113 Koechlin, E. 235 Kolmogoroff, A. N. 84 Konidaris, G. 258, 262 Körding, K. 77n, 243n Kotler-Cope, S. 187 Kringelbach, M. L. 233n Krueger, G. P. 193 Krug, K. 238 Krynski, T. 209, 210 Lagnado, D. A. 34n, 39n, 195, 207, 210n, 211 Lagorio, C. H. 246 Lai, T. L. 241 Laibson, D. I. 245 Langer, E. J. 253n Laplace transformations 57, 131, 240 latent factors 134n Latham, G. P. 149, 150n, 155, 156, 157, 161, 238 Laughery, K. R. 132 Law of Requisite Variety 98 Lawless, M. L. 132 Lawrence, A. 242 LDW (lane departure warning) 6
Index learning behaviour 181–2, 242, 243 animal 237 causal 209 experts versus novices 181 formal method of describing 70n optimal 234, 250 see also action-based learning; control-based learning; IBLT (instance-based learning theory); reward-based learning; RL (reinforcement learning) Lebiere, C. 180 Lee, C. 156 Lee, M. 241 Leen, G. 113 Leigh, J. R. 55, 64, 208n Leising, K. 239n Lennie, P. 227 Leondes, C. T. 71n Lerch, F. J. 180 Lewis, D. 41 Libet, B. 28, 170 Lichacz, F. 193 likelihood ratio 105, 209 linear equations 73 differential 58, 66 linearity and dynamics 57–9 linguistic hedges 103 LIP (lateral intraparietal area) 247 Lipshitz, R. 127, 128, 179, 200 Litt, M. D. 161 Lituchy, T. R. 156 Locke, E. A. 149, 150, 153, 154, 155, 156, 157, 158, 161, 163, 192, 202n Loewenstein, G. 217, 245, 248 logical operators 102–3 long-term memory 135
loudspeakers 65 Love, B. C. 110 LPFC (lateral prefrontal cortex) 236 LTI (linear time-invariant) systems 126n, 132 Luc, F. 187 Lucas, J. 31, 49, 50, 51, 89n Luu, P. 226, 227n MABA–MABA capabilities 129 MacDonald, A. W. 225 Mach, D. E. 161 machine learning 4n, 10, 89, 91, 113, 125, 151n, 167, 168, 228, 255, 256 AI and 11, 13, 81, 82, 93–4, 111, 112, 117, 118, 120, 147 algorithms are computationally expensive 112 current work in 100 far-reaching impact on daily lives 111 models of 101, 105–6 ongoing challenge in 112 Mack, G. 78n Mackie, J. L. 41 Madden, G. J. 246 Maddux, J. E. 151 Mahadevan, S. 112, 113, 257 management 6, 14, 77, 98, 126, 137, 148, 160, 166, 183–4, 192, 258, 268 complex decision-making tasks 157 locus of control clearly specified 170 see also MORT (management oversight and risk tree); self-management 331
Index Mangels, J. A. 207n Mangold, S. 210n, 211 Mann, M. F. 161 Manning, M. M. 161 manual control 30 Marco-Pallarís, J. 232 Marcus, G. 227 Marescaux, P.-J. 187 Mark, T. A. 238 Marquis, M. A. 161 Martinez, F. 253n Marx, Karl 162n Mathews, R. 179 Mathews, R. C. 187 Matsumoto, K. 236–7 Mauel, E. C. 246 Maynard, D. 177 mazes 177 McClelland, D. C. 152, 162 McClelland, J. L. 135 McClure, J. 174 McClure, S. M. 244, 245 McDonald, A. J. S. 90 McKenzie, C. R. M. 144n, 161, 260–1 MCPL (multiple cue probabilistic learning) tasks 82n, 204, 205–7, 209, 213 predictive judgements in 14, 182 MCS (metacognitive subsystem) 198 MDPs (Markov decision processes) 100, 112, 258 Mead, G. H. 162n means-end analysis 199 mechanical systems 1, 17, 18, 81, 87 mechanistic theories 35n, 42 mechatronics 79, 91 Meder, B. 210n, 211
332
Mele, A. R. 22, 24, 25n, 26, 29 memory 88, 94, 109, 179, 181n, 201, 202 computational machines have 95 decision and response 140 instance representation stored in 196–7 learning and 97, 99–100 speed in retrieval of 135 mental states 38n, 94 mentalism 23n Merrill, E. 198 Mesch, D. 177 Messick, D. M. 241 metacognition 169, 198, 200 integral part of decision-making process 202 Meulenbroek, R. G. J. 133–4 Miami 119n Michela, J. L. 173, 174, 175, 176 Micro Saint Sharp models 132, 135 microphones 65–6 Mikkelsen, L. A. 261 military systems 53, 118, 141, 188 Mill, J. S. 72 Miller, E. K. 220–1, 222, 224 Mindell, D. 83, 116 misperception 192, 193n, 195 Mitchell, T. 96, 101, 102n, 113 Mitchell, T. R. 160 mobile phones 124, 129, 131, 139, 168 model-based learning 243 monitoring 18, 30, 79, 98, 113, 122, 124, 126, 149, 163, 266–7 assigning the role of 129–30 feedback 188 pressures of 123
Index Montague, P. R. 70n, 231n, 242 Monte Carlo simulations 131–2nn Morales-Menendez, R. 261 Moray, N. 170 More, K. M. 160 Moritz, S. E. 161 Morrell, M. J. 226 Morse, A. S. 77 Morsella, E. 227 MORT (management oversight and risk tree) 141 motion aircraft 66 light detector 168 planetary 38, 44n motivational factors 133, 148, 153–4, 158, 159, 198 motor control behaviour 77n, 179 behaviour high-level 225 see also perceptual-motor control motor learning 207–8, 227n, 243n Motro, A. 92 movement time 133 mPFC (motor prefrontal cortex) 237, 238 Mueller, G. 175 Müller, A. V. 232 multiple control systems 120 Mumaw, R. J. 186–7 Mumford, Lewis 83 Münte, T. 232 Murtha, T. C. 161, 180 muscle movements 26, 27, 85 NACS (non-action-centred subsystem) 198 Nagel, Thomas 24
naturalistic decision-making theories 200–3 necessary connections 34, 35, 46, 49, 50, 93 Hume’s denial of 33 Nelson, L. 161, 180 Neumann, J. 233 neural networks 71n, 102, 107–9, 219, 240 artificial 257, 259 neurobiology 218–19, 246 neurochemicals 122, 127, 167, 169, 244 neuroeconomics 15, 159n, 217, 218, 240–6, 247–8, 249, 252 neurology 27, 95 neuronal activity 107, 224, 247 changing thresholds of 94 neuropsychology 10, 12, 14, 151n, 155, 217, 218, 224, 227n, 228, 229, 235, 240, 248, 251, 252 neuroscience 4, 5, 14–15, 23, 24, 101, 110, 216–53, 254, 255, 256 research that combines psychology, economics and 159n, 217 RL algorithms in 258 serious issue for 28 Newell, B. R. 207 Newtonian mechanics 88 Niv, Y. 106, 106n, 6, 151n, 155, 242, 243 noise 54, 74, 100, 186 measurement of 72 sensor 78 signal 84, 121 smoothing of 84 stochastic 64
333
Index nonlinear systems 59, 65, 70, 128, 133 non-volitional responses 152, 153 norepinephrine 244 Norman, D. A. 119, 139 Northcraft, G. B. 156 Norton, J. 257 Norton, L. 151 NP (non-polynomial) time 8 NSGs (non-specific goals) 156, 157–8 nuclear power plants 1, 49, 53, 90, 118, 123, 142, 177, 201 Nygren, T. E. 241 Nyquist stability criterion 67 observable behaviour 23, 32, 68–9, 119 observation-based learning 179, 181, 188 Ockham (William of) 31 O’Doherty, J. P. 106n, 230, 233n, 235, 241, 242–3 OFC (orbitofrontal cortex) 230, 231, 237, 238, 243 Olson, J. M. 160 Olson, M. 241 online control 111, 116, 123, 254 OP (order-of-processing) diagrams 135 optimal control theory 68, 69, 77n, 106, 115 optimal sampling 117 optimization 54, 56, 57, 72, 134, 257 Orasanu, J. 179 organizational psychology 14, 148, 160, 201 organizational systems 1, 17, 81 simulated tasks 177 334
Oritz, R. J. C. 261 Osman, M. 179, 181n, 187n, 192, 194n, 210n, 211, 214, 255n, 263 outcome variables 191 Owens, A. 40–1, 43 Pacherie, E. 22, 25, 27, 28, 29–30 Pado-Schiopp, C. 245 pain 161 parameters 56, 72, 78, 109, 139 constant but uncertain 74 estimated 70, 76n time-varying 58, 66 unknown 76n Parasuraman, R. 129, 140, 260 particle filters 110 Parvez, C. 179 Patterson, P. 174 payoffs 242, 243–4, 252 Pearl, J. 209, 210 Penfold, H. B. 255 perception 36n, 119, 135, 141, 151, 153, 160, 162n, 165, 172, 177, 181n integrated into action 220, 221, 222 see also misperception perception–action cycle 221 perceptual-motor control 14, 182, 204 perceptual-motor learning 207–8 perfect information 91–2 Perner, J. 169n Perrow, C. 120, 137 Persensky, J. J. 132 Peterson, T. 198 Petri nets 75 Pettit, P. 40n
Index PFC (prefrontal cortex) 219–25, 230, 235–8, 243 Philiastedies, M. G. 227 Phillips, C. A. 132, 133 philosophy 11, 16, 19, 21, 28, 29, 30, 48, 51, 148, 171 see also Aristotle; Carnap, Rudolf; Descartes, René; Hume, David; Kant, Immanuel; Marx, Karl; Ockham (William of); Russell, Bertrand phobic behaviours 161 physical-level description 23 physics 80n, 82, 103, 130 laws of 22 physiology 82 Piatetsky-Shapiro, G. 89 planetary motion 44n planning decisions/actions 180 Platt, M. L. 246–7, 249 Podsakoff, P. 177 Poldrack, R. 207n polynomial time 8 Porr, B. 257 Porter, D. 241 Posner, M. I. 181n, 226 possibility theory 93 posterior cortex 221, 222 Powers, W. T. 149, 151, 153, 165 Pratt, N. 227n predictability 191, 264, 267 see also unpredictability prediction 4, 22–3, 46, 50, 63, 90, 208, 272 accurate 102, 211 and control 213–14, 268–9 multiple 188 operator performance 134
parallel between observation and 210n reward 224 subjective estimates of confidence in 267 uncertainties in accuracy of 54 weather 258 see also THERP (technique for human error rate prediction) prediction error signals 230 predictive learning 181, 222 Prelec, D. 217, 248 Prentice-Dunn, S. 161 preprogrammed control 63 prescriptive methods 129, 201, 248, 260, 261 probabilistic phenomena 46, 128 algorithms 103, 104, 106 risk analyses 142 see also MCPL (multiple cue probabilistic learning) tasks probability 48, 69, 73, 80n, 92, 132n, 152, 229, 233 conditional 68, 75, 99 estimation of 177 expected 74 fuzzy 103 logical interpretation of 47 objective 142 outcome 234, 247 prior 105, 209 reward 231–2, 234, 235, 246, 252 subjective 93 problem spaces 104, 112 large 120–1 problem-solving 6–9, 83n, 161, 167 dysfunctional strategies 177 procedural knowledge 187n 335
Index proximal intention 29 Psillos, S. 33, 45 psychology 7–8, 23, 24, 27, 32, 34, 40, 82, 108, 109, 245 advances in 136 culmination of recent work in 240 economics, neuroscience and 15, 159n, 217 formative influence of engineering on 83n machine learning and 101, 110, 111 serious issue for 28 specific constructs used to measure mental capacity 135 see also cognitive psychology; neuropsychology; organizational psychology; social psychology Pugh, A. L. 77 punishment 106, 218, 228, 229–35 Putka, D. J. 149, 153, 165 Putnam, H. 89–90nn Q-learning 102, 106, 110n, 234, 240, 259 Quesada, J. 180, 191 Raaijmakers, J. G. W. 190, 191 Rajagopalan, N. 175 Rajan, J. 79, 137 Ramadge, P. J. 76 Ramirez, R. 261 random fluctuations 128 random variables 183, 191 Ranganath, C. 230 Rao, S. S. 91 336
Rasmussen, J 53, 119, 139 Ratcliff, R. 227 rate of change 64, 265–6, 268, 272 rational models 248, 250, 260 RAWFS model 200, 201–2 RCA (root cause analysis) 142 RCZ (rostral cingulate zone) 233, 234 reaction time 133 reason 37 Reason, J. 119, 134n, 137, 140 Reddish, D. 242 Redmill, F. 79, 137 regularities 32, 33, 38, 42, 100, 123 invariant 35 structures that will exploit 104 see also statistical regularities reinforcement theory 153, 166 see also RL (reinforcement learning) relative frequency 47, 93 reliability 191 representations 28, 59–62, 79n, 144, 208, 224, 255, 259, 268 causal 195, 211–12 conceptual 238–9 conflicts between 225 flexible 239 formal 56 graphical 57, 75 higher-level 29, 221 instance 195–6, 196–7, 201 internal 25n, 167, 239, 262 lower-level 29 perceptual 220 static 100 syntactic-based 238–9 theory of knowledge 91n response–outcome expectancies 152
Index reward 110n, 155, 157, 236, 240, 252, 262 delayed 239, 241, 245–6 desirable 160n dynamic 234, 235 expectancy and 151, 152, 159n, 222, 241, 246, 247 future 245 immediate 245 magnitude of 229–31, 234, 245, 246, 247 neurons responding to signals 224 probabilistic 106, 234, 246 punishment and 106, 218, 228, 229–35 purposeful voluntary behaviour 24 risky 241–4 stochastic structures 235 variable 246, 247 reward-based learning 237–8 Richardson, G. P. 77, 78, 165 Riley, V. A. 140 risk aversion 240 risk factors 137 RL (reinforcement learning) 106n, 110n, 111, 117, 155, 234, 243n, 246, 258–9, 262 expectancy value theory has its origins in 151n RM (recognition/metacognition) model 200, 201, 202 robotics 53, 79, 91, 93, 96, 101, 125, 258 robustness 50, 71, 72, 74, 111–12, 125 rocket propulsion 115 Rodriguez, P. F. 227n
Roese, N. J. 160 Rogers, R. W. 161 Rolls, E. T. 233n Rosen, B. 161 Rosenbaum, D. A. 133–4 Rosenblueth, A. 83n Rosse, J. G. 177 rotating discs 61–2, 71, 88, 94 Rothrock, L. 53, 126 Rotter, J. B. 152 Rowe, R. R. 246 RPD (recognition-primed decision) model 200–1, 202, 203 Ruddle, C. M. 66 rule induction 100 see also CLARION (Connectionist Learning with Adaptive Rule Induction Online) rule space 199, 200 Rumelhart, D. E. 135 Rushworth, M. F. 231, 232, 234 Russell, A. 78n Russell, Bertrand 36, 93 Rustichini, A. 217, 248, 249 Saari, L. 157 SabreTech maintenance safety 119n safety-critical systems 1, 117n, 123, 140 Sailer, U. 229, 230, 231, 233 Sajda, P. 227 Sakagami, M. 236 Salas, E. 179 Salmon, W. 42, 46, 47 Salvendy, G. 119, 136, 139 SAP (semi-automatic parking) 6 Sarter, N. B. 186–7 Sawa, K. 239n 337
Index scepticism 33n, 271 Schaffer, J. 89 Schul, Y. 188 Schultz, W. 237, 246 Schwabe, K. 237 Schweickert, R. 135 Schwenk, C. R. 175 Schwoerer, C. 161 Searle, J. 25, 28, 29, 89n, 89 Second Law of Thermodynamics 98–9 Segalowitz, S. J. 226 self-corrective mechanisms 87 self-efficacy 148, 150, 158, 161–2, 166, 252, 272 estimations of 153, 159–60 influence on behaviour 173n perceived 152n, 161 self-evaluation 157, 159, 168, 170 self-management 161 self-regulation 4n, 11, 85, 148, 149–53, 162, 163, 165, 167, 198 sensors 61, 62, 64, 69, 71, 74, 94 imprecision of 92–3, 94, 113 incorrectly indicating 128 noise of 78 temporal failure of 74 sensory-motor system 208 sequencing and synchronisation 75 servo-mechanisms 83 Sexton, T. L. 161 Seymour, B. 106n, 233n, 235, 241, 242–3 SGs (specific goals) 154, 156, 189, 199, 205, 208 Shafto, M. 126 Shane, M. 227n Shanks, D. R. 35n, 181n, 207 338
Shannon, C. E. 84, 121, 135 Sharit, J. 142, 143 Shaw, K. 157 Shea, N. 238, 239 Sheridan, T. B. 53, 79, 116, 120, 129, 130, 135, 136, 143, 260 Shevchenko, M. 110 Shirakawa, S. 236 short-term memory 135 signal detection theory 117, 135, 141 Simon, Herbert 83n simplicity 155 Singer, T. 233n single-difference theory 42 SISO (single-input/single-output) system 133 situation assessment 203 situation awareness 140–1 situational factors 175, 176 Sivashankar, N. 113 skill acquisition 179 Skinner, B. F. 23n, 204 Sloman, A. 51, 255, 262, 263 Sloman, S. A. 34n, 39n, 51, 195, 211 Slusarz, P. 179 Smart, J. J. C. 89–90nn SMAs (supplementary motor areas) 225 Smets, P. 92 Smith, K. 140 Smith, N. D. 161 Smith, P. J. 137 Smyth, P. 89 social networks 169 see also sociotechnical systems social psychology 4, 10, 14, 147–78, 188, 192, 216–17, 252, 256, 262, 263
Index social-cognitive theory 149, 155, 162–4, 166 sociotechnical systems 98, 121, 122, 123, 130 automated, dynamic and uncertain 124 Soon, C. S. 170 speech recognition 101 Spering, M. 161 sport 161 Squire, L. R. 207n St John, M. F. 181n stability 6, 64–7, 72, 118, 133, 232, 234, 265 standard deviation 132n Stanley, W. B. 187, 196, 197 Stanovich, K. E. 144n, 250n states and uncertainty 67–70 statistical generalization 38, 46, 47, 48 statistical mechanics 80, 82, 83, 88–9 statistical regularities making causal explanations from 43–9 spurious 38 statistics 69, 101, 130 Stavy, R. 181n, 194n Stenger, A. 225 Sternberg, R. J. 181n Stevenson, R. J. 200 Steyvers, M. 110, 210, 211, 241 stimulus–response theories 23, 58, 94, 133, 135, 153, 163, 164, 168, 221, 225, 246–7 variability 205 Stoltenberg, C. D. 151 Stoutland, F. 22 Strauss, O. 127, 128, 200
Strawson, G. 33n Stumpf, S. A. 161 subjective preferences 245 subway systems 49, 90, 118, 123, 124, 169, 178 Suchman, L. 77 Summer, A. 79 Summerfield, C. 235 Sun, J. 113 Sun, R. 179, 198 supervisory control 76, 79, 130 Sutton, R. S. 4n, 106, 111, 174, 255 Suzuki, W. 236–7 Swain, A. D. 141 Swann, W. B. 187 swarm intelligence 257, 259 Sweller, J. 180, 199 Swets, J. A. 135, 141 synthetic judgements 36, 37, 38 systems theory 81, 95, 100, 228 Tanaka, K. 236–7 Tannenbaum, A. 54 task analysis 129, 135, 140 task network models 132, 135 Taylor, P. J. 241 telecommunications 115, 121 temperature 17, 18, 20, 39, 48, 56, 66, 74, 93 body 97 temporal properties 29–31, 75, 109–10 Tenenbaum, J. B. 209, 210, 261 Terry, C. 179 Thatcher, J. B. 265n THERP (technique for human error rate prediction) 142 Thüring, M. 209 339
Index time delays 65–6, 78, 118, 183 dynamical systems with 73 stability criterion used to determine the effect of 67 time preferences 240 time-varying control systems 58, 66, 70, 125n, 132, 191 Tirosh, D. 194n Tobler, P. 238, 242 Todd, P. 250n tracking objects 207 traction control 113 traffic lights/control 17, 18, 20, 75, 76, 77 triggering 28–9, 134n, 167, 222, 237, 250, 262 Trumpower, D. L. 199, 200 Tucker, D. 227n Turing, Alan 89n, 90, 94, 118n Tversky, A. 161 Ullsperger, M. 232, 233 unconscious learning processes 179 unpredictability 20, 52, 54, 55, 64, 72, 73, 74, 76, 78 complete 244 knowledge and understanding of 213 temporal delays interpreted as the result of 195 Urbancˇ icˇ , T. 113 Uthurusamy, R. 90 utility functions 106 Uttal, W. R. 217n vague information/instructions 103, 157 validity 191 cue 205 340
value functions 106, 268 expected 268 Valujet accident (2007) 119n Vancouver, J. B. 149, 150, 151, 153, 155, 160, 163, 165, 166, 252 Varaiya, P. 113 Vaughan, J. 133–4 vehicles 5–6 collision warning systems 135 see also cars velocity 146n ventromedial PFC 222 Vetter, P. 207 Vicente, K. J. 79, 117, 119, 120, 122, 126, 127, 130, 179, 201 Vigoda-Gadot, E. 158 visual stimuli 132 static and dynamic 135 visuo-motor tasks/learning 207, 227n vmPFC (ventral motor prefrontal cortex) 235 Vogt, B. A. 226 Vollmeyer, R. 180, 188, 198, 199, 200 voluntary control behaviours 87 Von Wright, G. H. 31 Vromen, J. 248n, 260 Wagener, D. 161 Wagenmakers, E. J. 241 Wagner, J. A. 175 Waldmann, M. 211, 239n Wall, T. 188, 194 Walters, B. 132 Walton, M. E. 231, 232 Wang, C. 78 Wang, J. H. 179, 201
Index Wang, Z. 66, 75 Watanabe, M. 221, 222, 236 Watkins, C. J. C. 106, 111, 255 Weaver, W. 84, 121, 135 Weber, B. 230 Weeks, B. L. 66 Wegner, D. M. 169n, 170 Weng, J. 90 West, R. F. 250n White, C. 31 Wickens, C. 128, 131 Wickens, C. D. 186–7 Wiener, Norbert 5, 80, 82–9, 93–4, 95, 96, 118n, 121, 124n, 149, 165, 256n Wieslander, J. 105, 124 Wilkinson, L. 179 Williams, R. J. 109 Williams, S. L. 161 Windridge, D. 110 Winter, S. 237, 238 Witt, K. 179 Wittenmark, B. 105, 124 Wolf, S. 200 Wolpert, D. 77n, 110, 207
Wonham, W. M. 76 Wood, R. E. 156, 161 Woods, D. D. 129, 130, 135, 140 Woodward, J. 41, 42 Woolrich, M. W. 231, 234 Wörgötter, F. 257 work-related performance 161 working memory load 135 Wright, T. L. 161 Yang, F. 66, 74 Yoder, R. J. 160 Yokimoto, K. 106 Yoshida, W. 106 Yow, A. 132 Yu, A. 111, 228, 244 Yu, L. 67 Zanna, M. P. 160 Zaug, J. M. 66 Zhang, X. 179 Zimmer, C. 265n Zipser, D. 109 Index compiled by Frank Pert
341