COGNITIVE ECONOMICS: NEW TRENDS
CONTRIBUTIONS TO ECONOMIC ANALYSIS 280
Honorary Editors: D.W. JORGENSON J. TINBERGENy Editors: B. BALTAGI E. SADKA D. WILDASIN
Amsterdam Boston Heidelberg London New York Oxford San Diego San Francisco Singapore Sydney Tokyo
Paris
COGNITIVE ECONOMICS: NEW TRENDS
Edited by Richard Topol CREA, Ecole Polytechnique, Paris, France Bernard Walliser Paris Sciences Economiques, Paris, France
Amsterdam Boston Heidelberg London New York Oxford San Diego San Francisco Singapore Sydney Tokyo
Paris
Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK First edition 2007 Copyright r 2007 Elsevier B.V. All rights reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; email:
[email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://elsevier.com/locate/permissions, and selecting Obtaining permission to use Elsevier material Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN-13: 978-0-444-52242-9 ISBN-10: 0-444-52242-5 ISBN: 0573-8555 For information on all Elsevier publications visit our website at books.elsevier.com
Printed and bound in The Netherlands 07 08 09 10 11 10 9 8 7 6 5 4 3 2 1
Introduction to the Series This series consists of a number of hitherto unpublished studies, which are introduced by the editors in the belief that they represent fresh contributions to economic science. The term ‘economic analysis’ as used in the title of the series has been adopted because it covers both the activities of the theoretical economist and the research worker. Although the analytical methods used by the various contributors are not the same, they are nevertheless conditioned by the common origin of their studies, namely theoretical problems encountered in practical research. Since for this reason, business cycle research and national accounting, research work on behalf of economic policy, and problems of planning are the main sources of the subjects dealt with, they necessarily determine the manner of approach adopted by the authors. Their methods tend to be ‘practical’ in the sense of not being too far remote from application to actual economic conditions. In addition, they are quantitative. It is the hope of the editors that the publication of these studies will help to stimulate the exchange of scientific information and to reinforce international cooperation in the field of economics. The Editors
This page intentionally left blank
Contents LIST OF CONTRIBUTORS INTRODUCTION Bernard Walliser and Richard Topol PART I: CHAPTER 1
1 2
3
4
5 6
DECOMPOSITION PATTERNS IN PROBLEM SOLVING Massimo Egidi
Introduction Invariant decomposition patterns in puzzles solving 2.1. Decomposition patterns 2.2. Some general properties of puzzle decomposition - triangularity 2.3. Applying the backward branching procedure in the search for shortest paths Conjectures and biases 3.1. Fitness 3.2. Weakly suboptimal strategies and their local stability 3.3. Landscape An application: MiniRubik 4.1. Optimal solutions 4.2. Comparing different decomposition patterns Preliminary experiments Concluding remarks References Appendix
CHAPTER 2
1 2
DECISION AND BELIEFS
IMPOSSIBLE STATES AT WORK: LOGICAL OMNISCIENCE AND RATIONAL CHOICE Mikae¨l Cozic
Introduction Logical omniscience in epistemic logic 2.1. Epistemic logic 2.2. Three solutions to logical omniscience
xiii 1 13
15
16 19 20 21 23 26 27 28 28 29 32 33 42 43 44 44
47
47 49 49 51
viii
3
4
5
The probabilistic case 3.1. Probabilistic counterpart of logical omniscience 3.2. Nonstandard implicit probabilistic structures 3.3. Special topics: deductive information and additivity 3.4. Nonstandard explicit probabilistic structures Insights into the decision-theoretic case 4.1. Choice models without logical omniscience 4.2. Open questions Conclusion Acknowledgments References Appendix
CHAPTER 3
1 2
3
4 5
Introduction and overview of the results The model 2.1. Primitives of the model 2.2. Axioms A representation theorem for RDP 3.1. The theorem 3.2. Interpretation of the representation theorem 3.3. Strength of the status quo bias and desirability of options Conclusion Proofs Acknowledgment References
CHAPTER 4
1 2
REFERENCE-DEPENDENT PREFERENCES: AN AXIOMATIC APPROACH TO THE UNDERLYING COGNITIVE PROCESS Raphae¨l Giraud
A COGNITIVE APPROACH TO CONTEXT EFFECTS ON INDIVIDUAL DECISION MAKING UNDER RISK Boicho Kokinov and Daniela Raeva
Introduction Approaches to risk understanding 2.1. Sociology 2.2. Economics 2.3. Psychology 2.4. Cognitive science 2.5. Cognitive economics
55 55 57 57 60 61 61 63 63 63 64 65
69
69 72 72 76 81 81 84 86 87 88 96 96
99
100 100 100 101 102 103 106
ix
3 4 5
A DUAL-based approach towards decision-making under risk 3.1. Prediction Experimental study Conclusions Acknowledgments References
PART II: CHAPTER 5
1 2 3 4 5
6
CONTAGION AND DOMINATING SETS Jacques Durieu, Hans Haller and Philippe Solal
Introduction Preliminaries Contagion in regular graphs Fixed points and two-period limit cycles Applications Conclusions and ramifications Acknowledgements References
CHAPTER 7
1 2 3
ON BOUNDEDLY RATIONAL RULES FOR PLAYING NORMAL FORM GAMES Fabrizio Germano
Introduction Preliminary notions Rules for playing normal form games Equilibrium analysis of rules Evolutionary analysis of rules 5.1. Iterated strict dominance 5.2. Nash equilibria Conclusion Acknowledgements References
CHAPTER 6
1 2 3 4 5 6
GAMES AND EVOLUTION
SELECTIVE INTERACTION WITH REINFORCING PREFERENCE Akira Namatame
Introduction The inverse problem Classification of social interactions
106 110 110 112 113 113 117
119
119 121 122 124 127 128 129 131 132 132 135
135 138 141 144 147 150 151 151
153
154 155 156
x
4 5 6 7 8
Rational decisions of conformists and nonconformists Heterogeneity in preferences Selective interaction of heterogeneous agents Selective interaction with reinforcing preferences Conclusion References
PART III: CHAPTER 8
1 2 3 4 5
6
7
CHOICE UNDER SOCIAL INFLUENCE: EFFECTS OF LEARNING BEHAVIOURS ON THE COLLECTIVE DYNAMICS Viktoriya Semeshenko, Mirta B. Gordon, Jean-Pierre Nadal and Denis Phan
Introduction The model Learning binary choices from experience Population characteristics and equilibrium properties General simulation-settings 5.1. Dynamics 5.2. Parameters and initial states 5.3. Stationary states Simulation results 6.1. Myopic fictitious play 6.2. Time-averaged reinforcement learning 6.3. Weighted belief learning 6.4. Time-decay fictitious play Conclusion Acknowledgements References
CHAPTER 9
1 2 3
ECONOMIC APPLICATIONS
COGNITION, TYPES OF ‘‘TACIT KNOWLEDGE’’ AND TECHNOLOGY TRANSFER Andrea Pozzali and Riccardo Viale
Introduction Different types of tacit knowledge Modalities of acquisition, codification and transfer of different types of tacit knowledge
157 160 165 167 172 173
175
177
178 180 182 186 188 188 189 190 190 190 195 197 197 201 201 202
205
205 207 215
xi
4 5
Conclusions: tacit knowledge as an explicative factor in the configuration of systems for technological transfer References
CHAPTER 10
1 2 3
4
OVERCONFIDENCE, TRADING AND ENTREPRENEURSHIP: COGNITIVE AND CULTURAL PROCESSES IN RISK-TAKING Denis Hilton
What is overconfidence? Two different paradigms in psychology Overconfidence in trading: miscalibration vs. positive illusions? 2.1. Positive illusions vs. realism: happier but poorer? Entrepreneurship: factors predicting career choice, risk-taking and success 3.1. How can psychology explain entrepreneurial behaviour? 3.2. Culture, risk-taking and entrepreneurship Summary and conclusions Acknowlegement References
218 220
225
226 227 228 230 230 232 232 233 233
EPILOGUE Alan Kirman
237
SUBJECT INDEX
261
This page intentionally left blank
List of Contributors Mikae¨l Cozic
De´partement de Philosophie & De´partement d’ Etudes Cognitives Ecole Normale Supe´rieure, Paris, France
Jacques Durieu
CREUSET, Universite´ de Saint-Etienne, SaintEtienne, France
Massimo Egidi
Faculta` di Economica, Luiss University, Rome, Italy
Fabrizio Germano
Departament d’Economia i Empresa, Universitat Pompeu Fabra, Barcelona, Spain
Raphae¨l Giraud
CRESE, Universite´ de Franche-Comte´, Besanc- on France
Mirta B. Gordon
Laboratoire Leibniz, IMAG, Grenoble, France
Hans Haller
Department of Economics, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
Denis Hilton
DSVP, Universite´ de Toulouse-Le Mirail, Toulouse, France
Alan Kirman
GREQAM, Marseille, France
Boicho Kokinov
Central and East European Center for Cognitive Science, Department of Cognitive Science and Psychology, New Bulgarian University, Sofia, Bulgaria
Jean-Pierre Nadal
Laboratoire de Physique Statistique, Ecole Normale Supe´rieure, Paris, France
Akira Namatame
Department of Computer Science, National Defense Academy, Yokosuka, Japan
Denis Phan
CREM, Universite´ de Rennes, Rennes, France
Andrea Pozzali
Dipartimento di Sociologia e Ricerca Sociale, Universita` degli Studi di Milano-Bicocca, Milan, Italy
xiv
Daniela Raeva
Central and East European Center for Cognitive Science, Department of Cognitive Science and Psychology, New Bulgarian University, Sofia, Bulgaria
Viktoriya Semeshenko
Laboratoire Leibniz, IMAG, Grenoble, France
Philippe Solal
CREUSET, Universite´ de Saint-Etienne, SaintEtienne, France
Richard Topol
CREA, Ecole Polytechnique, Paris, France
Riccardo Viale
Laboratorio di Scienze Cognitive e della Complessita`, Fondazione Rosselli, Turin, Italy
Bernard Walliser
Paris Sciences Economiques, Paris, France
Introduction$ Bernard Walliser and Richard Topol
The expression ‘cognitive economics’ associates two loaded terms. ‘Economics’ refers to the coordination of exchanges of goods among several agents and more generally to the global effects of multilateral relations between heterogenous actors. Its main concern is the link between two organizational levels: the microlevel of individual behaviors and the macro-level of global effects. These effects are considered in a broad sense including not only collective effects resulting from a combination of individual ones, but social effects which are really emergent from individual ones. ‘Cognitive’ expresses that an agent is able to gather pieces of information and to treat them in a symbolic way by various modes of reasoning. Its main concern is the link between two levels of existence: the physical sphere of material objects and the psychical sphere of mental states. These phenomena are also taken in a broad sense, which includes not only the reasoning about beliefs, but all mental processes oriented toward decision-making. Accordingly, cognitive economics deals with the links between the cognitive and social aspects of individual behaviors. Hence, it relates a ‘theory of action’, which studies how actors behave together and achieve some social goals and a ‘theory of mind’, which studies how the individuals perceive their environment and reason about acting on it. In fact, the relationships between cognitive and social aspects of agents’ behavior work in both ways. In one way, it concerns the cognitive factors of social phenomena, operating directly on the actors (individual reasoning) or on their relations (symbolic links). In the other way, it studies the social influence on cognition, exerted on individual mental states (social conditioning) or on interpersonal relations (altruistic concerns). Cognitive
$ This book is based on the first European Conference on ‘Cognitive Economics’ (ECCE1), held in the castle of Gif-sur-Yvette (September 2004). The editors thank the Department of Humanities and Social Sciences of CNRS for allowing the organization of the conference (through a network on Cognitive Economics), the Scientific Commitee of the Conference for selecting the papers, the association ADRES (theoretical economics) and the Institute of Complexity for financial help. They also thank especially D. Hilton who gave a special help to the conference and to P. Bourgine for his many suggestions at each step of the process. Corresponding author.
CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80001-5
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
2
Bernard Walliser and Richard Topol
economics is related to another trend called ‘social cognition’, but whereas the latter insists essentially on the cognition brought by social structures, the former in addition considers moreover the internal deliberation of actors. 1. Ontology of cognitive economics Cognitive economics adopts an ontology which stems from economics (and especially game theory), but is completed by notions coming from cognitive science. Three basic entities are considered: agents, institutions (institutional environment) and nature (material environment). Agents interact between them under the influence of nature and with the intermediation of institutions. Moreover, some prior relations, permanent rather than transitory, may govern these interactions. Interactions are material when they deal with exchanges of goods or symbolic when they deal with exchanges of information, the most usual relations involving simultaneously both dimensions. Agents follow a deliberation process in which their intented actions are determined by combining mentally their beliefs and preferences. In particular, they display a strategic behavior since each agent acts with reference to the others. Nature and institutions behave in a more autonomous and mechanical way, even if their behavior is affected by some random factors. According to this ontology, a ‘sophisticated methodological individualism’ is generally adopted. An influence loop relates the micro-level and the macro-level in both directions. In a bottom-up approach, agents’ actions, states of nature and institutional signals produce some collective phenomena (homogenous behaviors, original structures or new entities, especially institutions) considered as emergent. In a top-down approach, collective phenomena act on the mental states of the agents or more directly on agents’ behavior. However, the two opposite effects act at different time scales. If global phenomena emerge at long term from individual behaviors and relations, these phenomena act on the agents at short term. Normally, the two opposite effects are of a material as well as symbolic nature. If agents are more or less aware of the social regularities they are creating, institutions influence the agents by physical constraints as well as by persuasion. Individual agents are endowed with mental states considered at different levels. They consist in basic entities, such as beliefs or desires. They incorporate rules which apply on states, such as belief revision rules or reasoning modes. They even incorporate meta-principles, either meta-entities (meta-representations assessing the basic representations) or meta-rules (rules stating in what context to shift from one behavior or learning rule to another). All these elements are subjective since they vary from one individual to another, even if the dispersion decreases with the level considered. They evolve through time due to past experience, institutional signals or simply age, even if their evolution is slower with the level considered. They allow the individuals to make mental simulations
Introduction
3
of the external world, hence they are intentional in the philosophical sense. It is generally stated that the ontology of the agent is the same as that of the modeler. Especially, he reasons in an ‘ontological triangle’ formed of himself, the others and nature. An open question concerns the advantage of introducing, besides the social, behavioral and mental levels, an infra-individual level, the neural level. The question consists in assessing if the social and the neural levels are fully linked by the individual level or if they are shortcuts between them. Traditionally, in economics, social effects are explained by individual actions induced by agents’ mental states. Normally, in neurophysiology, mental states are explained by the neural activity of the brain. The question becomes whether some neural activities act directly on social phenomena without going through mental states and individual behaviors. It is probable that some neural activities act on man’s behavior without passing through mental states. But, it is rather obvious that neural activity does not act directly on a social level without transiting through individual behavior. Hence, the neural level will be considered only in studying some behavioral modes. 2. Four topics of cognitive economics The first topic deals specifically with the influence exerted by the external stimuli on an individual’s mental state. First, an agent receives from his informational neighborhood various physical signals that he has to treat. These signals concern direct observations, reports by other agents or news brought in by media. For instance, a speculator observes directly the price of assets, indirectly what others buy or sell and learns statistics about general activity. In order to become real information, the signal has to be duly interpreted, by comparing it with already accepted beliefs and by using appropriate codes. For instance, when a firm observes a decrease in its sales, it may attribute it to a change of the consumers’ tastes or to an increase of competition. Finally, the interpreted signal fits into categories, which are pregiven to the agent or originally constructed by him and regularly revised. Especially, a consumer observes states of his material environment, checks other agents’ transactions and feels some utility from his past consumptions. Second, agent’s preferences are conditioned in several ways by his social or institutional environment. Preferences are influenced by past experience, since an agent adapts to the observed performances of his actions, directly felt or expressed by other agents. Especially, a consumer learns progressively about the quality of experienced goods and revises his aspiration levels accordingly. Preferences are influenced by social norms (a kind of institution) through incentives and sanctions, of a material or psychological form. In particular, a producer is constrained by some laws which forbid him to trade some commodities or by norms which dissuade him to treat his employees badly. Finally, even if autonomous, preferences can only act upon objects already categorized by beliefs. For example,
4
Bernard Walliser and Richard Topol
assessment of a good needs to compare it with similar goods while application of a norm to some context needs to have a clear taxonomy of that context. Third, agent’s beliefs are changed in parallel by assimilating the factual information acting as a message. Whatever their structure (syntactic/semantic, set-theoretic/probabilistic), beliefs are revised according to some ‘revision context’ (revising, updating, focusing). For instance, when feeling a shift in quality for some good, a consumer has to decide whether he is just judging this quality more accurately or whether the quality has really changed. In order to process his information, the agent uses other modes of reasoning as well, even if many of them can be formally expressed as belief revision. For instance, a producer uses abductive reasoning when analyzing the nature of demand, nonmonotonic reasoning when expecting the desirability of a good, and conditional reasoning when forecasting his opponent’s behavior. Finally, even if isolated, beliefs are shaped by preferences, since only phenomena relevant to the agent are really modeled by him. Especially, agents have frequently segmented ‘belief areas’, which correspond to specific activities and which are revised separately. A second topic of cognitive economics deals with the internal deliberation process of each agent. First, the agent transforms his beliefs into an adapted decision-making framework, stressing the importance of uncertainty. Uncertainty is related to nature, to the other agents or to the agent himself. For instance, a negotiator is more or less aware of the behavior of nature, of other’s mental states and even of his own’s mental states. Independently, uncertainty concerns the past events (factual uncertainty), the permanent regularities (structural uncertainty) or the future events (strategic uncertainty). For instance, a manager is more or less aware of other’s past actions, mental states and intended actions, all of them leading evidently to crossed beliefs (beliefs about other’s beliefs). Concretely, an agent reduces uncertainty by transforming factual information into structural information, then stuctural information into strategic information. For instance, an agents relies on other’s observed actions in order to reveal his preferences (abductive reasoning) or relies on other’s mental states in order to simulate the other’s behavior in original circumstances (conditional reasoning). Second, the agent combines his beliefs and preferences, together with his opportunities, in order to make a choice. Most important, the computational limitations faced by an agent when deliberating entail a bounded rationality. For instance, a firm is unable to solve combinatorial problems which imply a high level of complexity. Besides, the agent faces framing effects since the way the choice is expressed may affect the result. For instance, an investor may consider a reference level separating gains and losses in order to judge the return of an investment. Finally, the agent uses mental states (or directly a choice rule) depending on the choice context. For instance, a consumer may use different means to choose a pen or a car, holidays or sports, not to speak of a wife. Third, in a dynamic context, agents implement specific learning rules based on bounded rationality. In belief-based learning, the main driving force is belief
Introduction
5
revision about the nature’s state or the other’s behavior. For instance, a producer may expect his opponent’s future action from his past actions and choose an action accordingly. In reinforcement learning, the main driving force is reinforcement of actions which gave the best performances in the past. For instance, a consumer will adopt the brand of a good with which he was the most satisfied in the past. Here again, the learning process is adapted to the context and is organized in embedded levels. For instance, a firm uses some routine for producing a given good, but may shift to another one if it is locked in weak efficiency. A third topic of cognitive economics concerns the combination of actions an agent may implement in parallel or sequentially. First, in one way, when choosing an operational action, the agent can proceed beforehand to an informational action giving him deeper insights in the former. When buying information from external offices, the agent faces a problem of determination of the value of information, which may be positive or negative, at least in games. For instance, a consumer may visit several stores before buying a product in order to compare their prices. When gathering information during an operational process, the agent faces an ‘exploration– exploitation’ dilemma since he may shift his spontaneous short-term action in order to gather information used at long term. For instance, an investor may implement a more or less flexible investment in order to take advantage of the information he will get by its partial implementation. An agent may even implement a deliberative action by asking for a choice study from a specialized entity. For instance, a firm frequently uses external consultants in order to make its beliefs and preferences more precise or prepare his deliberation. Second, the other way around, an agent diffuses information directly or through his actions, in a strategic way and according to his interests. The agent diffuses the information he possesses when he assumes that it will be profitable for him. For instance, if a producer has low costs, it may be in his interest to announce this in order to dissuade potential opponents to enter the market. Conversely, the agent keeps the information for himself or communicates it in a vague form when he thinks that it can be used against him. For instance, when discovering a new good or a new technology, a research center will hide it until deposing a patent. Finally, the agent may even diffuse false information in order to get a better result, as in bluff effect or reputation effect. For instance, even if soft, a firm may act in such a way that it is considered as tough by potential opponents. Third, the information appears as a strategic item which diffuses or not from one source to all agents. In conditions where the interests of agents are independent or complementary, information possessed by an agent may spread out completely. For instance, in some town district, information about the quality of a restaurant can diffuse very rapidly. In conditions where the interests of the agents are opposed, information may stay completely hidden. For instance, in usual auctions, information about the reservation price of buyers is never revealed. Many situations are in-between, giving rise to partial diffusion, especially through permanent networks linking the agents together. For instance, in
6
Bernard Walliser and Richard Topol
an ordinary market, the information about the quality of a good is more or less reflected in its price, even if other factors act on prices. A fourth topic of cognitive economics concerns the coordination of agents’ actions through various channels. First, the agents may be coordinated by beliefs which give them a common base for reasoning. Some beliefs act as institutions in order to create common references among agents as concerns their environment. For instance, agents on a good market have a common taxonomy of goods while agents on a job market have a common taxonomy of qualifications. Moreover, some equilibria can be justified by the precoordination of agents by some common beliefs (on the situation’s structure and on the agent’s rationality). For instance, producers and consumers on the pork market may compute the equilibrium price by pure reasoning. Likely, the selection of an equilibrium state is achieved by some common conventions culturally imposed. For instance, drivers on a road may agree on the convention of driving right or priority to the right. Second, the agents may be coordinated by the work of time in repeated interactions in a stationary environment. Besides individual devices like threats and promises, some institutions like ‘trust’ facilitate the coordination. For instance, since delivery of goods and payments are not always synchronic, a minimal form of trust is necessary to allow exchanges. Moreover, some equilibria are justified as asymptotic states of conjoint learning processes of agents. For instance, players in a duopoly may arrive at an equilibrium by learning simultaneously the demand and adjusting one to another. Finally, an equilibrium state is selected by initial conditions, but more efficiently by specific random factors. For instance, in a competition between firms for a new technology, the socially best one may be selected by random influences. Third, the agents contribute to the genesis of institutions, which constrain their actions and make them more predictable. An institution is generally considered as a device able to respond collectively to factual, structural or strategic uncertainty. For instance, if statistical institutes compensate for factual uncertainty, insurance companies help for strategic uncertainty. An institution emerges, rather than by a voluntary process, by an ‘eductive process’ or an ‘evolutionary process’. For instance, if a financial market may be created by a collective decision, money emerged rather spontaneously by an evolution process. Once emerged, an institution is naturalized (recognized and autonomized by the agents) and furthermore legitimized (accepted and eventually legalized by a public authority). For instance, if a queue forms spontaneously for obtaining a public good, it is progressively recognized by the agents and even imposed.
3. Methodology of cognitive economics Coming now to a methodological point of view, cognitive economics is clearly in-between economics and cognitive science.
Introduction
7
From economics, cognitive economics adopts the construction of rigourous models based on first principles and highly idealized. Likely, in order to test these models, it resorts heavily to the statistical methods of econometrics. From cognitive science, cognitive economics adopts the computer simulation of models expressed as multi-agent systems. Normally, in order to validate these models, it resorts essentially to the results of laboratory experiments. Hence, cognitive economics uses formal languages and even adds to the traditional mathematical tools, the tools of epistemic logics. In parallel, cognitive economics gives more importance to experimental economics and especially to behavioral economics. The models have first to propose formal explanations of some economic phenomena. Of course, any formal explanation is preceded by a pre-modeling phase where the concepts are qualitatively constructed and the mechanisms literarily stated. What is modeled is either the link between micro and macro levels, or the link between mental states and actions, more seldom a combination of them. Explaining means that some underlying mechanisms are made explicit and not only a stable relation postulated. Of course, what is being looked for are ‘minimal’ explanations, i.e. the simplest assumptions that can yield the phenomenon under consideration. From that point of view, cognitive economics should be reticent toward analogies coming from physics (especially statistical physics) or from biology (especially evolutionary biology). An economic problem has first to be expressed in its specific form and an analogy helps only when a similar formal structure has already been studied and solved in other contexts. The models have further to be empirically validated against the empirical base. At the individual level, observations concern the actions and even, in some conditions, the mental states. The relations between mental states and individual behavior is studied in experimental economics and especially ‘behavioral economics’. At the social level, some regularities or some entities like institutions are directly manifest. Relation between individual behavior and social entities may in part be studied by experimental economics, but is rather tested on historical data. At the neural level, the neural activity related to some reasoning tasks is more and more observed. The emerging ‘neuroeconomics’ focuses on the neural activity associated to individual choices in a decision-making or game context. But the link between that activity and the mental states or operations stays very crude. Probably, it will be necessary to find the relation between very basic mental beliefs (correlations, mental maps) or operations (deduction, abduction) before considering complex reasoning modes. The models may finally be pragmatically exploited for forecasting and choice. Cognitive economics, more than classical economics, is obviously oriented toward a positive rather than prescriptive approach. It aims at being grounded on more realistic assumptions, especially as concerns individual behavior. The individual is not only intentional as before, but it becomes subjective, interpretative and reflexive. However, it always constitutes an elementary brick which at best approximates the actual agent in some contexts. Nevertheless, forecasts may become more precise, at least at a collective level where the social phenomena
8
Bernard Walliser and Richard Topol
result from averaging individual behaviors and constraining them by permanent relations. Even more, prescriptions may become more reliable, still at a collective level, at least since they will be based on better forecasts. But the necessary definition of collective choice criteria is not in the scope of cognitive economics and has not really improved. 4. Limits of cognitive economics Even if slowly emerging, cognitive economics does not already appear as a welldefined and integrated discipline as concerns its theory and methodology. On the one hand, cognitive economics is still weak on some classical problems which are obviously in its scope. As mental states are concerned, it says nothing about the role played by the agents’ language, even if many reasoning and communication processes cannot be considered as neutral toward this language. As individual behavior is concerned, it is very unprecise about what is really meant by the interpretation by an agent of some outside events, even if it insists on that central notion. As concerns social effects, it deals with institutions as a hotchpotch concept which covers all entities differing from agents or objects, and gives no precise assumptions about their possible structure and mode of influence. It may well be that cognitive economics has not succeeded in clarifying some elements that it has emphasized. As far as mental states are concerned, it says little about how agents’ emotions like envy, fear or empathy (even if they are badly defined) may be relevant in relation with forecasting or decision-making. As individual behavior is concerned, many static bounded rationality behavior rules or dynamic learning rules are not profoundly related to what one knows about the limited abilities of the human mind. As far as social effects are concerned, many economic phenomena such as financial speculation are still not clearly related to behaviors and relations incorporating cognitive as well as symbolic aspects. On the other hand, there is no unified ‘style’ of modeling that has emerged as a characteristic feature. Theoretical models of decision-making or reasoning are grounded on an abstract axiomatic basis and they establish some coherence results which are loosely interpreted by the modeler. Empirical models of job markets or organizations just articulate some stylized facts about some concrete domain and infer some consequences without stating what accuracy can be attributed to them. Simulation is realized in a blind way, for a large variety of contexts made feasible with computers, without even making precise what are the more realistic past ones and the more relevant future ones. It seems that the integration of empirical work and theoretical work is far from being systematic. Economists work generally in a projective way, by testing prior theories against available historical data or constructed experimental data, but are not always ready to revise the aspects of their models that fail. They are less prone to work in an inductive way, by internalizing in their models some
Introduction
9
results obtained by economic enquiries or by psychological ‘theory-free’ experiments. They show superb ignorance of a whole set of data, gathered by connected disciplines, for instance concerning the cognitive development of children, the medical pathologies of decision-makers or the organization of cognitiveoriented groups. 5. Structure of the book Following this introduction, the book is organized in ten chapters and an epilogue (by A. Kirman). They cover a wide spectrum of economic problems but only focus on some topics of the research program previously sketched. Four articles deal with decision theory, and all consider some aspects of the deliberation process followed by a decision-maker. Three articles deal with game theory, and are all concerned with the learning process of several players in a repeated game. Three articles deal with economic applications, and are concerned with entrepreneurial behavior, consumer interactions and knowledge economics. The articles also implement various methods. Four papers are devoted to analytical work, two to simulations, three to empirical work and one to conceptual clarification. The article by Egidi considers an actor who solves a combinatorial problem by decomposing this problem into subproblems and optimizing each of them. The global solution he achieves is generally suboptimal, in relation with the pattern of decomposition that he adopted. The conditions under which a decomposition process gives rise to a suboptimal solution are theoretically explored. The suboptimality generally results from the process of categorization that governs the creation of a decomposition pattern. Moreover, the persistence of a biased behavior is explained by resorting to the notion of weakly suboptimal strategy and related properties. An empirical application to a simplified version of Rubik cube is developed, showing that individuals are locked in weakly suboptimal startegies. The paper by Giraud studies axiomatically the deliberation process of a decision-maker grounding his preferences on some reference point. After defining a set of preferences on the couple of consequences and reference points, several axioms are stated. A representation theorem indicates how the decision-maker introduces the reference point in a generalized utility index. An interpretation of the cognitive process followed by the decision-maker is that the reference point has two influences. It modifies the selection of implicit choice criteria and it modifies the combination rule of these criteria. This last notion of desirability of an action can further be related to the strength of the status quo bias. This article shows clearly how some cognitive processes associated with decision-making and empirically sustained can be formalized in a traditional axiomatics. The paper by Cozic attempts to ground and to formalize bounded rationality of a decision-maker on a lack of logical omniscience. Logical omniscience is defined as the fact that an agent has perfect capacities for inference and is able to
10
Bernard Walliser and Richard Topol
deduce all consequences of what he already knows. In epistemic logics, a failure of logical omniscience can be formalized in several ways, but best by considering non standard (or impossible) worlds in some Kripke structure. By means of probabilistic logics, this solution can be extended from set-theoretic beliefs to probabilistic beliefs. It can then be applied to the consequences of the available actions, assumed to be impossible to compute precisely by the decision-maker. This article shows how basic tools from epistemic logics can be internalized in decision theory in order to study the cognitive basis of an actor’s rationality. The article by Kokinov and Raeva presents a tool able to incorporate some context effects in individual decision-making under risk. It first recalls the different ways decision under risk has been considered in sociology, economics, psychology as well as cognitive science, with special stress on the ‘framing effects’. The authors’ aim is then to design an integrated computational model, which articulates the decision process with other cognitive operations, and which is able to explain or predict context-sensitivity of decision-making. A first step is achieved with an already existing cognitive architecture, called DUAL, which can be adapted to decision-making. It is able to explain a card experiment where the choice of a decision-maker is influenced by a nonrelevant contextual item, the picture on the back of the card. The article by Germano analyzes the learning behavior of players playing successively and randomly different games and endowed with rules (algorithms) which apply to all these games. Given a set of normal form games and a set of adapted rules, it associates to them an average normal form game to which the same rules can be applied, hence for which associated equilibria can be computed. It compares then a global reinforcement learning process applied to the set of games and a similar learning process applied to the average game. It shows especially that the rules which are dominated in the average game disappear in the global learning process and if the global process converges, the limit point is a Nash equilibrium of the average game. This article illustrates the power of a theoretical analysis which predicts how a player progressively selects a rule able to be applied to a variety of games, such a process being empirically tested in some contexts. The article by Durieu et al. examins, in a repeated game, the possibility of a step-by-step contagion of some action according to the structure of the graph relating them. Players are situated on a regular graph which is characterized by the existence of a dominating set, defined by the density of edges issued from each node. Each player has a threshold behavior, meaning that he adopts one of the two actions when a given proportion of his neighbors adopted that action in the past. It proves analytically necessary and/or sufficient conditions for an action spreading from an initial subpopulation to the whole population in one or more steps in relation with the existence of dominating sets. Similar conditions are given for convergence no more toward an homogenous fixed point, but a heterogenous fixed point or a global two-period limit cycle. This article illustrates how analytical work allows to state under what structural conditions some item shared by a subpopulation of agents is able to diffuse.
Introduction
11
The article by Namatame examins in a repeated game what stuctures may emerge from very simple behaviors. Players are situated on a torus and interact either globally or only with their direct neighbors. Each player uses a threshold behavior, meaning again that he adopts one of the two actions when a given proportion of his neighbors adopted that action. The heterogenous population is formed of two types of players, conformist and nonconformist ones. Each subpopulation is characterized by a distribution of thresholds and the players are able or not to observe the other’s type. The resulting stuctures are assessed by the players’ average utility and utility dispersion. In a first simulation, the players are able to change their location, leading to a spatial structure which is segregated or pooling. In a second simulation, players adapt their preferences to their past performances, leading to a distribution of thresholds which is concentrated or dispersed. This article illustrates how simulation is able to treat a lot of interactive situations giving rise to very different emerging structures. The article by Semeshenko et al. presents a learning process of consumers facing a monopolist and submitted to interpersonal influences. More precisely, at each period, a consumer may buy or not a homogenous good at some fixed price, but his utility depends not only on an idiosyncratic willingness to pay (regularly distributed), but on externalities due to others’ consumptions. The learning process of each consumer is an EWA process, considered in fact under four classical modalities, and it is implemented by the different players in a parallel or sequential way. Depending on several parameters (price, intensity of interactions, learning features), the simulated learning process converges or not toward a Nash equilibrium state (sometimes multiple) in an observed time of convergence. This article well illustrates how some tools developed by statistical physics can be used to study the asymptotic states of coupled individual behaviors. The article by Hilton overviews the links between overconfidence and entrepreneurship, by articulating a variety of empirical results with theoretical models. It first distinguishes two notions of overconfidence, miscalibration (overestimation of own’s judgment accuracy) and positive illusion (overestimation of own’s capacities). It further analyzes the link between overconfidence – opposed to realism – and trading, and more precisely entrepreneurship as concerns its effect on risk taking and further on final performance. Of course, such a relation is dependent on many contextual factors, especially cultural traits, for instance the belonging of the entrepreneur to an individualistic or a more integrated society. This article illustrates how psychologists and economists may work together, by associating a projective and an inductive view, as concerns the cognitive aspects of individual behaviors as well as their collective consequences. The article by Pozzali and Viale clarifies the recently introduced notion of tacit knowledge and examines his main properties. It distinguishes between three forms of tacit knowledge, knowledge as competence (physical skills which help to perform expert activities), as background (behavior rules which precondition social activities), and as cognition (cognitive rules which guide reasoning and
12
Bernard Walliser and Richard Topol
action). These forms not only differ by their degree of codifiability, but also by their means of acquisition and transmission (respectively imitation and apprenticeship, socialization, implicit learning). Finally, the volume of each form of tacit knowledge in modern economies is more or less decreasing under the influence of new technologies of information and knowledge. This article is a good example of the way epistemology is able to propose precise taxonomies–based on knowledge coming from cognitive sciences–which are further used by economists in different fields.
Part I: Decision and Beliefs
This page intentionally left blank
CHAPTER 1
Decomposition Patterns in Problem Solving Massimo Egidi Abstract The origin of biases in problem solving is analysed in the context of games and puzzles. To discover a game strategy, individuals decompose the problem according to their ‘‘intuitions’’, i.e. their ways to categorize and conceptualize the game’s properties. It is shown that, according to Bellman’s principle, by decomposing a problem into subproblems and optimizing each of them, normally players do not achieve an optimal solution of the original problem. The global solution to a problem may therefore be suboptimal, in relation to the pattern of decomposition that has been adopted. Therefore, biases are interpreted as suboptimal behaviour originated by the decomposition pattern that individuals adopt, and ultimately by their categorization of the problem. After introducing a simplified version of Rubik cube, two decomposition patterns are identified, one of which maintains optimality while the other is weakly suboptimal. It is shown, with preliminary experiments, that individuals who have discovered the weakly suboptimal solution behave in a fully coherent way with their strategy, and therefore that errors can be explained as rational behaviour within an incomplete representation.
Keywords: bounded rationality, categorization, cognitive bias, problem solving JEL classifications: C70, C91, D83
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80002-7
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
16
Massimo Egidi
1. Introduction One of the most widely known heuristics to solve problems suggests to decompose it into parts (subproblems) which can be solved separately. The decomposition process is repeatedly applied to each subproblem until elementary subproblems, easy to be solved, are identified. Applications to games and puzzles are very usual: Hanoi tower and Rubik cube are typical contexts in which decomposition is usefully applied and leads more easily to discover a solution strategy. During decomposition players identify progressively simpler subgames, until they reach elementary sub-games that are easy to be solved. Given the simplicity of these elementary sub-games, players are supposed to be able to discover the optimal strategy for each sub-game. This is apparently the key to reduce the complexity of research and obtain optimal global solutions. However, as we will see, when a problem is decomposed into parts, the pattern of decomposition that players – consciously or not – adopt, introduces hidden suboptimalities. This happens as a consequence of a general property of decomposition: when we decompose a problem, although we discover and apply the optimal solution to each subproblem, generally we get a suboptimal solution of the global problem. This property, related to Bellman principle, will be illustrated in the next section, where we will discuss the conditions under which the optimization of all subproblems leads to the optimization of the global problem. These conditions are quite restrictive and generally do not emerge in the course of the ‘‘natural’’ human process of problem solving; humans in fact discover a solution through a process of symbolic manipulation of the categories with which they frame and represent the original problem; categorization and classification help to simplify the representation of the problem, lead to the identification of subproblems, and therefore allow individuals to achieve an ‘‘easily solvable’’ representation of the problem. Therefore the ‘‘natural’’ process of problem solving leads to a simple solution, not necessarily to an optimal one. Puzzles provide a context in which the features of problem solving emerge clearly. In puzzles the optimal solution can be defined as the shortest path from the starting configurations to the goal; given the enormous number of game configuration that should be analysed to get an optimal solution, to obtain a simple representation of the solution players classify the states of the puzzle in a relatively low number of classes; large sets of game configurations are aggregated in classes, and the categorization lead to the creation of a decomposition pattern. Categories are related to the process of abstraction and classification that players realize while playing the game; categorization may be driven by the ‘‘salience’’ of some symbolic features of the configurations of the game. In Rubik, for example, the disposition of colours of the tiles along edges, corners and faces may be salient elements to categorize classes of configurations. Players decompose the problem in steps, each of which is characterized by an initial and
Decomposition Patterns in Problem Solving
17
a final configuration; in 2 2 2 Rubik, for example, a well-known strategy for beginners suggests three steps: – The first step allows the player to put in the right position the bottom layer of the cube. – The second step will get the top-layer pieces in the correct spot but not necessarily in the correct orientation. – In the third step the player will be untwisting the final pieces in the top layer.1 At every step a subproblem is proposed; the starting and ending positions that define every subproblem are sets of configurations that we will call ‘‘building blocks’’; for example at the end of the step two in the example above, the building block is the set of configurations where there are ‘‘the top-layer pieces in the correct spot but not necessarily in the correct orientation’’. Every subproblem is therefore characterized by a pair of building blocks, respectively representing the class of starting configurations and the class of ending configurations. Each building block is described by means of the basic categories (here corners, edges, layers, faces and colours). The suboptimality of a decomposition pattern is related to the distance to the goal of the building blocks: in fact players, by following a decomposition pattern, believe, at any step, to be progressively nearer to the final configuration. But frequently the distance that players conjecture either is wrong or does not hold for all elements of the building blocks: for example in Rubik cube players may believe that a cube with three faces with the same colour is nearest to the goal than a cube with two faces with the same colour, which is not true for all dispositions of the elements of the cube. Consequently the procedure they adopt may be suboptimal at least for some configurations. Owing to this distortion, players do not always get close to the goal through the shortest path, but on the contrary at least for some configurations they achieve the goal following a tortuous path that in some steps gets farer from the goal. Hence biases in puzzle solving are originated by the process of categorization and symbolic manipulation that give rise to a decomposition pattern, during the search of a solution of the game. Players adopt categories that lead them to a decomposition pattern in which every subproblem is easy to be solved, and eventually optimally solved: but, according to the Bellman principle, even if all solutions of all subproblems are optimal, it does not guarantee that the global solution will be optimal. I will call ‘‘invariant’’ a decomposition pattern if by optimizing separately the subproblems in which the problem is decomposed, we get an optimal solution. It is intuitive to understand that there are different categorizations and decomposition patterns for any given problem, some which are invariant and keep
1
http://puzzlesolver.com/puzzle.php?id=25
18
Massimo Egidi
the features of the original problem, while the vast majority does not fully respect these features and generates decision biases; with the help of an example, we will illustrate the formal conditions under which a decomposition pattern applied to a puzzle maintains (or not) its original metric and will explain the reasons why distortions of metric give rise to suboptimal strategies. After illustrating some features of decompositions and related categorizations, we will apply them to a game which is a simplified version of Rubik’s cube; we will then analyse some experimental data, showing that a large part of deviations from rational behaviour are the result of the adoption of ‘‘weakly suboptimal’’ strategies. The notion of ‘‘weak suboptimality’’ is relevant for explaining the stability of suboptimal strategies, i.e. the fact that players persist in using patterns and strategies that are suboptimal. ‘‘Weak suboptimality’’ of a strategy means that the strategy prescribe paths to the goal that are optimal for some of the starting configurations but not for all of them; therefore, when players solve the puzzle, for some of the starting configurations they will be confirmed in the optimality of their decisions and will not easily perceive that they have made a non-invariant, inefficient decomposition of the problem. Players may therefore have serious difficulty to perceive the suboptimality of their strategy and to revise the related decomposition pattern. The difficulty of discovering the suboptimality of a strategy is reinforced by a general phenomenon called ‘‘mechanization of thought’’. This tendency has been proved by Luchins (1942) and Luchins and Luchins (1950), who conducted experiments with subjects exposed to problems that had different solutions with different levels of efficiency. The authors show that subjects, having identified the optimal solution of a task in a given context, may ‘‘automatically’’ and systematically use such solution applying it also to contexts where it proves to be suboptimal. This process is called ‘‘mechanization of thought’’. The experiments with the game Target the Two (Cohen and Bacdayan, 1994; Egidi and Narduzzo, 1997) confirm that a similar process is also characterizes team’s behaviour in an even more evident and persistent manner: groups of subjects who are to jointly solve a shared problem may remain even more stably ‘‘trapped’’ into suboptimal solutions than single individuals. In fact, while difficulties encountered by a single subject when solving a problem in a new way depend on the possibility to discover a new solution and are influenced by cognitive limitations to individual learning, this is even more difficult in a group: for teams in fact there is a need to discover and adopt an alternative solution jointly. The explanation we will provide of the lock-in process into systematically biased decisions is therefore based on two different features; on the one hand, the difficulty to perceive weak suboptimalities; on the other, the effect of the socalled ‘‘mechanization of thought’’ on reasoning. The ‘‘mechanization of thought’’ is related with the mental effort required to find out new solution, i.e. based on the discovery of a new representation of the problem. In the experiment we will describe later, to discover a new
Decomposition Patterns in Problem Solving
19
decomposition pattern, players have to re-codify the game and modify the categories they use. But re-codification requires to abandon the original classification and to build up new classifications and new categories that generally describe with higher accuracy, but reduced simplicity, the game properties. Therefore, in many cases re-thinking a problem to reduce the biases leads to a ‘‘refinement’’ of the decomposition pattern and the related categories: this implies that there may be a trade-off between the simplicity of representation of a strategy and its efficiency. 2. Invariant decomposition patterns in puzzles solving In search of a solution by decomposition, players may discover different strategies, slightly different in terms of efficiency. This is the case of the most popular strategies diffused on the web for playing Rubik cube: a large number of different alternative strategies are proposed, no one of which is evidently the best one; the existence of a variety of different strategies, not clearly distinguishable from the viewpoint of the efficiency, is connected with the properties of decomposition patterns. In this section we briefly sketch some of these properties, to make possible a comparison between different patterns in terms of efficiency (degree of suboptimality) and simplicity of representation. In particular, we well describe, without formal demonstrations, some basic properties of invariant decomposition patterns, related with the Bellman principle of optimality. Some preliminary definitions: in puzzles, a specific problem is defined by a pair of states, i.e. starting and final states of the game, and a solution consists of a strategy which, starting from the initial state, allows the player to reach the final state.2 I refer to a specific problem because its reference elements – the starting state and the target state – are single states. In general, however, a problem is defined with reference to classes of states, the class of the starting states P and the class of goal states G. A solution is therefore a strategy whichstarting from each starting state peP, allows the player to reach one final state geG. To formally represent a puzzle, the following shall be defined 1. A set of states X ¼ {x1, x2, x3, y, xn} of the game and two subsets, P and G which are the set of starting and goal (final) states, respectively. If a problem is a specific one, P and G are composed of a single element each.
2
The strategy can be represented by a list of conditions–actions, where each state of the game (condition) is associated with an action (the adequate move); the path from the initial state to the goal is generated as follow: while comparing the starting state of the problem with the condition–action list, the move to be applied is identified and a second state is obtained (successor); by repeating this process, the final state is reached through a ‘‘path’’ composed of a sequence of connected states.
20
Massimo Egidi
2. A set of moves M ¼ {m1, m2, y, mm}. 3. A set of rules indicating to which states a move is applied and the configuration of the puzzle after that move is applied. This set of rules can be written in the form of a matrix of state transition, A, where its element aij it represents the move mk that must be applied to the state xi to reach the state xj. xj is called successor of state xi and vice versa, xi is called predecessor of state x ; aij ¼ 0 means that there is no move connecting state i with state j. The transition array A can be used to represent the puzzle as a (directed) graph where nodes represent the states of the game whereas the moves are represented as directed edges connecting two nodes. For the most common puzzles, the graphs are strongly connected, i.e. all goal states are reachable by any starting state, and therefore we will assume strong connection as the normal property of a puzzle graph.3 j
2.1. Decomposition patterns As stated in the literature (Nillson, 1986), a problem can be decomposed into a tree of and/or subproblems.4 If Ch and Rh are respectively the set of starting and ending states of the hth subproblem, the subproblem is defined as the pair {Ch, Rh}; according to the previous description of decomposition process, we will call Ch and Rh the (starting and ending) building blocks of the decomposition pattern. Decomposition is a recursive process where subproblems, subproblems of subproblems, and so on are identified until problems having a ‘‘minimal’’ size, i.e. one which can be solved in just one move, are finally achieved. Terminal subproblems, i.e. those which can be solved in just one move, are the finest ‘‘grid’’ where a problem can be defined and solved: if Ch and Rh are the starting and final building blocks of a terminal subproblem, this will be solved in just one move mw: Ch-mw-Rh.
3 From the transition array A we can easily get the adjacency matrix C of a graph, simply setting mk ¼ 1 for all moves. C ¼ {cij} in fact is defined as follows: cij ¼ 1 if the edge (xi, xj) exists in graph G; cij ¼ 0 if the edge (xi, xj) does not exist in G. Thus the adjacency matrix C can be obtained by transition matrix A, by simply setting cij ¼ 1 if aij ¼ mk and cij ¼ aij ¼ 0 otherwise. 4 In the first instance, to solve a problem one must solve all subproblems the problem is composed of, while in the second case one should solve just one of the problems it is made up of. The identification of subproblems is achieved through processes of abstraction (each and structure is based on an abstraction) and specification (given a problem, any or subset is a specification of its special conditions).
Decomposition Patterns in Problem Solving
21
The list of all terminal problems (each provided with the relevant move) can be considered as a program,5 in the form of a list of condition–action equations Ch-mw, which solves the problem. 2.2. Some general properties of puzzle decomposition – triangularity The properties of a decomposition pattern are strictly related to the features of the shortest paths between starting states and goals of every subproblem. The shortest path is based on the concept of distance as follows: the distance between two nodes in a graph can be defined as the number of edges within the shortest path, connecting the two nodes (or, which is the same, the length of the shortest path connecting the two nodes). In puzzles and games, the goal might be composed of a set of nodes, and the same may happen for the starting states; it is therefore worth defining the meaning of ‘‘shortest path’’ or ‘‘distance’’ between sets of nodes. Suppose that G is the set of target nodes and s is a node in the graph not belonging to G. Consider the set M of shortest paths between s and all nodes of G. We define as ‘‘the shortest path between s and G’’ the shortest element of M. The distance between s and G is the length of the shortest path. This means that the distance between a node and a set G is defined as the shortest distance between that node and the nodes in G. This definition may easily be generalized to sets of nodes: the shortest path between two sets A and G will be the shortest path between the two closest elements of A and G. Distances exhibit some unusual properties, and noticeably the non-reversibility; in fact, the following properties hold: For directed connected graphs: If the transition matrix of a puzzle is not symmetric, the shortest path from node xh to node xk does not coincide with the ‘‘inverse’’ shortest path from xk to xh. The pair (xh, xk) may have a different distance than the pair (xk, xh). Therefore, we find that the distance between two elements is not symmetric:6 Dðxh ; xk ÞaDðxk ; xh Þ
5
Each strategy can be considered as a program as it is made of a condition–action sequence, taken from a transition matrix. As such a programme is equivalent to an automaton and not to a turing machine. 6 In the special case of nondirected connected graphs, for every pair of nodes (xh, xk) we get D(xh, xk) ¼ D(xk, xh). In fact, the distance between node xh and node xk is given by the length of the shortest path connecting node xh to node xk; let xh, xs, xt, y, xv, xk be the nodes along the shortest path. These nodes are directly connected, and therefore the coefficients of the adjacency matrix are nonzero: ahs ¼ ast y ¼ avk ¼ 1. While the graph is nondirected, the adjacency matrix is symmetrical, i.e. aij ¼ aji 8i,j. It follows that the inverse shortest path from xk to xh exists and has the same length of the shortest path from xh to xk, because for the symmetry of the adjacency matrix the coefficients akv, y ats ash are nonzero.
Massimo Egidi
22
The most interesting property of distances among graph nodes is the triangular inequality. Dðxh ; xk Þ Dðxh ; x Þ þ Dðx ; xk Þ This means that only under special condition it is true that the distance between two nodes is equal to the sum of the distances with an intermediate node. The following elementary example illustrates the two cases: the distance between node S and node G in graph G is 4, i.e. four steps. (There are two shortest path from the two points S and G, {S,D,H,M,G}and {S,D,H,N,G}.) Consider two different intermediate point H and E, respectively in Figures 1 and 2; From Figure 1 we have D(S,H)+D(H,G) ¼ D(S,G) while from Figure 2 we get D(S,E)+D(E,G)>D(S,G): it is clear that {S,D,H,M,G} is a shortest path from S to G, while {S,C,E,L,N,G} is not a shortest path. What is important is that in both cases the intermediate paths are optimal: the minimal path from S to E is SCE, and the minimal path from E to G is ELNG, but the minimal path from S to G is not the union of the two paths; on the contrary the minimal path from S to H is SDH, and the minimal path from H to G is HMG, and the minimal path from S to G is the union of the two paths; therefore is the choice of the intermediate node that governs the optimality of the global path. The example shows that the triangular inequality holds for all intermediate nodes that do not belong to an optimal global path; it is trivial to show that if
Figure 1. Triangular equality L
E
C
N
H
G
D
S
M
F
B
Decomposition Patterns in Problem Solving
Figure 2.
Triangular inequality L
E
C
N
H
G
23
D
S
M
F
B
and only if we select an element C belonging to the minimal path between A and B as intermediate point then the equality holds: D(A,B) ¼ D(A,C)+D(C,B). Of course in a real game context, where the optimal path to be discovered is very long, a player who selects an intermediate node to divide a long path into two shorter paths cannot know in advance if the intermediate points his ‘‘right’’, i.e. if the triangular inequality reduces to an equality or not for the point he has selected. Therefore, the selection of the intermediate element may reduce the computational load but does not guarantee global optimality. The properties of distance decomposition we have illustrated so far can easily be generalized: the triangular inequality can be extended to distances among sets in a graph; in the next section we will see that the conditions under which the triangular inequality reduces to an equality are similar to those we have discovered in the previous simple case.
2.3. Applying the backward branching procedure in the search for shortest paths Now we generalize the elementary example of the previous section. Assume for a moment that the goal of the puzzle is composed by only one state. To discover the shortest path to the final goal we may implement a ‘‘backward branching’’ procedure as follows: starting from the final configuration of the puzzle, we label all ‘‘previous’’ configurations, i.e. those leading to the final goal in a move, and then those leading to the final goal in two moves, and so on, thereby assigning to each configuration the distance to the goal. The states of the game are thus classified and ordered in relation to the distance to the final goal; an order of
Massimo Egidi
24
configurations is established in relation to the distance to the objective and we can get the shortest path from each point to the goal simply by looking at the labels of distance and moving in the direction of reducing the distance. The same algorithm can be improved if the goal is a set of states. Call G the set of target nodes. S1 is built as the set of nodes adjacent and directed at least to one of the nodes in G, which will be labelled at a distance 1 (see footnote 4). Let us then build S2 , the set of nodes directed at least to one of the nodes of S1, excluding nodes belonging to G; they are labelled at a distance 2; similarly, S3 is built, being the set of nodes directed to at least one of the nodes of S2 , excluding those belonging to S1[G; in this way, after N iterations (Nr number or nodes in G), all elements of the graph will have been reached and labelled. The sets S1, S2, S3 y SN that have been identified with this procedure will be called ‘‘layers’’ at distance of 1, 2, 3, y steps to G. All nodes in layer k are at k distance to the final goal G. From this we may easily conclude that each decomposition of a problem into two subproblems, having a layer Sk as separating set, is invariant; moreover, the following equality holds: DðS; GÞ ¼ DðS; S k Þ þ DðS k ; GÞ where D is the distance, i.e. the length of the shortest path, respectively between S,S0 and G. We may therefore conclude that layers are the particular set of nodes for which the triangularity of distances, D(C,G)rD(C,C0 )+ D(C0 G) becomes an equality. Now suppose that we are searching the shortest paths from the starting states xAC to the goal G. A decomposition of a problem C-G into two subproblems C-R and R-G is invariant if and only if for all (xAC) the equality of distances holds: Dðx; GÞ ¼ Dðx; RÞ þ DðR; GÞ As we have seen, the equality holds for any set R that is a layer.7 The top 10 strategies for playing Rubik cube, available on the web, are not invariant, and prescribe paths to lead to a goal that are, at least in some conditions, suboptimal; according to the previous observations, this happens because the building blocks that players use to compose their strategies are not usually layers. In fact players discover ‘‘naturally’’ (and sometime implicitly) a decomposition pattern by 1. Categorizing the problem, i.e. assuming basic ‘‘objects’’ that describe set of configurations; in Rubik cube ‘‘edges’’, ‘‘corners’’, ‘‘faces’’, ‘‘layers’’ and
7
The conditions for invariant decomposition are related with the Bellman principle (Bellman, 1957).
Decomposition Patterns in Problem Solving
25
‘‘colours’’ are the fundamental (natural) categories with which the building blocks are composed. 2. Identifying all subproblems, i.e. the starting states and the goal states of the subproblem (the building blocks of the decomposition pattern). 3. Discovering a path (if possible the shortest path) that connects any pair of building blocks (subproblems). For example, in order to solve the 2 2 2 Rubik cube, a solution for expert players suggested on the web is the following: 1. ‘‘Pick the blue–yellow–orange corner. 2. Connect one of the three edges (blue–orange in the example) with the corner, so that the colours match, somewhere on the cube. Don’t try to put them in their correct position yet. 3. Put an other edge (blue–yellow) in its correct position. 4. Now join the first two pieces with the third by putting them in their correct positions. 5. Finally, put the third edge (orange–yellow) in position. You have to remove the three other pieces for this. This is rarely the most efficient way, but it always works. (http://lar5.com/cube/index.html) The question is that the building blocks, which are spontaneously created through the process of representation of the game, usually are not layers, because the decomposition through layers does not allow players to represent game and strategy in a compact way; in fact, by definition a layer cannot be connected to his successor or to his predecessor with a unique move. On the contrary players try to build up a compact strategy in which a move is applied to a building block to achieve the successor block recursively until they achieve the goal. This means that players ‘‘aggregate’’ large sets of configurations in building blocks and therefore drastically simplify the representation of their strategy. Therefore, we have to check if there exist building blocks – different from layers – that lead to invariant decomposition patterns, or to weakly non-invariant decomposition patterns. Without pretending to develop a formal analysis here, we will simply consider some conditions for this to happen, and will provide some examples in Section 4. It is quite intuitive to discover these conditions, by referring to the properties of minimal paths. In fact, call ancestors the elements that do not have predecessors by applying repeatedly the backward branching process to the goal. Under conditions of ‘‘reachability’’, the graph of a puzzle may be partitioned into three sets: G, the set of goal elements, P, the set of ancestors , and all the
26
Massimo Egidi
rest, i.e. the ‘‘intermediate’’ elements: every intermediate element has at least one ancestor, and for every ancestor there is at least one minimal path to the goal. Now build up a building block BBh as follows: let k be the number of distinct minimal paths from the ancestors to the goal; select one node from each of k shortest paths to the goal and call BBh the set of these nodes. We can consider BBh as a ‘‘barrier to the goal’’ because each shortest path starting from a node more distant to the original goal than BBh crosses one element of BBh. It is easy to recognize that BBh form an invariant decomposition. In fact, we can apply the backward branching procedure to BBh, considered as an intermediate goal, and get all the predecessors of BBh; call BBh+1 the set of predecessors of BBh; by applying the backward branching procedure again and again, the process will end up when all ancestors will be reached. Vice versa, we can apply the forward branching procedure, until the goal is reached. In this way we have re-created all minimal paths from the ancestors to the goal. Finally, note that if we select all nodes of BBh at the same distance to the goal, BBh is obviously a layer.
3. Conjectures and biases Of course in real contexts players may build up a decomposition pattern in a very chaotic way, sometimes disregarding the requisites of logical coherence typical of the and/or trees creation. We do not pretend to model the decomposition process, i.e. the process of creation of a decomposition pattern, which is a matter that is experimentally unexplored so far. We simply note that the requirements for a decomposition pattern may be relaxed; whatever the features of a decomposition process are, if the process is developed until the terminal subgames have been reached, the states of each pair {Ck Rk} are adjacent because they are connected by one single move. On identifying the relation Ck-Rk, the player implicitly assumes that the move takes him one step closer to the goal, and therefore that all the states of the set Rk are one step closer to the objective with respect to the states of the set Ck. This means that players performing the decomposition conjecture an order among pairs of sets (Ck, Rk). If player’s conjecture is correct, i.e. if in every subproblem (Ck, Rk), Rk is closest to the goal than Ck, then the strategy yielded by the decomposition is optimal and the decomposition is invariant. If the conjecture is not correct, errors which are often difficult to detect occur. In fact, categorization enables identification of terminal problems {Ck Rk} which are solved by applying the same move mw to all the states of the elementary component Ck to achieve the partial objective Rk. However, there is often no single action which, when applied to Ck-mw-Rk, yields a set of adjacent states all one step closer to G. On the contrary, some of these xi ACk states are such that move mw takes one step away from the objective, while other xjACk states take it one step closer to the objective. In this case the player has
Decomposition Patterns in Problem Solving
27
made a partly inaccurate conjecture in which the error is generated by the categorization.8 3.1. Fitness The sum of the distances between every starting state and the goal is a benchmark to measure the efficiency or the fitness of a strategy and compare the ‘‘deviation’’ of different decompositions patterns to optimality.9 We can compare different types of decomposition patterns by associating to each of them, as a measure of fitness, the length of the paths from every starting state xi to the goal, defined as follows: given the list of instructions of the strategy derived from a given decomposition pattern, we can calculate the length of path L(xi, G) between every starting configuration xi and the goal G based on the building blocks of the decomposition pattern. The length of L(x, G) is the sum of the lengths of the paths between every pair of subproblems that compose the problem. If a problem is fully decomposed, the length of the paths of the terminal subproblems is 1, and therefore L(xi, G) is simply equal to the number of terminal subproblems that are involved in the decomposition. The sum of lengths LTOT ¼ Si L(xi, G) is called ‘‘global path’’. Note that Lðxi ; GÞ Dðxi ; GÞ where D(xi, G) is the distance between xi and G, i.e. the number of steps of the shortest path, or, equivalently, the label of distance associated through the backward branching procedure. LTOT will vary with the different decomposition patterns and, within the same decomposition pattern, with the moves associated to every building block. For all invariant decomposition patterns LTOT will have the same Pvalue, the minimum, i.e. Si L(xi, G) ¼ Si D(xi, G) and LTOT ¼ Si L(xi, G) ¼ i D(xi, G). For the non-invariant decomposition patterns we will have L(xi, G)ZD(xi, G) and Si Lðxi ; GÞ4Si Dðxi ; GÞ
8
Moreover, having identified the terminal sub-problems, and having assumed that the solution consists in applying the same move to all the components of the sub-problem, even if the player optimizes each single sub-prolem (with the constraint just stated) he will not be able to achieve the global optimum. In fact, if xiACk ed xjACk and the player has identified the solution Ck-mw, and if move mw applied to xi leads to a (closer) successor whereas if it applied to xj it leads to a node further away, the player cannot improve his strategy. Even if he attempts to apply moves different from mw to the xjACk, its solution cannot achieve the optimum because there is no single move that makes this possible. The player must therefore modify the composition of the building blocks Ck. 9 Fitness is a term originally used in evolutionary biology to mean the reproductive success of an organism in the environment and therefore to evaluate his potential capacity to survive in a biological competition; in our context we can use alternatively.
28
Massimo Egidi
Hence SiL(xi, G) can be naturally assumed as a measure of the fitness of a strategy (and of the associated decomposition pattern). 3.2. Weakly suboptimal strategies and their local stability The non-invariant decomposition patterns give rise to suboptimal strategies for the reason we have just seen: Si L(xi, G)>Si D(xi, G). But some of them display a more subtle property: they are optimal in a sub-domain of the configurations of the game. This means that some of the trajectories prescribed by the pattern are suboptimal, some other are optimal. More precisely, once a strategy is constructed according to a decomposition pattern, if the corresponding move of a given building block Ci is mh, and if xr and xs are two configurations of Ci it may happen that for configuration xrACi we have L(xr, G)>D(xi, G) while for xsACi we have L(xs, G) ¼ D(xi,G). This happens because the same move mh, applied to two different states of Ci, can get the player nearer or farther to the goal. This means that in a non-invariant decomposition pattern there can be numerous configurations, for every building block, for which the moves of the strategy are optimal. If the number of configurations for which the moves of the strategy are suboptimal is low, it becomes extremely difficult to discover them, and a player can persist in using a strategy, without perceiving that there are suboptimal characteristics. 3.3. Landscape We have assumed the ‘‘total path’’ LTOT ¼ Si L(xi, G), as a measure of the fitness of a strategy. Every strategy is based on a decomposition pattern, and therefore the fitness varies with the decomposition pattern discovered by the player. For a given decomposition pattern, a player can try to modify the fitness of his strategy by modifying the moves, within the constraint imposed by the building blocks. It is possible to demonstrate that, if there are not restriction on the moves to be selected within the transition matrix, the fitness function admits absolute minimums (all equal) corresponding to the sum of the shortest paths. Moreover, the minimal value is reachable by modifying the moves by trial and errors, despite a positive epistatic degree (Egidi, 2000, 2002). An hypothetical player who builds up his strategy in the extended form can therefore improve it while modifying by trials and errors the actions matching every condition of the program, until the optimal value of LTOT is reached. This property seems to be in contrast to the behaviours that have been observed experimentally, which indicate that individuals insist in using suboptimal strategies, and do not modify them incrementally for improvement, as it would be possible. The notion of building block allows us to explain the apparent contrast: the strategies adopted by players are not in extended form – they are in fact compact – and based on building blocks allowing much less mental effort. However, the building blocks limit the possibilities to modify the
Decomposition Patterns in Problem Solving
29
moves of a strategy, because they constitute a constraint: all configurations of a given block must have the same move, and therefore if the move is modified to be applied to configuration xieCi one must also modify the moves of all other configurations of Ci in the same way. Therefore, in the search of a minimal value of LTOT, the imposed constraints only allow to reach a local minimum. Consequently what makes a relative minimal value stable is the difficulty that players encounter in modifying the set of selected blocks, i.e. the decomposition pattern, and to adopt a different one. This difficulty depends on the fact that building blocks are represented abstractly and synthetically, they are in fact mental categories that are obviously difficult to modify. A strategy generated from a non-invariant decomposition pattern therefore admits a stable local optimum in the sense that, for a given decomposition pattern, once one relative minimum has been identified, if the actions matching the building blocks are modified, the programme turns out to be less efficient, i.e. the number of moves necessary in order to reach the goal increases. Therefore, the solution based on the non-invariant decomposition pattern though being suboptimal cannot be improved – it is ‘‘locally’’ optimal and stable. In order to improve the strategy it is therefore necessary to modify the building blocks, i.e. introduce a new decomposition pattern.
4. An application: MiniRubik MiniRubik is a simplified version of Rubik cube. In the experiments that we will illustrate, the player sits in front of the screen of a computer where two squares composed of four tiles of different colours are displayed (Figure 3). On the bottom right of the screen, the arrangement of tiles to be reached as a goal is displayed. In the centre, tiles are arranged differently and the player can modify the arrangement, exchanging tiles horizontally or vertically, until the goal configuration is achieved. (For the sake of simplicity, when necessary, we will use letters, A, B, C, D instead of colours). Figure 3.
The Minirubik puzzle
State of the game
D B C A Goal A B D C
Massimo Egidi
30
Figure 4. The basic Minirubik’s moves
D B C A
UP
B D C A
D B C A
DOWN
D B A C
D B C A
RIGHT
D A C B
D B C A
LEFT
C B D A
Figure 5. 1 A D 4
A string representation of the puzzle 2 B C 3
1 2 3 4 A B C D
As indicated in Figure 4, tiles can be moved and exchanged in the horizontal or vertical direction. The players must reach the final configuration. The players gain points according to the number of moves they make to reach the goal: they are granted an initial sum which diminishes at every move they make: the lower the number of moves made to resolve the problem, the greater the residual amount that remains at the end of the game. In order to simplify the description in the discussion that follows, a configuration can be indicated as a sequence of four letters (or colours) instead of a square of four letters (or colours), applying the following rule: start from the upper left corner of the square and list the elements in the square, moving clockwise. The positions of A, B, C and D in Figure 5 are positions 1, 2, 3 and 4, respectively. With this representation strings can be written like groups of four letters (or colours), like for example CBDA or A##C that represents the set of configurations where A is the first position (upper left corner) and C is the last (lower left corner). The transition matrix of MiniRubik is illustrated in Figure 6 and the corresponding graph in Figure 7.
Decomposition Patterns in Problem Solving
Figure 6. 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
2 D
3
4
5
6
R
D R
The transition matrix of the puzzle
7 U
R
8 U
D D
R D
R R
D
U
D U
D R L R L U U
L
L L U U
31
L
L L
9 10 11 12 13 14 15 16 17 18 19 20 21 22 L L U L U L U L U R L R L D U D R U D L U R D U D R L D R U R D L U D R L R D R D L D R D U L R D U D L U R U R
Figure 7. The graph of Minirubik DCBA 24 DBAC 21
DCAB 23
DACB 20 DBCA 22
CBDA 16
BACD ABCD
7
1
CABD
ACBD
CBAD 15
BDCA 12
ABDC 2
CDBA 18
13
3
DABC 19
14 BCAD
9
BDAC 11
BADC 8
BCDA 10
ADBC 5
ADCB 6
ACDB 4
CDAB CADB
17
23 24
L
L
U U R R D D
Massimo Egidi
32
4.1. Optimal solutions Figure 8 shows the game’s optimal solutions obtained with the backward branching procedure discussed earlier. Figure 9 indicates the transition matrix of the game reordered by the distance to the goal. Figure 10 illustrates the corresponding graph of shortest paths to the goal. The layers are clearly identified by the same vertical position (distance). The example we have discussed so far shows that when decomposing a problem into subproblems with layers as separators, the metric of the decomposition is unvaried. If for example we select the set S2 ¼ {4, 5, 8, 9, 12, 13, 16, Figure 8. Layers
Distance from G 0
1
S1
1
2, 3, 7, 22
S2
2
4, 5, 8, 9, 12, 13, 16, 20, 21, 24
S3
3
15, 19, 18, 11, 14, 10, 23, 6
S4
4
17
0 0 1 0 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4
States
G
Figure 9.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Minirubik’s optimal layers
1 2 3 7 22 4 5 8 9 12 13 16 20 21 24 6 10 11 14 15 18 19 23 17
1 1 2
The Transition matrix reordered by distance to the goal 2 1 3
3 4 5 1 1 2 7 22 4
6 2 5
7 2 8
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 9 12 13 16 20 21 24 6 10 11 14 15 18 19 23 17
D R U L D R U
D R U U
L L L
D R R L
D D R L
U
U L U
L R D
U U U D R L
L
D R
L U D R
R D L
R
D
U
Decomposition Patterns in Problem Solving
Figure 10.
33
The graph of shortest paths to the goal 24 L 21 U L
23 U D
20
R
R
19
U D
22
D
16
R L
18
R U R U 1
U 7
U
13
L
D
U
U
U D
15
L D
R 3
12
R 2
14
17
L
L 9
L RU
R
11
R L D 8
5
L
L D
10
6
D 4
20, 21, 24} of nodes at distance 2 to the goal as an intermediate set, i.e. that problem S-G is decomposed into two subproblems, S-S2 and S2-G, (S ¼ S4[S3), we see that the pattern is invariant; in fact, we can immediately verify that the optimal path from every node s1eS to G is exactly the sum of the optimal path from s1 to a node of S2 plus the optimal path from S2 to G. This decomposition pattern has therefore preserved the metric of the game.
4.2. Comparing different decomposition patterns Obviously players do not calculate the optimal distance the way we have seen. They try to represent a problem in a simplified way, concentrating on some properties of the game and identifying the elementary subproblems. There are many possible decomposition patterns that can be compared in terms of ‘‘simplicity’’ of representation and in terms of weak suboptimality (fitness).
Massimo Egidi
34
Among the possible different patterns, we will consider and compare two of them; the first one is weakly suboptimal and very easy to be discovered; the second is optimal and more difficult to be discovered and conceptualized.
4.2.1. First A: a weakly suboptimal strategy A strategy that has been frequently implemented by players in some preliminary experiments is based on a simple and sequential decomposition of the problem: players put the pieces in order in the right position, one at a time, until the final position is achieved. Let us see an example and suppose that the starting and final configurations are strings BDAC and ABCD, respectively, as indicated in Figure 11. The sequential strategy to move from this particular starting configuration to the final configuration suggested by players is composed of the following instructions: Sequential strategy ‘‘First A’’: Step 1. Move A from the starting position, anticlockwise, until the final position (position 1). Step 2. If B is not yet in the final position, move it to position 2, leaving A in its position. Step 3. If C and D are not yet in the required final positions, exchange them. We can immediately appreciate that this sequence of instructions can be applied to any other starting configuration, only by modifying the first instruction in the most ‘‘natural’’ way i.e. moving A from the starting position to position 1 with the least number of moves. This strategy, where one must concentrate on the position of a tile at a time (first move A, then B and finally C and D) is based on the decomposition into three subproblems, each one being implicitly based on an adequate categorization: the categories A###, #A##, ##A#, ###A, A#B# y that may be ordered in mind to decide how to approach the goal. These categories constitute the building blocks of the ‘‘First A’’ decomposition pattern.
Figure 11. An example showing starting and final configuration Starting configuration
B C
D A
Final configuration
A D
B C
Positions
1 4
2 3
Decomposition Patterns in Problem Solving
Figure 12.
35
The aggregated form of ‘‘First A’’ strategy
A
B3 = # A # # Up A
7
Right
8 13
A
14 19
B4 = A # # #
B2 = # # A #
20
1
9
2
11
A
3 4 5
15
Left
17
B1 = # # # A
6
Down
21 23
10 12 16 18 22 24
Figure 12 illustrates the diagram of the Step 1 first part of the strategy: the subproblem consists in how to move A from its starting position to position 1.10 The building blocks on which the representation is based therefore include categories B1 ¼ ###A, B2 ¼ ##A#, B3 ¼ #A## and B4 ¼ A###. When the player plans to move A from position 3 to position 2, and finally to position 1, he implicitly orders the three building blocks B1, B2 , B3 in terms of distance to the goal G ¼ ABCD. In fact, the player presupposes that a configuration belonging to B3 ¼ #A##, is nearer the goal compared to a configuration that belongs to B2 ¼ ##A# or B1 ¼ ###A. Therefore, if D is the distance between block B1 and goal G, we will have
10 In the present version of the problem, the player decides to move A anticlockwise if A is in position 3, but this solution displays similar properties even if the player moves the tile clockwise.
36
Massimo Egidi
D(A###, G)oD(#A##, G)oD(##A#, G) D(A###, G)oD(###A, G)oD(##A#, G) This system of relative distances can be illustrated as follows: B1 ¼ ###A ¼ (10, 12, 16, 18, 22, 24); presumed distance ¼ 1 B2 ¼ ##A# ¼ (9, 11, 15, 17, 21, 23); presumed distance ¼ 2 B3 ¼ #A## ¼ (7, 8, 13, 14, 19, 20); presumed distance ¼ 1 B4 ¼ A### ¼ (1, 2, 3, 4, 5, 6); sub-goal. Therefore the player classifies the building blocks according to an order that defines the relative distances to the goal. This clarifies the reasons why errors appear: if such a system of distances defined at an abstract level of representation is isomorphic with the minimal distances that are obtained by applying the backward branching procedure to the extended description, then the decomposition in Bi blocks is optimal while – if this is not the case – the decomposition will introduce hidden errors. In order to verify such property a graph of the game has been drawn in the extended form (Figure 13), associating to every node the blocks defined in Figure 12, according to the presumed distance to the goal. The results are quite interesting: the elements of building block B4 ¼ A### ¼ (1, 2, 3, 4, 5, 6) do not make a barrier along the shortest paths to the goal G ¼ {1} and therefore introduce a systematic distortion in the system of the distances to the goal. At the same time, if we consider the building block B4 ¼ {1, 2, 3, 4, 5, 6} as a goal, the other building blocks B1, B2, B3 are layers relatively to B4. In fact, if the system is constructed by means of the backward branching procedure, the set of nodes 1, 2, y from B4 (B1\B3) is at distance 1, and B2 is at a distance 2. The strategy ‘‘First A’’ effectively identifies the optimal moves that connect building blocks B1, B2 and B3 to B4. If we compare the shortest paths to {1} with those to B4 ¼ {1, 2, 3, 4, 5, 6}, we discover that they do not coincide in two cases, and precisely for nodes 12 and 20; 12 and 20 in fact are nearer to 6 than to 1: d(12, 1)>d(12, 6) and d(20, 1)>d(20, 6). The same applies for node 20 to the block B3. These suboptimalities are generated because B4 is not a barrier to the final goal G, and that distorts all the space of relative distances. This distortion is clearly illustrated in Figure 13 that represents the graph of the distances to building blocks Bi. By comparing Figure 13 with Figure 10 the distortions can be seen very clearly, in the form of paths along which – in a part of the route – one moves away from the goal while believing to getting closer to it. Consequently, the shortest paths in the representation based on building blocks Bi coincides with the shortest paths only in a part of the configurations – and when
Decomposition Patterns in Problem Solving
Figure 13.
7
22
1
2
B3
B1
B4
The ‘‘First A’’ strategy 9
B2
20
B3
12
B1
21
23
B2
B2
6
B4
5
B4
19
B3
16
B1 18
B1
11
B2
15
B2
14
B3
10
B1
8
3
B4
37
B3
13
B3
24
B1
4
B4
17 B2
the two paths do not coincide, as it happens for nodes 12, 20 and 21, the representation ‘‘First A’’ provides suboptimal solutions.11 In this way the source of errors is explained: they are generated by the ‘‘imperfect’’ categorization that players use to simplify the representation of the game and create the building blocks; the categorization – being founded on salient elements (the position and eventually the colours of the tiles to be moved at a time) – is generally simple though imperfect: it groups ‘‘inhomogeneous’’
11
For example: starting from configuration 20, DACB, the only optimal move is left. Hence, 206-5-2-1. Let us suppose that DCBA and ABCD are the starting and final configurations, respectively. The optimal sequence - which can be calculated with the matrix of optimal links – is DCBA-right-DACB-left-BACD-up-ABCD, while the rules in our procedure provide the following sequence: DCBA-right-DACB-left-BACD-up-ADCB-down-ADBCright-ABDC-down-ABCD which obviously includes more moves.
Massimo Egidi
38
Figure 14.
Z0 0 Z1 1 1 Z2 2 Z3 2 3 1 2 Z4 2 3 3 2 3 2 Z5 2 3 1 2 2 3 Z6 3 4 2 3
The transition matrix of ‘‘First A’’ strategy
Z0 Z1 Z2 Z3 0 1 1 2 2 3
1
2
1
7
8 13 14 19 20 10 12 16 18 22 24 9 11 15 17 21 23
2
3
5
4
6
Z4 2 3
3
2
1 2 3
R R
5 4 6 7 8
D D U U U
13 14 19 20 10 12 16 18 22 24 9 11 15 17 21 23
U U U L L L L L L R R R R R R
3
2
Z5 2 3
1
2
2
3
Z6 3 4
2
3 Distance A A A A A A B B C C D D B B C C D D B B C C D D
B B C D C D A A A A A A C D B D B C C D B D B C
C D B B D C C D B D B C D C D B C B A A A A A A
D C D C B B D C D B C B A A A A A A D C D B C B
blocks since the system of distances among building blocks is not isomorphic with optimal distances.
Simplifying by aggregation. To cross check what we have observed, a comparison can be made between the optimal strategy and the ‘‘First A’’ strategy expressed in the form of the matrix of shortest paths: three ‘‘discrepancies’’ are immediately apparent (Figures 13 and 14), i.e. three specific configurations for which the ‘‘First A’’ strategy does not prescribe an optimal move. The configurations n. 12, 20 and 21, shaded in Figure 14. The configuration n. 20 is in fact at distance 3 to the goal (the distances to the goal are reported in Figure 14 and in Figure 8) and the move U leads to the configuration n. 6 which is at distance 4 to the goal (G ¼ 1); therefore this move gets far the player from the goal. The same holds for the two other critical configurations. In the right part of Figure 14 the configurations of the game are listed in the extended format. It is evident that the sets of indexes Zk make possible to classify the configurations in 6 different categories: ###A ¼ Z5, ##A# ¼ Z6, #A## ¼ Z4, A#B# ¼ Z2, A##B ¼ Z3, AB## ¼ Z0UZ1. This allow to represent the ‘‘First A’’ strategy in a remarkably simpler way than the optimal strategy, despite some distortions of the metric.
Decomposition Patterns in Problem Solving
39
Figure 14 illustrates the aggregation based on the 6 categories, i.e. consistent with the symbolic content of configurations: the states characterized by the same optimal action are aggregated according to their common and salient features defined by the categories. The criterion used by the player in ‘‘First A’’ to build the categories is that of aggregating all the configurations where the tiles A (then A and B and finally A,B and C) are in the same position. This criterion does not allow to cluster configurations at the same distance to the goal and therefore the signal of the distance to the goal that the player obtains while observing the position of A is partially wrong. The error is however limited to a few configurations, and is therefore very difficult to find. The symbolic manipulation made on the states of the game is not therefore a safe criterion to identify the sets which are equidistant to the goal and as such this is a ‘‘natural’’ source of distortions.12 As we have seen, the ‘‘First A’’ procedure proves to be optimal in a sub-domain of configurations of the game. Therefore, when adopting this procedure, the players have the advantage to use a simple, abstract and complete representation but at the price of inefficiencies since the number of moves necessary to reach the goal is (slightly) greater than the optimal value. The ‘‘First A’’ strategy is therefore described by a compact though weakly suboptimal list of instructions. Stability. This strategy has a remarkable advantage: the comparison between different alternatives to moving A first, then B and finally C into the final position is immediate and is not very mind-consuming for the players. It is therefore quite natural to order the categories and find the optimal moves that connect them. Hence, the instructions of the strategy determine a locally stable optimum. The strategy is locally stable, in the sense that it cannot be improved by simply modifying the actions matching the elementary building blocks. In fact, if the instructions are modified, with the constraints imposed by the structure of building blocks, the programme turns out to be less efficient, i.e. the number of moves necessary to reach the goal increases. Therefore the solution based on the ‘‘First A’’ decomposition pattern – though being suboptimal – cannot be improved, and is locally optimal and stable. In order to improve the strategy it will therefore be necessary to modify the building blocks. 4.2.2. An invariant decomposition pattern: the ‘‘First row’’ strategy A different representation of the problem that some players have discovered during the different runs is based on a more careful analysis of the properties of
12
For example, the category ###A is composed of states (10, 12, 16, 18, 20, 22, 24); by applying the move ‘‘left’’ to any configuration of this category, A is moved to position 1 and the player thinks to obtain a configuration that is nearer to the goal ( ###A-left-A###) while this is true for all the configurations except one: n. 12 (BDCA).
Massimo Egidi
40
the game. Some players, instead of moving the tiles sequentially, as in the ‘‘First A’’ strategy, did take into consideration the interdependence among the various positions of the tiles, observing that moving a tile means moving at the same time a second tile associated to the action. The suggested strategy consists in taking into account the effect that an action on the tile has on tile B. The strategy consists therefore in moving A and B to the final position, i.e. in the first row, in position AB##, regardless of the positions of B and C, while the ‘‘First A’’ strategy suggests to move A regardless of B,C and D; in this case more comparisons are needed than in ‘‘First A’’ and hence more efforts in terms of memory and calculations: however, as we will see, the strategy designed in this way is optimal.
‘‘First row’’ strategy. Let us take A and B and place them in the first row, in position AB##, while comparing the number of moves needed to reach the goal whenever there are possible alternatives. This comparison is generally easy. For example, let us consider configuration 10, B##A; it is easy to compare the two possible solutions: solution 1 consists in ‘‘moving A anticlockwise’’ and it requires three moves. Solution 2 consists in ‘‘moving B to its final position and then move A’’: this requires two moves. The sequence of actions illustrated in Figure 15 is hence derived. Figure 15.
The aggregated form of ‘‘First row’’ strategy Moves
States 1, 2
A B
3, 5
A
A B
7, 8
B
A Up B
13, 19
Right
A Left
14, 20
B 4, 6
Moves
States
B Down
A Up
B Down
15, 21
B Left
16, 22
A 9, 11
B
Right
10, 12
B A
Down,
17, 23
A
B A Left, Right Up
Left, Right
18, 24
A B
Decomposition Patterns in Problem Solving
41
Here the categories are AB##, A#B#, #AB#, ##AB#, BA# y , respectively; here, too, the problem consists in establishing an order in terms of proximity to the goal. The ‘‘First row’’ strategy that simplifies the extended representation of the game allows to maintain the metric. This strategy identifies the building blocks that – while not fully fulfilling the conditions defined by the Layers – , do generate an optimal strategy. These blocks in fact are barriers along the shortest paths. Interestingly, this decomposition pattern allows to compact the graph of the game so as to halve the number of states, which is equivalent to decomposing the graph into two perfectly identical ones. In this way we have identified building blocks that allow an invariant aggregation of the strategy (see Figure 16).
Figure 16.
The Graph of ‘‘First row’’ strategy
9,
11
B A L
R 7,
8
B
A
1 4, 20
L
A
17, 23
R B
B U
10,
12
B A U 1, 2
A
16,
B
L
22
B
15,
D
21
B A
A R
R
L
3, 5
18,
24
A
B
13,
19
A
A B
B D
4, 6
A B
D
A
Massimo Egidi
42
The reader may well appreciate that besides the ‘‘First row’’ strategy – for which the graph was designed – the ‘‘second row’’ and ‘‘first column’’ strategies are invariant too and allow some ‘‘savings’’ in the representation.
5. Preliminary experiments An experiment on MiniRubik, with 20 subjects, showed that a large number of players discovered the First A decomposition pattern, thereby making systematic errors in some specific configurations of the game. The errors were in line with the predictions of First A strategy. Other players discovered the ‘‘First row’’ strategy, that allowed them not to make errors though ‘‘saving’’ in terms of memorization and calculation. In Figure 17 the global errors are compared (i.e. discrepancies to the optimal strategy) to the biases, i.e. the deviations to the optimal strategy consistent with the ‘‘First A’’ strategy. It is clear that most ‘‘errors’’ are actions that are perfectly consistent with the ‘‘First A’’ strategy: the players have discovered the simplified representation of the game and they use it rationally; this is an interesting example of bounded rationality. In fact, the players behave in completely rational Figure 17.
Errors and ‘‘rational biases’’ in the experiment
50 Errs 45
Biases
40 35 30 25 20 15 10 5 0 1
4
7
10
13
16
19
22
25
28
31
34
Decomposition Patterns in Problem Solving
43
way within the First A decomposition pattern, which hiddenly makes them deviate from optimality. 6. Concluding remarks The decomposition of a problem allows players to successfully find the optimal strategies to the elementary subproblems, but as we have seen, the decomposition patterns are usually non-invariant and therefore the final result is not an optimal strategy. As we have seen, the processes of abstraction and categorization used by players to identify a pattern of decomposition in a puzzle usually is based on some salient features of the game as a guide. These features may not reflect correctly the metric of the problem, and the distances of subproblems to the goal. Therefore the biases in decision making are originated by the nature of decomposition process. The errors are therefore created by the representation and can only be corrected if the representation is revised and modified appropriately. Moreover some decomposition patterns give rise to optimal paths in sub-domains of the problem and it is therefore extremely difficult for an individual to notice the errors and to correct them. This helps to explain the stability of the weakly non-invariant representations: in fact, correcting hidden errors should be extremely expensive in terms of calculation and memorization and therefore to correct the errors players are normally guided by exceptions that accidentally emerge. Any attempt to actively discover the errors would demand a complete and detailed description of the configurations of the game thus nullifying the ‘‘parsimony effect’’ obtained by the categorization, and nullifying the attempt to express the strategy in a simple manner. We noted that the key element in the representation of a strategy is that of building blocks (that normally are represented as categories); given a puzzle, we know that there are different representations of it, each of them being defined by a different structure of building blocks. The wider the extension of a building block (the number of configurations involved) the lower the number of building blocks needed to represent the game and identify the winning strategy. If players, in order to get a clear and simple representation of the game, try to increase the extension of the building blocks, easily introduce hidden errors into the categorization. Generally the categories are created by the players during the runs of the game, on the basis of their direct experience. If they try to extend the rules that have experimented as optimal in a specific game context, to a larger domain, they may inadvertently include domains where the rules are suboptimal. Consequently, the errors in the mental representation of a problem can be the natural effect of the categorization and identification of building blocks beyond their ‘‘right’’ domain. This can help to explain the ‘‘mechanization of thought’’ shown by Luchins and Luchins (1950) experiments. The discussion we have conducted so far emphasizes some important aspects of the approach of bounded rationality, because shows that the construction of a
44
Massimo Egidi
strategy is based on categorizations that allow to simplify the problems’ representation and at the same time generates biases and suboptimalities.13
References Bellman, R. (1957), Dynamic Programming, Princeton NJ: Princeton University Press (reprinted (2003) by Dover, Mineola, NY). Cohen, M.D. and P. Bacdayan (1994), ‘‘Organizational routines are stored as procedural memory: evidence form a laboratory study’’, Organization Science, Vol. 5(4), pp. 554–568. Egidi, M. and A. Narduzzo (1997), ‘‘The emergence of path dependent behaviors in cooperative contexts’’, International Journal of Industrial Organization, Vol. 15(6), pp. 677–709. Egidi, M. (2000), ‘‘Bias cognitivi nelle organizzazioni’’, Sistemi Intelligenti, Vol. 2 (XII), Bologna: Il Mulino, pp. 237–270. Egidi, M. (2002). ‘‘Biases in organizational behavior’’, in: M. Augier and J.J. March, editors, The Economics of Choice, Change and Organization: Essays in Memory of Richard M. Cyert, Cheltenham, UK: Edward Elgar. Gigerenzer, G. and R. Selten (eds.) (2002), Bounded Rationality: The Adaptive Toolbox, Cambridge, MA: MIT Press. Luchins, A.S. (1942), ‘‘Mechanization in problem-solving’’, Psychological Monograph, Vol. 54, pp. 1–95. Luchins, A.S. and E.H. Luchins (1950), ‘‘New experimental attempts in preventing mechanization in problem-solving’’, The Journal of General Psychology, Vol. 42, pp. 279–291. Nillson, N.J. (1986), Problem Solving Methods in Artificial Intelligence, New York: McGraw Hill.
Appendix We will show that any decomposition pattern whose building blocks are layers is invariant. To get a clear description of this property, it is convenient to re-describe the backward branching procedure in terms of fit to the matrix representation of the graphs. We will come up to the matrix representation by using an iterative method: first we show how to build up a matrix representation of a layer Sk+1 if Sk still exists, for every k; finally we apply the method to S0 ¼ G.
13
This feature exhibits some similarities with Gigerenzer and Selten’s (2002) suggestion that ‘‘fast and frugal’’ heuristics allow to find out nonoptimal rules that perform almost as well as the optimal ones.
Decomposition Patterns in Problem Solving
Figure A.
45
The transition array reordered by layers
Z0 Z0 A00
Z1
Z1 A10
A11
Z2
A21
Z2
Z3
Z4
A22 A32
A33
Z3 Z4 A43
A44
Assume that by applying iteratively the backward branching procedure we have built up the layers S1, S2, S3, y Sk. Our goal is to build up Sk+1. Remember that S1 is the set of the predecessors of G, S2 the set of predecessors of S1 and so on. Call Zh the set of indexes of the nodes (configurations) that belong to the layer Sh and call zh the number of elements in Zh. For continuity of the representation, call Z0 the set of indexes of the nodes in G. For example if Sk ¼ {xr, xs, xt, xu} then Zk ¼ {r, s, t, u} and zk ¼ 4. T The following properties hold: 8h, k(h6¼k) Zh Zk ¼ Ø ( because two nodes belonging to different layers cannot have the same distance to the goal) and Z0\Z1\Z2\Z3\ y Zs ¼ {1, 2, 3, y n}, where s is the number of layers. To identify the predecessors of a given node xhASk we have first to find out the columns of A whose indexes belong to Zk. Suppose that jAZk, check the coefficients aij along the column j, for i ¼ 1, y n. If aij6¼0 than the link between xi and xj exists. This means that the node xi is adjacent (and directed) to node xj and therefore is a predecessor of xj. Therefore by repeating the same procedure for all elements j in Zk, we collect all indexes i for which aij 6¼ 0, jAZk. It is easy to recognize that the set of all such indexes i is Zk+1, because Zk+1 is by definition the set of xi that are predecessors of xjAZk. This procedure can be applied first to the elements of the goal G, to obtain the first Layer S1 and its indexes Z1 and then iteratively to all other layers. Finally, call Ak+1,k the submatrix of A composed by the coefficient aij with iAZk+1 and jAZk. Ak+1,k is composed by the coefficients aij that link the layer Sk and its predecessor Sk+1. So far we have identified both the sequence of layers Z0, Z1, Z2, y Zs and the sequence of submatrices A10, A21, y As,s1 that link the adjacent layers. Using
46
Massimo Egidi
this procedure we can now build up the matrix of optimal links, B. This matrix is composed by the coefficients of A that represent links among layers, and therefore that give rise to the shortest paths from every initial state x to the goal G. The matrix is represented in the following Figure A. Hence the matrix of optimal links of the original game is composed by the two sub-matrices of optimal links of the two subproblems. The decomposition pattern is invariant, in the sense that the matrices of the two subproblems can be independently optimized to get a global optimization.
CHAPTER 2
Impossible States at Work: Logical Omniscience and Rational Choice Mikae¨l Cozic Abstract Logical omniscience is a never-ending problem in epistemic logic, the main model of full beliefs. It is seldom noticed that probabilistic models of partial beliefs face the same problem. As far as choice models are built on such doxastic models, they necessarily inherit the problem as well. Following some philosophical (Hacking, 1967) and decision-theoretic (Lipman, 1999) contributions, we advocate the use of nonstandard or impossible states to tackle this issue. First, we extend the nonstandard structures to the probabilistic case; an axiom system is devised, i.e. proved to be complete with respect to nonstandard probabilistic structures. Second, we show how to substitute weakened doxastic models for the idealized ones in choice models, and discuss the questions raised by this ‘‘unidealization’’. Keywords: Bounded probabilistic logic.
rationality,
epistemic
logic,
logical
omniscience,
JEL Classifications: D80, D81, D82 1. Introduction Let us imagine an agent that could solve any stochastic decision process, whatever the number of periods, states and alternatives may be; that could find a Nash equilibrium in any finite game, whatever the number of players and strategies
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80003-9
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
48
Mikae¨l Cozic
may be; more generally that would have a perfect mathematical knowledge and, still more generally, which would know all the logical consequences of his or her beliefs. By definition, this agent would be described as logically omniscient. For sure, logical omniscience is an highly unrealistic hypothesis from the psychological point of view. Yet, this is the cognitive situation of agents in the main current doxastic models, i.e. models of beliefs. The issue has been raised a long time ago in epistemic logic (Hintikka, 1975, see the recent survey in Fagin et al., 1995), which is the classical model of full beliefs. In particular, it has been recognized that logical omniscience is one of the most uneliminable cognitive idealizations, because it is an immediate consequence of the core principle of the modeling: the representation of beliefs by a space of possible states. What is the relevance for rational choice theory? A standard decision model has three fundamental building blocks: 1. a model of beliefs, or doxastic model, 2. a model of desires, or axiological model, and 3. a criterion of choice, which, given beliefs and desires, selects the ‘‘appropriate’’ actions. In choice under uncertainty, the classical model assumes that the doxastic model is a probability distribution on a state space, the axiological model a utility function on a set of consequences and the criterion is the maximization of expected utility. In this case, the doxastic model is a model of partial beliefs. But there are choice models which are built on a model of full beliefs: this is the case of models like maximax or minimax (Luce and Raiffa, 1985, Chapter 13), where one assumes that the agent takes into account the subset of possible states that is compatible with his or her beliefs. The point is that, in both cases, the choice model inherits the cognitive idealizations of the doxastic model. Consequently, the choice model is cognitively at least as unrealistic as the doxastic model upon which it is based. Indeed, a choice model is strictly more unrealistic than its doxastic model since it assumes furthermore the axiological model and the implementation of the choice criterion. Hence, one of the main sources of cognitive idealization in choice models is the logical omniscience of their doxastic model; the weakening of logical omniscience in a decision-theoretic context is therefore one of the main ways to build more realistic choice models, i.e. to achieve bounded rationality. Surprisingly, whereas there has been extensive work on logical omniscience in epistemic logic, there has been very few attempts to investigate the extension of the putative solutions to the probabilistic representation of beliefs (probabilistic case) and to models of decision making (decision-theoretic case).1
1
One important exception is Lipman (1999).
Impossible States at Work
49
The aim of this paper is to make some progress in filling this gap. Our method is the following one: given that a huge number of (putative) solutions to logical omniscience have been proposed in epistemic logic, we will not start from scratch, but we will consider extensions of the main current solutions. Our main claim is that the solution that we will call the ‘‘nonstandard structures’’ constitute the best candidate to this extension. The remainder of the paper proceeds as follows. In Section 2 the problem of logical omniscience and its most popular solutions are briefly recalled. Then, it shall be argued that, among these solutions, nonstandard structures are the best basis for an extension to probabilistic and decision-theoretic cases. Section 3 is devoted to the probabilistic case and states our main result: an axiomatization for nonstandard explicit probabilistic structures. In Section 4, we discuss the extension to the decision-theoretic case. 2. Logical omniscience in epistemic logic 2.1. Epistemic logic Problems and propositions related to logical omniscience are best expressed in a logical framework, usually called ‘‘epistemic logic’’ (see Fagin et al. (1995) for an extensive technical survey and Stalnaker (1991), reprinted in Stalnaker (1998) for an illuminating philosophical discussion), which is nothing but a particular interpretation of modal logic. Here is a brief review of the classical model: Kripke structures. First, we have to define the language of propositional epistemic logic. The only difference with the language of propositional logic is that this language contains a doxastic operator B: Bf is intended to mean ‘‘the agent believes that f’’. Definition 1. The set of formulas of an epistemic propositional language LBðAtÞ based on a set At of propositional variables Form ðLBðAtÞÞ, is defined by 2 f ::¼ pj:fjf _ cjf ^ cjBf The interpretation of the formulas is given by the famous Kripke structures. Definition 2. Let LBðAtÞ an epistemic propositional language; a Kripke structure for LBðAtÞ is a 3-tuple M ¼ ðS; p; RÞ where (i) S is a state space, (ii) p: At S-{0, 1} is a valuation, and (iii) RDS S is an accessibility relation.
2
This formulation (the so called Backus-Naur Form) means, for instance, which the propositional are formulas, that if c is a formula, :c too, and so on.
Mikae¨l Cozic
50
Intuitively, the accessibility relation associates to every state the states that the agent considers possible given his or her beliefs. p associates to every atomic formula, in every state, a truth value; it is extended in a canonical way to every formula by the satisfaction relation. Definition 3. The satisfaction relation, labelled extends p to every formula of the language according to the following conditions: (i) (ii) (iii) (iv) (v)
M; s p iff pðp; sÞ ¼ 1, M; s f ^ c iff M; s f and M; s c, M; s f _ c iff M; s f or M; s c, M; s :f iff M; saf, and M; s Bf iff 8s0 s:t: sRs0 ; M; s0 f:
The specific doxastic condition contains what might be called the possible-state analysis of belief. It means that an agent believes that f if, in all the states that (according to him or her) could be the actual state, f is true: to believe something is to exclude that it could be false. Conversely, an agent does not believe f if, in some of the states that could be the actual state, f is false: not to believe is to consider that it could be false. This principle will be significant in the discussions below. Example 1. S ¼ {s1, s2 , s3, s4}; p (‘‘it’s sunny’’) is true in s1 and s2, q (‘‘it’s windy’’) in s1 and s4. Suppose that s1 is the actual state and that in this state the agent believes that p is true but does not know if q is true. Figure 1 represents this situation, omitting the accessibility relation in the non-actual states. Definition 4. Let M be a Kripke structure; in M, the set of states where f is true, or the proposition expressed by f, or the informational content of f, is noted ½½jM fs : M; s jg. Figure 1.
Kripke structures
p, q
p
s1
s2
s3
s4
q
Impossible States at Work
51
To formulate logical omniscience, we need lastly to define the following semantical relations between formulas. Definition 5. f M-implies c if ½½fM ½½cM .f and c are M-equivalent if ½½fM ¼ ½½cM . There are several forms of logical omniscience (see Fagin et al., 1995); the next proposition shows that two of them, deductive monotony and intensionality, hold in Kripke structures: Proposition 1. Let M be a Kripke structure and f, c A LBðAtÞ. (i) Deductive monotony: if f M-implies c, then Bf M-implies Bc and (ii) Intensionality: if f and c are M-equivalent, then Bf and Bc are M-equivalent. Both properties are obvious theorems in the axiom system K, which is sound and complete for Kripke structures:
System K (PROP) Instances of propositional tautologies (MP) From f and f - c infer c (K) Bf4B(f-c)-Bc (RN) From f, infer Bf
2.2. Three solutions to logical omniscience A huge number of solutions have been proposed to weaken logical omniscience, and arguably no consensus has been reached (see Fagin et al., 1995).3 We identify three main solutions to logical omniscience, which are our three candidates to an extension to the probabilistic or decision-theoretic case. There is probably some arbitrariness in this selection, but they are among the most used, natural and powerful existing solutions. 2.2.1. Neighborhood structures The ‘‘neighborhood structures’’, sometimes called ‘‘Montague–Scott structures’’ are our first candidate. The basic idea is to make explicit the propositions that the agent believes; the neighborhood system of an agent at a given state is precisely the set of propositions that the agent believes.
3
We have contributed to this industry by defending the use of substructural logics in Cozic (2006); this setting is not tractable enough for the aim of the current paper.
Mikae¨l Cozic
52
Definition 6. A neighborhood structure is a 3-tuple M ¼ ðS; p; V Þ where (i) S is a state space, (ii) p:At S-{0, 1} is a valuation, and (iii) V: S-Y((Y(S)), called the agent’s neighborhood system, associates to every state a set of propositions. The conditions on the satisfaction relation are the same, except for the doxastic operator: M; s Bf iff ½½fM 2 V ðsÞ It is easy to check that deductive monotony is invalidated by neighborhood structures, as shown by the following example. Example 2. Let us consider the first example and replace the accessibility relation by a neighborhood system; V(s1) contains {s1, s2} but not {s1, s2, s3}. Then, in s1, Bp is true but not B(p3q). This is represented in Figure 2.4 As expected, one can regain deductive monotony by closing the neighborhood systems under supersets. Nonetheless, the axiomatization presented below5 makes clear that the power of neighborhood system is limited: intensionality cannot be weakened. System E (Chellas, 1980) (PROP) Instances of propositional tautologies (MP) From f and f-c infer c (RE) From f2c infer Bf2Bc 2.2.2. Awareness structures The second solution, due to R. Fagin and J. Halpern (1988),6 are the ‘‘awareness structures’’. The basic idea is to put a syntactical filter on the agent’s beliefs. The term ‘‘awareness’’ suggests that this can be interpreted as reflecting the agent’s awareness state, but other interpretations are conceivable as well. Definition 7. An awareness structure is a 4-tuple (S, p, R, A) where (i) S is a state space, (ii) p:At S-{0, 1} is a valuation,
4
This reccuring example is not chosen for its cognitive realism, but because it makes the comparison of different solutions easy. 5 The system E is strong and complete with respect to neighborhoods structures. 6 What we call ‘‘awareness structures’’ is called in the original paper ‘‘logic of general awareness’’.
Impossible States at Work
Figure 2.
53
Neighborhood structures p, q
p
s1
s2
s3
s4
V(s1) = {{s1 , s2 }}
q
Figure 3. Awareness structures p, q
p
s1
s2
s3
s4
A(s1) = {p}
q
(iii) RDS S is an accessibility relation, and (iv) A:S- Form ðLBðAtÞÞ is a function which maps every state in a set of formulas (‘‘awareness set’’). The new condition on the satisfaction relation is the following: M; s Bf iff 8s0 s:t: sRs0 2 ½½fM and f 2 AðsÞ This new doxastic condition permits to weaken any form of logical omniscience; in particular, our example shows how to model an agent who violates deductive monotony. Example 3. Let us consider our example and stipulate that A(s1) ¼ { p}. Then it is still the case that Bp is true in s1, but not B( p3q). This is represented in Figure 3. If one keeps the basic language LBðAtÞ, one obtains as axiom system a minimal epistemic logic which eliminates any form of logical omniscience:
Mikae¨l Cozic
54
Minimal Epistemic Logic (FHMV, 1995) (PROP) Instances of propositional tautologies (MP) From fand f - c infer c 2.2.3. Nonstandard structures We now switch to our last solution: the nonstandard structures, which are sometimes called ‘‘Kripke structures with impossible states’’. Contrary to the two preceding solutions, neither the accessibility relation nor the doxastic condition are modified. What is revised is the underlying state space or, more precisely, the nature of the satisfaction relation in certain states of the state space. Definition 8. A nonstandard structure is a 5-tuple M ¼ ðS; S0 ; p; R; Þ where (i) (ii) (iii) (iv) (v)
S is a space of standard states, S0 is a space of nonstandard states, R D S[S0 S[S0 is an accessibility relation, p :Form ðLBðAtÞÞ S ! f0; 1g is a valuation on S, and is a satisfaction relation which is standard on S (recursively defined as usual) but arbitrary on S0 .
In nonstandard structures, there are no a priori constraints on the satisfaction relation in nonstandard states. For instance, in a nonstandard state s’, both f and :f can be false. For every formula f, one might therefore distinguish its objective informational content ½½fM ¼ fs 2 S : M; s fg from its subjective informational content ½½jM ¼ fs 2 S ¼ S [ S 0 : M; s fg. In spite of appearances, this generalization of Kripke structures is arguably natural as soon as one accepts the possible-state analysis of beliefs. Recall that, according to this analysis, to believe that f is to exclude that f could be false and not to believe that c is not to exclude that c could be false.
In consequence, according to the possible-state analysis, to believe that f but not to believe one of its logical consequences c is to consider as possible at least one state where f is true but c false. By definition, a state of this kind is logically nonstandard. Nonstandard structures is the most straightforward way to keep the possible-state analysis of beliefs.7 Example 4. Let us consider our example but add a nonstandard state in S0 ¼ {s5}; we stipulate that M, s5 p, but that M; s5 ð p _ qÞ. Then in s1, Bp is true but not B ð p _ qÞ. This is represented in Figure 4.
7
For a more extensive defense of the solution, see Hintikka (1975) or, more recently, Barwise (1997).
Impossible States at Work
55
Figure 4.
Nonstandard structures
p, q
p
s1
s2
s3
s4
s5
q
3. The probabilistic case Mainstream decision theory is based on doxatic models of partial beliefs, not of full beliefs. Hence weakenings of logical omniscience in the framework of doxastic logic does not give directly a way to weaken logical omniscience that is appropriate for decision theorists. The aim of this section is to study the probabilistic extension of doxastic models without logical omniscience. 3.1. Probabilistic counterpart of logical omniscience First, we have to define the probabilistic counterparts of logical omniscience. In the usual (non-logical) framework, if P is a probability distribution on S,8 then the following property is the counterpart of logical omniscience: if E E 0 ; then PðEÞ PðE 0 Þ. But to be closer to the preceding section, it is better to work with an elementary 9 logical version of the usual probabilistic model: Definition 9. Let LðAtÞ a propositional language; a probabilistic structure10 for LðAtÞ is a 3-tuple M ¼ ðS; p; pÞ where (i) S is a state space, (ii) p is a valuation, and (iii) P is a probability distribution on S.
8
In the paper, to avoid complications that are unnecessary for our purpose, we suppose that S is finite and that P is defined on Y(S). 9 ‘‘Elementary’’ because there is no doxastic operator in the object-language. 10 See Fagin and Halpern, (1991). For a recent reference on logical formalization of probabilistic reasoning, see Halpern (2003).
56
Mikae¨l Cozic
We will say that an agent believes to degree r a formula f 2 Form ðLðAtÞÞ, symbolized by CP(f) ¼ r, if Pð½½fM Þ ¼ r .11 We can state the precise probabilistic counterparts of logical omniscience: Proposition 2. The following holds in probabilistic structures: (i) deductive monotony: if f M-implies c, then CP(f)rCP(c) and (ii) intensionality: if f and c are M-equivalent, then CP(f) ¼ CP(c). One can check that these are indeed the counterparts of logical omniscience by looking at the limit case of certainty, i.e. of maximal degree of belief: (i) if an agent is certain that f and if f M-implies c, then the agent is certain that c as well; (ii) if f and c are M-equivalent, then an agent is certain that f iff he or she is certain that c. Which of the three solutions to choose for this extension? (a) First, we should eliminate neighbordhood structures because their power is limited: intensionality is a too strong idealization. This is especially sensitive in a decision context, where, under the label of ‘‘framing effects’’, it has been recognized for a long time that logically equivalent formulations of a decision problem could lead to different behaviors. (b) Second, the extension of awareness structures seems intrisically tricky. Suppose that an agent believes f to degree rf and c to degree rc with f M-implying c and rf>rc. This is a failure of deductive monotony. Now, in an analogous situation, the way awareness structures proceed in epistemic logic is by ‘‘dropping’’ the formula c. Let us apply this method to the probabilistic case: we would say that an agent believes that f to degree r if Pð½½fÞMÞ ¼ r and he or she is aware of f. But no one could model a situation like the preceding one: either the agent is aware of c and in this case necessarily he or she believes that c to a degree rc Z rf; or he or she is not aware of c, and it this case he or she has no degree of belief toward c. This is not a knock-down argument, but it implies that if one wants to extend awareness structures, one has to make it substantially more sophisticated. (c) Lastly, the extension of awareness structure is problematic in our perspective, i.e. a perspective of decision-theoretic application. To see why, let us notice that a criterion choice like expected utility might be seen as a function whose first argument is a doxastic model and second argument an axiological model. If we would extend the awareness structures, the first value of an expected utility criterion would not be any more a simple probability distribution. Consequently, we should have to revise our choice criterion. For sure, nothing precludes such a move, but simplicity recommends another tactic. We are therefore left with nonstandard structures. Nonstandard structures do not suffer from the above-mentioned troubles: they are as powerful as one
11
Note that CP(f) ¼ r is in the meta-language, not in the object-language.
Impossible States at Work
57
can wish, the extension is intrinsically simple and they should permit to keep usual choice criterion when embedded in a choice model. This is our motivation, but now we have to turn to positive arguments.12 3.2. Nonstandard implicit probabilistic structures To give the basic insights and show the fruitfulness of the proposition, we will continue to work in the elementary setting where no doxastic operators are in the object-language. Definition 10. Let LðAtÞ a propositional language; a non-standard implicit probabilistic structure for LðAtÞ is a 5-tuple M ¼ ðS; S0 ; p ; PÞ where (i) (ii) (iii) (iv) (v)
S is a standard state space, S0 is a nonstandard space, p : Form (L(At)) S-{0, 1} is a valuation on S, F is a satisfaction relation which is standard on S but arbitrary on S0 , and P is a probability distribution on Sn ¼ S [ S 0 .
As in the set-theoretic case, one can distinguish the objective informational content of a formula, i.e. the standard states where this formula is true, and the subjective informational content of a formula, i.e. the states where this formula is true. To obtain the expected benefit, the nonstandard probabilistic structures should characterize the agent’s doxastic state on the basis of subjective informational content: an agent believes a formula f to degree r, CPðfÞ r, if Pð½½fM Þ ¼ r. It is easy to check that, in this case, logical omniscience can be utterly controlled. Example 5. Let us take the same space state as in the preceding examples. Suppose that the agent has the following partial beliefs: CPðpÞ4CPðp _ qÞ. This can be modeled in the following way: S0 ¼ {s5}, s5 2 ½½pM but s5 e½½p _ q M; Pðs1 Þ ¼ Pðs2 Þ ¼ Pðs3 Þ ¼ Pðs4 Þ ¼ 1=8 and Pðs5 Þ ¼ 1=2. This is represented in Figure 5.
3.3. Special topics: deductive information and additivity This extension of nonstandard structures is admittedly straightforward and simple. It gives immediately the means to weaken logical idealizations. Furthermore, it opens perspectives specific to the probabilistic case; two of them will be briefly mentioned.
12
A similar idea has been defended a long time ago by I. Hacking who talks about ‘‘personal possibility’’, by contrast with ‘‘logical possibility’’. We will not develop the point here, but this contribution can be seen as a formalization of Hacking’s insights (Hacking, 1967).
Mikae¨l Cozic
58
Figure 5.
Probabilistic non-standard structures
π(p ∨ q, s5) = 0
p, q 1/8
p 1/8
s1
s2
s3
s4
q 1/8
1/8
1/2 s5
Deductive information and learning. First, one can model the fact that an agent acquires not only empirical information but deductive information; in nonstandard structures, this corresponds to the fact that the agent eliminates nonstandard states. Let us come back to our generic situation. Suppose that our agent learns that f implies c. This means that he or she learns that the states where f is true but c false are impossible. This is equivalent to say that he or she learns the event I ¼ S ð½½fM ½½cM Þ To be satisfying, such a notion of deductive information must respect a requirement of compatibility between revising and logical monotony: if the agent learns that f implies c and revise his or her beliefs upon this fact, his or her new probability distribution should conform to logical monotony with respect to f and c. One can check that it is the case with the main revising rule, i.e. conditionalization. Proposition 3. If I is learned following the conditionalization, then deductive monotony is regained, i.e. CPI (f)rCPI (c). Example 6. This can be checked in the preceding example: I ¼ S ¼ fs1 ; s2 ; s3; s4 g. By conditionalization, CPI ð pÞ ¼ 1=2 whereas CPI ð p _ qÞ ¼ 3=4. Additivity. A second topic is additivity. From a logical point of view, one can define additivity as follows: Definition 11. M is (logically) additive if, when f and c are logically incompatible, CP(f)+CP(c) ¼ CP(f3c).
Impossible States at Work
59
Additivity is of course the core of the probabilistic representation of beliefs, and alternative representations of beliefs depart often from probability on this point. For example, in the Dempster–Shafer theory (Shafer, 1976), the so-called belief function is superadditive (in our notation, CP _ ðf _ cÞ
CPðfÞ þ CPðcÞÞ whereas its dual, the plausibility function, is subadditive CP(f3c)rCP(f)+CP(c)). A noteworthy aspect of probabilistic nonstandard structures is that the freedom of the connectives’ behavior in nonstandard states permits us to have a very flexible framework with respect to additivity: simple conditions on the connectives imply general properties concerning additivity. Definition 12. Let M ¼ ðS; S 0 ; p; ; PÞ a nonstandard probabilistic structure; M is _ -standard if for every formulas f; c; ½½f _ cM ¼ ½½fM [ ½½cM : This means that the disjunction behaves in the usual way in nonstandard states; a trivial consequence of this is that the structure M is (logically) subadditive. Proposition 4. If M is 3-standard, then it is (logically) subadditive. To be a little bit more general, one can consider the (logical) inclusion– exclusion rule: CPðf _ cÞ ¼ CPðfÞ þ CPðcÞ CPðf ^ cÞ One can define (logical) submodularity (respectively supermodularity or convexity) as: CPðf _ cÞ CPðfÞ þ CPðcÞ CPðf ^ cÞ (respectively CPðf _ cÞ
CPðfÞ þ CPðcÞ CPðf ^ cÞÞ: It is clear that to control submodularity, we have to control the conjunction’s behavior. Definition 13. Let M ¼ ðS; S 0 ; p; ; PÞ a probabilistic nonstandard structure; (i) M is negatively 4-standard if for every formulas f, c, when , M; s jf , or M; s jc, then M; s jf ^c. (ii) M is positively 4-standard if for every formulas f, c, when M; s f, or M; s c; then; M; s f ^ c.
Proposition 5. Suppose that M is 3-standard; – if M is negatively 4-standard, then submodularity holds – if M is positively 4-standard, then supermodularity holds Proof: see the Appendix.
Mikae¨l Cozic
60
3.4. Nonstandard explicit probabilistic structures Implicit probabilistic structures are not very expressive; to have a true analogon of epistemic logic, we have to start from an object-language that contains (partial) doxastic operator. Following Aumann (1999) and Heifetz and Mongin (2001), we consider the operator La.13,14 The intuitive meaning of Laf is: the agent believes at least to degree a that f. Note that we add the usual symbols >; ?: > is what the agent recognizes as necessarily true and ? is what he or she recognizes as necessarily false. Definition 14. The set of formulas of an explicit probabilistic language LLðAtÞ based on a set At of propositional variables, Form ðLLðAtÞÞ is defined by: f ::¼ pj ? j>j:fjf _ cjLa f where P 2 At and a 2 ½0; 1 Q. The corresponding structures are an obvious extension of implicit nonstandard structures. Definition 15. A nonstandard explicit probabilistic structure for LLa ðAtÞ is a 5tuple M ¼ ðS; S0 ; p; ; PÞ where (i) F is a satisfaction relation s.t. (a) F is standard on S for all propositional connectives, (b) 8s 2 S; M ; s La f iff PðsÞð½½fM Þ a; and (c) 8s 2 S [ S0 ; M; s > and M; s|6¼ ?. (ii) P:S*-D(S*) assigns to every state a probability distribution on the state space. In Aumann (1999), R. Aumann has failed to axiomatize (standard) explicit probabilistic structures, but Heifetz and Mongin (2001) have recently devised an axiom system that is (weakly) complete for these structures. In comparison with epistemic logic, one of the problems is that the adaptation of the usual proof method, i.e. the method of canonical models, is not trivial. More precisely, in the epistemic logic’s case, it is easy to define a canonical accessibility relation on the canonical state space. This is not case in the probabilistic framework, where strong axioms are needed to guarantee that. Fortunately, the nonstandard structures permit huge simplifications, and one can devise an axiom system that essentially mimics the Minimal Epistemic Logic above described.
13
Economists are leading contributors to the study of explicit probabilistic structures because they correspond to the so-called type spaces used in games of incomplete information, in the same way that Kripke structures (with R as an equivalence relation). See Aumann and Heifetz (2002). 14 Note that another language is used by J. Halpern in Fagin and Halpern (1991) or Halpern (2003).
Impossible States at Work
61
Minimal Probabilistic Logic (PROP) Instances of propositional tautologies (MP) From fand f - c infer c (A1) La f (A2) La > (A2+) : La ? ða4 0Þ (A7) La f ! Lb f ðbo aÞ The axioms’ notation follows Heifetz and Mongin (2001) to facilitate comparison. Axioms (A2) and (A2+) reflect our semantic for > and ?: the agent believes to maximal degree what he or she considers as necessarily true and does not believe to any degree what he or she considers as necessarily false. (A1) and (A7) reflect principles specific to the probabilistic case. Note that both bear on a single embedded formula f: there is no doxastic reflection of a logical relation. They express something like a minimal metric of partial beliefs. If F NSEPS f means that f is true in every nonstandard explicit probabilistic structure and ‘MPL f that f is provable in the Minimal Probabilistic Logic, then we are ready to state our main result: Theorem 1. (Soundness and Completeness of MPL) NSEPS f iff ‘MPL f Proof: see the Appendix. 4. Insights into the decision-theoretic case We would like to end this paper by showing how to build choice models without logical omniscience, and which are the challenges raised by such a project. 4.1. Choice models without logical omniscience The basic method to build a choice model without logical omniscience is to substitute one of our nonstandard structures to the original doxastic model in the target choice model. We will now show how this could be done. One might generically see models of choice under uncertainty as based on
a a a a a
state space S, set A of actions, consequence function C: S A ! C where C is a set of consequences, utility function u: C ! R, and criterion of choice.
Mikae¨l Cozic
62
To complete the choice model, one adds a distribution P on S for models of choice under probabilistic uncertainty, and a set K S of states compatible with the agent’s beliefs under set-theoretic uncertainty. To rigourously extend nonstandard structures to choice models, one should translate the above described notions in a logical setting. But to give some insights, we will, on the contrary, import nonstandard structures in the syntax-free framework of conventional decision theory. Let us have a look at the following, admittedly particular, target situation: an agent knows abstractly the consequence function C, but, because of limited computational capacities, he or she is not able, at the moment of choice, to perfectly infer from the choice function the consequence of each action at each possible state. One can think about a classic two-state example of insurance application.15 The consequence function is Cðs1 ; xÞ ¼ w px Cðs2 ; xÞ ¼ y þ x, where x, the choice variable, is the amount of money spent in insurance, s1 the state without disaster, w the wealth in s1, s2 the state with a disaster and y the subsequent wealth, and p the rate of exchange. In this case, a nonlogically omniscient agent with respect to the consequence function would be such that he or she ignores the value of C for some arguments. A simple way to model this target situation would be the following one. Let us consider extended states w, which are composed of a (primitive) state s and a local consequence function Cw : A ! C : w ¼ ðs; Cw Þ. The set of extended states is intended to represent the beliefs of the agent, including his or her logically imperfect beliefs. An extended state is standard if its local consequence function is conform to the (true) consequence function: Cw ðaÞ ¼ Cða; sÞ; if not, it is nonstandard. For instance, a logically imperfect agent could not know what is the consequence of action a in state s, thinking that it is possible that this consequence is ci (let us say, the true one) or cj. This situation would be modeled by building (at least) two extended states: wi ¼ ðs; Cwi Þ where Cwi ðaÞ ¼ ci and wj ¼ ðs; Cwj Þ where Cwj ðaÞ ¼ cj . A perfect logician would not have considered a possible state like wj. On this basis, one can build choice models without the assumption of logical omniscience: – in the case of choice under set-theoretic uncertainty, if one takes the maximin criterion, for a belief set K W , the solution is
15
From Lippman and McCall (1981).
Impossible States at Work
63
SolðA; S; W ; C; C; E; u; K Þ ¼ arg maxa2A minw2K uðCw ðaÞÞ; and – in the case of choice under probabilistic uncertainty, if one takes the maximization of expected utility criterion, for a probability distribution P on W, the solution is X SolðA; S; W ; C; C; E; u; PÞ ¼ arg maxa2A PðwÞ:uðCw ðaÞÞ. w2W
4.2. Open questions From the decision theorists point of view, the substitution we have just described is only a first step. Two fundamental questions remain. (a) First, there is the question of the axiomatization of the new choice models, i.e. closely linked with the behavioral implications of choice models without logical omniscience. In a recent paper, B. Lipman (Lipman 1999) has remarkably tackled this issue, advocating a very similar approach. But the choice model he uses is quite specific (conditional expected utility), and one would like to compare choice models based on nonstandard structures with the savagean benchmark. More precisely, one would like to obtain a representation theorem a` la Savage: define conditions on a preference relation k such that there exists (1) a space of extended states W, (2) a probability distribution P on W and (3) a utility function u such that the preference relation could be rationalized by the expected utility defined over preceding notions. (b) Second, the nonstandard choice models weaken only the cognitive assumptions of the (underlying) doxastic model. But there remains cognitive assumptions concerning the utility function and the choice criterion. In the approach we just described, we still assume that the agent is able to assign a precise utility to each consequence c 2 C and to calculate the solution to its choice criterion. Therefore, from the point of view of the bounded rationality program, our proposition is strongly incomplete. 5. Conclusion This paper has advocated the use of nonstandard or impossible states as a general framework to ‘‘unidealize’’ belief and choice models. This admittedly does not permit a complete treatment of the idealizations underlying conventional choice models, but can be seen as a first step toward a fine-grained modeling of bounded rationality. Acknowledgments I would like to thank D.Andler, A. Barberousse, A. Barton, D.Bonnay, I. Drouet, P. Egre´, B. Kooi, Ph. Mongin, C. Paternotte, B. Walliser and three anonymous referees for their helpful comments. I am also indebted to the seminar participants of ‘‘Formal Philosophy’’ and ‘‘Philosophy of Probability’’
64
Mikae¨l Cozic
(both at IHPST, CNRS-ParisI-ENS Ulm); and colloquium participants of ECCE1 (Gif-sur-Yvette, september 2004), the First Congress of the Society of Philosophy of Sciences (Paris, January 2005) and the First Paris–Amsterdam Meeting of Young Researchers (Amsterdam, June 2005).
References Aumann, R. (1999), ‘‘Interactive knowledge’’, Internation Journal of Game Theory, Vol. 28, pp. 263–300. Aumann, R. and A. Heifetz (2002), ‘‘Incomplete information’’, pp. 1665–1686 in: R. Aumann and S. Hart, editors, Handbook of Game Theory, Vol. 3, Amsterdam: Elsevier. Barwise, J. (1997), ‘‘Information and impossibilities’’, Notre Dame Journal of Formal Logic, Vol. 38(4), pp. 488–515. Blackburn, P., M. de Rijke and Y. Venema (2001), Modal Logic, Cambridge: Cambridge University Press. Chellas, B. (1980), Model Logic, Cambridge: Cambridge University Press. Cozic, M. (2006), ‘‘Epistemic models, logical monotony and substructural logics’’, in: J. van Benthem, editor, The Age of Alternative Logics, Kluwer Academic Publisher. Fagin, R. and J. Halpern (1988), ‘‘Belief, awareness, and limited reasoning’’, Artificial Intelligence, Vol. 34, pp. 39–76. Fagin, R. and J. Halpern (1991), ‘‘Uncertainty, belief and probability’’, Computational Intelligence, Vol. 7, pp. 160–173. Fagin, R., J. Halpern, Y. Moses and M. Vardi (1995), Reasoning about Knowledge, Cambridge, MA: MIT Press. Hacking, I. (1967), ‘‘Slightly more realistic personal probability’’, Philosophy of Science, Vol. 34, pp. 311–325. Halpern, J. (2003), Reasoning about Uncertainty, Cambridge, MA: MIT Press. Heifetz, A. and P. Mongin (2001), ‘‘Probability logic for type spaces’’, Games and Economics Behavior, Vol. 35, pp. 34–53. Hintikka, J. (1975), ‘‘Impossible worlds vindicated’’, Journal of Philosophical Logic, Vol. 4, pp. 475–484. Lipman, B. (1999), ‘‘Decision theory without logical omniscience: toward an axiomatic framework for bounded rationality’’, The Review of Economic Studies, Vol. 66(2), pp. 339–361. Lippman, S. and J. McCall (1981), ‘‘The economics of uncertainty: selected topics and probabilistic methods’’, pp. 210–284 in: K. Arrow and M. Intriligator, editors, Handbook of Mathematical Economics, Vol. I, Elsevier. Luce, R. and H. Raiffa (1985), Games and Decisions. Introduction and Critical Survey, 2nd edn, New York: Dover. Shafer, G. (1976), A Mathematical Theory of Evidence, Princeton: Princeton University Press.
Impossible States at Work
65
Stalnaker, R. (1991), ‘‘The problem of logical omniscience, i’’, Synthese, Vol. 89, pp. 425–440. Stalnaker, R. (1998), Context and Content, Oxford, Oxford: Oxford Cognitive Science. Appendix: Proof of Proposition 5 The proof deals only with the case of submodularity; the other is symmetric. If [[f]]* and [[c]]* are disjoint, then by hypothesis ½½f ^ j ¼ +. Therefore CPðf _ cÞ ¼ CPðfÞ þ CPðcÞ CPðf ^ cÞ. It follows from the definition that ifM; s c ^ f, then M; s c and M; s f (the converse does not hold). In other words, (1) ½½f ^ c ½½f \ ½½c . This implies that (2) Pð½½f ^ c Þ Pð½½f \ ½½c Þ. Since M is 3-standard, Pð½½f _ c Þ ¼ Pð½½f Þ þ Pð½½f Þ Pð½½f \ ½½c Þ. By (2), it follows from this that Pð½½f _ c Þ Pð½½f Þ þ Pð½½f Þ Pð½½f ^ c Þ. ’ Proof of Theorem 1 ( ) ). Soundness is easily checked and is left to the reader. ( ( ). We have to show that F NSEPS f implies ‘ MPL f. First, let us notice that the Minimal Probabilistic Logic (MPL) is a ‘‘modal logic’’ (see Blackburn et al., 2001, p. 191): a set of formulas (1) that contains every propositional tautologies and (2) that is closed by modus ponens and uniform substitution. One can then apply the famous Lindenbaum Lemma. Definition 16. (i) A formula f is deducible from a set of formulas G, symbolized G ‘ f , if there exists some formulas c1, y, cn in G s.t. ‘ (c1 4y 4 cn)-f. (ii) A set of formulas G is 4-consistent if it is false that G ‘ 4?. (iii) A set of formulas G is maximally 4-consistent if (1) it is 4-consistent and (2) if it is not included in a 4-consistent set of formulas. Lemma 1. (Lindenbaum Lemma) If G is a set of 4-consistent formulas, then there exists an extension G+ of G that is maximally 4-consistent. Proof: see for instance Blackburn et al. (2001, p. 199). Definition 17. Let f 2 L ; the language associated with f, Lf is the smallest sublanguage that
66
Mikae¨l Cozic
(i) contains f; ? et >, (ii) is closed under sub-formulas, and (iii) is closed under the symbol defined as follows: w: ¼ c if w ¼ :c and w: ¼ :w if not.16 In the language Lf, one can define the analogon of the maximally ^-consistent sets. Definition 18. An atom is a set of formulas in Lf which is maximally 4-consistent. At(f) is the set of atoms. Lemma 2. For every atom G, (i) there exists an unique extension of G in L, symbolized G+, that is maximally ^-consistent and (ii) G ¼ G+\Lf. Proof. (i) An application of Lindenbaum Lemma. (ii) Implied by the fact that G is maximally coherent. Suppose that there exists a formula c from Lf in G+ but not in G, then G+would be inconsistent, i.e. excluded by hypothesis. ’ Starting from atoms, one may define the analogon of canonical structures, i.e. structures where (standard) states are sets of maximally ^-consistent formulas. In the same way, we will take as canonical standard state space the language’s Lf atoms. The hard stuff is the definition of the probability distributions. The aim is to make true in sG every formula Law in the atom G associated with the state sG. To do that it is necessary that P(sG, w) Z a; this is guaranted if one takes for P(sG, w) the number b* s.t. b* ¼ max{b :Lb bw A G}. This can easily be done with nonstandard states. It will be the case if (1) the support of P(sG,.) is included in the set of non-standard states, (2) P(sG,.) is equiprobable and (3) there is a proportion b* of states that make w true. Suppose that I (G) is the sequence of formulas in G that are prefixed by a doxastic operator La ; for every formula, one can rewrite b*(w) as pi/qi. Define qðGÞ ¼ Pi21 qi; qðGÞ will be the set of nonstandard states in which P(sG,.) will be included. If the ist formula is w, suffice it to stipulate that w in the first pi P q – i states. One may check that the proportion of states w is true is pi/qi. Definition 19. The f-canonical structure is the structure Mf ðS f ; S0 ; pf ; f ; Pf Þ where (i) Sf ¼ {s SG: GAAt (f)}, (ii) S0 f ¼ GAAt(f) q(G),
16
See Blackburn et al. (2001, p. 242).
Impossible States at Work
67
(iii) for all standard state, pf (p, sG) ¼ 1 iff p A sG, (iv) for all nonstandard state Mf ; sf c iff, if sAq(G) and c is the i-st formula prefixed by a doxastic operator in G, then s is in the pi Pq– i first states of q(G), and (v) Pf (sG,.) is an equiprobable distribution on q(G). As expected, the f-canonical structure satisfies the Truth Lemma. Lemma 3. (Truth Lemma) For every atom G, Mf , sGF c iff cAG. Proof. The proof proceeds by induction on the length of the formula. (a) : ¼ p; follows directly from the definition of pf. (b) c ¼ c1 3 c2; by definition, Mf ; sf c iff Mf ; sf c1 or Mf ; sf c2 . Case (b) will be checked if one shows that c13c2AG iff c1AG or c2AG. Let us consider the extension G+ of G; one knows that c13c2AG+ iff c1AG+ or c2AG+. But G ¼ G+\Lf and c1 and c2ALf. It follows that sG c1 _ c2 iff c1 _ c2 2 G. (c) c ¼ :w. Mf , s f :w iff Mf , sjfw iff (by induction hypothesis) weG. Suffice it to show that weG iff :wAG. (i) Let us suppose that weG; w is in Lf hence, given the properties of maximally 4-consistent sets, :w AG+. And since G ¼ G+\Lf, :w AG. (ii) Let us suppose that :w AG; G is coherent, therefore w eG. (d) c ¼ La w; by definition sG La w_ iff P(sG, w)Za. (i) Let us suppose that P(sG, w)Za; then arb* where bn ¼ maxfb : L 2 Gg since by definition of the canonical distribution, P(sG, w) Zb*. Now, let us consider the extension G+: clearly, Lnb w 2 Gþ . In virtue of axiom (A7) and of the closure under modus ponens of maximally 4-consistent sets, La 2 Gþ . Given that by hypothesis La w 2 Lf , this implies that La w 2 G. (ii) Let us suppose that La w 2 G; then arb* hence P(sG,w)Za. ’ To prove completeness, we need a last lemma. Lemma 4. Let At(f) the set of atoms in Lf; AtðfÞ ¼ fD \ Lf : D is maximally coherentg. Proof. At(f)D{D\Lf:D is maximally coherent} follows from a preceding lemma. Let G+ a maximally consistent set and G ¼ G+\Lf. We need to show that is maximally consistent in Lf. First G is consistent; otherwise, G+ would not be. Then, we need to show that is G maximal, i.e. that for every formula cALf, if G[{c} is consistent, then cAG. Let c such a formula. Let us recall that G+ is maximally consistent. Either cAG+ and then cAG; or :cAG+ (elementary property of maximally consistent sets) and, if c ¼ :w, wAG+ as well. Hence, by
68
Mikae¨l Cozic
definition of Lf, w or ::wAG. But this is not compatible with the initial hypothesis according to which G[{f} is consistent. ’ We can now finish the proof: Let f a LPM-consistent formula. Then, there exists a maximally LPM-consistent set G+ which contains f. Let G ¼ G+\Lf. f is in G therefore by the Truth Lemma, f is true in state sG of the f-canonical structure. Then f is satisfiable. ’
CHAPTER 3
Reference-Dependent Preferences: An Axiomatic Approach to the Underlying Cognitive Process Raphae¨l Giraud Abstract This paper studies axiomatically the cognitive process of reference-dependent preferences (RDP). We show that the effect of a shifting reference point is two sided: first, modifying the criteria deemed relevant for a given choice and second, modifying the desirability of an option, i.e. how these criteria combine in order to yield decision. We also relate this notion of desirability and the strength of the status quo bias that allows for a reference-independent definition of the desirability of an option.
Keywords: Partial preference orders, rationality, reference-dependent preferences, reference point shifting, status quo, implicit criteria JEL Classifications: A12, D00, D81 1. Introduction and overview of the results It is now a well-documented fact that people’s preferences depend on their reference point. This has been repeatedly shown in experiments (Samuelson and Zeckhauser, 1988; Kahneman et al., 1990; Camerer, 1995). Samuelson and Zeckhauser (1988) have specifically identified what they have called the status
Corresponding author. E-mail address:
[email protected] (R. Giraud). CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80004-0
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
70
Raphae¨l Giraud
quo bias (SQB), i.e. the tendency to prefer sticking to some given situation only because it is the current situation. Compared to the considerable empirical literature on the phenomenon of reference-dependence, there are relatively few theoretical works that study it axiomatically.1 This may be due to the fact that reference-dependence has a flavor of irrationality that is at odds with the normative nature of most axiomatic approaches. Besides the seminal paper by Tversky and Kahneman (1991), the recent work on the subjects includes Sugden (2003), Munro and Sugden (2002), Masatlioglu and Ok (2003), Sagi (2003), and Schmidt (2003). All these papers, except Masatlioglu and Ok (2005), start, like Tversky and Kahneman (1991), with a family of preference relations {hr}rAX defined over a set X (consisting of commodity bundles for Tversky and Kahneman, 1991; Munro and Sugden, 2002; Savage acts for Sugden, 2003 and lotteries for Sagi, 2003 and Schmidt, 2003), and derive axiomatically a representation of these preference relations. Prospect theory (Kahneman and Tversky, 1979) is often considered to be a reference-dependent preference functional, but dependence on the endowment is usually not modeled directly, remaining at the level of interpretation. Schmidt (2003) has adressed this problem and has studied axiomatically reference-dependence in the context of cumulative prospect theory (CPT), explicitly starting from a family of CPT preferences depending on the endowment and providing axiomatic characterizations (in terms of tradeoff consistency) of properties of CPT like loss aversion. Masatlioglu and Ok (2005) take a more foundational approach by considering choice functions, and axiomatically characterizing the existence of a SQB in this context. Finally, [Koszegi and Rabin (2006)] start, in the spirit of behavioral economics, with a ‘‘modified’’ utility function that incorporates reference-dependence and loss aversion by ways of assumptions on utility functions, but without explicit axiomatic foundations. In light of these contributions, it seems that the more natural way of representing the fact that an agent’s observable preferences are reference-dependent is to assume that his orher behavior can be represented by some family of preference relations hr r2R defined over a subset Xr of some set X. This shall be called a family of reference-dependent preferences [(RDP)] on X. An element r of the index set R will be said to be a reference point. The general interpretation of such a reference point is that it indexes the context, in the broad sense, in which the decision is made. However, for reasons that will be explained later on, this family of RDP will not be the primitive of our model, but will be a derived concept. We will start instead from a preference relation h defined on some subset Y of X R.
1
We do not consider here the literature on sign dependence (the fact that behavior is different in the domain of gains and in the domain of losses), which does not concern reference-dependence per se, but in a sense uses it to explain other phenomena: in this literature, reference-dependence is assumed and a reference point is chosen once and for all.
Reference-Dependent Preferences
71
We propose in this chapter to study from an axiomatic point of view the underlying cognitive process that generates reference-dependence in an abstract setting, i.e. we aim at understanding what is the effect of a change in the reference point. By a cognitive process, we mean the way the external world is analyzed by the decision maker and how he or she processes the information describing the world. We will see that this plays an important role in the characterization that we propose. Our answer is contained in the preference functional that we axiomatize. We shall not detail it in the introduction, but only discuss the insights it provides on the cognitive process underlying RDP. It shows that a given reference point r determines both a list of relevant criteria that can be further interpreted as the set of states of the world that may matter for the decision to be made given that circumstances r obtain, and, for each pair of objects x and y, the set of decisive criteria i.e. the criteria such that if x is better than y for all of these criteria, then xry. An intuition of the mechanism that underlies this functional can be grasped by examining the case where R X and y ¼ r, i.e. the initial endowment. In this case, the reference point determines both the set of relevant criteria and the number of criteria among these for which a given alternative x must beat the endowment for the decision-maker to forego it. The SQB associated to a reference point r will be the stronger the larger the number of decisive criteria necessary to forego it. Consider the following example: you wish to move from your accommodation A. Suppose that, in general, the relevant criteria for the choice of an accommodation are: location in town, proximity of stores, elevator, ancient or modern style, view, number of rooms and price. Suppose accommodation A does not have an elevator, and that this has got you used to climbing the stairs. We can thus assume that this criterion is of no importance to you, so that it can be deleted from the set of criteria. Notice that this is possible only because you are endowed with an accommodation without an elevator. This might no longer be true if the initial endowment were different. Suppose now that on the whole you feel comfortable in your present accommodation, and that you wish to move only for some change in your life. In this case, you will probably be more demanding on the conditions that must be met by the potential alternatives. If, on the contrary, you are poorly satisfied with your accommodation, for other reasons than the absence of elevator, for instance noise on the street you live on, you will probably be far less demanding with respect to the features of the possible alternatives. Notice that in the previous example, the way the decision maker feels about his or her endowment matters in the way he or she chooses in a referencedependent way. This aspect will be further studied in this chapter, where we will show how it can be linked to some of the mathematical objects that appear in the representation theorem. This will be the second main result of the chapter: if the decision maker is more satisfied with one endowment r than with another
72
Raphae¨l Giraud
endowment r0 , he or she will exhibit a stronger SQB in favor of r than in favor of r0 , i.e. he or she will in general be more reluctant to forego r than to forego r0 . The model that we propose has the advantage of allowing for a more flexible analysis of reference-dependent decision-making than Sagi’s (2003) model, which is formally the model closest to ours. Indeed, this model corresponds, in Sagi’s specification, roughly speaking, to a case of uniformly strong status quo bias.2 In our model, on the other hand, the SQB may vary with the status quo considered. We provide in the chapter a numerical index of the strength of the SQB in this model. The main axiom that will be used to derive the representation leading to these insights will be that whenever an object x is preferred to an object y in all possible circumstances, then this absolute, reference-independent, preference will not be affected by a common modification of these objects (in a way to be made precise in the main text). We therefore base our theory entirely on what happens for preferences that are not reference dependent. The reason we do that is that the reference-independent part of the preference relation allows to identify the aspects of the reference point that may possibly matter in general. Indeed, if the modeler knows some of the aspects by which a reference situation can be described, and if one knows that the preference of x to y is reference-independent if and only if x is better than y for each of these aspects, then the modeler knows that conflict on one of them is necessary to explain reference-dependence: if there is reference-dependence, then it must have been introduced by one of these aspects. Therefore, the study of the reference-independent part of preferences and its characterization provides all information needed to study their referencedependent part. Our axiomatization shows how to build the reference-dependent part starting from the reference-independent one. The structure of the paper is as follows. Section 2 introduces the basic primitives and assumptions of the model. Section 3 introduces the main axiom, the representation theorem and discusses its implications. Section 4 concludes. The proofs appear in Section 5. 2. The model 2.1. Primitives of the model Let R and X be two sets. The set R is the set of reference points; a reference point in our theory indexes a general notion of context or circumstances under which the decision is to be made. Now this context can be described in different ways. The simplest case is when R X and each r stands for the reference consumption, be it the initial endowment or the consumption to which the
2
However, this implication does not seem to be formally provable in our specification, as there seems to exist a discontinuity in behavior as our model ‘‘converges’’ to Sagi’s.
Reference-Dependent Preferences
73
individual is used or the one he or she aspires to. Other cases include all types of circumstances that may define the context of decision: time, state of nature, state of mind or a subset of the set of alternatives. As can be seen from the latter case it is quite natural to consider that the reference point determines the set of alternatives. For instance, in an exchange economy, the initial endowment determines the set of feasible commodity bundles; similarly, the set of feasible commodity bundles does not necessarily remain the same throughout time. The decision-maker is assumed to have RDP over the set X, i.e. his or her preferences depend on the circumstances under which the decision is made. A natural way to model this fact which is used in the literature is to introduce a family P ¼ fhr gr2R of binary relations defined over (some subset Xr of) X. This family defines, for each reference point r, the associated preference relation (and possibly the subset of X on which it is defined) and will be called a preference profile. The preference profile P may be seen as modeling the DM’s observable behavior, in the sense that it specifies what he or she actually does given a reference point. Here we adopt a different approach, motivated by concerns about the rationality of agents exhibiting reference-dependence that we shall make clear in a moment. Our approach will be to consider as a primitive a binary relation h defined over some subset Y of X R. Now, why do we depart from the usual practice? As we just said, this is related to concerns about the rationality of RDP. Indeed, if one sticks to a strictly behaviorist position, one is compelled to judge the rationality of a behavior only on the basis of what is observable of this behavior. One is therefore tempted to argue that RDP are irrational based on a kind of money pump argument. This money pump argument assumes that R X and that r 2 R is the endowment of the decision maker. Then, if preferences exhibit reference-dependence, it is possible (though not necessarily always the case) that there exists a sequence (x1, x2,y, xn) such that x1 x2 x2 ; x2 x3 x3 ; . . . ; xn x1 x1 , so that starting from x1 and offering him xn, and then xn1 etc., a clever crook could make the decision maker into giving him a positive amount of money while ending holding x1 again. Of course, this argument requires introducing a dynamical aspect to the model, which is not explicitly modeled. However, it arguably raises some concern about the ability of agents exhibiting reference-dependence to survive in a world with non-RDP agents. Sagi (2003) and Munro and Sugden (2002) have based their axiomatic approach of RDP on axioms ruling out this kind of behavior. But can this weakness be attributed to irrationality per se? This depends on the definition of rationality one has in mind. If rationality is defined by the ability to survive in the market, then of course RDP correspond to irrational behavior. However this is not the only way to define rationality, and certainly not the classical one, although it is favored by many economists. A more classical definition of rationality is the one embodied by the so-called rationality principle, which states that an agent is rational if he or she behaves so as to implement in an adequate way some preexistent ends or preferences. In this line of thought, an
Raphae¨l Giraud
74
agent exhibiting reference dependence could not be said to be irrational if he or she could be shown to act in accordance with some predefined preference relation that he or she applies under different circumstances. In other words, given some preference profile fhr gr2R , if there exists a binary relation h defined on some subset Y of X R such that, for each r in R, x, x0 in X such that (x, r) and (x0 , r) are in Y, xhr x0 3ðx; rÞhðx0 ; rÞ, then the preference profile is rational, because it can be seen as implementing some predefined preference relation. Of course, this definition leaves in some sort of black box the question of how adequate this implementation is; RDP are an adequate way of implementing predefined preferences because they are completely determined once a reference point is selected, given the preference ordering on Y. More substantive rationality constraints could and should therefore be imposed on this predefined preference relation, not on the preference profile, in order to give more content to the assumption that the decision maker is rational, and presumably these constraints will have implications observable on the preference profile. Enquiring the relationship between rationality properties of the preference profile and the meta-ordering was one of the line of research developed in Giraud (2004a), but we shall not develop this problem any further here. Instead, we shall take it for granted that issues concerning rationality should be addressed by imposing restrictions on the meta-relation and not on the preference profile. This view is embodied in the first assumption.
Assumption 1. (rationality) There exists a binary relation h defined over a subset Y of X R. Now this relation h bearing on objects of the form (x, r) may seem difficult to interpret, and even to accept, given its partially unobservable character and because it seems to demand a lot from the decision maker’s ability to express preferences. A general line of interpretation is to consider this relation as representing ex ante preferences, that is the plans, contingent on the realization of the state represented by the reference point, formed by the decision-maker under a veil of ignorance as to the circumstances – the reference point – with which he or she will have to cope.3 We may therefore interpret the relation h as representing what the decision-maker intends to do when a reference point is selected, before knowing which reference point has been selected. Therefore, it may be regarded as modeling his or her hedonic or contemplative preferences which express his or her desires or norms.
3
We thank Jacob Sagi for suggesting this ‘‘rawlsian’’ interpretation.
Reference-Dependent Preferences
75
We shall hereafter consider that these desires or norms are the true primitives of our theory. We shall thus assume that he or she knows how to rank contingent objects, i.e. objects of the form (x, r), where r indexes a context and x an object of choice. The general interpretation of the statement ‘‘the pair (x, r ) is preferred to the pair (x0 , r0 )’’ is that it is the answer to the question: ‘‘Would you rather receive object x when you are in state r or object x0 when you are in state r0 ’’? This rather abstract question can take a very concrete form: ‘‘Would you rather receive $100 while being rich or $1000 while being poor’’? Specifically, in the case where the reference point is the initial endowment, it would amount to answer for instance the following question: ‘‘Would you rather play a gamble that yields – 100 and 1000 with probability 1/2 when you are endowed with 10 or a gamble that yields – 10 and 100 with probability 1/2 while you are endowed with 100’’? or ‘‘Would you rather eat fish after having eaten eggs or meat after having eaten a soup’’? More generally: ‘‘Would you rather be in state x when you were in state r or in state x 0 when you were in state r0 ’’? Other possible translations include the statements ‘‘Would you rather receive an apple today or two tomorrow’’?, ‘‘Would you rather go to the movies when you are happy or go the theater when you are not’’? or ‘‘Would you rather watch a drama (x) on television (r) or a blockbuster (x0 ) at the movies (r 0 )’’? The relation h defines an associated observable preference profile PðhÞ :¼ fðhr ; X r Þgr2R defined by: X r :¼ fx 2 X jðx; rÞ 2 Yg and xhr y3x; y 2 X r
and
ðx; rÞhðy; rÞ.
The reason why we choose to define h only on some subset of X R, so that Xr is itself a subset of X, is that the set of feasible alternatives may depend on the reference point, for instance, when r denotes the initial endowment of the agent and therefore determines his or her budget constraint given a system of prices or, in a more direct fashion, when r is itself some subset of X, as it is the case in the modeling of context-dependent preferences, setting Xr ¼ r. We need the following not very restrictive non-triviality assumption: Assumption 2. (non-triviality) For all xAX, the set Rx :¼ fr 2 Rjx 2 X r g has at least two elements and, if R X then, for all r 2 R, rAXr. Given yAY, we shall denote xy and ry the projections of y onto X and R, respectively.
76
Raphae¨l Giraud
2.2. Axioms 2.2.1. Basic axioms In line with the approach chosen here, we shall directly state the axioms in terms of the meta-ordering h. The first axiom is a classical ordering axiom. Axiom 1. (weak order) (i) The relation h is a weak order and (ii) The relation h is non-trivial: there exists y1 ; y2 2 Y; y1 y2 . It is shown in Giraud (2004a) that part (i) of this axiom is equivalent to the statement that all observable preferences of the decision maker, elements of his or her preference profile fhr gr2R , are weak orders. It is thus not very restrictive with respect to usual requirements in decision theory. However, here the completeness assumption may seem stronger than usual because X R may seem a conceptually larger set than X. But, on the one hand, the relation h is defined over a subset of X R, weakening the cognitive demands of this axiom, and, on the other hand, requiring completeness of h is tantamount to requiring it of each relation hr . Thus there is no extra complexity cost involved in this assumption. The next axiom we shall introduce will be a continuity requirement, so that we need some topological structure, gathered in the next assumption: Assumption 3. (topological assumption) (i) X and R are convex subsets of the normed vector spaces EX and E R and (ii) Y is a convex compact subset of E :¼ E X E R endowed with the product topology. One may think of X as a set of lotteries over a prize space Z, or of Savage acts with values in Rn , for instance a set of commodity bundles. As to R, it may be a convex subset of X, or a real interval denoting continuous time or levels of wealth. This assumption, which will be always implicitly made from now on, allows us to posit the following axiom: Axiom 2. (Continuity) For all yAY, the sets {y0 AY|y0 hy} and {y0 AY|yhy0 } are closed in Y. This axiom is classical. It implies in particular the continuity of each weak order hr. Because Y is a compact, hence separable, metric space, it is well known that the previous axioms imply that there exists a continuous (hence bounded as Y is assumed to be compact) function representing the relation h over Y:
Reference-Dependent Preferences
77
Lemma 1. (utility function; Debreu, 1964) The relation h satisfies weak order and continuity iff there exists a continuous function u: Y ! R such that: 8y; y0 2 Y; yhy0 3uðyÞhuðy0 Þ.
ð1Þ
A similar result holds for each relation hr 2 PðhÞ (the proof is omitted):
Lemma 2. (utility functions for the preference profile) If h satisfies weak order and continuity, then for each r 2 R, there exists a continuous function ur: X r ! R such that: 8x; y 2 X r ; xhr y3ur ðxÞhur ðyÞ.
ð2Þ
2.2.2. The absolute preference relation Generally speaking, the notion of RDP implies that observable preferences 0 depend T on the reference point, in the sense that there exists r; r 2 R and x, yAXr Xr0 such that xry and yr0 x. Now, if one considers that, from a normative point of view, preferences should not be reference-dependent (based on the money pump argument sketched in the previous section, for instance), it is interesting to wonder to what extent preferences diverge from this ideal situation. This leads us to consider the part of preferences that is referenceindependent, i.e. the part that describes the ex ante plans that are not affected by the selection of the reference point. In order to introduce the formal definition of this object, some piece of notation is needed. For all yAY, let ½y:¼ fy0 2 Yjxy ¼ xy0 . We then give the following definition: Definition 1. (absolute preference) Let y,y0 AY. Say that y is absolutely preferred to y0 – denoted y hay0 – if: ( ~ y~0 ; 8y~ 2 ½y; y~0 2 ½y0 if yay0 yh ð3Þ y ¼ y0 otherwise: Denote a and a the asymmetric and symmetric part of this relation and||a the noncomparability relation of this relation, which is partial when preferences are truly reference-dependent. As explained in Giraud (2003) and Giraud (2004b, Chapter 1), whenever h is a preorder, the relation ha is also a preorder and it is the maximal restriction of
78
Raphae¨l Giraud
h satisfying the reference-independence property defined as follows: Reference independence 8y; y0 2 X 2 ; yay0 ; 8r; r0 2 Rxy Rxy0 , a 0 y y 3ðxy ; rÞa ðxy0 ; r0 Þ ya y0 3ðxy ; rÞa ðxy0 ; r0 Þ:
ð4Þ
Any relation satisfying this property will be said to be R-independent. Remark 1. The definition of ha implies that, for all x,yAX such that x6¼y, if there exists r1, r2 2 Rx \ Ry such that (x, r1) a (y, r2), then: for all r 2 Rx \ Ry , xhry (but not necessarily that for at least one r, xry). It is therefore a stronger ordering than the unanimity ordering.
Remark 2. In the definition of absolute preference, we imposed the condition y6¼y0 . The reason for this restriction is the following. Suppose we do not impose it and that the rest of the definition of the absolute preference relation remains unchanged. Call Ra this new definition of the absolute preference relation. Then, Ra is not necessarily reflexive. Indeed, if (x,r)Ra(x,r) for some (x,r)AY, then (x,r0 )(x,r00 ) for all r0 , r00 2 Rx . However, in principle nothing prevents the existence of some x, r, r0 such that (x,r)(x,r0 ), except when Ra is reflexive. In the latter case, (x,r0 )(x,r00 ) for all xAX and r0 , r00 2 Rx , and this implies that there is no reference-dependence! The condition y6¼y0 is therefore needed for the relation ha to be a preorder (hence a reflexive relation) without contradicting the fact that there is referencedependence. As long as the behavior modeled by the absolute preference relation enjoys a strong rationality property, as it represents the part of preferences that complies with the normative ideal of reference-independence, it seems legitimate to consider that it is a strong rationality kernel of preferences. This allows to impose to this relation to satisfy other normative properties, and in particular the von Neuman–Morgenstern independence axiom. This axiom is well known to be descriptively inaccurate. However, as noted by Tversky and Kahneman (2000), this axiom seems to be considered as a reasonable rule of thumb by decision makers when it is clear that it is applicable, i.e. when people see clearly the common part of two lotteries, while this axiom calls for the cancellation of common features of objects. Evidence for this can be gathered from the editing process prior to decision making (Kahneman and Tversky, 1979) where people try to transform the decision problem they face into a simpler problem, for instance by eliminating common parts of different objects. Therefore, the independence axiom seems to be viewed as normatively compelling by agents, and they use it as long as the problem they face seems to transparently call for its use, but do not use it when this is not the case, for instance with reduced lotteries.
Reference-Dependent Preferences
79
In the context of RDP, the only transformation that may alter preferences is the context in which decisions are made. We assume here that context affects the apparent applicability of the independence axiom, so that, for decisions that are not reference-dependent, the independence axiom is always transparently seen to be applicable. This is the motivation for our next axiom, that imposes independence to the absolute preference relation. The normative appeal of this axiom may, however, depend on the domain of objects on which preferences are defined. It may hold for a convex set of lotteries but be extremely restrictive for a convex set of commodity bundles. Here we will apply it to a general convex set, but it must be borne in mind that its validity has to be checked for a given specification of this convex set. Axiom 3. (absolute preference independence, API) 8x, y, zAX, 8aA [0,1], 8ðr; r0 Þ 2 Rx Ry ; ðx; rÞ h ðy; r0 Þ, if and only if 8ðr; r0 Þ 2 Raxþð1aÞz Rayþð1aÞz; ðax þ ð1 aÞz; rÞ h ðay þ ð1 aÞz; r0 Þ. Notice that axiom (API) bears on the options to evaluate, i.e. elements of X, and not on the reference point, which is immaterial when we consider the absolute preference. The immediate effect of this axiom is shown in the next lemma. Lemma 3. The relation h satisfies (API) iff the relation ha satisfies the independence axiom: for all y, y0 , y00 A Y, for all l A [0, 1], yha y0 3 ly þ ð1 lÞy00 ha ly0 þ ð1 lÞy00 .
ð5Þ a
When the preference relation h is reference-independent, h and h coincide. Therefore, in this case, if the absolute preference relation satisfies Axiom 3, h satisfies independence. This means that in our model, if preferences do not satisfy the independence axiom, it must be because they are not referenceindependent. In decision theory, the independence axiom is considered to be a property that preferences ought to satisfy if everything conformed to some ideal of rationality. Its failure is often associated to aspects related to decision under risk (when the perception of probabilities by the decision maker leads to a distortion of them) or under uncertainty (when the decision maker perceives ambiguity in the decision situation he or she faces). Our model reinforces the normative status of the independence axiom, by stipulating that it is a necessary condition for other normative properties to be satisfied, namely those implied by reference-independence: context-independence, no endowment effect, patience, etc. The independence axiom becomes the minimal axiom of rationality (beyond those implied by the maximization of utility). However, preferences can satisfiy the independence axiom without being reference-independent. Indeed, in this ! h and h satisfie the independence axiom. case haD
Raphae¨l Giraud
80
The continuity of a preorder is an important condition to obtain a representation for it. In the case of the (generally incomplete) relation ha, it is necessary, in the abstract setting we are working in, to postulate a strong continuity condition. To state it, define the domination cone of ha, denoted Ca , by ð6Þ Ca :¼ flðy y0 Þl 0; yha y0 g. The geometry of the domination cone can be used to study the structure of the incompleteness of the preorder. For instance, when it is a half-space, the preorder is complete; when it is a subspace, the preorder is the union of incomparable indifference classes (see Shapley and Baucells, 1998 for details). Let Ca denote the closure of Ca in E. The continuity assumption is the following: Axiom 4. (absolute preference continuity APC) For all y, y0 AY, y y0 2 C a ) y y0 2 C a . The relationship between this axiom, introduced by Evren (2005) building on the work by Dubra et al. (2004), and continuity of ha may not be transparent. To illustrate it, consider the following continuity axiom for this relation: Axiom 5. (strong sequential absolute preference continuity SSAPC) For all sequences fyn gn2N ; fy0n gn2N of elements of Y converging to y, y0 AY, respectively and such that yn ha y0n for all n 2 N, it is the case that yha y0 . The following result linking both axioms holds: Proposition 1. If h satisfies (API), then (APC) ) (SSAPC). Dubra et al. (2004) have shown conversely that, when Y is the set of Borel probability measures over a compact metric space Z, then (SSAPC) implies that axiom (APC) is satisfied (in fact, that Ca is closed). Let us give now an example showing that our model can indeed encompass some instances of this particular case. Example 1. Let Z be a finite set, of cardinality n 2 N . Assume that preferences depend on the probability that some prize z0AZ obtains (for instance, the probability to have a lethal disease). Denoting r 2 ½0; 1 this probability, the set R can be identified with [0,1]. Given r 2 ½0; 1, define ( ) n1 X n1 X ¼ x 2 ½0; 1 j xi ¼ 1 r . ð7Þ i¼1
Xr can be seen as the set of lotteries conditional on the fact that z0 does not obtain (divide by 1r). Then, Y ¼ fðx; rÞjx 2 X r g is isomorphic to D(Z), the set of lotteries on Z. In this case, Ca will always be closed when the axioms of the model are satisfied.
Reference-Dependent Preferences
81
It may be noticed that allowing each hr to be defined only on a subset of X is key to making it possible for the model to encompass this example. 3. A representation theorem for RDP 3.1. The theorem We shall now provide a representation theorem for RDP shedding light on the cognitive process underlying them. Before stating it, let us give some definitions and notations. Notation 1. Given a convex subset C of a topological vector space, denote x the set of continuous affine functionals f : C ! R, i.e. for all c, c0 AC, aA[0,1], f ðac þ ð1 aÞc0 Þ ¼ af ðcÞ þ ð1 aÞf ðc0 Þ. We denote E0 the set of continous linear functionals over E. Remark 3. To each f 2 AðYÞ corresponds, for all r, an element of AðX r Þ. Indeed, let f 2 AðYÞ. Then, for all r 2 R, x,yAXr, lA[0,1], f ðlx þ ð1 lÞy; rÞ ¼ f ðlx þ ð1 lÞy; lr þ ð1 lÞrÞ ¼ f ðlðx; rÞ þ ð1 lÞðy; rÞÞ ¼ lf ðx; rÞ þ ð1 lÞf ðy; rÞ. Similarly, for all x, the function f ðx; :Þ belongs to AðRx Þ. We then have the following representation theorem: Theorem 1. (representation theorem for RDP) The relation h satisfies weak order, continuity, API and APC iff there exists
a weak*-closed convex set V AðYÞ, with at least one non-constant functional and such that for all yAY, 1 inf vðyÞ" sup vðyÞ þ1 v2V
ð8Þ
v2V
with strict inequality whenever ha is different from h; a function a: Y-[0,1] satisfying ya y0 ) aðyÞ ¼ aðy0 Þ,
ð9Þ
such that: (i) 8y, y0 AY, yha y0 38v 2 V;
vðyÞhvðy0 Þ;
ð10Þ
(ii) 8yAY, uðyÞ ¼ aðyÞ inf vðyÞ þ ð1 aðyÞÞ sup vðyÞ; v2V
v2V
ð11Þ
Raphae¨l Giraud
82
(iii) 8x,yAX,x6¼y ) 8r , r0 2 Rx \ Ry , ð8v 2 V; vðx; rÞhvðy; rÞÞ3ð8v 2 V; vðx; r0 Þvhð y; r0 ÞÞ;
ð12Þ
(iv) if R X , there exists v 2 V, x; y 2 R such that v(x, x)6¼v(y, y). Moreover, if Y has nonempty interior and is finite-dimensional, then a is continuous on the interior of Y. This theorem is proved in Section 5. The proof is based on arguments given by Shapley and Baucells (1998), Dubra et al. (2004), and Evren (2005) but we repeat their arguments for the sake of completeness.
Remark 4. Condition (iv) implies in particular that it is not the case that v(r, r) ¼ 0 for all r 2 R, v 2 V. In particular, it is impossible to have vðx; yÞ ¼ cðxÞ cðyÞ for some function c 2 AðX Þ. Nevertheless, the form vðx; yÞ ¼ jðxÞ cðyÞ, with j6¼c is possible. The following corollary provides a similar representation for the family fur gr2R representing PðhÞ: Corollary 1. Under the conditions of the representation theorem, for all r 2 R, there exists: a convex set Vr AðX r Þ and a function ar:Xr-[0,1]
such that (i) 8r 2 R, 8x,yAXr, ðx; rÞha ðy; rÞ38vr 2 Vr ; vr ðxÞhvr ðyÞ;
ð13Þ
(ii) 8(x, r)AY, ur ðxÞ ¼ ar ðxÞ inf vr ðxÞ þ ð1 ar ðxÞÞ sup vr ðxÞ. vr 2Vr
ð14Þ
vr 2Vr
(iii) 8x,yAX, x6¼y ) 8r, r0 2 Rx \ Ry , ð8vr 2 Vr ; vr ðxÞhvr ðyÞÞ3ð8vr0 2 Vr0 ; vr0 ðxÞhvr0 ðyÞÞ;
ð15Þ
Reference-Dependent Preferences
83
Proof. It suffices to apply the preceding theorem and to set ar(x) ¼ a(x,r) and vr(x) ¼ v(x,r). We shall dedicate the following section to the precise interpretation of the representation theorem and its corollary. Before that, let us make two important points. Remark 5. When Y is finite-dimensional, the theorem holds without assumption (APC). In this case, one can rely on theorems by Shapley and Baucells (1998) and, as in finite-dimensional settings every linear map is continuous, the techniques used to prove the theorem go through.
Remark 6. (reference-dependence under risk) An interesting special case of our setting is the case where X and R are lottery spaces. Let Z be a compact metric space, C (Z ) be the set of continuous functions from Z to R, endowed with the sup norm, P ðZ Þ the set of Borel probability measures over Z and ca( Z ) the vector space spanned by P ðZ Þ, endowed with the weak convergence topology.4 There exists a duality between C ( Z ) and ca ( Z ) defined by: Z hw; mi ¼ w dm :¼ Em ðwÞ: The following corollary of the representation theorem holds:
Corollary 2. When R ¼ PðZ R Þ, X ¼ PðZ X Þ, ZX or Z R finite, under the conditions of the theorem, for all r 2 PðZÞ, the preferences are represented by the function defined for all p 2 P ðZ Þ by: ur ð p Þ ¼ ar ð p Þ inf Epr ðwÞ þ ð1 ar ð pÞÞ sup Epr ðwÞ, w2W
ð16Þ
w2W
where W CðZX Z R Þ and pr denote the product measure of p and r. See Section 5 for the proof. Sagi (2003) has axiomatized, in the case where Z X ¼ Z R ¼ Z, a functional that may be rewritten: ur ð p Þ ¼ inf Eprðz;z0 Þ ðcðzÞ cðz0 ÞÞ, c2C
4
According to Ouvrard (2000), this topology is normable. Indeed, as Z is compact, C(Z) is separable. Let therefore f f n gn2N be a dense sequence in C(Z). We then let, for all mAca(Z): Z þ1 X 1 f n dm: jjmjj :¼ n 2 jj f jj n¼1
n 1
Z
Raphae¨l Giraud
84
where CDC(Z). This functional may be considered, in spirit, as a special case of the above functional. Nevertheless, because of Remark 4, the function w may not be written as the difference of two equal functions. However, one can consider functions of the following kind: w(z, z0 ) ¼ f (g(z)h(z0 )), f : R ! R and g, hAC(Z), with g6¼h.
3.2. Interpretation of the representation theorem It is of course well known that it is illicit to interpret a representation theorem as giving a direct access to the cognitive process that actually takes place in the mind of the decision maker. However, it may suggest some hypothesis, in the form of an ‘‘as if’’ statement, that is to say that to the extent that the only thing we know about the decision maker is that he or she behaves in accordance with the axioms, then the psychological assumptions on his or her behavior we can make based on the representation are the one that better fit the data, since more empirical information would be needed to disprove it. This being said, we shall indulge in some psychological interpretation of the representation theorem and its corollary. They suggest that, in the decision process leading to RDP, the reference point affects preferences in two ways. First, given r 2 R, a set Vr is selected, which may be interpreted, similarly to Sagi (2003) or, in a different context, Masatlioglu and Ok (2003), as a set of implicit criteria with respect to which an element x is compared to an element y. For all vr 2 Vr , one can indeed define a weak order over X: xhvr y3vr ðxÞhvr ðyÞ. Hence, to each r 2 R corresponds a family of preference relations fhvr gvr 2Vr , which is the family of implicit criteria or attributes. The first effect of the reference point is thus to modify the way the decision maker envisions the world, by determining which aspects of it he or she will consider to be relevant. Notice that, as shown by the theorem and the proof of the corollary, this set of criteria can be seen somehow as the projection on a given reference point of a more primitive set of criteria, determined by the set V which itself is determined by the absolute preference relation. This leads to another way of interpreting the set V. To each xAX, one can associate the function f x : V Rx ! R defined by f x(v, r) ¼ v(x, r). This defines a Savagian act associated to action x, where uncertainty is described by states of the world belonging to V and circumstances belonging to R. Now, if ðx; rÞha ðy; rÞ, this means that f xhf y: x is better than y in each state of the world and in all circumstances. Uncertainty regarding which element in V or R obtains does not matter. On the contrary, if (x, r) is not absolutely preferred to (y, r), then there exist v, v0 such that f x(v, r)f y(v, r) and f y(v0 , r)f x(v0 , r). Therefore, once a state in R is selected, uncertainty about V matters. So the set V represents the set of contingencies that may matter for the decision maker even when he or she knows the circumstances described by R. It
Reference-Dependent Preferences
85
is some kind of residual uncertainty. The choice of one state in R rules out some states in V that are identical given this state, so that some contingencies are deemed impossible in these circumstances. Second, suppose ar(x) is close to 1 and ar(y) close to 0. Then, for xr y to hold, it is necessary that vr(x)vr( y) for almost all vr 2 Vr , not only within criteria, but also across criteria, whereas for yr x to hold, it suffices that vr( y)vr(x) for a small number of criteria. Hence, it is more likely that yr x. More generally, if ar(x)ar( y), it is more likely that yr x, as shown by the following proposition: Proposition 2. For all r 2 R, for all x, yAXr such that ar(x)6¼1 and ar( y)6¼1, there exists mr(x, y)A[0,1] such that: ar ðxÞ mr ðx; yÞ ) xr y and ar ðyÞ mr ðx; yÞ ) yr x. This proposition shows that if ar(x)ar( y), yr x is more likely than xr y. Therefore, if the reference point belongs to X ; yr r is more likely than xr r. In this case, the coefficient ar(x) thus measures, roughly speaking, the repulsiveness of x wrt r: the higher it is, the more reluctant the decision maker is to trade r for x. It can be seen, moreover, that the exact value of ar(x) is not informative by itself: this parameter is of an ordinal nature, as shown by the proposition. It measures, in the general case, the repulsiveness of x relative to y, from the point of view of r. This can be intuitively understood by noticing that, in fact, by the function ar, the decision maker is going to select one criterion among the relevant ones to assess the value of an option. One may think of ar as a salience function that puts forward one attribute of the object considered. The higher ar(x) is, the lower the utility value of x for the criterion selected with respect to the other criteria, hence the less desirable x is. As this function depends on the reference point, a different reference point will lead to a different salient attribute. Thus the second effect of the reference point is to modify the desirability of a determinate object. For concreteness, consider the following story: Professor Cosine is currently tenured at University A. Suppose that in general the quality of a job at a university can be summarized by four attributes: salary, international prestige, quality of the students and location. Suppose Prof. Cosine is forced to abandon his job for some reason and that he has two offers, one at University B, the other at University C. We shall distinguish two situations: A1 and A2. Table 1 describes how each job fares with respect to each criterion. Consider first situation A1: Prof. Cosine’s current employment is very prestigious and located in a fairly nice place, but has very low wage and students of very poor quality. Therefore, on these last two attributes, things can only improve. For this reason, these attributes are irrelevant. So we can just delete the corresponding lines. When we do that, job B appears clearly superior to job C (there is dominance), so we have BA1 C. By the same reasoning, we have CA2 B. Thus we
Raphae¨l Giraud
86
Table 1.
Salary Prestige Location Quality
Prof. Cosine’s choices
A1
A2
B
C
* **** *** *
**** * * ***
*** *** **** *
*** *** * ****
see with this simple example how the mechanism described in the representation theorem may lead to preference reversal. 3.3. Strength of the status quo bias and desirability of options The previous discussion interprets the meaning of ar(x) from the point of view of x. But, in the case where R X , it can also be shown to reveal something about the desirability of r, perhaps in a more precise way. Suppose an option x is more desirable than another option y; then one will be more reluctant to forego x when x is the endowment than to forego y when y is the endowment. Thus, the fact that the SQB is stronger in favor of x than in favor of y will be a symptom of the fact that x is more desirable than y. Formally, this corresponds to the following definition: Definition 2. The status quo bias is greater for r than r0 ðdenoted rhSQ r0 Þ if: 8 2 X ; xr r ) xr0 r0 This definition clearly formalizes the intuition we gave: if r is more desirable than r0 and if I am willing to forego r for x, then a fortiori will I be willing to forego r0 for x and, if the preference of r over r0 is strict, there will exist an option x for which I am willing to forego r0 but not r. Hence, desirability of an option is related to the strength of the SQB. We will now show that the function ar can be used to assess the strength of the SQB. Theorem 2. Assume that, for all x, rAX, ar(x)6¼1. Then, there exists a pseudodistance5 function d on X with values in [0,1] such that: ½8x 2 X ; ar ðxÞ ar0 ðxÞhdðr; r0 Þ ) rhSQ r0 . To understand the meaning of this theorem, suppose to begin with that d(r, r0 ) ¼ 0. Fix xAX. Suppose for the moment that ar(x) is close to 1 and ar0 (x) ¼ 0. Then, (x, r)h(x, r0 ) requires u(x, r)hu(x, r0 ) for almost all u 2 V; whereas (x, r0 )h(x, r) requires that, for at least one u 2 V; u(x, r0 )hv(x, r).
5
A function d: X X ! Rþ is a pseudo-distance function if d(x, x) ¼ 0, if it is symmetric and if it satisfies the triangle inequality. So it differs from the usual notion of distance in that d(x, y) ¼ 0 does not necessarily imply x ¼ y.
Reference-Dependent Preferences
87
Therefore, it is more likely that (x, r0 )h(x, r). Suppose now more generally that ar(x)har0 (x). Then, by an argument similar to Proposition 2, it is more likely that (x, r0 )h(x, r). Now suppose this is true for all xAX. Then it is more likely that for a given x the decision maker thinks moving from r0 to x is a greater improvement than moving from r to x, i.e. r is a more satisfactory status quo than r0 . Say in this case that r exchange-dominates r0 . Now we see that if d(r, r0 ) ¼ 0, this implies rhSQ r0 . Suppose now d(r, r0 )0 and is very close to 1. Then, for ar(x) to be greater than ar0 (x)+d(r, r0 ), ar(x) must be very close to 1 and ar0 (x) very close to 0, which makes us come back to the first situation, where the probability that r exchange-dominate r0 was great. Therefore, the function d indicates how likely the event ‘‘r exchange-dominates r0 ’’ must be to yield rhSQ r0 , i.e. in a sense it indicates how often the decision maker must deem foregoing r0 to be preferable to foregoing r for us to be able to conclude that r is indeed more desirable than r0 . If d(r, r0 ) is small, this event need not be very likely; if d(r, r0 ) is big, it must be very likely. As d is a pseudo-distance function, one can say that it measures similarity: if r, r0 are very close or very similar, then it is possible to conclude with less information about their relationship: if r and r0 are very similar, for instance commodity bundles containing the same amount of one prespecified good, one could infer which one is preferable by observing only a few changes. If they are very different, like two arbitrary commodity bundles, one will need more observations to infer which one is actually preferred. Remark 7. The pseudo-distance function d is a concept derived from the preference relation and depends on the choice of V. It is thus not an intrinsic concept. The notion of similarity is thus a subjective notion and depends on this choice of V. It has the dimension of a utility function. However, the theorem does not depend on the choice of V, as it relies on a comparative reasoning. Therefore, the criterion of comparability is always applicable. Remark 8. Intuitively, in the limit case where ar ðxÞ ¼ 1 for all xAX and all rAX, i.e., in a case very similar to the functional axiomatized by Sagi (2003), we would have dðr; r0 Þ ¼ 0 ¼ ar ðxÞ ar0 ðxÞ. If the theorem was still valid, one would have rSQ r0 for all r,r0 . However this would imply, based on Sagi’s axioms, that rr0 r00 for all r,r0 ,r00 , which obviously does not hold. Therefore, this intuition must not be true and the theorem cannot be extended to cover this case. 4. Conclusion In this chapter, we have shown that, under some rationality axioms imposed on a preference relation representing the plans of a decision maker prior to the selection, by nature or by history, of some circumstances as the reference point for decision making, it is possible to characterize the effect of this reference-point on decision. Specifically, we have shown that the selection of a
Raphae¨l Giraud
88
reference-point amounts to the determination of the relevant criteria to be considered in the specific choice situation, and of the severity with which alternative options will be assessed with respect to these criteria. We have moreover shown that this severity is linked to the desirability of a given option, in the sense that if, for any option, the decision maker is more severe for one status quo than for another, then the former is revealed more desirable than the latter. 5. Proofs Lemma 3. ) : Let ðy; y0 Þ 2 Y2 be such that yha y0 and y00 2 Y; a 2 ½0; 1. Then, if y ¼ y0 , we are done. If this not the case, then, this is true iff for all r 2 Rxy , for all r0 2 Rxy0 ; ðxy ; rÞhðxy0 ; r0 Þ. By API, this is equivalent to ðaxy þ ð1 aÞxy00 ; tÞhðaxy0 þ ð1 aÞxy00 ; t0 Þ for all ðt; t0 Þ 2 Raxy þð1aÞxy00 Raxy0 þð1aÞxy00 which is true iff ðaxy þ ð1 aÞxy00 ; tÞha ðaxy0 þ ð1 aÞxy00 ; t0 Þ, for all ðt; t0 Þ 2 Raxy þð1aÞxy00 Raxy0 þð1aÞxy00 . This is true in particular for t ¼ ary þ ð1 aÞry00 et t0 ¼ ary0 þ ð1 aÞry 0 0 , which belong to Raxy þð1aÞxy00 and Raxy0 þð1aÞxy 00 , as ðaxy þ ð1 aÞxy00 ; ary þ ð1 aÞry00 Þ ¼ ay þ ð1 aÞy00 2 Y and ðaxy0 þ ð1 aÞxy00 ; ary0 þ ð1 aÞry00 Þ ¼ ay0 þ ð1 aÞy00 2 Y. Hence this is equivalent to ay þ ð1 aÞy00 ha ay0 þ ð1 aÞy00 . ( : Suppose ha satisfies the independence axiom. Let ðx; yÞ 2 X 2 be such that for all ðr; r0 Þ 2 Rx Ry ; ðx; rÞhðy; r0 Þ. Then, as Rx and Ry have at least two elements, there exists r 2 Rx ; r0 2 Ry ; rar0 such that ðx; rÞha ðy; r0 Þ. By the independence axiom for ha , this is true iff, for all z 2 X ; s 2 Rz ; a 2 ½0; 1, aðx; rÞ þ ð1 aÞðz; sÞha aðy; r0 Þ þ ð1 aÞðz; sÞ, i.e. ðax þ ð1 aÞz; ar þ ð1 aÞsÞha ðay þ ð1 aÞz; ar0 þ ð1 aÞsÞ, i.e., as rar0 ; ðax þ ð1 aÞz; tÞhðay þ ð1 aÞz; t0 Þ for all ðt; t0 Þ 2 Raxþð1aÞz Rayþð1aÞz .
Proposition 1. Take two sequences fyn gn2N ; fy0n gn2N of elements of Y converging to y; y0 2 Y, respectively and such that yn ha y0n for all n 2 N. Then yn y0n 2 Ca for all n by definition of Ca . Therefore, y y0 2 Ca by continuity of addition
Reference-Dependent Preferences
89
in a topological vector space. But then APC implies that y y0 2 Ca , hence by API (see Lemma 4 below), yha y0 . Representation Theorem Sufficiency We first prove point (i). In order to do this, we first prove a lemma. Lemma 4. The relation h satisfies weak order, API and APC iff Ca is a convex cone containing 0 that represents ha , i.e. for all y; y0 2 Y, yha y0 3y y0 2 Ca .
ð17Þ
Proof. Necessity being easy, we show only sufficiency. Ca contains 0 because ha is reflexive. Let us show it is convex. Let lð p qÞ and l0 ð p0 q0 Þ be in Ca , i.e. pha q and p0 ha q0 . Let a 2 ½0; 1. Then alð p qÞ þ ð1 aÞl0 ð p0 q0 Þ ¼ alp þ ð1 aÞl0 p0 ðalq þ ð1 aÞl0 q0 Þ ¼ alð p qÞ þ ð1 aÞl0 ð p0 q0 Þ. a a 0 al 0 Let b :¼ alþð1aÞl 0 . Then, by independence, ph q and p h q entail that
bp þ ð1 bÞp0 ha bq þ ð1 bÞp0 ha bq þ ð1 bÞq0 . Therefore, bp þ ð1 bÞp0 ðbq þ ð1 bÞq0 Þ 2 Ca . But Ca is a cone, therefore alð p qÞ þ ð1 aÞl0 ð p0 q0 Þ ¼ ðal þ ð1 aÞl0 Þðbp þ ð1 bÞp0 ðbq þ ð1 bÞq0 ÞÞ 2 Ca . We proceed now to show that yhay0 iff y y0 2 Ca . The only thing to prove is necessity. So assume y y0 2 Ca . Then by APC, y y0 2 Ca . But this implies that there exist lh0 and p,qAY such that phaq and yy0 ¼ l( pq), i.e., setting a ¼ 1/1+l, ay+(1a)q ¼ ay0 +(1a)p. Two applications of API conclude the proof. Let then W be the polar of Ca , i.e.: W :¼ w 2 E 0 jwðzÞh0; 8z 2 Ca . Let W be the polar of W, i.e. Wn :¼ z 2 EjwðzÞh0; 8w 2 W . Clearly, Ca Wn . If we show the converse inclusion, Lemma 4 will allow us to conclude that W represents ha, i.e.: yha y0 3wðyÞhwðy0 Þ; 8w 2 W. Suppose there exists z0 2 W such that z0 eCa . Ca being convex by Lemma 4 and closed by definition, by a classical separation theorem, and because E is locally convex, there exists TAE0 and a 2 R such that 8z 2 Ca ; TðzÞha Tðz0 Þ
Raphae¨l Giraud
90
But Ca is a cone, hence, for all n 2 N , for all z 2 Ca , nz 2 Ca , hence nT(z) ¼ T(nz)ha, which entails T(z)ha/n for all n 2 N , hence T(z)h0 for all z 2 Ca , hence T 2 W. But, on the other hand, 0 2 Ca , hence Tðz0 Þ a"Tð0Þ ¼ 0, i.e.. z0 eW . There is therefore a contradiction. Suppose now ha6¼h. Then, there exists y1, y2AY such that y1||ay2. Let W1 :¼ w 2 Wjwðy1 Þ w 2 ðy2 Þ and
W2 :¼ w 2 Wjwðy1 Þ"w 2 ðy2 Þ .
As y1||ay2, W1 et W2 are both nonempty. Normalize now all functions in W1 so that they take their values in [0,1] and all functions in W2 so that they take their values in [2, 1] (this is possible because these functions are continuous over a compact set). Let now V :¼ cofW1 [ W2 g, where the closure is for the topology of pointwise convergence. By construction, V AðYÞ is convex, closed, 1 inf vðyÞ" 1 0" sup vðyÞ þ1 v2V
v2V a
and represents h . V contains at least one nonconstant function as h is nontrivial. This proves point (i) Let now yAY. Normalize u (continuous over a compact set) so that it takes its values in [1,0]. By construction, there exists v1 2 W1 V; v2 2 W2 V such that v1(y)h0 and v2(y)"1. As a consequence, uðyÞ 2 ½v2 ðyÞ; v1 ðyÞ VðyÞ :¼ fvðyÞjv 2 Vjg, which is a nondegenerate interval because V is convex and inf vðyÞ sup vðyÞ.
v2V
v2V
Let aðyÞ :¼
supv2V vðyÞ uðyÞ . supv2V vðyÞ inf v2V vðyÞ
This proves point (ii). Point (iii) is an immediate consequence of the definition of ha. As for point (iv), suppose R X . If, for all v 2 V, for all x; y 2 R, v(x,x) ¼ v( y,y)then (x,x) a ( y,y), hence, if x6¼y, (x,r) a ( y,r0 ) for all r; r0 2 Rx Ry , hence yy0 for all y,y0 AY contradicting the nontriviality of h . Suppose now that Y has nonempty interior and is finitedimensional. Let F be the vector subspace generated by Y. Then for each v 2 V, vv(0) can be extended to a linear map v~ over F, which is therefore continuous. Therefore v^ ¼ v~ þ vð0Þ is also continuous and
Reference-Dependent Preferences
91
extends v. Moreover the functions y7!supv2V v^ð yÞ and y7!inf v2V v^ð yÞ are respectively convex and concave, hence continuous on the interior of Y. This entails that a is continuous as u is. Necessity Necessity of weak order, continuity and API is easily shown. We only show necessity of APC. Without loss of generality, we can assume that 0AY as, otherwise, we can translate the problem. We can also take all functions in V to be linear (as it suffices to substract n(0) for each n 2 V, yielding a new set of functions that also represents ha). Let therefore y,y0 AY be such that y y0 2 Ca . Then, there exist sequences (ln) (yn), (y0 n) such that ln(yny0 n)-yy0 , lnh0 and yn ha y0n . Therefore, vðyn Þhvðy0n Þ for all v 2 V, n 2 N. Therefore, vðln ðyn y0n ÞÞh0 for all n 2 N because v is linear. But this entails that vðy y0 Þh0 as v is continuous, so that yhay0 . But this implies that y y0 2 Ca . Corollary 2. We prove the corollary for the case where ZX is finite. The case where it is ZR is similar. Let f 2 AðPðZÞÞ. Riesz representation theorem stipulates that there exists wAC(Z) such that for all p 2 PðZÞ, f ð pÞ ¼ w; p ¼ Ep ðwÞ. Now let u 2 AðYÞ. Fix r 2 PðZR Þ. By Remark 3, the map p/v( p, r) belongs to AðPðZ X ÞÞA(P(ZX)). As a consequence, there exists WrAC(ZX) such that vð p; rÞ ¼ Ep ðW r Þ. Fix now z0AZX. Let us show that the map j0 : r7!W r ðz0 Þ is affine and continuous. First, Ep ðW lrþð1lÞr0 Þ ¼ uð p; lr þ ð1 lÞr0 Þ ¼ lvð p; rÞ þ ð1 lÞvð p; r0 Þ ¼ lEp ðW r Þ þ ð1 lÞEp ðW r0 Þ ¼ Ep ðlW r þ ð1 lÞW r0 Þ. This being true for all p, j0 ðlr þ ð1 lÞr0 Þ ¼ W lrþð1lÞr0 ðZ 0 Þ ¼ EdZ0 ðW lrþð1lÞr0 Þ ¼ EdZ0 ðlW r þ ð1 lÞW r0 Þ ¼ lj0 ðrÞ þ ð1 lÞj0 ðr0 Þ. Second, if rn-r, then, as v is continuous, for all pAX, v( p, rn)-v( p, r). Hence Ep ðW rn Þ ! Ep ðW r Þ for all p, hence in particular for p ¼ dz0 , which entails W rn ðZ 0 Þ ! W rn ðZ 0 Þ. j0 is therefore continuous. Now, again by Riesz’ theorem, there exists wZ0 2 CðZ R Þ such that W r ðZ 0 Þ ¼ Er ðwz0 Þ
Raphae¨l Giraud
92
Define w 2 CðZ X Z R Þ by w(z,z0 ) ¼ wz(z0 ). Then vð p; rÞ ¼ Ep ðEr ðwÞÞ. We wish now to apply Fubini’s theorem. For this, we must show that w is measurable wrt the product s-algebra. Let A be an open set of R. Then, w1 ðAÞ ¼ ðz; z0 Þ 2 Z X Z R jwðz; z0 Þ 2 A ¼ [z2ZX fzg z0 2 Z R jwz ðZ0 Þ 2 A ¼ [z2ZX fzg w1 z ðAÞ. Now, wz is continuous, hence wz1 (A) is open, hence measurable. {z} is closed, hence measurable. Therefore, {z} wz1(A) is measurable. As, moreover, 1 ZX is finite, [z2ZX fzg w1 z ðAÞ ¼ w ðAÞ is measurable: w is measurable. We can now apply Fubini, which gives uð p; rÞ ¼ EprðwÞ . It suffices now to inject this result into the functional of the representation theorem. &
Proposition 2. We first prove the following lemma, of independent interest: Lemma 5. There exist a function b:Y Y-[0,1] such that, for all y, y0 AY Y, b(y,y0 )A[a(y),1] and, if a(y)6¼1: y y0 3bðy; y0 Þ aðyÞ.
Proof. The proof consists first, given y, y0 AY Y, in showing the existence of a convex subset Vy,y0 of V, and of a set function my defined over the power set of V such that y y0 3my ðV y ;y0 Þ 0. We then show that there exists a number such that my ðV y;y0 Þ ¼ bðy; y0 Þ 0. We proceed in several steps and to help understand the proof we highly recommend that the reader draw diagrams representing the interval VðyÞ ¼ vðyÞjv 2 V and the respective positions of u(y) and u(y0 ) in this interval. STEP 1: Let V be a convex subset of V. Then, for all yAY, V(y) ¼ {v(y)9vAV} is a bounded real interval. Indeed, let a,bAV(x,r). Then, there exists v,wAV such that a ¼ v(y) and b ¼ w(y). Let gA]a, b[. There exists lA]0,1[ such that g ¼ la+(1l)b ¼ lv(y)+(1l)w(y) ¼ (lv+(1l)w)(y). Now, lv+(1l)wAV, hence gAV(y): V(y) is an interval. This interval is bounded by Theorem 1. This is in particular true of VðyÞ.
Reference-Dependent Preferences
93
STEP 2: Let us define a family of functions over 2V by setting, for all yAY, for all V V such that V(y) is Lebesgue-measurable, my ðV Þ ¼
‘ðvðyÞÞ , ‘ðVðyÞÞ
where ‘ is the Lebesgue measure over R. If V is convex, V(y) is an interval, so that: my ðV Þ ¼
supv2V vðyÞ inf v2V vðyÞ . supv2V vðyÞ inf v2V vðyÞ
my is a way of measuring V, but it is not a measure in the strict sense, nor a capacity, as its domain need not be a s-algebra over V. Step 3: For all y, y0 AY Y, let Vy :¼ v 2 VjvðyÞ uðyÞ and Wy;y0 :¼ v 2 Vjuðy0 Þ"vðyÞ Clearly, Vy and Wy;y0 are convex, so that Vy ðyÞ and Wy;y0 ðyÞ are intervals, and so is their intersection. By this fact, u(y) always belongs to the closure of Vy ðyÞ by construction and to the closure of Wy;y0 ðyÞ whenever u(y0 )"u(y) (as uðyÞ 2 VðyÞ and, in this case, Wy;y0 ðyÞ ¼ maxfuðy0 Þ; inf VðyÞg; sup VðyÞ½Þ: We therefore let Vy :¼ Vy [ fug and Wy;y0 :¼ fu 2 V [ fugjuðy0 Þ"uðyÞg. Let us show the following fact: y y0 3my ðVy \ Wy;y0 Þ 0 0
ðnÞ.
Suppose, indeed, that y y . Then u(y)u(y00 ). Hence the interval uðy0 Þ; uðyÞ½ is nonempty. But, as aðyÞa1 (assumption of the lemma), uðyÞa inf VðyÞ. Hence the interval inf VðyÞ; uðyÞ½ is nonempty. As a consequence, the intersection of these two nested intervals, maxfuðy0 Þ; inf VðyÞg; uðyÞ½, is nonempty. But this set is exactly Wy;y0 ðyÞ \ Vy ðyÞ. Hence, there exists u 2 Wy;y0 \ Vy such that uðyÞauðyÞ, hence uau. This entails the existence of u0 2 Vy \ Wy;y0 such that u0 au. But, because u(y)u(y0 ), u 2 Wy;y0 and, by definition, u 2 Vy , therefore u 2 Vy \ Wy;y0 . Thus, the interval Vy ðyÞ \ Wy;y0 ðyÞ contains the open interval u0 ðyÞ; uðyÞ½, with u0 ðyÞauðyÞ. But Vy \ Wy;y0 also contains this open interval, as u(y) is the only point that can belong to Vy ðyÞ \ Wy;y0 ðyÞ but not to Vy ðyÞ \ Wy;y0 ðyÞ. Therefore, Vy ðyÞ \ Wy;y0 ðyÞ is an interval with nonempty interior, and this entails ‘ðVy ðyÞ \ Wy;y0 ðyÞÞ 0, so that my ðVy \ Wy;y0 Þ 0.
Raphae¨l Giraud
94
Conversely, suppose my ðVy \ Wy;y0 Þ 0. Then, there exists u 2 Vy \ Wyy0 . By construction, this entails uðy0 Þ"uðyÞ uðyÞ, hence uðyÞ uðy0 Þ, i.e., y y0 . STEP 4: Now define, for all y; y0 2 Y Y, 8 1 if uðy0 Þ" inf VðyÞ > < 0 Þsup VðyÞ inf VðyÞ uðy0 Þ uðyÞ bðy; y0 Þ ¼ infuðyVðyÞsup VðyÞ if > : aðyÞ if uðyÞ"uðy0 Þ: Let us show that my ðVy \ Wy;y0 Þ ¼ bðy; y0 Þ aðyÞ. If uðy0 Þ" inf VðyÞ, then Vy ðyÞ \ Wy;y0 ðyÞ ¼ Vy ðyÞ. Hence, because u(y) belongs to the closure of Vy ðyÞ, it is the upper bound of this set: sup uðyÞ ¼ uðyÞ u2Vy
¼ aðyÞ inf uðyÞ þ ð1 aðyÞÞ sup uðyÞ: u2V
u2V
Moreover, by construction, inf uðyÞ ¼ inf uðyÞ:
u2Vy
u2V
Therefore, aðyÞinf u2V uðyÞ þ ð1 aðyÞÞsupu2V uðyÞ inf u2V uðyÞ supu2V uðyÞ inf u2V uðyÞ ð1 aðyÞÞðsupu2V uðyÞ inf u2V uðyÞÞ ¼ supu2V uðyÞ inf u2V uðyÞ ¼ 1 aðyÞ
my ðVy Þ ¼
But, by definition of b(y,y0 ) in this case, i.e. bðy; y0 Þ ¼ 1, we have: my ðVy \ Wy;y0 Þ ¼ bðy; y0 Þ aðyÞ. Now, if inf VðyÞ uðy0 Þ uðyÞ, then, as u(y) belongs to the closure of Vy ðyÞ, sup Vy ðyÞ \ Wy;y0 ðyÞ ¼ uðyÞ and, as u(y0 ) belongs to the closure of Wy;y0 ðyÞ, inf Vy ðyÞ \ Wy;y0 ðyÞ ¼ uðy0 Þ ¼ bðy; y0 Þ inf uðyÞ þ ð1 bðy; y0 ÞÞ sup uðyÞ. u2V
u2V
Reference-Dependent Preferences
95
This entails, my ðVy \ Wy;y0 Þ ¼
aðyÞ inf u2V uðyÞ þ ð1 aðyÞÞ supu2V uðyÞ bðy; y0 Þ inf u2V uðyÞ ð1 bðy; y0 ÞÞ supu2V uðyÞ supu2V uðyÞ inf u2V uðyÞ
¼
ðaðyÞ bðy; y0 ÞÞ ðinf u2V uðyÞ þ ðbðy; y0 Þ aðyÞ supu2V uðyÞ supu2V uðyÞ inf u2V uðyÞ
¼
ðbðy; y0 Þ aðyÞÞðsupu2V uðyÞ inf u2V uðyÞÞ supu2V uðyÞ inf u2V uðyÞ
¼ bðy; y0 Þ aðyÞ
Finally, if uðyÞ"uðy0 Þ, then Vy ðyÞ \ Wy;y0 ðyÞ ¼ f hence my ðVy \ Wy;y0 Þ ¼ 0. But, by definition of b(y, y0 ) in this case, i.e. bðy; y0 Þ ¼ aðyÞ, this entails my ðVy \ Wy;y0 Þ ¼ bðy; y0 Þ aðyÞ: Now, if y ¼ ðx; rÞ and y0 ¼ ð y; rÞ, it suffices to let mr ðx; yÞ ¼ minfbðy; y0 Þ; bðy0 ; yÞg, and to apply the lemma. &
Theorem 2. . By Lemma 5, setting: b(x,r):¼b((x,r), (r,r)) it easy to see that: ð8x 2 X ; bðx; rÞ ar ðxÞ"bðx; r0 Þ ar0 ðxÞÞ ) rhSQ r0 . Now this is equivalent to: ð8x 2 X ; ar ðxÞ ar0 ðxÞhbðx; rÞ bðx; r0 ÞÞ ) rhSQ r0 . and entails ð8x 2 X ; ar ðxÞ ar0 ðxÞhbðx; rÞ bðx; r0 ÞÞ ) rhSQ r0 . Now, by construction, b(x,r)A[ar(x), 1]. Hence, b(x, r)b(x, r0 )A[ar(x)1, 1ar0 (x)]D[1, 1] and|b(x, r)b(x, r0 )|A[0, 1]. Hence, sup jbðx; rÞ bðx; r0 Þj þ1. x2X
Set dðr; r0 Þ ¼ sup jbðx; rÞ bðx; r0 Þj: x2X
Raphae¨l Giraud
96
Then, we have: ð8x 2 X ; ar ðxÞ ar0 ðxÞhdðr; r0 ÞÞ ) rhSQ r0 . d is symmetric and it is easy to show, as for the distance of uniform convergence, that d satisfies the triangle inequality. Therefore, it is a pseudo-distance. & Acknowledgment I would like to thank Miche`le Cohen for her invaluable support during the writing process of this paper and Christophe Bertault for helpful mathematical discussions. I also thank Eric Danan, Lionel Page, Jacob Sagi, Jean-Marc Tallon, Jean-Christophe Vergnaud and audiences at EUREQuA workshops and the RUD 2004 Conference. Funding from the latter and financial support from the French Ministry of Research (Action concerte´e incitative) are gratefully acknowledged. All errors remain mine. References Camerer, C. (1995), ‘‘Individual decision making’’, pp. 587–703 in: J. Kagel and A. Roth, editors, Handbook of Experimental Economics, Princeton: Princeton University Press. Debreu, G. (1964), ‘‘Continuity properties of paretian utility’’, International Economic Review, Vol. 5, pp. 285–293. Dubra, J., F. Maccheroni and E. Ok (2004a), ‘‘Expected utility theory without the completeness axiom’’, Journal of Economic Theory, Vol. 115(1), pp. 118–133. Evren, O¨. (2005), ‘‘Expected multi-utility theorems with topological continuity axioms’’, Technical report, Bilkent University. Giraud, R. (2003), ‘‘A Representation theorem for procedure-dependent preferences: States of mind and the salience effect’’, Universite´ Paris I: Cahiers de la MSE Bla03042. Giraud, R. (2004a), ‘‘Reference-dependent preferences: Rationality, mechanism and welfare implications’’, Universite´ Paris I: Cahiers de la MSE V04087. Giraud, R. (2004b), Une the´orie de la de´cision pour les pre´fe´rences imparfaites, Ph.D. thesis, Universite´ Paris I. Kahneman, D., J.L. Knetsch and R. Thaler (1990), ‘‘Experimental tests of the endowment effect and the Coase theorem’’, Journal of Political Economy, Vol. 48, pp. 1325–1348. Kahneman, D. and A. Tversky (1979), ‘‘Prospect theory: an analysis of decision under risk’’, Econometrica, Vol. 47, pp. 263–291. Koszegi, B. and Rabin, M. (2006), ‘‘A model of reference-dependent preferences’’, Quarterly Journal of Economics, forthcoming.
Reference-Dependent Preferences
97
Masatlioglu, Y. and E. Ok. (2005), ‘‘Rational choice with a status quo bias’’, Journal of Economic Theory, Vol. 121(1), pp. 1–29. Munro, A. and R. Sugden (2002), ‘‘A theory of reference-dependent preferences’’, Journal of Economic Behavior and Organization, Vol. 50, pp. 407–428. Ouvrard, J.-Y. (2000), Probabilite´s, Vol. 2, Paris: Cassini. Sagi, J.S. (2003).‘‘Anchored preference relations’’, Mimeo, a shorter version not containing the material relevant to the present paper is forthcoming in Journal of Economic Theory. Samuelson, W. and R. Zeckhauser (1988), ‘‘Status quo bias in decision making’’, Journal of Risk and Uncertainty, Vol. 1, pp. 7–59. Schmidt, U. (2003), ‘‘Reference dependence in cumulative prospect theory’’, Journal of Mathematical Psychology, Vol. 47, pp. 122–131. Shapley, L. S., and M. Baucells. (1998), ‘‘Multiperson Utility’’ Working Paper 779, UCLA. Sugden, R. (2003), ‘‘Reference-dependent subjective expected utility’’, Journal of Economic Theory, Vol. 111, pp. 172–191. Tversky, A. and D. Kahneman (1991), ‘‘Loss aversion in riskless choice: a reference-dependent model’’, Quarterly Journal of Economics, Vol. 106(4), pp. 1039–1061. Tversky, A., and D. Kahneman, (2000), ‘‘Rational choice and the framing of decisions’’ In: Choice, Values and Frames, K. Daniel and T. Amos, Cambridge: Cambridge University Press, Chapter 12.
This page intentionally left blank
CHAPTER 4
A Cognitive Approach to Context Effects on Individual Decision Making Under Risk Boicho Kokinov and Daniela Raeva Abstract This chapter compares and contrasts various approaches to understanding human decision making under risk, and is trying to formulate requirements for a cognitive economics theory of risky decision making. Then a first attempt is made to put forward such a theory by proposing a cognitive model JUDGEMAP based on the general cognitive architecture DUAL. This allows the model to be integrated with other cognitive processes such as perception, analogical reasoning, spreading activation memory retrieval, etc. The fact that all processes in DUAL are based on local computations and parallel processing allows for modelling the interplay between various cognitive processes during the decision-making process, in particular the model predicts that the unconscious and automatic process of spreading activation will influence the conscious process of argument building and comparison. This prediction is tested and confirmed by a psychological experiment that demonstrates that seemingly remote and irrelevant aspects of the environment can change the decision we make.
Keywords: choice under risk, computational models, context effects JEL classifications: C63, C91
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80005-2
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
100
Boicho Kokinov and Daniela Raeva
1. Introduction Both individual and societal prosperity are closely related to human willingness to risk. Risky behaviour is related to better exploration of the environment and its opportunities and thus to learning and acquiring new knowledge, better skills, diverse practice, and richer experience. Risky behaviour is also related to obtaining higher rewards and profit (simply because there is less competition for these new unexplored resources), and is at the heart of entrepreneurship. Having more individuals who are willing to take risk in a society and who are exploring new spaces, who are starting new types of business, who are trying out new technologies, results in faster economic development and growth. However, risk is also associated with failure, with loss, with possible punishment and that makes many individuals risk averse. At the societal level unwillingness to risk is associated with conservatism, with tradition – they ensure that some rules will be kept constant that the world will be more predictable and that the coherence of the society will be preserved. The concept of ‘‘risk’’ is central to many disciplines nowadays: sociology, economics, psychology, and cognitive science. We will briefly review some theoretical approaches towards ‘‘risk’’ in these disciplines and then we will try to formulate what would be characteristic for a desired cognitive economics approach towards risk. An attempt will be made towards formulating one possible such approach. We will then present the cognitive architecture DUAL that might be a basis for building a model of individual decision-making under risk and its predictions, and finally an experimental study that tests these predictions.
2. Approaches to risk understanding 2.1. Sociology There are various schools of thought in sociology that view risk from different perspectives. One can see risk as something objective resulting from the use of complex technology, from the potential dangers of physical and social environment. In this case the emphasis will be on risk analysis. Alternatively, one can see risk as a social construction – people are worrying about those aspects of the environment that happened to be culturally central that are focal for the discussions in the society or group. Thus, for example, the objective risk of humans distorting the environment has always been high; however, this has only recently become to be perceived as a risk, after it became the cultural focus of discussions. This second approach concentrates on the study of risk perception and its relation with the socio-cultural level. Another example would be the perceived danger of immigrants, of followers of a different religion, of different ethnic group, etc. This is the fear of the other as traditionally considered in the sociocultural tradition.
Individual Decision Making Under Risk
101
Another influential line of research stems from the work of Beck (1992, 1999), who argues that we are entering a new era – the era of the risk society. This era can be characterized as a society using complicated technologies that necessarily have both good and bad sides, as a society where the traditionally established social institutions, such as marriage and family, are falling apart. Thus the overall order and predictability is decreasing and the uncertainty is increasing and we have to adapt to this new situation by learning to manage the risks in our own lives without relying on social institutions and social norms but by finding our own solutions. Thus Beck argues that risk management becomes an individual responsibility that is accompanied with less trust and reliance on experts, governments, and authorities. Furedi (1997, 2003) argues for the developing of a new culture of fear or a therapy culture in which the individuals are left with more freedom from official intervention. This line of research with its focus on the individual risk perception and individual risk response is closer to the psychological view of risk. 2.2. Economics Frank H. Knight (1921) is recognized as the first economist suggesting a clear definition of risk within economic context. In his eminent book Risk, Uncertainty and Profit, he made an essential distinction between risk and uncertainty: when the decision-maker is unaware of the probability distribution of the possible outcomes, or the consequences of his or her possible actions then he or she is faced with uncertainty; when the decision-maker is facing a situation where he or she knows in advance the probabilities of all possible outcomes and has to choose among them, then this is called decision-making under risk. In real life it is much more often that the decision-maker has to deal with uncertainty, but there are situations in which the risk is well defined, like in lotteries and gambles, in often repeated situations for which there is a statistics know to the decisionmaker. Even though uncertainty is the ecologically more valid situation, researchers in economics and psychology study much more often risk since it is a simpler and more controllable situation. This paper is also about risk although some of the ideas could easily be extended to uncertainty. Based on the assumption that probabilities are objective and they cannot be influenced by the agent, John Von Neumann and Oskar Morgenstern (1944) developed the expected utility theory which is the single most influential theory of decision-making under risk. In this theory the decision-maker is supposed to know the probability of occurrence in each state of the world and the related consequences of each possible choice. The expected utility is calculated as a weighted sum of the utility (the monetary benefit) of each possible outcome multiplying it by the probability of this outcome to occur. And the decisionmaker is considered as maximizer of this expected utility. For example, if someone has a choice between two lotteries: lottery A ensures 100% chance to win EURO 100, and lottery B provides you with a 50% chance to win EURO 300,
102
Boicho Kokinov and Daniela Raeva
and 50% chance to win nothing (e.g., tossing a coin), then this person should always choose lottery B which has an expected utility of EURO 150, greater than the expected utility of A – EURO 100. In consequence the neoclassical economic theory assumes an ideal economic agent, either a firm or a household that is fully informed, able to make computations of the outcomes with perfect accuracy and maximizing the utility. Such decisions are called rational. That is why this theory is also referred to as the Rational Choice theory. Everyday situations are usually unique and unrepeatable, and therefore the probabilities cannot be made objectively. That is why Savage (1954) suggested replacing the objective probabilities with subjective estimation of these probabilities. Thus his model is favouring the alternative with the highest subjective expected utility. The probability judgement is based on the subject’s beliefs about frequencies or about the set of logical possibilities, thus different individuals could have different subjective probabilities associated with each outcome. Once the subjective probabilities formed they play the same role as the objective ones. Various economists have pointed out various shortcomings and paradoxes of these normative theories (Allais, 1953; Camerer, 1992; Lichtenstein and Slovic, 1971; Rabin and Thaler, 2001; Shafer, 1986); however, no better theory has been suggested in the field of economics. The strength of this theory is based on its simplicity and beauty which make it so widespread and a common basis for neoclassical economic theories. Alternatives have been proposed within cognitive science and cognitive economics is trying to rebuild economic theory using some of these theories. 2.3. Psychology Classical psychology has also provided various treatments of risk that have proved not to be very useful for modern economic theories. The psychoanalytic tradition considers risk-taking behaviour as a violation of rationality and interprets it as being an expression of an unconscious death wish (‘‘Thantos’’), or repressed feelings of masculine inadequacy. Thus they treat risk takers (such as alpinists, financial brokers, etc) as being illogical or even pathological. However, there is no empirical support of that claim; on the contrary, it has been demonstrated that ocean sailors and people who take financial risks at their work are often with higher self-esteem and tend to be more successful in their jobs, so risky behaviour could not be considered as self-defeating. Evolutionary psychology suggested that risk-taking behaviour has evolved for its survival value since ‘‘early humans’’ (Australopithecus) have lived in a very hostile and hazardous environment and if they were ‘‘playing safe’’ not changing the inhabitant location they would soon die from starvation. Humans with riskseeking behaviour are typically dominating in the group and are considered to be more sexually attractive and that is why ‘‘risk-taking behaviour’’ has been selected and transferred through the generations.
Individual Decision Making Under Risk
103
Finally, personality psychology considers risk seeking and risk aversion as personality traits. This means that they are underlying characteristics of the individual that are relatively stable over time, and explain regularities in people’s behaviours. Thus psychologists have developed various batteries to measure the internal inclination of a given person to take risks. These results could be used for describing individual differences and explain why some people could systematically violate the expected utility theory principles but could hardly be used for building adequate new economic theories.
2.4. Cognitive science There are various treatments of risk-seeking and risk-averse choices in cognitive science but it has also not yet provided an adequate and satisfactory theory that could easily be used by economists. However, the advantage of cognitive science is that it is interested in the mechanisms underlying judgement and choice. Knowing the mechanisms could potentially be useful for making predictions about what kind of behaviour might be expected in a given situation. That would be directly useful for building a more flexible economic theory that would have greater predictive power. Another advantage is the combination of various methodologies used in exploring these mechanisms (computational modelling, psychological experimentation, brain imaging, neuropsychological data, etc.). Simon (1955, 1964) argued that the actual understanding of decision-making would require examination on how various cognitive processes (e.g., perception, learning, reasoning) may influence human decision behaviour. He suggested that decision-makers have limited cognitive resources (such as memory and attention) and limited computational capabilities (such as reasoning mechanisms) to interact with the complexity of the environment. Therefore, he suggested human beings are rational but they have bounded rationality, i.e. if there were not resource limitations the outcome would have been corresponding to the rational choice theory. As a result, Simon (1978) argued that the process of reaching a conclusion or decision might be considered rational (since it uses the given resources in the best possible way) even though its outcome is not rational. Kahneman and Tversky (1974) and Kahneman, Slovic and Tversky (1982) suggested that people are using specific heuristics (such as representativeness, availability, etc.) for judging probabilities of events which is a specific proposal for a bounded rationality procedure. They demonstrate that although these heuristics lead to good estimation in many cases, there are situations in which the heuristics produce strong biases which lead to ‘‘irrational’’ decisions. From evolutionary perspective it might be expected that these kind of situations are relatively rare and in the vast majority of cases they produce reasonable results. In this way humans have the advantage to reach the adequate decision in a very fast and efficient way in the majority of cases and therefore the probability of survival will be higher.
104
Boicho Kokinov and Daniela Raeva
Among the various attempts for creating an alternative to utility theory following the bounded rationality prescription merely Prospect theory has achieved recognition in economics. Kahneman and Tversky (1979) proposed a descriptive theory called prospect theory, which could account for almost all of the available at that time data concerning decision under risk. Prospect theory has probably done more to bring psychology into the heart of economic analysis than any other approach. It aims to modify expected utility theory as little as possible in order to take account for the observed violations and to explain why and how our choices deviate from the normative model. Unlike expected utility theory, which deals with how decisions under uncertainty should be made (a prescriptive approach), prospect theory deals with how decisions are actually made (a descriptive approach). Prospect theory postulates that we make decisions by multiplying something like subjective probability to something like utility. For instance, if the consequence of x is more probable than the one of y, we may weight the utility of option x more heavily, than we do for y, i.e. certainty is over weighed and lower values of probability are under weighed. Thus, prospect theory predicts that stated probabilities are not accepted as ought to be in mathematical average values of choice. Furthermore, according to prospect theory the value function of outcomes is compared to some imaginary reference point that is relatively easy to be manipulated (e.g., easy to be affected by irrelevant factors that may lead to different decisions for similar problem, depending on how the decision situation is described or how we describe the choice to ourselves, framing effect). Despite the fact that prospect theory has a solid mathematical basis, making it comfortable for economists to play with, it has not been applied very far from behavioural economics. In addition, some new paradoxes have shown where prospect theory is self-contradicting or has false predictions. Therefore, it has a disputed status as a new general theory of decision-making, which can replace expected utility theory. Nonetheless, prospect theory has provided important insights into choice behaviour. Tversky and Kahneman (1981) described the so-called framing effect, which shaped a supplementary investigation of description variance. Framing effect presumes that the choice we made is dependent on how the situation is perceived, or ‘‘framed’’. Many experimental results have demonstrated that decision-makers respond differently to logically equivalent lotteries, which are described in different terms (e.g., in terms of gains or losses). Framing effects have been widely investigated. Over the past decades studies of framing effects in the area of judgement and choice have been expanded and included domains such as psycholinguistics, cognition, perception, social psychology, health psychology, clinical psychology, education psychology, and business. Levin, Schneider and Gaeth (1998) introduced a typology of framing effects. They start with the ‘‘classical’’ framing effect, which they labelled ‘‘risky choice’’ framing and introduce two other types: ‘‘attribute framing’’ and ‘‘goal framing’’. The typology of framing effects was based on differences in
Individual Decision Making Under Risk
105
information encoding of positive and negative features in the decision task. Despite the wide spread interest and the amount of accumulated evidence, there is a lack of understanding of the basic cognitive mechanisms that underlie framing effects. Shafir et al. (1993) proposed a theory that may explain why a change in the description of two gambles reflects in a change of the evaluation of the objective known probabilities. The theory is called reason-based choice. Rational choice assumes that options are ordered according to their value in a context independent way where the more attractive options are selected and the less attractive ones are rejected. In contrast, reason-based choice argues that we raise different arguments in different context, i.e. each context provokes specific type of arguments, which might not be raised in a different context. As a result when we face decision under risk in different contexts we tent to arrive at different number of arguments ‘‘in favour’’ and ‘‘against’’ each option and therefore different choices are made. In contrast to the more abstract prospect theory, reason-based choice suggests specific mechanisms of how choices are made. In addition this theory explains a real wide set of experimental data that looked strange and ‘‘irrational’’ before (Simonson, Nowlis and Simonson, 1993; Tversky and Shafir, 1992; Tversky and Simonson, 1993). Finally, some cognitive scientists have introduced mechanisms based on emotions which could also explain why people are not always rational. The main reasoning is that the emotional system is very rapid and could produce a result (an emotional state and a behavioural response) even before the cognitive processing could take place. Zajonc (1980) was the first to demonstrate this effect – he obtained an effect on interpreting a Chinese character providing the subject in the experiment with subliminal image presentation of a smiling face vs. a neutral geometric figure. His interpretation is that the emotional processing of the smiling face has been faster than the information processing process and has changed its results. Following this line of research Epstein (1994) and Slovic et al. (2004) suggested that there are two ‘‘thinking modes’’ or two systems called analytic (rational) and experiential (emotional) that together form the decision in a ‘‘dance of affect and reason’’. Each of these systems has its advantages – the analytical is supposed to provide more precise results but only when we have a lot of information about the domain and a lot of time to process it, while when there is a little information or high time pressure the experiential system reacts fast and to even peripheral aspects of the problem and thus ensures that a decision will be made in the required time period – this usage of emotions is called ‘‘affect heuristics’’ and can be considered as one of the available heuristics. Loewenstein, et al. (2001) have also argued for risky decision-making being influenced by and probably taken by feelings. Damasio (1994) provides neurological evidence supporting the claim of usage of emotions in decision-making and that people with specific brain damage related to their ability for emotional response may degrade in their decision performance. He introduced the so-called somatic markers which are positive or negative and which are directly related to
106
Boicho Kokinov and Daniela Raeva
the images. Thus the presence of an image could directly activate the positive or negative marker associated with it and thus influence the decision to be taken. The main problem with all these theories of decision-making proposed within cognitive science is that they are not yet detailed enough to be strongly testable. There is generally a lack of computational models that could provide the details of how exactly the decisions are made and to explain the existing data and more importantly to predict new data. 2.5. Cognitive economics Many of the theories discussed above can be brought under the heading of cognitive economics, however, we will leave this section empty since we are not satisfied with any account proposed so far. That is why we will rather list the requirements that such a theory would have to meet. First of all, a theory of decision-making should be detailed enough in terms of mechanisms and representations and implemented in a computational model. This will allow predictions to me made and tested comparing human data against simulation data. Second, the model of decision-making should be closely integrated with models of other cognitive processes since as it is evident from the review above deciding is not an isolated process – it is being influenced by and influences perception, judgement, categorization, memory, reasoning, emotions, etc. that is why the best approach for us would be to integrate a decision model into a general cognitive architecture. Third, the model should explain context-sensitivity of human decision-making and even predict new types of context effects (Kokinov, 1995, 1997, 1999). Forth, the outcomes of the model should be easily usable in building a new economic theory of decision-making, which will not be based on the rational choice assumption, but rather the ‘‘rational choice’’ or ‘‘maximizing expected utility’’ would be one possible outcome of the process among many possible behaviours. Moreover, the theory should predict under what circumstances ‘‘rational-choice’’ behaviour will be produced. 3. A DUAL-based approach towards decision-making under risk In this section, we present an outline of a possible theory of decision-making under risk. This is just a first step and we are far away from having a properly developed theory. We still lack a fully implemented model and simulation data. Still some predictions will follow out of it. Since we would like to integrate decision-making with other cognitive processes we will base our theory of decision-making on a general cognitive architecture – DUAL (Kokinov, 1994a,b; Petrov and Kokinov, 1999). DUAL consists of a great number of micro-agents (simple computational units) that collectively produce the emergent behaviour of the whole system. Each micro-agent is hybrid.
Individual Decision Making Under Risk
107
It has a symbolic part and a connectionist part. The symbolic part represents a simple piece of knowledge; it has local memory that holds symbolic structures representing a simple statement or a piece of procedural knowledge. The connectionist part represents the relevance of that knowledge to the current context by the level of activation computed by this node. The connectionist units spread activation overall existing links between the micro-agents. The symbolic units exchange messages over specific semantic links (superclass, subclass, instance-of, c-coreference, etc.) between them. All the communication and therefore all computations are local only. Global processes (such as, reasoning, problem solving, judgement, and decision-making) emerge out of these local communications. There is no centralized device to coordinate the whole process. Moreover, each symbolic processor runs at its own speed and this speed varies dynamically; it is proportional to the activation level computed by the paired connectionist processor. This mechanism ensures that whatever is considered relevant to the context, and therefore highly activated, runs faster and thus has priority. This makes all the computations in the architecture context-sensitive. The set of all micro-agents form the so-called long-term memory (LTM) of the system. This is what the DUAL-based system ‘‘knows’’: this includes concepts, general facts, episodes from past experience as well as skills, including motor programs, mental operations, etc. However, the vast majority of these micro-agents will be in ‘‘sleeping mode’’, i.e. they will not be active, and therefore they will not take part in the computation process. This means that even though they may contain important information that might be useful for solving the task at hand, this information is not available at that moment. The subset of active micro-agents at a particular moment of time is called working memory (WM). Only they participate in the computations. Moreover, they participate at various speeds depending on their current activation levels. In different contexts different sets of agents will be present in WM and they will be running at different speeds and therefore they could produce different outcomes at the emerging global level, i.e. the cognitive system will arrive at different solutions or decisions. Several models have been developed on the basis of the DUAL architecture: AMBR – a model of analogy making and memory (Kokinov, 1994c; Kokinov and Petrov, 2001), PEAN – a model of perception (Nestor and Kokinov, 2004), and JUDGEMAP – a model of judgement and choice (Kokinov, Hristova, and Petkov, 2004; Hristova, Petkov, and Kokinov, 2005; Petkov and Kokinov, 2006). They all use the same knowledge structures and the same mechanisms, which are provided by DUAL. Thus JUDGEMAP basic mechanisms are spreading activation, mapping, and constraint satisfaction – all these mechanisms were initially introduced for modelling the process of analogy making and they are reused in JUDGEMAP for modelling the processes of scaling and choice (Fig. 1). The choice could be modelled by JUDGEMAP in the following way. There will be a coalition of agents each of which representing a possible alternative and
108
Boicho Kokinov and Daniela Raeva
Figure 1. AMBR and JUDGEMAP models using the same mechanisms (spreading activation, costrant satisfaction, mapping) from the cognitive architecture DUAL
spreading activation constraint satisfaction
marker passing
an agent representing the positive choice. Each of the alternatives will try to form hypotheses (represented by other micro-agents) for possible pairing with the choice-agent. All of these hypotheses will compete and one of them will finally win. These hypotheses may have justifications which support them (similarly to the reason-based choice theory) and they will receive additional activation from these justification agents. Finally the constraint satisfaction mechanism will decide who is the winner; this will depend on the supporting structure of justifications, but it will also depend on various other factors such as the activation level of the alternatives themselves (if some of them are perceived in the moment they will get additional activation through perception), the activation level of concepts related to these justifications (e.g., the concept of size would activate a justification agent claiming that one option is bigger than another one), etc. Thus if we compare this model to reason-based choice theory we can assume that all the power of reason-based choice is included in JUDGEMAP, but in addition there is an unconscious process of spreading activation which adds on the explicit mechanisms of comparison and building arguments in favour of one or another option. Thus making choice in JUDGEMAP is an interplay of conscious (explicit) and unconscious (implicit) processes. Compared to the ‘‘two thinking modes’’ theory JUDGEMAP acknowledges the complementary role of two or more processes in decision-making, but the rapid unconscious process is not necessarily an emotion-based mechanism, it could be an associative process of information processing. Of course, emotions do play a role in making choices, but the way they could be integrated into the model is yet to
Individual Decision Making Under Risk
109
Figure 2. The constraint satisfaction network built by the model to provide a choice between two options (Alternative 1 and Alternative 2). Alternative 1 is safer than Alternative 2 and the corresponding comparison relation becomes a justification for Hypothesis 1 (Alternative 1 is chosen), Alternative 2 is more profitable than Alternative 1 and the corresponding comparison becomes justification for Hypothesis 2 (Alternative 2 is chosen). Other concepts and instances linked to the comparison relations are also shown and they emit activation towards the hypotheses when activated “baby”
Concept “safety”
Alt. 1
Safer (1,2)
Hypoth. 1 choice
Hypoth. 2
Alt. 2 More profit (2,1)
Concept “profit”
“James Bond”
be established. They could influence direct self-activation of a specific (positively charged) alternative, or they can act as a general parameter changing the pattern of spreading activation. These alternatives have yet to be explored. Figure 2 presents an example of choice making between two options – one is more risky than the other, but offers higher profit. This is a case of a strong
110
Boicho Kokinov and Daniela Raeva
conflict. Each of the two options has formed a hypothesis for possible pairing with the choice response. Each of these hypotheses has been formed on the bases of some justification agent (one is safer than the other, and the second one is more profitable than the first one). There could be second-order relations representing which difference is bigger than the other one – they are not depicted in the figure since in this particular case the differences are equal and therefore do not play a significant role. The concept of ‘‘safety’’ is connected to the ‘‘riskcomparison’’ justification, while the concept of ‘‘profit’’ is connected to the ‘‘profit comparison’’ and when activated they send activation to the corresponding justification nodes. Various other concepts or specific instances and episodes could be connected to the concepts of ‘‘safety’’ and ‘‘profit’’ in our case. Let us assume that the concept of ‘‘baby’’ is strongly associated with ‘‘safety’’, while the image of ‘‘James Bond’’ is strongly associated with success and profit. In that way we arrive at a specific prediction that this model is making. 3.1. Prediction If the external environment provides a clue that is strongly associated with the concept of a ‘‘baby’’ then the concept ‘‘safety’’ will be strongly activated and this will result in greater activation of the safety comparison and the cognitive system will prefer the more safer Alternative 1. On the contrary, if the environment happens to provide a clue for James Bond than this will result in activating the concepts of ‘‘success’’ and ‘‘profit’’ and as a result the cognitive system will activate highly the profit comparison justification and finally will prefer Alternative 2. In this example we can see the interplay between the conscious processes of making comparisons and building justifications and the unconscious process of spreading activation that might be triggered by any environmental stimulus, even if this stimulus is not part of the task, it would be sufficient to ensure that it will be presented in the visual field of the human decision-maker to ensure that it will be perceived. Thus the specific prediction of the model to be tested is that if we present a supposedly irrelevant picture of James Bond human decision-makers will tend to prefer the more risky option, while presenting a supposedly irrelevant picture of a baby will result in a preference for the safer option. 4. Experimental study The experimental study described here tests the above prediction of the model. It is a continuation of previous work on context effects on problem solving (Kokinov and Yoveva, 1996; Kokinov et al., 1997) where AMBR predicted distant context effects (DICE). A similar methodology is used here in a decision-making context. We present a risky-choice task to the participants in the study and at the same time we either present them with a picture of James Bond or a picture of a baby. The picture is not part of the task or the instruction and thus has to be
Individual Decision Making Under Risk
111
considered as completely irrelevant by the participants. That is why we call such context effects (if we obtain them) DICE. In the experiment people played gambles. Participants were invited to play a game on a PC in a soundproof boot. On the computer screen they were presented with a stack of cards that has an Ace in it. The card stack consists of 10 cards, which are randomly distributed over two rows (Fig. 3). The particular position of the Ace is randomly chosen on each trial. The participants had to guess where (in which row) the Ace is. If they guess correctly they win a certain amount of points, which are accumulated in a general score and later on paid to the participants. The amount of points was selected in such a way as to make the expected value of the two rows equal. Thus in the example provided in Fig. 3 we have expected value of 24 points for each of the rows (4/10 60 for the first row and 6/10 40 for the second row). According to the rational theory we should obtain a random 50:50 choice of either option. On each trial participants faced a different configuration which varied from 1 card in the first row and 9 cards in the second row to 9 cards in the first row and 1 card in the second row (excluding 5 cards on each row). Every trial displays a different not-repeating configuration. Three conditions have been used: control condition (the cards have a neutral back), ‘‘risk-seeking’’ condition (the cards have a picture of James Bond as their back), and ‘‘risk-aversion’’ condition (the cards have a sleeping baby as their back). The dependent variable was the percentage of risk responses during the game. People have seen many card stacks and they know that the back of the cards may vary, but the particular picture on the back do not play any role in any card game. Thus participants were supposed to ignore (at the conscious level) the picture at the back of the card (and thus this to be an example of a good distant context). At the same time the picture was supposed to be perceived and possibly have an impact on the choice at an unconscious level. The results from the experiment are presented in Table 1. They have shown small but significant difference between the performances of the three groups. Figure 3.
An example display of a trial from the gamble
Boicho Kokinov and Daniela Raeva
112
Table 1.
Mean % of risk responses of the subjects in each group
Group Baby condition Control James Bond condition
Mean % Risk Choices 41 48 55
Participants in the neutral condition (without a meaningful picture at the back of the cards) have been choosing the more risky option in 48% of the trials. Participants in the James Bond condition picked up the risky option in 55% of the trials. Finally, participants in the ‘‘Baby’’ condition opted for the risky alternative only 41% of the trials. These results confirmed the prediction of the model. Moreover, interviews with participants revealed that they have not consciously used the James Bond or baby pictures in their decision-making strategy and thus were surprised to learn that their performance was influenced by these pictures.
5. Conclusions The performed experiment expanded the known territory of context influences: not only elements of the task, but also distant and seemingly irrelevant elements of the environment can produce a contextual effect. This study has also potential relevance to the practice. It shows that even an incidental picture on the wall of a store may change the choices of the customers. This new phenomenon raises also important theoretical challenges to the economic theories that are based on the traditional rational theory of decisionmaking under risk by showing that even though the choice task is the same in all conditions, people react in a different way depending on factors that are supposed not to have any relation to the optimal choice. It also challenges the reason-based choice theory since these effects may appear without subjects’ awareness and without producing explicit arguments (arguments like ‘‘since we have James Bond on the back of the card we should risk more’’ seem odd to the participants). The effects were predicted by the DUAL-based model of choice. These predictions are based on parallel local processing postulated by DUAL, which allows for an interplay between various cognitive processes some of which might be conscious and other unconscious. In this example the process of unintended perception of the back of the cards (although irrelevant to the task) leads to unconscious and uncontrolled activation of various concepts in LTM and if they happened to be connected with some of criteria used (the comparison relations established) they will activate the corresponding hypotheses and thus influence
Individual Decision Making Under Risk
113
the result of the constraint satisfaction process which determines the winning hypothesis and the preferred alternative. Someone may be inclined to interpret the results from the experiment within a different theory. Thus one may try to interpret them using the emotional heuristics theory. The problem with this interpretation is that both pictures (of James Bond and a baby) produce positive emotional states but they produce different responses (facilitation and inhibition of risk choices). Thus we need different types of explanations. The presented model is by no way considered the only possible mechanism of decision-making. Moreover, Kokinov (2003, 2005) has argued that decisions can be made by analogy with a specific old case (or episode). Also in this case we can observe the interplay between the deliberate process of mapping and establishing correspondence between the two cases and the unconscious process of retrieving the old case which cannot be consciously controlled but could be influenced by accidental pictures or objects in the environment. This was our first attempt to build a cognitive model of risky decisionmaking that should satisfy the criteria set out in the first section. It was based on a general cognitive architecture and is thus strongly integrated with other cognitive processes, it explains context effects on decision tasks, and it has made new predictions that were tested and confirmed. The usefulness of the model in building new economic theories is yet to be established.
Acknowledgements This research has been supported by the ANALOGY research project funded by NEST within the 6th framework program of EU.
References Allais, M. (1953), ‘‘Le comportement de l’homme rationnel devant le risque: Critique des postulats et axiomes de l’e´cole ame´ricaine’’, Econometrica, Vol. 21, pp. 503–546. Beck, U. (1992), Risk Society: Towards a New Modernity, London: Sage. Beck, U. (1999), World Risk Society, Cambridge: Polity Press. Camerer, C. (1992), ‘‘Recent tests of generalized utility theories’’, in: W. Edwards, editor, Utility theories: Measurement and Applications, Norwell, MA: Kluwer Publishing, pp. 207–251. Damasio, A.R. (1994), ‘‘Descartes’ Error: Emotion, Reason, and the Human Brain,’’ New York: Grosset/Putnam. Epstein, S. (1994), ‘‘Integration of the Cognitive and the Psychodynamic Unconscious’’, American Psychologist, Vol. 49, pp. 709–724.
114
Boicho Kokinov and Daniela Raeva
Furedi, F. (1997), Culture of Fear. Risk-taking and the Morality of Low Expectation, London: Continuum. Furedi, F. (2003), ‘‘Therapy culture: cultivating vulnerability in an uncertain age, Oxford, UK: Routledge. Kahneman, D. and A. Tversky (1974), ‘‘Judgement under uncertainty: heuristics and biases’’, Nature, Vol. 185, pp. 1124–1131. Kahneman, D., and A. Tversky (1979), ‘‘Prospect theory: An analysis of decisions under risk’’, Econometrica, Vol. 47. Kahneman, D., P. Slovic and A. Tversky (1982), ‘‘Judgment under Uncertainty: Heuristics and Biases’’, New York: Cambridge University Press. Knight, F. (1921), ‘‘Risk, Uncertainty and Profit’’, Boston: Houghton Mifflin. Kokinov, B. (1994a), ‘‘The DUAL Cognitive Architecture: a hybrid multi-agent approach’’, pp. 203–207 in: A. Cohn, editor, Proceedings of the European Conference on Artificial Intelligence: ECAI’94, London: Wiley. Kokinov, B. (1994b), ‘‘The context-sensitive cognitive architecture DUAL’’, in: Proceedings of the 16th Annual Conference of the Cognitive Science Society, Erlbaum, Hillsdale, NJ. Kokinov, B. (1994c). ‘‘A hybrid model of reasoning by analogy’’. In: K. Holyoak, J. Barnden, editors, Advances in Connectionist and Mental Computation Theory, Vol. 2: Analogical Connections, Norwood, NY: Ablex. Kokinov, B. (1995). ‘‘A dynamic approach to context modeling’’, IJCAI’95 workshop on modeling context in knowledge representations and reasoning, IBP, LAFORIA 95/11. Kokinov, B. (1997), ‘‘A dynamic theory of implicit context’’, in: Proceeding of the 2nd European Conference on Cognitive science, Manchester, UK, April 9–11. Kokinov, B. (1999), ‘‘Dynamics and automaticity of context: a cognitive modelling approach’’, in: P. Bouquet, L. Serafini, P. Brezillon, M. Benerecetti, F. Castellani, editors, Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), Modeling and Using Context. Vol. 1688, Berlin: Springer. Kokinov, B. (2003), ‘‘Analogy in Decision-making and Social interaction: Emergent Rationality’’, Brain and Behavioral Sciences, Vol. 26(2), pp. 167–168. Kokinov, B. (2005), ‘‘Can a single episode or a single story change our willingness to risk?’’ The role of analogies in decision-making’’, in: B. Kokinov, editor, Advances in Cognitive Economics, Sofia: NBU Press. Kokinov, B., K. Hadjiilieva and M. Yoveva (1997), ‘‘Explicit vs. Imlicit Hint: Which One is More Useful?’’ in: B. Kokinov, editor, Perspectives on Cognitive Science, 3, Sofia: NBU Press. Kokinov, B., P. Hristov and G. Petkov (2004), ‘‘Does irrelevant information play a role in judgment?’’ in: Proceedings of the 26th Annual Conference of the Cognitive Science Society, pp. 720–725, Erlbaum, Hillsdale, NJ.
Individual Decision Making Under Risk
115
Kokinov, B. and A. Petrov (2001), ‘‘Integration of memory and reasoning in analogy-making: The AMBR Model’’, in: D. Gentner, K. Holyoak and B. Kokinov, editors, The Analogical Mind: Perspectives from Cognitive Science, Cambridge, MA: MIT Press. Kokinov, B. and M. Yoveva (1996), ‘‘Context effects on problem solving’’, in: Proceedings of the 18th Annual Conference of the Cognitive Science Society, Erlbaum, Hillsdale, NJ. Levin, I., S. Schneider and G. Gaeth (1998), ‘‘All frames are not created equal: a typology and critical analysis of framing effects’’, Organizational Behavior and Human Decision Processes, Vol. 76(2), pp. 149–188. Lichtenstein, S. and P. Slovic (1971), Reversals of preferences between bids and choices in gambling decisions, Journal of Experimental Psychology, Vol. 89, pp. 46–55. Loewenstein, G., E. Weber, C. Hsee and N. Welch (2001), ‘‘Risks as feelings’’, Psychological Bulletin, Vol. 127(2), pp. 267–286. Nestor, A., B. Kokinov (2004), Towards Active Vision in the DUAL Cognitive Architecture, International Journal on Information Theories and Applications, Vol. 11(1), pp. 9–15. Petrov, A. and B. Kokinov (1999), ‘‘Processing symbols at variable speed in DUAL: Connectionist activation as power supply’’, pp. 846–851, in: T. Dean editor, Proceedings of the 16th International Joint Conference on Artificial Intelligence, San Francisco, CA: Morgan Kaufman. Petkov, G. and B. Kokinov (2006), ‘‘JUDGEMAP – integration of analogymaking, judgment, and choice’’, in: Proceedings of the 28th Annual Conference of the Cognitive Science Society, Erlbaum, Hillsdale, NJ. Rabin, M. and R. Thaler (2001), ‘‘Anomalies: risk aversion’’, The Journal of Economic Perspectives, Vol. 15(1), pp. 219–232. Savage, L. (1954), The Foundations of Statistics, New York: Wiley. Shafer, G. (1986), ‘‘Savage revisited’’, Statistical Science, Vol. 1, pp. 488–492. Shafir, E., I. Simonson and A. Tversky (1993), ‘‘Reason-based choice’’, Cognition, Vol. 49, pp. 11–36. Simon, H. (1955), ‘‘A behavioural model of rational choice’’, Quarterly Journal of Economics,Vol. 69, pp. 99–118. Simon, H. A. (1964), Rationality, in: J. Goulg and W. L. Kolb, editors, A dictionary of the social sciences, pp. 573–574, Glencoe, IL: The Free Press. Simon, H.A. (1978), ‘‘Rationality as a process and as a product of thought’’, American Economic Review, Vol. 68(4), pp. 1–16. Simonson, I., S. Nowlis and Y. Simonson (1993), ‘‘The effect of irrelevant preference arguments on consumer choice’’, Journal of Consumer Psychology, Vol. 16. Slovic, P., M. Finucane, E. Peters and D. MacGregor (2004), ‘‘Risk as analysis and risk as feelings’’, Risk Analysis, Vol. 24, pp. 2. Tversky, A. and D. Kahneman (1981), ‘‘The framing of decisions and the psychology of choice’’, Science, Vol. 211.
116
Boicho Kokinov and Daniela Raeva
Tversky, A. and E. Shafir (1992), ‘‘Choice under conflict: The dynamics of deferred decision’’, Psychological Science, Vol. 3, pp. 358–361. Tversky, A. and I. Simonson (1993), ‘‘Context-dependent preferences’’, Management Science, Vol. 39(10), pp. 1179–1189. Von Neumann, J. and O. Morgenstern (1944), Theory of Games and Economic Behavior, 1953 edition, Princeton, NJ: Princeton University Press. Zajonc, R.B. (1980), ‘‘Feeling and thinking: Preferences need no inferences’’, American Psychologist, Vol. 35, pp. 151–175.
Part II: Games and Evolution
This page intentionally left blank
CHAPTER 5
On Boundedly Rational Rules for Playing Normal Form Games Fabrizio Germano Abstract We consider a framework that allows to analyze the strategic interaction of players that are playing different games according to a fixed set of rules. Rules are viewed as algorithms prescribing strategies for any of the different games that may arise. We also consider evolutionary implications of the framework and briefly relate them to recent experiments on rules and games.
Keywords: bounded rationality, evolutionary dynamics, learning, normal form game, stochastic dynamics JEL classification: C72, C73, D81, D83. 1. Introduction Motivated by an extensive experimental literature in both game theory and decision theory1 suggesting that decision makers often rely on simple criteria or rules of thumb when making decisions in both strategic and nonstrategic environments, we consider a framework that is a natural and straightforward extension of the standard framework studying interactions between players playing fixed normal form games. Rather than focussing on one fixed game with
Corresponding author. 1 See for example Camerer (2003) and Gigerenzer and Selten (2001) and the literature mentioned there.
CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80006-4
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
120
Fabrizio Germano
fixed strategies and using its corresponding set of equilibria to describe the strategic interaction between the players, we consider a class or a set of games G from which games are randomly and independently drawn according to a probability distribution m. Agents play the drawn games according to rules, which we take to be algorithms prescribing a strategy (possibly mixed) for any game from G that may appear. An interesting feature of the framework is that rules are no longer just strategies of a given game, but are algorithms that apply to the entire class G and may in fact have deeper cognitive interpretations, as well as applicability beyond the class G. For example, a rule could be the prescription to always play ‘‘maximin,’’ that is, the strategy that maximizes the player’s minimum or guaranteed payoffs (with randomizing if ties occur); or it could prescribe the Nash equilibrium strategy that maximizes joint payoffs; or simply the strategy that best responds to uniform priors over other players’ profiles. Basically, our goal is to study interactions between players that play according to rules recommending strategies for each of the individual games from the class G and to study which of these rules may emerge from the (possibly repeated) interaction of the players. For this purpose, to make the analysis concrete, we fix a list of rules Ri for each player i and a class of games G endowed with a probability distribution m. We then derive what we call an average game. This is a game (or meta-game), where players play rules against rules (strategies thus become rules for playing normal form games), and where payoffs are the m-expected payoffs of playing the games from G according to the different profiles of rules from set of rule profiles R ¼ iRi. The average game turns out to play a central role in our analysis of rules. It should be stressed that it is of course not an average of the games in the class G, but rather is determined by the rules and their expected performance over the set of games in G. We then apply standard game theoretic analysis to the average game, providing a simple framework for studying the interaction of players playing different games according to fixed rules. Essentially, the analysis suggests that one should expect players to play rules profiles that form equilibria of the average game. Clearly, having the ‘‘correct’’ set of rules and set of games with corresponding distribution m is crucial for obtaining an analysis that can be related to experiments. Unfortunately, at this stage we cannot make any claims in this direction. On the other hand, our results are stated in terms of abstract sets of rules and games. After studying average games, we introduce a basic evolutionary process which we apply to study the evolution of rules and show that some basic folk results of evolutionary game theory (see for example, Hofbauer and Sigmund, 2003) carry over to this setting with rules. In particular, we show that rules which are strictly dominated in the average game, are played with probability zero almost surely in the limit; similarly for iteratively strictly dominated rules. Further, we show that, if the evolutionary dynamics converges, then the limit rule profile must be a Nash equilibrium of the average game. These results have counterparts, within the standard framework with fixed games, among others, in
On Boundedly Rational Rules for Playing Normal Form Games
121
Nachbar (1990), Friedman (1991), Samuelson and Zhang (1992), Cabrales and Sobel (1992), and Ritzberger and Weibull (1995). They can be seen as giving some further foundation for applying equilibrium analysis to the average game. Given the game-theoretic emphasis, our focus throughout is on strategic interactions. Besides the literature mentioned in Camerer (2003), particularly closely related are the experimental papers of Stahl and Wilson (1995), Stahl (1999, 2000), Rankin et al. (2000), Costa-Gomes et al. (2001), Stahl and Van Huyck (2002), Van Huyck and Battalio (2002), and Selten et al. (2003), Costa-Gomes and Crawford (2004), who test specifically for the rules agents may use, learn to use, or also learn to develop when playing many different games. Identifying and understanding the rules that underlie subjects’ decisions is an important step toward understanding their strategic behavior in more complex and general environments. Many of the examples throughout the paper are based on some of these experiments. There are also a number of theoretical papers concerned with players playing or learning to play rules in strategic environments. Some of these include Li Calzi (1995), who considers agents that play potentially different games and studies what he calls fictitious play by cases, which specifically models how agents may draw on experience from playing similar games in the past via a fictitious play algorithm; he shows almost sure convergence of his process for 2 2 games. Samuelson (2001) considers agents playing different games with rules (or models representing the environment – which in his case are automata with varying states) that are optimal subject to complexity costs, he also studies the evolution of the automata and how they play in equilibrium. Sgroi and Zizzo (2001, 2003) study the process of neural networks having to play randomly drawn games after having been trained to play Nash equilibrium in environments with unique Nash equilibria; they then compare their networks’ behavior after the training period is completed with behavior observed in the experimental literature and find potential similarities. Also closely related is Heller (2004), who studies the evolution of simple rules vs rules that allow agents to learn their environment; her paper derives conditions on the costs associated with learners’ rules guaranteeing that, within changing environments, learners survive in the long run. While our framework allows in principle to address some of the questions arising in these papers, it is much less specific about the concrete learning and cognitive processes involved. The paper is organized as follows. Sections 2 and 3 contain basic notation and the framework. Sections 4 and 5 contain examples, the game theoretic and the evolutionary analysis, as well as the main results. Finally, Section 6 concludes. 2. Preliminary notions Let I ¼ {1, y, n} denote the set of players, Si player i’s space of pure strategies, S ¼ Xi2I S i the space of pure strategy profiles, and let Si denote the set of
122
Fabrizio Germano
probability measures on S i ; S ¼ Xi2I Si the space of mixed strategy profiles. Let also S i ¼ Xjai Sj , and Si ¼ Xjai Sj and set Ki ¼ #Si for the number of i’s strategies, K ¼ Si2I K i , and k ¼ Pi2I K i the number of possible outcomes. For any set X IRm , let int(X) denote its interior. In what follows, we consider finite normal form games, that is, where n and each Ki are finite, and fix both the set of players and the set of strategy profiles, so that we can identify a game with a point in Euclidean space g 2 IRkn . We denote by gi 2 IRk the payoff array of player i and, by slight abuse of notation, also the payoff function of player i at game g. Finally, N(g) denotes the set of Nash equilibria of g. 3. Rules for playing normal form games We are interested in rules for playing games within given subspaces of normal form games G IRkn . We view rules as being algorithms that for any game g 2 G prescribe a strategy of player i for that game, formally, Definition 1. A rule for player i is a map ri : G ! Si , for i 2 I. As with strategies for individual games, we let Ri denote a finite list of rules, let R ¼ Xi2I Ri denote the space of rule profiles, and Ri the set of probability measures on Ri; R ¼ Xi2I Ri denotes the space of mixed rule profiles with generic element r. While the space of all possible rules is very large, for concreteness, we will focus here on a specific subset of rules mainly taken from the experimental literature. Below is a list of such rules. It contains most of the ones that have been studied in the experimental literature in game theory. N1 O1 P1 A1 NR1 D1 N2 O2 P2 A2 NR2 D2 N3 N4 RDN PDN
Naı¨ ve Optimist Pessimist Altruist Naive reciprocal Naive after one round dominance Best response to N1 Best response to O1 Best response to P1 Best response to A1 Best response to NR1 Naive after two rounds dominance Best response to N2 Best response to N3 Risk dominant Nash equilibrium Payoff dominant Nash equilibrium
On Boundedly Rational Rules for Playing Normal Form Games
123
The first four rules, ‘‘Naive’’ (N1), ‘‘Optimistic’’ (O1), ‘‘Pessimistic’’ (P1), and ‘‘Altruistic’’ (A1) recommend, for each game g 2 G, respectively, N1: strategy that best replies to beliefs assigning equal probability to the opponent’s actions, O1: ‘‘maximax’’ strategy that maximizes the maximum of own payoffs, P1: ‘‘maximin’’ strategy that maximizes the minimum of own payoffs, and A1: strategy that maximizes joint payoffs. The fifth rule, ‘‘naive reciprocal’’ (NR1), tries to capture how a player can take into account the opponent’s payoffs, by weighing them with the player’s own normalized payoffs. The relatively higher the own payoffs are the higher the weight on the opponent’s payoffs. Hence, the higher the own and the opponent’s payoffs are for a given strategy, the more likely it is that the strategy is selected by this rule. This is a particularly reduced and questionable way of modeling reciprocity, which we stress by referring to it as ‘‘naive’’ reciprocity. The remaining rules on the list, ‘‘naive after one round dominance’’ (D1) or ‘‘naive after two rounds dominance’’ (D2), recommend respectively, D1: N1 after one round of deletion of strictly dominated strategies; D2: N1 after two rounds of deletion of strictly dominated strategies; and ‘‘best reply to naive’’ (N2), ‘‘best reply to optimistic’’ (O2), ‘‘best reply to pessimistic’’ (P2), ‘‘best reply to altruistic’’ (A2), ‘‘best reply to naive reciprocal’’ (NR2), recommend playing a best response to the belief that the opponent plays, respectively, N1, O1, P1, A1, or NR1. Finally, ‘‘risk dominant Nash equilibrium’’ (RDN) and ‘‘payoff dominant Nash equilibrium’’ (PDN) recommend respectively the risk dominant Nash equilibrium strategy (this is generically unique and well defined in 2 2 games) or the Nash equilibrium strategy that maximizes joint payoffs. The next example makes some of these definitions more explicit. Example 1. If G is the space of two player 2 2 games of the form a2
a1
b2
b1 a4
a3 b3
b4
where payoffs are all taken from the interval [0, 1], then one can write the above list of rules explicitly as algorithms that specify a strategy for player 1 and for any game in G (the algorithm for player 2 is similar). For example, the first five rules can be written as
Fabrizio Germano
124
N1 O1 P1 A1 NR1
If If If If If
a1+a2>a3+a4 then s1 else s2 max[a1, a2]>max[a3, a4] then s1 else s2 min[a1, a2]>min[a3, a4] then s1 else s2 max[a1+b1, a2+b2]>max[a3+b3, a4+b4] then s1 else s2 max[a1+(2a1–1)b1,a2+(2a2–1)b2]>max[a3+(2a3–1)b3, a4+(2a4–1)b4] then s1 else s2
where ties could also be solved by randomizing between the strategies in question. However, we do not note this explicitly since, as we will see later, in all our results, ties occur with probability zero and so the way ties are solved does not matter. An important objective of much of the experimental literature is to understand the level of (strategic) sophistication of subjects in different environments. Many of the earlier experiments were often concerned with dominant solvable games such as the guessing games or beauty contest games, see Camerer (2003), especially Chapter 5 and the literature cited there. More recently, Stahl and Wilson (1995), Stahl (1999, 2000), Costa-Gomes et al. (2001), Selten et al. (2003), Camerer et al. (2004) among others have looked at other broader classes of games. Specifically, Costa-Gomes et al. (2001) have subjects play a series of 2 2, 2 3, and 2 4 games and indicate that less than 30% of the subjects appear to use such nonstrategic rules; between 65 and 90% appear to use ‘‘naive’’ (N1) or ‘‘best reply to naive’’ (N2), and up to 20% ‘‘naive after one round dominance’’ (D1). Stahl and Wilson (1995) and Stahl (1999, 2000) essentially also obtain many subjects playing N1 and N2, and, to a much lesser extent, also Nash equilibrium; Costa-Gomes and Crawford (2004) obtain similar results for a set of two players guessing games. Selten et al. (2003) have subjects actually write programs to play 3 3 games and largely obtain Nash equilibrium play, especially for games with pure equilibria. As a way to capture potential cognitive bounds associated to different rules, one can consider notions of complexity measures of the given rules. To fix ideas, we take as proxy for the complexity of a rule the number of occurrences of the payoff symbols, a1, a2, y, in its algorithm, counting repetitions. So, N1, O1, and P1 all have complexity 4, while A1 has complexity 8, and NR1 has complexity 12. This measure will not play an important role here, but we will briefly come back to it in the next section.
4. Equilibrium analysis of rules Given a subset G IRkn and a probability measure m on G, we can assess the performance of given rules on games in G by computing the expected payoffs from playing the individual games that are drawn according to the probability measure m. Throughout the paper we assume the set G to be compact. This leads to the following notion of an average game.
On Boundedly Rational Rules for Playing Normal Form Games
125
Definition 2. Let G IRkn be a compact set of games, let m be a probability measure on G, and let R denote a finite space of rules. The average game gN is defined by: Z gi1 ðrÞ ¼ gi ðrðgÞÞdmðgÞ; for r 2 R; i 2 I: G
The notion of average game is to be understood not as the average of games in G but as the average of how rule profiles in R perform over all games in G. An equilibrium analysis of rules, given an environment (G, m, R), involves studying the average game and the corresponding equilibria, where now players’ strategies are rules. As we will see in the next sections, important properties of the learning behavior of rules, given such an environment (G, m, R), are derived from the associated average game gN. Example 2. Take again G to be the space of 2 2 games with payoffs in the interval [0, 1], and take m to be such that payoffs are drawn according to independent uniform distribution on [0, 1]. All rules mentioned above are well defined here. For the rules N1, P1, A1, and N2 we have the following average game.2
N1 N1
.617
P1
.600
A1
.597
N2
.667
P1 .617
.617
.600 .600
.617
.597
.600
.707
.596
.661
.639 .588 .696
.712 .696
.600
.667 .600
.712
.639
N2 .617
.661
.596
.617
A1 .707
.625 .588
.625
The rule profiles A1 and N2 constitute pure Nash equilibria of the average game; there is also a mixed equilibrium involving the rules A1 and N2. The ‘‘maximin’’ rule P1 is strictly dominated by N1 and N2, and the rule N1 is strictly dominated by a mixture of A1 and N2. N2 does relatively better against the nonstrategic rules N1, P1, and A1 than against N2. Notice also how the average game need not have any resemblance with any of the individual games in G.
2
The payoffs in this and the following examples are obtained through random simulations approximating the true expected values. The number of runs was chosen sufficiently large (over 106) so that the standard deviation of individual payoffs is below 10–4.
Fabrizio Germano
126
Which rule profiles constitute Nash equilibria clearly depends on the underlying environment, as the following example illustrate. Example 3. Adding the rules ‘‘best reply to altruistic’’ (A2), ‘‘naive after one round dominance’’ (D1), and ‘‘risk dominant Nash’’ (RDN) to the environment of Example 1, leads to the average game. N2 N2
.625
A1
.588
A2
.618
D1
.641
RDN
.667
A1 .696
.625
A2 .648
.588 .712
.696
.618 .610
.712 .751
.648 .667 .636
.708
.649
.595
.659
.625
.593
.702 .622
.641
.656
.667 .593
.625
.659
.702
.641
.751
.610
RDN .636
.595
.649
.708
D1 .667
.656 .644
.641 .651
.622
.651 .651
.644
.651
where only the rule RDN survives iterated deletion of strictly dominated strategies.3 Thus, unfortunately, to be able to make a connection with the experimental results, one would need more information about the overall environment. 22 N1 O1 P1 A1 NR1 D1 N2 O2 P2 A2 NR2 N3 RDN PDN c
3
N1 .617 .600 .600 .597 .609 .642 .667 .639 .639 .626 .637 .625 .652 .642 4
O1 .617 .600 .600 .592 .607 .642 .639 .667 .611 .625 .651 .625 .639 .642 4
P1 .617 .600 .600 .596 .608 .642 .639 .611 .667 .616 .611 .625 .639 .627 4
A1 .707 .710 .661 .712 .722 .708 .696 .696 .686 .751 .708 .689 .702 .711 8
NR1 .636 .626 .613 .621 .634 .656 .656 .670 .631 .657 .686 .642 .656 .660 12
D1 .617 .600 .600 .595 .607 .641 .667 .639 .639 .625 .637 .625 .651 .642 16
N2 .617 .600 .600 .588 .602 .641 .625 .625 .625 .618 .623 .666 .667 .633 8
O2 .617 .600 .600 .595 .606 .641 .625 .625 .625 .614 .620 .641 .635 .637 8
P2 .617 .600 .600 .584 .603 .641 .625 .625 .625 .618 .623 .642 .635 .625 8
A2 .634 .624 .613 .610 .629 .659 .648 .664 .629 .649 .659 .656 .656 .667 12
NR2 .623 .609 .605 .603 .613 .647 .632 .638 .626 .625 .633 .650 .643 .647 16
N3 .617 .600 .600 .594 .604 .642 .625 .625 .625 .613 .617 .625 .632 .623 12
RDN .617 .600 .600 .593 .605 .644 .636 .629 .629 .622 .626 .642 .651 .635 84
PDN .623 .614 .602 .615 .621 .647 .639 .644 .629 .643 .643 .648 .646 .657 76
A1 is dominated by A2; A2 and D1 are then dominated by RDN, and N2 finally is dominated by RDN. The rules N1 and P1 were omitted since they are dominated already by N2.
On Boundedly Rational Rules for Playing Normal Form Games
127
More generally, still within the 2 2 setting, one can derive average games for any set of rules of the two players from the list of rules given in Section 3. In particular, we obtain the above payoff matrix for player 1 (by symmetry, player 2’s payoffs are the same). This can be used to derive average games for corresponding subsets of rules. Also reported are the complexities of the given rules. As one might expect, altruistic or reciprocal rules tend to do relatively better against similarly altruistic or reciprocal rules than against more strategic ones. (Notice that the reciprocal rule NR1 nonetheless strictly dominates the altruistic rule A1). Sophisticated strategic rules tend to do relatively well both against nonstrategic and strategic rules, though they tend to do even better against altruistic or reciprocal rules. While it is true that some of the more complex rules strictly dominate some of the less complex ones (e.g., D1 dominates N1, O1, and P1), it might be worth observing that the payoffs deriving from Nash equilibrium behavior in games with more complex rules can be lower than in games with less complex ones. To see this, consider the Nash equilibrium payoff of the profile A1–A1 in the game of Example 2, which is 0.712. This is strictly higher than the payoff of the profile PDN–PDN, which is only 0.657. In other words, the adoption of more complex rules by the overall population need not lead to higher payoffs in equilibrium, suggesting that as agents get more sophisticated overall payoffs may actually decline. 5. Evolutionary analysis of rules Next, we consider evolution of rules from the given set R and model agents learning or updating the probability with which they play different rules depending on the payoffs obtained on the randomly selected games. More specifically, time is discrete, t ¼ 1, 2, y, and games are drawn randomly from G each period according to the distribution m. Players play the randomly drawn games according to the rules in R and choose the rules depending on their performance on the drawn games. We thus obtain an evolutionary process for rules that is a stochastic discrete time process. For simplicity, we take as updating rule the aggregate log-monotonic dynamics studied for example in Cabrales and Sobel (1992). Given the average game, the aggregate log-monotonic dynamics is well defined and applies also to rules. As we will see later, it will be useful in evaluating limiting properties of the stochastic dynamics (to be introduced just below) on the underlying environment (G, m, R). Definition 3. (Discrete) Aggregate log-monotonic dynamics on gN i
rik;tþ1 ¼
i
i
i
ea ðrt Þðg1 ðrk ;rt Ki P j¼1
rij;t e
ai ðr
Þgi1 ðrt ÞÞ
i i i i t Þðg1 ðrj ;rt Þg1 ðrt ÞÞ
rik;t ,
where a : R - IR+ is a continuous function, bounded away from zero. i
ð1Þ
Fabrizio Germano
128
We refer to the dynamics defined in Equation (1) simply as the average dynamics. Notice that it is a deterministic dynamics and the weights with which rules are played are updated according to the relative performances of the rules at the current rule profile rt. It is also referred to as exponential weighted average rule and is closely related to the logistic learning rule (e.g., Camerer and Ho, 1999). Though we are not committed to the particular form of the dynamics, we use it in the proofs. We view Camerer and Ho (1999) and Hopkins (2002) as providing indirect, empirical and theoretical, support for employing such a dynamics. Next, we consider a process of stochastic updating of rules that occurs over games that are drawn randomly from G according to the probability measure m. In this context, starting from an initial distribution of rules r0 2 R within the population of players, we consider an evolutionary process that is an application or extension of the above aggregate log-monotonic selection dynamics applied to this stochastic context. As before, the probabilities with which rules are played are updated according to the relative performances of the rules on the randomly drawn games. Definition 4. (Discrete) Stochastic aggregate log-monotonic dynamics on (G, m, R) i
rik;tþ1
¼
i
i
i
ea ðrt Þðgt ðrk ðgt Þ;rt Ki P j¼1
Þgit ðrt ÞÞ
i i i i i rij;t ea ðrt Þðgt ðrj ðgt Þ;rt Þgt ðrtÞÞ
rik;t ,
ð2Þ
where ai : R ! IRþ is a positive continuous function, bounded away from zero. We refer to the dynamics defined in Equation (2) as the stochastic dynamics. Notice that unlike the (deterministic) average dynamics, where the relative performance of a given rule is evaluated with the fixed average game gN, here the relative performance at t is evaluated with the randomly drawn game gt.
5.1. Iterated strict dominance Our first result shows that rules that are strictly dominated in the average game tend to disappear under the stochastic dynamics. This is a stochastic counterpart to Nachbar (1990), Friedman (1991), Samuelson and Zhang (1992), and, in particular, Cabrales and Sobel (1992) and Cabrales (2000). Notice that a rule may be strictly dominated in the average game although it never recommends a dominated strategy in any of the randomly drawn games; such a rule would tend to disappear almost surely in the long run.4
4
All proofs are contained in Germano (2006).
On Boundedly Rational Rules for Playing Normal Form Games
129
Proposition 1. Let gN be the average game for the environment (G, m, R). If rkiARi is strictly dominated in gN for some iAI, and {rt} follows some stochastic aggregate log-monotonic dynamics with r0A int(R), then rik;t ! 0. a:s:
0
Further, if rik0 2 Ri0 is iteratively strictly dominated in gN for some i0 AI, then
0 rik0 ;t ! 0. a:s:
In Example 2 this proposition implies that N1 and P1 would be played with probability zero in the limit, while in Example 3 it implies that all the rules except for RDN would be played with probability zero in the limit.
5.2. Nash equilibria We next consider the issue of convergence of the stochastic dynamics to Nash equilibria. Notice that our log-monotonic dynamics is uncoupled in the sense of Hart and Mas-Colell (2003), that is, the weight put by player i on rule k at time t depends only on the profile ri t1 at t – 1 and on player i’s payoffs in the current game gt. In particular, it does not directly depend on the payoffs of any of the other players. Hart and Mas-Colell show that such a dynamics, if deterministic, cannot in general (and generically in the space of games) guarantee convergence to Nash equilibrium. Thus, except for specific games, like two-player potential or zero-sum games, one should not expect the log-monotonic dynamics to converge to Nash equilibrium of the average game. However, analogous to Nachbar (1990), Friedman (1991), and Samuelson and Zhang (1992), we show that if the process does converge, then the limit point must be a Nash equilibrium of the average game. Proposition 2. Let gN be the average game for the environment (G, m, R) and let {rt} follow some stochastic aggregate log-monotonic dynamics with r0 2 intðRÞ. Then, if rt ! r, then r 2 Nðg1 Þ. a:s:
This proposition gives some further foundation for the equilibrium analysis involving Nash equilibrium rule profiles of the average game. The next result shows that if the initial rule distribution r0 is sufficiently close to a stable pure strategy Nash equilibrium of the average game, then, in the limit, the stochastic dynamics will converge to that equilibrium profile almost surely. We formalize this by using the following definition based on Arnold (1974). Definition 5. Let r 2 R be a zero of the dynamics {rt}, then we say r is stochastically stable if for every neighborhood V of r and for every A>0, there exists a neighborhood U of r, U V of strictly positive measure, such that P½rt 2 V ; t40 1 2 whenever r0 2 U; r is stochastically unstable if it is not stochastically stable.
Fabrizio Germano
130
We say r is stochastically asymptotically stable if it is stochastically stable and lim P½ lim rt ðr0 Þ ¼ r ¼ 1.
r0 !r
t!1
We can now state the following. Proposition 3. Let gN be the average game for the environment (G, m, R) and let {rt} follow some stochastic aggregate log-monotonic dynamics. If rAR is a regular pure Nash equilibrium of gN that is asymptotically stable under the corresponding average dynamics, then r is stochastically asymptotically stable under {rt}. Applied to Examples 2 and 3, this proposition implies that if players are putting sufficiently high probability on rules constituting one of the pure strategy equilibrium rule profiles, then they will converge to playing it with probability one. One positive feature of this framework is that convergence of the process {rt} to a pure strategy rule profile can be interpreted as the players learning to play the actual rules (or algorithms) corresponding to the limiting rule profile. In particular, if the limiting rule corresponds to say playing the payoff dominant Nash equilibrium strategy, then this means all players eventually learn (i.e., converge to) the actual algorithm that prescribes playing the payoff dominant Nash equilibrium strategy for every game drawn from G. Moreover, in some cases, rules may be applicable even to games outside of G. The next example is adapted from the experiments of Rankin et al. (2000) and Stahl and Van Huyck (2002), who study the evolution of subjects’ behavior within a class of stag hunt games. Example 4. Take G to be the space of all 2 2 games of the form
e
a+e a+e b+e
b+e
e a+e
e
b+e e
b+e b+e
b+e
b+e
b+e
a+e
where a ¼ 1, b A [0, 1], e A [0, 1/8], and where b and e are drawn uniformly and independently from their respective ranges and where the left or right payoff matrix is drawn with probability one-half. Consider the two rules, ‘‘payoff dominant Nash’’ (PDN) and ‘‘risk dominant nash’’ (RDN). The average game is RDN
PDN PDN
0.562
1.063
0.937
1.063 RDN
0.937
0.937 0.562
0.937
The rule profiles PDN and RDN are both asymptotically stable under the average dynamics. Hence Proposition 3 says that if initial propensities to play say PDN are sufficiently high then the process will converge with high
On Boundedly Rational Rules for Playing Normal Form Games
131
probability to the entire population playing PDN. The probability increases the higher is the initial propensity of playing PDN, and, given an initial propensity, whether the stochastic dynamics converges to PDN or RDN depends on the actual sequence of games drawn. Rankin et al. (2000) obtain that between 80 and 98% of their subjects play PDN after about 75 rounds, starting from initial propensities that seem to be only about 45 – 75%. The following example shows that Proposition 3 does not hold for strictly mixed equilibria that are asymptotically stable in the average game. Example 5. Consider the following average game obtained from the environment (G, m, R), where G is the space of all 3 3 games with payoffs in [0, 1] and m again uniform.
D1 D1
0.660
N3
0.650
N2 0.660
0.660
0.724 0.750
0.660
0.640
This game has a unique mixed equilibrium, which is asymptotically stable under the average dynamics if for example a1 ¼ a2 ¼ 1. However, it can be checked that, the stochastic dynamics, even if it starts at the mixed equilibrium, leaves any sufficiently small neighborhood with probability one. Since strict Nash equilibria are always in pure strategies and asymptotically stable under the log-monotonic selection dynamics, it follows, analogous to Nachbar (1990) and Ritzberger and Weibull (1995), that if r 2 R constitutes a strict Nash equilibrium of the average game and {rt} follows some stochastic aggregate log-monotonic dynamics, then the profile r is stochastically asymptotically stable under {rt}. 6. Conclusion The present analysis can be extended in many ways. Some obvious ones consist in dropping some of the stationarity assumptions built into the model. For instance, one could consider rules that are contingent on past behavior (as for example in Stahl, 1999, 2000; Stahl and Van Huyck, 2002); one may also allow the distribution m to change over time. On the other hand, one could generalize the class of dynamics and test whether folk results shown for the present aggregate log-monotonic dynamics carry over to further classes, for example, to more sophisticated learning or heuristic dynamics like fictitious play type dynamics (see Fudenberg and Levine, 1998) or regret-based dynamics (see Hart and Mas-Colell, 2001). In order to extend the results (essentially, the close link between the stochastic and the average dynamics) one needs to check that the
132
Fabrizio Germano
dynamics depends in a sufficiently linear fashion on the history of play, such that the law of large numbers can be applied. However, it seems that some of the main challenges lie in characterizing ‘‘good’’ rules that ideally apply to a wide range of games and environments, and linking them to actual cognitive (or genetic) behavior. We view this paper as a first step toward such a broader and deeper analysis. Relatedly, one could attempt to obtain more information about the relevant environments that are faced by subjects in different contexts and thus relate some of the theoretical implications to actual experiments. Another aspect that has not been touched on here is the modeling of the process of learning, conceiving, or developing rules without previous knowledge of the set of rules. The experiments of Selten et al. (2003), where students had to develop algorithms for playing randomly drawn 3 3 games as well as some of the experiments mentioned in the paper, focusing on learning of rules, for example, Rankin et al. (2000) and Stahl and Van Huyck (2002) are useful for this. The neural network approach of Zizzo and Sgroi (2001, 2003), while conceptually quite different from the one of the present paper, may also help understand possible cognitive processes underlying the decision processes of subjects having to play in different environments, especially for intuition-based decision making; Zizzo and Sgroi (2001) also provide evidence that neural networks may play in similar ways to some of the subjects, for example in the Stahl (1999, 2000) and Costa-Gomes et al. (2001, 2004) experiments. Acknowledgements I thank Vince Crawford, Thibault Gajdos, Ehud Lehrer, Rosemarie Nagel, Burkhard Schipper, Joel Sobel, and participants at ECCE1, Gif-sur-Ivette, September 2004, for comments and insightful conversations. Financial support from the Spanish Ministry of Science and Technology, Grant SEJ2004-03619 and in form of a Ramo´n y Cajal fellowship, as well as from Fundacio´n BBVA Grant ‘‘Aprender a jugar’’ is gratefully acknowledged. References Arnold, L. (1974), Stochastic Differential Equations: Theory and Applications, Malabar, Florida: Krieger Publishing. Cabrales, A. (2000), ‘‘Stochastic replicator dynamics’’, International Economic Review, Vol. 41, pp. 451–481. Cabrales, A. and J. Sobel (1992), ‘‘On the limit points of discrete selection dynamics’’, Journal of Economic Theory, Vol. 57, pp. 407–419. Camerer, F.C. (2003), Behavioral Game Theory: Experiments in Strategic Interactions, Princeton, New Jersey: Princeton University Press. Camerer, F.C. and T.H. Ho (1999), ‘‘Experience-weighted attraction learning in normal-form games’’, Econometrica, Vol. 67, pp. 827–874.
On Boundedly Rational Rules for Playing Normal Form Games
133
Camerer, F.C., T.H. Ho and J.K. Chong (2004), ‘‘A cognitive hierarchy model of games’’, Quarterly Journal of Economics, Vol. 119, pp. 861–898. Costa-Gomes, M., V. Crawford and B. Broseta (2001), ‘‘Cognition and behavior in normal-form games: an experimental study’’, Econometrica, Vol. 69, pp. 1193–1235. Costa-Gomes, M. and V. Crawford (2004), ‘‘Cognition and behavior in two-person guessing games: an experimental study’’, Mimeo, University of York and University of California, San Diego. Friedman, D. (1991), ‘‘Evolutionary games in economics’’, Econometrica, Vol. 59, pp. 637–666. Fudenberg, D. and D. Levine (1998), The Theory of Learning in Games, Cambridge, MA: MIT Press. Germano, F. (2006), ‘‘Stochastic Evolution of Rules for Playing Normal Form Games’’, Theory and Decision, forthcoming. Gigerenzer, G. and R. Selten (2001), Bounded rationality: the adaptive toolbox, Cambridge, MA: MIT Press. Hart, S. and A. Mas-Colell (2001), ‘‘A general class of adaptive strategies’’, Journal of Economic Theory, Vol. 98, pp. 26–54. Hart, S. and A. Mas-Colell (2003), ‘‘Uncoupled dynamics do not lead to Nash equilibrium’’, American Economic Review, Vol. 93, pp. 1830–1836. Heller, D. (2004), ‘‘An evolutionary approach to learning in a changing environment’’, Journal of Economic Theory, Vol. 114, pp. 31–55. Hofbauer, J. and K. Sigmund (2003), ‘‘Evolutionary game dynamics’’, Bulletin of the American Mathematical Society, Vol. 40, pp. 479–519. Hopkins, E. (2002), ‘‘Two competing models of how people learn in Games’’, Econometrica, Vol. 70, pp. 2141–2166. Li Calzi, M. (1995), ‘‘Fictitious play by cases’’, Games and Economic Behavior, Vol. 11, pp. 64–89. Nachbar, J.H. (1990), ‘‘‘Evolutionary’ selection dynamics in games: convergence and limit properties’’, International Journal of Game Theory, Vol. 19, pp. 59–90. Rankin, F.W., J.B. Van Huyck and R.C. Battalio (2000), ‘‘Strategic similarity and emergent conventions: Evidence from similar stag hunt games’’, Games and Economic Behavior, Vol. 32, pp. 315–337. Ritzberger, K. and J.W. Weibull (1995), ‘‘Evolutionary selection in normal form games’’, Econometrica, Vol. 63, pp. 1371–1399. Samuelson, L. (2001), ‘‘Analogies, adaptation, and anomalies’’, Journal of Economic Theory, Vol. 97, pp. 320–366. Samuelson, L. and J. Zhang (1992), ‘‘Evolutionary stability in asymmetric games’’, Journal of Economic Theory, Vol. 57, pp. 363–391. Selten, R., K. Abbink, J. Buchta and A. Sadrieh (2003), ‘‘How to play 3 3 games: A strategy method experiment’’, Games and Economic Behavior, Vol. 45, pp. 19–37. Sgroi, D. and D.J. Zizzo (2003), ‘‘Strategy learning in 3 3 games by neural networks’’, Mimeo, University of Cambridge and Oxford University.
134
Fabrizio Germano
Stahl, D.O. (1999), ‘‘Evidence based rules and learning in symmetric normalform games’’, International Journal of Game Theory, Vol. 28, pp. 111–130. Stahl, D.O. (2000), ‘‘Rule learning in symmetric normal-form games: theory and evidence’’, Games and Economic Behavior, Vol. 32, pp. 105–138. Stahl, D.O. and J.B. Van Huyck (2002), ‘‘Learning conditional behavior in similar stag hunt games’’, Mimeo, University of Texas, Austin, and Texas A&M University. Stahl, D.O. and P.W. Wilson (1995), ‘‘On players’ models of other players: theory and experimental evidence’’, Games and Economic Behavior, Vol. 10, pp. 218–254. Van Huyck, J.B. and R.C. Battalio (2002), ‘‘Prudence, justice, benevolence, and sex: evidence from similar bargaining games’’, Journal of Economic Theory, Vol. 104, pp. 227–246. Zizzo, D.J. and D. Sgroi (2001), ‘‘Bounded-rational behavior by neural networks in normal form games’’, Mimeo, University of Cambridge and Oxford University.
CHAPTER 6
Contagion and Dominating Sets Jacques Durieu, Hans Haller and Philippe Solal Abstract Each agent of a finite population interacts strategically with each of his neighbours on a graph. All agents have the same pair of available actions. In every period, each agent chooses a particular action if at least a proportion p of his neighbours has chosen this action in the previous period. Contagion is said to occur if one action spreads from a particular group of agents to the entire population. We develop techniques for analysing contagion in connected regular graphs. These techniques are based on the concept of dominating set from graph theory. We also characterize the class of regular graphs where different agents may choose different actions forever and the class of regular graphs where two-period limit cycles may occur. Finally, we apply our results to the case of tori.
Keywords: Contagion, Dominating set, normal form game, automata network JEL classification: C72, C73, D83 1. Introduction Many important interactions in an economic, social, political or computational context are local in the sense that they are based on bilateral relationships. A local interaction structure formally describes who interacts with whom. It specifies for each agent a set of neighbours, the set of agents with whom the agent interacts. The agent chooses from a finite set of available actions. The agent’s choice depends on
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80007-6
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
136
Jacques Durieu, Hans Haller and Philippe Solal
or, more concisely, responds to his neighbours’ choices. The theory of discrete dynamical systems has been utilized in recent years to describe and analyse contagion phenomena in a society with a local interaction structure. Contagion is said to occur if one action can spread by a contact effect from a particular group of agents to the entire population. In this paper, we develop techniques for analysing contagion in local interaction structures on a finite population of agents. All agents have the same pair of available actions. Hence each makes binary choices. We study the evolution of the profile of choices in discrete time. In every period, each agent adapts his action to his neighbours’ choices in the previous period according to a boundedly rational behaviour rule. To be precise, each agent follows a threshold rule: an agent chooses a particular action if at least a proportion p of his neighbours has chosen this action in the previous period. In a game-theoretic context, this refers to a situation in which each agent plays a symmetric two-person coordination game with each of his neighbours on the interaction structure. In every period, each agent chooses an action by following a myopic rule. Such a behaviour has been extensively investigated in the literature on evolutionary game theory, see among others Blume (1995, 1997), Ellison (1993), Young (1998), Baron et al. (2002). However, a common assumption in these models is that agents may deviate from their myopic rule and choose an action at random. Here, in order to understand how the local interaction structure influences the contagion process, we exclude such deviations and concentrate on the threshold rule described above. Individual behaviour in our model can be described by a threshold finite automaton, so that the local interaction structure gives rise to a threshold automata network. A classical result establishes that in the case of a finite population, the sequence of states generated by the threshold automata network converges either to a fixed point or to a two-period limit cycle (Goles and Olivos, 1980). In the present paper, we are primarily interested in describing this sequence in a detailed way, not merely its limiting behaviour. In case there is contagion, our aim is to describe the step-by-step contagion process which evolves before the sequence of states reaches a fixed point (i.e., a Nash equilibrium in a game-theoretic specification), where each agent chooses the same action. In other words, we want to address the question how contagion occurs in addition to the question whether it occurs. Our approach is distinct from that of Morris (2000) and Lee and Valentinyi (2000), who analyse contagion as an asymptotic behaviour of the system. Morris (2000) studies contagion in an infinite population. Taking into account a large class of local interaction structures, Morris characterizes the critical contagion threshold which ensures that contagion occurs (from a finite group of agents) for any threshold p less than the critical one. Among other results, Morris establishes that the critical contagion threshold is at most 1/2 in all local interaction structures. By contrast, our focus on finite local interaction structures allows us to take into consideration contagion thresholds strictly greater than 1/2. Lee and Valentinyi (2000) consider a similar context: a finite but sufficiently large population located on a torus. They analyse contagion in the case of random
Contagion and Dominating Sets
137
initial conditions. It is assumed that initially each agent has a positive probability to choose each action. Following Morris, Lee and Valentinyi identify the critical contagion threshold that ensures that contagion occurs in the long run. In particular, they show that the critical contagion threshold is exactly 1/2. Instead of considering random initial configurations, we derive properties on the initial configurations which are compatible with contagion. Another kind of models dealing with contagion on a finite local interaction structure are Flocchini et al. (2001, 2004) and Peleg (1998, 2002). In the context of distributed systems, these authors study the impact that a set of initial faulty elements can have. They consider a majority rule: if the majority of its neighbours is faulty, then an element becomes faulty. They investigate the question of the size of the set of faulty elements which leads the entire system to a faulty behaviour. Flocchini et al. (2001, 2004) take into account two classes of communication topologies of the system: chordal rings and tori (toroidal mesh, torus cordalis and torus serpentinus). They give bounds on the size of the sets of initial faulty elements needed to lead the system to a faulty behaviour. Finally, Peleg (1998, 2002) examines various models of majority rule and formulates some general results for the class of connected graphs. In particular, he derives a tight bound on the minimum number of initial faulty elements. We depart from these works in several ways. First of all, our purpose is to describe the entire contagion process, not only the initial configurations which allow contagion. Thus we do not focus on the determination of the minimal size of the set of initial vertices allowing contagion. Second, our approach covers a broader class of local interaction structures than Flocchini et al. (2001, 2004), even if we restrict our applications to the case of tori. Third, our analysis is not restricted to the case of a majority rule. Fourth, Flocchini et al. (2001, 2004) often assume irreversibility dynamics. When the contagion of a particular action (say, action 0) is studied, it is assumed that once an agent chooses action 0 then this choice is permanent, independently of the actions chosen in the neighbourhood. On the contrary, we focus on reversible dynamics where the choice of an action by an agent always depends on the actions played in his neighborhood. Our general framework is particularly suited for analysing contagion in regular finite local interaction structures. This framework is based on the concept of dominating set developed in graph theory (Haynes et al., 1998). The concept of dominating set proves useful because every step of contagion can be analysed in terms of domination. We obtain a complete description of the step-by-step contagion mechanism and we identify which qualitative features of finite local interaction structures are associated with contagion. This allows us to identify as well some qualitative features of finite local interaction structures which may prevent contagion. We establish the following results. 1. We describe contagion, which in our setting occurs in a finite number of steps, if at all. 2. Using the above description we deduce simple necessary conditions on the initial group of agents from which contagion may occur. These conditions are
138
3. 4. 5. 6.
Jacques Durieu, Hans Haller and Philippe Solal
both quantitative and qualitative in nature.1 They help discern contagion which occurs from an initial group of agents partitioned into several neighbourhoods. We examine contagion exhibiting the property of optimality. We identify the class of local interaction structures where different agents may choose different actions forever (i.e., heterogeneous fixed point). We identify the class of local interaction structures where two-period limit cycles may occur. Considering a torus, we study how contagion becomes difficult when p increases. Moreover, we identify conditions on p under which heterogeneous fixed points and two-period limit cycles may occur.
The set-up of this paper is as follows. Section 2 deals with preliminaries regarding contagion, threshold automata networks and domination in graphs. In Section 3 we describe the step-by-step contagion mechanism. We also derive necessary conditions for contagion and consider optimal contagion. In Section 4 we provide a complete characterization of the class of local interaction structures where different agents may choose different actions forever and of the class of local interaction structures where two-period limit cycles may occur. Section 5 contains applications of our results to the case of (two-dimensional) tori. The last section concludes with some remarks on assumptions that are made in this paper. 2. Preliminaries We describe connected local interaction structures using terminology and notation related to graph theory. Let V ¼ f0; 1; . . . ; n 1g N be a finite population of agents of size n and let G (V, E ) be an undirected graph where V or V(G) denotes the set of vertices (nodes) of G and E or E(G ) denotes the set of edges of G. The adjacency matrix A(G ) is the symmetric matrix [aij] whose rows and columns are indexed by the vertices of G; aij 2 f0; 1g being the number of edges G joining vertices i and j. We assume that aii ¼ 0 for each i 2 V ðGÞ so that each agent i is not linked with himself. Each i 2 V ðGÞ has a set of neighbours denoted by N i ¼ j 2 V ðG Þ : aij ¼ 1 . When j 2 V ðGÞ is such that j 2 N i , we say that i and j are adjacent or neighbours. Let di(G) be the degree of i in G, i.e. d i ðGÞ ¼ jN i j. We define a connected local interaction structure as a connected graph GðV ; EÞ where d i ðGÞon 1 for at least one i 2 V ðGÞ. Each i 2 V ðGÞ has a set of available actions Si. We restrict ourselves to the case of binary choices, i.e. S i ¼ f0; 1g for all i 2 V ðGÞ. S ¼ S0 . . . S n1 ¼ f0; 1gn is the set of states, with generic elements s ¼ ðs0 ; s1 ; . . . ; si ; . . . ; sn1 Þ. Let t ¼ 0, 1, 2, y denote successive time periods. At each t, every i 2 V ðGÞ updates his action according to a threshold rule. A threshold rule is defined by a real
1
Flocchini et al. (2001) also present certain qualitative properties for a class of chordal rings.
Contagion and Dominating Sets
139
number p 2 ð0; 1Þ and a thresholdpd i ðGÞ for each i 2 V ðGÞ such that action 0 is chosen by i at period t if at least pd i ðGÞ of his neighbours played 0 at period t-1, where dae denotes the smallest integer greater than or equal to a 2 R. Symmetrically, action 1 is chosen by agent i 2 V ðGÞ if at least ð1 pÞd i ðGÞ of his neighbours played 1 at period t 1. In case of a tie between the two actions, i.e. when pd i ðGÞ þ ð1 pÞd i ðGÞ ¼ d i ðGÞ, we assume without loss of generality that i chooses action 0. Note that this family of threshold rules includes a large class of rules of thumb such as myopic optimization, imitation or majority rule in the context of spatial strategic interaction. For instance, assume that each agent is matched with his neighbours to play a symmetric two-person coordination game characterized by the payoff matrix 0
1
0
(a, a)
(b, c)
1
(c, b)
(d, d)
with a; b; c; d 2 R; a4c and d>b. This game has two strict Nash equilibria (0, 0) and (1, 1). Assume that each agent uses a myopic best response rule. Action 0 is best response for some agent exactly if he assigns probability at least p¼
db acþd b
to the other agent choosing action 0. This rule is similar to that in Morris (2000) and Berninghaus and Schwalbe (1996). In the case of p ¼ 1=2, the threshold rule coincides with the majority rule analyzed by Flocchini et al. (2001, 2004) and Peleg (1998, 2002). This kind of individual behaviour can also be described by means of a finite threshold automaton, such that the connected local interaction structure GðV ; EÞ gives rise to a threshold automata network. The local transition function of the automaton i 2 V ðGÞ is given by: ! X si ¼ 1 aij sj ð1 pÞd i ðGÞ , j2V ðGÞ
where 1(x) ¼ 1 if x>0 and 0 otherwise. For t ¼ 0, 1, 2, y, each automaton i 2 V ðGÞ computes an action i.e. ! X tþ1 t si ¼ 1 aij sj ð1 pÞd i ðGÞ , j2V ðGÞ t
s 2 S. The tuple N ¼ ðGðV ; EÞ; S; p; 1Þ defines a threshold automata network. The main focus of inquiry will be the sequence of states generated by iteration of the threshold automata, given particular assumptions for the initial state s0.
140
Jacques Durieu, Hans Haller and Philippe Solal
Since S is finite, the sequence of states has only two possible modes of asymptotic behaviour. Indeed, after a certain number of periods, the sequence must reach a state which has been encountered before. That is, we have stþt ¼ st for some t 2 N. In such a case the sequence starts repeating itself and therefore runs into a limit cycle. When the limit cycle has period 1, i.e. t ¼ 1, we shall say that s is a fixed point. If t41, then the sequence runs into a limit cycle of period t. Without loss of generality, we will study how action 0 can spread contagiously on a connected local interaction structure GðV ; EÞ from some initial subset V 0 ðGÞ of V ðGÞ. We are going to introduce some additional useful notions related to graph theory.2 Let D be a subset of V ðGÞ. D is a dominating set in GðV ; EÞ if for every vertex i 2 V ðGÞ D there exists (at least) a vertex j 2 D such that j is adjacent to i, i.e. jN i \ Dj 1; 8i 2 V ðGÞ D. We are going to use several special properties concerning a dominating set. First, we introduce a restriction directly on the dominating set. A set D is a total dominating set in GðV ; EÞ if for every vertex i 2 V ðGÞ there exists (at least) a vertex j 2 D such that i is adjacent to j, i.e. jN i \ Dj 1; 8i 2 V ðGÞ. Second, we impose a restriction on the kind of domination. Let l 2 N. A set D is an l-dominating set in GðV ; EÞ if for every vertex i 2 V ðGÞ D; jN i \ Dj l. Third, it is possible to combine both previous properties: we want every vertex to be multiply dominated. A set D is an l-tuple dominating set in GðV ; EÞ if each vertex i 2 V ðGÞ is dominated by at least l vertices in D, i.e. jN i \ Dj l; 8i 2 V ðGÞ. Recall that a graph G 0 ðV 0 ; E 0 Þ is a subgraph of GðV ; EÞ if V 0 ðG0 Þ V ðGÞ; E 0 ðG 0 Þ EðGÞ. We say that G 0 ðV 0 ; E 0 Þ is the induced subgraph on V 0 if it contains all edges of E(G) which have both of their vertices in V 0 . In order to see (intuitively) why the concept of dominating set proves useful, consider a population of 10 agents located at the vertices of a Petersen graph depicted in the following figure: 0
9
4
8 3
5
6
1
7 2
At every period, each agent chooses between action 0 and action 1. Assume that each agent chooses action 0 if at least two of his neighbours play 0. Consider an initial state where seven agents choose action 0. Contagion of action 0 from
2
For a comprehensive introduction to the topic of domination in graphs, c.f. Haynes et al. (1998).
Contagion and Dominating Sets
141
this initial set of agents is not guaranteed. The distinction between an initial state from which contagion occurs and an initial state from which contagion does not occur may be expressed in terms of dominating sets. First, suppose that the set of agents initially choosing action 0 is D1 ¼ f0; 2; 3; 5; 6; 7; 9g. At the following period, the set of agents choosing action 0 is D2 ¼ f1; 2; 4; 5; 7; 8; 9g. And, at the next period, the set of agents choosing action 0 is D1 again. Observe that, at every period, the set of agents choosing action 0 is a 2-dominating set in the Petersen graph i.e. each agent who does not belong to the 2-dominating set has at least two neighbours playing 0. The sequence of dominating sets alternates between D1 and D2 which means that contagion does not occur. Alternatively, suppose that the set of agents initially choosing action 0 is D01 ¼ f2; 3; 5; 6; 7; 8; 9g. At the following period, the set of agents choosing action 0 is D02 ¼ f1; 2; 3; 4; 5; 6; 7; 8; 9g. Next, each agent chooses action 0 forever. The set D01 is a 2-dominating set in the induced subgraph on the subset of vertices V 1 ¼ f1; 2; . . . ; 8; 9g. The set D02 is a 2-tuple dominating set (rather than just a 2-dominating set) in the Petersen graph i.e. each agent has at least two neighbours playing 0. This example suggests that from an initial state certain domination properties on subgraphs of the underlying graph are needed to obtain contagion after a finite number of steps. 3. Contagion in regular graphs We restrict our analysis to the case of regular graphs. That is we consider the class of connected local interaction structures G(V, E) for which there exists r 2 N such that d i ðG Þ ¼ r; 8i 2 V ðGÞ. This implies that for a regular graph, there exists T 2 N with pd i ðGÞ ¼ T; 8i 2 V ðGÞ. Our main aim is to describe contagion by using the concept of dominating set. We provide a description of contagion on G(V, E ) according to the number of steps necessary to obtain a contagion. To clarify the relationships between contagion and the notions taken from domination in graphs, it is convenient to distinguish between a contagion which occurs in one step and a contagion which occurs in m>1 steps, where m 2 N. Proposition 1. Let N ¼ ðGðV; EÞ; S; p; 1Þ be a threshold automata network. Action 0 spreads in one step from some initial subset of agents V 0 ðGÞ to the entire population if and only if V0(G) is a T-tuple dominating set D1 in G(V, E).
Proof. ( ) ) Consider a subset V0 ðGÞ of VðGÞwhich is a T-tuple dominating set D1 in G(V, E). Trivially, such a subset always exists in a regular graph of degree r. Consider a state s0 where each i of V0 ðGÞ chooses action 0. By definition of a T-tuple dominating set, jN i \ D1 j T; 8i 2 V ðGÞ. Consequently, each i 2 V ðGÞ chooses s1i ¼ 0.
142
Jacques Durieu, Hans Haller and Philippe Solal
( ( ) Consider a state s1 where s1i ¼ 0; 8i 2 V ðGÞ. This implies that each i 2 V ðGÞ has at least T neighbours choosing action 0 at period 0. It follows that there exists a subset of agents choosing action 0 at period 0 which constitutes a T-tuple dominating set D1 in G(V, E ). Then in particular, V0(G ) is a T-tuple dominating set D1 in G(V, E). ’ Now, we can go further by considering a more complex contagion which occurs in m>1 steps. Proposition 2. Let N ¼ ðGðV ; EÞ; S; p; 1Þ be a threshold automata network. Action 0 spreads in m>1 steps from some initial subset of agents V0(G) to the entire population if and only if there exists a sequence of m subgraphs G 1 ðV 1 ; E 1 Þ; . . . ; G m ðV m ; E m Þ of GðV ; EÞ such that: 1. for h ¼ 1,y, m1, there exists a T-dominating set Dh in Gh(Vh, Eh) and D1 ¼ V0(G), 2. Gm(Vm, Em) ¼ G(V, E) and there exists a T-tuple dominating set Dm V(G) in G(V, E), and 3. for h ¼ 1; . . . ; m 1; Dhþ1 V h ðG h Þ Dh [ Ah where Ah ¼ fi 2 Dh : jN i \ Dh j Tg. Proof. ( ) ) Consider a state s0 such that each i 2 V 0 ðGÞchooses action 0. By condition 1, V0(G) is a T-dominating set D1 in G 1 ðV 1 ; E 1 Þ. It follows that every i 2 V 1 ðG 1 Þ such that jN i \ D1 j T chooses action 0 at period 1. If m ¼ 2, then by conditions 2 and 3, a subset of this set of agents constitutes a T-tuple dominating set D2 in G2(V2, E2) ¼ V(G, E). And contagion occurs at period 2. If m>2, then by conditions 1 and 3, a subset of the set of agents i A V1(G1) such that jN i \ D1 j T is a T-dominating set D2 in G2(V2, E2). Continuing in this fashion, a subset of the set of agents i 2 V h1 ðG h1 Þ such that jN i \ Dh1 j T is a T-dominating set Dh in Gh(Vh, Eh), for every h ¼ 3; . . . ; m 1, by conditions 1 and 3. And, each agent of Dh chooses action 0 at period h1. Finally, a subset of the set of agents i 2 V m1 ðG m1 Þ such that jN i \ Dm1 j T is a T-tuple dominating set Dm in Gm(Vm, Em) ¼ G(V, E), by conditions 2 and 3. And, each agent of Dm chooses action 0 at period m1. We conclude that sm i ¼ 0; 8i 2 V ðGÞ. ( ( ) We proceed by induction. Assume that contagion occurs at period m. This implies that each i A V(G) has at least T neighbours choosing action 0 at period m1. Thus, there must exist a set Dm C V(G) of agents choosing action 0 at period m1 that satisfies jN i \ Dm j T; 8i 2 V ðGÞ. Thus, Dm is a T-tuple dominating set in G(V, E). At the previous period m2, each i of Dm must have at least T neighbours choosing action 0 i.e. there must exist a set of agents Dm– 1 choosing action 0 that satisfies jN i \ Dm1 j T; 8i 2 Dm .This means that Dm– 1 is a T-dominating set in a subgraph Gm– 1(Vm– 1, Em– 1) which contains Dm. If
Contagion and Dominating Sets
143
m ¼ 2, then Dm– 1 ¼ D1. For m>2, inductively, assume that conditions 1 and 3 are satisfied at period m h; h m 1. Then, there exists a set of agents choosing action 0 which is a T-dominating set Dm– h+1 in Gm– h+1(Vm– h+1, Em– h+1). It follows that each i 2 Dmhþ1 has at least T neighbours choosing action 0 at period mh1. Thus, there must exist a set of agents Dm– h choosing action 0 at period mh1 that satisfies jN i \ Dmh j T; 8i 2 Dmhþ1 . It follows that Dm– h is a T-dominating set in Gm– h(Vm– h, Em– h) which contains Dm– h+1. This completes the proof. ’ From the above description, we can derive some necessary conditions on the initial configuration for contagion. Let Z be the minimal number of neighbours choosing action 1 which guarantees that agent i chooses action 1 given the tiebreaking rule introduced in Section 2. Proposition 3. Let N ¼ ðGðV; EÞ; S; p; 1Þ be a threshold automata network. Action 0 spreads from some initial subset of agents V0(G) to the entire population only if: 1. 2. 3.
V 0 ðGÞ T; V 1 ðG 1 Þ D1 [ A1 T; and V 1 ¼ V ðGÞ V 1 ðG 1 Þ does not contain a Z-tuple dominating set D01 in the induced subgraph G 1 V 1 ; E 1 .
Proof. If V0 ðGÞ oT|, then there does not exist an agent i 2 V ðGÞ such that N i \ V 0 ðGÞ T. Thus, there does not exist a set D1 which is a T-dominating set (a T -tuple dominating set) and, by Proposition 2 (by Proposition 1), contagion cannot occur. Suppose now that condition 1 holds but V 1 ðG 1 Þ D1 [ A1 oT. Then, contrary to condition 3 of Proposition 2, there does not exist a set D2 V 1 ðG 1 Þ which is either a T-dominating set or a T-tuple dominating set for a subgraph G2(V2, E2). Finally, suppose that conditions 1 and 2 hold but that V 1 contains a Z-tuple dominating set D01 in G 1 ðV 1 ; E 1 Þ. Then, wehave jN i \ D01 j Z for each agent i 2 V 1 . Since rZoT, we have N i \ V 1 ðG 1 ÞoT for each agent i 2 V 1 . Assume that there exists a T-dominating set D2 in G2(V2, E2) which satisfies condition 3 of Proposition 2 (otherwise, contagion cannot occur). Then, we have that D2DV1(G1). And, since |Ni \V1(G1)|oT for each agent i 2 V 1 , we have that V2(G2)DV1(G1). This implies that |Ni\V2(G2)|oT for each agent i 2 V 1 . Continuing in this fashion, we see that there does not exist a T-tuple dominating set Dm in G(V, E) which is implied by V0(G) in the sense of condition 3 of Proposition 2. By condition 2 of Proposition 2, contagion cannot occur. ’ Note that conditions 2 and 3 in the above proposition concern the first step of contagion since we are particularly interested in the formulation of conditions
144
Jacques Durieu, Hans Haller and Philippe Solal
on the initial state of contagion. However, similar conditions are necessary at every step of contagion. Finally, we examine the specific class of optimal contagion. The property of optimality of contagion refers to the question how many agents choosing action 0 are just sufficient to obtain a contagion. A contagion is optimal if the number of agents initially choosing action 0 is minimal. Using Propositions 2 and 3, we derive the following result. Corollary 1. An optimal contagion occurs in m steps from an initial subset V0(G) if there exists a sequence of m subgraphs G1(V1, E1), y, Gm(Vm, Em) that satisfies conditions 1, 2 and 3 of Proposition 2 and if|V0(G)| ¼ T. Furthermore, it is possible to formulate a necessary condition for the optimality of contagion. Proposition 4. Let N ¼ ðGðV ; EÞ; S; p; 1Þ be a threshold automata network. A contagion which occurs from an initial subset V0(G) is optimal only if there does not exist a subset D01 such that: 1. D01 is a T-dominating set in a subgraph G 01 ðV 01 ; E 01 Þ, 2. V 0 ðGÞ V 01 D01 [ A01 , and 3. jD01 jojV 0 ðGÞj.
Proof. We proceed by contradiction. Assume that there exists a subset D01 which satisfies the above conditions 1, 2, and 3. Then jN i \ D01 j T for every agent i 2 V 0 ðGÞ by conditions 1 and 2. If each agent i 2 D01 chooses action 0 at period 0, then each agent iAV0(G) chooses action 0 at period 1. By hypothesis, this implies that contagion occurs. But, since jD01 jojV 0 ðGÞj; then the contagion which occurs from V0(G) is not optimal. ’ Observe that the above necessary condition can be extended to the existence of any subset of size inferior to V0(G) and which implies V0(G) (in the sense of condition 2) through a sequence of T-dominating sets. 4. Fixed points and two-period limit cycles To go a step further in the study of contagion in regular graphs, we are going to examine the alternative modes of asymptotic behaviour of the sequence of states generated by iteration of the threshold automata. Using the concepts of domination in graphs, we provide a characterization of these alternative modes. In this way, we are able to identify some properties of local interaction structures which may prevent contagion. To this end, it is useful to introduce the following general result. Theorem 1. (Goles and Olivos, 1980) Let N ¼ ðGðV ; EÞ; S; p; 1Þ be a threshold automata network and consider an arbitrary initial state s0 2 S. The sequence of
Contagion and Dominating Sets
145
states generated by iteration of the threshold automata converges either to a fixed point or to a two-period limit cycle. Using the above theorem, it is possible to distinguish between four possible modes of asymptotic behaviour. First, a fixed point is homogeneous if every agent chooses the same action. Second, a fixed point is heterogeneous if there is a subset of agents choosing action 0 whereas other agents choose action 1. Third, a limit cycle is a global two-period limit cycle if all agents are involved in the cycle and change their action at every period. Fourth, other limit cycles will be called local two-period limit cycles, i.e. only a subset of agents are involved in the cycle whereas other agents in the population choose an action forever. Trivially, a contagion occurs only if the sequence of states reaches a fixed point belonging to the first class of behaviour. In the sequel, we are going to give properties of regular local interaction structures such that a heterogeneous fixed-point, a global or a local two-period limit cycle may occur. Then, it is easy to derive properties guaranteeing that contagion occurs. The following result identifies a class of regular local interaction structures which admit heterogeneous fixed points.
Proposition 5. Let N ¼ ðGðV ; EÞ; S; p; 1Þ be a threshold automata network. N admits a heterogeneous fixed point if and only if there is a partition of V(G) into two sets V1 and V 1 such that: 1. V1 contains a T-tuple dominating set D1 in the induced subgraph on V1, G1(V1, E1) and 0 2. V 1 contains a Z-tuple dominating set D1 in the induced subgraph on V 1 ; G 1 ðV 1 ; E 1 Þ.
Proof. ( ) ) Consider a state such that each agent i 2 V 1 chooses action 0 and each agent j 2 V 1 chooses action 1. By condition 1 (condition 2), each agent i in V1 (j 2 V 1 ) satisfies |Ni\D1|ZT jN i \ D01 j Z). Hence, such a state is a fixed point which protects diversity of actions. ( ( ) Consider an heterogeneous fixed point. In such a state, each agent i choosing action 0 must have T neighbours choosing action 0 and each agent j choosing action 1 must have Z neighbours choosing action 1. That is, the set of agents choosing action 0, V1, must contain a subset of agents, say D1, such that jN i \ D1 j T for each agent i choosing action 0. And, the set of agents choosing action 1, V 1 , must contain a subset of agents, say D01 , such that jN j \ D01 j Z for each agent j choosing action 1. This means that V1 must contain a T-tuple dominating set in the induced subgraph on V1, G1(V1, E1), and that V 1 must contain a Z-tuple dominating set in the induced subgraph on V 1 ; G 1 ðV 1 ; E 1 Þ.
146
Jacques Durieu, Hans Haller and Philippe Solal
Thus, there exists a partition of V(G) such that conditions 1 and 2 are satisfied. ’ Proceeding in a similar way, in the following proposition we characterize a class of regular local interaction structures which admit global two-period limit ¼ maxfT; Zg. That is, W is the minimal number of i’s neighbours cycles. Let W choosing an action which guarantees that agent i chooses this action given the tie-breaking rule introduced in Section 1. And, let W ¼ min {T, Z}.
Proposition 6. Let N ¼ ðGðV ; EÞ; S; p; 1Þ be a threshold automata network. N admits a global two-period limit cycle if and only if there is a partition of V(G) into two sets V1 and V 1 such that: -dominating set D1 in G(V, E) and 1. V1 is a W -dominating set D0 in G(V, E). 2. V 1 is a W 1
Proof. ( ) ) Assume that there exists a partition of V(G) into two sets V1 and V 1 such that conditions 1 and 2 are satisfied. Observe that, since T+Z>r, we have jN i \ D1 j oW for every agent i A V1 and N i \ D01 oW for every agent i 2 V 1 . Consider a state s0 such that each agent i 2 V 1 chooses action 0 and each agent for every agent i 2 V 1 chooses action 1. By conditions 1 and 2, jN i \ D01 j W for every agent i 2 V 1 . Then, at period 1, each i 2 V 1 and jN i \ D1 j W agent changes his action. A similar argument shows that the same conclusion holds at period 2. Hence, s2 ¼ s0. And, the sequence of states runs into a twoperiod limit cycle. ( ( ) Assume that the sequence of states runs into a two-period limit cycle. Let V1 be the set of agents choosing action 0 and V 1 be the set of agents choosing action 1 at period 0. By definition of a two-period limit cycle, each agent changes his action at every period. This implies that, for every agent i 2 V 1 , we must have jN i \ V 1 j Z at period 0 and jN i \ V 1 j T at period 1. Thus, V 1 must -dominating set D0 in G(V, E). Similarly, for every agent iAV 1 we be a W 1 must have jN i \ V j T at period 0 and jN i \ V 1 j Z at period 1. Thus, V1 -dominating set D1 in G(V, E). ’ must be a W By using Propositions 5 and 6, it is possible to derive a characterization for the class of local interaction structures which admit local two-period limit cycles (see Durieu et al., 2004). Indeed, some of these local two-period limit cycles can be viewed as a mixed configuration between a two-period limit cycle and a fixed point which is heterogeneous. And, it is also possible to obtain a characterization for the class of local interaction structures where apart from the subset of agents involved in the two-period limit cycle, there exists a homogeneous fixed point.
Contagion and Dominating Sets
147
5. Applications In this section, we focus our attention to a specific class of regular local interaction structures: (two-dimensional) tori. Our aim is to explore the usefulness of our framework when a particular interaction structure is considered. First, let us recall the definition of a torus. A ring of order n>2 denoted by Cn is a connected graph G(V, E) with di(G) ¼ 2 for all i A V (G).3 A torus of a b ¼ n vertices, denoted by Tn, is the regular graph G(V, E) defined by the cartesian product of two rings Ca and Cb. That is, di(G) ¼ 4 for each i A V. In the sequel, we assume that a; b 3 in order to have four distinct neighbours for each i A V. For example, the torus of 12 vertices obtained by the cartesian product of C4 and C3 is represented as follows 00
01
02
03
10
11
12
13
20
21
22
23
C4 ⊗ C3=T12
Contagion on a torus. In order to illustrate the result of Proposition 2, we commence with an example of contagion on a torus T12. Let p 1=4 so that T ¼ 1. Assume that |V0(G)| ¼ 1. Without loss of generality assume that V0(G) ¼ {11}. Observe that this vertex constitutes a dominating set D1 in the induced subgraph G1(V1, E1) on V1 ¼ {01, 10, 11, 12, 21}. Then the set of vertices {01, 10, 12, 21} is a dominating set D2 in the induced subgraph G2(V2, E2) on V2 ¼ {00, 01, 02, 10, 11, 12, 13, 20, 21, 22}. The subset of vertices {00, 01, 02, 11, 13, 20, 21, 22} is a dominating set D3 in the induced subgraph G3(V3, E3) on V3 ¼ V(G), i.e. G3(V3, E3) ¼ G(V, E). Finally, the subset of vertices {00, 01, 02, 03, 10, 11, 12, 20, 21, 22, 23} is a total dominating set D4 in T12. In other words, contagion occurs in four steps from the set V0(G).
3
C2 is the complete graph of order 2.
148
Jacques Durieu, Hans Haller and Philippe Solal
For an illustration see the following figure where each vertex belonging to a dominating set is a black vertex and each vertex dominated is in a box.
Contagion on a T12
Moreover, by Corollary 1, observe that the above contagion is optimal. & The following example demonstrates how one can determine in complex situations that contagion fails, by showing that not all conditions of Proposition 2 are satisfied. Consider a torus T64 defined by the cartesian product of two rings C8. Let p ¼ 1=2 so that T ¼ 2. Consider the sets D4 ¼ {01, 03, 05, 07, 10, 12, 14, 16, 21, 23, 25, 27, 30, 32, 34, 36, 41, 43, 45, 47, 50, 52, 54, 56, 61, 63, 65, 67, 70, 72, 74, 76} and D5 ¼ {00, 02, 04, 06, 11, 13, 15, 17, 20, 22, 24, 26, 31, 33, 35, 37, 40, 42, 44, 46, 51, 53, 55, 57, 60, 62, 64, 66, 71, 73, 75, 77}. D4 and D5 are 2-dominating sets in G(V, E). It is easy to see that the sequence of 2-dominating sets in G(V, E) alternates between D4 and D5 whenever one of the two sets is reached from some V0(G). In other words, condition 2 of Proposition 2 is never satisfied and contagion does not occur from V0(G). For instance, assume that V0(G) ¼ {00, 11, 22, 33, 44, 55, 66, 77}. This set is a 2-dominating set D1 in the induced subgraph G1(V1, E1) on V1 ¼ {00, 01, 07, 10, 11, 12, 21, 22, 23, 32, 33, 34, 43, 44, 45, 54, 55, 56, 65, 66, 67, 70, 76, 77}. Then, the set {01, 07, 10, 12, 21, 23, 32, 34, 43, 45, 54, 56, 65, 67, 70, 76} is a 2dominating set D2 in the induced subgraph G2(V2, E2) on V2 ¼ {00, 01, 02, 06, 07, 10, 11, 12, 13, 17, 20, 21, 22, 23, 24, 31, 32, 33, 34, 35, 42, 43, 44, 45, 46, 53,
Contagion and Dominating Sets
149
54, 55, 56, 57, 60, 64, 65, 66, 67, 70, 71, 75, 76, 77}. The set {00, 02, 06, 11, 13, 17, 20, 22, 24, 31, 33, 35, 42, 44, 46, 53, 55, 57, 60, 64, 66, 71, 75, 77} is a 2-dominating set D3 in the induced subgraph G3(V3, E3) on V3 ¼ V(G)–{04, 15, 26, 37, 40, 51, 62, 73}. Finally, D4 is a 2-dominating set in the subgraph G4(V4, E4) induced on V4 ¼ V(G), i.e. G4(V4, E4) ¼ G(V, E). Hence, contagion does not occur from V0(G). & Finally, conditions in Proposition 3 allow to explain how contagion becomes difficult when p increases and to make some comparisons with the results obtained by Flocchini et al. (2004) and Morris (2000). Consider a torus of a b ¼ n vertices. If p 1=4, then T ¼ 1 and Z ¼ 4. By condition 1 in Proposition 3, V0 is composed of at least one vertex. And, condition 3 of Proposition 3 is always satisfied since there does not exist a subset of V 1 ðG 1 Þ which is a 4tuple dominating set in the induced subgraph G 1 ðV 1 ; E 1 Þ. If 1=4op 1=2, then T ¼ 2 and Z ¼ 3. By condition 1 in Proposition 2 and condition 1 in Proposition 3, V0 is composed of at least two vertices i, j such that N i \ N j a0. Moreover, any subset of vertices composed of at least two adjacent rows (or columns) is (or contains) a 3-tuple dominating set in the induced subgraph on these adjacent rows (or columns). This implies that any V0(G) which is composed of a0 a 2 adjacent rows and/or b0 b 2 adjacent columns does not satisfy condition 3 of Proposition 3. And the same result holds for every V 00 ðGÞ V 0 ðGÞ. Note that Flocchini et al. (2004) obtain a similar result when p ¼ 1=2 (Lemma 1, p. 201). If 1=2op 3=4, then T ¼ 3 and Z ¼ 2. By condition 1 in Proposition 2 and condition 1 in Proposition 3, V0 is composed of at least three vertices i, j and k such thatN i \ N j \ N k a0. Moreover, any subset of vertices composed of at least one row (or column) is a 2-tuple dominating set in the induced subgraph on this row. This implies that any V0(G) which is composed of a0 a 1 rows and/or b0 b 1 columns does not satisfy condition 3 of Proposition 3. And the same result holds for every V 00 ðGÞ V 0 ðGÞ. Thus, contagion occurs only if V0(G) is such that each row and each column contains at least one agent choosing action 0.4 If 3=4op, then T ¼ 4 and Z ¼ 1. Since the only subset of V(G) which is a 4-tuple dominating set in G(V, E) is V (G) then, by condition 2 of Proposition 2, contagion cannot occur. & Heterogeneous fixed points on a torus. By using Proposition 5, we are able to determine if a torus Tn admits or not heterogeneous fixed points according to the value of p. Consider a torus Tn defined by the cartesian product of Ca and Cb. If p 1=4 then T ¼ 1 and Z ¼ 4. But observe that the only subgraph which contains a 4-tuple dominating set is Tn. Thus, N does not admit a heterogeneous fixed point. Note that this result also holds for p43=4. If 1=4op 1=2, then T ¼ 2 and Z ¼ 3. Any row (or column) of agents choosing action 0 is trivially a
4
In such a case, contagion in the sense of Morris, i.e. an action spreading from a finite group of agents to an infinite population, cannot occur (see Morris, 2000, Proposition 3 p. 69). Indeed, on an infinite local interaction structure, condition 3 of our Proposition 3 implies that V0(G) is infinite.
150
Jacques Durieu, Hans Haller and Philippe Solal
2-tuple dominating set in the induced subgraph on this row (or column) of agents. And any two adjacent rows (or columns) of agents choosing action 1 is a 3-tuple dominating set in the induced subgraph on these rows (or columns) of agents. Thus, any configuration where a0 adjacent rows of agents choosing action 1 with 2 a0 oaðb0 adjacent columns of agents choosing action 1 with 2 b0 obÞ and the other agents choosing action 0 is a heterogeneous fixed point. Finally, if 1=2op 3=4, then T ¼ 3 and Z ¼ 2. Thus the above description of configurations which are heterogeneous fixed points (i.e. for 1=4op 1=2) applies here provided we substitute the set of agents choosing action 0 with the set of agents choosing action 1 and reciprocally. & Global two-period limit cycles on a torus. Proposition 6 allows to identify conditions under which global two-period limit cycles may occur on a torus. Consider a torus Tn defined by the cartesian product of two rings Ca and Cb with a and b even. Consider a configuration where any two adjacent agents choose different actions. The set of agents choosing action 0, V1, is a 4-dominating set in G(V, E). And the set of agents choosing action 1, V 1 , is a 4-dominating set in G(V, E). Thus, this configuration belongs to a global two-period limit cycle for all p 2 ð0; 1Þ. On the contrary, if a or/and b is odd, then there do not exist two distinct 4-dominating sets in G(V, E). Indeed, since a or/and b is odd, there necessarily exist at least two adjacent vertices belonging to the same subset, say V1. Thus, the subset of vertices V 1 cannot be a 4-dominating set in G(V, E). Therefore, by Proposition 6, a global two-period limit cycle cannot occur if ¼ 3. Global two-period limit cycles p 1=4 or p43=4. If 1=4op 3=4, then W may occur if a or b is odd but not both. Otherwise (i.e. a and b odd), there do not exist two distinct 3-dominating sets in G(V, E). Indeed, there necessarily exists at least one vertex with two of his neighbours belonging to the same subset, say V1. Thus, the subset of vertices V 1 cannot be a 3-dominating set in G(V, E). Therefore, by Proposition 6, a global two-period limit cycle cannot occur. & 6. Conclusions and ramifications We conclude with some remarks on assumptions made in the paper. First, we have restricted our attention to the case of regular local interaction structures. Extending the description of the step-by-step contagion mechanism to the case of non-regular graphs would require a modest modification of the model, by introducing bounds on the multiple domination. Since pd i ðGÞ may differ from 0 one agent each to another, agent of a subset V V ðGÞ chooses action 0 if there exists a maxi2V 0 pd i ðGÞ -dominating set in a subgraph including V’. By adding this modification in Proposition 2, it is possible to formulate a sufficient condition for contagion. However, it is no longer possible to obtain a complete characterization of contagion. Second, we have assumed that Si ¼ {0, 1} for all agents i 2 V ðGÞ. In the case of a finite set of actions, Proposition 2 still holds for an action such that each agent chooses this action if at least a proportion p of his neighbours has chosen
Contagion and Dominating Sets
151
this action at the previous period. In particular, if agents use a myopic best response rule in the context of spatial strategic interaction, then Proposition 2 provides a characterization of contagion for an action which is p-dominant in the sense of Morris et al. (1995). On the other hand, results obtained in Section 4 do not hold anymore. Third, we have assumed the same behavioural rule for all agents. Suppose the population consists of agents who follow myopic best response rules, but some agents’ payoffs from interaction with each neighbour are given by symmetric two-person coordination game with a>c, d>b, whereas the payoffs of other agents are given by an anti-coordination game with aoc, dob. The first type of behaviour might be called conformist and the second type non-conformist. Such a mixed population does not admit contagion. Moreover, there can be modes of asymptotic behaviour different from fixed points and 2-cycles. Finally, notice that reversibility is a plausible assumption in many cultural, economic and social interactions. The propagation of contagious diseases or technological defects like computer viruses is more likely to exhibit some irreversibility which favors contagion, but also other phenomena neglected here which may slow down or prevent contagion: death (disfunction) and immunity against infection. Acknowledgements We thank Richard Baron and Sylvain Be´al for helpful comments. Support from the French Ministry for Youth, Research and Education through project SCSHS-2004-04 is gratefully acknowledged. References Baron, R., J. Durieu, H. Haller and P. Solal (2002), ‘‘Control costs and potential functions for spatial games’’, International Journal of Game Theory, Vol. 31, pp. 541–561. Berninghaus, S.K. and U. Schwalbe (1996), ‘‘Conventions, local interaction, and automata networks’’, Journal of Evolutionary Economics, Vol. 6, pp. 297–312. Blume, L. (1995), ‘‘The statistical mechanics of best-response strategy revision’’, Games and Economic Behavior, Vol. 11, pp. 111–145. Blume, L. (1997), ‘‘Population games’’, in: B. Arthur, S. Durlauf and D. Lane, editors, The Economy as an Evolving Complex Systems II, Massachusetts: Addison Wesley. Durieu, J., H. Haller and P. Solal (2004), ‘‘A characterization of contagion’’, Mimeo, CREUSET. Ellison, G. (1993), ‘‘Learning, local interaction, and coordination’’, Econometrica, Vol. 61, pp. 1047–1071.
152
Jacques Durieu, Hans Haller and Philippe Solal
Flocchini, P., F. Geurts and N. Santoro (2001), ‘‘Optimal irreversible dynamos in chordal rings’’, Discrete Applied Mathematics, Vol. 113, pp. 23–42. Flocchini, P., E. Lodi, F. Luccio, L. Pagli and N. Santoro (2004), ‘‘Dynamic monopolies in tori’’, Discrete Applied Mathematics, Vol. 137, pp. 197–212. Goles, E. and J. Olivos (1980), ‘‘Periodic behaviour of generalized threshold functions’’, Discrete Mathematics, Vol. 30, pp. 187–189. Haynes, T.W., S.T. Hedetniemi and P.J. Slater (1998), Fundamentals of Domination in Graphs, New York: Marcel Dekker. Lee, I.H. and A. Valentinyi (2000), ‘‘Noisy contagion without mutation’’, Review of Economic Studies, Vol. 67, pp. 47–56. Morris, S. (2000), ‘‘Contagion’’, Review of Economic Studies, Vol. 67, pp. 57–78. Morris, S., R. Rob and S. Shin (1995), ‘‘p-dominance and belief potential’’, Econometrica, Vol. 63, pp. 145–157. Peleg, D. (1998), ‘‘Size bounds for dynamic monopolies’’, Discrete Applied Mathematics, Vol. 86, pp. 263–273. Peleg, D. (2002), ‘‘Local majorities, coalitions and monopolies in graphs: a review’’, Theoretical Computer Science, Vol. 282, pp. 231–257. Young, P. (1998), Individual Strategy and Social Structure. An Evolutionary Theory of Institutions, Princeton: Princeton University Press.
CHAPTER 7
Selective Interaction with Reinforcing Preference Akira Namatame Abstract The performance of a collective system crucially depends on the way they interact as well as how they adapt to others. There are two closely related issues concerning a collective of interacting agents, the forward and inverse problems. The forward problem is to investigate what a collective of interacting agents determines as a complex emergent behavior. The inverse problem is to investigate how the private utility functions of agents can be modified so that their self-interested behavior collectively gives rise to a desired outcome. This paper examines the effects of the combined model of the partner choice and preference reinforcing in order to solve the inverse problem. Agents choose their partners and decide on a mode of behavior when interacting with selected partners. Two types of meta-agents are considered: conformists who behave based on the logic of majority, and nonconformists who behave based on the logic of minority. It is shown that a collection of conformists and nonconformists having the identical preference initially evolves into a collection of heterogeneous agents with diverse preferences and that they achieve both an efficient and equitable outcome.
Keywords: cooperation, heterogeneous agents, preference reinforcement, selective interactions JEL classifications: A12,C79,D83
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80008-8
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
154
Akira Namatame
1. Introduction The following question is often addressed: how does a collection of interacting agents generate global macroscopic orders as a whole? An interesting aspect of interacting agents is emergent property. Natural evolution has created a multitude of systems in which the actions of interacting agents give rise to coordinated global information processing. Insect colonies, cellular assemblies, the retina, and the immune system have been often cited as examples of having emergent properties. Emergent property is surprising because it can be hard to anticipate the full consequences of even simple forms of interaction. Emergence also refers to the appearance of global information-processing capabilities that are not explicitly represented in the systems’ elementary components or in their interconnection. However, there is no presumption that a population of interacting agents in an imperfect world leads to a collectively satisfactory result. How well agents do for it in adapting to their environment is not the same thing as how satisfactory is an environment they collectively create. While all agents understand the outcome is inefficient, acting independently is powerless to manage an efficient collective outcome. In order to lead collective behavior to be a desirable one, we need to consider two different levels: the microscopic level, where the decisions of the individual agents occur and collective behavior can be observed. We also have to specify how agents interact, respond, adapt, or learn from each other. However, to understand the role of the link between these two levels also remains one challenge (Huberman and Glance, 1993; Helbing, 1995). In examining collective effects, we shall draw heavily on the individual decisions. It might be argued that understanding how individuals make decisions is sufficient to understand collective behavior. Many researchers have pointed out that equilibrium theory does not resolve the question of how people behave in a particular interdependent decision situation. It is often argued, ‘‘it is hard to see what can advance the discussion short of assembling a collection of people, putting them in the situation of interest, and observing what they do’’ (Bala and Goyal, 1998, 2000; Banerjee, 1999). People have preferences, pursuing goals, and they behave in a way that we might call ‘‘purposive’’. We metaphorically ascribe motives to behavior because something behaves as if it were oriented toward a goal. But the purposes or objectives often relate directly to those of others. Therefore their behaviors are also constrained by other people who are pursuing their own purposes. Schelling (1978) characterized contingent behavior – behavior that depends on what others are doing. What makes contingent behavior interesting and difficult is that the entire aggregate outcome is what has to be evaluated, not merely how each person does within the constraints of her own environment. Although individual decision is important to understand, it is not sufficient to describe how a collection of agents arrives at specific decisions. We attempt to probe deeper understanding of the issue by specifying how they interact with
Selective Interaction with Reinforcing Preference
155
each other. The greatest promise lies in analysis of situations where many agents behave in ways contingent on one another, and these situations are central in theoretical analysis of linking micro to macro levels of collective decision. The overall collective performance depends crucially on the type of interaction as well as on the heterogeneity of agent preferences. An externality occurs if one cares about others’ decisions, and their decisions also affect her own decision. An agent’s outcome, whichever way he makes his choice, also depends on the number of agents who choose one way or the other. An interesting problem is then under what circumstances will a collection of agents realize some particular stable situations, and whether they satisfy the conditions of efficiency (Iwanaga and Namatame, 2001, 2002). There are two closely related issues concerning a collective of interacting agents, the forward and inverse problems (Tumer and Wolpert, 2004). An agent behaves based not only on her preference but also on others’ actions. It is also important to consider with whom an agent interacts and how each agent decides her action depending on others’ actions. Agents are selfish in the sense that they only do what they want to do and what they think according to their own best interests, their motivations. They necessarily have different sets of goals, motivations, cognitive states by virtue of their different histories, the different resources they use, or different settings they participate in, and so on.
2. The inverse problem Agents myopically adapt their behavior based on their idiosyncratic rule to others’ behaviors. In the simplest form of our model, agents are born with fixed preferences. They are also assumed to be rational in the sense that they select their strategy in order to maximize their endogenous preferences as shown in Figure 1(a). However individuals’ behavior should be understood in a social setting. In order to understand their behavior, we must observe them within social and cultural environments. The inverse problem consists then to investigate how their endogenous preferences can be modified through interactions and strategy choices in the past as shown in Figure 1(b). Figure 1.
The forward and inverse problems (a) Preference determines strategy choice (b) Strategy choice reinforces preference Preference
Preference
Strategy
Strategy
(a)
(b)
156
Akira Namatame
It will be nice to motivate the source of learning algorithms by properly redesigning the payment schemes. One possible approach is to consider a mediator who provides the agents with algorithms to use and suggests payments to be made. This correlation device is not a designer who can enforce behaviors or payments, and it does not possess any private knowledge or aim to optimize private payoffs. Game theory is typically concerned with learning of a strategy (Fudenberg and Levine, 1998). Agents repeatedly play a game, each time observing their rewards, which reflect their prefixed preference. Typically, these studies have the goal of showing that some simple update rule leads the agents to eventually adopt some Nash equilibrium. However, the learning algorithms themselves are not required to satisfy any rationality requirement; it is what they converge to, if adopted by all agents that should be in equilibrium. Shoham and Powers (2003) characterized the so-called ‘AI agenda’ asking what the best learning strategy is for a given agent for a fixed class of the other agents in the game. It thus retains the design stance of AI, asking how to design an optimal agent for a given environment. There is another agenda that is more prescriptive. It asks how agents should learn in the context of other learners. The recent literature on multi-agent learning also deals extensively with connections between distributed systems and the design of economic mechanisms. One central issue in that literature is the selection of a solution criterion: what can we assume about the agents’ rational behavior. Much of the literature deals with the search for mechanisms where the agents will have dominant strategies that lead to desired behavior (maximizing social efficiency). However, it can be shown that such mechanisms rarely exist. One alternative is to consider mechanisms where there exists an ex post equilibrium: a strategy profile of the agents in which it is irrational to deviate from each agent’s learning algorithm, assuming the other agents stick to their strategies, and regardless of the state of the system (Durlauf and Young, 2001; Young, 1998). 3. Classification of social interactions Social interactions pose many coordination problems to individuals. There are many situations where interacting agents can benefit from coordinating their action. Examples where coordination is important include trade alliance, the choice of compatible technologies or conventions such as the choice of a software or language. We can classify interaction with externality into two types. Coordination usually implies that the increased effort by some agents leads the remaining agents to follow them, which rises a multiplier effect or bandwagon effects. These are also characterized as situations where interacting agents can benefit if they take the same action. We call this type as social interaction with positive externalities. In this case, agents behave based on the logic of majority, since agents receive payoffs if they select the same strategy as the majority does. These symmetric social interactions are modeled as coordination games in which an agent receives a payoff if he selects the same strategy as the others.
Selective Interaction with Reinforcing Preference
157
On the other hand, in the route selection problem for instance, an agents receives a payoff if he selects the opposite strategy as the majority does. For example, in the context of traffic networks, agents have to determine their routes independently. In telecommunication networks, they have to decide what fraction of their traffic to send on each link of the network. This type of interaction is distinguished by defining as social interaction negative externalities. In this case, agents behave based on the logic of minority, since agents receive payoff if they select the opposite strategy as the majority does (Beckmann et al., 1956). The El Farol bar problem and its variants minority games provide a clean and simple example of this type of social interaction (Challet and Zhang, 1997). The market entry game is also a stylized representation of a common problem based on the logic of minority: a number of agents have to choose independently whether or not to undertake some activity, such as enter a market, go to a bar, or drive on a road, the utility from which decreasing in the number of the participants (Ochs, 1998; Duffy and Hopkins, 2002). The choice of market entry games for studying coordination is quite natural. When there are too many potential entrants wishing to exploit a new market opportunity, a problem arises regarding how entry may be coordinated. Without coordination, too many firms may decide to enter the market and consequently they sustain losses. Conversely, fully aware of the consequences of excessive entry, firms may be reluctant to enter and exploit the market in the first place. These examples of social interaction typically admit a large number of Nash equilibria. Given this multiplicity of equilibrium outcomes, an obvious question is which type of equilibrium are agents likely to coordinate upon? The coordination failure is attributed to certain features of the payoff functions. 4. Rational decisions of conformists and nonconformists In most cases, the decision can be thought of as having a positive and negative side – deciding to do a thing or not to do it. We model the problems of individual and collective decisions as follows: each agent has two alternatives, and the costs and benefits of each alternative depend on how many other agents choose the same alternative. As a specific example, we consider the situation in which each agent in the population of N agents faces the binary decision problem with the following two choices: S1: votes for,
S2 : votes against
One’s decision to vote for a particular alternative may depend heavily on how many others decide to do so, partly because of social influence, partly because one does not want to waste her own vote. In this example, the payoff matrix is given in Table 1. When ai>0, agent i personally prefers to vote for, and if aio0, she prefers to vote against. The
Akira Namatame
158
Table 1.
The payoff matrix of a conformist
Agent i
The others
S1 S2
p(t) S1
1p(t) S2
ai+bi 0
ai bi
absolute value of ai represents the strength of her preference. The parameter bi (>0) represents the level of consistency level with the others’ choices. Agents myopically change their actions based on their own rules obtained as a function of their idiosyncratic utility and of the actions of their neighbors. Assignment of heterogeneous agents also becomes important. We investigate the relation between the collective behavior and the assignment of heterogeneous agents. (1) Rational decision of a conformist We obtain a rational decision rule of a conformist. Let suppose the ratio of agents to choose S1 is p(t). We define the threshold of conformist i as: yi ¼ ðbi ai Þ=ðai þ bi þ ai bi Þ ¼ ð1 ai =bi Þ=2
ð1Þ
Then the rational decision rule of a conformist is given as: (i)
pðtÞ yi : votes for ðS1 Þ
(ii)
pðtÞoyi : votes against ðS 2 Þ
ð2Þ
(2) Rational decision of a nonconformist Nonconformist with the payoff matrix of Table 2 prefers the opposite choice of the majority. The threshold of a nonconformist i is defined as: yi ¼ ð1 þ ai =bi Þ=2
ð3Þ
Then a rational decision rule of a nonconformist is given as: (i)
pðtÞoyi : votes for ðS1 Þ
(ii)
pðtÞ yi : votes against ðS2 Þ
ð4Þ
The strategic decisions of a conformist and a nonconformist can be also modeled with the payoff matrix in Table 3. If bi>0, agent i is a conformist, and if bio0, she is a nonconformist. From the rational decision rules in (3.2) and (3.4), the agents with heterogeneous payoffs are classified into the following four types.
Selective Interaction with Reinforcing Preference
159
Table 2. The payoff matrix of a nonconformist Agent i
S1 S2
The others p(t) S1
1p(t) S2
ai bi
ai+bi 0
Table 3. The payoff matrix of a conformist (bi>0) and nonconformit (bio0) Agent Ai
S1 S2
The other agents p(t) S1
1p(t) S2
ai+bi 0
ai bi
(1) Type 1: hardcore S1 If agent i is a conformist with the payoff parameters 0 bi ai or she is a nonconformist with ai bi 0, she always chooses S1 without regard to the others’ decisions. In these cases, the strategy S1 becomes a dominant strategy. This type of an agent is a hardcore of the S1 chooser. (2) Type 2: hardcore S2 If agent i’s payoff parameters satisfy 0obi o ai when she is a conformist, or ai obi o0 when she is a nonconformist, she always chooses S2 without regard to the others’ decisions. In this case, the strategy S2 becomes a dominant strategy. This type of an agent is regarded as a hardcore of the S2 chooser. (3) Type 3: conformist If agent i is conformist (bi>0) and her payoff parameters satisfies jai jobi , her rational decision depends on the proportion of agents who make the same decision. Since he prefers what the majority does, this type of an agent is regarded as a conformist. (4) Type 4: nonconformist If agent i is nonconformist (bio0) and her payoff parameters satisfies jai jo bi , her rational decision depends on the proportion of agents who make the opposite decision. Since she prefers what the majority does not, this type of an agent is regarded as a nonconformist. In Figure 2, the x-axis represents the parameter value of ai, each agent’s preference level over two choices (purposive behavior), and the y-axis represents the parameter value of bi, the level of consistency with the others’ decisions
Akira Namatame
160
Figure 2.
Classification of agent types from their idiosyncratic payoffs (ai, bi) 1 0.8
Type3
α i ≤ βi
0.6 0.4
i
0.2
Type2 0 ≤ β i ≤ −α i
0
Type1 0 ≤ βi ≤ α i
−α i ≤ β i ≤ 0
α i ≤ βi ≤ 0
-0.2 -0.4
Type4
-0.6
α i ≤ −βi
-0.8 -1 -1
-0.8
-0.6
-0.4
-0.2
i
0
0.2
0.4
0.6
0.8
1
(contingent behavior). An infinite number of possible decision rules of heterogeneous agents can be classified into the above four types. 5. Heterogeneity in preferences In human societies, an essential element is that individuals differ from each other. The diversity comes into play in many instances of collective behavior. In this section, we depart from the assumption of homogeneity with respect to the payoff parameters. In particular, we consider continuum games with infinite number of heterogeneous agents with respect to their preferences. We assume that a conformist has the payoff matrix in Table 4, and a nonconformist in Table 5. We should remark that the payoff matrices in Tables 1 and 2 are strategically equivalent with those in Tables 4 and 5. The heterogeneity among agents can be described by their payoff parameter y ensuring an enormous simplification in the present analysis. In our model, agents have idiosyncratic payoff parameter y. The diversity of a collection of heterogeneous agents is characterized by the distribution function of the payoff parameter y. We consider a collection of N agents, and denote the number of agents with the same parameter value y by n(y). We define the density of y by f(y), which is obtained by dividing n(y) by the total number of agent N, i.e. f ðyÞ ¼ nðyÞ=N
ð5Þ
As specific examples, we consider three density functions in Figure 3. In Figure 3(case 1), all agents have the same payoff parameter value y ¼ 0.5, and in
Selective Interaction with Reinforcing Preference
161
Table 4. The payoff matrix of a conformist y ¼ ð1 þ a=bÞ=2ð0 y 1Þ Choice of agent i
S1 S2
Choice of other agents S1
S2
1y 0
0 y
Table 5. The payoff matrix of a nonconformist y ¼ ð1 a=bÞ=2ð0 y 1Þ Choice of agent i
S1 S2
Choice of other agents S1
S2
0 1y
y 0
Figure 3(case 2), a half of the population have the payoff parameter value y ¼ 0, and the rest of them have y ¼ 1. In examining collective effects, we shall draw heavily on the individual adaptive decisions. Within the scope of our model, we treat the case in which agents make deliberate decisions by applying rational procedures, which also guide their reasoning. In order to describe the adaptation process at the individual level, we may have two fundamental models, global interaction and local interaction. It is important to consider with whom an agent interacts and how she decides her action depending on others’ actions. Agents may adapt based on the aggregate information representing the current status of the whole system (global information). In this case, each agent chooses an optimal decision based on aggregate information about how all other agents behaved in the past. In many situations, agents are not assumed to be able to correctly guess or anticipate other agents’ actions, or they are not able to know how to calculate best replies. With local adaptation, each agent is modeled to adapt to local information (Kirman, 1997; Tennenholtz, 2002). As a specific model, we consider the lattice structure as shown in Figure 4. We arrange a collection of heterogeneous agents in the torus of 50 50 (2500) agents where the four corners and an edge of an area connect it with an opposite side. The consequence of their actions also gives an effect on agents with whom they are not directly linked. The hypothesis of local adaptation also reflects limited ability of agents to receive, decide, and act based upon information they receive in the course of interaction.
Akira Namatame
162
1
1
0.6
0.6
0.6
0.6
f ()
f ( )
Figure 3. Some density functions of the payoff parameter: f(y) Case 1: Distribution with one-peak; Case 2: Distribution with the two-peaks; Case 3: Normal distribution; Case 4: Uniform distribution; and Case 5: Polarized distribution
0.4 0.2 0
0.2 0
0.2
0.4
0.6
0.8
0
1
(Case 1)
0
0.2
0.4
0.2
0.4
0.8
1
0.6
0.8
1
0.6
0.8
1
(Case 2)
0.03
0.03
0.02
0.02
f ( )
f ()
0.4
0.01
0.01
0
0 0
0.2
0.4
0.6
0.8
0
1
(Case 3)
(Case 4)
f ( )
0.03 0.02 0.01 0
0
0.2
0.4
0.6
(Case 5)
Figure 4. Adaptation to (a) global information and (b) local information Aggregate Information
Agent Ai
(a)
(b)
Selective Interaction with Reinforcing Preference
163
Global adaptation rule of a conformist Agents adopt actions that optimize their expected payoff given what they expect the others to do. In this model, agents choose the best replies to the empirical frequencies distribution of the previous actions of the others. The main point is that an agent’s decision depends on what he knows about the others. We obtain the adaptive rule of each agent as her best response. We denote the proportion of agents having chosen S1 at time t by p(t). The optimal adaptive rule of an agent is obtained as the function of the aggregate information on collective p(t) and her idiosyncratic payoff parameter y as follows: (i)
If
pðtÞ y; choose S1
(ii)
If
pðtÞoy; choose S 2 .
ð6Þ
Local adaptation rule of a conformist (i)
If
pi ðtÞ y; choose S 1
(ii)
If
pi ðtÞoy; choose S 2 :
ð7Þ
We denote the proportion of the neighbors of an agent having chosen S1 at time t by pi(t). The optimal adaptive rule with local adaptation is obtained as follows: We evaluate the collective result of interacting heterogeneous agents with two criteria, efficiency and equity. Efficiency is evaluated by the average utility, which also stands for the measure of the desirability at the macro level. Equity is evaluated the utility distribution, which stands for the measure of the desirability at the micro level. The Gini ratio is often used to measure the dispersion of the utility distribution of a society. The pairs of the average payoff U and equity E of all heterogeneous populations of conformists both under global adaptation and local adaptation are shown in Figure 5. The effect of local adaptation compared with global adaptation has been discussed by many researchers (Kirman, 1997; Tennenholtz, 2002). With these simulation results, both efficiency and equity are low when heterogeneous agents adapt locally. The merit of local adaptation is enhanced for the collective of conformists. Global adaptation rule of a nonconformist The global adaptive rules of a nonconformist is obtained as follows: (i)
If
pðtÞ y; choose S1
(ii)
If
pðtÞ4y; choose S 2 :
ð8Þ
Akira Namatame
164
Figure 5.
Efficiency U and equity E for conformists (the initial proportion is p(0) ¼ 0.5) (a) Global adaptation and (b) Local adaptation
1
1
0.8
0.8
U
Case2 Case3 Case5 Case1 Case4
0.4
0.6
Case2
Case5
U
0.6
Case4
0.4
Case3
0.2
0.2
Case1
0
0
0.2
0.4
(a)
0.6
0.8
0
0.2
0.8
0.6
0.6 Case5 Case4 Case3
U
U
0.8
0.4 Case5 Case3 Case2
0.2
(a)
0.8
1
Efficiency U and equity E for nonconformists (the initial proportion is p(0) ¼ 0.5) (a) Global adaptation and (b) Local adaptation 1
0
0.6 E
1
0
0.4
(b)
E
Figure 6.
0
1
0.2
0.4
Case1
Case4
0.4
0.6
Case2 Case1
0.2 0.8
E
0
1 (b)
0
0.2
0.4
0.6
0.8
1
E
Local adaptation rule of a nonconformist The local adaptive rule of a nonconformist is obtained as follows: (i)
If
pi ðtÞ y; choose S1
(ii)
If
pi ðtÞ4y; choose S 2 .
ð9Þ
We evaluate a collective of nonconformists with two criteria, efficiency and equity. The pairs of the average payoff U and equity E of all heterogeneous populations of nonconformists both under global adaptation and local adaptation are shown in Figure 6. As shown in Figure 6, both efficiency and equity
Selective Interaction with Reinforcing Preference
165
are low when conformists adapt locally. On the other hand, the merit of local adaptation is enhanced for the collective of nonconformists. 6. Selective interaction of heterogeneous agents We formalize our idea by modeling a population of heterogeneous agents in which agents are repeatedly matched within a long-time period to play coordination games, see Table 4. There are many parameters to be considered such as payoff matrix, population structure, population configuration, the number of agents, and so on. Among these parameters, we examine the effects of heterogeneity in payoffs y and the configuration of locating agents. The interaction methodology also plays an important role in the outcome. We consider two fundamental models, random (or global) interaction and local interaction. Agents myopically evolve their actions based on their own rules obtained as a function of their idiosyncratic utility and of the actions of their neighbors. Assignment of heterogeneous agents also becomes important. We investigate the relation between the collective behavior and the assignment of heterogeneous agents. Our interest is to investigate whether that systems, in which many locally connected processors with no central control, can produce efficient collective performance. We evaluate the collective behavior from the criteria of both efficiency and equity. The crucial concept for describing heterogeneity of agents is their preference characterized by the value of payoff parametery. Heterogeneity of preferences makes it possible to have a different type of interaction, selective interaction. This is possible because each agent has a different payoff parameter. We classify heterogeneous agents into the following two types: Type 1: The payoff parameter value y is less than 0.5 (such an agent prefers S1 to S2). Type 2: The payoff parameter value y is greater than 0.5 (such an agent prefers S2 to S1). We also classify interaction types into the following three types as shown in Figure 7: (1) Random assignment: each agent has a chance to interact with neighbors of any type. (2) Sorted assignment: each agent interacts with neighbors of the same type. (3) Mixed assignment: each agent interacts with neighbors of the opposite type. We describe the selective interaction process of the agents. An agent of type 1 (yr0.5) chooses S1, since such an agent prefers S1 to S2. On the other hand, an agent of type 2 (y>0.5) chooses S2, since such an agent prefers S2 to S2. In Figure 8, we represent the selection process of an agent. Each agent interacts
Akira Namatame
166
Figure 7.
Assignment of heterogeneous agents (a) Random, (b) sorting, and (c) Mixed
(a)
(b)
Figure 8.
(c)
The process of selective interaction
Selection of strategy Interaction with neihbors Gain from interaction Selection process Average pay off < 0.5 No Yes Change partners
with neighbors by choosing her preferred strategy. If she receives the average payoff per one neighbor more than 0.5, she remains in the same location, otherwise she moves to another location and interacts with different neighbors. At the beginning, we set two types and heterogeneous agents randomly in the lattice of 50 50 agents. Conformists and nonconformists of type 1 and type 2 are randomly assigned as shown in Figures 9(a) and 10(a). Each agent interacts with current 8 neighbors. If an agent receives an average payoff per one neighbor less than 0.5, then he is defined as ‘‘unhappy agent’’. We choose randomly any two unhappy agents on the lattice, and we replace their locations so that all unhappy agents can change their current neighbors. After a few hundreds of rounds, a collection of heterogeneous conformists could self-organize both their locations and their actions so that they can realize homogeneous interaction as shown in Figures 9(b) (conformists) and 10(b) (nonconformists). We evaluate collectives of both conformists and nonconformists with two criteria, efficiency and equity. The pairs of the average payoff U and equity E
Selective Interaction with Reinforcing Preference
167
Figure 9. Locations of conformists after selective interactions (a) Random assignment and (b) structured assignment
Figure 10.
Locations of nonconformist agents after selective interaction (a) Random assignment and (b) structured assignment
under selective interaction are shown in Figure 11. Compared with the results shown in Figure 5, both efficiency and equity are improved when agents interact with the fixed neighbors. Therefore, a collective of heterogeneous agents can achieve the highest efficiency and equity under selective interaction. 7. Selective interaction with reinforcing preferences Game theory is typically based upon the assumption of a rational choice (Fudenberg and Levine, 1998). The real advantage of the rational-choice assumption is that it often allows deduction. The main alternative to the assumption of rational choice is the adaptive approach with reinforcement learning.
Akira Namatame
168
Figure 11. Efficiency U and equity E under structured assignment (a) conformist under sorted assignment and (b) nonconformist under mixed assignment Case2
1 Case5
0.8
Case2
1 Case5
0.8
Case4
Case4
0.6
0.6
Case3
U
U
Case3
0.4
0.4
Case1
0.2
0.2 0
Case1
0
0.2
(a)
0.4
0.6
0.8
0
1
E
0
0.2
(b)
0.4
0.6
0.8
1
E
Table 6. Reinforcement of learning the payoff-matrix (a) The payoff matrix of a conformist and (b) The payoff matrix of a nonconformist Choice of agents
S1 S2
Choice of the other agents S1
S2
1y+Dy 0
0 yDy
Choice of agents
S1 S2
Choice of the other agents S1
S2
0 1y+Dy
yDy 0
With reinforcement learning, agents tend to adopt actions that yielded a high payoff, and to avoid actions that yielded a low payoff. Agents will try any of the binary choices, and repeat the strategy that led to high payoffs in the past experiences. Propensity of trying a strategy is increased according to the associated payoff. Payoff describes choice behavior, but it is one’s own past payoffs that matter, not the payoffs of the others. The basic premise is that the probability of taking an action in the present increases with the payoff that resulted from taking that action in the past. In this section, we consider a collection of homogeneous agents who initially have the same payoff parameter with y ¼ 0.5 in Table 4, and the threshold density is shown in Case 1 in Figure 3. Each agent repeats T( ¼ 10) interactions with neighbors. If agents receive an average payoff per one neighbor more than 0.5 by choosing S1, they increase 1y by Dy ¼ 0.01, hence they decrease y by Dy ¼ 0.01 as shown in Table 6. Similarly, agents who receive an average payoff more than 0.5 by choosing S2, they increase y by Dy ¼ 0.01. In Figure 12, we represent the processes of collective reinforcement learning. (1) A collective of conformists We show the transition of the density function of a collective of conformists in Figure 13. In Figure 13(a), all conformists have the same value y ¼ 0.5 at the
Selective Interaction with Reinforcing Preference
Figure 12.
169
The reinforcement process of an agent’s preference. Count: the number of successful gain Strategy Selection Interaction with neighbors Gain from interaction Select partners Yes count < K No
Reinforce preference
count = count + 1 count = 0
beginning, and the initial proportion of agents who choose S1 is set to p(0) ¼ 0.25. A collection of identical conformists gradually self-reinforce their payoff value (preference), and after a repetition of T ¼ 600 rounds, all agents have the same value of y ¼ 1. In Figure 13(c), the initial proportion p(0) is set to p(0) ¼ 0.5. Identical conformists with the same payoff parameter y ¼ 0.5 selfreinforce their payoff parameters, and after a repetition of T ¼ 600 rounds, all agents have the same parameter y ¼ 0. In Figure 13(b), the initial proportion is set to p(0) ¼ 0.5, and after learning process of T ¼ 600 periods, a half of agents have the same payoff parameter y ¼ 0, and the rest of agents have y ¼ 1. In all three cases, a collection of identical agents at the beginning collectively reinforce their preferences and evolve into heterogeneous agents, so that the most efficient and equitable collective action can be self-organized. However, the result of collective learning heavily depends on the initial ratio of choosing either strategy. (2) A collective of nonconformists We show the transition of the density function of a collective of nonconformists in Figure 14. Figure 14(a) gives the results when all conformists have the same value y ¼ 0.5 at the beginning, and the initial proportion of agents to choose S1 is set to p(0) ¼ 0.5. A collection of identical conformists gradually selfreinforce their payoff value (preference), and after the repetition of T ¼ 600 periods, they split into two groups: agents with the payoff parameter y ¼ 0, and the agents with y ¼ 1. Figure 14(b) shows the results when the initial proportion of agents to choose S1 at beginning is set to either p(0) ¼ 0.25 or p(0) ¼ 0.75.
170
Akira Namatame
Figure 13.
Distribution of y under reinforcement learning (conformists) (a) p(0) ¼ 0.25, (b) p(0) ¼ 0.5, and (c) p(0) ¼ 0.75
Figure 14.
Distribution of y under reinforcement learning (nonconformists) (a) p(0) ¼ 0.5 (b) (P0) ¼ 0.3 or P(0) ¼ 0.7
They remain as homogeneous agents with the same payoff parameter y ¼ 0.5 and they fail to reinforce their preferences. The success of reinforcing nonconformists’ preferences to induce desired collective behavior heavily depends on the initial condition.
Selective Interaction with Reinforcing Preference
171
(3) A mixed collective of conformists and nonconformists We have observed that the success of collective reinforcement learning of conformists and nonconformists heavily depends on the initial condition to choose either strategy. We now consider a mixed case in which conformists who behave with the majority rule and nonconformists who behave with the minority rule coexist in the same population. We assume that the ratios of conformists and nonconformists are the same. Figure 15 shows the results when the initial proportion of agents to choose S1 is set to p(0) ¼ 0.25. A collection of identical agents who have the same payoff parameter y ¼ 0.5 initially, gradually self-reinforce their preferences, and after the repetition of T ¼ 1000 periods, 75% of the agents reinforced the payoff parameter up to y ¼ 1, and the rest of 25% of agents reinforced to y ¼ 0. The ratio of agents who choose S1 is 75%, and all conformists (50% of the total population) choose S1 and 50% of nonconformists (25% of the total population) choose S1. The ratio of agents who choose S2 is 25%, and they are all nonconformists. Figure 16 shows the results when the initial proportion of agents to choose S1 is set to p(0) ¼ 0.75. In this case, a collection of identical agents gradually selfreinforce their payoff parameters, and after the repetition of T ¼ 1000 periods, 25% of agents have y ¼ 1, and the rest 75% of agents have y ¼ 0. The ratio of agents to choose S1 is 25%. In this case, all conformists (50% of the total population) choose S2 and 50% of nonconformists (25% of the total population) choose S2. The ratio of agents to choose S1 is 25%, and they are all nonconformists. Figure 17 shows the results when the initial proportion is set as p(0) ¼ 0.5. In this case, a half of agents reinforce to have y ¼ 1, and the rest of agents have y ¼ 0. The ratio of agents who choose S1 is 50%, and a half of conformists (25% of the total population) choose S1 and 50% of nonconformists (25% of the total population) also chose S1. The same thing is true for agents who chose S2. Distribution of y under reinforcement learning (conformists and nonconformists with initial ratio p(0) ¼ 0.75) 1 t= 0 0.8 f ()
Figure 15.
1000
0.6 0.4 1000 0.2 800 600 400 0 0
800 600 400
200
0.2
0.4
0.6
0.8
1
Akira Namatame
172
Figure 16.
Distribution of y under reinforcement learning (conformists and nonconformists with initial ratio p(0) ¼ 0.5) 1 t=0
f ()
0.8 1000 0.6 0.4 800 0.2 600 400 0 0
Figure 17.
1000 800 600 400
200 0.2
0.4
0.6
0.8
1
Distribution of y under reinforcement learning (conformists and nonconformists with initial ratio p(0) ¼ 0.25) 1 t=0
f ()
0.8 0.6 1000 800 600 0.4 200
400
0.2 0
1000 800 600
0
200 0.2
0.4
0.6
400 0.8
1
In all three cases, a collection of identical agents collectively reinforce their preferences and evolve into heterogeneous agents, so that the most efficient and equitable collective action can be self-organized. The success of collective reinforcement learning therefore depends on the coexistence of conformists and nonconformists who behave with the opposite rules. 8. Conclusion There is no presumption that a collective of interacting agents leads to a satisfactory performance. Agents normally react to others’ decisions, and the resulting volatile collective action is often far from being efficient. In this paper, we investigated how the preferences of agents can be reinforced so that their selfinterested behavior collectively gives rise to a desired outcome. There are many parameters to be considered such as payoff matrix, population structure, population configuration, number of agents, and so on. Among these parameters,
Selective Interaction with Reinforcing Preference
173
we examine the heterogeneity of payoffs and the configuration of locating heterogeneous agents. Agents myopically evolve their actions based on their own rules obtained as a function of their idiosyncratic utility and of the actions of their neighbors. Assignment of heterogeneous agents also becomes to be important. We examined the interaction between partner choice and individual preferences. Agents choose their partners and also decide on a mode of behavior in interactions with these partners. We also investigated how interacting homogeneous agents evolves into heterogeneous agents with diverse preferences by reinforcing their preferences so that they can realize the most efficient and equitable collective action. We showed that the most crucial factor that considerably improves the performance of the system of interacting agents are the endogenous selection of the partners and reinforcement of preferences at individual level.
References Bala, V. and S. Goyal (1998), ‘‘Learning from neighbors’’, Review of Economic Studies, Vol. 65, pp. 595–621. Bala, V. and S. Goyal (2000), ‘‘A non-cooperative model of network formation’’, Econometrica, Vol. 68, pp. 1181–1229. Banerjee, S. (1999), ‘‘Efficiency and stability in economic networks’’, Mimeo, Boston University. Beckmann, M., C.B. McGuire and C.B. Winsten (1956), Studies in the Economics of Transportation, Yale University Press. Challet, D. and C. Zhang (1997), ‘‘Emergence of cooperation and organization in an evolutionary game’’, Physica Vol. A246, pp. 514–518. Duffy, J. and E. Hopkins (2002), ‘‘Learning, information and sorting in market entry games’’, Mimeo, University of Pittsburgh. Durlauf, S.N. and H.P. Young (2001), Social Dynamics, Brookings Institution Press. Fudenberg, D. and D. Levine (1998), The Theory of Learning in Games, The MIT Press. Helbing, D. (1995), Quantitative Sociodynamics, Kluwer Academic. Huberman, B. and N. Glance (1993), ‘‘Diversity and collective action’’, Interdisciplinary Approaches to Nonlinear Systems, Springer. Iwanaga, S. and A. Namatame (2001), ‘‘Asymmetric coordination of heterogeneous agents’’, IEICE Trans. on Information and Systems, Vol. E84-D, pp. 937–944. Iwanaga, S. and A. Namatame (2002), ‘‘The complexity of collective decision’’, Journal of Nonlinear Dynamics and Control, Vol. 6(2), pp. 137–158. Kirman, A. (1997), ‘‘The economy as interactive system’’, pp. 461–490 in: W. Arthur, S. Durlauf and D. Lane, editors, The Economy as an Evolving Complex System II, Perseus Books.
174
Akira Namatame
Ochs, J. (1998). ‘‘Coordination in market entry games,’’ pp. 143–172 in: D. Budescu, I. Erev and R. Zwick, editors, Games and Human Behavior. Schelling, T. (1978), Micromotives and Macrobehavior, Norton. Shoham, Y. and R. Powers (2003), ‘‘Multi-agent reinforcement learning: a critical survey’’, Memo. Tennenholtz, (2002). ‘‘Efficient learning equilibrium’’, in: Proceedings of NIPS. Tumer, K. and D. Wolpert (2004), Springer. Young, H.P. (1998), Individual Strategy and Social Structures, Princeton University Press.
Part III: Economic Applications
This page intentionally left blank
CHAPTER 8
Choice under Social Influence: Effects of Learning Behaviours on the Collective Dynamics Viktoriya Semeshenko, Mirta B. Gordon, Jean-Pierre Nadal and Denis Phan Abstract We consider a simple model in which a population of individuals with idiosyncratic willingnesses to pay must choose repeatedly either to buy or not a unit of a single homogeneous good at a given price. Utilities of buyers have positive externalities due to social interactions among customers. If the latter are strong enough, the system has multiple Nash equilibria, revealing coordination problems. We assume that individuals learn to make their decisions repeatedly. We study the performances along the learning path as well as at the customers’ reached equilibria, for different learning schemes based on past earned and/or forgone payoffs. Results are presented as a function of the price, for weak and strong social interactions. Pure reinforcement learning is shown to hinder convergence to the Nash equilibrium, even when it is unique. For strong social interactions, coordination on the optimal equilibrium through learning is reached only with some of the learning schemes, under restrictive conditions. The issues of the learning rules are shown to depend crucially on the values of their parameters, and are sensitive to the agents’ initial beliefs. Keywords: equilibrium coordination, positive consumer externalities, reinforcement learning, fictitious play learning JEL classifications: D42, D62
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80009-X
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
178
Viktoriya Semeshenko et al.
1. Introduction In this paper, we consider the simplest model where individuals with social interactions have to make a binary choice. Since early work by Schelling (1971, 1973, 1978), who analysed patterns of social segregation in urban areas, to recent applications, like the analysis of large variances in criminality across cities by Glaeser et al. (1996), social interaction models have deserved increasing interest in economics literature (see Blume, 1993; Glaeser and Scheinkman, 2003; Brock and Durlauf, 2001; and, for recent overviews, Manski, 2000 and Ball, 2003). More recently, the model has been generalized by some of us (Gordon et al., 2005, 2006; Nadal et al., 2006) to a simple market situation: a population of heterogeneous individuals – having different and time-independent willingnesses to pay – subject to local positive externalities have to choose either to buy or not a unit of a single good at a given price, for example posted by a monopolist. Most of published works on models with social interactions study their Nash equilibria properties. In principle, Nash equilibria may be reached by the system if agents are perfectly rational and have complete knowledge of the strategies and payoffs of all the agents. However, when more than one equilibrium exists, the one effectively reached by the system cannot be predicted a priori, even in small systems. Furthermore, experimental results in psychology and economics (see Camerer, 2003 for a recent review) show that individuals are unlikely to base their decisions uniquely on rational deductions. It has been argued that whenever the expected utilities depend on the strategies of other individuals, deviations from rational behaviour may arise, in particular due to a lack of information about the rationality of the others. In such situations of strategic uncertainty, individuals rely on a priori beliefs. Outcomes of coordination games reported for example by VanHuyck et al., 1990, reflect such apparently non-rational behaviours. This is even more conspicuous in large social systems because making decisions that depend on the decisions of many other agents is a difficult cognitive task. When individuals have to make decisions repeatedly, they may modify their beliefs by learning based on more or less fragmentary information about past behaviours or experiences (see Young, 1998a; Camerer and Ho, 1999 among others). Most models of learning proposed in the literature assume that each agent makes his decisions based on values he attributes to each possible strategy. Following Camerer (2003), these values are called attractions hereafter. Attractions may depend on summary statistics, like those considered by Crawford (1995), on earned and/or forgone utilities, or any other quantity observable by the agent or reflecting his beliefs (Young, 1998b). Before making a decision for the first time, different individuals may have different initial attraction values, which are subsequently revised using a learning rule. The learning literature has proposed many plausible learning schemes (see, e.g., Fudenberg and Levine, 1998; Sutton and Barto, 1998). They are all based on an error correction dynamics using empirical data, and differ mainly on the
Effects of Learning Behaviours on the Collective Dynamics
179
amount of information assumed to be available to the learners, and on the confidence put by the learner on its quality. Myopic (Cournot) best reply, for example, needs knowledge about the outcomes of the preceding decisions. Reinforcement learning only uses the returns of the actually played strategies, so that attractions of not experimented strategies are not updated. Fictitious play learning schemes (Brown, 1951; Robinson, 1951) weight differently the information depending on whether the strategy was actually played or not. Furthermore, even within one class of learning scheme there are variations, depending on which quantities are used to determine the attractions, and to the numerical values of the rule’s parameters. When used to fit experimental data, such variations allow to produce different explanations, as recently showed by the reinforcement learning rules considered by Erev and Roth (1998) and Sarin and Vahid (1999). Although there are many studies of the properties of different learning algorithms a better understanding of their comparative behaviour is clearly lacking. In this paper we are interested in the equilibria reached by a population of heterogeneous learning agents with social interactions. We compare the convergence properties of the system for several learning rules, which use differently the available information, and explore the incidence of different initial conditions. Like in the model studied in (Gordon, 2005; Nadal et al., 2006), customers have idiosyncratic willingnesses to pay (IWP), and exert positive mutual influences. One interesting characteristic of models with social interactions is that when the interactions are strong enough, there are two Nash equilibria (Nadal et al., 2006): one with a small fraction of buyers, and another one with a large fraction of buyers, the latter being the Pareto optimal equilibrium. The nature of this equilibrium is similar to that of the dominant Nash equilibrium in coordination games. It is worth to stress that these multiple equilibria arise in the collective models as a consequence of the externality for any distribution of the idiosyncratic component over the population, i.e. without having to assume any bimodal distribution of the customers’ IWP (a bimodal distribution of the IWPs gives rise to coordination problems even without externalities). In principle customers should learn to select the best individual response in a game where each one plays against all the others, and have thus to learn from each other. Here, we restrict to the analysis of the learning problem under the simplifying hypothesis that all the customers use the same learning rule, although they start with different initial values of the attractions. We compare the system’s behaviour at different prices, under different initial conditions, and using different learning rules. We assume that the price is exogenous. The paper is organized as follows: the customers model is presented in Section 2. Section 3 describes the framework of the learning schemes used by the agents in our simulations. The statistical characteristics and equilibrium properties of the simulated population are described in Section 4. The general settings and the simulations results are then described in Sections 5 and 6. Section 7 concludes the paper and presents some open questions.
Viktoriya Semeshenko et al.
180
2. The model We consider a population of N agents (i ¼ 1, 2, y, N) with the following characteristics: Strategies: each individual i has to make a binary choice, that we denote oi ¼ 1 (to buy, to adopt a fashion, etc., depending on the situations addressed by the model) or oi ¼ 0 (not to buy, not to adopt, etc.). Heterogeneity: each individual has his own (idiosyncratic) willingness to pay or to adopt, Hi; the larger Hi the higher the willingness to choose oi ¼ 1 rather than oi ¼ 0.1 We assume that Hi is distributed among the agents according to a probability distribution function (pdf) of average H and variance sH. A uniform distribution was considered by Gordon et al. (2005), whereas Nadal et al. (2003) and Phan et al. (2003) present results for a logistic pdf. In the present paper we consider a triangular distribution (see Section 4). Externalities: we assume that the willingness to pay of each individual i increases beyond his IWP (Hi) if other customers also decide to buy. In general, individuals may be susceptible to the decisions of a specific subset of agents, which constitue their so-called neighbourhood. Here, we assume a global neighbourhood: the social component is proportional to the choice ok (k 6¼ i) of every other agent. Moreover, we assume that the corresponding weights Jik are equal and positive (Jik ¼ J > 0 for all k 6¼ i; Jii ¼ 0). Positive weights correspond to strategic complementarities, as discussed by Durlauf (1997). The actual payoff of agent i if his strategy is oi ¼ 1, is Ui ({o– i}) ¼ Hi+(J/N) Sk ok – P, where P is the posted price. Otherwise, it vanishes. If Ui({o– i}) is positive (negative), the optimal choice is oi ¼ 1 (oi ¼ 0). Learning: we assume that the individuals do not know the values of the payoff expected upon choosing the strategy oi ¼ 1 but estimate their attraction Ai of playing it instead of oi ¼ 0, based on their past experiences. The actual payoff corresponding to deciding oi ¼ 1 at time t is2 U i ðtÞ ¼ H i þ JZi ðtÞ P
ð1Þ
which depends on the actual fraction of other customers that buy at time t: X 1 Zi ðtÞ ¼ ok ðtÞ ð2Þ N 1 kðkaiÞ Notice that buying because the attraction is positive runs the risk of having a negative payoff. Conversely, if the choice is oi ¼ 0, the individual has a vanishing payoff but may miss a positive one.
1 Without loss of generality, Hi represents the preference difference between strategy oi ¼ 1 and oi ¼ 0. 2 Because our model encompasses market and non-market situations, in the paper we use indistinctly utility or payoff for the quantity Ui, as well as ‘‘to buy’’ or ‘‘to adopt’’ for strategy o ¼ 1.
Effects of Learning Behaviours on the Collective Dynamics
181
There are many models of how attractions may be determined by the agents. In this paper we make the assumption that the individuals do not know precisely the values of the parameters Hi and J, and that they may even not know the structure of their utilities. Their attractions are estimations of the expected payoffs that rely on the values of Ui(t) grasped at each period, after making decisions. It has already been pointed out by Fo¨llmer (1974), Schelling (1978), Blume (1995), and Durlauf (1997) among authors in economics and social sciences that the population model considered here, with idiosyncrasy and additive externalities (but without the learning ability), belongs to a family of models of magnetic systems in Physics. The most important concept issued from such models is the existence of collective states. These are system’s states characterized by strong correlations among the individual states (the agents’ decisions), which only arise if there are mutual influences between the elementary components, be they physical or social (see also Durlauf, 1997; Phan, 2004; Phan et al., 2004). In this paper we are interested in the steady state reached by the system when individuals make their decisions with limited information, represented by the attractions learned from their past experiences. Although we formulate the model in economic terms, it applies to any situation where social interactions influence the preferences of the individuals.3 If the number of agents N is large enough, we can determine analytically the Nash equilibria of the system, as described by Nadal et al. (2003). These equilibria, which correspond to the optimal decisions, are realized with probability 1 if the individuals are rational and have full knowledge of all the parameters of the model. The theoretical results are obtained in the limit of large populations, which allows to approximate Zi(t) in (2) by Zi ðtÞ ZðtÞ
N 1X ok ðtÞ N k¼1
ð3Þ
With this approximation, the social component of the willingness to pay is proportional to the rate of adoption of the good. The properties of the system may be visualized in a so-called phase diagram. In a plane of axis J and P– H, it represents the regions where different regimes of Nash equilibria exist. If J is sufficiently large, there is a range of values of P– H where two different equilibria coexist. Although the exact boundaries of this region depend on the details of the pdf, its very existence is a general property of the model, due to the social interactions. These equilibria are of the same nature as those of a coordination
3
In non-market situations like the ones considered, for example, in (Schelling, 1978; Granovetter, 1978; Durlauf, 1999; Glaeser and Scheinkman, 2002), the value of H– P should be interpreted as the population average willingness to adopt the state o ¼ 1 rather than o ¼ 0. Then, P is the cost of adoption, assumed to be the same for all the individuals.
Viktoriya Semeshenko et al.
182
game, but here the payoffs of all the players are different, because the population is heterogeneous. In Section 5 we present the phase diagram corresponding to the triangular distribution used in the simulations. We do not detail here its calculation, which follows the same lines as in Gordon et al. 2005 and in Nadal et al. 2006. 3. Learning binary choices from experience We are interested in the equilibria reached when the customers are rational but have incomplete knowledge: they make their decisions repeatedly at successive periods t based on information obtained from past actions. The system evolves through an entangled dynamics that combines decision making and learning: agent i chooses a strategy oi(t) using a decision rule (that may be deterministic or probabilistic) that depends on the attractions Ai (t). Then, the actual earned payoff – and eventually the payoffs corresponding to non-played strategies – are used to update or learn the attractions for buying according to some particular learning rule. Following Camerer (2003), we consider his experience weighted attractions (EWA) scheme, which allows to represent in a single expression many families of learning rules proposed in the literature. We found it convenient to use a parameterization slightly different from Camerer (2003). Specifically, given the actual payoff of a strategy o, U o i ðtÞ, each agent uses the following adaptive rule to update the corresponding attractions: o o Ao i ðt þ tÞ ¼ ½1 mðtÞAi ðtÞ þ mðtÞDi ðtÞU i ðtÞ
ð4Þ
Di ðtÞ ¼ d þ ð1 dÞI½oi ðtÞ; o
ð5Þ
mðt þ tÞ ¼ ð1 kÞ
mðtÞ þ kð1 fÞ mðtÞ þ f
ð6Þ
where I (x, y) is the indicator function (I(x, x) ¼ 1, I(x, y) ¼ 0 for y6¼x) and t the elementary time step. The factor Di(t) allows to update differently the attractions of played and non-played strategies. m(t) is usually called learning rate in statistical learning theory. It weights the relative importance of recent payoffs with respect to past estimations.4 The values of the learning parameters m(0)>0, k A
4
For comparison with the expressions in Camerer (2003 p. 305), that use parameters K, F and N (t) with N (t) ¼ 1+fN (t– 1) define m(t)1/N (t). Then, if we identify k ¼ K and F ¼ f, the updating rule (6) gives exactly Camerer’s EWA expressions for k ¼ 1. Since by construction the coefficients of the two terms in (4) add up to 1, when k ¼ 0, our updating (6) is slightly different from that of Camerer’s EWA. All our learning rules have weighted attractions, and do not include learning through cumulated attractions. The later, however, can be shown to be equivalent to weighted attractions through a basic rescaling of the quantities of interest. A detailed discussion of this difference is left for a forthcoming paper.
Effects of Learning Behaviours on the Collective Dynamics
183
{0, 1}, fZ0 and 0rdr1 in (4), (5) and (6) correspond to different assumptions about the rationality and cognitive capacities of the customers. They allow to decline different learning algorithms.5 The above equations may be simplified within the binary-choices framework of our model. First, since o 2 f0; 1g, we may consider only the attractions for buying. We may also write Di(t) ¼ d+(1–d)oi(t) in (5). Thus, in our case of binary choices, after the strategy oi(t) is played, the attraction for buying in next period is estimated as follows: Ai ðt þ tÞ ¼ ½1 mðtÞAi ðtÞ þ mðtÞ½d þ ð1 dÞoi ðtÞU i ðtÞ, mðt þ tÞ ¼ ð1 kÞ
mðtÞ þ kð1 fÞ, mðtÞ þ f
ð7Þ ð8Þ
where Ui(t), given by Equation (1) is the actual payoff of strategy o ¼ 1 at period t. A list of the possible learning rules represented by the above expressions is given below. Decisions are taken based on the learned attractions. We assume that each agent chooses strategy o ¼ 1 at time t using a probability law that depends on his attraction for buying at time t, P(o(t) ¼ 1|Ai(t)). Results in the present paper have been obtained using a myopic best response (9): oi ðt þ tÞ ¼ YðAi ðtÞÞ,
ð9Þ
where YðxÞ is the Heaviside function ðYðxÞ ¼ 1 if x 0; YðxÞ ¼ 0 otherwiseÞ. Thus, the decision depends only on the sign of the attraction, and not on its magnitude: the individuals buy whatever the value of their attractions, provided they are positive. It is worth to stress that this response is optimal with respect to the attractions. These are learned estimations of the payoffs which may be erroneous, thus leading the individuals to make bad decisions. In the special case where the attractions are equal to the payoff at the preceding period, this decision rule coincides with what is usually called Cournot best reply in the literature. The numerical values of the parameters in (7) and (8) allow to generate different learning rules. Although we consider the special case of binary decisions, where the agents only need to estimate the attractions for buying, the discussion that follows is very general, easily transposable to situations with more possible strategies. The factor m(t) in (7) sets the memory decay rate of past attractions, modelizing the so-called recency effect in cognitive psychology. This decay may arise because of limited memory capacity or because the agent believes that older information may not be as relevant as the newer one. It is parameterized by the
5 We assume that, given the learning algorithm, the parameters m(0), k, f and d are the same for all the individuals in the populations.
Viktoriya Semeshenko et al.
184
values of kA{0, 1} and f, which control the time dependence of the learning rate. For f ¼ 0 and any kA{0,1}, m (t) ¼ 1 and (7) gives Ai ðt þ tÞ ¼ ½d þ ð1 dÞoi ðtÞU i ðtÞ
ð10Þ
This is myopic learning since the attraction at each time step t only relies on the outcome of the preceding iteration, without keeping any trace of the previous steps. If d ¼ 1, the attraction for buying at each time step is the utility of buying at the previous time step, had the agent bought or not. Combining this updating scheme with the deterministic best response decision rule (9) gives the standard Cournot best reply behaviour. It is well known from the Physics literature that, if the interactions between agents are symmetric and the attractions are the true utilities at the previous time step, like in (10) with d ¼ 1, then iterating (9) with parallel or sequential dynamics6 leads the system to a Nash fixed point. With a probabilistic decision rule, using a logit distribution of parameter b: Pb ðoðt þ tÞ ¼ 1Þ ¼ 1 Pb ðoðt þ tÞ ¼ 0Þ ¼
1 1 þ exp bAi ðtÞ
ð11Þ
the system converges to a stationary state, similar but not identical to the socalled Quantal response equilibrium in economics (McKelvey and Palfrey, 1995), in which the decisions fluctuate close to the optimal ones7 with a variance inversely proportional to b. This rule reduces to the myopic best response (9) in the limit of b-N. If k ¼ 0 and f>0 in Equation (8), the learning rate decreases through time. Moreover, (8) with k ¼ 0 gives explicitly mðt þ tÞ ¼
mð0Þð1 fÞ mð0Þð1 ft Þ þ ft ð1 fÞ
ð12Þ
In the limit of t-N, if fo1 m(t) converges asymptotically to 1–f, the same time independent learning rate as when k ¼ 1. This convergence is faster the smaller the value of f, and for f ¼ 0 the value m ¼ 1 is reached after only one time step. In that case, only the last utility determines the attraction, like in myopic learning. When f>1, the learning rate m(t) decreases asymptotically like f–t. As the factor multiplying Ui(t) in (7) vanishes with t, learning becomes less and less effective with time. The incidence of the learning rate time dependence on the learned values has been studied in the machine learning literature (Saad, 1998). A too fast decrease of m(t) may stop prematurely the learning process, whereas excessively large values of m(t) may induce a chaotic behaviour. Small values of f (o1) are preferable for successful learning, at the price of long learning times.
6 7
Provided that the sequential dynamics is ergodic. Optimal with respect to the attractions, which may be different from the actual payoffs.
Effects of Learning Behaviours on the Collective Dynamics
185
If k ¼ 1 in (8), the learning rate is constant: m(t+t) ¼ 1 – f, meaning that the agents keep memory of their past utility, but learning remains an active process in the long run. If f ¼ 1 there is no learning at all. Values f>1 when k ¼ 1 are not allowed, since they would imply unlearning. The value of f (0rfo1) controls the decay rate of past updates. The smaller f the stronger the memory effects. Thus, if fo1 we expect quite similar learning behaviours both for k ¼ 0 and k ¼ 1. If f>1, which is not forbidden when k ¼ 0, learning slows down over time because the learning rate decays to zero like f–t. For t ðlogfÞ1 the learning ability vanishes. This phenomenon is known in cognitive psychology as primacy effect: only the first experiences are memorized. In learning theory, this is an undesired phenomenon known as early stopping, and usually reflects a bad adaptation of the learning parameters. If the true fraction of buyers does not change very abruptly from one time step to the other – this should be true at least close to convergence – the learning rule (7) converges to values that depend on whether the steady state of the agent is oi ¼ 0 or oi ¼ 1, due to the decision dependent weighting factor d+(1–d)oi(t). The value of d allows to update the attraction of the played strategy differently from that of non-played ones. If d ¼ 1, all the strategies are equally updated, a learning scheme known as fictitious play. If d ¼ 0 we obtain usual reinforcement learning: the attraction for buying is updated only after buying, i.e. only if it is positive. Otherwise it remains negative and its absolute value decays by a factor 1–m(t) at each period. Thus, non-buyers will persist in their initial strategy, ignoring whether it is the correct one. Reinforcement learning is a plausible learning rule if the only way to obtain information about the attractions is through playing the corresponding strategies. When 0odo1 we have weighted belief learning: the utility of the strategy actually played at time t has a greater influence on updating the corresponding attraction than the potential utility of non-played strategies have on their own attractions.8 As a consequence, nonbuyers will systematically underestimate the absolute value of the attraction for buying. However, whenever d>0, the sign of the correcting term in (7) is correct. Notice that our parameterization restricts the family of learning rules to cases where the typical scale of the attractions remains constant (hence the factors 1–m and m in (7), which add up to 1). A rule allowing for a global increase of the attractions, with eventual convergence to cumulated payoffs, would be of interest in the case of probabilistic decision rules like (11) or any trembling-hand perturbation of the deterministic choice rule. In that case, the decision rule would get closer and closer to a deterministic best response rule as time increases. However, this effect can be separated from the learning rule by including
8 We do not consider d>1, which corresponds to learners that overestimate the attractions of nonplayed strategies. Such values might modelize regret about the played strategy.
186
Viktoriya Semeshenko et al.
a time dependence on the variance of the trembling noise or in the logit parameter b, such that b(t+1)>b(t). In the present paper we consider deterministic myopic best response, where only the sign of the attractions is relevant. To summarize, we have the following correspondences: Myopic learning Fictitious play: Reinforcement learning: Time-averaged learning Fictitious play: Reinforcement learning: Time-decay learning Fictitious play: Reinforcement learning: Weighted belief learning
kA{0, 1}, f ¼ 0, d ¼ 1 kA{0, 1}, f ¼ 0, d ¼ 0 kA{0, 1}, 0ofo1, d ¼ 1 kA{0, 1}, 0ofo1, d ¼ 0 k ¼ 0, f>1, d ¼ 1 k ¼ 0, f>1, d ¼ 0 k ¼ 0, 0ofo1, 0odo1
We merged under the same designation of time-averaged learning both the constant (k ¼ 1) and the variable (k ¼ 0) learning rate learning rules, for which we expect similar behaviours. Before presenting our simulation results in Sections 5 and 6, we describe the Nash equilibria of the system. Then we simulate systems of N learning agents, that start with some initial values of their attractions, decide whether to buy or not, and update the attractions based on the payoffs of the period. We compare the performances of different learning rules and different decision dynamics in terms of learning times and collective states at convergence. 4. Population characteristics and equilibrium properties We consider the simplest system where the individuals willing to pay large prices are fewer than those willing to pay low ones. This is a population where Hi, the individuals’ IWP distribution is triangular around its mean value H, such that the fraction of individuals with a given Hi decreases linearly with Hi. In terms of normalized parameters h H=sH ; xi ¼ ðH i HÞ=sH , where sH is the variance of the IWP distribution, the random variables xi have zero mean and unitary variance, with probability density function given by: 8 if x b > <0 2ð2bxÞ f ðxÞ ¼ if b x 2b ð13Þ 2 > : 9b 0 if 2b x pffiffiffi where b ¼ 2. Since individuals with high IWP are less numerous than those with low IWP, fluctuations in our results are more conspicuous for high values of the IWP. The triangular distribution presents the advantage of allowing an
Effects of Learning Behaviours on the Collective Dynamics
187
analytical determination of the system’s equilibrium properties. These are calculated following the same lines as in Gordon et al. (2005). The R 2bfraction of buyers is given by the solutions of the self-consistent equation Z ¼ phjZ f ðxÞdx. Figure 1 shows the customers phase diagram. It exhibits the characteristics of the system’s equilibrium states for different values of the normalized parameters p–h ( p– H )/sH and j J/sH. On the line j ¼ 0, if the price p is larger than the maximal IWP in the population, namely h+2b, then nobody buys and Z ¼ 0 (dark-grey region). On the contrary, when p falls below the smallest IWP in the population ( poh– b) all the consumers buy the product (Z ¼ 1). For intermediate prices, Z( j ¼ 0) decreases with the price: Zð j ¼ 0Þ ¼
ðh þ 2b pÞ2 for b p h 2b 9b2
ð14Þ
and saturates at Z ¼ 0 for p– h>2b; Z ¼ 1 for p– ho– b. This saturation is a consequence of the pdf (13) having finite support. Otherwise the variation of Z Figure 1. Customers phase diagram for the triangular distribution of the IWP. Axis represent normalized variables: p–h (P–H)/sH, j J/sH, where sH is the variance of the IWP distribution. In the light-grey region, the fraction of buyers is smaller than 1. It is strictly 0 in the dark-grey region, as a consequence of the distribution having a finite support. For j > jB, the domain where all the customers are buyers (Z ¼ 1) is seen to overlap with that where the fraction of buyers is small within a largeprange of prices bounded by p–h ¼ j–b and p–h ¼ 2b–j2B/j. Numerical ffiffiffi values: b ¼ 2, jB 3b/2. For j ¼ 4 (one of the values of j used in the simulations), the region where two solutions coexist is bounded by p1–h ¼ 1.7034 and p2–h ¼ 2.5858
η=1 4 j
η=0 jB 0<η<1 1 0 -b
b/2 p-h
2b
188
Viktoriya Semeshenko et al.
would be smooth over all the range of prices. The fraction of buyers is smaller than 1 in the light-grey region and is strictly 1 at the left of the saturation line (hashed region). If j>0, individuals whose IWP is smaller than the price may have a positive payoff due to the social influence term, so that the fraction of buyers at a given price increases with j. This effect is apparent on the phase diagram through the positive slope of the saturation line (lower bound of the hashed region) at which Z ¼ 1: the larger j, the higher the price at which the solution Z ¼ 1 exists. If j>jB 3b/2 there is a range of prices for which two different solutions coexist. One corresponds to a large fraction of buyers (with the present distribution (13) there is saturation, Z ¼ 1), the other one to a fraction of buyers bounded by a finite upper value Zsupo1, represented by the dashed line (upper bound of the grey region). Notice that for p– h and j large enough, the two co-existing solutions are Z ¼ 0 and Z ¼ 1, due here again to the boundedness of the support of the IWP distribution. These two possible outcomes for Z, represented by the superimposition of the grey (dark and light) low-Z regions with the hashed highZ one, reflect a coordination problem brought-up by the externalities. This coordination problem is somehow less canonical than usual coordination games like the stug-hunt game9 because, due to the players heterogeneity, different individuals earn different payoffs within each Nash equilibria. However, the nature of the two equilibria is the same as in usual coordination games, i.e. an efficient Pareto-optimal one where coordination is achieved, and an inefficient one with low adoption rates. 5. General simulation-settings In the next section we compare results obtained through the combination of deterministic best response decisions with different learning schemes (myopic fictitious play, weighted belief learning, time-average reinforcement learning and time-decay fictitious play), under parallel and sequential dynamics. We analyse the behaviour of the fraction of buyers, the fraction of customers that buy only because of the social influence (but would not buy otherwise) and the learning times for each learning scheme. Convergence properties are studied for each learning rule, to understand which equilibria agents converge to, if any, in different regions of the customers phase diagram. 5.1. Dynamics There are different possible ways of simulating the population decision dynamics, the most popular being parallel and sequential updating. In parallel
9
This is a class of games used to represent many economic scenarios that require agents to coordinate their strategies in order to maximize their utilities.
Effects of Learning Behaviours on the Collective Dynamics
189
dynamics, at each time step t all the N agents choose their strategies and afterward update their attractions according to the corresponding utilities. At the other end, we may consider completely asynchronous agents, each one making his decision and subsequent attractions updates at a different time, in an arbitrary order. This dynamics is usually implemented by taking at random, with a uniform distribution, the agent that is active at each time step, and is called stochastic sequential dynamics. In order to compare with parallel dynamics, we make N asynchronous updates in each period t. Owing to the random choice of the agents to be updated, it may arise that some agents are not updated at all, while others are updated more than once, in each period. More precisely, since the probability of selecting a given agent for updating in one period is 1/N, the probability that an agent is not selected at all in the period is E e–1, i.e., larger than 30%. A consequence of this high probability is a strong slow down of the temporal evolution of sequential with respect to parallel dynamics. Remark that different time scales for the decision and the learning processes could be chosen. For example, individuals could memorize experiences through several decision-making periods before updating the EWA. Such dynamics are worth being investigated.
5.2. Parameters and initial states We are interested in the collective learning dynamics corresponding to different parameters values of the EWA learning scheme (7). Given the characteristics of the phase diagram, we focused on the learning behaviour for two different values of the social influence weights, namely j J/sH ¼ 1, which has a single equilibrium state for all the values of p– h, and one with j ¼ 4, which may present two equilibria. In the latter case we denote p1 and p2 the prices at the boundaries of the region where states with large and small Z coexist (see Figure 1). We used different initial conditions of both the states oi(0) and the attractions Ai(0). Namely, we considered two homogeneous initial conditions (called cold start in Physics literature), and a random initial state. The two cold start initializations are: the optimistic one, in which all the customers begin by buying the good, oi ð0Þ ¼ 1 8i, and the pessimistic one, with oi ð0Þ ¼ 0 8i. In the random initial state (known as hot start) the initial strategy of each agent is set to 0 or 1 with equal probabilities. In all the cases, the initial attractions are drawn at random from a uniform distribution within an interval [–m, m], but consistently with the initial states. That is, if oi(0) ¼ 1, then Ai(0)A[0, m]; if oi(0) ¼ 0, Ai(0)A[–m, 0]. We present results for m ¼ 1. All our simulations correspond to systems with N ¼ 1000 agents. Results are averages over 100 systems, corresponding to 100 different realizations of the random IWP.
Viktoriya Semeshenko et al.
190
5.3. Stationary states At convergence, provided that m(t) does not vanish prematurely, the estimated attractions (7) may be calculated by replacing Ai (t), oi (t) and Ui (t) by their asymptotic values Ai (N), oi (N) and Ui (N). We obtain Ai ð1Þ ¼ ½d þ ð1 dÞoi ð1ÞU i ð1Þ
ð15Þ
Thus, only if oi(N) ¼ 1 the estimated values Ai (N) converge to the actual utilities, Ui (N). This arises for all the population only when d ¼ 1. If 0odo1, individuals that do not buy underestimate the absolute value of the attraction for buying, but do estimate its sign correctly. If d ¼ 0 (classical reinforcement learning), only the buyers’ attractions converge to the actual utilities, but individuals whose initial attractions are negative never update their values and may eventually miss positive utilities. As a consequence, the overall system will converge to a state with a fraction of buyers smaller than the value predicted by the phase diagram. In the case of time-decay learning k ¼ 0, f>1, convergence may arise just because m(t) decreases too fast, stopping untimely the learning process. Then, nothing can be predicted about the values of the attractions, which become frozen because the updating coefficient vanishes, and not because the utilities become stationary. In the simulation results of next section we present dramatic consequences of early stopping on the equilibria reached by the system. 6. Simulation results In this section we describe the results obtained for qualitatively different learning schemes. Most of the results at convergence turn out to be the same with both kinds of dynamics (parallel and sequential). In the following we present results for both dynamics only when they differ. 6.1. Myopic fictitious play Inserting the parameters of this learning rule (k ¼ 1, f ¼ 0, d ¼ 1) in (8) gives Ai ðt þ tÞ ¼ U i ðtÞ
ð16Þ
where Ui (t) is the actual payoff of strategy oi ¼ 1, given by Equation (1). Agents are myopic in the sense that at each time step t they update their attractions for the next time considering only the utility of the current period. As stated in the previous section, a system of myopic-learning agents will reach the same equilibrium as a system of rational agents with complete information. They only need enough time to converge. The learning time is defined as the number of time steps t needed until the attractions of all the agents reach their stationary states (with asynchronous dynamics there are N sequential updates in one time step, with parallel dynamics all the N agents are updated simultaneously in one time step). Figure 2 presents
Effects of Learning Behaviours on the Collective Dynamics
191
Figure 2. Myopic fictitious play (k ¼ 1, f ¼ 0, d ¼ 1). Learning time (averaged over 100 systems: error bars are smaller than the symbols) versus p–h for j ¼ 1 and 4. Remark the different time scales (above: parallel dynamics, below: sequential dynamics)
our results with both kinds of dynamics, as a function of the difference between the price p and the mean IWP h of the population, for different initializations. Each point in the figure represents the average over 100 simulated systems (each one corresponds to one realization of the random values hi (1rirN, N ¼ 1000)). With parallel dynamics the system reaches the equilibrium faster than with sequential dynamics: parallelization speeds-up the learning process because attractions are updated based on the decisions of all the agents at the previous time step. In sequential dynamics the information used by the N agents successively selected for updating is a mixture of the states of individuals already updated and those who still have their previous time-step states. In different regions of the phase diagram (Figure 1) agents spend different times to learn. Consider the case j ¼ 1, when p– hoj– b (hashed region in Figure 1). With the optimistic initialization and parallel dynamics all the agents buy from the start, Z(0) ¼ 1, so that at the next time step all the attractions have the values of the utilities at equilibrium: convergence is reached whatever the initial values of
192
Viktoriya Semeshenko et al.
the Ai. With the pessimistic initialization (oi ¼ 0 for all i), Z(0) ¼ 0 at start. Since the utilities of individuals with hi>p are positive independently of the fraction of buyers, at the second time step their attractions become positive, inducing them to buy. Thus Z(1)>0, so that at the next time step more agents will buy. This entails in turn a new increase of Z. Progressively, at the pace of the successive updates, more and more individuals decide to buy thanks to their updated attractions, which increase linearly with Z, leading the system to its unique Nash equilibrium, Z ¼ 1. As may be seen on the left graphs of Figure 2, learning times for pessimistic and random initializations increase with p– h in the region where the Nash equilibrium is Z ¼ 1, i.e., up to p– h ¼ j– b. For slightly larger prices, the learning times present a cusp, which is most conspicuous with pessimistic and random initializations. This is an indication that when very few individuals should not buy, finding the system’s equilibrium through individual learning is a difficult task, which becomes easier for increasing prices. There is an interesting crossing of the curves for optimistic and pessimistic initializations, and a dramatic drop of the learning time for random initialization at p– hE 0.3284. At this value, ZE0.5 at equilibrium, which is close to the initial fraction of buyers when the initialization is at random. This explains the corresponding short learning times: after the first time step the payoffs correspond to those with approximatively one half of the buyers in the population, which are close to the equilibrium attraction values. More generally, the optimistic (pessimistic) initialization is closer to the correct guess when equilibrium corresponds to Z>1/2 (Zo1/2). Beyond p– h ¼ 2b, the Nash equilibrium is Z ¼ 0, and the situation is inverted with respect to the Z ¼ 1 state: the pessimistic initialization presents the fastest convergence. If the social influence is large, j ¼ 4, outside the coexistence region p1opop2, the learning task is similar to the one when j ¼ 1 because there is a single equilibrium. Learning times follow the same trends as for j ¼ 1 although they are longer. The position of the cusp in the coexistence region depends on the initialization. With cold start, the cusps in learning times fall exactly at the boundaries of the coexistence region, where the fraction of buyers presents a discontinuity. In the coexistence region the variances are larger, reflecting the fact (see below) that the systems converge to either one or the other solution, with quite different learning times. Sequential and parallel updating show similar behaviours (upper and lower graphs in Figure 2), but the learning times with the former is dramatically higher than with the latter for the same reasons as for j ¼ 1. Optimistic and pessimistic initializations give reliable bounds to the learning times corresponding to random initializations. Exceptions occur when by chance the initialization results in a value of Z(0) already close to the equilibrium. Consider now the fraction of buyers at equilibrium. Since they turn out to be the same for both dynamics, we only present results for parallel updating in Figure 3. For j ¼ 1, for each price all the simulated systems converge to the same value of Z: the equilibria are unique, consistently with the phase diagram. For
Effects of Learning Behaviours on the Collective Dynamics
193
Figure 3. Myopic fictitious play (k ¼ 1, f ¼ 0, d ¼ 1). Z at equilibrium versus p–h for j ¼ 1 and j ¼ 4. When non-visible, error bars are smaller than the symbols’ sizes. The full lines are the analytical predictions for the rational (Nash) equilibrium. Parallel and sequential dynamics give the same results
j ¼ 4, as shown in the phase diagram, there exists a range of p– h where the system presents two possible stable solutions, one with a large fraction of buyers, another with a small one. This coexistence of solutions reveals the customers coordination problem. To receive the highest payoffs agents must spontaneously coordinate. Otherwise, the system gets stuck in the low-Z inefficient equilibrium, at which every customer has a smaller payoff. Within our model, for this value of j>jB the social influence is strong enough, and when coordination is successful the overall population buys, even if the price is larger than the willingness to pay of most individuals. Dynamically, if at a time t there are enough agents with positive attractions (that decide to buy), then individuals with negative attractions (that did not buy at time t) will have positive utilities thanks to the term jZi (t) in (16), making the population evolve to the state Z ¼ 1. Clearly, the possibility of such an evolution depends on the initial conditions. If this avalanche process leading to the large Z solution does not take place, the system gets trapped in the inefficient low-Z equilibrium. This happens with the pessimistic initialization. The optimistic initialization allows to reach the optimal Nash equilibrium because coordination is set from the start. The fact that the averages corresponding to the random initialization in Figure 3 (right-hand side) lie within the two theoretical solutions simply reflects the fact that the simulated systems converge either to one or to the other state, strongly depending on the details of the initial conditions. In fact the distribution of the results is bimodal; this is why the averages in the coexistence region present larger variances than elsewhere. This very same phenomenon has been invoqued by Glaeser et al.(1996) to explain the large variances in criminality rates across cities. Customers with IWP that satisfy h+xiop may buy if the social influence term is large enough to verify h+xi– p+jZ>0. They are a fraction of individuals
194
Viktoriya Semeshenko et al.
Figure 4. Myopic fictitious play (k ¼ 1, f ¼ 0, d ¼ 1). Fraction of customers that buy at equilibrium thanks to the social influence. If j ¼ 0, these agents would choose not to buy
among the N agents, represented in Figure 4. Clearly, the social influence is larger at the boundary of the Z ¼ 1 phase, where the price is the highest with still 100% of buyers. When j ¼ 4, in the coexistence region, these customers are the potential losers if coordination is unsuccessful. At the highest price edge of the coexistence region, when coordination is successful, 100% of the population are socially influenced customers. Since learning by agents takes time, the monopolist may or not take advantage of this fact even at a fixed price. An indication of the characteristics of a learning rule and the role of the initial conditions from the sellers’ point of view may be gathered by studying, in the case of a monopoly market, the monopolist’s cumulated benefit or payoff during the learning process. However, we do not consider here the monopolist’s side, which will be analysed in a forthcoming paper. At the end of the learning process the attractions for buying of the N agents converge towards the fixed points given by limt!1 Ai ðtÞ ¼ hi p þ jZ, where Z is the fraction of buyers at convergence. That is, they take the values of the equilibrium payoffs. These attractions are plotted in Figure 5 against the individuals’ IWP values xi, for a single system at three different prices, corresponding to equilibria with Z ¼ 1, 0oZo1 and Z ¼ 0 (upper, intermediate and lower curves, respectively). As expected, the attractions are proportional to Z (although individuals do not estimate Z to determine the attractions) because in myopic learning these are the actual equilibria payoffs. Notice the lower density of points at large xi, which reflects the fact that there are fewer individuals with large IWP than with small ones, due to the considered triangular distribution. Both kinds of updating dynamics, i.e. parallel and sequential, give similar results. When j ¼ 4, optimistic and random initializations give consistently different attraction values at convergence, depending on whether coordination has been achieved or not (see Figure 5).
Effects of Learning Behaviours on the Collective Dynamics
195
Figure 5. Myopic fictitious play (k ¼ 1, f ¼ 0, d ¼ 1). Attractions at convergence of the learning process at difference prices, as a function of xi ¼ hi–h. The prices for j ¼ 1 correspond to three different regions of the phase diagram. The fraction of buyers are Z ¼ 1 for p–h ¼ –1.1, Z ¼ 0.4195 for p–h ¼ 0.5 and Z ¼ 0 for p–h ¼ 3.1. For j ¼ 4, when p–h ¼ 2.0 attractions converge to two different fixed points (with equilibrium fraction of buyers Z ¼ 0.0666 and Z ¼ 1), depending on the initial condition, whereas Z ¼ 1 for p–h ¼ 1.5 and Z ¼ 0 for p–h ¼ 3.0
6.2. Time-averaged reinforcement learning Within this learning scheme, k ¼ 1, d ¼ 0, 0ofo1, so that the attractions for buying are updated according to fAi ðtÞ þ ð12fÞU i if oi ðtÞ ¼ 1 Ai ðt þ 1Þ ¼ ð17Þ fAi ðtÞ if oi ðtÞ ¼ 0 That is, non-buyers depreciate the attraction for buying by a factor f. Buyers, on the contrary, use their payoffs for updating the attraction, which is a weighted average of the previous value and the actual payoff. In our simulations we used f ¼ 0.5. As distinct from the learning rules presented above, with reinforcement-type learning rules the outcome of the system’s behaviour strongly depends on the initial states, even when the equilibrium state is unique. Individuals with states oi ¼ 0 do not use information of the payoffs they would have obtained upon buying. In other words, whenever an agent is led to make the decision oi ¼ 0 during the learning process, the absolute value of his attraction for buying will begin to decrease with time, but keeping the (negative) sign that induced him not to buy for the first time. Thus, the agent will persist in state o ¼ 0, independently of the environment’s evolution. At the end of the learning process the attraction values converge towards fixed points that depend on the strategies at convergence. If the latter are oi ¼ 1, the fixed points are the actual payoffs Ai ¼ Ui, but if oi ¼ 0, then at convergence Ai ¼ 0. This is illustrated by the distribution of the individuals’ attractions as a function of their IWP (Figure 6), and explains that the values of Z at convergence are
196
Viktoriya Semeshenko et al.
Figure 6. Time-averaged reinforcement learning (k ¼ 1, d ¼ 0, f ¼ 0.5). Attractions at convergence of the learning process at three difference prices, as a function of xihi–h. See the caption of Figure 5 for the fraction of buyers at the different prices
Figure 7. Time-averaged reinforcement learning (k ¼ 1, d ¼ 0, f ¼ 0.5). Z at equilibrium versus p–h for j ¼ 1 and 4. When non-visible, error bars are smaller than the symbols’ sizes. The full lines are the analytical predictions for the rational (Nash) equilibrium. Parallel and sequential dynamic give the same results
systematically smaller (or equal) to those at the Nash equilibria, as may be seen in Figure 7. The only exception is for the optimistic initialization, because in that case all the individuals buy from the start. This allows them to estimate the corresponding attraction and eventually realize that the utility is negative. Since the updating is gradual, negative utilities do not necessarily produce an immediate change in the learners attractions’ sign. Therefore, learners continue to buy (exploring thus a losing strategy) for several periods before they switch to strategy o ¼ 0, explaining that with this initialization the behaviour of the system is similar to that with myopic learning. All the other quantities present the initialization-dependent behaviour that may be expected from Figure 7, but show strong dependencies with the value of f.
Effects of Learning Behaviours on the Collective Dynamics
197
6.3. Weighted belief learning Weighted belief learning is an intermediate scheme between fictitious play and reinforcement learning. The learning parameters of the simulations are k ¼ 0, d ¼ 0.5 and f ¼ 0.5. With this learning rule, after making his decision, each agent updates his attraction for buying according to ð12mðtÞÞAi ðtÞ þ mðtÞU i ðtÞ if oi ðtÞ ¼ 1 Ai ðt þ 1Þ ¼ ð18Þ ð12mðtÞÞAi ðtÞ þ mðtÞdU i ðtÞ if oi ðtÞ ¼ 0 with m(t) given by (12). Old attractions are reduced by a time increasing factor 1– m(t): results of past experiences are gradually forgotten, meaning that agents value current experiences more than earlier ones. The agent updates the attraction for buying using the actually yielded utility, underweighting it by a factor do1 if his actual strategy was not to buy. This type of learning rule gives results similar to those with myopic fictitious play for equilibrium quantities like the fraction of buyers and the social influence. However, there is a quantitative difference in the learning time, see Figure 8, and consequently, in cumulated quantities. The learning time is sensitive to the decay factor f: the stronger agents remember their past values the slower they learn. Since agents weight differently the received and the hypothetical payoffs, the learning time is also affected by the weighting factor d. When d-1, earned and forgone payoffs tend to be used almost in similar footings. Agents use information more efficiently, closer to a fictitious play setting, and consequently learning times are shortened. At the end of the learning process the attractions converge towards the fixed points Ai ¼ [d+(1– d)oi)]Ui. So, if oi ¼ 1 they converge to Ai ¼ Ui, but to Ai ¼ d Ui if oi ¼ 0 (see the two different slopes in Figure 9). In any case, the agent selects always the best action because the sign of the attraction is correct. However, we expect this misestimation of the utility of buying by non-buyers to bring about consequences in trembling-hand decisions, where the magnitudes, and not only the sign, of the attractions are important. 6.4. Time-decay fictitious play The updating rule for time-decay fictitious play is Ai ðt þ 1Þ ¼
f mðtÞ Ai ðtÞ þ Ui f þ mðtÞ mðtÞ þ f
ð19Þ
with f>1, meaning that the learning rate decays to zero like f– t independently of the initial value m(0). Thus, for t ðlogfÞ1 the learning ability declines dramatically. Learning is stopped not because the utilities have converged to equilibrium values, but due to the time cut-off in the learning rate. The freezed values of the attractions at the end of the learning process depend crucially on the initial values Ai(0), and do not reflect generic properties of the system. The dramatic consequences of this early stopping can be seen in Figure 10 for the
198
Viktoriya Semeshenko et al.
Figure 8. Weighted belief learning (k ¼ 0, f ¼ 0.5, d ¼ 0.5). Learning time (averaged over 100 systems: error bars are smaller than the symbols) versus p–h for j ¼ 1 and j ¼ 4 (above: parallel dynamics, below: sequential dynamics)
Figure 9. Weighted belief learning (k ¼ 0, f ¼ 0.5, d ¼ 0.5). Attractions at convergence of the learning process at three different prices, as a function of xihi–h. See the caption of Figure 5 for the fraction of buyers at the different prices
Effects of Learning Behaviours on the Collective Dynamics
199
Figure 10. Time-decay fictitious play (k ¼ 0, f ¼ 1.5, d ¼ 1). Attractions at convergence of the learning process for m ¼ 0.1, 1, 5, respectively, for the same prices as the preceding learning rules, as a function of xihi–h, with sequential dynamics
sequential dynamics and Figure 11 for parallel dynamics, where the freezed attraction values are plotted against the individual IWPs for the same prices as with the other learning rules presented previously, and for three different ranges m of initial values.
200
Viktoriya Semeshenko et al.
Figure 11. Time-decay fictitious play (k ¼ 0, f ¼ 1.5, d ¼ 1). Attractions at convergence of the learning process for m ¼ 0.1, 1, 5, respectively, for the same prices as the preceding learning rules, as a function of xihi–h, with parallel dynamics
At each value of m, the final attractions are much more spread for sequential than for parallel updating. This reflects the fact that in each period of sequential dynamics about 30% of the population do not update their choices nor their attractions. They keep their values of the preceding period, adding thus noise to the state of the system.
Effects of Learning Behaviours on the Collective Dynamics
201
7. Conclusion Learning through past experience is a plausible mechanism for social agents when they have to make decisions whose values are a function of the decisions of others. Depending on the amount of available information, the learning schemes may differ. The aim of this paper was to explore at what extent different learning rules may lead to different collective results in a simple market model where individuals have to decide whether to buy or not a given good at a posted price. We considered learning schemes where individuals must make their decisions repeatedly, and base them on attraction values they associate to the possible strategies. Starting with some initial guesses, learning amounts to modify the attractions iteratively, after each decision, using information of the corresponding payoffs. Assuming that the population is homogeneous with respect to learning, i.e. all the individuals use the same learning rule, we analysed myopic fictitious play, time-averaged reinforcement learning, weighted belief learning and time-decay fictitious play. Depending on the strength of the social interactions, the learning times and the equilibria reached by the system may vary, depending on the learning rule. With rules that use more information, like the different variants of fictitious play, the systems are more likely to reach the Nash equilibria than with reinforcement learning, because the latter only uses information concerning the actually played strategies. In fact, with reinforcement learning the system converges to nonoptimal collective states, with fractions of buyers that depend on the initial states and that are systematically lower than the values at the corresponding Nash equilibria. Learning times are very different and strongly dependent on the parameters of the rules. When the system presents two possible Nash equilibria, learning times are dramatically larger than when the equilibrium is unique. Among the open questions left by the present investigation, we are considering learning schemes where the individuals are assumed to know the additive structure of their utility function. Then, they need to learn an estimation of the fraction of buyers and use Equation (1) to calculate the attractions. Preliminary results show that the results are qualitatively different from those presented here. We restricted to deterministic decision rules: individuals choose their strategy according to the value of the attraction. Results of simulations with probabilistic (trembling-hand) decision rules, as well as analytical studies of these dynamics, will be published elsewere.
Acknowledgements The authors acknowledge support from the ACI Complex Systems in Social and Human Sciences (CNRS and French Ministry of Education), project ELICCIR.
202
Viktoriya Semeshenko et al.
References Ball, P. (2003), ‘‘The physical modelling of human social systems’’, Complexus, Vol. 1, pp. 190–206. Blume, L.E. (1993), ‘‘The statistical mechanics of strategic interaction’’, Games and Economic Behavior, Vol. 5, pp. 387–424. Blume, L.E. (1995), ‘‘The statistical mechanics of best-response strategy revisions’’, Games and Economic Behavior, Vol. 11, pp. 111–145. Brock, W.A. and S.N. Durlauf (2001), ‘‘Discrete choice with social interactions’’, Review of Economic Studies, Vol. 68, pp. 235–260. Brown, G.W. (1951), ‘‘Iterative solution of games by fictitious play’’, in: Activity Analysis of Production, New York: Wiley. Camerer, C.F. (2003), Behavioral Game Theory, Princeton, NJ: Princeton University Press. Camerer, C.F. and T. Ho (1999), ‘‘Experience-weighted attraction learning in normal form games’’, Econometrica, Vol. 67, pp. 827–874. Crawford, V.P. (1995), ‘‘Adaptive dynamics in coordination games’’, Econometrica, Vol. 63, pp. 103–143. Durlauf, S.N. (1997), ‘‘Statistical mechanics approaches to socioeconomic behavior’’, pp. 81–104 in: B. Arthur, S.N. Durlauf and D. Lane, editors, The Economy as an Evolving Complex System II, Santa Fe Institute Studies in the Sciences of Complexity, Vol. XVII, Addison-Wesley Pub. Co. Durlauf, S.N. (1999), ‘‘How can statistical mechanics contribute to social science?’’, Proceedings of the National Academy of Sciences, Vol. 96, pp. 10582–10584. Erev, I. and A.E. Roth (1998), ‘‘Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria’’, The American Economic Review, Vol. 88(4), pp. 848–881. Fo¨llmer, H. (1974), ‘‘Random economies with many interacting agents’’, Journal of Mathematical Economics, Vol. 1(1), pp. 51–62. Fudenberg, D. and D.K. Levine (1998), The Theory of Learning in Games, Cambridge, MA: MIT Press. Glaeser, E. and J.A. Scheinkman (2003), ‘‘Non-market interactions’’, pp. 339–369 in: M. Dewatripont, L.P. Hansen and S. Turnovsky, editors, Advances in Economics and Econometrics: Theory and Applications, Eight World Congress, Cambridge University Press. Glaeser, E.L., B. Sacerdote and J.A. Scheinkman (1996), ‘‘Crime and social interactions’’, Quarterly Journal of Economics, Vol. CXI, pp. 507–548. Gordon, M.B., J.-P. Nadal, D. Phan and J. Vannimeuns (2005), ‘‘Seller’s dilemma due to social interactions between customers’’, Physica A, Vol. 356(2–4), pp. 628–640. Gordon, M.B., J.-P. Nadal, D. Phan and V. Semeshenko (2006), ‘‘Discrete choices under social influence: generic properties’’, in: 1st International Conference on Economic Sciences with Heterogeneous Interacting Agents
Effects of Learning Behaviours on the Collective Dynamics
203
(WEHIA06), 15–17 June 2006, Bologna, Italy, http://www.dse.unibo.it/ wehia/paper/parallel%20session_3/session_3.1/phan_3.1.pdf. Granovetter, M. (1978), ‘‘Threshold models of collective behavior’’, American Journal of Sociology, Vol. 83(6), pp. 1360–1380. Manski, C.F. (2000), ‘‘Economic analysis of social interactions’’, The Journal of Economic Perspectives, Vol. 14(3), pp. 115–136. McKelvey, R.D. and T.R. Palfrey (1995), ‘‘Quantal response equilibria for normal games’’, Games and Economic Behavior, Vol. 10, pp. 6–38. Nadal, J.-P., D. Phan, M.B. Gordon and J. Vannimenus (2006), ‘‘Multiple equilibria in a monopoly market with heterogeneous agents and externalities’’, Quantitative Finance, Vol. 5(6), pp. 557–568. Presented at the 8th Annual Workshop on Economics with Heterogeneous Interacting Agents (WEHIA 2003). Phan, D. (2004), ‘‘From agent-based computational economics towards cognitive economics’’, pp. 371–398 in: P. Bourgine and J-P. Nadal, editors, Cognitive Economics, Springer. Phan, D., M.B. Gordon and J.-P. Nadal (2004), ‘‘Social interactions in economic theory: an insight from statistical mechanics’’, pp. 335–358 in: P. Bourgine and J.P. Nadal, editors, Cognitive Economics, Springer. Phan, D., S. Pajot and J.-P. Nadal (2003), ‘‘The monopolist’s market with discrete choices and network externality revisited: small-worlds, phase transition and avalanches in an ace framework’’, in: Ninth annual meeting of the Society of Computational Economics, University of Washington, Seattle, USA, July 11– 13, http://econpapers.repec.org/paper/scescecf3/150.htm. Robinson, J. (1951), ‘‘An iterative method of solving a game’’, Annals of Mathematics, Vol. 54, pp. 296–301. Saad, D. (1998), On Line Learning in Neural Networks, Cambridge University Press. Sarin, R. and Vahid, F. (1999), ‘‘Predicting how people play games: a simple dynamic model of choice’’, Working Paper. Schelling, T.S. (1971), ‘‘Dynamic models of segregation’’, Journal of Mathematical Sociology, Vol. 1, pp. 143–186. Schelling, T.S. (1978), ‘‘Hockey helmets, concealed weapons, and daylight saving: a study of binary choices with externalities’’, The Journal of Conflict Resolution, t, Vol. XVII(3). Schelling, T.S. (1978), Micromotives and Macrobehavior, New York: W.W. Norton and Co. Sutton, R.S. and A.G. Barto (1998), Reinforcement Learning: An Introduction, Cambridge, MA: MIT Press. VanHuyck, J., R. Battaglio and R. Beil (1990), ‘‘Tacit coordination games, strategic uncertainty and coordination failure’’, American Economic Review, Vol. 80, pp. 234–248. Young, H.P. (1998a), Individual Strategy and Social Structure: An Evolutionary Theory of Institutions, Princeton University Press. Young, H.P. (1998b), ‘‘Individual learning and social rationality’’, European Economic Review, Vol. 42, pp. 651–663.
This page intentionally left blank
CHAPTER 9
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer Andrea Pozzali and Riccardo Viale Abstract The concept of ‘‘tacit knowledge’’ is widely used in economics literature to refer to knowledge that cannot be codified and cannot be treated as ‘‘information’’. As the usage of the term grows, so do the number of critics that consider much of the debate around ‘‘tacit knowledge’’ as lacking in both analytical precision and empirical content. If we want the concept of ‘‘tacit knowledge’’ to be really useful as a tool to develop better explanations of economical and social phenomena, a clear foundation of it is needed. Under a purely epistemological point of view, tacit knowledge can in fact have many different forms. The distinction of knowledge in three broad categories (competence, acquaintance and correct information), which is commonly used in epistemology, holds for tacit knowledge too. Different types of tacit knowledge can be learned and transferred in different ways and therefore can play different roles in technology transfer processes. Keywords: codification, skills, tacit knowledge, technology transfer JEL classifications: O30, O31 1. Introduction The concept of ‘‘tacit knowledge’’, introduced in modern epistemological literature thanks to the seminal work of the scientist and philosopher of science
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80010-6
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
206
Andrea Pozzali and Riccardo Viale
Michael Polanyi (1958, 1967), has experienced over the years an ever widening application in a growing number of disparate disciplines that range from psychology to mathematics, from econometrics to religious thought and from aesthetics to evolutive economy. If we limit our considerations to literature of a strictly economic type, we find for example ‘‘tacit knowledge’’ used as an explicatory concept in studies on organisational learning (Nelson and Winter, 1982; Fransman, 1994; Cohen et al., 1996; Grant, 1996; Marengo et al., 2000), on knowledge management (Nonaka and Takeuchi, 1995; Baumard, 1999; von Krogh et al., 2000), on the role of technology in economic development (Metcalfe, 1988; Kogut and Zander, 1992; Senker, 1995; Nightingale, 1998, 2000; Balconi, 2002; Koskinen and Vanharanta, 2002), on technological transfer and innovation models (Faulkner and Senker with Velho, 1995; Howells, 1996; Gorman, 2002). As could be expected, with the expansion of the use of the term, critical voices have also multiplied, which are based particularly on two aspects: On one hand, the concept of tacit knowledge is said to have been used in an indiscriminate manner in a too heterogeneous series of contexts, without worrying about coming to some conceptual clarification of the meaning to be attributed to the concept itself. As a result, the term ‘‘tacit knowledge’’ has become less precise and more vague: it can be used in many different instances, but as a matter of fact it lacks any effective explicatory value: ‘‘y the terminology and meaning of ‘tacitness’ in the economics literature, having drifted far from its original epistemological and psychological moorings, has become unproductively amorphous; indeed, (y) it now obscures more than it clarifies’’ (Cowan et al., 2000, p. 213).
On the other hand, the ever greater capillary diffusion of information and telecommunication technologies (ICTs) should increase the capacity of codification of information and therefore strongly limit the domain of ‘‘tacit knowledge’’. According to this viewpoint, in principle, all knowledge to some degree can be codified: it is only the different cost/benefit structures associated with the codification operation that determine if the given knowledge remains tacit and unexpressed. Recourse to new generation information technologies, altering the cost/benefit structures associated with codification, would therefore make possible the codification of an always greater amount of knowledge, leading to a notable drop in the empirical relevance of ‘‘tacit knowledge’’: ‘‘y falling costs of information transmittal, deriving in large part from computer and telecommunications advances, have lately been encouraging a general social presumption favouring more circulation of timely information and a reduced degree of tacitness’’ (Dasgupta and David, 1994, p. 502; on the same subject, see also Foray and Cowan, 1997; Foray and Steinmueller, 2003).
It seems evident how these two criticisms, though connected, should actually be interpreted separately: the first appears, in fact, as a conceptual type criticism, which concentrates on the vague and ambiguous character of the concept of ‘‘tacit knowledge’’ and on its consequently limited explicatory value; whereas the second is a more empirical criticism, which asserts how the concept of ‘‘tacit knowledge’’,
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
207
however it is to be understood, is destined to have less and less relevance in the future, because of the increased social capacities of knowledge codification. The contributions that have tried to reply to these critics, sustaining the validity of the concept of ‘‘tacit knowledge’’ and arguing that its empirical relevance should not be destined to diminish even in the contemporary information society (Johnson et al., 2002; Nightingale, 2003), have in some respects neglected to take into consideration this distinction, dealing conjointly with theoretical and empirical aspects.1 It is our opinion that a better argument can be developed in favour of the empirical and theoretical relevance of the concept of tacit knowledge if the two aspects of the problem are faced separately, and this is the objective that we have set before us in the present paper. The work is structured in the following way: Section 2 is dedicated to a conceptual clarification that aims at pointing out how the term ‘‘tacit knowledge’’ can in fact refer to forms of knowledge that are very different one from the other. So we must not speak of ‘‘tacit knowledge’’ tout court, but rather of different forms of tacit knowledge, each distinguished by specific characteristics. Only after differentiating the various forms of tacit knowledge it will be possible to develop a critical review of the thesis that maintains that the diffusion of information technologies, increasing the possibility of codification, is destined to limit in the future the field of empirical applicability of the concept of tacit knowledge: this critical review will be presented in Section 3. Finally, in the fourth and last part of the work we will analyse, making reference to the reflections conducted in the previous sections, what role tacit knowledge could effectively play in technology transfer processes. 2. Different types of tacit knowledge The need to develop more detailed taxonomies of the characteristics that can be attributed to tacit knowledge has been unanimously recognised in the literature, both by those scholars who continue to acknowledge the importance of the role of tacit knowledge (Johnson et al., 2002), and by those who hold more critical positions (Cowan et al., 2000). One distinction that has for a long time contributed to orient the debate, and that goes back to the work of Ryle (1949/1984), is the one between know how and know that, or in almost equivalent terms between procedural and declarative
1
The paper of Johnson et al. (2002) concentrates above all on a criticism of a theoretic type of the use by Cowan et al. (2000) of the concept of ‘‘codification’’, arguing for complementarity (and not for substitutability) between tacit knowledge and codified knowledge. Nightingale’s paper (2003) is more articulated, it considers a wider number of bibliographical references and accompanies his theoretical arguments with a notable number of empirical supports, taken mostly from recent cognitive literature on procedural memory and implicit learning mechanisms. Also in his case, however, the theoretic aspects and the empirical ones are woven together and not kept separate.
208
Andrea Pozzali and Riccardo Viale
knowledge (Anderson, 1983). This distinction is relevant here as for long time tacit knowledge has been in a certain way confined to the domain of know how, as a component of skills and physical abilities. More recent contributions have tried to come to more refined classifications, like, for example, in the case of Gorman (2002), who identifies four categories: information (know what), skills (know how), judgment (know when) and wisdom (know why). Four are also the categories identified by Johnson et al. (2002), who, however, substitute know when with know who: Know what – should indicate knowledge regarding ‘‘facts’’, assimilable to the so-called ‘‘information’’. This type of knowledge is easily codified and communicated, also thanks to its decomposability into many elementary components or ‘‘raw data’’; Know why – should refer to knowledge related to principles and to general laws present in nature, in society and in the human mind; Know how – indicates skills, understood, however, not in the limited sense of mere physical type abilities, but in a general sense as ‘‘the capacity to do something’’, that can present also theoretical and abstract elements: ‘‘Even finding the solution to complex mathematical problems is based on intuition and on skills related to pattern recognition that are rooted in experience-based learning rather than on the carrying out of a series of distinct logical operations (Ziman, 1979, pp. 101–102)’’ (Johnson et al., 2002, p. 250).
Know who – encloses all the knowledge related to ‘‘who knows what’’, that is, the capacity to individuate within the whole available knowledge base the most appropriate expertise to solve determined problems: ‘‘The general trend towards a more composite knowledge base, with new products typically combining many technologies, each rooted in several different scientific disciplines, makes access to many different sources of knowledge more essential’’ (Johnson et al., 2002, p. 251).
All these classifications share a common method which consists in individuating a series of types of knowledge and subsequently indicating to what extent the single types can be considered more or less codifiable. In this sense, the classic distinction between know how and know that represented a sort of alternative formulation (or, if you prefer, of specification) of the tacit knowledge/explicit knowledge dichotomy. In fact, know how ended up being identified as the only field where it was possible to track down forms of tacit knowledge, while know that was considered almost totally explicit. The subsequent classifications by Gorman and by Johnson, Lorenz and Lundvall represent a notable advance in the debate as they both recognise that forms of tacit knowledge, far from being confined exclusively in the context of know how, can also be traced in other types of knowledge. None of these classifications, however, has tried to analyse the possibility that ‘‘tacit knowledge’’, far from representing
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
209
a concept that defines a perfectly homogeneous series of phenomena, can take on internal distinctions, or said more simply that different types of tacit knowledge can exist.2 Distinguishing these different types is important for two reasons: in the first place because this classification is a prerequisite for the conduction of more detailed empirical analyses, and in second place because different types of tacit knowledge can be learned, and consequently transmitted, with different mechanisms. In order to carry out a similar analysis it is opportune to refer directly to the classic tripartition between forms of knowledge in use in the epistemological literature (Lehrer, 1990; Dancy and Sosa, 1992), which distinguishes competential knowledge (ability), direct knowledge (knowledge as familiarity) and propositional knowledge (or justified true belief or knowledge as ‘‘correct information’’). In a similar way, tacit knowledge can be classified in the following three categories: Tacit knowledge as competence: this class includes all the forms of physical abilities and skills that refer to the capacity of a subject to know how to perform certain activities without being able to describe the knowledge he used to do the task. This type of tacit knowledge can have an automatic and unreflected character (for example, in the case of knowing how to breathe) or it can be the fruit of a conscious learning or training process (for example, in the case of knowing how to play the piano). This kind of tacit knowledge operates in particular in physical-like abilities such as swimming or riding a bicycle: in all these skilful performances, the activity is carried out by following a set of rules that are not explicitly known by the person following them. In other words, usually a person is able to ride a bicycle or to swim even if he does not know how he is able to do it. The same holds also for more complicated and less common abilities that are at the base of the development of craftsmanship (for example, the ability to make a violin) and of technological innovations (such as nuclear weapons, cf. MacKenzie and Spinardi, 1995, or aircrafts, cf. Vincenti, 1990). In all these cases the actual practice, that is the ability to carry on the given activity, can not be described correctly in all its details; even when a description can be formulated, this is always incomplete and is not enough to allow for knowledge transfer:3 ‘‘Rules of art can be useful, but they do not determine the practice of an art; they are maxims, which can serve as a guide to an art only if they can be integrated into the practical knowledge of the art. They cannot replace this knowledge’’ (Polanyi, 1958, p. 50).
2
Gorman’s classification admits the possibility that tacit knowledge may be present as a constitutive element of a series of different types of knowledge, such as, for example, heuristics, mental patterns, physical abilities, moral imagination and so on, but it does not specify concretely the modalities with which this can take place. 3 By the way, this explain why in our times, with all the modern technology we can dispose of, we are still not able to recreate or emulate Stradivari’s mastery in making violins!
210
Andrea Pozzali and Riccardo Viale
This type of manual abilities is defined by Polanyi as the capacity to physically carry out a predefined series of actions in order to complete a complex activity. The classic example, used also to introduce an important distinction between subsidiary awareness and focal awareness, starts from the observation of an apparently simple operation, like hitting a nail with a hammer: ‘‘When we use a hammer to drive in a nail, we attend to both the nail and hammer, but in a different way. We watch the effect of our strokes on the nail and try to wield the hammer so as to hit the nail most effectively. When we bring down the hammer we do not feel that its handle has struck our palm but that its head has struck the nail. Yet in a sense we are certainly alert to the feelings in our palm and the fingers that hold the hammer. They guide us in handling it effectively, and the degree of attention that we give to the nail is given to the same extent but in a different way to these feelings. The difference may be stated by saying that the later are not, like the nail, objects of our attention, but instruments of it. They are not watched in themselves; we watch something else while keeping intensely aware of them. I have a subsidiary awareness of the feeling in the palm of my hand which is merged into my focal awareness of my driving in the nail’’ (Polanyi, 1958, p. 55).
The two forms of awareness are mutually exclusive. Shifting our focal awareness from the general nature of a determined action to the single details that the action is composed of produces in us a sort of ‘‘self-consciousness’’ that can act as an impediment, making it impossible for us to go on doing the action we have undertaken. This is what happens, for example, to a pianist when he shifts his focal awareness from the piece he is playing to the details of the movements of his hands: it is likely that at this point he will become confused to the point that he has to interrupt his performance. What is destroyed, in these cases, is the sense of context: ‘‘Here again the particulars of a skill appear to be unspecifiable, but this time not in the sense of our being ignorant of them. For in this case we can ascertain the details of our performance quite well, and its unspecifiability consists in the fact that the performance is paralysed if we focus our attention on these details. We may describe such a performance as logically unspecifiable, for we can show that in a sense the specification of the particulars would logically contradict what is implied in the performance or context in question’’ (Polanyi, 1958, p. 56).
In the performance of complex tasks, therefore, we have a focused awareness only of some central details regarding the different operations being performed, while the rest of the details are left to subsidiary awareness. It is precisely the interaction between different forms of awareness that enables us to perform our various activities, which could not, for their nature, be performed in a fully ‘‘selfconscious’’ manner. Tacit knowledge in the form of competence is at the base of the already mentioned concept of ‘‘know how’’ (Ryle, 1949/1984) and of procedural knowledge (Anderson, 1983). This knowledge type also has an important role in the development of scientific and technological innovations, as has been pointed out in numerous works in the sociology of science and in the history of technology (Cambrosio and Keating, 1988; Vincenti, 1990; Collins, 1992, 2001; Jordan and Lynch, 1992; Mackenzie and Spinardi, 1995; Pinch et al., 1996). In the economic
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
211
field, the work of Nelson and Winter (1982) is the classic reference for the analysis of the importance of tacit skills in evolutionary economics and in the organisational capabilities approach to the theory of the firm. Tacit knowledge as background knowledge (or as familiarity): in this class we find all those forms of interiorised regulations, of codes of conduct, of values and widespread knowledge that a determined subject knows thanks to his direct experience. This knowledge cannot be articulated or formalised because of its extremely dispersed nature, which makes it difficult to access to it by aware consciousness. This type of tacit knowledge has more than one affinity with the notion of background, introduced by Searle to find a solution to the problem of retrieving a stable foundation for the process of interpretation of rules and of representations, or in more precise terms, to prevent this process from turning into an infinite regression (Searle, 1992, 1995). Background is defined as that set of biological and cultural capacities, of assumptions, of presuppositions and of pre-theoretic convictions that are the preconditions of any form of theoretical knowledge. Even if background is a very complex structure, which has been the object of many reinterpretations and redefinitions, even by Searle himself, it is possible, in any case, to find between it and the concept of ‘‘knowledge as familiarity’’ some significant overlapping, especially if we consider those components of the ‘‘background’’ whose acquisition is mediated by processes of socialisation and acculturation (and therefore in final analysis of experience understood in a broad sense). On the other hand, this type of tacit knowledge shows many elements of contact also with ‘‘pre-theoretical’’ knowledge on which the analysis of sociologists of knowledge like Berger and Luckmann concentrate: ‘‘Theoretical thought, ‘‘ideas,’’ Weltanschauungen are not that important in society. Although every society contains these phenomena, they are only part of the sum of what passes for ‘‘knowledge.’’ Only a very limited group of people in any society engages in theorizing, in the business of ‘‘ideas,’’ and the construction of Weltanschauungen. But everyone in society participates in its ‘‘knowledge’’ in one way or another. Put differently, only a few are concerned with the theoretical interpretation of the world, but everybody lives in a world of some sort’’ (Berger and Luckmann, 1966, p. 15).
Every modern society is characterised by a huge amount of this kind of tacit knowledge, dispersed among every individual member of the society and transmitted from a generation to another through an endless and continuous process of socialisation. It appears evident how the analysis of Berger and Luckman is in many respects less shareable than the one of Searle, especially where they speak of ‘‘objective structures of the social world’’ and define pre-theoretic knowledge as the pure and simple ‘‘total sum of what everyone knows’’. If we want the analysis of the role of ‘‘background’’ tacit knowledge to gain an effective explicatory role and not to remain a pure and simple descriptive concept, we should always try to lower our focus to the individual level, analysing how the cognitive capacities of the single individual filter and recombine the set of preexisting social knowledge. The work of Searle on the ‘‘construction of social reality’’
212
Andrea Pozzali and Riccardo Viale
offers in this context some interesting methodological cues, where it shows how the ‘‘objective structures of the social world’’, which Berger and Luckmann speak of, can in fact be analysed and described as the fruit of thought and language processes that take place in individual minds. It seems likely to assume that tacit background knowledge, given its dispersed and informal character, seldom plays a direct causal role in shaping the decisions and the behaviours at the individual level. Anyway, it can act as a reference point and as a inevitable filter between the individual and the social level. If we want to find, in the economic literature, a sort of correspondence to this type of knowledge, we may look for example at the concept of ‘‘social capital’’ (Woolcock, 1998). More in general, we can also think that tacit background knowledge can be one of the constitutional elements of all those forms of knowledge that are embedded in a specific social, political and/or geographical context (Granovetter, 1985; Saxenian, 1994; Lawson and Lorenz, 1999) and that are used in many cases as explanatory variables in the analysis of the different competitiveness performance at the local level. Tacit knowledge as implicitly held cognitive rules: following the epistemological classification we have proposed as a reference point, we now come to the problem of finding a kind of tacit knowledge that can be considered as an analogous of ‘‘knowledge as justified true belief ’’ or as ‘‘correct information’’. Under a certain point of view, this can be considered as an impossible task: how can we conceive, in fact, of an individual possessing a ‘‘tacit propositional knowledge’’? How can we ascertain that the knowledge one subject has can be considered as a ‘‘justified true belief’’, if this knowledge is tacit, that is the subject is not able to express and formulate it? How can a person holds ‘‘tacit beliefs’’? These are just some of the questions that immediately raise when one starts to conceive of the possibility to envision a type of tacit knowledge that is not merely a physical abilities or a social, background knowledge. As a matter of fact, the possibility of considering tacit knowledge as having also a cognitive dimension was for many years substantially ruled out in epistemology and in cognitive sciences. The only way of considering tacit knowledge was limited to admitting that it could have a role in skill-like abilities. Other forms of tacit knowledge seem to represent no more than a logical absurdum. In the last few years this kind of veto towards a form of ‘‘tacit cognition’’ is beginning to vacillate, thanks in particular to the empirical and theoretic evidences coming from cognitive psychology and from neurosciences. The first and perhaps the most significant example of a form of tacit knowledge that cannot be considered either a physical-type skill, or a form of ‘‘social capital’’, is linguistic knowledge (Chomsky, 1986, pp. 263–273). This form of knowledge does not represent, in a strict sense, a form of skill, but must be considered as an actual cognitive system, defined in terms of mental states and structures that cannot be articulated in words nor described in a complete formal language. The completely tacit nature of this linguistic knowledge is such
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
213
that a language, in fact, cannot be ‘‘taught’’, but must be more properly ‘‘learned’’ by subjects: ‘‘Language is not really taught, for the most part. Rather, it is learned, by mere exposure to the data. No one has been taught the principle of structure-dependence of rules (y), or language-specific properties of such rules (y). Nor is there any reason to suppose that people are taught the meaning of words. (y) The study of how a system is learned cannot be identified with the study of how it is taught; nor can we assume that what is learned has been taught. To consider an analogy that is perhaps not too remote, consider what happens when I turn on the ignition in my automobile. A change of state takes place. (y) A careful study of the interaction between me and the car that led to the attainment of this new state would not be very illuminating. Similarly, certain interactions between me and my child result in his learning (hence knowing) English’’ (Chomsky, 1976, p. 161).
Moreover, not only the acquisition, but also the utilisation of linguistic knowledge does not seem to imply a reference to the formalized rules of language, but rather an automatic and mostly unconsciously reference to the acquired abilities: ‘‘the knowledge of grammatical structures (y) is not present in a conscious way in most of the cases where we use the language effectively and perfectly correctly’’ (Damasio, 1999, p. 357).4 Other examples of cognitive forms, not skill-like nor background-like, of tacit knowledge come from the substantial number of studies on implicit learning processes (Reber, 1993; Cleeremans, 1995; Cleeremans et al., 1998), in particular those relating to experiments in artificial grammar and probabilistic sequence learning.5 The typical experiment of artificial grammar learning consists in giving subjects a series of alphanumeric strings, some of which generated from a hidden grammatical structure, others completely casual. After completing this phase, subjects are given other alphanumeric strings and they are asked to distinguish between the grammatical and the non-grammatical ones. The results show that the subjects are able to successfully perform this recognition task, though they are unable to explain in an articulated form the type of logical path that led them to these results, nor can they describe the characteristics of the hidden grammatical structure. Even more interesting experiments, with a similar structure, are those related to the control of complex systems, in which a subject is asked to maximise an unknown function selecting the values to be attributed to specified variables (Broadbent et al., 1986). On the whole, it is possible to say that research on implicit learning shows how subjects are able to make use of the hidden structural characteristics that make up the essence of a given phenomenon, though they are
4
Even if in certain cases it is possible to admit that, in the case of language, we can reach the formulation of an explicit rule, the fact remains that the total formalisation and codification of linguistic knowledge has not yet been reached, in spite of the considerable research efforts expended over the years. 5 To remain in the field of neurosciences, further empirical evidence supporting the role of tacit knowledge in individual cognitive processes comes also from research on implicit memory and perception phenomena (cf. Atkinson et al., 2000; Raichle, 1998; Zeman, 2001).
214
Andrea Pozzali and Riccardo Viale
not able to come to the complete and explicit knowledge of these same characteristics. The knowledge that enables the subjects of implicit learning experiments to obtain this type of results can be considered, together with linguistic knowledge, as a type of tacit knowledge that is neither a purely physical ‘‘skill’’, nor a form of ‘‘familiarity’’ or ‘‘background’’ knowledge. Obviously, we cannot in any case consider it is a type of ‘‘justified true belief ’’, or as a ‘‘propositional knowledge’’, for the reasons already explained. How can we try then to define it? We propose to define this kind of tacit knowledge as implicitly held cognitive rules that can guide the actions and decisions of a subject while at the same time remaining confined to the tacit domain. As we know that admitting the possibility that a cognitive rule can be implicitly held can represent a highly controversial point, a clarification is needed here. The problem seems to lie in the fact that the representational theory of mind, which can be considered the mainstream in cognitive science, in a certain way requires that in order to be causally efficacious representations have to be tokened in a conscious way. The evidences coming from implicit learning research, but also from recent studies on phenomena of implicit memory and subliminal perception, should make us consider more in depth the possibility that not all knowledge need to be tokened in order to play a causal role, as Cleeremans and Jime´nez clearly state: ‘‘We suggest to eliminate the ‘‘knowledge box’’ as a requirement for the definition of knowledge, and to assume that representations can simultaneously constitute knowledge and be causally efficacious without ever being tokened in any way. For instance, observing that ‘‘butter’’ has been perceived in a subliminal perception experiment because it exerts detectable effects on performance does not imply that the property of ‘‘butter’’ has been somehow unconsciously represented in the subject’s knowledge box (y). It simply means that the relevant neural pathways were activated sufficiently to bias further processing in the relevant direction when the stem completion or lexical decision task is actually performed. The knowledge embedded in such pathways is knowledge that is simultaneously causally efficacious and fully implicit’’ (Cleeremans and Jime´nez, 1999, p. 771).
The type of tacit knowledge subjects seems able to develop in implicit learning experiments is knowledge that cannot be expressed and at the same time surely has a direct causal impact on subjects’ decisions and performances. We can consider it as a kind of tacit analogous of other well-known cognitive mechanisms such as pragmatic schemes, heuristics, mental models and so on. As it is knowledge able to influence the decisions made by the subject, it is a real cognitive rule, that is held in an implicit way. For this reason we propose to categorize it as implicitly held cognitive rules. Even if empirical research on this type of tacit knowledge is still in great part lacking, we suspect that it may be considered as an important element in the development of heuristics, rules of thumb and case-based expertise that are commonly used in decision-making processes (Gigerenzer, 2000). In economic literature, we could maybe find this type of tacit knowledge as being one of the component of ‘‘expert knowledge’’ and of ‘‘organisational routines’’ (Nonaka
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
215
and Takeuchi, 1995; Cohen et al., 1996). We believe the clarification of these elements to be one of the main future topics for the advancement of tacit knowledge research in cognitive science and in economics both. 3. Modalities of acquisition, codification and transfer of different types of tacit knowledge The distinction between different types of tacit knowledge is a useful heuristic instrument to develop deeper and more accurate empirical analyses. Compared to alternative distinctions, like for example the one by Collins (2001),6 the one we are proposing has the advantage of dividing tacit knowledge into three distinct forms, each of which can be easily detected in an empirical way and characterised on the basis of its specific mechanisms of acquisition, codification and transfer. As for the mechanisms with which the different forms of tacit knowledge can be acquired and transmitted, we can indicate the following points (which could be aspects worth further empirical analysis): – tacit knowledge as competence (skills, know-how) can be learned and transmitted fundamentally through processes of imitation and apprenticeship based on face-to-face interaction and on the job learning by doing/learning by using (Nelson and Winter, 1982; Anderson, 1987; for a description of the neurological processes that seem to be involved in the acquisition of skill-like abilities and other similar physical competences, see Passingham, 1997; Petersen et al., 1998); – tacit knowledge as background knowledge is acquired, as we have seen, mainly through processes of socialisation (to which we can also add mechanisms of implicit learning in some cases); the same mechanisms are at the base of the
6
Collins distinguishes five types of tacit knowledge: concealed knowledge, mismatched salience, ostensive knowledge, unrecognised knowledge and uncognized/uncognizable knowledge. Concealed knowledge encompasses all those tricks of the trade, rules of thumbs and practical stratagems that are part of scientists’ experience and that normally are not included in scientific publications and papers. Mismatched salience has to do with the fact that usually the development of new scientific knowledge involves an indefinite number of potentially important variables. Not all the possible variables have the same relevance and different scientists can attribute different importance to the same things. As this differential attribution is made quite often in a semi automatic manner, a scientist can have some difficulties in explaining this thing to other people. Ostensive knowledge is knowledge that cannot be transmitted with words or formulas, but only by direct pointing, or demonstrating (as in the case of interpretation of radiography and other images). Unrecognised knowledge refers to the possibility that sometimes a scientist can perform aspects of an experiment in a certain way without realising their importance. Uncognized/uncognizable knowledge refers to all those activities that are carried out in an automatic and unconscious way. Of these, concealed knowledge is a type of knowledge that has a tacit character only on the basis of motivational factors tied to the specific interests of the subject possessor of the knowledge, while uncognized/uncognizable knowledge is difficult to detect empirically. Among the remaining three categories, the one that takes on the most important role is ostensive knowledge, which is substantially a further specification of the concept of skill-like knowledge.
216
Andrea Pozzali and Riccardo Viale
circulation and transmission of this type of tacit knowledge within a determined social, economic and institutional context; and – tacit knowledge as implicitly held cognitive rules is acquired through processes of implicit learning like the ones remembered above (Berry, 1987; Berry and Broadbent, 1988; Berry and Dienes, 1991; Reber, 1993; Dienes and Berry, 1997). The mechanisms that allow the transmission of this type of knowledge have not yet been analysed in a thorough manner. One of the first objectives of current research on tacit knowledge should be precisely the study of this particular field of analysis. The aspect related to the codification mechanisms of the different forms of tacit knowledge is worth being studied more deeply, as it is tied to the debate on the influence of the current revolution in ICTs on the extension of the empirical domain of tacit knowledge. As we remembered before, this debates concentrate on the possibility that the development in information and communication technologies should significantly extend the realm of explicit knowledge and confine tacit knowledge to an increasingly marginal role (Dasgupta and David, 1994; Foray and Cowan, 1997; Foray and Steinmueller, 2003). As correctly pointed out by Johnson, Lorenz and Lundvall, this type of reasoning cannot be conducted in abstract, but must take into account the fact that different forms of knowledge can have different degrees of codifiability. So, while the use of databases, semantic research engines, neural networks and other similar mechanisms of information representation and archiving can be effective if applied to know what, the problems are greater when we want to try to codify and transmit forms of know how or of know who: ‘‘Know-how is the kind of knowledge where information technology faces the biggest problems in transforming tacit or non-explicit knowledge into an explicit, codified format. The outstanding expert – cook, violinist, manager – may write a book explaining how to do things, but what is done by the amateur on the basis of that explanation is, of course, less perfect than what the expert would produce. (y) Know-who refers to a combination of information and social relationships. Telephone books that list professions and databases that list producers of certain goods and services are in the public domain and can, in principle, be accessed by anyone. In the economic sphere, however, it is increasingly important to obtain quite specialized competencies and to locate the most reliable experts, hence the enormous importance of good personal relationships with key persons one can trust. Electronic networks cannot substitute for these social and personal relationships’’ (Johnson et al., 2002, pp. 252–253).
The fundamental problem in the analysis of Johnson, Lorenz and Lundvall is the fact that their classification of knowledge into know what, know how, know who and know why, not having a solid cognitive and epistemological base, is not particularly effective. In the first place, it is evident that the category of ‘‘know who’’ has little relation to the problem of tacit knowledge and is more tied to the social modalities of the organisation and transmission of the knowledge itself. In the second place, inside a single category we can find different forms of tacit knowledge, as is evident especially in the case of know how that encloses certainly the skills (tacit knowledge as competence) but includes also rules of thumb,
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
217
heuristics and case-based knowledge, which should be considered more properly as implicitly held cognitive rules. More punctual and precise indications on the different forms of codifiability and transmission of tacit knowledge come from the work of Margherita Balconi, who analyses with extreme detail how the processes of codification have taken place over these last years in different industrial sectors (steel, semiconductors, mechanical). Balconi correctly points out how different forms of complementarity/substitutability can exist between tacit knowledge and ICTs: while some types of tacit knowledge can be substituted by ICTs, others have to be considered complementary to ICTs: ‘‘Tacit skills which have been substituted by codified know how and have become obsolete in most modern manufacturing processes, are those relying on the perceptions of sensory organs or manual ability. (y) Either their tacit knowledge has been codified and the execution of their activity assigned to a machine/instrument, or a technological innovation has changed the production process and made their specific knowledge obsolete. (y) Tacit skills which complement codified and automated manufacturing processes are those heuristics and interpretative skills which serve to decode and assign meaning to information-bearing messages (structured data inputs, codified know-how) and to create novelties’’ (Balconi, 2002, pp. 361-362).
The tacit knowledge that can be easily codified is made up, according to Balconi, of craftsmanship, and it is acquired and transmitted through learning on the job and apprenticeship. In the classification we have proposed, this type of knowledge is definable as tacit knowledge as competence. A study we conducted on innovation in the biotechnology sector enabled us to collect empirical evidence that shows how, even in the high-tech sector, this type of knowledge has an important role in innovation processes, but it is also relatively easy to transfer to other subjects (Viale and Pozzali, 2003). There are two principal methods with which this transmission can take place: – the embodiment of the subject’s tacit knowledge inside an automatic device that mimics the subject’s performance step by step; and – the construction of algorithms that make use of the calculation power of an electronic processor to elaborate, wherever possible, computationally highly complex processes which manage to achieve the same results that the subjects are able to achieve by using physical and perceptive abilities impossible to implement in a technological device. What is even more interesting in Balconi’s work, however, regards the second aspect, the one related to tacit knowledge that must be considered complementary to ICTs and not substitutable or codifiable. This type of knowledge is made up of heuristics of judgment, specific problem-solving abilities, individual intuitive capacities that have a specifically cognitive character, at the base of which we can trace a precise correlation of a neurological type: ‘‘These categories draw upon the way the human brain functions, on the basis of pattern matching (Ginsberg, 1998). Humans have a clear advantage over computers in those situations that need to be addressed through a method of pattern matching instead of computing’’ (Balconi, 2003, p. 362).
218
Andrea Pozzali and Riccardo Viale
The suggestions that come from the empirical research conducted by Balconi can be inserted in that line of reflection on problems of tacit knowledge that detects in the sphere of pattern matching and of signalling activities (that is, the activation of a given behavioural or cognitive response to the repeated exposure to a series of external stimuli characterized by structural regularity: cf. Dewey and Bentley, 1949) an extremely promising field of research. Within this field it is possible, in fact, to find examples of those cognitive forms of tacit knowledge (that is, tacit knowledge as implicitly held cognitive rules) that are acquired and transmitted through processes of implicit learning like the ones remembered above. This type of tacit knowledge is not easily codified or transmitted and ICT technologies, in this sense, are not a great help: ‘‘The crucial role of problem solving points to the fact that codification of technological knowledge does not absolutely imply that human tacit competences – meaning knowledge and abilities that are inherently embodied in individuals – have ceased to be important’’ (Balconi, 2002, p. 359).
It is precisely this type of tacit knowledge that represents a kind of ‘‘cognitive bottleneck’’, which economic literature and studies on technological innovation and processes of technological transfer will inevitably have to consider. 4. Conclusions: tacit knowledge as an explicative factor in the configuration of systems for technological transfer In the preceding paragraphs we have considered the problem of tacit knowledge limiting our analysis to the purely cognitive and epistemological aspects. Entering into the debate on the ‘‘codification’’ of tacit knowledge (Foray and Cowan, 1997; Cowan et al., 2000; Johnson et al., 2002; Foray and Steinmueller, 2003; Nightingale, 2003), we have tried to propose a classification of the different forms of tacit knowledge that could have solid epistemological foundations. A similar classification should enable us to replace terms such as know how, know what, know why and know who with a triple partition divided into: tacit knowledge as competence, as background knowledge and as implicitly held cognitive rules, useful also for purposes of empirical analysis.7 The three types of tacit knowledge we have identified offer, as we have seen, different degrees of
7
It may also happen that the category of ‘‘tacit knowledge as background knowledge’’ could not be of great interest for a cognitive point of view, as long as it has more to do with the social mechanisms for knowledge accumulation and transmission than with the individual specific cognitive endowment. Even the actual empirical role of this type of tacit knowledge is something that may be quite difficult to detect and ascertain. For these reasons, from a more applied and empirical point of view, the analysis may be limited to identify two broad dimensions for tacit knowledge itself: tacit knowledge as competence and tacit knowledge as implicitly held cognitive rules (or in other words tacit competence and tacit cognition). Anyway, in this paper we have decided to stick to a tripartition of tacit knowledge for completeness sake, as long as we think that, even if it may be quite fuzzy and cumbersome, the concept of tacit knowledge as background knowledge cannot be completely left out of the analysis.
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
219
codifiability and of transmissibility: if tacit knowledge as competence can be in some way codified and transmitted, greater doubts remain regarding the possibility of codifying forms of tacit knowledge as background knowledge and (above all) as implicitly held cognitive rules. Further indications on the possible role of tacit knowledge in processes of innovation can come also from the analysis of current tendencies of evolution inside the systems for scientific and technological knowledge production and diffusion. As it is well known, even if from a generic point of view the development of Western economic systems and the scientific and technological progress that is one of its main drivers are tied to the availability of a stock of useful and tested knowledge (Kuznets, 1965), in the course of the three industrial revolutions that have marked modern history the role of ‘‘useful knowledge’’ has gradually been changing (Mokyr, 2002). If at the beginning of the First Industrial Revolution, practical knowledge and know how were the principal sources of useful knowledge, while the purely ‘‘scientific’’ contribution was limited and consisted mainly in empirical generalisations of an accidental character, the role of scientific knowledge grew in the Second and (especially) in the Third Industrial Revolution. In our days, with the ‘‘institutionalization of innovation’’ (Mowery and Rosenberg, 1998), the ‘‘scientific density’’ of innovations, that is, the degree of dependence of the innovative process on the availability of new specialised and excellent scientific knowledge, has touched levels never reached before. Moreover, the scenarios of future development, like for example those described in the technology foresight report of the National Science Foundation (Roco and Bainbridge, 2002), lead us to conclude that in the future not only will innovations be more and more science-based, but they will also be the fruit of contributions of an interdisciplinary character coming from different scientific disciplines. The acronym NBIC (Nano Bio Info Cogno) summarises this process of convergence of different scientific and technological knowledge inside a common innovation process. These changes inside the system of ‘‘useful knowledge’’ are to be considered conjointly with the evolution in the procedures and in the modalities of codification and transmission of knowledge made possible by recent developments in ICTs. Both of these forces seem, in fact, to be pushing in the direction of the development of new innovation models in which the role of the universities as sources of scientific and technological knowledge is destined to grow even more than before (Etzkowitz et al., 1998; Godin and Gingras, 2000). Innovation will plausibly be less and less the fruit of the work of a single industrial laboratory and will become more and more dependent on the outcome of processes of technological transfer between universities and businesses. From an empirical point of view, confirmations in this direction can be seen in recent developments occurred in typically science-based sectors, such as, for example, biotechnologies (Senker, 1998; Orsenigo, 2001) and parallel computing (Faulkner et al., 1995).
220
Andrea Pozzali and Riccardo Viale
Within these new processes of innovation the role of tacit knowledge does not seem destined to diminish, especially to the degree in which this knowledge is made up not only of banal forms of competential knowledge, but takes on also specific cognitive aspects that are particularly difficult to codify in explicit forms and to transmit at a distance. For this reason, mechanisms able to permit the transmission of tacit knowledge, such as face-to-face interactions and transfer by head, will have and will continue to have a great importance in the development of efficient technological transfer processes. The innovation systems that will be able to shorten the knowledge transmission chain and in this way circumventing the cul de sac of cognitive mediation will be able to compete in the market, the others will plausibly find themselves operating in conditions of low competitiveness and will produce modest innovative dynamics. Once again, empirical confirmation in this sense can be found in the trajectories of development of strongly science-based sectors such as biotechnologies, which has followed totally different path in America and Europe (Orsenigo, 2001). While the American system is distinguished by a strong proximity between the industrial system and the world of research, with the universities in the first line in internalising and in taking on many functions typical of the business world, in Europe the universities have been much more reluctant to take on a similar propulsive role. On the other hand, there has been a multiplication of various types of specialised institutions, which in theory should have had the task of favouring technological transfer but which in many cases, almost paradoxically, have ended up separating more and more the academic world from the world of businesses. The problem here seems to lie precisely in the fact that the more the chain of knowledge transmission between research system and business world is long, the more it is difficult to make for an efficient processes of transfer of all the different components of scientific and technological knowledge, both in its explicit and its tacit dimension. These features characterise the First and Second Industrial Revolutions. Countries such as Great Britain and Germany, which allowed direct interaction between ‘‘savants’’ and ‘‘fabricants’’, had a stronger innovation output and industrial development.
References Anderson, J.R. (1983), The Architecture of Cognition, Cambridge: Harvard University Press. Anderson, J.R. (1987), ‘‘Skill acquisition: compilation of weak-method problem solutions’’, Psychological Review, Vol. 94, pp. 192–210. Atkinson, A.P., M.S.C. Thomas and A. Cleeremans (2000), ‘‘Consciousness: mapping the theoretical landscape’’, Trends in Cognitive Science, Vol. 4, pp. 372–382. Balconi, M. (2002), ‘‘Tacitness, codification of technological knowledge and the organization of industry’’, Research Policy, Vol. 31, pp. 357–379.
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
221
Baumard, P. (1999), Tacit Knowledge in Organizations, London: SAGE. Berger, P.L. and T. Luckmann (1966), The Social Construction of Reality. A Treatise in the Sociology of Knowledge, New York: Doubleday. Berry, D.C. (1987), ‘‘The problem of implicit knowledge’’, Expert Systems: The International Journal of Knowledge Engineering, Vol. 4, pp. 144–151. Berry, D.C. and D.E. Broadbent (1988), ‘‘Interactive Tasks and the implicitexplicit distinction’’, British Journal of Psychology, Vol. 79, pp. 251–272. Berry, D.C. and Z. Dienes (1991), ‘‘The relationship between implicit memory and implicit learning’’, British Journal of Psychology, Vol. 82, pp. 359–373. Broadbent, D.E., P. Fitzgerald and M.H. Broadbent (1986), ‘‘Implicit and explicit knowledge in the control of complex systems’’, British Journal of Psychology, Vol. 77, pp. 33–50. Cambrosio, A. and P. Keating (1988), ‘‘Going monoclonal: art, science, and magic in the day-to-day use of hybridoma technology’’, Social Problems, Vol. 35, pp. 244–260. Chomsky, N. (1976), Reflections on Language, Glasgow: Fontana. Chomsky, N. (1986), Knowledge of Language, New York: Praeger. Cleeremans, A. (1995), ‘‘Implicit learning in the presence of multiple cues’’, Proceedings of the 17th annual conference of the cognitive science society, Erlbaum, Hillsdale. Cleeremans, A., A. Destrebecqz and M. Boyer (1998), ‘‘Implicit learning: news from the front’’, Trends in Cognitive Science, Vol. 2, pp. 406–416. Cleeremans, A. and L. Jime´nez (1999), ‘‘Fishing with the wrong nets: How the implicit slips through the representational theory of mind’’, Behavioral and Brain Sciences, Vol. 22, p. 771. Cohen, M.D., R. Burkhart, G. Dosi, M. Egidi, L. Marengo, M. Warglien and S. Winter (1996), ‘‘Routines and other recurrent action patterns of organizations: contemporary research issues’’, Industrial and Corporate Change, Vol. 5, pp. 653–698. Collins, H.M. (1992), Changing Order. Replication and Induction in Scientific Practice, Chicago: University of Chicago Press. Collins, H.M. (2001), ‘‘Tacit knowledge, trust, and the Q of sapphire’’, Social Studies of Science, Vol. 31, pp. 71–85. Cowan, R., P.A. David and D. Foray (2000), ‘‘The explicit economics of knowledge codification and tacitness’’, Industrial and Corporate Change, Vol. 9, pp. 211–253. Damasio, A.R. (1999), The Feeling of What Happens: Body and Emotion in the Making of Consciousness, London: William Heinemann. Dancy, J. and E. Sosa (eds.) (1992), A Companion to Epistemology, Oxford: Basil Blackwell. Dasgupta, P. and P.A. David (1994), ‘‘Towards a new economics of science’’, Research Policy, Vol. 23, pp. 487–521. Dewey, J. and A.F. Bentley (1949), Knowing and the Known, Boston: Beacon Press.
222
Andrea Pozzali and Riccardo Viale
Dienes, Z. and D.C. Berry (1997), ‘‘Implicit learning: below the subjective threshold’’, Psychonomic Bulletin Review, Vol. 4, pp. 3–23. Etzkowitz, H., A. Webster and P. Healey (eds.) (1998), Capitalizing Knowledge. New Intersections of Industry and Academia, New York: SUNY Press. Faulkner, W., J. Senker with L. Velho (1995), Knowledge Frontiers. Public Sector Research and Industrial Innovation in Biotechnology, Engineering Ceramics and Parallel Computing, Oxford: Clarendon. Foray, D. and R. Cowan (1997), ‘‘The economics of codification and the diffusion of knowledge’’, Industrial and Corporate Change, Vol. 6, pp. 595–622. Foray, D. and W.E. Steinmueller (2003), ‘‘The economics of knowledge reproduction by inscription’’, Industrial and Corporate Change, Vol. 12, pp. 299–319. Fransman, M. (1994), ‘‘Information, knowledge, vision and theories of the firm’’, in: G. Dosi, D.J. Teece and J. Chytry, editors, Technology, Organization, and Competitiveness, Oxford: Oxford University Press. Gigerenzer, G. (2000), Adaptive Thinking. Rationality in the Real World, Oxford: Oxford University Press. Ginsberg, M.L. (1998), ‘‘Computer, games and the real world’’, Scientific American Presents, Vol. 9(4), pp. 84–89. Godin, B. and Y. Gingras (2000), ‘‘The place of university in the system of knowledge production’’, Research Policy, Vol. 29, pp. 273–278. Gorman, M.E. (2002), ‘‘Types of knowledge and their roles in technology transfer’’, Journal of Technology Transfer, Vol. 27, pp. 219–231. Granovetter, M. (1985), ‘‘Economic action and social structure: the problem of embeddedness’’, American Journal of Sociology, Vol. 49, pp. 323–334. Grant, R. (1996), ‘‘Toward a knowledge-based theory of the firm’’, Strategic Management Journal, Vol. 17, pp. 109–122. Howells, J. (1996), ‘‘Tacit knowledge, innovation and technology transfer’’, Technology Analysis & Strategic Management, Vol. 8, pp. 91–106. Johnson, B., E. Lorenz and B.-A. Lundvall (2002), ‘‘Why all this fuss about codified and tacit knowledge?’’ Industrial and Corporate Change, Vol. 11, pp. 245–262. Jordan, K. and M. Lynch (1992), ‘‘The sociology of a genetic engineering technique: ritual and rationality in the performance of the Plasmid Prep’’, pp. 77–114 in: A. Clark and J. Fujimura, editors, The Right Tools for the Job: At Work in 20th Century Life Sciences, Princeton: Princeton University Press. Kogut, B. and V. Zander (1992), ‘‘Knowledge of the firm, combinative capabilities and the replication of technology’’, Organization Science, Vol. 3, pp. 383–397. Koskinen, K.U. and H. Vanharanta (2002), ‘‘The role of tacit knowledge in innovation processes of small technology companies’’, International Journal of Production Economics, Vol. 80, pp. 57–64. Kuznets, S. (1965), Economic Growth and Structure, New York: Norton.
Cognition, Types of ‘‘Tacit Knowledge’’ and Technology Transfer
223
Lawson, C. and E. Lorenz (1999), ‘‘Collective learning, tacit knowledge and regional innovative capacity’’, Regional Studies, Vol. 33, pp. 305–317. Lehrer, K. (1990), Theory of Knowledge, London: Routledge. MacKenzie, D. and G. Spinardi (1995), ‘‘Tacit knowledge, weapons design and the uninvention of nuclear weapons’’, American Journal of Sociology, Vol. 101, pp. 44–99. Marengo, L., G. Dosi, P. Legrenzi and C. Pasquali (2000), ‘‘The structure of problem-solving knowledge and the structure of organizations’’, Industrial and Corporate Change, Vol. 9, pp. 757–788. Metcalfe, J.S. (1988), ‘‘The diffusion of innovation: an interpretative survey’’, in: G. Dosi, C. Freeman, R. Nelson, G. Silverberg, and L. Soete, editors, (1988). Technical Change and Economic Theory, London: Pinter. Mokyr, J. (2002), The Gifts of Athena. Historical Origins of the Knowledge Economy, Princeton: Princeton University Press. Mowery, D.C. and N. Rosenberg (1998), Paths of Innovation: Technological Change in Twentieth-Century America, Cambridge: Cambridge University Press. Nelson, R.R. and S. Winter (1982), An Evolutionary Theory of Economic Change, Cambridge: Harvard University Press. Nightingale, P. (1998), ‘‘A cognitive model of innovation’’, Research Policy, Vol. 27, pp. 689–709. Nightingale, P. (2000), ‘‘Economies of scale in experimentation: knowledge and technology in pharmaceutical R&D’’, Industrial and Corporate Change, Vol. 9, pp. 315–359. Nightingale, P. (2003), ‘‘If Nelson and Winter are only half right about tacit knowledge, which half? A Searlean critique of codification’’, Industrial and Corporate Change, Vol. 12, pp. 149–183. Nonaka, I. and H. Takeuchi (1995), The Knowledge-Creating Company, New York: Oxford University Press. Passingham, R. (1997), ‘‘Functional organisation of the motor system’’, in: R.S.J. Frackowiak, K.J. Friston, C.D. Frith, R.J. Dolan and J.C. Mazziotta, editors, Human Brain Function, San Diego: Academic Press. Petersen, S.E., H. van Mier, J.A. Fiez and M.A. Raichle (1998), ‘‘The effects of practice on the functional anatomy of task performance’’, Proceedings of the National Academy of Science USA, pp. 853–860. Pinch, T., H.M. Collins and L. Carbone (1996), ‘‘Inside knowledge: second order measures of skill’’, Sociological Review, Vol. 44(2), pp. 163–186. Polanyi, M. (1958), Personal Knowledge: Towards a Post-critical Philosophy, London: Routledge & Kegan Paul. Polanyi, M. (1967), The Tacit Dimension, New York: Doubleday. Raichle, M.E. (1998). ‘‘The neural correlates of consciousness: an analysis of cognitive skill learning’’ [review], Philosophical Transactions: Biological Science, Vol. 353, pp. 1889–1901. Reber, A.S. (1993), Implicit Learning and Tacit Knowledge. An Essay on the Cognitive Unconscious, Oxford: Oxford University Press.
224
Andrea Pozzali and Riccardo Viale
Roco, M.C. and W.S. Bainbridge (editors) (2002), Converging Technologies for Improving Human Performance. Nanotechnology, Biotechnology, Information Technology and Cognitive Science, National Science Foundation Report, Arlington. Ryle, G. (1949/1984), The Concept of Mind, Chicago: Chicago University Press. Saxenian, A.L. (1994), Regional Advantage: Culture and Competition in Silicon Valley and Route 128, Cambridge: Harvard University Press. Searle, J. (1992), The Rediscovery of the Mind, Cambridge: MIT Press. Searle, J. (1995), The Construction of Social Reality, New York: Free Press. Senker, J. (1995), ‘‘Tacit knowledge and models of innovation’’, Industrial and Corporate Change, Vol. 4, pp. 425–447. Viale, R. and A. Pozzali (2003), ‘‘Al di qua della razionalita`: la conoscenza tacita’’, Sistemi Intelligenti, XV, Vol. 2, pp. 325–346. Vincenti, W.G. (1990), What Engineers Know and How They Know It, Baltimore: Johns Hopkins University Press. von Krogh, G., I. Nonaka and T. Nichiguchi (eds.) (2000), Knowledge Creation. A Source of Value, London: Macmillan Press. Woolcock, M. (1998), ‘‘Social capital and economic development: toward a theoretical synthesis and policy framework’’, Theory and Society, Vol. 27(2), pp. 151–208. Zeman, A. (2001), ‘‘Consciousness’’ [invited review], Brain, Vol. 124, pp. 1263–1289. Ziman, J. (1979), Reliable Knowledge, Cambridge: Cambridge University Press.
CHAPTER 10
Overconfidence, Trading and Entrepreneurship: Cognitive and Cultural Processes in Risk-taking Denis Hilton Abstract Miscalibration of judgement can be viewed as distinct from other positive illusions identified by Taylor and Brown (1988). Accordingly, miscalibration needs to be distinguished from other positive illusions in models of how stable tendencies in judgemental biases might affect behaviour (e.g. Odean, 1998). It is certainly possible that miscalibration will have different effects on behaviour to those caused by positive illusions. In like vein, Biais et al.’s (2005) experimental results suggest that realism (in the form of better calibrated judgement) can produce more positive outcomes in competitive market situations, where perspicacity and accuracy in judgement may count for more than motivation and persistence. The finding that miscalibration leads to poor performance does indeed suggest that it pays to have accurate beliefs in a competitive market.
Keywords: judgemental bias, miscalibration, positive illusions JEL classifications: A12, C90, D82 Rather than assume that people are fully rational, economists have become increasingly interested in how people actually process information and make choices. For example, they have shown great interest in psychological research on overconfidence, arguing that this causes traders to overestimate the chances
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80011-8
r 2007 ELSEVIER B.V. ALL RIGHTS RESERVED
226
Denis Hilton
of success of their investments (Daniel et al., 1998; Odean, 1998) and entrepreneurs to enter into markets where there will be an unfavourable level of competition (Camerer and Lovallo, 1999), thus leading to premature company death (Busenitz and Barney, 1997; Landier and Thesmar, 2004). While the above studies suggest that overconfidence has negative effects on wealth, other economists (e.g. Be´nabou and Tirole, 2003) have argued that overconfidence will lead to positive outcomes on the basis that positive illusions (such as optimism, the illusion of control and the belief that one is better than average) appear to be correlated with favourable health outcomes (Taylor and Brown, 1988). Below, I attempt to unpack the notion of overconfidence. For example, although Odean (1998) distinguished between two types of overconfidence in his theoretical model, which I will refer to as miscalibration (overestimating the precision of one’s judgement) and positive self-illusions (believing oneself to be better than average, possessing unrealistic optimism and illusions of control), it is only recently that researchers have shown that these kinds of overconfidence are in fact empirically distinguishable (Re´gner et al., 2004; see also Glaser et al., 2005). I then explore the implications of this finding for evaluating whether and how overconfidence in its various forms influences trading and entrepreneurship. Finally, I discuss how culture may influence perceptions of risk and entrepreneurial behaviour. My aim is to illustrate how empirical studies of cognitive and cultural processes can inform economic questions about risk-taking in financial markets and entrepreneurship. 1. What is overconfidence? Two different paradigms in psychology In his theoretical paper, Odean (1998) reviewed psychological evidence for overconfidence. The first line of evidence came from cognitive psychological research on miscalibration in judgement (see Lichtenstein et al., 1982, for a review). In this line of research, participants give judgements of subjective confidence (expressed either as point estimates or using confidence intervals), which are then compared to the correct responses. Biais et al. (2005) used a questionnaire test of miscalibration, where participants were required to answer general knowledge questions through setting 90% confidence intervals on their responses (Alpert and Raiffa, 1982). The usual finding with these procedures is that participants are overconfident in their responses (especially when giving confidence intervals – see Klayman et al., 1999). Miscalibration is remarkably robust, and is still observed even when incentives to accuracy and frequency formats are used (Cesarini et al., 2003). The second line of research reviewed by Odean (1998) concerned what Taylor and Brown (1988) described as ‘‘positive illusions’’, namely the belief the self is better than average, unrealistic optimism (beliefs that the self was less vulnerable to misfortune than others, Weinstein, 1980) and illusion of control (Langer, 1975). Are judgemental miscalibration and positive illusions related? Judgemental miscalibration could reasonably be considered to be a form of unrealistic
Overconfidence, Trading and Entrepreneurship
227
optimism about the quality of one’s judgement, and thus there is a prima facie case to classify as a kind of positive illusion. Nevertheless, Odean (1998) includes parameters in his theoretical model that distinguish overconfidence in the precision of one’s information (akin to miscalibration) from belief that one has better information than others (a form of positive illusion, namely the better-than-average effect). In fact, recent studies suggest that while judgemental miscalibration may be considered to be a stable personality characteristic (Jonsson and Allwood, 2003), it is unrelated to positive illusions. Re´gner et al. (2004) showed that miscalibration did not correlate with measures of optimism, unrealistic optimism, the better-than-average effect, and two measures of perceived control (self-efficacy and locus of control) that were used as proxies for illusion of control. On the other hand, all the measures of positive illusions were significantly intercorrelated with each other, suggesting that they share an underlying component. Given that all these tasks imply self-evaluations, Re´gner et al. suggest that this may be regarded as a ‘‘self positive illusion’’ factor, in line with Taylor and Brown’s (1988) original classification. In line with this interpretation, these positive illusions correlated negatively with a societal risk scale, indicating that the more (for example) one thinks oneself to be less vulnerable to risk than others, the more one considers society to be at risk from accidents and natural disasters. Finally, using a miscalibration task to evaluate accuracy of judgement concerning questions about finance in populations of business students and bank professionals, Glaser et al. (2005) likewise find that overconfidence in judgement is not correlated with a measure of the better-than-average effect. The empirical studies therefore suggest that overconfidence in judgement (as measured by miscalibration techniques) and positive illusions are unrelated forms of overconfidence. Rather than there being one form of overconfidence that can influence economic outcomes, there are at least two. This raises new empirical questions. For example, with respect to entrepreneurship, optimism might well predict decisions to enter into a market, whereas accuracy of calibration might predict success in that market. We return to this question later, after having examined research on overconfidence in trading.
2. Overconfidence in trading: miscalibration vs. positive illusions? As noted above, several economists have argued on theoretical grounds that overconfidence leads investors to lose money on financial markets (Daniel et al., 1998; Odean, 1998). The main empirical support for this argument comes from a behavioural ‘‘demonstration’’ of overconfidence using a data base on private investors obtained from a large bank. Thus Barber and Odean (2001) found that men tend to trade more than women, and incurred greater losses due to failure to make profits that offset transaction costs. Using gender as a proxy for overconfidence, they argued that this result demonstrates that men are more
228
Denis Hilton
overconfident than women, and that this overconfidence leads to lower trading profits. Using an experimental financial market with asymmetric information, Biais et al. (2005) replicated Barber and Odean’s finding that men trade more than women. However, they found no correlation between gender and overconfidence in judgement on a confidence interval task. Clearly, any difference in male and female performance cannot be explained by group level differences in miscalibration of judgements, one of the accepted definitions of overconfidence. Nevertheless, within-group analyses of Biais et al.’s experimental results show that while miscalibration does not significantly affect performance in women, it does lead to worse performance in men. This effect is significant and robust across samples. Biais et al. argue that miscalibrated people tend to overestimate the precision of their information, suffer from winner’s curse risk and are consequently loss making. In line with their analysis, they demonstrate that overconfident individuals are particularly likely to lose money when market signals are ambiguous. To summarise: while both studies show that gender affects trading behaviour, and overconfidence in judgement leads to lower profits in an experimental financial market, there is no independent evidence that gender produces lower profits in men due to their greater overconfidence. It is of course possible that gender causes lower profits in men Barber and Odean’s (2001) data through a different kind of overconfidence to miscalibration in judgement. In fact, Odean’s (1998) theoretical model contains two parameters for overconfidence: one representing overconfidence in the precision of one’s knowledge (i.e. underestimating conditional uncertainty); and another representing a belief that one has better information than other agents. Miscalibration of judgement appears to correspond to the first kind of overconfidence, whereas ‘‘positive illusions’’ (Taylor and Brown, 1988) that the self is superior to others, has high control over outcomes, etc. appear to correspond to the second kind of overconfidence. In fact, Re´gner et al. (2004) show that the tendency to make miscalibrated judgements is not correlated with the tendency to entertain positive illusions such as the better-thanaverage effect, unrealistic optimism and high perceptions of personal control. Distinguishing kinds of overconfidence allows new research questions to be generated. For example, does positive self-evaluation lead people to trade more often (thus explaining why men are more active in financial markets than women)? On the other hand, does accuracy in judgement may lead people to make better, more perspicacious and profitable trades? Careful operationalisation of these constructs will allow such empirical hypotheses to be tested in future research. 2.1. Positive illusions vs. realism: happier but poorer? Work on experimental economics can also clarify a major debate in psychology and economics. Both psychologists (e.g. Taylor and Brown, 1988) and
Overconfidence, Trading and Entrepreneurship
229
economists (e.g. Be´nabou and Tirole, 2003) have argued that certain kinds of cognitive style are associated with health and economic outcomes. In particular, ‘‘positive illusions’’ such as inflated self-esteem, optimism and illusion of control may lead individuals to attain better outcomes, for example through motivating them to work harder and persist when the going gets tough. Such positive illusions may indeed act as self-fulfilling prophecies: for example, Aspinwall and Taylor (1992) report longitudinal field data that students who have high selfesteem, are optimistic, and have high beliefs about and desire for control are more likely to report being well adjusted to college three months later. Those with high self-esteem, high desire for control and internal locus of control are more likely to show greater motivation, and to obtain high-grade point averages 21 months later, controlling for intelligence (SAT scores). In similar vein Murray and Holmes (1997) show that relationship optimism and perceptions of control predicts relationship survival a year later even controlling for scores of initial relationship quality (satisfaction, trust, etc.). However, Fo¨rsterling (2002) has recently argued that realism in causal attributions can lead to more favourable outcomes. In support of this argument, Fo¨rsterling and Morgenstern (2002) show that giving participants accurate, rather than self-enhancing, feedback for task performance in a learning phase enables individuals to specialise in tasks that they are good at in the subsequent test phase, and thus improve their chances of gaining a financial reward for the best performance. One of the beneficial effects of realism is that accurate feedback may help individuals to identify and invest in areas of competitive advantage, whereas positive illusions may lead individuals to overestimate their chances of success and thus lose their investments. Thus Gibson and Sanbonmatsu (2004) show that in a domain such as gambling where increased effort cannot make a difference, optimism (as scored by the LOT scale) can lead to persistence after failure and thus greater monetary losses. In an experimental market setting, Camerer and Lovallo (1999) have found that being led to overestimate one’s chances of success relative to others leads to excess market entry and financial losses. Their experimental results seem to be mirrored by the field data of Landier and Thesmar (2004) who find that entrepreneurs who are particularly likely to overestimate their chances of success relative to others in their business category have business which grow less and die sooner. It may be that positive illusions – including self-serving attributions and illusion of control – may be adaptive in ‘‘responsive’’ environments, where attributional style can ‘‘make winners’’ by encouraging an individual to further effort, and by bringing others to associate with and support her. However, in ‘‘unresponsive’’ environments such as competitive markets, it seems likely that realistic attributions will facilitate ‘‘picking winners’’ – projects which will yield success if an individual invests effort in them. This analysis suggests how realism and accuracy in judgement may count for more than motivation and persistence. For example, by recognising that one does not have the ability to perform well in
230
Denis Hilton
a certain sector, a person can invest their energies more profitably elsewhere. Future work will therefore do well to specify when, why and how a positive illusion or attributional style will enhance or impede economic performance and/ or personal well-being. The clearer specification of the relationship between covariation expectancies and attributions reviewed above suggests the relationships between covariation information, attributions and outcomes can be subjected to finer test in both the mental health and economic achievement domains (Fo¨rsterling, 2002). In addition, Be´nabou and Tirole’s (2003) model implies that while optimism may lead to positive outcomes in some cases (e.g. through encouraging effort), it may also have negative effects on performance if it leads people to take unwarranted risks in domains where the returns will be low (e.g. due to high competition, markets with declining returns, etc.). However more work needs to be done to operationalise the concepts posited by the model so that it can be subjected to empirical test. 3. Entrepreneurship: factors predicting career choice, risk-taking and success I turn now to the various ways in which entrepreneurship can be explained, and how culture and cognition can be introduced to help explain some surprising results about culturally determined risk attitudes. The classic economic approach (Smith, 1776/1991) would explain economic development in terms of predisposing geographical factors such as closeness to rivers and the sea, the predominant mode of shipping goods in the 18th century. Technological advances and easy transport between centres of production and markets should create favourable opportunities for entrepreneurs, which they will then exploit. Later thinkers such as the sociologist Weber and the social psychologist McClelland (1961) would point to culturally transmitted belief and value systems such as the Protestant work ethic, achievement motivation as explanations for why some societies (e.g. England and Holland in the 17th and 18th centuries) seemed to develop faster than others. However, a problem for this position has been the growth of successful capitalism in the highly collectivist societies of East Asia in the last 20–30 years, a question I return to it below. 3.1. How can psychology explain entrepreneurial behaviour? There are several possible kinds of psychological explanation for entrepreneurial wealth creation. A first question is what makes some people invest time, prestige in money in a project whereas others, who perceive the project to have the same chances of success, do not do so. This is essentially the question: what is the profile that distinguishes people who decide to become entrepreneurs from others? For example, Lazear (2004) has shown that Stanford business students who were ‘‘jacks-of-all-trades’’ (i.e. who took a wide range of courses at business school) were more likely to become entrepreneurs than students who specialise in one area at business school. It could well be that these people have a
Overconfidence, Trading and Entrepreneurship
231
‘‘taste’’ for entrepreneurship (e.g. being one’s own boss, an inclination for novelty and variety). A second question is what causes entrepreneurial risk-taking? In this regard, it is often suggested that entrepreneurs have a different risk attitude to the average person. But what might this mean? Intuitively, when confronted with the same objective chances of success for a project, an entrepreneur may be more likely to accept the risk than another person if (a) he estimates his chances of success to be higher than others in the same category; and (b) he values success more and failure less than another person. These suggest different psychological processes, and there is evidence that both play a role. Thus encouraging positive self-evaluations (the perception among players that they are better than average) encourages entry into an experimental market (Camerer and Lovallo, 1999), while there is also evidence that entrepreneurs focus more on upside than downside risks (Palich and Bagby, 1995), making the potential gains more salient and the potential losses less so. Both tendencies could explain why entrepreneurs are more likely to take risks than other people. A third question is what causes entrepreneurial success? Thus different forms of overconfidence can predict entrepreneurial performance. Thus, there is evidence that positive self-evaluations (the belief that one’s company is better than average than others in the category) predict less profits and earlier death (Landier and Thesmar, 2004). Also, overconfidence in judgement (in the form of miscalibration) predicted lower profits in a sample of entrepreneurs (Bonnefon et al., 2005). Note that there needs to be nothing specifically ‘‘entrepreneurial’’ about these traits; they may well predict success in other professions where people have to take calculated risks on the basis of accurate appreciations of uncertainty, such as medicine and mountain climbing. In addition, traits such as intelligence may be as likely to predict success in entrepreneurship as they are in many other ways of life. Entrepreneurship research needs to carefully distinguish these questions. For example, It is not enough to simply demonstrate that entrepreneurs overestimate their chances of success (relative to the likely probabilities of failure; most new businesses fail within a few years) to explain why they become entrepreneurs in the first place (Busenitz and Barney, 1997). This is because the tendency to overestimate one’s competence and chances of success is in any case very widespread in Western cultures (Taylor and Brown, 1988), and is likely to characterise non-entrepreneurs as much as entrepreneurs. That entrepreneurs in a US sample consider themselves to be ‘‘better-than-average’’ and are unrealistically optimistic cannot serve to distinguish entrepreneurs from others, as pretty much all Westerners are this way. Thus the mere existence of positive self-illusions cannot answer our first question about why some people become entrepreneurs and others do not. However, Landier and Thesmar’s data can answer our third question by showing that entrepreneurs with a particularly high propensity to have positive self-illusions are those who are most likely to fail.
232
Denis Hilton
3.2. Culture, risk-taking and entrepreneurship I would like to conclude with some speculations about the influence of culture on risk attitude and entrepreneurship. Recent research has shown that Americans (like Europeans) think of themselves as better than average on competence traits such as intelligence and driving cars, whereas Japanese tend to think of themselves as better-than-average on social traits such as loyalty and honesty (Sedikides et al., 2003). In other words, it may be that people in individualistic societies will tend to believe that they are better on traits that allow them to compete successfully with others, whereas people in collectivistic societies will tend to believe themselves to be better on traits that make them good members of their social groups. On the assumption that individualistic Westerners have higher confidence in their abilities, this would predict that they would be more prepared to take risks then collectivistic Easterners because they believe they have higher chances of success. However, collectivism could also encourage entrepreneurship on the basis that Easterners believe that the costs of failure will be lower than do Westerners. If people in collectivist societies know that their peers pride themselves on their loyalty and dependability, this may encourage them to take risks because they will be bailed out in the case of failure. The observation that east Asians have a different sense of self can thus help explain a result that might otherwise seem paradoxical. It turns out that Chinese showed significantly more risk-seeking behaviour than Americans on a number of simulated gambles, while both sets of nationals predicted exactly the opposite (Weber and Hsee, 1999). Weber and Hsee propose a ‘‘cushion’’ hypothesis whereby people in collectivist societies will assume that members of their group will help them out if things go wrong. This kind of group insurance leads to self-assurance: they become willing to take risks because – objectively – the downside costs of failure are lower in collectivist than individualist societies. In this way, an economist could argue that collectivist societies form informal institutions that modify the objective risks run through providing mutual insurance. My interpretation of Weber and Hsee’s results is somewhat speculative. But I hope that I have shown that only an approach with a model of social values and organisation (individualism and collectivism) combined with careful theorising about and operationalisation of notions of the self and risk attitude will be able to do justice to the complex pattern of results concerning economic preferences and behaviour that are being observed. It is possible, for example, that high achievement motivation will predict entrepreneurial behaviour in individualist societies, whereas other processes may be more important in collectivist societies. 4. Summary and conclusions Research suggests that miscalibration of judgement can be viewed as distinct from other positive illusions identified by Taylor and Brown (1988).
Overconfidence, Trading and Entrepreneurship
233
Accordingly, miscalibration needs to be distinguished from other positive illusions in models of how stable tendencies in judgemental biases might affect behaviour (e.g. Odean, 1998). It is certainly possible that miscalibration will have different effects on behaviour to those caused by positive illusions. In like vein, Biais et al.’s (2005) experimental results suggest that realism (in the form of better calibrated judgement) can produce more positive outcomes in competitive market situations, where perspicacity and accuracy in judgement may count for more than motivation and persistence. The finding that miscalibration leads to poor performance does indeed suggest that it pays to have accurate beliefs in a competitive market. Future work will therefore do well to specify when, why and how a cognitive bias or positive illusion will enhance or impede economic performance and/or personal well-being, and where possible provide experimental tests of the causal paths implied by these hypotheses. Twenty-first century economics is emerging from a period where the 18th century model of man as a rational thinker was combined with deductive procedures based on mathematical modelling to explore the consequences of the rational self-interest model. Although economics is becoming more ‘‘cognitive’’ and ‘‘behavioural’’, it is doing so without always adopting the inductive techniques of hypothesis formulation and testing that psychologists use. Sometimes, economics retains a distinctly behaviourist flavour through failing to operationalise and test hypotheses about how cognitive processes can influence economic outcomes. Finally, culture provides another challenge for behaviourist economics as culturally imposed meaning systems will transform how ‘‘objective realities’’ are perceived and responded to at the societal level (e.g. Be´nabou and Tirole, 2006; Leung and Bond, 2004). Acknowledgement I thank Bruno Biais, Jean-Franc- ois Bonnefon, Laure Cabantous, David Molian, Se´bastien Pouget, Isabelle Re´gner, and Ste´phane Vautier for their collaboration in the research described in this chapter, and for many useful discussions, and Jean Tirole for comments on an earlier draft. References Alpert, M. and H. Raiffa (1982), ‘‘A progress report on the training of probability assessors’’, pp. 294–305 in: D. Kahneman, P. Slovic and A. Tversky, editors, Judgement under Uncertainty: Heuristics and Biases, Cambridge: Cambridge University Press. Aspinwall, L.G. and S.E. Taylor (1992), ‘‘Modeling cognitive adaptation: A longitudinal investigation of the impact of individual differences and coping on college adjustment and performance’’, Journal of Personality and Social Psychology, Vol. 63, pp. 989–1003.
234
Denis Hilton
Barber, B.M. and T. Odean (2001), ‘‘Boys will be boys: gender, overconfidence, and common stock investment’’, Quarterly Journal of Economics, Vol. 116, pp. 261–292. Be´nabou, R. and J. Tirole (2003), ‘‘Self-knowledge and self-regulation: an economic approach’’, in: I. Brocas and J.D. Carrillo, editors, The Psychology of Economic Decisions: Volume 1: Rationality and Well-being, Oxford: Oxford University Press. Be´nabou, R. and J. Tirole (2006), ‘‘Belief in a just world and redistributive politics’’, Quarterly Journal of Economics, Vol. 121, pp. 699–746. Biais, B., D. Hilton, K. Mazurier and S. Pouget (2005), ‘‘Judgmental overconfidence, self monitoring and trading performance in an experimental financial market’’, Review of Economic Studies, Vol. 72, pp. 297–312. Bonnefon, J.F., D.J. Hilton and D. Molian (2005), ‘‘A portrait of the unsuccessful entrepreneur as a miscalibrated thinker: Profit decreasing ventures are run by overconfident owners’’, Unpublished paper, University of Toulouse-II. Busenitz, L.W. and J.B. Barney (1997), ‘‘Differences between entrepreneurs and managers in large organizations: biases and heuristics in strategic decisionmaking’’, Journal of Business Venturing, Vol. 12, pp. 9–30. Camerer, C.F. and D. Lovallo (1999), ‘‘Overconfidence and excess entry: an experimental approach’’, American Economic Review, Vol. 89, pp. 306–318. Cesarini, D., O. Sandewall and M. Johannesson (2003), ‘‘Confidence interval estimation tasks and the economics of overconfidence’’, SSE/EFI Working Paper Series in Economics and Finance No. 535. Daniel, K., D. Hirshleifer and A. Subrahmanyam (1998), ‘‘Investor psychology and security market under- and overreactions’’, Journal of Finance, Vol. 53, pp. 1839–1885. Fo¨rsterling, F. (2002), ‘‘Does scientific thinking lead to success and sanity? An integration of attribution and attributional models’’, European Review of Social Psychology, Vol. 13, pp. 217–258. Forsterling, F. and M. Morgenstern (2002), ‘‘Accuracy of self-assessment and task performance: does it pay to know the truth?’’, Journal of Educational Psychology, Vol. 94, pp. 576–585. Gibson, B. and D.M. Sanbonmatsu (2004), ‘‘Optimism, pessimism and gambling: the downside of optimism’’, Personality and Social Psychology Bulletin, Vol. 30, pp. 149–160. Glaser, M., T. Langer and M. Weber (2005), Overconfidence of professionals and laymen: individual differences within and between tasks? Jonsson, A.-C. and C.M. Allwood (2003), ‘‘Stability and variability in the realism of confidence judgments over time, content domain and gender’’, Personality and Individual Differences, Vol. 34, pp. 559–574. Klayman, J., J.B. Soll, C. Gonzales-Vallejo and S. Barlas (1999), ‘‘Overconfidence: it depends on how, what and whom you ask’’, Organizational Behavior and Human Decision Processes, Vol. 79, pp. 216–247.
Overconfidence, Trading and Entrepreneurship
235
Landier, A. and D. Thesmar (2004), ‘‘Financial contracting with optimistic entrepreneurs: theory and evidence’’, Working Paper, Graduate School of Business, University of Chicago. Langer, E. (1975), ‘‘The illusion of control’’, Journal of Personality and Social Psychology, Vol. 32, pp. 311–328. Lazear, E.P. (2004), ‘‘Balanced skills and entrepreneurship’’, American Economic Review, Vol. 94, pp. 208–211. Leung, K. and M.H. Bond (2004), ‘‘Social axioms: a model for social beliefs in multi-cultural perspective’’, Vol. 36, pp. 119–197 in: M. P. Zanna, editor, Advances in Experimental Social Psychology, San Diego, CA: Elsevier Academic Press. Lichtenstein, S., B. Fischhoff and L.D. Phillips (1982), ‘‘Calibration of probabilities: the state of the art to 1980’’, pp. 306–334 in: D. Kahneman, P. Slovic and A. Tversky, editors, Judgment under Uncertainty: heuristics and Biases, Cambridge, England: Cambridge University Press. McClelland, D.C. (1961), The Achieving Society, New York: Van Nostrand. Murray, S.L. and J.G. Holmes (1997), ‘‘A leap of faith ? Positive illusions in romantic relationships’’, Personality and Social Psychology Bulletin, Vol. 23, pp. 586–604. Odean, T. (1998), ‘‘Volume, volatility, price and profit when all traders are above Average’’, Journal of Finance, Vol. 53, pp. 1887–1934. Palich, L.E. and D.R. Bagby (1995), ‘‘Using cognitive theory to explain entrepreneurial risk-taking: challenging conventional wisdom’’, Journal of Business Venturing, Vol. 10, pp. 425–438. Re´gner, I., D.J. Hilton, L. Cabantous and S. Vautier (2004), Judgmental overconfidence: one positive illusion or many? Working Paper, University of Toulouse. Sedikides, C., L. Gaertner and Y. Toguchi (2003), ‘‘Pancultural self-enhancement’’, Journal of Personaility and Social Psychology, Vol. 84, pp. 60–79. Smith, A. (1776/1991), An Inquiry into the Nature and Causes of the Wealth of Nations, New York: Prometheus Books (original work published 1776). Taylor, S. and J. Brown (1988), ‘‘Illusion and well being: A social psychological perspective on mental health’’, Psychological Bulletin, Vol. 103, pp. 193–210. Weber, E.U. and C.K. Hsee (1999), ‘‘Cross-national differences in risk preference and lay predictions’’, Journal of Behavioral Decision Making, Vol. 12, pp. 165–179. Weinstein, N. (1980), ‘‘Unrealistic optimism about future life events’’, Journal of Personality and Social Psychology, Vol. 39, pp. 806–820.
This page intentionally left blank
Epilogue Alan Kirman
The main thrust of cognitive economics has been to analyse individual decision making, and this is in the tradition of ‘‘methodological individualism’’. Yet the current interest in this theme has different sources. A first and clear origin comes from those who realised that there was something fundamentally unsatisfactory about the way in which economists have come to formalise preferences. The members of this category fall into two groups. First, there were those, in general economists, who realised that the preference-based model was not adequate, as a basis for an equilibrium model of the economy. I will return to how economics came to this point a little later. Second, there were those, working in logic and philosophy, who were interested in the logical structure of preferences. The aim of the former was to decide how to build models of the economy on some more solid foundation, while the aim of the latter was to provide a structure for preferences which can somehow remove the difficulties with the standard assumptions. Such an exercise consists in providing a logically coherent structure which can handle, in a consistent way, the problems of choice under uncertainty and over time. It may be that, to find such a structure, we have to abandon our current ideas of what characterises coherent preferences. Observed violations of the standard assumptions might then be incorporated into the framework. The sort of position taken by Cozic in this book, is that of redesigning the axiomatic structure to provide a coherent model of ‘‘bounded rationality’’. This moves quite far from the usual hypotheses but remains an approach favoured by a number of theoretical analysts. There have of course been those who stand somewhere in the middle and who would like to reestablish something close to the standard framework which would be able to handle the difficulties that have arisen. Indeed, the literature on discounting and future preferences has tried to modify the standard framework in a minimalist way in order to resolve some observed choice paradoxes. Another and rather different origin of the interest in cognitive economics lies in the areas of psychology, and the emphasis here has been in examining the
Corresponding author. CONTRIBUTIONS TO ECONOMIC ANALYSIS VOLUME 280 ISSN: 0573-8555 DOI:10.1016/S0573-8555(06)80012-X
r 2007 PUBLISHED BY ELSEVIER B.V.
238
Alan Kirman
behaviour of subjects and proposing explanations of this based on various mental mechanisms rather different from simple preference maximisation. A further step in this direction has been made by the development of neuro-economics, which looks at the parts of the brain involved in decision making and tries to infer patterns that may lead to apparently inconsistent behaviour. The origins of the interest in cognitive economics are varied as are the motivations of those who come from them. I will examine some aspects of each of them in turn and this may provide some ideas as to how they will evolve. Yet, before embarking on an examination of how we might hope to analyse preferences in the future, it seems natural to the economist to start with the origins of the dissatisfaction with the preference theory which is at the very heart of modern economic theory. 1. The limitations of standard preference theory Recall that the basic aim of economists in the Arrow–Debreu tradition, which claims its origins in the work of Walras and Pareto, is to start with a basic notion for individuals, that of preferences, then to derive demand and then to proceed to show that there is an equilibrium of the economy or market in question. What preferences are, or rather had boiled down after nearly a century of debate, was clearly stated by Uzawa who says simply, ‘‘Preference relations, in terms of which the rationality of human behaviour is postulated, are precisely defined as irreflexive, transitive, monotone, convex and continuous relations over the set of all conceivable commodity bundles. A demand function associates with prices and incomes those commodity bundles that the consumer chooses subject to budgetary restraints’’. Uzawa (1960)
A first, and obvious difficulty comes from the nonobservability of such preferences and the absence of any way of testing the standard hypotheses. Yet, even though they were abstract in nature, a great deal of effort and mathematical skill was spent on weakening the conditions assumed for preference relations. It was argued that weakening the assumptions in this way would make them more acceptable. Not much was said about the basic and deeper, underlying assumption that individuals are indeed characterised by the corresponding orders. Taking this as given, the goal was to weaken the basic axioms as much as possible. Today, it is recognised that none of the standard axioms, even in their weakened form, is derived from observation of choice behaviour but rather are the result of pure introspection by economists. Yet, this recognition has nothing to do with recent developments. Hildenbrand (1994) cites quotations from economists such as Robbins, Koopmans and Hicks all of whom were well aware of this. Worse, late in his life, Pareto came to the conclusion that individuals make quite arbitrary choices and spend the rest of their time justifying them! Thus, scepticism about the bases of utility theory is longstanding among highly reputable economists.
Epilogue
239
Furthermore, the methodological grounds for using the principle of Occam’s razor are far from being generally accepted. Pareto himself was not happy with the insistence on weakening assumptions to the maximum. He was still, for example, convinced that the idea of measurable utility had some merit. As John Chipman (1971) pointed out, ‘‘Thus, altering slightly a suggestive metaphor of Georgescu Roegen, just because the equilibrium of a table is determined by three of its legs, we are not required by any scientific principle to assume that actual tables have only three legs, especially if direct observation suggests that they may have four’’.
Indeed there are many in psychology and philosophy who simply do not find these hypotheses, even in their most weakened form, satisfactory as a basis for analysing human behaviour and therefore look for alternatives. The objection that not only psychologists and philosophers have to the conditions imposed on preferences is that they offer no possibility of empirical rejection. However, hope was pinned on the notion of revealed preference which was, at least in principle, based on observable choices. Here again, hope was short lived. Uzawa’s (1960) basic and important contribution was to show that if a demand function satisfies the weak axiom of revealed preference (WARP), then there exists a preference relation from which it is derived. Furthermore, demand functions derived from preferences satisfy WARP. Both Ville and Houthakker had shown that a necessary and sufficient condition for a demand function to be derived from preferences was that the strong axiom of revealed preference (SARP) be satisfied. Uzawa completed the picture by showing that, under certain regularity conditions WARP implies SARP. This was rather disappointing since it was originally thought that revealed preference theory was somehow more satisfying than utility-based theory since it is based, as I have said, at least in principle, on observable behaviour. Thus an avenue out of the abstract world of preference relations satisfying certain axioms derived from introspection was closed off. One might have thought that revealed preference would have lost its appeal after this result, but this is not the case. This is because the idea of observable choice is, in theoretical terms, an illusion but the hypothesis is more easily formulated and more convincing than listing the standard axioms on preferences. Why is there an illusion? Although demand choices are in principle observable, in fact, in the Arrow–Debreu framework, an individual is never faced twice with the same situation. Choices are made once and for all at the beginning of time. Since all commodities are dated, seeing an individual choosing bundle x when she could have chosen y , then, subsequently choosing y when she could have chosen z and then later, choosing z when she could have chosen x is not a violation of SARP, since the x which figures at the end of the sequence is not the same as that in the first choice. The bundle of commodities in question is available at a later time and is therefore no longer the same. Furthermore, if any consumption took place between the choices, the preferences which determine
240
Alan Kirman
the later choices may well have been affected. The only way out of this would be to pose the hypothetical question repeatedly to an individual, which of these two bundles would you choose? At this point the advantage of the ‘‘observability’’ of the choices has been lost. Yet revealed preference still plays a prominent role in theory and in the teaching of economics (see Mas-Colell et al., 1995). As I have said, this is because of the simplicity with which it can be explained, and this is reinforced by its widespread use in experimental economics where a lot of work in cognitive economics has been done. In the experimental laboratory the practice of posing successive hypothetical choice questions and then checking to see whether these choices are coherent is widespread. The objections I have mentioned are overruled on the grounds that the choices are indeed hypothetical but are made over a short period of time. Yet, the fact remains that the foundations of revealed preference theory cannot, tautologically, tell us any more about choices than standard preference theory. They are both equally abstract and unverifiable. However, a much more severe and essentially internal difficulty with standard preferences has been created by the destructive results of Sonnenschein (1972), Mantel (1974) and Debreu (1974), (SMD). Assume for a moment that one accepts the abstract conditions imposed upon preferences. What do economists want? They want to show that if people satisfied these assumptions there would be an equilibrium set of prices which would clear all markets. The existence of such an equilibrium can be proved under very limited restrictions. However, to argue that one is actually modelling the economy, one has to be able to show that the equilibrium would be achieved. The system has to be stable, in the sense that starting from arbitrary prices it moves to an equilibrium. This, after all, is what Adam Smith was alluding to with his ‘‘invisible hand’’ and what Walras had in mind with his ‘‘tatonnement’’ process. It is here that the SMD results reveal their full force. They show that the abstract conditions we impose on preferences can guarantee neither the stability nor the uniqueness of equilibrium. Therefore, if the Arrow–Debreu model is claimed to be a picture of the real world, it is certainly not a satisfactory one. Without any idea as to whether an equilibrium will be attained, the concept remains just an abstraction. Without uniqueness macroeconomists cannot study what will happen to the equilibrium state if we change some parameter of the economy. That is they cannot do ‘‘comparative statics’’. Yet this is what much of macroeconomics is concerned with. It is worth noting that the SMD results show the weakness of the model but not where that weakness comes from. Nevertheless, the damage was done and many theorists realised just how unsatisfactory the basic model was. What is particularly interesting about that episode is that it was scholars of the highest reputation in mathematical economics who understood the nature of the problem and who brought the edifice down. Indeed, to this day, many economists, usually not pure theorists, continue to use the model as if the SMD results were just a formalistic objection. As a parenthesis it is just worth remarking that
Epilogue
241
something that plays a key role in cognitive economics, information, was an important ingredient of the failure to solve the stability problem. The basic market model has been shown to use remarkably little information when functioning at equilibrium. But as Saari and Simon (1978) have shown, if there were a mechanism that would take an Arrow–Debreu economy to an equilibrium, that mechanism would require an infinite amount of information. Thus, the stability problem was basically unsolvable in the context of the general equilibrium model. Starting from individuals with standard preferences and adding them up allows one to show that there is an equilibrium but does not permit one to say how it could be attained. There have been a number of economists such as Grandmont and Hildenbrand before he took the more radical approach of abandoning individual rationality, who have argued that there was a possible way to build a model on the standard assumptions and which would have a unique and stable equilibrium. Thus, instead of trying to modify the assumptions on individuals we might want to make assumptions on the distribution of their characteristics. Thus, even if the individuals are assumed to operate in isolation, the fact that they are heterogeneous may be important. Indeed the authors argued that, by making use of the dispersion of characteristics, one could generate the structure necessary to obtain uniqueness and stability of equilibria without modifying the underlying structure of the model. In doing so they were following on from ideas advanced already by Cournot. To repeat, what they wished to show is that if the economy consists of a large number of sufficiently heterogeneous agents, properties like uniqueness and stability of equilibrium may be restored (see Grandmont, 1987, 1992; and Hildenbrand, 1983, 1994). Thus structure may be introduced into aggregate behaviour by the presence of enough differences between the characteristics of the agents. That is, heterogeneity can be added to assumptions strictly restricted to those on individuals. This would be a step in a different direction than that taken by cognitive economics but one which may clearly be appropriate if we are concerned with macroeconomic questions. This is an interesting approach since it takes us away from a meticulous examination of individual characteristics and makes us look at the characteristics of the population of economic agents. However, the hope that introducing heterogeneity could play a key role in justifying the general equilibrium model has, at least until now, proved a vain one. The reasons for this are not very important here and good accounts are given by Billette de Villemeur (1998) and Hildenbrand and John (2003). There is something important which may escape our attention here. Cognitive economics has spent a great deal of time and effort investigating the individual but, although assumptions about the latter have underpinned standard economic analysis, the discussion of heterogeneity may suggest that aggregate behaviour should be analysed differently. Hildenbrand (1994) has suggested a radical departure from traditional theory and proposed that we start with individual demands without deriving them from preferences and that we then see
Alan Kirman
242
if we could find some condition on the dispersal of consumption choices that would give us back uniqueness and stability. What he showed was that if, with increasing income, the consumption choices of individuals become more dispersed in a very precise sense, then the aggregate demand will satisfy the aggregate ‘‘Law of Demand’’. He works with demand functions, which associate a demand f(p) with a price vector p. Aggregate demand F(p) is just the sum of the individual demands. and the condition which says that F(p), the aggregate demand satisfies the ‘‘Law of Demand’’ is that, for any two price vectors p and q: ð p qÞF ð pÞ ð p qÞF ðqÞ.
ð1Þ
This property guarantees the uniqueness and stability of equilibrium in a simple economy. It is worth noting that what Hildenbrand does is far from the traditional approach. He starts with a condition on observable empirical choices which he does not assume are derived from any utility maximisation and then deduces the ‘‘Law of Demand’’. Thus the method is completely rigorous but does not depend on assumptions about the source of the demand behaviour. What he requires is that the choices of individuals should be sufficiently heterogeneous and that this heterogeneity should increase with income. This pragmatic approach which starts with Hildenbrand (1989), Hardle et al. (1991) and Hildenbrand and Kneip (1993, 2005), does not encounter the difficulties seen in the earlier discussion and it is surprising that it has not had a larger echo. It is noteworthy that this approach is almost orthogonal to that adopted in cognitive economics since Hildenbrand does not provide any explanation as to why individuals should behave in the way he suggests. He makes no apologies for this and suggests that just such a purely pragmatic approach is appropriate. The conclusion here is that one could move into a situation where two different modus operandi cohabit in economics. One based on the behaviour of individuals analysed in detail and the other a more pragmatic and statistical approach destined to explain aggregate phenomena through aggregate data. Presumably, cognitive economics would focus on the first sort of question whilst other approaches would govern the second approach. The difficulty is that such a division is not accepted in economic theory, the individual should, according to current practice, remain the basis for all economic analysis.
2. Time and its role However, the problems I have just discussed are not those which provoked the arrival of psychologists on the economic scene. They were interested in the individual and the way in which he makes his decisions, but they were troubled by the sort of behaviour that was predicted and the fact that it seemed to be in contradiction to much of what had been observed in psychological experiments. They were much less concerned about the individual as a basis for an equilibrium model. Yet, as I have said, this is the origin of many of the difficulties that
Epilogue
243
economists have with preferences. Consider the question of inter-temporal choice. It is clear that time has no real role in the Arrow–Debreu model, other than to date commodities. Any notion of an evolution of the system is largely absent. Just dating a finite number of commodities is not a serious acknowledgement of the temporal character of experience, and the extension by Bewley (1972) to an infinity of commodities does not solve the problem of how people deal with the future. Although we can then give formal proofs of the existence of an equilibrium, we do not say anything about how people make their intertemporal choices. The basic approach in economics, derived from the Arrow–Debreu approach has been to argue that peoples’ preferences over all commodities, including those available in the future, amount to having preferences over ‘‘consumption streams’’. This implies that an agent knows today what he will prefer in the future. Yet, to assume that preferences are given from the outset, whatever the outset is, is clearly unrealistic. There has been growing recognition within the economic theory, and in particular, cognitive economics, that has been developed in the last decades, that the economic agent is not an unchanging individual whose characteristics are captured by a well-defined set of preferences that do not evolve over time. This has had less impact on economics in general, than one might have imagined. Perspicacious theorists have acknowledged that having preferences fixed once and for all time that are separable, and then maximising future utility involves a thought experiment that represents a choice between alternative selves tied to time and circumstances (Mirlees, 1982, p. 65). This requires, as James Mirlees observes, that the agent’s ‘‘[y] preference regarding what he will be doing at one particular time in one particular set of circumstances be independent of what he may be planning for all other times and circumstances’’. (ibid. p. 66)y
Mirlees thus asserts that ‘‘[e]verything that has to do with life as a connected whole – such as habit, memory, preparation for future action, anticipation, achievement and failure – seems to have been ignored’’ (ibid. p. 66).
But if we are to take this to heart, we should consider preferences that do, in fact, evolve over time and this would represent a serious move away from standard practice. Guinness once ran a rather successful advertising campaign based on the slogan ‘‘I don’t like Guinness. That’s why I have never tried it’’. This slogan, which always provokes an amused reaction from non-economists, should not seem absurd to an economist. Given that agents have well-defined preferences over the whole goods space, this sort of statement is perfectly consistent. In basic economic theory, preferences are not formed by experience; they are given from the outset, therefore one knows whether one likes Guinness or not. The agent’s characteristics are, in this respect, fixed and immutable, and, I
244
Alan Kirman
will come back to this in particular, independent of those around him. This is also an idea with early origins. Thomas Hobbes ([1651] 1949) said: ‘‘Let us return to the state of nature and consider men as if sprung out of the earth, and suddenly, like mushrooms, come to full maturity without any kind of engagement to each other’’.
It is surely unreasonable to exclude the possibility that preferences might depend on experience, and even other changing features of the environment. Yet to do so poses a real problem for inter-temporal optimisation. Samuelson (1937) proposed that one should consider a one-period utility function and then maximise the sum or integral of the discounted utilities over time. This added more and not very plausible structure to the preferences of individuals, but allowed the economist to remain within the basic framework with which he is familiar. Samuelson himself was surprised that this rather elementary idea encountered so much success. However, as a description of how people actually behave, this approach has run into major difficulties. For many philosophers this vision of the human is simply incoherent. Furthermore, experiments have shown that people seem to violate the conclusions of this discounting model, and, recently, much attention has been given to the fact that individuals seem to place undue weight on the near future. A wellknown paper on this problem is that by Hausman (1979), who looked at purchases of air-conditioning units. The more expensive units were more energy efficient. One could then conclude that those people who chose to pay less today had a higher discount rate than those who preferred to pay more and then have lower energy bills in the future. From this emerged the idea that, if individuals do indeed discount, their rate of discount should be essentially the same across goods. Some natural experiments were carried out by Gately (1980), who showed that revealed discount rates for different products were not the same for a given individual and indeed, could vary from 5% to 300% ! As Prelec and Loewenstein (1997) point out, the same people who smoke may well also save for retirement. Frederick et al. (2002) give a comprehensive and elegant account of such problems. An important response to this sort of question has been to introduce the socalled ‘‘hyperbolic discounting’’. This replaces the usual exponential discounting by a different function, which gives much less weight to distant events. There are two significant problems with this approach. First, the utility of an individual is still assumed to be the same at each period but less weight is attached to later utility. Thus the idea that utility will itself change over time is not incorporated. Second, decisions taken in this way are ‘‘time inconsistent’’. The choice made today by someone maximising this sort of preferences will not be consistent with what he would like to do at some later date. Yet, it is clear that to take into account how our preferences will change in the future is a very complicated task. To put the problem in perspective, think of the Arrow–Debreu world. If one wants to preserve that framework, one has to say what it is that individuals wish to optimise when they make their choices. If their future preferences are subject to complicated changes, which are, in part, endogenous, the problem becomes
Epilogue
245
overwhelming. This explains why within the standard model we avoid this by assuming that preferences are the same at each point in time. 3. Multiple selves If however, we are to take on board the contributions of other fields such as philosophy, then we should take the idea of ‘‘multiple selves’’ suggested by Mirlees seriously. Philosophers have long been concerned by the problem of the identity of an individual and contrary to the standard economic view of an unchanging person have insisted on the fact that the individual changes over time. This has been a preoccupation since the early Greeks. As Plato (1997) remarked (Symposium, 207D–208B), ‘‘A man is said to be the same person from childhood until he is advanced in years: yet though he is called the same, he does not at any time possess the same properties; he is continually becoming a new person not only in his body but in his soul; besides we find none of his manners or habits, his opinions, desires, pleasures, pains or fears, ever abiding the same in his particular self, some things grow in him while others perish’’.
Much later, Hume (2000, [1739/40]), states categorically that when he looks into his most intimate self, he detects ‘‘[y] nothing but a bundle or collection of different perceptions, which succeed each other with an inconceivable rapidity, and are in perpetual flux and movement’’ [1.4.6.3].
There is nothing that remains the same over time. Even though each of us believes herself to be the same and identical person over time, our supposed identity is built up, through continuous change. We are, as Hume observed, like a theatre in which there is a continuing sequence of plays and scenes. We are then faced with the opposite problem from the economist, does the individual have anything that allows to consider him as one entity over time? If not, how can we can talk about the same individual at different point in time and, even more difficult, how can we evaluate his overall welfare? A considerable literature has grown up in which this notion is discussed in the economic context and alternative views proposed. This literature has, as its basis, a dissatisfaction with the static view of individual characteristics and a growing recognition that the other social sciences and the cognitive sciences in particular, have a great deal to tell us about how economic agents are identified. Who an agent is, is an important concept since it will have a direct influence on the choices that an agent can and will make. That identity now matters in economics is shown by a number of economists’ recent interest in introducing the notion of identity explicitly within economics (see e.g., Akerlof and Kranton, 2000, 2002; Davis, 1995, 2003, 2004; Sen, 1999, 2000, 2001). Whereas most economists are concerned with social identity, Davis, for example, takes an explicitly philosophical approach and analyses the atomistic conception of the economic agent of what he calls orthodox economics. He claims that it fails to represent any identity according to the philosophical criteria of individuation or
246
Alan Kirman
synchronic identity (being able to distinguish between different individuals), and reidentification or diachronic identity (being able to follow one and the same individual through time and change). This same distinction is, incidentally the basis for the discussion of inter-temporal choice by Frederick (2003). Many solutions have been proposed to this problem and a well-known suggestion of Parfitt and others is to think of the individual as a succession of selves. For example, Pierre Livet (2004), supposes four different identity functions based on different criteria (organic, mnemonic, personality, social), which change over time. However, he assumes that not all of them change at once: while one identity function is changing, the others remain unchanged (but may well be the source of the change). This allows for a ‘‘chain’’ of changing – changed identity function that reidentifies the selfsame individual throughout time. This comes very close to the position of Parfitt (1971, 1984), who argues that people discount the future heavily because the future self is different from the present self and only linked through a chain of intermediaries. Thus, my failure to take account of my fate far in the future may be because of my feeling little for the person who will be myself at that time. Note that the approach of Livet differs in a subtle way from that of many neuroscientists when they analyse different selves. In Livet’s view there is a struggle between the different identities and this may lead to apparently paradoxical choices and the incompatibility between the different identities will lead to modifications in the latter. However, frequently, the neuroscientists suggest that one several, potentially active, mechanisms perhaps governed by one region of the brain, ‘‘takes over’’ and temporarily controls the individual’s choices and actions. For the neuroscientist there will be competition between different mechanisms which are present for evolutionary reasons, and at any point in time certain mechanisms will prevail. Thus the neuroscientific approach would suggest switching between conflicting mechanisms whereas the philosophical approach would suggest a continual evolution of the self or selves over time. A possible conclusion from taking the philosophical point of view is that individuals do not discount enough, they apparently pay too much attention to the future self. This is an argument that seems to be testable. It is argued by neuroscientists that different regions of the brain are triggered by questions concerning the self and those concerning others. If one accepts Parfitt’s position, one should see less and less activation of the ‘‘self’’ mechanism as one considers decisions further and further into the future concerning oneself. The rate of this decline would be a measure of the distance between oneself and anonymous others. I will come back to the other contributions that the neurosciences and what has now come to be called, neuroeconomics can make to our science, but for now it is simply worth remarking that the vision of both the philosophers and the biologists is very different from that of the economists such as Fudenberg and Levine (2005), who introduce multiple selves but specify their relation and consider that they play a sophisticated game with each other. The
Epilogue
247
neuroscience approach is much more a modular vision in which one module and then another is switched on. 4. Neuroeconomics The neurosciences have recently made rather rapid inroads into economics and it is interesting to see what the new science of neuroeconomics has contributed. What it has done materially is to bring modern technology for analysing and pinpointing activity in the brain to bear on problems of economic decision making. For a long period, there was some scepticism among economists as to how useful this would be in explaining choice behaviour for example. Economists tended to argue that neuroscience was just making a map of the brain and that this was not likely to be very pertinent for explaining human economic behaviour. This activity of those economists interested in the mechanisms of the brain, was regarded as a somewhat marginal part, if part it was, of economics. Yet, as neuroeconomics has grown in strength and certainly in attention, it has begun to attract much more hostility. This is incidentally, a clear indicator of its strength. In a recent paper entitled, The Case for Mindless Economics Gul and Pesendorfer (2005) argue that the analysis of the connection between certain regions of the brain and certain types of decision is irrelevant for economists. They claim that the interest that neuroeconomists have in people’s happiness or emotions is not a part of the economic discipline. They refer disparagingly to ‘‘therapeutic concerns’’, yet, they are not clear as to where happiness begins and welfare stops, since all considerations of efficiency are based on welfare considerations. They lean heavily on the revealed preference theory that I have discussed, and argue that preferences are simply what is revealed by choices. However, as I have said, the only way to prevent this from being purely tautological is to impose some structure on choices, and this structure is described as making them ‘‘rational’’. Then, one could try to see whether the assumptions needed for this are violated. Such an approach would describe as any such violations as ‘‘irrational’’. Thus their basic argument is that economics has solid foundations in the classic theory of preferences and that this is enough to categorise what is rational or not. They are not prepared to contemplate the idea that no such structure may exist and that choices may be made in very different ways and may not be easily reduced to one structure, no matter how sophisticated. 5. Emotions or bounded rationality An alternative view is that one is constantly switching between motivations and some decisions are taken on a short-term emotional basis and others are more calculated. If this is the case, we cannot hope to reconcile all decisions in one
248
Alan Kirman
rational structure. But this is where the neuroscientists come into the picture, for they attribute a great deal of importance to the influence of emotions on decisions as Cohen (2005) explains. He argues that emotions do affect our decisions and that ‘‘They do so in just about every walk of our lives, whether we are aware or unaware of it and whether we acknowledge it or not. In particular y emotions may explain inconsistencies in human behavior and forms of behavior that some have deemed irrational, though such behavior may be seem more sensible after a discussion of the functions that emotions serve – or may have once served in our evolutionary past’’.
Thus, his basic argument is that if decisions are made on the basis of older ‘‘affective’’ mechanisms, they may not be consistent with those made with that part associated with reasoning and calculation. This would help to explain what have frequently been described in economics as ‘‘paradoxes’’ and which played an important role in the development of behavioural economics. So Cohen argues that evolutionarily old brain mechanisms, which are associated with the expression of emotions, may be important in understanding decision making. Although not admitting this into their formal analysis, few economists are astonished by the ideal that a very ‘‘unfair’’ proposal in the ultimatum game elicits reactions from those brain areas associated with disgust for example. They would however argue that this is somehow obvious and we did not need a great deal of equipment to come to this conclusion. An alternative suggested by many economists is that, rather than a complex interaction between emotion and reason, what is actually going on is that people simply cannot handle the optimisation problems with which they are faced. They may not have enough information for this, they may have limited time to devote to the problem, and they may simply not be clever enough. Rather than emotional they are what has come to be called ‘‘boundedly rational’’. Behavioural economists have been particularly interested in examining and defining this notion (see, for example, Loewenstein, 1996; O’Donoghue and Rabin, 1999; Kahneman, 2003). Yet, there is an intrinsic puzzle in this idea. If ‘‘bounded rationality’’ means adding constraints to an existing problem it makes it more, and not less, complex. Thus, if we consider that indiviuals are using their reasoning to solve this sort of problem, we suggest that they are solving more difficult tasks. Economists should perhaps not be so ready to reject the intrusion of emotions into their field. After all, Hume argued that we are governed by our passions and that to be rational was to respond as efficiently as possible to their needs. A much deeper question raised by Cohen (2005) is why we have emotions at all and why they may seem to interfere in our rational behaviour? The evolutionary biologists’ response to such a question is likely to be that emotions do have a purpose even though, in some circumstances, that use may have been preempted by more reasoned behaviour. The hope is that advances in neuroscience will be able to shed light on the relationship between emotions and rational behaviour.
Epilogue
249
While it may be useful to model the thinking associated with rational calculating optimisation, the neuroscientists suggest that the capacity to do this, relies heavily on the function of a particular set of brain structures, including the prefrontal cortex, which has evolved recently. What is suggested is that, ‘‘homo economicus’’ is a partial and biased representation of a decision-making individual. But even the ancient philosophers were aware that it is not always reason that governs our actions. This may simply because of limitations to our capacity to analyse various aspects, even of the same problem, simultaneously. Our processor simply does not have the capacity to do this. But then we must rely on more primitive mechanisms to take over and, almost by definition, these are less likely to adapt to a given situation. In this case, the reasoning function may be needed to correct rapid but unsatisfactory responses. Although this sounds like an appropriate division of labour, what is not evident to the outsider is how some acquired but sophisticated activities like riding a bicycle come, once learned, to be delegated to less-calculating areas of the brain. Nevertheless, the underlying argument is that there are tasks that are done promptly and at low cost without the intervention of the reasoning process, and of course, this also includes those actions that are taken for emotional reasons. 6. Rapid versus reasoned response There is a tradition in psychology which separates automatic processes from ones governed by calculation. Kahneman was involved in studying this division and a number of other authorities have examined it (see for example, Cohen et al., 1990 and Kahneman and Treisman, 1984). The same type of distinction has been made in the economics-related literature, between what have been called, System 1 involving rapid automatic responses and System 2 mechanisms involving reasoning and which may monitor the System 1; responses (see Kahneman, 2003; and Camerer et al., 2005). However, we are now in a position to actually examine the types of brain activity that are associated with different types of reasoning in different situations. Those situations in which the prefrontal cortex is active are those where a great deal of reasoning and conscious calculation is done. In situations where a more immediate and ‘‘instinctive’’ reaction is appropriate, what is often referred to as the ‘‘limbic system’’ comes into play. How does this work ? Neuroscientists have known for some time that several subcortical structures, particularly those in the brainstem that release dopamine and those in the striatum that are influenced by the release of dopamine, respond directly to events which themselves engender direct rewards or harm, or to their anticipation (see, for example, Schultz et al., 1997; Knutson et al., 2001). It is these cortical structures, and their subcortical counterparts, which are referred to as the limbic system of the brain, which are intimately related to emotional processing (see Dalgleish, 2004). If we were able to unravel the working and interrelationship of these two systems we would be well placed, it is argued, to
250
Alan Kirman
explain the apparently contradictory behaviour of individuals with respect to both time and risk. This sort of approach is alluded to in the chapter in this book by Kokinov. In terms of risk aversion, he claims that different mechanisms can be activated by ‘‘cues’’ and that which mechanism is activated by different cues will have a direct impact on choices. While the impact of these cues may not be recognised consciously, they have a direct effect on the decisions experimental subjects make in the face of risky situations. Economists such as Gul and Pesendorfer (2005) are sceptical about this, believing that it must be possible to find one overarching structure within which behaviour could be considered as rational. They try to reduce the notion of ‘‘cue’’ to that of an externality or a complementary good. Yet, the chapter by Kokinov shows clearly how individuals are influenced in their decisions by cues which are apparently irrelevant from the standard economic point of view. If we are to accept that there is this constant interplay between reasoning on the one hand and emotion and reaction on the other, this would imply a radical rethinking of how agents make their economic decisions. The ‘‘holy grail’’ for neuroeconomics, as expressed by Cohen an author of several important contributions to the field, is to find explanations which one could not have obtained from observing behaviour alone. 7. Problem solving A question which is related to those just mentioned, but which is less important to the neuroscientists, is to explain how people solve problems when they are in reasoning mode. Suppose that, for the moment, we simply accept the division between reasoned and instinctive actions. Suppose that an individual is in the reasoning phase. Not only is it of interest to know that they are making intensive use of the prefrontal cortex, but we would like to know what sort of reasoning they are employing. To understand what people do, it is a good idea to understand how they arrived at their decisions. How do people solve problems? In economic theory, this is the basic activity of an economic agent although he is not normally assumed to perform it consciously. He has preferences, he knows the constraints he faces and therefore has to simply identify the best element of the choice set according to his preferences. However, in many situations, the problem is more consciously posed. The agent must decide ‘‘what is the best way to do something’’? His preferences are over the consequences and maybe over the time taken to solve the problem. Such a simple question poses a number of complex questions, some of which are addressed by Egidi in this volume. The best way to solve the problem may be the quickest way to transform the original state into the desired state by using a number of permitted operations. But ‘‘quickest’’ may be interpreted as using the minimum energy or making the smallest number of states in the transition. Thus, if one mapped, the states onto a graph the best method would involve the shortest path from the initial state to the final state. There is a fundamental problem here. Are we looking for a method that will always work for a large class of problems? If so it
Epilogue
251
may well be the case that will not be efficient in many situations. For some specific cases, a particular and often obvious way of obtaining the solution exists. However the method in question may not work well in general. Unless individuals are faced repeatedly with the same task they seek algorithms of general applicability and it is these that they learn to use. 8. Interaction and its influence on economic behaviour In economics we focus on individuals and how they make their decisions, and this is justified because it is typically assumed that individuals act independently of each other and that the only interaction between them is through the price system or ‘‘market’’. Yet, in reality, individuals interact with each other and this interaction which takes many forms has an important influence on choices. The clear problem with this is that, it is much more difficult to integrate choices which are influenced by others into the analysis of economic activity. Game theory does just this but it poses fundamental difficulties for the calculating capacity that we have to attribute to individuals. We have literally to endow them with ‘‘unbounded rationality’’. The well-known common knowledge problem shows how solving the Nash equilibrium of even a simple game is an essentially impossible for an individual. It is becoming increasingly apparent that we have to take interaction into account, but we need to find some tractable way to analyse it. The discussion of the role of interaction and organisation is very much present in other disciplines. There is, for example, a considerable debate as to whether, in the neurosciences, it is necessary to revert to the study of the behaviour of neurons to explain behaviour. However, the situation is complicated by the fact that the practice of analysing aggregate behaviour without considering its micro-foundations, is now, in economics, almost universally considered as ‘‘unscientific’’. If we accept the idea that we have to descend to the level of the individual to explain aggregate economic behaviour the question then becomes why stop at the level of the observed behaviour. Should we not go down to more fundamental levels to examine the explanation for that behaviour? At no level can we avoid the problem that the interaction with others, whether it be with other economic individuals or with different decision-making mechanisms within the same individual, all affect the overall result. The difficulties with aggregation and the impact of interaction on the relation between individual and aggregate phenomena is clearly shown by the following quote from two neurobiologists. ‘‘Major advances in science often consist in discovering how macroscale phenomena reduce to their microscale constituents. These latter are often counterintuitive conceptually, invisible observationally, and troublesome experimentallyy
Knowledge of the molecular and cellular levels is essential, but on its own it is not enough, rich and thorough though it be. Complex effects, such as
252
Alan Kirman
representing visual motion, are the outcome of the dynamics of neural networks. This means that while network properties are dependent on the properties of the neurons in the network, they are nevertheless not identical to cellular properties, nor to simple combinations of cellular properties. Interaction of neurons in networks is required for complex effects, but it is dynamical, not a simple windup doll affair’’. Churchland and Sejnowski (1995). The important message for the economist here is that the aggregate is not related in a simple way to the behaviour of the individuals that make it up. In economics we are faced with another problem, that of intention, and a major problem is what happens if interaction is actually chosen consciously. Thus, it is not only the organisation and the interaction that matter but also the very nature of the individual and his conscious choices concerning himself and the community he interacts with. I have already mentioned the first attempt to include the economic agent’s place in society in an analysis of his choices which is one associated with Akerlof and Kranton (2000, 2002). They argued that an agent’s utility depends on where in the social system he is situated. Thus preferences are conditioned by social identity. While this argument is absolutely fundamental in sociology, it was relatively new in economics, particularly in the context of a formal model. For Akerlof and Kranton the social network was given and the influence on preferences, therefore fixed. Yet, introducing the idea of situation, even in a fixed network can become more complicated if the influence of others is not simply a fixed component of the utility function. For example, even if the network is fixed, the agent may have some choice as to his place within it. The analysis can be pursued because of the way an individual’s choices are influenced by his neighbours’ view of him. A number of economists have reflected on the fact that an agent knows that his acts and choices signal to the member of his communities certain attitudes and commitments of his. The agent is thus interested in maintaining or producing a certain image of himself and will carefully choose his group participation and his acts that constitute his identity (Bernheim, 1994; Akerlof and Kranton, 2000, 2002). However, the idea of caring about one’s image has also been taken up in order to develop a more psychological concept of the economic agent. In this view, the economic agent may want to maintain a certain self-image of himself and will thus have to carefully choose those actions and that information that help him to do so (Be´nabou and Tirole, 2002, 2003). In particular, the economic agent is concerned about maintaining a certain level of self-confidence, which is important to carry out successfully certain actions at a later moment of time. In other words, he will not want to try too hard to succeed for fear of creating unwarranted expectations as to his ability among others. Alternatively the individual wants to construct his own self-image and will thus signal to himself his type through his actions (Bodner and Prelec, 2003). These expansions of the model of the economic agent to include notions such as social identity or self-image are important. The sort of road, opened up by
Epilogue
253
this sort of analysis, seems likely to lead further than the more standard approach. However, the addition of social and psychological aspects alone does not provide a complete account of the functioning of an individual. Indeed, these models, to some extent, replicate some of the difficulties that I have alluded to in the philosophical discussion of personal identity in relation to invariance and change. In many models, while the economic agent has a view of himself as evolving over time, his interaction with his environment is not represented in a dynamic way. His environment, or at least the components associated with his social situation, remain fixed, even if chosen at the outset. Thus the static nature of the model is retained. But, in this case, I would argue that the influences of a fixed social environment on the agent are not enough to account for personal identity and hence for his choices. In particular, the economic agent will not only be influenced by his social context, but he will, to some extent, also influence that context. Thus one has to take the feedback between the two into account. Equally with his selfimage: the economic agent does not have a given self-image, but his self-image will be modified by his social context and by the actions he chooses to realise his self-image. Indeed, the very structure of society and the groups which make it up will be modified by the changing evaluations and choices of all the individuals. Now, we have a very different view of the nature of the individual and of the society that he belongs to than is the case in the usual model. To what extent are these problems within the field of enquiry of the economist? The answer is simply that the introduction of these sorts of consideration concerns choices in a very direct way. Choices reflect the constraints the individual faces and what he considers desirable. The two are of course intimately linked. The individual with a certain identity, which reflects both his preferences and his various social and budgetary constraints, will make certain choices at a given point of time. However, if his identity changes so, in general, will his choices. Now think of the Arrow–Debreu world. If one wants to preserve that framework one has to say what it is that an individual wishes to optimise to make his choices. If his future preferences are subject to complicated changes, which are, in part, endogenous the problem is overwhelming. Within the standard model as we know, we avoid this by assuming that preferences, wherever they come from, are unchanging. If we accept the idea suggested here that identity and preferences are constantly evolving, we move to a world far removed from the conventional model. Indeed, if we agree on the fact that in the face of his future identity which will reflect little of his current tastes an individual is faced with a task of momentous proportions in formulating his choices, then either we have to find an alternative to the standard preferences arguments, and this is what behavioural economics is largely about, or we have to find some way to reconcile these difficulties within the standard rationality framework. We have to be able to explain how individuals would be able to optimise in the face of their unfolding future selves and environment.
Alan Kirman
254
For those who feel that I am dismissing the standard model too easily and that it can be modified appropriately to take care of all these problems, let me mention a possible way out. An appealing argument to defend rational preferences and optimisation even faced with complicated inter-temporal choices is frequently put forward. This is to say that, in fact, individuals do not optimise they simply learn from experience how to make the best choices. Thus, economic individuals, make their choices according to simple rules which he has developed from their experiences. Individuals are not optimisers, they are adaptive and only behave, ‘‘as if’’ they optimise. In this case what ‘‘works’’ or does not work is presumably judged on the basis of short-term considerations. 9. Learning and adaptation This idea is admirably summarised by Lucas (1988), who is a pillar of the rational school, when he says, ‘‘In general we view, or model, an individual as a collection of decision rules (rules that dictate the action to be taken in given situations) and a set of preferences used to evaluate the outcomes arising from particular situation–action combinations. These decision rules are continuously under review and revision: new decisions are tried and tested against experience, and rules that produce desirable outcomes supplant those that do not. I use the term ‘‘adaptive’’ to refer to this trial-and-error process through which our modes of behaviour are determined’’.
Notice that there is no mention of the preferences themselves changing so the identity of the individual is unchanging and it is simply his limited cognitive capacity that prevents him from optimising immediately. However, it is the second part of the reasoning that provides even more problems, for Lucas then goes on to argue that we can safely ignore the dynamics of this process because, ‘‘Technically, I think of economics as studying decision rules that are steady states of some adaptive process, decision rules that are found to work over a range of situations and hence are no longer revised appreciably as more experience accumulates’’.
What is the underlying argument here? It is that the evolution of the economic environment is very much slower than the speed at which agents adjust to that evolution. Thus, the two processes can safely be separated. This is reminiscent of the sort of reasoning used in biological discussions of evolution. However, this position is open to objections when used in the context of economics. First, even if we were prepared to except that the collective result of adaptive individual interaction converged, then if we are to argue that we can stick to standard optimisation and equilibrium arguments, it has to be proved that the result will be some standard equilibrium whether Walrasian or Nash. In that case, we have a justification for focusing on the equilibrium and no longer have to worry about how individuals reach it. But what is the environment here? As the brief discussion of interaction shows, for the economic agent it is essentially provided by the other agents. But
Epilogue
255
these agents are learning as well and it may well be the case that the changes in the economic environment, are at least in part caused by the modifications in the behaviour of all the individuals as they adapt. In this case the environment will evolve together or ‘‘co-evolve’’ with individual behaviour. There is no particular reason to believe that such a process will converge and indeed, the whole system may be in continual evolution. Thus, the configuration of actions being taken by the players may evolve in a complicated way as the result of their collective impact on the payoffs earned by the others. Examples of how this may occur are explicit in models of agents learning in networks, such as those studied by Durieu et al. (2006) in this volume. There are, of course, cases where the system will converge to an equilibrium of the underlying static game or economy. Indeed, as Marimon (1995) points out, simple reinforcement learning rules have shown themselves to be remarkably effective in reaching sophisticated equilibria in game theory. But, it is also true that many examples have been shown where individuals may learn their way to the wrong result and be perfectly happy with the outcome (see Kirman, 1983). Even in those economic situations where the myopic use of limited rules by individuals can lead to a collectively satisfactory solution there may be difficulties, since, unless the individuals involved learn extremely rapidly then other considerations enter into play. Furthermore, if the agents themselves are changing so are their evaluations of the payoffs that they are facing. In population games which have been widely used in evolutionary game theory, the payoffs are unchanging and the same for all the players. This denies any differences between the utilities or preferences of the individuals involved. All the arguments concerning the way in which individuals make decisions and the way in which they evolve over time are necessarily excluded from consideration. Thus, it is probably the case that learning will not be at all easy to analyse except for a fixed population of identical agents in a totally static environment. This is a world so far removed from the sort of situation which both behavioural and neuroeconomists are trying to study that it seems hardly worthwhile to reconcile the two. 10. Conclusion Economics is at a point where there is a more direct conflict between behavioural approaches and the analysis of their neurological origins, and more standard economic theory. The two approaches have made somewhat uneasy bed-fellows in the past but it now seems that the intrusion of psychology and of neurosciences into economics is regarded with more hostility. Yet, this more heated debate reveals an underlying difficulty that has always been present. If we are, as economists usually claim, concerned by the micro-foundations of aggregate behaviour then we must be prepared to examine the value of those foundations and the hypotheses on which they are based. This is the challenge offered by cognitive economics. If, on the other hand, we are not interested in individual choices as such and even less in the neurological mechanisms that generate those
256
Alan Kirman
choices then why do we insist on basing our models on optimising individuals? Why not take individual choices as given and test assumptions about those choices, or preferably about their distribution, and work out which testable hypotheses on those would provide some coherent structure. The route that is still chosen by many economists of maintaining the standard assumptions on preferences in the face of overwhelming experimental and empirical evidence that people behave otherwise is neither theoretically justified nor practically very useful. If we are genuinely interested in individual choices and behaviour and how these fit together in the social context the route offered by cognitive economics, understood in a broad sense, seems more promising and more realistic.
References Akerlof, G.A. and R.E. Kranton (2000), ‘‘Economics and Identity’’, Quaterly Journal of Economics, Vol. 65(3), pp. 715–753. Akerlof, G. A. and R. E. Kranton (2002), ‘‘Identity and schooling: some lessons for the economics of education’’, Journal of Economic Literature, Vol. 115, pp. 1167–1201. Be´nabou, R. and J. Tirole (2002), ‘‘Self-confidence and personal motivation’’, Quarterly Journal of Economics, Vol. 117, pp. 871–915. Be´nabou, R. and J. Tirole (2003), ‘‘Intrinsic and extrinsic motivation’’, Review of Economic Studies, Vol. 70, pp. 489–520. Bernheim, D.B. (1994), ‘‘A theory of conformity’’, Journal of Political Economy, Vol. 102, pp. 841–877. Bewley, T. (1972), ‘‘Existence of equilibria in economies with infinitely many commodities’’, Journal of Economic Theory, Vol. 4, pp. 514–540. Billette de Villemeur Etienne (1998), ‘‘Heterogeneity and stability: variations on scarf’s processes’’ European University Ph.D. thesis, EUI Florence. Bodner, R. and D. Prelec (2003), ‘‘self signaling and diagnostic utility in everyday decision making’’, Vol. 1, pp. 105–123 in: I. Brocas and J. D. Carrillo, editors, The Psychology of Economic Decisions, Rationality and Well-Being. Oxford: Oxford University Press. Camerer, C., G. Loewenstein and D. Prelec (2005), ‘‘Neuroeconomics: how neuroscience can inform economics’’, Journal of Economic Literature, Vol. 43, pp. 9–64. Chipman, J.S. (1971). Introduction to Part II of Preferences, Utility and Demand, editors, J.S. Chipman, L. Hurwicz, M.K. Richter and H.F. Sonnenschein, New York: Harcourt Brace, Jovanovich. Churchland, P. and T. Sejnowski (1995), The Computational Brain, Cambridge, MA: MIT Press. Cohen, J.D. (2005), ‘‘The vulcanization of the human brain: a neural perspective on interactions between cognition and emotion’’, Journal of Economic Perspectives, Vol. 19, pp. 3–24.
Epilogue
257
Cohen, J.D., K. Dunbar and J.L. McClelland (1990), ‘‘On the control of automatic processes: a parallel distributed processing account of the stroop effect’’, Psychological Review, Vol. 97, pp. 332–361. Dalgleish, T. (2004), ‘‘The emotional brain’’, Nature Reviews Neuroscience, Vol. 5, pp. 583–589. Davis, J.B. (1995), ‘‘Personal identity and standard economic theory’’, Journal of Economic Methodology, Vol. 2(1), pp. 35–52. Davis, J.B. (2003), The Theory of the Individual in Economics: Identity and Values, London: Routledge. Davis, J.B. (2004), ‘‘Complex economic systems: using collective intentionnality analysis to explain individual identity in networks’’, Journal of Economic Methodology, Vol. 2(1), pp. 35–52. Debreu, G. (1974), ‘‘Excess demand functions’’, Journal of Mathemiatical Economics, Vol. 1, pp. 15–23. Frederick, S. (2003), ‘‘Time preference and personal identity’’, in: G. Loewenstein, D. Read and R. Baumeister, editors, Time and Decision: Psychological Perspectives in Intertemporal Choice, New York: Russel Sage. Frederick, S.G., Loewenstein and T. O’Donoghue (2002), ‘‘Time discounting and time preference: a critical review’’, Journal of Economic Literature, Vol. 40, pp. 351–401. Fudenberg, D. and D. Levine (2005), ‘‘A dual self model of impulse control’’, Mimeo, Harvard Institute for Economic Research. Gately, D. (1980), ‘‘Individual discount rates and the purchase and utilization of energy-using durables: comment’’, Bell Journal of Economics, Vol. 11, pp. 373–374. Grandmont, J.-M. (1987), ‘‘Distributions of preferences and the ‘Law of Demand’’’, Econometrica, Vol. 55(1), pp. 155–161. Grandmont, J.-M. (1992), ‘‘Transformations of the commodity space, behavioural heterogeneity, and the aggregation problem’’, Journal of Economic Theory, Vol. 57, pp. 1–35. Gul, F. and W. Pesendorfer (2005), ‘‘The case for mindless economics’’, Working Paper Princeton University. Hardle, W.W., Hildenbrand and M. Jerison (1991), ‘‘Empirical evidence on the law of demand’’, Econometrica, Vol. 6, pp. 1525–1549. Hausman, J.A. (1979), ‘‘Individual discount rates and the purchase and utilization of energy-using durables’’, Bell Journal of Economics, Vol. 10, pp. 33–54. Hildenbrand, K. and R. John (2003), ‘‘On parametrization in modelling behavioral heterogeneity’’, Working Paper 03–27, Institute of Economics, University of Copenhagen. Hildenbrand, W. (1983), ‘‘On the law of demand’’, Econometrica, Vol. 51, pp. 997–1019. Hildenbrand, W. (1989), ‘‘Facts and Ideas in Microeconomic Theory’’, European Economic Review, Vol. 33, pp. 251–276.
258
Alan Kirman
Hildenbrand, W. (1994), Market Demand: Theory and Empirical Evidence, Princeton: Princeton University Press. Hildenbrand, W. and A. Kneip (1993), ‘‘Family Expenditure Data, Heteroscedasticity and the ‘Law of Demand’’’, Ricerche Economiche, Vol. 47, pp. 137–165. Hildenbrand, W. and A. Kneip (2005), ‘‘On behavioral heterogeneity’’, Economic Theory, Vol. 25, pp. 155–169. Hobbes, T. (1949 [1651]), De Cive, the Citizen, New York: Appleton Century Crofts. Hume, D. (2000 (1739/40)), ‘‘A Treatise of Human Nature’’, in: D.F. Norton and M.J. Norton, editors, Oxford: Oxford University Press. Kahneman, D.A. (2003), ‘‘Perspective on judgment and choice: mapping bounded rationality’’, American Psychologist, Vol. 58(9), pp. 697–720. Kahneman, D.A. and A. Treisman (1984), ‘‘Changing views of attention and automaticity’’, pp. 29–61 in: R. Parasuraman, D.R. Davies and J. Beatty, editors, Varieties of Attention, New York: Academic Press, Inc. Kirman, A. (1983), ‘‘Mistaken beliefs and resultant equilibria’’, in: R. Frydman and E. Phelps, editors, Individual Forecasting and Collective Outcomes, New York: Cambridge University Press. Knutson, B., G.W. Fong, C.M. Adams, J.L. Varner and D. Hommer (2001), ‘‘Dissociation of reward anticipation and outcome with event-related fMRI’’, NeuroReport, Vol. 12(17), pp. 3683–3837. Livet, P. (2004), ‘‘La pluralit0 e coh0 erente des notions de l’identit0 e personnelle’’, Revue de Philosophie Economique, Vol. 9, pp. 29–58. Loewenstein, G. (1996), ‘‘Out of control: visceral influences on behavior’’, Organizational Behavior and Human Decision Processes, Vol. 65(3), pp. 272–292. Lucas, R. (1988), ‘‘Adaptive behaviour and economic theory’’, Journal of Business, Vol. 59, pp. S401–S426. Mantel, R. (1974), ‘‘On the characterisation of aggretate excess demand’’, Journal of Econ Theory, Vol. 7, pp. 348–353. Marimon, R. (1995), ‘‘Learning from learning in economics’’, Mimeo, European University Institute, Florence. Mas-Colell, A., M.D. Whinston and J.R. Green (1995), Microeconomic Theory, Oxford: Oxford University Press. Mirlees, J. (1982), ‘‘The economic uses of utilitarianism’’, pp. 63–84 in: A. Sen and B. Williams, editors, Utilitarianism and Beyond, Cambridge: Cambridge University Press. O’Donoghue, T. and M. Rabin (1999), ‘‘Doing it now or later’’, American Economic Review, Vol. 89(1), pp. 103–124. Parfitt, D. (1971), ‘‘Personal identity’’, Philosophical Review, Vol. 80(1), pp. 3–27. Parfitt, D. (1984), Reasons and Persons, Oxford: Oxford University Press. Plato (1997), The Symposium, trans. by A. Nehamas and P. Woodruff, From Cooper, J. M. and I, Hackett. editors. Plato: Complete Works.
Epilogue
259
Prelec, D. and G. Loewenstein (1997), ‘‘Beyond time discounting’’, Marketing Letters, Vol. 8, pp. 97–108. Saari, D. and C.P. Simon (1978), ‘‘Effective price mechanisms’’, Econometrica, Vol. 46, pp. 1097–1125. Samuelson, P. (1937), ‘‘A note on measurement of utility’’, Review of Economic Studies, Vol. 4, pp. 155–161. Schultz, W., P. Dayan and P.R. Montague (1997), ‘‘A neural substrate of prediction and reward’’, Science, Vol. 275(5306), pp. 1593–1599. Sen, A. (1999), Reason before Identity: The Romanes Lecture, Oxford: Oxford University Press. Sen, A. (2001), ‘‘Other people’’, Proceedings of the British Academy, Vol. 111, pp. 319–335. Sonnenschein, H. (1972), ‘‘Market excess demand functions’’, Econometrica, Vol. 40, pp. 549–556. Uzawa, H. (1960), ‘‘Preferences and rational choice in the theory of consumption’’, Proceedings of the First Stanford Symposium on Mathematical Methods in the Social Sciences.
This page intentionally left blank
Subject Index adaptation 161–164, 185, 254–255 aggregation 38, 39, 41, 251 ambiguity 79 analogy 7, 107, 113 artificial intelligence (AI) 156 automaton 21, 136, 139 belief 1–6, 47, 48, 50, 54–59, 61, 62, 102, 123, 177, 178, 185, 186, 188, 197, 198, 209, 212, 214, 225–228, 230, 233 bias 9, 15, 70, 72, 86, 225, 233 categorization 9, 15–18, 26, 27, 34, 37, 43, 106 causal 212, 214, 229, 233 certainty 6, 104 coalition 107 codification 19, 205–207, 213, 215–218, 210 cognition (cognitive) cognitive architecture 10, 99, 100, 106, 108 cognitive economics 1–9, 99, 100, 102, 106, 237, 238, 240–242 cognitive science 2, 6, 7, 10, 100, 103–106, 212, 214, 215, 245 social cognition 2 communication 8, 107, 137 competition 3, 6, 100, 226, 230, 246 complexity (complex) 4, 7, 16, 76, 100, 103, 121, 124, 127, 142, 148, 153, 210, 211, 213, 217, 248, 250, 252 computation (computational) 4, 10, 23, 62, 99, 102, 103, 106, 107, 135, 217 conditional 4, 63, 80, 228 conditionalization 58
conjecture 26–29 connectionist 107 consistency 70, 158, 159 contagion 10, 135–151 context 2, 4, 5, 7, 10, 15, 16, 18, 23, 31, 48, 56, 70, 72, 73, 75, 79, 99, 101, 105–107, 110–112, 128, 135–137, 139, 151, 156, 208, 210, 212, 216, 241, 245, 252–254 continuity 45, 76, 77, 80, 81, 88, 91 convention 6, 156 convexity 59 cooperation 153 coordination 1, 6, 136, 139, 151, 16, 157, 165, 177–179, 181, 188, 193, 194 decision/decision-making binary decision 157, 183 decision under uncertainty 79, 104 demand function 239 dominance 85, 123, 124, 126, 128 dynamics (dynamic, dynamical) collective dynamics 177–201 evolutionary dynamics 119, 120 sequential dynamics 184, 188, 189, 191, 193, 198–200 stochastic dynamics 119, 127–129, 131 eductive process 6 efficiency (efficient) 5, 18, 19, 27, 153–156, 163–168, 188, 220, 247 emergence 154 environment 1–3, 6, 99, 100, 102, 103, 110, 112, 121, 125–127, 129–131, 154, 156, 244, 253–255 epistemic 7, 10, 47–49, 53, 54, 56, 60 epistemology 12, 205, 212
262
Subject Index
equilibrium Nash equilibrium 10, 11, 47, 120, 121, 123, 124, 127, 129–131, 136, 156, 179, 192, 193, 251 Walrasian equilibrium 254 evolution (evolutionary, evolutionist) 2, 6, 7, 102, 103, 119, 120, 121, 127, 128, 130, 36, 154, 189, 193, 195, 211, 219, 243, 246, 254, 255 experiment (experimental) 7–10, 15, 18, 29, 34, 42, 69, 99, 100, 103–105, 110–112, 119–122, 124, 126, 130, 132, 178, 179, 213, 214, 225, 228, 229, 231, 240, 242–244, 250 experimental economics 7, 228, 240 exploration-exploitation 5 externality 155, 156, 179, 250 feedback 229, 253 fictitious play 121, 131, 177, 179, 185, 186, 188, 190–195, 197, 199, 200 framing effect 4, 10, 56, 104, 105 game coordination game 136, 139, 151, 156, 165, 178, 179, 188 normal form game 10, 119–132, 135 game theory 2, 9, 119, 122, 156, 167, 251, 25 evolutionary game theory 120, 136, 255 heterogeneity 155, 160–165, 180, 188, 241, 242 imitation 12, 139, 215 independence 78, 79, 88, 89 induction (inductive) 8, 11, 67, 142, 233 inference 9 information (informational) 1–6, 50, 54, 57, 58, 71, 72, 84, 87, 105, 107, 108, 154, 161–163, 178, 179, 181, 183, 190, 191, 195, 197, 205–209, 212, 216, 217, 225, 227, 228, 230, 241, 248, 252 asymmetric information 228 incomplete information 60 information value 5 innovation 206, 209, 210, 217–220
institution (institutional) 2, 3, 6–8, 101, 216, 219, 220, 232 introspection 238, 239 invariance 253 knowledge 9, 11, 12, 48, 100, 107, 156, 178, 179, 181, 182, 205–220, 226, 228, 251 common knowledge 251 tacit knowledge 205–220 language 7, 8, 49, 50, 53, 55, 57, 60, 65, 66, 156, 212, 213 learning reinforcement learning 5, 10, 167, 168, 170–172, 177, 179, 185, 186, 189, 190, 195–196, 255 weighted belief learning 185, 186, 188, 197, 198 logic (logical) 47–49, 53–56, 60, 61, 65, 153, 156, 157 epistemic logic 7, 10, 47, 49–51, 54, 56, 60 logical omniscience 9, 10, 47–63 probabilistic logic 10, 47, 61, 65 market 5, 6, 73, 157, 178, 194, 220, 225, 227–229, 231, 238, 241, 251 meaning 233 memory 40, 99, 103, 106, 107, 183, 185, 214 message 4, 107, 217 methodological individualism 2, 237 multi-agent model 7 nature 2–5, 73, 87, 208 network 5, 121, 135, 136, 138, 139, 141–146, 157, 252, 255 neural network 121, 132, 216, 252 neuroscience 212, 213, 246–248, 251, 255 norm 4, 83 normative view 70, 77, 79 optimality 15, 18, 19, 22, 23, 27, 43, 138, 144 Pareto optimality 179, 188 organization 1, 8
Subject Index
path-dependency phase 7 phase diagram 181, 182, 187–193 preference 63, 69–80, 84, 86, 87, 110, 153, 155, 156, 159, 165, 169, 180, 237–240, 247 price 3, 5, 6, 11, 39, 71, 177–179, 184, 187, 188, 191, 193, 194, 242, 251 problem solving 15–44, 107, 110, 217, 250–251 prospect 70, 104, 105 rationality (rational) 4, 6, 8–10, 15, 18, 42, 43, 47, 48, 63, 69, 70, 73, 74, 78, 79, 87, 102–106, 111, 119, 136, 155–159, 161, 167, 178, 181–183, 190, 225, 233, 237, 238, 241, 247–251, 253, 254 reasoning 1, 2, 4, 6–8, 11, 18, 87, 99, 103, 106, 107, 161, 216, 248–250, 254 regulation 211 representation 2, 9, 15, 16, 18, 19, 25, 30, 33, 35–37, 39, 41–43, 48, 59, 63, 71, 72, 80–84, 86, 89, 92, 106, 157, 211, 214, 216, 249 revising 4, 58
263
risk 10, 11, 79, 83, 99–106, 110–112, 123, 126, 130, 180, 225–228, 230–232, 250 routine 5, 214 selection (selective) 128, 131, 157 similarity 87 simulation 7–9, 106, 163, 179, 182, 186, 188, 190–200 stability 18, 28, 39, 43, 240–242 state space 48, 49, 52, 54, 55, 57, 60, 61, 66 statics (static) 8, 240, 253, 255 statistical physics 7, 11 structural 4, 6, 10, 213, 218 symmetry 21, 127 technology 5, 6, 100, 205–207, 209, 211, 219, 247 threshold 10, 11, 136–139, 141–146, 158, 168 trust 6, 101, 229 uncertainty 4, 6, 48, 61–63, 79, 84, 85, 101, 104, 178, 228, 231 updating 4, 127, 128, 184, 185, 188–190, 192, 194, 196, 197, 200
This page intentionally left blank