Microeconomic Theory
This page intentionally left blank
Microeconomic Theory a Concise Course
James Bergin
Great...
375 downloads
2836 Views
8MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Microeconomic Theory
This page intentionally left blank
Microeconomic Theory a Concise Course
James Bergin
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © James Bergin, 2005 The moral rights of the authors have been asserted Database right Oxford University Press (maker) First published 2005 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available ISBN 0–19–928029–0 (Hbk.) 978–0–19–928029–2 1 3 5 7 9 10 8 6 4 2
Preface This book covers a standard range of topics that appear in graduate courses in microeconomic theory. The intent in writing has been to be brief and clear, and yet achieve some depth of detail, summarizing key ideas in each topic succinctly. With one or two exceptions, it is possible to read any chapter without reference to any other—so that chapters can be read independently. In terms of length, the time devoted to each topic is comparable to what one might find in an actual teaching setting. The material is not intended as a textbook since there are no assignments or teaching materials. However, it is hoped that it will be a useful companion reader; much of the subject matter appears in any graduate microeconomics program. The topics are presented on the presumption that readers will already have some familiarity with the issues considered, and are looking for further discussion, or a “second point of view”. The subject matter implicitly reflects some degree of personal taste and judgment on the current direction of the subject. For example, there is substantial emphasis on information economics and the statistical techniques used in that literature; auctions are covered in some detail; a chapter on large games introduces anonymous games and illustrates the methodology with a macroeconomic application. And so on. The notes draw from a wide range of sources, but in general the primary sources are the references at the end of each chapter—although the link to the references is stronger in some chapters than others. In some cases, the material depends almost exclusively on only a few references (as is the case in chapter 8); in other cases, the discussion is broad ranging and may well contain material unique to these notes. Thanks are due to my family for their support during the writing of this book. At Oxford University Press, I thank Andrew Schuller for help in developing the project, and Carol Bestley and Jennifer Wilkinson for guiding the material through the stages of production. The remainder of the preface briefly describes the content of each chapter.
This page intentionally left blank
Contents A Brief Outline of the Chapters 1.Decision Theory 1.1 Introduction 1.2 Preferences and Optimal Choices 1.3 Decisionmaking under Risk 1.3.1 von Neumann-Morgenstern preferences 1.3.2 Other preference specifications 1.4 The State Preference Model 1.5 Decisionmaking under Uncertainty 1.5.1 Objections to the theory 1.5.2 Other preference specifications Bibliography 2.Preferences, Risk, and Stochastic Dominance 2.1 Introduction 2.2 von Neumann-Morgenstern Preferences and Risk 2.2.1 Risk aversion: some relations 2.2.2 Risk aversion and behavior: asset choice 2.3 Risk Aversion and the State Preference Model 2.4 Stochastic Dominance 2.4.1 Stochastic dominance and distribution functions 2.4.2 Stochastic dominance and preferences 2.5 Equivalence of Dominance Criteria 2.5.1 Equal means: mean preserving spreads
xvii 1 1 3 5 6 8 13 14 16 17 19 21 21 23 23 25 27 29 29 30 30 32
viii
CONTENTS
2.5.2 Higher order stochastic dominance 2.5.3 Stochastic dominance and risk aversion 2.5.4 Likelihood ratios and hazard rates 2.5.5 Dominance in terms of semideviations 2.5.6 Conditional stochastic dominance and monotone likelihood ratios Bibliography 3.Strategic Form Games 3.1 Introduction 3.2 Strategies 3.3 Solutions 3.3.1 Maxmin choices 3.3.2 Dominant strategies 3.3.3 Rationalizability 3.3.4 Evolutionary stable strategies 3.4 Nash Equilibrium 3.5 Correlated Equilibrium Bibliography 4.Nash Equilibrium—Existence and Refinements 4.1 Introduction 4.2 Nash Equilibrium 4.3 Existence of Equilibrium 4.3.1 Fixed points 4.3.2 Equilibrium 4.4 Perfect Equilibrium 4.5 Proper Equilibrium 4.6 Persistent Equilibrium Bibliography 5.Mechanism Design 5.1 Introduction 5.2 Mechanisms
33 33 34 35 38 38 39 39 40 41 41 42 43 44 45 47 53 55 55 56 57 58 59 60 61 63 64 65 65 66
CONTENTS
5.3 Complete and Incomplete Information Environments 5.4 Implementation: Complete Information 5.4.1 Direct mechanisms 5.5 Dominant Strategy Implementation 5.5.1 The revelation principle: dominant strategies 5.5.2 Strategy-proofness 5.5.3 The Gibbard-Satterthwaite theorem 5.5.4 Preference domain restrictions Bibliography 6.Implementation: Complete and Incomplete Information 6.1 Introduction 6.2 Complete Information Environments 6.3 Strategic Form Mechanisms (Complete Information) 6.3.1 The environment 6.3.2 Nash implementation 6.3.3 Undominated Nash implementation 6.3.4 Virtual implementation 6.4 Extensive Form Mechanisms (Complete Information) 6.5 Incomplete Information Environments 6.5.1 The framework 6.5.2 Incentive compatibility and participation 6.5.3Ex ante, interim, and ex post criteria 6.5.4 Strategic form mechanisms (incomplete information) 6.5.5 Nash implementation 6.6 Other Mechanisms Bibliography 7.Auctions I: Independent Values 7.1 Introduction 7.2 Auction Procedures 7.2.1 First price auctions 7.2.2 Second price auctions 7.2.3 All-pay auctions 7.2.4 Fixed price auctions (take it or leave it pricing) 7.2.5 The Dutch and the English auctions
ix 67 68 69 70 70 71 72 76 80 81 81 83 83 83 84 88 89 91 94 94 94 96 96 96 98 99 101 101 103 104 105 106 106 107
x 7.3 Revenue Equivalence 7.4 Reduced Form Auctions 7.4.1 Incentive compatibility 7.4.2 Revenue 7.5 The Optimal Auction 7.5.1 Canonical Pricing 7.6 Risk Aversion 7.7 Efficiency and Optimality Bibliography 8.Auctions II: Dependent Values 8.1 The Framework 8.1.1 Affiliated (MTP2) random variables 8.2 Auction Procedures 8.2.1 First price auctions 8.2.2 First price auctions: an example 8.2.3 Second price auctions 8.2.4 English auctions 8.2.5 Revenue comparisons 8.3 Price and Information Linkages 8.4 The Winner's Curse 8.5 Optimality: Surplus Extraction 8.6 Farkas' Lemma Bibliography 9.Extensive Form Games 9.1 Introduction 9.2 Description of an Extensive Form Game 9.2.1 Choices 9.2.2 Information 9.3 Strategies 9.3.1 Strategies: informal description 9.3.2 Strategies: detailed description 9.3.3 Perfect recall 9.3.4 Strategic equivalence with perfect recall Bibliography
CONTENTS
107 109 111 112 113 116 117 120 120 121 121 122 123 125 128 131 132 133 136 139 139 145 146 147 147 147 149 149 149 150 152 153 156 157
CONTENTS
10.Equilibrium in Extensive Form Games 10.1 Introduction 10.2 Extensive and Strategic Form Equilibria 10.2.1 Subgames and subgame perfection 10.3 Perfect Equilibrium 10.4 Sequential Equilibrium 10.5 Perfect Bayesian Equilibrium 10.6 Proper and Sequential Equilibrium 10.7 The Chain Store Paradox 10.7.1 The complete information model 10.7.2 The incomplete information model Bibliography 11.Repeated Games 11.1 Introduction 11.2 The Framework 11.2.1 Evaluation of payoff flows 11.2.2 Strategies and equilibrium 11.2.3 Mixed strategies 11.3 The Impact of Repetition 11.4 Characterization of Equilibrium Payoffs 11.4.1 Maximal punishments and minmax payoffs 11.4.2 Convexity, feasibility, and observability 11.5 Infinitely Repeated Games with Averaging 11.6 Infinitely Repeated Games with Discounting 11.6.1 The dimensionality condition 11.7 Finitely Repeated Games 11.8 Finite Repetition and Discounting 11.8.1 No gain from one-shot deviation 11.8.2 History independent punishments 11.9 Repeated Games of Incomplete Information 11.9.1 Strategic information revelation 11.9.2 Equilibrium Bibliography
xi 159 159 160 161 162 165 170 170 171 172 172 177 179 179 180 180 181 181 182 183 183 185 187 188 190 191 192 192 193 194 195 196 198
xii
CONTENTS
12.Information 12.1 Introduction 12.2 The Framework 12.3 Information and Decisions 12.4 Utility Maximization and the Value of Information 12.4.1 Finer information 12.4.2 Garbling 12.5 Monotonic Decisions 12.6 Likelihood Ratios, MTP2, and Supermodularity 12.6.1 Monotone likelihood ratios: observations 12.6.2 Monotone total positivity of order two 12.6.3 Supermodularity and monotonicity 12.7 The Multiperson Environment 12.7.1 Rational expectations 12.7.2 Nonexistence of equilibrium 12.7.3 Rational expectations and no speculation 12.8 Equilibrium in n-player Bayesian games 12.9 Multiagent Models: Information Structures Bibliography 13.The Principal-Agent Problem 13.1 Introduction 13.2 Details 13.3 The Full Information Case 13.3.1 Risk aversion and risk allocation 13.3.2 Efficiency with a risk neutral principal 13.4 The Incomplete Information Case 13.4.1 The first-order approach 13.4.2 Validity of the first-order approach: sufficiency conditions 13.4.3 Comments on the sufficiency conditions 13.4.4 Inefficiency and the likelihood ratio Bibliography 14.Signaling 14.1 Introduction
201 201 202 203 203 204 205 209 211 212 214 215 217 218 220 221 222 224 225 227 227 228 229 231 232 233 235 236 238 240 241 243 243
CONTENTS
xiii
14.2 Signaling Games 14.2.1 Nash equilibrium 14.2.2 Sequential equilibrium 14.2.3 Intuitive equilibrium 14.3 Examples Bibliography 15.Screening 15.1 Introduction 15.2 Screening Models 15.2.1 The insurance market model 15.2.2 The labor market model Bibliography 16.Common Knowledge 16.1 Introduction 16.2 Information Structures 16.3 Common Knowledge 16.4 Posterior Announcements 16.5 Public Announcements 16.6 Common Knowledge of an Aggregate Statistic 16.7 Common Knowledge and Equilibrium 16.8 No-Trade Theorems Bibliography 17.Bargaining 17.1 Introduction 17.2 Axiomatic Bargaining 17.3 Axiomatic Bargaining Solutions 17.3.1 Egalitarian and utilitarian solutions 17.3.2 The Nash bargaining solution 17.3.3 The Kalai-Smorodinsky (K-S) bargaining solution 17.4 NonCooperative Bargaining
243 244 244 246 248 253 255 255 255 256 259 265 267 267 268 269 271 272 273 275 277 279 281 281 282 283 283 284 287 289
xiv
CONTENTS
17.5 Alternating Offers and Nash Bargaining 17.6 Bargaining with Many Individuals Bibliography 18.Cooperative Outcomes 18.1 Introduction 18.2 Framework 18.3 The Core 18.3.1 Balancedness 18.4 Nontransferable Utility 18.4.1 Derivation of the coalition function 18.5 von Neumann-Morgenstern Solutions and Stability 18.5.1 Stability 18.6 The Shapley Value Bibliography 19.Anonymous Games 19.1 Introduction 19.2 Formulation of Anonymous Games 19.2.1 Equilibrium 19.2.2 An example 19.2.3 Pure strategy equilibrium 19.3 Strategies as Functions 19.4 Dynamic Anonymous Games 19.5 Social Planner Formulations 19.5.1 Surplus maximization: an example 19.6 No Aggregate Uncertainty Bibliography 20.Evolution and Learning 20.1 Introduction 20.2 Fictitious Play 20.3 Replicator Dynamics
292 293 294 297 297 298 298 298 301 301 302 303 304 308 309 309 310 310 312 312 313 313 316 316 321 321 323 323 324 325
CONTENTS
20.4 Stochastic Stability 20.4.1 Motivation 20.4.2 Invariant distributions: overview 20.4.3 Best response dynamics: an example 20.4.4 Imitative dynamics: an example 20.5 Regret Minimization 20.5.1 Approachable sets of payoffs 20.5.2 The model 20.6 Calibration 20.7 Bayesian Learning 20.8 Approachability Bibliography Index
xv 326 326 327 332 336 339 339 340 343 344 345 350 353
This page intentionally left blank
A Brief Outline of the Chapters Chapter 1 considers decision theory. Starting with preference orderings, sufficient conditions are given for the existence of optimal choices and for the representation of preference orderings by utility functions. Following this, decision-making under risk is discussed. The starting point for the discussion is the von Neumann–Morgenstern model. This sets the stage for objections to the von Neumann–Morgenstern theory and consideration of possible alternatives that preserve some simplicity of structure. These separate into two groups—betweenness models in which indifference curves are linear but not parallel; and rank order models which permit nonlinear indifference curves. Finally, decisionmaking under uncertainty is discussed. This parallels the risk discussion, with the theory of Savage replacing that of von Neumann and Morgenstern. Objections to the theory of Savage are described and proposed alternatives are discussed. Chapter 2 focuses on preferences and risk, beginning with a discussion of risk aversion in the context of von Neumann–Morgenstern preferences. Risk aversion and asset choice are considered briefly. It is shown how assumptions on risk aversion provide implications for portfolio choice. The state preference model is described, and assumptions on absolute and relative risk aversion are used to determine the shape of indifference curves in the state space. There are many measures of riskiness of a random return. Measuring riskiness in terms of dominance criteria is discussed at length. Definitions are given purely in terms of distributions, and these are connected to preference-based definitions. First-, second-, and higher order stochastic dominance are explained along with mean preserving spreads, conditional stochastic dominance, monotone likelihood ratio dominance, and hazard rate dominance. The relations between these notions are examined. In addition, a semideviation model is presented which has some virtue as a risk criterion when decisions are based on risk–return pair comparisons. In Chapter 3 the basic features of a game in strategic form are described. A variety of approaches to the selection of equilibrium outcomes are considered. These provide alternative perspectives on how a player might reasonably approach a strategic decision problem. One part discusses Nash equilibrium, but avoids going into the knowledge considerations that lie behind its (modern) logical foundations.
xviii
A BRIEF OUTLINE OF THE CHAPTERS
The section includes a short discussion of dynamic stability of Nash equilibrium in terms of a simple tatonnement model. Chapter 4 considers the existence of Nash equilibrium and describes some of the major equilibrium refinements (perfection, properness, and persistence). The chapter also lists a few fixed point theorems that arise in traditional proofs of existence of equilibrium. The relation between properness and sequential equilibrium is explained in Chapter 10, establishing an important connection between extensive and strategic form equilibria. Mechanism design is introduced in Chapter 5. The key classifications of complete and incomplete information are given and the revelation principle is described. Direct mechanisms and dominant strategy implementation are discussed. The Gibbard–Satterthwaite theorem and some positive results are given for single-peaked and quasilinear preferences. Chapter 6 discusses complete and incomplete information implementation in both strategic and extensive form games. A variety of solution concepts are used—Nash, undominated Nash, virtual implementation, and subgame perfection. The key ideas on monotonicity and Bayesian monotonicity are explained and they highlight the essential role of preference reversals in designing implementing mechanisms. This also frames the discussion for mechanisms based on other solution concepts or game forms. In Chapter 7 auctions with independent values are described. Detailed calculations of equilibrium strategies are given for the standard auctions (first price, second price, and so on). The fundamental revenue equivalence theorem is illustrated by computing expected revenue for five different types of auction that all share key features sufficient for revenue equivalence (assignment to the buyer with highest valuation and lowest valuation types have an expected payment of 0). Then, reduced form auctions are discussed. These are the key to a full study of the structure of incentives in this environment. One important observation comes directly from incentive compatibility: apart from a common constant, the assignment rule fully determines the expected payment of every type of a bidder. A simple envelope theorem argument is used to give this result. The implications for revenue are immediate—maximizing revenue (the optimal auction) revolves around the optimal assignment rule. This is used to characterize the optimal auction. Finally, the chapter concludes with a section on risk aversion. The revenue equivalence link is broken; greater risk aversion produces more competitive bidding. This captures the intuition that greater risk aversion leads to greater loss from not winning the object, and hence more aggressive bidding. Chapter 8 considers auctions where valuations are not drawn independently. Equilibrium bidding behavior in the first price, second price, and English auctions is characterized. Revenue comparisons are given showing that the revenue equivalence theorem fails for the standard auctions when values are correlated. The expected revenue is at least as large in the second price auction as in the first price auction. The linkage principle (price is positively “linked” to information) is described. Finally, full surplus extraction is discussed at length. This is the analog of the optimal auction in the independent valuations environment.
A BRIEF OUTLINE OF THE CHAPTERS
xix
Chapter 9 introduces extensive form games. Informations structures—perfect, imperfect, and incomplete information are explained. Pure behavioral and mixed strategies are defined. Finally, perfect recall and the equivalence of mixed and behavioral strategies (in terms of end point distributions) are considered. Next, equilibrium in extensive form games is considered in Chapter 10. This chapter covers Nash equilibrium, perfect equilibrium, sequential equilibrium, and perfect Bayesian equilibrium. The classic chain store paradox example is discussed to illustrate. Repeated games are considered in Chapter 11. Apart from definitions and the familiar characterization results, the discussion explains issues surrounding randomization, observability, feasibility, and convexity. Games with payoff averaging and discounting are discussed along with finitely repeated games. The “no-gain-from-one-shot-deviation” property of games with continuous payoffs is discussed and the proof sketched. Finally, games of incomplete information are introduced. Information models are considered in Chapter 12. The chapter begins with a discussion of utility maximization when the decisionmaker has some information. The main focus of the first part is to set the framework for a discussion and proof of Blackwell's theorem on garbling and the value of information. Information and monotonic decisions are discussed: under what conditions do higher signal values lead to a higher optimal level for a choice variable? This issue is examined using simple arguments which then lead to a discussion of stochastic dominance of the distribution on states conditional on the signal, and subsequently to supermodularity of the utility function. This then leads to the introduction of monotone total positivity of densities—which is seen to correspond to the monotone likelihood condition. Some useful results relating to monotone total positivity are given. The material sets the stage for a brief review of supermodularity and submodularity and on how the concepts relate to optimization. Multiperson environments are then introduced. The first observation made is that more information is not necessarily valuable. Rational expectations concepts and potential nonexistence of equilibrium are explained. Here, a short proof is given of the fact that in a rational expectations equilibrium, no speculative gain is possible. Next, equilibrium in an abstract game of incomplete information is discussed. Finally, the discussion sets the stage for three classical information models by describing their distinct features; the information structures for principal–agent, screening, and signaling models are laid out. In Chapter 13 the principal–agent problem is considered. The principal's problem with full informa-tion is taken as the benchmark. Full insurance occurs with a risk neutral principal and risk averse agent. Then efficient risk allocation with a risk averse principal and risk averse agent is considered, again with full informa-tion. Turning to the incomplete information case, unobservable effort raises the key incentive problem. Optimizing subject to first-order conditions—the first-order approach—is considered at length, and sufficient conditions for validity of the firstorder approach are given. The key conditions relate to the distribution
xx
A BRIEF OUTLINE OF THE CHAPTERS
function of output, conditional on effort. One of these conditions is the monotone likelihood ratio condition. Some distributions satisfying the sufficiency conditions are given. Finally the monotone likelihood ratio is used to interpret output as a signal of effort and the level of inefficiency is related to the informativeness of the likelihood ratio. In Chapter 14 the signaling model is considered. The model is used to highlight the difference between various equilibrium refinements. Chapter 15 considers the traditional screening model and covers the basic features of screening models, including pooling and separating equilibria. In Chapter 16 common knowledge is discussed. The chapter describes information structures and sets up a framework for the discussion of common knowledge. The definition of common knowledge is given. Convergence of beliefs under iterative announcement is discussed and the implications of common knowledge of an aggregate statistic are described. In a game theoretic framework, it is shown how lack of common knowledge can lead to cooperative equilibrium in a finitely repeated prisoners' dilemma game. Finally, a no-trade theorem is given. Chapter 17 deals with bargaining. The chapter begins with the axiomatic bargaining framework. Derivation of a bargaining set from an underlying environment is illustrated by example. Four axiomatic solutions are discussed: Egalitarian, Utilitarian, Nash, and Kalai–Smorodinsky. Proofs for the Nash and Kalai–Smorodinsky characterizations are given. Noncooperative bargaining is considered with emphasis on the alternating offers model and its recursive structure. The connection between Nash bargaining and the alternating offers model is described. This is the basic noncooperative foundations story: as the time between offers goes to 0, the alternating offers equilibrium division converges to the generalized Nash bargaining solution. Finally, difficulties that arise in the multiperson case are discussed. In Chapter 18 cooperative games are considered. The chapter discusses some of the key ideas in cooperative game theory. The core is introduced and the key idea behind nonemptiness (balancedness) is explained through the dual program. Since the coalitional function is commonly introduced without reference to underlying preferences and choice sets, it is discussed here through the notions of “alpha” and “beta” effectivity. Following this, von Neumann–Morgenstern solutions are discussed, and von Neumann–Morgenstern stability is defined. The chapter concludes with a description of the Shapley value. Large games—games with a continuum of players are described in Chapter 19. Both one-shot and dynamic games are considered. The focus is on anonymous games where only the distribution of players and actions affects any given player. A section discusses the social planner formulation of equilibrium. No aggregate uncertainty is briefly described. Chapter 20 studies evolution and learning. The chapter begins with a discussion of fictitious play and replicator dynamics—two early models of dynamic adjustment or learning. Following this, some detailed discussion of stochastic stability is given, including the computation of invariant distributions and minimum cost
A BRIEF OUTLINE OF THE CHAPTERS
xxi
trees. An example illustrates how these computations connect directly to the relative sizes of the basins of attraction of absorbing states. A second computation illustrates how the minimum cost tree approach can be used to identify stochastically stable states. Blackwell approachability is used to define strategies that minimize regret across all actions. This is then connected to correlated equilibrium. Calibrated forecasts are defined and a connection to correlated equilibria is also noted. The chapter provides a brief discussion of Bayesian learning and the key role of the martingale convergence theorem. Finally, Blackwell approachability is discussed.
This page intentionally left blank
1 Decision Theory 1.1 Introduction Decision theory is concerned with making optimal choices—selecting the most preferred choice from a set of alternatives. This task arises in a variety of different environments leading to different frameworks, issues, and analysis. In deterministic decision problems, a choice leads unambiguously to an outcome or consequence. When risk is present, the outcome or consequence of a decision is unknown, but is determined according to known probabilities (such as a fair coin toss). In contrast, with uncertainty, the probabilities over outcomes are unknown. These different contexts lead to distinct types of decision problem, which are outlined below. However, regardless of the environment, some fundamental results hold. Given a choice set and a preference ordering over the set of choices, one can show under mild conditions that there is an optimal or best choice in the choice set. Furthermore, preferences can, in general, be represented by a utility function—a useful result since the utility function representation is convenient for a variety of reasons, such as providing a convenient tool for marginal analysis. The basic decision problem and these results are described in Section 1.2, in terms of an abstract underlying choice set, so that many different environments fall within the framework. Section 1.3 focuses on decisionmaking under risk. When the preference ordering is on lotteries (probability distributions over a set of outcomes), the utility function is defined on the space or set of lotteries. But, without some additional behavioral assumptions, such a utility function has little structure (and so has limited usefulness). One such assumption is the independence axiom, introduced by von Neumann and Morgenstern and described in Section 1.3.1. When the preference ordering on lotteries satisfies the independence axiom, the utility function on lotteries
2
MICROECONOMIC THEORY
is linear, so that indifference curves on the space of lotteries are linear and parallel. This is a dramatic simplification that has proved remarkably useful in applications. Nevertheless, the independence axiom has been the subject of extensive debate and criticism. A typical example from this debate is given (in Section 1.3.1). Criticisms of the independence axiom have led to the development of alternative theories. These fall into two main categories: theories which preserve linearity of indifference curves but allow the indifference curves to be nonparallel; and theories with nonlinear indifference curves which preserve some additive structure on the utility function. The first category of models is the group satisfying the “betweenness” property; the second is in the category called rank dependent utility. These models are discussed in Section 1.3.2. In the context of risk, a decisionmaker's utility depends on both outcomes and the probabilities of those outcomes. When the outcomes are taken as fixed, then preferences are given by a function defined on lotteries and the utilities of different lotteries may be compared. This is the case with the von Neumann–Morgenstern model where different lotteries are considered; but the underlying outcomes are fixed. State preference theory reverses this: probabilities on states are fixed, but the payoffs in states are variable. So when the probabilities on outcomes are fixed, preferences depend on outcomes with different outcome profiles generating different utility levels. In Section 1.4 the state preference model is defined. This model can be used, for example, to study the purchase of insurance where the risk probabilities are exogenous but the choice of insurance levels can vary. In the theory of decisionmaking, the term risk relates to random outcomes with known probabilities; the term uncertainty refers to situations where the probabilities are unknown. Section 1.5 considers decision theory with uncertainty. The foundational work on decisionmaking under uncertainty appears in the theory of Savage. There, states, acts, and consequences are the primitives of the environment along with a preference ordering on acts, where an act is a function associating consequences to states. The objective of the theory of Savage is to provide a workable model of decisionmaking under uncertainty—as von Neumann–Morgenstern theory does in the case of risk. The key assumption utilized in this theory, “the sure thing principle,” is described in this section. With that axiom and others, acts may be ranked in a manner that is similar to von Neumann–Morgenstern theory. A common objection to the theory of Savage is presented in Section 1.5.1—the best known example is the Ellsberg paradox. When faced with uncertainty, people make cautious choices and this turns out to be inconsistent with the Savage model. One resolution to this is presented in Section 1.5.2.
3
CHAPTER 1: DECISION THEORY
1.2 Preferences and Optimal Choices Decision theory is concerned with rationality in choice—making the best decision. Making choices assumes a set of possible choices, X, and a way of ranking choices—an ordering, ≽, on X. An ordering ≽ on X is a binary relation on X. Write x ≽ y to mean that x is (weakly) preferred to y, and write x ≻ y to denote strict preference, where x ≽ y, but not y ≽ x. An optimal choice is an x* ∈ X such that x* ≽ x, ∀ x ∈ X. Whether an optimal choice exists or not depends on both X and ≽. To see this, consider the problem of choosing the largest number less than 1 but not smaller than 0; there is no such number since for any number x < 1, x < (x + 1)/2 < 1. Or, let X = [0,1] with x ≻ 1, ∀ x ≠ 1; and x ≻ y if 1 > x > y. Again there is no optimal choice. In the first case, the preference ordering is continuous but X equals [0,1) and has no top ranked element (largest element). In the second case, the preference ordering is discontinuous and again there is no optimal choice. To ensure the existence of optimal choices some assumptions on the choice space, X, and the ordering ≽ are essential. The conventional assumptions on the ordering ≽ are: 1. 2. 3. 4.
Reflexive: ∀x ∈ X, x ≽ x. Complete: ∀x, y ∈ X, x ≽ y or y ≽ x. Transitive: x,y,z ∈ X, x ≽ y, y ≽ z ⇒ x ≽ z. Continuous: ∀y X, {x ∈ X | x ≽ y} and {x ∈ X|y ≽ x} are closed.
Reflexive means an item is ranked as highly as itself, complete means that any two items can be ranked or compared, and transitivity means that rankings have no cycles. For example, if preferences were not complete, the individual would be unable to compare and rank alternatives—in which case it would be impossible to determine how an individual might select an alternative. These requirements are natural, although some experimental evidence suggests that transitivity is not always satisfied in practice. The condition of continuity is a technical condition (“closed” depends on the topology on X). For example, if X ⊆ Rn, then {x ∈ X | x ≽ y} is closed if xn ∈ X for each n and implies that . Optimal decisions are defined next. Definition 1.1 The choice
is optimal in B if
, ∀ x ∈ B.
Provided the preference ordering satisfies conditions (1)–(4), and the choice set is compact, optimal choices exist. Theorem 1.1Let B ⊆ X be compact and let ≽ be a preference ordering satisfying (1)–(4). Then,
,
, ∀ x ∈ B.
4
MICROECONOMIC THEORY
Proof Let Bx = {y | y ∈ B, y ≽ x} = {y ∈ X | y ≽ x} ∩ B. Let ℱ = {Bx | x ∈ B}. Thus, ℱ is a family of closed subsets of B. Let , be a finite collection of sets in ℱ. Then, since ≽ is a preference ordering, ∃ x* ∈ {x1, …, xn} such that x* ≽ xi, i= 1, …, n. Therefore, , i = 1, …, n. So, and therefore . So, every finite intersection of sets in ℱ is nonempty. From the finite intersection property, ∩x∈BBx ≠ ∅. Therefore, , so that for each x, , ∀ x ∈ B. This implies , ∀ x ∈ B. Thus, is an optimal choice in B.1 Therefore, under relatively mild conditions, optimal choices exist. However, it is common and often easier to work with utility functions rather than preference orderings. The following discussion considers the scope for doing so. Definition 1.2A preference ordering on X is representable by the (utility) function, u: X → R, if
Under what conditions can preferences, ≽, be represented by a utility function? The next result asserts that a preference ordering can be represented by a utility function, under mild conditions. Theorem 1.2 (Eilenberg 1941; Debreu 1954) Let ≽ be a preference ordering on X satisfying (1)–(4), and suppose that X is a topological space2such that 1. X has a countable base of open sets, or 2. X is connected and separable, then ∃ a continuous functionu: X → Rwhich represents ≽:
Thus, under mild conditions, optimal choices exist (Theorem 1.1) and preferences may be represented by a utility function (Theorem 1.2). Also, while the conditions in these theorems are sufficient, they are not necessary; optimal choices may exist, for example, even if the ordering, ≽, is not transitive or continuous. Finally, with regard to the choice set X, Theorems 1.1 and 1.2 impose few restrictions. The following examples illustrate. •
The consumer theory model: X ⊂ Rn and u: X → R.
1
Finite intersection property : If ℱ is a family of closed subsets of a compact set B such that every finite intersection of sets in ℱ is nonempty, then the intersection over all sets in ℱ is nonempty.
2
A topological space consists of a set X and a collection of sets T called open sets with the property that ∅, X ∈ T and such that T is closed under finite intersections and arbitrary unions. A collection of sets ℬ, ℬ ⊆ T, is called a base for T if and only if given T ∈ T, ∃ {Bα }, Bα ∈ ℬ, ∀ α and T = ∪αBα . If a topological space has a countable base, it is called second countable. A metric space is second countable if and only if it is separable. A topological space, X , is connected if there do not exist open sets U , V with U ∩ V = ∅ and U ∪ V = X .
CHAPTER 1: DECISION THEORY
5
•
A sequential decision problem:
•
So, X = {(T,U,H),(T,U,L),(T,D,H),(T,D,L), (B,U,H),(B,U,L),(B,D,H), (B,D,L)}. Each choice leads to an outcome zi, z: X → Z = {z1, …, z4} and u: Z → R. When X is written this way there is some redundancy, since the choice T at location a eliminates the problem of making a decision at location c. Let Y be a compact subset of Rn and let X = Δ(Y) be the set of distributions on Y. Let u: X → R.
1.3Decisionmaking under Risk Decisionmaking under risk arises when the outcome of a choice is unknown, but the probabilities of the alternative possibilities are known. For example, choice α might be a lottery with a 50% chance of winning $75 and a 50% chance of winning $130, while choice β might guarantee $100 for sure. In this context, a choice selects a distribution on outcomes. Let Y be the set of possible outcomes and X = Δ(Y) the set of distributions on Y.3 From the earlier discussion, under mild conditions a preference ordering, ≽, on X = Δ(Y) may be represented by a utility function V: if p, q ∈ X, p ≽ q ⇔ V(p) ≥ V(q).4 However, the conditions gua-ranteeing the existence of V impose little structure on the form of the function V. The von Neumann–Morgenstern formulation addresses this matter.
3
When using distributions over sets, it is useful to know properties of the set of such distributions. Consider a finite set of points Y = {y1 , y2 , …, yn }. The set of probability distributions on Y is the n 1 dimensional simplex: X =Δ(Y ) = {(p1 , …, pn )|pi ≥ 0, ∑pi = 1}. Thus, Δ(Y ) is convex and compact. If Y =[0, 1], and again X =Δ(Y ), the set of distributions on Y , this set is convex and compact (although that requires discussion of the topology). More generally, if Y is a compact metric space, X = Δ(Y ).
4
For example, if Y is a complete separable metric space, then X =Δ(Y ) is separable and since it is connected, there is a utility function V , V : X → R , representing ≽.
6
MICROECONOMIC THEORY
1.3.1 von Neumann–Morgenstern preferences Given a preference ordering, ≽, on lotteries over outcomes Δ(Y), under general conditions a utility can be assigned to each lottery p: V(p). However, without additional restrictions on ≽, the function V(p) is arbitrary. The axioms on ≽ in von Neumann–Morgenstern theory result in linear preferences. The key axiom is called the “independence” axiom and implies that the function V is linear in p, so that V(p) = ∑ uipi or more generally V(p) = ∫Yu(y)μ(dy). In what follows, take x = Δ(y) where y is finite or a subset of the real line. Theorem 1.3Let ≽ be a preference ordering on X satisfying: 1. ≽ is complete, 2. ≽ satisfies transitivity, 3. ≽ satisfies mixture continuity:
4. ≽ satisfies independence:
Then there is a utility function V representing ≽, such that V has the formV(μ) = ∫Yu(y)μ(dy), whereu: Y → Ris uniquely determined up to an affine transformation: umay be replaced by any û where û = a + bu, and b > 0. The function u is called the von Neumann–Morgenstern utility function.5 Although u is uniquely determined up to an affine transformation, if V represents ≽, then so does any strict monotone increasing transformation: if f is any strictly increasing function (f(x) > f(y) if x > y), then V* (·) = f(V(·)) represents ≽ because V(p) ≥ V(q) if and only if V*(p) = f(V(p)) ≥ f(V(q)) = V*(q). In the case where Y is finite, V(p) = ∑ u(yi)p(yi), where p(yi) is the probability of drawing yi with the distribution p. Write u = (u(y1), …, u(yn)) so that V(p) = u · p. Thus, an indifference curve is a hyperplane in the (n − 1)-dimensional simplex; in two dimensions, indifference curves are parallel straight lines. With three possible outcomes, V(p) = ∑ u(yi) pi, where pi is the probability of yi. Suppose that utilities are ordered: u(y1) < u(y2) < u(y3), and write ui = u(yi). Thus, V(p) = p1u1 + p2u2 + p3u3 and since 1 = p1 + p2 + p3, V(p) = p1u1 + (1 − p1 − p3) u2 + p3u3 or V(p) = u2 − (u2 − u1) p1 + p3 (u3 − u2). Consider an indifference curve: , so p ∈ I implies , or
5
There are many proofs for this result—see for example, Fishburn (1982). See Miyake (1990) for discussion of the case where y is a separable metric space.
CHAPTER 1: DECISION THEORY
7
To see why the independence axiom implies parallel indifference curves, take λ and μ to be on the same indifference curve, λ ∼ μ, so that λ ≽ μ and μ ≽ λ. From independence (with μ and λ replacing ν), θ λ + (1 − θ) μ ∼ μ for θ ∈ [0,1], so that all points on the line connecting λ and μ are on the same indifference curve. Next, take ν different from λ and μ, as in the figure, and , . Because λ ≽ μ, λ(θ) ≽ μ(θ) and because μ ≽ λ, μ(θ) ≽ λ(θ), using the independence axiom. Therefore, λ(θ) and μ(θ) are on the same indifference curve, and so also must all points be on the line connecting them. But this line is parallel to the line through the points λ and μ—the indifference curves are parallel. This can be seen directly from the fact that μ(θ) − λ(θ) = θ [μ -λ], so the slope is independent of θ. The vector μ(θ) -λ(θ) is just (μ − λ) scaled by θ.
Objections to von Neumann–Morgenstern preferences One of the main objections to this theory is called the Allais paradox, according to which a behavior is commonly observed that is inconsistent with von Neumann–Morgenstern preferences. The following example illustrates. Suppose there are three outcomes Y = {y1, y2, y3} = {0, 1, 10}, denoting 0, 1 million, and 10 million dollars. Consider three distributions over these outcomes: λ = (0, 0.9, 0.1), μ = (0.2, 0.6, 0.2) and ν = (1,0,0). It is common in experiments for people to prefer the distribution λ to μ because it guarantees at least 1 million dollars, whereas with μ there is a 20% chance of getting 0: thus λ ≻ μ. Taking θ = 0.1, define λ(θ) = θ λ + (1 − θ) ν and μ() = θ μ + (1 − θ) ν so λ(θ) = (0.90,0.09,0.01), μ(θ) = (0.92,0.06,0.02). In this case, many people prefer μ(θ) to λ(θ) since both offer roughly the same chance of 0, but μ(θ) has twice as good a chance of drawing 10 million dollars. Such behavior violates the independence axiom. Since the indifference curve through λ lies above μ, it must be steeper than ½, and since the indifference curve
8
MICROECONOMIC THEORY
through λ(θ) lies below μ(θ), it must be less steep than ½. This is illustrated in the figure below:
1.3.2 Other preference specications The Allais paradox suggests relaxing the assumption of linear preferences. The discussion here briefly discusses two forms of relaxation. In one case indifference curves are linear but not parallel; in the other case nonlinear indifference curves are possible.
Weighted utility One alternative specification of preferences is weighted utility:
If p and q are on the same indifference curve:
so that
, then so is p(θ) = θ p + (1 − θ)q:
CHAPTER 1: DECISION THEORY
9
or
So,
Therefore the weighted utility formulation preserves linear indifference curves, but the indifference curves need not be parallel. From the axiomatic perspective, independence was a key condition leading to linear and parallel indifference curves (and von Neumann–Morgenstern preferences). Like von Neumann–Morgenstern expected utility, weighted utility has an axiomatic basis. The key axiom is weak independence (or weak substitution). A preference ordering satisfies weak independence if:
Given μ, λ, and taking ν = μ, weak independence implies that there is some point γ = β λ + (1 − β) μ on the line connecting μ and λ which is indifferent to either μ or λ. Similarly, between γ and μ there is a point δ which is indifferent to μ. And so on. Weak independence implies linear indifference curves. One notable feature of weighted utility is that all the indifferences curves intersect at the same point (outside the simplex to satisfy transitivity). To see this consider the three-state case where an indifference curve is given by (with wi = w(xi) and ui = u(xi) for notational convenience): and with p2 = 1 − p1 − p3,
A second indifference curve (with utility level W) is given by:
Write d12 = (w1 − w2), d32 = (w3 − w2), r12 = (w1u1 − w2u2), r32 = (w3u3 − w2u2). So the “W” and “V” indifference curves are defined by:
10
MICROECONOMIC THEORY
Denote the solution to the pair of equations: (p*1, p*3). With W ≠ V, subtracting the second equation from the first gives (V − W) [p*1d12 + p*3d32 + w2] = 0 and [p*1d12 + p*3d32 + w2] = 0 implies p*1r12 + p*3r32 + w2u2 = 0. So the solution to the pair of equations will be the same, regardless of the values of W and V. From the equations: p1d12 + p3d32 +w2 = 0 and p1r12 + p3r32 + w2u2 = 0, solution for the three-state case is:
Weighted utility is based on weak independence. Yet a weaker formulation of independence is a very weak substitution which is defined:
Comparing these: with weak substitution, the value of β is fixed for all ν (and depends on λ, μ, θ); whereas with very weak substitution, the value of β can vary with ν. Very weak substitution leads to the following representation of preferences:
so that the weighting function can now depend on the level of utility. Using the same reasoning as earlier, it follows that this representation implies linear indifference curves. Observe, for example that, in the definition of very weak substitution, if one takes ν = μ, then the condition implies that μ ∼ β λ + (1 − β)μ. So, given any pair of points μ and λ on the same indifference curve, there is a third point on the line between them which is also indifferent to either. However, there is no requirement that indifference curves be parallel.
11
CHAPTER 1: DECISION THEORY
Betweenness preferences Another axiom which leads to a similar representation is betweenness. This requires that a distribution between two others has a preference ranking between the ranking of those other distributions. The ordering ≽ satisfies betweenness if: Directly, this implies linear indifference curves and is the basis for the representation of utility as:
Implicit linear utility Let p ∈ Δ(X), where X is a subset of the real numbers. The utility associated with V(p) may be identified with a certainty equivalent. Let δm be the distribution placing probability 1 on m. Define m(p) implicitly to be the value of m satisfying V(δm(p)) ≡ V(p). The certainty equivalent m(p) is ordinally equivalent to V(p). Begin with a function τ defined on R × R, τ(x,y) with τ increasing in x and τ(x,x) = 0, ∀ x. Consider the problem of finding a function m(p) to satisfy:
This implicitly assigns a utility (the certainty equivalent) to each distribution, and the specification is called implicit linear utility. Implicit linear utility generalizes the previous models as follows. In the case of expected utility, let τ(x,y) = [u(x) − u(y)], so that
, ∀p ∈ Δ implying that
.
12
MICROECONOMIC THEORY
If τ is modified so that τ(xi, y) = [w(xi) u(xi) − w(xi) u(y)], then
, ∀p ∈ Δ gives
which is a weighted utility. Similarly, with the betweenness formulation, V(p) = ∑ u(x, V(p)) p(x), let τ(xi,y) = [u(xi, V(δy)) − V(δy)], so that ∑ip(xi) τ(xi, m(p)) = 0 gives:
One key feature of the implicit linear model is linearity of indifference curves: if p and q satisfy the equation with the same solution: :
then
, so that
.
Models satisfying betweenness preserve linearity of indifference curves but indifference curves need not be parallel. Rank dependent utility takes a very different approach leading to nonlinear indifference curves.
Rank dependent utility Returning to a general formulation of preferences on risk, make the dependence of utility on x explicit, and write V(p) = H(x,p) where x = (x1, …, xn) and p = (p1, …, pn). This is the utility function obtained from a general preference ordering on lotteries. A natural simplification of H is to impose additivity in xi, so that H has the form . Going one step further, suppose that hi(xi, p) = u(xi) ϕi(p) with ϕi ≥ 0 and ∑i ϕi(p) = 1. In this case a lottery independent utility, u(xi), is attached to each outcome xi. Going further along these lines requires additional structure on the ϕi functions. One specification derives with g increasing and taking the xi's to be ordered by rank, u(xi) ≥ u(xi − 1). In this case the utility of the lottery is:
with the convention that and with g increasing and g([0,1]) = [0,1], such that g(0) = 0 and g(1) = 1. This formulation is called rank dependent utility, and reduces to expected utility in the special case where g(z) ≡ z. Unlike indirect linear utility, with rank dependent utility the indifference curves need not be linear. This specification also has an axiomatic basis.
CHAPTER 1: DECISION THEORY
13
To understand the motivation for the formulation, with the outcomes xi ordered (x1 ≺ x2 ≺ …) define the rank of xj as . Thus, the rank of xn is 1. Suppose now that in V(p) = ∑i ϕi(p)u(xi) the weight ϕi can depend only on pi and the rank of xi: . Consider first the case where n = 2. Because x2 is top ranked (with rank equal to 1), ϕ2(p) = ϕ*(p2), with ϕ*(0) = 0 and ϕ*(1) = 1. Since the weights add to 1, ϕ1(p) = 1 − ϕ*(p2). Thus, V(p1, p2) = u(x1)(1 − ϕ*(p2)) + u(x2)ϕ*(p2). Let g(x) = 1 − ϕ*(1 − x), so that
(g(1) = g(p1 + p2) = 1, g(0) = 0.) Next, consider the case n > 2. Since the rank of xn is 1, ϕn(p) = βn(pn). Now, take two distributions on {x1, …, xn}. One is {p1, …, pn} and the other attaches probability pi to xi for i ≤ k < n and the probability to some arbitrary xr, r > k. The weights always add up to one and since r is top ranked, for some function ,
or
. But k and r are arbitrary, so βr must be independent of r, βr(·) = β(·), and . Thus,
Let β(1 − x) = 1 − g(x), so that
and then
, and V(p) is as given above.
1.4 The State Preference Model Given p = (p1, …, pn) and x = (x1, …, xn) the expression ∑ piu(xi) associates a utility to the pair (p,x). One can view this function as a function of either p = (p1,…, pn) or x = (x1, …, xn), or both (x,p). In von Neumann–Morgenstern theory, x is taken as fixed and behavior studied as p varies. Expected utility is defined on the simplex: V: Δ(Y) → R. If this view is reversed and p held fixed, then utility
14
MICROECONOMIC THEORY
is a function of x, U: X → R. So, U(x) = ∑ipiu(xi) is a function of outcomes in different states: xi in state i. Since U(x) is additively separable in the xi's, concavity of u implies concavity of U(x), producing conventional indifference curves in x-space. In monetary terms, x1 is the reward in state 1 and x2 the reward in state 2. The 45° line corresponds to state independent rewards in the figure.
1.5 Decisionmaking under Uncertainty When different outcomes are possible and the true outcome unknown or not yet determined, the term risk is associated with circumstances where the probabilities of alternatives are known, and the term uncertainty assigned to the case where probabilities over alternatives are unknown. In the following discussion it is assumed that the decisionmaker can identify the possible states of the world and how actions relate to outcomes or consequences, but no objective probabilities are assigned to states. Let S be a set of states, C be a set of consequences and let ℱ be a collection of events (loosely, subsets of S). For example, S = {s1, s2} = {no rain} and C = {c1, c2} = {wet, dry}, in which case there are two states—it rains or it does not; and two consequences, you are wet or dry. An act is a function from states to consequences: f: S → C. Let A be the set of acts. For example, A = {f,g} = {carry umbrella, don't carry umbrella}. The decision problem is to choose an act, not knowing the state. If there were a utility function defined on consequences: u: C → R and a distribution on states {p(s)}s∈ S, then a utility could be attached to each act: U(f) = ∑s ∈ Su(f(s))p(s). In that case, a relation on acts is implicitly defined: f, g ∈ A, then f ≽ g if and only if U(f) ≥ U(g). Also, the relation on A defined this way is unchanged if defined with utility function α ∈ R. Conversely, starting with an ordering on acts, ≽, what restrictions on ≽ leads to preferences being representable in this way? What assumptions on ≽ imply the
15
CHAPTER 1: DECISION THEORY
existence of u: C → R and p ∈ Δ(S), such that for all f,g ∈ A, f ≽ g if and only if ∑su(f(s))p(s) ≥ ∑su(g(s))p(s)? The Savage axioms6 developed in The Foundations of Statistics provide an answer to this question: if ≽ satisfies those axioms, then there exists a probability measure p ∈ Δ(S) and u: C → R such that:
Since each act f generates a distribution on C: μf(B) = p({s ∈ S | f(s) ∈ B}), this expectation may also be written ∫Cu(c) μf(dc). In deriving this result, one of the key axioms developed by Savage is the sure thing principle and plays a role analogous to the independence axiom of von Neumann–Morgenstern. Sure thing principle: When comparing two acts, only states on which they differ matter. Let B ⊆ S, and let f = g on Bc, so that f and g agree on Bc. Then, f ≽ g if and only if f′ ≽ g′, ∀ f′, g′ ∈ A such that f′(s) = f(s), s ∈ B, g′(s) = g(s), s ∈ B, g′(s) = f′(s) on Bc. So, if f = g on Bc, in ranking f and g, it does not matter what they equal on Bc as long as they agree on Bc. The following example illustrates.
In changing from f to f′ and g to g′, the values remain unchanged on B. Initially, they agreed on Bc and continue to agree after the change to f′ and g′ (from (a,e) to (b,h) on Bc). If f ≽ g, then the sure thing principle requires that f′ ≽ g′. If the sure thing principle is satisfied, then f ≽ g implies f′ ≽ g′.7 Given a set of states, S, and a set of consequences, C. The theory of Savage provides a set of axioms, including the “sure thing principle” which lead to a representation of preferences in terms of expected utility. An act is a function f, f: S → C, so the set of acts is CS. A preference ordering, ≽, on acts is given. Let f,g,f′,g′ be four acts. The sure thing principle says that if on a set of states Q ⊆ S, f = g, and f′ = g′ then if f = f′ and g = g′ on Qc, f ≽ g if and only if f′ ≽ g′. (Put differently, let f and g be any two acts that agree on some states Q. Then two new acts f′ and g′ which agree on Q and equal f and g, respectively, on Qc must be ranked the same way as f,g: f ≽ g ⇔ f′ ≽ g′.)
6
The development of the axioms is lengthy and is not discussed here.
7
It may be worth observing that the sure thing principle in a sense imposes separability on preferences: consequences in states B do not affect the value of consequences in states B . This suggests preference of the form U(f ) = ∑sw (f (s ), s ), or ∑sw *(f (s ), s )p (s ) where w *(f (s ), s ) and p (s ) are chosen to satisfy w *(f (s ), s )p (s ) = w (f (s ), s ).
c
16
MICROECONOMIC THEORY
In the formulation of Savage, acts map from states to outcomes. An alternative is to formulate an act as a mapping from states to lotteries over outcomes. That formulation simplifies the analysis when there are (some) objectively given probabilities. This is discussed in Anscombe and Aumann (1963).
1.5.1 Objections to the theory One of the common objections to the theory of Savage is the Ellsberg paradox— illustrated in the following examples. Both examples highlight dislike for not knowing the odds faced—that of itself confers disutility. Example 1. Let X and Y be two boxes, each containing 100 balls. The balls may be white or black. In box X, there are 49 white and 51 black balls. In box Y, there are α white and β black balls. The value of the integer α is unknown, but it is known that α + β = 100: Box Box Y
White balls 49 α
Black balls 51 β
Consider two experiments. In the first experiment (I) an individual must choose a box and then a ball is selected from that box; if the ball is black the person wins $1000 and $0 if the ball is white. In the second experiment (II) the winning color is reversed; again the person selects a box and wins $1000 if the ball drawn is white and zero otherwise. Summarizing: I → choose Z ∈ {X, Y}—if ball drawn is black ⇒ $1000. II → choose Z ∈ {X, Y}—if ball drawn is white ⇒ $1000.
According to the theory there is a utility function u and probability distribution over the unknown states. In this case, identify a state as the number of (say) white balls in the box—let si denote the state where there are i white balls in the box and let p(si) be the probability that there are exactly i white balls. It is said to be common for most individuals to select box X in both experiments. However, this is inconsistent with the theory. To see this, note that in the case of the Z= X choice, the payoff is calculated in experiment I as: 0.49 · u(0) + 0.51 · u(1000). Let μ be the probability that white is drawn in the first experiment (μ = ∑ p(si)), so the expected payoff to Z = Y is u(0) μ + u(1000)(1 − μ). Since the choice X was preferred to the choice Y it must be that 0.51 > 1 − μ or μ > 0.49. In the second experiment, X was also chosen, giving an expected payoff of 0.51 · u(0) + 0.49 · u(1000), whereas the choice Y would have yielded u(0)(1 − μ) + u(1000) μ. Thus, μ < 0.49. So, the two experiments yield behavior inconsistent with the theory.
CHAPTER 1: DECISION THEORY
17
Example 2. An urn contains 60 balls: 20 red balls, 40 green and blue balls. However, the specific number of blue and green balls is unknown. In one experiment, the individual announces a color, either red or green. A ball is drawn from the urn and if it matches the announced color, the individual receives $100, and nothing otherwise. Announcing red gives the individual a 20/60 chance of winning $100, whereas announcing green gives a x/60 chance of winning $100—where x/60 is the subjective probability of green and y/60 is the subjective probability of blue determined by the Savage theory (x + y = 40). In this experiment, most people choose red—so it must be that 20/60 > x/60. In a second experiment the individual must select a color pair—either red and blue (r − b) or green and blue (g − b). A ball is then drawn. If the person chose r − b and either a red or blue ball was drawn, they receive $100 and $0 otherwise. If the person chose g − b and the ball drawn was either green or blue, the person receives $100 and 0 otherwise. In this experiment, most people choose g − b which has a probability of 40/60 whereas the r − b choice has probability (20+y)/60. Thus 40/60 > (20 + y)/60 or 20/60 > y/60. So the first experiment implies x < 20 and the second that y < 20. These are inconsistent with x + y = 40. Note that in both cases, the individual selects the choice which has a known probability. This is taken as indicative of a dislike for the “ambiguity” associated with the unknown probabilities of green and blue.
1.5.2 Other preference specications In the previous discussion, uniqueness of the distribution {p(s)}s ∈ S on states in conjunction with the linearity of preferences in the distribution over states led to the paradox. Since the distribution over states is not objectively given, this raises the prospect of permitting doubt in the individual's assessment of the probabilities on states. Rather than fix p ∈ Δ(S), let Δ* ⊂ Δ(S) be a set of possible distributions. Recall that an act maps from states to consequences—here consider a mild generalization to allow mappings to distributions over consequences. In the previous example, in state si, there are i white balls and 100 − i black balls, so in state si there is probability ξi = i/100 of drawing a white ball from box Y. For situation I, let f be the act of choosing Y. Thus, f(si) = (ξi, 1 − ξi) = (i/100, (100 − i)/100). (f: S → Δ(C) = Δ({0, 1000})). First consider situation I, and let f be the act of choosing box Y and g the act of choosing box X. With act f the associated payoff is
18
MICROECONOMIC THEORY
And this is u(0) μ + u(1000) (1 − μ). With act g, the mapping is degenerate: g(si) = (49/100, (100 − 49)/100), ∀si ∈ S, giving expected payoff
Normalize utility, u(0) = 0 and u(1000) = 1, so the payoff to act f is (1 − μ) and the payoff to act g is was chosen, 0.51 > 1 − μ, implying μ > 0.49.
. Because act g
In situation II, let h be the act of choosing Y (a different mapping from states to distributions on consequences than f) and let r be the act of choosing X. The expected payoff from r is 0.49 and the expected payoff from act h is μ. Since act r was chosen, 0.49 > μ, which is inconsistent with the previous calculation. In the present context, given a set of distributions Δ* ⊂ Δ, define the payoff to an act f as
When f: S → Δ(C), u(f(s)) = ∑c ∈ Cu(fc(s)), where fc(s) is the probability of consequence c at state s. This form of preference exhibits pessimism: for any act, the most unfavorable distribution in Δ* is used to evaluate the expected payoff. To illustrate the impact of this in the current context, let
So, Δ* is the set of distributions with support on the states s40 and s60. Then, in situation I, with p ∈ Δ*, act f yields a payoff of
With the normalization u(0) = 0, u(1000) = 1, this is:
This is minimized by setting p(s40) = 0 to yield a value of 40/100, less than 0.51, so g is chosen. In the second experiment the expected payoff from h when p(s40) = 1 is 0.40, less than the expected payoff of 0.49 from r. This specification has an axiomatic basis. Let X be the set of outcomes and Y the set of lotteries on X with finite support. Let L be a convex subset of {f : S → Y | f is measurable} containing the constant functions, denoted Lc and the finite step functions Lo. Let ≽ be a preference ordering on L. A preference ordering ≽ induces an ordering ≥ on Y according to y ≥ y′ if and only if fy ≽ fy′ where fy(s) = y, ∀ s and fy′(s) = y′, ∀ s. Suppose that ≽ is complete and transitive; satisfies certainty independence (f,g ∈ L, h ∈ Lc, then f ≻ q ⇔ α f +(1 − α)h ≻ α g +(1 − α)g);
CHAPTER 1: DECISION THEORY
19
continuity (f ≻ g, g ≻ h then ∃ α, β ∈ (0,1), α f + (1 − α)h ≻ g and g ≻ β f + (1 − β) h); monotonicity (f, g ∈ L, f(s) ≥ g(s) on S, then f ≽ g); uncertainty aversion (∀ f,g ∈ L, α ∈ (0,1), f ∼ g, then α f + (1 − α)g ≽ g). The main result is that under these conditions there is an affine function u: Y → R and a closed convex nonempty set of finitely additive probability measures Δ* such that f ≽ g ⇔ minp ∈ Δ* ∫u(f(s))dp ≥ g ≻ β f + (1 − β) h); monotonicity (f, g ∈ L, f(s) ≥ g(s) on S, then f ≽ g); uncertainty aversion (∀ f,g ∈ L, α ∈ (0,1), f ∼ g, then α f + (1 − α)g ≽ g). The main result is that under these conditions there is an affine function u: Y → R and a closed convex nonempty set of finitely additive probability measures Δ* such that
The function u is unique up to a positive affine transformation and the set Δ* is unique if not all f,g are equivalent in L. (See Gilboa and Schmeidler (1989).)
Bibliography Anscombe, F. J. and Aumann, R. J. (1963). “A Definition of Subjective Probability,” Annals of Mathematical Statistics, 34, 199–205. Chew, S. H. (1979). “Alpha-nu Choice Theory; a Generalization of Expected Utility,” Working Paper 669, University of British Columbia, Faculty of Commerce and Business Administration. Chew, S. H. (1989). “Axiomatic Utility Theories with the Betweenness Property,” Annals of Operations Research, 19, 273–298. Chew, Soo Hong, and Epstein, L. G. (1989). “A Unifying Approach to Axiomatic Non-Expected Utility Theories,” Journal of Economic Theory, 49, 207–240. Debreu, G. (1954). “Representation of a Preference Ordering by a Numerical Function,” in R. M. Thrall, C. H. Coombs, and R. C. Davis (eds.), Decision Processes. New York: John Wiley. Dekel, E. (1986). “An Axiomatic Characterization of Preferences under Uncertainty: Weakening the Independence Axiom,” Journal of Economic Theory, 40(2), 304–318. Eilenberg, S. (1941). “Ordered Topological Spaces,” American Journal of Mathematics, 63, 39–45. Fishburn, P. C. (1982). The Foundations of Expected Utility. Dordrecht: D. Reidel. Fishburn, P. C. (1983). “Transitive Measurable Utility,” Journal of Economic Theory, 31, 293–317. Gilboa, I. and Schmeidler, D. (1989). “Maximin Expected Utility with Nonunique Prior,” Journal of Mathematical Economics. Gilboa, I. and Schmeidler, D. (1995). “Case-based Decision Theory,” Quarterly Journal of Economics, 110, 605–639. Gul, F. (1989). “A Theory of Disappointment Aversion,” Econometrica, 59(3), 667–686.
20
MICROECONOMIC THEORY
Miyake, M. (1990). “Continuous Representation of von Neumann–Morgenstern Preferences”, Journal of Mathematical Economics, 19, 323–340. Mas-Colell, A., Whinston, M. D., and Green, J. (1995). Microeconomic Theory. Oxford: Oxford University Press. Rubinstein, A. (1988). “Similarity and Decision Making Under Risk,” Journal of Economic Theory, 46, 145–153. Savage, L. J. (1972). The Foundations of Statistics. New York: Dover Publications.
2 Preferences, Risk, and Stochastic Dominance 2.1 Introduction How do individuals compare or rank risky alternatives? With von Neumann–Morgenstern preferences, the answer is straightforward. If X and Y are random variables with densities f and g then X is preferred to Y if ∫u(x)f(x) dx > ∫u(y)g(y) dy. This provides a complete transitive ordering on random returns. Given such preferences for an individual, one can study how an individual allocates risk, makes portfolio choices, or reacts to changes in risk. Alternatively, one can focus on finding rankings of random returns across individuals. For example, under what circumstances might X be considered more risky than Y “in general”? This question suggests attempting to find ways of ranking random returns consistently across a family of preferences (not necessarily von Neumann–Morgenstern), such that all preferences in the family would give the same ranking for random returns. For example, if X and Y have the same mean but Y has a larger variance, one might argue that Y is more risky than X and that X should be preferred to Y, regardless of preferences. There is little prospect of defining a complete ordering on risky returns common across individuals—even with von Neumann–Morgenstern preferences, different utility functions may rank risks differently. For example, for another von Neumann–Morgenstern utility function v, it could be that ∫v(x) f(x) dx < ∫v(y)g(y) dy, so there is no agreed ranking of X and Y as preferences vary from one von Neumann–Morgenstern utility function to another. In fact, for X to always be ranked as high as Y requires ∫u(x)f(x) dx ≥ ∫u(x)g(x) dx for all functions u, and this is equivalent to ∫u(x)[f(x) − g(x)] dx ≥ 0 for allu. The only way this can hold is if f(x) = g(x) for all x so that X and Y have the same distribution. Thus, it is natural to search for a (partial) ordering on risky returns such that one return would be considered more or less
22
MICROECONOMIC THEORY
risky than the other for a restricted class of preferences. One common procedure for establishing such rankings is through stochastic dominance comparisons. This chapter pursues these themes in turn. First, individual attitudes toward risk and decisionmaking are considered for a fixed preference—in Sections 2.2 and 2.3. Then stochastic dominance criteria are considered in Section 2.4, where classes of individuals would agree on the ranking of alternative random returns. Considering individual attitude toward risk, risk aversion measures the extent to which an individual dislikes risk. In Section 2.2 relative and absolute risk aversion are defined in terms of the von Neumann–Morgenstern utility: in terms of the function u in the expression V(x,p) = ∑iu(xi)pi. With these concepts, one can compare attitude to risk–-for example, the degree of risk aversion of individuals with some given wealth level but different preferences. In Section 2.2.1 some equivalences between risk aversion and other indicators of riskiness such as risk premia are considered. Risk aversion generally varies with the level of wealth. This section also discusses, in terms of risk aversion, how variations in wealth affect portfolio decisions. Next, from a different perspective, in the function V(x,p), one can view p as given and x as the variable-this defines the state preference model. Then one can consider the impact of varying outcomes at different states while the probability of each state is given-for example, in considering the purchase of different levels of insurance when the probability of an accident is fixed. The connection between risk aversion and the shape of indifference curves in the state preference model is discussed in Section 2.3. Stochastic dominance criteria are introduced in Section 2.4. This section explores the criteria for ranking random returns in terms of properties of distribution functions, and in terms of expected utility. First-, second-, and higherorder stochastic dominance criteria are defined. The connection between direct rankings of distributions and ranking of distributions through preferences is explored. In the case where two random returns have equal mean, second-order stochastic dominance is equivalent to the mean preserving spread criterion: if two random returns have equal means and one second-order stochastically dominates the other, then the dominated return is a mean preserving spread of the undominated one. Stochastic dominance and risk aversion are discussed in Section 2.5.3: in particular, preferences with decreasing absolute risk aversion lead to a partial characterization of third-order stochastic dominance. In Section 2.5.4 the relationship between stoch-astic dominance, mean preserving spreads, likelihood ratios and hazard rate criteria for ranking distributions is developed. Definitions of risk dominance are given in terms of mean semide-viations in Section 2.5.5. Mean semideviations provide a measure of risk similar to the variance of a random return, but also have a direct connection to stochastic dominance. Finally, in Section 2.5.6 conditional stochastic dominance is defined.
CHAPTER 2: PREFERENCES, RISK, AND STOCHASTIC DOMINANCE
23
2.2 von Neumann–Morgenstern Preferences and Risk Given random returns Z, Z′, which one should an individual prefer? Is one more “risky” than the other? These are questions about ranking risky alternatives and comparing their riskiness. For example, if X is the random return yielding 20 or 30, each with 50% probability and Y is the random return yielding 40 or 60, each with 50% probability, these are easily ranked since the worst possible outcome from Y exceeds the best possible outcome from X. If a third random variable Z yields returns 0 or 100, each with 50% probability, matters are less clear. Comparing Y and Z, the difference in expected payoffs is ½[u(40) − u(0)] − ½[u(100) − u(60)], so that Y is (weakly) preferred to Z provided [u(40) − u(0)] ≥ [u(100) − u(60)], a condition that is satisfied for any concave function u. In the context of decisionmaking, the choices an individual makes (portfolio selection, insurance decisions, and so on) affect the distribution of returns the individual receives. For example, suppose the random return, Z(a), depends on a vector of choice variables, a ∈ A, where A is the feasible choice set. A portfolio with ai shares of asset i, where a unit of asset i has random return Xi, yields a random return of Z(a) = ∑iaiXi. For given a, Z(a) is a random variable with distribution Fa. If Xi has mean μi and variance , then Z(a) has mean ∑iai μi; and if the Xi's are independent, Z(a) has variance . Let V(Z(a)) be the utility attached to the random return Z(a). When preferences are represented by a von Neumann–Morgenstern utility, u, V(Z(a)) = E{u(Z(a))} = ∫u(ξ) dFa(ξ). An optimal choice for a solves: maxa ∈ A ∫u(ξ) dFa(ξ) and depends critically on attitude to risk and return. In what follows, a variety of criteria for ranking risk and return are considered. For von Neumann–Morgenstern utility u, an individual is said to be risk averse if u″ < 0 (taking as given that “more-isbetter,” u′ > 0). Two standard measures of risk aversion are absolute and relative risk aversion. At income level y, absolute risk aversion is measured as ra(y) = −u″(y)/u′(y) and relative risk aversion is measured as rr(y) = − u″(y)y/u′(y). These measures of risk aversion are local measures, evaluated at a given level of income y and focus on the curvature of the utility function. Assumptions on these risk aversion measures may be used to provide insight into portfolio decisions. For example, in the case where the measure is assumed constant over all ranges of income, constant absolute risk aversion implies a utility function of the form: u(y) = a − be−αy with absolute risk aversion equal to α. With constant relative risk aversion u(y) = a + byα with relative risk aversion parameter 1 − α; or u(y) = a ln (by) with risk aversion parameter 1.
2.2.1 Risk aversion: some relations The measures of risk aversion are closely related to other measures of risk. Let Z be a random variable with mean μ and variance σ2 and let the decisionmaker
24
MICROECONOMIC THEORY
have wealth x. Four standard notions of riskiness are: • • • •
Risk premium, π is implicitly defined from: u(x + μ − π) = E{u(x+Z)} Certainty equivalent, πa is implicitly defined from: u(x + πa) = E{u(x + Z)}. Bid price, πb is implicitly defined from: u(x) = E{u(x+Z -πb)}. Insurance premium, πI is implicitly defined from: u(x − πI) = E{u(x + Z)}.
From the definitions, πa = μ − π = − πI, so that risk premium, certainty equivalent, and insurance premium measure the same thing-willingness to trade return for elimination of risk. The relative and absolute measures of risk aversion involve the derivatives of the utility function. For example, the relative risk aversion function is −(xd ln u′(x))/dx, the elasticity of marginal utility with respect to wealth. When risk measured by the random return Z is small (in the sense that the distribution of Z has support close to its mean), then local approximations of the utility function pin down the value of risk associated with Z. The following calculations (with E{Z} = μ and E{(Z − μ)2} = σ2) illustrate the relationship between the risk premium and absolute risk aversion: Assuming a Taylor series expansion is valid:
Since E{Z − μ} = 0, u(x + μ− π) ≈ u(x + μ) + ½ u″ (x + μ) σ2, where σ2 = E{(Z − μ)2}. Assuming π is small, then u(x + μ − π) ≈ u(x + μ) − u′(x + μ) π so that
or
Similar calculations give the odds required for an individual to take a (small) risk. If Z is a random variable taking on value h > 0 with probability p and −h with probability (1 − p), the larger the value of p the larger the mean of Z. The following calculation gives the value of p that sets the value of the random return Z to 0.
CHAPTER 2: PREFERENCES, RISK, AND STOCHASTIC DOMINANCE
25
Thus,
or 0 ≈ (2p − 1)u′(x)h + ½ u″ (x)h2, or 2p − 1 = − ½ h(u″(x)u′(x))h or p = ½ + ¼[−(u″(x)/u′(x))]h = ½ + ¼ra(x) h. Thus, the extent to which odds are better than “50/50” is measured by relative risk aversion.
2.2.2 Risk aversion and behavior: asset choice How does the degree of risk aversion affect the holding of risky assets? The following discussion considers the impact of monotonicity of the risk aversion function on the optimal amount of a risky asset as wealth varies. Say there are two assets: money and a risky asset. In a two-period model, the return on money is 0: its value next period is the same as its current value. The risky asset yields a random return of ξ, where E{ξ} > 0. Initially, an individual has a stock of money, m, and no risky asset. Then the individual selects an optimal amount of the risky asset: let θ be the proportion of money spent on the risky asset, and let Z be the amount of risky asset purchased. After making the investment decision, the individual has (1 − θ)m of money and Z = θ m of the risky asset. So, total income next period is Y = (1 − θ)m + θ m(1 + ξ) = m + θ m ξ. With a von Neumann–Morgenstern utility function, u, the choice of θ is determined as the solution to:
Differentiating this with respect to θ (and assuming that the problem is sufficiently regular to pass the derivative under the integral), this gives the first-order condition:
Since m is a constant, E{u′(m + θmξ)ξ} = 0 is the first-order condition. One can use the first-order condition to show how portfolio choices are affected as risk aversion varies with income. Theorem 2.1 (1) dθ/dm ≤ 0 if relative risk aversion is increasing. (2) dZ/d m ≤ 0 if absolute risk aversion is increasing. Proof If the solution is θ(m), then substituting this into the first-order condition gives an identity:
Differentiating with respect to m, letting Y = m + θ(m)m ξ:
26
MICROECONOMIC THEORY
This rearranges to:
Since u″ < 0 and mξ2 > 0 the denominator has negative sign; the sign of θ′(m) is the same as the sign of the numerator: E{u″(Y)ξ[1 + θξ]}. If relative risk aversion is increasing (i.e. rr(Y) = −(Yu″ (Y))/u′(Y) is increasing in Y), then:
In this case, multiplying both sides by −u′(Y) gives Yu″(Y) ≤ −rr(m) u′(Y), and with ξ ≥ 0,
Similarly r(Y) = rr(m+ θmξ) ≤ rr(m), when ξ ≤ 0. In this case, multiplying both sides by −u'(Y) gives Yu″(Y) ≥ −rr(m)u′(Y), and with ξ ≤ 0,
Thus, ∀ ξ, ξ Yu″(Y) ≤ −rr(m)u′(Y)ξ. Taking expectations, E {ξ Yu″(Y)} ≤ −rr(m)E{u′(Y)ξ} = 0, where the last equality follows from the first-order condition. So, the sign of θ′(m) is negative: θ′(m) ≤ 0. For the case of absolute risk aversion, again turning to the first-order condition with θ(m)m = z(m):
Differentiating with respect to m:
Recall that absolute risk aversion is defined: ra(Y) = −u″(Y)/u′(Y). Increasing absolute risk aversion gives:
or −u″(Y) ≥ ra(m)u′(Y) or u″(Y)ξ ≤ − ra(m)u′ (Y)ξ. Similarly,
or −u″(Y) ≤ ra(m)u′(Y) or u″(Y)ξ ≤ −ra(m)u′(Y)ξ. Thus,
so, dZ(m)/dm≤ 0.
CHAPTER 2: PREFERENCES, RISK, AND STOCHASTIC DOMINANCE
27
2.3 Risk Aversion and the State Preference Model Consider an environment where there is a given set of states, {s1, …, sn} with corresponding probabilities {p1, …, pn}. Let xi be the return in state i. Then, with von Neumann–Morgenstern utility the expected utility associated with return x = {x1, …, xn} is . Here, in the state preference model, the focus is not on the distribution p or alternative distributions, but on the possible outcomes in different states. The following discussion focuses on the twostate case. Let V(x1, x2) = p1u(x1) + p2u(x2) = p1u(x1) + (1 − p) u(x2). If V is quasiconcave in x then u is quasiconcave, but possibly not concave. (So, convexity of the indifference curves of V-in (x1, x2) space-does not imply concavity of u.) If u is concave, then so is V. The marginal rate of substitution along an indifference curve of V is given by:
The shape of the function V can be related to absolute risk aversion as follows. Theorem 2.2If absolute risk aversion is increasing, then the indifference curves of V become flatter further out. More formally, along a ray from the origin, the indifference curve becomes steeper if the ray has an angle greater than 45° and the indifference curve becomes flatter if the ray has an angle less than 45°. Proof To prove this consider rax = −u″(x)/u′ =−dln u′(x)/dx, marginal rate of substitution is:
where μ is greater that 1 for the case above the 45° line. Noting that , so that e−ϕ(ξ) = eln u′(ξ)e−ln u′(0) = ku′(ξ). With this, the . Now, put x2 = μx1, and consider the derivative with respect to x1.
Along indifference curve:
When x2 = μ x1:
Consider x1.
, a derivative describing how the marginal rate of substitution varies along the ray x2 = μ
28
MICROECONOMIC THEORY
If ra(x) is increasing in x, then μ > 1 ⇒ ra(μx1) > ra(x1) and μ < 1 ⇒ ra(μx1) < ra(x1). Since
Thus, above the 45° line, moving out along a ray, the indifference curve has a “more negative” slope (is steeper), and below the 45° line the indifference curve has a less negative slope (is flatter).
Summarizing, above the 45° line the indifference curve becomes steeper along the ray x2 = μ x1, μ > 1. Below the 45° line the indifference curve becomes flatter, moving out along a ray x2 = μx1, μ < 1. A similar result holds with relative risk aversion. Theorem 2.3If relative risk aversion is increasing, then along a ray above the 45° line the indifference curve of V becomes steeper; along a ray below the 45° line the slope becomes flatter. If relative risk aversion is constant, the slope of the indifference curve along a ray is constant. Proof With relative risk aversion rr(x) = −xu″(x)/u′(x) = −x(d ln u′ (x)/dx), or rr(x)/x = −d ln u′(x)/dx. Then , ϕ*′(ξ) = rr(ξ)/ξ and e−ϕ*(ξ) = e ln u′(ξ) e−ln u′(0) = ku′(ξ). The marginal rate of substitution is:
. So,
CHAPTER 2: PREFERENCES, RISK, AND STOCHASTIC DOMINANCE
29
If relative risk aversion is increasing, then since is negative, when μ > 1, [rr(μx1) − rr(x1)] > 0 and the slope of the indifference curve becomes steeper or more negative and the slope is flatter when μ < 1. In the case where relative risk aversion is constant, [rr(μx1) − rr(x1)] = 0, and the indifference curves have the same slope along a ray from the origin.
2.4 Stochastic Dominance How should different random variables be compared in terms of risk and return? Given two random variables X and Y, under what circumstances should Y be considered more risky than X, for any decisionmaker. Stochastic dominance criteria provide one answer in terms of their corresponding distribution functions: for example, if the distribution of X has lower probability on low values than the distribution of Y, then X is preferred. This informal idea leads to many different notions of dom-inance (first and higher order), introduced in Section 2.4.1. Although the definitions of stochastic dominance can be given without reference to preferences, there is a very close connection or equivalence which is developed in Section 2.4.2.
2.4.1 Stochastic dominance and distribution functions Given a distribution F, of a random variable X, let F1 ≡ F. Where necessary, to associate the distribution explicitly with the random variable, write FX. Iteratively, define functions Fk according to:
Say that random variable Xk-order stochastically dominates random variable Y if stochastically dominates Y. If X ≽kY then
, ∀z. Write X ≽kY if Yk-order
so that X ≽k+1Y. Thus, if a random variable X dominates a random variable Y at some order k, it dominates at all higher orders, k+j, j ≥ 1. Write X ≻kY for the strict order: X ≽kY but not Y ≽kX. From the definition, the orderings ≽k and ≻k are transitive orderings, the ordering ≽k is reflexive (X ≽ X) and the ordering ≻k is asymmetric (X ≻kY implies that Y ≻kX does not hold (¬(Y ≻kX))). Also, note that d/dz{Fk+1(z)} = Fk(z).
30
MICROECONOMIC THEORY
2.4.2 Stochastic dominance and preferences An alternative approach to ordering random variables is via integration on a class of (utility) functions. Consider classes of real valued functions mapping from the real line. Let
Define orderings on the space of random variables according to X ≥1Y if ∫u(ξ) dFX(ξ) ≥ ∫u(ξ) dFY(ξ) for all u ∈ ℱ1; and X ≥2Y if ∫u(ξ) dFX(ξ) ≥ ∫u(ξ)dFY(ξ) for all u ∈ ℱ2. These notions of dominance are equivalent to ≽1 and ≽2, as the following discussion shows.
2.5 Equivalence of Dominance Criteria The following discussion shows that definitions of stochastic dominance are equi-valent whether given directly in terms of distributions or alternatively through preferences. For ease of notation, let F = FX and G = FY. The discussion below makes use of the following calculations.
Provided
(assumed in what follows),
Theorem 2.4.The orderings ≥1and ≥2are equivalent to ≽1and ≽2. Proof Comparison of the orderings ≥1 and ≽1. 1. [X ≥1Y ⇒ X≽1Y]: If X≥1Y, then for all u ε mathcalℱ1, , so that for all increasing u, . If there is an interval on which F(ξ) > G(ξ) then taking u to be increasing on that interval and constant elsewhere yields , a contradiction. So, F(ξ) ≤ G(ξ), ∀ ξ and X ≽1Y. 2. [X ≽1Y ⇒ X≥1Y]: Suppose that X≽1Y, so that F(ξ) ≤ G(ξ) for all ξ. Then, for any increasing u, , so that , and X ≥1Y.
CHAPTER 2: PREFERENCES, RISK, AND STOCHASTIC DOMINANCE
31
Before considering ≽2 and ≥2, some calculations will be useful. Consider the expression:
Noting that
Thus,
Note that F>2(−∞) = G2(−∞) = 0, since the area under the distribution function at -∞ is 0. Comparison of the orderings ≥2 and ≽2. 1. [X ≽2Y ⇒ X≥2Y]: Suppose that X ≽2Y so that F2(x) − G2(x) ≤ 0, ∀ x. Then, with u′ ≥ 0, the term is nonnegative. And, since F2(x) − G2(x) ≤ 0, ∀ x, if u″ ≤ 0, then ∫[F2(ξ) − G2(ξ)] u″(ξ) dξ ≥ 0. Thus, X ≽2Y implies that or X ≥2Y. 2. [X ≥2Y ⇒ X≽2Y]: Let X ≥2Y, so that for all u concave and increasing. Suppose that 2 2 for some interval, F (x) − G (x) > 0. Let x* be the left point of such an interval so that F2(x) − G2(x) > 0 on (x*, x+ε) for some ε > 0. Let u be linear on (−∞, x*] and strictly concave on (x*, x+ε) with u′(x* + ε) = 0 (so that, , . In
32
MICROECONOMIC THEORY
this case,
and,
contradicting X ≥2Y.
2.5.1 Equal means: mean preserving spreads One case of particular interest is that where E{X} = E{Y} or μF = μG. On the class of such distributions, secondorder stochastic dominance is characterized by concavity (alone) of the integrating functions. Recall that provided ,
In the case where u(x) = x, this gives
Recalling
Consider the case where the supports of the distributions are both some interval [a,b].
When the means are equal (μF = μG), second-order stochastic dominance, [F2(ξ) − G2(ξ)] ≤ 0, ∀ ξ implies that for all concave u (since ). Conversely, if for all concave u, then this holds for all increasing concave u, so that F second-order stochastically dominates G. Finally, this relates to the notion of a mean preserving spread as follows. Let F and G be the distribution functions of X and Y. The distribution G is said to be a (weak) mean preserving spread of F (X ≽MPSY) if they have the same mean and ∀z. Thus, X≽MPSY is equivalent to common mean and secondorder stochastic dominance.
CHAPTER 2: PREFERENCES, RISK, AND STOCHASTIC DOMINANCE
33
2.5.2 Higher order stochastic dominance For calculations involving higher order stochastic dominance, integral iterates of the distribution function can be related to higher order derivatives of the von Neumann–Morgenstern utility function. Consider the expression
and observe that
satisfies:
Thus,
More generally, writing u(l) for the lth order derivative of u,
Let ℱk = {u|(−1)ju(j) ≤ 0, j = 1, …, k}. This expression connects ∫u dF − ∫u dG to the terms Fk − Gk and the derivatives, u(l), l = 1,…, k. For example, for any k ≥ 1, if , ∀u ε ℱk, then μF ≥ μG. In particular, if F first- or second-order stochastically dominates G, then μF ≥ μG.
2.5.3 Stochastic dominance and risk aversion
Recall that the absolute risk aversion measure was given by ra(y) = −(u″(y)/u′(y)). A plausible assumption is that aversion to risk declines as income
34
MICROECONOMIC THEORY
increases: ra′(y) ≤ 0. Observe that
For this to be negative requires that u′(y)u‴(y) − (u″(y))2 > 0 or u‴(y) > 0 since u′(y) > 0 and (u″(y))2 > 0. Thus, decreasing absolute risk aversion implies that the third derivative of the utility function be positive: u‴(y) > 0. (More specifically, u‴(y) ≥ u′(y)ra(y)2.) Considering those functions u with u′ ≥ 0 and u″ ≤ 0, the subset that has the decreasing absolute risk aversion property is a subset of the subset of the collection of functions that have a nonnegative third derivative. In the case where the distributions have support [a,b] and μF = μG
Thus, F ≥3G if μF = μG and F3(z) − G3(z) ≤ 0, ∀ z (F ≽3G). (More generally, F≥3G if and only if μF ≥ μG and F ≽3G.) Decreasing absolute risk aversion in the class of increasing risk averse utility functions (u′ ≥ 0, u″ ≤ 0) implies third degree stochastic dominance on the class of distributions with common mean.
2.5.4 Likelihood ratios and hazard rates Apart from stochastic dominance criteria, likelihood ratios and hazard rates are also used to rank distributions. Let X and Y have common support with distributions F and G and densities f and g. (More generally, if S(X) and S(Y) are the supports of F and G, respectively, the likelihood ratio is said to be increasing if increasing on S(X) ∪ S(Y)). Definition 2.1Say that X dominates Y in likelihood ratio order (X ≽LRY) if f(z)/g(z) is increasing in z. The hazard function, hFassociated with F is defined:
. Write X ≽HRY ifhF(z) ≤ hG(z), ∀ z.
CHAPTER 2: PREFERENCES, RISK, AND STOCHASTIC DOMINANCE
35
These are related as follows: Theorem 2.5.X ≽LRY implies X ≽HRY which implies X ≽1Y. Proof [X ≽LRY ⇒ X ≽HRY]: Take x ≤ y so that f(y)/g(y) ≥ f(x)/g(x). Then f(y) ≥ (f(x)/g(x))g(y). Observe that
Letting y = x,
Thus X ≽LRY implies X ≽HRY. [X ≽HRY ⇒ X ≽1Y]: Observe that the hazard function satisfies
Integrating,
Since hG(ξ) ≥ hF(ξ), ∀ξ, Thus X ≽HRY implies X ≽1Y.
and
or
and
. Thus,
so that F(x) − G(x) ≤ 0, ∀ x and so X ≽1Y.
2.5.5 Dominance in terms of semideviations In some cases, it may be argued that stochastic dominance does not adequately capture the risk–return tradeoff. Let X be a random variable uniformly distributed on [0,2] and let Y be a random variable uniformly distributed on [0,1]: F(x) = ½x and G(y) = y. Thus, F first-order stochastically dominates Y. Also, Y has a mean of ½ and a variance of ⅔, whereas X has a mean of 1 and a variance of ⅔. Thus, for portfolio's based on X and Y each is on the mean–variance frontier, although X first-order stochastically dominates Y. This suggests as a possibility, using a measure of risk other than the variance to relate stochastic dominance to risk. The following discussion introduces the notion of semideviations. For k = 1, 2, …, the kth central semideviation is defined: . In the specific
36
MICROECONOMIC THEORY
case where k = 1, this is called the absolute semideviation
Note that
, so that
Therefore, the absolute semideviation may be written:
It turns out that if X second-order stochastically dominates Y, X≽2Y, then
Thus, the mean is higher and also the gap between return and risk. How do semideviations relate to stochastic dominance? The following calculations connect integral iterates of the distribution function (which are used to define stochastic dominance relations) to mean semideviations. For k≥ 1,
This can be seen by induction. If k = 0 (with 0! = 1 by convention), the expression gives
which is correct. If true for k − 1, then
so that
Consider . This is the integral of the function f(η, ξ) = (η−ξ)k−1 on the region below the 45° degree line in (η, ξ) space. From Fubini's
CHAPTER 2: PREFERENCES, RISK, AND STOCHASTIC DOMINANCE
37
theorem (see the figure below),
This can be seen directly. In , for η < x the first integral runs along the (vertical) line through η from −∞ to η. The second integral sums the integrals on these lines. This computes the integral of the function on the shaded region with respect to dF(ξ) ⊗ dη. Alternatively, for ξ < x, integrate from ξ up to x along the η-axis: . Summing these (horizontal) lines with a second integral: . The integral is the same as before. Observe
so that
Thus,
and the result is verified by induction. Let
and note that
so that Fk+1(x) = (‖max{0,(x − ξ)} ‖k)k. In particular,
.
The mean–variance criterion for risk–return tradeoff permits inconsistencies with second-order stochastic dominance: a random variable with mean–variance
38
MICROECONOMIC THEORY
on the efficient frontier may be second-order stochastically dominated by another random variable. If risk is defined in terms of first-order semideviations (so that risk–return is evaluated in terms of mean-semideviation), this is not the case, for an appropriate measure of risk. Comparing two random returns, X and Y, with common mean, if X secondorder stochastically dominates Y then the absolute semideviation measure of risk associated with X is lower than the absolute semideviation measure of risk associated with Y.
2.5.6 Conditional stochastic dominance and monotone likelihood ratios The definitions of stochastic dominance can be extended to conditional dominance. Let F be a cumulative distribution and write F(· | x) for the distribution conditional on the random variable being as large as x: for . Let G be a cumulative distribution with conditional distribution G(· | x). Say that F conditionally stochastically dominates G (first order) if F(· | x) stochastically dominates G(· | x), ∀ x. The density of F(z | x) is f(z | x) = f(z)/(1 − F(x)) and sim-ilarly, the density of G(z | x) is g(z | x) = g(z)/(1 − G(x)), both with support on [x, ∞). Thus, f(z|x)/g(z|x) = f(z)/g(z) [(1 − G(x))/(1 − F(x))]. Consequently, the likelihood ratio f(z | x)/g(z | x) is increasing in z if f(z)/g(z) is increasing in z, and so a sufficient condition for conditional stochastic dominance is a monotone likelihood ratio f(z)/g(z).
Bibliography Fishburn, P. C. (1982). The Foundations of Expected Utility. Holland: Doredrecht. Gale, Douglas (1981). Lecture notes, mimeo, LSE. Machina, M. (1982). “ ‘Expected Utility′ Analysis Without the Independence Axiom,” Econometrica, 50, 277–323. Machina, M. (1987). “Choice Under Uncertainty: Problems Solved and Unsolved,” Economic Perspectives, 1(1), 121–154. Ogryczak, W. and Ruszczyński, A. (1997). “On Stochastic Dominance and Mean-Semideviation Models,” IIASA Report 97–043. Pratt, J. (1964). “Risk Aversion in the Small and in the Large,” Econometrica, 32, 122–136. Rothschild, M. and Stiglitz, J. E. (1969). “Increasing Risk: A Definition,” Journal of Economic Theory, 2, 225–243.
3 Strategic Form Games 3.1 Introduction The standard framework for modeling interaction between individuals is the strategic form game. A strategic form game—defined in Section 3.2—consists of a choice set for each player and a payoff function associating payoffs to choices. Players make choices simultaneously, or in ignorance of others' choices, and receive payoffs. How should such choices be determined and what choices might one predict will be chosen? In a single person decision problem, given a choice set X and a preference ordering or utility function, u, defined on X, it is reasonable to predict that the individual will maximize u: maxx∈Xu(x). When there is more than one decisionmaker (say two) and the utility of one, u(x,y), depends on the actions of the other (y ∈ Y), this raises a fundamental conundrum in predicting behavior: the optimizing choice for x depends on the (a priori unknown) value of y, so that choosing x “optimally” requires some view of what value will be chosen for y. This leads to reasoning about the motivation of other players, and can quickly lead to complex models of rationality layered on top of the basic physical description of the problem. The discussion below does not pursue these issues, but instead describes a variety of models that reflects very different points of view on how individuals might behave, including a dynamic stability view of Nash equilibrium—the most widely used equilibrium criterion. In Section 3.3 a variety of notions of equilibrium are described. Section 3.3.1 discusses a minmax approach, where individuals take a conservative position, minimizing “downside” outcomes: a player selects the action that makes the worst outcome that can occur as good as possible. Next, dominant strategies are introduced in Section 3.3.2. A strategy is a dominant strategy for a player if when compared with any other strategy, it is always as good and sometimes better in terms of payoffs. When a strategic form game has a dominant strategy
40
MICROECONOMIC THEORY
for each player, this provides a robust prediction of behavior, especially when the dominance is strict. Section 3.3.3 discusses rationalizability which emphasizes higher order reasoning of players (when players think through how other players may be reasoning). Section 3.3.4 presents evolutionary stable strategies—a biologically based model of behavior: when a strategy used by the entire population is immune to invasion by an alternative strategy, is such a strategy called an evolutionary stable strategy. Following this, correlated equilibrium is described in Section 3.5. The convexity of the set of correlated equilibrium payoffs and the connection with Nash equilibria are discussed. In Section 3.4, Nash equilibrium is introduced along with a brief discussion of its dynamic stability. Nash equilibrium is by far the most widely used notion of equilibrium behavior and is defined mathematically by the stability requirement of no gain from unilateral deviation. Each player's choice is optimal given the choices of other players. However, no gain from deviation presupposes some anticipation of the choice of others. Addressing this issue requires identifying the extent to which players have knowledge of the equilibrium that is being played.
3.2 Strategies A strategic form game is given by , where ui: A → R is the utility function of player i, Ai is the pure strategy space of i, and A = × Ai. The set of mixed strategies of i is the set of probability distributions on Ai. If each Ai is finite then the game in mixed strategies is defined:
In the continuous action space case, this becomes . To play a mixed strategy, xi, player i uses the distribution xi to select a point ai ∈ Ai, which is then “played,” all at the same time. One reason a player might use a mixed strategy is for concealment. For example, in the game,
for either player, a pure strategy guarantees no more than −1; but the mixed strategy (z, 1 − z) = (½, ½) guarantees a payoff of 0. The fundamental assumption in the strategic form game is that a player when making a choice does not know the choices of others—which can be interpreted
CHAPTER 3: STRATEGIC FORM GAMES
41
as simultaneous decisionmaking. Since the outcome of the game depends on the choices of all players, it is necessary for each player to form some opinion on what others might do. Pursued vigorously, this leads to reasoning about how others reason and possibly to the psychology governing individual behavior. A Nash equilibrium is defined mathematically by the property of no gain from unilateral deviation. So â = (âi, â−i) is a Nash equilibrium in pure strategies if and only if for each player i, ui(âi, â−i) ≥ ui(ai, â−i), for all ai ∈ Ai implicitly requiring some insight for i regarding the choices â−i.
3.3 Solutions A noncooperative solution to a game predicts certain strategy profiles as outcomes of the game. Defining or interpreting a solution revolves around how players reason and behave or are believed to reason and behave. This inevitably leads to the need for a player to attempt to understand and predict how others will behave. In this way, beliefs about “opponents” enter the discussion of equilibrium behavior and can lead to complex models of reasoning and associated behavior. Here, however, this aspect of the study of behavior is minimized and the focus placed on a few well-known solutions without much discussion of possible rational or logical underpinnings.
3.3.1 Maxmin choices Given the strategic situation described by , one possibility is that a player might take a “defensive” position choosing an action that makes the worst outcome as good as possible; no other action guarantees a higher minimum payoff. Formally, this is the “maxmin” strategy. Strategy ◯i is a maxmin strategy if . This maxmin behavior is conservative behavior, and has the advantage that it requires little reflection on the psychology or reasoning of other players. In some circumstances this may well be plausible as a model of behavior. Consider the game:
Player 1 (the row player), guarantees a payoff of 49 with the choice B whereas with choice T, −100 is a possible payoff. For player 2, there is only a small difference between the best and the worst possible payoffs. Player 2 has only a small incentive
42
MICROECONOMIC THEORY
to choose L over R, so it may be prudent for 1 to choose B. If however, the payoffs to (T,R) were (−100,−100) (instead of (−100,49)), then the choice of R is potentially a very bad one for 2—in the event that 1 plays T. If, furthermore, the payoffs to (B,R) were also changed to (49, −100) then the choice of R guarantees 2 a payoff of −100, which is much worse than either outcome from choosing L. In such cases the logic for player 1 choosing B to guard against a choice of R by player 2 seems much weaker (although the best response functions are unchanged). The best response mapping of player i associates to each profile of other player's choices x−i ∈ X−i the utility maximizing choices for i. This is defined: .
3.3.2Dominant strategies Considering dominance comparisons offers another approach to the selection of strategies. Given a strategic form game , the corresponding mixed extension is . Strategy xi weakly dominates if (1) , −i −i −i i ∀x ∈ X and (2) , for some x ∈ X−i. Strategy x is weakly dominated if there is some strategy that weakly dominates xi. Strategy xi strictly dominates if , ∀ x−i ∈ X−i. Strategy xi is strictly dominated if there is some strategy that strictly dominates xi. Similar definitions apply to domination in terms of pure strategies. In general there is some strategy that is not weakly dominated. To see this, for each j, let ϕj put positive probability on each open set in Si, and let ϕ−i = ×j ≠ iϕj, the distribution determined on S−i by {ϕj}j ≠ i (where Si is A1 or Xi). Define:
Let maximize φ(si). Then is not weakly dominated. To see this, suppose otherwise, so that there is some ŝi which weakly dominates : ŝ−i such that , ∀s−i, with strict inequality for some , and so if ui is continuous this holds on an open neighborhood (of ). So, , a contradiction. DefinitionA strategy profile (x1, …, xn) is a dominant strategy equilibrium if for each i, xi is a dominant strategy. Most games do not have dominant strategy equilibria. Still, one may also consider iterative procedures for the elimination of dominated strategies that
CHAPTER 3: STRATEGIC FORM GAMES
43
sometimes produce tight predictions. Consider the following game.
In this game, although pure strategies as a basis for elimination of strategies have no effect, mixed strategies do eliminate strategies via domination: ½T + ½ B = ½ ((4,1), (2,2)) + ½ ((2,1), (5,2)) = ((3,1), (3½, 2)). The combination of strategies T and B weakly dominates C. If player 2 believed that 1 would not play C, then R is a better choice than L. If 2 were to play R, it is best for 1 to play T. Such considerations suggest an iterative approach. Let be a game and for each i set . Define is not weakly dominated in Go}. Define . Inductively, define is not weakly dominated in Gt−1}. Let . Say that G is dominance solvable if x*, y* ∈ x* implies u(x*) = u(y*). In this procedure, all weakly dominated strategies are eliminated at each stage. This is necessary to avoid ambiguity. Consider:
If T is removed in the first round, and then L, two strategy profiles remain: (M,R) and (B,R), both with payoff (2,1). If B is removed in the first round, and then R, both (T,L) and (M,L) remain, with payoff (1,1). Thus, if not all weakly dom-inated strategies are removed at each stage, the final set of strategies obtained may depend on the order of elimination. When only strictly dominated strategies are considered for elimination, even if not all strictly dominated strategies are eli-minated at each stage, iterative elimination of strictly dominated strategies in any order produces the same final set of strategies.
3.3.3 Rationalizability When players reason about their own and others' behavior the impact on strategy selection can be dramatic. The following discussion introduces rationalizability and shows that in the linear duopoly game, iterative reasoning about how players will behave leads to a unique prediction. Let the set of mixed strategies of i be Xi. Let . Proceeding
44 inductively,
MICROECONOMIC THEORY
.
. Define
, and the rationalizable strategies as
As an example, in the duopoly model with demand P(Q) = a − bQ, the best response functions are qi(qj) = ((a − c) − bqj)/2b. The symmetric equilibrium is given by q1 = q2 = 1/3((a − c)/b). Depending on (regardless of) the value of qi, the best response for j is in the range [0, (a − c)/2b]; and vice versa.
Here, in terms of mixed strategies, the set of best responses to some strategies of the other player are those distributions with support in the interval Q1 = [0, (a − c)/2b]. At the next round of iteration, if player i never chooses a quantity above (a − c)/2b, player j will never have a best response below (a − c)/4b: if i's choices are in the interval [0, (a − c)/2b], the best responses of j are in the interval Q2 = [(a − c)/4b, (a − c)/2b]. Similarly if i chooses from the range [(a − c)/4b, (a − c)/2b], the best response of j lies in the range Q3 = [(a − c)/4b, 3/8(a − c)/b], and so on. This procedure converges: Q∞ = {{(1/3)(a − c)/b}}, the point where the reaction functions cross.
3.3.4 Evolutionary stable strategies Evolutionary stable strategies provide a criterion for strategy selection based on stability of the strategy in a large population. A given strategy is stable according to this criterion if an “invading” strategy performs less well in this modified population than the status quo or incumbent strategy, since this provides the incentive to choose the status quo; or for the faster replication of the status quo strategy when viewed in evolutionary terms. Fix a two-player symmetric game: (A, A′) (u1(i,j) = u2(j,i)), where A is an m × m matrix. Let Δ be the m − 1 dimensional simplex. Given x, y ∈ Δ, let .
45
CHAPTER 3: STRATEGIC FORM GAMES
Definition 3.2A strategyx* ∈ Δ is an evolutionary stable strategy (ESS) if for every y ≠ x*, ∃εy ∈ (0,1) such that:
So, the invading strategy, y, performs less well than the status quo, x*, against the “invaded” population represented by the strategy (1−ε)x* + ε y. Observe that the inequality is strict. The inequality can be expressed differently. Note that
so that the inequality may be written:
If u(x*, x*) < u(y, x*), the condition is automatically satisfied for εy, sufficiently small, and cannot be satisfied if u(x*, x*) < u(y, x*). If u(x*, x*) = u(y, x*) the condition requires u(x*, y) > u(y,y). Thus, ESS may be defined equivalently: x* is an evolutionary stable strategy if for any y ≠ x*: 1. u(x*, x*) > u(y, x*), or 2. (a) u(x*, x*) = u(y, x*), and (b) u(x*, y) > u(y,y). No weakly dominated strategy is an ESS. To see this, suppose y weakly dominates x, so that u(y,z) ≥ u(x,z), ∀ z ∈ Δ. Then, at z = x, u(x, x) ≤ u(y, x) and with z = y, u(x, y) ≤ u(y,y)—so neither (1) or (2) can hold. Second, every ESS, x*, is a Nash equilibrium, since for any y ≠ x*, either (1) or the first part of (2) is satisfied.
3.4 Nash Equilibrium A strategic form game in pure strategies is given by either Ai or Xi, with representative element si.
or in mixed strategies by
. Write Si to denote
A Nash equilibrium is a strategy ŝ ∈ S such that for all i, . In words, no player can gain from a unilateral deviation. Formally, each player must make his or her choice without knowledge of the other players choices—as if players move simultaneously. However, for player i to justify the choice of ŝi, player i must somehow expect or anticipate that player j, j ≠ i, will choose ŝj. It is the profile ŝ−i that makes ŝi a good choice. So, attempting to develop a behavioral model around the no gain from unilateral deviation inequality inevitably leads to some consideration of how players reason when making
46
MICROECONOMIC THEORY
choices; and in particular about how they reason other players will act. Altern-atively, it is possible to develop dynamic stories whereby behavior converges to the use of strategies that satisfy the Nash equilibrium condition. The following discussion illustrates that approach.
Convergence to Nash equilibrium Let G be a strategic form game with strategy space Ai and payoff function ui for player i. Take Ai to be some interval in R. Define , and suppose that this maximizer is unique and bi a twice differentiable function of a−i. Assuming the problem is “well-behaved” in that bi varies smoothly with a−i. For each for all a−i interior, where is the partial derivative of ui with respect to . In a dynamic context, at time t, . Let a* be a Nash equilibrium. If is close to a*−i, then
From
, ∀a−i, differentiating this identity with respect to ai gives
so that
In the case where n = 2,
Write
, so that the dynamics are given by:
Let ρij = ∂bi(a*−i)/∂ aj, so that the dynamic system of equations becomes δt + 1 = Rδt:
For stability, the eigenvalues of R must all be less than 1in which case Rt → 0.
CHAPTER 3: STRATEGIC FORM GAMES
47
When n = 2, the matrix is
This has two roots λ = ± | (∂b1/∂a2) · (∂b2/∂a1) |, so stability requires |(∂b1/∂a2) · (∂b2/∂a1)| < 1. For example, consider the duopoly model with demand given by P(a1, a2) = α −β(a1 + a2) and constant marginal cost c. Thus firm 1 maximizes π1(a1, a2) = a1(α −β(a1+a2))−ca1 to give a best response function b1(a2) = (α −c − β a2)/2β = (α −c)/2β − ½a2. Then, ∂b1/∂a2 = −½ (and likewise ∂b2/∂a1 = −½), so that (∂b1/∂a2) · (∂b2/∂a1) = ¼.
3.5 Correlated Equilibrium Consider two people who wish to meet but cannot communicate to arrange the meeting. There are two possible meeting locations, the beach (B) or the library (L). Unilaterally, each has the choice set {B, L}. Described as a game, suppose that the payoffs are given by G5 (where person 1 chooses row):
In terms of the Nash equilibrium, there are two pure strategies ((B,B) and (L,L)), and one mixed x = (x1, x2) = (⅔, ⅓), y = (y1, y2) = (⅓,⅔). Suppose now that each player conditions his or her choice on the weather. Suppose also, that the probability of sunny weather (S) is 50% and the probability of overcast or wet weather (O) is 50% (P(S) = ½ and P(O) = ½). So, there are two states: Ω = {S, O}, and now let a strategy be contingent on the state: qi: Ω → {B, L}. The strategy for i of going to the beach when sunny and the library when overcast is: qi(S) = B, qi(O) = L. If both use this strategy, then when it is sunny both go to the beach and when fine both go to the library. Neither has an incentive to alter his or her strategy, qi gives at least as high a payoff as any alternative τi:Ω → {B,L}, and the expected payoffs are . This strategy pair (q1, q2) in conjunction with the state space Ω and the distribution on Ω, P(S) = P(O) = ½ describes a correlated equilibrium—defined below. Apart from whatever interpretation might be attached to the state space, there is nothing fundamental in the choice of state space. Suppose that the state space
48
MICROECONOMIC THEORY
is the outcome of the toss of two fair coins Ω = {(H,H),(H,T), (T,H), (T,T)} (so π(i,j) = ¼, i,j ∈ {H, T}) and both parties are “told” whether the coins match (the event M = {(H,H), (T,T)}), or do not match (the event NM = {(H,T), (T,H)}). If each adopts the strategy of choosing B when there is a match and L when there is no match, again they both either go to the library or both go to the beach. (Although they might end up in the library on a sunny day, being together matters for payoffs.) Again, this is a correlated equilibrium. Finally, retain the same state space, but alter the probability on states so that , , , and . Now, let player 1 observe the outcome of the first coin only and let player 2 observe the outcome of the second coin only. Thus, player 1 learns whether the true state is in P11 = {(H, H), (H,T)} or in P12 = {(T, H), (T,T)} but no more. Similarly, let P21 = {(H, H), (T,H)} and P22 = {(H, T), (T,T)} so that 2 only observes the outcome of the second coin toss. Let q1(ω) = B if ω ∈ P11 and let q1(ω) = L if ω ∈ P12; and let q2(ω) = B if ω ∈ P21 and let q2(ω) = L if ω ∈ P22. Again, these strategies define equilibrium strategies in the sense that there is no alternative qi' yielding i = 1,2, a higher expected payoff. Here, for example, if player 1 observes P11, the conditional probability that player 2 will choose B is ⅓ so the expected payoff from choosing B is and the expected payoff from choosing L is . So, choosing B if P11 is observed is an optimal decision. Likewise, choosing L if P12 is observed is optimal. Similar calculations apply for player 2, so the strategies are a correlated equilibrium. Note that the joint distribution over actions is the same as that in the Nash equilibrium.
Correlated equilibrium: denition and properties Let (Ω, π) be a probability space and let Pi be a partition of Ω, i = 1, …, n. Let Qi = {qi: Ω → Ai | qi is Pi measurable} (the set of functions from Ω to R which is constant on elements of Pi). The partition, Pi may be written as , where ki is the number of elements of partition i, or if one denotes by Pi(ω) the element of the partition containing ω, Pi = {Pi(ω)}ω ∈ Ω. Thus, the function qi must be constant on each Pik and so one may write, qi(Pik) or qi(Pi(ω)). If ω′, , then . The interpretation is that if state ω is drawn, i learns that the state drawn is in Pi(ω) and selects an action knowing this information. Definition 3.3The collection
where for eachi, qiis constant on each member ofPi.
is a correlated equilibrium if ∀i,
CHAPTER 3: STRATEGIC FORM GAMES
49
Given the information received through ω (given Pi), player i is maximizing expected utility. This can be written in terms of information partitions. Expanding:
or
or
since E {ui(q−i(ω), qi(ω)) | Pi(ω)} is constant for all ω′ ∈ Pi(ω), because Pi(ω′) = Pi(ω), ω, ω′ ∈ Pil. To illustrate, consider the game G6 which has two pure strategy equilibria, (a11, a21) and (a12, a22). (Here, aik is pure strategy k of person i.)
Let Ω = {H, T}, π(ω) = ½, ω ∈ Ω, P1 = {P11, P12} = {{H}, {T}}, P2 = {P21, P22} = {{H}, {T}}, and set qi(Pik) = aik, i = 1,2, k = 1,2. A fair coin is tossed and each player observes the outcome. With Pi(ω) having positive probability, the correlated equilibrium condition requires that
Suppose ω = H. Then π(P21 | P11) = 1. Write qi(ω) = qi(Pik) if ω ∈ Pik. Since q2(P21) = a21 and q2(P22) = a22, E {u1(q2(ω), q1(ω)) | P1(ω)} = u1(a11, a21) = 1. The
50
MICROECONOMIC THEORY
only alternative choice for 1 is a12, but setting τi(P11) = a12 gives E {u1(q2(ω), τ1(ω)) | Pi(ω)} = u1(a12, a21) = 0, so 1 is worse off. Similarly, if ω = T, π(P22 | P12) = 1 and q2(P22) = a22; the expected payoff to player 1 is 0 playing q1(ω) = q1(P12) and no deviation can improve. Observe that this correlated equilibrium gives the players an expected payoff of (½,½) = ½(1,0) + ½(0,1). Replacing the fair coin with one where π(H) = p ∈ [0,1], the same strategies continue to form a correlated equilibrium, and the associated expected payoff vector is (p, 1−p) = p(1,0) + (1 − p) (0,1). Thus, all payoffs on the line connecting (1,0) and (0,1) are correlated equilibrium payoffs, although only the endpoints are Nash equilibrium payoffs. This formulation of a correlated equilibrium lends itself to a broad range of interpretations (e.g. sunspot equilibria), but from a computational point of view there is a more natural formulation—where the state space is identified with the space of pure strategies. This is discussed next. Given a game , there is a natural or canonical way to define a correlated equilibrium, using the strategy spaces as the state space according to Ω = ×iAi. Let π be a distribution on A. For the information structure, if , player i is informed of ai. This defines an information partition of Ω for . If a = (a1, …, an) is drawn, individual i observes ai, and given the distribution π, the conditional distribution on A−i is given by π(a−i | ai) and so i will choose a′i to maximize: . Definition 3.4If the draw of π, ai, is viewed as the “recommended” strategy and if this is the optimal choice fori—so that for eachai, is maximized byai' = ai, then π is called a canonical correlated equilibrium. In game G6, Ω = {(a11, a21), (a12, a21), (a11, a22), (a12, a22)} and let π(a11, a21) = π(a12, a22) = ½, π(a12, a21) = π(a11, a22) = 0. This defines the canonical equilibrium. Theorem 3.1.Let be a correlated equilibrium. Then there is a canonical correlated equilibrium π* yielding the same distribution on actions and the same expected payoff to each player. Proof (Sketch) If qi involves the same choice on two different elements of Pi, the inequalities still hold with a coarser partition. Let define a new partition for i, and let Q*i = {qi: Ω → Ai | qi is Pi* measurable}. As before,
In this case, because qi is different on each element of P*i, knowledge of the value of qi(ω) is the same as knowing the partition member P*i(ω). So, every τi∈ Q*i can be written τi(ω) = ρi(qi(ω)), for some ρi: Ai → Ai. Thus,
CHAPTER 3: STRATEGIC FORM GAMES
51
Given q: Ω → A, a distribution π* is defined on A: π*(a) = π({ω ∈ Ω | q(ω) = a}). The interpretation is that a randomization device draws ω. Then i is told “qi(ω),” but not the value of ω. Player i may choose qi(ω), or having received the information qi(ω) may choose some alternative, ρi(qi(ω)). So, the previous expression may be written:
Writing π*(a−i | ai) for the conditional distribution of a−i given ai,
So, for each ai with positive probability, it must be that
or
So, every correlated equilibrium associated with an abstract space (Ω, π) is strategically equivalent to one associated with a second space (A, π*). This completes the proof. In this (canonical) formulation, a correlated equilibrium is a distribution π* on A such that for each i, and each ai ∈ ai,
The canonical equilibrium may be summarized as follows. A point a in A is drawn according to the distribution π*. Player i is informed of the ith component of a, ai with the expectation that i will choose this action. Given π*, player i can calculate the conditional distribution over A−i and the conditional expected payoff from each choice . The inequality asserts that if i is told ai, this is in fact a best choice for i. One useful application of the canonical formulation is to prove: Theorem 3.2.The set of Nash equilibrium payoffs is a subset of the set of correlated equilibrium payoffs. Proof If x is a Nash equilibrium, for a ∈ A define this suppose to the contrary that
or
, and let qi(ai) = ai. This is a correlated equilibrium. To see
52
let
MICROECONOMIC THEORY
, then
and this contradicts the fact that x is a Nash equilibrium. More directly, observe that is the product of the Nash strategies other than i, and the distribution implies by π and qi is the Nash strategy for i: a best response to x−i. Finally, the set of correlated equilibrium payoffs has a simple structure. Theorem 3.3.The set of correlated equilibrium payoffs is a convex set. Proof To see this note that the correlated equilibrium condition,
may be written (multiplying both sides by π*(ai)): ∀ ai
Let πα and πβ be correlated equilibria and consider πθ = θπα + (1 − θ) πβ. Since
and
These imply that
CHAPTER 3: STRATEGIC FORM GAMES
53
or
So that πθ is a correlated equilibrium.
Bibliography Bernheim, B. D. (1984). “Rationalizable Strategic Behavior,” Econometrica, 52, 1007–1028. Moulin, H. (1979). “Dominance-Solvable Voting Schemes,” Econometrica, 47, 1337–1351. Pearce, D. G. (1984). “Rationalizable Strategic Behavior and the Problem of Perfection,” Econometrica, 52, 1029–1050. Reny, Phil (1999). “On the Existence of Pure and Mixed Strategy Nash Equilibria in Discontinuous Games,” Econometrica, 67, 1029–1056. Simon, L. K. and Zame, W. (1990). “Discontinuous Games and Endogenous Sharing Rules,” Econometrica 58, 861–872. Tan, T. C. and Werlang, S. R. (1988). “The Bayesian Foundations of Solution Concepts of Games,” Journal of Economic Theory, 370–391. Vives, Xavier (1990). “Nash Equilibrium and Strategic Complementarities,” Journal of Mathematical Economics, 19(3), 305–321.
This page intentionally left blank
4 Nash Equilibrium—Existence and Renements 4.1 Introduction In economics the most widely used notion of equilibrium behavior is Nash equilibrium. Because of its popularity, the question of existence of Nash equilibrium is important: what conditions on strategy spaces and payoff functions are sufficient to guarantee the existence of Nash equilibrium? This question is taken up here, along with some consideration of alternative solutions that are based on refining Nash equilibrium. Section 4.2 begins with the definition of Nash equilibrium in pure and mixed strategies. Standard equilibrium existence results rely on fixed point theorems which require (among other things) convexity of the best response correspondence. This property follows in the pure strategy case if utility functions are concave (or quasiconcave) in own action, and follows in the mixed strategy case from the linearity of payoffs in each player's mixed strategy. Section 4.3 discusses equilibrium and the role of fixed point theorems. Here, a few of the main fixed point theorems are reviewed and in Section 4.3.2 some basic applications to equilibrium existence theorems are given. In some cases, it is argued that the set of Nash equilibria is too large—in the sense that some equilibria are implausible and should be excluded for a number of reasons. This leads to formulations that impose additional conditions beyond the Nash equilibrium requirement. Perfect equilibrium is described in Section 4.4, proper equilibrium in Section 4.5, and persistent equilibrium in Section 4.6. These notions refine the set of Nash equilibria in different ways. Perfection and properness are based on models of mistakes whereby a player has positive probability of playing all pure strategies. Fully mixed equilibria where all pure strategies are played with strictly positive probability satisfy both the perfection and properness criteria. Most refinement models have no impact on fully mixed equilibria, but
56
MICROECONOMIC THEORY
one notion that is different in this respect is persistent equilibrium. Persistence tests for “stability” of a strategy, mixed or pure, and therefore can eliminate fully mixed strategy equilibria.
4.2 Nash Equilibrium An n-player game is defined by the set of players, N = {1,…, n}, an action space Ai for each player i, and a payoff function for each player: ui: A → R, where . A strategy profile a* = (a*1,…,a*n) is a Nash equilibrium if for each i, ui(a*i, a*−i) ≥ ui(ãi, a*−i), ∀ ãi ∈ A. A Nash equilibrium a* is a strict Nash equilibrium if for each i, ui(a*i, a*−i) ≥ ui(ãi, a*−i), ∀ ãi ≠ a*i. In games with a finite number of pure strategies, the mixed extension of the game is defined as follows. Let Xi = Δ(Ai) be the set of probability distributions on Ai. Let Ai have ki elements so that . Given that xi ∈ Xi, define preferences on as:
Let
So, V(ais | x−i) is the expected payoff to i taking action ais when other players play x−i = {xj}j≠ i. Thus, . This arrangement highlights the fact that ui(x1, …, xn) is linear in xi.
CHAPTER 4: NASH EQUILIBRIUM
57
In games with an infinite number of strategies, a mixed strategy is a probability measure, μi, on the set of pure strategies. In this case, the expected payoff to i is:
Therefore, a strategy profile (μ*1, …, μ*n) is a Nash equilibrium if for each i, μ*i maximizes ui given μ*−i. Again, ui is linear in μi.
4.3 Existence of Equilibrium Consider a utility function ui: × Ai → R. At strategy profile a ∈ A, the best response correspondence of i is defined:
This is a correspondence because in general there may be more than one choice for i that maximizes the payoff of i, given the choices of other players. If ai ∈ bi(a−i), there is no alternative choice for i giving a strictly higher payoff to i than ai, given that other players' choices are a−i. The correspondence associates to each a−i ∈ A−i, a set of points bi(a−i) in Ai. Thus, bi: A−i ↠ Ai; or, with some redundancy, bi: A ↠ Ai, where bi does not depend on a−i. (The notation ↠ indicates a set-valued mapping.) The product of the correspondences defines a correspondence , with b: A → A. Observe that a* is a Nash equilibrium if and only if a* ∈ b(a*). So, a strategy profile is a Nash equilibrium if and only if it is a fixed point of the best response correspondence. Thus the question of existence of Nash equilibrium leads naturally to consideration of fixed points. One central concept in the application of fixed point theorems is a property called upper-hemicontinuity of the best response correspondence. The following brief discussion notes that continuity of the payoff functions leads directly to upper-hemicontinuity of the best response correspondence. Suppose that a sequence {an} in A converges to a and let . Then, for each n, , ∀ ai ∈ Ai. Suppose that converges to â. Then, if ui is continuous, , ∀ ai ∈ Ai. Put differently, if (1) , (2) , and (3) , then âi ∈ bi(a−i). This property is called upper-hemicontinuity of bi, and is discussed in the next section. It is worth noting that these calculations apply also with mixed strategies. Furthermore, with mixed strategies the strategy set is convex and a player's payoff function is linear in own (mixed) strategy, so that in particular the set of best responses is convex.
58
MICROECONOMIC THEORY
4.3.1 Fixed points A correspondence, ϕ from a set U to a set V, associates to each u ∈ U a subset ϕ(u) of V, ϕ(u) ⊆ V. Throughout, take V compact. The graph of ϕ is defined: Gϕ = {(u,v) | y ∈ ϕ(u)}. Definition 4.1.Upper- and lower-hemicontinuity. 1. The correspondence ϕ is upper-hemicontinuous if Gϕis closed: if (uk, vk) ∈ Gϕand (uk, vk) → (u,v), then (u,v) ∈ Gϕ. Alternatively, ifuk → u, vk → v, andvk ∈ ϕ(uk), thenv ∈ ϕ(u). 2. The correspondence ϕ is lower-hemicontinuousif at any u: ifuk → u and v ∈ ϕ(u) implies that ∃ a sequence {vk} with vk → v and vk ∈ ϕ(uk). 3. A correspondence is continuous if it is both upper- and lower-hemicontinuous. The correspondence ϕ is convex-valued atuo if ϕ(uo) is convex, and convex-valued if convex-valued at each uo ∈ U. The following fixed point theorems are used routinely. Brouwer's fixed point theorem. Let f: C → C, where f is a continuous function and C ⊂ Rn a compact and convex subset of Rn. Then f has a fixed point: ∃ x*, f(x*) = x*. Kakutani's fixed point theorem. Let ϕ: C → C, where ϕ is an upper-hemicontinuous, convex-valued correspondence and C ⊂ Rn a compact and convex subset of Rn. Then ϕ has a fixed point: ∃ x*, x* ∈ ϕ(x*). Debreu's social existence theorem. Let be a continuous function, Xi a convex compact subset of Rn for each i, and let ϕi: ×j≠ iXj → Xi be a nonempty continuous correspondence. Let
, then
is upper-hemicontinuous, and if μ is convex valued, it has a fixed point: x* ∈ μ(x*).
Glicksberg–Fan fixed point theorem. Let C be a compact convex subset of a locally convex linear topological Hausdorff space and let ϕ be a convex-valued correspondence with closed graph and ϕ: C ↠ C. Then ϕ has a fixed point.8 Contraction mapping fixed point. Contraction mapping: Let (X,d) be a complete metric space and f: X → X a contraction. The f has a (unique) fixed point. Recall that (X,d) is complete if every Cauchy sequence converges, where {xl} is a Cauchy sequence of given ε > 0, such that d(xl, xk) < ε if l, . A function f is a contraction if ∃ β ∈ [0, 1) such that ∀ x,y ∈ X, d(f(x),f(y)) ≤ d(x,y).
8
A space X is a linear topological space if it is a linear (vector) space with a topology such that multiplication by a scalar and addition are continuous operations in the topology. A topological space is a Hausdorff space if for any pair of distinct points there are disjoint neighborhoods containing those points. A linear topological space X is locally convex if each point x has a neighborhood base of convex sets: there is a collection of open convex sets C such that any neighborhood of x contains a member of C.
59
CHAPTER 4: NASH EQUILIBRIUM
Knaster–Tarski fixed point theorem. Let X be a complete lattice and ≽ a partial ordered on X, such that every chain has a supremum. Let f: X → X be monotone and such that ∃ a∈ X with f(a) ≽ a. Then the set of fixed points of f is nonempty and there is a maximal fixed point. (Aside: An order ≽ on X is a partial order if it is reflexive (x ≽ x, ∀x ∈ X) transitive (x ≽ y, y ≽ z imply x ≽ z) and antisymmetric (x ≽ y, y ≽ x imply x = y). A function is monotone if a ≽ b implies f(a) ≽ f(b). A fixed point, x*, of f is a maximal fixed point if for any fixed point y (y = f(y)), x* ≽ y. A set C ⊂ X is a chain if x, y ∈ C, x ≠ y implies that x ≽ y or y ≽ x. A supremum of a chain C ⊂ X is a point c* ∈ X such that c* ≽ c, ∀ c ∈ C and ∄ ĉ ≠ c* with c* ≽ ĉ and ĉ ≽ c, ∀c ∈ C.)
4.3.2 Equilibrium The following discussion illustrates the application of some of these fixed point theorems—in particular the Kakutani and Glicksberg–Fan theorems—in establishing the existence of equilibrium in games with finite and infinite numbers of pure strategies.
Finite pure strategies In general, a finite game need not have an equilibrium in pure strategies. For example, consider the game G1:
When both make the same choice player 1 gets a payoff of 1 and player 2 a payoff of −1, when they make different choices the payoffs are reversed. For this game there is no equilibrium in pure strategies. However, there is an equilibrium in mixed strategies, where each player plays (½, ½). (More generally, any game with a finite number of pure strategies always has a mixed strategy equilibrium.) Recall that for i, bi: X → Xi, defined
is linear in xi and continuous in
. Therefore, the best response mapping
is upper-hemicontinuous and convex-valued. So Kakutani's fixed point theorem applies. Therefore, in the finite pure strategy game there is an equilibrium in mixed strategies.
60
MICROECONOMIC THEORY
Innite pure strategies On a convex set, a function f(z) is quasiconcave if f(θ z + (1 − θ) z′) ≥ min{f(z), f(z′)}, ∀ θ ∈ [0, 1]. If Ai is a convex set and ui(ai, a−i) is quasiconcave in ai, for any a−i, then at any a−i the set of maximizers of ui is a convex set. So, if ui(ai, a−i) is quasiconcave in ai and continuous on A, then provided Ai satisfies the conditions for the Glicksberg–Fan theorem (convex and compact), there is an equilibrium in pure strategies. If ui(a) is continuous, but possibly not quasiconcave in ai then the best response mapping may not be convex. However, if A is a compact metric space, so also is the set of probability measures on Ai. In this case, again the Glicksberg–Fan theorem applies with preferences defined on the space of measures: ui(μ1, …, μn). And so there is an equilibrium in mixed strategies.
4.4 Perfect Equilibrium The basic idea in perfect equilibrium is that people make mistakes—with small probability unintended choices are made—and awareness of this affects how people make choices. Adding mistakes to the model leads to a perturbed game where every possible choice of a player has positive probability. Equilibrium stra-tegies selected as the probability of error becomes small are identified as perfect equilibria. Consider the game G2:
This game has two pure strategy Nash equilibria (x1, x2) = (1,0) = (y1, y2) and (x1, x2) = (0,1) = (y1, y2). To model mistakes, suppose players are restricted to play each pure strategy with at least some small probability: for player 1 (x1, x2) ≥ (ε1, ε2), and for player 2, (y1, y2) ≥ (η1, η2). With the strategies restricted this way, the payoff to player 1 is x1y1 ≥ x1 η1 > 0 so that an optimal choice for player 1 is to set x1 as large as possible: x1 = 1 − ε2. Likewise, for player 2, the optimal strategy is y1 = 1 − η2. This is the only equilibrium of the perturbed game. As (ε1, ε2) → 0, strategies converge: (x1, x2) → (1, 0) and (y1, y2) → (1, 0). These limits identify a perfect equilibrium. Note that the implausible outcome (a2, b2) is eliminated. The following discussion formalizes the concept. For finite games with # Ai = ki < ∞, let Xi = Δ(Ai) be the ki − 1 dimensional simplex, the set of mixed strategies of player i. Let with ɛij ≥ 0 and . Say that ɛi is strictly positive if ɛi ≫ 0 (if ɛij > 0, ∀ j = 1, …, ki). Write Xi(ɛi) = {xi ∈ Xi | xij ≥ ɛij}. Observe that Xi(ɛi) is compact and convex so that this modified game G(ɛ) has an equilibrium, x* in mixed strategies, where x*i ∈ Xi(ɛi).
61
CHAPTER 4: NASH EQUILIBRIUM
Say that ɛ = (ɛ1, …, ɛn) is strictly positive if each ɛi is strictly positive. This leads to the definition of a perfect equilibrium. Definition 4.2.An equilibriumx* ∈ X is perfect if there exists a sequence such thatG(ɛl) has an equilibriumxlwithxl → x*.
, where
is strictly positive and
, ∀ i,
Perfect equilibria exist since G(ɛl) has an equilibrium for each l and the sequence {xl} is in a compact set X. Note that x is a perfect equilibrium if and only if ∃ xl → x and xl ≫ 0, ∀l such that for l sufficiently large xi is a best response to . The intuition for this is simple: if xij > 0, then in a perfect equilibrium test sequence with restriction, , if the corresponding strategy , it must be that when l is large. For probabilities strictly positive in the limit, eventually on the sequence the strict positivity constraint is not binding. Finally, if a pure strategy aij is weakly dominated, then in any perfect equilibrium, xij = 0. Weak domination of aij implies there is some aik such that ui(aij, a−i) ≤ ui(aik, a−i) for all a−i with strict inequality for some a−i. With xl fully mixed, Vi(aij | x−i) < Vi(aik | x−i), so that as l → ∞.
4.5 Proper Equilibrium Proper equilibrium refines the perfect equilibrium criterion by restricting the way in which players may make small errors. Players are less likely to make more serious errors which cause larger reductions to payoffs. Thus, both perfection and properness are based on perturbations of strategies, proper equilibria are derived from a perturbed game satisfying more restrictions, so that the set of possible limits is smaller. Therefore, the set of proper equilibria is a subset of the set of perfect equilibria. Definition 4.3An ɛ-proper equilibrium is a strategy tuplex = (x1, …, xn) ≫ 0 such that for alli, ifVi(aij | x−i) < Vi(aik | x−i) thenxij ≤ ɛ xik. A strategy profilexis a proper equilibrium if ∃ ɛl, ɛl ∈ (0,1) and ɛl → 0 andxl → x, withxlan ɛlproper equilibrium. Existence of equilibrium is guaranteed. Theorem 4.1.Every finite game has a proper equilibrium. Proof (Sketch) To prove existence of an ɛ-proper equilibrium, consider vectors of the form , using every permutation of the exponents (1, 2, …, ki). So, for ki = 3, the candidates are: (ɛ, ɛ2, ɛ3), (ɛ, ɛ3, ɛ2), (ɛ2, ɛ, ɛ3), (ɛ2, ɛ3, ɛ), (ɛ3, ɛ, ɛ2), (ɛ3, ɛ2, ɛ). In general, there are ki! such lists. Let and put Xi(ℓ) = {x ∈ Xi | x ≥ ℓ}. Finally, let , the convex hull of . So, Xi(ɛ) allows probabilities in a strategy to be ordered in terms of orders of magnitude. The game with strategy spaces Xi(ɛ) has an equilibrium and each equilibrium is ɛ-proper. For any such
62
MICROECONOMIC THEORY
equilibrium, letting ɛ → 0, the corresponding limiting strategy profile is a proper equilibrium. The following example illustrates the idea. Consider the games:
In G3, the strategy pair (x1, x2) = (1, 0) and (y1, y2) = (1, 0) is the unique perfect equilibrium (x the strategy of the row player). When the game is augmented by the addition of strictly dominated strategies for both players, this gives game G4. In G4, (x1, x2, x3) = (1, 0, 0) and (y1, y2, y3) = (1, 0, 0) is a perfect equilibrium strategy pair, but so also is x* = (x1, x2, x3) = (0, 1, 0) and y* = (y1, y2) = (0, 1, 0). This corresponds to the equilibrium (a2, b2), which is strictly dominated by (a1, b1). To see that (x*, y*) is a perfect equilibrium, let , and consider the perturbed game where both players must play the pure strategies with at least these probabilities. When the column player uses this strategy, the expected payoff to each of the row's three choices is (−8ɛk, −7ɛk, −7−2ɛk), so that a2 is the best choice and is played with maximum probability (1 − 2ɛk), and with strategies a1 and a3 played with minimum probability (ɛk). These converge to (0, 1, 0) as ɛk → 0. In contrast, the proper equilibrium criterion eliminates this strategy pair. Observe that if y = (y1, y2, y3) ≫ 0 (fully mixed), then for player 1, V1(a3 | y) < V1(a2 | y), since strategy a2 weakly dominates strategy a3. Thus, in an ɛ-proper equilibrium x3 ≤ ɛ x2. From the perspective of player 2, given a strategy x for 1, the payoff to each of the three pure strategies is given by: (x1 −9x3, −7 x3, −9x1 −7 x2 − 7x3). Since x3 ≤ ɛ x2 and ɛ small, 9x3 < 7x2, so that x1− 9x3 > −9x1 − 7x2 −7 x3. Thus, the first strategy gives a higher payoff than the third: V2(b1 | x) > V2(b3 | x), so, from the ɛ-proper criterion, y3 ≤ ɛ y1. Turning again to player 1, the three pure strategies have payoffs: (y1 − 9y3, −7y3, −9y1−7y2−7y3). Consider y1 − 9y3 and recalling that y3 ≤ ɛy1, y1−9y3 ≥ y1−9 ɛ y1 = y1(1 − 9ɛ). This is positive for small ɛ. Since −7 y3 < 0, it must be that x2 ≤ ɛ x1. From the earlier calculation, x3 ≤ ɛ x2 so that x3 ≤ ɛ2x1 and x2 ≤ ɛ x1. Therefore, as ɛ → 0, x1 → 1. The same reasoning applies to player 2, so the unique proper equilibrium is (x1, x2, x3) = (1, 0, 0) = (y1, y2, y3). In this example, the addition of dominated strategies enlarged the set of perfect equilibria, but not the set of proper equilibria. However, there is no direct connection between dominated strategies and proper equilibrium. In the following game, G5, there are three players. Player 1 chooses the row, player 2 the column, and player 3 the matrix. In a restricted version of this game, player 3 has
CHAPTER 4: NASH EQUILIBRIUM
63
only one pure strategy—play 1. In that restricted game, there is a unique proper equilibrium where player 3 plays 1 and players 1 and 2 both choose 1. In the unrestricted game there are two proper equilibria. Letting xi be the strategy of i, consider . This converges to a proper equilibrium, but so also does: and .
4.6 Persistent Equilibrium One feature of refinements such as perfectness or properness is that fully mixed equilibrium strategies satisfy the criteria—since the ɛ perturbations impose no restriction. In contrast, the notion of persistence can eliminate mixed strategy equilibria. Recall that given a game in pure strategies, , the associated game in mixed strategies has payoffs defined on : for player i, ui(x1, …, xn). Define a retract to be a set with ∅ ≠ Θi ⊆ Xi and Θi closed and convex. Define an ɛ-neighborhood of Θ:
Call a retract absorbing if and only if ∃ ɛ > 0 such that ∀ x ∈ Nɛ(Θ), ∃ z ∈ Θ such that
In words, a retract is absorbing if when a strategy profile is sufficiently close to Θ, there are best responses in Θ for each player. An absorbing retract, Θ, is minimal if there does not exist an absorbing retract Θ' with Θ' ⊆ Θ and θ' ≠ Θ. A persistent retract is a minimal absorbing retract. Definition 4.4.A persistent equilibrium is an equilibrium in a minimal absorbing retract.
64
MICROECONOMIC THEORY
To illustrate, consider the game:
There are three equilibria: (a) (x1, x2) = (y1, y2) = (1, 0); (b) (x′1, x′2) = (y′1, y′2) = (0, 1); and (c)
.
Note that Θa = {((1, 0), (1, 0))} is a retract which is absorbing and minimal because it contains only one point. Since ((1, 0), (1, 0)) is an equilibrium, it is a persistent equilibrium. The same reasoning applies to Θb = {((0, 1), (0, 1))}. Finally, consider the mixed strategy equilibrium. If (y1, y2) is a strategy for 2 with y1 > ½, the unique best response for 1 is (x1, x2) = (1, 0); and if y1 < ½, the unique best response for 1 is (x1, x2) = (0,1). So, if Θc is an absorbing retract that contains ((½,½), (½,½)), then for Θc to be absorbing it must also contain ((0, 1), (0, 1)) and ((1, 0), (1, 0)). But then Θa ⊆ Θc and Θa ≠ Θc so that Θc is not a minimal absorbing retract. There is no minimal absorbing retract containing ((½,½), (½,½)), so the mixed strategy equilibrium is not persistent.
Bibliography Aliprantis, C. D. and Border, K. (1999). Infinite Dimensional Analysis: A Hitchhikers Guide, 2nd Edn. Berlin: SpringerVerlag. Kalai, E. and Samet, D. (1984). “Persistent Equilibria in Strategic Games,” International Journal of Game Theory, 13, 129–144. Myerson, R. B. (1978). “Refinement of the Nash Equilibrium Concept,” International Journal of Game Theory, 7, 73–80. Selten, R. (1975). “Reexamination of the Perfectness Concept for Equilibrium Points in Extensive Games,” International Journal of Game Theory, 4, 25–55. Van Damme, E. (1991). Stability and Perfection of Nash Equilibria. Berlin: Springer-Verlag. Zhou, L. (1994). “The Set of Nash Equilibria of a Supermodular Game is a Complete Lattice,” Games and Economic Behavior, 7, 295–300.
5 Mechanism Design 5.1 Introduction To motivate the topic of mechanism design, consider the following voting model. Suppose there are two candidates, a and b, for some position. Let N = {1, …, n} be the set of voters and suppose for simplicity that n is odd. With majority voting, each voter votes for a candidate, so that each voter i makes a choice ci from the set Mi = {a, b}. The candidate with the most votes wins. This defines a selection rule: g(c1, …, cn) = a if # {i | ci = a} ≥ (n + 1)/2, and g(c1, …, cn) = b if # {i | ci = b} ≥ (n + 1)/2. This collection defines a mechanism, characterized by a message space for each individual, Mi and a rule, g, associating outcomes (a candidate in {a,b}) to messages. The task in mechanism design is to formulate an environment governing interaction between individuals (the Mi's and g), in a manner such that strategic behavior produces the desired outcome; and this must be achieved without knowledge of the characteristics of the individuals. In the example, whether i turns out to prefer a or b does not affect the available message set, Mi, for i. This is similar to the design of a constitution: there is a fixed rule for converting individual preferences into a decision for the population, and the decision varies with the characteristics of the population. However, the formulators of a constitution cannot anticipate what the preferences or characteristics of the members of a society will be at any point in time, so the rule must produce a satisfactory outcome as characteristics vary. Such a fixed rule is a mechanism. An implementing mechanism is one where equilibrium behavior within the mechanism produces the socially desired outcome (such as majority rule) at each state. In Section 5.2, the formal description and definition of a mechanism is given. There are a variety of environments in which mechanisms can be developed: Section 5.3 describes the complete and incomplete information environments. In one case, individuals know all parameters of the environment; in the other they have partial information with the lack of information captured
66
MICROECONOMIC THEORY
by a distribution over the environment parameters. Since mechanism design centers on schemes that exploit individuals' information, this difference in information possessed by individuals is critical to the way in which such schemes are designed, so that variations in information in the incomplete information environment feed through to determine the appropriate outcome. The study of complete and incomplete information mechanisms is taken up in greater detail in Chapter 6. Subsequent sections here focus on complete information environments. In Section 5.4 implementing mechanisms are defined, and after that the special case of direct mechanisms is introduced in Section 5.4.1. In a direct mechanism, the individual's message space is the information possessed by the individual. Determining outcomes from a mechanism assumes an equilibrium concept. A robust notion of implementation utilizes dominant strategy equilibrium, discussed in Section 5.5. The revelation principle for dominant strategies is discussed in Section 5.5.1. The basic point of the revelation principle is that a message space no larger than the information space is adequate to signal all information possessed. However, direct mechanisms have shortcomings that are discussed here and relate to “full” and “weak” implementation. In this literature one of the early theorems is the Gibbard–Satterthwaite theorem, summarized in Section 5.5.3. This is a negative result—the only social choice functions implementable in dominant strategies are dictatorial. The GibbardSatterthwaite theorem relies on unrestricted domains of preferences: when the domain of preferences is restricted, positive results are possible. Section 5.5.4 considers two restrictions—one where preferences are single-peaked, and the other where preferences are quasilinear.
5.2 Mechanisms The framework is the following. A group of n individuals and a set of outcomes, C, are given. Individual i is characterized by a utility function ui: C × Θ → R, where Θ = × Θi and Θi is the characteristic space of i. When ui depends only on θi, then θi fully defines i's preferences. A social choice rule is a correspondence from Θ to C: f: Θ ⇒ C. Call f a social choice function if it is single valued. In this framework, at characteristics profile θ = (θ1, …, θn), f(θ) represents the socially desirable outcome. From the planning perspective, difficulties arise because the characteristics vector θ is unknown to the planner and f will generally vary with θ. Ideally, at any profile of characteristics, θ, as individuals interact the outcome of that interaction should be f(θ). Given f, the objective in mechanism design is to set up a scheme of interaction (a mechanism) such that f(θ) is the outcome that occurs when the characteristics profile is θ. A mechanism ℳ is a collection of message spaces, , and a rule g associating outcomes in C to message profiles. So, , with g: M → A, , and where Mi is
67
CHAPTER 5: MECHANISM DESIGN
assigned to individual i as an action space. Given a representative element, mi ∈ Mi, for each i, a payoff can be assigned to each message profile m = (m1, …, mn) and characteristics profile θ = (θ1, …, θn): for individual i, the payoff is ui(g(m), θ). Observe that the mechanism, ℳ, is independent of the characteristics vector θ, reflecting the planner's lack of information (and hence the need for a mechanism). From the perspective of the participants this is not the case—since an individual at least knows his or her own characteristic. But to contemplate strategic choices, individuals either need to know the characteristics of others or have a distribution over the possible characteristics of others. The former is called the complete information case and the latter the incomplete information case. These are discussed in the next section.
5.3 Complete and Incomplete Information Environments If at any profile θ, each individual knows the characteristics of others (at each θ, the characteristics profile θ is common knowledge), the environment is called a complete information environment. In that case, the specification of a mechanism implicitly defines a game where Mi is the strategy space of i and the payoff function is ui(g(·), θ): M → R. For each θ ∈ Θ, player i can consider the strategic choice of an element mi ∈ Mi. (Although ui is written on the assumption that an individual's preferences depend on the characteristics of others, a common assumption is that each individuals' preferences are independent: ui(c, θ1, …, θn) = ui(c, θi), ∀c ∈ C. In the literature this property is sometimes called “private values.”) In effect, the mechanism, ℳ, defines a game for each θ ∈ Θ: given
, the payoff function
, and action spaces define a strategic game . Suppose that at profile θ strategic considerations lead i to choose mi(θ), and that g(m1(θ), …, mn(θ)) = f(θ). Then, at θ, strategic considerations lead to the desired outcome. When this happens at every θ ∈ Θ, the mechanism is said to implementf. This leaves undefined the behavioral model implicit in “strategic considerations,” but one has in mind equilibrium behavior in terms of some notion of equilibrium. So for each , for any equilibrium message profile , in the game , the corresponding outcome coincides with
. For the incomplete information environment, the situation is modified in that at each profile θ, individuals may not know the characteristics of other individuals, but instead attach a conditional probability to the characteristics of others, given their own characteristic. For individual i, let pi(θ−i | θi) be the conditional distribution. As written this allows for the possibility that there may not be a common distribution, p, on Θ such that for each i, pi(θ−i | θi) = p(θ−i | θi)—there
68
MICROECONOMIC THEORY
may not be a common prior over characteristics. Here, the focus is on the case of common priors. Given the prior, p, at outcome c the ex ante expected utility of i is ∑θ ∈ Θui(c, θ) p(θ). At characteristics profile θ, the interim expected utility is , and the ex post utility is ui(c, θ). Unlike the complete information case, the individual can only condition on “own characteristic,” θi. The ex ante computation is based on payoff evaluation prior to knowing one's own characteristic; the interim computation is made at the point where the individual knows his or her own characteristic but does not know the characteristics of others. The ex post computation depends on the individual knowing all the agents' characteristics. As in the complete information case, the timing of events is such that each individual learns his or her own characteristic before potentially taking some action (otherwise messages cannot possibly depend on information). With a message space given, individual i observing θi selects a message mi ∈ Mi, so the message taken depends on the characteristic: mi(θi). In this case, with each individual i selecting a message according to his or her characteristic, the (interim) expected payoff to i with characteristic θi is
and the ex ante expected payoff is
.
Again, as in the complete information case, if “strategic considerations” lead i to choose mi(θi) at characteristic θi, and if f(θ) = g(m1(θ1), …, mn(θn)), ∀θ ∈ Θ, then the mechanism implements f.
5.4 Implementation: Complete Information Let
be a mechanism. This defines a family of games:
Any message profile, , produces an outcome g(m). The choice of mi is a strategic consideration for individual i and is assumed to be chosen according to some notion of equilibrium behavior. For example, one can focus on Nash equilibrium or on dominant strategy equilibrium. This is a question of how individuals behave and it is not considered further here—the form of equilibrium behavior is taken as given. Fix a notion of equilibrium (such as dominant strategy equilibrium or Nash equilibrium). For any game Gℳ(θ), let Eℳ(θ) ⊂ M be the set of equilibrium strategies. Two of the most familiar criteria are dominant strategy equilibrium and Nash equilibrium. The set of equilibrium outcomes at θ is then g(Eℳ(θ)) = {c ∈ C | c = g(m), m ∈ Eℳ(θ)}. The mechanism ℳ implements the social choice
CHAPTER 5: MECHANISM DESIGN
69
function f if f(θ) = g(Eℳ(θ)), ∀θ ∈ Θ. Because this is an equality, it is often referred to as full implementation—since the equilibria coincide exactly with the social choice function. A weaker notion of implementation is that the social choice rule be attainable as an equilibrium: f(θ) ⊆ g(Eℳ(θ)), ∀ θ ∈ Θ. In this case the mechanism has an equilibrium that achieves f(θ) at θ, but may also have other equilibria with outcomes not equal to f(θ). This is often referred to as weak implementation. The fact that implementation is defined in terms of equilibrium directly raises the issue of what individuals are assumed to know when making decisions. In the complete information case, common knowledge of the game implies that each individual at any profile, θ = (θ1, …, θn), knows the full profile, even in the case where preferences satisfy private values. With private values, θi fully describes the characteristics of i, even though the conceptual framework of implementation requires that i know θ−i. Otherwise, the individual would not know the payoffs of others ({uj(c, θj)}j≠ i) at any given outcome c and so would not know the game being played. (In the context of incomplete information models, the issue does not arise since common knowledge of the game does not imply that individual i knows θ−i). In principle, the message space, Mi, may be arbitrarily complex. However, with Θ = × Θi, where Θi is the characteristic space of i, it is natural to contemplate mechanisms where each individual can signal their specific characteristics. A direct mechanism is one where the individual's message space, Mi, is the type space Θi.
5.4.1 Direct mechanisms A mechanism has the form ℳ = ({Mi}, g) with g: M → C. In a direct mechanism, where Mi = Θi, “truth-telling” or “honest reporting” of characteristics by i corresponds to i choosing θi where i's characteristic is θi. This is most natural in the case of private values where ui(·, θ) = ui(·, θi). Direct mechanisms are in one sense the “smallest” mechanisms that one could consider. For implementation, the range of g, g(M), must in general be as large as the range of f, f(Θ). If , then it must be that #Mi ≥ #Θi (so that a distinct mi may be associated to each θi). Apart from this, Θi as a message space is natural because it contains all the information relating to the individual. If a social choice function cannot be implemented by any mechanism, then it cannot be implemented by a direct mechanism either. However, if a social choice rule is implementable by some mechanism, then it is natural to ask if it can be implemented by a direct mechanism. The “revelation principle” addresses this issue, the main content of which is to assert that any equilibrium of an arbitrary mechanism can be identified with an equilibrium of a direct mechanism yielding the same outcome at each profile θ.
70
MICROECONOMIC THEORY
5.5 Dominant Strategy Implementation A social choice rule, f, is implementable in dominant strategies if there is a mechanism implementing f in dominant strategies. That is: Definition 5.1.f is implementable in dominant strategies if there is a mechanism, ℳ, such that for each i and θ, ∃ m* = (m*1, …, m*n) such that: 1. ui(g(m*i, m−i), θi) ≥ ui(g(mi, m−i), θi), ∀ mi ∈ Mi, ∀ m−i ∈ M−i. 2. , for all dominant strategy equilibria, , at profile θ. In this case the mechanism, ℳ, is said to implement f in dominant strategies. Condition 1 gives the dominant strategy requirement, condition 2 requires that f(θ) = g(Eℳ(θ)), ∀ θ ∈ Θ, where Eℳ(θ) is the set of dominant strategy equilibrium messages at profile θ.
5.5.1 The revelation principle: dominant strategies According to the revelation principle, a message space equal to the characteristic space is adequate for implementation, in the following sense. Suppose that ℳ implements f. Define a new (direct) mechanism, , so that the message space of i is Θi and where
, ∀θ ∈ Θ. From condition 1,
so that in particular at any θi, since m*j(Θj) ⊆ Mj for all j,
Therefore, in the mechanism ℳ, if reporting m*i(θi) at characteristic θi is a dominant strategy, then in the mechanism ℳ*, reporting θi is a dominant strategy at characteristic θi. This is called the revelation principal for dominant strategies. The principle suggests that a message space equal to Θi for i is adequate. However, there is an important caveat. Because g(Eℳ(θ)) ⊆ g(E*ℳ(θ)), ∀ θ, f(θ) ⊆ g(Eℳ*(θ)), ∀ θ. But at some θ the inclusion may be strict: whereas f(θ) = g(Eℳ(θ)), ∀ θ ∈ Θ, it may be that f(θ) ≠ g(Eℳ*(θ)), for some θ ∈ Θ, although necessarily f(θ) ⊆ g(Eℳ*(θ)), ∀ θ. Put differently, the reduction to a direct mechanism may enlarge the set of equilibrium outcomes. From the planner perspective, while the mechanism can achieve the desired outcomes, it may also produce undesired outcomes. The following example illustrates this point.
71
CHAPTER 5: MECHANISM DESIGN
The revelation principle: full and weak implementation Given an indirect mechanism that implements a social choice rule, the previous discussion shows that there is a direct mechanism which also yields those outcomes in equilibrium. But there may be additional equilibria in the direct mechanism which fail to match the social choice rule: the direct mechanism may have “too many” equilibria. This is discussed in the following example. Suppose there are two individuals with , , and C = {a,b,c,d}. Suppose that the preferences of player 1 are as follows and . Let the preferences of player 2 be arbitrary except that c and d are ranked lower than either a or b by either of player 2's types. Finally, let the social choice function be f(θ) = a if and f(θ) = b if , so that f is independent of θ2. In a direct mechanism that implements f, player 2's announcement does not affect the outcome. But, since a and b are equally ranked by both types of 1, announcing either is equally good for either type. So, in particular, truthtelling is a dominant strategy. But, so also is the opposite: it is also a dominant strategy for 1 to announce when 1's type is and to announce when 1's type is . In the direct mechanism this latter strategy produces the outcome b whenever 1's type is and the outcome a whenever 1's type is —exactly the reverse of that required by the social choice function. However, enlarging the message space of player 2 resolves the problem. Let to each message pair the corresponding outcome in the following matrix:
and
. Associate
Since type strictly prefers c to d and type strictly prefers d to c and player 2 regardless of type ranks c and d below a or b, any dominant strategy equilibrium produces the desired outcome.
5.5.2 Strategy-proofness One of the key concepts in mechanism design is strategy-proofness and is based on dominant strategy behavior. Recall that a direct mechanism is dominant strateg y incentive compatible if ∀ i, , . Definition 5.2.A social choice function f is strategy-proof if it is dominant strategy incentive compatible in the direct mechanism, , so that ∀i,
72
MICROECONOMIC THEORY
As the example illustrated, dominant strategy incentive compatibility in the direct mechanism does not imply that the direct mechanism will guarantee the correct outcome at each preference profile. However, if a social choice rule is implementable in dominant strategies, then it must be that the social choice rule is dominant strategy incentive compatible in the direct mechanism. Strategy-proofness is a necessary condition for implementability in dominant strategies, but for general families of preferences strategy-proofness turns out to be a demanding condition—as the following discussion explains.
5.5.3 The Gibbard–Satterthwaite theorem For implementation in dominant strategies, the key negative result is Gibbard–Sattherthwaite theorem: on unrestricted preferences domains, few social choice functions are implementable in dominant strategies. Specifically, only dictatorial social choice functions are implementable in dominant strategies. A social choice rule f is dictatorial if there is some i such that f(θ) = arg maxc ∈ f(Θ)ui(c, θi), ∀ θ. Finally, a social choice function f is unanimous if when c is top ranked by every individual, c is chosen by the social choice rule. Theorem 5.1. (Gibbard–Satterthwaite)Suppose that the number of outcomes is finite and that preferences consists of all possible strict orderings (any possible strict ranking of point in C corresponds to some preference of any player i). Suppose also that f is unanimous andf(Θ) contains at least three outcomes. Thenfis strategy-proof if and only if it is dictatorial. Proof Recent proofs of this theorem are given by Benoît (2000), Reny (2001), and Sen (2001). The proof here follows Benoît. Consider a social choice function f and suppose that f is unanimous and strategy-proof. A preference profile for i is denoted θi and may be written as a vector of elements in the choice set in descending order according to preference: elements appearing earlier in the vector are more highly ranked. Write θ = (θ1, …, θn) for a particular profile of preferences. The proof is given through a series of observations and lemmata. Preliminary observations: F1: Suppose that f(θ1, …, θn) = a. If x is raised in i's ranking to give θ′i, then f(θ−i, θ′i) ∈ {a,x}. For, if not, then f(θ−i, θ′i) = c ∉ {a,x}. Since c's ranking relative to a is unchanged from θi to θ′i, if i prefers c to a at θi, the preference θ′i initially would have given c. But f is strategy-proof, so this is not the case. F1a: Consider an arbitrary strict profile θ where b is ranked last by everyone. Then f(θ) ≠ b. To see why, suppose to the contrary that f(θ) = b. Strategy-proofness requires that i cannot change the outcome: for any θ′i, f(θ′i, θ−i) = b. Let a be top ranked under preference θ′i. Repeating for j ≠ i, with a top ranked by θj, f(θ′i, θ′j, θ−ij) = b. Continuing this way, f(θ′) = b, where for all l, a is top ranked by θ′l. This violates unanimity.
CHAPTER 5: MECHANISM DESIGN
73
Lemma 5.1.Pick an arbitrary outcomeband preference profile θ. There is anr ∈ {1, …, n} such that: F2a: b is chosen when individuals 1 up to r rank b first; F2b: b is not selected when individuals r up to n rank b last. Starting with a profile θ0 such that b is bottom ranked by all, move b from bottom to top in the ranking of 1 leaving other relative rankings of 1 unchanged. This causes a switch to b or else the outcome is unchanged (by F1). Repeat for persons 2, 3, and so on until the switch of individual r causes b to be selected. By construction, f(θ0) = f(θ1) ≠ b and f(θ2) = b, where:
Consider θ2, where f(θ2) = b. Let R = {1,…, r} and Rc = {r + 1,…, n}. Since f is strategy-proof no preference announcement by any individual i ∈ Rc = {r+1, …, n} can change the outcome at profile θ2. Otherwise, for some j ∈ Rc, there is with and so at , announcing produces an outcome at preferred to b, violating strategy-proofness. c Thus, for any j ∈ R , and any , . Repeating, with any l ∈ Rc, l ≠ j, using the same reasoning, strategyproofness requires that for any , . And so on, so that for any , . Also, at θ2, for i ∈ R = {1, …, r}, consider a change to i's ranking that leaves b top ranked. Since i can get b by announcing i's initial ranking, strategy-proofness implies that after the change in i's preference the value of f is unchanged. Repeating this argument for each i ∈ {1, …, r}, any changes to preferences in the group {1, …, r} that leaves b top ranked for each individual, leaves the value of the social choice function unchanged at b. These observations imply F2a: for any profile θ, b = f(θ), is chosen if the first r individuals rank b first. Next, in profile θ1, for , regardless of the announcement of i, b is not chosen—otherwise at profile θ1, some i could announce a preference ordering that gets b, contradicting strategy-proofness and f(θ1) ≠ b. So, given i ∈ {1, …, r − 1}, for any , . Next, consider any j ≠ i, j ∈ {1, …, r − 1}. By the same reasoning, there is no with . If there were such a , then f fails strategy-proofness at . Continuing this way,
for any
.
74
MICROECONOMIC THEORY
Fix an arbitrary and starting at , take some preference . Because , at profile
, and preference such that b is bottom ranked by i at
, if i announces , the outcome is that
. Since b is bottom ranked under preference , strategy-proofness requires
. Starting from
, if b is bottom ranked at preference , then at profile
, and some
, b cannot be the outcome determined by f—otherwise l would announce preference . Proceeding through all members of in this way implies F2b: for any profile, θ, b is not chosen, b ≠ f(θ), if each individual in {r, …, n} ranks b last. Lemma 5.2.Consider any profile, θ3, where k is top ranked for r, and where b is at the bottom of every individual's ranking. Thenf(θ3) = k. From θ3, construct θ4 in two steps. Raise k to the top in everyone's ranking. At this profile, unanimity implies that k is selected by f. Next raise b to the top in the rankings of each member of {1, …, r − 1}.
By F1, either b or k is chosen at θ4, and F2b implies it is not b, hence k is chosen: f(θ4) = k. Next, raise b to the second position for r to give θ5.
Outcome k is still chosen, f(θ5) = k, otherwise r would report preference when r's preference is θ*r, contrary to strategy-proofness.
CHAPTER 5: MECHANISM DESIGN
75
Suppose that, at profile θ3, the outcome is g = f(θ3) ≠ k. Construct profile θ6 from θ3 by raising b to the top of the preference of 1, then 2, and so on up to person r − 1. By F2b, b is not chosen and then by F1, g continues to be chosen. Raise b in r's ranking to the second position. This gives θ6. At profile θ6 either g is still chosen or by F1, b is chosen. If g is still chosen, then b will be chosen if it is raised to the top of r's ranking, by F2a. However, at this current ranking of r, b is above g, so that r with preference profile θ*r can get b by announcing a profile with b top ranked—violating strategy-proofness. Hence, at profile θ6, b must be chosen.
Now, starting with θ6, if k is raised to the second position for i ∈ {1, …, r − 1} and to the first position for i ∈ {r + 1, …, n}. As each individual in {1, …, r−1} has k-rise it is still below b, and since b can be had with the original report, strategy-proofness requires that b be selected by f as k rises to second position for each i ∈ {1, …, r − 1}. For i ∈ {r+1, …, n}, taking r + 1 first as k rises in r + 1's ranking, b must still be chosen since if a flip to k occurs, that profile announced could have been announced initially to obtain k, contradicting strategy-proofness. The same argument applies to the subsequent individuals in {r + 1, …, n}. Thus, the assumption g = f(θ3)≠ k implies that f(θ7) = b.
But θ7 is in the category of preference represented by θ5, so from the earlier discussion, it must be that f(θ7) = k. Hence f(θ3) = k. Lemma 5.3.Given an arbitrary profile, θ, such that r ranks k ≠ b highest. Then f(θ) = k. Take the profile θ, lower b to the lowest rank for all individuals to give profile θ′. From Lemma 5.2, f(θ′) = k. Now raise b in each individual's ranking to its original position. By F1, either b or k is chosen: f(θ) ∈ {b,k}.
76
MICROECONOMIC THEORY
Next, consider profile θ8 with k≠ c and b≠ c. Because θ8 has b ranked top for i ∈ {1, …, r}, by F2a, since r is pivotal for b, f(θ8) = b.
As in the reasoning in lemma 1.1, move c from the bottom to the top of the ranking for person 1, 2 and so on in turn until the pivotal voter for c is reached. Say this is voter m. This gives profile θ9, with f(θ9) =c. Thus, m is the pivotal voter for c. The next argument shows that m ≤ r. Suppose to the contrary that m > r. Then, θ°m (in profile θ8) has a top ranked (and c is bottom ranked by all individuals in profile θ8). Recalling the previous discussion (lemmas 1.1 and 1.2) with c replacing b in that discussion, c is the pivotal voter so that by lemma 1.2, for any profile, θ⋆ where k is top ranked by m and where c is bottom ranked by everyone, f(θ⋆) = k. With a replacing k, this implies that in profile θ8, f(θ8) = a. However, this contradicts the fact that f(θ8) = b. Hence m ≤ r. A symmetric argument (starting first with c and then b) imples that r ≤ m. Therefore m = r so that voter r is pivotal with respect to both b and c. So, reasoning as before f(θ) ∈ {k,c}. Thus, f(θ) ∈ {k,b} and f(θ) ∈ {k,c}, so that f(θ) = k. Finally, if k= b (in the statement of lemma 5.3), use a third alternative a, and using the argument above, r is pivotal for both a and c, so that k = b is chosen. This negative result suggests two directions of investigation. The first is to consider restricted classes of preferences, and the second to consider alternative notions of equilibrium. Restrictions on preferences are discussed next. In chapter 6 implementation with alternative equilibrium notions is considered.
5.5.4 Preference domain restrictions Two well-known preference domain restrictions are single-peakedness and quasilinearity. With single-peaked preferences each type of each individual has a top ranked outcome with utility falling as one moves away from the top ranked point. Quasilinear preferences admit transfers.
Single-peaked preferences For the following discussion, let C be a subset of a one-dimensional set such as R with ordering ≥.
CHAPTER 5: MECHANISM DESIGN
77
Definition 5.3.Preferences are single-peaked if for each i and each θi, there is api(θi) such thatpi(θi) ≥ c > c′ orc′ > c ≥ pi(θi), imply thatui(c, θi) > ui(c′, θi). Thus, c is preferred to c′ because c′ is “farther away” from pi(θi) than c. The leading example of social choice function that is strategy-proof under single-peaked preferences is the median rule: f(θ) = median {p1(θ1), …, pn(θn)}. Adopt a tie-breaking procedure in the event of ties. To see why this is strategyproof consider the situation from the point of view of some individual i. Fix a profile of announcements of other individuals. If i at θi announces truthfully this identifies i's top ranked choice, pi(θi). Suppose this is below the median of top ranked choices, f(θ). Altering the announcement either identifies a lower top ranked point for i, say a, which does not affect the median, or it identifies a higher top ranked point for i than pi(θi), such as b which does not raise the median, or c which does raise the median. But announcement c can only move the outcome away from i's top ranked point at θi and thus lead to lower utility.
Quasilinear preferences The utility to i in the quasilinear framework is ui(x,ti, θi) = vi(x,θi) + ti, where x denotes some outcome or project choice and ti a transfer or payment to i. Consider an environment where C = X × T, where T is a subset of Rn with points in T representing transfers to the n individuals. For example, if T = {t = (t1, …, tn) | ∑iti ≤ 0} aggregate transfers are nonpositive. With quasilinear preferences, the social choice function has the form f(θ) = (x(θ), t(θ)) where f: Θ → C. Say that x*: Θ → X is efficient if for each θ:
Efficiency restricts the class of social choice functions; and it turns out that this restriction in the quasilinear environment identifies a class of social choice functions that admit a nice characterization (Groves 1973). Theorem 5.2.Suppose that x* is efficient. Then ifti(θ) = δi(θ−i) + ∑j≠ ivj(x*(θ), θj) the social choice functionf(θ) = (x*(θ), t1(θ), …, tn(θ)) is strategy-proof (incentive compatible in dominant strategies).
78
MICROECONOMIC THEORY
Proof To see this, suppose otherwise, so that at some θ = (θi, θ−i) for some i there is a
Substituting for
Putting
with:
and ti(θi, θ−i), the δi(θ−i)'s cancel and
,
contradicting efficiency of x* at θ.
There is a converse to this result (Green and Laffont 1979), giving conditions under which efficiency implies that a strategy-proof social choice function must have transfers of the form given above if the social choice rule is to be strategy-proof. Theorem 5.3.Let V be the set of all functions fromXtoR. Suppose that {vi(·, θi) | θi ∈ Θi} = V and suppose thatx* is efficient. If f(θ) = (x*(θ), t1(θ), …, tn(θ)) = (x*(θ), t(θ)) is strategy-proof then for eachi, tihas the formti(θ) = δi(θ−i) + ∑j≠ ivj(x*(θi, θ−i), θj). Proof To see this, given (x*(θ), t(θ)) define δi(θ): δi(θ) = ti(θ) − ∑j ≠ ivj(x*(θi, θ−i), θj). If δi(θ) is shown to be independent of θi the result follows. Suppose that for some θi, . To be specific, say for some ε > 0.9 Dominant strategy incentive compatibility (announcing is the best choice for type ) implies that:(5.1)
so that necessarily
9
Alternatively,
.
, and an identical argument applies.
79
CHAPTER 5: MECHANISM DESIGN
Since for any function g in V there is a θ′i such that vi(·, θ′i) = g(·), one can find a θ′i such that(5.2)
and(5.3)
So, is the unique maximizer of vi(x, θ′i) + ∑j ≠ ivj(x, θj), and since x* is efficient, proofness then implies that , because
so that , and at preference θ′i the opposite inequality holds: truthful type announcement by i yields the payoff:(5.4)
using equation (5.2) above. Because
. Strategy-
. At the profile (θ′i, θ−i),
, from equation (5.3):
so it follows that(5.5)
Recalling that , comparison of (5.4) and (5.5) shows that type θ′i is strictly better off announcing θi—contradicting the assumption of strategy-proofness. Thus, it must be that , .
80
MICROECONOMIC THEORY
However, while strategy-proofness is a strong property there are other desirable properties that the mechanism may not enjoy. In particular, participation may not be voluntary, and the project may not be self-financing. Assuming that individuals get a utility of 0 in the absence of the project, it may be that at some values of θ, for some i, v(x*(θ)) + ti(θ) < 0. In which case the individual is better off withdrawing support from the project. Also, the construction leaves open the possibility that the transfers may not break even—at some θ it may be that ∑iti(θ) ≠ 0. In that case, at the given θ, there must be a transfer to or from the group, so that efficiency is conditional on this transfer. This raises the question of what is possible when a budget balance condition is imposed—so that ∑iti(θ) = 0, ∀θ. Without further restrictions, the result is negative. Theorem 5.4.Let V be the set of all functions fromX to R. Suppose that {vi(·, θi) | θi ∈ Θi} = V. There is no strategy-proof social choice function f(θ) = (x*(θ), t1(θ), …, tn(θ)) = (x*(θ), t(θ)) withx* efficient and t=(t1, …, tn) satisfying the budget balance condition for all θ.
Bibliography Benoît, J.-P. (2000). “The Gibbard–Satterthwaite Theorem: A Simple Proof,” Economic Letters, 69, 319–322. Clarke, E. H. (1971). “Multi-part Pricing of Public Goods,” Public Choice, 11, 17–33. Gibbard, A. (1973). “Manipulation of Voting Schemes: A General Result,” Econometrica, 41, 587–601. Green, J. and Laffont, J. J. (1979). Incentives in Public Decision Making. Amsterdam: North Holland. Groves, T. (1973). “Incentives in Teams,” Econometrica, 41, 617–663. Jackson, M. O. “Mechanism Theory,” The Encyclopedia of Life Support Systems, EOLSS publishers, 2003. Mas-Colell, A., Whinston, M., and Green, J. (1995). Microeconomic Theory. Oxford: Oxford University Press. Reny, P. (2001). “Arrow's Theorem and the Gibbard–Satterthwaite Theorem: A Unified Approach,” Economic Letters, 70, 99–105. Satterthwaite, M. (1975). “Strategy-proofness and Arrow Conditions: Existence and Correspondence Theorems for Voting Procedures and Social Welfare Theorems,” Journal of Economic Theory, 10, 187–217. Sen, A. (2001). “Another Direct Proof of the Gibbard–Satterthwaite Theorem,” Economic Letters, 70, 381–385.
6 Implementation: Complete and Incomplete Information 6.1 Introduction When individuals participate in a market, trades and a market price are determined; when people vote to elect representatives, a governing party is selected. Decisions in a framework or environment of rules and procedures translate into outcomes. Often, the framework is designed with specific objectives in mind. Rules of voting are commonly structured to achieve a majority or representative outcome; complex rules govern trading of financial instruments such as mutual funds with a view to promoting equity or efficiency and other desiderata. At an abstract level, one can view these rules or procedures as game forms. Mechanism design concerns developing game forms whose equilibria coincide with some objective, as specified by a social choice function that associates outcome to states of the world. When such a game exists with the property that at each state of the world, the equilibrium outcomes agree with the social choice function, then the social choice function is said to be implementable. In what follows, these issues are discussed: the development of game forms or mechanisms whose equilibrium outcomes coincide with the value of some social choice rule at each state. Two distinct types of environment are commonly considered in the literature: the complete and incomplete information environments. Because these differ fundamentally with regard to their information structures they are considered separately. What individuals know (and hence can reveal or act on) is fundamental to each of the two environments. In the complete information case, individuals know the environment completely; with incomplete information they are assumed to have a distribution over parameters describing the environment.
82
MICROECONOMIC THEORY
The first part of the discussion considers the complete information case, the incomplete information environment is considered subsequently. Given a framework of interaction between individuals, the outcome depends on what people actually do, and predicting this depends on utilizing some notion of equilibrium behavior. The discussion here does not attempt to consider how individuals behave, but considers the consequences for mechanism design of different models of behavior (different notions of equilibrium.) Although Nash equilibrium is a natural candidate as the equilibrium notion, there are a number of alternatives (such as undominated Nash equilibrium) which lead to different results. These are discussed separately. Section 6.2 begins with a brief overview of the complete information model. Section 6.3 and its subsections focus on implementation with Nash and other equilibrium notions in the strategic form game. For strategic form mechanisms, there are many solution concepts that may be employed. Dominant strategy equilibrium is one candidate, but it is demanding in the sense that “few” social choice rules are implementable (Gibbard–Satterthwaite theorem). This has led to consideration of alternative solution concepts to implement a social choice function. Nash equilibrium is considered in Section 6.3.2. The key requirement under Nash equilibrium is a monotonicity condition on the social choice function. Following the discussion of Nash implementation, implementation in undominated Nash equilibrium is considered in Section 6.3.3. The primary impact of moving to undominated Nash equilibrium as the solution concept is that monotonicity of the social choice function is no longer necessary. Using virtual implementation as the solution concept is yet another alternative. With virtual implementation (discussed in Section 6.3.4), the problem is cast in terms of social choice functions mapping from states to lotteries over outcomes. In a special case of this formulation, preferences are linear so that the monotonicity condition is satisfied directly. As with undominated Nash equilibrium, virtual implementation yields strong results: “nearly all” social choice rules are virtually implementable. All these approaches involve working with strategic form games. An alternative approach uses extensive form games. This is discussed in Section 6.4. With subgame perfection as the solution concept, it turns out that very permissive results are obtained: any social choice rule satisfying mild restrictions is implementable in subgame perfect equilibrium. Section 6.5 examines implementation in incomplete information environments. The structure of the incomplete information environment is reviewed in Section 6.5.1. The key requirements of incentive compatibility and participation are described in Section 6.5.2. These are necessary conditions that any implementable social choice rule must satisfy. In Section 6.5.3, the evaluation of a social choice function according to ex ante, interim and ex post criteria is discussed. Section 6.5.4 introduces strategic form mechanisms and Section 6.5.5 considers Nash implementation. The key requirement here is
CHAPTER 6: IMPLEMENTATION
83
Bayesian monotonicity, the analog of monotonicity in the complete information environment. Before turning to details, it is worth remarking on the method of presentation. In general the proofs of the results presented here are lengthy. At the same time, there is usually a clear underlying insight that is the key. The aim and approach here is to highlight the key ideas and illustrate them by example, without providing proofs.
6.2 Complete Information Environments In the complete information environment, at any state, each player knows all the parameters of the environment—they are common knowledge. Within this framework, strategic form mechanisms are introduced first. The central role of preference reversal and monotonicity in the implementation of a social choice function is described. The scope for implementation of a social choice function depends on the form of game (strategic or extensive), and the choice of equilibrium concept. There are many possible equilibrium criteria—Nash, undominated Nash, subgame perfect, and so on. These are discussed in turn.
6.3 Strategic Form Mechanisms (Complete Information) 6.3.1 The environment A set of n individuals and a set of outcomes C are given. The set of possible preferences of i is Ui. For ui ∈ Ui, ui: C → R. Alternatively, let the set of possible preferences of i be parametrized by Θi, so that individual i with characteristic θi has preference ui(·, θi): C → C. In this formulation the set of preferences of i is identified with Θi and Ui = {ui(·, θi) | θi ∈ Θi}. More generally, one may have ui(·, θ1, …, θn) = ui(·, θ) so that each individual's preferences depend on the entire profile of θs, in which case Ui = {ui(·, θ) | θ ∈ Θ}, with Θ = × Θi. A social choice function may be defined on U = ×Ui, or on the parameter space, Θ. Taking the second formulation, a social choice function is a function from Θ to C: f: Θ → C. A mechanism, , is a collection of message spaces, {Mi} and a function g: M → C. Given a mechanism, ℳ, and a preference profile θ, there is an associated strategic form game, , where player i's strategy space is Mi and payoff function . This game is parametrized by θ so that all individuals are assumed to know θ—even when ui(·, θ) depends only on θi.
84
MICROECONOMIC THEORY
6.3.2 Nash implementation A point c ∈ C is an equilibrium outcome of the game Gℳ(θ) if NEℳ(θ) for the set of Nash equilibrium strategies of Gℳ(θ).
is an equilibrium message and c = g(m*). Write
Definition 6.1.The social choice function f is Nash implementable if there is a mechanism, ℳ, such that f(θ) = g(NEℳ(θ)), ∀ θ ∈ Θ. One may view this condition as implicitly consisting of two conditions: (1) at each θ there is a Nash equilibrium, m(θ) ∈ NEℳ(θ), giving the desired outcome (f(θ) = g(m(θ))); and (2) every equilibrium at θ determines the same outcome, f(θ) (f(θ) = g(NEℳ(θ))). When condition 1 is satisfied the mechanism weakly implements the social choice function; when conditions 1 and 2 are both satisfied the mechanism fully implements the social choice function. Often, in Nash equilibrium weak implementation is not very demanding. For example, suppose that for each individual there is some fixed worst outcome common across all types—for individual i there is some wi ∈ C and ui(wi, θ) ≤ ui(c,θ), ∀ c ∈ C, and ∀ θ ∈ Θ. Consider the following mechanism to implement f. Let Mi = Θ, ∀ i, fix some , an arbitrary point in Θ, and define g according to:
With this game form, at θ, mi = θ, ∀ i is a Nash equilibrium. So the weak implementation requirement is satisfied. However, at any preference profile θ, picking any θ′ ∈ Θ, the message mi = θ′, ∀ i is a Nash equilibrium, since no individual has a unilaterally improving deviation. At each θ, every outcome in the range of f is a Nash equilibrium outcome: f(Θ) ⊆ NEℳ(θ), ∀ θ ∈ Θ. In so far as the aim is to achieve the outcome f(θ) at profile θ, the mechanism ℳ is useless—although it weakly implements f. So, it is important to find mechanisms that are tight—in the sense that f(θ) = NEℳ(θ), ∀ θ—and this concern leads to considering the scope for elimination of “bad” equilibria (where, say, the outcome f(θ′) is obtained at preference profile θ). The central idea in this matter is monotonicity or the closely related idea of preference reversal.
Monotonicity For full implementation in Nash equilibrium, the key necessary condition is monotonicity. Definition 6.2.The social choice function f is monotonic if a ∈ f(θ) anda ∉ f(θ′) imply that there is someiandb ∈ Csuch thatui(a,θ) ≥ ui(b,θ) andui(a,θ′) < ui(b, θ′).
CHAPTER 6: IMPLEMENTATION
85
In moving from θ to θ′ there is a preference reversal for i in the ranking of a and b. Succinctly, f satisfies monotonicity if ∀ θ ∈ Θ:
The concept can be expressed in terms of lower contour sets. For c ∈ C let Li(θ, c) = {c′ ∈ C | ui(c′,θ) ≤ ui(c,θ)}. Li(θ, c) is the set of points in C that are ranked no higher than c by i at profile θ. In the definition of monotonicity above the requirement called for was the existence of some point b ranked no higher than a by i at θ, but ranked above a by i at θ′—b ∈ L(a, θ) but b ∉ L(a, θ′). Thus, f is monotonic if:
This can be depicted graphically in the case where C ⊂ R2.
If f(θ) = a then any equilibrium at θ, say m*, that produces outcome a at profile θ must be such that no player can deviate and improve their payoff. For i this means that all deviations generate outcomes such as b in the shaded region—the lower contour set for i through a at preference profile θ. When the state is θ′ if each player makes the same choice, m*, then a deviation by i again produces points only in the shaded area. But in the monotonic case a point such as b raises i's utility in state θ′ and so such a point gives i in state θ′ the incentive to deviate. Thus, it becomes possible to ensure that a is not an equilibrium outcome at θ′. However, in the nonmonotonic case at strategy profile m* in state θ′ there is no deviation raising i's utility. If this is the case for every i, then a is an equilibrium outcome at state θ’, contrary to the requirement that a ≠ f(θ′). The following paragraph gives a more formal argument showing that implementability in Nash equilibrium implies monotonicity. More formally, for ℳ to implement f, if a = f(θ) then a ∈ g(NEℳ(θ)) so ∃ m*, a = g(m*) and ui(a, θ) = ui(m*, θ) ≥ ui(m′i, m*−i, θ), ∀ m′i ∈ Mi. Therefore, for all i,
When f fails monotonicity at a, a ≠ f(θ′) but for all i, Li(θ, a) ⊆ Li(θ′, a), so that
Therefore, for all i, ui(a, θ′) ≥ ui(g(m′i, m−i*, θ′), θ′), ∀ m′i ∈ Mi. Thus, m* ∈ NEℳ(θ′) and a = g(NEℳ(θ′)) but a ≠ f(θ′) so f(θ′) ≠ g(NEℳ(θ′)). In sum, monotonicity is a necessary condition for Nash implementation. If a social choice function is
86
MICROECONOMIC THEORY
implementable in Nash equilibrium it is monotonic. For a large class of environments, monotonicity is sufficient: monotonicity and no-veto power are sufficient conditions for implementation when there are at least three individuals. No-veto power is discussed next.
No-veto power No-veto power is defined as follows. Definition 6.3.A social choice function f: Θ → C satisfies no-veto power if for each θ ∈ Θ, # {i | ui(a, θ) ≥ ui(c, θ), ∀ c∈ C} ≥ n − 1 implies that a = f(θ). In words, if a is top ranked by all or all but one individual at state θ, then it is in the range of the social choice function at θ. This is a mild condition: if, for example, there is competition between individuals at each state so that different individuals have different top ranked outcomes, then the condition is vacuously satisfied since the condition # {i | ui(a, θ) ≥ ui(c, θ), ∀ c∈ C} ≥ n − 1, is satisfied at no θ ∈ Θ. With these concepts in place the key result is (Maskin 1977, 1999): Theorem 6.1.Suppose that (1) n ≥ 3, (2) f is monotonic, and (3) f satisfies no-veto power. Then f is implementable in Nash equilibrium. The next section focuses on pinning down necessary and sufficient conditions for implementation in Nash equilibrium. The discussion this far has focused on social choice functions. Similar results go through when considering social choice correspondences (f(θ) may be set valued), with appropriate modification of the definitions. For example f(θ) = g(NEM(θ)) is then a set valued equality. The next section allows for social choice correspondences.
Necessary and sufcient conditions for Nash implementation While monotonicity is a necessary condition, no-veto power is not. And, while no-veto power is a mild restriction, for completeness it is of interest to identify necessary and sufficient conditions for Nash implementation. The issue is discussed next, and motivates the necessary and sufficient condition given below. For B ⊆ C, let
Ti(B, θ) is the set of points in B that are top ranked by i at state θ. In this notation no-veto power is: ∀ θ ∈ Θ, if ∃ i, and a ∈ ∩j≠ iTj(C, θ), then a = f(θ). Consider a mechanism that implements f and let . For each θ a a 10 pick m (θ) ∈ NEℳ(θ) so g(m (θ)) = a ∈ f(θ) . Define
10
The discussion here considers social choice correspondences, in line with the characterization of Moore and Repullo (1990).
87
CHAPTER 6: IMPLEMENTATION
the set of alternatives that i can “reach.” In this notation, the monotonicity property is captured by: (1) If at θ′, a ∈ Ti(Qi(a, θ), θ′) ∀ i then a ∈ f(θ′). If ma(θ) is chosen at θ′ and for each i, a ∈ Ti(Qi(a, θ), θ′), so a is the best choice for i given ma−i(θ), then a will be an equilibrium outcome at θ′. So, it must be that a ∈ f(θ′) for f to be implementable. The following two conditions relax the no-veto requirement: (2) Given θ′ and i, let c ∈ Qi(a, θ) so for some mi ∈ Mi. If c ∈ Ti(Qi(a, θ), θ′) and f(θ′). (3) Given θ′, if ∀ i, then c ∈ f(θ′). Equivalently, .
, j ≠ i then c ∈
Condition (2) says that if at the equilibrium ma(θ), i can deviate to get outcome c, and at θ′, c is top ranked for i among alternatives that i could get, playing against ma(θ) and in addition c if top ranked globally for j≠ i over the range of the outcome function, g(M), then c ∈ f(θ). This is clearly a necessary condition since under the circumstances, the outcome c is unavoidable as an equilibrium at θ. Condition (3) says that if at θ′, c is top ranked by everyone over the set g(M), then c ∈ f(θ′). Since c is top ranked on the range of the mechanism, any with is a Nash equilibrium, so must be that c ∈ f(θ′). All three conditions are clearly necessary; it turns out that they are also sufficient. Without reference to a fixed implementing m(θ), define the condition μ as follows. Condition μ:
such that for all i, ∀ θ, ∀ a ∈ f(θ),
such that ∀ θ′ ∈ Θ:
1. Given θ′, a ∈ Ti(Ci(a,θ), θ′), ∀ i implies a ∈ f(θ′). 2. If c ∈ Ti(Ci(a,θ), θ′) and 3.
, j≠ i, then c ∈ f(θ′).
implies c ∈ f(θ′).
Condition μ is necessary and sufficient for Nash implementation (Moore and Repullo 1990): Theorem 6.2.Suppose that n ≥ 3. The social choice correspondence f is Nash implementable if and only if it satisfies condition μ. The case n = 2 requires independent consideration and is discussed in detail in Moore and Repullo (1990). Although monotonicity appears central it turns out that a “minor” change in the equilibrium criterion (to undominated Nash equilibrium) alters the nature of the characterization of implementable social choice functions dramatically and in particular the monotonicity condition is no longer necessary. This is discussed next.
88
MICROECONOMIC THEORY
6.3.3 Undominated Nash implementation A Nash equilibrium is undominated if there is no player playing a weakly dominated strategy. Definition 6.4.Given θ, m* is an undominated Nash equilibrium at θ if (1) for eachi, ui(m*, θ) ≥ ui(mi, m*−i, θ) and (2) ∄ m′isuch thatui(m′i, m−i, θ) ≥ ui(m*i, m−i, θ), ∀ m−i, with strict inequality for somem−i. The first condition is the Nash condition; the second is the undominated requirement. Write UNEℳ(θ) for the set of undominated Nash equilibrium strategies of Gℳ(θ). The social choice function f is implementable in undominated Nash equilibrium if there is a mechanism ℳ such that f(θ) = g(UNEℳ(θ)), ∀ θ ∈ Θ. Although this might appear to be a minor restriction on Nash equilibrium, the effect from the implementation perspective is dramatic. To see this consider the following example where and . Let C = {a,b, x, y} and . Indifference curves for individual 1 are as depicted in the figure and ∀θ∈ Θ, u2(a, θ) = u2(b,θ) > max {u2(x,θ), u2(y,θ)}. So, f is not monotonic (because ) and player 2 has just one type).
In the mechanism ℳ individual 1 chooses from M1 = {T, B} and individual 2 chooses from M2 = {L, R}. Whatever the value of θ, picking L is a strictly dominant strategy for individual 2. For individual 1, at state , T is as good as B if individual 2 chooses L, and strictly better than B if individual 2 chooses R. So, at the unique undominated Nash equilibrium is (T,L) with outcome a. At state , B is as good as T if individual 2 chooses L, and strictly better than T if individual 2 chooses R. So, at the unique undominated Nash equilibrium is (B,L) with outcome b. Thus, ℳ implements f in undominated Nash equilibrium. The following condition is necessary for implementation of a social choice correspondence in undominated Nash equilibrium. In essence, a preference reversal holds, but not at the equilibrium outcome—as the example above illustrates.
CHAPTER 6: IMPLEMENTATION
89
Condition Q:f: Θ → C satisfies condition Q if, given θ, θ′ with x ∈ f(θ) and x ∉ f(θ′). Then either 1 or 2 holds, where: 1. ∃ i, ∃ a,b ∈ C with ui(a,θ) > ui(b,θ), ui(a,θ′) ≤ ui(b,θ′) and ∃ c,d ∈ C with ui(c,θ′) ≤ ui(d,θ′). 2. ∃ i, a,b ∈ C with ui(a,θ) ≥ ui(b,θ) and ui(a,θ′) < ui(b,θ′). This is a remarkably weak condition. For example, in (2) there is no requirement relating preferences at f(θ) to the feasibility of implementation of f. (The reason is probably clear from the example above where the existence of points x and y that have no connection to the f provides a preference reversal that is sufficient for implementation in undominated strategies.) For the general case, in conjunction with no-veto power condition Q it is sufficient for implementation in undominated Nash equilibrium. Theorem 6.3.Suppose that n ≥ 3, f satisfies condition Q and no-veto power. Then f is implementable in undominated Nash equilibrium. See Palfrey and Srivastava (1991) for further discussion.
6.3.4 Virtual implementation Virtual implementation takes an alternative approach to implementation. Rather than the implementation of the social choice function f, consider the implementation of a social choice function h that is close to f. For this to make sense requires that there is some notion of distance between social choice functions; and for it to be appropriate, it ought to be the case that when h and f are close in terms of distance, they are close in terms of preferences of the individuals (give approximately the same utility). Both of these matters are resolved simultaneously by formulating preferences on distributions over outcomes and defining social choice rules as functions from states into distributions over outcomes. Thus, let Δ(C) be the set of distributions on outcomes and define a social choice function to be a function from Θ to Δ(C): f: Θ → Δ(C). With C finite, Δ(C) is a finite dimensional simplex. With Euclidean metric , points x,y in Δ(C) are within ε of each other if . Two social choice functions, f and h are ε close if , ∀ θ ∈ Θ. Define . Definition 6.5.A social choice function, f, is virtually implementable if ∀ε > 0, ∃ a social choice function h, with ρ(f,h) ≤ ε, where h is implementable. A preference ordering for i at state θ is given by an ordering Ri(θ) on Δ(C), or utility function Ui(x, θ), Ui(·, θ): Δ(C) → R. It is assumed that given θ ∈ Θ there are individuals i, j and ak, a′k ∈ A such that Ui(δa, θ > Ui(δa′, θ) and Uj(δa, θ) < Uj(δa′, θ), so that at any θ ∈ Θ not all individuals' preferences on A are identical.
90
MICROECONOMIC THEORY
These preferences implicitly define a ranking on A, via distributions with support on just one point in A. If δa is the distribution placing probability 1 on a ∈ A, then let . Given θ and i, and assuming no ties in the ranking of the lotteries {δa}a ∈ A, let the (K) elements of A be ordered such that ui(a1, θ) > ui(a2, θ) > … > ui(aK, θ), so that ai is ranked higher than ai+1 by i at θ. Definition 6.6.A preference, Ui(·, θ) on Δ(A) is monotone if and , r = 1, …, K, then Ux(x, θ) > Ui(y, θ).
, k = 1, …, K − 1, implies that whenever x ≠ y
The impact of monotonicity is to ensure that appropriate “preference reversals” occur. To see this, suppose there are just two elements in A: A = {a1, a2}. If Ui(δa, θ) > Ui(δa′, θ) and Ui(δa, θ′) < Ui(δa′, θ′), then for any x ≫ 0, for y with xa > ya, Ui(x, θ) > Ui(y, θ) and Ui(x, θ′) < Ui(y, θ′).
von Neumann–Morgenstern preferences Note that although the definition of preferences on lotteries is not assumed to be von Neumann–Morgenstern, that is a natural interpretation, in which case Ui(x,θ) = ∑aui(a,θ) xa. Some discussion with von Neumann–Morgenstern preferences will clarify the role of lotteries. Let fc(θ) denote the probability of c at state θ with the social choice function f(θ). Then with f, the expected utility of i in state θ is ∑cui(c,θ)fc(θ). If f and h are ε close then:
Let maxcui(c, θ) ≤ K, ∀θ, so that utilities.
. Thus, when f and h are close, so are the expected
With von Neumann–Morgenstern preferences, indifference curves are linear and (Maskin) monotonicity is satisfied automatically on the interior of Δ(C). The formulation has an important implication as far as monotonicity is concerned. Consider i and ui(·, θ) and write ui(θ) for the vector {ui(c,θ)}c ∈ C. Then ui(θ) · f(θ) = ∑cui(c,θ) fc(θ) and the indifference curve through f(θ) is ℐi(f,θ) = {ξ ∈ Δ(C) | ui(θ) · ξ = ui(θ) · f(θ)}—a hyperplane intersected with Δ(C). For preferences at θ and θ′ to be different it must be that ℐi(f,θ) ≠ ℐi(f,θ′) since preferences are linear and are fully defined by any one hyperplane or half-space. Take a social choice function mapping to the interior of Δ(C). Since θ and θ′ represent different preferences, ℐi(f,θ) ≠ ℐi(f,θ′), in the neighborhood of f there is
CHAPTER 6: IMPLEMENTATION
a social choice rule , such that mapping to the interior of Δ(C).
91
. Thus, monotonicity is satisfied for every social choice function
When individuals have von Neumann–Morgenstern preferences indifferences curves are either everywhere equal or agree nowhere. The case of #C = 3, is depicted. With the assumption that preferences are monotonic, Abreu and Sen (1991) show: Theorem 6.4.Suppose that n ≥ 3. Any social choice function is virtually implementable in Nash equilibrium. A similar result applies if the equilibrium criterion is strengthened to iteratively undominated Nash equilibrium. Say that a Nash equilibrium is iteratively undominated if iterative elimination of strictly dominated strategies leads to elimination of all strategies except the Nash equilibrium. This is a robust criterion—in particular the Nash equilibrium is strict; and since elimination is in terms of strictly dominated strategies, the outcome is independent of the order of elimination. For implementation in strictly undominated strategies the following mild assumption is made: ∀ i, θ ∈ Θ, ∃ ā(i, θ), a(i, θ) ∈ C with ui(ā(i, θ), θ) > ui(a(i, θ), θ) and ∀j ≠ i, uj(ā(i, θ), θi) > uj(a(i, θ), θ). With this assumption, Abreu and Matsushima show: Theorem 6.5Any social choice function is virtually implementable in iteratively dominated strategies.
6.4 Extensive Form Mechanisms (Complete Information) In extensive form games, implementation in the complete information case has focused on subgame perfect equilibria. The key role of the extensive form mechanism is to break the connection between the value of the social choice function and
92
MICROECONOMIC THEORY
the point where the preference reversal occurs. In this respect, the way in which preference reversals occur is similar to that in undominated Nash equilibrium.
Subgame perfect implementation Consider an exchange economy with just two individuals, where preferences are either both Cobb–Douglas or both Leontief. The following figure identifies preferences and a corresponding social choice rule where monotonicity fails:
In the figure, each individual's preferences are either Cobb–Douglas (C) or Leontief (L). The set of possible states is {θC, θL} where θC = (C,C) and θL = (L, L): both have Cobb–Douglas preferences, or both have Leontief preferences. The social choice function is f(θC) = c and f(θL) = l. At outcome a, the lower contour set of i at state θ is Li(a,θ). From the figure, Li(c,θC) ⊆ Li(c, θL). Since c = f(θC), monotonicity requires that c = f(θL). So, the social choice rule fails monotonicity. While the mechanism fails monotonicity, it turns out to be implementable in an extensive form game—as follows.
Here, player 1 chooses L or C; if L the game terminates with outcome l, otherwise the game moves to the next stage where player 2 must move. In this event, if 2 chooses L, the outcome is C; if 2 chooses C the game moves to the next stage where 1 must pick T or B. The choice T leads to outcome x and the
CHAPTER 6: IMPLEMENTATION
93
choice B leads to outcome y. There are two possible states, θC and θL. The following figure indicates the (unique) equilibrium in each case. The equilibrium choices are indicated by arrows, with one corresponding to the state θC and the other to θL.
At profile θC, if is reached, individual 1 with Cobb–Douglas preferences prefers x to y and chooses T at that node. When individual 2 chooses at , L yields outcome c while C leads to outcome x. Since 2 with Cobb–Douglas pre-ferences prefers c to x, 2 chooses L. Therefore, for 1 at , C leads to outcome c whereas L yields outcome l and since c is preferred to l by 1 with Cobb–Douglas preferences, C is chosen. Thus, at state θC the unique subgame perfect equilibrium leads to outcome c. When the state θL describes preferences, the unique subgame perfect equilibrium outcome is l. Thus, the game form Γ implements the social choice rule f(θC) = c, f(θL) = l, despite the fact that monotonicity is not satisfied. In this example, the preference flip that occurs at the end of the game tree (moving from preference θC to θL) (x is preferred for Cobb–Douglas; y is preferred for Leontief) has a chain effect all the way back to the start of the game. This chain effect is the key feature exploited in extensive form implementation and gives the following necessary condition. Given a social choice correspondence f: Theorem 6.6.Suppose that f is implementable in subgame perfect equilibrium. Then given θ, θ′ ∈ Θ, for each a ∈ f(θ), a ∉ f(θ′), ∃ {a0, a1, …, ar = x, ar+1 = y} ⊆ Csuch that 1. ∀ k ∈ {0, 1, …, r −1}, ∃ j(k) withuj(k)(ak, θ) ≥ uj(k)(ak+1, θ); and 2. ∃ j(r) (ar = x, ar+1 = y) uj(r)(x, θ) ≥ uj(r)(y, θ) anduj(r)(x, θ′) < uj(r)(y, θ′). In this chain, individual j(k) prefers ak to a+1, uj(k)(ak, θ) ≥ uj(k)(ak+1, θ) at θ. So, in an extensive form game, faced with the choice, j(k) will choose ak over ak+1, and this will be a part of the subgame perfect equilibrium strategy profile supporting a specific outcome at θ. When the state changes, there is a preference reversal at the end of the chain, uj(r)(x, θ) ≥ uj(r)(y, θ) and uj(r)(x, θ′) < uj(r)(y, θ′) which can be used to eliminate the strategies that were subgame
94
MICROECONOMIC THEORY
perfect at θ, but now not subgame perfect at the new profile, θ′. Subgame perfect implementation is considered in Abreu and Sen (1990) and Moore and Repullo (1988).
6.5 Incomplete Information Environments 6.5.1 The framework An environment is a collection (N, {Si}i∈ N, μ, C, {ui}) where N = {1, …, n} is the set of individuals, Si the set of types of individual i, μ a prior distribution on S = ×i∈ NSi, and uI: C × S → R, the utility function of individual i. A social choice function, x, is a function from S to C, x: S → C. Let the set of social choice functions be X. Assume that S is finite. Given x, the ex ante expected payoff to i is ∑s ∈ Sui(x(s),s) μ(s); the interim expected utility to i, type si is , and the ex post utility at type profile s is ui(x(s),s). Ex ante corresponds to a payoff evaluation of x prior to knowledge of one's own type; interim expected utility corresponds to evaluating the payoff conditional on knowing one's type, and ex post utility is the evaluation of utility when all types are known.
6.5.2 Incentive compatibility and participation Incentive compatibility plays a central role in mechanism design. Specifically, it is a necessary condition for a social choice rule to be implementable. This is discussed next. As before, a social choice function, x: S → C represents the socially desirable outcome at each state. Definition 6.7.The social choice function x satisfies (interim) incentive compatibility if ∀ i, ∀ si,
A mechanism ℳ = ({Mi}i∈ N, g) consists of a message space, Mi, for each i, and a function g: M → C, (with M = × Mi). Given a mechanism ℳ a game is implicitly defined. Let σi: Si → Mi and σ = {σi}i∈ N and define . The strategy σ* is a Nash equilibrium if ∀ i, ∀ si,
where σ*−i(s−i) = {σ*j(sj)}j ≠ i. Definition 6.8.The mechanism ℳ implements x if for every Nash equilibrium σ*, x(s) = g(σ*(s)), ∀ s ∈ S. If a social choice function is implementable, it is incentive compatible.
95
CHAPTER 6: IMPLEMENTATION
Theorem 6.7.If x is implementable in Nash equilibrium then x is incentive compatible. Proof Let σ* be a Nash equilibrium of a mechanism that implements x. So, ∀ i, ∀ si,
Define a new mechanism ℳd, where and gd(s) = g(σ*(s)) = x(s), ∀ s ∈ S. This is a direct mechanism since . In the direct mechanism, a strategy for i is a function τi: Si → Si; let the “truth-telling” strategy be τ*i(si) = si, ∀ si ∈ Si. Then, ∀ si ∈ Si,
Consider τi with τi(si) = s′i ≠ si for some si. Suppose that σ*i(s′i) = mi, and consider any
with
.
where the first equality follows from the definition of gd, and the second because σ* is a Nash equilibrium. Now, , and σ*−i(s−i) = σ*−i(τ*i(s−i)) so
or
Since τ*(s) = s,
or
96
MICROECONOMIC THEORY
Thus, incentive compatibility is a fundamental condition, necessary for a social choice rule to be implementable in Nash equilibrium or any more demanding solution concept. In some environments it is natural to introduce a participation requirement. It may be that an individual, i, has an outside option and may not participate in the mechanism unless it guarantees a minimum level of utility, say ūi (or ui(si) if a state-dependent reservation level of utility is appropriate). With a participation constraint an additional condition appears: for all si.
6.5.3Ex ante, interim, and ex post criteria In the evaluation of outcomes, individuals can assess the situation from three temporal perspectives: prior to knowledge of their own signal, after knowledge of their own signal but prior to knowledge of others signals, or with full information when all signals are revealed. Given x, the ex ante expected payoff to i is ∑s ∈ Sui(x(s),s) μ(s); the interim expected utility to i, type si is , and the ex post utility at type profile s is ui(x(s),s). These criteria amount to evaluating an individual's payoff with varying degrees of information. If for each si, ui(x(s),s) ≥ ui(x(si, s−i),s), then , so that ex post incentive compatibility implies interim incentive compatibility. Similarly, interim incentive compatibility implies ∑s ∈ Sui(x(s),s) μ(s) ≥ ∑s ∈ Sui(x(si, s−i),s) μ(s), or ex ante incentive compatibility. In the consideration of a social choice rule, matters such as incentive compatibility, participation, and efficiency depend on the information possessed by individuals.
6.5.4 Strategic form mechanisms (incomplete information) Implementation in strategic form games may be in terms of Nash equilibrium, undominated Nash equilibrium, and so on. As in the complete information case, Nash equilibrium provides an important benchmark.
6.5.5 Nash implementation For (Nash) implementation in complete information environments, the key necessary condition is monotonicity. In the incomplete information case, there is an analogous condition called Bayesian monotonicity. Conceptually this addresses the same issue—the occurrence of preference reversals when moving from one
97
CHAPTER 6: IMPLEMENTATION
state to another. The following discussion derives and motivates the Bayesian monotonicity condition, which is defined below. Fix a mechanism ℳ. If the mechanism implements the social choice rule x, then there is an equilibrium such that at each s, the outcome coincides with x(s); and any strategy profile generating an outcome different from x(s) at some s cannot be an equilibrium. Let σ* be an equilibrium strategy with x(s) = g(σ*(s)), ∀ s ∈ S. Because σ* is an equilibrium strategy, ∀ i, ∀ si,(6.1)
For each i let αI: Si → Si and let α = (α1, …, αn) and consider defined by , and such that for some s ∈ S. This cannot be an equilibrium: for some i, si, a best response to so :(6.2)
Given any function f: S → S, let fα(s) = f(α(s)), ∀ s. Also, given let denotes that replaces si in the ith position of f. So, the function Similarly, .
is not
where the subscript varies with s−i, but is independent of si.
With this notation and taking to be the improving deviation above, let , ∀S−i and ∀ s′i, so that y is independent of the ith coordinate. Note in particular, that because y is independent of the entry in the ith coordinate, for any , . In particular . Substituting ĉi in inequality (6.1) above:
or(6.1′)
Next, observe that in inequality (6.2) above,
So, inequality (6.2) may be written:(6.2′)
. Also,
98
MICROECONOMIC THEORY
Given two social choice functions, v, w, let Ri(si) be the preference ordering defined by:
And write vPi(si)w if vRi(si)w holds, but not wRi(si)v. Thus, (6.1′) reads
, ∀si and (6.2′) reads yαPi (si)xα.
Definition 6.9.A social choice rule x satisfies Bayesian monotonicity if given α: S → S, ∃ i, si, and y: S → S such that 1. , for allti ∈ Si. i 2. yαP (si) xα. The previous discussion shows that Bayesian monotonicity is a necessary condition for implementation in Nash equilibrium. See Schmeidler and Postlewaite (1986) and Jackson (1992) for further discussion. For a large class of environments Bayesian monotonicity is sufficient. Let B be a subset of S and put:
Given social choice functions x, z, let xBz = χB · x +(1 − χB)· z. Definition 6.10.An environment is economic if given z ∈ X and s ∈ S, ∃ i, j i ≠ j, x,y ∈ Xwith bothx and yconstant social choice functions (for alls, and such that
So, in economic environments, for example, if every pair of individuals strictly disagree in their top ranked outcomes the environment is economic. Theorem 6.8Let n ≥ 3 and suppose the environment is economic. Then a social choice function is implementable (in Nash equilibrium) if and only if x satisfies incentive compatibility and Bayesian monotonicity.
6.6 Other Mechanisms The discussion above considers strategic form mechanisms and Nash equilibrium. One can also consider strategic form mechanisms with undominated Nash, virtual implementation, and other solution concepts. Likewise, extensive form mechanisms may be developed to implement social choice rules.
CHAPTER 6: IMPLEMENTATION
99
Bibliography Abreu, D. and Matsushima, H. (1992). “Virtual Implementation in Iteratively Undominated Strategies: Complete Information,” Econometrica, 60(5), 993–1008. Abreu, D. and Matsushima, H. (1994). Exact Implementation,” Journal of Economic Theory, 64, 1–19. Abreu, D. and Sen, A. (1990). “Subgame Perfect Implementation: A Necessary And Almost Sufficient Condition,” Journal of Economic Theory, 50, 285–299. Abreu, D. and Sen, A. (1991). “Virtual Implementation in Nash Equilibrium,” Econometrica, 59, 997–1021. Glaser, J. and Rosenthal, R. W. (1992). “A Note on Abreu-Matsushima Mechanisms,” Econometrica, 60, 1435–1438. Jackson, M. (1991). “Bayesian Implementation,” Econometrica, 59(2), 461–477. Maskin, E. (1977). “Nash Equilibrium and Welfare Optimality,” Mimeo, MIT. Maskin, E. (1999). “Nash Equilibrium and Welfare Optimality,” Review of Economic Studies, 66, 23–38. Matsushima, H. (1988). “A New Approach to the Implementation Problem,” Journal of Economic Theory, 45, 128–144. Moore, J. and Repullo, R. (1988). “Subgame Perfect Implementation,” Econometrica, 56, 1191–1220. Moore, J. and Repullo, R. (1990). “Nash Implementation: A Full Characterization,” Econometrica, 58, 1083–1099. Palfrey, T. and Srivastava, S. (1991). “Nash Equilibrium using Undominated Strategies,” Econometrica, 59, 479–501. Repullo, R. (1987). “A Simple Proof of Maskin's Theorem on Nash Implementation,” Social Choice and Welfare, 4, 39–41. Saijo, T. (1988). “Strategy Space Reduction in Maskin's Theorem: Sufficient Conditions for Nash Implementation,” Econometrica, 56, 693–700.
This page intentionally left blank
7 Auctions I: Independent Values 7.1 Introduction An auction is a procedure for selling a good or goods. There are many common auction procedures: ascending bid, descending bid, sealed bid first price auction, etc. In an English auction, bidders continue bidding until all but one bidder remains, and the object is sold at a price equal to the highest bid. The highest bid made at this point is the selling price. In a Dutch auction, the price is lowered (from a high level) until some bidder accepts that price, and that determines the selling price. In a sealed bid auction, buyers submit bids—in a first price auction the highest bidder gets the object at a price equal to that highest bid; in a second price auction the highest bidder gets the object at a price equal to the second highest bid. This chapter discusses the independent values case, where bidders' valuations are independently drawn. In auctions involving the sale of a good, asset, or entitlement that will be used to generate revenue one expects that different bidders will have different estimates of the revenue generating potential of the asset, but that these estimates will be correlated, and hence also the valuations of different bidders. Such environments are excluded by the independent values assumption. Instead, consider the auction of a house in a neighborhood with many similar or identical houses. Here, one expects potential buyers' valuations to be drawn independently from some population of buyers and who just happen to be in the market for a house at that point in time. In the pool of bidders, the fact that one bidder has a relatively high valuation does not mean that it is more or less likely that others' valuations will be high. In such circumstances, the independent values model is appropriate. The focus of this chapter is on two issues: equilibrium behavior and revenue raised in different types of auction. From the seller's perspective, the value of the good (or goods) to the potential buyers is unknown, so there is a difficulty in trying to achieve a good price with
102
MICROECONOMIC THEORY
high probability. One might expect that different auction schemes would perform differently in terms of selling price since the rules governing them are so different. Furthermore, different auction procedures lead to different bidding behavior by potential buyers, so that in principle it is difficult to compare such auctions in terms of revenue generated. Nevertheless, there are key features in common to all auctions. Any auction procedure leads to some form of strategic behavior by bidders, and this in turn generates win probabilities for each bidder, and an expected payment. In any auction one can identify win probabilities and an expected selling price (given bidding strategies for the bidders) and this does provide a basis for revenue comparison. Under appropriate circumstances these expected payments and win probabilities are sufficient to study behavior. For example, consider a sealed bid second price auction where bidding one's true valuation is a weakly dominant strategy. For any buyer, one can compute the conditional distribution on the selling price, given that buyer's bid is the winning bid. Alternatively, one might just consider the expected selling price—again conditional on that buyer's bid being the winning bid. If the buyer is risk neutral, the expected selling price is sufficient to determine expected utility, and knowing the price distribution is irrelevant. A bidder is risk neutral if expected utility equals the bidder's valuation less than the expected payment (win probability times the amount paid conditional on winning). Similarly, a risk neutral seller is concerned only with the expected selling price. With risk neutrality agents care only about the probability of winning and the expected payment, so with risk neutrality, these variables become a benchmark for comparing different auctions. If equilibrium bidding behavior leads to the same expected payment and win probability for any bidder, these auctions are in a sense equivalent. In particular, the expected revenue would be equal in such auctions. The Revenue Equivalence Theorem is this observation: auctions with the same win probabilities and expected payments are revenue equivalent. Auctions defined in terms of win probabilities (the probability that a bidder gets the object) and expected payments are called reduced form auctions. With risk neutrality it turns out that incentive compat-ibility establishes a tight connection between the win probability and the bidder's expected payment: the win probability fully determines the bidder's expected payment. This is a remarkably useful property since, in particular, comparison of a class of auctions reduces to comparison of the associated win probabilities. Furthermore, it connects the seller's expected revenue to the win probabilities, so questions concerning optimality and efficiency can be discussed in terms of these win probabilities. From this the equivalence of different auctions in terms of expected revenue can be shown (“revenue equivalence”), and optimal auctions characterized. Auctions with independent values refer to the case where buyer's valuations are independently drawn, so that for example, it is not the case that a high valuation for one buyer makes it more likely (in the conditional probability sense), but that the valuations of other buyers are high. The assumption of independence is
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
103
central to the results on revenue equivalence. It is the independence of valuations and risk neutrality that lead to the key connection of expected payments to win probabilities in reduced form auctions. An auction is optimal if it maximizes the seller's expected revenue. An auction is efficient if efficient in the Pareto sense. In general, there is a conflict between efficiency and optimality due to incentive requirements. Encouraging a bidder to bid aggressively is important to a revenue maximizing seller, but this requires that the bidder has some incentive to do so. In the present context, one natural incentive is that the bidder may lose the object even when the object is valued by the bidder. This risk encourages the bidder but also creates some states where no sale takes place even though efficiency would require a sale. Section 7.2 introduces the standard auctions such as the English and the Dutch auctions, and equilibrium bidding strategies are characterized. As discussed above, a fundamental result of auction theory asserts that in the independent values case and with bidders that are risk neutral a large class of auctions generates the same expected revenue for the seller. In Section 7.3 the familiar auctions are all shown to have this property. With risk neutrality the key strategic features defining an auction procedure are the win probabilities and the expected payment. In particular, auctions with the same win probabilities and expected payments must generate the same expected revenue for the seller. An auction defined directly in these terms is called a reduced form auction. Reduced form auctions are introduced in Section 7.4. Viewed as mechanisms, incentive compatibility and participation (willingness to participate) are the key requirements on win probability and expected payment pairs. In Section 7.4.1, such mechanisms are characterized. This characterization of incentive compatibility is used in Section 7.4.2 to simplify the computation of the expected revenue from a reduced form auction. Section 7.5 characterizes the optimal auction (revenue maximizing auction). Finally, in Section 7.6, the case where either seller or buyers are risk averse is discussed. When the seller is risk averse and the buyers risk neutral, while the expected revenue from both the first and second price auctions is the same, the expected utility for the seller is higher in the first price auction. When buyers are risk averse, and focusing on the first price auction, the more risk averse the bidders, the higher the bid at any valuation: in the first price auction, greater risk aversion leads to more aggressive bidding.
7.2 Auction Procedures There are many possible schemes for selling an object—in principle, there is an infinite number of auction procedures. Some of the most common auction procedures are the first and second price auctions, and the English and the Dutch auctions. Each of these schemes follow different rules and are discussed next. In
104
MICROECONOMIC THEORY
what follows, there are n bidders and the set of bidders is denoted N = {1,…, n}. The valuation of bidder i is a random variable, Vi. Valuations are independently and identically distributed.
7.2.1 First price auctions In a first price auction, bidders submit sealed bids. The highest bid wins, and the bidder pays that bid. So, the expected utility associated with a bid is the net value of the bid times the probability of winning at that bid. Suppose that all individuals have valuations drawn from the interval according to a common distribution, F = Fi, for all i. Consider a symmetric bidding strategy: let b(vj) be the bid of j with valuation vj with b increasing (b(v′) > b(v), v′ > v). Suppose that all players, j ≠ i adopt this strategy. The expected utility to i type vi from bidding β is:
The range of b is given by , where b = b(v) and . If i bids b, there is 0 probability of winning the object, if i bids , i gets the object for sure. So an optimal bid for i at valuation vi can be found in the range : an optimal bid can be chosen by selecting α ∈ to determine a bid in according to b(α). Here, for i type vi, the problem reduces to:
Maximizing,
Rearranging,
If b(·) is a symmetric equilibrium, it must be that α = vi is optimal when i's valuation is vi:
or
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
105
or
So,
or
With F(v) = 0,
Thus,
(Note that .) Recall that Fn−1 is the distribution of the maximum of n−1 independent random variables each with distribution F. Thus, b(vi) = E{maxj ∈ N\{i}Vj | maxj∈ N\{i}Vj < vi}; bidder i with valuation vi computes the expected value of the highest of n−1 random valuations, conditional on that being lower than vi. This defines i's bid function.
7.2.2 Second price auctions In a second price auction, the highest bidder gets the object and pays a price equal to the value of the second highest bid. There are many Nash equilibria. Let be the support of the valuation distribution. Suppose that an arbitrary bidder, i bids and all other bidders bid 0. That is an equilibrium: i gets the object and pays nothing. For another bidder to win, they must outbid i resulting in a selling price of —more than the value to anyone. However, bidding one's true valuation is a weakly dominant strategy. When others bid lower the bidder wins and a higher bid would leave the price and winning bidder unchanged, while a lower bid would only affect the outcome if the bid was below the nearest competitor, in which case the bidder loses the object at a price the bidder would have been willing to pay. The situation is similar when others bid higher. In this equilibrium, the object is sold at the valuation of the second highest bidder (the value of second-order statistic in the valuation profile). In subsequent discussions, the equilibrium of the second price auction will be seen to be the dominant strategy equilibrium.
106
MICROECONOMIC THEORY
7.2.3 All-pay auctions In an all-pay auction, the highest bidder gets the object, but all bidders pay their bid. Thus, the expected payoff to bidder i with bid β is vi · Prob(win) − β. Using the same formulation as in the first price auction, with ba a candidate bid function for the other players, the problem for i is:
The first-order condition is vi (F(α)n−1)′ − b′a(α) = 0, and with this an identity at vi: vi (F(vi)n−1)′ = ba′(vi), ∀vi. Thus,
The solution is:
or
At valuation v there is zero probability that the bidder will win, so to avoid negative expected utility at v, it must be that ba(v) = 0. Thus,
7.2.4 Fixed price auctions (take it or leave it pricing) In the fixed price auction, the object is offered for sale at a fixed price v*. In this case the probability of a sale is the probability that the largest valuation is above v*. The bidding rule is not important—for example, the object can be randomly assigned to one of the bidders bidding at or above v* (in which case truthful bidding is a dominant strategy) or bidders bid and the highest bidder gets the object (in which case all bidders with valuations above v* will bid ). The expected payoff to the seller is v*(1 − F(v*)n). Maximizing this with respect to v* gives the necessary condition:
so that v* satisfies:
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
107
So, the item is offered for sale at price v*. If more than one buyer offers v*, some rule for selecting a buyer must be applied. This process does not necessarily give the object to the buyer with the highest valuation. When n = 1, this reduces to
and the seller simply offers the item to the buyer at a fixed price v*.
7.2.4 The Dutch and the English auctions In the Dutch or descending bid auction, the price is reduced until some bidder accepts the object at that price. This is strategically equivalent to the first price auction. In the English auction, the price is raised by competition from bidders, until only one bidder is left, and the object is sold for the final bid. This competition will drive the price to the valuation of the bidder with the second highest valuation—when that bidder drops out, the price is bid up no more. So the outcome is the same as that in the weakly dominant equilibrium of the second price auction.
7.3 Revenue Equivalence In the discussion of reduced form auctions below (Section 7.4), it is shown that the expected payment of bidder i depends entirely on the expected payment of i at i's lowest possible valuation and the function yi giving the probability that i wins at each valuation profile. Any auction procedure whose reduced form determines the same value of and the same function yi therefore determines the same expected payment from the buyer. In each of the five auctions discussed above (first price, second price, English, Dutch, all-pay) , since at the lowest valuation (and Fi = F, ∀ i), there is zero probability of getting the object, and yi(v) = 1 if vi > maxjvj, while yi(v) = 0 if vi < maxjvj. So, although the procedures are very different, it turns out that because they all generate the same reduced form auction, from a revenue point of view, they are equivalent. Nevertheless, it is of some interest to confirm the details directly for the specific auctions discussed earlier. In the first price auction, the equilibrium bid function is
108
MICROECONOMIC THEORY
The seller's expected revenue is equal to the expected value of the highest bid:
In the second price auction (in the dominant strategy equilibrium), expected revenue is equal to the expected value of the second-order statistic, which has cumulative distribution function: F2(v) = nF(v)n − 1(1 − F(v)) + F(v)n. So,
Therefore, since the expected value of the second-order statistic is: .
, it follows that
Next, compare the first price and all-pay auctions, and write ba for the all-pay auction bid and b for the first price auction bid as before.
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
109
So,
The expected value of the bid of the highest bidder is the expected revenue of the seller in the first price auction. This is
Thus, the expected revenue in the first price auction is equal to the expected revenue in the all-pay auction (n times the expected payment of each bidder). Since the first price and the Dutch auctions are strategically equivalent, and the English and the second price auctions determine a price equal to the second-order statistic, all five auctions generate the same expected revenue.
7.4 Reduced Form Auctions Any selling procedure determines two key features: the win probability of a bidder at a valuation profile, and the expected payment of the bidder at that profile. From a strategic perspective the win probability and payment at every profile fully describe the strategic scenario, and auctions defined in these terms are called reduced form auctions. For general preferences, the utility of i with valuation vi and paying si is if the item is received and if no item is received (allowing for the possibility that an unsuccessful buyer might have to pay). If at profile v = (vi, v−i), yi(v) is the probability of receiving the object and si(v) the payment made at that profile, then the expected utility of i at that profile, v, is:
When ui is separable,
.
Suppose that , so there is no satisfaction when the object is not received. If, in addition, the dissatisfaction from payment is the same whether the object
110 is received or not, then
MICROECONOMIC THEORY
. With these two assumptions and writing g = gr, the expected utility at v becomes:
Taking expectations conditional on vi and with
, the (interim) expected utility is:
While the function gi(vi) may be viewed as a rescaling of utility, the function h reflects the bidder's attitude to risk. In the earlier discussion risk neutrality was assumed so that hi(si) = −si. Now assume that gi(vi) = vi and hi(si) = −si, so the interim expected utility is
where . For example, the first price auction had the form [vi − b(vi)] F(vi)n−1, or viF(vi)n−1 − b(vi) F(vi)n−1. n−1 Here F(vi) is the probability of winning when the valuation is vi and corresponds to , while b(vi) F(vi)n−1 is the expected payment for a bidder with valuation vi. Let
So that . Summarizing, the win probability and the expected payment are the key features of any auction, and auctions described in these terms are called reduced form auctions. A reduced form auction is given by a “2n-tuple” of functions , where yi(v) is the probability that i gets an object at type profile vi and si(v) is the transfer from i at that profile. The associated (expected) utility of i (at profile v) is ui(vi, v−i) = viyi(v) − si(v). According to the revelation principle, if a mechanism is incentive compatible, it can be replaced by a direct mechanism (where in the present context the “type” of an individual is their valuation) and the original equilibrium outcome continues to be an equilibrium outcome in the direct mechanism. Hence, one can focus on incentive compatible direct mechanisms (e.g. the search for an optimal selling procedure to maximize the seller's expected revenue). With a direct mechanism, the payoff to i, type vi at report
and with
and
is
this may be written as:
111
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
giving the expected payoff, conditional on own type, vi, and report . Let . Since, is linear (affine) in vi, u*i(vi) is convex. This implies the important property that is weakly increasing in vi—the key incentive compatibility requirement. Before discussing this in greater detail, the convexity of u*i is confirmed. To see that u*i is convex, consider a slightly more general specification: function, and γ(vi) and δ(vi) are both arbitrary functions. If . Therefore:
With
, ∀vi, this implies .
, where a(vi) is a convex and so
, then
. When a(v)i = vi,
, and
,
7.4.1 Incentive compatibility For reduced form auctions, incentive compatibility has a special structure: the interim expected payment of a bidder (as a function of valuation) is fully determined (up to a constant) by the win probability. That is, for any i, given , the function is determined up to a constant when {y,s} is incentive compatible. This useful result is considered next and simplifies the study of reduced form auctions. If {y,s} is an incentive compatible reduced form auction, the maximum,
From the envelope theorem,
, is attained at
and
, ∀vi, and convexity of u*i implies that:
Thus, incentive compatibility of the reduced form mechanism {y,s} implies that , ∀vi:
is weakly increasing in vi. Since
112 And because
MICROECONOMIC THEORY
, ∀vi, the above equation may be written as
So, the transfer is fully determined by
Individual rationality requires that u*i(vi) ≥ 0, ∀vi. Since incentive compatibility implies that requirement for individual rationality, given incentive compatibility, is u*i(vi) ≥ 0. Thus, Theorem 7.1.A reduced form auction,
is increasing, the only
is incentive compatible and individually rational if and only if for each i,
and:
1. 2. An important point to observe here is that fully determines the shape of , the only flexibility being at the level, , and this is chosen by consideration of the participation constraint. In particular, this implies that the seller's expected revenue is determined by and the participation constraints.
7.4.2 Revenue In view of the previous discussion, the reduced form auction generates expected revenue from i, given vi, of unconditional expected revenue of
The following computations rearrange Let
to a more useful form.
. The expected payment of i to the seller is:
and
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
Now, letting for g and dg:
(Using
,
,
the last term is
113
, and integrating this term by parts, substituting
). Thus:
Summarizing, Theorem 7.2.In a reduced form auction (y,s), the expected payment of bidder i is
From the theorem, the key parameter in determining the seller's ability to extract revenue from the buyer is vi − ((1 − Fi(vi))/fi(vi)). Therefore, this is central in the determination of the optimal auction.
7.5 The Optimal Auction For notational convenience let Ji(vi) = vi − ((1 − Fi(vi))/fi(vi)), so that the expected payment of bidder i is and the expected revenue, R(s,y),
114
MICROECONOMIC THEORY
of the seller is:
or, expanding
:
A reduced form auction is optimal if it maximizes this expression. Because the participation constraint implies that ki(vi) ≤ 0, it is optimal to set this to 0 for each i (by choice of ). If there are l objects for sale, then . Recall that incentive compatibility requires that is weakly increasing in vi—a constraint that must be imposed in the optimization problem. If, for example, Ji(vi) were decreasing then an unconstrained solution might require for small values of vi and equal to zero for large values of vi—violating the incentive constraint. If Ji is nondecreasing, the issue does not arise. Focusing on this case and with one good for sale, it is optimal to set yi(v) = 0 whenever Ji(vi) ≤ 0 and to set yi*(v) for that i* defined by , ∀j and provided . This defines an optimal (revenue maximizing) auction. A cutoff for each bidder is set (v*i defined by Ji(v*i) = 0 for i), and the object is awarded to the bidder with the highest “J” function at the announced valuation profile. If there are multiple units, the analysis is essentially unchanged: the bidders with the highest valuations get an object, and objects are sold up to the point where supply runs out or a valuation with J value below 0 is reached. When Ji(·) is not monotone, the discussion above does not apply. In the extreme, ignoring the incentive constraint, the maximizing yi may require selling to i on a region [a, b], but not above b or below a. To see this and gain some insight into the issue, suppose there is just one buyer, i, and consider the following two figures:
In both cases, the smooth curve represents Ji(·) with Ji(a) = Ji(b) = Ji(c) = 0. Consider case A, where . In the absence of the incentive constraint, the seller would set on and zero
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
115
elsewhere. But the incentive constraint requires that if , then , v′i ≥ vi (with just one buyer, ). If the seller sets a reservation price of a, , vi ≥ a, then the expected revenue is . But the expected revenue on the section [b,c] is negative and outweighs the positive expected revenue on the section [a,b]—setting a reservation bid of c gives a higher expected payoff. If the function J*i is defined
The dashed line (at intercept Jc) is chosen so that . Because (case A), this line lies below the 0 line. The J*i function captures the unprofitability of selling to i valuations above a. If for case A, the problem: is replaced by , the solution to the second program is , vi < c and , vi ≥ c, the correct solution is obtained. Replacing Ji by J*i produces a program where the unconstrained choice of satisfies the incentive constraint—because J*i is monotone. In case B similar reasoning applies. Now, however, it is better to set the reservation threshold at a, and this occurs again when the program is solved. With many buyers, similar reasoning applies. For a fixed realization of v−i, let ϕi(v−i) be the payoff from selling to buyers other than i. Conditional on v−i the revenue from selling to a buyer other than i is ϕi(v−i).
From the function Ji the modified function J*i is defined:
so that:
or vi′ is chosen so that:
116
MICROECONOMIC THEORY
If v−i generates (in the figure), then setting yi(vi, v−i) = 1 if and only if vi ≥ a generates an expected payoff conditional on v−i that is less than the expected payoff from setting yi(vi, v−i) = 1 if and only if vi ≥ c; and selling to the competing highest bidder otherwise. This is the same rule as would be determined with the seller using J*i as the revenue function for i. If vi generates , then the same reasons now lead to setting yi(vi, v−i) = 1 if and only if vi ≥ a′ where a′ is the smallest value of vi for which (and lies to the left of ). Again, this is the same rule as would be adopted with the seller using the J*i function. Optimal auctions are discussed by Myerson (1981) and Riley and Samuelson (1981). The “monotonization” of Ji to J*i is developed in Myerson. Similar constructions appear in Mussa and Rosen (1978). Additional discussion of these and related issues is given in Maskin and Riley (1984) and Rochet and Choné (1995).
7.5.1 Canonical pricing Recall that the expected payment of i is . This expression may be used to derive a price (the canonical price) for bidder i, if i wins the object. The following calculations derive this price. Given the assignment rule, yi, define ϕi(v−i)= inf {vi| yi(vi,v−i) = 1} and expand the expression for the expected payment (recall f(v−i) = ×j≠ if(vj)):
where is the indicator function of the event {vi≥ ϕi(v−i)}. Define Si = {v | yi(v) = 1}, the set of valuation profiles at which individual i obtains
117
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
the good. Thus, ϕi(t)= inf {vi| (vi,v−i)∈ Si}. Let pi(vi, v−i) = ϕi(v−i)yi(vi,v−i) i.e.,
For example, suppose there are two bidders with valuations distributed according to F1(v1) = v1 and J1(v1) = 2v1 − 1 and . Note that
. Then
and
,
. So, . Thus, if v ∈ S1, 1 gets the object and pays: p1(v) = ½ if . In fact, this allocation mechanism is strategy-proof (Bergin and Zhou (2001), (2004)).
and
if
7.6 Risk Aversion Risk aversion can arise on the side of the seller, buyers, or both. As discussed earlier, the English and second price auctions lead to the same selling price—the statistic; while the first price and Dutch auctions determine a more complex strategy, but are also strategically equivalent (to each other). In both cases, the basis for equivalence is the strategic structure, and not the form of preferences. So, these equivalences continue to hold with risk aversion. The following discussion makes two observations. When the seller is risk averse and the buyers risk neutral, the Dutch auction is preferred to the English auction by the seller. Second, as bidders become more risk averse, in the first price auction they make uniformly higher bids at each valuation. These points are discussed in turn. For the first assertion, let b(Vi) be the equilibrium bid function in the first price auction. The selling price in the Dutch auction is PF(V1, …, Vn) = maxib(Vi) = b(maxiVi) = b(V(1)). For the English auction PE(V1, …, Vn) = V(2), the statistic. With risk neutral buyers both have expected value equal to the expected value of the second-order statistic (E{PD} = E{PE}). Conditional on
118
MICROECONOMIC THEORY
V(1) = v, E {PD | V(1) = v} = b(v)= E{maxj ∈ N\{i}Vj | maxj ∈ N\ {i}Vj
Taking expectation over v according to the distribution of the first-order statistic,
See Wolfstetter (1996) for additional discussion. With risk averse buyers, the payoff to a bidder with valuation vi paying bid β is u(vi − β), and take ui normalized with u(vi) = 0. As in the first price auction the bidder's problem can be written:
This yields the first-order condition:
Rearranging:
Consider two utility functions, u1 and u2, with corresponding absolute risk aversion measures, r1 and r2, (r1(x) = − u1″(x)/u1′(x) and r2(x) = −u2″(x)/u2′(x)). Let b1 and b2 be the corresponding equilibrium bid functions. Suppose that utility function 2 exhibits greater risk aversion than utility function 1: r2(x) > r1(x), ∀x. The following discussion shows that b2(v) > b1(v), ∀ v > v. To show this it is sufficient to establish that (1) b2′(v) > b1′(v) and (2) if b2(v) = b1(v), then b2′(v) > b1′(v).
In the figure b′2(v) > b′1(v) (as required by (1)), but the curves cut at where implying that which violates (2). So, (1) and (2) imply that b2(·) initially moves above b1(·) and stays above.
,
Suppose that b2 crosses b1 as in the figure. Letting qi(x) = ui(x)/ui′(x), i = 1,2, and φ(x) = q2(x) − q1(x), then at ,
If φ(x) > 0, ∀ x, then in particular at ,
, contradicting the assumption that b2 crosses b1.
So, consider the function φ. Noting that qi′(x) = 1 + qi(x)ri(x), φ′(x) = q2(x)r2(x) − q1(x)r1(x). Because qi(v) = 0, φ′(0) = 0.
CHAPTER 7: AUCTIONS I: INDEPENDENT VALUES
119
Furthermore, φ″(x) = q2′(x)r2(x) + q2(x)r2′(x) − q1′(x)r1(x) − q1(x)r1′(x). Since at v, qi(v) = 0, φ″(v) = q2′(v)r2(v) − q1′(v)r1(v) = r2(v) − r1(v) > 0. So, φ is convex locally at v: for sufficiently small x, φ(x) > 0. To see that φ is always strictly positive, note that if at some x* > 0, φ(x*) = 0, then take x* to be the first such point and note that q2(x*) = q1(x*) = q* so that φ′(x*) = q* [r2(x*) − r1(x*)] > 0. Therefore, on a neighborhood of x*, φ′(x) > 0, while φ(x) > 0 and x < x* φ(x*) = 0: φ is strictly positive and upward sloping to the left of x*, contradicting φ(x*) = 0. See Riley and Samuelson (1981) for additional discussion of risk aversion.
An example: seller's utility with risk aversion Suppose that F(v) = v, so that the valuations are uniform and independently distributed on [0, 1]. Then the equilibrium bid in the first price auction is: b(v) = ((n − 1)/n)v. The highest bid is ((n − 1)/n) Y1 where Y1 is the first order statistic (the largest of the n valuations). If the seller has utility function u(·), the expected utility is , where Fi is the distribution function of the ith order statistic, i = 1, 2, …, n. Similarly, in the second price auction, the expected utility is , where Fi is the distribution function of the ith order statistic, i = 1, 2, …, n. For the example, let . Recall that F1(y) = F(y)n = yn and F2(y) = nF(y)n−1(1−F(y))+ F(y)n so dF1(y) = nyn−1dy and dF2(y) = n(1−F(y)) n−1 d{F(y) } = n(1 −y)(n− 1)yn−2dy. The expected utility of the seller in the first price auction is, R1:
The expected utility of the seller in the second price auction is, R2:
Therefore:
Thus, R1 > R2.
120
MICROECONOMIC THEORY
7.7 Efciency and Optimality An auction is efficient if at each valuation profile, the object goes to the person who values it most. In the present context, the object for sale has no value to the seller (other than the revenue it can generate), so efficiency requires that the object always be sold, and to the person who values it most. An auction is optimal if it maximizes expected revenue for the seller. From the earlier discussion, all the familiar auctions (first price, second price, English, Dutch and all-pay) are efficient, since in each auction the object is sold to the highest bidder. However, none of these are optimal since the optimal auction imposes a reservation price pr (defined by pr = (1 − F(pr))/f(pr)). So, on the region Vl = {(v1, …, vn | vi ≤ vi < pr, ∀i}, no sale takes place.
Bibliography Bergin, J. and L. Zhou (2001), “Optimal Monopolistic Selling under Uncertainty: Does Price Discrimination Matter?” Mimeo. Bergin, J. and L. Zhou (2004), “ Monotonic Assignment Rules and Common Pricing,” forthcoming, Mathematics of Operations Research. Maskin, E. and Riley, J. (1989). “Optimal Multi-Unit Auctions,” in F. Hahn (ed.), The Economics of Missing Markets, Information and Games, Clarendon Press, Oxford. Maskin, E. and Riley, J. (1984). “Optimal Auctions with Risk Averse Buyers,” Econometrica, 52, 1473–1518. Mussa, M. and Rosen, S. (1978). “Monopoly and Product Quality,” Journal of Economic Theory, 18, 301–317. Myerson, R. (1981). “Optimal Auction Design,” Mathematics of Operations Research, 6, 58–73. Riley, J. G. and Samuelson, W. F. (1981). “Optimal Auctions,” American Economic Review, 71, 381–392. Rochet, J.-C. and Choné, P. (1995). “Ironing, Sweeping and Multidimensional Screening,” Econometrica, 66, 783–826. Wolfstetter, E. (1996). “Auctions: An Introduction,” Journal of Economic Surveys, 10, 367–420.
8 Auctions II: Dependent Values Consider an auction for oil drilling rights on a tract of land. The value of the oil reserves is the same whoever wins the auction, but differing private information will generally lead individuals to value the tract differently. In such circumstances, it is natural to assume that individual valuations are not independent. This chapter considers auctions with dependent valuations: each bidder attaches a value to the object which may be correlated with the valuations of other bidders. The basic structure is laid out in Section 8.1, including the cases of common values and private values. To make progress in the study of bidding behavior some statistical assumptions on the joint distribution of signals and characteristics—the variables defining information and preferences—are necessary. The key assumption is called affiliation (or monotone total positivity of order 2). This is discussed in Section 8.1.1. Following this, in Section 8.2, the traditional auctions are discussed. The equilibrium bidding function for the first price auction is given in Section 8.2.1; second price auctions are considered in Section 8.2.3, and the English auctions in Section 8.2.4. The different forms of auction are compared in revenue terms in Section 8.2.5. Section 8.4 describes the winner's curse: the winning bidder is, on average, likely to have overestimated the value of an object in the common values model. Finally, in Section 8.5 optimal auctions are discussed, where necessary and sufficient conditions are given for full extraction of the surplus by the seller.
8.1 The Framework The general framework for the modeling of auctions where valuations are correl-ated is developed in Milgrom and Weber (1981). The outline here follows that model. In this framework, a bidder observes private information, Xi, which is
122
MICROECONOMIC THEORY
correlated with the information of other individuals and other variables affecting the value of the object to the bidder. Let X = (X1, X2, …, Xn) be individual-specific signals on the value of the object: bidder i receives signal Xi. Let S = (S1, …, Sm) be other random variables which, in addition to X, also affects the value of the object to each bidder. The value of the object to bidder i is Vi = ui(S, X), where the function ui is assumed to have the form u(S, Xi, {Xj}j ≠ i); u depends only on the list {Xj}j ≠ i and not the order—so each bidders' valuation is a symmetric function of others' information. If i knows that Xi = xi, the value of the object conditional on that information is E {V | Xi = xi}. Two well-known special cases are the private values model (m = 0, ui(S, X) = Xi) and the common values model (m = 1, ui(S, X) = S1). Assume that u ≥ 0, and that u is nondecreasing in all its arguments. The random variables (X,S) are assumed to have density f (s,x) satisfying: (1) f(s,x) is symmetric in the x variables: if x′ is a permutation of x, f (s,x) = f (s,x′); and (2) (S1, …, Sm, X1, …, Xm) are affiliated random variables. Affiliation, also known as monotone total positivity of order 2 (MTP2), plays a central role and is discussed in the next section. Affiliation is a strong form of positive association between random variables. When one variable is “high,” other variables are more likely to be high. So, for example, because u is nondecreasing in all its arguments, affiliation will imply that E {Vi | xi} is increasing in xi.
8.1.1 Afliated (MTP ) random variables 2
Let f(a,b) be a density on A × B where A and B are totally ordered. The function f is said to be totally positive of order 2 (TP2) if f(x1,y1)f(x2,y2) ≥ f(x1,y2)f(x2,y1) for all x1 < x2, y1 < y2. Let z, z′ be vectors in Rl. The component-wise maximum of the pair is denoted z ∨ z′ and the component-wise minimum is denoted z ∧ z′. These are defined:
The random variables, z = (z1, …, zl) with density f are said to be MTP2 if
The term affiliated is also used to denote this property. Loosely, if f satisfies the affiliation condition then when some variables are high, other variables are more likely to be high than low. In order to check if f is MTP2, it is sufficient to check that f is TP2 for every pair of variables, holding the remaining variables fixed. One of the key properties of
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
123
affiliation or MTP2 is: If Z1, Z2, …, Zk are affiliated and H(Z1, Z2, …, Zk) is a nondecreasing function, then
is nondecreasing in all its arguments. In the current context let Y1, …, Yn−1 be the order statistics of X2, …, Xn:
Then if f is affiliated and symmetric in X2, …, Xn, then (S1, …, Sn, X1, Y1, …, Yn−1) are affiliated random variables.
8.2 Auction Procedures This section considers three standard auctions—first price, second price, and the English auctions. In each case the equilibrium bidding strategy is derived. Because agents are symmetric, one may focus on any individual to derive optimal (symmetric) strategies, so consider individual 1. Before discussing the general case, it is worth considering a simple generalization of the independent private values model that allows for correlation between valuations. In terms of the model introduced above, assume that there are no random variables Si, and take ui(S,X) = Xi. Thus, Vi = Xi, and let F be the joint distribution of valuations (V1, …, Vn). In a first price auction, given bidding strategies, {bj(·)} and attaching bids to signals, (bj(vj)), the expected payoff to bidder 1 at valuation v1 with bid b is:
where F(v2, …, vn | v1) = prob(V2 ≤ v2, …, Vn ≤ vn | V1=v1). The best response is determined by:
or
Let F have density f so that
124
MICROECONOMIC THEORY
Then,
Where fij is the joint density of (vj,vi). Now, suppose that the distribution is symmetric: for any permutation, π, of the nvector v, F(v) = F(π(v)). Similarly, the densities fi and fij are independent of i and j and may be written f(v) and f(v,v′). In a symmetric equilibrium, bj = b* for all j, and the optimizing choice for 1 is b = b*(v1). To simplify, note that , so
Or,
Thus, the equilibrium bidding strategy satisfies the differential equation:
When the ViS are independent then F(v1, …, v1 | v1) = F(v1)n−1 and F(v1, …, v1 | v1, v1) = F(v1)n−2, (with slight abuse of notation, using F also for the cumulative distribution of each Vi.) Similarly, f(v1, v1) = f(v1)f(v1). So, with independence,
the formula derived earlier for the independent identically distributed valuations case.
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
125
In the case of the second price auction, the usual reasoning applies, so that an equilibrium bidding strategy is to bid one's true valuation. Next, the general case is considered. Recall that
and let
This is the expected payoff to 1 given that own signal is x and the highest signal among others is y. From affiliation, v is increasing in x and y. Also, let be the conditional distribution function of Y1, given that X1 = x.
8.2.1 First price auctions The next result characterizes the equilibrium bidding strategy. Theorem 8.1.The symmetric equilibrium bidding strategy is
Proof. To find the equilibrium bidding strategy, conjecture a symmetric equilibrium (b*, …, b*), where b*(xi) is the bid of agent i observing Xi = xi. The expected payoff to bidder 1 with signal x bidding b is:
where the second equality follows from the rule for iterated expectation: given three random variables X, Y, Z, and a function w,
Now, consider E{(V1 −b) · χ{b*(Y1) < b} | X1 = x, Y1 = y}. Using the rule:
126
MICROECONOMIC THEORY
Substituting this into the expression for π(x,b):
Differentiating with respect to b and setting to 0 gives the first-order condition11:
Noting that
, and rearranging the first-order condition:
By assumption, b* is a symmetric equilibrium, so at x the bid b must be optimal: b = b*(x) or x = b*−1(b). Therefore, a necessary condition for b* to be an equilibrium is:
The solution to this differential equation is:
(Remark: To check this note that
so that
11
Recall Leibnitz's rule: , *(g (b )) so that 1 = b *′(g (b ))g ′(b ) and g ′(b ) = 1/b *′(g (b )) = 1/b *′(b *−1 (b )).
. And note that if g (b ) ≡ b *
−1
(b ), then b ≡ b
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
127
Note also that
Thus,
and so the expression is confirmed.) To consider sufficiency, recall that
The term {…} is 0 at z = x, from the first-order necessary condition. From affiliation f(z | x)/f(z | x′) is increasing in x if x ≥ x′ : (f(z | x)/f(z | x′)) ≥ (f(z′ | x)/f(z′ | x′)), z ≥ z′, or (f(z′ | x)/f(z | x)) ≤ (f(z′ | x′)/f(z | x′)), z ≥ z′. So,:
and so is decreasing in x. Also, v(x,z) is increasing in x and z, so that whenever x < z, the term {…} is negative; and positive when x > z. Thus, the solution to the first-order condition is the unique (maximizing) value. Remark: It is useful to observe that for given x, L(ξ | x) is a distribution function. Writing the inequality above as , for x ≥ x′; taking z = x, x′ = x implies . Therefore, . Since , the log is −∞ and so
128
MICROECONOMIC THEORY
, which implies . Recalling that , L(x | x) = 1, and L(x | x) = 0, so that L(ξ | x) is a distribution on [x, x]. Two special cases of interest are the extreme cases of private and common values.
First price auctions: the private and common values cases In the case of private values and independent signals, v(x,x) =x and
So,
. Since
:
and thus, . Substituting into the definition of b*(x) gives . This agrees with the expression derived earlier.
In the case of common values, v(x,y) = E{S1 | X1 = x, Y1 = y}. With indepen-dent variables, dL is as in the previous calculations, so that b*(x) is , and noting that independence implies , gives b*(x) = μ.
8.2.2 First price auctions: an example Suppose there are two bidders. In this case the distribution of the valuation of the second highest bidder is just the distribution of the second variable. So, throughout, rather than write , it is sufficient to write f(y | x). Let f(x,y) = k(1 + axy) be a density on [0, 1]2, where k is a normalizing constant. Thus,
This is an MTP2 density. If x1 < x2 and y1 < y2, then k2(1 + ax1y1)(1+ax2y2) ≥ k2(1 + ax1y2)(1 + ax2y1), or a(x2 − x1)(y2 − y1) ≥ 0. Remark More generally, if f(x1, …, xn) is MTP2, then so is , where ai(·) is monotonically S increasing and the bi are all monotonically increasing or monotonically decreasing. For example, consider
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
129
bxy + ax2y2 and suppose x1 < x2 and y1 < y2, with a ≥ 0 and b ≥ 0. This function satisfies MTP2. To check this directly observe that the MTP2 requirement is:
This inequality requires:
Since the first and last terms on either side are equal, the condition becomes:
or
or
Returning to the density, | x) = f(y,x)/f(x) = (1+axy)/(1 + ax/2) and
Thus,
In particular:
Therefore:
. Similarly, the conditional density is f(y
130
MICROECONOMIC THEORY
The equilibrium bid function is given by:
Independent valuations With private values (Vi = Xi), the case where valuations are drawn independently corresponds to a = 0. To make that connection, note that as a → 0, the density f(x,y) = k(a)(1 + axy) (the constant k varies with a) converges to 1, so that x and y are independent uniformly distributed on [0, 1]. For that case (two bidders with valuations uniformly distributed on [0, 1]), the equilibrium bidding strategy in a first price auction is b(x) = ½x. From the expression above, observe that
so that lima → 0b(x) = ½x. Thus, in the limit as the density approaches the uniform distribution, the bid function approaches the usual independent types equilibrium bid function.
Common values To cast this as common values example, define a density on S × X × Y = [0, 1]3. Extending the previous example, let f(s,y,x) = k(1 + asxy), so that f(x,y) = k(1+(a/2)xy) and f(s | x,y) = (1+asxy)/(1+(a/2)xy). This density satisfies MTP2, since it is TP2 pairwise. Also, let u(s,x,y) = s. Then , and so v(ξ, ξ) = (1 + (2/3)aξ2)/(2+aξ2). For example, if 0 ≤ a ≤ 1 then , so this function is relatively
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
131
flat. The equilibrium bid function is:
In the special case where a → 0, the density converges to a density of independent random variables. Observe that
So, lima → 0b*(x) = ½, which is E {S} in the present example.
8.2.3 Second price auctions This section derives the equilibrium bidding strategy in the second price auction. Theorem 8.2.The equilibrium bidding strategy in the second price auction is:
Proof. Bidder i submits a bid given i's information Xi: bi(Xi). Let W = maxj ≠ ibj(Xj). Bidder i wins if i's bid exceeds W. In that case, the payoff to i is Vi − W. Otherwise, the payoff to i is 0. Focus on player 1. The bid selection is:
when 1 observes X1 = x. The following discussion shows that b*(x) = v(x,x) is a symmetric equilibrium strategy. To see this, note that since v(x,x) is increasing in x, so is b, so that
132
MICROECONOMIC THEORY
W = b(Y1). The bid b gives the expected payoff
Since [v(x, ξ) − v(ξ, ξ)] is positive if and only if x ≥ ξ, the integral is maximized by setting b*−1(b) = x or b = b*(x).
Second price auctions: the private and common values cases In the case of private values, v(x,x) = x and this reduces to bidding one's valuation (whether signals are correlated or not). With common values, as before b*(x) = v(x, x) = E{S1 | X1 = x, Y1 = x}. In the specific common values example discussed earlier, f(s,x,y) = k(1 + asxy), and so b*(x) = v(x,x) = (1 + (2/3)ax2)/(2+ax2).
,
8.2.4 English auctions The English auction is an extensive form game: at any point in time a bidder observes a string of bids and may bid or wait. The following discussion defines the equilibrium bidding strategy. Let p1 ≤ p2 ≤ … ≤ pk denote a history where k bidders have quit—at the corresponding prices. At this point, i's strategy (assuming i has not quit) is bi(x | p1, p2, …,pk), the price at which i will quit if still an active bidder and given the k quits (p1, p2, …, pk). Take bi(x | p1, p2, …, pk) ≥ pk. Define a strategy b* = (b*0, b*1,…, b*n − 2) iteratively
This is the expected value of V1 given X1 = x and all other valuations also equal x.
The strategy b* is a symmetric equilibrium. From affiliation it can be shown that for all k, bk(x | p1, …, bk) is increasing in x. If bidders 2, …, n use b* and 1 wins the
133
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
auction, the price paid is
because when j is dropped out, this was observed as pj and bj(Yj | p1, …, pj−1) = bj(yj | p1, …, pj−1), where yj is the realization of Yj. From this, player 1 infers that Yj = yj, j ≥ 2. Now, player 1's conditional value, given X1 = x1 and knowing (Y1, …, Yn−1) = (y1, …, yn−1) is
So the conditional expected payoff is greater than 0 if and only if x ≥ y1. However, b* is such that bidder 1 will win if and only if X1 > Y1 (bidder drops out when valuation is reached). Thus b* is a best reply for 1.
8.2.5 Revenue comparisons The revenue equivalence of these auctions in the private valuation independent types case does not carry over to this environment. With affiliation, a higher value of one's own valuation implies a conditionally higher distribution on other agent's valuations. The main result is that in the affiliated values model the expected revenue in the English auction weakly exceeds that of the second price auction, which in turn weakly exceeds that of the first price auction. Theorem 8.3.The expected selling price in the second price auction is at least as large as in the first price auction. Proof. Recall that in the first price auction, the equilibrium bid is the equilibrium bid is b*(x) = v(x,x).
and in the second price auction
Let . This is the expected value to bidder 1 with valuation z, bidding as if it were x. Let Wm(x,z) be the conditional expected payment by bidder 1 in auction m ∈ {1, 2} (the first and second price auctions) when: • • •
others play the equilibrium strategy; bidder 1's information is z and bids as if it were x; he wins.
In the first price auction,
the bid made is the price paid if player 1 wins. In the second price auction, the bid is v(x,x) when the value is x. If 1 wins, then 1 pays the bid of the second highest
134
MICROECONOMIC THEORY
bidder, v(Y1, Y1). The event that 1 wins with bid v(x,x) is the event {Y1 < x}, so
The next argument shows that W1(z,z) ≤ W2(z,z), for all z, which will give the result. The expected payoff from the strategy of bidding x at true valuation z is
Given the strategy is a equilibrium bidding strategy, this must be maximized at x = z.
(Where
.)
In either auction, b*(x) = v(x, x) so Wm(x, x) = v(x, x). Because W1(x,z) = b*(x) and does not depend on z, From the affiliation property (increasing in integral ranges), . Thus, in particular, for any z,
.
.
The following argument confirms that W1(z,z) ≤ W2(z,z). Suppose to the contrary that W1(z,z) > W2(z,z) for some z. The equilibrium bidding condition is:
so that W1(z,z) > W2(z,z) implies
. Therefore, if W1(z,z) > W2(z,z),
Recalling that W2(x, x) = W1(x, x) and noting from the above argument that if W1(z,z) is ever as large as W2(z,z) then W2(z,z) has a steeper slope than W1(z,z). This implies that W2(z,z) can never fall below W1(z,z): W2(z,z) ≥ W1(z,z), ∀z. Since the expected price for the first price auction is E{E {W1(X1, X1) | X1 > Y1}} the expected price for the second price auction is E{E{W2(X1, X1) | X1 > Y1}}, the result follows. Remark: An alternative proof follows from stochastic dominance considerations. Recall from earlier calculations that for any z and x ≥ x′, . Taking z ≤ x and x′ = z gives . So .
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
135
Comparing the integrals,
or
Thus, with ξ < x,
So, F*(ξ | x) first-order stochastically dominates L(ξ | x). Therefore,
An example Continuing with the earlier example, recall that the density is f(x,y) = k(1+axy), f(y | x) = (1+axy)/(1+(a/2) x) with F(y | x) = y(1+(a/2)xy)/(1+(a/2) x) and with private values, v(y,y) = y. From the definition:
At x=z, W2(z,z) = (z/3)(3+2az2)/(2+ az2). Recall that
. Thus,
136
MICROECONOMIC THEORY
The sign of W2(z,z) − W1(z,z) is the same as the sign of the numerator. Let az2 = η, so the numerator is or
. This is nonnegative if and only if , or (4 + η)2 2(2+ η) ≥ 16(2+η)2 or (4+η)2 ≥ 8(2+η) or 16+8η +η2 ≥ 16 + 8 η. This follows since η2 ≥ 0, so that W2(z,z) − W1(z,z) ≥ 0. Remark: In the proof of the theorem,
. Thus,
, and in the private values case, this is
Differentiating with respect to x:
Setting this to 0 gives x = z as a root (and the only positive root). This illustrates the construction in the proof.
8.3 Price and Information Linkages Consider now an augmented model where the seller also draws a signal X0 which is affiliated with the other random variables in the model: (X,S). The seller may choose or not choose to reveal this signal. In the latter case, the bidding strategies are as described above. If the seller chooses to reveal the realization of X0, then bidding strategies depend on this also. The following discussion identifies a general principle called the “linkage principle”: when information related to the valuation of the item by other parties is revealed, the price (value of the winning bid) is linked to that information. In this environment, where valuations rise with all signals, the consequence is to raise the expected selling price.
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
137
The rst price auction In the first price auction, the release of the seller's information leads to bids that depend on this information: if i's private signal is x and x0 is revealed, then the bid has the form bi(x, x0), and a symmetric equilibrium bidding strategy may be written b*(x, x0). From the perspective of (say) bidder 1, the expected payoff from bidding b is
where w(y, x, x0) = E{V1 | Y1 = y, X1 = x, X0 = x0}. So,
With x0 fixed, write auction:
for b*(y, x0), and repeating the discussion for the equilibrium bid function in the first price
From the discussion at the end of Section 8.2.1, is increasing in x0 (using affiliation), and w(x, x,x0) is increasing in x0, again from affiliation. The boundary condition is w(x, x, x0) b*(x, x0) = 0. For x0′ > x0, if b*(·, x0′) < b*(·, x0), so that [w(x, x, x0′) − b*(x, x0′)] > [w(x, x, x0) − b*(x, x0)], then since
. So, comparing the functions b*(·, x0′) and b*(x, x0), as functions of x, if b*(·,x0′) falls below b*(x, x0), then its slope is greater. This implies that b*(x,x0′) ≥ b*(x, x0) for each x. Thus, higher values of the signal x0 raise the equilibrium bid. How does the expected price compare with that from the first price auction without any information of the sellers signal? The following argument asserts that revealing the information raises the expected selling price. Theorem 8.4.In the first price auction, release of information by the seller (truthful announcement of the value ofX0) raises the expected selling price. Proof. Let W*(x,z) = E{b*(x, X0) | Y1 < x, X1 = z}, the expected payment of bidder 1 conditional on having an estimate z, bidding as if it were x, and winning with this bid. From affiliation, W*2(x,z) ≥ 0 and since b*(x, x0) = w(x, x, x0) (= E{V1 | X1 = x, Y1 = x, X0 = x0}),
Given the private signal x, the strategy b*(z, X0) is optimal when z is chosen equal to x. Prior to learning X0, and after learning x, the function b*(x, ·) is the
138
MICROECONOMIC THEORY
optimal rule: once the value of x0 is revealed, the optimal bid is b*(x, x0). So, as in the previous discussion, with , the fu nction is m a x im iz ed when . With W* replacing W2 in the discussion of Theorem 8.3, W*(z,z) ≥ W1(z,z), z ≥ x. Since the expected price in the first price auction without any information provided by the seller is E {W1(X1,X1) | Y1 < X1} and the expected price with information release is E {W*(X1, X1) | Y1 < X1}, information release raises the expected selling price. This completes the proof.
The second price auction A similar result applies for the second price auction. Recall that in the second price auction the equilibrium bid function (without any seller information) was v(x,x) where v(x,y) = E {V1 | X = x, Y1 = y}. Similarly, with the release of the seller's information, x0, conditional on this the equilibrium bid function is: v(x,x, x0) where v(x,y, x0) = E {V1 | X=x, Y1 = y, X0 = x0}. Theorem 8.5.In the second price auction, release of information by the seller (truthful announcement of the value ofX0) raises the expected selling price. Proof. Let RN = E{v(Y1, Y1) | X > Y1} and RI = E{v(Y1, Y1, X0) | X > Y1}. These are the expected selling prices when the seller's information is not revealed (RN), and when the seller's information is revealed (RI). The following discussion shows that RI ≥ RN. Recall that v(x,y) = E{V1 | X1 = x, Y1 = y} and in the second price auction, the equilibrium symmetric bidding function is b*(x) = v(x,x)—the expected value of V1 given bidder 1's value estimate is x and equals the highest valuation. Since
and with x ≥ y
(E{w(y, y, X0) | X1 = y, Y1 = y} ≤ E{w(y, y, X0) | X1 = x, Y1 = y} follows from affiliation.) Therefore, with v(Y1, Y1) ≤ E{w(Y1, Y1, X0) | X1, Y1}, and X1 > Y1
So, the expected revenue in the second price auction when X0 is revealed exceeds the expected revenue when X0 is not revealed.
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
139
8.4 The Winner's Curse Consider a sealed bid auction. Let b = (b1, …, bn) be an equilibrium—so i's bid is b(Xi). The expected value to player 1 (say), given X1 = x is
The expected payoff to player 1, given that 1 wins is:
Affiliation implies that this expression is increasing in b (affiliation property 6), so that:
Thus, the agents' valuation given own information and having won the object is lower than the valuation given own information.
8.5 Optimality: Surplus Extraction From the seller's perspective, the purpose of an auction is to raise revenue. So, it is natural to ask what is the maximal revenue that can be raised in a given environment? This issue is discussed next. Under very mild conditions on correlation of valuations across individuals, it is possible for a seller to extract all the surplus: there is a mechanism (selling scheme) such that at each valuation profile, the object is sold to the person with the highest willingness to pay for that price. Let π(s) be the probability of state s and wi(si) the value of the object to individual i, type si. If i, type si gets the object with probability y for an expected payment of xi, the expected utility is wi(si)yi − xi. Let yi(s) be the probability that i gets the object at type profile s and let xi(s) be payment at that profile. Also, let
And
Individual rationality (IR) is the requirement that ∀i, ∀si, that ∀ i, ∀ si, ,
. Incentive compatibility (IC) is the requirement
140 ∀ti ∈ Si. In the auction
MICROECONOMIC THEORY
, the expected revenue or surplus to the seller is
. Alternatively,
If an auction is IR and extracts all the surplus, then for each i and si, . Otherwise, there is some si with , so that IR continues to hold for and generates greater revenue for the seller. The key problem is in finding an incentive compatible way to do this. For notational convenience, write and let be an enumeration of the points in S−i. Define Γi:
Let be a vector in Rki. If Γi has rank ki, then there exists gi, This turns out to be sufficient for full extraction of the surplus. For any allocation.
, such that gi Γi = hi. , all surplus can be extracted at that
Proposition 8.1.If for each i, Γihas rankkithere is an auction that extracts all the surplus. Proof. Given any auction , if the auction does not extract all the surplus, let , which is positive for some i, si. Define gi as in the previous discussion and let xi′(s) = xi(s) + gi(s−i). The incentives for si are unchanged since gi(s−i) is independent of si: if {yi, xi} is incentive compatible, so is {yi, xi′}, since gi is independent of si. With xi′ the expected payment of i, type si is
So, i, type si's utility is reduced by hi(si)—the net benefit from participating in the auction; and for all si ∈ Si, This condition is also necessary, considering arbitrary valuation functions, wi(·).
.
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
A second result provides a different perspective. Define
and
141
:
consists of the vectors , and is the rth column of Γi. ( is the matrix obtained from Γ by deleting the jth column.) The next proposition makes use of Farkas' lemma which states that if Ax = b has a nonnegative solution x*, then there is no solution to y′A ≤ 0 and y′ b > 0. And, if Ax = b has no nonnegative solution x*, then there is a solution to y′A ≤ 0 and y′b > 0. Farkas' lemma is of general interest, and is discussed further in Section 8.6. Proposition 8.2.Suppose that ∄ i,
such that:
has a nonnegative solution Then, the seller can fully extract the surplus. Proof Under the hypotheses of the proposition, assume that there is no i,j pair with lemma, for each , there is a vector
such that
Pick a strictly positive number,
and
. Or, for each
, and let
,
and ρij ≥ 0. From Farkas'
142
MICROECONOMIC THEORY
where δij is chosen so that
For any
. Since δij > 0,
, r ≠ j. That is:
, let
So,
Thus, for each i,
for all
. To check incentive compatibility, consider
.
Since if is sufficiently large. Thus incentive compatibility is satisfied given the allocation rule. This scheme fully extracts the surplus at that rule.
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
The condition [∄i, (1988).
,
143
with ρij ≥ 0] is also necessary for full surplus extraction—see Crémer and McLean
One point worth noting is that the calculations are based on interim expected payoffs. The scheme described above satisfies interim incentive compatibility and individual rationality. However, ex post, this may not be the case. The following discussion follows Fudenberg and Tirole (1991). Consider a two-buyer model and focus on individual 1. Write Vl for . Write yrk for , so the first subscript refers to player 1's type. Similarly, xrk. Thus, y21 is the probability that player 1 gets the object at the message and x21 the expected payment at that profile. Also, write π(k | r) for the probability that 2 is type given 1 is type . Thus, the incentive compatibility constraints are
and
These rearrange to:
and
The IR conditions are that each agent has nonnegative expected utility. With full surplus extraction, all types of all individuals have zero expected surplus (otherwise someone is willing to pay more). Thus,
or
So,
144
MICROECONOMIC THEORY
and
Substituting these into the first incentive compatibility equation,
or
or
or
In the present context, the matrix, Γ1 from the earlier discussion is:
The determinant of Γ1 is
So, the expression above gives:
Thus, for example, if |Γ1| > 0 and |Γ1| → 0, then x21 → −∞ to satisfy the inequality. To illustrate, suppose , so that and . Suppose also that extraction requires that y11 = 1, y12 = 0, y21 = 1, y22 = 1. Only when 2's type is
. Let and 1's type
. Full surplus does the object
CHAPTER 8: AUCTIONS II: DEPENDENT VALUES
145
go to individual 2 (y12 = 0). Substituting in Ωi:
So,
or
If π(1 | 1) = π(2 | 2) = ½ + ε and π(2 | 1) = π(1 | 2) = ½ − ε, | Γ1 | = 2 ε, and the condition is:
So, as ε → 0, x12 → −∞. The next section discusses a result used above.
8.6 Farkas' Lemma Lemma 8.1.Either (1) {Ax = b, x ≥ 0}, or (2) {A′y ≤ 0 andy′b > 0} has a solution, but not both. Proof Suppose that (1) has no solution: b ∉ K ≡ {Ax | x ≥ 0}. Let p ∈ {Ax | x ≥ 0} solve minz ∈ K ‖b − z‖. Since K is convex, p is unique. Let w ≥ 0 satisfy p = Aw. The hyperplane perpendicular to b − p separates b and K: ∀ z ∈ K, (b − p) · (z − p) ≤ 0. Equivalently:
Put y = b − p, so y · (Ax − Aw) ≤ 0 for all x ≥ 0. Let x = w + ei, ei is a vector with 1 in the ith position and zero
146
MICROECONOMIC THEORY
elsewhere. Then A(x − w) = Aei = ai the ith column of A. Thus, y · ai ≤ 0, ∀ i, or y′A ≤ 0. Taking x = 0 in y · (Ax − Aw) ≤ 0 and recalling Aw = p gives −y · p ≤ 0. Substituting b − y for p, −y · (b − y) ≤ 0 or −y · b + y · y ≤ 0 or y · y ≤ y · b and y · y > 0 since y ≠ 0, so y · b > 0. Thus, if (1) does not hold, (2) must hold. Finally, (1) and (2) both cannot be satisfied: if ∃ x ≥ 0, Ax = b and if y′A ≥ 0 then y′b = y′Ax ≥ 0. This completes the proof. Farkas' lemma has a simple geometric interpretation. In the figure, b is in the cone generated by the vectors a1 and a2: b = α1a1 + α2a2 with αi ≥ 0. The shaded region is defined as the set of y's satisfying a1 · y ≤ 0 and a2 · y ≤ 0 or A′y ≤ 0, where A is the matrix with columns a1 and a2 (A = {a1, a2}) and A′ the transpose of
. As the figure illustrates, the region {y |A′ y ≤ 0} is a subset of the region {y | b′ y ≤ 0}. Thus, Ax = b has a nonnegative solution and A′y ≤ 0 implies b′y ≤ 0.
Bibliography Cremér, J. and Mclean, R. (1988). “Full Extraction of the Surplus in Bayesian and Dominant Strategy Auctions,” Econometrica, 53, 345–361. Fudenberg, D. and Tirole, J. (1991). Game Theory. MIT Press, Cambridge, Massachusetts. Krishna, V. O. (2002). Auction Theory, Academic Press, San Diego, USA. Milgrom, P. and Weber, R. (1982). “A Theory of Auctions and Competitive Bidding,” Econometrica, 50, 1089–1122.
9 Extensive Form Games 9.1 Introduction Informally, an extensive form game represents the structure of interaction between players, identifying order of moves, dependency of one's choices on the previous actions of others, and so on. The formal definition of an extensive form game is lengthy (see Selten 1975, Kreps and Wilson 1982a or Ritzberger 2001). Here a somewhat informal and brief description of the structure of an extensive form is given. Pure, behavioral and mixed strategies are discussed and the relations between them are explored. In terms of strategic possibilities, mixed strategies are the most general, but are strategically equivalent to behavioral strategies in games with perfect recall. These matters are discussed in detail. The basic components of an extensive form game are introduced in Section 9.2. In Section 9.3 strategies in extensive form games are described. Pure strategies assign a deterministic choice at each decision point, behavioral strategies allow local randomization over choices at each decision point; and mixed strategies involve randomization over pure strategies. Sections 9.3, 9.3.1, and 9.3.2 develop this categorization of strategies. Both behavioral and mixed strategies are distributions on pure strategies, but mixed strategies permit correlation in strategy choices at different locations in a game, whereas behavioral strategies do not. When a game has perfect recall, described in Section 9.3.3, there is no strategic advantage to using mixed strategies over behavioral strategies. This is discussed in Section 9.3.4.
9.2 Description of an Extensive Form Game An n-player extensive form game tree consists of a set of nodes or vertices with edges (ordered pairs of nodes) connecting or joining nodes. If nodes ea and eb are
148
MICROECONOMIC THEORY
connected by an edge, this is written (ea, eb). Here eb is called a successor of ea, and ea is the predecessor of eb. While a node can have multiple successors, each node has one predecessor—with the exception of one node called the root node where the game starts, and which has no predecessor. The collection of all nodes is denoted V, and the collection of all edges denoted E. There is a partition of nodes into n + 1 player sets and a set of terminal nodes. The set of nodes allocated to i is denoted Pi, P0 is the set of nodes allocated to plays of nature (such as exogenous randomization at certain nodes), and T the set of terminal nodes. Each player set, Pi, is partitioned into a family of information sets , with . Each element of the partition represents the information of a player i at a point in the game, where mi denotes the number of information sets of player i. The information set containing the root node has no other nodes: it is a set containing just one node. Write e for a nonterminal node, and z for a terminal node. For example, in Γ1, player 1's information set, , contains e5 and e6. This means that at player 1 recalls having chosen the uppermost choice (branch) at , but learned nothing about player 3's choice at . Also, note that at each information set a player has the same number of choices, reflecting the fact that one cannot determine the node in an information set from the number of choices available.
Each terminal node, z, is an outcome of the game, with payoff u(z) = (u1(z), …, un(z)). No payoff is assigned to nature. A path in the game is a collection of edges, S, with the property that the root node appears in just one edge as the initial node; only one edge has a terminal node, which appears as the successor in that edge, and every other node in the collection S appears twice—once as a predecessor node in an edge and once as a successor node in an edge. For example, S = {(e1,e2), (e2,e5), (e5,z1)} is a path. A path can also be defined in terms of an ordered collection of nodes, so S = (e1, e2, e5, z1) is a path. A path corresponds to a “play” of the game, describing what happens from start to finish.
CHAPTER 9: EXTENSIVE FORM GAMES
149
9.2.1 Choices Attached to each information set of a player is a set of possible choices: . Each choice at an information set defines an edge from each node in that information set to the successor node. In Γ2, the choice of by player 2 assigns an edge from each of the nodes in and (e3,e6). There are two nodes in , e2 and e3; at e2 the choice leads to e4 and at e3 it leads to e6.
9.2.2 Information The information structure in a game describes what players know when they make choices. An extensive form game is a game of perfect information if every information set contains just one point. Otherwise, the game is a game of imperfect information. A game of incomplete information is a game of imperfect information with a special information structure: each player has characteristics unknown to other players and an exogenously given distribution on all characteristics is given.
9.3 Strategies A strategy for a player is a plan giving a choice for the player at every decision location (information set) of the player. This definition is encompassing in that a choice is specified in every contingency, even those circumstances that cannot arise, given prior choice. In Γ3, choosing a at and c at is a strategy for 1; as choosing b at and c at is a strategy. However, in the latter case choice c is irrelevant in that the choice at has no effect on the outcome.
150
MICROECONOMIC THEORY
At first sight this involves redundancy, but it does allow consideration of “what if ” questions—such as the consequence of “local” changes (e.g. a switch from b to a at ).
9.3.1 Strategies: informal description A pure strategy for a player is a rule assigning a choice at each informa-tion set of the player. In Γ2, player 1 has three possible choices at . Since this is the only information set of 1, player 1 has three pure strategies. Player 2 has two possible choices at each of the information sets , , and , for a total of eight (2 × 2 × 2) possible pairings of choices and hence eight pure strategies. Thus, is particular strategy. This strategy has player 2 choose if player 1 chooses either or , and in that case, regardless of what player 3 chooses, information set of player 2 cannot be reached—so 2's choice there is irrelevant. In this way, there is some “redundancy” in the specification of a strategy, but this redundancy plays a role in answering “what if ” questions. Writing down all the pure strategies for each player with associated payoffs defines the strategic form game associated with the given extensive form. The game Γ4 has strategic form G4 where player 1 is the row player, and player 2 the column player.
A behavioral strategy is a rule assigning a distribution to choices at each informa-tion set. In Γ4 a behavior strategy for player 1 is a distribution, , on {a, b, c} and a behavior strategy for player 2 is a pair of distributions, and , on {d, e} and {f, g}, respectively. Any pure strategy can be identified with a behavior strategy. For example, the pure strategy a of player 1 corresponds to
151
CHAPTER 9: EXTENSIVE FORM GAMES
and the pure strategy dg of player 2 corresponds to
.
A mixed strategy is a distribution on the set of pure strategies. Since any behavior strategy determines a distribution on the set of pure strategies, every behavior strategy is a special case of a mixed strategy. However, the set of mixed strategies is generally a larger class of distributions. To see this, consider Γ5.
In this one-player game, Γ5, player 1 moves twice—there is only one player. There are four pure strategies: {ac, ad, bc, bd}. The mixed strategy q = (½, 0, 0, ½) cannot be generated by any behavior strategy, since there is no pair of distributions (x,1 − x) on {a,b} and (y, 1 − y) on {c, d} that replicate the mixed strategy q. (There is no x, y pair with xy = ½ = (1 − x)(1 − y) and x(1 − y) = (1 − x)y = 0.) In Γ5, player 1 when choosing between c and d does not know whether the game is at the node reached by choice a, or at the node reached by choice b. The player cannot recall whether his or her initial choice was a or b. The game Γ5 is a game without perfect recall (defined in Section 9.3.3). The next game, Γ6, is also a game without perfect recall, and shows how this issue has strategic implications.
In this game, Γ6, player 1 has two information sets, and . Let (x, 1 − x) and (y, 1 − y) be behavior strategies for 1, and let (v, 1 − v) be a behavior strategy for 2. The expected payoff to player 1 is xyv + (1 − x)(1 − y)(1 − v). Since for all x, y, in{xy, (1 − x)(1 − y)} ≤ ¼, for any strategy of 1 there is a strategy for 2 that gives 1 a payoff no greater than ¼. However, if 1 uses the mixed strategy (½, 0, 0, ½), this guarantees that 1 gets an expected payoff of ½. This example illustrates that in games without perfect recall, mixed and behavioral strategies are not strategically equivalent.
152
MICROECONOMIC THEORY
9.3.2 Strategies: detailed description A pure strategy for i, σi assigns a choice to each information set: , where . Let be the number of possible choices at information set , so that the number of pure strategies is . Denote the set of pure strategies for i by Σi. The set of mixed strategies is Xi = Δ(Σi), the set of distributions on Σi. Thus, in the mixed strategy xi, xi(σi) is the probability that pure strategy σi is played. A pure strategy profile, σ = (σ1, …, σn) determines a path in a game (or a collection of paths when there is randomization at information sets of the “0” player, nature). Recall that a path, p in the game is a sequence of edges or nodes, where e0 is the root node, et+1 is a successor node of et, and er = z is a terminal node. In the absence of choices by nature, at the root node or start of the game, the owner of the information set containing that node makes a choice. This choice leads to another information set, where the associated owner makes a choice, and so on. Thus, a unique path is associated with a pure strategy profile, say p(σ). In Γ2, if , , and , the associated path is p(σ) = (e1, e2, e5, e11, z8). With randomization by nature there may be multiple paths compatible with a given pure strategy. Write p(σ) to denote the collection of paths that have positive probability when strategy σ is used. In the absence of moves by nature, p(σ) is always a single path.
In Γ7 nature moves first (at information set )—a is chosen with probability ¼, b is chosen with probability ½, and c is chosen with probability ¼. If and , then with this strategy there are three paths which can occur (have positive probability) and p(σ) = {(e1, e2, z1), (e1, e3, z4), (e1, e4, z6)}. A behavior strategy assigns to each information set a distribution over choices at that information set. Write bi for a behavior strategy of i, so , where is a probability distribution on the set of choices at . Write to denote the probability of choice . A behavior strategy defines a distribution on pure strategies—a mixed strategy. The probability of pure strategy . This defines directly a distribution on Σi, so that calculating a mixed strategy from a behavior strategy is straight-forward. However, the set of distributions over pure strategies generated by mixed strategies
CHAPTER 9: EXTENSIVE FORM GAMES
153
is, in general, strictly larger than the set of distributions generated by behavior strategies (as game Γ5 shows). Going in the opposite direction (from mixed to behavioral strategies) is less clear, but there is a natural definition. In the case of Γ5, the natural approach is to calculate the conditional distribution of the choice c (and d), given that the information set is reached (an event with probability 1 in Γ5). If x = (x1, x2, x3, x4) is the mixed strategy on the set of pure strategies, {ac, ad, bc, bd}, let α = x1 + x2 and β = x1 + x3. This defines a behavior strategy where a has probability α and c has probability β. If x1, x4 > 0, then both α and β are strictly between 0 and 1. In this case, whatever the mixed strategy, the derived behavior strategy assigns positive probability to every pure strategy. The procedure for obtaining a behavior strategy from a mixed strategy is developed more formally next. A path reaches information set , if for some , . So, for example, in Γ2 the path p = {e1, e2, e5, e11, z7} reaches , since . That path also reaches , but does not reach , for example. For player i, define the set of pure strategies of i that reach information set , , to consist of those strategies which when matched with some strategy of the other players, has positive probability of determining a path that reaches information set . Formally,
With this notation, given a mixed strategy for i, xi, a behavior strategy, bi, may be defined.
So, given a behavior strategy, bi, there is a procedure for determining a corresponding mixed strategy, , and conversely, given a mixed strategy, xi, there is a procedure for determining a corresponding behavior strategy, . In the case of Γ6, every pure strategy reaches the second information set, so the mixed strategy x = (x1, x2, x3, x4) = (1, 0, 0, 1) generates the behavior strategy: bx = {(ρ, 1 − ρ), (τ, 1 − τ)} = {(½, ½), (½, ½)}, where ρ and τ are the probabilities of a and e, respectively, at the two information sets. The mixed strategy determined by bx is .
9.3.3 Perfect recall From these calculations it is clear that there are fewer strategic possibilities with behavior strategies than with mixed strategies. However, behavior strategies are easier to use, allowing comparison of alternative decisions locally at each information set. Under what conditions are behavior strategies strategically equivalent to
154
MICROECONOMIC THEORY
mixed strategies? The key condition is called “perfect recall,” and can be motivated by considering Γ6. In Γ6, at player 1's second information set the choice made initially is forgotten—at the second information set player 1 cannot tell if a or b was chosen initially although player 1 made the choice. When these choices can be recalled Γ6 is modified and the information structure is represented by Γ8—a game that satisfies perfect recall.
In Γ8, player 1 has three information sets with choices {a,b} at , {e,f} at , and {g,h} at . Player 2 has one information set with choices there of c or d. Behavioral strategy probabilities are written on the branches. The mixed strategy putting probability ½ on aeh and probability ½ on beh guarantees a payoff of ½ to player 1. Let b = {(ρ, 1 − ρ), (τ, 1 − τ), (γ, 1 − γ)} be the distributions assigned at the information sets , , and by the behavior strategy. Then b = {(½,½), (1, 0), (0, 1)} assigns probability ½ to aeh and probability ½ to beh, and so provides the same distribution on pure strategies as the mixed strategy did. These calculations correspond with the formula given earlier for deriving a behavior strategy from a mixed strategy. For example, at information set , the set of pure strategies which reach are those strategies which involve the choice a at . So, the probability of reaching under the mixed strategy is:
And,
Thus, τ = [½]/[½] = 1. A similar calculation applies for γ. For ρ, every strategy reaches , so the probability of reaching is 1, and there is probability ½ that a will be chosen. So, in this example, and it turns out in general, when a player recalls prior choices, behavior strategies are strategically as good as mixed strategies. Note however that even with perfect recall, the set of feasible distributions over pure strategies is strictly smaller with behavior than with mixed strategies.
CHAPTER 9: EXTENSIVE FORM GAMES
155
In Γ8 the behavior strategy b = {(ρ, 1 − ρ), (τ, 1 − τ), (γ, 1 − γ)} implies in G8 the mixed strategy x:
Regardless of the behavior strategy, any derived mixed strategy satisfies: x1x6 = x2x5 = ρ(1 − ρ)γ(1 − γ)τ2. So, for example, a mixed strategy, x, with in {x1, x6} > 0 and in {x2, x5} = 0 generates a distribution on pure strategies which cannot be replicated by a behavior strategy. Notice however, that the initial choice of a makes the choice of g or h irrelevant; and the choice of b initially makes the choice of g or h, irrelevant. In the first four strategies we can ignore (strategically) the distinction between g and h, and in the second four the distinction between e and f can be ignored. Write [ae] to denote the pair of strategies aeg and aeh, with similar notation for other choices. Strategically, it does not matter which choice in [ae] is taken—only the total probability matters. Take a given x and compress to strategically relevant choices:
Observe that w1 = x1 + x2 = ρτγ + ρτ(1 − γ) = ρτ, and w2 = x3 + x4 = ρ(1 − τ). So w1 + w2 = ρ. Similarly w3 + w4 = (1 − ρ). In addition w3 = (1 − ρ) γ. So, given an arbitrary (w1, w2, w3, w4), put ρ = w1 + w2, τ = w1/(w1 + w2) and γ = w3/(w3 + w4). Regardless of the choice of player 2, if x = (x1, …, x8) is a mixed strategy for player 1 then the behavior strategy: b = {(ρ,1 − ρ), (τ, 1 − τ), (γ, 1 − γ)} generates exactly the same distribution over endpoints of the game as does x, where ρ = x1 + x2 + x3 + x4, τ = (x1 + x2)/(x1 + x2 + x3 + x4), and γ = (x5 + x7)/(x5 + x6 + x7 + x8). For example, referring back to G8,
156
MICROECONOMIC THEORY
9.3.4 Strategic equivalence with perfect recall The following discussion formalizes the notion of perfect recall. The key result is that in games with perfect recall there is no strategic advantage to using mixed strategies over behavior strategies: with perfect recall any distribution over endpoints of a game generated by a mixed strategy can be replicated by a behavior strategy. Recall that any nonterminal node e has one or more successors. In the game, the successor reached is determined by the choice of the player who owns the information set containing e. Write s(e,c) to denote the node in the game reached by choice c at e. Suppose that a player makes a choice c at some information set I. The player does not know which node in the information set is actually reached. If the node is e, the choice c will move the game to node s(e,c); if the node is e′ the choice c will move the game to node s(e′,c), and so on. So, if the player makes the choice c, then the player will know that the node reached by the choice must be in the set ∪e ∈ Is(e,c) . If the player makes the choice c at information set I, in subsequent choices the player should remember this information. It should be that the player knows that the only paths possible are those through ∪e∈Is(e,c). Formally, let T denote the nodes in the game tree, and define T(e) as the set of nodes that follow e—a node e′ is in the tree T(e) if there exists a sequence of nodes in T, e1, …, er such that e = e1, e′ = er and for j = 1, …, r − 1, (ej, ej+1) is an edge in T. Definition 9.1.Player i has perfect recall if e, e′ ∈ Piwith
,
ande′ ∈ T(e)implies that ∃ unique
such that
.
In the following theorem, an arbitrary mixed strategy for a player is replaced by its derived behavior strategy. The theorem asserts that in games with perfect recall the derived strategy generates the same distribution over outcomes and payoffs as the original mixed strategy, for any mixed strategies of the other players. Theorem 9.1.Let Γ be an extensive form game where player i has perfect recall. Fix a mixed strategy fori, xi. The derived behavior strategy is , and the mixed strategy derived fromb*i is . Then
So, the behavior strategy b*i which is identical to the mixed strategy in x*i in terms of distribution over payoffs and terminal nodes yields the same expected payoff as the mixed strategy xi. See Kuhn (1953) and Aumann (1964).
CHAPTER 9: EXTENSIVE FORM GAMES
157
Bibliography Aumann, R. J. (1964). “Mixed and Behavior Strategies in Infinite Extensive Form Games,” in Melvin Dresher, Lloyd Shapley, and Albert W. Tucker (eds.), Advances in Game Theory. Princeton, NJ: Princeton University Press. Kohlberg, E. and Mertens, J.-F. (1986). “On the Strategic Stability of Equilibria,” Econometrica, 54, 1003–1038. Kreps, D. and Wilson, R. (1982a). “Sequential Equilibria,” Econometrica, 50, 863–894. Kreps, D. and Wilson, R. (1982b). “Reputation and Imperfect Information,” Journal of Economic Theory, 27, 253–279. Kuhn, H. (1953). “Extensive Games and the Problem of Information,” in H. Kuhn and H. Tucker (eds.), Contributions to the Theory of Games, Vol. 2, Princeton, NJ: Princeton University Press, pp. 193–216. Myerson, R. B. (1978). “Refinement of the Nash Equilibrium Concept,” International Journal of Game Theory, 7, 73–80. Selten, R. (1975). “Reexamination of the Perfectness Concept for Equilibrium Points in Extensive Games,” International Journal of Game Theory, 4, 25–55.
This page intentionally left blank
10 Equilibrium in Extensive Form Games 10.1 Introduction This chapter describes the main equilibrium notions in extensive form games. Because any extensive form game can be converted to a strategic form game by defining the corresponding pure strategies, existence of equilibrium may be discussed in terms of the strategic form game associated with the extensive form. However, this transformation conceals the information structure of the game along with features such as the ordering of moves and so on. Nash equilibria in the strategic form do not involve consideration of behavior throughout the game tree that may support or justify choices made on the (equilibrium) path determined by the Nash strategies. So, one cost of moving to the strategic form is the loss of the structural details and information in the extensive form game that could provide clues in the identification of equilibria. This suggests that it might be necessary to consider equilibrium defined in terms of the extensive form game. With this in mind three such equilibrium notions, perfect, sequential, and perfect Bayesian equilibria are discussed. Despite these remarks on the loss of detail in going from the extensive to the strategic form of a game, appropriate solutions in the strategic form can generate equilibria that have all the features of sequential equilibrium. Proper equilibria in the strategic form generate sequential equilibria in the extensive form. So, from this perspective, the distinction between extensive and strategic forms in considering equilibria is blurred. Still, whatever the merit of this point, from a computational perspective finding equilibria directly in the extensive form is natural. One example below, the “chain store paradox” illustrates this point. In that example, a multiperiod game is considered and a sequential equilibrium is identified—although the analysis in the strategic form would be impractical.
160
MICROECONOMIC THEORY
Section 10.1 provides an overview of the issues involved in defining equilibrium. Here, Nash equilibrium and equilibria computed by backward induction are described. For games without perfect information, the need to analyze decisionmaking at “larger” information sets (having more than one node) motivates the need for other solutions. In Sections 10.3, 10.4, and 10.5 perfect, sequential equilibria and perfect Bayesian equilibrium are described. The connection between equilibria in the strategic form and equilibria in the extensive form is considered in Section 10.6. A proper equilibrium in the strategic form game derived from an extensive form game may be identified with a sequential equilibrium of the extensive form game. Finally, in Section 10.7 the chain store paradox is discussed. This example illustrates the application of sequential equilibrium and provides insight into the role of beliefs in determining behavior.
10.2 Extensive and Strategic Form Equilibria Given an extensive form game Γ, for each player i there is a set of pure strategies, Σi, and payoff function ui: Σ → R, Σ = × Σi. So, one can directly define Nash equilibrium in the strategic form of the game. Consider the game Γ1 with strategic form G1, where player 2 chooses row.
In the extensive form, Γ1, the behavior strategy for 1 is , and for player 2 the behavior strategy is given by a pair: and . One obvious equilibrium is , and , γ ɛ[0,1]. From the strategic form, the strategies (HU,T), (HD,T), (LU, B), and (LD,B), are all pure strategy Nash equilibria. However, for player 2 to play L at information set seems odd, since it is strictly worse than playing H. It occurs as a possibility because if 1 plays B, the choice of 2 has no impact on payoffs. The strategy (LD,B) in the strategic form corresponds to the behavior strategy , and in the extensive form—where the component involves suboptimal behavior given that the information set is reached.
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
161
In this game, a natural way to determine how players will behave is to work backward from the end of the game. If information set is reached, then player 2 has to move and the (unique) best choice is to choose H. At information set , both U and D are equally good, since both yield a payoff of 0. Thus, at information set , the (unique) best choice is to choose T—since player 2 will then choose H and both will receive a payoff of 1. Choosing B guarantees a payoff of 0. Determining behavior in this way is called backward induction. (And an equilibrium found this way is called a subgame perfect equilibrium.) Such equilibria are Nash equilibria, but not all Nash equilibria satisfy this criterion. While Γ1 has two pure strategy backward induction equilibria, (HU,T) and (HD, T), there are four pure strategy Nash equilibria.
10.2.1 Subgames and subgame perfection Given an extensive form game Γ, subgames are extensive form games derived from Γ. If e is a nonterminal node in Γ, one can identify the set of nodes that are reachable from e: a node e′ is reachable from e if there is a collection of vertices in Γ, (e1, …, er) such that ej+1 is a successor of ej, j = 1, …, r − 1, e1 = e and er = e′. Let T(e) consist of e and the set of nodes be reachable from e. Recall that an information set, I, of a player is a collection of nodes. Suppose that for every information set I in the game Γ, if I ∩ T(e) ≠ ∅, then I ⊆ T(e). Then the set of nodes T(e) defines a game which is called a subgame of Γ, denoted Γ(e). In words, if an information set, I, contains nodes that are in the subtree, T(e), then all nodes in that information set are contained in the subtree. In the game, Γ2, there are two subgames, Γ(e1) and Γ(e2). Consider the node e4, and observe that T(e4) = {e4, z5, z6}. Note that , but so also is e3 and e3 ∉ T(e4) so that , but . Thus the set of nodes T(e4) does not define a subgame.
A subgame of an extensive form game is itself an extensive form game. Therefore it is possible to consider equilibria of any subgame. Note that strategies in an extensive form game assign actions at every information set and so implicitly assign or induce strategies on every subgame.
162
MICROECONOMIC THEORY
Definition 10.1.A Nash equilibrium is a subgame perfect equilibrium if strategies induced on any subgame define a Nash equilibrium on that subgame. This criterion addresses the objection to equilibria (LU, B) and (LD,B) raised above, since Nash equilibrium on the subgame reached in Γ1, when player 1 chooses T, requires that player 2 choose H. However, the same objection arises in games with no strict subgames, so that the criterion of subgame perfection does not fully address the issue of suboptimal behavior at some locations in the game tree. The following game, Γ3, illustrates the problem when a game has no subgames.
In this game there are two Nash equilibria: (C,H) and (T,L). Both are also subgame perfect—since the game has no subgames (apart from Γ3 itself). However, the second equilibrium involves player 2 playing L when called on to play; and at information set , L gives a payoff of 0 whereas H gives a payoff of 1. From the perspective of player 2 at information set , the only uncertainty relates to whether the game is at ea or eb. But whichever is the case, choosing H is strictly better. Suppose that player 2 assigns probabilities to these nodes: (πa, πb). Then the expected payoff from H is πa · 1 + πb · 1 = 1, and the expected payoff from L is πa · 0 + πb · 0 = 0. Whatever the “beliefs,” (πa, πb), H is strictly better. In the case where 1 plays T (with probability 1), there is 0 probability that the information set is reached. So, if in this example equilibrium required that player 2's choice at information set was justified by some belief, then in no such equilibrium can 2 choose L. Perfect equilibrium, perfect Bayesian equilibrium, and sequential equilibrium are three extensive form equilibrium concepts which require optimizing behavior at every information set.
10.3 Perfect Equilibrium Consider an extensive form game Γ. Recall that the behavior strategy of player i at information set at the set of choices is , and is the set of probability distributions on those choices. And, let
, where
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
be a behavior strategy for i in the game—recall mi is the number of information sets of this behavior strategy, the probability of choice is , and .
163 . With
For a fixed profile of behavior strategies, not all information sets are necessarily reached with positive probability. In Γ3, if player 1 plays T with probability 1, then, for example, information set is reached with probability 0. Therefore, whether player 2 plays optimally at this information set or not has no impact on player 2's expected payoff. If, however, player 1 plays C or D with positive probability then player 2's behavior at does affect 2's payoff, forcing player 2 to optimize at that information set. Perfect equilibrium requires optimizing behavior at every information set. The way this is achieved is by defining a perturbed game where every information set is reached with positive probability, finding the equilibria of the perturbed game and then considering limiting behavior (limiting strategies) as the perturbation goes to zero. The following discussion describes the procedure. For each information set , let
and ment, least
, for all
and
. Write
, with εi ≫ 0 denoting the condition that , for all and for all . The require, requires that at information set , the behavior strategy plays choice c with probability of at for each . Define
So a behavior strategy of i in the set Bi(εi) plays every choice of i at every information set with positive probability. Write ε = (ε1, …, en) to denote the collection of restrictions across all players with ε ≫ 0 meaning εi ≫ 0 for all i. This defines a perturbed game, Γ(ε), from the original game Γ, where the (behavior) strategy space of player i is Bi(εi), i = 1, …, n. In this notation, Bi = Bi(0) is the strategy set of i in the unperturbed game, and Γ(0) is the unperturbed game. Let EQ(Γ(ε)) denote the set of equilibria of this perturbed game. Definition 10.2.A strategy profileb* = (b*1, …, b*n) (b*i ∈ Bifor alli), is a perfect equilibrium if there is a sequence of strictly positive perturbations {ε} converging to 0, , such that the corresponding sequence of games has a sequence of equilibria converging to
, ∀land liml→∞bl = b*. Although perfection in the extensive form is based on perturbations similar to those in the strategic form, the implications are very different. In particular: • • •
a perfect equilibrium in the extensive form may not be perfect in the strategic form (see Γ5). a perfect equilibrium in the strategic form may not be perfect in the extensive form (see Γ6). perfection in the extensive form does not eliminate weakly dominated strategies (see Γ5).
164
MICROECONOMIC THEORY
These observations are illustrated by the following examples.
In the game, Γ4, player 1 moves U or D; if U, then 1 moves again; if D then 2 moves. Two extensive form perfect equilibria are: (UT;H) where (α, β) = (1,1), γ = 1, and (DT;H) where (α, β) = (0,1), γ = 1. In the strategic form game there is a unique perfect equilibrium: (UT;H); in particular (DT;H) is not a perfect equilibrium in the strategic form. Furthermore, (DT;H) although a perfect equilibrium in the extensive form is weakly dominated in the strategic form.
In Γ5 player 1 moves U or D; if U, then 1 moves again, following which 2 chooses H or L. There is a unique subgame perfect equilibrium: when player 1 moves at there is a strictly dominant choice: T. On this subgame there is a unique Nash equilibrium yielding payoff outcome (4,1). However, in the strategic form the strategy profile (DT,L) is a perfect equilibrium—achieved in the limit by (ε, ε, 1 − 3ε, ε) and (ε, 1 − ε). As an aside, from the point of view of proving existence of equilibrium, one can consider Bi(εi) to be the set of strategies of i, or one can consider the set of strategies to be the set of probability distributions on Bi(εi). The former is more natural, but the expected utility function is not linear in behavioral strategies. In the unperturbed game with strategy spaces Bi(0), the set of best responses in behavioral strategies is not convex as the game Γ6 illustrates.
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
165
In this one-player game, Γ4, a behavior strategy has the form b = [(α, (1 − α)), (β, (1 − β))]. So, b* = ((1,0),(1,0)) is a best response; as is
and . All three strategies give a payoff of 1. However , b′ =((½,½;), (½,½)) gives a payoff of ¾ and so is not a best response. However, in the perturbed game with strategy spaces
the set of best responses is convex.
Computational issues The concept of perfect equilibrium was introduced by Selten (1975). By perturbing actions, players are led to optimizing behavior at each information set where a decision must be made. One difficulty with perfection lies on the computation side. Since equilibrium must be determined for a sequence of perturbed models approaching the original game, the computation of equilibrium can be difficult. This in part motivates the concept of sequential equilibrium which also requires optimality, but only as a limit requirement of a sequence of perturbed games. In the limit, as with perfection, optimizing behavior is required at every point where a player could possibly have to make a decision (i.e. at every information set). Sequential equilibrium was introduced by Kreps and Wilson (1982). Perfect Bayesian equilibrium is motivated by similar considerations of computational simplicity, but again requiring optimal behavior at every information set.
10.4 Sequential Equilibrium Given a behavior strategy profile b = (b1, …, bn), suppose that for each i, , . Then Bayes' rule can be applied to determine the posterior probability of any node in the game tree. For each e ∈ V, probb(e) > 0, where V the set of nodes in the game and probb(e) is the probability of reaching node e when the behavior strategy profile is b. Given information set , , the probability of reaching the information set is equal to the sum of the probabilities of reaching its nodes. From these two probabilities one can compute the probability of a given node in the information set, given the information set is reached: for , . Label
166
MICROECONOMIC THEORY
such a conditional distribution,
, where dependence on b is implicit. Thus,
and , for all . The information set is a collection of nodes in the game, so that is the set of distributions on these nodes. Thus, . For player i, write , so that . Put . Call such a profile of distributions, π, a system of beliefs. A sequential equilibrium is defined by two components—a behavior strategy profile, b = (b1, …, bn), and a belief system, π = (π1, …, πn). Recall that the set of behavior strategies of i is Bi, and let . Definition 10.3.Consistency and sequential rationality. • •
Call a pair (b, π) consistent if ∃ a sequence (bn, πn) ∈ int(B × Π), where πnis derived frombnaccording to Bayes' rule, and where (bn, πn) → (b, π). Call a pair (b, π) sequentially rational if ∀ i, ∀ k = 1, …, mi,
Here, int(X) denotes the interior of the set X. Consistency and sequential rationality are clarified further in the following comments.
Consistency When a behavior strategy, bn, is fully mixed (in the interior of ), every node is reached with positive probability, so that Bayes' rule uniquely determines πn. As bn → b, (bn, πn) or some subsequence will also converge. Note however that a different sequence, , with may generate a sequence with a limit different to that of the sequence πn. So, it may be that (bn, πn) → (b, π) and with π ≠ π*. For example, in Γ7, the distribution on the three choices {T, C, D}, generates the distribution on the nodes of given by (ε/(ε + ε2), ε2/(ε + ε2)) with limit (1, 0) when ε → 0; whereas the distribution generates (0, 1) as the limiting distribution on nodes in as ε → 0, although and have the same limit, (1, 0, 0). Thus, in general there will be many belief systems that are consistent with a given behavior strategy. However, the consistency condition does impose nontrivial restrictions on the beliefs, depending on the structure of the game. The next example illustrates.
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
167
In Γ7, let and be sequences of behavior strategies. Since these are strictly positive, Bayes' rule can be used to determine the posterior distributions on the points in and . Let and be the corresponding distributions. Under Bayes' rule
Similarly, at and are the same: (πc, πd) = (πf, πg).
. Thus,
. So, the consistent posterior distributions
Sequential rationality As defined, the expression allows for the computation of expected payoffs conditional on reaching any information set, including those that may have zero probability under b. The presence of a distribution over nodes of an information set, , for every information set, makes the computation possible, even when the information set has 0 probability under . To compute the expectation,
take any node . From the behavior strategy, b, the distribution over terminal nodes of paths through e can be computed—and hence the expected payoff to i given e: Eb{payoff to i | e}. The overall expectation may then be computed using the given distribution, on nodes of .
With these concepts in place, sequential equilibrium may be defined. Definition 10.4.The pair (b,π) is called a sequential equilibrium if (b,π) is consistent and sequentially rational.
168
MICROECONOMIC THEORY
The following example illustrates the computation of a sequential equilibrium.
In this game player 1 chooses U, C, or D; player 2 then chooses (T or B) and finally 1 moves again recalling 1's initial choice but not observing the choice of 2. One feature of this game is that sequentially consistent posterior distributions at and satisfy (πc, πd) = (πe, πf) = (π, 1 − π). The parameter ε satisfies 0 < ε < ½. At , the expected payoffs from h, l are, respectively, π(1 + ε) and (1 − π)(1 + 2 ε). At , the expected payoffs from t, b are respectively (1 − π)(1 + ε) and π (1 + 2 ε). First, observe that there is no sequential equilibrium where player 2 plays a pure strategy. • •
If 2 chooses T with probability 1, the pure strategy of 1, (C,h,b), is a best response and every best response includes the choice of C and b leading to the payoff (1 + 2ε, 0). In that case, 2 receives 0 and a deviation to B raises 2's payoff to ε. If 2 chooses B with probability 1, any best response of 1 includes U and l giving the payoff, (1 + 2ε, 0) and 0 to 2. A deviation to T raises 2's payoff from 0 to ε.
In the next step, an equilibrium strategy is conjectured for 2, and it is shown to be part of a sequential equilibrium. Suppose that 2 plays the strategy (y, 1 − y) = (½, ½). • • •
Then at , the expected payoff to player 1 from h is ½(1 + ε) and the expected payoff to 1 from l is ½(1 + 2ε), so player 1 will play l at . At , the expected payoff from t is ½(1 + ε), and the expected payoff from b is ½(1 + 2ε), so 1 will play b at . If 1 chooses l at and b at , then: – Conditional on being at a, the payoff to 2 from choice T is ε, and from choice B it is 0. – Conditional on being at b, the payoff to 2 from choice T is 0, and from choice B it is ε.
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
•
169
Thus, the expected payoff for 2 from choice T is πa ε + πb · 0 = πa ε and the payoff from choice b is πa · 0 + πb · ε = πb · ε. If πa = πb = ½, y = ½ is optimal. If (y, 1 − y) = (½, ½), then the expected payoff to 1 from choice U is ½(1 + 2ε) and the expected payoff to choice C is also ½(1 + 2ε). Since ε < ½, ½(1 + 2ε) < 1, so that the choice D gives a higher payoff.
Consider the strategy sequences:
,
, and , where the weights are in order of the strategy choices from top to bottom. With εn → 0, these converge to (0, 0, 1), (1, 0), (0, 1) respectively and the corresponding beliefs are (πa, πb) = (πc, πd) = (πe, πf) = (½, ½). This defines the sequential equilibrium. Implicitly, the criterion of perfection has the consistency and sequential rationality properties. Given the equilibrium sequence, bl ε EQ(Γ(εl) with bl → b*, since bl is strictly positive, a unique system of beliefs, πl is determined by bl and the play is necessarily optimal relative to this belief system at every information set. The sequence, (bl, πl) (or a subsequence) converges to (b*, π*), and this pair is consistent and sequentially rational. However, the criterion of perfection is a stricter criterion since it requires that the test sequence of the perturbed game is an equilibrium of the perturbed game. Sequential equilibrium requires that the equilibrium conditions be satisfied only in the limit. Thus, every perfect equilibrium is a sequential equilibrium which in turn is a Nash equilibrium. The following example shows that perfection is more demanding than the sequential requirement.
In this game player 1 moves U, C, or D; then player 2 chooses H or L. There are two subgame perfect equilibrium, (D,L) and (U,H). Considering perfection, in the perturbed game, when player 2 moves at , H gives a strictly higher payoff than L, so the limiting strategy must put probability 1 on H. Therefore, there is a unique perfect equilibrium yielding payoff outcome (4, 1). However, (2, 2) is also a sequential equilibrium—supported by beliefs . These are obtained from the corresponding strategy sequence or net, (ε, 1 − ε) and (ε2, ε, 1 − ε − ε2). In the game, Γ3, (T,L) is a Nash equilibrium, but there is no belief system that will support L as a sequentially rational choice, so that (T,L) are not strategies in a sequential equilibrium, illustrating how the sequential equilibrium criterion can eliminate certain Nash equilibria.
170
MICROECONOMIC THEORY
10.5 Perfect Bayesian Equilibrium Recall that in the definition of perfect equilibrium the sequence of strategies are equilibria of the perturbed games where Bayes' rule is applicable at every information set. In sequential equilibrium, the sequence of strategies approaching the candidate equilibrium has a corresponding system of beliefs; and the limiting strategies must be optimal relative to some limit of this system of beliefs. Thus, the beliefs are connected to the strategies directly. In perfect Bayesian equilibrium, beliefs are assigned at each information set, but in a less restrictive manner than in sequential equilibrium. In the specific case of signaling games, these conditions amount to applying Bayes' rule wherever possible (on histories that have positive probability) and using arbitrary beliefs elsewhere. For more general games a number of conditions are imposed including application of Bayes' rule wherever possible in the extensive form. In view of the construction of beliefs, the set of perfect equilibria is a subset of the set of sequential equilibria, which in turn is a subset of the set of perfect Bayesian equilibria.
10.6 Proper and Sequential Equilibrium With the formulation of beliefs and optimization relative to those beliefs, sequential equilibrium captures aspects of the extensive form structure that one might conjecture are beyond the scope of analysis using the strategic form. This issue has been debated at length. One reason for thinking that there is more in the strategic form than one might suspect at first sight is the following result. This asserts that given an extensive form game, a proper equilibrium of the strategic form induces a sequential equilibrium in the extensive form. (See Kohlberg and Mertens 1986; van Damme 1987.) Theorem 10.1.A proper equilibrium of a normal form game determines a sequential equilibrium in any tree with that strategic form. Proof. Let x = limε → 0xε, where xε is an ε-proper equilibrium. For player i, attaches probability to pure strategy l. This determines at each information set of i a behavior strategy (with ) the probability of choice c), and a distribution on nodes . Choose a subsequence so that these converge:
and
, ∀ i, ∀ k = 1,…, mi.
Suppose that the limits do not determine a sequential equilibrium. So, for some information set
171
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
Choose an information set such that there is no subsequent information set on a path through with this property. (So has no successor information set where i can improve.) Therefore, for some ĉ, , and
and since (bε, πε) → (b, π), for ε close enough to 0,
And, the perturbed strategy satisfies . Given any two pure strategies , which are identical except at , where one chooses ĉ and the other ; the once choosing yields a strictly higher payoff. So, properness implies that
Let
be the set of pure strategies of i which reach . Partition
, where is the set of strategies in choosing ĉ, is the set of strategies in choosing , and ε remaining strategies in . Write probε for the probability of an event under b . Thus,
is the set of
and
There is some
Thus, as ε → 0,
dominating each
. So with
, where
, contradicting the assumption that
is the number of pure strategies of i,
.
172
MICROECONOMIC THEORY
10.7 The Chain Store Paradox In a market for a homogeneous good, theory suggests that an incumbent should accommodate entry rather than attempt to fight it. The logic is simple: once entry has occurred, fighting hurts not only the entrant but also the incumbent. So, given that entry has occurred, it is optimal for the incumbent to adjust to this new reality. However, in practice this is not observed—giving rise to what is called the chain store paradox. The following discussion addresses this subject in the context of an environment with incomplete information where the potential entrant does not have full knowledge of the incumbent's strength. The interesting feature is that this creates scope for the incumbent to develop a reputation by acting aggressively in the face of initial entry; and in a multiperiod setting the gains to reputation lie in deterring further entry when multiple potential entrants exist: aggressive behavior builds a reputation which deters potential competition. In this way, the paradox is resolved. The example is a leading illustration of sequential equilibrium.
10.7.1 The complete information model In the complete information case, there is full information about each party. The entrant can choose to enter (E), or stay out (O). If entry occurs, the monopolist can choose to fight entry (F), or acquiesce (A). Payoffs are labeled (entrant, monopolist). The game Γ10 depicts the extensive form of the game.
With 0 < a, b < 1 there are two Nash equilibria of the strategic form game, (O,F) and (E,A). In the extensive form (E,A) is the unique subgame perfect equilibrium. Any finite repetition produces the same outcome: entry occurs in each period and the monopolist acquiesces. That describes the complete information case.
10.7.2 The incomplete information model In a variant of this model the information is incomplete—the payoff to the mono-polist reflects a strong (S) or weak (W) monopolist. Depending on which of these situations prevails, the respective games are:
In the incomplete information game, the potential entrant does not know if the firm is strong or weak. In what follows, assume that 0 < a, b < 1 and b > p. The
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
173
strong monopolist prefers to fight, given entry. The model illustrates reputation building in equilibrium when the future is sufficiently important (3 periods in this example).
The one-period game The following extensive form game describes the environment in a one-period model. The potential entrant does not know if the firm is strong or weak. The firms characteristic is drawn (strong with probability p), and without knowing this the potential entrant decides to enter (E) or stay out (O). The upper branch, with probability p, represents the “draw” of the strong monopolist.
If the entrant stays out, the entrant's payoff is 0. If the entrant enters the mono-polist will fight if strong and acquiesce if weak. So, the payoff to entry for the entrant is (1 − p) b + p(b − 1) = b − p. Therefore, entry occurs only if b − p > 0. And in this case there is a unique subgame perfect equilibrium where the entrant enters (E), the strong monopolist fights (F), and the weak monopolist acquiesces (A). This can be seen from the strategic form of the game where AF, for example, means the strong firm acquiesces and the weak firm fights:
Here, for example, the entry corresponding to (E, AA) is determined as p ·(b, −1) + (1−p) · (b, 0) = (b, −p). Assuming b − p > 0, (E,FA) and (O,FF) are equilibria, with the first equilibrium corresponding to the subgame perfect equilibrium. So, in the extensive form one-shot game, entry occurs if and only if b > p.
174
MICROECONOMIC THEORY
The entrant's expected payoff as a function of p, is12
Similarly, the monopolist's expected payoff as a function of p is
When the firm is believed to be strong with sufficiently large probability (p > b), the entrant stays out of the market and the monopolist (strong or weak) gets a payoff of a. Otherwise, entry occurs. In the one-shot game, since b > p, entry occurs; the weak monopolist always acquiesces and the strong monopolist always fights.
The two-period game Next, consider the two-period model. Here, envisage the incumbent operating in two locations with a potential entrant in each location and where the potential entrant in the second location waits to see the outcome (entry or no entry) in the first location before deciding to enter or not. Let denote the expected profit of the monopolist given entry occurs in the first period. Write to denote the probability that the weak monopolist fights, and to denote the probability that the weak monopolist acquiesces. The corresponding probabilities for the strong monopolist are and . So,
where p(A) and p(F) are the posterior probabilities following the choices A and F, and prob(A) and prob(F) the probabilities that A and F were chosen. Thus,
and
What is optimal for the incumbent firm (monopolist)? Suppose that and are such that both p(A) and p(F) are less than b. Then v1(p(A)) = v1(p(B)) = 0 and . This is dominated by , which makes prob(F) = p, prob(A) = 1 − p, p(A) = 0 and p(F) = 1. Then the expected payoff to the monopolist is pv1(1) = p · a · χ[b,1](1) = p · a. Is there a strategy that does better for the incumbent firm when faced with entry? Note that prob(A) p(A) + prob(B) p(B) = p < b, so that it cannot be that both
12
The function χQ is the characteristic function of Q : χQ (p )=1 if p ∈ Q , and otherwise χQ (p )=0.
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
175
p(A) and p(B) are as large as b. The only remaining possibility to consider is that b lies between p(A) and p(B). In this case, since choosing F is always preferable for the strong firm facing entry, an optimal choice of , with p(F) > b will require that p(F) ≥ b. Thus,
Rearranging
Since b > p, ξ > 1. Then, with
,
which rearranges to:
Since a < 1, this is maximized by setting
,
and so
is optimal for the incumbent.
So, in the two-period model the entrant enters if and only if b>p. The payoff to the monopolist in the two-period model is, as a function of p:
Thus, in the second period, in equilibrium entry occurs, the strong monopolist fights and the weak monopolist acquiesces. If equilibrium entry did occur in the second period (the equilibrium case), since the strong incumbent fights and the weak one acquiesces, in the next and final period, conditional on fighting in period 2, the posterior will place probability 1 on the firm being strong and no entry will occur; conditional on acquiescence in period 2, the posterior will place probability 1 on the firm being weak and entry will occur. Finally, if entry did not occur in the second period (an out of equilibrium event), the game goes on to the first and final period where the distribution over types is unchanged, and entry occurs in this period also.
The three-period game For the three-period model, conditional on entry the expected payoff to the monopolist is:
Again, the relevant case to consider is that where p(F) ≥ b > p(A), and in this case v2(p(F)) = 2a and v2(p(A)) = 0.
176
• •
MICROECONOMIC THEORY
If 2a < 1, this expression is maximized by putting and . If 2a > 1, it is optimal to make and as large as possible subject to the condition p(F) ≥ b > p(A)). So, if 2a > 1, put and .
Take a > ½ so that
and
(which is required for
is an optimal choice, in which case
Note that the probability that the monopolist fights is:
, since b<1.
The expected payoff to an entrant in period 3 is: (p/b) (b − 1) + (1 − (p/b))b = b −p/b = (b2 − p)/b. So, in stage 3 entry does not occur if b2 < p. In contrast, in the two-period game, entry does not occur if b< p. Also, in contrast to the twoperiod game where the weak monopolist does not fight; with three periods the weak monopolist fights with probability p(1 − b)/b(1 − p) and the total probability of the entrant facing a fight is (p/b) > p. Summarizing, the game is played as follows, with the parameters satisfying b2 < p < b < 1 and 2a > 1. In period 3 the entrant does not enter. If in period 3, entry does occur (out of equilibrium), the strong incumbent fights for sure, the weak firm fights with probability p(1 −b)/b(1 − p). Then the posterior going into the next period (with two periods then remaining) exceeds b and deters further entry. In period 2, entry occurs according to the description of the twostage game, given no entry in period 3 (the equilibrium case)—since the distribution on types is p. If entry occurred in period 3 (out of equilibrium), and the firm acquiesced, the firm is identified as a weak firm and entry occurs in both remaining periods or markets. However, if entry in period 3 occurred and the incumbent fought, then the posterior in period 2, p(F), satisfies p(F) ≥ p and no further entry occurs according to the induced equilibrium. So, in the latter case, the weak firm gets the benefit of a reputation as being likely to be strong (p(F) ≥ p) if it fights in period 3 when confronted with entry. This model is developed in detail in Kreps and Wilson (1982b).
CHAPTER 10: EQUILIBRIUM IN EXTENSIVE FORM GAMES
177
Bibliography Fudenberg, D. and Tirole, J. (1991). “Perfect Bayesian Equilibrium and Sequential Equilibrium,” Journal of Economic Theory, 53, 236–260. Kohlberg, E. and Mertens, J.-F. (1986). “On the Strategic Stability of Equilibria,” Econometrica, 54, 1003–1038. Kreps, D. and Wilson, R. (1982a). “Sequential Equilibria,” Econometrica, 50, 863–894. Kreps, D. and Wilson, R. (1982b). “Reputation and Imperfect Information,” Journal of Economic Theory, 27, 253–279. Mailath, G. J., Samuelson, L. and Swinkels, J. M. (1993). “Extensive Form Reasoning in Normal Form Games,” Econometrica, 61(2), 273–302. Myerson, R. B. (1978). “Refinement of the Nash Equilibrium Concept,” International Journal of Game Theory, 7, 73–80. Selten, R. (1975). “Reexamination of the Perfectness Concept for Equilibrium Points in Extensive Games,” International Journal of Game Theory, 4, 25–55. van Damme, Eric (1987). Stability and Perfection of Nash Equilibria. Berlin: Springer Verlag.
This page intentionally left blank
11 Repeated Games 11.1 Introduction In a repeated game a fixed strategic game G is played at regular intervals over time. At the beginning of each period, players observe the past history of choices before making a decision: payoffs in the repeated game are the accumulated flow of payoffs from each period. Thus, repeated games are extensive form games with a special structure. This structure makes it possible to relate equilibrium in the repeated game to the payoff structure of the stage game G, and much of the analysis of repeated games involves connecting equilibrium payoffs of the repeated game to the stage game payoffs. The framework of a repeated game is described in Section 11.2. Sections 11.2.1 and 11.2.2 formulate payoff functions and strategies for the repeated game, and define Nash and subgame perfect equilibrium. In Section 11.2.3, mixed strategies are discussed. Strictly speaking, a mixed strategy is a distribution on pure strategies. Here, the terminology is used to denote randomization at each choice node of a player, and because the games satisfy perfect recall, behavior and mixed strategy are strategically equivalent. One of the primary insights from the literature on repeated games is that repetition in a competitive environment can lead to cooperation. Section 11.3 illustrates how repetition generates collusive equilibria through trigger strategies. Key issues in the characterization of equilibrium payoffs are discussed in Section 11.4. The characterization depends on a variety of notions (minmax payoffs, identification of the feasible payoff set, observability of histories), which are discussed in Sections 11.4.1 and 11.4.2. In the literature the two most common ways of evaluating payoff flows are averaging with equal weighting of time periods (limit of means) and discounting. These alternatives are discussed in Sections 11.5 and 11.6. In the discounted case, relating equilibrium payoffs to the individually rational set of payoffs leads to
180
MICROECONOMIC THEORY
a key condition called the “dimensionality condition.” This is discussed in Section 11.6.1. Finitely repeated games are described in Section 11.7. In finitely repeated games and games with discounted payoffs, equilibrium characterization utilizes two key ideas, “no gain from one-shot deviation,” discussed in Section 11.8.1, and history independent punishments considered in Section 11.8.2. Finally, games of incomplete information are introduced in Section 11.9.
11.2 The Framework A strategic form game consists of n players, a strategy set for each player, Ai, and a payoff function, ui: A → R, with A = × Ai. Denote the game . When the game is played for T periods, this defines a T-period repeated game. When T < ∞, the game is finitely repeated and each player chooses an action at each time t ∈ {1, 2, …, T}; write T = ∞ to denote the infinitely repeated game where each player chooses an action at each time t ∈ {1, 2, …}.
11.2.1 Evaluation of payoff ows Let be a sequence of actions over time, with at ∈ A. The corresponding payoff flow for i is . Two of the most common criteria for evaluation payoff flows are averaging and discounting. When T is finite, the average payoff is and the discounted payoff is , or if the weights add up to 1. In the infinitely repeated case, the discounted formula is unchanged: . For averaging it is natural to consider the limit of , as T becomes large. However, {ūT} may not converge.13 Taking the “lim inf ” of the average14 represents a pessimistic assessment of the payoff flow, while “lim sup” represents an optimistic view. Other criteria are possible, but for any sequence {ūT}, lim inf ūT ≤ lim sup ūT, although these may well be different—as noted above. The natural requirement for equilibrium is that the average of the payoffs converges, and any deviation produces a payoff flow for which the lim sup of is no larger than that obtained by the candidate
13
14
For example, let T0 =0, T1 =1, T2 = 2T1 , Tk = k (T1 + T2 + ··· +Tk−1 ) and . Suppose ui (a1 ) = 0, ui (a2 ) = ui (a3 ) = 1, and for Sk−1 < t ≤ Sk , ui (at ) = 0 if k is odd and ui (at ) = 1 if k is even. Then, for T = Sk , ūT ≤ 1/k if k is odd and ūT ≥ (k −1)/k if k is even. For any , there is a with ūT′ arbitrarily close to 0 and a arbitrarily close to 1. So, lim inf ūT =0 and lim sup ūT =1. The lim inf of uT is defined:
, while
.
CHAPTER 11: REPEATED GAMES
181
equilibrium strategy. This criterion for evaluating payoff flows is called the “limit of means” payoff criterion.
11.2.2 Strategies and equilibrium When players make choices at time t they are assumed to have observed the past history of choices. Let H1 = ∅, the empty set, and for 1 < t, let Ht = At−1. The set of all histories is . So, ht ∈ Ht has the form ht = (a1, a2, …, at−1), where and the choice of player i in period τ. A strategy for i is a collection of functions , where . A path in the game is a sequence of choices for all players, from beginning to end; the set of paths in the game is AT. Write for AT, to denote the set of paths in the game. A strategy profile (a strategy for each player) is given by σ = (σ1, …, σn). Let . Each strategy profile generates a path in the game: ā1=σ1, ā2=σ2(ā1), ā3=σ3(ā1, ā2), and so on defining a path . Given any history, ht ∈ Ht and strategy σ, a path is determined according to , where āt = σt(ht), āt+1 = σt(ht, at), and so on. Write h(σ, ht) = (ht, ht(σ, ht)) to denote the path determined in this way, so that ht(σ, ht) denotes the component determined from period t on, given the history ht and the strategy σ. For notational symmetry, write h1(σ, h1) to denote the path determined by the strategies from the start of the game (i.e. h(σ)), although h1, the history prior to the start of the game is empty. A strategy profile σ determines a path in the game with corresponding reward—the payoff associated with the strategy. In this framework, Nash equilibrium and subgame perfect equilibrium are defined as follows. Definition 11.1.Nash and subgame perfect equilibria are defined: 1. A strategy is a Nash equilibrium if ∀i
2. A strategy is a subgame perfect equilibrium if ∀t, ∀ht, and ∀i,
A subgame perfect equilibrium is a Nash equilibrium, but a Nash equilibrium of the repeated game may not be subgame perfect. For each ht, subgame perfection requires that strategies form a Nash equilibrium given that history or subgame.
11.2.3 Mixed strategies Given a strategic form game , where Ai is the pure strategy space, write for the mixed extension of the game—Xi is the mixed strategy space for player i—the set of probability distributions on Ai. Let and
182
MICROECONOMIC THEORY
. For x ∈ X, the payoff to i is (in the case where A is finite). In this case, the definition of a strategy in the extensive form game is modified to, . With this definition player i observes the past history of choices made, but not the way players randomized. The (common) presumption being that the actual choices made are observed, but the selection process is not. In this case a mixed strategy determines a distribution, μσ, over outcome paths in the game. The payoff to player i is then Eσ{Vi(h)}. Nash equilibrium requires and subgame perfection requires , ∀i, ∀ht. If instead the randomization procedure is observed, then mixed strategies are said to be observable. In this case a history at time t is a vector (x1, x2, …, xt−1) where , τ = 1, …, t−1. Then , where . Equilibrium and subgame perfect equilibrium in the extensive form game are defined in the same way as for pure strategy equilibrium.
11.3 The Impact of Repetition Repeating a one-shot game raises the possibility of sustaining outcomes that cannot be achieved as equilibria of a oneshot game. The following example, Prisoner's dilemma illustrates. The Prisoner's dilemma game is given by the following matrix:
This game has a unique Nash equilibrium where each player chooses D. But, in the infinitely repeated version of this game, as long as the future is not too heavily discounted there are many subgame perfect equilibria. Suppose that payoffs are discounted over time by the players, with player i having discount rate δi. Also, suppose that δi > ⅓ and T = ∞. For player i define the strategy: , if ht ((C, C), (C, C) and if ht ≠ ((C, C), (C, C), …, (C, C)). In words, start playing C, and continue to do so at time t if C has been chosen by both players at each time in the past. The payoff to player i if each player uses this strategy is: . Suppose now that one player, i, follows this strateg y, but the other, j defects at time playing D. The corresponding payoff is and compares with a payoff of in the absence of deviation. So, the net gain from deviation is or
or gain from deviation.
. Since δj > 1/3, this is negative, so player j does not
CHAPTER 11: REPEATED GAMES
183
11.4 Characterization of Equilibrium Payoffs In repeated games the set of equilibrium payoffs is closely related to the maximal punishment that can be inflicted on a player. This is called the minmax payoff; it defines a lower bound on the payoff a player will receive in any equilibrium and so identifies a region in which the set of equilibrium payoffs must lie. A payoff vector is called individually rational if each player's payoff lies above their minmax payoff.
11.4.1 Maximal punishments and minmax payoffs Determining maximal punishment is important because it measures the scope for sustaining “cooperative” behavior under threat of punishment. The greater the punishment available, the larger the set of sustainable outcomes. So, identifying the extent to which a player's payoff can be reduced or held down is central to identifying the possible equilibrium outcomes. In a one-shot game, there are two standard measures of maximal punishment—the minmax and maxmin measures. If a player, i, chooses action ai and then the remaining players choose a−i a payoff to i is determined. If players other than i make the choice a−i to minimize i's payoff, they solve . If others act that way, the best that i can do is to choose ai to maximize this: . This is called the “maxmin” payoff in view of the order of optimization. Alternatively, suppose that given the choice a−i, i chooses ai to maximize . And, suppose that other players choose a−i to minimize ui(ai, a−i) taking into consideration that i will respond: . This is called the “minmax” payoff for player i, and again the term follows from the order of optimization. Since , this implies that
And if
, ∀ai then
This inequality may be strict—consider G1:
. So,
184
MICROECONOMIC THEORY
For player 1, the minmax payoff is v1 = minb maxau1(a, b) =1, whereas the maxmin payoff is
.
The minmax and maxmin payoffs can be (and usually are) defined using mixed strategies. For a two-player game, if x and y are distributions over the strategies of 1 and 2, respectively, the minmax for 1 is then miny maxxu1(x, y) where u1(x, y) is the expected payoff to 1. Using the same reasoning as above,
In fact, with mixed strategies, they coincide: there is a pair (x*, y*) such that ∀x, y, u1(x*, y) ≥ u1(x*, y*) ≥ u1(x, y*). This follows from the existence of Nash equilibrium in the two-player game where player 1's payoff function is u1(x, y) and player 2's payoff function is . If (x*, y*) is an equilibrium of this game then u1(x*, y*) ≥ u1(x, y*) for all x; and u2(x*, y*) ≥ u2(x*, y) for all y. The latter is equivalent to −u1(x*, y*) ≥ −u1(x*, y) for all y or u1(x*, y*) ≤ u1(x*, y) for all y. Combining these conditions,
The first inequality implies that maxx minyu1(x, y) ≥ u1(x*, y*) and the second inequality implies that u1(x*, y*) ≥ miny maxxu1(x, y). So,
and since miny maxxu1(x, y) ≥ maxx minyu1(x, y) follows for the same reasons as in the pure strategy calculation, miny maxxu1(x, y) = maxx minyu1(x, y). So, for two players and mixed strategies, the minmax and maxmin coincide. In the game G1, if player 2 plays the mixed strategy (y, 1−y) then the expected payoff to strategy a1 of player 1 is y and the expected payoff to strategy a2 is (1−y). Optimizing against (y, 1−y) produces a payoff of max {y, 1−y} and this is minimized by setting y=½. The minmax payoff to player 1 is ½, which is strictly less than that computed using pure strategies. When there are three or more players the definition is similar. Let X−i = ×j≠iXj and define the minmax payoff for i as: . Thus, in punishing i, other players randomize independently on their pure strategies. An alternative formulation would allow correlation of punishment: let X−i* = Δ(A−i), the set of probability distributions on A−i. Since X−i⊆ X−i*, , where
CHAPTER 11: REPEATED GAMES
185
11.4.2 Convexity, feasibility, and observability Apart from the scope for punishment, a number of other issues arise explicitly in relating payoffs to strategies. For example, if players' discount rates differ, then the present values of the individual payoff flows may not be in the convex hull of the feasible set: the payoff might not be achievable by any distribution over actions in the one-shot game. Another issue arises with randomization. Many equilibria are sustained by the threat of retribution for deviation from agreements. But this depends on the detection of deviations which in turn requires observability of choices. When players randomize and only the selection rather than the randomization is observed, then a deviation may be undetectable. These matters are discussed below.
Observability Where the game is defined in terms of pure strategies, at each stage the choices made in previous period are observed. So, in particular, deviations from specified strategies are detected immediately. But when players' strategies involve randomization, observation of choices made does not allow a player to determine whether a specific strategy was used or not: any choice in the support of the mixed strategy can be chosen with probability 1 and such a choice is consistent with the strategy—in the sense of having positive probability under the strategy. Observation of the choice does not imply the player has deviated. As a consequence, inability to detect deviations may limit the scope for enforcing agreements. The assumption of observability of mixed strategies addresses this issue.
Convexity With pure strategies, players can achieve any payoff vector in the set Fp = {v| ∃a ∈A, v = u(a)}, and with mixed strategies, players can achieve any payoff vector in the set Fm = {v | ∃x ∈ X, v = u(x)}. But in a repeated game there are additional possibilities. Consider the following game:
For both players in G2 the minmax payoff is 0. There are two pure strategy equilibria and no mixed strategy equilibria. In the one-period game (1, 0), (0, 1) and (0, 0) are the only possible payoffs with pure strategies—the set Fp. With mixed strategies (x, 1−x) and (y, 1−y) the expected payoffs to players 1 and 2 in the stage game are xy and (1−x)(1−y). If the payoff to player 2 is γ, then the largest possible payoff to 1 is , so the highest possible symmetric payoff is (¼, ¼). Therefore, the set of payoffs with pure or mixed strategies is generally not convex.
186
MICROECONOMIC THEORY
However, in a two-period game, with payoff averaging, strategy choice (a1, b1) in period 1 and (a2, b2) produces an average payoff of (½, ½) = ½(1, 0) + ½(0, 1). This is strictly above any payoff possible with mixed strategies in the one-stage game. More generally, any point u in the convex hull of Fp has the form where a(k) ∈ A and {λk} are nonnegative weights summing to 1. If T is large then one can find such that Tk/T ≈ λk, k = 1, …, K. In that case the payoff obtained by playing a(k) for Tk periods yields an average payoff approximately equal to . As T increases one can choose each Tk such that Tk/T → λk. In that case the corresponding limit of means payoff is u exactly. With discounting, the issue is complicated further. Assuming a common discount rate, δ, any efficient outcome gives player 1 a payoff of the form , where αt ∈ {0,1} and gives player 2 a payoff where βt = 1−αt. Thus the players either play (a1, b1) or (a2, b2) in each period. So, the payoff to player 1 is . Since either α1 =1 or β1 = 1, one player gets at least (1−δ). If δ < ½, there is no way to alternate on (a1, b1) and (a2, b2) to give each player a payoff of ½. Even when δ is large, not all points in the convex hull of {(1,0), (0,1), (0,0)} can be written exactly in the form , with ξt ∈ {(1,0), (0,1), (0,0)}. However, when δ is large any payoff in the convex hull of Fp can be approximated through a sequence of pure strategy choices. One direct way to address the convexity issue is to assume that a public correlation device is available—at each period t, having observed the available information (a point in At−1 or Xt−1, all players see the realization of an appropriate random variable, ξt and choices, pure or mixed, can be conditioned on this.
Feasibility When discount rates differ across players, other issues arise. For example, consider game G2 and suppose that in the first period players play (a1, b1) and in subsequent periods (a2,b2). The payoff to player 1 is . and the payoff to player 2: . So, the payoff to 1 is (1−δ1) and the payoff to 2 is δ2. If δ1 ≈ 0 and δ2 ≈ 1 then each has expected payoff of 1—the payoff vector is approximately (1,1), outside the convex hull of the set of payoffs achievable in the stage game (conv(Fp)). In that case the relative sizes of δ1 and δ2 become crucial and the description of the one-stage game is not enough to characterize the equilibrium set. Note that this observation remains valid even if both discount rates are arbitrarily close to 1. If strategy (a1, b1) is played for T periods, and (a2,b2) thereafter the payoff to 1 is . and the payoff to 2 is Given δ1, for any small ε > 0 pick T(δ1) so that . Next pick δ2(T(δ1)) so that . Again the expected payoffs of both players are close to 1. Most of the discussion here concerns the case of common discount rate—where
CHAPTER 11: REPEATED GAMES
187
these issues do not arise, and what is feasible must lie in the convex lull of payoffs achievable in the stage game.
11.5 Innitely Repeated Games with Averaging Let Si denote either the set of pure strategies or mixed strategies of player i, with and . Recall that a strategy profile s ∈ S is an individually rational outcome if ∀i, . Let γ(i) satisfy . In the following theorem, assume that the strategies in question are pure strategies, or that mixed strategies are observable. Theorem 11.1.With the limit of means criterion, the payoff of the set of Nash equilibrium are the individually rational feasible outcomes. Proof. Let s* be an individually rational outcome and define the following strategy for i. Play si* at t if ht = (s*, …, s*). If ht = (s1, …, st−1) ≠ (s*, …, s*), let τ* = min {t | st ≠ s*} and let , the set of defectors defecting for the first time. If # Di =1, so there is one such player, say i, then let j play γj(i), j=1,…, n. If # Di > 1, let j play sj*, ∀ j. This defines a Nash equilibrium. Theorem 11.2. With the limit of means criterion, the payoffs of the set of subgame perfect equilibria are the individually rational feasible outcomes. Proof. Let s* be an individually rational outcome. Define a strategy σi for i as follows. Put rule associating players to histories as follows (with P(h1) = ∅).
. Define a punishment
So, i is identified for punishment in the current period in two cases: (1) if i was identified for punishment in the previous period, others adhered to the punishment, and i's average payoff was “high” relative to ui(s*); (2) no player was not identified for punishment last period but i defected in the current period and has an average payoff “high” relative to ui(s*). Define
This strategy is a subgame perfect equilibrium.
188
MICROECONOMIC THEORY
In the equilibria above, a fixed strategy is played in equilibrium or subgame perfect equilibrium. This gives points in Fm that are individually rational. The strategies can be played for an appropriate fraction of time to convexify.
11.6 Innitely Repeated Games with Discounting Let be the set of subgame perfect equilibrium payoffs. Let Σi be the set of strategies of i. With Si and vi defined in Section 11.5,
and Σ = ×i Σi. Denote the set of Nash equilibria by Σl and the set of subgame perfect equilibrium by Σs. Then
If S is compact and ui continuous then V is compact and Σs is closed. In the case of discounting, the set V* is difficult (impossible) to characterize. In the study of the structure of the set of equilibrium payoffs with discounting, the primary focus has been on the case of low discounting. In the discounted case, the path s= (s1, s2, … ) generates the payoff flow to i of ; and the payoff vector (to all players) is , where . In the following discussion a common discount factor, δ, is assumed, with δ ≥ δ, for some δ < 1. If idiosyncratic discount rates are given (δi for player i), then the requirement becomes δi ≥ δ, ∀ i, for some δ < 1. Since this involves consideration of the repeated game payoff as δ → 1, payoffs are normalized so that the payoff weights sum to 1: . For the following result, let F = Conv({ũ|∃ s∈ S, ũ=u(s)}) and define:
So, V* is the set of feasible and strictly individually rational payoffs. Theorem 11.3. Suppose that mixed strategies are observable and there is a publicly observed random variable. 1. If u ∈ V*, ∃ δ ∈ (0,1) such that the discounted game has a Nash equilibrium with payoffu. 2. In a two-player game, if (u1, u2) ∈ V*, ∃ δ ∈ (0,1) such that the discounted game has a subgame perfect equilibrium with payoffu. 3. In an n-player game suppose that dim(V*) =n. Then ifu ∈ V*, ∃ δ ∈ (0,1) such that the discounted game has a subgame perfect equilibrium with payoffu.
CHAPTER 11: REPEATED GAMES
189
Proof. (Sketch) For any u ∈ V*, while there may be no mixed strategy x with u(x) =u, ∃ {x1, x2, …, xK} and , where and ρk ≥ 0, ∀k with such that u =∑ku(xk) ρk. The randomization ρ is public correlation device and is used in the following way.15 If k is drawn according to the distribution ρ, player i chooses strategy (i.e. with probability ρk), producing payoff ∑ ρku(xk) =u. With a mild abuse of notation, this can be written u(x), with the understanding that x is the function described. 1. Let x* satisfy u(x*) = u and for each i let . Pick δ so that (1−δ)ui(bi(x*), x−i*)+ δvi < ui for each i where vi is the minmax payoff for i. Define a strategy for i: play xi* in each period until a defection occurs. Following a defection, punish the defector indefinitely with the minmax payoff. If more than one player defects at the same time, continue with x*. 2. Take (u1, u2) ∈ V*, so that ui > vi, i=1,2. Let mi be the minmax strategy used by i to minmax j. Let and choose and δ so that condition (1) where vi* is defined in condition (2) . Condition (1) says that it is better to get ui in present value terms than for one period and vi* thereafter; condition (2) says that getting ui(m1, m2) for periods and ui thereafter is better than the minmax payoff to i. Observe that vi = maxx minyui(x, y) = maxxui(x,mj) ≥ ui(mi,mj), so that ui > vi ≥ ui(mi, mj). To see that both conditions (1) and (2) can be satisfied, note that condition (2) holds if . Substitute condition (2) into (1) to get: . Noting that is increasing in δ and decreasing in , as δ is increased, raising keeps approximately constant (and is approximately constant and less than ui, since ui(mi, mj) < ui and vi* is a weighted average of ui and ui(mi, mj)). Thus choose , raise δ and (to keep approximately constant above k) until (1) is satisfied. Players start playing x, and continue to do so as long as no defection occurs. If more than two defect at the same time the defections are ignored. If there is a sole defector, switch to the strategy pair (m1,m2) for periods and then return to x. According to (1), this eliminates any possible gain from the defection. In the punishment phase where the players play (m1, m2), a defection from mi gives a payoff of vi that period and thereafter vi* (since the deviation initiates a punishment cycle). Since vi* > vi, it is not profitable to
15
The randomization can be implemented as follows. Let ξ be a random variable uniformly distributed on the unit interval. Given any finite collection on nonnegative numbers, , adding to 1, the interval can be partitioned: , so the probability of Pk is ρk . Define a i mixed strategy (with public correlation) for i : xi : [0, 1] → Xi such that if ξ ∈ Pk . With this strategy, if ξ is drawn i chooses x (ξ); ξ ∈ Pk with probability ρk and so i plays with probability ρk . The expected payoff from this strategy is ∫u (x (ξ)) d ξ = ∑ρku (xk ) =u .
190
MICROECONOMIC THEORY
deviate in the punishment phase. For (3) see Fudenberg and Maskin (1986) for the proof. The cases where mixed strategies are not observable and where no public randomizing device is available are discussed in Fudenberg and Maskin (1986) and Fudenberg, Levine and Maskin (1994). The key assumption of full dimensionality ensures that in punishment phases it is possible to simultaneously reward one player while punishing another. This is discussed below.
11.6.1 The dimensionality condition The dimensionality condition requires that the feasible set be of full dimension. The following example (where the dimensionality condition fails) is a game where there is no subgame perfect equilibrium with payoffs close to the individually rational payoffs.
For each player this game has minmax payoff (in pure or mixed strategies) of 0. For any δ <1, there is no subgame perfect equilibrium with payoff lower than ¼. To see this, let v* = inf {v | v ∈ V*}, the lower bound on subgame perfect equilibrium payoffs. Let (x1,1−x1), (x2,1−x2), and (x3,1−x3) be mixed strategies for players 1, 2, and 3 in the onestage game. At least one of the following must hold: (a) x1, x2 ≤ ½, (b) x1, x2 ≥ ½, or (c) in {x1, x2} ≤ ½ ≤ max{x1, x2}. In case (a), (1−x2)(1−x2) ≥ ¼; in case (b) x1x2 ≥ ¼. In case (c) if x3 ≤ ½ then there is some player, i, in addition to 3 with xi ≤ ½, or if x3 ≥ ½, there is some player, i, in addition to 3 with xi ≥ ½. So, always there is a pair j,k with either xjxk ≥ ¼ or (1−xj)(1− xk) ≥ ¼. For i ≠ j,k a best response produces a payoff of at least ¼. Thus, in any subgame perfect equilibrium there is some player who can deviate initially to achieve a payoff of at least (1− δ) ¼ + δ v*. So, v* must satisfy v* ≥ (1− δ) ¼ + δ v* or (1−δ) v* ≥ (1− δ) ¼ or v* ≥ ¼. Thus, the minmax payoff for each player is 0, but the lowest subgame perfect equilibrium is no smaller than ¼, regardless of δ. In this case, the subgame perfect equilibrium payoffs cannot be identified with the feasible individually rational payoff set, regardless of the discount rate.
CHAPTER 11: REPEATED GAMES
191
11.7 Finitely Repeated Games In a finitely repeated game where the stage game has a unique Nash equilibrium, that equilibrium must be played in the last period—so that the outcome in the last period is independent of the history. This leads to the same being true one period before the end, and so on. With a unique Nash equilibrium of the stage game there is a unique subgame perfect equilibrium of the finitely repeated game—play the stage game Nash equilibrium repeatedly. When there is more than one Nash equilibrium in the stage game, any sequence of Nash equilibria of the stage game forms an equilibrium of the repeated game. Any such equilibrium produces a payoff which is a weighted average of payoffs from Nash equilibria. More generally, actions in periods prior to the last period need not be Nash equilibria of the stage game. Consider the following game:
There are two (pure) Nash equilibria (a1, b1) and (a2, b2) with payoffs 1+ε and 1, respectively to each player. (Also, there is one mixed strategy Nash equilibria. Since a mixture of a1 and a2 strictly dominates a3, a3 has zero probability in any equilibrium. Similarly for b3. The mixed strategy equilibrium is: (x1,x2,x3) = (y1, y2, y3) = (1/{(2+ε), {(1+ε)/{(2+ε), 0). The latter yields an expected payoff of (1+ε)/(2+ε) which may be written ½ + ε/2(2+ε).) In the two-period version of this game, any strategy where players play in each period a Nash equilibrium of the stage game gives a subgame perfect equilibrium. However, there are other subgame perfect equilibria. Consider:
The strategies specify that in period 1 both should play their third strategy; in period 2, the final period, both should play their first strategy if the third one was chosen by both initially and otherwise both should play the second strategy. A deviation by either in period 1 produces a gain of at most ε/4 followed subsequently by a loss of ε. So, there is no incentive to deviate in period 1. The period 2 strategies are a Nash equilibrium of the stage game. The average payoff in this equilibrium is ½(0 + 1+ε) = (1+ε)/2. Thus, the average payoff is much lower than either Nash equilibrium payoff of the stage game.
192
MICROECONOMIC THEORY
This suggests that in the finitely repeated case, there is scope for strong subgame perfect equilibrium “punishment” and consequently scope for sustaining a large set of subgame perfect equilibrium payoffs. In a T period game, let wi(T) be the lowest subgame perfect equilibrium payoff for player i. Because an equilibrium in a T period game and an equilibrium in an S period game can be joined to form an equilibrium in a T+S period game, wi(T+S) ≤ wi(T) + wi(S), so wi is subadditive. While wi(T)/T is not monotone decreasing in T it does converge (from subadditivity). So, as T becomes large, it is possible to consider long run worst equilibrium payoffs. Let NE1 denote the set of Nash equilibrium strategies of the one-shot game. Theorem 11.4.Suppose that: 1. for eachj, ∃ γj ∈ NE1anduj(γj) > wj(1), 2. Dim F = n, 3. u is feasible and individually rational payoff. Then given ε > 0, ∃ T0such that ∀ T ≥ T0there is a subgame perfect equilibrium with outcome path (a1, a2, …, aT) and
11.8 Finite Repetition and Discounting In finitely repeated games or games with discounting, the payoff function in the repeated game varies continuously with the strategies. These two types of repeated game have a few useful properties: subgame perfect equilibria are characterized by the “no gain from one-shot deviation,” and optimal punishments can be chosen independent of history (at least for pure strategy equilibria). These points are discussed next.
11.8.1 No gain from one-shot deviation Given the history ht, the path determined in the game under strategy σ is h(σ, ht) = (ht, ht(σ, ht)), where recall, ht(σ, ht), is the continuation path induced by the history ht and the strategy σ. The continuation payoff for i from period t is Vi(ht(σ, ht)), and , where ht = (a1, …, at−1). Suppose at period t player i contemplates a deviation from σi. At this point a history ht = (a1, …, at−1) is in place and say σi specifies . Under σ, the history in the next period will be ht+1 = (a1, …, at−1, at) = (ht, at), where . For i the payoff to strategy σi viewed from this point in time is:
CHAPTER 11: REPEATED GAMES
193
If σ is subgame perfect, then in particular, a single deviation by i at some point cannot be payoff improving. At t, if , consider a deviation by i to . In this case the payoff becomes:
If σ is a subgame perfect equilibrium there is no gain from this one-shot deviation:
This is implied by the definition of subgame perfection. However, the converse is also true: no gain from one-shot deviation at every history, ht, implies subgame perfection. To see this suppose a strategy profile σ satisfies the no gain from one-shot deviation property, but is not a subgame perfect equilibrium. In that case, for some i at some history ht, , so that . So, some gain from the deviation must occur in a finite length of time—before some T′: on the time interval {t, T′} following history ht, the payoff from using is greater than that from using σi. Starting from period T′ and working back there must be a first period, T′ − s where at history hT′−s} the deviating strategy produces a higher payoff. This violates the no gain from one-shot deviation property.
11.8.2 History independent punishments Let ej =(0, …, 1, …, 0) with 1 in the jth position. Let . Thus, v(j) has the property that if , then . Corresponding to v(j) there is a subgame perfect equilibrium γ(j) ∈ Σs. So, γ(j) is the worst possible subgame perfect equilibrium in payoff terms for player j. Viewing only perfect equilibria as credible, γ(j) is the maximal credible punishment that can be inflicted on j. The main point of the following discussion is that in sustaining any perfect equilibrium the collection is sufficient to deter deviation from the equilibrium path. Given a subgame perfect equilibrium, σ, consider any history, ht, and suppose that at ht the strategy profile specifies σt(ht) = at, generating the history (ht, at). Evaluating the payoff from this point, for player i: ui(at) + δiVi(ht+1(σ, (ht,at))). A deviation by i to ãi produces the payoff , so for equilibrium it must be that
Since , following history , if the strategy profile induced by σ on this subgame is replaced by the strategy profile γ(i) the payoff from the deviation for i is v(i) so that i continues to have no incentive to
194
MICROECONOMIC THEORY
deviate. Thus, off the equilibrium path determined by a strategy σ, all deviations can be followed by one of the strategies , depending on the deviator. With multiple deviations at a point, an arbitrary strategy may be chosen.
11.9 Repeated Games of Incomplete Information In a (finite) game of incomplete information, player i of n players has a finite number of possible types, ti ∈ Ti, and action space Ai. Each player chooses an action ai and at type profile t = (t1, …, tn) the payoff to player i is ui(a1, …, an; t1, …, tn). Given a probability distribution p on T = × Ti, the interim expected payoff of player i is . In contrast to games of complete information, repeated games of incomplete information are greatly complicated by the fact that learning about private information (the types) takes place, so that for example, the basis for calculating expected payoffs changes as the game progresses. In what follows, the discussion focuses on a special case: the zerosum game with incomplete information on one side. In this case, the payoffs are given by a collection of matrices , where each Ak has I rows and J columns. The matrix is the payoff matrix when player 1 is type k; if player 1 chooses i and player 2 chooses j and player 1 is type k, then 1 receives and 2 receives (because the game is a zero-sum game). Let xk be a strategy for player 1 type k and let y be a strategy for player 2 and put x = (x1, …, xK). In the one-stage game, the value is given by: v(p) = maxx miny ∑kpkxkAky = miny ∑kpk, where p=(p1,… pK) with pk the probability of type k. Note that V(p) is a concave function. To see this, let pθ = θ pa + (1−θ)pb for two type distributions pa and pb with θ ∈ [0,1]. Then
From the perspective of information value, one can view drawing from the distribution pa with probability θ and from pb with probability (1−θ) as more informative than drawing from the distribution pθ = θ pa + (1−θ) pb since pa and pb are more “extreme.” With better information, player 2 is better able to hold down the payoff to 1.
CHAPTER 11: REPEATED GAMES
195
11.9.1 Strategic information revelation In the game, player 1 makes use of information whenever xk ≠ xk′ (depending on whether the game is Ak or Ak′, the strategy varies). In a one-shot game, using information can generally raise the payoff of a player. For example, let
Game A1 is drawn with probability p and A2 with probability (1−p). Player 1 learns the game drawn, player 2 does not. The strategic form of the incomplete information game is
Here, player 1 has four pure strategies. For example, BT denotes the strategy of playing B if the game is A1 and T if the game is A2. Player 2 has two pure strategies, L and R. The optimal strategy for player 1 is TB and an optimal strategy of 2 is to play L if p ≤ ½ and R if p > ½. This gives a value V(p) = min{p,(1−p)}. (As a matter of notation, p will denote the vector of probabilities on types or games. However, in the two-player type case, the notation p and (1−p) is more convenient for the probabilities of the two types (rather than p1 and p2)). Note however that the posterior distribution is fully informative if player 1 plays TB: prob (k=1 | T) =1 and prob (k=1 | B) =0. If 1 uses the strategy TB, then after observing 1's choice, 2 knows the true game. In a repeated game, if 1 used the strategy TB in period 1, player 2 will play R in period 2 and subsequent periods if T was chosen in period 1, and L otherwise. Thus, player 1's expected payoff would be in {p,(1−p)} in period 1 and 0 thereafter. If payoffs are “averaged” over an infinite number of periods, the payoff averages out to 0. In a multiperiod model the strategy TB is “too” informative. To see this more formally, given x =(x1, …, xk), the posterior distribution conditional on i is:
Note that the expected value of p(i) equals the prior: for all k, or , where p(i)=(p(i)1,…,p(i)K) and p=(p1,…, pK). If player 1 uses strategy x in period 1, next period's posterior is p(i) with probability . In
196
MICROECONOMIC THEORY
the example above, when played for two periods, the first period gives payoff in{p, (1−p)}, where p is the probability of type 1. In period 2, if T was chosen, the posterior on type 1 is p(T)=1 and p(T) = 0 if B was chosen. Thus, in the second period, and , since and . The expected payoff summed over the two periods is:
since v(1) = v(0) = 0. So, getting the (high) payoff in {p, (1−p)} in period 1 is costly, since it involves fully revealing player 1's information, which is then exploited in period 2 by player 2, producing a bad payoff for player 1. More generally, if strategy x is used by player 1 in period 1 of a two-period game, the expected payoff in period 2 is and given the concavity of v, , so that gives the expected payoff loss in period 2 from using information in period 1. Consider the other extreme where player 1 ignores the private information entirely and plays subject to the condition that x1 = x2 = x′. In this restricted setting the minmax is found from:
In the present example,
so that the optimal strategy for player 1 is ((1−p), p) and the optimal strategy for 2 is ((1−p), p), giving the expected payoff u(p) = p(1−p). Not using private information lowers player 1's payoff in the stage game: u(p) ≤ v(p), but this strategy can be played each period in a repeated game to guarantee a payoff of u(p) every period. In the two-period case, this gives 2p(1−p) and 2p(1−p) > in {p, 1−p}, ∀p ∉ {0, 1/2, 1}.
11.9.2 Equilibrium From the previous discussion, player 1 can achieve a payoff of u(p) in the infinitely repeated game with averaging. However, player 1 can sometimes do better than u(p) in the infinitely repeated game. Let cav u be the smallest concave function above u: cav u is a concave function with cav u(p) ≥ u(p), ∀p and there is no concave function h ≠ cav u with cav u(p) ≥ h(p) ≥ u(p). Theorem 11.5. The infinitely repeated zero-sum game has a value, v∞, withv∞(p) = cav u(p).
CHAPTER 11: REPEATED GAMES
197
Proof. (sketch.) Let be a finite collection of distributions on K, p(r) ∈ Δ(K) for each r, such that ∑r αrp(r) = p with αr ≥ 0, ∑r αr =1 (i.e. ∑r αrpk(r) = pk, ∀k ∈ K). Let x(r) be an optimal type independent strategy when the prior is p(r). Thus, x(r) guarantees u(p(r)). Consider an auxiliary lottery for 1; if 1 is type k, choose r and strategy x(r) with probability αr (pk(r))/pk. Given that r was chosen, the posterior on types is
(For example, if player 1 has choice set I, choose m so that the number of Im is as large as the number of elements of and select R elements from Im, labeling them c1,…, cR. Then, let player k play in this initial phase, putting probability αr (pk(r)/pk) on cr.) Conditional on observing r, the posterior is p(r). Player 2 can do at least as well observing the outcome of the lottery as not. If player 2 observes the outcome and plays optimally in the game with posterior p(r) in each period (and player 1 is playing x(r)), the overall expected payoff to 1 is ∑αr u(p(r)). In this way, player 1 can achieve an expected payoff at p equal to the “concavified” value of u at p. This is illustrated in the figure where there are two possible types: k ∈ {1,2}, and the prior distribution is p. After the “signaling” phase, the posterior is p(r) with probability αr. Playing independent of type, player 1 can guarantee u(p(r)) if the posterior is p(r), and so can achieve an expected payoff of α1u(p(1)) + α2u(p(2)). In this way, player 1 can achieve a payoff of cav u(p)
It turns out that player 1 can do no better than this, because player 2 has a strategy holding down player 1's payoff to this level. Given p, let α define a supporting hyperplane to cav u(p): cav u(p) = α · p and α · p′ ≥ u(p′), ∀ p′. Let S = {ξ ∈ RK |ξk ≤ αk, k = 1, …, K}. The vector payoff in the game if i is chosen by 1 and j chosen by 2 is average vector payoff up to T is
. If (it, jt) are chosen at time t, the
198
MICROECONOMIC THEORY
. Let R2(y) = conv ({∑jaijyj | i ∈ I}). If player 2 has a strategy that forces ū = (ū1, …, ūK) into the set S, then p· ū ≤ α · p = cav u(p), guaranteeing an expected payoff no greater than cav u(p). According to a theorem of Blackwell, player 2 has a strategy which forces the vector payoff into the set S if: given any ū ∈ RK, ū ∉ S, there is a strategy y(ū) for player 2 such that if ξ is the point in S closest to ū, the hyperplane through ξ, perpendicular to ū − ξ, separates ū and R2(y(ū)).
Observe that ū − ξ ≥ 0, so that
is a distribution on {1, …, K}. Let y* be an optimal strategy in the game with payoff matrix
If ūk − ξk > 0, then ξk = αk (see the figure), or equivalently,
Note that
Therefore, for all v ε R2(y*), strategy at ū).
. Thus, for all z,
implies ξk = αk. Thus,
. Therefore,
. Thus,
and the hyperplane through ξ separates ū and R2(y*) (and y(ū) = y* is the approaching
Bibliography Abreu, D., Dutta, P. K., and Smith, L. (1994). “The Folk Theorem for Repeated Games: A Neu Condition,” Econometrica, 64, 939–948.
CHAPTER 11: REPEATED GAMES
199
Aumann, R. and Maschler, M. (1995). “Repeated Games with Incomplete Information”, MIT Press, Cambridge, MA. Aumann, R. and Shapley, L. (1976). “Long Term Competition—A Game Theoretic Analysis”, in Essays in Game Theory: In honor of Michael Maschler, ed. Nimrod Megiddo, pub. Springer Verlag, Berlin. Benoit, J. P. and Krishna, V. (1985). “Finitely Repeated Games,” Econometrica, 53(4), 905–922. Blackwell, D. (1956). “An Analogue of the Minimax Theorem for Vector Payoffs,” Pacific Journal of Mathematics, 6, 1–8. Fudenberg, D. and Maskin, E. (1986). “The Folk Theorem in Repeated Games with Discounting or Complete Information,” Econometrica, 54, 533–554. Fudenberg, D., Levine, D., and Maskin, E. (1994). “The Folk Theorem with Imperfect Public Information,” Econometrica, 62, 997–1040. Rubinstein, A. (1977). “Equilibrium in Supergames,” Research Memorandum No 25, Center for Research in Mathematical Economics and Game Theory, The Hebrew University of Jerusalem. Rubinstein, A. (1979). “Equilibrium in Supergames with the Overtaking Criterion,” Journal of Economic Theory, 21, 1–9. Sorin, S. (1980). “An Introduction to Two-person Zero-sum Repeated Games with Incomplete Information.” Technical Report 312, Stanford. Wen, Q. (1994). “The Folk Theorem for Repeated Games with Complete Information,” Econometrica, 949–954.
This page intentionally left blank
12 Information 12.1 Introduction The study of information in economics leads naturally to such questions as: when is one source of information more valuable than another?; and how do changes in the information structure affect equilibrium outcomes? Questions such as these are developed and discussed in the following sections. Throughout the entire discussion, it is assumed that preferences have the von Neumann–Morgenstern} form, so that the basis for valuing different information structures or comparing different actions is in terms of impact on expected utility. The information framework is introduced in Section 12.2 and leads to some preliminary observations on the utility of different information structures or signals in Section 12.3. A natural question concerns the value of information: would an individual, regardless of preferences, find the information in one signal more valuable than the information in another? This question is considered in Section 12.4, beginning with a special case where one information structure is finer than another. The finer information structure is confirmed to be more valuable, regardless of the preferences of an individual (Section 12.4.1). This leads to the study of the value of information more generally—where the notion of garbling appears. A signal, Y, is a garbling of another signal X, if the distribution of Y conditional on the underlying state can be determined from the conditional distribution of X given that state, via a Markov transition matrix. In Section 12.4.2 the garbling theorem is discussed: signal X is more valuable (gives a higher payoff regardless of the payoff function) than signal Y if and only if Y is a garbling of X. Thus, there is a tight connection between the value of a signal and its “informativeness.” Section 12.5 considers the impact of information signals on choices, and monotonicity of choices in signals. The ideas here are connected (in Section 12.6.2) to the monotone likelihood ratio property and monotone total positivity. Section 12.6 provides some general observations on likelihood ratios and Section 12.6.3 briefly reviews the closely related concept of supermodularity.
202
MICROECONOMIC THEORY
When the focuss shifts to environments with more than one decisionmaker, strategic considerations arise. One interesting consequence of this is that more information is not necessarily better. Section 12.7 considers a multiperson environment—an insurance market model—where improving information destroys the role for insurance, and consequently makes everyone worse off. Thus, as far as the value of information is concerned, additional information may be bad because the change in information structure may eliminate good (high payoff) equilibria. Subsequent sections focus on various aspects of equilibrium in the multi-person environment. In Section 12.7.1, rational expectations model is developed and rational expectations equilibrium is defined. One interesting feature of this model is the possibility that equilibrium may not exist—an issue discussed in Section 12.7.2. Section 12.7.3 considers speculation in a rational expectations environment. The main result here is the impossibility of speculative gain in a rational expectations equilibrium: if someone expects to gain, then they must expect someone else to lose. Willingness to trade by one party implies information unfavorable to the position of the other party to the trade, and no rational trader will take a position with an expected loss. Section 12.8 considers the existence of Nash equilibrium in an n-person game of incomplete information. The key condition for existence of equilibrium is an informational continuity requirement which is sufficient for the existence of Nash equilibrium. Finally, Section 12.9 describes three fundamental information models: the principal–agent model; the screening model, and the signaling model. Here, starting with a common parameter structure, the different categories of model are defined in terms of information possessed by participants when making choices, the order in which participants move, and so on. The primary purpose of this discussion is to clarify distinctions in the information structures that separate these models.
12.2 The Framework In what follows the discussion is concerned with situations where an individual has partial information about some unknown variable or state, Θ ∈ Θ, and must make a decision given this information. There are a number of ways to formulate information. In one, a signal, s, associates an observation, x, to each state θ: s(θ) = x; the individual observes x and infers that the state is in s−1(x) where s−1(x) = {θ∈Θ|s(θ) = x}. Given a prior distribution π(θ) on the set of states Θ, this determines a posterior, for B ⊆ Θ, π(B|x) = π(B ∩ s−1(x))/π(s−1(x)). Alternatively, a joint distribution may be given directly on the set of states and signals: π(θ, x). With payoffs dependent only on θ (θ is the payoff relevant component of the state ω = (θ, x)), one can again consider the conditional distribution, π(θ | x). Given
CHAPTER 12: INFORMATION
203
a von Neumann–Morgenstern utility dependent on an action a and the state θ, the expected utility from action a is ∑θu(a, θ) π(θ | x), conditional on x.
12.3 Information and Decisions The environment for a decisionmaker is a set of feasible actions, A, and preferences given by a utility function u(a, θ), where θ ∈ Θ is an unknown state. Given a distribution φ on Θ, expected utility is ∑ u(a, θ) φ(θ), and in the absence of any information an optimal decision involves solving maxa ∑ u(a, θ) φ(θ). With fixed state space, and considering preferences and the action space as variable, a decision problem is characterized by a pair D = (A,u). Call the pair D = (A,u) an admissible decision problem if maxa ∑ u(a, θ)φ(θ) has a solution for all distributions φ on Θ. Let D denote the set of admissible decision problems. In the present context, information will be represented by a signal x drawn from some set X, and a joint distribution π on X × Θ. The decisionmaker observes x and maximizes expected utility conditional on x: maxa ∑θu(a, θ) π(θ | x). Thus, the ex ante expected utility from such a decision is ∑x maxa ∑θu(a, θ) π(θ | x) π(x), where {π(x)}x∈X is the marginal distribution of π(θ, x) on X: π(·) = ∑θπ(θ,·). For given utility function u, the expected utility from an optimal decision is:
Comparing two signals, X and Y, if
the expected utility from observing signal X is as large as the expected utility from observing signal Y.
12.4 Utility Maximization and the Value of Information In the following sections, the “quality” or “accuracy” of information is discussed in terms of information partitions. This provides a context for the discussion of garbling, a well-known criterion for ranking informativeness of information signals.
204
MICROECONOMIC THEORY
12.4.1 Finer information Given two information partitions ℱY and ℱX, partition ℱX is finer than partition ℱY if every member of ℱY can be written as the union of members of ℱX. For example, let Θ = [0,1], with ℱY = {[0, ½), [½, 1]} and ℱX = {[0, ¼), [¼, ½), [½, ¾), [¾, 0]}. Given the state space Θ, a signal is function from Θ to R. Partitions may be generated by a signal: if Y: Θ → R is Y(θ) = 0, with θ ∈ [0, ½), and Y(θ) = 1, θ ∈ [½, 1], then knowledge of the value of Y is equivalent to learning which member of ℱY contains θ. Finer partitions provide better information: when a partition is finer, it is associated with higher expected payoff, as the following discussion shows. Let X be a signal (with finite) range RX. Then, the induced partition of Θ is individual observing X (and optimizing) is
. The expected payoff to an
where probX(r) = prob{θ ∈ Θ | X(θ) = r}. Let Xr = X−1(r) and for a second random variable Y, let Yk = Y−1(k). If the partition is finer than , then X is more informative than Y. For any Yk, there is a collection such that . Let Ik = {r1, …, rj} be the corresponding index set: Xr. Noting that:
the signal generating a finer partition on Θ generates a higher expected payoff. These observations are illustrated by the following example which serves to motivate the notion of garbling. Let X and Y be two random variables on the state space Θ = {θ1, θ2, θ3}, with X(θi) = i and Y(θ1) = Y(θ2) = 3 and Y(θ3) = 1 and where prob(θi) = ⅓. Thus, the range of X and Y is S = {1, 2, 3}. Let p(x, y) denote the joint distribution of X and Y.
CHAPTER 12: INFORMATION
205
Observe that ℱX = {{1}, {2}, {3}} is finer than ℱY = {{1, 2}, {3}}, so that X is more informative than Y.
These functions satisfy, for example:
Or, considering all such computations,
This latter equality asserts that Y is a “garbling” of X. This turns out to be the key condition in ranking the informativeness or value of two signals.
12.4.2 Garbling Consider two random variables X and Y, each having a finite number of possible values: and (with some abuse of notation write X
206
MICROECONOMIC THEORY
and Y for the possible values also). Let π(x,θ) and π(y,θ) be the associated joint distributions on X × Θ. The observation of x or y carries information about θ—in particular the posterior assessment of the likelihood of different values of θ changes according to π(θ | x) or π(θ | y). For each, the value at decision problem (A,u) is ∑x maxa ∑θu(a, θ) π(θ | x) π(x) and ∑y maxa ∑θu(a, θ) π(θ | y) π(y), respectively. Let the conditional distribution of Y given θ be π(y | θ), and the conditional distribution of X given θ, π(x | θ). Thus with prior π, the joint distribution on Y × Θ is π(y, θ) = π(y | θ) π(θ), and the marginal distribution on Y is π(y) = ∑θ π(y | θ) π(θ). The posterior on Θ given y is π(θ | y) = π(y | θ) π(θ)/∑θ π(y | θ) π(θ). Similar definitions apply with x. Let πx(θ) = (π(x | θ))x∈X be the distribution on X given θ, and πy(θ) = (π(y | θ))y ∈ Y, the distribution of Y given θ. Definition 12.1.Say that X is sufficient for Y or that Y is a garbling of X if there exists a Markov matrix P, P = {p(y | x)}x∈X, y∈Y, ∑y(py | x) = 1, ∀ x, such that for all θ,
In this case write X ≽SY. So, Y is a garbling of X if the conditional distribution of Y given θ can be determined from the conditional distribution of X given θ using a Markov matrix, P, defined on X × Y. There is a slight ambiguity of terminology here, regarding the term “sufficient.” In familiar statistical terms, given a joint distribution, π, on Θ × X × Y, π(θ,y,x) may be written π(θ, y,x)= π(y | x, θ) π(x | θ) π(θ). Thus π(y,θ) = [∑x π(y | x, θ) π(x | θ)]π(θ) and π(y | θ) = ∑x π(y | x, θ) π(x | θ). The random variable X is said to be sufficient for Y if π(y | x, θ) is independent of θ so that π(y | x, θ) = p(y | x), and this implies the garbling condition (so that sufficiency is a strictly stronger property than garbling).
Garbling and mean preserving spreads Garbling and mean preserving spreads are closely related and are discussed next. Later, the proof of Blackwell's theorem on garbling and the value of information utilizes this connection. If Y is a garbling of X, then the posterior distribution on Θ conditional on X is a mean preserving spread of the posterior distribution on Θ conditional on Y. Intuitively, the distributions conditional on X realizations are more “extreme” than the distributions conditional on Y and hence more informative. In the present context, take the garbling condition and multiply by π(θ), so that for all (y,θ):
CHAPTER 12: INFORMATION
207
Here, for example, π(θ | Y) is the random variable taking values in equals is the probability of the set of yi such that . That is,
. The probability that π(θ | Y)
Thus
So,
Write for the random vector (π(θ | Y))θ∈Θ. For any realization of Y, say yi, expectation conditional on π(θ | Y) on both sides,
is a distribution on Θ. Taking
where the last expectation follows because π(θ | Y) generates a coarser field than Y.16 Thus,
Writing,
since E{π(θ | X) | π(θ | Y)} = {π(θ | Y)}, π(θ | X) is a mean preserving spread of π(θ | Y). (The vector ξ is a mean preserving spread of η if there is third random vector ɛ such that ξ has the same distribution as η + ɛ and E{ɛ | η} = 0.) Thus the notion of garbling provides a notion of informativeness of observations on a random variable.
16
Given random variables X and Y and functions g and f defined on the ranges of X and Y ,
208
MICROECONOMIC THEORY
Blackwell's theorem Before relating information value to informativeness, some notation is necessary. Given a distribution φ on Θ, (φ(θ) is the probability of θ), let r(φ) = maxa ∑θu(a,θ)φ(θ), the maximum expected utility when the distribution on states is φ. Observe that r is convex in φ since:
If D = (A,u) is a decision problem, recall
Definition 12.2Say that X is more valuable than Y if for all admissible decision problems D ∈ D:
Write X ≽VY if X is more valuable than Y. Blackwell's theorem asserts that X ≽SY if and only if X ≽VY. Theorem 12.1.X ≽SY if and only if X ≽VY. Proof. (a) [X ≽ S Y] ⇒ [X ≽ V Y]: Let that where of . Thus, and since
, and recall that so is a convex function of . Let ρX be the distribution is a mean preserving spread of
Thus, X ≽SY ⇒ X ≽VY. Next, suppose that X ≽SY fails. To see that this implies that X ≽VY fails, let Γ be the convex set:
If Y is a garbling of X, then by definition {π(y | θ)}y∈Y,θ∈Θ ∈ Γ. Since Y is not a garbling of X, {π(y | θ)}y∈Y,θ∈Θ ∉ Γ, there is a separating hyperplane: for
CHAPTER 12: INFORMATION
209
some {αyθ}y∈Y,θ ∈ Θ, for all P = {p(y/x)}y∈Y, x∈x:
In particular,
Letting αyθ = φyθ · π(θ) (assuming π(θ) > 0, so that there is such a φyθ),
Since π(θ | y) π(y) = π(y, θ)
Let A = {y1,…,yny} and u(a,θ) = φaθ and take a(y) = y. Thus,
Thus, if X is not sufficient for Y, there is a decision problem D* such that R(Y; D*) > R(X; D*) so ¬ [X ≽VY].
12.5 Monotonic Decisions When a decisionmaker receives a signal x that provides partial information about the state θ, then maximizing expected utility produces a choice a that depends on the observed information: a(x). The following discussion considers how a(x) varies with x. Because the discussion calculates a′(x), the signal space is assumed continuous. Consider again the problem:
where p(θ | x) is the posterior distribution on θ given x. Say that p has a monotone likelihood ratio if p(θ | x′)/p(θ | x) is increasing in θ when x′ is greater than x. Since ∫θp(θ | z) = 1 for all z, for some θ*, p(θ | x′) < p(θ | x) for θ < θ* and p(θ | x′) > p(θ | x) for θ > θ*.
210
MICROECONOMIC THEORY
The cumulative distribution is . From the graph Fx′(θ) ≤ Fx(θ), ∀ θ, so that Fx′ first order stochastically dominates Fx. (Note that this also holds if [p(θ | x′) − p(θ | x)] is increasing in θ—when the likelihood difference is increasing in θ.) With sufficient smoothness, the first-order condition for an optimum at x is
Thus, if a(x) is the solution, ∫θua(a(x),θ)p(θ | x) dθ = 0, ∀ x. If a* is the maximizer at x, the second-order condition is
Summarizing, if a(x) is the solution at x then for each x:
Differentiating the identity (in x), ∫θua(a(x),θ) p(θ | x) dθ = 0, gives
And so,
From the second-order condition, the denominator is negative, so the sign of a′(x) is the same as the sign of the numerator. Consider the numerator: If the cumulative distribution, of p(θ | x′), Fx′(θ), first-order stochastically dominates that of p(θ | x), Fx(θ), and ua(a,θ) is increasing in θ for each
CHAPTER 12: INFORMATION
211
a (ua θ > 0), then
This implies that ∫θua(a(x),θ) (∂p(θ | x)/∂x) dθ ≥ 0, so that a′(x)≥ 0. So, if uaθ > 0, and p(θ | x′) first-order stochastically dominates p(θ | x) then a′(x)≥ 0. Conversely, if ua θ < 0 and p(θ | x′) first-order stochastically dominates p(θ | x), then a′(x)≤ 0. Note that p(θ | x′), first-order stochastically dominates p(θ | x) if p(θ | x′)/p(θ | x) is increasing in θ when x′ > x, or if p(θ | x′) −; p(θ | x) is increasing in θ when x′ > x: either a monotone likelihood ratio or a monotone likelihood difference implies first-order stochastic dominance. This discussion leads to the following result. Theorem 12.2.Consider the problem: ∫x {maxa ∫θu(a,θ) p(θ | x)dθ} π(dx) with solution a(x). If: 1. u is concave in a, 2. p(θ | x′), first order stochastically dominates p(θ | x) for all x,x′ with x′ > x and 3. uaθ > 0, ∀ a, θ then a(x) is increasing in x. If 3 is replaced by 3′: uaθ < 0, ∀ a,θ then a(x) is decreasing in x. For a slightly different perspective, define f(a, x) = ∫θu(a,θ) p(θ | x) dθ, so that a(x) solves maxaf(a,x): fa(a(x), x) = 0, ∀ x. Thus, faa(a(x),x)a′(x) + fax(a(x),x) = 0 and a′(x) = −fax(a(x),x)/faa(a(x),x), so that with u (or f) concave in a, a′(x) > 0 when fax(a,x) > 0. Here, fax(a,x) = ∫θua(a(x),θ) (∂p(θ | x)/∂x) dθ. The condition fax(a,x) > 0 is called the supermodularity condition and is discussed below in Section 12.6.3. In the context of the previous discussion, with this terminology, if u(a,θ) is concave in a and supermodular (uaθ > 0) and if p(θ | x′) (Fx′(θ)) first order stochastically dominates p(θ | x) (Fx(θ)), then a′(x) ≥ 0.
12.6 Likelihood Ratios, MTP , and Supermodularity 2
This section reviews some characterizations of the monotone likelihood ratio property. In addition, a brief discussion of the property of monotone likelihood difference is given. The monotone likelihood ratio property is a special case of monotone total positivity of order 2 (MTP2), which applies to densities of n random variables. Finally, supermodularity is discussed. This is a monotone difference property; the discussion is in the context of an optimization problem.
212
MICROECONOMIC THEORY
12.6.1 Monotone likelihood ratios: observations Recall that the monotone likelihood ratio condition is equivalent to
or, multiplying both sides by p(x′)/p(x),
Rearranging: p(θ′, x′) p(θ, x) ≥ p(θ′, x) p(θ, x′), x′ ≥ x, θ′ ≥ θ, so that relatively more of the density weight is on the main diagonal.
Writing z ∨ z′ for the coordinate-wise maximum and z ∧ z′ for the coordinate-wise minimum,
Alternatively, letting p(θ | x′) = p2(θ) and p(θ | x) = p1(θ), the monotone likelihood ratio expression may be written as:
When θ′ ≤ θ, the condition is vacuous and θ′ ≥ θ gives the monotone likelihood ratio property. The monotone likelihood ratio condition is sometimes expressed by the condition that px(θ | x)/p(θ | x) is increasing in θ—a formulation that is commonly used in the study of the principal–agent problem. (Here, px(θ | x) is ∂p(θ | x)/ ∂x.) To see why, note:
Thus
CHAPTER 12: INFORMATION
213
and letting x′ ↓ x gives for θ′ ≥ θ
Conversely,
If px(θ | z)/p(θ | z) is increasing in θ for each z, then px(θ′ | z)/p(θ′ | z) ≥ px(θ | z)/p(θ | z), ∀ z and so,
An alternative differential implication may also be given. If p(θ, x′)/p(θ, x) is increasing in θ, then provided the derivatives exist, differentiating this ratio with respect to θ and requiring monotonicity gives,
Finally, yet another perspective can be given if one considers the difference in the density at x and x′ (x′ > x): p(θ | x′) − p(θ | x). If the likelihood difference is increasing, then for θ′ > θ,
Note that despite the similarity, monotonicity in the likelihood ratio and the likelihood difference are not equivalent. To see this observe that, if p(θ | x′) and p(θ | x) were parallel on some region above , then the likelihood difference would be constant and the likelihood ratio would be decreasing. If the curves were downward sloping, then a constant likelihood difference would be associated with an increasing likelihood ratio. Viewing p as a function of two variables, the latter condition (increasing differences) is called supermodularity. Because ∫p(θ | z) dθ = 1 for all z, if this difference is increasing, then it is nonpositive to the left of some θ* and nonnegative to the right of θ*. If then d/dθ [Fx′(θ) − Fx(θ)] = p(θ | x′) − p(θ | x), so Fx′(θ) increases more slowly than Fx(θ) up to some θ* and faster thereafter: Fx′ first order stochastically dominates Fx. So, from the perspective of first-order stochastic dominance, monotone likelihood ratio or monotone likelihood difference yields the same implications. For an optimization perspective, the monotone difference criterion has useful application, which is discussed in section 12.6.3. The next section reviews some of the properties of monotone total positivity.
214
MICROECONOMIC THEORY
12.6.2 Monotone total positivity of order two The monotone likelihood ratio property is a special case of monotone total positivity. The following definitions and results identify a few key properties of monotone totally positive functions (of order 2). A function f defined on Z1 × Z2 × … × Zn where each Zi is totally ordered, satisfying
is said to be monotone totally positive of order 2 (MTP2). A random vector X = (X1, …, Xn) is MTP2 if its density is MTP2. Given two densities, f1 and f2 on Rn, if
write
. When
, then for all increasing ϕ: Rn → R,
If f(x;λ) is MTP2, let λ2 ≥ λ1 (coordinate-wise) and put fi(x) = f(x; λi). Then given for reference: •
If
•
More generally, if
•
If
• •
. The following results are useful and
and K(x, y) is MTP2, then
,
and
, and
, then
are totally ordered spaces, f MTP2 on Y × X, g MTP2 on X × Z then
is MTP2 on Y × Z, where σ = × σi and σi is a sigma-finite measure. One implication this is given next: If f(x, y) is MTP2 on X × Y, then the marginal density, f(x) = ∫Yf(x, y)dy is MTP2 on X. If f and g are MTP2 functions on X, then fg is an MTP2 function on X. If X = (X1, …, Xn) is a random vector with an MTP2 density f with respect to some product measure on Rn. Then, for any increasing function ϕ: Rk → R, 1≤ k ≤ n,
is increasing in (xk+1, …, xn).
215
CHAPTER 12: INFORMATION
12.6.3 Supermodularity and monotonicity Let f(x, y) be a function of two variables and consider the maximization of f with respect to y. This gives a solution y(x), the solution to fy(x, y) = 0: fy(x,y(x)) = 0, ∀x. Differentiating this identity and rearranging: y′(x) = −(fyx(x, y(x))/fyy(x, y(x))) = [−1/fyy(x, y(x))] fyx(x, y(x)). Since y(x) is the maximizer, degeneracies aside, fyy(x, y(x)) < 0 so the sign of y′(x) is the same as the sign of fyx(x, y(x)) (= ∂2f(x, y)/∂x∂y |y = y(x)). The sign of the derivative fxy has an interpretation in terms of supermodularity and submodularity. This is discussed below. Let y* solve maxyf(x1, y): f(x1, y*) ≥ f(x1, y), ∀ y. Suppose that x2 > x1. Under what conditions will the solution to maxyf(x2,y) be greater than y*? A sufficient condition is that the maximum of f(x2,y) over y occurs at value of y greater than or equal to y*:(12.1)
From the fact that y* maximizes f(x1, y), it must be that (12.1) will certainly be satisfied if (12.2) is satisfied:(12.2)
. Since the latter expression is nonnegative,
And condition (12.2) will be satisfied if condition (12.3) is satisfied:(12.3)
Equation (12.3) is an “increasing differences” condition: the difference f(x, y2) − f(x, y1) is increasing in x for y2 > y1. Alternatively, write (12.3) as:(12.4)
A function satisfying (12.3) is called a supermodular function. The increasing differences property has a differential formulation. From (12.3),
Letting y2 ↓ y1,
so that (repeating the calculation with x2 ↓ x1),
216
MICROECONOMIC THEORY
Conversely, suppose that (∂2f(x, y)/∂x ∂y) ≥ 0, then
Noting
implies that [f(x2, y2) − f(x1, y2)] − [f(x2, y1) − f(x1, y1)] ≥ 0, so, with sufficient differentiability, ∂2f(x, y)/∂x ∂y ≥ 0 implies (12.3).
Submodularity If the inequality (12.4) runs the other way the function f is called a submodular function:(12.5)
This may be written as a decreasing differences condition:(12.6)
Suppose that y1 solves maxyf(x1, y), so that f(x1, y1) ≥ f(x1, y), ∀y. Rearranging the inequality,(12.7)
Since 0 ≤ f(x1, y1) − f(x1, y2), ∀y2, in particular 0 ≤ f(x2, y1) − f(x2, y2), ∀y2 with y1 ≤ y2, or f(x2, y1) ≥ f(x2, y2), ∀y2 ≥ y1. Thus, as x increases the maximizing value of y declines.
Supermodularity and MTP
2
Supermodularity is closely related to MTP2. Write g(x, y) = ef(x, y), so that f = ln g. Then (12.3) may be written as(12.8)
Using earlier notation, given vectors z = (a, b) and z′ = (c,d) let z ∨ z′ be the component-wise maximum and z∧ z′ be the component-wise minimum. Let the domain of f (and g) be a lattice Q. Then (12.4) becomes:(12.9)
As written in (12.9), this is the MTP2 condition.
CHAPTER 12: INFORMATION
217
12.7 The Multiperson Environment In a one-person environment, “better” information makes the decisionmaker better off. Matters are dramatically different in the multiperson environment. The following (insurance-based) example illustrates how improved information may lower welfare. Suppose there are two individuals, 1 and 2, and there are two states of nature, A and B, with each state equally likely. Individual 1 has an asset which in state A is worth $1 and $0 in state B; individual 2 has an asset which in state A is worth $0 and $1 in state B. In either state, the total value of the two assets is 1. Each individual has von Neumann–Morgenstern utility function, . In the absence of trade the expected payoff to either individual is . If they trade, suppose that they exchange half of each asset. In this case, following trade, each individual owns half of each asset and the expected utility is . So, trade makes both individuals better off. In the framework of extensive form games, Γ1 depicts one information structure—where they offer to trade simultaneously. Trade only takes place when both individuals agree to trade (choose T). When player 1 plays T, and player 2 plays T, both have an expected payoff of , and a deviation by either player (to N) gives each player an expected payoff of at most ½. Thus, the strategy (T,T) is an equilibrium.
Now, consider an environment where individual 1 is better informed and learns the true state prior to trade—as depicted in game Γ2. If individual 2 chooses T with positive probability, then individual 1 has a unique optimal choice in each state: in state B, choose N and in state A choose T. Thus, the corresponding posterior distribution on (e1, e2, e3, e4) is . For 2, the expected payoff from T is: and the payoff from N is ½ 1 + ½ 0. So, 2 will not play T. Thus, in any equilibrium the expected payoff to each player is ½ 1 + ½ 0 which is less than the expected payoff from the (T,T) equilibrium in Γ1. This illustrates how there are no equilibrium beliefs which can sustain trade. The idea appears again after the discussion of rational expectations equilibrium. The next section describes the rational expectations model.
218
MICROECONOMIC THEORY
12.7.1 Rational expectations Individuals with partial information trade and optimize relative to available information. This and subsequent sections highlight two points: informational asymmetry may lead to nonexistence of equilibrium; and, in a rational expectations model there is no scope for speculative trade: rational individuals do not speculate. Consider an exchange economy with incomplete information. An underlying state space, Ω is given with fixed measure, μ. Individual i has endowment ei ∈ Rn. A price function p: Ω → Δl, the l − 1 dimensional simplex. 1. 2. 3. 4. 5.
ω ∈ Ω is drawn. Individual i observes a signal and price , where and Individual i chooses a trade, xi(ω), measurable with respect to i's information. The state ω is observed and trades implemented. Individual i receives utility ui(xi(ω), ω).
.
The condition that xi(ω) be measurable with respect to i's information amounts to requiring that xi depend on ω only through the value of (si, p). If (si(ω), p(ω)) = (si(ω′), p(ω′)), then xi(ω) = xi(ω′). Write σ(si) for the sigma field generated by the signal, σ(p) for the sigma field generated by p and σ(si) ∨ σ(p) for the sigma field generated by both random variables. For i, the expected utility at state ω is
For example, let Ω = [0,1], and
In this case, knowledge of the signal value identifies which of the three intervals, [0,⅓), [⅓,⅔), [⅔,1] contains ω. Similarly, knowledge of the value of p identifies which of the intervals [0, ½), [½,1] contains ω. Thus, for example, if si = 2 and p = 1, it can be inferred that ω ∈ [⅓,⅔) ∩ [0, ½). So, σ(si) consists of the three sets {[0,⅓), [⅓,⅔), [⅔,1]} and all sets formed from the union or intersection of these sets (such as [0,⅔)). Similarly, σ(p) consists of the sets {[0, ½), [½,1]} and all sets formed by union or intersection. Finally, σ(si) ∨ σ(p) consists of {[0,⅓), [⅓, ½), [½, ⅔), [⅔,1]} and all sets formed from the union or intersection of these sets. A function, xi: Ω → Rl is measurable with respect to σ(si) ∨ σ(p) if it is constant on each of these sets: thus, if ω, ω′ ∈ [⅓, ½), then xi(ω) = xi(ω′). If si = 2 and p = 1, then individual i knows that or (si,p) = (2, 1) is observed. As an example, let x = (x1, x2) and for the price of good 1. To
. Let the prices of the goods be normalized (p1 + p2 = constant) and write p
CHAPTER 12: INFORMATION
219
illustrate the computation of E{u|σ(si)∨σ(p)}, let si and p be as in the example above. Suppose that ω ∈ [⅓, ½)—the individual observes s1 = 2, p = 1, and infers that the true state lies in [⅓, ½). Suppose that xi1(ω) = aω and xi2(ω) = bω. Then, for ω ∈ [⅓, ½),
Repeating this for all signals:
In the case where x = {xi} is chosen optimally, this defines the demand function. The demand function for i satisfies:
With this notation in place: Definition 12.3.A rational expectations equilibrium is a function p: Ω → Δland and • • •
such that xi is σ(si)∨ σ(p) measurable
p(ω) · xi(ω) ≤ p(ω) · eifor μ-almost all ω, E{u(xi(·), ·) | σ(si)∨ σ(p)}(ω) ≥ E{u(xi′(·), ·) | σ(si)∨ σ(p)}(ω), for all xi′ measurable σ(si)∨ σ(p), for μ-almost all ω,
An application Consider a futures contract with spot price q at delivery time: q: Ω → R. Let p be the (present) price, which may also depend on the state of nature, p(ω). Let xi be the trading position of i so that if yi is pre-trade income, income after closing the position is: xi(q−p) + yi. Given private signal si, i maximizes:
This yields a function xi(si,p) or xi(ω) since si and p are functions of ω.
220
MICROECONOMIC THEORY
12.7.2 Nonexistence of equilibrium In a rational expectations equilibrium, information transmission occurs—traders extract information from the market price. The following example is one where if the market price is uninformative, the market will not clear. But when everyone is informed (through variation of the market price with the state), the only market clearing price is uninformative or constant over states. Hence, there is no equilibrium. Suppose there are two states, ωh and ωt (heads and tails), with prob(ωh) = ½ = prob(ωt). Say there are two individuals, i = 1, 2 and let
Expected utility across states is:
Each individual has endowment ei = (1,1); prices are normalized to the unit simplex, so a price vector is a pair (p,1 − p) with p ∈ [0,1]. Thus, for all p, the value of the endowment is p · 1 + (1−p) · 1 = 1. Maximizing expected utility subject to the budget constraint: maxx,y α lnx + β lny + λ (1 − px −(1 − p)y) gives x = (α/(α+β))(1/p) and y = (α/(α+β))(1/ 1−p). Suppose now that individual 1 has full information, observing the state prior to trade; but individual 2 has no information on ω. Then
For individual 2, expected utility is: ½ ln x + ½ ln y so that x2(p) = ½(1/p) and y2(p) = ½(1/1 − p). Market clearing in each state requires:
The first equation gives ⅚ = 2p or and the second equation gives or . Market clearing requires and , but in this case the market price reveals the state. If the state is known, player 2's utility
CHAPTER 12: INFORMATION
221
maximizing demands are:
In that case, market clearing requires:
The first equation implies that p(wh) = ½, and the second equation implies that p(wt) = ½, so p(wh) = p(wt) = ½. The price reveals no information, so that individual 2's trades cannot be contingent on the state. There is no rational expectations equilibrium. The difficulty here is that maxx,y {u2 | p} is discontinuous in p. When p is state revealing the maximum is uniformly larger than for p nonrevealing.
12.7.3 Rational expectations and no speculation The previous discussion illustrated that equilibrium may not exist in the rational expectations environment. The following discussion concerns the impossibility of speculative trading in a rational expectations equilibrium. Consider an environment where utility is ui: R → R, concave on R. The state space is Ω = E × S where E is the set of payoff relevant events—those events that affect the return directly, and is the signal space. Individual i k observes signal si. Let E = R be a space of random returns (spot prices) on k assets. The current price (vector) is p ∈ E and the spot price vector, . If i buys xi claims at price p, when the spot price is realized the gain in value is . This is a speculative environment. Total claims must net out to zero, ∑ixi = 0 so : the sum of gains must equal zero as a market clearing condition. For an individual i, wealth on realization of the spot price is , where yi is the pre-trade income of i. Definition 12.4.A rational expectations equilibrium is a function φ: S → E and a set of trades xi(p, si, φ−1(p)) such that • •
xi(p, si, φ−1(p)) maximizes ∑ixi(p, si, φ−1(p)) = 0 (market clearing).
The interpretation is that each individual observes a private signal (si). Acting on this behavior determines a price, p = φ(s1, …, sn), which agents trade on and which reveals additional information directly from knowledge of p and indirectly from what it infers (in equilibrium) about s = (s1, …, sn) (φ−1(p)). For individual i to trade at any information profile {p, si, φ−1(p)}, the individual's expected utility conditional on {p, si, φ−1(p)}, must be at least as large as ui(yi). Concavity of the utility function implies that
222
MICROECONOMIC THEORY
E{gi | p, si, φ−1(p)} ≥ 0 for i to trade. Since E{gi | p, si, φ−1(p)} ≥ 0, taking expectation conditional on (p, φ−1(p)),
and since ∑igi = 0,
So,
So, if μi(· | φ−1(p)) is the conditional distribution on Si given φ−1(p), for almost all si in the support of μi(· | φ−1(p)),
Hence, E{E{gi|p,si,φ−1(p)}|p,φ−1(p)} = 0,∀ i. Assuming ui is strictly concave, if gi is non-degenerate, then nonnegative expected utility implies E{gi | p, si, φ−1(p)} > 0 which is inconsistent with 0 = E{gi | p, si, φ−1(p)}. Thus, in a rational expectations equilibrium, gi = 0 for all i.
12.8 Equilibrium in n-player Bayesian games In what follows, a general Bayesian model of incomplete information is described. The framework encompasses a large class of economic models (such as auctions). A general equilibrium existence theorem is given. A set of n players with player i having action space Ai and type space Ti. Player i observes ti ∈ Ti and takes an action in Ai. Assume that Ai is a compact metric space and Ti is a complete separable metric space. Let A = xiAi and T = xiTi. Player i's utility function is ui : A × T → R. The function ui(a,t) is assumed to be a Caratheodory function (continuous in a for fixed t and measurable in t for fixed a), and bounded by an η-integrable function g(t). Definition 12.5.A measure η on T is called the information structure and gives the (prior) distribution on types. Information is said to be continuous if η is absolutely continuous with respect to its marginals, , so that for any Borel set B, η(B) = ∫Bfdη1 · dηn. A distributional strategy for i is a joint distribution, μi, on Ai × Ti. The conditional distribution on Ai given ti ∈ Ti is μi(dai | ti). For consistency with the information
CHAPTER 12: INFORMATION
structure, the marginal of μi on Ti must coincide with the marginal, ηi, of η on Ti. Given the strategy profile expected payoff to player i is:
223 , the
With continuous information,
The main theorem concerning existence is the following: Theorem 12.3.If information is continuous, the game has an equilibrium in distributional strategies. Proof. In this framework, the strategy space of i is Si , where P(Ai × Ti) is the set of probability measures on is the marginal distribution of μi on Ti. The set Si is compact (a closed subset of the compact space P(Ai × Ti) with the weak* topology—the “weak-star” topology). Also, the payoff function, Ui(μ1, …, μn) is linear in μi so that the set of best responses is convex. If Ui is continuous, then the best response correspondence is upperhemicontinuous and the Glicksberg–Fan fixed point theorem applies. To check continuity, consider a sequence . The discussion below shows that . In the context of incomplete information games, say that payoffs satisfy equicontinuity if for each player i and for all ɛ > 0, there is Tɛ ⊆ T with η(Tɛ) ≥ 1−ɛ and ℱ = {ui(·, t) | t ∈ Tɛ} is equicontinuous. The following argument shows that payoffs satisfy equicontinuity. View the function ui(a,t) as a measurable function from T to C(A), the set of continuous functions on . A measure, λi on C(A) is induced by this mapping: given a measurable subset, B of C(A), . Since λi is a measure on a complete separable metric space, the measure is tight so given ɛ > 0, there is a compact subset, B of C(A), such that λi(B) ≥ 1−ɛ. From the Arzela–Ascoli theorem, compactness of B ⊂ C(A) implies that the set of functions B is equicontinuous. (A collection of functions, ℱ, defined on a metric space X is called equicontinuous if for each ζ > 0, there is a δ > 0 such that d(x,x′) < δ implies | f(x) − f(x′) | < ζ for all x, x′ ∈ X and all f ∈ ℱ.) Let , so that η(Tɛ) ≥ 1−ɛ. From the previous paragraph, equicontinuity follows, letting ℱ = B. Given Tɛ, there is a continuous function on A × T, vɛ, with vɛ = ui on A × Tɛ and sup(a,t)vɛ(a,t) ≤ sup(a,t)ui(a,t). From Lusin's theorem there is a sequence of
224
MICROECONOMIC THEORY
continuous functions {fn} with ∫| fn − f | dη → 0. Considering payoffs,
Thus, given any γ > 0, ∃ continuous functions vɛ and fn such that ∀ μ:
So,
or
Taking limits
Since this holds for each γ > 0, Ui(μ1, …, μn) is continuous. Consequently, the best response correspondence satisfies the conditions of the Glicksberg–Fan fixed point theorem.
12.9 Multiagent Models: Information Structures This concluding section outlines the framework for a variety of information models presented in subsequent chapters. The principal–agent, screening, and signaling models may be described in terms of a labor market environment where there are two players—the firm and the worker. The firm pays a wage (w), the worker provides effort or education (e) and
CHAPTER 12: INFORMATION
225
there is some shock (θ) or possibly incomplete information in the system. Output y depends on the effort or education e and the shock θ: y(e,θ). The firm has profit function π(y,w,θ) and the worker has utility function u(w,e,θ).
The principal–agent model 1. 2. 3. 4. 5.
The firm offers a wage as a function of output: w(y). Effort or education, e, is chosen by the worker, knowing the wage function, w(·). The shock or information variable θ ∈ Θ is drawn with distribution p. Output y(e,θ) is determined. The firm receives profit π(y(e,θ),e, θ) and the worker gets utility u(y(e,θ),e, θ).
The screening model 1. 2. 3. 4. 5.
The firm offers a wage as a function of effort or education : w(e). The shock or information variable θ ∈ is drawn with distribution p. The worker sees the value of θ drawn and chooses effort or education e, knowing θ and the wage function w(·). Output y(e,θ) is determined. The firm receives profit π(y(e,θ),e, θ) and the worker gets utility (y(e,θ),e, θ).
The signaling model 1. 2. 3. 4. 5.
The shock or information variable θ ∈ Θ is drawn with distribution p. The worker sees the value of θ drawn and chooses effort or education e. The firm offers a wage w having observed effort or education e. Output y(e,θ) is determined. The firm receives profit π(y(e,θ),e, θ) and the worker gets utility u(y(e,θ),e, θ).
Bibliography Crémer, J. (1982). “A Simple Proof of Blackwell's Theorem,” Journal of Economic Theory, 27, 43–443. Hart, O. (1975), “On the Optimality of Equilibrium when the Market Structure in Incomplete”, Journal of Economic Theory, 11, 418–443. Karlin, S. and Rinott, Y. (1980). “Classes of Orderings of Measures and Related Correlation Inequalities. 1. Multivariate Totally Positive Distributions,” Journal of Multivariate Analysis, 10, 467–498.
226
MICROECONOMIC THEORY
Kreps, D. (1977), “A Note on “Fulfilled Expectations” Equilibria”, Journal of Economic Theory, 14, 32–43. Milgrom, P. R. and Weber, R. (1985). “Distributional Strategies for Games with Incomplete Information,” Mathematics of Operations Research, 10(4), 619–632. Tirole, J. (1982). “On the Possibility of Speculation under Rational Expectations,” Econometrica, 50, 1163–1182. Zauner, K. G. (2002). “The Existence of Equilibrium in Games with Randomly Perturbed Payoffs and Applications to Experimental Economics,” Mathematical Social Sciences, 44, 115–120.
13 The Principal–Agent Problem 13.1 Introduction The principal–agent problem is the problem of providing an optimal incentive scheme to an employee (agent), when the agent's effort is not directly observable, but where some statistic which is positively correlated with effort is observed. This situation arises, for example, when a manager cannot monitor the effort of an employee. Instead, some variable, such as output is observed, and although this may depend on many factors, such as weather or good luck that are independent of the individual's effort level, it provides partial information on effort. A compensation scheme must be based on observables, but this raises the problem of striking the right balance on incentives. Because the principal observes only the output, the principal does not have full information concerning all the variables affecting it. Since the output varies not only as a result of the agent's effort, but also because of variation in other factors, as long as reward depends on it, variation in reward or wages will occur as the output varies—even if the agent's effort level is constant. Thus, the agent is forced to bear risk as a result of the principal's need to encourage the agent to work. A reward scheme that is too highly correlated with output may deter a risk averse agent; a reward scheme with too low a correlation may not give the agent enough incentive. How should a reward scheme be constructed to maximize the manager's expected revenue? As a benchmark, one can consider the full information case where the principal observes all variables affecting the output. In this case (full information), efficient reward schemes can be devised. Solving the principal–agent problem (without full information) is a complex task. The primary approach is called “the first-order approach.” This involves maximizing the principal's expected revenue subject to the agent's first-order
228
MICROECONOMIC THEORY
condition (for optimal choice of effort given the reward schedule) being satisfied. There are two difficulties with the first-order approach. The first difficulty is unavoidable: the principal selects a reward function and associates a wage to each level of output, and the agent makes choices on the basis of this reward scheme—so the optimization is complex. The second difficulty is more fundamental. The first-order condition for the agent's problem may not identify the maximizing choice for the agent unless the agent's optimization problem is concave in effort. And that in turn depends on the reward function set by the principal. Determining the validity of the approach depends on unraveling these issues. The problem is described in Section 13.2. As a basis of comparison, one can consider the optimal reward scheme for the principal when the agent's actions are observablethe “full information” case. This case is considered in Section 13.3. Sections 13.3.1 and 13.3.2 consider the efficient allocation of risk in the full information case. Turning to incomplete information, the optimization problem is complex since there is a double optimization—as the agent responds to reward schedules set by the principal. The problem is outlined in Section 13.4. The well-known approach to this problem, the first-order approach, is discussed in Sections 13.4.1 and 13.4.1. The sufficiency conditions described there are discussed further in Sections 13.4.3 and 13.4.4.
13.2 Details There are two parties, the principal and the agent. The agent is employed by the principal to provide effort e in return for a wage w. Output, y, depends on e and the state of nature θ: y(e, θ). The state of nature is drawn randomly, according to a distribution p(θ). The principal's utility is π(y, w, θ) and the worker's utility is u(w, e, θ). In setting the wage the principal can observe only the output, so the wage function has the form: w(y). In particular, the effort is unobservable by the principle—knowledge of the value of y does not reveal the value of e: a medium value of y could arise because e is large and θ small; or because e is small and θ large (e.g. if y(e, θ) = eθ). This creates an informational difficulty from the perspective of efficiency. In general, suitable variations in w (as y varies) can help distribute risk between the principal and the agent, but since the principal must also motivate the agent to work, the wage schedule must also incorporate incentives to encourage effort from the agent. So, the lack of full information (the lack of observability of both e and θ) in determining the agent's reward leads to inefficiency. For example, in
CHAPTER 13: THE PRINCIPALlAGENT PROBLEM
229
the case where the principal is risk neutral and the agent risk averse, it is efficient for the principal to bear all the risk; but without full information the principal, to raise net revenue, places some of the risk on the agent to encourage effort. In what follows the full information case is discussed first, and the optimal distribution of risk characterized. This provides a benchmark for the case of unobservable effort—the incomplete information case. The discussion of the incomplete information case highlights the impact of the incentive issue on the efficient allocation of risk. Because the optimization problem is complex, the discussion follows the common first-order approach. The validity of the approach is discussed and the assumptions behind it are examined.
13.3 The Full Information Case In the full information case the principal observes all relevant variables. Provided y(e, θ) is monotone in both arguments, knowledge of any pair (y, e), (y, θ), or (e, θ) determines the value of the remaining variable, and a number of informational interpretations are possible. So, in the full information case, the principal observes (e, θ) and the principal's problem is to maximize: ∫θ π(y(e, θ), w(e, θ), θ) dp(θ) subject to the requirement that the agent will accept work from the principal—the offer must be as good as the agent's outside alternatives: ∫θu(w(e, θ), e, θ) dp(θ) ≥ ū. Suppose the principal contracts a given level of effort e from the agent, and then determines the optimal wage w(e, ·) as a function of θ. Although the optimization involves selection of a function, standard Lagrange multiplier techniques for optimization continue to apply (see Clarke 1990).17 The Lagrangian for this problem is:
17
The computations may be motivated as follows. Let x * maximize ∫u (x (θ), θ) dF (θ). Then ∫u (x *(θ), θ) dF (θ) ≥ ∫u (x *(θ) + ε h (θ), θ) dF (θ) for any h . With ∫u (x *(θ) + ε h (θ), θ) dF (θ) ≈ ∫u (x *(θ), θ) + ε ∫ ux (x *(θ), θ) h (θ) dF (θ), so for any h , 0 ≥ ∫ ux (x *(θ), θ) h (θ) dF (θ). Since h is arbitrary, this implies ux (x *(θ), θ) =0 for almost all θ.
230
MICROECONOMIC THEORY
Consider a perturbation of the function w: w(e, θ) + εh(e, θ) and note that
So, limε→0(ℒ(w + εh, e) − ℒ(w, e))/ε = 0 implies that ∫θ [πw(y(e, θ), w(e, θ), θ) − λ uw(w(e, θ), θ)] h(e, θ) dp(θ) = 0. Since this must be true for all functions h, for each e,
or for a set of θ of probability 1. Differentiating ℒ with respect to λ imposes the participation constraint: ū = ∫θu(w(e, θ), e, θ)dp(θ) = 0. Under suitable conditions, this determines λ(e) and the function w(e, ·) given e. Denote this as w*(e, θ). Rearranging,
Because λ is independent of θ, if the principal is risk neutral so that πw is constant, then uw(w(e, θ), e, θ) must be constant for all θ. So, for example, if uw is decreasing in w and uwθ = 0 then w is constant or independent of θ. Therefore in the case of a risk neutral principal (and risk averse agent with uww < 0 and u(w, e, θ) = u(w, e)), optimality in the full information case requires that all risk is borne by the principal. If the principal is risk neutral but the agent's utility depends directly on θ (u(w, e, θ) and uwθ ≠ 0) and if the solution is smooth, then optimality requires some variability in w as θ varies, since:
Finally, if the agent is risk neutral and the principal risk averse, then w*(e, θ) must vary with θ to keep πw(y(e, θ), w*(e, θ), θ) constant.
CHAPTER 13: THE PRINCIPALlAGENT PROBLEM
231
Note that for any e, the solution w*(e, θ) satisfies ∫θu(w*(e, θ), e, θ) dp(θ) = ū. For if this is not satisfied, a small reduction in w(e, θ) will not violate the participation constraint, but raise the expected payoff of the principal. Next, consider the optimal choice of e. With w*(e, θ) chosen optimally for each e, full optimality requires that e be chosen optimally. Write ℒ*(e) for the optimized function (by appropriate choice of w(e, ·) for given e). Finding an optimal e amounts to maximizing ℒ*(e).
Since [πw(y(e, θ), w*(e, θ), θ) − λ uw(w*(e, θ), e, θ)] =0, ∀θ, this reduces to:
Setting this to 0 gives the optimal choice for e. So, the conditions for optimality are: 1. πw(y(e, θ), w(e, θ), θ) − λuw(w(e, θ), e, θ)] = 0, ∀θ ∈ Θ, 2. ∫θ [πy(y(e, θ), w(e, θ), θ) (∂y/∂e) − λ ue(w(e, θ), e, θ)] dp(θ) =0, 3. ū − ∫θu(w(e, θ), e, θ) dp(θ) = 0. These equations can be used to discuss the allocation of risk in the full information case. In what follows, the cases where the principal is risk averse and risk neutral are discussed in turn.
13.3.1 Risk aversion and risk allocation Since the first equation, [πw(y(e, θ), w(e, θ), θ) − λuw(w(e, θ), e, θ)] = 0, ∀θ ∈ Θ is an identity, differentiating with respect to θ gives:
232
MICROECONOMIC THEORY
Recalling that [πw(y(e, θ), w*(e, θ), θ) − λuw(w*(e, θ), e, θ)] = 0, ∀θ ∈ Θ so that λ = (πw(y(e, θ), w*(e, θ), θ))/(uw(w*(e, θ), e, θ)), ∀θ. Because the solution for λ does not depend on θ, this ratio is constant over all θ. Substituting for λ in the expression above:
Assume that π has the form π(y, w, θ) = π(y − w) and u(w, e, θ) = u(w, e) so that πθ = uθ = 0, πy = −πw = π′, πww = π″, and πwy = −π″. Then the expression reduces to
or
Let rπ = −(π″/π′) and ru = −(uww/uw), the absolute risk aversion measures associated with π and u, so the expression may be written −rπ · (yθ − wθ) + ruwθ = 0. Rearranging,
In the case where the principal is risk neutral so rπ = 0, and the agent is risk averse, ru > 0, w is independent of θ, wθ = 0. All the variation in y is absorbed by the principal and the agent receives a constant wage: w=w(e). In the (opposite) case where rπ > 0, and ru = 0, wθ = yθ, so that all the variation in y is absorbed by the agent: w(e, θ) = y(e, θ) − c, where c is a constant. (See Rees 1985 for further development.)
13.3.2 Efciency with a risk neutral principal Recalling from optimality condition (2), ∫θ [πy(∂y/∂e) − λue] dp(θ) =0. In the case of a risk neutral principal, the function π(y−w) has a constant first derivative: πy = −πw = π′ is constant. Write ye for (∂y/∂e). Recall from the optimality condition (1) that λ = (πw/uw) = − (π′/uw). Substituting these into (2), ∫θ [π′ ye + (π′/uw) ue] dp(θ) = π′ · ∫θ [ye + (ue/uw)] dp(θ) = 0. So, {∫θ [ye + (ue/uw)] dp(θ) =0. Recalling that (ue/uw) is independent of θ because the optimal choice of w(e, θ) is, this expression becomes: ∫θyedp(θ) = ∫θ (∂y(e, θ)/∂e) dp(θ) = −(ue/uw). Assuming sufficient regularity to differentiate through the integral, ∂/∂e {∫θy(e, θ) dp(θ)} = −ue/uw. Or, writing .
CHAPTER 13: THE PRINCIPALlAGENT PROBLEM
233
In the full information case, the optimal scheme had a wage function that puts the agent on their reservation level of utility. The marginal expected increase in output due to an increase in effort equals the increment in the wage required for the continued participation of the agent, . For example, if and ū = 0, y(e, θ) =eθ, and the principal is risk neutral, π(y−w) = k·[y−w], then w is chosen as a deterministic function of e and no risk is borne by the agent. It is optimal for the principal to keep the agent at the participation level of utility: u(w, e) = ū =0, so or w = e2. Let y(e, θ) = e θ, with θ uniformly distributed on [0, 1]. Then so that the principal maximizes ½e − e2 to give e = ¼.
13.4 The Incomplete Information Case In the incomplete information case the principal observes only the level of output y. So, the wage scheme can depend only on the output level y: w(y). Given w(y), the agent solves maxe ∫θu(w(y(e, θ)), e) dp(θ). Thus, effort is a function of the reward scheme: e(w(·)). Taking the profit function to have the form π(y−w), the principal wishes to maximize ∫θ π(y(e, θ) − w(y(e, θ))) dp(θ), bearing in mind that the choice of the wage function w(·) determines the effort level supplied by the agent. This complicates the optimization problem significantly. Let
Thus, given e the expected payoff to the agent is ∫yu(w(y), e) dF and to the principal, ∫y π(y −w(y)) dF, with dF = f(y| e) dy. If the class of wage functions is restricted, the problem may be solvable directly. The following discussion illustrates this point by restricting the wage schemes to be linear in output. For example, let e ∈ [0, 1] and suppose that F(y | e) = ey2 + (1−e)y, 0 ≤ y ≤ 1, so that f(y | e) = 2 ey +(1−e). Take the preferences of the agent to
234
MICROECONOMIC THEORY
be ; and suppose that the class of wage schemes are linear: w(y) = βy. From the agent's perspective with β fixed the expected payoff is then
This integrates to:
Thus, , and setting this to 0 gives . Since a″(e) =−2 < 0, this is the unique optimal choice for the agent. Thus, if the principal sets the wage schedule as w(y) = β y, the agent will choose effort level . If the principal sets the wage schedule at βy, then the agent responds with choice e(β) generating the distribution f(y | e(β)) on y with expected payoff for the principal: π*(β) = ∫y π(y − βy)f(y | e(β)) dy. The problem for the principal then is: maxβπ*(β). Consider the risk neutral case with π(y−w) = y−w. Given e, ∫y (y − βy)(2ey +(1−e)) dy = (1−β)∫yy(2 ey +(1−e)) dy = (1−β)(⅙e + ½). Since given β, the agent chooses ,
So,
Setting π*′(β) =0 and solving for is .
and the principal's profit at the optimum β
Note that since F(y | e) = ey2 +(1−e)y, Fe(y | e) = y2−y =y(1−y) and Fee(y | e) = 0. Also, since the density is Fy(y | e) = f(y | e) = 2ey +(1−e), fe(y | e) = 2y−1 and fe(y | e)/f(y | e) = 2y−1)/(2ey + (1−e)), so that (d/dy)(fe(y | e)/f(y | e))= 2/(2ey+1−e)2 > 0. The condition Fee = 0 implies that F satisfies Fee(y | e) ≥ 0, so that f(y | e) is convex in e. The condition that fe(y | e)/ f(y | e) is increasing in y is the monotone likelihood ratio property. These conditions are discussed further below. The solution in this example is in the class of linear wage functions. However the optimal wage function may not be linear—here the optimal linear wage function is determined. Finding the optimal wage function raises significant difficulties because of the circularity from the wage function to effort and back to the wage function. The first-order approach is an approach to circumventing these difficulties.
CHAPTER 13: THE PRINCIPALlAGENT PROBLEM
235
13.4.1 The rst-order approach The first-order approach amounts to maximizing the principal's objective subject to the first-order condition of the agent being satisfied and subject to participation by the agent. Given the wage function w(·), the agent's payoff function is a(e) = ∫yu(w(y), e) f(y | e) dy. Differentiating with respect to e,
where fe(y | e) = ∂f(y | e)/∂e. At an optimum for the agent, a′(e) = 0 and this is sufficient for an optimum if a is concave. The participation constraint is ∫yu(w(y), e)f(y | e) dy ≥ ū. Thus, the constrained problem in the first-order approach is to maximize expected profit subject to participation and a′(e) = 0.
First fix e and maximize over w. At an optimum, consider variations of the form w(y) + ε h(y). This generates the condition:
Rearranging,
or
Thus, for example, if the principal is risk neutral so that π′ is constant then as long as fe(y | e)/f(y | e) varies with y, uew(w(y), e) or uw(w(y), e) must vary—so that w(y) varies with y. To go further, suppose that π(x) =x so that π′ =1 and u(w, e) = v(w)−c(e) so that uwe = 0 and uw(w(y), e) = v′(w(y)). The expression then becomes:
or
236
MICROECONOMIC THEORY
The key point to note is that w(y) varies with y as long as fe/f varies with y and μ ≠ 0—in contrast to the efficient full information case where all the risk is absorbed by the principal. Note that μ is the multiplier on the agent's incentive constraint. So, the inefficiency is seen to arise directly from the incentive requirement—the agent must bear some risk to have the incentive to work although this gives an inefficient allocation of risk. Because the participation constraint, ū − ∫yu(w(y), e)f(y | e) dy ≤ 0 enters the Lagrangian as + λ [ū − ∫yu(w(y), e)f(y | e) dy], the Lagrangian multiplier is nonnegative: λ ≥ 0. The multiplier on the first-order constraint, μ, is a multiplier on an equality constraint, so the sign is not determined by the Kuhn–Tucker conditions. However, in the present case it can be determined that μ ≥ 0. To see this is the case, observe that with v′ ≠ 0, the condition above implies that w(y) varies with y. In fact, differentiating this identity:
Therefore, with v′ > 0, v″<0, and fe(y | e)/f(y | e) increasing in y (monotone likelihood ratio, see Section 13.4.3), if μ < 0, then w(y) is decreasing in y. The increasing likelihood ratio condition implies that higher values of e raise the distribution on y in the stochastic dominance sense (see below for the result that f(y | e′) first order stochastically dominates f(y | e) if e′ ≥ e), so if μ <0 the agent will choose e =0, a corner solution. As long as the corner solution fails the first-order conditions, μ > 0. For example, if ∫v(α y) f(y | e) − c(e) > 0 for some 0 < α <1, there is a wage schedule (w(y) =α y) which induces the agent to work and gives positive expected profit to the principal. Then μ > 0 and w(y) is increasing in y (w′(y) > 0). (These issues are discussed further in Rogerson 1985 and Jewitt 1988.) Grossman and Hart 1983 address the principal-agent problem without recourse to the first-order approach. If risk neutrality of the principal is replaced by risk aversion, so that π(y −w(y)) satisfies π′> 0 and π″ < 0 one cannot prove from the program above that the incentive multiplier is positive. However, a relaxation of the program turns out to be adequate: an appropriately modified incentive constraint leads to a positive multiplier (on the incentive constraint) and the sufficiency conditions given next apply to this case also (where both principal and agent are risk averse). This issue is discussed in Rogerson (1985).
13.4.2 Validity of the rst-order approach: sufciency conditions Consider the problem: max(x, y)f(x, y) subject to y ∈ max g(x, y). This problem is analogous to that of the principal–agent problem where the principal solves an optimization problem subject to the choice variables themselves being connected by a second optimization problem. The first-order approach involves solving
CHAPTER 13: THE PRINCIPALlAGENT PROBLEM
237
max ℒ = f(x, y) + λ [0 −gy(x, y)] on the presumption that given x, gy(x, y)=0 will locate the value of y maximizing g(x, y). This is valid when g(x, y) is concave in y and that in turn may depend on the choice of x. To see what can go wrong, consider the case where f(x, y) = y−x and g(x, y)= 3xy − (y−2)3, so that gy(x, y) = 3(y−2)2 −3x and gyy(x, y) = 6(y−2). For fixed x, the function g has a local minimum, y(x), and a local maximum, , with . Setting gy(x, y) = 0 implies that (y − 2) or .
With f(x, y) = y − x and gy(x, y) = 0 (i.e. x − (y − 2)2 = 0), the Lagrangian is ℒ = y − x + λ(x − (y − 2)2). This gives ℒx =1+λ=0, ℒy =1 − 2λ(y − 2)=0, and ℒλ =x − (y − 2)2=0. Solving gives x*=¼ and y* = 2½ (corresponding to y(x*)). The unique solution to the first-order conditions select that value of y that provides a local minimum of g(x*, y). The validity of the first-order approach depends on the first-order condition identifying a maximum for the agent. A sufficient condition for this is that a(e) be concave. Recall that
with the assumption that u(w, e) = v(w) − c(e). Fix the support of F(y | e) as
So,
, independent of e. In this case,
238
MICROECONOMIC THEORY
and
With convexity of c, c″ > 0. The function v satisfies v′ > 0, by assumption. From earlier calculations, w′(y) > 0 if fe(y | e)/ f(y | e) is increasing in y. So, provided Fee > 0, a″(e)< 0. Summing up, the first-order approach is valid if: • • • • •
Preferences have the form u(w, e) = v(w) − c(e), v concave and c convex, and π is concave. The distribution of y given e is F(y | e) and the support of F(· | e) is independent of e, for any y, f(y | e) is a convex function of e, and the ratio Fe(y | e)/f(y | e) is increasing in y, where f is the density of F.
With fe(y | e)/f(y | e) increasing in y, the optimal wage function must be increasing in y. And, if w(y) is increasing in y, Fee > 0 implies that a″(e) < 0, so the agent's objective is concave. (See Fudenberg and Tirole 1988 for some discussion.)
13.4.3Comments on the sufciency conditions Apart from the support requirement, the two sufficiency conditions are the monotone likelihood ratio, fe/f, and the convexity of F(y | e) in e. These are discussed in turn.
Monotone likelihood ratio and stochastic dominance The assumption that fe(y | e)/f(y | e) is increasing in y is called the monotone likelihood ratio property and is equivalent to the assumption that the likelihood ratio, f(y | e′)/f(y | e), is increasing in y for e′ > e. To see this, observe
is increasing in y. Letting e′ ↓ e gives (1/f(y | e))(∂f(y | e)/∂e) or fe(y | e)/f(y | e) is increasing in y; and the density f is said to satisfy the monotone likelihood ratio condition if
is an increasing function of y. (The term “likelihood” is used to refer to the log of the density.)
CHAPTER 13: THE PRINCIPALlAGENT PROBLEM
239
The monotone likelihood ratio implies an ordering on the corresponding hazard functions and this in turn implies an ordering (first-order stochastic dominance) on the corresponding cumulative distribution functions. Recalland take e′ ≥ e. If y ≥ x: f(y | e′)/f(y | e) ≥ f(x | e′)/f(x | e) and f(y | e′) ≥ [f(x | e′)/f(x | e)] f(y | e), so for y ≥ x,
So, letting x=y the right-side term in this expression is: illustrates.
. Thus, h(y | e′) ≤ h(y | e). The figure below
In the figure, observe that h(y | e′) = (f(y | e′)/(C + D), h(y | e) = f(y | e)/(B+D). The densities integrate to 1, A+B =C, so that B < C. Thus, h(y | e′) ≤ h(y | e). If y is above the point of intersection, then, because the ratio f(z| e′)/f(z | e) is increasing in z, , so that again: . Note that the hazard function is h(y | e) = −(d/dy) ln (1 − F(y | e)). Integrating, . Thus, , so that F(y | e′) ≤ F(y | e). In the present context, higher effort levels generate better distributions over output—in the sense of first-order stochastic dominance.
Monotone likelihood ratio and inference In terms of inference, the increasing likelihood ratio has an additional interpretation—loosely, higher observed values of y mean it is more likely that a high value of e was chosen. Suppose that f(y | e′)/f(y | e) is increasing in y whenever e′ ≥ e. Then
240
MICROECONOMIC THEORY
Multiplying the ratios by q(e′)/q(e) where q is a density on e gives f(y, e′)/f(y, e) ≥ f(x, e′)/f(x, e), y ≥ x, e′ ≥ e, or f(y, e′)/f(x, e′) ≥ f(y, e)/f(x, e). Multiplying both sides of this expression by (1/f(y))/(1/f(x)) gives f(e′ | y)/f(e′ | x) ≥ f(e | y)/f(e | x), y ≥ x and e′ ≥ e. From this perspective, this monotone likelihood ratio implies that F(e | y) stochastically dominates F(e | x) where F( · | y) is the cumulative distribution of e given y. Thus, higher output means that it is more likely that there was greater effort.
Convexity of the conditional output distribution Convexity of the conditional distribution appears to have little intuitive interpretation, and many distributions do not satisfy the condition. The following discussion identifies some suitable distributions. Let effort lie in the interval [0, ē] and output in the interval [0, 1]. One distribution that satisfies the convexity and monotone likelihood ratio condition is . Two classes of distributions that satisfy both conditions are: 1. F(y, e) = y +β(y)γ(e), where β(y) > 0, limy↓ 0 β(y) = 0 = limy ↑ 1 β(y) with |β′(y)| ≤ 1, ∀ y; and where γ(e) is a decreasing convex function with |γ (e) | ≤ 1. 2. F(y, e) = δ(e) eβ(y)γ(e), where β is a negative increasing convex function with limy → 1 β(y) = 0; γ(e) is strictly positive increasing and concave; and δ(y) is strictly positive increasing and concave with limy ↓ 0 δ(y) = 0 and limy → 1 δ(y) = 1. See LiCalzi and Spaeter (2003) for further discussion.
13.4.4 Inefciency and the likelihood ratio Recall that likelihood ratio f(y | e′)/f(y | e) is related to fe(y | e)/f(y | e):
Let and letting e′ = e + Δ with Δ small, ϕ(F(y | e + Δ)/f(y | e), e+ Δ,e) ≈ fe(y | e)/f(y | e). If fe(y | e)/f(y | e) rises quickly in y, so does f(y | e + Δ)/f(y | e) revealing that e + Δ was the more likely effort choice when y is “large”. So, when fe(y | e)/f(y | e) rises quickly with y the statistic is informative. Define the distribution of the ratio rfe(y) = fe(y | e)/f(y | e)
CHAPTER 13: THE PRINCIPALlAGENT PROBLEM
241
When rfe is steep, an increase in z to z′ adds the small interval [y, y′] where z = rfe(y) and z′ = rfe(y′), so the increase in Ψfe(z) is small. Alternatively,
is small. So, an “informative” ratio, rfe, tends to generate a flat Ψfe distribution. Heuristically, if one distribution is obtained from another by a slope reducing rotation it generates a mean preserving spread of the original distribution if the mean is unchanged. Here relative flatness corresponds to a relatively informative rfe. The scope for the principal to exploit the informational value of the rfe statistic is discussed by Kim (1995).
Bibliography Clarke, F. H. (1990). Optimization and Nonsmooth Analysis. SIAM Classics in Applied Mathematics, SIAM, Philadelphia, PA. Fudenberg and Tirole. (1988). Industrial Organization. MIT Press, Cambridge, MA. Grossman, S. and Hart, O. (1983). “An Analysis of the Principle Agent Problem,” Econometrica, 51, 7–45. Jewitt, I. (1988). “Justifying the First-Order Approach to Principal Agent Problems,” Econometrica, 56, 1177–1190. Kim, Son Ku (1995). “Efficiency of an Information System in an Agency Model,” Econometrica, 63, 89–102. LiCalzi, M. and Spaeter, S. (2003). “Distributions for the First-Order Approach to Principal Agent Problems,” Economic Theory, 21, 167–173. Strasbourg. Milgrom, P. (1981). “Good News and Bad News: Representation Theorems and Applications,” Bell Journal of Economics, 12, 380–391. Mirlees, J. (1999). “The Theory of Moral Hazard and Unobservable Behaviour: Part I,” Review of Economic Studies, 66, 3–21. Rees, R. (1985a). “The Theory of Principal and Agent: Part 1,” Bulletin of Economic Research, 37(1), 3–26. Rees, R. (1985b), “The Theory of Principal and Agent: Part 2,” Bulletin of Economic Research, 37(2), 75–95. Rogerson, W. (1985). “The First-Order Approach to Principal-Agent Problems,” Econometrica, 53, 1357–1368.
This page intentionally left blank
14 Signaling 14.1 Introduction Signaling in the economics literature refers to the choice of actions by an informed individual that can reveal information and which may affect the decisions of those moving later. In this framework, actions convey information, so that “signaling” of information takes place. In some cases the first mover may have the incentive to manipulate information through choice of action. Such considerations are explored here. In what follows, signaling games are formalized and a distinction is drawn between different forms of equilibria—specifically Nash, sequential, and intuitive equilibria. These are discussed largely in the context of an extensive form game and the classic labor market game. In a signaling game an informed player moves first and the action is observed by a second player who moves next—the framework of a signaling game is described in Section 14.2. Sections 14.2.1, 14.2.2, and 14.2.3 describe Nash, sequential, and intuitive equilibria. Finally, two examples are discussed in Section 14.3.
14.2 Signaling Games The structure of a signaling game is described by the following sequence in terms of order of moves. The timing in terms of moves is as follows: 1. 2. 3. 4.
A state θ ∈ Θ is drawn according to a distribution p, p(θ) > 0, ∀ θ. Player 1 observes θ and chooses an action a ∈ A. Player 2 observes the choice of player 1, a, and chooses an action b ∈ B. Player 1 receives u1(a,b, θ) and player 2 receives u2(a,b, θ).
244
MICROECONOMIC THEORY
A key feature of this environment is that while the choice of the first mover affects the follower directly through payoffs, that choice also affects the follower's assessment of the likelihood of different states, and this can have a significant effect on the follower's optimizing decision. Therefore, posterior beliefs turn out to play an important role in describing equilibrium. In what follows, various equilibria are discussed in turn, as the equilibrium criteria become stricter, going from Nash equilibrium to sequential equilibrium, to intuitive equilibrium.
14.2.1Nash equilibrium Viewed as a game, the pure strategy spaces are Λ = {α:Θ → A} and ϒ = {β:A → B}. Similarly, the mixed strategies are Δ1 = {α:Θ → Δ(A)} and Δ2 = {β :A → Δ(B)}. Given θ, αa(θ) is the probability that player 1 chooses a; and βb(a) is the probability that player 2 chooses b given 1 has chosen a. Conditional on θ, the expected payoffs are:
and
The unconditional expected payoffs are:
Definition 14.1.The pair (α*, β*) is a Nash equilibrium if: 1. V1(α*, β*) ≥ V1(α, β*), ∀ α ∈ Δ1, and 2. V2(α*, β*) ≥ V2(α*, β), ∀ β ∈ Δ2. Note in particular that if ∑θp(θ) αâ(θ) = 0, so that â has 0 probability under α, then the choice of β at â, β(â), has no impact on V2.
14.2.2 Sequential equilibrium Given the strategy α, from the observation a ∈ A, player 2 can calculate the posterior distribution on Θ given a, provided a has positive probability. This is
where pα(a) = ∑θ ∈ Θ αa(θ) p(θ).
CHAPTER 14: SIGNALING
245
From these expressions one may compute best responses for each player. For player 1 the best response is obtained from:
and for player 2
Recall that pα(θ | a) = αa(θ) p(θ)/pα(a) or αa(θ) p(θ) = pα(θ | a) pα(a). The problem for 2 can be expressed as
Since pα(a) = ∑θ ∈ Θ αa(θ) p(θ), depends on α, for some a's it may be that pα(a) = 0. If pα(a) = 0, then αa(θ) p(θ)/pα(a) is not defined. However, one can proceed to define {p(θ | a)}θ ∈ Θ as any function satisfying αa(θ)p(θ) = pα(θ | a) pα(a), p(θ | a) ≥ 0, ∀ θ, and ∑θp(θ | a) = 1. A system of beliefs is a collection {p(θ | a)θ∈Θ}a ∈ A such that: 1. ∑θp(θ | a) = 1, ∀a, and p(θ | a) ≥ 0, ∀ (θ,a). 2. If pα(a) > 0, p(θ | a) = p(θ)αa(θ)/pα(a), ∀θ ∈ Θ (Bayes' rule applies where possible). Sequential equilibrium is defined in the usual way: Definition 14.2.The pair (α*, β*) and associated consistent belief system {p(θ | a)θ ∈ Θ}a ∈ Ais a sequential equilibrium if: 1. V1(α*, β*) ≥ V1(α, β*), ∀α ∈ M1 2. β*(a) ∈ argmaxβ(a) ∑b βb(a) ∑θu2(a,b,θ) p(θ | a), ∀ a ∈ A. This contrasts with Nash equilibrium where β(a) may be chosen arbitrarily if pα(a) = 0. Because of the structure of the information sets in the extensive form, here sequential consistency imposes no restrictions on beliefs off the equilibrium path.
246
MICROECONOMIC THEORY
If there are k points in A and s states in Θ, then the strategy α may be represented by the k × s matrix, Tα:
Let denote those actions in A that have 0 probability under α—they correspond to the 0's in . And n n let . If , let and let α ≫ 0, α → α. For n sufficiently large, take a small amount of mass from that is large relative to , and distribute on to get any arbitrary distribution. More specifically, fix any distribution , where pα(a) = 0. Then can be chosen so that:
14.2.3 Intuitive equilibrium In what follows use Mi to denote the action space of i, with a view to considering actions as “messages”—to emphasize the notion of signaling. Thus, relating to former notation, A = M1 and B = M2. In perfect or sequential equilibrium, the restrictions on beliefs, if any, are purely mathematical and motivated by the need to extend Bayes' rule to events that have zero probability in the specific context of extensive form games. Further restrictions based on intuitively plausible criteria raise questions about how people reason and what motivates their choices. The pair (α*, β*) is an intuitive equilibrium with associated sequentially consistent belief system, if whenever pα (m1) = 0, the associated belief system satisfies intuitively reasonable conditions. What are the plausible requirements on the belief system {p(θ | a)θ ∈ Θ}a ∈ A in an equilibrium? One candidate approach is the following: Fix a sequential equilibrium (α*, β*), recalling that this includes both strategies and a belief system. The messages sent with positive probability are . If , then all messages in M1 have positive probability and Bayes' rule applies everywhere in the extensive form of the game. In this case the beliefs are fully determined, there is no scope for further restriction of beliefs and the equilibrium is intuitive by default. When , the set is the set of messages not sent under the sequential equilibrium strategy α*. Recall that player 1's expected payoff at
CHAPTER 14: SIGNALING
θ is V1(α*, β*, θ), and for each
247
form the set:
Thus, S(m1;α*, β*) is the set of θ's for which the message choice, regardless of the (subsequent) choice of 2.
gives player 1 lower utility than the equilibrium
For player 2, for any q ∈ Δ(Θ) let
This is the set of best responses of 2 to message m1 when beliefs are q. Then, the set
is the set of responses by 2 to message m1 which are optimal for some belief, q ∈ Q. Recall that if θ ∈ S(m1;α*,β*), there is no action of player 2 that could give player 1 as high a payoff as received in the status quo sequential equilibrium. Therefore, if the (out of equilibrium) message m1 is sent, player 2 should attach 0 probability to θ being in S(m1;α*,β*). That being so, 2 when responding should do so on the basis of beliefs with support in Θ\S(m1;α*,β*). If the (out of equilibrium) message m1 is observed, the posterior distribution, q, should have support Θ\S(m1;α*,β*): that is q ∈ Δ(Θ\S(m1;α*,β*)). This is the intuitive restriction on beliefs in the sequential equilibrium with strategies (α*,β*), and is used to define an intuitive equilibrium next. Let (α*, β*) be a sequential equilibrium with associated beliefs restricted according to the intuitive criterion. If 2's best response following message m1 is consistent with the intuitive condition, then the lowest payoff possible for 1 at θ is:
Suppose that ∃ θ ∈ Θ\S(m1;α*,β*) with
In this case, the payoff to 1 at θ sending message m1, is higher than at the equilibrium, regardless of the choice of 2, as long as that choice is in B2(Δ(Θ\S(m1;α*,β*))) (i.e. satisfies the intuitive restriction). In this case, given that 2 best responds consistent with intuitive beliefs, 1 will have the incentive at θ to (deviate and) choose m1. This violates the presumption that the strategies and belief system are a sequential equilibrium. Say that an equilibrium is intuitive if no such θ exists.
248
MICROECONOMIC THEORY
More formally: Definition 14.3.A sequential equilibrium
is empty and for
is intuitive if for each
, the support ofp(· | m1) is in Θ\S(m1;α*,β*).
These issues are discussed at length in Cho and Kreps (1987). The examples here are also given there; the labor market example was developed by Spence (1973).
14.3 Examples This section considers two examples that illustrate sequential and intuitive equilibria—a matrix game example and a labor market example. These are discussed in turn.
The “Beer–Quiche” example In the following game the first mover is either strong or weak (with probability p, 1 − p, respectively). The individual can choose B or Q after which the follower can choose either D or N.
This translates to an incomplete information game as described next.
In this game the “follower” has two information sets, IB and IQ. If at IB, the posterior that the first mover is strong is qB, then the expected payoff to D is qB · 0 + (1 − qB)· 1 = (1 − qB), and the expected payoff to N is qB · 1 + (1 − qB)· 0 = qB.
CHAPTER 14: SIGNALING
249
At IQ, with posterior qQ (that the first mover is strong), the expected payoff to D is qD · 0 + (1 − qD) · 1 = (1 − qD), and the expected payoff to N is qD · 1 + (1 − qD)· 0 = qD. So, at both information sets the follower should choose D when the (conditional) probability that the leader is strong is less than ½. So, at IB the follower should play D if qB < ½ and N if qB > ½. At qB = ½ the expected payoff is the same from both choices. Similarly, at IQ the follower should play D if qQ < ½ and N if qQ > ½, and again at qQ = ½ the expected payoff is the same from both choices. So, play D if qQ < ½ and play N if qQ > ½, regardless of the information set. For the remainder of the discussion, assume that ½. There are two types of sequential equilibria. Equilibrium I. In the first equilibrium, player 1 chooses B regardless of type. Player 2 chooses N at IB (where the posterior is qB = p) and chooses D at IQ (where the posterior is qQ < ½). (Since IQ has 0 probability, the posterior may be chosen arbitrarily.) Equilibrium II. In the second type of equilibrium, player 1 chooses Q regardless of type. Player 2 chooses N at IQ (where the posterior is qQ = p) and chooses D at IB (where the posterior is qB < ½). (Now, since IB has 0 probability, the posterior may be chosen arbitrarily.)
However, only one of these satisfies the intuitive criterion. In equilibrium I, the strong type of player 1 is getting 3, the highest payoff that a player could possibly get. The weak type is getting 2, and the highest possible payoff is 3. In this equilibrium, the only player that could possibly gain from switching from B to Q is the weak player. However, the follower already attaches high probability that a player choosing Q is weak. And this is reinforced by the fact that the only player who could gain from deviation to Q is the weak player. In equilibrium II, the weak player is getting 3, the highest possible; so cannot gain from deviation regardless of what 2 does. The strong player is getting 2 and a deviation to B does not help since the observation of Q leads to the follower
250
MICROECONOMIC THEORY
choosing D, based on the posterior that the weak leader is with high probability (qB < ½). But this is at odds with the fact that the only player who could gain by switching from Q to B is the strong player. More formally, to relate this to the detailed discussion, let equilibrium I be γ* = (α*, β*). In this equilibrium message B is sent, and message m1 = Q is not sent . Observe that
(choosing m2 = N in both cases). Consequently, S(m1 = Q, γ*) = {θ = strong}, and Θ\S(m1 = Q, γ*) = {θ = weak}. So, q ∈ Δ(Θ\S(m1 = Q, γ*)) puts probability 1 on θ = weak. Finally, observe that with m1 = Q, θ = weak,
because when the state is θ = weak and Q is chosen, the best response for 2 is D which gives the payoff profile (1,1). Thus, the condition in definition 1.3 is satisfied. For equilibrium II, denoted is the unsent message.
So,
and
. However, with m1 = B, θ = strong:
because when the state is θ = weak and Q is chosen, the best response for 2 is D which gives the payoff profile (1,1). because when the state is θ = strong and B is chosen, the best response for 2 is N which gives the payoff profile (3,1). Therefore, the intuitive criterion condition fails.
Labor market signaling Say there are two types of workers 1 and 2, with probabilities: prob(t = 1) = p, prob(t = 2) = 1 − p. When employed a worker receives a wage w and expends effort or acquires education e. A worker of type i has utility ui(w,e) = w−kie2, with k1 > k2. Thus, expending effort incurs greater loss of utility for worker 1 than for 2. Along an indifference curve of i, {(w,e) | ui(w,e) = ūi}, the marginal rate of substitution is dw/de|Ī = 2ki. On the hiring side a firm's profit is just output less the wage: π(y,w) = y − w. Worker productivities differ: a working of type i with training level e produces output γi · e—so if the firm hires a worker of type i and pays the wage w, then the firms' profit is γi · e − w. Assume that γ2 > γ1.
CHAPTER 14: SIGNALING
251
So, the worker chooses e ∈ [0, ∞), and the firm chooses w given e. From the firms perspective the problem is to choose a function w(·) to solve for each e:
Given the level of education e, the firm forms an assessment (p(t = 1 | e), p(t = 2 | e)) of the probability that the worker is a type 1 or type 2 worker. Assume that the market for workers is competitive so that in equilibrium the workers must be offered their expected productivity: w(e) = p(t = 1 | e)· γ1e + p(t = 2 | e)· γ2e. In this environment, equilibrium is determined by the wage function, w(·), and the belief function p(t = 1 | e)—write this as p(e) for short. Nash equilibria. This environment has many Nash equilibria—pooling, separating, and hybrid. In a pooling equilibrium, all types take the same action. Hence, types cannot be identified—they are pooled. In a separating equilibrium different types take different actions. Hence types are separated by their behavior. Finally, in a hybrid equilibrium subgroups of each type may take a common action; while the remainder from each group take separate actions. (The discussion below clarifies.) To find a pooling equilibrium, pick some e* > 0, let w* = w(e*) = p γ1e* + (1 − p) γ2e*, so that the wage at e* is equal to the expected productivity of a worker or the average productivity of workers in the population. For this to attract both types of workers, it must be that for i = 1, 2. (Since k1 > k2, this is just .) In addition, the wage function must be such that this is the best choice for each type of worker: for each i, . (Here in the game theoretic context the firm has specified a strategy (wage payment) at each node in the game tree; so a choice is an “off the equilibrium path” choice.) Consider the wage function w(e) = 0, e < e*, w(e) = w*, e ≥ e*. In this case effort less than e leads to a wage of 0 and effort greater than e* produces no extra benefit. So, from the individual's point of view, acquiring skill level e* is the best choice. From the firm's point of view, it is paying a wage equal to the average productivity and so is at 0 profit. Changing the wage schedule does not improve the firm's profitability—a lower wage at e* and the firm will attract no workers; a wage higher than w(e*) and the firm will make a loss as workers of both types
252
MICROECONOMIC THEORY
move to the firm. At levels of education different to e changing the wage has no effect—since no workers offer that education level. To find a separating equilibrium, pick (w1,e1) and (w2,e2) satisfying participation and incentive compatibility. Participation requires , . Incentive compatibility requires that i prefers (wi, ei) to and . The indifference curves may be represented wi(e), i = 1,2. Pick a wage function ŵ(e) with wi = ŵi(ei), i = 1,2 and ŵ(e) ≤ in {wi(e)}, ∀ e. One can also construct hybrid equilibria. Pick (w1, e1), w1 = γ1e1, and pick so that and . Similarly, pick (w2, e2) so that w2 = γ2e2 and . Finally, let w(·) be a wage function that passes through (w1,e1), , and (w2,e2) (see the figure below). Let individuals of type 1 randomize on the choices of education level: with probability x1 choose e1 and with probability (1 − x1) choose . Similarly, let individuals of type 2 randomize making choice e 2 with probability x 2 and choice with probability (1 − x 2 ). So, , and . Pick x1, x2 so that . In this case, the wage at education level generates 0 profit.
These equilibria are Nash equilibria.
CHAPTER 14: SIGNALING
253
Sequential equilibria. Sequential equilibrium refines the set of equilibria somewhat. Since the productivity of a worker with training e lies in the interval [γ1e, γ2e], if wages are competitive the wage function will satisfy γ1e ≤ w(e) ≤ γ2e. This implies that whatever (w,e) combination type 1 gets, it must lie in the region on or above the indifference curve of 1 that is tangent to the line w = γ1e. So, if the equilibrium is separating the equilibrium pair for 1, (w,e), defined by k1e2 = γ1e or e = γ1/k1. For incentive compatibility the (w,e) pair for 2 must lie to the right of the intersection of 1's indifference curve and γ2e. Intuitive equilibria. The intuitive criterion dramatically reduces the set of equilibria. Observe that in any equilibrium, levels of e to the right of the point where 1's indifference curve cuts the curve w = γ2e correspond to deviations (“messages”) that 1 cannot gain by sending, regardless of what the firm does (since the highest possible wage, w at any e satisfies w ≤ γ2e). Hence the intuitive beliefs following any such deviation should attach probability 1 to the type of the worker being type 2. In particular, there are no pooling equilibria.
Bibliography Cho, I. and Kreps, D. (1987). “Signaling Games and Stable Equilibria,” Quarterly Journal of Economics, CII, 179–221. Spence, M. (1973). “Job Market Signaling,” Quarterly Journal of Economics, 87, 355–374.
This page intentionally left blank
15 Screening 15.1 Introduction In the incomplete information environment screening occurs when an informed agent selects from a set of choices in a way that reveals (possibly partially) information about the state. This can arise when the uninformed party designs the choice set in a manner such that the informed agent's choice varies with the information so that “screening” of informed agents occurs. Examples where these issues appear are insurance and labor markets. In the personal insurance market, different individuals have different degrees of “riskiness,” and an individual will typically be better informed about health or other risk related factors than the insurance provider. Similarly, in the labor market one expects that job candidates know better their own strengths and weaknesses than a potential employer. At the hiring stage, every worker wishes to represent himself or herself as a “good” worker (say in terms of productivity), although some are and some are not. This matters to the firm, since it affects profitability. Such informational asymmetries create an incentive for the uninformed party to become better informed. Therefore, the firm is led to design schemes that result in workers, through their behavior, identifying their productivity. This is the “screening” that takes place in the model.
15.2 Screening Models In what follows, two well-known screening models are discussed—the insurance market model where there are different risks in the population; and the labor market model where there are workers of different abilities present in the population.
256
MICROECONOMIC THEORY
15.2.1 The insurance market model Consider an insurance market for some risk. An individual has income or wealth W in the absence of an accident; and wealth W − d if the accident occurs. Let p be the probability of an accident. With preferences represented by the state preference model, the expected utility of the person is
Suppose that insurance is available, and that the market for insurance is competitive so that insurance is offered at zero profit. Let α1 be the premium and be the payout in the event of an accident, so in the event of an accident the net payout is . Now, if the individual purchases insurance then wealth in the absence of an accident is CNA = W − α1 and CA = W − d + α2 if an accident occurs, writing CA for consumption in the event of an accident; and CNA for consumption if there is no accident. So, an insurance contract (α1, α2) gives consumption:
The individual's associated utility is:
For a representative firm, expected profit from contract α = (α1, α2) is:
With competitive insurance markets, any insurance contract α, offered in equilibrium, will earn zero expected profit. In that case α1 and α2 are related according to α2 = ((1 − p)/p) α1: a premium payment of α1 purchases ((1 − p)/p)α1 of insurance. So, in a competitive market, if α1 is spent on insurance, CA = W − d +((1 − p)/p) α1 and CNA = W − α1, so that CA = W − d + ((1 − p)/p)(W − CNA). The set of state-dependent consumption profiles associated with zero-profit contracts are:
So, the state-dependent consumption profiles associated with zero-profit contracts are given by this line, defined by Π0, with slope −((1 − p)/p). For the individual, the utility function V(CA, CNA, p) = (1 − p) u(CNA) + pu(CA) has slope
CHAPTER 15: SCREENING
257
Indifference curves and zero-profit consumption profiles are plotted in the figure.
Thus, (CA, CNA) pairs to the left of Π0 are associated with positive profit insurance contracts: a contract such as α^ below the zero-profit line generates positive profit. In this model with complete information, equilibrium is straightforward to describe. Equilibrium. With perfect competition in the provision of insurance, in equilibrium only zero-profit contracts will survive. If a firm offers a contract with positive profit an entrant can take that firm's customers with a contract that still makes positive profit by providing more favorable terms. Along the 0-profit line, only the contract α* (yielding CA = CNA) will survive. For example, the zero-profit contract ˜α is less preferable to consumers than , which generates positive profit. Again an entrant can take customers from a firm offering and make positive profit (by offering a contract, , such that and . The only contract immune to such challenges is the contract α*. So, in this environment, in equilibrium the only contract offered is α*.
Different risks Suppose now that there are two types of individuals in the population, low and high risk. Let the proportion of high risk people be λ. Among the high risk population, let pH be the probability of an accident; and among the low risk population, let pL be the probability of an accident (pL < pH). For the high risk people, utility is given by:
and for the low risk individuals,
258
MICROECONOMIC THEORY
So, ignoring signs, the respective indifference curves have slopes, (1 − pH)/pH and (1 − pL)/pL, respectively. And because pH > pL, (1 − pH)/pH < (1 − pL)/pL. The zero-profit lines in state-dependent consumption space are and . In the overall population, the average probability of an accident is , and the corresponding zero-profit line is given by . Since , is steeper than , which is steeper than . These relations are shown in the figure. At any α′ = (α1′, α2′), the associated consumption is ”C(α′) = (CA(α′), CNA(α′)) = (W − d +α2′, W − α1′).
The remainder of this section considers equilibrium in this environment. The main conclusions are that (1) there is no pooling equilibrium and, (2) a separating equilibrium may or may not exist, depending on the relative size of the high risk group in the population. Equilibrium. With pooling behavior, the same contract is offered by all firms and bought by all consumers. The main observation here is that there is no pooling equilibrium. To see this suppose otherwise. From the same reasoning as before, any pooling contract that generates nonnegative profit must be on the line given by , otherwise an entrant could offer a profitable contract preferred by all consumers. If a contract is on the average zero-profit line , a new contract provides consumption in the lens between the indifference curves, UL and UH to the right of and left of and is preferred only by the low risk individuals. But this contract, when sold only to low risk individuals produces a positive profit. Hence, the contract α cannot be an equilibrium: there is no pooling equilibrium. So, if there is an equilibrium, it must be a separating equilibrium where different contracts are offered to different types.
CHAPTER 15: SCREENING
259
If there is a separating equilibrium, observe first that the high risk individuals will be fully insured.
A contract anywhere on will generate zero profit if only high risk individuals purchase; and will generate positive profit if any of the low risk individuals purchase. Let αH be the contract giving consumption at the tangency of UH and . Given a contract α′ ≠ αH with C(α′) on the line ; if any individual purchases this contract there is a (strictly) profitable alternative preferable for the high risk individuals (because the high risk indifference curve must cut at C(α′)). Such an α′ cannot be offered in equilibrium. So, in a separating equilibrium the contract αH is offered to the high risk individuals. The low risk individuals will be offered a zero-profit contract along the line . Incentive compatibility requires that the contract not be demanded by high risk individuals, and this uniquely determines the low risk contract, αL with corresponding state-dependent consumption c(αL). When the proportion of high risk individuals is relatively large, (case A), the mean zero-profit line lies below the indifference curve through c(αL) of a low individual. In this case, the contract pair (αH, αL) is an equilibrium pair. There is no contract that can be offered that would make positive profit. However, when the proportion of high risk individuals is relatively small (case B), there is an alternative contract, γ, which generates consumption profile c(γ) that is profitable and strictly preferred by both high and low risk individuals to either (αH or αL). Hence, in this case the pair αH and αL are not equilibrium contracts. But the pooling contract, γ, cannot be an equilibrium contract either (from earlier arguments), so there is no equilibrium in this case.
15.2.2 The labor market model The basic model of screening in a labor market has the following structure: 1. The firm offers a wage as a function of effort or education: w(e). 2. The shock or information variable θ ∈ Θ is drawn with distribution p.
260
MICROECONOMIC THEORY
3. The worker sees the value of θ drawn and chooses effort or education e, knowing θ and the wage function w(·). 4. Output y(e, θ) is determined. 5. The firm receives profit π(y(e, θ),e, θ) and the worker gets utility u(y(e, θ),e, θ). This model is developed next. Suppose there are two types of workers 1 and 2, drawn with probabilities: prob(t = 1) = p and prob(t = 2) = 1 − p, respectively. When employed, a worker receives a wage w in return for effort or acquired education e. A worker of type i has utility ui(w,e) = w − kie2, with k1 > k2. Thus, expending effort incurs greater loss of utility for person 1 than 2. On the firm's side, profit is just output less the wage: π(y,w) = y − w. Worker productivities differ: a working of type i with training level e produces output γi · e: so if the firm hires a worker of type i and pays the wage w, then the firm's profit is γi · e − w. Assume that γ2 > γ1. In the screening context, consider matters in terms of (w, y) space—since the firm is designing contracts to maximize profit, y − w. It is convenient to reparametrize the model. If a type i worker expends effort e, then output is y = γie or alternatively output y implies an effort level e = y/γi and the corresponding utility is . 2 Let , so ui(w,y) = w − θiy . Since k1 > k2 and γ1 < γ2, θ1 > θ2.
The figure shows the indifference curves of individuals θ1 and θ2. The marginal rate of substitution of θi at (w,y) is , so that . Three iso-profit lines are given for constant negative, break-even, and positive profit. If the firm's objective is to maximize expected profit, then it can offer a wage w contingent on an output level y; or it may offer many such pairs: {(wr, yr)}. Call a pair cr = (wr, yr) a contract. The profit associated with cr is π(cr) = yr − wr. It is sufficient to consider two contracts at most when there are just two types. If three contracts are offered and accepted, say {ca, cb, cd}, then at least one type is indifferent between two contracts, say θ1 is indifferent between ca and cb and type
CHAPTER 15: SCREENING
261
θ2 accepts cd, and so must rank cd at least as high as ca and cb. Discarding the least profitable of ca and cb leaves the firm's profit unchanged, or raises it. So, with at most two contracts offered, write ci = (wi, yi) for the contract offered to θi, allowing for the possibility that c1 = c2. This suggests the possibility that it is optimal for the firm to offer just one contract. When all types choose the same contract, call the contract a pooling contract. The following figures show why this can never be optimal. In each case, c* is a candidate pooling contract. In the first case, the contract ĉ generates a higher profit and would be preferred by both types. In the second case, the contract is preferred by both types and again makes higher profit. Finally in the third case, ĉ is preferred by type θ1 and is preferred by type θ2. And both yield higher profit for the firm.
In view of the fact that there is no optimal pooling contract, it must be optimal to provide two (separating) contracts. In designing such a pair of contracts there are two considerations: the types must accept employment from the firm (participation), and each type must accept the contract intended for that type (incentive compatibility). If type i can achieve utility ūi employed elsewhere, then any contract, c, acceptable to i must satisfy ui(c) ≥ ūi. The second issue, incentive compatibility, is motivated by the following situation.
262
MICROECONOMIC THEORY
If contracts c1 and c2 are offered, a difficulty arises because c1 is preferred by both types. If the firm intends that type 1 workers will choose c1 and type 2's will choose c2 then the firm anticipates an expected profit (per worker) of pπ(c1) + (1 − p) π(c2). However, both types of workers will choose c1 so the workers “pool”. And this is suboptimal since, starting from any pooling contract, there is an incentive compatible profit improving contract. Therefore, in the selection of contracts, incentives must be right for types to select the appropriate contract. In this context u1(c1) ≥ u1(c2) and u2(c2) ≥ u2(c1). In the figure while u1(c1) ≥ u1(c2), u2(c1) > u2(c2), so incentive compatibility is satisfied for the first type, but not for the second. So, the problem is a constrained optimization problem. Let f(x1, …, xn) be an objective function and gi(x1, …, xn) ≤ 0, i = 1, …, m, be m constraints. Suppose that x is a local maximum and the program satisfies a regularity condition,18 then x satisfies: (1) x is feasible (gi(x) ≤ 0, ∀ i); (2) there are numbers λj ≥ 0, j = 1, …, m such that ∂f/∂xi − ∑j λj ∂gj/∂xi = 0, ∀i; and (3) ∑λjgj(x) = 0. So, subject to the constraint qualification, these are necessary conditions for a local (and hence global) maximum. In what follows, necessary conditions for a profit maximizing firm are identified. In the present context the two incentive constraints are: (1) g1(c1, c2) = u1(c2) − u1(c1) ≤ 0 and (2) g2(c1, c2) = u2(c1) − u2(c2) ≤ 0 and the participation constraints are (3) g3(c1, c2) = ū1 − u1(c1) ≤ 0 and (4) g1(c1, c2) = ū2 − u1(c2) ≤ 0. Observe now that it cannot be the case that both incentive constraints are binding at a solution. If one constraint is binding ui(ci) − ui(cj) = 0, both contracts are on i's indifference curve. For incentive compatibility to hold it must be that uj(cj) ≥ uj(ci). But since the indifference curves having different slopes ui(ci) = ui(cj) implies that uj(cj) > uj(ci), the indifference curve of j is above ci. Thus, the incentive compatibility constraint for j is not binding. Another possibility is that neither incentive compatibility constraint binds. This is considered next.
18
The constraint qualification: The Jacobian of (g1 , …, gm ) has full rank at x .
CHAPTER 15: SCREENING
263
The indifference curves are drawn at the individually rational level. If contracts c1 and c2 are offered, type i strictly prefers contract ci. The contracts are incentive compatible. And since for both types the participation constraints are binding, this pair of contracts is optimal. This corresponds to the first-best outcome where the firm can solve the programs: max ℒ = (yi − wi) + λ (ūi − ui(yi, wi)), i = 1,2. The only remaining possibility is that just one constraint binds. So, consider the case where the incentive constraint of type 2 binds. This corresponds to the case considered above where the contract c1 is preferred to type c2 by type 2. See the figure below. In this case the program is:
This gives the Lagrangian:
And leads to the first-order conditions: (1) (2) (3) (4)
∂ℒ/∂y1 = p + 2 θ2 μ y1 − 2 λ1 θ1y1 = 0, ∂ℒ/∂w1 = −p − μ + λ1 = 0, ∂ℒ/∂y2 = (1 − p) − 2 θ2 μ y2 −2θ2 λ2y2 = 0, ∂ℒ/∂w2 = −(1 − p) + μ +λ2 = 0,
where equality follows from the assumption of an interior solution (y1, w1, y2, w2) ≫ 0. And, from the formulation of the program, all multipliers are
264
MICROECONOMIC THEORY
nonnegative. From (3) (1 − p) = 2 (μ + λ2)θ2y2 and from (4) (1 − p) = μ + λ2. Therefore 1 = 2 θ2y2, so the indifference curve of a type 2 has slope 1 and is tangent to an iso-profit line. Also, adding (2) and (4), 1 = λ1 +λ2. From (1) p = − 2 θ2 μ y1 + 2 λ1 θ1y1 and from (2) p = − μ +λ1. So,
where the inequality is strict if μ > 0 (if the incentive compatibility constraint is binding). So, at 1's contract, c1 = (w1, y1), the slope of the indifference curve is less than the slope of the iso-profit line through that contract. From (2), λ1 = p +μ ≥ p, so that λ1 > 0 and the participation constraint for type 1 is binding. Adding (2) and (4) gives 1 = λ1 +λ2, so that the second participation constraint may (λ2 > 0) or may not (λ2 = 0) be binding.
The optimal contract pair is (c1, c2). Incentive compatibility prevents the firm from offering c* to type 1 workers—since that contract would also be chosen by type 2 workers. At the optimal contract pair, type 1's participation constraint is binding (λ1 > 0). (If it were not binding, “sliding” c1 along would raise profit for type 1 workers without altering profit for type 2 workers—and reduce type 1's utility at the same time.) From these calculations the impact of varying p may also be considered. If p increases, the type 1 workers become a more significant fraction of the population. Therefore, from the firm's point of view the unit profit on each such type becomes more important. In the limit, at p = 1 the optimal policy is to offer the contract c*, which gives maximum profit on type 1's. So, as p increases, the optimal contract for type 1's moves along toward c*. At the same time, if a contract is offered to type 1's, the optimal contract for type 2's is defined as the point on the indifference curve where the slope is equal to 1—at point c′.
CHAPTER 15: SCREENING
265
At the initial value of p the optimal contract pair is (c1, c2). As p increases there are relatively more type 1's, the optimal contract pair moves to ( , c′) with type 1's buying and type 2's buying c′. The classic paper on screening is Rothschild and Stiglitz (1976). The area is broad and discussed in greater length in Stiglitz (1984).
Bibliography Rothschild, M. and Stiglitz, J. (1976). “Equilibrium in Competitive Insurance Markets: An Essay on the Economics of Imperfect Information,” Quarterly Journal of Economics, 90, 629–650. Stiglitz, J. (1982). “Lecture Notes,” Mimeo, Princeton: Princeton University. Stiglitz, J. (1984). “Information, Screening, and Welfare,” in M. Boyer and R. E. Kihlstrom (eds.), Bayesian Models in Economic Theory, Elsevier, Amsterdam.
This page intentionally left blank
16 Common Knowledge 16.1 Introduction In philosophy, the branch concerned with the theory of knowledge is called epistemology, and deals with such questions as: what is knowledge?, and how is knowledge derived? In economics and game theory, knowledge is typically represented by “states of nature,” where each state of nature is a complete description of the relevant features of the environment. Full knowledge in this context means knowing the state of nature; and partial knowledge means knowing that the (true) state of nature lies in some subset of the set of states. Knowledge of the environment, along with knowledge of what others know and how they act, plays an important role in the study of behavior. The term “common knowledge” is traditionally used to assert that some fact is commonly known to be true. In this informal sense, common knowledge refers to knowledge that is public. When an event is common knowledge, it provides an objective benchmark relative to which individuals can make decisions, knowing that others have the same benchmark—when a choice is made one can reasonably depend on specific information guiding other's decisions. So, it is natural to consider what it means to say an event is common knowledge and how common knowledge of an event might affect behavior. This section and Section 16.2 give a brief description of the formulation of knowledge in terms of information partitions. Common knowledge is defined in Section 16.3 and the classic result on common knowledge of posteriors implying their equality is discussed. Section 16.4 discussed iterative announcements and the convergence to common posteriors. The logic behind this type of result is developed in Section 16.5 where iterative processes or actions taken over time lead to transfer of information in subtle ways—even the absence of action by an individual conveys information about what that individual knows (or does not know). Section 16.6 discussed common knowledge of aggregate statistics. In this case, rather than observe pieces of information about what individuals
268
MICROECONOMIC THEORY
know, only an aggregate statistic may be available. For example, in a market with incomplete information, the price at which an asset trades provides a statistic summarizing private information on the value of the asset. Section 16.7 discusses common knowledge in the context of games. The key point here is that lack of common knowledge of the parameters of a game can dramatically affect the equilibrium outcome. Finally, in Section 16.8 a no-trade theorem is given. Intuitively, if the expected value of an asset is 0, but conditional on private information, individual 1 assigns positive value and individual 2 assigns negative value, then individual 1 can infer that if 2 is willing to sell the asset, 2's information must be unfavorable. Ultimately, this prevents trade from taking place. Information is typically represented by an information function or a partition19 of the state space. An information function for a person is a function that assigns a subset of Ω, P(ω) to each ω ∈ Ω. The interpretation is that, at state ω, the person knows that the true state is in the set P(ω), but no more. States in P(ω) are states considered possible by the individual when the true state is ω. Two common requirements on the function p(·) are: P1. ω ∈ P(ω) for each ω ∈ Ω. P2. If ω′ ∈ P(ω) then P(ω) = P(ω′). The first condition says that you cannot know something that is false, while the second says that if one state is in the possibility set of the other, the possibility sets at both states are the same. The logic for this condition is that if it were the case that P(ω) ≠ P(ω′), then at state ω′ the person would know when told “the true state is in p(ω′),” that the true state cannot be ω, since this would have led to the message “the true state is in P(ω).” P2 assumes that the function P(·) is known to the individual: you know what you know, and you know what you would have known in other states of the world. A partition of Ω is a division of Ω into disjoint sets; each element of the partition identifies a group of states. If P(·) satisfies P1 and P2, then there is a partition of Ω, P such that for each ω, P(ω) is an element of the partition. So, partitions naturally represent an individual's information.
16.2 Information Structures Let the information of an individual be represented by a partition of Ω, Pi for individual i. Write Pi(ω) to denote the element of the partition containing ω. The meet of a collection of partitions, , denoted ℳ, is the finest common coarsening of those partitions: . The join of a collection of partitions, ,
19
A partition of a set Ω is a collection {Pα }, such that ∪αPα = Ω, and for α ≠ α′, Pα ∩ Pα ′ = ∅.
CHAPTER 16: COMMON KNOWLEDGE
269
denoted J is the coarsest common refinement of those partitions: . The meet denotes knowledge that is common to all (that everyone knows); the join denotes the pooled knowledge of the individuals. These concepts are depicted in the figures for the case where there are two individuals and their knowledge is represented by different partitions of the disk.
Representing information by partitions is most straightforward when the state space is finite or countable, and this will be assumed in what follows. (With a continuum of states, technical considerations arise.) Individual i's partition is a collection ; write Pi for a representative element of Pi.
16.3 Common Knowledge In the figures, at state ω individual 1 knows that the state is in P11, and since P11 ⊆ ℳ(ω), the individual knows the state is in ℳ, or knows the event ℳ(ω). From this 1 can infer that 2 either knows that the true state is in P21 or knows that the state is in P22. So, 1 knows that 2 knows the state is in P21 ∪ P22 ⊆ ℳ(ω). When ω′ ∈ ℳ(ω), ω′ ∈ P11 ∪ P12. Thus, 1 knows that 2 knows that 1's partition is either P11 or P12, and since both of these are in ℳ(ω), 1 knows that 2 knows that 1 knows ℳ(ω). Going another round in the cycle: 1 knows that 2 knows that 1 knows that 2 knows ℳ(ω). Arbitrary iterations of this statement may be made—so in this case the set ℳ(ω) is said to be “common knowledge at ω.” In fact, similar reasoning would lead one to say that E is common knowledge at ω if ℳ(ω) ⊆ E, because
270
MICROECONOMIC THEORY
such cycles of reasoning do not take one outside the set E. This leads to a natural definition of common knowledge. Definition 16.1.Given ω ∈ Ω, and event E is said to be common knowledge at ω if ℳ(ω) ⊆ E. A number of applications of this formulation of common knowledge relate to functions defined on the state space: x : Ω → R. One function of particular interest is the posterior distribution. For this discussion, fix a prior distribution p on Ω. Given an event A, let qi(ω) = p(A | Pi(ω)) = p(A ∩ Pi(ω))/p(Pi(ω)). This defines a function qi: Ω → R. Theorem 16.1.If it is common knowledge at ω that
and
, then
.
To say that and is common knowledge at ω, is to say that the event is common 1l 2l 1l 2l knowledge at ω: ℳ(ω) ⊆ E. Let I = {l | P ⊂ ℳ(ω)} and J = {l | P ⊂ ℳ(ω)} so that ∪l∈lP = ∪l∈lP = ℳ(ω), from the definition of ℳ. Since , , , l ∈ I and , l ∈ J. So, , l ∈ I and therefore , or , or . The same computation applies to person 2 so that and therefore . Common knowledge of the posterior values is different to (just) knowing the posterior values. To see this, let Ω = {α, β, γ, δ}, P1 = {{α, β}, {γ, δ}}, and P2 = {{α, β, γ}, {δ}}. Suppose that ω = α, so that P1(ω) = {α, β}, P2(ω) = {α, β, γ}, and A = {α, δ}. Take p(ω) = ¼, ∀ ω.
The information partitions of individuals 1 and 2 are P1 and P2. When the state is ω = α, individual 1 has more precise information than 2. ∥Then, with ω = α,
and
(q1( ω) = ½, ∀ ω; q2(ω) = ⅓, ω ∈ {α, β, γ}, and q2(ω) = 1, ω ∈ {δ}.)
CHAPTER 16: COMMON KNOWLEDGE
271
At ω = α, 1 knows that ω is in the set {α, β} and so is better informed than 2 who knows that ω ∈ {α, β, γ}. From 1's information, 1 knows the value of 2's posterior. For 2, who knows {α, β, γ}, learning q1 = ½ is consistent with 1 having observed {α, β} or {γ, δ}. So, learning that q1 = ½ does not help 2 determine whether 1 observed {α, β} or {γ, δ}. Knowledge of the values of the posteriors does not imply they are equal.
16.4 Posterior Announcements While knowledge of posteriors does not imply equality, if posteriors are announced back and forth, this does lead to agreement. In the example, the initial announcement is . Person 1 does not learn anything from this because at state α, 1 already knows the state is in {α, β} and that 2 will therefore observe {α, β, γ}—2's posterior will be constant on this set—{ω | q2(1)(ω) = ⅓} = {α, β, γ}. This does tell 1 that ω ∉ {δ}, something 1 already knows. Individual 1's posterior function for the second round is now ). Observe that
Since , ∀ ω learning the value of tells individual 2 nothing about ω, so , ∀ ω. The function is different from on ω ∈ {γ, δ}. But, since the true state is ω = α, 1 calculates a posterior as ½. So, the announcement profile is at the second round. Next period, (since , ∀ ω). However, learning the value of does tell 2 something new. If the state were ω = γ, person 1 knowing that the true state was not δ, would attach probability 1 to (the conditional probability of A at γ for 1 in round 2). The fact that 1 announces a posterior value of ½ tells 2 that ω ≠ γ. Therefore, 2 learns that the true state is in {α, β}. Hence, at ω = α, . This observation is not specific to the example. In general the iterative announcement of posteriors back and forth leads to convergence of the announcements to a common value. To see this, let where . Also, let P2s* be the member of P2 containing ω. Let and let . From 2's perspective, 1's announcement implies that the state is in ; and from 1's perspective, 2's announcement implies that the state is in . Next period's posteriors are:
272
MICROECONOMIC THEORY
These announcements further refine the index sets I1, J1 to I2, J2.
Proceeding iteratively, the process converges: J* = JT+1 = JT, I* = IT+1 = IT with corresponding posteriors: q*1 and q*2. Thus, for each k ∈ I* and j ∈ J*,
Taking the first of these,
Summing over k ∈ I*:
Noting that P1k and P1k′ are disjoint when k ≠ k′,
repeating these calculations, the same expression holds for q*2, so that q*1 = q*2.
16.5 Public Announcements A public announcement is common knowledge: everyone hears the announcement; everyone knows that others have heard the announcement, and so on. When reasoning takes time, a public announcement can lead to learning long after the announcement is made. This is illustrated in the following example. Suppose, there is audition for people to join a theater company. Suppose also that the company hires any individual who scores an A; those who do not score an A are not hired. All candidates are present at the audition, with each performing in turn. The person performing cannot see the reaction of the judges and the grade they assign to that individual—but other candidates can. Thus, if candidate i gets a grade A, only i among the candidates does not know this. Individuals are not informed of their grade by the judging committee; but instructions in the audition hall say that anyone with an A grade can sign up for training. The company will hire any individual who scores an A grade. Consider the case where there are two interviewees and both score an A grade. In this scenario, person 1 knows that two scored an A, and 2 knows that 1 scored an A. But neither knows their own score. In particular, both know that at least
CHAPTER 16: COMMON KNOWLEDGE
273
one person scored an A. But, 1 does not know if 2 knows this (and likewise for 2). Neither can sign up for training, because each one does not know if they scored an A, and nothing can be inferred from the behavior of the other. Now, suppose there is a public announcement that at least one person scored an A. From the information structure, 1 knows that 2 knows that at least one person scored an A. Individual 1 can now consider the behavior of 2 under two scenarios. Suppose 1 did not have an A grade. In that case, 2 could infer from the public announcement that 2 must have an A grade; and sign up immediately. If there is a delay and 2 does not sign up immediately, it must be that 1 has an A grade (explaining why the public announcement does not allow 2 to conclude that 2's grade is an A). In sum, 2's delay in signing up informs 1 that 1's grade must be A. After some delay, 1 signs up. Symmetrically, 2 signs up after an equivalent delay (of “one period”). With 3 candidates, there are two periods of delay. In the state where each person got an A grade, each person knows that the other two got an A grade, but does not know their own grade. So, no candidate can sign up with the information available. Now, again suppose there is a public announcement saying that at least one person scored an A. Consider the situation from 1's point of view. There are two possibilities to be considered—individual 1 either did or did not score an A grade. Let 1 reason as follows: 1. Suppose that 1 did not score an A grade. Then 2 and 3 can see this and the situation is the two-person case. So, after a period of delay both will move to sign up. 2. On the other hand, if 1 had an A, 2 and 3 see an A for 1, so 2 and 3 cannot move, after the initial round of delay. What they know is consistent with neither having an A grade. A second delay occurs, and the observation of this second period of delay tells 1 that 1's grade must be A—it is the only thing that can explain the second delay by the others. Again, with others reasoning in the same way, all sign up after two periods of delay. With more participants, the reasoning process takes longer: each additional participant adds an extra period of delay. But the ultimate outcome is the same; when all have an A grade and it is common knowledge that at least one has, then after a sequence of delays all sign up.
16.6 Common Knowledge of an Aggregate Statistic Recall earlier that common knowledge of the event with n individuals common knowledge of the event
, at ω implies that . More generally, implies that for some ,
274
MICROECONOMIC THEORY
, ∀i. It turns out that a much stronger result holds when only a summary statistic is common knowledge. Specifically, take an aggregating function f, f(q1(ω), …, qn(ω)), which is strictly monotonic and additively separable. The following discussion shows that common knowledge of the value of such a function at ω implies that all qi's are constant and equal. Let fi : R → R, i = 1, …, n, be strictly increasing functions and let . Let x: Ω → R, and define xi(ω) = E {x | Pi(ω)}. The following discussion considers how common knowledge of the value of ∑ifi(xi) (common knowledge at ω of the event ) implies that the xi's are equal and hence generalizes earlier results. Write Ii = {l | Pil ⊂ ℳ(ω)}, so that for each i, Note that
. Ii identifies the collection of partition members of i in ℳ(ω). . If ω ∈ Pil, then Pi(ω) = Pil and or
. From the definition of xi(ω), the function is constant on each Thus, .
, so for
write xi(Pil) for
In the case where x = χA, the indicator function of a set A, E {x | Pi(ω)} = qi(ω) = p(A | Pi(ω)). Theorem 16.2.If for some constant k, Let
is common knowledge at ω, then ∀ i, , so that
Since Q is common knowledge at ω, ℳ(ω) ⊆ Q, so that
,
or, expanding f and reversing summations,
Recalling
,
Recalling that xi is constant on each member of
. So,
. So,
,
.
.
CHAPTER 16: COMMON KNOWLEDGE
For ω ∈ Pil, Therefore,(16.1)
and
275
so that
Note that:
Therefore,
, and so
Subtracting this expression from expression (16.1) above:
The product Ii, .
is always nonnegative, so for the sum to be 0, each term must be zero: ∀ l ∈
16.7 Common Knowledge and Equilibrium In finitely repeated games there are significant end-period effects that work backward through time. One example is Prisoner's Dilemma, where in the last period the players play a Nash equilibrium; with the last period determined, the same holds true one period before the last, and so on back in time to the start of the game.
Thus, in any finite repetition of Prisoner's Dilemma, equilibrium behavior involves defection at each period. However, this conclusion depends on the game being common knowledge. The following discussion shows how relaxing the common knowledge assumption on one dimension (the length of the game) alters the conclusion dramatically. Specifically, for an appropriate information structure, cooperation throughout most of the game is a Nash equilibrium. Suppose that instead of a fixed number of periods, the length of play of the game, T, is drawn according to the distribution p(T) = (1 − δ)δT−1, with 0 < δ < 1.
276
MICROECONOMIC THEORY
Then, the game is played repeatedly for T-periods. Suppose that the information of the two players is represented as follows:
Thus if the game is a 5-period game (T = 5 is drawn), player 1 is told that the game is 5 or 6 periods long, while player 2 is told that the game is 4 or 5 periods long. When period 4 is reached, 2 is unsure whether there will be a fifth stage. Suppose that T = 10 is drawn. Then, 1 knows this is the last period and knows that 2 knows the true length is in either 10 or 11, and knows that 1 knows the true length is in or . Since 10 has prior probability (1 − δ) δ9 and 11 has 10 prior probability (1 − δ) δ the relative probabilities are 1:δ; so given 2 is at information set , 2 attaches probability 1/(1 + δ) to 10 and δ/(1 + δ) to 11. Define the “grim-trigger” strategy for each player as follows. At time t = 1 play C, and at t > 1 play C if the other player has played C at every period in the past, and there is positive probability of the game having another stage. Otherwise, play D. The following discussion shows that this strategy played by both players is an equilibrium. Suppose that period 10 arrives, and neither has defected. According to the strategy, player 1 will play D. What will 2 do? From 2's perspective, there is probability 1/(1+δ) that 1 observed and probability δ/(1 + δ) that 1 observed . Thus, there is probability 1/(1 + δ) that the game terminates this period and that 1 will play D; and there is probability δ/(1 + δ) that 1 will play C. Suppose 2 chooses D. Player 2 anticipates that with probability 1/(1 + δ) the game terminates in this period (period 10), in which case 2 gets a payoff of 1; and anticipates that with probability δ/(1+δ) the game goes to the next period (period 11). In the latter case, 2 believes that 1 would have chosen C as the strategy specifies, so that 2 would earn a payoff of 4 in period 10 playing D and in period 11 both would play D giving 2 a payoff of 1—for a total of 5. In sum, 2's expected payoff from D is (1/(1 + δ)) · 1 + (δ/(1 + δ)) · [4 + 1] = (1/(1 + δ))[1 + 5δ]. Suppose that 2 chooses C. Again player 2 anticipates that with probability 1/(1+δ) the game will end—giving 2 a payoff of 0; and with probability δ/(1 + δ) the game is an 11-period game. In this case, 1 following strategy would play C in period 10 and C again in period 11, while 2 chooses C in period 10, and D in period 11, knowing from their information set that it is the last period. The total payoff in this case is 3 + 4 = 7. So, the expected payoff to 2 from choosing C in period 10 is (1/(1 + δ)) · 0 + (δ/(1 + δ)) · [3 + 4] = (1/(1 + δ))[7δ]. Comparing the expected payoff from D and C, C is preferred if 7 δ > 1+ 5δ, or 2 δ > 1 (δ > ½). Thus, if the length of the game is not common knowledge,
CHAPTER 16: COMMON KNOWLEDGE
277
even though the game has a finite length, the cooperative outcome can arise as a Nash equilibrium. In this example only the entire set of times {1, 2, …} is common knowledge.
16.8 No-Trade Theorems In a gambling situation where risk averse players have a common prior over states there must be winners and losers in terms of expected payoffs. But, when there is private information one might expect that there may still be scope for trade because the effect is similar to giving different agents different (subjective distributions). However, with common knowledge this is not the case—in the sense that it is not possible for it to be common knowledge that everyone can gain. This idea is articulated in the context of an exchange model with risk. Let ^ be an exchange economy. Uncertainty is represented by Ω = Θ × X where Θ is the set of payoff relevant variables and X is the set of payoff irrelevant variables. Utility depends on consumption (points in Rl) and the payoff relevant state. If i consumes c ∈ Rn at state θ, i's payoff is ui(θ,c). Individual i's endowment is ei: Θ → Rn. Prior distributions over Ω are given for each i, pi. The main assumption on the prior distributions is that pi(x | θ) = pj(x | θ). A trader is risk averse if for each θ, ui(θ,c) is concave in c. A trade is a vector of trades: t = (t1, …, tn), where ti : Ω → Rn. A trade is called θ contingent or a θ trade if ti : Θ → Rn. A trade is feasible if ei(θ) + ti(θ, x) ≥ 0, ∀ i, θ, x and ∑iti(θ, x) ≤ 0. Theorem 16.3.Suppose that: (1) (2) (3) (4) (5)
All traders are weakly risk averse. e = (e1, …,en), the initial allocation is Pareto optimal relative to θ trades. Prior beliefs satisfypi(x | θ) = pj(x | θ). Each trader, i, observes information conveyed by a private partition, Pi. It is common knowledge at ω* that t is a feasible θ trade and each individual weakly prefers t to the 0-trade.
Then: (a) Every agent is indifferent totand the 0-trade (b) If all agents are strictly risk averse, tis the 0-trade Proof. From (5), ∀ w∈ ℳ(ω*), Ei{ui(θ, ei + ti)| Pi(ω)} ≥ Ei{ui(θ, ei)| Pi(ω)}. If (a) is false, this inequality is strict for some i. Let ti* = ti · χℳ(ω*)′, ∀ i (where χE is the characteristic or indicator function of E). Since t is feasible, so also is t*, because ti* is equal to t or 0. However, t* is generally not a θ trade (as it depends
278
MICROECONOMIC THEORY
on both θ and x through ℳ(ω*)). Note that t* involves trade on the region where it is common knowledge that trade is feasible and mutually acceptable. Write ℳ(ω)c for the complement of ℳ(ω) in Ω and observe that
The equality between lines 2 and 3 follows because ℳ is coarser that Pi, and the first inequality follows from (5): Ei {ui(θ, ei + ti) | Pi} ≥ Ei {ui(θ, ei) | Pi}, ∀ ω ∈ ℳ(ω*). Thus, if t is a feasible mutually acceptable θ-trade then t* satisfies, ∀ iEi {ui(θ, ei + ti*)} ≥ Ei {ui(θ, ei)} with strict inequality for some i if Ei {ui(θ, ei + ti) | Pi} > Ei {ui(θ, ei) | Pi}. Now,
where the inequality follows from Jensen's inequality. Let ti** = Ei{ti*∣ θ} = E{ti* ∣ θ} from the assumption pi(x ∣ θ) = pj(x ∣ θ). Because t* is feasible so is t** = E{t⊃ ∣ θ}, and
Therefore,
with strict inequality for any i where Ei{ui(θ, ei+ti)∣ Pi(ω)} ≥ Ei{ui(θ, ei)∣ Pi(ω)} is strict.This contradicts the hypothesis that e is optimal relative to θ trades. So, under assumptions 1–5, if τ is a θ-trade, then
If each ui is strictly concave and τ ≠ 0, then
so the trade λ τ is a θ-trade which is Pareto improving. Thus, τ = 0 proving (b).
CHAPTER 16: COMMON KNOWLEDGE
279
Bibliography Aumann, R. (1976). “Agreeing to Disagree,” Annals of Statistics, 4, 1236–1239. Geanakoplos, J. D. and Polemarchakis, H. M. (1982). “We Can't Disagree Forever,” Journal of Economic Theory, 28, 192–200. Hart, S. and Tauman, Y. (2004). “Market Crashes Without External Shocks,” Journal of Business, vol 77, 1–8. McKelvey, R. and Page, T. (1986). “Common Knowledge, Consensus and Aggregate Information,” Econometrica, 54 (1), 109–127. Milgrom, P. (1981). “An Axiomatic Characterization of Common Knowledge,” Econometrica, 49, 219–222. Milgrom, P. and Stokey, N. (1982). “Information, trade and Common Knowledge,” Journal of Economic Theory, 26, 177–227. Nielsen, L. T., Brandenburger, A., Geanakoplos, J., McKelvey, R., and Page, T. (1990). “Common Knowledge of an Aggregate of Expectations,” Econometrica, 58 (5), 1235–1239. Rubinstein, A. (1989). “The Electronic Mail Game: Strategic Behavior under almost Common Knowledge,” American Economic Review, 79, 385–391.
This page intentionally left blank
17 Bargaining 17.1 Introduction Bargaining concerns the allocation of surplus between individuals. One natural assumption is that bargaining will lead to an efficient outcome so that when negotiation concludes, whatever agreement is reached, there is no alternative that could make the players better off. While efficiency means that the outcome is on the Pareto frontier of the set of possible payoffs, this leaves open a large number of possibilities since the Pareto frontier usually contains many points. Axiomatic bargaining imposes axioms that restrict the possible outcomes and typically provide tight predictions. Noncooperative bargaining approaches the problem of surplus division from a strategic perspective where issues such as first mover advantage and patience in negotiation come into play. In what follows the basic issues in the bargaining problem are laid out. Section 17.2 gives the context for axiomatic bargaining and introduces the bargaining set. Here, it is shown how a bargaining set can be derived from an environment described by actions and payoff functions. Section 17.3 describes four axiomatic bargaining solutions. The egalitarian and utilitarian bargaining solutions are discussed in Section 17.3.1. Section 17.3.2 develops the Nash bargaining solution and the Kalai–Smorodinsky (K–S) solution is considered in Section 17.3.3. Noncooperative bargaining is presented in Section 17.4 with focus on the altern-ating offers model. The main point is that in the two-player model with a recursive structure in the game, there is a unique subgame perfect equilibrium. Furthermore, there is a close connection between the Nash bargaining and the alternating offers model: as the length of time between offers becomes small, the equilibrium payoffs approach the generalized Nash bargaining solution. This connection is explored in Section 17.5 following the discussion of noncooperative bargaining. Finally, in Section 17.6 the n-person noncooperative bargaining problem is considered. The main point there is that many of the desirable features of the two-person model
282
MICROECONOMIC THEORY
do not carry over. In general there is a multiplicity of equilibria and only strong assumptions (such as stationarity) restore uniqueness.
17.2 Axiomatic Bargaining A bargaining problem is defined by a pair (X,d), X ⊂ R2, d ∈ R2, where X is the set of feasible payoffs and d is a “disagreement point.” The set X is assumed to satisfy convexity, comprehensiveness (free disposal), closure, and boundedness from above. The family of such problems is denoted ℬ:
Convexity is a technical requirement often requiring randomization, and comprehensiveness (or free disposal) requires that if x is feasible, so is y ≤ x. Given a bargaining problem, it is possible to discuss bargaining alternatives with the payoff set, X. This is done below, but before doing so, some possible models underlying X are described. In the following example, X is derived from a “physical situation” described by the matrix giving payoffs to various choice combinations (but not interpreted as a strategic form game).
Here u(si, tj) = (u1(si,tj), u2(si, tj)) gives the payoffs to 1 and 2 respectively from the choice (si, tj). Note that the minmax for both players is 1, so let d = (1,1) (the maxmin point is (1,1)). Let Y = {y ∣ ∃ (si,tj), y = u(si, tj)}, and put Y*= conv Y. Finally X = {x ∣ ∃ ξ ∈ Y*, x ≤ ξ}. The next example is a “pie division” example. Let S = {(s1, s2) ∣ {si ≥ 0, s1 + s2 ≤ 1}. Utility for i is ui:[0, 1] → R, ui′ < 0. F = {u = (u1, u2) ∣ ∃ s ∈ S, u(s) = u} u(s) = (u1(s1), u2(s2)). When s1 + s2 = 1, it is convenient to write s for the share of 1, so that or u1(s(u1)) = u1 gives the implied share of 1, if 1's utility is u1. Thus, u1′(s(u1)) s′(u1) = 1 or s′(u1) = 1/ u1′(s(u1)) < 0, and
CHAPTER 17: BARGAINING
283
s″(u1) = − (u1″(s(u1))[s′(u1)]2)/u1′(s(u1))2. So, s″(u1) > 0 if u1″ < 0, and s″(u1) < 0 if u1″ > 0. Put ϕ(u) = u2(1 − s(u)), the utility that 2 gets, when 1 gets utility u. Then, ϕ′(u) = − u2′(1 − s(u))s′(u) < 0, and ϕ″(u) = +u2″(1 − s(u))[s′(u)]2 − u2′(1 − s(u))s″(u). If ui′ > 0 and ui″ < 0, then ϕ″ < 0; If ui′ > 0 and ui″ > 0, then ϕ″ > 0. These situations are depicted in the following figures.
If ui″ > 0, the set {(u1, u2) ∣ ∃ (s1, s2) ∈ S, ui ≤ ui(si), i = 1,2} is not convex. In this case, convexity can be obtained by randomization. In a bargaining problem, (X,d), how is a point selected in X to determine the allocation or sharing of welfare between the two individuals? A minimum requirement is efficiency—that it is not possible to make at least one person better off without worsening anyone's welfare. Imposing this condition gives the bargaining set. The bargaining set is:
and gives the entire Pareto frontier. At the other extreme one may consider procedures selecting a unique outcome. Any such procedure is called a solution. Definition 17.1.A solution to the bargaining problem is a function f, where f: ℬ → R2, withf(X,d) ∈ X. Any such solution involves selection, and that in turn requires selection criteria. In the bargaining context, such criteria take the form of axioms, leading to axiomatic bargaining.
17.3 Axiomatic Bargaining Solutions Two of the simplest bargaining solutions are the egalitarian and the utilitarian solutions. These are discussed next, and then Nash bargaining is considered.
17.3.1 Egalitarian and utilitarian solutions One obvious procedure for selecting a welfare allocation is to give each person the same welfare. This is called the egalitarian solution. A second procedure is to maximize the sum of welfares—the utilitarian solution.
284
MICROECONOMIC THEORY
The egalitarian solution is defined:
Similarly the utilitarian solution is:
Whereas the egalitarian solution is unique, the utilitarian solution may not be—in which case the procedure takes a selection from “equivalent” outcomes. These solutions have drawbacks. For example, the egalitarian solution may require that a very large reduction in the welfare of one is required for small gain in the welfare of the other; the utilitarian solution may require a very unequal distribution of welfare. How should such solutions be evaluated? The primary procedure for proposing and evaluating solutions is the axiomatic approach—a solution is justified in terms of the principles that define it. The best known bargaining solution is the Nash bargaining solution.
17.3.2 The Nash bargaining solution The Nash bargaining solution is characterized by four axioms. 1. Pareto efficiency (PE): f(X,d) ≥ d, ξ ≥ f(X,d) ⇒ ξ ∉ X. (An efficient point is selected.) 2. Independence of irrelevant alternatives (IIA): Let d ∈ Y ⊆ X. f(X,d) ∈ Y ⇒ f(X,d) = f(Y,d). (If y ∈ Y was chosen when the choice set was X, then in a smaller choice set containing y, y will still be chosen.) 3. Invariance under increasing affine transformations (INV): Let L be an affine transformation on R2, so that L has the form: L(x1, x2) = (a1x1 + b1, a2x2 + b2), ai > 0. Then L(f(X,d)) = f(LX, Ld). (In words, affine rescaling of the choice set and default point lead to the same affine rescaling of the choice.) 4. Symmetry (S): Let d1 = d2 and (y,x) ∈ X ⇒ (x,y) ∈ X. Then f1(X,d) = f2(X,d). (Identical people are treated the same way.) None of these are contentious, except possibly for the invariance axiom. Together, these axioms pin down a unique solution—the Nash bargaining solution.
285
CHAPTER 17: BARGAINING
Theorem 17.1.Suppose ∃x ≫ d. If f: ℬ →, R2satisfies PE, IIA, INV, and S, then f is the unique solution to:
that is,
.
Proof. (Sketch) Define φ(x1, x2; d) = (x1 − d1)(x2 − d2) and let (x1*, x2*) solve strictly convex level surfaces, and since X is convex, there is a unique solution: if φ(x*;d), then . In the symmetric case, the solution to
and
. The function φ has and φ(◯;d)≥
is symmetric. So, f satisfies S. Since PE is satisfied, it remains to check INV and IIA. IIA is clear from the definition of f. To confirm INV, with αi > 0, let xi′ = αixi + βi, i = 1, 2, be an increasing affine transformation on R2, x′ = L(x), and put di′ = αidi + βi, so x′ = (x1′, x2′) = L(x) and d′ = (d1′, d2′) = L(d) and observe that since xi′-di′ = αi (xi − di),
It is necessary to show that if x* maximizes φ(x;d) on X, then ◯ = L(x*) maximizes φ(x;L(d)) on L(X). Suppose this is not the case, so , . Since , there is an , . Then the inequality may be written . From the calculations above, this implies or , contradicting the hypothesis that x* is optimal on X. Thus, f satisfies INV. Could there be another function f* satisfying the axioms? It remains to show that f is unique. At x* the level surface of φ(x) (for ease of notation write φ(x) instead of φ(x;d) since d is constant throughout the discussion), is tangent to the set X, and the normal vector at x* is . The vector is perpendicular to the tangent line, T, separating X and {x ∣ φ(x;d) ≥ φ(x*d)}: x* is the unique point in X ∩ {x ∣ φ(x;d) ≥ φ(x*d)}. Here T = {x ∣ φx(x*)· x = φx(x*) · x*}, the tangent line to the level surface of φ at x*.
If x′ ∈ T, then φx(x*) (x′ − x*) = 0 and if x′ is on or below the line T, φx(x*) (x′− x*) ≤ 0. Let Δ be the triangle with corners d, û2 and û1. By construction, f(Δ,d) = x*.
286
MICROECONOMIC THEORY
There is (see below) a linear transformation, L, which moves (0,û2) to (0,2), (û1,0) to (2,0), (x1*, x2*) to (1, 1), and d to (0, 0). Thus, L(Δ) = Δ*, where Δ* is the triangle with vertices, (0,0), (2,0), and (0,2).
The transformation L, L(x1, x2) = ((x1 − d1)/(x1* − d1), (x2 − d2)/(x2* −d2)) moves the triangle with corners d, û1, û1 to the triangle with corners 0, 2, and 2. For the bargaining set defined by Δ*, S and PE imply that (1, 1) is the unique solution: (1, 1) = f*(Δ*, (0,0)). Then, invariance implies that x× = L−1(1,1) = L−1(f*(Δ*, (0,0))) = f*(L−1Δ*, L−1(0,0)) = f*(Δ, d), and IIA implies x*=f*(X,d). Thus, if f* satisfies the axioms, f*(X,d) = f(X,d) = x*. This completes the argument. The following discussion clarifies the definition of L used above. Note from the definition of φ, (∂ φ/∂ x1∣x*, ∂φ/∂x2∣x*) = ((x2* − d2),(x1* − d1)). So, the region under the tangent line T is defined (x2* − d2)x1 + (x1* − d1) x2 ≤ (x2* − d2)x1* + (x1* − d1) x2*, or:
or
Consider the transformation L: L(x1, x2) = ((x1 − d1)/(x1* − d1), (x2 − d2)/(x2* − d2)). Under this transformation, the tangent line T moves to the line {(x1′, x2′) ∣ x1′ + x2′ ≤ 2}. Also, x* moves to (1,1) and d moves to (0,0). By symmetry, the bargaining solution must be (1,1). By invariance, L−1(1, 1) is the solution to the original problem. Hence the function φ is the unique function satisfying the axioms. If the symmetry axiom is dropped, the generalized Nash bargaining solution is obtained.
CHAPTER 17: BARGAINING
287
Theorem 17.2.Suppose ∃ x ≫ d. Then iff: ℬ → R2satisfies PE, IIA, and INV, then for some α, β ≥ 0, f is the unique solution to
that is,
.
17.3.3 The Kalai–Smorodinsky (K–S) bargaining solution The K–S solution is a second axiomatic bargaining solution. Like the Nash bargaining axiomatization, the K–S axioms include PE, S, and INV. However, IIA is replaced with an axiom called individual monotonicity (IM). These four axioms yield a unique solution. Fix (X,d) and let m1 = m1(X,d) = max{x1 ∣ (x1, d2) ∈ X}, and m2(X,d) = max{x2 ∣ (d1, x2) ∈ X}. So, mi(X, d) is the most that i could get, given that j must get at least the default level dj. Define Ldm(X,d) = {z ∈ R2 ∣ ∃ θ ∈ [0,1], z =θ d +(1 − θ) m}, and let eff(X) = {x ∈ X ∣ ∄ z > x, z ∈ X}. The K–S solution is defined: f(X,d) = Ldm(X,d) ∩eff(X). The following example illustrates. The matrix associates payoffs to choices. The figure on the right gives the bargaining possibilities.
Here, M1 = 15/4 and M2 = 20/3. The line connecting (0,4) and (4,3) is y = 4 −¼x, and the line connecting (4,3) and (8,0) is y = 6 − ¾x. So, the solution is at the intersection of y = 6 − ¾x and the line connecting (1,1) and . Solving these two equations gives . Like the Nash bargaining solution, the K–S solution has an axiomatic basis. The key difference is that IIA is replaced by the condition of IM. Individual monotonicity (IM): Let (X,d) and (Y,d) be two bargaining problems with the same default point, d. Suppose that: 1. m1(X,d) = m1(Y,d), and 2. b2(X,z1) = max{z2 ∣ (z1, z2) ∈ X} ≥ b2(Y, z1), ∀ z1 ∈ [d1, m1] Then f2(X,d) ≥ f2(Y,d). A symmetric condition applies for index 2: m2(X,d) = m2(Y,d), and b1(X, z2) = max {z1 ∣ (z1, z2) ∈ X} ≥ b1(Y,z2), ∀ z2 ∈ [d2, m2] implies f1(X,d) ≥ f1(Y,d). So,
288
MICROECONOMIC THEORY
for example, monotonicity says that for any payoff to 1, if the highest possible payoff to 2 in (X,d) is at least as large as the highest possible payoff in (Y,d), then the bargaining solution gives 2 at least as much in (X,d) as it does in (Y,d). Alternatively, one can write this condition as:
With this terminology: Theorem 17.3.If (X,d) is regular and f(X,d) satisfies PE, INV, S, and IM then f(X,d) is the K–S solution. Proof. (Sketch) For simplicity, assume that d = 0, (translating X by − d). Observe directly that the K–S solution satisfies PE, S, and IM. Invariance is also satisfied: if L is a linear transformation and x* is the K–S solution, then L(x*) is on the line α 0 + (1 − α) L(m1, m2) and L(x*) is on the Pareto frontier of L(X). To prove that the K–S solution is the only solution, let x* be the K–S solution of (X, 0). Consider the translation L satisfying L(m1(X), m2(X)) = (1, 1). Since X is convex the translation L(X) is convex with Pareto frontier given by the curve through (0,1), ◯, and (1,0) and lying above the line segments connecting (0,1) to ◯ and ◯ to (1,0). This translation moves x* to the 45° line at ◯ = L(x*).
The set defined by the corners (0, 1), ◯, and (1, 0) is symmetric, so that ◯ is the only solution satisfying PE and S on this set. Applying IM twice (once for each individual) gives that ◯ is the only solution satisfying the axioms on the set L(X), and applying L−1, x* = L−1(◯) is the only solution satisfying all four axioms on (X,d).
The Nash and K–S solutions Both solutions satisfy PE, S, and INV. In addition, as observed, the Nash bargaining solution satisfies IIA and the K–S solution satisfies IM. The following examples show how these conflict: the Nash solution fails IM and the K–S solution fails IIA. The first example shows that the Nash bargaining solution does not satisfy IM.
CHAPTER 17: BARGAINING
289
The choice set expands from conv{(0,0), (0,10), (6½,7½), (10,0)} to conv{(0,0), (0,10), (9½,7), (10,0)}, and the Nash bargaining solution moves from (6½,7½) to (9½,7). Thus, although the choice set is larger, individual 2 gets a lower payoff (7 instead of 7½). The next example shows that the K–S solution does not satisfy IIA (independence of irrelevant alternatives).
In this example, removing the heavily shaded section moves the solution from • to ⊙. Since • and ⊙ are feasible before and after the change, IIA is violated. Finally, as a remark, these solutions have generalizations to the n-person case.
17.4 NonCooperative Bargaining In Nash bargaining on the division of a pie, the division (s, 1 − s) gives 1 utility u = u1(s). Alternatively, achieving utility u requires that 1 receive . In that case, 2 receives 1 − s for utility . Assume that ui is strictly increasing and normalized so that ui(0) = 0 and ui(1) = 1. Then, ϕ(u) is strictly decreasing in u, with ϕ(0) = 1 and ϕ(1) = 0. In the alternating offers model, the pattern of play is described as follows. Individual 1 proposes a pie division (s,1 − s). This is either accepted or rejected by 2. Acceptance concludes with the corresponding division; rejection moves the process to period 2 where person 2 is now the proposer. Offers cycle back and forth until an acceptance occurs. Technically, this is equivalent to proposing in terms of utility allocations (û1, û2), where and . And, since a utility proposal for 1 determines the utility for 2 , it is sufficient to
290
MICROECONOMIC THEORY
focus on the utility obtained by 1. Remarkably, the alternating offers model leads to a unique equilibrium in payoff space, defined by the equation: ϕ(u) = δ2 ϕ(δ1u), where δi is the discount factor of person i. The following discussion characterizes the equilibrium. Let ū be the highest payoff that individual 1 gets in any subgame perfect equilibrium. In any subgame perfect equilibrium, player 1 will always accept an offer of ū, so 2 will never offer more. If period 2 is reached, the present value in period 2 of the best equilibrium payoff possible for player 1 in period 3 is δ1 ū. So, if period 2 is reached, 1 will accept an offer yielding utility δ1 ū: an offer of . Therefore, 2 gets a share of at least , and so if period 2 is reached, 2 will get a utility of at least . Consequently, in period 1, player 1 must offer a utility of a least δ2 ϕ(δ1 ū) to player 2, or else 2 will not accept the proposal. So, the best 1 can get in utility terms, ū, is bounded above by what 2 must obtain: ϕ(ū) ≥ δ2 ϕ(δ1 ū). Conversely, let u be the lowest payoff that 1 gets in any subgame perfect equilibrium. Repeating the arguments above, in period 2 person 1 will accept no offer below δ1u, so in that period, 2 will get utility no higher than ϕ(δ1u), and so in period 1 person 1 will offer (utility to 2) no more than δ2ϕ(δ1u): ϕ(u) ≤ δ2ϕ(δ1u). Combining these and the condition u ≤ ū, yields three conditions (1) u ≤ ū, (2) ϕ(u) ≤ δ2ϕ(δ1u), and (3) ϕ(ū) ≥ δ2 ϕ(δ1 ū). The functions ϕ(·) and δ2 ϕ(·) are depicted below. Note from condition (2) that the smallest possible value for u is u*, u ≥ u*; and from condition (3) the largest possible value for ū is u*, ū ≤ u*. Thus, u* ≥ ū ≥ u ≥ u*, so that ū = u.
In the figure, for u < u*, ϕ(u) > δ2 ϕ(δ1u), and for u < u*, ϕ(u) < δ2 ϕ(δ1u). The figure is drawn on the assumption that ui′ > 0 and ui″ < 0. Those conditions are sufficient to guarantee a unique solution, as depicted. From the solution u* satisfying ϕ(u*) = δ2 ϕ(δ1u*), the actual division is determined: the share of individual 1 is . (If ui″ is not everywhere nonpositive, then the curves ϕ(·) and δ2 ϕ(δ1 ·) could have multiple intersections.) The basic idea in the alternating offers model can be extended to more complex bargaining situations with two people, as long as the recursive structure is preserved. To illustrate, consider the following model: 1. In period 1, person 1 is drawn with probability p to make an offer, and person 2 is drawn with probability (1 − p).
CHAPTER 17: BARGAINING
291
2. If 1 is drawn, 1 proposes a pie division. If the proposal is accepted by 2, the process ends; rejection by 2 introduces a delay of k periods after which the same probabilities are used to draw a person to propose. 3. If 2 is drawn initially, 2 makes a pie division proposal. Acceptance by 1 concludes the bargaining, rejection leads to the next period where an individual is drawn again according to the same probabilities to make an offer. For simplicity, assume that ui(s) = s, i = 1, 2. The figure depicts the structure of the game.
The equilibrium is found using reasoning similar to that above. Let be the best expected payoff to 1 in any subgame perfect equilibrium. If 1 is drawn and 2 rejects, then k periods later the most 1 will get is and so the least that 2 will get is . So, the initial offer will be rejected by 2 if it does not give 2 at least . So if 1 is the proposer, the most that can be obtained is . If 2 is drawn initially to propose, then 1 will accept for sure any offer of or more, since rejection takes the process one period on where the highest possible expected payoff for 1 is . Thus 2 will never offer 1 more than . Therefore, . Conversely, let s be the worst subgame perfect equilibrium payoff for 1. Similar reasoning gives that . Combining these implies that there is a unique subgame perfect equilibrium expected payoff, s*, for player 1:
One can study the behavior of this division as the time between offers goes to zero. Let for i is per unit of time.
, so the discount rate
So,
Thus, for example, with arbitrarily small time intervals, for fixed p, if k → ∞ the impact on the pie division is the same as p → 1, with k fixed. Large k and large p strengthen the bargaining position of player 1.
292
MICROECONOMIC THEORY
17.5 Alternating Offers and Nash Bargaining The following discussion connects the Nash bargaining solution with the outcome determined by the alternating offers model. The main observation is that as the length of time between offers becomes small, the payoff allocation in the alternating offers model approaches that of the Nash bargaining solution. From the equation, ϕ(u*) = δ2 ϕ(δ1u*), consider the case where δ1 and δ2 are close to 1.
where is between δ1 and 1. Writing δ2 = 1 +(δ2 − 1),
Rearranging
The terms on the second line are of small order of magnitude (relative to (δ1 − 1) + (δ2 − 1)). So,
where O(x) means order of magnitude no larger than x. Thus,
To parametrize, let
so, as Δ → 0, limΔ → 0u*(δ1, δ2) = u*, defined by:
In these terms, the generalized Nash bargaining solution is given by:
CHAPTER 17: BARGAINING
293
Taking the log: (1/ρ1) ln u + (1/ρ2) ln ϕ(u), the first-order condition is 0 = (1/ρ1) (1/u*) + (1/ρ2) (1/ϕ(u*)) ϕ′(u*). This rearranges to:
So, as the time between offer and counter offer goes to 0, the alternating offers bargaining solution converges to the Nash bargaining solution. In the case where ui is convex, the alternating offers bargaining model yields a unique solution, but the feasible set without randomization is not convex, so the connection with Nash bargaining is blurred.
In the figure, for u < u*, ϕ(u) > δ2 ϕ(δ1u), and for u > u*, ϕ(u) < δ2 ϕ(δ1u). The figure is drawn on the assumption that ui′ > 0 and ui″ < 0. Those conditions are sufficient to guarantee a unique solution, as depicted in the figure.
17.6 Bargaining with Many Individuals With three or more individuals, there are an infinite number of subgame perfect equilibrium allocations in the alternating offers bargaining problem. This is clear in the case where individuals propose in turn, but all responders to a proposal announce acceptance or rejection simultaneously. Furthermore, multiplicity of (subgame perfect) equilibria also holds when responders move in sequence. However, imposing stationarity in suitable circumstances restores uniqueness. A strategy is stationary if a player makes the same proposal when called on to propose, and uses the same acceptance or rejection rule when at the same position in the order of responders. In the two-person bargaining problem, if and are the utility allocations arising when 1 and 2 respectively are first to propose, then when 1 proposes, 2 must obtain to induce 2 to accept. So or . Similarly, when two is first to propose, . These equations determine and and full division of the pie,
294
MICROECONOMIC THEORY
with 1 achieving utility and 2 achieving utility
. In the linear case:
A similar computation can be applied in the n-person case. Consider the case where n = 3. Using the same reasoning as above, with denoting the utility of i, when i is the first proposer,
In the linear case, ui(x) = x,
So,
Thus, the pie division is
with
.
In the case where δ3 = 0, player 3 gets a share of 0, but still has impact on the division between 1 and 2 because of delay. As δ3 → 0, the division approaches:
so, as δ3 → 0, the division resembles the case where player 3 is not involved in the bargaining but the delay of one period between an offer from 2 and counteroffer from 1 shows up in the pie allocation.
CHAPTER 17: BARGAINING
295
Bibliography Kalai, E. (1985). “Solutions to the Bargaining Problem,” in Leonid Hurwicz, David Schmeidler, and Hugo Sonnenschein (eds.), Social Goals and Social Organization. Cambridge University Press, Cambridge, UK. Kalai, Ehud and Smorodinsky, Meir (1975). “Other Solutions to Nash's Bargaining Problem,” Econometrica, 43, 513–518. Nash, J. (1953). “Two-Person Cooperative Games.” Econometrica, 21, 128–140. Rubinstein, A. (1982). “Perfect Equilibrium in a Bargaining Model,” Econometrica, 50, 97–109. Stahl, I. (1972). Bargaining Theory. Stockholm: EFI. Sutton, J. (1986). “Non-Cooperative Bargaining Theory: An Introduction,” Review of Economic Studies, 709–724.
This page intentionally left blank
18 Cooperative Outcomes 18.1 Introduction Rather than focus on rules of interaction, cooperative game theory emphasizes the scope for gain or mutual benefit. One of the main features of cooperative game theory is the possibility for groups to form and reach (or break) agreements, and this ability of groups (coalitions) to form plays a central role in determining outcomes. Identifying outcomes that might plausibly arise in such settings lead to a variety of solution concepts for cooperative games. Here, a few of the major cooperative models are described—specifically, the core, von Neumann–Morgenstern stable sets and the Shapley value. The notion of a core allocation (no subgroup can unilaterally improve their allocation) is a key idea and is discussed first, along with necessary and sufficient conditions for nonemptiness of the set of core allocations. It is common to define the coalition function without reference to underlying actions or preferences, but still it is important to show how it may be derived from underlying primitives. That connection is taking up in a discussion of nontransferable utility (NTU). Then, von Neumann–Morgenstern stability is introduced. This concept identifies stable outcomes in terms of internal and external stability and is illustrated with a discussion of core allocations. Finally, the Shapley value describes a point valued solution concept. The framework for the cooperative game is given in Section 18.2. Section 18.3 discusses the core. The focus there is on circumstances under which the core of a game is nonempty. The key necessary and sufficient condition is balancedness. The core is nonempty if the game is balanced and this holds if and only if a certain linear program has a solution. So, nonemptiness of the core can be phrased in terms of the program, and the extreme points of the feasible region can be identified as the balancing vectors in the definition of balancedness. This provides the motivation for the notion of balancedness. Section 18.4 discusses NTU-games, and Section 18.4.1 considers the derivation of such games. The primary focus of this section is on motivation of the value
298
MICROECONOMIC THEORY
function in terms of an underlying environment. This highlights the environment and the role of effectivity in defining a coalitional function. Section 18.5 considers the von Neumann–Morgenstern solution and the concept of stability of outcomes for a coalition function. In Section 18.5.1 the stability concept is observed to have more general application. Section 18.6 concludes with a discussion of the Shapley value.
18.2 Framework A set of n players, N = {1, …, n} is given. Let P be the set of subsets of N. A characteristic function is a function from P to ℛ, v:P → ℛ. The function v measures coalition power: given S ⊆ N, v(S) is the payoff that coalition S can achieve. This defines a game with transferable utility(a TU-game) and the game (N, v) is called a game in coalition form. An allocation or imputation is a vector x ∈ ℛn, where xi is a payoff for player i with xi ≥ v(i), ∀ i and ∑ixi ≤ v(N).20 These conditions are individual rationality (an individual gets at least the amount the individual can guarantee unilaterally; and the group as a whole exhaust no more than that is achievable from group action). Denote the set of allocations by I(v).
18.3 The Core An allocation, x is a core allocation if ∑>i ∈ Sxi ≥ v(S), ∀ S ⊆ N. In words, x is a core allocation if there is no coalition which can unilaterally achieve a higher payoff. The set of core allocations of v is denoted C(v). Depending on v, there may be no core allocations: C(v) = ∅. For example, in a three player majority game, N = {1, 2, 3}, v({i}) = 0, ∀ i, v({i,j}) = 1 and v({1,2,3}) = 1. So, if x = (x1, x2, x3) is a candidate core allocation, then xi ≥ 0, and x1 + x2 + x3 = 1. But xi + xj ≥ v({i,j}) = 1, so it must be that x1 + x2 ≥ 1, x1 + x3 ≥ 1, and x2 + x3 ≥ 1. Adding these gives 2(x1 + x2 + x3) ≥ 3 or . Hence, the core is empty.
18.3.1 Balancedness Under what conditions does the cooperative game have a nonempty core? The fundamental characterizing condition is balancedness which is necessary and sufficient for the core to be nonempty.
20
Sometimes this is defined by xi ≥ v (i ), ∀i and ∑ixi = v (N ).
CHAPTER 18: COOPERATIVE OUTCOMES
299
Definition 18.1.A collection C = {Sα}, is balanced if there are positive numbersyαsuch that for each i
A balanced collection C is called minimal if no strict subset is balanced. The weight vector {yα} is called the balancing vector for C and yα a balancing coefficient. The main characterization of games with nonempty cores is the following. Theorem 18.1.A necessary and sufficient condition for the game v to have a nonempty core is that for every minimal balanced collection C = {Sα} with balancing vector {yα}, ∑αyαv(Sα) ≤ v(N). Motivation for the balancedness condition may be based on linear programming observations. Observe that the core of a game v is nonempty if and only if minx ∑ixi subject to ∑i ∈ Sxi ≥ v(S) for all S ⊆ N has a solution x* = (x1, …, xn) with ∑ixi* ≤ v(N). Let {Sα ∈ A} be an enumeration of the nonempty subsets of N. Consider the program:
The dual of this program is:
One of the fundamental theorems of linear programming is that if either (1) the primal has an optimal solution or (2) there are feasible solutions to both primal and dual, then there are optimal solutions to both primal and dual and the objectives have the same value. The primal program here has a solution and hence so does the dual—and the objective values are equal. So, the core is nonempty if and only if this primal program has a solution y*α with ∑αy*αv(Sα) ≤ v(N).21
21
Aside on duality: For a linear programming problem where the primal is as written below, thedual has the form given.
Observation: If a primal constraint is an equality, the sign of the corresponding dual variable is unconstrained (∑jaijrj = ci ⇒ ti unconstrained). If the sign of a primal variable is restricted ((1) ri ≥ 0 or (2) ri ≤ 0), and the dual is a minimization problem, the corresponding constraint in the dual is (1) ∑jaijtj ≥ bi or (2) ∑jaijtj ≤ bi . When the dual is a maximization problem these are instead (1) ∑jaijtj ≤ bi or (2) ∑jaijtj ≥ bi .
300
MICROECONOMIC THEORY
For example, to illustrate the program, consider the majority game above. Writing vS for v(S) to simplify notation, the primal objective is:
Since v({i}) = 1, v({i,j}) = 1 any i,j pair, and v({1, 2, 3}) = 1 the primal program is:(18.1)
(18.2)
(18.3)
The solution is y12 = y13 = y23 = ½, and all other variables equal to 0. The objective value is —the same as obtained in the dual by direct computation. Since the solution can be found at the extreme points of the feasible region, a sufficient condition for ∑αyαv(Sα) ≤ v(N) on the feasible region is that this holds at the extreme points of the feasible region. These extreme points coincide with the minimal balanced coalitions. Summarizing these observations: if in the primal program, the objective value is no larger than v(N) at the extreme points of the feasible region, then it is no larger than v(N) at any point in the feasible region. Second, the extreme points of the feasible region are the balancing vectors of the minimal balanced coalitions. The following definitions will be useful later. Definition 18.2 • • •
A coalitional function v is said to be monotonic ifv(S) ⊆ v(T), forS ⊆ T. Enlarging a coalition makes it stronger. A coalitional function v is said to be superadditive if v(S) + v(T) ≤ v(S ∪ T), for S, T ⊆ NandS ∩ T = ∅. The interpretation is that two groups, S and T, can do better by pooling resources. A coalitional function v is said to be supermodular if v(S) + v(T) ≤ v(S ∪ T) + v(S ∩ T)}, for all S, T ⊆ N. If the coalitional function is supermodular, the game is said to be a convex game.
Note that supermodularity implies superadditivity. It turns out that supermodularity is sufficient (but not necessary) for nonemptiness of the core, but superadditivity is not sufficient—the majority game is superadditive but has an empty core.
CHAPTER 18: COOPERATIVE OUTCOMES
301
18.4 Nontransferable Utility The discussion above begins with a function v and from there proceeds to model outcomes. Where does the function v come from? This question is addressed in the framework of NTU-games, which are discussed next. An NTU-game is a pair (N,V) where V: P → ∪S ⊆ N ℛS (V(S) ⊆ ℛS, V(S) is the set of payoffs that members of S can guarantee for themselves) with: (1) V(S) is a closed convex nonempty subset or ℛS; (2) {x ∈ V(S) ∣ xi ≥ vi, ∀ i ∈ S} is bounded in ℛS, where vi = max {yi ∣ yi ∈ V({i})} < ∞. An imputation or allocation is a point x ∈ V(N). The core of V is the set of allocations, x, such that if y ∈ ℛS, yi > xi, ∀ i ∈ S, then y ≠ V(S). Write C(N,V) to denote the set of core allocations. The following discussion shows how such games are derived from standard strategic form games. As with the TU-games, nonemptiness of the core can be formulated in terms of a balancedness condition (Scarf 1967).
18.4.1 Derivation of the coalition function The following discussion shows how an NTU coalitional function may be derived from an n-person environment where each individual has an action space and preference ordering defined on action profiles. Let G = (N, {Ci}i, {u}i) be a strategic form game, where Ci is the choice set of i and ui the payoff function. Write CS = ×i ∈ SCi and C = ×i ∈ NCi. Let Δ(CS) be the set of probability distributions on CS. Given σS∈ Δ(CS), σN\S ∈ Δ(CN\S), let
Say that x ∈ ℛS is assurable in G for S if and only if ∃ σS ∈ Δ(CS) such that for all σN\S ∈ Δ(CN\S), ui(σS,σN\S) ≥ xi ∀i∈S. The α -coalition function is defined:
Say that x ∈ ℛS is unpreventable in G for S if and only if for each σN\S ∈ Δ(CN\S), ∃ σS ∈ Δ(CS) such that ui(σS, σN\ S) ≥ xi ∀i ∈ S. The β-coalition function is defined:
Given x ∈ ℛN and coalition S, write (xS, xN\S) to identify the payoffs to coalition and noncoalition members. Definition 18.3.An NTU-game, V, superadditive if given S, T ⊂ N,
It turns out that both of these coalitional functions are superadditive.
302
MICROECONOMIC THEORY
Theorem 18.2.Both Vαand Vβare superadditive. These are considered in turn. Vα is superadditive. Let xS ∈ Vα(S), xT ∈ Vα(T) with S ∩ T = ∅. Then (a) ∃ σS* ∈ Δ(CS) such that for all σN\ S ∈ Δ(CN\ S), ui(σS*, σN\ S) ≥ xi, i∈ S and (b) ∃ σT* ∈ Δ(CT) such that for all σN\T ∈ Δ(CN\T), ui(σT*,σN\T) ≥ xi, i∈ T. In (a), consider those σN\S restricted to σT* on T so σN\S = (σT*, σN\{S ∪ T}); and in (b) consider those σN\T restricted to σS* on S so that σN\T = (σS*, σN\ {S ∪ T}). Then, and . Thus, (xS, xT) is assurable for S ∪ T: (xS, xT) ∈ Vα(S ∪ T). Vβ is superadditive. Let xS ∈ Vβ(S), xT ∈ Vβ(T) with S∩ T = ∅. Fix σN\ {S ∪ T} and define correspondences: ϕS, ϕT as follows.
These correspondences are convex valued (from the definition of ui). If the ui functions are upper-semicontinuous, then ϕS and ϕT are upper-hemicontinuous correspondences and so ϕS∪T = ϕS × ϕT is a convex valued upperhemicontinuous correspondence and so has a fixed point (σS*, σT*). Thus, given σN\ {S ∪ T}, this gives at least xS to S and at least xT to T.
18.5 von Neumann–Morgenstern Solutions and Stability Consider an allocation x = (x1, …, xn). Recall that in the discussion of the core an allocation was “defeated” if some coalition, S, could improve their payoff: ∃ y, a payoff vector such that yi > xi, i ∈ S and ∑i ∈ Syi ≤ v(S). In this case, write y ≻Sx. One can order allocations along these lines: write y ≻ x if ∃ S and y ≻Sy. With this notation, x is a core allocation if and only if there is no payoff vector y with y ≻ x. Because, if x is a core allocation and y ≻ x, then for some coalition S, v(S) ≥ ∑i ∈ Syi > ∑i ∈ Sxi, contradicting the assumption that x is in the core. Conversely, if x is not in the core there is S with v(S) > ∑i ∈ Sxi. For i ∈ S, let yi = xi+ ((v(S) − ∑i∈ Sxi)/#S) < 0; and for i ∉ S, let yi = 0. Then y ≻ x. (Although y may not be an imputation.) Here #S or |S| denotes the number of elements of S. Theorem 18.3.If v is superadditive (∀ S, T, S ∩ T = ∅, v(S ∪ T) ≥ v(S) + v(T)), an imputation x is in the core if and only if it is not dominated by any other imputation. Proof. If x is an imputation not dominated by any imputation y, then from the definition of the core, x is a core allocation. If x is an imputation not in the core
CHAPTER 18: COOPERATIVE OUTCOMES
303
then for some S, ∑i∈Sxi < v(S). Define y:
So, ∑iyi = v(N). For i ∈ S, yi > xi ≥ v(i); and superadditivity implies that v(N) − [v(S) + ∑i∉ Sv(i)] > 0, so that for i ∉ S, yi ≥ v(i). In sum, if x is in the core it cannot be dominated by another imputation; if it is not in the core it must be dominated by another imputation. This suggests a definition. Definition 18.4.A set of imputations, Q is a (von Neumann–Morgenstern) solution if: 1. x, y ∈ Q then x does not dominate y and 2. if x ∉ Q, then there is a y ∈ Q such that y dominates x. Write I(v) for the set of imputations in the game v and D(Q) for the set of imputations that are not dominated in Q: D(Q) = {x ∈ Q ∣ ∄ y ∈ Q, y≻ x}. The first condition implies that any x ∈ Q is in the set of points not dominated by any point in Q: Q ⊂ [I(v) \ D(Q)]. The second condition implies that if x is an imputation not dominated by a point in Q, it must be in Q: [I(v) \ D(Q)] ⊆ Q. So, [I(v) \ D(Q)] ⊆ Q ⊆ [I(v) \ D(Q)] or
As with the core, a stable set may not exist. When a game has a stable set, it may not be the only one: there may be multiple sets. In the case where the game is superadditive (convex), the core is the unique stable set. And in this case, the Shapley value (discussed below) is in the core.
18.5.1 Stability The definition of a solution set Q involves comparison of members within Q (internal stability) and ranking of points not in Q with some point in Q (external stability). It is possible to apply this idea in a much broader setting—beyond orderings on payoffs. For example, let Xi be the set of strategies of player i, X = ×iXi and ui: X → R the utility of i. This defines a strategic form game, G. Say that x ≻iy if x−i = y−i and ui(x) > ui(y). Let x ≻ y if and only if there is some i with x ≻iy. Let NE be the set of Nash equilibria of G. Observe that (1) if x,y ∈ NE then x does not dominate y and (2) if y ∉ NE, there is some x with x ≻ y. Thus, NE is both internally and externally stable.
304
MICROECONOMIC THEORY
18.6 The Shapley Value Given the set N of n players, consider a partition in two sets S and N\ S. Suppose that i ∈ S, and |S| = k, where |T| means the number of elements of the set T. Given this, consider all those permutations or arrangements of the elements of the set S, such that i is the last element. There are (k − 1)! such arrangements. For example, if S = {1,2,3} and i = 2, then the possible arrangements with 2 listed last are (1,3,2) and (3, 1, 2), which is (3 − 1)! = 2! = 2. Viewed as arrival times, there are (k − 1)! ways in which the other members of S could have arrived before i. Similarly, there are (n − k)! orderings, or ways in which the remaining n − k players can be arranged. Thus, the total number of arrangements with the k − 1 members of S\ {i} appearing before i and the n − k members of N \ S appearing after i is (k − 1)!(n − k)!. (In the case where S = {i} this gives 0!(n − 1)! = n! and when S = N, (n − 1)! 0! = n!.) In total, there are n! orderings of players arrival times, so that (k − 1)! (n − k)!/n! is the fraction of all arrival times where the players in S\ {i} arrive before i and the players in N \ [S ∪ {i}] arrive after i. Thus, the probability of i arriving after the group S \ {i} has arrived and before any members of N\ S arrive is (|S| −1)! (|N| − |S|)!/n!, when all orders of arrival are equally likely. The arrival of i to the group S\ {i} increases the value of the group by [v(S) − v(S \ {i})]. The Shapley value of v, is ϕ(v) = (ϕ1(v), …, ϕn(v)) where ϕi(v) gives the average over all coalitions if i's contribution to each coalition:
Alternatively, this may be written22
Finally, a useful formulation can be given directly in terms of permutations. Let π = (π1,…, πn) be a permutation of the set of players: if player i occupies position k, then πk = i. Let Π be the entire set of permutations on N, so |Π| = n!. Given any permutation π, let S(π; < i) be the set of players in the list strictly prior to i. If πk = i, then:
22
Given i and any set S with i ∈ S uniquely identifies T = S \ {i }, with |S | = |T |+1 and so
CHAPTER 18: COOPERATIVE OUTCOMES
305
Then
From these expressions a few conclusions can be drawn. If v is superadditive the Shapley value is an imputation, since:
The Shapley value gives an efficient payoff:
. To see this observe that
Note that if i = πk, then S(π; < i) = {π1, …, πk − 1} and S(π; < i) ∪ {i} = {π1, …, πk − 1, πk}. So,
In fact, efficiency in conjunction with other axioms provides an axiomatic basis for the Shapley value. Let V be the set of coalitional games: V = {v ∣ v : 2n → R, with v(∅) = 0}. Theorem 18.4. Suppose that for all v ∈ V, 1. 2. 3. 4. Then
satisfies for each i:
Symmetry: ϕi(v) = ϕj(v)if for allS ⊂ Nnot containingi or j, v(S ∪ {i}) = v(S ∪ {j}). Dummy player: ϕi(v) = v({i}) if for all S ⊆ N, i∉ S, v(S ∪ {i}) − v(S) = v({i}). Efficiency: ∑i ∈ N ϕi(v) = v(N). Additivity: ϕi(v + w) = ϕi(v) + ϕi(w). is the Shapley value.
Proof. First, the Shapley value satisfies the four axioms. For T ⊆ N let μ(T) = ((|T|)!(|N| − |T|−1)!)/n!, so that ϕi(v) = ∑{T ⊆ N ∣ i ∉ T} μ(T)[v(T∪ {i})−v(T)]. If for all T, i ∉ T, v(T∪ {i}) − v(T) = v({i}), then
and the dummy player axiom is satisfied. Additivity follows directly from the definition, and efficiency has been confirmed above. To see that symmetry holds,
306
MICROECONOMIC THEORY
If i,j ∉ T implies v(T ∪ {i}) = v(T ∪ {j}) then directly:
If j ∈ T and i ∉ T, let T* = (T\ {j}) so that
where
(and so
). Thus,
And so
Therefore, ϕ satisfies the four axioms. The proof that ϕ is the only function on V satisfying these axioms proceeds as follows. Viewing V as a subset of ,a restricted class of coalition functions, V*={{vS}S ⊂ N, S≠ ∅} is shown to span the space of all functions V: if v ∈ V, there are numbers {ρS}S ⊆ N, S ≠ ∅ such that v = ∑ ρSvS, vS ∈ V*. Given vS (or δ vS), any value function satisfying the four axioms is uniquely determined: if ϕ and φ satisfy the four axioms on V*, then ϕ(δ vS) = δ ϕ(vS) = δ φ(vS) = φ(δ vS). Consequently, using additivity, ϕ(v) = ∑S ρS ϕ(vS) = ∑S ρS φ(vS) = φ(v), so that ϕ and φ are equal on V. These details are given next. For any S ⊂ N, S ≠ ∅, define the coalition function vS:
Observe that for any S, is a 2n 1 dimensional vector. The collection this let there be nj coalitions of size j,
are linearly independent. To see
CHAPTER 18: COOPERATIVE OUTCOMES
307
and let be an enumeration of these coalitions as a column vector. Let be the corresponding v's arranged in the same order as the . Form a matrix bordered by the S's and vS's, placing a 1 in any position where the corresponding T and vS satisfy S⊆ T; and 0 otherwise. This gives,
where is an identity matrix of dimension nj. Observe that it has full rank so that the collection of vectors linearly independent. To illustrate, consider with N = {1, 2, 3}:
are
For example, the function v{12} is given by the fourth column. Consequently, given a coalitional function v there are numbers ρS, for S ⊆ N, S ≠ ∅ such that:
Any value ϕ, using additivity, satisfies ϕ(v) = ∑ ρSϕ(vS). Therefore, if the value is uniquely determined by the axioms on the collection of coalitional functions {vS}, it is uniquely determined on the entire family of coalitional functions. The remainder of the discussion confirms uniqueness on each vS. Consider the game vS and suppose that i ∈ N\S. If
And if
Consequently, if i ∉ S, ϕi(vS) = vS({i}) = 0, by the dummy player axiom.
308
MICROECONOMIC THEORY
Next, consider i ∈ S. From the symmetry axiom, if for all T such that i, j ∉ T, vS(T ∪ {i}) = vS(T∪ {j}), then ϕi(vS)= ϕj(vS). If j ∈ S, then vS(T∪ {i}) = vS(T∪ {j}) = 0 for all T with i, j ∉ T. So, i, j ∈ S implies that ϕi(vS) = ϕj(vS). Efficiency implies that ∑i ∈ N ϕi(vS) = ∑i ∈ S ϕi(vS) = vS(N) = 1 and symmetry implies that |S| ϕi(vS) = 1 so
The same calculations (with δ · vS) show that ϕi(δ · vS) = δ · ϕi(vS). Thus, if ϕ and φ are two value functions satisfying the axioms, then linearity implies
since for each S, ϕ(vS) = φ(vS). Different axiomatizations are possible (e.g. Young 1985 replaces the additivity axiom with a monotonicity axiom).
Bibliography Aumann, R. W. (1975). “Lectures on Game Theory,” Mimeo, Stanford University. Aumann, R. W. and Shapley, L. S. (1974). Values of Nonatomic Games. Princeton University Press, Princeton, NJ. Driessen, T. (1988). Cooperative Games, Solutions and Applications. Kluwer Academic Publishers. Greenberg, J. (1990). Theory of Social Situations. Cambridge University Press, Cambridge, UK. Lucas, W. F. (1968). “A Game with no Solution,” Bulletin of the American Mathematical Society, 74, 237–239. Nash, S. G. and Sofer, A. (1996). Linear and Nonlinear Programming. McGraw Hill. Scarf. H. (1967). “The Core of an n-Person Game,” Econometrica, 35, 50–69. Shapley, L. (1953). “A Value for n-Person Games,” in H. W. Kuhn and A. W. Tucker (eds.), Contributions to the Theory of Games II. Princeton University Press, Princeton, NJ. von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behavior. Princeton University Press. Young, P. (1985). “Monotonic Solutions of Cooperative Games,” International Journal of Game Theory, 14, 65–72.
19 Anonymous Games 19.1 Introduction An anonymous game is one with many players, and where each player is negligible, viewed as a fraction of the total set of players. In this environment, any individual player has no impact on any aggregate variable in the game. One advantage of this formulation is that the decisionmaker need not consider the ramifications of own action on the state of the system and hence the behavior of others. This simplifies the analysis of individual decisions, particularly in dynamic models where the state of the system at each point in time is some aggregate of individual actions and parameters. In Section 19.2 large anonymous games are described. Because there are technical subtleties, some care is required in specifying the model. In the formulation here, a strategy is a joint distribution on actions and players. So, the conditional distribution on actions given a player can be identified as a strategy for that player, and under standard assumptions regarding the environment, it is shown that equilibria exist. An alternative formulation identifies strategies as functions from players to actions. This is discussed in Section 19.3. Anonymous games are easily formulated in a dynamic context, and in that context equilibria and stationary equilibria can be discussed—this is done in Section 19.4. For a special class of games equilibrium can be identified with the optimization of a function representing economic surplus. Section 19.5 presents an example of a single market illustrating this “social planner” formulation. Finally, Section 19.6 introduces the property of no-aggregate uncertainty which is routinely used in the literature.
310
MICROECONOMIC THEORY
19.2 Formulation of Anonymous Games An anonymous game is characterized by an infinite number of players Λ (a continuum of players), a distribution μ on Λ, an action space, A, common to all players and a payoff function, u defined on actions, characteristics, and strategies. There are two ways to formulate a strategy—as a function from individuals to actions (x: Λ → A), or as a joint distribution on players and actions (τ an measure on A × Λ). Both of these are discussed below, but primary emphasis is on the second formulation where the formulation is called an anonymous game. In an anonymous game, given a joint distribution τ on the player set Λ, and action space A, the payoff to player α taking action a is u(a, α, τ). When the mass of players is fixed and given by some measure μ on Λ, the distribution τ is required to have a marginal distribution on Λ that is equal to μ. Such a measure, τ, is an equilibrium if for almost all α ∈ Λ, ∫Au(a, α, τ) τ(da | α) ≥ maxau(a, α, τ). (An alternative formulation defines the parameter space to be the set of continuous utility functions on (a, τ) in which case, ∫Au(a, τ) τ (da | u) ≥ maxau(a, τ), for almost all u. These formulations will be used interchangeably.) Fix a measure μ, the aggregate distribution on the set of players, Λ.23 This defines the population. For a given set Q, let ℳ(Q) denote the set of measures on Q, and P(Q) denote the set of probability measures on Q. Take μ to be a probability measure so that the mass of players is normalized to 1. In this context: Definition 19.1Distributional strategy, τ is a probability measure on A × Λ with marginal distribution μ on Λ. Conditional on α, τ(· | α) defines α's “mixed” strategy on A. Take the utility of player α to depend on own action a and the action distribution, u(a, α, τ), and assume that u is jointly continuous in (a, α, τ). (Take A to be a compact metric space and let the topology on ℳ(A × Λ) be the weak* topology: τk → τ if and only if ∫ f(a, α) τk(da × dα) → ∫ f(a, α) τ(da× dα) for all bounded continuous f.) Write τΛ to denote the marginal of τ on Λ. The following section discusses equilibrium in this environment.
19.2.1 Equilibrium Denote an anonymous game G = (Λ, A, u, μ). Equilibrium in an anonymous game is straightforward to define: Definition 19.2A strategy τ* is an equilibrium if τ* has marginal distribution μ on Λ and
for all with marginal μ on Λ.
23
Although unnecessary for most of the discussion, assume that Λ is compact.
CHAPTER 19: ANONYMOUS GAMES
311
This definition permits randomization: for some or even all α, it may be that τ*(· | α) is a non-degenerate distribution. A conventional definition of a pure strategy is a function associating an action a to each player α, h: Λ → A. So, if there is some measurable function h such that τ*({(α, a) | a =h(α), some α}) =1 then for all , and h is a pure strategy equilibrium. For this environment, equilibrium (in mixed strategies) exists. Theorem 19.1If A is compact and u continuous on Λ × A × ℳ(Λ × A), then the game G = (Λ, A, μ) has an equilibrium. Proof Since the set of players is fixed (by μ), the set of distributional strategies in the game is given by the set ^*= {τ ∈ P(A × Λ) | τΛ = μ}, a compact metric space since A is a compact metric space. The best response mapping is straightforward to define. For τ ∈ ^*define:
The correspondence B is convex valued and upper-hemicontinuous. Convexity follows from the definition of B. To see that B is upper-hemicontinuous let τk → τ. Continuity of u on the compact metric space A × S* implies uniform continuity so given ε > 0 for k large |u(a, α, τk) − u(a, α, τ) | ≤ ε, ∀a ∈ A, α ∈ Λ. Let , ∀k with . So, , ∀τ′ ε S*. Given ε > 0, for k large, , and ∫u(a, α, τk) dτ′ ≥ ∫u(a, α, τ)dτ′ − ε so that , ∀ τ′ ∈ S*. Taking limits, . Since ε is arbitrary, . Hence B is upper-hemicontinuous and from Glicksberg's theorem, has a fixed point, τ* ∈ B(τ*). This completes the proof. A slight variation in the proof of existence of equilibrium is the following. Let Q be a separable metric space and τ a measure on Q. Then, there is a closed subset, R, such that τ(R)=1. If R′ is any other closed set with τ(R′) =1, then R ⊆ R′. R is called the support of the measure τ. The support of a measure τ is written supp τ. The support can be viewed as a correspondence from the set of measures on Q to Q, supp: P(Q) ↦ Q. One important property of the support correspondence is that it is lower-hemicontinuous: if q ∈ supp τ and τn → τ, then ∃qn ∈ supp τn and qn → q. In the present context, let Qτ = {(a, α) | u(a, α, τ) ≥ maxau(a, α, τ)} and define
This correspondence is upper-hemicontinuous. To see this, let τn → τ, {τn′ ∈ φ(τn)}, τn′ → τ′. Since τn′ ∈ φ(τn), . n n n n Because the support correspondence is lower-hemicontinuous, if (a, α) ∈ supp τ′ then ∃ (a , α ) → (a, α) with (a , α ) ∈ supp τn′ so (u(an, αn, τn) ≥ maxau(a, αn, τn)) and then
312
MICROECONOMIC THEORY
(an, n) → (a, α) implies (a, α) ∈ Qτ. This implies that supp τ′ ⊆ Qτ. Hence φ is upper-hemicontinuous and so has a fixed point τ* which is an equilibrium.
19.2.2 An example To illustrate, consider a market supplied by a continuum of firms with index α ∈ [0, 1]. Firms are distributed according to μ. Let τ be a distribution on quantities and characteristics with marginal μ on characteristics. The conditional distribution τ(· | α) is the supply decision of firm α. Total output is Q(τ) = ∫q τ(dq × dα). Let the demand function be pd(Q), so that market clearing requires a price p(τ) = pd(Q(τ)). Since Q(τ) is continuous in τ, provided pd is a continuous strictly decreasing function, the market clearing price varies continuously with τ. If the market price is p(τ), firm α maximizes p(τ)q − c(q, α), where c(q, α) is the cost function of firm α. Maximizing this gives q(α, p(τ)), and aggregate output is ∫αq(α, p(τ)) μ(dα). Then p* is an equilibrium price if p* = pd(∫αq(α, p*) μ(dα)). (In distributional strategy terms, let τ′(· | α) be the distribution conditional on α that puts probability 1 on q(α), and let τ′ be the joint distribution on (q, α) determined by τ′ and μ. (If the optimizer q(α) is not unique, any distribution, τ′(· | α), on optimal choices is a best response.) This defines the best response mapping B(τ)).
19.2.3 Pure strategy equilibrium A Nash equilibrium, τ, is symmetric if ∃h: Λ → A measurable such that τ(Graph(h)) =1. Thus, τ({h(α)} | α) = 1, a.e. μ (or τ({(h(α), α) | α ∈ Λ}) = 1) and h(α) can be identified as the pure strategy choice of player α. When there is a finite number of pure strategies, A is a finite set. Let the utility function depend on the action, a, characteristic α, and distribution over actions, x, implied by τ: u(α, a, x). The following result gives conditions under which a pure strategy equilibrium exists. Theorem 19.2.If μ is atomless and A finite, then there is a symmetric equilibrium. Proof. To see this, let A = {a1, …, an} and let ei = (0, …, 0, 1, 0, …, 0)—1 in the ith position and 0's elsewhere. Define a correspondence Φ: supp μ × Δ(A) ↦ ℛn according to
The correspondence Φ is nonempty and upper-hemicontinuous. Let
CHAPTER 19: ANONYMOUS GAMES
313
The correspondence is Φ* nonempty, upper-hemicontinuous, and convex-valued and so has a fixed point . So, , a.e. μ and . So, , ∀a ∈ A, for μ— almost all α. This completes the proof.
19.3 Strategies as Functions Rather than define strategies as distributional strategies, an alternative is to formulate strategies directly by analogy with the finite player case—associating a pure or mixed strategy to each player. Let the pure strategy set A be finite, #A=n and let , the set of probability distributions on A. A strategy is a measurable function from Λ to P, x =(x1, …, xn): Λ → P where xi: Λ → [0, 1]. View xi as an element of L1(Λ, λ) and x = (xi) as an element of L1(Λ × {1, …, n}, λ). Take the strategy space, S, as the set of functions in L1(Λ × {1, …, n}, λ) mapping to P with the L1 weak topology. This is a compact convex subset of a local convex linear topological space. Utility is given by a function u: Λ × S → Rn, where ui(t, x) gives the utility of player t when the strategy is x and t takes action i. Thus, the expected payoff to t is . Assume that u is continuous on S and for fixed x ∈ S, u(·, x) is a measurable function. A fixed measure μ on Λ is given. Call the strategy x an equilibrium if for all p ∈ P, h(t, x) ≥ ∑piui(t, x) for almost all t. With the formulation as given: Theorem 19.3A non-atomic game has an equilibrium, x*. If u(t, x) depends only on ∫ x(t) dμ there is a pure strategy equilibrium, x*(for almost all t, xi(t) ∈ {0, 1}).
19.4 Dynamic Anonymous Games In a dynamic setting the characteristics of a player can change from period to period. For example, a firm's technology may evolve over time or it's cap-ital stock may change from period to period, but these are fixed in any given period. A player's payoff depends on it's choice a ∈ A, it's characteristic α ∈ Λ, and the aggregate distribution on actions and characteristics: ut(a, α, τt), where τt ∈ ℳ(A × Λ) and the t-subscript allows for time varying preferences. At time t, the available choices for player α are in the set At(a, τt), a continuous compact valued correspondence. The following discussion formulates a dynamic game and confirms that it has an equilibrium. Assume that ut is continuous in its arguments and uniformly bounded: |ut| ≤ K < ∞. To model the evolution of characteristics, α, let Pt(· | α, a, τt) be a transition kernel: a player with characteristic α in period t taking action a at aggregate distribution τ draws a characteristic next period according to the distribution
314
MICROECONOMIC THEORY
Pt(· | α, a, τt). While the individual faces uncertainty, at the aggregate level next period's aggregate distribution on characteristics is given by μt+1′(·) = ∫ΛPt(· | α, a, τt)τt(da × dα). (There is no aggregate uncertainty. Given , there is (from the contraction mapping theorem) a collection of bounded continuous functions v = (v1, v2, …, vt, …) satisfying
(Let v′ = Tv be defined as T = {Tt} and Tt as given above. Then
so that ‖vt − wt ‖ ≤ βt ‖vt+1 − wt+1 ‖. (Observe that
).
The function vt(α, τ) gives the present value of the payoff flow to player α when the sequence of aggregate distributions is given by . When player α optim-izes at time t, this generates a choice a = h(α), say; and given the current distribution μt on characteristics, this generates a joint distribution, , on actions and characteristics, A × Λ, according to for X ⊆ A and Y ⊆ Λ. And, at time t, given τt, next period's distribution over characteristics is determined via the transition kernel. In period t + 1 the distribution on characteristics is given by ∫A×ΛPt (· | α, a, τt) τt(da × dα), and this forces an intertemporal consistency on the distribution of characteristics over time. Definition 19.3.A collectionτis an equilibrium if: 1. τ1Λ = μ1, and τt+1Λ(·) = ∫Pt+1 (· | a, α, τt) τt(da × dα). 2.
CHAPTER 19: ANONYMOUS GAMES
315
With this definition in place: Theorem 19.4The dynamic anonymous game has an equilibrium in distributional strategies. Proof. For fixed τ, let Cτ be the set of ′s satisfying (1) so that the set of ′s such that , where
and
, and let Bτ be
Note that both C and B are closed convex-valued upper-hemicontinuous corres-pondences with C(τ) ∩ B(τ) ≠ ∅. To see the latter property note that Btτ is closed and hence measurable in A × Λ. The best response correspondence from Λ to A has graph Btτ and since this is measurable, the best response correspondence has a measurable selection, ht: Λ ↑ A and (α, ht(α)) ∈ Bt(τ) for each t. Pick and define a measure completes the proof.
. Then
is in C(τ) ∩ B(τ) and so C(τ) ∩ B(τ) ≠ ∅. This
In a stationary equilibrium the same strategy is played each period (and the same distribution over characteristics persists from period to period). Such equilibria are attractive since they are easier to analyze and in appropriate circumstances may be identified with long-run behavior in a model where the dynamics have “settled down.”
Stationary equilibrium Suppose that ut, Pt, βt, and At are all independent of t. In such circumstances it makes sense to consider the possibility that the same strategy might be used from period to period. Definition 19.4A stationary equilibrium of the dynamic anonymous game is an equilibriumτ = (τ1, τ2, …,) with τt = τt′ ∀ t, t′. Starting from an arbitrary “initial condition” with given distribution on characteristics μ1, there can be no expectation that there is some stationary equilibrium with τt = τ and τ with marginal distribution μ1 on Λ. This is the case for example, if there is no τ such that μ1 = ∫ P(· | α, a, τ) τ(da × dα). Over time, the distribution over characteristics varies; in a stationary equilibrium this distribution is at a point such that the strategy is constant from period to period.
316
MICROECONOMIC THEORY
Theorem 19.5Suppose that ut, Pt, βt, andAtare all independent of t. The dynamic anonymous game has a stationary equilibrium. Proof. Restrict τ to be stationary, so that it can be identified with τ ∈ P(A × Λ). As in the previous discussion, given τ ∈ P(A × Λ) define v:
Define C(τ) = {τ′ | τΛ′(·) = ∫P(· | a, α, τ)τ(da × dα).
As before, C ∩ B is a nonempty convex valued upper-hemicontinuous correspondence and so has a fixed point.
19.5 Social Planner Formulations In the models discussed above, the strategies are distributions over actions and characteristics. For some classes of games (in particular, partial equilibrium market models) the study of equilibrium can be simplified through the “social planner formulation.” In this, the sum of consumer and producer surplus is maximized and it turns out that this yields the same outcome as occurs in a standard equilibrium. In competitive markets, equilibrium occurs where the supply and demand curve intersect. That also is the price–quantity pair that maximizes the sum of consumer and producer surplus. Thus, total surplus maximization can be identified with competitive equilibrium. This observation motivates the derivation of competitive equilibrium in a market from the maximization of surplus. The following simple model illustrates this.
19.5.1 Surplus maximization: an example This example studies optimal entry–exit decisions of firms competing in a market where there is scope for acquiring improved technology over time, and shows how surplus maximization yields competitive equilibrium. A firm with technology α has production function f(l, α) = (αl)1/2, where l is the firm's choice of labor input. At market price p, profit maximization requires solving: maxl [pf(α, l) −wl]. For f(l, α) = (αl)1/2, the first-order condition
CHAPTER 19: ANONYMOUS GAMES
is pα1/2(½)l−1/2 − w=0. So, l*(α, p) = α(p/2w)2. The corresponding supply is α) = pf(α, l*) − wl* = α p2/4w.
317 ; and profit is π(p,
Consider the following two-period model with exit. Let μ be uniform on [0,1], the initial distribution of firms. A firm with technology α can remain in the market and retain the technology α or exit and re-enter with a new technology drawn from the uniform distribution. Thus, there is potentially an incentive for inefficient firms to exit: a firm with α small on exit has a 50% chance of drawing a technology between ½ and 1. In this framework, a firm operating in the market in any period chooses l optimally and in addition compares the benefit from remaining in the market with that from exit which leads to potentially more efficient operation for the firm next period. Because firm efficiency is increasing in α, the decision to exit will be characterized by an exit threshold α. Thus, firms with technology below α exit; those with technology above α remain in the market. Let μα be the truncated distribution of those firms remaining in the market. The output of fir ms remaining in the market is . Or, in terms of inverse supply, Ps(Q, α)= Ps(Q, μα)= (4w/(1−α2))Q. In period 1, equilibrium, given exit level α, is characterized by Pd(Q) = Ps(Q, μα), with solution Q(α).
In period 1, the surplus is defined as the area between the supply and demand curve:
In a two-period model, social surplus is maximized by the optimal choice of the exit threshold. This affects surplus in both periods. Consider first the impact on
318
MICROECONOMIC THEORY
period 1 surplus of varying the threshold.24
(using (∂Ps(Q, α)/∂α) = −(2α/(1−α2)2)(−2α) 4wQ, P(α) = (4w/(1−α2))Q(α), Q(α)2 = P(α)2 ((1−α2)2/(4w)2), and π(p, α) = α p2/4w.) Next, consider the both periods. The present value of surplus over the two periods is (in what follows use superscripts to differentiate time periods):
where Q1(α) and Q2(α) are the market clearing quantities in each period and α is the exit threshold. (In period 2 there is no exit.) From the social planner perspective, this is maximized by choice of α. Maximizing with respect to the exit threshold:
24
From Leibnitz's rule:
319
CHAPTER 19: ANONYMOUS GAMES
Concerning the second term, in period 2 output at price p consists of output from continuing firms, those firms that exited and re-entered with new technology .
Thus, the inverse supply function is
. So,
and
and
Where the second equality follows from Q(α) = ((1+α−α2)/4w) P2(α). Now, given p, the profit to a firm with technology α is (α/4w) p2. The expected profit to a firm drawing its technology from the uniform distribution is . Summarizing
Thus, the derivative of the surplus function gives the difference between the expected profit from staying in the market and the expected profit from exit.
320
MICROECONOMIC THEORY
In a competitive equilibrium this must be zero since the marginal firm must be indifferent between exiting and staying in the market. For firms with technology above the α value with dS(α)/dα = 0, the profit from staying in exceeds that from exit; and exit is optimal for those with technology below the threshold. So, maximizing S(α) gives the competitive equilibrium.
Additional discussion The logic behind this can be seen by considering the impact of raising the period 1 exit threshold on period 1 surplus. From the perspective of an individual firm, at any price p, profit maximization gives: π(p, α) = maxq [pq − c(q, α)], so p c′(q, α) =0, and solving for q gives q(p, α). At price p*, profit is:
This is depicted in figure A. The impact of raising the exit threshold (so there are fewer operating firms in period 1) is depicted in figure B. The change in total surplus for period 1 is given (to first-order approximation) by the shaded area in figure B when α increases to α′. At p, individual supply is q(p, α) and total supply . The change in surplus as α increases to α′ is
Thus,
This corresponds to the calculation for dS/dα given earlier with the Cobb–Douglas technology.
CHAPTER 19: ANONYMOUS GAMES
321
19.6 No Aggregate Uncertainty In the discussion of dynamic models in Section 19.4 the evolution of the aggregate distribution proceeded according to μt+1(·) = ∫Pt(· | α, a, τt) dτt, so that the distribution from period to period evolves deterministically; while at the individual level, player α faces the draw of a new identity parameter α′ in the next period according to Pt( · | α, a, τt). The deterministic evolution of the aggregate distribution is called “no-aggregate-uncertainty.” With this property, the characteristics distribution evolved as a sequence of distributions (μ1, μ2, …). So for example, one can consider the existence and properties of stationary equilibria—equilibria which where in particular characteristic distribution is unchanged from period to period: μt = μt+1. To understand this better in terms of underlying processes, suppose that technology evolution depends only on current characteristic. Fix an underlying probability space (Ω, ℬ, p) and let next period's technology evolve according to the process: ξ(α, ω). In this case, the distribution on technologies, given current technology α is μα(B) = p({ω ∈ Ω | ξ(α, ω) ∈ B}), B ⊆ Λ, and conditional on ω ∈ Ω, μω(B) = μ({α ∈ Λ | ξ(α, ω) ∈ B}), where μ is the current distribution on characteristics. In this formulation, no aggregate uncertainty holds when the process ξ and underlying probability and characteristic spaces are such that the measure μω(·) is independent of ω.
Bibliography Bergin, J. and Bernhardt, D. (1995). “Anonymous Sequential Games: Existence and Characterization of Equilibria,” Economic Theory, 45, 431–465.
322
MICROECONOMIC THEORY
Hopenhayn, H. (1992). “Entry, Exit, and Firm Dynamics in Long Run Equilibrium,” Econometrica, 60, 1127–1150. Jovanovic, B. and Rosenthal, R. W. (1988). “Anonymous Sequential Games,” Journal of Mathematical Economics, 17, 77–88. MasColell, A. (1989). “On a Theorem of Schmeidler,” Journal of Mathematical Economics, 13, 201–206. Schmeidler, D. (1973). “Equilibrium Points of Nonatomic Games,” Journal of Statistical Physics, 7 (4), 295–300.
20 Evolution and Learning 20.1 Introduction This chapter studies dynamic models of behavior and learning. Throughout the discussion, “learning” essentially means the convergence to a pattern of behavior under repeated application of some given principle for making choices. Although the definition might seem restrictive, learning in this form covers a large class of different forms of behavior. One of the main attractions of this kind of approach is that it makes low demands in terms of modeling “rationality” of players, or the need to forecast the actions of others, and so on. In practice this view of behavior may well be as realistic as that developed with more rational models of behavior. Classical models of long-run dynamics include fictitious play, considered in Section 20.2 and replicator dynamics, considered in Section 20.3. Both of these provide formulations of the dynamic choice of action, one based on best response reactions, the other based on growth of strategies that perform well. Stochastic stability, discussed in Section 20.4, describes a procedure for selecting long-run equilibria of dynamic systems according to robustness to perturbations of behavior. This is motivated by an example in Section 20.4.1 and the computation of such stable points is discussed in Section 20.4.2. The notion of stochastic stability is not tied to any particular model of behavior. The dynamics associated with two forms of behavior (best response and imitative behavior) are discussed in Sections 20.4.3 and 20.4.4. Section 20.8 describes approachability, a criterion identifying what a player can achieve when the payoff function is vector-valued. The concept turns out to have wide application. In the context of this chapter it appears in Section 20.5, where a model of regret minimization is presented and the long-run dynamics are associated with correlated equilibrium. Section 20.6 discusses long-run dynamics when players make calibrated forecasts and optimize relative to those forecasts. Finally, Section 20.7 provides a short discussion of Bayesian learning.
324
MICROECONOMIC THEORY
20.2 Fictitious Play In fictitious play, a player takes the average of historical choices of the other player as an estimate of that player's choice next period. Players then best respond to those historical averages and this determines the dynamic evolution of choices. Let S1 and S2 be the finite strategy spaces in a two-player game, with #S1 = m1 and #S2 = m2. For t ≥ 1, let ht = {(i1, j1), …, (it, jt)} be a sequence of choices in S1 × S2, giving the history of actions chosen in the game up to and including period t. Given ht, let be the number of times i ∈ S1 is chosen in the period from 1 to t by player 1. Similarly, denotes the number of times j ∈ S2 was chosen by player 2. So, . Denote the averages by and . Because there is no history at time t = 1, fix an initial condition as follows. For i ∈ S1, let , . Similarly, for j S2, let , . Set and . With this initial condition, take the averages to be:
Noting that
,
For these definitions, and choices made by the players.
are distributions which are approximately equal to the empirical frequencies of the
Let xt ∈ Δ(S1) and yt ∈ Δ(S2) be strategies for 1 and 2, respectively. Given y ∈ Δ(S2), the set of best responses to y is denoted BR(y) with a similar notation for x. Definition 20.1.Given if , and
and , a pair of rules, x: Δ(S2) → Δ(S1), y: Δ(S1) → Δ(S2) is called fictitious play , t ≥ 1, where and are defined recursively from and (x, y).
Fictitious play has been studied extensively. The following is a brief list of some of the main results. If s = (s1, s2) is a strict Nash equilibrium, then it is absorbing for the fictitious play process; if in some time t the players choose s, they will do so in every period thereafter. If s = (s1, s2) is a pure strategy steady state of a fictitious play, then s = (s1, s2) is a Nash equilibrium. And, if the empirical distributions and converge to and , then is a Nash equilibrium. Under fictitious play, the empirical distributions converge for: (1) two person zero-sum games, (2) generic two person games, (3) for games solvable by iterative strict dominance, (4) strategic complements (with some conditions).
CHAPTER 20: EVOLUTION AND LEARNING
325
In fictitious play, players take others to be myopic—in the sense of assuming that others will play an action next period with probability equal to the historical frequency with which the action was chosen. With this model of behavior of others, each player then best responds against that frequency distribution over opponents actions. Thus, each player optimizes on the assumption that others are myopic.
20.3 Replicator Dynamics In replicator dynamics, behavior is not explicitly modeled. A dynamic system governs the evolution of actions with the key feature that actions which are relatively more successful grow faster in the population. Fix a symmetric game: (A, A′), where A is a n × n matrix. Let Δ be the n − 1 dimensional simplex. Given x, y ∈ Δ, let . Write ei to denote the n vector with 1 in the ith position and zero's elsewhere. Thus, , the y-weighted sum of entries in the ith row of A. The replicator dynamic is defined as
Thus, the growth rate of the population share using strategy i equals the difference between the strategy's current payoff and the current average payoff. Replicator dynamics can be derived from behavioral models where agents imitate more successful strategies (social learning), from models where agents experience behavior reinforcement (positive experience with a strategy (good payoff) makes the choice of strategy more likely subsequently), and from biological models where more successful genes grow faster than the general population. Observe that since
, worse than average performers have declining share. Also, because
the ratio of any two shares varies according to their relative performance. If a pure strategy i is strictly dominated, then xi(x0,t) → 0, for any x0 ∈ int(Δ). However, weakly dominated strategies are not necessarily eliminated under replicator dynamics. A stationary point of the replicator dynamic is any point x for which , ∀ i. So, any pure strategy is a stationary point. The set of Nash equilibria is a subset of the set of stationary points.
326
MICROECONOMIC THEORY
20.4 Stochastic Stability Dynamical systems governing the movement of a state over time often have a multiplicity of long-run equilibria—where the state of the system does not change from period to period. In the current context, particular patterns of behavior may be unchanging from period to period. However, when such steady states are subjected to repeated perturbation, some may be more resilient than others in the sense that reversion to the initial unperturbed condition is more likely for some states. Stochastic stability is an approach to identifying such states—those states that are most difficult to leave in an appropriate statistical sense are the ones most likely to persist over time. Stochastic stability provides a criterion for identifying such states. The next section begins with some motivation for the stochastic stability cri-terion and a discussion of a few techniques for the computation of stochastically stable distributions. Following that, two applications are considered where the underlying dynamics are derived respectively from best response behavior and from imitative behavior.
20.4.1 Motivation Many dynamical systems can be represented by a rule describing the way in which the system moves from the current state to a future state. If states are represented by a finite set, S, a law of motion on the state space is determined by transition probabilities from state to state, pij, giving the probability of moving to state j next period, given the current state is i. Such probabilities and an initial distribution, π1, on states determine the evolution of the system. In particular, this determines the distribution over states at each time period. If the current distribution over states is , then next period's distribution is given by {πt+1}, where . Under certain circumstances, πt converges to some distribution π*, independent of the initial distribution on states. In that case, π* is a natural measure of the long-run distribution. Convergence of this sort depends on the transition probabilities satisfying certain conditions which will be discussed below. However, in the context of the models arising in the applications discussed later, those conditions are typically not satisfied. Nev-ertheless there is a natural criterion for identifying a particular distribution as the long-run stable distribution—based on the notion of stochastic stability. In the evolution and learning context, the criterion can be justified in terms of models of experimentation or error in carrying out decisions. The following discussion motivates the basic idea. Given a transition matrix P, an invariant or stationary distribution of P is a vector π such that π P = π. In general an arbitrary transition matrix may have many invariant distributions—an extreme example is where P is the identity matrix,
CHAPTER 20: EVOLUTION AND LEARNING
327
so that every distribution is invariant. Nevertheless, it may still be possible to identify a specific distribution as the better predictor of long-run outcomes, taking stochastic stability considerations into account. Consider the following transition matrix.
In this example, suppose there is a natural notion of distance between states, for example, dij = |i − j|. Suppose also that the matrix represents the error free formulation of a system with errors, where if errors occur they are most likely to move the system to an adjacent state. For example, in state 3 an error is more likely to move the system to state 2 than state 1. Similarly, in state 1 errors are more likely to move the system to state 2 than state 3. Note that there are two absorbing states, 1 and 3 (p11 = 1 = p33), so even with the addition of errors, any distribution over states will put most weight on these states and little weight on state 2. Note, however, that errors at state 3 most likely move the system to state 2, where, even with errors, the system will with large probability return to state 3, because p23 = 1. When the same error level at state 1 moves the system to state 2 then with large probability the system moves to state 3. In such circumstances, state 3 seems to be the state that is most likely in the long run. Stochastic stability captures this idea. In the evolutionary context, the basic procedure is to take the initial transition matrix P, perturb it to a matrix P(ε) such that there is a unique invariant distribution πε of P(ε), (πεP(ε) = πε), and identify π* = limε πε as the stochastically stable distribution (a suitable model of experimentation or errors leads to perturbations for which the limit exists). Computation of πε and π* is tedious in principal. However, there are a number of methods for computing these distributions, and in the context of the present discussion, not all terms in P(ε) are equally important in determining πε: only terms of dominant order of magnitude are relevant. This fact allows simplification of the calculations in a variety of circumstances using minimum costs trees—one of the techniques described next.
20.4.2 Invariant distributions: overview Given a transition matrix P = {pij}, π is an invariant distribution if π P = π. The transition matrix P is irreducible if there is positive probability of reaching any state from any other: given i,j, ∃ i1, … in, i1 = i, in= j and . If the transition matrix is irreducible, there is a unique invariant distribution. In general the transition matrix, P may or may not be singular. For example, the identity matrix is a non-singular transition matrix with an infinite number of invariant distributions while a transition matrix with two rows equal is singular but will have a unique invariant distribution if, for example, all entries are strictly positive.
328
MICROECONOMIC THEORY
While irreducibility guarantees that P has a unique invariant distribution, this does not imply that starting from any initial distribution the system will converge to the unique invariant distribution. If the initial distribution is π1, then at period 2 the new distribution is π2 = π1P; and at period t, πt = πt−1P. Given π1, this defines a sequence of distributions {πt}. Since πt+1 = πtP = πt − 1PP = πt − 1P2 = π1Pt,} the behavior of πt depends on π1 and P (or Pt). Consider
If n is odd Pn = P, and if n is even Pn = I. If π1 = (1, 0), then π2 = (0,1), π3 = (1,0) and so on: for t odd πt = (1,0) and for t even πt = (0,1). So the process does not converge: starting from an arbitrary initial distribution, irreducibility is insufficient to guarantee convergence.25 The key extra condition is aperiodicity. For state i, let
be the element in the ith and jth column of Pn. The period of state i is denoted d(i) and defined: (where “g.c.d.” is the greatest common divisor). Two states i,j communicate if ∃ n′, n″ such that and —it is possible to reach one state from the other. Communicating states form an equivalence class and all states in the equivalence class have the same period. In the case where P is irreducible this implies that all states have a common period. In the example above, so that g.c.d. , and P is periodic with period 2. When P is irreducible with period 1, P is called aperiodic, and the process is called ergodic. Furthermore, with ergodicity Pn converges to a matrix P*, with the property that each row of P* is the same, and equal to the unique invariant distribution of P. Thus, P* can be written in terms of the invariant distribution π* = (π*1, …, π*s):
For an irreducible aperiodic Markov matrix, convergence of Pn is fast:
Finally, this can be related to the time it takes to reach or return to a state. Let {Xt} be a stationary Markov process with state space S and such that
25
However,
, where π* is the invariant distribution of P .
CHAPTER 20: EVOLUTION AND LEARNING
329
p(Xt+1 = j | Xt = i) = pij}, so the process has transition matrix P. Let X∞ = (X0, X1, …) and define Ti(X∞) according to:
Since Ti is a function of the random vector X∞, it is a random variable—a “stopping time”—and gives the first visit to state i (which depends on the value of X∞). The expected return time to state i is E {Ti | X0 = i}. Note that when this is small, return to state i occurs “quickly,” so that state i is likely to be observed often. This intuition connects directly to the invariant distribution: if π* is the invariant distribution of an ergodic (irreducible and aperiodic) process, then π*i = 1/(E {Ti | X0 = i}). The following discussion describes three procedures for finding the invariant distribution of an irreducible transition matrix. One is based on matrix inversion, one on a cofactor computation, and these are both seen to generate the same formula as the minimum cost tree procedure, described later.
Computation of invariant distributions For irreducible transition matrices there are a number of direct ways to compute the invariant distribution. The following discussion describes some matrix procedures and an equivalent graph theoretic procedure which is particularly useful when the orders of magnitude of transition probabilities between states vary.
Computation by matrix inversion Let ι be a n × 1 vector (a vector with n rows and 1 column), with 1 in each position—a column vector of 1's. Given a vector x, write x′ for it's transpose, so in particular, ι′ is a row vector of 1's. Observe that π is an invariant distribution for P if and only if π (I − P +ιι′) = ι′, where I is the identity matrix. If P is irreducible then (I − P +ιι′) is invertible26 and π may be computed directly: π = ι′ (I − P +ιι′)−1. In the example above,
so
26
If not, then let π be the invariant distribution and by assumption ∃ξ ≠ π and ξ (I − P + ιι′) = ι. Then since π ≫ 0, for θ small, π(θ) = θ π + (1−θ) ξ ≫ 0 and π(θ) ι =1, so that π(θ) is also an invariant distribution—a contradiction.
330
MICROECONOMIC THEORY
Computation by cofactors One can also compute the invariant distribution of an irreducible matrix P using the cofactors of I − P, a matrix which is necessarily singular since (I − P)ι = ι − ι = 0. Let Q = I − P, and let qii be the cofactor of Q in the ith position on the diagonal. Then if π is the invariant distribution of P, πi = cii/∑ cjj. For example,
So
Because 1 − pii = ∑j ≠ ipij, 1 − p22 = p21 + p23, and so on. Therefore,
Then,
An alternative way to compute the invariant distribution is based on the Markov chain tree theorem. This is discussed next.
Computation by minimum cost trees Let S be the set of states and pick z ∈ S. A z-tree is a collection of ordered pairs (of states) h such that 1. j ≠ z implies ∃ unique k ≠ j and (j,k) ∈ h. 2. ∀ j ∈ S \ {z}, ∃ j1, …, jr such that (ji, ji+1) ∈ h, jr = z and j1 = j. The invariant distribution can be determined directly from the h-trees. Let Hz be the collection of z trees. Define
The Markov chain tree theorem asserts that the invariant distribution of P is proportional to f = (fz)z ∈ Z. In the example above there are three “1-trees.” The three h-trees in H1 are:
CHAPTER 20: EVOLUTION AND LEARNING
331
Underneath the trees are the corresponding products of the transition probabilities. So, for example, f1 = p21p31 + p21p32 + p23p31. This is q11 from the earlier cofactor calculation. The invariant distribution π satisfies:
In the model of Section 20.4.1, while the unperturbed system is not irreducible, the model with mutations produces an irreducible matrix. To illustrate the computation of the invariant distribution, consider the matrix P perturbed according to the binomial scheme to P(ε) with C(ε) the cofactor matrix of I − P(ε):
Adding the terms on the diagonal of C(ε): ε2 − 4ε3 + 4ε2 +4ε3 − 5ε2 + 2ε = 2ε. Thus, the invariant distribution is:(20.1)
where O(ε) means that the term goes to 0 at the same rate as ε (and O(1) is constant plus small order terms). Letting ε → 0, π(ε) → (0, 0, 1). Observe that the limiting distribution is determined but the terms of largest order of magnitude—only the third term, π3(ε) is of order of magnitude 1.
332
MICROECONOMIC THEORY
Using the formula for fz:
While fi = cii(ε) (the iith entry of C(ε)), the approximation is obtained by taking those terms of leading order of magnitude in ε. From this it is clear that f3(ε) dominates as ε → 0. Taking the leading terms simplifies the calculation—since there is no need to obtain the exact invariant distribution for a given ε.
20.4.3 Best response dynamics: an example The following coordination game illustrates the computation of stochastically stable distributions under best response dynamics. Suppose there are nine players playing the game:
In this game there are two pure strategy equilibria and one mixed strategy equilibrium: (⅓, ⅔). For a distribution (x, 1 − x), A is a unique best response if x > ⅓, and B a unique best response if x < ⅓. Define a state space S = {0, 1, …, n} to denote the number of players playing action A. Given z ∈ S, the number of players playing A, let
Thus, (w(i), 1 − w(i)) is the probability distribution on the choices (A,B) that player i faces in the population. Define the best response for i:
In any given state, each player makes a choice, A or B, according to the best response function and this moves the system to a new state where each player is an A or B player, according to their choice. In the example here with n = 9, the dynamics evolve as follows. In state 0, all play action B. Each player faces a distribution response is to choose A—next period, all nine
on {A, B} (there are eight other players), so the best
CHAPTER 20: EVOLUTION AND LEARNING
333
players choose A. The system stays in state 0. Therefore p00 = 1 and p0j = 0, j = 1, …, 10. This defines the first row of the transition matrix. In state 1, the “A” player faces the distribution on {A, B}; all eight “B” players face the distribution (⅛,⅞) on {A, B}. For either, the best choice is A so that next period all choose A. Thus p10 = 1, and p1j = 0, j ≠ 0. In state 2, the “A” players faces the distribution (⅛,⅞) on {A, B}; the “B” players face the distribution on {A, B}. For both A and B players, the best response is B. Thus, p2j = p1j, for j= 1, …, 10. The transition matrix is given below. In state 3, there are 3 A players and 6 B players. The A players face the distribution , and the B players face the distribution (⅜,⅝). So, the A players choose B and the B players choose A. This leads to 6 A players and 3 B players: p36 = 1, p3j = 0, j ≠ 6. For i > 3, similar calculations show that pij = 0, j < 9 and pi9 = 1. There are two absorbing states: s=0 and s=9. The size of the basins of attraction of these two states are indicated by the dotted lines in the transition matrix.
Experimentation Given the state is state 0 (no player chooses A), every player plans to play B next period. As the transition matrix indicates, p00 = 1. Suppose that each player's choice is subject to mutation: with probability (1 − ε) no mutation occurs and the player's choice is unchanged, and with probability ε there is a mutation switching the player's choice—to B. Assume that mutations are independent across players. In this case, with probability (1 − ε) for each player, no mutation occurs and so the choice of B for every player has probability (1 − ε)9. Next, there are 9 ways in which one player mutates and 8 do not. Each of these has probability (1−ε)8 ε, so there is total probability of of moving to state 1 where exactly 1 player chooses A, and 8 choose B. Similarly, there are ways of choosing two players so the total probability of two mutations is . And so on. In this notation, the probability of moving from state 0 to state 0 is ; and the probability of moving from state 0 to state j is . In state 9, the calculations go
334
MICROECONOMIC THEORY
in the opposite direction.
; and the probability of moving from state 9 to state j is . The main point to note is the order of magnitude of the mutation: mutation to a neighboring state has probability of order ε.
The only state that is computationally tedious to work with is state 3 where a deterministic transition moves the system to state 6. From state 6, there are many ways that, for example, state 5 can be reached. One possibility is that exactly one player with a B strategy mutates to A. This has order of magnitude in probability of ε. Another possibility is that two B players switch to A and one switches to B; this has order of magnitude in probability ε3. Such possibilities are of smaller order of magnitude relative to the case where just one switch occurs. Moving up or down one state had order of magnitude 1 under mutation; moving up or down 2 has order of magnitude 2 under mutation; and so on. From these observations, it is clear that it requires 3 mutations to move out of the basin of attraction of state 0 and 7 mutations to move out of the basin of attraction of state 9.
Minimum cost trees In this model, with mutations each transition probability pij(ε) (the ijth element of P(ε)) has an order of magnitude: where γij is the smallest exponent of ɛ in pij(ε) and where 1 is read as ε0. (In the 3-state example p11(ε) = (1 − 2 0 ε) = O(ε ), p12(ε) = 2(1 − ε)ε = O(ε1) and p13(ε) = ε2 = O(ε2) so that γ11 = 0, γ12 = 1, γ13 = 2.) Thus, . The smaller is ∑(i,j) ∈ h γij, the larger is . Since , the order of magnitude of the largest term in the sum determines the order of magnitude of fz, , and is determined by that h ∈ Hz for which ∑(i,j) ∈ h γij is smallest. If there is more than one such h, that may affect the value of the limiting invariant distribution, but not the support of the distribution. Let and v* = minz ∈ Svz. Then limε → 0fz = 0 if vz > v*. Again, considering the 3-state example discussed earlier:
Therefore, γ21 = 2, γ31 = 2, γ32 = 1, γ23 = 0, and
CHAPTER 20: EVOLUTION AND LEARNING
Thus, before.
335
. Similar calculations given v2 = 2 and v3 =1. So, limε → 0f1(ε) = limε → 0f2(ε) = 0, as was calculated
The n-person case is essentially the same. Theorem 20.1.In the coordination game with n players, the unique stochastically stable state is the one where all players play strategy A. Proof. (Sketch) In the coordination example with n players, essentially the same reasoning as above applies. For some , when all players switch to action B and when all players switch to action A. In this case, vo ≈ ⅔ n and vn ≈ ⅓ n. Loosely speaking, it is twice as unlikely to move from state n to state 0 as it is to move from state 0 to state n. The relation of the minimum cost trees to the transition matrix can be seen from the following figures.
Here,
. Similar calculations give
when n is large.
Other 2× 2 games If instead of the coordination game, an arbitrary symmetric 2× 2 is considered:
336
MICROECONOMIC THEORY
Suppose there are two pure strategy equilibria, (A, A) and (B, B), and one mixed strategy equilibrium (xA, 1 − xA). If xA < ½, then the stochastically stable outcome has all players playing A; if xA > ½, then the stochastically stable outcome has all players playing B. In games with two pure strategies the state space is identified with the number of players plus 1. So, the state space has a relatively simple structure. The next section discusses an oligopoly model, so that the natural state space is the set of possible profiles of actions. In addition, the underlying dynamics is based on imitative behavior.
20.4.4 Imitative dynamics: an example Consider an oligopolistic environment with n firms. The demand function is p(∑jqj) and the cost function of an individual firm is c(q). Demand is downward sloping and cost increasing. Each firm can choose output from a grid Γ = {0, δ, 2δ, …, k δ}. A state of the system is an output profile ω = (q1, …, qn) ∈ S = Γn. At time t, denote a state of the system by ωt = (q1t, …, qnt). At each period in time, with probability ρ > 0 firms can adjust their output. The adjustment process is assumed to follow an imitative rule—firms imitate the most successful choices in the current period in determining their choice next period (provided they can adjust). This is given by:
Define a revision rule for i as α: S → Δ(Γ) where
So, a revision rule for i puts probability 1 on elements of the set B(wt − 1). Adding an inertia parameter ρ, one may construct a transition rule for the system:
So, for example, if qit − 1 ∉ B(ωt − 1), then i wishes to change action. With probability (1 − ρ), this is not possible and i must continue with the choice qit −1. With probability ρ, i can change action, and in this case does so optimally—according to the distribution α(ωt−1). If qit−1 ∈ B(ωt−1), then with probability (1−ρ), i must continue with that choice; with probability ρ, i can choose any q, and given the option to choose, i chooses qit−1 with probability
, so the total probability of choosing qit − 1 is
, and so on.
CHAPTER 20: EVOLUTION AND LEARNING
337
Aggregating over firms, this defines a transition matrix on states. For any q ∈ Γ write ωq = (q, q,…, q) and let A = {ωq | q ∈ Γ} so that A is the set of states where every firm makes the same choice. Because choices are drawn from B(ω), if ω ∈ A, then the dynamic system has every firm make the same choices at every date: each ω ∈ A is an absorbing state. When each firm choose q last period, there is only q to imitate. To consider the issue of stochastic stability, add mutations to the model; once the firm makes it's choice, allow for the possibility that a small random shock may alter it's decision. After adjustment of a firm's choice occurs, to qit, say, let qit be perturbed: where γ = 1/(# Γ −1). With probability 1−ε the firms choice is unperturbed, and with probability [1/(# Γ −1)]ε any other output level is reached. This defines a perturbed transition matrix, P(ε). The Walrasian output level plays a central role in the discussion. The profit of firm i is πi(q1, …, qn) = p(∑ qj) qi − c(qi), and the walrasian output, is defined: , ∀q. Note that for , and 1 ≤ k ≤ n,
To see this, observe that the second negative, if
and subtracting
: if , then and so the first term is positive and , the first term is negative and the second positive. Rearranging,
from both sides:
Since a > c, from the definition of Walrasian output, it must be that b < d. And, given b < d consider the cases where k = 1 and k = n − 1. In particular, (i) (ii)
338
MICROECONOMIC THEORY
From (i) if one player's action switches from q to with others playing q, the action yields higher profit. From (ii) if one player's action switches from to q with others playing , the action q yields lower profit. These key features of are used in the following theorem (see the cost trees below). Theorem 20.2The unique stochastically stable state is the Walrasian state,
.
Proof. (Sketch) As far as the dynamics go, the central idea can most easily be seen in the constant marginal cost case: c(q) = cq with Walrasian output defined by . Let Q = ∑iqi so that the profit of i is [p(Q) − c]qi and πi − πj = [p(Q)− c](qi − qj). As long as p(Q) ≥ c, the higher output firm has the larger profit. Thus, imitation will lead to the higher output level being adopted: an output increase by one firm will be followed by matching increases by others when possible. However, an output reduction will not be matched. Thus, below the Walrasian output, at say q′, if the state is ωq′ and as a result of mutation i's output is increased to q′+δ, i will continue with this choice since it yields higher profit, and others will adopt it: an upward mutation for one agent leads to matching by others. At the Walrasian output, neither an upward or a downward mutation will be imitated: an upward mutation pushes price below c and the firm with highest output makes the greatest loss. A drop from the Walrasian level by some firm pushes price above c and that firm with the lowest output makes profit lower than that of other firms. In sum at least two mutations are required to leave the Walrasian state, and just one mutation to leave any other absorbing state.
In the figure, {Aq} denotes a collection of transient states with positive probability of reaching q in the unperturbed system. Only states of the form s = wq for some q are absorbing, so these are the only ones where positive cost is incurred (in terms of mutations) to move to another state. In the figure, the first h-tree is one corresponding to a nonWalrasian state ωq. From every other state excluding at least one mutation is required to leave such a state; and at least two mutations are
CHAPTER 20: EVOLUTION AND LEARNING
339
required to leave the state . Take any such tree, and interchange the root ωq with . This produces a tree with lower cost. Because this can always be done and produces a tree with lower cost, any minimum cost tree has as root. Thus, the unique stochastically stable distribution puts probability 1 on the Walrasian state.
20.5 Regret Minimization Consider a player who at any point in time can take one of a finite number of actions. Over time, a history of payoffs associates a payoff with each choice. A player can assess what the payoff would have been when at any time in the past where j was chosen, the player chose i instead. This provides a measure of the regret for not having chosen i instead of j. For any pair i,j with i ≠ j such a measure of regret can be calculated. If there are mi pure strategies, then there are (mi−1)2 regret variables: for each i, mi−1 regrets based on the mi−1 other pure strategies. A player who simultaneously wishes to minimize all these regrets must reduce the value of an (mi−1)2 vector of variables. It is as if the player has a vector of payoffs. The theory of approachability deals with games that have vector payoffs. The following provides a brief discussion; the topic is discussed in greater detail in Section 20.8.
20.5.1 Approachable sets of payoffs Consider a game with finite action spaces A and B for player 1 and 2, respectively. Let v(a,b) ∈ Rm be the vector payoff to actions (a,b) ∈ A × B. When the game is played over time, this generates a sequence of actions (at, bt) and corresponding payoffs v(at, bt), with average at time n equal to . Given a convex set Q ⊆ Rm, does player 1 have a strategy to force the payoff, into the set Q? If so, the set Q is said to be approachable. The approachability criterion provides the answer to the question: The set Q is approachable if for each λ ∈ Rm there is some q ∈ Δ(A) such that
Approachability is discussed at length in Section 20.8. The following discussion provides some insight for the condition. Let λ ∈ Rm and let c maximize λ · x on Q: wQ(λ) = λ · c ≥ λ · x, ∀ x ∈ Q. The hyperplane, HQ(λ) = {ξ ∈ Rm | λ · ξ = wQ(λ)} is perpendicular to the vector λ and tangent to Q. According to the criterion, if Q is approachable, then given λ, there is a q, such that ∑aqav(a,b) lies below HQ(λ) (on the same side of HQ(λ) as Q), for all b ∈ B.
340
MICROECONOMIC THEORY
If λ· ∑aqav(a,b) ≤ wQ(λ) for all b ∈ B, then for any b′, ∑aqav(a,b′) lies at a point such as indicated on the graph. If the vector average payoff at time t was v*, 1 plays q and b′ is chosen by 2, then the average payoff at time t + 1 is a point closer to Q than . Over time, the average payoff is forced closer and closer to Q. In the subsequent discussion , the negative orthant of Rm. Then, for any vector λ ∈ Rm, wQ(λ) = ∞ if ∃ i, λi < 0. In this case the approachability condition is trivially satisfied. For λ ≥ 0, λ ≠ 0, wQ(λ) = 0. Thus for Q = {x ∈ Rm | xi ≤ 0} it is sufficient to check that for ∀ λ ∈ Rm, λ ≥ 0,
20.5.2 The model Let be an n-person game with #Si = mi. Consider the repeated version of G. A history of length t is denoted ht with ht = (s1, s2, …, st), . At time t, identify those times when player i chose action j. At each of those times, replace choice j with choice k in ht and compute the payoff that i would have achieved with this replacement. Let be the average payoff under the replacement minus the average payoff under the true history. If the choice of other players at time l ≤ t is , define
Put
341
CHAPTER 20: EVOLUTION AND LEARNING
Thus,
So, Define
measures the total regret from not having chosen k instead of j, and a vector in Rm*(m* = mi(mi−1)
the corresponding average regret.
where [0,… 0] is a mi − 1 vector of zero's. Similarly, the hth block is:
This defines the vector payoff function ri(si, s−i), ri: Si × S−i → Rm*. The average regret is given by
The following discussion shows that there is a strategy for i such that euclidean distance from to the set —the negative orthant of Rm*. Theorem 20.3. There is a strategy for i such that
, where
is the minimum
.
Proof. In view of the earlier discussion, it is sufficient to show that given λ ≥ 0, λ ∈ Rm*, ∃ q ∈ Δ(Si) such that
Using the fact that
unless si = j, this is equivalent to:
342
MICROECONOMIC THEORY
Let Λ = {λ}jk be a matrix of nonnegative numbers with 0's on the diagonal. Let β > ∑k λjk and define Λ* according to:
The matrix Λ* is a Markov matrix with invariant distribution q: q Λ* = q or . The distribution q gives the approaching strategy (to ), as is shown next. For j ≠ k, λjk = β λjk*. Consider ∑jk, j≠ k λjkqj [ui(k, s−i) − ui(j, s−i)]. Since [ui(k, s−i) − ui(j, s−i)] = 0 when k=j, for any wij, ∑jk, j≠ kwjkqj [ui(k, s−i) − ui(j, s−i)] = ∑jkwjkqj [ui(k, s−i) − observations,
Consequently
is approachable: player i has a strategy pushing the vector of regrets to 0.
343
CHAPTER 20: EVOLUTION AND LEARNING
These ideas also lead to a connection to correlated equilibrium. The empirical distribution on S generated by the path {h = (s1, s2, …) is:
Put
. Let γ be a positive number sufficiently small that
where
and let T be defined
The matrix Ti(ht) is a Markov matrix. Let be an invariant distribution of following theorem is stated without proof (see Hart and MasColell 2000). Theorem 20.4.Suppose that at time t player i plays strategy correlated equilibrium distributions of the game.
. The
. The empirical distributionzTconverges almost surely to the set of
Here convergence means that with probability 1, for any neighborhood of the set of correlated equilibria, the trajectory {zt} enters that neighborhood and stays there. Finally, a necessary and sufficient condition for the empirical distribution to converge to a correlated equilibrium is that the regrets all converge to 0.
20.6 Calibration Let (S1, s2, u1, u2) be a finite game. Played over time the play generates histories of choices. Suppose that player one is forecasting the choice of 2 each period (on the basis of the existing history). Let ft ∈ Δ(S2) be the forecast at time t. Thus, , where #S2 = m2, where is the forecast probability that 2 will chooses j in period t. Fix q ∈ Δ(S2), and let be the indicator function of the event if and only if ft = q. Thus,
is the number of times between t = 1 and t = T that player 1 forecasts q. Let χj(t) equal 1 if 2 chooses j at period t, and 0 otherwise. Then is the number of times j was chosen in those periods when the forecast was q. So, , since for
344 each t,
MICROECONOMIC THEORY
. Let
So, ρ(q,j,T) is the proportion of the times j was chosen when q was forecast. From the earlier calculations . Definition 20.2.The forecast f is said to be calibrated with respect to the sequence of plays made by player 2 if:
Let F be a joint distribution on S1 × S2. Say that F is a limit point of calibrated forecasts if ∃ deterministic best response functions (R1, R2) and calibrated forecasting rules such that if each player plays a best reply to the forecast according to Ri, then the limiting joint distribution will be F. For almost all games, the set of distributions to which calibrated rules can converge is equal to the set of correlated equilibria.(See Foster and Vohra (1997)).
20.7 Bayesian Learning Consider an infinitely repeated game based on some finite stage game: . Let fi be a behavior strategy for i in the infinitely repeated game, let be beliefs about the strategies of the opponents. Behavior of player is determined by selecting some fi according to the distribution and λi where puts probability 1 on fi (players know their own strategies). To simplify notation, focus on just two players, i and j. Let Ht = St−1 and H∞ = S∞, the sets of finite histories at time t and the set of infinite histories. In an infinitely repeated game a strategy for i is a function fi: ∪t ≥ 1Ht → Si. Let Pt(h) = {h′ ∈ H∞ | ht′ = ht} and Pt = {Pt(h) | h ∈ H∞}. Let λi be a distribution (with finite support) on the set of strategies of j: λi has support λj has support
and suppose that
. The distribution λi represents i's beliefs regarding the strategy choice of j. The following discussion shows that in the long run, strategies are fully revealed or those strategies in the support of the limiting belief distribution of a player involve playing the same way. Suppose that (f1*, f2*) ∈ F1 × F2 is an equilibrium of the (incomplete information) game with beliefs (λ1, λ2). The posterior distributions
CHAPTER 20: EVOLUTION AND LEARNING
345
evolve according to Bayes' rule:
so,
or
Because the sequence of posterior distributions each k=1, …, ki. Therefore,
so that for t large
for each k. If
converges (from the martingale convergence theorem), for
, the strategy is fully revealed asymptotically, so that
. If , then the strategy is not fully revealed asymptotically. In this case all types in the support of the limiting distribution (along h) must play the same:
for large t if
.
20.8 Approachability Typically, strategic models of choice concern an agent making decisions to affect some variable (such as payoff). The theory of (Blackwell) approachability concerns strategic decisions where a vector of variables is influenced by the decision of agents, and controlling the values of each member of the vector is of interest. For example, suppose there are a finite number of states, with r=(r1, …, rk) the return in each state. Given a specification of the environment, one may ask if there is a strategy for the agent that can guarantee that returns in each state are nonnegative. Can the agent guarantee that ri ≥ 0, ∀ i? Is there a strategy for the agent that forces the vector payoff r=(r1, …, rk) ∈ {x ∈ Rk | xi ≥ 0, i=1, …, k}? In the case where such a strategy exists, the set is said to be approachable. The theory of approachability concerns identifying circumstances where sets (such as ) are approachable. This is a useful and elegant theory. It plays a central role in defining optimal strategies in incomplete information games. Historically, approachability has played a central role in the theory of repeated games of incomplete information. In the present context it is a useful tool in the study of regret minimization and can also be used to establish the existence of calibrated forecasts. The following discussion describes the theory.
346
MICROECONOMIC THEORY
There are two players that is the number of players is not important since the issue concerns what a player can achieve, playing against one or many opponents where correlated behavior is allowed with action spaces given by finite sets and . To each action pair (a,b) ∈ A × B, associate a vector v(a,b) ∈ Rm. The distance between two points, ξ, η in Rm is ; (ξ − η, ξ − η) = ∑(ξi − ηi)2. Let V = {v | ∃ (a,b) ∈ A× B, v = v(a,b)}. In a repeated game, each player has a strategy which at each time period associates an action to each history. The strategies for players 1 and 2 are, respectively, σ and τ. Thus, given the history ht = (a1,b1, a2,b2, …, at−1,bt−1), agent 1 chooses σ(ht) ∈ Δ(A), where Δ(A) is the set of probability distributions on A. Similarly, τ(ht) ∈ Δ(B). A strategy pair (σ, τ), determines a joint distribution, P(σ, τ), on sequences (a1,b1, a2,b2, …, at,bt, …) ∈ (A × B)∞, with corresponding expectation operator E(σ, τ). When no confusion can arise write P and E respectively. The associated sequence of vector payoffs is (v(a1,b1), v(a2,b2), …, v(at,bt), …) = (v1, v2, …, vt, …). Thus, vt is random variable. Write vn = (v1, v2, …, vn), and , the average of the first n elements of the sequence of vector payoffs. Given a set Q ⊂ Rm and a point x ∈ Rm, the distance from x to Q is defined: d(Q,x) = infz ∈ Qd(z,x). Given μ ∈ Δ(A), let R1(μ) = co{∑a ∈ A μav(a,b) | b ∈ B} and ν ∈ Δ(B) let R2(ν) = co {∑b ∈ B νbv(a,b) | a ∈ A}. With this notation, the key concepts are: 1. A set Q ⊂ H is approachable for player 1 if ∃ σ* such that ∀ ɛ > 0, ∃ N, such that:
2. A set Q is excludable by player 2 with τ* if ∃ η > 0 and N such that:
Observe that Q is approachable if and only if the closure of Q is approachable and Q is excludable if and only if the closure of Q is excludable. If Q is excludable, then Qc, the complement of Q is approachable (by player 2). Theorem 20.5.Let Q be a closed set in Rn. 1. If for allx ∉ Q, ∃ μx ∈ M(A), such that ify ∈ arg minz ∈ Qd(z,x) (ya closest point inQtox), the hyperplane perpendicular tox − ythroughyseparatesxfromR1(μx), thenQis approachable. The approaching strategy, σ*, is given by: (a) At stage 1 play anything. (b) At stage n + 1 play anything if , play otherwise. Next, assume that both A and B are compact and v: A × B → His continuous. 2. Let Q be a closed convex subset ofH. ThenQis approachable if and only ifQ ∩ R2(ν) ≠ ∅, ∀ ν ∈ M(B). IfQis not approachable, then it is excludable with τ, where τ(ht) = ν* and where ν* satisfies Q ∩ R2(ν*) = ∅.
347
CHAPTER 20: EVOLUTION AND LEARNING
Proof. 1. Let
. If
, then,
Thus note the identities:
since n/(n+1)= 1 − 1/(n+1).
, so that
, where c = supz ∈ V infs ∈ Qd(s,z) + dV. Therefore, E{δn+1} ≤ c/n+1.
If , let y be a closest point in Q to through
. Let F be the hyperplane perpendicular to
. Since , the strategy σ* specifies , so that the period n + 1 payoff is in of the strategy of 2, it must be that whatever the value of vn+1, the average The following figure illustrates the intuition.
. Since regardless is closer to y (and hence Q), that .
If in period n, the current average payoff is , then player 1 has a strategy such that whatever the choice of 2 is, the payoff in period n + 1 will lie in . If the n + 1 period payoff is vn+1, then the average payoff over n + 1 periods is given by . Since is closer to y that , is closer to Q that . The following calculations work through the details. Since y ∈ Q,
348 And since
MICROECONOMIC THEORY
,
so that
Since vn+1 ∈ {v(a,b) | (a,b) ∈ A × B} and y ∈ Q, (vn+1 − y)· (vn + 1−y) ≤ c2. Next observe that since line through y perpendicular to , , and since , Therefore,
Therefore, on vn−1,
. Similarly,
and substituting for
Similarly,
and so
. Repeated iteration gives
lies below the .
. Conditioning
349
CHAPTER 20: EVOLUTION AND LEARNING
where
. Since, the square is a convex function,
.
2. For the second part, assume that Q convex and for each ν, R2(ν) ∩ Q ∉ ∅. Take x ∉ Q and y ∈ argminz ∈ Qd(z,x). Let z* ∈ R2(ν) ∩ Q, and noting that because Q is convex, it lies in the half-space {ξ | (x − y, ξ) ≤ (x − y, y)}, observe that
Thus, . Next, let m(a,b) = (x − y, v(a,b)) and consider the zero-sum game with payoffs given by m(a,b). Denote the value by val(m) and observe that
Thus,
From the minmax theorem, ∃μ ∈ M(A) such that (x − y,∑a ∈ Av(a,b) μa) ≤ val(m), ∀ b ∈ B. Thus if z ∈ R1(μ), then (x − y, z) ≤ val(m). So, ∀ z ∈ R1(μ), (x − y, z) ≤ val(m) ≤ (x − y, y). Thus the hyperplane {ξ | (x − y, ξ) = (x − y, y)} separates x from R1(μ) and so Q is approachable for player 1. This establishes the first part of 2. Finally, if there is some ν* such that Q ∩ R2(ν*) = ∅, then the strategy τ: τ(ht) = ν* has the property that , independent of the strategy of player 1. Since R2(ν*) and Q are both closed,
. Hence,
, n ≥ N*, so that Q is excludable.
Discounting It is worth noting that if payoffs are discounted, similar results hold. A brief sketch follows. Let βj = 1/(1 + γ+γ2 + ·s+γj−1), where γ ∈ (0, 1) is the discount rate, and set
Also,
Suppose at stage 1 that
. Use the approaching strategy as defined above. Let y ∈ argminz ∈ Qd(z,v1). Then
350
MICROECONOMIC THEORY
. Thus, (E{v2 | v1}−y, v1−y) ≤ 0. Proceeding,
Since
, so that
,
Proceed in this way to find:
. Thus
. Let
. Then,
Now, βn → (1 − γ), and 1 + γ2 + γ4 + ·s → 1/(1−γ2), so the expression converges to:
Thus,
and
So, if the approachability criterion is satisfied, an approachable set is approachable for discount factors close to 1.
Bibliography Blackwell, D. (1956). “An Analog of the Minimax Theorem for Vector Payoffs,” Pacific Journal of Mathematics, 6, 1–8. Ellison, G. (1993). “Learning, local interaction, and coordination,” Econometrica, 61, 1047–1071. Foster, D. and Vohra, R. V. (1997). “Calibrated Learning and Correlated Equilibrium,” Games and Economic Behavior, 21, 41–55. Foster, D. and Vohra, R. V. (1998). “Asymptotic Calibration,” Biometrika, 85, 379–390 University of Pennsylvania. Fudenberg, D. and Levine, D. (1998). The Theory of Learning in Games. Cambridge, MA: MIT Press.
CHAPTER 20: EVOLUTION AND LEARNING
351
Hart, S. and MasColell, A. (2000). “A Simple Adaptive Procedure Leading to Correlated Equilibrium,” Econometrica, 68, 1127–1150. Hofbauer, J. and Sigmund, K. (1998). Evolution Games and Population Dynamics. Cambridge University Press. Kandori, M., Mailath, G., and Rob, R. (1993). “Learning, Mutation, and Long–Run Equilibria in Games,” Econometrica, 61, 29–56. Samuelson, L. (1997). Evolutionary Games and Equilibrium Selection, Cambridge, MA: MIT Press. Vega-Redondo, F. (1997). “The Evolution of Walrasian Behavior,” Econometrica, 65, 375–384. Vega-Redondo, F. (1996). Evolution, Games, and Economic Behavior. Oxford University Press. arkast Weibull, J. (1995). Evolutionary Game Theory. Cambridge, MA: MIT Press. Young, P. (1993). “The Evolution of Conventions,” Econometrica, 61, 57–84.
This page intentionally left blank
Index α-coalition function, 301 β-coalition function, 301 ɛ-proper equilibrium, 61, 170 absolute risk aversion, 23 act, 14 admissible decision problem, 203 affiliated, 122 affiliation, 122 aggregate distribution, 310 aggregate uncertainty, 314 all pay auction, 106 Allais paradox, 7 alternating offers and Nash bargaining, 293 alternating offers model, 289 anonymous game, 310 anonymous games, 309 anonymous games, pure strategies, 312 aperiodic transition matrix, 328 approachability, 198 approachability and discounting, 349 approachability, regret minimization, 339 approachable sets, 345 approachable, definition, 346 assurable, 301 attitude to risk, bidding, 110 auction, 101 auction procedures, 103, 123 auctions, efficiency and optimality, 119 auctions, incentive compatibility, 111 auctions, information linkage, 136 auctions, risk aversion, 116 averaging of payoffs, 187 axiomatic bargaining, 281 backward induction, 161 balanced collection, 299 balancedness, 298 balancedness and extreme points, 300 balancedness and linear programming, 298 bargaining problem, 282 bargaining problem, solution, 283 bargaining set, 283 bargaining, independence of irrelevant alternatives, 284 bargaining, invariance, 284 bargaining, many person, 293 bargaining, Pareto efficiency, 284 bargaining, symmetry, 284 behavior strategy, 152
behavioral strategies, 147, 150 best response correspondence, 57 best response dynamics, 332 best response mapping, 42 betweenness, 11 bidding and dominant strategy, 105 bidding and increasing risk aversion, 116 calibrated forecasts, 344 calibration, 343 chain store paradox, 171 characteristic function, 298 coalition function, 301 coalitional function, superadditivity, 302 common knowledge and backward induction, 275 common knowledge and iterative announcements, 271 common knowledge and speculation, 277 common knowledge at a state, 269 common knowledge of an event, 270 common knowledge of posterior distributions, 270 common values, 122, 128, 132 comprehensiveness, 282 consequences, 14
354 consistent beliefs, 166 continuous correspondence, 58 continuous information, 222 convex game, 300 core allocation, 298 core allocations and domination, 302 correlated equilibrium, 47 decision problem, 203 descending bid auction, 107 dictatorial social choice function, 72 dimensionality, 190 direct mechanisms, 69, 70 discounted payoffs, 188 distribution support, 311 distributional strategy, 222, 310 dominance solvable, 43 dominant strategy equilibrium, 42 dominant strategy incentive compatible, 71 Dutch auction, 101, 107 dynamic anonymous games, 313 efficient auction, 103 egalitarian solution, 283 Ellsberg paradox, 16 English auction, 101, 104 english auctions, 132 equicontinuous payoffs, 223 equilibrium and quasiconcavity, 60 equilibrium, finite strategies, 59 ergodic process, 328 evolutionary stable strategy, 45 ex-ante expected utility, 68 ex-post expected utility, 68 exchange economy, 218 excludable, definition, 346 extensive and strategic form perfection, 164 extensive form game, definition, 147 extensive form mechanisms, 91 external stability, 303 farkas' lemma, 141 fictitious play, 324 fictitious play and Nash equilibrium, 324 fictitious play, convergence, 324 finer information, 204 finite repetition, 191 first price auction, 101, 104 first price auctions, 125 first-order approach, 227, 236 fixed point, 58 full implementation, 69, 84 games of incomplete information, 149 games of perfect information, 149 garbling, 205, 206
INDEX
generalized Nash bargaining, 286, 292 history independent punishment, 193 hybrid equilibria, 251 imitation and Walrasian equilibrium, 338 implementation, dominant strategies, 70 implementing mechanism, 85 implicit linear utility, 11 incentive compatibility, 111 incentive compatibility and monotone win probability, 114 incentive compatible reduced form auction, 111 independence axiom, 1, 6, 15 independent values, auctions, 102 information, 201 information and insurance, 217 information in multi-agent models, 224 information partitions, 204, 268 information set, 149 information structure, 222 interim expected payment, 111 interim expected utility, 68 internal stability, 303 intuitive equilibrium, 247 invariant distributions, computation, 329 irreducible transition matrix, 327 iteratively undominated Nash equilibrium, 91 Kalai-Smorodinsky axioms, 287 knowledge, 267 learning, 323 likelihood difference, 213 limit of means, 181, 187 linkage principle, 136 lower-hemicontinuous correspondence, 58 Markov tree theorem, 330 maximal punishments, 183
INDEX
mean preserving spread, 32, 206–208 measurable trade, 218 mechanism design, 65 mechanism, definition, 66 mechanism, implement, 67 mechanisms and complete information, 67 minimum cost trees, 330 minimum cost trees and transition probabilities, 335 minmax and maxmin payoffs, 184 minmax payoffs, 183 mixed extension of a game, 56 mixed strategies, 147 mixed strategies, extensive form, 151 monotone likelihood difference, 213 monotone likelihood ratio, 212 monotone preferences, 90 monotone total positivity, 122 monotone total positivity of order 2, 122 monotonic coalitional function, 300 monotonicity, 84 more informative, 206 more valuable information, 208 Nash bargaining and alternating offers, 281 Nash bargaining axioms, 284 Nash equilibrium, 51, 56 Nash equilibrium, strict, 56 Nash implementation, 84 no gain from one-shot deviation, 192 no-veto power, 86 non-cooperative bargaining, 289 non-transferable utility, 301 observable randomization, 185 optimal auction, 103, 113 optimal choice, 3 optimal decision, 203 optimal selling procedure, 110 parallel indifference curves, 7 partition, join, 269 partition, meet, 268 payoff set, convexity, 185 payoff set, feasibility, 186 payoffs in repeated games, 180 perfect Bayesian equilibrium, 170 perfect equilibrium, 159, 162 perfect recall, 151, 153 perfect recall and end node distributions, 155 perfect recall and strategic equivalence, 156 perfection, 60 perfection and sequential equilibrium, 169 persistent equilibrium, 63 pooling equilibrium, 251, 258 preference ordering, complete, 3
preference ordering, continuous, 3 preference ordering, reflexive, 3 preference ordering, representable, 4 preference ordering, transitive, 3 preference reversal, 83–85 principal–agent model, 225 principal–agent, first-order approach, 235 principal–agent, full information, 229 principal–agent, monotone likelihood, 236 principal–agent, monotone wage function, 236 principal–agent, risk allocation, 231 principal–agent, risk neutrality, 230, 232 principal–agent, sufficiency conditions, 238 principal–agent, unobservable actions, 233 principle–agent information structure, 225 principle–agent problem, 227 prisoners' dilemma, 182 private values, 122, 128, 132 proper and sequential equilibrium, 170 proper equilibrium, 61, 159 pure strategies, 147 pure strategies, extensive form, 150 quasi-linear preferences, 77 rank dependent utility, 12 ranking informativeness, 205 rational expectations, 218 rationalizability, 43 reduce form auctions, incentive compatibility, 112 reduced form auction, 102, 109, 110 reduced form auctions, revenue, 112 regret minimization, 339 regret minimization and correlated equilibrium, 343 relative risk aversion, 23 repeated games, 179 reputation, 172 return time, expected, 329 revelation principle, 70, 110 revenue comparisons, 133
355
356 revenue equivalence, 107 revenue equivalence theorem, 102 risk aversion, 23 risk neutral buyers, 102 risk neutrality, 102, 110 risk neutrality and incentive compatibility, 102 screening information structure, 225 screening model, 225 second order statistic, 105, 109 second price auction, 101, 105, 131 semideviation, absolute, 36 semideviation, central, 35 separating equilibrium, 251, 259 sequential equilibrium, 159, 162, 165 sequentially rational beliefs, 166 Shapley value, 304 signal informativeness, 207 signaling information structure, 225 signaling model, 225 single peaked preferences, 76 social choice function, 66 social planner formulation, 316 speculation, 221 stability, 302 state preference model, 13 stationary distribution, 326 stationary equilibrium, 315 stochastic stability, 326 stochastic stability and imitation, 336 stopping time, 329 strategy proof, 71 strategy-proofness, 71, 79 strictly dominated strategy, 42 subgame perfect, 162 subgame perfect equilibrium, 161 subgame perfect implementation, 92 subgames, 161 submodular function, 216 submodularity, 215 sufficient, 206 sufficient random variable, 206 superadditive coalitional function, 300, 301 supermodular coalitional function, 300 supermodular function, 215 supermodularity, 211, 215 sure thing principle, 15 surplus, 316 surplus extraction, 139 surplus maximization, 320 symmetric bidding strategy, 104 symmetric equilibrium, 132 symmetric equilibrium, bidding, 104
INDEX
symmetric Nash equilibrium, 312 take-it-or-leave-it auction, 106 total positivity, 214 totally positive of order 2, 122, 214 transferable utility, 298 TU-game, 298 unanimous social choice function, 72 uncertainty, 14 undominated equilibrium and monotonicity, 89 undominated Nash, 88 unpreventable, 301 upper-hemicontinuity, 57 upper-hemicontinuous correspondence, 58 utilitarian solution, 283 value of information, 204 value of information, n-person, 217 vector payoff, 197 very weak substitution, 10 virtual implementation, 89 virtual implementation and monotonicity, 91 von Neumann-Morgenstern solutions, 302 von Neumann-Morgenstern stability, 302 von Neumann–Morgenstern utility, 6 weak implementation, 69, 84 weak independence, 9 weak substitution, 9 weakly dominated strategy, 42 weighted utility, 8 win probabilities, 102 winner's curse, 139 zero-sum game, 194