Contents
Acknowledgements
vi
Notes on the Contributors
vii
1
An Introduction to Game Theory for Linguists
1
Ant...
107 downloads
1752 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Contents
Acknowledgements
vi
Notes on the Contributors
vii
1
An Introduction to Game Theory for Linguists
1
Anton Benz, Gerhard J¨ager and Robert van Rooij 2
Saying and Meaning, Cheap Talk and Credibility
83
Robert Stalnaker 3
Pragmatics and Games of Partial Information
101
Prashant Parikh 4
Game Theory and Communication
123
Nicholas Allott 5
Different Faces of Risky Speech
152
Robert van Rooij and Merlijn Sevenster 6
Pragmatic Reasoning, Defaults and Discourse Structure
175
Nicholas Asher and Madison Williams 7
Utility and Relevance of Answers
195
Anton Benz 8
Game-Theoretic Grounding
220
Kris de Jaegher 9
A Game Theoretic Approach to the Pragmatics of Debate
248
Jacob Glazer and Ariel Rubinstein 10 On the Evolutionary Dynamics of Meaning-Word Associations
263
Tom Lenaerts and Bart de Vylder Author Index
285
Subject Index
288
v
Acknowledgements The roots of this book lie in a small but very successful workshop on Games and Decisions in Pragmatics held at the Centre for General Linguistics (ZAS) in Berlin in October 2003. This event was organized by Anton Benz as part of the ZAS project on Bidirectional Optimality Theory, which was then directed by Gerhard J¨ager and Manfred Krifka. We have to thank the members of this project who helped with reviewing and organising this workshop: Jason Mattausch, Yukiko Morimoto, and Tue Trinh. Thanks go also to Klaus Robering, who helped with the reviewing, and to the director of the ZAS, Manfred Krifka, for the encouragement and general support we received. Jason was also very helpful in the final stage of the book project when he corrected the English of the introductory chapter. A crucial role in making this book project happen was played by Richard Breheny. After the first Call for Papers, we received e-mails from Richard encouraging us to use the conference contributions as a basis for a book on this newly developing field of pragmatics. He suggested contacting Palgrave Macmillan and the editors of the series “Palgrave Studies in Pragmatics, Language and Cognition”. He receives our special thanks. We would also like to thank the series editors Robyn Carston and No¨el Burton-Roberts for their encouragement for and confidence in our project and to Jill Lake and Melanie Blair for their advice. Last but not least, special thanks go to Mirco Hilbert, who did a terrific job in typesetting the book and preparing the index. Anton Benz, Gerhard J¨ager and Robert van Rooij
vi
Notes on the Contributors Nicholas Allott is a Ph.D. student at University College London, working on relevance and rationality in communication. Nicholas Asher is a Professor of Philosophy at the University of Texas at Austin. His interests include natural language semantics, pragmatics, and philosophy of language. He is author of Reference to Abstract Objects in Discourse and, together with Alex Lascarides, of Logics of Conversation. Anton Benz is Assistant Professor of Humanistic Information Science at the University of Southern Denmark, Kolding. His main research interests lie in the field of formal pragmatics and dialogue modelling. He received his Ph.D. in philosophy, logic, and theory of science at Ludwig Maximilians University Munich. Jacob Glazer is a Professor of Economics at the Faculty of Management, Tel Aviv University and the Department of Economics, Boston University. He received his Ph.D. from Northwestern University in 1986. His research areas include economic theory and health economics. Kris de Jaegher is a postdoctoral Research Fellow of the Fund for Scientific Research, Flanders (Belgium), stationed at the Vrije Universiteit Brussel (Belgium). He teaches economics at the Roosevelt Academy in Middelburg (Netherlands). He received his Ph.D. at the Department of Economics of the Vrije Universiteit Brussel in 2002. His research interests lie in the fields of microeconomics and game theory. His research in the field of game theory includes applications in economics, biological signalling theory, and the philosophy of language. Gerhard J¨ager is Professor of Linguistics at the University of Bielefeld in Germany. He received his Ph.D. from Humboldt University in Berlin. He has published on a variety of topics relating to semantics, pragmatics and the mathematics of language. His current focus of interest is the application of evolutionary models to linguistics. Tom Lenaerts works as a postdoctoral Researcher at IRIDIA, l’Institut de Recherches Interdisciplinaires et de D´eveloppements en Intelligence Artificielle, at the Universit´e Libre de Bruxelles and guest professor at the com-
vii
viii
Game Theory and Pragmatics
puter science department situated at the Vrije Universiteit Brussel. He received his Ph.D. from the latter institution in May 2003. His research interests include game dynamics, evolutionary computation, cultural evolution, complex networks and the biochemical origin of life. Prashant Parikh is a Senior Research Scholar at the Institute for Research in Cognitive Science at the University of Pennsylvania. He pioneered the application of game theory to semantics/pragmatics in the mid-nineteeneighties and his interests include all aspects of the connections between game theory and language. Robert van Rooij is a senior staff member of the Institute for Logic, Language and Computation (ILLC), stationed at the Department of Philosophy ¨ of the University of Amsterdam. He received his Ph.D. at the Institut fur Maschinelle Sprachverarbeitung at the University of Stuttgart in 1997. His working areas include formal semantics and pragmatics, philosophy of language, and the evolution of language. Ariel Rubinstein is a Professor of Economics at Tel Aviv University and New York University. He received his Ph.D. at the Hebrew University in 1979. His major working areas are economic theory, game theory and models of bounded rationality. Merlijn Sevenster is currently employed as a Ph.D. student at the Institute for Logic, Language and Information in Amsterdam, working on a project on imperfect information games. His approach combines a linguistic, computational, and logical viewpoint. Robert Stalnaker is the Laurance S. Rockefeller Professor of Philosophy at the Massachusetts Institute of Technology. He is the author of two recent collections of papers, Context and Content and Ways a World Might Be, both published by Oxford University Press. His areas of interest include metaphysics, the philosophy of language, and the foundations of game theory. Bart de Vylder is a researcher at the Artificial Intelligence Laboratory at the Free University of Brussels. His research comprises robotics, computer vision and the modelling of language evolution. Madison Williams is a graduate student who is finishing his Ph.D. at the University of Texas at Austin. He is writing his thesis on rational inference and decision making.
1 An Introduction to Game Theory for Linguists Anton Benz, Gerhard J¨ager and Robert van Rooij
1
Classical game theory
In a very general sense we can say that we play a game together with other people whenever we have to decide between several actions such that the decision depends on the choice of actions by others and on our preferences over the ultimate results. Obvious examples are card games, chess, or soccer. If I am to play a card to a trick, then it depends on the cards played by my playing partners whether or not I win the trick. Whether my move in chess leads to a win usually depends on the subsequent moves of my opponent. Whether I should pass the ball to this or that team member depends not in the least on my expectations about whether or not he will pass it on to a player in an even more favourable position. Whether or not my utterance is successful depends on how it is taken up by its addressee and the overall purpose of the current conversation. This provides the basis for applications of game theory in pragmatics. Game theory has a prescriptive and a descriptive aspect. It can tell us how we should behave in a game in order to produce optimal results, or it can be seen as a theory that describes how agents actually behave in a game. In this book, the latter interpretation of game theory is of interest. The authors of this volume will explore game theory as a framework for describing the use of language. 1.1
Decisions
At the heart of every game theoretic problem there lies a decision problem: one or more players have to choose between several actions. Their choice is governed by their preferences over expected outcomes. If someone is offered a cherry and a strawberry but can only take one of them, then if he prefers the strawberry over the cherry, he will take the strawberry. This is not a prescription. It is an explication of the semantics of the word preference. If I can choose between actions a1 and a2 , and prefer the outcome s1 of a1 over s2 of a2 , then it is the very meaning of the word preference that I 1
2
Game Theory and Pragmatics
choose action a1 . In general, one can distinguish between decision making under certainty, risk and uncertainty. A decision is made under certainty if the decision maker knows for each action which, outcome it will lead to. The cherry and strawberry example is such a case. A decision is made under risk if each action leads to a set of possible outcomes, where each outcome occurs with a certain probability. The decision maker knows these probabilities, or behaves as if he knew them. A decision is made under uncertainty if no probabilities for the outcomes are known to the decision maker, and where not even reasonable assumptions can be made about such probabilities. We consider here only decision making under certainty or risk, as does the majority of literature on decision theory. Decision under risk Before we enter into game theory proper we want to say more about decision under risk. Decision theory found interesting applications in pragmatics, and its ideas and concepts are fundamental for game theory. The decision maker may be uncertain about the outcomes of his actions because he has only limited information about the true state of the world. If Adam has to decide in the morning whether or not to take an umbrella with him, this depends on whether or not he believes that it will rain that day. He will not know this but will have some expectations about it. These expectations can be represented by probabilities, and Adam’s information state by a probability space. We identify a proposition A with sets of possible worlds. In probability theory they are called events; but we will stick here to the more familiar terminology from possible worlds semantics. If a person is convinced that A is true, then we assign probability 1 to it, and 0 if he thinks that it can not be true. If there are two propositions A and B that cannot be true at the same time, e.g. that the sky is sunny and that the sky is cloudy, then the probability of A or B is just the sum of the probability of A and the probability of B. The latter property is generalised in the following definition to arbitrary countable sequences of pairwise incompatible propositions. Let Ω be a countable set that collects all possible states of the world. P is a probability distribution over Ω if P maps all subsets of Ω to the interval [0, 1] such that: 1 P (Ω) = 1; P P 2 P ( j∈J Aj ) = j∈J P (Aj ) for each family (Aj )j∈J of countably many P pairwise disjoint sets. The sum j∈J Aj here denotes the (disjoint) union of the sets Aj .
An Introduction to Game Theory for Linguists
3
We call (Ω, P ) a (countable) probability space. The restriction to countable Ω’s simplifies the mathematics a lot. It follows e.g. that there is a subset S ⊆ Ω P such that P ({v}) > 0 for each v ∈ S and P (A) = v∈A∩S P ({v}) for all A ⊆ Ω, i.e it follows that P is a count measure. For P ({v}) we write simply P (v). If (Ω, P ) describes the information state of a decision maker, what does his new information state look like if he learns a fact E? Adam may look out of the window and see that the sky is cloudy, or he may consult a barometer and see that it is rising. E would collect all worlds where the sky is cloudy, or, in the second scenario, where the barometer rises. If neither fact contradicts what Adam previously believed, then his probabilities for both sets must be greater than zero. Whatever proposition E represents, how does learning E affect (Ω, P )? In probability theory this is modelled by conditional probabilities. In learning theory, these are known as Bayesian updates. Let H be any proposition, e.g. the proposition that it will rain, i.e. H collects all possible worlds in Ω where it rains at some time of the day. The probability of H given E, written P (H|E), is defined by: P (H|E) := P (H ∩ E)/P (E) for P (E) 6= 0.
(1.1)
In particular, it is P (v|A) = P (v)/P (A) for v ∈ A 6= ∅. For example, before Adam looked out of the window he may have assigned to the proposition (E ∩ H) that it is cloudy and that it rains a probability of 13 and to the proposition (E) that it is cloudy a probability of 12 . Then (1.1) tells us that, after observing that the sky is cloudy, Adam assigns probability 13 : 12 = 23 to the proposition that it will rain. Bayesian updates are widely used as a model for learning. P is often said to represent the prior beliefs, and P + defined by P + (A) = P (A|E) the posterior beliefs. As an illustration we want to show how this learning model can be applied in Gricean pragmatics for explicating the notion of relevance. We discuss two approaches. The first one measures relevance in terms of the amount of information carried by an utterance and is due to Arthur Merin (Merin 1999b). The second approach introduces a measure that is based on expected utilities and is used by Prashant Parikh (Parikh 1992, Parikh 2001), Rohit Parikh (Parikh 1994) and Robert van Rooij (van Rooij 2003b). The fact that the barometer is rising (E) provides evidence that the weather is becoming sunny. We can see the situation as a competition between two hypotheses: (H) The weather will be sunny, and (H) The weather will be rainy. For simplicity we may assume that H and H are mutually exclusive and cover all possibilities. E, the rising of the barometer, does not necessarily imply that H, but our expectations that the weather will be sunny are much higher after learning E than before. Let P represent the given
4
Game Theory and Pragmatics
expectations before learning E, i.e. P is a probability distribution over possible states of the world. Let P + represent the expectations obtained from epistemic context (Ω, P ) when E, and nothing but E, is learned. Modeling learning by conditional probabilities as above, we find that P + (H) = P (H|E), where we have to assume that P (E) 6= 0, i.e. we can only learn something that doesn’t contradict our previous beliefs. Our next goal is to introduce a measure for the relevance of E for answering the question whether H or H is true. Measures of relevance have been extensively studied in statistical decision theory (Pratt et al. 1995). There exist many different explications of the notion of relevance which are not equivalent with each other. We choose here Good’s notion of relevance (Good 1950). It was first used by Arthur Merin (Merin 1999b), one of the pioneers of game theoretic pragmatics, in order to get a precise formulation of Grice’s Maxim of Relevance.1 If we know P (H|E), then we can calculate the reverse, the probability of E given H, P (E|H), by Bayes’ rule: P (E|H) = P (H|E) × P (E)/P (H).
(1.2)
With this rule we get: P + (H) = P (H|E) = P (H) × (P (E|H)/P (E)).
(1.3)
H denotes the complement of H. Learning E influences our beliefs about H in the same way as it influences our beliefs about H: P + (H) = P (H|E). This leads us to: P (H|E) P (H) P (E|H) P + (H) = = × . P + (H) P (H|E) P (H) P (E|H)
(1.4)
Probabilities are non-negative by definition. In addition we assume that all probabilities in this equation are positive, i.e., strictly greater than 0. This allows us to apply a mathematical trick and build the log of both sides of this equation. As the logarithm is strictly monotone it follows that (1.4) is true exactly iff log(P + (H)/P + (H)) = log(P (H)/P (H)) + log(P (E|H)/P (E|H)). (1.5) We used here the fact that log(x × y) = log x + log y. Furthermore we know that log x = 0 iff x = 1. This means that we can use the term rH (E) := log(P (E|H)/P (E|H)) as a measure for the ability of E to make us believe H. If it is positive, E favors H, if it is negative, then E favors H. In a competitive situation where a speaker wants to convince his addressee of
An Introduction to Game Theory for Linguists
5
some proposition H it is reasonable to call a fact E more relevant the more evidence it provides for H. Merin calls rH (E) also the argumentative force of E.2 Whether or not this is a good measure of relevance in general depends on the overall character of communication. Merin sees the aim to convince our communication partner of something as the primary purpose of conversation. If Adam has an interview for a job he wants to get, then his goal is to convince the interviewer that he is the right person for it (H). Whatever he says is more relevant the more it favors H and disfavors the opposite proposition. We could see this situation as a battle between two agents, H and H, where assertions E are the possible moves, and where log(P (E|H)/P (E|H)) measures the win for H and the loss for H. Using the terminology that we will introduce in subsubection 1.2.1, we can say that this is a zero-sum game between H and H. We want to elaborate a little more on this example. The basis for Merin’s proposal lies in the assumption that the main purpose of communication is to provide evidence that helps one decide whether a proposition H or its opposite is true. Hence, it works fine as long as we concentrate on yes-no questions or situations where one person tries to convince an addressee of the truth of some hypothesis. In general decision problems, the decision maker has to decide between different actions. Hence, the preferences over outcomes of these actions become important. It is not surprising that we find examples where a measure of relevance based on pure information becomes inadequate. Imagine that Ω consists of four worlds {v1 , . . . , v4 } of equal probability and that the decision maker has to decide between two actions a1 and a2 . Suppose that she prefers a1 in v1 and v2 and a2 in v3 and v4 but that the value she assigns to a1 in v1 is very large compared to the other cases. If the decision maker learns E = {v2 , v3 }, then, using Merin’s measure, this turns out to be irrelevant for deciding whether it is true that it is better to perform a1 (i.e. H = {v1 , v2 }), or a2 (i.e. H = {v3 , v4 }) because log(P (E|H)/P (E|H)) = 0. But, intuitively, it is relevant for the decision maker if she learns that the most favoured situation v1 cannot be the case. Let us return to the job interview example, and turn from Adam the interviewee to the interviewer. Let’s call her Eve. From Eve’s perspective the situation can be seen as a decision problem. She has to decide between two actions, employ Adam (a1 ) or not employ Adam (a2 ). Depending on the abilities of Adam these actions will be differently successful. The abilities are part of the various possible worlds in Ω. We can represent the success of the actions as seen by Eve by her preferences over their outcomes. We assume here that we can represent these preferences by a (von NeumannMorgenstern) utility measure, or payoff function U that maps pairs of worlds
6
Game Theory and Pragmatics
and actions to real numbers. How does U have to be interpreted? If v is a world in Ω, then an equation like U (v, a1 ) < U (v, a2 ) says that the decision maker prefers the outcome of action a2 in v over the outcome of a1 in v. U (v, a1 ) and U (v, a2 ) are real numbers, hence their difference and sum are defined. In utility theory, it is generally assumed that utility measures are unique up to linear rescaling, i.e. if U (v, a) = r × U 0 (v, a) + t for some real numbers r > 0 and t and all v, a, then U and U 0 represent the same preferences. If Eve values employing an experienced specialist twice as much as employing a trained and able novice, and she values employing an able novice as positively as she values employing an inexperienced university graduate negatively, then this can be modeled by assigning value 2 in the first case, value 1 in the second and value −1 in the third. But it could equally well be modeled by assigning 5 in the first, 3 in the second and −1 in the third case. Putting these parts together we find that we can represent Eve’s decision problem by a structure ((Ω, P ), A, U ) where: 1 (Ω, P ) is a probability space representing Eve’s information about the world; 2 A is a set of actions; 3 U : Ω × A −→ R is a utility measure. In decision theory it is further assumed that decision makers optimize expected utilities. Let a ∈ A be an action. The expected utility of a is defined by: X EU (a) = P (v) × U (v, a) (1.6) v∈Ω
Optimizing expected utilities means that a decision maker will choose an action a only if EU (a) = maxb∈A EU (b). Let’s assume that Eve assigns a probability of p = 43 to the proposition that Adam is an inexperienced novice, but gives a probability of 1 − p = 14 to the proposition that he has some training. We further assume that she assigns value 1 to employing him in the first case, and value −1 to employing him in the second case. Furthermore, we assume that she values the state where she employs no candidate with 0. Then her expected utilities for employing and not employing him are EU (a1 ) = 34 × (−1) + 14 × 1 = − 12 and EU (a2 ) = 0 respectively. Hence she should not employ Adam. This may represent the situation before the interview starts. Now Adam tells Eve that he did an internship in a company X specialized in a similar field. This will change Eve’s expectations about Adam’s experience, and thereby her expected utilities for employing or not employing him. Using
An Introduction to Game Theory for Linguists
7
the ideas presented before, we can calculate the expected utility of an action a after learning A by: X EU (a|A) = P (v|A) × U (v, a); (1.7) v∈Ω
where P (v|A) denotes again the conditional probability of v given A. If Eve thinks that the probability that Adam is experienced increases to 43 if he did an internship (A), then the expected utility of employing him now rises to EU (a1 |A) = 12 . Hence, Adam was convincing and will be employed. A number of people (P. Parikh, R. Parikh, R. van Rooij) proposed measuring the relevance of a proposition A in terms of how it influences a decision problem that underlies the current communication. Several possible ways to measure this influence have been proposed. One heuristic is to say that information A is relevant if and only if it makes a decision maker choose a different action from before, and it is more relevant the more it increases the expected utility. This is captured by the following measure of utility value of A: U V (A) = max EU (a|A) − EU (a∗ |A). (1.8) a∈A
a∗ denotes here the action the decision maker had chosen before learning A — in our example this would have been a2 , not employing Adam. The expected utility value can only be positive in this case. If Eve had already a preference to employ Adam, then this measure would tell us that there is no relevant information that Adam could bring forward. So, another heuristic says that information is more relevant the more it increases expectations. This is captured by the following measure: U V 0 (A) = max EU (a|A) − max EU (a). a∈A
a∈A
(1.9)
If we put the right side in absolutes, then it means that information is the more relevant the more it changes expectations. This would capture cases where Adam could only say things that diminish Eve’s hopes. U V 00 (A) = | max EU (a|A) − max EU (a)|. a∈A
a∈A
(1.10)
Obviously, Adam should convince Eve that he is experienced. Following Merin we could say that arguments are more relevant for Adam if they favor this hypothesis and disfavor the opposite. If Adam uses the utilitybased measure of relevance, then he should choose arguments that make Eve believe that the expected utility after employing him is higher than that after not employing him. Given our scenario, this is equivalent with choosing arguments that favor the thesis that he is experienced. Hence, we see that for special cases both measures of relevance may coincide.
8
Game Theory and Pragmatics
We want to conclude this section about decision theory with a classical example of Grice’s. In this example there is no obvious hypothesis for which the provider of information could argue. Nevertheless, we can explain the relevance of his statement by a criterion based on the maximization of expected utilities. A and B are planning their summer holidays in France. A has an open map in front of him. They would like to visit C, an old friend of B. So A asks B: Where does C live? B answers: Somewhere in the south of France. We are not concerned here with the question of how the implicature ‘B does not know where C lives’ arises but with the question why B’s answer is relevant. In Merin’s model, there must be an hypothesis H that B argues for. But it is not immediately clear what this hypothesis H should be. We can model the situation as a decision problem where Ω contains a world for each sentence C lives in x, where x ranges over cities in France and where each of these worlds is equally possible. A contains all actions ax of going to x, and U measures the respective utilities with U (v, a) = 1 if a leads to success in v and U (v, a) = 0 if not. Let E be the set of all worlds where C lives in the south of France. Calculating the expected utilities EU (ax |E) and EU (ax ) for an arbitrary city x in the south of France would show that E increases the expected payoff of performing ax . Hence, if B has no more specific information about where C lives, then a criterion that measures relevance according to whether or not it increases expected utilities would predict that E is the most relevant answer that B could give. 1.2
Games
What distinguishes game theory from decision theory proper is the fact that decisions have to be made with respect to the decisions of other players. We start this section with some fundamental classifications of games and introduce the normal form. We look then at one example in more detail, the prisoners’ dilemma. In section 1.2.3 we present the most fundamental solution concepts of game theory, especially the concept of a Nash equilibrium. Finally, we introduce the extensive form. The latter is more suitable for sequential games, a type of game in terms of which communication is studied a lot. 1.2.1
Strategic games and the normal form
There exist several important different classifications of games which are widely referred to in game theoretic literature. We provide here a short overview. A first elementary distinction concerns that between static and dynamic games. In a static game, every player performs only one action, and all
An Introduction to Game Theory for Linguists
9
actions are performed simultaneously. In a dynamic game there is at least one possibility of performing several actions in sequence. Furthermore, one distinguishes between cooperative and non-cooperative games. In a cooperative game, players are free to make binding agreements in preplay communications. Especially, this means that players can form coalitions. In non-cooperative games no binding agreements are possible and each player plays for himself. In our discussion of the prisoners’ dilemma we will see how the ability to make binding agreements can dramatically change the character and solutions of a game. But, except for this one illustrative example, we will be concerned with non-cooperative games only. There are two standard representations for games: the normal form and the extensive form. Our introduction will follow this distinction. The major part concentrates on static games in normal form, which we will introduce in this section. We introduce the extensive form together with dynamic games in section 1.2.4. Games are played by players. Hence, in a description of a game we must find a set of players, i.e. the people who choose actions and have preferences over outcomes. This implies that actions and preferences must be represented too in our game models. Let N = {1, . . . , n} denote the set of players. Then we assume that for each player there is a set Ai that collects all actions, or moves, that can be chosen by him. We call Ai player i’s action set. An action combination, or action profile, is a n–tuple (a1 , . . . , an ) of actions where each ai ∈ Ai . The assumption is that they are performed simultaneously. In general, we distinguish strategies from actions. This becomes important when we consider dynamic or sequential games. Strategies tell players what to do in each situation in a game given their background knowledge and are modeled by functions from sequences of previous events (histories) into action sets. In a static game, i.e. a game where every player makes only one move, these two notions coincide. We will use the expressions strategy sets and strategy combinations, or strategy profiles, in this context too, although strategies are only actions. Players and action sets define what is feasible in static games. The preferences of players are defined over action or strategy profiles. We can represent them either by a binary relation i , i = 1, . . . , n, between profiles, or by payoff functions ui mapping profiles to real numbers. If (s01 , . . . , s0n ) ≺i (s1 , . . . , sn ) or ui (s01 , . . . , s0n ) < ui (s1 , . . . , sn ), then player i prefers strategy profile (s1 , . . . , sn ) being played over strategy profile (s01 , . . . , s0n ) being played. We can collect the individual ui ’s together in payoff profiles (u1 , . . . , un ) and define the payoff function U of a game as a function that maps all action or strategy profiles to payoff profiles. A static game can be represented by a payoff-matrix. In the case of two-player games with
10
Game Theory and Pragmatics
Table 1.1:
Payoff-matrix of a two-player game
b1
b2
a1
(u1 (a1 , b1 ) ; u2 (a1 , b1 ))
(u1 (a1 , b2 ) ; u2 (a1 , b2 ))
a2
(u1 (a2 , b1 ) ; u2 (a2 , b1 ))
(u1 (a2 , b2 ) ; u2 (a2 , b2 ))
two possible actions for each player it has the form given in Table 1.1. One player is called row player, he chooses between actions a1 and a2 ; the other player is called column player, he chooses between actions b1 and b2 . We identify the row player with player 1, and the column player with player 2. The action set A1 of player 1 is then {a1 , a2 }, and that for player 2 is A2 = {b1 , b2 }. ui (ak , bl ) is the payoff for player i for action profile (ak , bl ). It is assumed that two payoff functions U and U 0 are equivalent, i.e. represent the same preferences, if there is an r > 0 and a t such that for all i = 1, . . . , n and a ∈ A: rui (a) + t = u0i (a). Hence, in the class of games we introduced here the players’ payoffs depend only on the actions chosen, and not on the state of the environment. In the next section we discuss an example. Putting things together, we define a strategic game as a structure (N, (Ai )i∈N , U ) such that: 1 N = {1, . . . , n} the (finite) set of players 1, . . . , n; 2 Ai is a non-empty set of actions for each player i ∈ N ; A = A1 × · · · × An is the set of all action profiles. 3 U : A −→ Rn is a payoff function which maps each action profile (a1 , . . . , an ) ∈ A to an n–tuple of real numbers (u1 , . . . , un ), i.e. (u1 , . . . , un ) is the the payoff profile of players 1, . . . , n for action profile (a1 , . . . , an ). The following notation is very common in connection with profiles: if s = (s1 , . . . , sn ) is a given strategy profile, action profile, or payoff profile etc., then s−i denotes the profile (s1 , . . . , si−1 , si+1 , . . . , sn ), 1 ≤ i ≤ n; i.e. s−i is the profile of length n − 1 that we get if we eliminate player i’s strategy, action, payoff etc. (s0i , s−i ) then denotes the profile where we have replaced si in the original profile s by s0i . We can classify strategic games according to how much the payoff functions of the players resemble each other. One extreme are zero-sum games,
An Introduction to Game Theory for Linguists
11
or strictly competitive games; the other extreme are games of pure coordination. In a zero-sum game the payoffs of the players sum up to zero for each strategy profile. This means, that if one player wins a certain amount, then the other players lose it. These games are strictly competitive and if they are played by two persons we could justly call them opponents. A game of pure coordination is exactly the opposite extreme where the payoffs of all players are identical for each action profile. If one player wins something then the other player wins the same amount, and if one player loses then the other one loses too. We really could call them partners. Zero-sum games and games of pure coordination are two ends on a scale ranging from pure conflict to its opposite. In between are cases where interests partially overlap and partially conflict. In the last section we saw an example of a zero-sum game. In Merin’s approach to pragmatics, the aim to convince one’s conversational partner of some hypothesis H is the basic dialogue situation. This was modeled by a zero-sum game where the players are the hypotheses H and H, the complement of H, the moves are propositions E, and the payoffs are defined by the relevance of E for the respective hypotheses. If E favors H, then it disfavors H the same amount, and vice versa. Games of pure coordination are fundamental if we look, following David Lewis, at language as a convention. 1.2.2
The prisoners’ dilemma and strict domination
That a decision is not made under certainty does not necessarily imply that we have to calculate expectations expressed in terms of probabilities. Often we can do without them. Suppose there are two supermarkets, both sell exactly the same goods, and it takes the same effort to go shopping at one as to the other. I like to buy vanilla ice cream, but if there is no vanilla ice cream, I want strawberry ice cream, or ice cream with orange flavor. And I want to buy it as cheaply as possible. I know that one supermarket A sells everything at a lower price than the other supermarket B. Hence, whatever the state of the world, whatever sorts of ice cream they sell, I can never be better off if I go to supermarket B. To be better off means here to have a preference for the outcome resulting from one action, shopping at A, over the outcome resulting from the other action, shopping at B. Now, assume that I know in addition that at least one of my favorite sorts of ice cream is in store. So what to do? If I have to decide between actions a1 and a2 , and whatever the state of the world, I strictly prefer the outcome of a1 over that of a2 , then I will choose action a1 . We say that an action a1 strictly dominates an action a2 if in all possible courses of events the results from performing a1 are strictly preferred over that of a2 . It then amounts to a tautology to say that an agent will never choose the strictly dominated action. This criterion
12
Game Theory and Pragmatics
may tell us what a decision maker will do although he does not know the results of his actions with certainty, and although his expectations about these results are unknown. This example is an example of a pure decision problem, i.e. a problem where the outcome of the choice of action solely depends on the state of the world and not on the decisions of other players. It is straightforward to generalize the last principle of strict domination to proper game situations: if I have to decide between actions a1 and a2 , and, whatever actions the other players choose, the outcomes resulting from a1 are always strictly preferred over the outcomes of a2 , then I will choose action a1 . Again, this is meant as a tautology. As a consequence, if we study a decision problem in a game and ask which action will be chosen by the players, then we can eliminate all strictly dominated actions without losing any of the reasonable candidates. The prisoners’ dilemma is one of the most discussed problems in game theory. It is a standard example illustrating the principle of elimination of strictly dominated actions. One version of the story runs as follows: the police arrest two gangsters for a crime they committed together, but lack sufficient evidence. Only if they confess can the police convict them for this crime. Hence, the police separate them, so that they can’t communicate, and offer each of them a bargain: if he confesses, and the other one doesn’t, then he will be released but his companion will be sentenced to the maximum penalty. If both confess, then they still will be imprisoned but only for a considerably reduced time. If neither of them confesses, then the police can convict them only for a minor tax fraud. This will be done for sure and they both will receive a minor penalty. The exact numbers are irrelevant but they help to make examples more intuitive. So, let’s say that the maximal penalty is 10 years, the reduced penalty, in the case where both confess, is 8 years, the tax fraud is punished with 2 years, and if they are released they are imprisoned for 0 years. The police inform both criminals that they offer this bargain to each of them, and that they are both informed about this. Graphically we can represent the decision situation of the two prisoners as in Table 1.2. Table 1.2:
The prisoners’ dilemma
c
d
c
(−2 ; −2)
(−10 ; 0)
d
(0 ; −10)
(−8 ; −8)
An Introduction to Game Theory for Linguists
13
Each of the two players has to choose between cooperating and noncooperating with his companion. We denote these actions by c (cooperate) and d (defect). If a prisoner defects, then he confesses; if he cooperates, then he keeps silent. One prisoner chooses between columns, i.e. he is the column player, the other between rows, i.e. he is the row player. This payoff-matrix tells us e.g. that column player will be sentenced to 0 years if he defects, and if at the same time row player cooperates, the row player will be sentenced to 10 years. It is easy to see that for both players action d strictly dominates action c. Whatever the other player chooses, he will always prefer the outcome where he himself had performed d. Hence, after elimination of strictly dominated actions, only the pair (d, d) remains as a possible choice, and hence both will confess and be sentenced to 8 years. The prisoners’ dilemma is an instructive example not the least because it easily gives rise to confusion. It seems to lead into a paradox: if both players strictly follow their preferences, then they are led to perform actions with results that are much disliked by both. But to say that somebody follows his preferences is no more than a tautology, so the players cannot do anything else but strictly follow them. The question may arise whether the principle of eliminating strictly dominated actions isn’t too simple minded. It is necessary to make clear what the game theoretic model describes and what it doesn’t describe. As mentioned, the model has to be understood as a descriptive model, not as a prescriptive one. Hence, it does not advise us to follow only our narrowly defined short-term advantages and disregard all needs and feelings of our companions. It just says if the preferences are such as stated in the model, then a rational player will act in this and that way. It makes a crucial difference whether the prisoners’ dilemma is played only once or whether it is played again and again. In the repeated prisoners’ dilemma there is a chance that we meet the same person several times, and non-cooperative behavior can be punished in future encounters. And, indeed, it can be shown that there are many more strategies that rational players can choose when we considered the infinitely repeated prisoners’ dilemma. The model assumes that the preferences of the players are just as stated by the payoff matrix. This means that the only thing the prisoners are interested in is how long they will be imprisoned. They are not interested in the fates of each other. Again this is not a prescription but a description of a certain type of situation. If we consider a scenario where the prisoners feel affection for each other, then this has to be represented in the payoff matrix. In an extreme case where one criminal cares as much for the other one as for
14
Game Theory and Pragmatics
himself, his payoffs may be just the negative sum of both sentences. In this case the model would predict that the compassionate prisoner cooperates, and this behavior is as rational as the defecting behavior in the first scenario. The corresponding payoff-matrix is given in Table 1.3. Table 1.3:
The prisoners’ dilemma with a compassionate row player
c
d
c
(−4 ; −2)
(−10 ; 0)
d
(−10 ; −10)
(−16 ; −8)
But still, there may remain a gnawing doubt. Let’s assume that we play the prisoners’ dilemma only once, and let’s assume that the preferences are exactly the same as those stated in its payoff matrix, isn’t it simply better to cooperate? Even if we were convinced that the only rational choice is to defect, doesn’t this example show that it is sometimes better to be irrational? – and thereby challenge a central game theoretic assumption about rational behavior? First, it has to be noted that the prisoners’ dilemma does not show that it is better to deviate from defection unilaterally. As the example with the compassionate row player shows, this will simply mean to go to jail for 10 years. But, of course, it would be better for both to cooperate simultaneously. This simply follows from the payoff matrix; but this does not imply that the principle of elimination of dominated actions is irrational. The source for the doubt lies in the observation that if they were bound by an agreement or by moral obligations, then they would be better off following it even if this course of action contradicts their own preferences. The point is that in the setting we considered for the one-shot prisoners’ dilemma there is no room for agreements or moral obligations. If we add them, then we get a different game, and for this game we can indeed find that it is rational to make binding contracts because they lead to better payoffs, even if the payoffs are defined exactly as in the original situation. Let’s assume that the two criminals have a possibility to agree never to betray each other if they get imprisoned. Here is a straightforward model for this situation: we add two actions, a and −a. In the beginning both players have to decide whether they play a or −a. If both play a, then they make an agreement that they both play c, cooperate, in the prisoners’ dilemma. If one of them doesn’t play a, then no agreement is formed and hence nothing changes and both can cooperate or defect afterwards. An agreement is
An Introduction to Game Theory for Linguists
15
binding; i.e. it is impossible to do anything that is not in accordance with it. Hence, it has the effect of reducing the set of possible actions open to the players. Call the two players A and B. We depict this game as in Figure 1.1.
C : u a s10 uX 3 s1 XXX z u ... X a −a X s11
- u s100
C
- u (−2, −2)
: u (−2, −2) C uX 3 s200 XXX X z u (−10, 0) X C D
u Q s0 Q
Q
: u a Q s 20 Q Q s uX Q XXX Q s2 X z u . . . DQ X
−a Q
−a
s21
: u (0, −10) C Q s uX Q X XXX s201 z u (−8, −8) X D
Figure 1.1
In this representation we depict the players’ moves one after the other. First A chooses between actions a and −a in situation s0 , which leads to s1 or s2 respectively. Then B makes his choice from a and −a. The oval around s1 and s2 means that B cannot distinguish between the two situations, i.e. he does not know whether A played a or −a. Hence, although the actions of A and B are ordered sequentially, the graph covers also the case where both decide simultaneously. We will introduce this form of representation, the extensive form, in section 1.2.4. After their initial choice, both have to play the prisoners’ dilemma. As in s11 , s20 and s21 no agreement is reached, both play the ordinary prisoners’ dilemma as considered before. We depicted the associated game tree only once after s20 . It is identical for all three situations. In situation s10 they reached an agreement. So their possibilities in the prisoner’s situation are limited to cooperation. This means that they have only one choice to act. In the end we find the payoffs for each course of events. The first value is the payoff for player A and the second for player B. Which actions will rational players choose? As the situations in s11 , s20 and s21 are those of the prisoners’ dilemma, it follows that both will play defect. Hence, their payoffs for all these situations will be (−8, −8). If they are in s10 their payoff will be (−2, −2). For their choice between a and −a in the initial situation this means that the game tree can be simplified as depicted in Figure 1.2.
16
Game Theory and Pragmatics
: u (−2, −2) a s10 uX X 3 s1 XX z u (−8, −8) X a −a X s11
u
sQ 0 Q
Q
: u (−8, −8) a s20 Q s uX Q X XX s2 z u (−8, −8) X −a X
−a Q
s21
Figure 1.2
If A has played a, then B has a preference to play a too; if A played −a, then B has no preferences for one over the other. The situation for A is symmetrical. We can represent this game by a payoff matrix in Table 1.4. Table 1.4 a
−a
a
(−2 ; −2)
(−8 ; −8)
−a
(−8 ; −8)
(−8 ; −8)
The principle of elimination of strictly dominated actions does not help us here because a is not preferred by the players for every choice that the other player can make. This example motivates a more general principle, that of the elimination of weakly dominated actions: if a player has a choice between actions a1 , . . . , an such that (1) there is a possibility where a1 leads to a preferred outcome, and (2) there is no possibility where its outcome is dispreferred, then the player will choose a1 . This principle is already more than a pure explication of the meaning of prefer. It presupposes some deliberation on the side of the player. If we apply this principle in the situation represented by the last payoff matrix, we see that (a, a) is the only possible choice for the two agents. What does this example show us? We saw that it would be better for both of the prisoners if they could manage to cooperate in the prisoners’ dilemma, and this threw some doubt on the principle of elimination of strongly dominated actions. As we see now, if there is a device that forces
An Introduction to Game Theory for Linguists
17
them to cooperate, then it is rational for them to use it. But to restrict one’s own freedom and force oneself to make a certain choice is something different from behaving irrationally. To choose cooperate in the original prisoners’ dilemma means to make an irrational choice; which is in that case tantamount to preferring the dispreferred. The making of binding agreements is part of the actions the players can perform. If we want to study its effects on finding optimal behavior, then it has to be represented in the game structure. Doubts about the principle of elimination of strongly preferred actions arise when we put the possibility of deliberately restricting our freedom into the criterion of rationality. This shows that we have to be careful about what we model by what in a game. We can distinguish three aspects of a decision problem that have to be represented in a model. An agent has to consider: (1) what is feasible; (2) what is desirable; (3) what he knows. Hence, we have to represent in each decision situation what a player can do, what his preferences are and what he knows. In our game tree we can read off this information as follows: in situation s20 there are two possible actions player B can choose, and they lead to situations s200 and s201 respectively. This defines the feasible. His preferences are defined over final outcomes of courses of events. They are given by his payoffs in the final situations. He knows that he is in situation s20 . If he couldn’t distinguish between this situation and another, then this could be indicated by an oval containing both situations. This is called his information set. But this represents only his special knowledge about the specific game situation. In addition to this, it is usual to assume that players know the overall game structure, i.e. they know which moves are possible in which situations by each player, know their consequences, know each other’s preferences, and each other’s knowledge about each other’s moves. More precisely, the latter properties have to be common or mutual knowledge . A proposition ϕ is mutually known if every player knows ϕ, if every player knows that every player knows that ϕ, if every player knows that every player knows that every player knows that ϕ, etc. Hence, this imposes a very strong assumption about the players’ ability to reason about each other. This is a characteristic of classical Game Theory . In a later section we will see an interpretation of Game Theory that imposes much weaker constraints on the rationality of players: evolutionary game theory . The components introduced so far, the possible moves, the desires, and the players’ knowledge provide a description of a game but they still do not provide an answer to the question what an agent will or should do. What is needed is a criterion for selecting an action. In the prisoners’ dilemma we saw one such criterion, the principle of elimination of strongly dominated strategies. If an agent is rational, we said, then he cannot choose an action
18
Game Theory and Pragmatics
that will always lead to a dispreferred state of affairs. We introduced it as an explication of the meaning of prefer. But in general, this principle will not suffice to give us an answer for every game where we expect an answer. We saw already an example where we needed a weaker version of the principle. In the game theoretic literature we find a large number of criteria for what is rational to choose. These criteria may depend on the type of game played, the information available to the players, on assumptions on their ability to reason about each other, on the amount of common knowledge. Very often, these criteria are formulated in terms of equilibrium concepts. 1.2.3
Strategic games and the Nash equilibrium
Strategic Games without Uncertainty What strategy will a rational player choose in a given strategic game ? We saw one way to answer this question: a player may just eliminate all strictly dominated actions, and hope to find thereby a single possible move that remains, and hence will be chosen by him. We formulate strict domination for strategies: Definition 1 (Strict Domination) A strategy si of player i strictly dominates a strategy s0i , iff for all profiles s it holds that (s0i , s−i ) ≺i (si , s−i ). For weak domination we have to replace ≺i by i and to assume that there is at least one strategy combination by the opponents such that si does better than s0i . Definition 2 (Weak Domination) A strategy si of player i weakly dominates a strategy s0i , iff (1) for all profiles s it holds that (s0i , s−i ) i (si , s−i ), and (2) there is a profile s such that (s0i , s−i ) ≺i (si , s−i ). Whereas we can see the principle of strict domination as a mere explication of the meaning of prefer, the principle of weak domination involves some reasoning on the side of the player. Only if there is a chance that the other players choose the strategies s−i , where si is preferred over s0i , there is a reason to play si . In the previous examples we applied the principles of elimination of dominated strategies only once for each player. But in many games we have to apply them several times to arrive at a unique solution. For instance, consider the game in Table 1.5 on the facing page. Intuitively, the combination (r1 , c1 ) should turn out as the solution. This is a game with two players. We call them again row player and column player. Each player has the choice between three actions; row player between {r1 , r2 , r3 } and column player between {c1 , c2 , c3 }. Neither c1 , c2 nor c3 are dominated, hence our criterion does not tell us what column player will choose. For the row player neither r1 nor r2 are dominated; r1 is better if column player chooses c1 and c2 , and r2 is better if he chooses c3 . But we can see that r2
An Introduction to Game Theory for Linguists
19
Table 1.5 c1
c2
c3
r1
(5 ; 5)
(4 ; 4)
(0 ; 0)
r2
(1 ; 1)
(3 ; 3)
(2 ; 2)
r3
(0 ; 0)
(0 ; 0)
(1 ; 1)
strictly dominates r3 . This means that row player will never choose this action. Now, if we assume that the game structure is common knowledge, i.e. the payoffs are mutually known, and if column player knows about row player that he eliminates strictly dominated action, then column player can infer too that only r1 and r2 are possible moves by row player. If we assume that this reasoning is mutually known, then we can eliminate the third row of the payoff matrix. In the reduced matrix c3 is strictly dominated by c2 , and for the same reasons we can eliminate it from the payoff matrix. In the remaining 2 × 2 matrix, r1 strictly dominates r2 . Hence there remains only one choice for row player. This means that the problem of what to choose is solved for him. But if only r1 is a possible move for row player, then c1 strictly dominates c2 , and therefore the problem is solved for column player too. It turns out that (r1 , c1 ) is their unique choice. Apart from the fact that we have to apply the principle of elimination of dominated strategies iteratively, there is another important point that is confirmed by this example: that row player and column player arrive at (r1 , c1 ) presupposes that they know about each other that they apply this principle and that they are able to work out quite intricate inferences about each other’s behavior. Such assumptions about the agents’ ability to reason about each other play a major role in the justification of all criteria of rational behavior. Unfortunately, iterated elimination of dominated strategies can’t solve the question how rational players will act for all types of static games. The example in Table 1.6 is known as the Battle of the Sexes. There are several Table 1.6:
Battle of the sexes
b
c
b
(4 ; 2)
(1 ; 1)
c
(0 ; 0)
(2 ; 4)
20
Game Theory and Pragmatics
stories told for this game. One of them runs as follows: row player, let’s call him Adam, wants to go to a boxing event this evening, and column player, let’s call her Eve, to a concert. Both want to go to their events together. Eve would rather go to the boxing event with Adam than going to her concert alone although she doesn’t like boxing very much. The same holds for Adam if we reverse the roles of boxing and the concert. We see that for Adam neither going to the boxing event b dominates going to the concert c, nor the other way round. The same holds for Eve. Hence, the principle of elimination of dominated strategies does not lead to a solution to the question what Adam and Eve will do. Intuitively, Adam and Eve should agree on (b, b) and (c, c) if they want to maximize their payoffs. They should avoid (b, c) and (c, b). What could a justification look like? One way of reasoning proceeds as follows: if Eve thinks that Adam will go to the boxing event, then she has a preference to be there too. However, if Adam knows that Eve goes to the boxing event, then Adam wants to be there too. The same holds for the pair (c, c). If we look at (b, c), then we find that in this case Adam would prefer to play c, as this increases his payoff; or, if Eve knows that Adam plays b, then she would prefer to switch to b. A strategy profile s = (s1 , . . . , sn ) is called a Nash equilibrium if none of the players i has an interest in playing a strategy different from si given that the others play s−i . Definition 3 (Nash Equilibrium) A strategy profile s is a (weak) Nash equilibrium iff for none of the players i there exists a strategy s0i such that s ≺i (s0i , s−i ), or, equivalently, if for all of i’s strategies s0i it holds that (s0i , s−i ) i s. A Nash equilibrium is strict if we can replace the i by ≺i for s0i 6= si in the second characterization. In this case every player has a preference to play si if the others play s−i . There is another characterization, in terms of best responses. A move si of player i is a best response to a strategy profile s−i . We write si ∈ BRi (s−i ), iff ui (si , s−i ) = max ui (s0i , s−i ). (1.11) 0 si ∈Si
A strategy profile s is a Nash equilibrium, iff for all i = 1, . . . , n si is a best response to s−i , i.e. iff si ∈ BRi (s−i ). It is strict if in addition BRi (s−i ) is a singleton set for all i. Mixed Nash Equilibria We saw that the Battle of the Sexes has two Nash equilibria, (b, b) and (c, c). Once they managed to coordinate on one of them, they have no reason to play something else. But if it is the first time they play this game, or if they have to decide what to play every time anew, then the
An Introduction to Game Theory for Linguists
21
existence of two equilibria does not give them an answer to their question what to do. Eve may reason as follows: if I go to the boxing event, then the best thing I can get out is 2 pleasure points, but if things go wrong then I get nothing. But if I go to the concert, then the worst thing that can happen is that I get only 1 pleasure point, and if I am lucky I get 4 points. So, if I play safe, then I should avoid the action that carries with it the potentially worst outcome. Hence, I should go to the concert. If Adam reasons in the same way, then he will go to the boxing event, and hence Adam and Eve will always be at their preferred event but never meet and get limited to payoff 1. The strategy that Adam and Eve play here is known as the minimax strategy. When von Neumann and Morgenstern wrote their seminal work On the Theory of Games and Economic Behavior (1944) they didn’t use Nash equilibria as their basic solution concept but the minimax strategy. Obviously, this strategy is reasonable. But we will see that Adam and Eve can do better. What happens if Adam and Eve mix their strategies, i.e. they don’t choose a pure strategy like always playing c or b but play each strategy with a certain probability? Let us assume that in the situation of Table 1.1. Adam goes to the concert with probability 21 and to the boxing event with probability 1 , and Eve does the same. Then the probability of each of the four possi2 ble combination of actions is 12 × 12 . The situation is symmetrical for both players, hence they can both calculate their expected payoffs as follows: 1 1 1 1 1 1 1 1 3 × ×2+ × ×1+ × ×0+ × ×4=1 2 2 2 2 2 2 2 2 4 So we see that a simple strategy like flipping a coin before deciding where to go can improve the overall outcome significantly. What is a Nash equilibrium for games with mixed strategies? If Eve believes that chances are equally high for Adam going to the concert as for Adam going to the boxing match, then she can calculate her expected payoff, or expected utility, of playing c as follows: EU (c) =
1 1 1 ×1+ ×4=2 2 2 2
Whereas her expected utility after playing b is EU (b) =
1 1 1 ×2+ ×1=1 2 2 2
In general she will find that always playing c is the most advantageous choice for her. A Nash equilibrium is a sequence of choices by each player such that none of the players has an interest to play something different given the choices of the others. Hence, playing b and c both with probabilities 12 can’t be a Nash equilibrium. But we will see that there exists one, and,
22
Game Theory and Pragmatics
moreover, that there exists one for every finite (two-person) game. Before we introduce this result, let us first state what a strategic game with mixed strategies is. Let ∆(Ai ) be the set of probability distributions over Ai , i.e. the set of functions P that assign a probability P (a) to each action a ∈ Ai such that P a∈Ai P (a) = 1 and 0 ≤ P (a) ≤ 1. Each P in ∆(Ai ) corresponds to a mixed strategy of agent i. A mixed strategy profile then is a sequence (P1 , . . . , Pn ) for the set of players N = {1, . . . , n}. A pure strategy corresponds to a mixed strategy Pi where Pi (a) = 1 for one action a ∈ Ai and Pi (b) = 0 for all other actions. In our example of the Battle of the Sexes the players’ action sets are {b, c}, i.e. the actions of going to a boxing event and going to a concert. If Adam is player 1 and Eve 2, then their strategy of playing b and c with equal probability corresponds to the strategy profile (P1 , P2 ) where Pi (b) = Pi (c) = 21 , i ∈ {1, 2}. We can calculate the expected utility of player i given a mixed strategy profile P = (P1 , . . . , Pn ) and payoff profile (u1 , . . . , un ) by: X EUi (P ) = P1 (a1 ) × . . . × Pn (an ) × ui (a). (1.12) a∈A1 ×...×An
It is assumed that rational players try to maximize their expected utilities, i.e. a player i strictly prefers action a over action b exactly if the expected utility of a is higher than the expected utility of b. For mixed strategy profiles P = (P1 , . . . , Pn ), we use the same notation P−i as for (pure) strategy profiles to denote the profile (P1 , . . . , Pi−1 , Pi+1 , . . . , Pn ) where we leave out the strategy Pi . (Pi0 , P−i ) denotes again the profile where we replaced Pi by Pi0 . Definition 4 (Mixed Nash Equilibrium) A (weak) mixed Nash equilibrium is a mixed strategy profile (P1 , . . . , Pn ) such that for all i = 1, . . . , n and Pi0 ∈ ∆(Ai ) it holds that EUi (Pi0 , P−i ) ≤ EUi (P ). A mixed Nash equilibrium is strict if we can replace ≤ by < in the last condition. A standard result states that every finite strategic two-person game has a mixed Nash equilibrium. In the case of our example of the Battle of the Sexes we find that the pair (P1 , P2 ) with P1 (b) = 54 and P1 (c) = 15 for Adam and P2 (b) = 51 and P2 (c) = 45 is a mixed Nash equilibrium. If Adam plays P1 , then Eve can play whatever she wants, she never can get a higher payoff than that for playing P2 . In fact, it is the same for all her possible strategies. The analogue holds for Adam if it is known that Eve plays P2 . That we find a mixed Nash equilibrium for every finite strategic game is of some theoretical interest because there are many games that don’t have a pure Nash equilibrium.
An Introduction to Game Theory for Linguists
23
Table 1.7 a
b
a
(1 ; −1)
(−1 ; 1)
b
(−1 ; 1)
(1 ; −1)
Consider the game from Table 1.7. This game cannot have a pure Nash equilibrium because whatever one player chooses, because it maximizes his payoff given a choice by the other player, will induce the other player to make a different choice. But it has a unique mixed Nash equilibrium where each player plays each move with probability 12 . There are many refinements of the notion of Nash equilibrium. We introduce here Pareto optimality. We saw that every finite strategic game has a mixed Nash equilibrium . But besides this it may have many more equilibria, hence the criterion to be a Nash equilibrium does normally not suffice for selecting a unique solution for a game. Although this is true, in many cases we can argue that some of these equilibria are better or more reasonable equilibria than others. In the example from Table 1.8 we find two Nash equilibria, (a, a) and (b, b). This means that both are possible solutions to this game but both players have an interest to agree on (a, a). Table 1.8 a
b
a
(3 ; 2)
(0 ; 1)
b
(1 ; 0)
(1 ; 1)
A Nash equilibrium like (a, a) is called strongly Pareto optimal, or strongly Pareto efficient . More precisely, a Nash equilibrium s = (s1 , . . . , sn ) is strongly Pareto optimal, iff there is no other Nash equilibrium s0 = (s01 , . . . , s0n ) such that for all i = 1, . . . , n ui (s) < ui (s0 ). I.e. a Nash equilibrium is strongly Pareto optimal, iff there is no other equilibrium where every player is better off. For example, if players can negotiate in advance, then it is reasonable to assume that they will agree on a strongly Pareto optimal Nash equilibrium. There is also a weaker notion of Pareto optimality: a Nash equilibrium is just Pareto optimal iff there is no other Nash equilibrium s0 = (s01 , . . . , s0n ) such that for all i = 1, . . . , n ui (s) ≤ ui (s0 ) and for one i ui (s) < ui (s0 ).
24
Game Theory and Pragmatics
1.2.4
Games in extensive form
We already saw graphical representations for games in extensive form in Figures 1.1 and 1.2. This form is useful for the representation of dynamic games, i.e. games where there may occur whole sequences of moves by different players, for example as in chess. The normal form goes together with the matrix representation, and the extensive form with the representation by a game tree. What is a tree? A tree consists of several nodes, also called vertices, and edges. In a game tree an edge is identified with a move of some of the players, and the nodes are game situations where one of the players has to choose a move. A tree starts with a distinguished root node, the start situation. If two nodes n1 , n2 are connected by an edge n1 → n2 , then n1 is a predecessor of n2 , and n2 a successor of n1 . Every node has exactly one predecessor, except the root node, which has no predecessor. Every node may have several successors. We call a sequence of nodes n1 → n2 → . . . → nk a path from n1 to nk . If nk has no successor, then we call this sequence a branch and nk an end node. In general, a tree may contain also infinite braches, i.e. there may exist an infinite sequence n1 → n2 → n3 → . . . with no end node. For the following definition of extensive form game we assume that there are no infinite branches. In a tree, every node, except the root node itself, is connected to the root by a path, and none of the nodes is connected to itself by a path (no circles). A tree for an extensive game may look as in Figure 1.3. The numbers i = 1, 2 attached to the nodes mean that it is player i’s turn to choose a move.
1 uX X X
29 u XXX XXX 1 19 z X u u Q Q .. ? Qs + . Q .. .
.. .
XXX
XX 2 X z X u XX XXX 19 Xz X 1u u .. .
.. .
.. .
Figure 1.3
In a game like chess all players have perfect information, i.e. they know the start situation, each move that either they or their opponents have made, and each others preferences over outcomes.
An Introduction to Game Theory for Linguists
25
In Figure 1.1 we saw an example where in some situations players do not know in which situation they are. For every player i we can add to every node n in the tree the set of situations that are indistinguishable for i. This set is called i’s information set. A typical example where information sets have more than one element is a game where one player starts with a secret move. The second player knows the start situation, which moves player 1 may have chosen, and the possible outcomes. If the move is really secret, then player 2 cannot know in which node of the tree he is when he chooses. Think of player 1 and 2 playing a game where 1 hides a Euro in one hand and 2 has to guess whether it is in 1’s left or right hand. If 2 makes the right guess he wins the Euro, otherwise 1 gets it. Figure 1.4 shows a tree for this game. The fact that player 2 cannot distinguish between the situations n, where 1 holds the Euro in his left hand, and n0 , where he holds it in his right hand, is indicated by the oval around n and n0 . What is in
1 t (0, 1) l 2t * nPPPr PP l q t (1, 0) P 1t H r HH 1 t (1, 0) H 2 l H j t P r n0 PP PP q t (0, 1) P Figure 1.4
general a game in extensive form? Let a tree be given in the sense defined above. There must be a set of players N = {1, . . . , n} and a set of moves A = {a1 , . . . , am }. Sometimes it is assumed that nature is an additional player; then it is denoted by 0. In order to represent a game we need the following information about the tree: 1 To each node of the tree we have to assign exactly one player from the set N ∪ {0}. If player i is assigned to a node, then this means that it is player i’s turn to make a move in that situation. 2 Each edge has to be labelled by a move from A. If an edge n1 → n2 is labelled with action a and node n1 is assigned to player i, then this means that playing a by i in situation n1 leads to n2 .
26
Game Theory and Pragmatics
3 If a node n is assigned to player i, then we have to assign to this node in addition an information set. This is a subset of the nodes that are assigned to i and always includes n itself. It represents the information available to i in situation n. If n0 is an element of the information set assigned to n, then the same information set has to be assigned to n0 . The idea is: if n and n0 are elements of the same information set, then player i cannot distinguish between the two situations, i.e. he does not know whether he is in n1 or in n2 . 4 There is a set of outcomes. To each end node we have to assign exactly one outcome. It is the final state resulting from playing the branch starting from the root node and leading to the end node. 5 For each player i in N there exists a payoff function ui that assigns a real value to each of the outcomes. In addition it is assumed that nature, player 0, chooses its moves with certain probabilities: 6 For each node assigned to 0 there is a probability distribution P over its possible moves at this node. Figure 1.5 shows an example of a game in extensive form with nature as a player. Assume that 1 and 2 want to meet at 5 pm. 1 calls 2 and leaves a
1t
ϕ
g1 1 t (1, 1) 2 - t g2 P PP PP q t (0, 0) P
ϕ
1 t (0, 0) g 1 2 - P t g2 P
* s1
ρ 0t
H0H ρ H H
H j 1t s2
P PP q t (1, 1) P
Figure 1.5
message (ϕ) for her: “I have an appointment with the doctor. You can pick me up there.” Now assume that 1 has regularly appointments with two different doctors. Where should 2 pick him up? Of course, if 2 knows that the probability ρ for 1 being at s1 is higher than the probability ρ0 for being at s2 , then she should better go to s1 (g1 ) than to s2 (g2 ).
An Introduction to Game Theory for Linguists
27
In this introduction we concentrated on classical Game Theory . It rests on very strong assumptions about the players’ rationality and computational abilities. A Nash equilibrium, to be a really clear cut solution for a game, presupposes that each other’s strategies are commonly known. In general, it’s not only each other’s strategies but also each other’s reasoning that must be commonly known. Hence, there have been efforts to develop criteria for players with bounded rationality. An extreme reinterpretation is provided by evolutionary game theory, where in fact no reasoning is assumed on the side of the players but a form of (evolutionary) learning. This will be the topic of the third section of this introduction. But before we come to that, we will first describe how the tools of standard decision and game theory can be used to study some pragmatic aspects of communication.
2
Communication in games
2.1 2.1.1
Cheap talk games A sequential formulation
In his classic work on conventions, Lewis (1969) proposed to study communication by means of so-called signaling games, games that are of immediate relevance for linguistics. Meanwhile, extensions of Lewisean signaling games have become very important in economics and (theoretical) biology. In this section we will only consider the simple kind of signaling games that Lewis introduced, games that are now known as cheap talk games. Cheap talk games are signaling games where the messages are not directly payoff relevant. A signaling game with payoff irrelevant messages is a sequential game with two players involved, player 1, the speaker, and player 2, the hearer. Both players are in a particular state, an element of some set T . Player 1 can observe the true state, but player 2 can not. The latter has, however, beliefs about what the true state is, and it is common knowledge between the players that this belief is represented by probability function PH over T . Then, player 1 observes the true state t and chooses a message m from some set M . After player 2 observes m (but not t), he chooses some action a from a set A, which ends the game. In this sequential game, the hearer has to choose a (pure) strategy that says what she should do as a response to each message, thus a function from M to A, i.e., an element in [M → A]. Although strictly speaking not necessary, we can represent also the speaker’s strategy already as a function, one in [T → M ]. In simple communication games, we call these functions the hearer and speaker strategy, respectively, i.e., H and S. The utilities of both players are given by uS (t, a) and uH (t, a).
28
Game Theory and Pragmatics
In cheap talk games, the messages are not directly payoff relevant: the utility functions do not mention the messages being used. Thus, the only effect that a message can have in these games is through its information content: by changing the hearer’s belief about the situation the speaker (and hearer) is in. If a message can change the hearer’s beliefs about the actual situation, it might also change her optimal action, and thus indirectly affect both players’ payoffs. Let us look at a very simple situation where T = {t1 , t2 }, M = {m1 , m2 }, and where f is a function that assigns to each state the unique action in A that is the desired action for both agents. Let us assume the following utility functions (corresponding closely with Lewis’s intentions): uS (ti , Hk (mj )) = 1 = uH (ti , Hk (mj )), if Hk (mj ) = f (ti ), 0 otherwise. Let us assume that f (t1 ) = a1 and f (t2 ) = a2 , which means that a1 is the best action to perform in t1 , and a2 in t2 . The speaker and hearer strategies will be defined as follows: S1 = {ht1 , m1 i, ht2 , m2 i} S3 = {ht1 , m1 i, ht2 , m1 i}
S2 = {ht1 , m2 i, ht2 , m1 i} S4 = {ht1 , m2 i, ht2 , m2 i}
H1 = {hm1 , a1 i, hm2 , a2 i} H3 = {hm1 , a1 i, hm2 , a1 i}
H2 = {hm1 , a2 i, hm2 , a1 i} H4 = {hm1 , a2 i, hm2 , a2 i}
In the following description, we represent not the actual payoffs in the payoff table, but rather what the agents who are in a particular situation expect they will receive. Thus we will give a table for each situation with the payoffs for each agent e ∈ {S, H} determined by X EUe (t, Si , Hj ) = µe (t0 |(Si (t))) × Ue (t0 , Hj (Si (t0 ))) t0 ∈T
where µe (t0 |Si (t)) is defined by means of conditionalization in terms of strategy Si and the agent’s prior probability function Pe as follows: µe (t0 |Si (t)) = Pe (t0 |Si−1 (Si (t))) and where Si−1 (m) is the set of states in which a speaker who uses strategy Si uses m. Because the speaker knows in which situation he is, for him it is the same as the actual payoff: EUS (t, Si , Hj ) = uS (t, Hj (Si (t))). For the hearer, however, it is not the same, because if the speaker uses strategy S3 or S4 she still doesn’t know what the actual state is. Let us assume for concreteness that PH (t1 ) = 13 , and thus that PH (t2 ) = 23 . This then gives rise to Table 1.9 on the next page: We have boxed the equilibria of the games played in the different situations. We say that strategy combination hS, Hi is a Nash equilibrium of
An Introduction to Game Theory for Linguists Table 1.9:
29
Cheap talk game: asymmetric information
t1
H1
H2
H3
H4
S1
(1 ; 1)
(0 ; 0)
(1 ; 1)
(0 ; 0)
S2
(0 ; 0)
(1 ; 1)
(1 ; 1)
(0 ; 0)
S3
(1 ;
1 ) 3
(0 ;
2 ) 3
(1 ;
1 ) 3
(0 ;
2 ) 3
S4
(0 ;
2 ) 3
(1 ;
1 ) 3
(1 ;
1 ) 3
(0 ;
2 ) 3
t2
H1
H2
H3
H4
S1
(1 ; 1)
(0 ; 0)
(0 ; 0)
(1 ; 1)
S2
(0 ; 0)
(1 ; 1)
(0 ; 0)
(1 ; 1)
S3
(0 ;
1 ) 3
(1 ;
2 ) 3
(0 ;
1 ) 3
(1 ;
2 ) 3
S4
(1 ;
2 ) 3
(0 ;
1 ) 3
(0 ;
1 ) 3
(1 ;
2 ) 3
the whole game iff hS, Hi is a Nash equilibrium in both situations (see Definition 3). Thus, we see that we have four equilibria: hS1 , H1 i, hS2 , H2 i, hS3 , H4 i and hS4 , H4 i. These equilibria can be pictured as in Figure 1.6 on the following page. 2.1.2
A strategic reformulation and a warning
Above we have analyzed the game as a sequential one where the hearer didn’t know in which situation she was. Technically, we have analyzed the game as a sequential game with asymmetric information: the speaker knows more than the hearer. However, it is also possible to analyze it as a standard strategic game in which this asymmetry of information no longer plays a role.3 If we analyze the game in this way, however, we have to change two things: (i) we don’t give separate payoff tables anymore for the different situations; and, as a consequence, (ii) we don’t look at (expected) utilities anymore of strategy combinations in a particular state, but have to
30
Game Theory and Pragmatics
Equilibrium 1 Situation
Message
Equilibrium 2 Action
t1
-
m1
- a1
t2
-
m2
- a2
Situation
Message
Action
a XXX: 1 XXX: X X z X Xz t2 a2 X m2 m1X
Equilibrium 3 Situation
Message
t1
Equilibrium 4 Action
a1 m1 X X : XX Xz X m2 - a2 t2
t1
Situation t1
XX
t2
Message m1 X X
XXX - m2 z X
Action XX
a1 Xz X - a2
Figure 1.6
look at expectations with respect to a common prior probability function ρ. Before we do this for signaling games, let us first show how a more simple game of asymmetric information can be turned into a strategic one with symmetric information. Suppose that Row-player knows which situation he is in, but Columnplayer does not. The reason might be that Column doesn’t know what the preferences of Row are. Column thinks that the preferences are as in t1 or as in t2 . Notice that in both situations the preferences of Column are the same, which represents the fact that Row knows what the preferences are of Column. Table 1.10:
Simple game of asymmetric information
t1
c1
c2
t2
c1
c2
r1
(2 ; 1)
(3 ; 0)
r1
(2 ; 1)
(3 ; 0)
r2
(0 ; −1)
(2 ; 0)
r2
(3 ; −1)
(5 ; 0)
Notice that in t1 , r1 is the dominant action to perform for Row, while in t2 it is r2 . Column has to make her choice depending on what she thinks is best. She knows that Row will play r1 in t1 , and r2 in t2 (assuming that Row is rational), but she doesn’t know what is the actual situation. However, she has some beliefs about this. Let us assume that PCol (t1 ) is the probabil-
An Introduction to Game Theory for Linguists
31
ity with which Column thinks that t1 is the actual situation (and thus that PCol (t2 ) = 1 − PCol (t1 )). Then Column will try to maximize her expected utility. Thus, she will choose c1 in case EUCol (c1 ) > EUCol (c2 ), and c2 otherwise (in case EUCol (c1 ) = EUCol (c2 ) she doesn’t care.) Notice that EUCol (c1 ) > EUCol (c2 ) if an only if [(PCol (t1 ) × 1) + (PCol (t2 ) × (−1))] > [(PCol (t1 ) × 0) + (PCol (t2 ) × 0)] if and only if PCol (t1 ) > PCol (t2 ). Thus, we predict by using the Nash equilibrium solution concept that Row will play r1 in situation t1 , and r2 in situation t2 , and Column will play c1 if PCol (t1 ) ≥ 21 and c2 if PCol (t1 ) ≤ 12 . We can characterize the Nash equilibria of the game also as follows: h(r1 , r2 ), c1 i iff PCol (t1 ) ≥ 21 and h(r1 , r2 ), c2 i iff PCol (t1 ) ≤ 21 , where h(r1 , r2 ), c1 i, for instance, means that Row will play r1 in situation t1 and r2 in situation t2 . This game of asymmetric information can also be studied as a standard strategic game of symmetric, or complete, information, if we assume that the participants of the game can represent their beliefs in terms of a common prior probability function, ρ, and that Row has obtained his complete information via updating this prior probability function with some extra information by means of standard conditionalization (Harsanyi 1967–1968). In our case we have to assume that the common prior probability function ρ is just the same as Column’s probability function P , and that Row has received some information X (in state t1 , for instance) such that after conditionalization ρ with this information, the new probability function assigns the value 1 to t1 , i.e. ρ(t1 /X) = 1. If we analyze our example with respect to a common prior probability function, the Row-player also has to pick his strategy before he finds out in which situation the game is being played, and we also have to define his payoff function with respect to the prior probability function ρ = PCol . But this means that the actions, or strategies, of Row player are now functions that tell him what to do in each situation. In the game described above Row now has to choose between the following four strategies: r11 = r1 in t1 and r1 in t2 r12 = r1 in t1 and r2 in t2
r22 = r2 in t1 and r2 in t2 r21 = r2 in t1 and r1 in t2
This gives rise to the strategic game from Table 1.11 on the next page with respect to the common prior probability function ρ that assigns the value 23 to t1 . Now we see that hr12 , c1 i is the Nash equilibrium. In a game of asymmetric information, this equilibrium corresponds to equilibrium h(r1 , r2 ), c1 i, which is exactly the one we found above if PCol (t1 ) = 32 . So, we have seen that our game of asymmetric information gives rise to the same equilibrium as its reformulation as a strategic game with symmetric information. This
32
Game Theory and Pragmatics
Table 1.11:
Strategic reformulation of simple game
r11 r22
c1
c2
(2 ; 1)
(3 ; 0)
(1 12
; −1) 1 ) 3
r12
(2 12 ;
r21
(1 ; − 31 )
(3 21 ; 0) (4 ; 0) (2 21 ; 0)
property, in fact, is a general one, and not limited to this simple example, but also extends to sequential games. In section 2.1.1 we have introduced signaling games as (simple) sequential games where the speaker has some information that the hearer does not have. Exactly as in the just described simple example, however, we can turn this game into a strategic one with symmetric information if we assume a common prior probability function ρ, such that ρ = PH . The payoffs are now determined as follows: ∀a ∈ {S, H} : EUa (Si , Hj ) = P t∈T ρ(t) × ua (t, Hj (Si (t))). As a result, the game analyzed as one of symmetric information receives the following completely symmetric strategic payoff table: Table 1.12:
Signaling game: strategic reformulation
H1
H2
H3
H4
S1
(1 ; 1)
(0 ; 0)
( 13 ;
1 ) 3
( 23 ;
2 ) 3
S2
(0 ; 0)
(1 ; 1)
( 13 ;
1 ) 3
( 23 ;
2 ) 3
S3
( 13 ;
1 ) 3
( 32 ;
2 ) 3
( 13 ;
1 ) 3
( 23 ;
2 ) 3
S4
( 23 ;
2 ) 3
( 31 ;
1 ) 3
( 13 ;
1 ) 3
( 23 ;
2 ) 3
Strategy combination hS, Hi is a Nash equilibrium (cf. Definition 3) in this kind of game given probability function ρ if (i) ∀S 0 : EUS (S, H) ≥
An Introduction to Game Theory for Linguists
33
EUS (S 0 , H) and (ii) ∀H 0 : EUH (S, H) ≥ EUH (S, H 0 ). We see that this game has four equilibria, hS1 , H1 i, hS2 , H2 i, hS3 , H4 i, hS4 , H4 i, which again exactly correspond to the earlier equilibria in the game of asymmetric information. We mentioned already that Lewis (1969) introduced cheap talk games to explain the use and stability of linguistic conventions. More recently, these, or similar, type of games have been used by, among others, Prashant Parikh (Parikh 1991, Parikh 2001), de Jaegher (de Jaegher 2003) and van Rooij (van Rooij 2004) to study some more concrete pragmatic phenomena of language use.4 Until now we have assumed that all Nash equilibria of a cheap talk game are the kind of equilibria we are looking for. However, in doing so we missed a lot of work in economics discussing refinements of equilibria. In order to look at games from a more fine-grained point of view, we have to look at the game as a sequential one again. Remember that when we analyzed cheap talk games from a sequential point of view, we said that hSi , Hj i is a Nash equilibrium with respect to probability functions Pa if for each t, Sk and Hm : EUS (t, Si , Hj ) ≥ EUS (t, Sk , Hj ) and EUH (t, Si , Hj ) ≥ EUH (t, Si , Hm ) P 0 For agents a, we defined EUa (t, Si , Hj ) as follows: t0 ∈T µa (t |Si (t)) 0 0 0 ×Ua (t , Hj (Si (t ))), where µa (t |(Si (t))) was defined in terms of the agents’ prior probability functions Pa and standard conditionalization as follows: µa (t0 |(Si (t))) = Pa (t0 |S −1 (S(t))). On this analysis, the equilibrium is the same as the one used in the strategic game. However, assuming a conditional probability function µ to be defined in this way gives in many games rise to counterintuitive equilibria. In the context of cheap talk games, all of them have to do with how the hearer would react to a message that is not sent in the equilibrium play of the game, i.e., not sent when the speaker uses strategy Si that is part of the equilibrium. Notice that if m is such a message, µH (t|m) can take any value in [0, 1] according to the above definition of the equilibrium, because the conditional probability function is not defined then. To give a very simple (though, admittedly, a somewhat unnatural) illustration of the kind of problems that might arise, suppose we have a signaling game with T = {t1 , t2 }, and M = {m, ε} as in Table 1.13 on the following page. Let us also assume that m is a message with a fully underspecified
34
Game Theory and Pragmatics
meaning, but that, for some reason, the speaker could use ε only in situation t2 , although the hearer still might react in any possible way. Then we have the following strategies. Table 1.13:
Game where standard conditionalization is not good enough
S1 S2
t1 m m
t2 ε m
H1 H2 H3 H4
m a1 a1 a2 a2
ε a1 a2 a1 a2
Now let us again assume for all a ∈ {S, H} that Ua (t, H(S(t))) = 1 if H(S(t)) = f (t), 0 otherwise, with f defined as in section 2.1.1. Now we see that if PH (t2 ) > 12 as before, we have the following equilibria: hS1 , H2 i, hS2 , H3 i, and hS2 , H4 i. The first one is independent of the probability function, while the latter two hold if we assume that µH (t1 |m) ≥ 12 and µH (t2 |m) ≥ 21 , respectively. The equilibrium hS2 , H4 i seems natural, because ε is interpreted by the receiver as it might be used by the sender. Equilibrium hS2 , H3 i, however, is completely unnatural, mainly because if PH (t2 ) > 21 , it seems quite unnatural to assume that µH (t1 |m) ≥ 12 is the case. To account for much more complicated examples, Kreps and Wilson (1982) propose that instead of defining µH (t|m) in terms of a speaker strategy S and prior probability function PH , the conditionalization requirement is only a condition on what µH (t|m) should be, in case m is used according to the speaker’s strategy. This leaves room for an extra constraint on what µH (t|m) should be in case m is not used by the speaker’s strategy. The extra condition Kreps and Wilson propose is their consistency condition for beliefs at information sets that are not reached in equilibrium. It says, roughly speaking, that for each situation ti and message mi that is not sent according to the speaker’s strategy, the posterior probability of ti at the information state that results after mi would be sent, i.e. µH (ti |mi ), should (in simple cases) be as close as possible to the prior probability of ti . In our case this means that, although ε is not sent if the speaker uses strategy S2 , we should still give µH (t1 |m) and µH (t2 |m) particular values, namely µH (t1 |m) = PH (t1 ) < 21 and µH (t2 |m) = PH (t2 ) > 12 . But if the beliefs are like this, the Nash equilibrium hS2 , H3 i ceases to be an equilibrium anymore when taking this consistency requirement into account, and we have explained why it is unnatural.
An Introduction to Game Theory for Linguists
35
In this introduction we won’t bother anymore with the above mentioned refinements, so we might as well think of the game from a strategic point of view. 2.2
The quantity of information transmission
Notice that in the first two equilibria of the cheap talk game described in section 2.1.1. there is a 1-1-1 correspondence between situations, messages and actions, and it is natural to say that if speaker and hearer coordinate on the first equilibrium, the speaker uses message mi to indicate that she is in situation ti and wants the hearer to perform action ai . As a natural special case we can think of the actions as ones that interpret the messages. In these cases we can identify the set of actions, A (of the hearer), with the set of states, T .5 In that case it is most natural to think of the set of states as the set of meanings that the speaker wants to express, and that in the first two equilibria there exists a 1-1 correspondence between meanings and messages. One of the central insights of Lewis’s (1969) work on conventions was that the meaning of a message can be defined in terms of the game theoretical notion of an equilibrium in a signaling game: messages that don’t have a pre-existing meaning acquire such a meaning through the equilibrium play of the game. Lewis (1969) proposes (at least when ignoring context-dependence) that all and only all equilibria where there exists such a 1-1 correspondence, which he calls signaling systems, are appropriate candidates for being a conventional solution for communicating information in a signaling game. In contrast to the first two equilibria, no information transmission is going on in equilibria 3 and 4. Still, they count as equilibria: in both equilibria the hearer is justified in ignoring the message being sent and always plays the action which has the highest expected utility, i.e. action a2 , if the speaker sends the same message in every situation; and the speaker has no incentive to send different messages in different states if the hearer ignores the message and always performs the same action. Thus, we see that in different equilibria of a cheap talk communication game, different amounts of information can be transmitted. But for cheap talk to allow for informative communication at all, a speaker must have different preferences over the hearer’s actions when he is in different states. Likewise, the hearer must prefer different actions depending on what the actual situation is (talk is useless if the hearer’s preferences over actions are independent of what the actual situation is.) Finally, the hearer’s preferences over actions must not be completely opposed to that of the speaker’s. These three conditions are obviously guaranteed if we assume, as in the example above, that there is perfect alignment of preferences between speaker and hearer, and for both speaker and hearer there is a one-to-one relation between states and op-
36
Game Theory and Pragmatics
timal actions (in those states). In general, however, we don’t want such an idealistic assumption. This gives rise to the question how informative cheap talk can be. That is, how fine-grained can and will the speaker reveal the true situation if talk is cheap? We will address this issue in a moment, but first would like to say something about a refinement of the Nash equilibrium concept in sequential games that sometimes helps us to eliminate some counterintuitive equilibria. As mentioned above, the main question asked in cheap talk games is how much information can be transmitted, given the preferences of the different agents. In an important article, Crawford and Sobel (1982) show that the amount of credible communication in these games depends on how far the preferences of the participants are aligned. To illustrate, assume that the state, message and action spaces are continuous and between the interval of zero and one. Thus, T = [0, 1]; the message space is the type space (M = T ), i.e., M = [0, 1], and also the action space is in the interval [0, 1]. Now, following Gibson (1992), we can construct as a special case of their model the following quadratic utility functions for speaker and hearer such that there is a single parameter, b > 0, that measures how closely the preferences of the two players are aligned: UH (t, a) US (t, a)
= =
−(a − t)2 −[a − (t + b)]2
Now, when the actual situation is t, the hearer’s optimal action is a = t, but the speaker’s optimal action is a = t + b. Thus, in different situations the speaker has different preferences over the hearer’s actions (in ‘higher’ situations speakers prefer higher actions), and the interests of the players are more aligned in case b comes closer to 0. Crawford and Sobel (1982) show that in such games all equilibria are partition equilibria ; i.e., the set of situations T can be partitioned into a finite number of intervals such that senders in a state belonging to the same interval send a common message and receive the same action. Moreover, they show that the amount of information revealed in equilibrium increases as the preferences of the speaker and the hearer are more aligned. That is, if parameter b approaches 0, there exists an equilibrium where the speaker will tell more precisely which situation he is in, and thus more communication is possible. However, when parameter b has the value 1, it represents the fact that the preferences of speaker and hearer are opposed. A speaker in situation t = 1, for instance prefers most action a = 1, and mostly disprefers action a = 0. If b = 1, however, a hearer will prefer most action a = 0 and most dislikes action a = 1. As a result, no true information exchange will take place if b = 1, i.e., if the preferences are completely opposed.
An Introduction to Game Theory for Linguists
37
To establish the fact proved by Crawford and Sobel, no mention was made of any externally given meaning associated with the messages. What happens if we assume that these messages in fact do have an externally given meaning, taken to be sets of situations? Thus, what happens when we adopt an externally given interpretation function [·] that assigns to every m ∈ M a subset of T ? The interesting question is now not whether the game has equilibria in which we can associate meanings with the messages, but rather whether there exist equilibria where the messages are sent in a credible way. That is, are there equilibria where a speaker sends a message with meaning {ti } if and only if she is in state ti ? As it turns out, the old question concerning informative equilibria in signaling games without pre-existing meaning and the new one concerning credible equilibria in signaling games with messages with pre-existing meaning are closely related. Consider a two-situation two-action game with the following utility table. (“tH ” and “tL ” are mnemonic for “high type” and “low type”, which mirror the preferences of the receiver.) Table 1.14:
Two-situation, two action
aH
aL
tH
(1 ; 1)
(0 ; 0)
tL
(1 ; 0)
(0 ; 1)
In this game, the informed sender prefers, irrespective of the situation he is in, column player to choose aH , while column player wants to play aH if and only if the sender is in situation tH . Now assume that the expected utility for the hearer to perform aH is higher than that of aL (because P (tH ) > P (tL )). In that case, in both situations speakers have an incentive to send the message that conventionally expresses {tH }. But this means that in this game a speaker in tL has an incentive to lie, and thus that the hearer cannot take the message to be a credible indication that the speaker is in situation tH , even if the speaker was actually in that situation. Farrell (1988, 1993) and Rabin (1990) discussed conditions under which messages with a pre-existing meaning can be used to credibly transmit information. They show that this is possible by requiring that the hearer believes what the speaker says if it is in the latter’s interest to speak the truth. The paper of Stalnaker in this volume explores the connection between this work on credible information transmission and Gricean pragmatics. We have indicated above that the assumption that messages have an externally given pre-existing meaning doesn’t have much effect on the equi-
38
Game Theory and Pragmatics
libria of cheap talk games, or on Crawford and Sobel’s (1982) result on the amount of possible communication in such games. This holds at least if no requirements are made on what should be believed by the hearer, and if no constraints are imposed on what kinds of meanings can be expressed in particular situations, for instance if no requirements like ∀t ∈ T : t ∈ [S(t)], saying that speakers have to tell the truth, are put upon speakers’ strategies S. 2.3
Verifiable communication with a skeptic audience
What happens if we do put extra constraints upon what can and what cannot be said? As it turns out, this opens up many new possibilities of credible communication. In fact, Lipman and Seppi (1995) (summarized in Lipman 2003) have shown that with such extra constraints, interesting forms of reliable information transmission can be predicted in games where you expect it the least: in debates between agents with opposing preferences. Before we look at debates, however, let us first consider cheap talk games when we assume that the signals used come with a pre-existing meaning and, moreover, that speakers always tell the truth. This still doesn’t guarantee that language cannot be used to mislead one’s audience. Consider again the two-situation two-action game described above, but now assume in addition that we demand that the speaker speaks the truth: ti ∈ [S(ti )]. The rational message for an individual in the ‘high’ situation to send is still one that conventionally expresses {tH }, but an individual in the ‘low’ situation now has an incentive to send a message with meaning {tH , tL }. If the hearer is naive she will choose aH after hearing the signal that expresses {tH , tL }, because aH has the highest expected utility. A more sceptical hearer, however, will argue that a speaker that sends a message with meaning {tH , tL } must be one that is in a ‘low’ situation, because otherwise the speaker could, and thus should (in her own best interest) have sent a message with meaning {tH }. Thus, this sceptical hearer will reason that the speaker was in fact in a low-type situation and interprets the message as {tL }. Indeed, this game has an equilibrium where the speaker and hearer act as described above. In general, suppose that the speaker has the following preference relation over a set of 10 situations: t1 < t2 < ... < t10 (meaning that t1 is the worst situation) and sends a message m with pre-existing meaning [m]. A sceptical hearer would then assign to m the following pragmatic interpretation S(m) based on the speaker’s preference relation ‘<’, on the assumption that the speaker knows which situation he is in: S(m)
=
{t ∈ [m]|¬∃t0 ∈ [m] : t0 < t}
This pragmatic interpretation rule is based on the assumption that the speaker gives as much information as he can that is useful to him, and that
An Introduction to Game Theory for Linguists
39
the hearer anticipates this speaker’s maxim (to be only unspecific with respect to more desirable states) by being sceptical when the speaker gives a message with a relatively uninformative meaning.6 Now consider debates in which the preferences of the participants are mutually opposed. Suppose that debaters 1 and 2 are two such players who both know the true state. Now, however, there is also a third person, the observer, who doesn’t. Both debaters present evidence to the observer, who then chooses an action a ∈ A which affects the payoffs of all of them. We assume that the observer’s optimal action depends on the state, but that the preferences of the debaters do not. In fact, we assume that the preferences of debaters 1 and 2 are strictly opposed: in particular, if debater 1 prefers state ti above all others, ti is also the state that debater 2 dislikes most. By assuming that the utility functions of all three participants are of type Uj (t, a), we again assume that the message being used is not directly payoff relevant, just as in cheap talk games. We assume that each debater can send a message. Let us assume that S denotes the strategy of debater 1, and R the strategy of debater 2. In contrast to the cheap talk games discussed above, we now crucially assume that the messages have an externally given meaning given by interpretation function [·]. Let us first assume that while debater 1 can make very precise statements, i.e., that a particular state t holds, debater 2 can only make very uninformative statements saying that a particular state is not the case. Let us assume for concreteness that T = {t1 , ..., t10 }. Then the ‘meaning’ of S(ti ), [S(ti )] can consist of one state, {tj }, while the meaning of R(ti ), [R(ti )], always consists of 9 states. Thus, debater 1 can be much more informative about the true state, and is thus in the advantage. But debater 2 has an advantage over debater 1 as well: in contrast to what is known about debater 1, it is commonly known of debater 2 that she is reliable and will only make true statements. Thus, for all ti ∈ T : ti ∈ [R(ti )], while it might be that ti 6∈ [S(ti )]. Suppose now that the observer may ask two statements of the players. The question is, how much information can the observer acquire? One is tempted to think that the messages cannot really give a lot of information: debater 1 has no incentive to tell the truth, so acquiring two messages from him is completely uninformative. Debater 2 will provide true information, but the informative value of her messages is very low: after two messages from her the observer still doesn’t know which of the remaining 8 states is the true one. Surprisingly enough, however, Lipman and Seppi (1995) show that the observer can organize the debate such that after two rounds of communication, he knows for certain which state actually obtains. The trick is the following: the observer first promises, or warns, debater 1 that in case he
40
Game Theory and Pragmatics
finds out that the latter will not give a truthful message, he will punish debater 1 by choosing the action that is worst for him. This is possible because it is common knowledge what the agents prefer. For concreteness, assume that debater 1 has the following preferences t10 > t9 > ... > t1 . Afterwards, the observer first asks debater 1 which state holds, and then asks debater 2 to make a statement. Suppose that the first debater makes a very informative statement of the form ‘State ti is the true state’. Obviously, debater 2 will refute this claim if it is false. For in that case the observer will as a result choose the state most unfavorable to debater 1, and thus most favorable to debater 2, i.e. t1 . Thus, if he is precise, debater 1 has an incentive to tell the true state, and the observer will thus learn exactly which state is the true one. Suppose that the true state is the one most undesirable for debater 1, t1 . So, or so it seems, he has every reason to be vague. Assume that debater 1 makes a vague statement with meaning {ti , ..., tn }. But being vague now doesn’t help: if the true state is ruled out by this vague meaning, debater 2 will claim that (even) the least preferred state in it is not true, and if debater 2 doesn’t refute debater 1’s claim in this way the observer will choose the most unfavorable state for debater 1 compatible with the true message with meaning {ti , ..., tn }. In general, if debater 1’s message m has meaning [m], and if m is not refuted, then the observer will ‘pragmatically’ interpret m as follows: {t ∈ [m]|¬∃t0 ∈ [m] : t0 < t}, where ‘t0 < t’ means that debater 1 (strictly) prefers t to t0 . Notice that this is exactly the pragmatic interpretation rule S(m) described above. From a signaling game perspective, this just means that the game has a completely separating equilibrium: whatever the true state is, it is never in the interest of debater 1 not to say that this is indeed the true state. The example discussed above is but a simple, special case of circumstances characterized by Lipman and Seppi (1995) in which observers can ‘force’ debaters to provide precise and adequate information, even though they have mutually conflicting preferences. The discussion in this section shows that truthful information transmission is possible in situations in which the preferences of the conversational participants are mutually opposed. This seems to be in direct conflict with the conclusion reached in section 2.2, where it was stated that credible information transmission is impossible in such circumstances. However, this conflict is not real: on the one hand, a central assumption in cheap talk games is that talk is really cheap: one can say what one wants because the messages are not verifiable. The possibility of credible information transmission in debates, on the other hand, crucially depends on the assumption that claims of speakers are verifiable to at least some extent, in other words, they are falsifiable (by the second debater), and that outside observers can
An Introduction to Game Theory for Linguists
41
punish the making of misleading statements. In fact, by taking the possibility of falsification and punishment into account as well, we predict truthful communication also in debates, because the preferences of the agents which seemed to be opposing are still very much aligned at a ‘deeper’ level. This subsection also shows that if a hearer knows the preferences of the speaker and takes him to be well-informed, there exists a natural ‘pragmatic’ way to interpret the speaker’s message which has already a preexisting ‘semantic’ meaning, based on the assumption that speakers are rational and only unspecific, or vague, with respect to situations that are more desirable for them. 2.4
Debates and pragmatics
It is well established that a speaker in a typical conversational situation communicates more by the use of a sentence than just its conventional truth conditional meaning. Truth conditional meaning is enriched with what is conversationally implicated by the use of a sentence. In pragmatics – the study of language use – it is standard to assume that this way of enriching conventional meaning is possible because we assume speakers to conform to Grice’s (1967) cooperative principle, the principle that assumes speakers to be rational cooperative language users. This view on language use suggests that the paradigmatic discourse situation is one of cooperative information exchange. Merin (1999b) has recently argued that this view is false, and hypothesized that discourse situations are paradigmatically ones of explicit or tacit debate. He bases this hypothesis on the work of Ducrot (1973) and Anscombre and Ducrot (1983) where it is strongly suggested that some phenomena troublesome for Gricean pragmatics can be analyzed more successfully when we assume language users to have an argumentative orientation. In the sequel we will sketch some of Ducrot’s arguments for such an alternative view on communication, and we will describe Merin’s analysis of some implicatures which are taken to be troublesome for a cooperative view on language use. Adversary connectives The connective but is standardly assumed to have the same truth-conditional meaning as and. Obviously, however, they are used differently. This difference is accounted for within pragmatics. It is normally claimed that ‘A and B’ and ‘A but B’ give rise to different conventional implicatures, or appropriateness conditions. On the basis of sentences like (1) it is normally (e.g. Frege 1918) assumed that sentences of the form ‘A but B’ are appropriate, if B is unexpected given A. (1) John is tall but no good at baseball.
42
Game Theory and Pragmatics
This, however, cannot be enough: it cannot explain why the following sentence is odd: (2) John walks but today I won the jackpot. Neither can it explain why the following sentence is okay, because expensive restaurants are normally good. (3) This restaurant is expensive, but good. Ducrot (1973), Anscombre and Ducrot (1983) and Merin (1999a,b) argue that sentences of the form ‘A but B’ are always used argumentatively, where A and B are arguments for complementary conclusions: they are contrastive in a rhetorical sense. For instance, the first and second conjunct of (3) argue in favor of not going and going to the restaurant, respectively. Not only sentences with but, but also other constructions can be used to express a relation of rhetorical contrast (cf. Horn 1991). These include complex sentences with while, even if, or may in the first clause (i.e. concession), and/or still, at least, or nonetheless either in place of or in addition to the but of the second clause (i.e. affirmation): (4) a. While she won by a {small, *large} margin, she did win. b. Even if I have only three friends, at least I have three. c. He may be a professor, he is still an idiot. Anscombre and Ducrot (1983) argue that rhetorical contrast is not all to the appropriateness of sentences like (3) or (4a)-(4c). It should also be the case that the second conjunct should be an argument in favor of conclusion H that the speaker wants to argue for. And, if possible, it should be a stronger argument for H than the first conjunct is for H. In terms of Merin’s notion of relevance discussed in section 1.1 this means that the conjunction ‘A but B’ is appropriate only if rH (A) < 0, rH (B) > 0 and rH (A∧B) > 0. In this way it can also be explained why (3), for instance, can naturally be followed by H = ‘You should go to that restaurant’, while this is not a good continuation of (5) This restaurant is good, but expensive. which is most naturally followed by H = ‘You should not go to that restaurant’. There is another problem for a standard Gricean approach to connectives like but that can be solved by taking an argumentative perspective (cf. Horn 1991). It seems a natural rule of cooperative conversation not to use a conjunctive sentence where the second conjunct is entailed by the first. Thus, (6) is inappropriate:
An Introduction to Game Theory for Linguists
43
(6) *She won by a small margin, and win she did. However, even though but is standardly assumed to have the same truthconditional meaning as and, if we substitute and in (6) by but, the sentence becomes perfectly acceptable: (7) She won by a small margin, but win she did. If – as assumed by standard Gricean pragmatics – only truth-conditional meaning is taken as input for pragmatic reasoning, it is not easy to see how this contrast can be accounted for. By adopting Ducrot’s hypothesis that in contrast to (6) the conjuncts in (7) have to be rhetorically opposed, the distinction between the two examples can be explained easily: if a speaker is engaged in a debate with somebody who argued that Mrs. X has a relative lack of popular mandate, she can use (7), but not (6). Merin’s (1999a) formalization allows him also to explain in a formal rigorous manner why (7) ‘She won by a small margin, but win she did’ can be appropriate, although the second conjunct is entailed by the first. The possibility of explaining sentences like (7) depends on the fact that even if B is entailed by A, A |= B, it is still very well possible that there are H and probability functions in terms of which r· (·) is defined such that rH (A) < rH (B). Thus, the notion of relevance used by Merin does not increase with respect to the entailment relation. Thus, it appears that an argumentative view on language use can account for certain linguistic facts for which a non-argumentative view seems problematic. Scalar reasoning Anscombre and Ducrot (1983) and Merin (1999b) argue that to account for so-called ‘scalar implicatures’ an argumentative view is required as well. Scalar implicatures are normally claimed to be based on Grice’s maxim of quantity: the requirement to give as much information as is required for the current purposes of the exchange. On its standard implementation, this gives rise to the principle that everything ‘higher’ on a scale than what is said is false, where the ordering on the scales is defined in terms of informativity. Standardly, scales are taken to be of the form hP (k), ..., P (m)i, where P is a simple predicate (e.g. Mary has x children) and for each P (i) higher on the scale than P (j), the former must be more informative than the latter. From the assertion that P (j) is true we then conclude by scalar implicature that P (i) is false. For instance, if Mary says that she has two children, we (by default) conclude that she doesn’t have three children, because otherwise she could and should have said so (if the number of children is under discussion). Other examples are scales
44
Game Theory and Pragmatics
like hA ∧ B, A ∨ Bi: from the claim that John or Mary will come, we are normally allowed to conclude that they will not both come. Unfortunately, as observed by Fauconnier (1975), Hirschberg (1985) and others, we see inferences from what is not said to what is false very similar to the ones above, but where what is concluded to be false is not more informative than, or does not entail, what is actually said. For instance, if Mary answers at her job interview the question whether she speaks French by saying that her husband does, we conclude that she doesn’t speak French herself, although this is not semantically entailed by Mary’s answer. Such scalar inferences are, according to Anscombre and Ducrot (1983) best accounted for in terms of an argumentative view on language: Mary wants to have the job, and for that it would be more useful that she herself speaks French than that her husband does. The ordering between propositions should not be defined in terms of informativity, or entailment, but rather in terms of argumentative force. Thus, from Mary’s claim that her husband speaks French we conclude that the proposition which has a higher argumentative value, i.e., that Mary speaks French herself, is false. It would be obvious how to account for this in terms of the relevance function used by Merin: assume that H is the proposition ‘Mary gets the job’. Perhaps surprisingly, this natural reasoning schema is not adopted in Merin (1999b). In fact, he doesn’t want to account for conversational implicatures in terms of the standard principle that everything is false that the speaker didn’t say, but could have said. Instead, he proposes to derive scalar implicatures from the assumption that conversation is always a game in which the preferences of the agents are diametrically opposed. From this view on communication, it follows that assertions and concessions have an ‘at least’ and ‘at most’ interpretation, respectively: if a proponent, Pro, makes a claim, Pro won’t object to the respondent, Con, conceding more, i.e. a windfall to Pro, but will mind getting less. Con, in turn, won’t mind giving away less than conceded, but will mind giving away more. Put simply: claims are such as to engender intuitions glossable ‘at least’; concessions, dually, ‘at most’. (Merin 1999b, p. 191). This intuition is formalized in terms of Merin’s definition of relevance cones defined with respect to contexts represented as hP, hi (I minimally changed Merin’s (1999b) actual definition 8 on page 197.) Definition 5 The upward (relevance) cone ≥S φ of an element φ of a subset S ⊆ F of propositions in context hP, hi is the union of propositions in S that are at least as relevant to H with respect to P as φ is. The downward (relevance) cone ≤S φ of φ in context hP, Hi is, dually, the union of S-propositions at most as relevant to H with respect to P as φ is.
An Introduction to Game Theory for Linguists
45
On the basis of his view of communication as a (bargaining) game with opposing preferences, Merin hypothesizes that while the upward cone of a proposition represents Pro’s claim, the downward cone represents Con’s default expected compatible counterclaim (i.e., concession). Net meaning, then is proposed to be the intersection of Pro’s claim and Con’s counterclaim: ≥S φ ∩ ≤S φ, the intersection of what is asserted with what is conversationally implicated. Now consider the particularized scalar implicature due to Mary’s answer at her job interview to the question whether she speaks French by saying that her husband does. As suggested above, the goal proposition, H, now is that Mary gets the job. Naturally, the proposition a = [Mary speaks French] has a higher relevance than the proposition B = [Mary’s husband speaks French]. The net meaning of Mary’s actual answer is claimed to be ≥S B ∩ ≤S B. This gives rise to an appropriate result if we rule out that B ∈ S. This could be done if we assume that S itself partitions the state space (in fact, this is what Merin normally assumes). Presumably, this partition is induced by a question like Who speaks French? On this assumption it indeed follows that the elements of the partition compatible with A = [Mary speaks French] are not compatible with the downward cone of B, and thus are ruled out correctly. As we already indicated above, Merin’s analysis of conversational implicatures – which assumes that conversation is a bargaining game – is not the only one possible, and perhaps not the most natural one either. In section 2.2 we saw that it makes a lot of sense to assume that (truthful) speakers say as much as they can about situations that are desirable for them. In case the speaker is taken to be well-informed, we can conclude that what speakers do not say about desirable situations is, in fact, not true (if the speaker is taken to be knowledgeable about the relevant facts). We formulated a ‘pragmatic’ interpretation rule for sceptical hearers that have to ‘decode’ the message following this reasoning, to hypothesize what kind of situation the speaker is in. Now consider the example again that seemed similar to scalar reasoning but could not be treated in that way in standard Gricean analyses: the case where Mary answers at her job-interview the question of whether she speaks French by saying that her husband does. Intuitively, this gives rise to the scalar implicature that the ‘better’ answer, that Mary herself speaks French, is false. As already suggested above, this example cannot be treated as a scalar implicature in the standard implementation of Gricean reasoning because the proposition that Mary speaks French is not more informative than, or does not entail, the proposition that her husband does. But notice that if we assume the scale to be the preference order (between states) of the speaker, we can account for this example in terms of
46
Game Theory and Pragmatics
our earlier mentioned pragmatic interpretation rule. All we have to assume for this analysis to work is that the state where speaker Mary speaks French herself is more preferred to one where she does not. Thus, we can account for the particularized conversational implicature that Mary doesn’t speak French in terms of the pragmatic interpretation rule described in section 2.2. The pragmatic interpretation rule that we used above is not only relevant for cases where speaker and hearer have opposing preferences (to at least some degree), but is also perfectly applicable in ideal Gricean circumstances where the preferences of the agents are well-aligned.7 Thus, even if neither the Gricean cooperative view on language use, nor the alternative argumentative view has universal applicability, this doesn’t mean that conversational implicatures cannot still be accounted for by means of a general rule of interpretation.
3 3.1
Evolutionary game theory The evolutionary interpretation of game theory
The classical interpretation of game theory makes very strong idealization about the rationality of the players. First, it is assumed that every player is logically omniscient. The players are assumed to know all logical theorems and all logical consequences of their non-logical beliefs. Second, they are assumed to always act in their enlightened self interest (in the sense of utility maximization). Last but not least, for a concept like “Nash equilibrium” to make sense in classical GT, it has to be common knowledge between the players (a) what the utility matrix is like, and (b) that all players are perfectly rational. Each player has to rely on the rationality of the others without doubt, he has to rely on the other players relying on his own rationality etc. These assumptions are somewhat unrealistic, and variants of classical Game Theory that try to model the behavior of real people in a less idealized way are therefore of high relevance. A substantial part of current game theoretical research is devoted to bounded rationality: versions of GT where the above-mentioned rationality assumptions are weakened. The most radical version of this program is evolutionary game theory (EGT ). It builds on the fundamental intuition that in games that are played very often (as for instance dialogues), strategies that lead to a high payoff at a point in time are more likely to be played in subsequent games than less successful strategies. No further assumptions are made about the rationality of the agents. Perhaps surprisingly, the solution concepts of classical GT are not invalidated by this interpretation but only slightly refined and modified. Originally EGT was developed by theoretical biologists, especially John Maynard Smith (cf. Maynard Smith 1982) as a
An Introduction to Game Theory for Linguists
47
formalization of the neo-Darwinian concept of evolution via natural selection. It builds on the insight that many interactions between living beings can be considered to be games in the sense of game theory (GT) – every participant has something to win or to lose in the interaction, and the payoff of each participant can depend on the actions of all other participants. In the context of evolutionary biology, the payoff is an increase in fitness, where fitness is basically the expected number of offspring. According to the neo-Darwinian view on evolution, the units of natural selection are not primarily organisms but heritable traits of organisms. If the behavior of organisms, i.e., interactors, in a game-like situation is genetically determined, the strategies can be identified with gene configurations. The evolutionary interpretation of GT is not confined to the biological context though. It is applicable to cultural evolution as well. In this domain, the transmission of strategies is achieved via imitation and learning rather than via DNA copying. If this is applied to language, EGT can thus be used as a tool to model conventionalization formally, and this is of immediate relevance to the interface between pragmatics and grammar in the narrow sense. 3.2 3.2.1
Stability and dynamics Evolutionary stability
Evolution is frequently conceptualized as a gradual progress towards more complexity and improved adaptation. Everybody has seen pictures displaying a linear ascent leading from algae over plants, fishes, dinosaurs, horses, and apes to Neanderthals and finally humans. Evolutionary biologists do not tire of pointing out that this picture is quite misleading. Darwinian evolution means a trajectory towards increased adaptation to the environment, proceeding in small steps. If a local maximum is reached, evolution is basically static. Change may occur if random variations (due to mutations) accumulate so that a population leaves its local optimum and ascends to another local optimum. Also, the fitness landscape itself may change as well – if the environment changes, the former optima may cease to be optimal. Most of the time biological evolution is macroscopically static though. Explaining stability is thus as important a goal for evolutionary theory as explaining change. In the EGT setting, we are dealing with large populations of potential players. Each player is programmed for a certain strategy, and the members of the population play against each other very often under total random pairings. The payoffs of each encounter are accumulated as fitness, and the average number of offspring per individual is proportional to its accumulated fitness, while the birth rate and death rate are constant. Parents pass
48
Game Theory and Pragmatics
on their strategy to their offspring basically unchanged. Replication is to be thought of as asexual, i.e., each individual has exactly one parent. If a certain strategy yields on average a payoff that is higher than the population average, its replication rate will be higher than average and its proportion within the overall population increases, while strategies with a less-thanaverage expected payoff decrease in frequency. A strategy mix is stable under replication if the relative proportions of the different strategies within the population do not change under replication. Occasionally replication is unfaithful though, and an offspring is programmed for a different strategy than its parent. If the mutant has a higher expected payoff (in games against members of the incumbent population) than the average of the incumbent population itself, the mutation will eventually spread and possibly drive the incumbent strategies to extinction. For this to happen, the initial number of mutants may be arbitrarily small.8 Conversely, if the mutant does worse than the average incumbent, it will be wiped out and the incumbent strategy mix prevails. A strategy mix is evolutionarily stable if it is resistant against the invasion of small proportions of mutant strategies. In other words, an evolutionarily stable strategy mix has an invasion barrier. If the number of mutant strategies is lower than this barrier, the incumbent strategy mix prevails, while invasions of higher numbers of mutants might still be successful. In the metaphor used here, every player is programmed for a certain strategy, but a population can be mixed and comprise several strategies. Instead we may assume that all individuals are identically programmed, but this program is non-deterministic and plays different strategies according to some probability distribution (which corresponds to the relative frequencies of the pure strategies in the first conceptualization). Following the terminology from section 1, we call such non-deterministic strategies mixed strategies. For the purposes of the evolutionary dynamics of populations, the two models are equivalent. It is standard practice in EGT to talk of an evolutionarily stable strategy, where a strategy can be mixed, instead of an evolutionarily stable strategy mix. We will follow this terminology henceforth. The notion of an evolutionarily stable strategy can be generalized to sets of strategies. A set of strategies A is stationary if a population where all individuals play a strategy from A will never leave A unless mutations occur. A set of strategies is evolutionarily stable if it is resistant against small amounts of non-A mutants. Especially interesting are minimal evolutionarily stable sets, i.e., evolutionarily stable sets which have no evolutionarily stable proper subsets. If the level of mutation is sufficiently small, each population will approach such a minimal evolutionarily stable set.
An Introduction to Game Theory for Linguists
49
Maynard Smith (1982) gives a static characterization of evolutionarily stable strategies (ESS), abstracting away from the precise trajectories9 of a population. It turns out that the notion of an ESS is strongly related to the rationalistic notions of a Nash equilibrium (NE) that was introduced earlier, and its stronger version of a strict Nash equilibrium (SNE). At the present point, we will focus on symmetric games where both players have the same strategies at their disposal, and we only consider profiles where both players play the same strategy. (The distinction between symmetric and asymmetric games will be discussed more thoroughly in the next subsection.) With these adjustments, the definitions from section 1 can be rewritten as • s is a Nash Equilibrium iff u(s, s) ≥ u(s, t) for all strategies t. • s is a Strict Nash Equilibrium iff u(s, s) > u(s, t) for all strategies t with s 6= t. Are NEs always evolutionarily stable? Consider the well-known zero-sum game Rock-Paper-Scissors (RPS). The two players each have to choose between the three strategies R (rock), P (paper), and S (scissors). The rules are that R wins over S, S wins over P, and P wins over R. If both players play the same strategy, the result is a tie. A corresponding utility matrix would be as in Table 1.15. This game has exactly one NE. It is the mixed strategy s∗ where one plays each pure strategy with a probability of 31 . If my opponent plays s∗, my expected utility is 0, no matter what kind of strategy I play, because the probability of winning, losing, or a tie are equal. So every strategy is a best response to s∗. On the other hand, if the probabilities of the strategies of my opponent are unequal, then my best response is always to play one of the pure strategies that win against the most probable of his actions. No strategy wins against itself; thus no other strategy can be a best response to itself. s∗ is the unique NE . Table 1.15:
Utility matrix for Rock-Paper-Scissors
R P S
R 0 1 -1
P -1 0 1
S 1 -1 0
Is it evolutionarily stable? Suppose a population consists of equal parts of R, P, and S players, and they play against each other in random pairings. Then the players of each strategy have the same average utility, 0. If the number of offspring of each individual is positively correlated with its accumulated utility, there will be equally many individuals of each strategy in
50
Game Theory and Pragmatics
the next generation again, and the same in the second generation ad infinitum. s∗ is a steady state. However, Maynard Smith’s notion of evolutionary stability is stronger. An ESS should not only be stationary, but it should also be robust against mutations. Now suppose in a population as described above, some small proportion of the offspring of P-players are mutants and become S-players. Then the proportion of P-players in the next generation is slightly less than 13 , and the share of S-players exceeds 13 . So we have: p(S) > p(R) > p(P ) This means that R-players will have an average utility that is slightly higher than 0 (because they win more against S and lose less against P). Likewise, S-players are at disadvantage because they win less than 31 of the time (against P) but lose 13 of the time (against R). So one generation later, the configuration is: p(R) > p(P ) > p(S) By an analogous argument, the next generation will have the configuration: p(P ) > p(S) > p(R) etc. After the mutation, the population has entered a circular trajectory, without ever approaching the stationary state s∗ again without further mutations. So not every NE is an ESS. The converse does hold though. Suppose a strategy s were not a NE . Then there would be a strategy t with u(t, s) > u(s, s). This means that a t-mutant in a homogeneous s-population would achieve a higher average utility than the incumbents and thus spread. This may lead to the eventual extinction of s, a mixed equilibrium or a circular trajectory, but the pure s-population is never restored. Hence s is not an ESS. By contraposition we conclude that each ESS is a NE . Can we identify ESSs with strict Nash equilibria (SNEs)? Not quite. Imagine a population of pigeons which come in two variants. A-pigeons have a perfect sense of orientation and can always find their way. B-pigeons have no sense of orientation at all. Suppose that pigeons always fly in pairs. There is no big disadvantage of being a B if your partner is of type A because he can lead the way. Likewise, it is of no disadvantage to have a B-partner if you are an A because you can lead the way yourself. (Let us assume for simplicity that leading the way has neither costs nor benefits.) However, a pair of B-individuals has a big disadvantage because it cannot
An Introduction to Game Theory for Linguists
51
find its way. Sometimes these pairs get lost and starve before they can reproduce. This corresponds to the utility matrix in Table 1.16. A is a NE, but not Table 1.16:
Utility matrix of the pigeon orientation game
A B
A 1 1
B 1 0
an SNE, because u(B, A) = u(A, A). Now imagine that a homogeneous Apopulation is invaded by a small group of B-mutants. In a predominantly A-population, these invaders fare as well as the incumbents. However, there is a certain probability that a mutant goes on a journey with another Bmutant. Then both are in danger. Hence, sooner or later B-mutants will approach extinction because they cannot interact very well with their peers. More formally, suppose the proportions of A and B in the populations are 1 − ε and ε respectively. Then the average utility of A is 1, while the average utility of B is only 1 − ε. Hence the A-subpopulation will grow faster than the B-subpopulation, and the share of B-individuals converges towards 0. Another way to look at this scenario is this: B-invaders cannot spread in a homogeneous A-population, but A-invaders can successfully invade a B-population because u(A, B) > u(B, B). Hence A is immune against B-mutants, even though A is only a non-strict Nash equilibrium. If a strategy is immune against any kind of mutants in this sense, it is evolutionarily stable. The necessary and sufficient condition for evolutionary stability are (according to Maynard Smith 1982): Definition 6 (Evolutionarily Stable Strategy) s is an Evolutionarily Stable Strategy iff 1 u(s, s) ≥ u(t, s) for all t, and 2 if u(s, s) = u(t, s) for some t 6= s, then u(s, t) > u(t, t). The first clause requires an ESS to be a NE. The second clause says that if a t-mutation can survive in an s-population, s must be able to successfully invade any t-population for s to be evolutionarily stable. From the definition it follows immediately that each SNE is an ESS. So we have the inclusion relation Strict Nash Equilibria ⊂ Evolutionarily Stable Strategies ⊂ Nash Equilibria
52
Game Theory and Pragmatics
Both inclusions are strict. The strategy A in the pigeon orientation game is evolutionarily stable without being a strict Nash equilibrium, and in RockPaper-Scissors, the mixed strategy to play each pure strategy with probability 13 is a Nash equilibrium without being evolutionarily stable. 3.2.2
The replicator dynamics
The considerations that lead to the notion of an ESS are fairly general. They rest on three crucial assumptions: 1 Populations are (practically) infinite. 2 Each pair of individuals is equally likely to interact. 3 The expected number of offspring of an individual (i.e., its fitness in the Darwinian sense) is monotonically related to its average utility. The assumption of infinity is crucial for two reasons. First, individuals usually do not interact with themselves under most interpretations of EGT . Thus, in a finite population, the probability to interact with a player using the same strategy as oneself would be less than the share of this strategy in the overall population. If the population is infinite, this discrepancy disappears. Second, in a finite population the average utility of players of a given strategy converges towards its expected value, but it need not be identical to it. This introduces a stochastic component. While this kind of stochastic EGT is a lively sub-branch of EGT (see below), the standard interpretation of EGT assumes deterministic evolution. In an infinite population, the average utility coincides with the expected utility . As mentioned before, the evolutionary interpretation of GT interprets utilities as fitness. The notion of an ESS makes the weaker assumption that there is just a positive correlation between utility and fitness – a higher utility translates into more expected offspring, but this relation need not be linear. This is important for applications of EGT to cultural evolution, where replication proceeds via learning and imitation, and utilities correspond to social impact. There might be independent measures for utility that influence fitness without being identical to it. Nevertheless it is often helpful to look at a particular population dynamics to sharpen one’s intuition about the evolutionary behavior of a game. Also, in games such as Rock-Paper-Scissors, a lot of interesting things can be said about their evolution even though they have no stable states at all. Therefore we will discuss one particular evolutionary game dynamics in some detail. The easiest way to relate utility and fitness in a monotonic way is of course just to identify them. So let us assume that the average utility of
An Introduction to Game Theory for Linguists
53
an individual equals its expected number of offspring. Let us say that there are n strategies s1 , . . . , sn . The amount of individuals playing strategy i is written as Ni . The relative frequency of strategy si , i.e., Ni /N , is written as P xi for short. (Note that x is a probability distribution, i.e. j xj = 1.) We Pn abbreviate the expected utility of strategy si , j=1 xj u(i, j), as u ˜i , and the P population average of the expected utility, n x u ˜ , as u ˜ . i i i=1 If the population size N goes towards infinity, the development of the relative abundance of the different strategies within the population converges towards a deterministic dynamics that can be described by the following differential equation: dxi dt
=
xi (˜ ui − u ˜)
This equation is called the replicator dynamics. It was first introduced in Taylor and Jonker (1978). It is worth a closer examination. It says that the reproductive success of strategy si depends on two factors. First, there is the relative abundance of si itself, xi . The more individuals in the current population are of type si , the more likely it is that there will be offspring of this type. The interesting part is the second factor, the differential utility. If u ˜i = u ˜, this means that strategy si does exactly as well as the population i average. In this case the two terms cancel each other out, and dx = 0. This dt means that si ’s share of the total population remains constant. If, however, u ˜i > u ˜, si does better than average, and it increases its share. Likewise, a strategy si with a less-than-average performance, i.e., u ˜i < u ˜, loses ground. Intuitively, evolutionary stability means a state is (a) stationary and (b) immune against the invasion of small numbers of mutations. This can directly be translated into dynamic notions. To require that a state is stationary amounts to saying that the relative frequencies of the different strategies within the population do not change over time. In other words, the vector x is stationary iff for all i: dxi =0 dt This is the case if either xi = 0 or u ˜i = u ˜ for all i. Robustness against small amounts of mutation means that there is an environment of x such that all trajectories leading through this environment actually converge towards x. In the jargon of dynamic systems, x is then asymptotically stable or an attractor. It can be shown that a (possibly mixed) strategy is an ESS if and only if it is asymptotically stable under the replicator dynamics. The replicator dynamics enables us to display the evolutionary behavior of a game graphically. This has a considerable heuristic value. There
Game Theory and Pragmatics
54
are basically two techniques for this. First, it is possible to depict time series in a Cartesian coordinate system. The time is mapped to the x-axis, while the y-axis corresponds to the relative frequency of some strategy. For some sample of initial conditions, the development of the relative frequencies over time is plotted as a function of the time variable. In a two-strategy game like the pigeon orientation scenario discussed above, this is sufficient to exhaustively display the dynamics of the system because the relative frequencies of the two strategies always sum up to 1. The left hand graphic in Figure 1.7 gives a few sample time series for the pigeon game. Here the yaxis corresponds to the relative frequency of the A-population. It is plainly obvious that the state where 100% of the population are of type A is in fact an attractor. R
1
0.8
0.6
0.4
0.2
0
t
S
Figure 1.7: Replicator dynamics of the pigeon orientation game (left) and the rockpaper-scissor game (right)
Another option to graphically display the replicator dynamics of some game is to suppress the time dimension and instead plot possible orbits of the system. Here both axes correspond to relative frequencies of some strategies. So each state of the population corresponds to some point in the coordinate system. If there are at most two independent variables to consider – as in a symmetric three-strategy game like RPS – there is actually a 1-1 map between points and states. Under the replicator dynamics, populations evolve continuously. This corresponds to contiguous paths in the graph. The right hand graphic in Figure 1.7 shows some orbits of RPS. We plotted the frequencies of the “rock” strategy and the “scissors” strategy against the y-axis and the x-axis respectively. The sum of their frequencies never exceeds 1. This is why the whole action happens in the lower left corner of the square. The relative frequency of “paper” is uniquely determined by the two other strategies and is thus no independent variable.
An Introduction to Game Theory for Linguists
55
The circular nature of this dynamics that we informally uncovered above is clearly discerned. One can also easily see “with the bare eye” that this game has no attractor, i.e., no ESS. 3.2.3
Asymmetric games
So far we considered symmetric games in this section. Formally, a game is symmetric iff the two players have the same set of strategies to choose from, and the utility does not depend on the position of the players. If u1 is the utility matrix for the row player, and u2 of the column player, then the game is symmetric iff both matrices are square (have the same number of rows and columns), and u1 (i, j) = u2 (j, i) There are various scenarios where these assumptions are inappropriate. In many types of interaction, the participants assume certain roles. In contests over a territory, it makes a difference who is the incumbent and who the intruder. In economic interaction, buyer and seller have different options at their disposal. Likewise in linguistic interaction you are the speaker or the hearer. The last example illustrates that it is possible for the same individual to assume either role on different occasions. If this is not possible, we are effectively dealing with two disjoint populations, like predators and prey or females and males in biology, haves and have-nots in economics, and adults and infants in language acquisition (in the latter case infants later become adults, but these stages can be considered different games). The dynamic behavior of asymmetric games differs markedly from symmetric ones. The ultimate reason for this is that in a symmetric game, an individual can quasi play against itself (or against a clone of itself), while this is impossible in asymmetric games. The well-studied game “Hawks and Doves” may serve to illustrate this point. Imagine a population where the members have frequent disputes over some essential resource (food, territory, mates, whatever). There are two strategies to deal with a conflict. The aggressive type (the “hawks”) will never give in. If two hawks come in conflict, they fight it out until one of them dies. The other one gets the resource. The doves, on the contrary, embark upon a lengthy ritualized dispute until one of them is tired of it and gives in. If a hawk and a dove meet, the dove gives in right away and the hawk gets the resource without any effort. There are no other strategies. A possible utility matrix for this game is given in Table 1.17. Getting the disputed resource without effort has a survival value of 7. Only a hawk meeting a dove is as lucky. Not getting the resource at all without a fight enables the loser to look out for a replacement. This is the fate of a dove meeting a hawk. Let’s say this has a utility of 2. Dying in a fight
56
Game Theory and Pragmatics
Table 1.17:
Hawks and Doves
H D
H 1 2
D 7 3
over the resource leads to an expected number of 0 offspring, and a serious fight is also costly for the survivor. Let us say the average utility of a hawk meeting a hawk is 1. A dove meeting another dove will get the contested resource in one out of two occasions on average, but the lengthy ritualistic contest comes with a modest cost too, so the utility of a dove meeting a dove could be 3. It is important to notice that the best response to a hawk is being a dove and vice versa. So neither of the two pure strategies is an NE. However, we also consider mixed strategies where either the population is mixed, or each individual plays either strategy with a certain probability. Under these circumstances, the game has an ESS. If the probability of behaving like a hawk is 80% and of being a dove 20%, both strategies achieve an expected utility of 2.2. As the reader may convince herself, this mixed strategy does in fact fulfill the conditions for an ESS. The replicator dynamics is given in Figure 1.8. Here the y-axis represents the proportion of hawks in a population. If the proportion of hawks exceeds the critical 80%, doves have an advantage and will spread, and vice versa. This changes dramatically if 1
0.8
0.6
0.4
0.2
0
t
Figure 1.8: Symmetric Hawk-and-Dove game
the same game is construed as an asymmetric game. Imagine the same situation as before, but now we are dealing with two closely related but different species. The two species are reproductively isolated, but they compete for the same ecological niche. Both species come in the hawkish and the
An Introduction to Game Theory for Linguists
57
dovish variant. Contests only take place between individuals from different species. Now suppose the first species, call it A, consists almost exclusively of the hawkish type. Under symmetric conditions, this would mean that the hawks mostly encounter another hawk, doves are better off on average, and therefore evolution works in favor of the doves. Things are different in the asymmetric situation. If A consists mainly of hawks, this supports the doves in the other species, B. So the proportion of doves in B will increase. This in turn reinforces the dominance of hawks in A. Likewise, a dominantly dovish A-population helps the hawks in B. The tendency always works in favor of a purely hawkish population in the one species and a purely dovish population in the other one. Figure 1.9 on the following page graphically displays this situation. Here we use a third technique for visualizing a dynamics, a direction field. Each point in the plain corresponds to one state of the system. Here the x-coordinate gives the proportion of hawks in A, and the y-coordinate the proportion of hawks in B. Each arrow indicates in which direction the system is moving if it is in the state corresponding to the origin of the arrow. The length of the arrow indicates the velocity of the change. If you always follow the direction of the arrows, you get an orbit. Direction fields are especially useful to display systems with two independent variables, like the two-population game considered here. The system has two attractor states, the upper left and the lower right corner. They correspond to a purely hawkish population in one species and 100% doves in the other. If both populations have the critical 8:2 ratio of hawks:doves that was stable in the symmetric scenario, the system is also stationary. But this is not an attractor state because all points in the environment of this point are pulled away from it rather than being attracted to it. It is possible to capture the stability properties of asymmetric games in a way that is similar to the symmetric case. Actually the situation is even easier in the asymmetric case. Recall that the definition of a symmetric ESS was complicated by the consideration that mutants may encounter other mutants. In a two-population game, this is impossible. In a one-population role game, this might happen. However, minimal mutations only affect strategies in one of the two roles. If somebody minimally changes his grammatical preferences as a speaker, say, his interpretive preferences need not be affected by this.10 So while a mutant might interact with its clone, it will never occur that a mutant strategy interacts with itself, because, by definition, the two strategy sets are distinct. So, the second clause of the definition of ESS doesn’t matter. To deal with asymmetric games, we have to use the more elaborate conceptual framework from section 1 again. Since the strategy sets of the two
58
Game Theory and Pragmatics
Figure 1.9: Asymmetric Hawk-and-Dove game
players (roles, populations) are distinct, the utility matrices are distinct as well. In a game between m and n strategies, the utility function of the first player is defined by an m × n matrix, call it uA , and an n × m matrix uB for the second player. An asymmetric Nash equilibrium is now a profile – a pair – of strategies, one for each population/role, such that each component is the best response to the other component. Likewise, a SNE is a pair of strategies where each one is the unique best response to the other. Now if the second clause in the definition of a symmetric ESS plays no role here, does this mean that only the first clause matters? In other words, are all and only the NEs evolutionarily stable in the asymmetric case? Not quite. Suppose a situation as before, but now species A consists of three variants instead of two. The first two are both aggressive, and they both get the same, hawkish utility. Also, individuals from B get the same utility from interacting with either of the two types of hawks in A. The third Astrategy are still the doves. Now suppose that A consists exclusively of hawks of the first type, and B only of doves. Then the system is in a NE, since both hawk strategies are the best response to the doves in B, and for a B-player, being a dove is the best response to either hawk-strategy. If this A-population is invaded by a mutant of the second hawkish type, the mutants are exactly as fit as the incumbents. They will neither spread nor be extinguished. (Biologists call this phenomenon drift – change that has no impact for survival fitness and is driven by pure chance.) In this scenario, the system is in a (non-strict) NE, but it is not evolutionarily stable. A strict NE is always evolutionarily stable though, and it can be shown (Selten 1980) that:
An Introduction to Game Theory for Linguists
59
In asymmetric games, a configuration is an ESS iff it is a SNE . It is a noteworthy fact about asymmetric games that ESSs are always pure in the sense that both populations play one particular strategy with 100% probability. This does not imply though that asymmetric games always settle in a pure state. Not every asymmetric game has an ESS. The asymmetric version of Rock-Paper-Scissors, for instance, shows the same kind of cyclic dynamics as the symmetric variant. As in the symmetric case, this characterization of evolutionary stability is completely general and holds for all utility monotonic dynamics. Again, the simplest instance of such a dynamics is the replicator dynamics. Here a state is characterized by two probability vectors, x and y. They represent the probabilities of the different strategies in the two populations or roles. The differential equation describing the replicator dynamics applies to multipopulation games as well. The only difference is that the expected utility of a player from one population is calculated by averaging over the strategies for the other population. 3.3
EGT and language
Language is first and foremost a means for communication. As a side effect of communication, linguistic knowledge is transmitted between the communicators. This is most obvious in language acquisition, but learning never stops, and adult speakers of the same language exert a certain influence on each other’s linguistic habits as well. This makes natural language an interactive and self-replicative system. Hence EGT is a promising analytical tool for the study of linguistic phenomena. Let us start this subsection with a few general remarks. To give an EGT formalization – or an evolutionary conceptualization in general – of a particular empirical phenomenon, various issues have to be addressed in advance. What is replication in the domain in question? What are the units of replication? Is replication faithful, and if so, which features are constant under replication? What factors influence reproductive success (= fitness)? What kind of variation exists, and how does it interact with replication? There are various aspects of natural language that are subject to replication, variation and selection, on various timescales that range from minutes (single discourse) to millennia (language related aspects of biological evolution). We will focus on cultural (as opposed to biological) evolution on short time scales, but we will briefly discuss the more general picture. The most obvious mode of linguistic self-replication is first language acquisition. Before this can take effect, the biological preconditions for language acquisition and use have to be given, ranging from the physiology of
60
Game Theory and Pragmatics
the ear and the vocal tract to the necessary cognitive abilities. The biological language faculty is replicated in biological reproduction. It seems obvious that the ability to communicate does increase survival chances and social standing and thus promotes biological fitness, but only at a first glance. Sharing information usually benefits the receiver more than the sender because information arguably increases fitness. Sharing information with others increases the fitness of the others and thus reduces the own differential fitness. Standard EGT predicts this kind of altruistic behavior to be evolutionarily unstable. Here is a crude formalization in terms of an asymmetric game between sender and receiver. The sender has a choice between sharing information (“T” for “talkative”) or keeping information for himself (“S” for “silent”). The (potential) receiver has the options of paying attention and trying to decode the messages of the sender (“A” for “attention”) or to ignore (“I”) the sender. Let us say that sharing information does have a certain benefit for the sender because it may serve to manipulate the receiver. On the other hand, sending a signal comes with an effort and may draw the attention of predators. For the sake of the argument, we assume that the costs and benefits are roughly equally distributed given that the receiver pays attention. If the receiver ignores the message, it is disadvantageous for the sender to be talkative. For the receiver, it pays to pay attention if the sender actually sends. Then the listener benefits most. If the sender is silent, it is of disadvantage for the listener to pay attention because attention is a precious resource that could have been spent in a more useful way otherwise. Sample utilities that mirror these assumed preferences are given in Table 1.18. Table 1.18:
The utility of communication
A
I
T
(1 ; 2)
(0 ; 1)
S
(1 ; 0)
(1 ; 1)
The game has exactly one ESS, namely the combination of “S” and “I”. (As the careful reader probably already figured out for herself, a cell is an ESS, i.e., a strict Nash equilibrium, if its first number is the unique maximum in its column and the second one the unique maximum in its row.) This result might seem surprising. The receiver would actually be better off if the two parties would settle at (T,A). This would be of no disadvantage for the sender. Since the sender does not compete with the receiver for resources (we are talking about an asymmetric game), he could actually afford to be generous and grant the receiver the possible gain. Here the predictions of
An Introduction to Game Theory for Linguists
61
standard EGT seem to be at odds with the empirical observations.11 The evolutionary basis for communication, and for cooperation in general, is an active area of research in EGT , and there are various possible routes that have been proposed. First, the formalization that we gave here may be just wrong, and communication is in fact beneficial for both parties. While this is certainly true for humans living in human societies, this still raises the questions how these societies could have evolved in the first place. A more interesting approach goes under the name of the handicap principle. The name was coined by Zahavi (1975) to describe certain patterns of seemingly self-destructive communication in the animal kingdom. A good example is what he calls the “stotting” behavior of gazelles: We start with a scene of a gazelle resting or grazing in the desert. It is nearly invisible; the color of its coat bends well with the desert landscape. One would expect the gazelle to freeze or crouch and do its utmost to avoid being seen. But no: it rises, barks, and thumps the ground with its forefeet, all the while watching the wolf. [. . . ] Why does the gazelle reveal itself to a predator that might not otherwise spot it? Why does it waste time and energy jumping up and down (stotting) instead of running away as fast as it can? The gazelle is signaling to the predator that it has seen it; by “wasting” time and jumping high in the air rather than bounding away, it demonstrates in a reliable way that it is able to outrun the wolf. The wolf, upon learning that it has lost its chance to surprise its prey, and that this gazelle is in top-top physical shape, may decide to move on to another area; or it may decide to look for more promising prey. (from Zahavi and Zahavi 1997, xiii-xiv) Actually, the best response of the predator is to call the bluff occasionally, often enough to deter cheaters, but not too often. Under these conditions, the self-inflicted handicap of the (fast) gazelle is in fact evolutionarily stable. The crucial insight here is that truthful communication can be evolutionarily stable if lying is more costly than communicating the truth. A slow gazelle could try to use stotting as well to discourage a lion from hunting it, but this would be risky if the lion occasionally calls the bluff. The expected costs of such a strategy are thus higher than the costs of running away immediately. In communication among humans, there are various ways in which lying might be more costly than telling (or communicating) the truth. To take an example from economics, driving a Rolls Royce communicates “I am rich” because for a poor man, the costs of buying and maintaining such an expensive car outweigh its benefits while a rich man can afford them. Here producing the signal as such is costly. In linguistic communication,
62
Game Theory and Pragmatics
lying comes with the social risk of being found out, so in many cases telling the truth is more beneficial than lying. The idea of the handicap principle as an evolutionary basis for communication has inspired a plethora of research in biology and economics. Van Rooij (2003) uses it to give a game theoretic explanation of politeness as a pragmatic phenomenon. A third hypothesis rejects the assumption of standard EGT that all individuals interact with equal probability. When I increase the fitness of my kin, I thereby increase the chances for replication of my own gene pool, even if it should be to my own disadvantage. Recall that utility in EGT does not mean the reproductive success of an individual but of a strategy, and strategies correspond to heritable traits in biology. A heritable trait for altruism might thus have a high expected utility provided its carrier preferably interacts with other carriers of this trait. Biologists call this model kin selection. There are various modifications of EGT that give up the assumption of random pairing. Space does not permit us to go into any detail here. However, refinements of EGT where a player is more likely to interact with other individuals of its own type often predict cooperative or even altruistic behavior to be evolutionarily stable even if it not an ESS according to Maynard Smith’s criteria. Natural languages are not passed on via biological but via cultural transmission. First language acquisition is thus a qualitatively different mode of replication. Most applications of evolutionary thinking in linguistics focus on the ensuing acquisition driven dynamics. It is an important aspect in understanding language change on a historical timescale of decades and centuries. It is important to notice that there is a qualitative difference between Darwinian evolution and the dynamics that results from iterated learning (in the sense of iterated first language acquisition). In Darwinian evolution, replication is almost always faithful. Variation is the result of occasional unfaithful replication, a rare and essentially random event. Theories that attempt to understand language change via iterated language acquisition stress the fact though that here, replication can be unfaithful in a systematic way. The work of Martin Nowak and his co-workers (see for instance Nowak et al. 2002) is a good representative of this approach. They assume that an infant that grows up in a community of speakers of some language L1 might acquire another language L2 with a certain probability. This means that those languages will spread in a population that (a) are likely targets of acquisition for children that are exposed to other languages, and (b) are likely to be acquired faithfully themselves. This approach thus conceptualizes language change as a Markov process12 rather than evolution
An Introduction to Game Theory for Linguists
63
through natural selection. Markov processes and natural selection of course do not exclude each other. Nowak’s differential equation describing the language acquisition dynamics actually consists of a basically gametheoretical natural selection component (pertaining to the functionality of language) and a (learning oriented) Markov component. Language is also replicated on a much shorter time scale, just via being used. The difference between acquisition based and usage based replication can be illustrated by looking at the development of the vocabulary of some language. There are various ways how a new word can enter a language – morphological compounding, borrowing from other languages, lexicalization of names, coinage of acronyms, and what have you. Once a word is part of a language, it is gradually adapted to this language, i.e., it acquires a regular morphological paradigm, its pronunciation is nativized etc. The process of establishing a new word is predominantly driven by mature (adult or adolescent) language users, not by infants. Somebody introduces the new word, and people start imitating it. Whether the new coinage catches on depends on whether there is a need for this word, whether it fills a social function (like distinguishing the own social group from other groups), whether the persons who already use have a high social prestige etc. Since the work of Labov (see for instance Labov 1972) functionally oriented linguists have repeatedly pointed out that grammatical change actually follows a similar pattern. The main agents of language change, they argue, are mature language users rather than children. Not just the vocabulary is plastic and changes via language use but all kinds of linguistic variables like syntactic constructions, phones, morphological devices, interpretational preferences etc. Imitation plays a crucial part here, and imitation is of course a kind of replication. Unlike in biological replication, the usage of a certain word or construction can usually not be traced back to a unique model or pair of models that spawn the token in question. Rather, every previous usage of this linguistic item that the user took notice of shares a certain fraction of “parenthood”. Recall though that the basic units of evolution in EGT are not individuals but strategies, and evolution is about the relative frequency of strategies. If there is a causal relation between the abundance of a certain linguistic variant at a given point in time and its abundance at a later point, we can consider this a kind of faithful replication. Also, replication is almost but not absolutely faithful. This leads to a certain degree of variation. Competing variants of a linguistic item differ in their likelihood to be imitated – this corresponds to fitness and thus to natural selection. The usage based dynamics of language use has all aspects that are required for a modeling in terms of EGT . In the linguistic examples that we will discuss further on, we will assume the latter notion of linguistic evolution.
64
Game Theory and Pragmatics
3.4
Pragmatics and EGT
In this subsection we will go through a couple of examples that demonstrate how EGT can be used to explain high level linguistic notions like pragmatic preferences or functional pressure. For more detailed accounts of linguistic phenomena using EGT, the reader is referred to J¨ager (2004) and van Rooij (2004). 3.4.1
Partial blocking
If there are two comparable expressions in a language such that the first is strictly more specific than the second, there is a tendency to reserve the more general expression for situations where the more specific one is not applicable. A standard example is the opposition between “many” and “all”. If I say that many students came to the guest lecture, it is usually understood that not all students came. There is a straightforward rationalistic explanation for this in terms of conversational maxims: the speaker should be as specific as possible. If the speaker uses “many”, the hearer can conclude that the usage of “all” would have been inappropriate. This conclusion is strictly speaking not valid though – it is also possible that the speaker just does not know whether all students came or whether a few were missing. A similar pattern can be found in conventionalized form in the organization of the lexicon. If a regular morphological derivation and a simplex word compete, the complex word is usually reserved for cases where the simplex is not applicable. For instance, the compositional meaning of the English noun “cutter” is just someone or something that cuts. A knife is an instrument for cutting, but still you cannot call a knife a “cutter”. The latter word is reserved for non-prototypical cutting instruments. Let us consider the latter example more closely. We assume that the literal meaning of “cutter” is a concept CUTTER ’ and the literal meaning of “knife” a concept KNIFE ’ such that every knife is a cutter but not vice versa, i.e., KNIFE ’
⊂ CUTTER ’
There are two basic strategies to use these two words, the semantic (S) and the pragmatic (P ) strategy. Both come in two versions, a hearer strategy and a speaker strategy. A speaker using S will use “cutter” to refer to unspecified cutting instruments, and “knife” to refer to knives. To refer to a cutting instrument that is not a knife, this strategy either uses the explicit “cutter but not a knife”, or, short but imprecise, also “cutter”. A hearer using S will interpret every expression literally, i.e., “knife” means KNIFE ’, “cutter” means CUTTER ’, and “cutter but not a knife” means CUTTER ’ − KNIFE ’. A speaker using P will reserve the word “cutter” for the concept CUTTER ’ − KNIFE ’. To express the general concept CUTTER ’, this strategy has to resort
An Introduction to Game Theory for Linguists
65
to a more complex expression like “cutter or knife”. Conversely, a hearer using P will interpret “cutter” as CUTTER ’ − KNIFE ’, “knife” as KNIFE ’, and “cutter or knife” as CUTTER ’. So we are dealing with an asymmetric 2 × 2 game. What is the utility function? In EGT, utilities are interpreted as the expected number of offspring. In our linguistic interpretation this means that utilities express the likelihood of a strategy to be imitated. It is a difficult question to tease apart the factors that determine the utility of a linguistic item in this sense, and ultimately it has to be answered by psycholinguistic and sociolinguistic research. Since we have not undertaken this research so far, we will make up a utility function, using plausibility arguments. We start with the hearer perspective. The main objective of the hearer in communication, let us assume, is to gain as much truthful information as possible. The utility of a proposition for the hearer is thus inversely proportional to its probability, provided the proposition is true. For the sake of simplicity, we only consider contexts where the nouns in question occur in upward entailing context. Therefore CUTTER ’ has a lower information value than KNIFE ’ or CUTTER ’−KNIFE ’. It seems also fair to assume that non-prototypical cutters are more rarely talked about than knives; thus the information value of KNIFE ’ is lower than the one of CUTTER ’−KNIFE ’. For concreteness, we make up some numbers. If i is the function that assigns a concept its information value, let us say that i(KNIFE ’)
=
30
i(CUTTER ’ − KNIFE ’)
=
40
i(CUTTER ’)
=
20
The speaker wants to communicate information. Assuming only honest intentions, the information value that the hearer gains should also be part of the speaker’s utility function. Furthermore, the speaker wants to minimize his effort. So as a second component of the speaker’s utility function, we assume some complexity measure over expressions. A morphologically complex word like “cutter” is arguably more complex than a simple one like “knife”, and syntactically complex phrases like “cutter or knife” or “cutter but not knife” are even more complex. The following stipulated values for the cost function take these considerations into account: cost(“knife”)
=
1
cost(“cutter”)
=
2
cost(“cutter or knife”)
=
40
cost(“cutter but not knife”)
=
45
66
Game Theory and Pragmatics
These costs and benefits are to be weighted – everything depends on how often each of the candidate concepts is actually used. The most prototypical concept of the three is certainly KNIFE ’, while the unspecific CUTTER ’ is arguably rare. Let us say that, conditioned to all utterance situations in question, the probabilities that a speaker tries to communicate the respective concepts are p(KNIFE ’)
=
.7
p(CUTTER ’ − KNIFE ’)
=
.2
p(CUTTER ’)
=
.1
The utility of the speaker is then the difference between the average information value that he manages to communicate and the average costs that he has to afford. The utility of the hearer is just the average value of the correct information that is received. The precise values of these utilities finally depend on how often a speaker of the S-strategy actually uses the complex “cutter but not knife”, and how often he uses the shorter “cutter”. Let us assume for the sake of concreteness that he uses the short form in 60% of all times. After some elementary calculations,13 this leads us to the following utility matrix. The speaker is assumed to be the row player and the hearer the column player. Both players receive the absolutely highest utility if both play P . This means perfect communication with minimal effort. All other combinations involve some kind of communication failure because the hearer occasionally interprets the speaker’s use of “cutter” either too strongly or too weakly. Table 1.19:
Knife vs. cutter
S
P
S
(23.86 ; 28.60)
(24.26 ; 29.00)
P
(21.90 ; 27.00)
(25.90 ; 31.00)
If both players start out with the semantic strategy, mutant hearers that use the pragmatic strategy will spread because they get the more specific interpretation CUTTER ’−KNIFE ’ right in all cases where the speaker prefers minimizing effort over being explicit. The mutants will get all cases wrong where the speaker meant CUTTER ’ by using “cutter”, but the advantage is greater. If the hearers employ the pragmatic strategy, speakers using their pragmatic strategy will start to spread now because they will have a higher
An Introduction to Game Theory for Linguists
67
chance to get their message across. The combination P/P is the only strict Nash equilibrium in the game and thus the only ESS. Figure 1.10 gives the direction field of the corresponding replicator dynamics. The x-axis gives the proportion of the hearers that are P -players, and the y-axis corresponds to the speaker dimension.
Figure 1.10: Partial blocking: replicator dynamics
The structural properties of this game are very sensitive to the particular parameter values. For instance, if the informational value of the concept CUTTER ’ were 25 instead of 20, the resulting utility matrix would come out as in Table 1.20. Here both S/S and P/P come out as evolutionarily stable. Table 1.20:
Knife vs. cutter, different parameter values
S
P
S
(24.96 ; 29.70)
(24.26 ; 29.00)
P
(23.40 ; 28.50)
(26.40 ; 31.50)
This result is not entirely unwelcome – there are plenty of examples where a specific term does not block a general term. If I refer to a certain dog as “this dog”, I do not implicate that it is of no discernible breed like “German shepherd” or “Airedale terrier”. The more general concept of a dog is useful enough to prevent blocking by more specific terms.
68
Game Theory and Pragmatics
3.4.2
Horn strategies
Real synonymy is rare in natural language – some people even doubt that it exists. Even if two expressions should have identical meanings according to the rules of compositional meaning interpretation, their actual interpretation is usually subtly differentiated. Larry Horn (see for instance Horn 1993) calls this phenomenon the division of pragmatic labor. This differentiation is not just random. Rather, the tendency is that the simpler of the two competing expressions is assigned to the prototypical instances of the common meaning, while the more complex expression is reserved for less prototypical situations. The following examples (taken from op. cit.) serve to illustrate this. (8) a. John went to church/jail. (prototypical interpretation) b. John went to the church/jail. (literal interpretation) (9) a. I need a new driller/cooker. b. I need a new drill/cook. The example (8a) only has the non-literal meaning where John attended a religious service or was convicted and send to a prison respectively. The more complex (b) sentence literally means that he approaches the church (jail) as a pedestrian. De-verbal nouns formed by the suffix -er can either be agentive or refer to instruments. So compositionally, a driller could be a person who drills or an instrument for drilling, and likewise for cooker. However, drill is lexicalized as a drilling instrument, and thus driller can only have the agentive meaning. For cooker it is the other way round: a cook is a person who cooks, and thus a cooker can only be an instrument. Arguably the concept of a person who cooks is a more natural concept than an instrument for cooking in our culture, and for drills and drillers it is the other way round. So in either case, the simpler form is restricted to the more prototypical meaning. One might ask what “prototypical” exactly means here. The meaning of “going to church” for instance is actually more complex than the meaning of “going to the church” because the former invokes a lot of cultural background knowledge. It seems to make sense to us though to simply identify prototypicality with frequency. Those meanings that are most often communicated in ordinary conversations are most prototypical. We are not aware whether anybody carried out any quantitative studies on this subject, but simple Google searches show that for the mentioned examples, this seems to be a good hypothesis. The phrase “went to church” got 88,000 hits, against
An Introduction to Game Theory for Linguists
69
13,500 for “went to the church”. “I will marry you” occurs 5,980 times; “I am going to marry you” only 442 times. “A cook” has about 712,000 occurrences while “a cooker” has only about 25,000. (This crude method is not applicable to “drill” vs. “driller” because the former also has an additional meaning as in “military drill” which pops up very often.) While queries at a search engine do not replace serious quantitative investigations, we take it to be a promising hypothesis that in case of a pragmatic competition, the less complex form tends to be restricted to the more frequent meaning and the more complex one to the less frequent interpretation. It is straightforward to formalize this setting in a game. The players are speaker and hearer. There are two meanings that can be communicated, m0 and m1 , and they have two forms at their disposal, f0 and f1 . Each total function from meanings to forms is a speaker strategy, while hearer strategies are mappings from forms to meanings. There are four strategies for each player, as shown in Table 1.21 on the following page. It is decided by nature which meaning the speaker has to communicate. The probability that nature chooses m0 is higher than the probability of m1 . Furthermore, form f0 is less complex than form f1 . So far this is not different from the signaling games from section 2. However, we assume here that talk is not cheap. (For simplicity’s sake, we identify both types and actions with meanings here.) The speaker has an interest in minimizing the complexity of the expression involved. One might argue that the hearer also has an interest in minimizing complexity. However, the hearer is confronted with a given form and has to make sense of it. He or she has no way to influence the complexity of that form or the associated meaning. Therefore there is no real point in making complexity part of the hearer’s utility function. To keep things simple, let us make up some concrete numbers. Let us say that the probability of m0 is 75% and the probability of m1 25%. The costs of f0 and f1 are 0.1 and 0.2 respectively. The unit is the reward for successful communication – so we assume that it is 10 times as important for the speaker to get the message across than to avoid the difference in costs between f1 and f0 . We exclude strategies where the speaker does not say anything at all, so the minimum cost of 0.1 unit is unavoidable. The utility of the hearer for a given pair of a hearer strategy and a speaker strategy is the average number of times that the meaning comes across correctly given the strategies and nature’s probability distribution. Formally this means that
uh (H, S)
=
X m
pm × δ m (S, H)
70
Game Theory and Pragmatics
Table 1.21:
Strategies in the Horn game
Speaker
Hearer
S1:
m0 → 7 f0 m1 → 7 f1
H1:
f0 → 7 m0 f1 → 7 m1
S2:
m0 → 7 f1 m1 → 7 f0
H2:
f0 → 7 m1 f1 → 7 m0
S3:
m0 → 7 f0 m1 → 7 f0
H3:
f0 → 7 m0 f1 → 7 m0
S4:
m0 → 7 f1 m1 → 7 f1
H4:
f0 → 7 m1 f1 → 7 m1
where the δ-function is defined as ( 1 if H(S(m)) = m δ m (S, H) = 0 else The speaker shares the interest in communicating successfully, but he also wants to avoid costs. So his utility function comes out as X us (S, H) = pm × (δ m (S, H) − cost(S(m))) m
With the chosen numbers, this gives us the utility matrix in Table 1.22 on the next page. The first question that might come to mind is what negative utilities are supposed to mean in EGT. Utilities are the expected number of offspring – what is negative offspring? Recall though that if applied to cultural language evolution, the replicating individuals are utterances, and the
An Introduction to Game Theory for Linguists Table 1.22:
71
Utility matrix of the Horn game
H1
H2
H3
H4
S1
(.875 ; 1.0)
(−.125 ; 0.0)
(.625 ; .75)
(.125 ; .25)
S2
(−.175 ; 0.0)
(.825 ; 1.0)
(.575 ; .75)
(.075 ; .25)
S3
(.65 ; .75)
(.15 ; .25)
(.65 ; .75)
(.15 ; .25)
S4
(.05 ; .25)
(.55 ; .75)
(.55 ; .75)
(.05 ; .25)
mode of replication is imitation. Here the utilities represent the difference in the absolute abundance of a certain strategy at a given point in time and at a later point. A negative utility thus simply means that the number of utterances generated by a certain strategy is absolutely declining. Also, neither the replicator dynamics nor the locations of ESSs or Nash equilibria change if a constant amount is added to all utilities within a matrix. It is thus always possible to transform any given matrix into an equivalent one with only non-negative entries. We are dealing with an asymmetric game. Here all and only the strict Nash equilibria are evolutionarily stable. There are two such stable states in the game at hand: (S1 , H1 ) and (S2 , H2 ). As the reader may verify, these are the two strategy configurations where both players use a 1-1 function, the hearer function is the inverse of the speaker function, and where thus communication always succeeds. EGT thus predicts the emergence of signaling conventions in the Lewisian sense. It does not predict though that the “Horn strategy” (S1 , H1 ) is in any way superior to the “anti-Horn strategy” (S2 , H2 ) where the complex form is used for the frequent meaning. There are various reasons why the former strategy is somehow “dominant”. First, it is Pareto optimal (recall the discussion of Pareto optimality on page 23). This means that for both players, the utility that they get if both play Horn is at least as high as in the other ESS where they both play anti-Horn. For the speaker Horn is absolutely preferable. Horn also risk-dominates anti-Horn. This means that if both players play Horn, either one would have to lose a lot by deviating unilaterally to anti-Horn, and this “risk” is at least as high as the inverse risk, i.e., the loss in utility from unilaterally deviating from the anti-Horn equilibrium. For the speaker, this domination is strict. However, these considerations are based on a rationalistic conception of GT , and they are not directly applicable to EGT . There are two arguments
Game Theory and Pragmatics
72
for the domination of the Horn strategy that follow directly from the replicator dynamics. • A population where all eight strategies are equally likely will converge towards a Horn strategy. Figure 1.11 gives the time series for all eight strategies if they all start at 25% probability. Note that the hearers first pass a stage where strategy H3 is dominant. This is the strategy where the hearer always “guesses” the more frequent meaning – a good strategy as long as the speaker is unpredictable. Only after the speaker starts to clearly differentiate between the two meanings does H1 begin to flourish.
1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
-0.1
-0.1 S1
S2
S3
S4
H1
H2
H3
H4
Figure 1.11: Time series of the Horn game
• While both Horn and anti-Horn are attractors under the replicator dynamics, the former has a much larger basin of attraction than the latter. We are not aware of a simple way of analytically calculating the ratio of the sizes of the two basins, but a numerical approximation revealed that the basin of attraction of the Horn strategy is about 20 times as large as the basin of attraction of the anti-Horn strategy. The asymmetry between the two ESSs becomes even more apparent when the idealization of the population being infinite population is lifted. In the next section we will briefly explore the consequences of this. 3.5
All equilibria are stable, but some equilibria are more stable than others: Stochastic EGT
Let us now have a closer look at the modeling of mutations in EGT. Evolutionary stability means that a state is stationary and resistant against small amounts of mutations. This means that the replicator dynamics is tacitly assumed to be combined with a small stream of mutation from each strategy to each other strategy. The level of mutation is assumed to be constant. An
An Introduction to Game Theory for Linguists
73
evolutionarily stable state is a state that is an attractor in the combined dynamics and remains an attractor as the level of mutation converges towards zero. The assumption that the level of mutation is constant and deterministic, though, is actually an artifact of the assumption that populations are infinite and time is continuous in standard EGT. Real populations are finite, and both games and mutations are discrete events in time. So a more finegrained modeling should assume finite populations and discrete time. Now suppose that for each individual in a population, the probability to mutate towards the strategy s within one time unit is p, where p may be very small but still positive. If the population consists of n individuals, the chance that all individuals end up playing s at a given point in time is at least pn , which may be extremely small but is still positive. By the same kind of reasoning, it follows that there is a positive probability for a finite population to jump from each state to each other state due to mutation (provided each strategy can be the target of mutation of each other strategy). More generally, in a finite population the stream of mutations is not constant but noisy and non-deterministic. Hence there are strictly speaking no evolutionarily stable strategies because every invasion barrier will eventually be overcome, no matter how low the average mutation probability or how high the barrier.14 If an asymmetric game has exactly two SNEs, A and B, in a finite population with mutations there is a positive probability pAB that the system moves from A to B due to noisy mutation, and a probability pBA for the reverse direction. If pAB > pBA , the former change will on average occur more often than the latter, and in the long run the population will spend more time in state B than in state A. Put differently, if such a system is observed at some arbitrary time, the probability that it is in state B is higher than that it is in A. The exact value of this probability converges towards pAB as time grows to infinity. pAB +pBA If the level of mutation gets smaller, both pAB and pBA get smaller, but at a different pace. pBA approaches 0 much faster than pAB . Thus pABpAB +pBA (and thus the probability of the system being in state B) converges to 1 as the mutation rate converges to 0. So while there is always a positive probability that the system is in state A, this probability can become arbitrarily small. A state is called stochastically stable if its probability converges to a value > 0 as the mutation rate approaches 0. In the described scenario, B would be the only stochastically stable state, while both A and B are evolutionarily stable. The notion of stochastic stability is a strengthening of the concept of evolutionary stability; every stochastically stable state is also evolutionarily stable,15 but not the other way round.
Game Theory and Pragmatics
74
We can apply these considerations to the equilibrium selection problem in the Horn game from the last subsection. Figure 1.12 shows the results of a simulation, using a stochastic dynamics in the described way.16 The left hand figure shows the proportion of the Horn strategy S1 and the figure on the right the anti-Horn strategy S2 . The other two speaker strategies remain close to zero. The development for the hearer strategies is pretty much synchronized. During the simulation, the system spent 67% of the time in 1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0 Horn
anti-Horn
Figure 1.12: Simulation of the stochastic dynamics of the Horn game
a state with a predominant Horn strategy and only 26% with predominant anti-Horn (the remaining time are the transitions). This seems to indicate strongly that the Horn strategy is in fact the more probable one, which in turn indicates that it is the only stochastically stable state. The literature contains some general results about how to find the stochastically stable states of a system analytically, but they are all confined to 2×2 games. This renders them practically useless for linguistic applications because here, even in very abstract models like the Horn game, we deal with more than two strategies per player. For larger games, analytical solutions can only be found by studying the properties in question on a case by case basis. It would take us too far to discuss possible solution concepts here in detail (see for instance Young 1998 or Ellison 2000). We will just sketch such an analytical approach for the Horn game, which turns out to be comparatively well-behaved. To check which of the two ESSs of the Horn game are stochastically stable, we have to compare the height of their invasion barriers. How many speakers must deviate from the Horn strategy such that even the smallest hearer mutation causes the system to leave the basin of attraction of this strategy and to move towards the anti-Horn strategy? And how many hearer-mutations would have this effect? The same questions have to be answered for the anti-Horn strategy, and the results to be compared.
An Introduction to Game Theory for Linguists
75
Consider speaker deviations from the Horn strategy. It will only lead to an incentive for the hearer to deviate as well if H1 is not the optimal response to the speaker strategy anymore. This will happen if at least 50% of all speakers deviate toward S2 , 66.7% deviate towards S4 , or some combination of such deviations. It is easy to see that the minimal amount of deviation having the effect in question is 50% deviating towards S2 .17 As for hearer deviation, it would take more than 52.5% mutants towards H2 to create an incentive for the speaker to deviate towards S2 , and even about 54% of deviation towards H4 to have the same effect. So the invasion barrier along the hearer dimension is 52.5%. Now suppose the system is in the anti-Horn equilibrium. As far as hearer utilities are concerned, Horn and anti-Horn are completely symmetrical, and thus the invasion barrier for speaker mutants is again 50%. However, if more than 47.5% of all hearers deviate towards H1 , the speaker has an incentive to deviate towards S1 . In sum, the invasion barriers of the Horn and of the anti-Horn strategy are 50% and 47.5% respectively. Therefore a “catastrophic” mutation from the latter to the former, though unlikely, is more likely than the reverse transition. This makes the Horn strategy the only stochastically stable state. In this particular example, only two strategies for each player played a role in determining the stochastically stable state. The Horn game thus behaves effectively as a 2 × 2 game. In such games stochastic stability actually coincides with the rationalistic notion of “risk dominance” that was briefly discussed above. In the general case, it is possible though that a larger game has two ESSs, but there is a possible mutation from one equilibrium towards a third state (for instance a non-strict Nash equilibrium) that lies within the basin of attraction of the other ESS. The stochastic analysis of larger games has to be done on a case-by-case basis to take such complex structures into account. In standard EGT, as well as in the version of stochastic EGT discussed here, the utility of an individual at each point in time is assumed to be exactly the average utility this individual would get if it played against a perfectly representative sample of the population. Vega-Redondo (1996) discusses another variant of stochastic EGT where this idealization is also given up. In this model, each individual plays a finite number of tournaments in each time step, and the gained utility – and thus the abundance of offspring – becomes a stochastic notion as well. He shows that this model sometimes leads to a different notion of stochastic stability than the one discussed here. A detailed discussion of this model would lead beyond the scope of this introduction though.
76
Game Theory and Pragmatics
4
Overview
With this book we hope to attract the attention of researchers and students of linguistics and of the philosophy of language that are interested in pragmatics. We hope to convince those readers of the potential and the relevance of game theory for linguistic pragmatics, and for the understanding of language in general. Likewise, we hope to convince working game theorists from other fields that natural language is an exciting area of application of their theory. Even though the roots of game theoretic pragmatics go back to the late sixties, it is still an emerging discipline. This makes the field diverse, and at the same time exciting and innovative. There is no agreement yet on a set of established ideas, concepts, and research questions, and in a sense, this is what makes the field so attractive for researchers from different backgrounds. In this volume, we hope to give a snapshot of the current state of this budding discipline. Lewis (1969) introduced signaling games for the study of linguistic conventions. His main aim was in line with Paul Grice’s project to base the (elusive) notion of ‘meaning’ on beliefs, desires, and intentions of the agents of a conversation. As suggested in section 2 of this Introduction, signaling games have been studied extensively by economists to investigate, among others, under which circumstances a message credibly conveys information about the world. This research does not have a big impact yet on linguistics. In the first contribution to this book, Robert Stalnaker seeks to close that gap, by showing the analogy between Grice’s philosophical analysis of meaning and the more recent game theoretical analysis of credible information exchange. In the second chapter, Prashant Parikh introduces his games of partial information and argues that they extend signaling games. He shows how some pragmatic phenomena can be accounted for within his framework, and points out that game theory might be the appropriate tool to account for probabilistic communication. In the latter part of his paper, Parikh argues that the utterance situation s is not only important to contribute the game model required to calculate the semantic meaning of an utterance, but also to determine which solution concept is appropriate to use. He suggests that this can be accounted for in terms of (a sequence of) higher order games. The main aim of this book is to show that game theory might shed new light on the study of language, mainly because it suggests that a very formal analysis of language use is within reach that takes a broader conception of language use than is standard in pragmatic analyses. However, by making use of game theoretical analyses, one also takes over its assumptions.
An Introduction to Game Theory for Linguists
77
Nicholas Allott’s chapter contains a critical discussion of game theoretical analyses of communication. Because Prashant Parikh’s analysis is the oldest and arguably best worked-out analysis of this sort, he naturally concentrates his discussion on this. Allott argues any analysis that makes use of standard game theory is based on some unmotivatedly strong assumptions, and suggests that some of these assumptions might be weakened by making use of some principles of Sperber and Wilson’s (1986) Theory of Relevance. Perhaps the main problem of game theoretical analysis of communication is the fact that such analyses typically predict that communication games have multiple equilibria, and that it is not a priori clear which one of those the conversational partners should, or will, coordinate on. A natural suggestion – also made by Prashant Parikh – is that of the various equilibria, agents typically converge to the Pareto optimal one, the equilibrium that gives to both participants the highest payoff. Natural as this proposal might seem, Sally (2003) has pointed out that in many game theoretical situations this is not the outcome we actually observe in case the preferences of the agents are not fully aligned. In those cases, avoidance of risk plays an important role as well. Following Sally’s observations, Robert van Rooij and Merlijn Sevenster discuss the importance of risk for the use of expressions with an intended non-literal interpretation, or with an underspecified meaning. The chapter by Nicholas Asher and Madison Williams investigates the rational basis for the computation of pragmatic interpretation from semantic content. They argue that an analysis of pragmatic inference in terms of Lewisian coordination games is insufficient because that model lacks a principled account of equilibrium selection. To overcome this problem, they develop a dynamic version of Bacharach’s (1993) Variable Frame Theory, which in turn builds on Schelling’s (1960) notion of focal points. The compositional interpretation of an utterance, together with the mutual world knowledge, defines a starting point in a game dynamics, which in turn converges on the pragmatic interpretation of the utterance. This approach is motivated and illustrated with several default inference patterns from Asher and Lascarides’ (2003) Segmented Discourse Representation Theory. Anton Benz’s chapter explains the possibility of partial and mentionsome answers in the context of two-person games. Starting out with Gronendijk and Stokhof’s (1984) semantic approach he argues that their occurrence can be explained if we assume that they are embedded into contextually given decision problems. This builds on work by Merin (1999b) and especially van Rooij (2003b). He shows that intuitive judgments about the appropriateness of partial and mention–some answers are in accordance with the assumption that interlocutors are Bayesian utility maximizers. In the
78
Game Theory and Pragmatics
second part of his chapter, he proves that explanations that are based on purely decision-theoretically defined measures of relevance cannot avoid picking out misleading answers. The chapter by Kris de Jaegher shows that the grounding strategies of interlocutors can be characterized as evolutionarily stable equilibria in variants of the so-called electronic mail game (Rubinstein 1989). In conversation, it is not only necessary to achieve common knowledge about the meaning of utterances but also about the fact that some information has been communicated. The strategies employed by the interlocutors to achieve this goal are called their grounding strategies. Kris de Jaegher shows that separating equilibria in an electronic mail have a natural interpretation as grounding strategies. He shows especially that Traum’s (1994) grounding acts are among the evolutionarily stable equilibria. The chapter by Jacob Glazer and Ariel Rubinstein studies the rules of pragmatics in the context of a debate between two parties aiming to persuade a listener to adopt one of two opposing positions. The listener’s optimal conclusion depends on the state of the world initially known only to the two parties. The parties argue sequentially. Arguing entails providing some hard evidence. A persuasion rule determines the conclusion that the listener will draw from the arguments made. A state of the world and a persuasion rule determine a zero-sum game played by the two debaters. The outcome of the game is the conclusion drawn by the listener, which might be right or wrong. The chapter imposes a constraint on the amount of information that the listener can absorb and characterizes the persuasion rules that minimize the probability that the listener reaches the wrong conclusion. It is demonstrated that this optimization problem is affected by the language in which the persuasion rules are defined. The last chapter in this volume, by Tom Lenaerts and Bart de Vylder, is of a somewhat different nature than the others. It concentrates not so much on the effects of our beliefs and preferences on what is communicated in an actual conversation, but rather on how a conventional language can emerge in which expressions have a meaning shared among a group of autonomous agents. It is the only contribution in this volume that makes use of the tools of evolutionary game theory. This chapter discusses the effect of a particular model of language learning on the evolution of a conventional communication system. We feel that this chapter is especially suited to this volume, because – and this in contrast to almost all other analyses of the evolution of language that give great weight to language learning – language learning in this model is not supposed to be passive, and only used by children, but rather active, where the learner’s language use also plays an important role.
An Introduction to Game Theory for Linguists
79
Notes 1. The standard statistical relevance of a proposition E for a hypothesis H is defined by R(H, E) = P (H/E) − P (H). The standard statistical relevance and Good’s relevance are identical with respect to all properties that we use in this introduction, especially, it is R(H, E) = −R(H, E). 2. We can look at log(P + (H)/P + (H)) as a (possibly negative) measure for our inclination to favor H over H; hence rH (E) tells us how the strength of this inclination is updated. This is an advantage of rH (E) over the standard statistical notion of relevance P (H/E) − P (H). 3. See also Parikh’s contribution to this volume. 4. Parikh (Parikh 1991, Parikh 2001) studies what he calls Games of Partial Information and claims in his contribution to this volume that they are more general than the signaling games as studied in economics and biology. 5. Or, more generally, as a set of subsets of T . 6. If hearers use such an interpretation rule, speakers have no reason anymore to be vague. But, of course, vagueness can still have positive pay-off when one’s audience is unsure about your preferences. 7. See van Rooij and Schulz (2004) for more discussion. 8. In the standard model of EGT, populations are – simplifyingly – thought of as infinite and continuous, so there are no minimal units. 9. A trajectory is the path of development of an evolving entity. 10. One might argue that the strategies of a language user in these two roles are not independent. If this correlation is deemed to be important, the whole scenario has to be formalized as a symmetric game. 11. A similar point can be made with regard to the prisoners’ dilemma, where the unique NE, general defection, is also the unique ESS, both in the symmetric and in the asymmetric conception. 12. A Markov process is a stochastic process where the system is always in one of finitely many states, and where the probability of the possible future behaviors of the system only depends on its current state, not on its history. 13. The utility of the hearer is the average information value of the interpretation that the hearer extracts from the speaker’s signal. If the information is incorrect, this information value is 0. This also holds if the interpretation that the hearer assigns to the signal is more specific than what the speaker intended. For instance, if the speaker wants to communicate the content CUTTER ’ but the listener interprets the signal as CUTTER ’ − KNIFE ’, the utility is 0. If the speaker uses a mixed strategy, the utility is the weighted average of the pure strategies involved. So if S is a probability distribution over speaker strategies and H is a hearer strategy, the general formula is uh (S, H)
X
p(m) ·
m
X
χm⊆H(s(m)) · i(H(s(m)))
s∈S
Here χφ is the characteristic function of the proposition φ. The speaker utility is calculated similarly, except that the costs of the signal are taken into account as negative utility. us (S, H)
X m
p(m) ·
X s∈S
(χm⊆H(s(m)) · i(H(s(m))) − cost(s(m)))
80
Game Theory and Pragmatics
14. This idea was first developed in Kandori et al. (1993) and Young (1993). Fairly accessible introductions to the theory of stochastic evolution are given in VegaRedondo (1996) and Young (1998). 15. Provided the population is sufficiently large, that is. Very small populations may display a weird dynamic behavior, but we skip over this side aspect here. 16. The system of difference equations used in the experiment is ∆xi ∆t
=
xi ((Ay)i − hx × Ayi) +
X Zji − Zij n j
∆yi ∆t
=
yi ((Bx)i − hy × Bxi) +
X Zji − Zij n j
where x, y are the vectors of the relative frequencies of the speaker strategies and hearer strategies, and A and B are the payoff matrices of speakers and hearers respectively. For each pair of strategies i and j belonging to the same player, Zij gives the number of individuals that mutate from i to j. Zij is a random variable which is distributed according to the binomial distribution b(pij , bxi nc) (or b(pij , byi nc) respectively), where pij is the probability that an arbitrary individual of type i mutates to type j within one time unit, and n is the size of the population. We assumed that both populations have the same size. 17. Generally, if (si , hj ) form a SNE, the hearer has an incentive to deviate from it P as soon as the speaker chooses a mixed strategy x such that for some k 6= P j, i0 xi0 uh (si0 , hk ) > i0 xi0 uh (si0 , hj ). The minimal amount of mutants needed to drive the hearer out of the equilibrium would be the minimal value of 1 − xi for any mixed strategy x with this property. (The same applies ceteris paribus to mutations on the hearer side.)
References Anscombre, J. C. and O. Ducrot (1983). L’Argumentation dans la Langue. Mardaga, Brussels. Asher, N. and A. Lascarides (2003). Logics of Conversation. Cambridge University Press, Cambridge (UK). Bacharach, M. (1993). Variable universe games. In K. Binmore, A. Kirman, and P. Tani, eds., Frontiers of Game Theory. MIT Press, Cambridge, MA. Crawford, V. and J. Sobel (1982). Strategic information transmission. Econometrica, 50, 1431–1451. Ducrot, O. (1973). La preuve et le dire. Mame, Paris. Ellison, G. (2000). Basins of attraction, long run equilibria, and the speed of step-bystep evolution. Review of Economic Studies, 67(1), 17–45. Farrell, J. (1988). Communication, coordination and Nash equilibrium. Economic Letters, 27, 209–214. Farrell, J. (1993). Meaning and credibility in cheap-talk games. Games and Economic Behavior, 5, 514–531. Fauconnier, G. (1975). Pragmatic scales and logical structure. Linguistic Inquiry, 6, 353–375. Frege, G. (1918). Der Gedanke: eine logische Untersuchung. Beitrage zur Philosophie des deutschen Idealismus, 1, 58–77.
An Introduction to Game Theory for Linguists
81
Gibson, R. (1992). A Primer in Game Theory. Harvester Wheatsheaf, Hertfordshire. Good, I. (1950). Probability and the Weighing of Evidence. Griffin, London. Grice, H. P. (1967). Logic and conversation. In William James Lectures. Harvard University. Reprinted in Studies in the Way of Words, 1989, Harvard University Press, Cambridge, MA. Groenendijk, J. and M. Stokhof (1984). Studies on the Semantics of Questions and the Pragmatics of Answers. Ph.D. thesis, University of Amsterdam. Harsanyi, J. C. (1967–1968). Games with incomplete information played by ’bayesian’ players. Management Science, I to III(14), 159–182, 320–334, 486–502. Hirschberg, J. (1985). A Theory of Scalar Implicatures. Ph.D. thesis, University of Pennsylvania. Horn, L. (1991). Given as new: When redundant affirmation isn’t. Journal of Pragmatics, 15, 313–336. Horn, L. (1993). Economy and redundancy in a dualistic model of natural language. In S. Shore and M. Vilkuna, eds., 1993 Yearbook of the Linguistic Association of Finland, pp. 33–72. SKY. de Jaegher, K. (2003). A game-theoretical rationale for vagueness. Linguistics and Philosophy, 26, 637–659. J¨ager, G. (2004). Evolutionary Game Theory and typology: a case study. Manuscript, University of Potsdam and Stanford University. Kandori, M., G. Mailath, and R. Rob (1993). Learning, mutation, and long-run equilibria in games. Econometrica, 61, 29–56. Kreps, D. and R. Wilson (1982). Sequential equilibrium. Econometrica, 50, 863–894. Labov, W. (1972). Sociolinguistic Patterns. University of Pennsylvania Press, Philadelphia. Lewis, D. (1969). Convention. Harvard University Press, Cambridge, MA. Lipman, B. (2003). Language and economics. In M. Basili, N. Dimitri, and I.Gilboa, eds., Cognitive Processes and Rationality in Economics. Routledge, London. Lipman, B. and D. Seppi (1995). Robust inference in communication games with partial provability. Journal of Economic Theory, 66, 370–405. Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge University Press, Cambridge (UK). Merin, A. (1999a). Die relevance der relevance: Fallstudie zur formalen semantik der englischen konjuktion but. Habilitationschrift, University Stuttgart. Merin, A. (1999b). Information, relevance, and social decision making: Some principles and results of decision-theoretic semantics. In L. Moss, J. Ginzburg, and M. de Rijke, eds., Logic, Language, and Information, volume 2, pp. 179–221. CSLI Publications, Stanford. von Neumann, J. and O. Morgenstern (1944). The Theory of Games and Economic Behavior. Princeton University Press, Princeton. Nowak, M. A., N. L. Komarova, and P. Niyogi (2002). Computational and evolutionary aspects of language. Nature, 417, 611–617. Parikh, P. (1991). Communication and strategic inference. Linguistics and Philosophy, 14, 473–513. Parikh, P. (1992). A game-theoretical account of implicature. In Y. Vardi, ed., Theoretical Aspects of Rationality and Knowledge. TARK IV, Monterey, California. Parikh, P. (2001). The Use of Language. CSLI Publications, Stanford. Parikh, R. (1994). Vagueness and utility: The semantics of common nouns. Linguistics and Philosophy, 17, 521–535.
82
Game Theory and Pragmatics
Pratt, J., H. Raiffa, and R. Schlaifer (1995). Introduction to Statistical Decision Theory. The MIT Press, Cambridge, MA. Rabin, M. (1990). Communication between rational agents. Journal of Economic Theory, 51, 144–170. van Rooij, R. (2003a). Being polite is a handicap: Towards a game theoretical analysis of polite linguistic behavior. In M. Tennenholtz, ed., Proceedings of the 9th conference on Theoretical Aspects of Rationality and Knowledge. ACM Press, New York. van Rooij, R. (2003b). Questioning to resolve decision problems. Linguistics and Philosophy, 26, 727–763. van Rooij, R. (2004). Signalling games select Horn strategies. Linguistics and Philosophy, 27, 493–527. van Rooij, R. and K. Schulz (2004). Exhaustive interpretation of complex sentences. Journal of Logic, Language and Information, (13), 491–519. Rubinstein, A. (1989). The electronic mail game: strategic behavior under ‘almost common knowledge,’. American Economic Review, 79, 385–391. Sally, D. (2003). Risky speech: behavioral game theory and pragmatics. Journal of Pragmatics, 35, 1223–1245. Schelling, T. (1960). The Strategy of Conflict. Harvard University Press. Selten, R. (1980). A note on evolutionarily stable strategies in asymmetric animal conflicts. Journal of Theoretical Biology, 84, 93–101. Sperber, D. and D. Wilson (1986). Relevance. Communication and Cognition. Basil Blackwell, Oxford. Taylor, P. and L. Jonker (1978). Evolutionarily stable strategies and game dynamics. Mathematical Biosciences, 40, 145–156. Traum, D. R. (1994). A Computational Theory of Grounding in Natural Language Conversation. Ph.D. thesis, University of Rochester. Vega-Redondo, F. (1996). Evolution, Games, and Economic Behaviour. Oxford University Press, Oxford. Young, H. P. (1993). The evolution of conventions. Econometrica, 61, 57–84. Young, H. P. (1998). Individual Strategy and Social Structure. An Evolutionary Theory of Institutions. Princeton University Press, Princeton. Zahavi, A. (1975). Mate selection — a selection for a handicap. Journal of Theoretical Biology, 53, 205–213. Zahavi, A. and A. Zahavi (1997). The Handicap Principle. A missing piece of Darwin’s puzzle. Oxford University Press, Oxford.
2 Saying and Meaning, Cheap Talk and Credibility Robert Stalnaker
In May 2003, the US Treasury Secretary, John Snow, in response to a question, made some remarks that caused the dollar to drop precipitously in value. The Wall Street Journal sharply criticized him for ‘playing with fire,’ and characterized his remarks as ‘dumping on his own currency’, ‘bashing the dollar,’ and ‘talking the dollar down’. What he in fact said was this: ‘When the dollar is at a lower level it helps exports, and I think exports are getting stronger as a result.’ This was an uncontroversial factual claim that everyone, whatever his or her views about what US government currency policy is or should be, would agree with. Why did it have such an impact? ‘What he has done,’ Alan Blinder said, ‘is stated one of those obvious truths that the secretary of the Treasury isn’t supposed to say. When the secretary of the Treasury says something like that, it gets imbued with deep meaning whether he wants it to or not.’ Some thought the secretary knew what he was doing, and intended his remarks to have the effect that they had. (‘I think he chose his words carefully,’ said one currency strategist.) Others thought that he was ‘straying from the script,’ and ‘isn’t yet fluent in the delicate language of dollar policy.’ The Secretary, and other officials, followed up his initial remarks by reiterating that the government’s policy was to support a strong dollar, but some still saw his initial remark as a signal that ‘the Bush administration secretly welcomes the dollar’s decline.’1 The explicit policy statements apparently lacked credibility. Perhaps their meaning was not as deep as the meaning of the ‘secret’ signal that currency traders and other observers had read into the initial remarks. Meaning, whether deep or on the surface, is an elusive and complicated matter. It is difficult to be clear about exactly what is going on in even the most direct, literal acts of communication. My aim in this chapter to bring out some of the problems by looking at two very different projects that each try to say something about what it is to mean things. I will first take a look back at Paul Grice’s analysis of meaning, and the wider project of which this analysis was the cornerstone. Then I will discuss some attempts by
83
84
Game Theory and Pragmatics
game theorists to give an account of acts of signaling, particularly of acts that are, in a sense to be defined, acts of pure communication. I think that these two projects, which use some similar ideas in very different ways, and face some related problems, throw light on each other. My discussion will be preliminary, speculative, and indirect, looking only at highly artificial and idealized situations, and reaching only tentative conclusions even about them, so I should begin with the kind of qualifications and expressions of reservation for which Grice was famous. Echoing Grice, ‘What follows is [only] a sketch of a direction.’ ‘We should recognize that at the start we shall be moving fairly large conceptual slabs around a somewhat crudely fashioned board.’ But I think that if we can get clear about how some basic concepts work in a very simple setting, this will help us to understand the kind of strategic reasoning that is involved in more complex and interesting communicative situations. My plan is this: I will begin by sketching the Gricean project, as I understand it – its motivation and some of its central ideas. Then I will look at the game theoretic project – a project that focuses on the role of what is called ‘cheap talk,’ and on the idea of credibility. I will point to some of the parallels in the two projects, make some suggestions, which are influenced by Grice’s project, about the way the idea of credibility might be characterized, and look at the consequences of these suggestions for some simple games. I will conclude by considering how these ideas might be generalized to slightly more complex and realistic situations, and how they might help to clarify some of the patterns of reasoning involved in real communication. We will see, in the end, if we can get any insight into the question why the secretary of the Treasury didn’t just say what he meant, and why others did not take him to be doing so. The Gricean project was begun in a philosophical environment (Oxford ‘ordinary language’ philosophy of the 1950’s) that now seems very distant and alien. Grice was very much a part of this philosophical movement, but was also reacting to it. The ordinary language philosophers shared with the philosophers in the logical empiricist tradition the idea that philosophical problems were essentially problems about language, but they were reacting to that tradition’s emphasis on the artificial languages of logic, and the method of clarification by translation into such formal languages, a procedure that abstracted away from the speaker and context, and from the way natural languages were actually used. The ordinary language philosophers emphasized that speech is a kind of action. To use terminology not current at the time, their focus was on pragmatics rather than just semantics. Meaning was to be understood in terms of the way that a speech act is intended to affect the situation in which it is performed.
Saying and Meaning, Cheap Talk and Credibility
85
Grice’s distinctive project2 was to provide a philosophical analysis of speaker meaning – to give necessary and sufficient conditions for a claim of the form ‘S [a speaker] means that P by u [an utterance].’ Speaker meaning (which Grice called ‘nonnatural meaning,’ or ‘meaning-nn’ to contrast it with a sense of ‘meaning’ as a natural sign) was to be the basic semantic concept in terms of which the meanings of statements, sentences and words were to be explained. As with any project of reductive analysis, to clarify the aim, one needs to say what is being analyzed in terms of what – what is problematic, and what are the resources of analysis. Grice’s answer to this question is distinctive, and gave his project a character quite different from most philosophical projects of explaining meaning. His project was to explain semantic concepts in terms of the beliefs and intentions of the agents who mean things. In contrast, most philosophers, both before and after Grice, who are trying to say what it is for something to mean some particular thing, are addressing the problem of intentionality – the problem of how words (and thoughts) manage to connect with the world – how they can be about something, have propositional content, be true or false. Quine on radical translation, and Davidson on radical interpretation, Michael Dummett on theory of meaning, causal theories of reference, Jerry Fodor on the semantics for the language of thought – all of these projects are attempts to explain both mental and linguistic intentionality in non-intentional terms. The standard strategy for the explanation of intentionality was to begin with language, and then to explain the intentionality of belief, desire and intention as somehow derivative from the intentionality of language. Thinking is ‘saying in your heart,’ (perhaps in the language of thought); the mental act of judgment is the ‘interiorization’ of the act of assertion; believing is being disposed to affirm or assert, where the content of what one is disposed to say is to be explained in terms of the way the language as a whole is used by the speaker’s community. Intentionality arises out of the constitutive rules (to use John Searle’s term) of an institutional practice of speech.3 An important part of the motivation for Grice’s project was to reverse the direction of explanation: to return to the idea, more natural from a naive point of view, that speech is to be explained in terms of thought. A speech act is an action that like any rational action should be explained in terms of the purposes for which it is performed, and the agent’s beliefs about its consequences. Speech may be an institutionalized social practice, but it is a practice with a function that is intelligible independently of the practice, and we can get clearer about how the practice works by getting clear about what that function is. Grice’s idea was that speech is an institution whose function is to provide resources to mean things, and that what it is to mean things needs to be explained independently of the institution whose aim it is to provide the means to do it.
86
Game Theory and Pragmatics
So the project is to explain the distinctive character of communicative action, taking for granted the normal resources for the explanation of rational action – beliefs, desires, values and ends, intentions. Step one is the simple idea that a communicative act is an attempt to get someone to believe something, but not every attempt to get someone to believe something is an act of meaning something. I might, for example, try to get the police to believe that the butler did it by putting the murder weapon in the butler’s pantry, and to do so would not be an act of meaning anything. The problem is to say what must be true about the way that one intends to induce a belief in order for an act done with that intention to be an act of meaning something. Step two is to add that the intention to induce a belief must be manifest or transparent (excluding the evidence-planting cases, which can work only if they are not recognized for what they are). It does seem to be a central feature of meaning that it is open – an act is a communicative one only if the intention to communicate is mutually recognized. (Communication can, of course, be devious and deceptive, but a speaker cannot attempt to deceive her interlocutor about what she intends him to understand her to be meaning.) Still, transparency is not enough. Grice used the example of Herod presenting the head of John the Baptist on a charger to Salome to illustrate that more was needed for meaning. Herod’s intention to induce the belief that John the Baptist had been beheaded was manifest, but this was not an act of meaning that he had been beheaded. What needed to be added, Grice argued, was that the recognition of the intention must play an essential role in accomplishing the primary intention to induce the belief. In an act of pure communication, the recognition of the intention is what does the work of inducing the belief. So this was Grice’s basic analysis: We may say that ‘A meantNN something by x’ is roughly equivalent to ‘A uttered x’ with the intention of inducing a belief by means of the recognition of this intention.4 The analysis was later refined and complicated in response to a barrage of counterexamples. Refinement of the Gricean analysis became one of those cottage industries that periodically take hold of the philosophical literature, with evermore complex counterexamples offered, and evermore complex clauses and qualifications added to the analysis in response. We will pass over the details. Our interest is not in vindicating the project of reductive analysis by getting the necessary and sufficient conditions for meaning exactly right, but in what the general ideas of the components of such an analysis might show about the way speakers and addressees reason about communicative acts. In particular, if an act of meaning something is an act of roughly this kind, then we can ask the following two questions about any act of uttering u5 in order to mean that P :
Saying and Meaning, Cheap Talk and Credibility
87
1 Why should uttering u be a way for S to get H to recognize her intention to get him to believe that P ? 2 Why should getting him to recognize her intention to get him to believe that P be a way of getting him to believe that P ? Question (1) will be answered in different ways in different situations. It could be that u is a natural sign (an act of smiling, frowning, or pointing, for example) that naturally tends to induce a belief, or to make prominent a thought, and as a result has come to be used. Or accidental associations may be noticed, and come to be mutually recognized, and reinforced over time. Obviously, the dominant way of meaning things is by saying them, which is to say by the use of an elaborate conventional system, codified and taught, that associates, in a systematic way, a range of sound patterns with a range of propositions, and Grice thought that this way of meaning things was in some sense central. But it was crucial for his project that meaning be intelligible independently of such institutionalized practices so that one can understand the practice in terms of the function – to mean things – that it is designed to serve, and so that one can better explain why people say what they say, and how sometimes they are able to exploit the rules of a linguistic practice in order to mean things different from what they are saying, or from what the conventional rules imply that they are saying. So while this first question must have an answer, in each particular case, in order for it to be possible for S to mean that P by uttering u, the question need not be answered in any particular way in order for the act to count as an act of meaning. The second question – why should getting H to believe that S intends him to recognize her intention to get him to believe that P be a means of getting him to believe that P ? – will have a satisfactory answer only if the pattern of priorities and beliefs is (or at least is believed to be) such as to give H reason to think that S would want him to believe that P only if P were true. (This is one of the things, as we will see, that the game theoretic apparatus can help to sharpen.) If the kind of intention that Grice uses to analyze speaker meaning is really essential to genuine communication, then it will be essential to the possibility of communication that there be a certain pattern of common interest between the participating parties. It will follow from the analysis of meaning that something like Grice’s cooperative principle, a principle that plays a central role in his theory of conversational implicature, is essential to the very idea of communication.6
1
Cheap talk signaling games7
As many people have noticed,8 Gricean ideas naturally suggest a game theoretic treatment. The patterns of iterated knowledge and belief that are
88
Game Theory and Pragmatics
characteristic of game-theoretic reasoning are prominent in Grice’s discussions of speaker meaning, and the patterns of strategic reasoning that Grice discussed in the derivation of conversational implicatures are patterns that game theory is designed to clarify. (Grice’s general pattern: one may communicate by saying something that gets the addressee to reason in the following way: what must be true in order that it be rational for S to have said that? If the answer is, it must be true that P , and if it is transparent that the speaker intended the addressee to reason in this way, then whatever the literal meaning of what one said, this will count, on a Gricean analysis, as a case of meaning that P .) Grice never, to my knowledge, discussed the potential connection between his work and game theory, and some of the developments within game theory that are most relevant to Grice’s work (in particular, the explicit modeling of common knowledge and belief, and more generally of the epistemic foundations of game-theoretic reasoning9 ) occurred after his work on meaning and implicature. But game theory provides both some sharp tools for formulating some of Grice’s ideas, and some simple idealized models of examples to which those ideas might be applied. And I think Gricean ideas will throw some light on the problems game theorists face when they try to model communicative situations. I will make some remarks about the general game theoretic setting, and then describe the simple communication games that I will be concerned with. Following this, I will state the problem about meaning that arises in this context, and sketch and refine, informally, a response to the problem. A game is a sequence of decision problems, usually involving two or more agents, where the outcome depends on the way the actions of different agents interact. To define a game, one specifies the alternative actions available at each point in the playing of the game – which player gets to move, what information that player has about the prior moves of other players, and what the consequences of his or her move are for the subsequent course of the game. Sometimes there are chance moves in the game, in addition to moves by rational players. In such cases, a probability is specified for each chance move, and it is assumed that these probabilities are mutually known, and determine the prior beliefs of all the players about those moves. The definition of a game also specifies each player’s motivating values (utilities) for each of the alternative ultimate outcomes of the game. The definition of a game does not specify the beliefs and degrees of belief of the players about the actions of other players. Instead, it is normally assumed that the players will act rationally, and that it is common knowledge that they will act rationally. It is also assumed that the structure of the game is common knowledge among the players. In the early developments of game theory, there was no formal representation of the idea of common knowledge; it was just a part of the informal commentary used to motivate the notion of Nash equilibrium, and
Saying and Meaning, Cheap Talk and Credibility
89
various refinements of it, which were taken to be implicit analyses of an idea of game-theoretic rationality. While it was assumed that rationality required maximizing expected utility when probabilities were given and known, it was not assumed that a player had probabilistic degrees of beliefs about the rational actions of other players, except when it was known or assumed that the other player had chosen a mixed strategy – a strategy that allowed chance to determine his or her choice. In the contrasting Bayesian, or epistemic approach to game theory, developed later, the ideas of common knowledge and common belief were made formally explicit, and it was assumed that rationality was identified, in all cases, with maximizing expected utility. It was assumed that players have degrees of belief about the behavior of other rational agents, as well as about chance moves. The assumption of common knowledge, or common belief, that all players act rationally may determine those beliefs in some cases, but in other cases, the structure of the game, and the assumption that it is common knowledge that players will make rational choices, given their beliefs, will be compatible with different models for the game, where a model for a game provides a full specification of the beliefs and degrees of belief of each of the players about the behavior and beliefs of the others, as well as a specification of what move each player makes, and is disposed to make, at each choice point in the game. Given a model theory for a game, one can give a mathematically precise definition of a solution concept in epistemic terms by specifying a class of models that meet some intuitively plausible epistemic constraints. A strategy or strategy profile satisfies the solution concept if it is realized in some model in the class. So, to take the most basic solution concept, one may define the rationalizable strategies of any given game as the set of strategies each of which is realized in some model for the game that satisfies the condition that there is common belief among the players that all players choose rationally. One can then prove that this set of strategies coincides with the set determined by other definitions of rationalizability – for example, rationalizability defined as the set of strategies that survive the iterated elimination of strictly dominated strategies. The games I will be concerned with in this chapter will all be simple sender-receiver games that are designed to model acts whose sole purpose is the communication of information. In these games, one player (the sender, S) has some information (determined by an initial chance move in the game) that is unavailable to the other player (the receiver, R). The chance move (which may model any fact about the state of the world that is determined independently of the choices of the players) determines the sender’s type, which is simply a label for the state that the information puts the sender in. Only R can act, but the information about S’s type that he lacks will normally be relevant in one way or another both to the choice that R would want to make, and to the choice that S would want him to make. All S can
90
Game Theory and Pragmatics
do to influence the outcome is to send a signal to R. R can then make his choice depend, in any way he chooses, on which of the alternative signals S sends. In the general case, the signal that S chooses to send might or might not affect the options available to R, and the payoffs to the players, but in a cheap talk game, they do not. A cheap talk signal is, by definition, one that has no effect on the subsequent course of the game, except to give R the option of making his choice depend on which signal is sent. That is, the moves available to R, and the consequences of those moves for both S and R are independent of the signal that is sent. Normally, in game theory, a move in a game is characterized simply by the subsequent options and payoffs that the move makes available – by the subgame that results from the move. In the case of cheap talk, it is true by definition that each of the cheap talk moves available to the sender has exactly the same effect; the subgame that results from one signal is exactly the same as the subgame that results from any other. But the signal will have a point only if it conveys some information, information that is different from the information conveyed by alternative signals. If the theory is to provide any guidance, or any explanation for the choices of players in such games, something must be added to the description of the game that distinguishes the messages in a way that is relevant to the information that they might convey. Let me illustrate the problem with the following minimal signaling game, where there is no conflict of interest, and communication should be as easy and unproblematic as it gets: Example 1
a1 t1 t2
a2 10
10
0 0
0 0
10 10
S is of either type t1 or t2 , determined by chance, with equal probability.10 The columns represent R’s two alternative actions, and the numbers in the cells of the matrix are the payoffs to S and R. Let us suppose that S may send either of two messages, m1 and m2 , and that she must send one or the other. So S has four alternative strategies: send m1 unconditionally, send m1 if she is of type t1 and m2 if of type t2 , send m2 if of type t1 and m1 if of type t2 , or send m2 unconditionally. R also has four alternative strategies for how to respond to the message: he may choose either action
Saying and Meaning, Cheap Talk and Credibility
91
unconditionally, or he may make his choice depend on the message in either of the two possible ways. It is clear that if information is to be conveyed, S must choose one of the two conditional strategies, and if the information is to be exploited, R must choose one of his conditional strategies, but nothing about the basic structure of the game favors one of the conditional strategies over the other, for either player. What we need to build in is something about the meaning or content of the messages, and to say how the fact that the messages have the meanings or contents that they have determines or constrains the effect of sending the messages.11 The resources available to the game theorist for solving this problem are similar to those available to Grice in his reductive project, which was to explain meaning in terms of a pattern of beliefs and intentions. The game theorist characterizes games and models for games in terms of the beliefs and motivating values of the agents, which in turn determine their intentions and actions, so he or she has the same resources. And there are more specific parallels between Grice’s project and the problem of representing meaning in signaling games: it was an important component of Grice’s analysis that an action counts as a case of meaning only if the recognition of the intention to induce a belief played an essential part in inducing the belief. The contrast was with the presentation of evidence that is intended to induce a belief by a means independent of facts about the utterer’s intentions. The same contrast is implicit in the idea of cheap talk, which contrasts with costly signaling, where something about the sender’s beliefs and priorities is demonstrated by an action that has consequences that are independent of the information sent, and that can be seen, on independent grounds, to be irrational unless the proposition the sender intends to communicate is true. (For example, one shows one’s wealth by acts of conspicuous consumption that would be prohibitive for one who is not wealthy.) Grice’s analysis suggested that the explanation for an act of meaning divides into two stages, corresponding to the two questions distinguished above that may be asked about why an utterance u was able to convey the information that P , and it is useful to divide the problem of explaining the meaning of messages in a signaling game in the same way. First, somehow, an action that has no external effect on the situation, and no intrinsic connection with any information (it does not present independent evidence) is able to convey a particular intention of the speaker to induce a belief. Second, the conveying of this intention to induce a certain belief is supposed to succeed in inducing the belief. The central way of explaining the first stage – of answering the first question – was in terms of a conventional device – a language – whose function is to mean things. The central way to mean something is to say it. The language provides a mutually recognized systematic correlation between actions that are easy (and cheap) to perform and certain items of information – propositions. So let us suppose that, in
92
Game Theory and Pragmatics
our signaling games, S has such a device available to her. In specifying the game, we will specify the conventional meaning of the alternative signals that are available to S. The focus is then on the second stage of the explanation – on the question, under what conditions can such a device be used successfully to mean things: to convey information simply in virtue of the recognition of the sender’s manifest desire to send it. This is the question of credibility, which is the central concept in the discussion of cheap talk games. In general, an epistemic model for a game will contain a state space, or a set of alternative possible worlds that represent the alternative ways that the game might be played, and the alternative belief states that the players might be in. A proposition (or in the terminology of the statistician and decision theorist, an event) is represented by a subset of the state space, or a set of possible worlds. So to specify what the available messages say we associate with each message a proposition or event. In the general case, a message might express any proposition, but in our simple games, we will restrict possible messages to information about S’s type. One might have a restricted list of available messages, or one might assume that a rich language is available in which anything may be said about S’s type (for example, if there are four types, there will be 15 consistent propositions, and so 15 distinguishable messages that are available to be sent. One of them – the tautological proposition – is a message that is equivalent to sending no message at all; four others are determinate propositions that say that S is one particular type; the others convey partial information – e.g. that S is either of type 2 or type 3, or that S is not of type 2.) The idea of credibility is simple enough, and it is easy to see, intuitively, that in our simple minimal example, once we have endowed our messages with meaning, credible communication will be unproblematic. But as we will see, there are some complications in spelling the definition out in detail. I will characterize the simple idea by giving a rough and unrefined definition of credibility, together with an assumption about the effect of sending a credible message, that we can impose as a constraint on the game models we are interested in. The unrefined definition and assumption suffice so long as we don’t look beyond the simple and unproblematic cases, and I will illustrate how they work with our minimal example. I will then use some more complex examples to show that the account of credibility needs to be refined and qualified, and also to point to some of the complexities of reasoning about communication, and to the possibility that the meaning of a message might diverge from what the message literally says. First a definition of a preliminary concept, to be used in the definition of credibility: A message is prima facie rational (pf rational) for player S, of type t if and only if S prefers that R believe the content of the message.
Saying and Meaning, Cheap Talk and Credibility
93
Second, the definition of credibility in terms of pf rationality: A message is credible if and only if it is pf rational for some types, and only for types for which it is true. Third, the constraint: It is common belief that the content of any credible message that is sent is believed (by R). This constraint is to be added to the usual constraints that are used to give an epistemic definition of rationalizability: that the structure of the game is common belief, and that it is common belief that both players are rational (that they make choices that maximize their expected utility).12 In our simple coordination game Example 1, assume that the message m1 has the content ‘S is of type t1 ’ and that message m2 has the content ‘S is of type t2 ’. Obviously, m1 is pf rational for t1 , but not for t2 , and m2 is pf rational for t2 but not for t1 , so both messages are credible. Therefore, by the constraint, it is common belief that R will believe either message, if it is sent, and since it is also common belief that R is rational, it follows that it is true and common belief that R will play the strategy, ‘a1 if m1 , a2 if m2 ’. S’s best response to this strategy is to send m1 if she is of type t1 and m2 if she is of type t2 , so our assumptions imply that this is what she will do. Communication, in this simple game, will take place, and will be successful, in any model satisfying our constraints. But when we move beyond the simple cases, we see that our definitions are not so clear as one might hope, and the required refinements will bring out the holistic and interdependent character of credibility, and will also point to some of the subtleties of strategic reasoning about communication. I will start with a question about how the definition of pf rationality is to be understood: the definition says that for a message to be pf rational, S must prefer that R believe the content of the message, but prefer that to what? It is neither necessary nor sufficient, to capture the intended idea, that S should prefer that R believe the message rather than to remain in his prior belief state, since remaining in the prior belief state may not be a feasible option. Consider the game in Example 2. If S is type t2 , then her first choice is that R get no information at all – to remain in the prior belief state – because that would motivate him to choose a1 . But that is not an available option, since it is clear that the message ‘S is t1 ’ is a credible message that S would be rationally required to send if and only if she were of type t1 . So R will infer that S is not t1 if he does not get that message. So sending no message at all would induce the belief that S is either t2 or t3 , which (if R didn’t know which of the two it was) would result in action a3 , which is a worst outcome for t2 . But if t2 is able to reveal her type, R will instead choose a4 , which S (if she is of type t2 ) would prefer to a3 . So the message, ‘S is t2 ’, should be pf rational for t2 , since she prefers that R believe that message to the feasible
94
Game Theory and Pragmatics
Example 2 a1 t1 t2 t3
a2 5
5
10 5
5
0
0 0
6 0
0 0
a4 0
0
0 5
5
a3 10
8 1
6 6
0 0
alternatives to believing it. Since this message is pf rational only for t2 , it is credible. Our definitions should ensure that S will reveal her actual type if she is t1 or t2 , and that R will believe her and respond appropriately. The expected effect on R of the feasible alternatives to a given message m may depend on whether those alternative messages are credible, which in turn may depend on whether the alternatives to those messages (including m itself) are credible. There is a circularity here, but it is not a vicious circularity, since it is not assumed that the players’ beliefs can be generated from the definition of the game, and the constraints on credibility. What the circularity implies is that sometimes a message will be credible in one model of a given game, but not in other models of the same game. Example 4, discussed below, will illustrate the phenomenon. Example 2 showed that sending no message may reveal information, whether the sender wants to reveal it or not. It is also true that sending a credible message may reveal more information than is contained in the explicit content of the message. We have said that a message is credible if it is sometimes pf rational,13 and also pf rational only when true; it is not required that it be pf rational, in all cases, when it is true. It might happen that a partial message is pf rational for some types for which it is true, and only for types for which it is true, but not for all types for which it is true. In such a case, if the message is sent, R will believe the message, but will also come to believe more. So, for example, if the disjunctive message, ‘S is either t1 or t2 ’ is credible, and rational for t1 to send, but not rational for t2 to send, then R would believe the message, if it were sent, but would also come to believe something stronger – that S is of type t1 . We need to take account of this possibility in the definition of pf rationality. What S must prefer, for a message to be pf rational, is that R believe the message in the way he would believe it if the message were received to all feasible alternatives. Example 3 illustrates this kind of situation. Here we assume that there are just two available messages: ‘S is t1 ’ or ‘S is not t1 ’. The second message is pf rational for t3 , and not for t1 or t2 . So it is credible, but will not
Saying and Meaning, Cheap Talk and Credibility
95
Example 3
a1 t1 t2 t3
a2 5
5
0 5
5
0 0
6 0
5 5
a3 0
0 0
6 6
8 8
be sent by t2 . The first message is not credible, since if S is of type t2 , the message would be false, but she might have a motive to send it, and will definitely have a motive to send it if it is required that one of the two messages be sent. Here we have a case where the meaning of the messages (in Grice’s sense) diverges from what the messages literally say, and (like Grice’s phenomenon of conversational implicature) the divergence is explained in terms of what the messages literally say. Even though the first message literally means that S is t1 , it will manifestly express S’s intention to induce the belief that she is either t1 or t2 , and will succeed in doing this. It will not credibly communicate its literal content, and so is not strictly speaking credible, but it will credibly convey something weaker. And since it will be mutually recognized that the second message will be sent only by t3 , it will induce the stronger belief that it is manifestly intended to induce, that S is t3 . We noted above that credibility is a feature of a model of a game, since it depends on the pattern of S’s beliefs; sometimes a message is determined to be credible, or to be not credible, by the structure of the game, together with the general assumptions that define the relevant class of models. But with some games, a message might be credible in some of models that conform to the constraints, and not in others. Furthermore, it might happen, with such games, that in some models, R is mistaken or ignorant about whether a message is credible. Credibility, as we have defined it, is a property determined entirely by S’s beliefs and utilities, and while the utilities are assumed to be common knowledge, players’ beliefs are not. If R is mistaken or uncertain about what S believes, he may be mistaken or uncertain about whether her messages are credible. But it is not plausible to assume that credible messages are believed by R in cases where R does not realize that they are credible, so our constraint should not say that the content of a message that is sent and is actually credible is believed by R, but rather that the content of a message that is sent, and that is believed (by R) to be credible is believed by R. This will not make any difference in the cases where credibility is de-
96
Game Theory and Pragmatics
termined by the structure of the game, but will matter for some potentially ambiguous cases. Example 4 is an illustration of a situation in which ignorance or error about credibility may arise.14
Example 4
a1 t1 t2
a2 9
0
10 9
0
a3 10
0 9
0 9
10 10
If ‘S is t1 ’ is credible, and if S believes that R believes that it is credible, then S will definitely send this message, if she is of type t1 . But then it will be true, and believed by R to be true, that the alternative message, ‘S is t2 ’ is not pf rational for t1 , and this implies that it will also be credible (and believed by R to be credible). Under these assumptions, each message will be sent and believed if and only if it is true; communication will succeed. But the first message might not be credible, since if there is a significant chance that R will believe the first message, but not the second, then S will prefer to send the first message, and to have it believed, even if she is of type t2 . In this case, neither message will be credible. Or it might happen that even though the messages are in fact credible, neither message is believed by R to be credible. The credibility of the messages is determined by the pattern of S’s beliefs, and the perceived credibility of the messages is determined by R’s beliefs about the pattern of S’s beliefs; in the case of this game, both are constrained, but not determined, by the structure of the game and the rationality and credibility constraints on the models. S always knows whether a message is credible, since she always knows her own beliefs and utilities, but in cases where R may be mistaken or uncertain about whether a message is credible, S may be unsure whether a credible message will in fact be believed, since she may be unsure whether R realizes that the message is credible. So she may be unsure what effect a given message would have, if sent, and her beliefs about this will affect the actual credibility of this message and of others. To take account of S’s potential uncertainly about the effect of her messages, we need, in the definition of the pf rationality of a message, to compare S’s expected value of the hypothesis that the message is sent, and believed, with the expected value of sending alternative messages. Here is our final15 definition:
Saying and Meaning, Cheap Talk and Credibility
97
A message m for S of type t is prima facie rational if and only if the expected value, for S, of sending message m, and having it believed, is at least as great as the expected value of sending any alternative message. Credibility is defined as before, and the credibility constraint should be as follows: It is common belief that the content of any message that is sent and that is believed by R to be credible is believed by R. We can then define the class of game models that satisfy this constraint (in the actual world of the model), along with the constraint that there is common belief among the players that both players choose rationally, and the sets of strategies for the players that are played in some model in the class defined.16 The simple sender-receiver games are intended to isolate pure communicative acts: to separate them from the complexities of more general strategic contexts. But ultimately, our interest is in the way that communication works in a wider setting, and with the way communicative acts interact with each other and with other kinds of rational decisions. I think this account of credibility can be generalized in a number of ways. First, the definitions apply straightforwardly to cases where the private information available to the sender concerns, not exogenous information determined by chance or nature, but information about other choices that the sender will make, before or after the message is sent. Any game, for example, might be preceded by a cheap talk move in which one player has the opportunity to announce her strategy for the rest of the game. Second, we can consider sequences of communicative moves by different players. There are simple sender-receiver games in which credible communication is not assured, but in which it would be assured if R had the opportunity to send a message to S prior to S’s message to R. (Informing her, credibly, that she has the beliefs that are required for credible communication.) Third, in games with more than two players, there may be broadcast messages that must go to many players at once, so that the credibility of the message depends on the effect it will have on players with different interests and different powers. Fourth, in a more general setting, there may be cases where it is uncertain whether or not a sender has certain information; in such cases, credibility requires not just confidence that the sender wants the receiver to know the truth, but also confidence that she knows the truth about the content of the message she is sending. In the simple theory, we make no assumptions about the effect of messages that are manifestly not credible, but such messages may have consequences that a more general theory should consider. They do give rise to the question, on the part of the receiver, ‘why did she say that, given it is obvious to both of us that it is not credible?’ We considered one very contrived
98
Game Theory and Pragmatics
case (Example 3) where a literally incredible message managed to convey a meaning. One may hope that future developments in a more general setting will help to explain the role that the content of what is said may play even when it diverges from what is meant. I am going to conclude with an example that, while it is still a simple sender-receiver game, does gesture toward the kind of phenomenon that is illustrated by our opening story, and at some of the strategic complexities that might arise in a wider context.
Example 5
a1 t1 t2
a2 9
−5
−5 0
−5
a3 0 5 9
−5
a4 8 0 3
0
a5 3
6 0
8 5
6 0
Let’s assume that S is actually of type t1 . Is there any message that she might like to send? Ideally, S would like to get R to choose a3 , yielding a payoff of 5 rather than 0, which is what she would get if she did nothing to change R’s prior 50/50 beliefs. If she could somehow get R to have a degree of belief of about 23 , rather than 12 , in the hypothesis that she is of type t1 , then he would make this choice. But what might S say to accomplish this? She might try revealing some, but not all, of the evidence that she is of type t1 , or she might say something that could be taken to be evidence for this, but that might mean something else. She might say something that R already knows to be true, but that might give some support, but only a little, to the conjecture that S said it because she is of type t1 . But given the disastrous consequences for S of R fully believing that she is of type t1 (in which case he would choose a1 , giving S a payoff of -5), and given that it is common knowledge that S knows whether she is of type t1 or type t2 , S would be ‘playing with fire’ if she made such an attempt, since it might get ‘imbued with deep meaning, whether she wants it to or not.’ In a game this simple, with the knowledge and motives of the participants assumed to be transparent, there is nothing that S can do. She had best remain silent and accept her payoff of zero in order to avoid something worse. But in a richer setting where there is perhaps some doubt about what she knows, or about exactly what her motivations are, or about what her messages say and mean, she might try to achieve more, at least if she is ‘skilled in the delicate language of dollar policy.’ She will never be able
Saying and Meaning, Cheap Talk and Credibility
99
to succeed by meaning what she wants to convey transparently and openly in the way that Grice’s analysis of meaning was trying to capture, but if she succeeds at all by sending a cheap talk message, then she will do so by exploiting communicative devices that are to be understood in terms of their intended role in this kind of communicative practice.
Notes 1. Quotations are from the Wall Street Journal, May 13, 2003, and from a Wall Street Journal editorial. 2. Grice’s lectures and papers on meaning and conversation are collected in Grice (1989). See in particular, ch. 14, ‘Meaning’, originally published in 1957, ch. 5, ‘Utterer’s meaning and intentions’ (1969), and the retrospective epilogue. 3. See Stalnaker (1984), chs 1 and 2 on the problem of intentionality and the contrast between linguistic and pragmatic strategies for explaining intentionality. 4. (Grice, 1989, 219) 5. Grice makes clear that he is using the term ‘utterance’ in an artificially broad way as a label for any act that is a candidate for an act of meaning something. 6. This is Grice’s cooperative principle: ‘Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purposes or direction of the talk exchange in which you are engaged.’ (Grice, 1989, 26). About this principle, he says ‘I would like to be able to show . . . that anyone who cares about the goals that are central to conversation/communication . . . must be expected to have an interest, given suitable circumstances, in participating in talk exchanges that will be profitable only on the assumption that they are conducted in general accordance with the Cooperative Principle and the maxims.’ (Grice, 1989, 30) 7. I am indebted to work by Robert Farrell and Matthew Rabin on cheap talk and credibility, which got me to appreciate both the complexity of the problems, and some of the constructive ideas that may provide solutions. See their papers in the list of references below. The account that is here informally sketched will not (when it is formally spelled out) be exactly the same, in the set of strategies it identifies, as Rabin’s notion of credible message rationalizability, but it will be close, and my account was strongly influenced by the examples he used to motivate and raise problems for his own account. I hope in a later more technical paper to focus on some of the examples on which the two accounts differ. 8. David Lewis was the first, to my knowledge, to connect Gricean ideas to game theory. His analysis of convention, developed in Lewis (1969) drew on work of Thomas Schelling, and discussed the kind of signaling games discussed below. More recent work includes Parikh (2001) and van Rooij (2003), two examples of a growing literature. 9. See Battigalli and Bonanno (1999) for an excellent survey of the literature in the epistemic approach to game theory. My way of developing an epistemic model theory for games is discussed in Stalnaker (1997). Grice’s work had an indirect influence on some of these developments, since it influenced David Lewis’s analysis of common knowledge in Lewis (1969), which in turn influenced the development of the epistemic approach to game theory. 10. In all the examples I will discuss, I assume that the chance move that assigns types to S give equal probability to all types.
100
Game Theory and Pragmatics
11. The problem was first posed as a problem about equilibria in cheap talk games. On the one hand, it was shown that the addition of a cheap talk move makes possible new equilibrium solutions. But on the other hand, the standard theory provides no basis for favoring one over other symmetrical alternative equilibria. And it was shown that there will always be what was called a ‘babbling equilibrium’ in which the sender chooses her signal at random, and the receiver ignores the signal, choosing his response at random. The seminal paper is Crawford and Sobel (1982). 12. Or, one can add the credibility assumption to any refinement of rationalizability, or to some epistemic conditions that characterize the class of Nash equilibrium strategies. 13. This clause was added just so that messages that S never would want to be believed do not count as vacuously credible. This does not really matter, since such messages will not be sent by rational players, but it does not seem natural to assume that if, contrary to fact, they were sent, they would be believed. 14. This example is used by Matthew Rabin (Rabin 1990) to illustrate the interdependence of the credibility of different messages. 15. By ‘final’ I mean final to be offered in this chapter. Further refinement will probably be required. 16. In a future more formal paper, I will spell the theory out more precisely, and explore some of the consequences of this account of credible communication.
References Battigalli, P. and G. Bonanno (1999). Recent results on belief, knowledge and the epistemic foundations of game theory. Research in Economics, p. 53. Crawford, V. and J. Sobel (1982). Strategic information transmission. Econometrica, 50, 1431–51. Grice, P. (1989). Studies in the Way of Words. Harvard University Press, Cambridge, MA. Lewis, D. (1969). Convention. Harvard University Press, Cambridge, MA. Parikh, P. (2001). The Uses of Language. CSLI Publications, Stanford, CA. Rabin, M. (1990). Communication between rational agents. Journal of Economic Theory, 51, 144–70. van Rooij, R. (2003). Quality and quantity of information exchange. Journal of Logic, Language and Information, 12, 423–51. Stalnaker, R. (1984). Inquiry. MIT Press, Cambridge, MA. Stalnaker, R. (1997). On the evaluation of solution concepts. In M. O. L. Bacharach, L.-A. G´erard-Varet, P. Mongin, and H. S. Shin, eds., Epistemic Logic and the Theory of Games and Decisions, pp. 345–64. Kluwer Academic Publisher.
3 Pragmatics and Games of Partial Information Prashant Parikh
1
Introduction
I introduced games of partial information initially in Parikh (1987), later in Parikh (1990, October 1991, 1992, April 2000), and most recently in Parikh (2001), where I considered their use in modeling communication, speaker meaning, and information flow. In this chapter, I look at a medley of topics, some in more detail and others in an exploratory vein. I first clarify some aspects of the basic model in the context of literal content that appear not to have been fully understood; I then discuss the differences between signaling games and partial information games; next I consider the strategic form and Bayesian form representations of these games; I then answer the question about the validity of the interpretive heuristic mentioned in section 2; and finally, I broach the completely new topics of probabilistic communication and situated game theory.
2
The basic model
Assume that A, dressed in business attire with briefcase in hand, on a street crowded with pedestrians rushing to work in midtown Manhattan at 8.30 a.m. on a Tuesday, meets a good friend B who asks him where he is headed. A responds with the sentence below: I’m going to the bank. (ϕ) Several things need to be resolved in this utterance for A to communicate the right proposition to B. One of them is the intended sense of ‘bank.’ B might notice A’s suit and his briefcase, note the time of day, realizing that this is roughly when many people in Manhattan go to work, and note the location in midtown Manhattan, a center for many firms including banks. All of this, both external facts about the situation as well as her general beliefs about Manhattan practices, might lead her to think that it was highly 101
102
Game Theory and Pragmatics
probable that he was headed to a financial bank rather than the bank of a river.1 Since she is a good friend, she might be able to glean even more information from the utterance. She may already know that A works as a senior vice-president at Chase Bank and that he is usually at work by 8 a.m. This knowledge might not only raise her probability that he is going to a financial bank (Chase Bank, in fact) and not a river bank, but she could conceivably also conclude that he is probably conveying that he is late for work and can’t stop to talk, especially if his utterance is accompanied by some appropriate gesture or facial expression. Of course, if he had been dressed more casually and had a fishing rod in his hand and if the Tuesday had in fact been a public holiday, B might have been compelled to consider the alternative option that he was headed for a river bank. It may also happen that the data don’t all point overwhelmingly in favor of one interpretation – it is enough that she form an estimate of the relevant subjective probabilities based on the situation. For B to interpret A’s utterance of ϕ in the circumstances described above, these probabilistic judgments may in fact suffice. B would just choose the option with the highest likelihood. But how can we justify this intuitive and commonsense reasoning? What underlying theory might guarantee that this informal procedure results in the right interpretation much of the time? What are the limits of this method? Of course, the answer we give will apply to all utterances, not just this one.2 I have argued in detail in the publications cited above that the gametheoretic model shown below (Figure 3.1) captures all the relevant facts of the situation and provides a solution to the questions posed in the previous paragraph. I will keep the explanation/‘derivation’ of this model brief here.3 First, the symbol ϕ in the figure stands for the sentence4 above. p stands for the proposition that A is going to the financial bank and p0 for the other option that he is going to the river bank. (Alternatively, we could restrict p and p0 to just ‘financial bank’ and ‘river bank’ with the understanding that ϕ refers just to ‘bank.’ This would imply that the full utterance and its content could enter into a larger analysis and that we are focusing on just one small part of it here.) µ stands for an alternative sentence that the speaker might have uttered but chose not to. This could be, for example, ‘I’m going to the financial bank’ or, in keeping with the parenthetical remarks above, it could be just ‘financial bank.’ Similarly, µ0 could be just ‘river bank.’ It is important to note that, while µ and µ0 are unambiguous with respect to the ambiguity we are considering, there is no reason to require that such unambiguous expressions
Pragmatics and Games of Partial Information
103
p er -r +7, +10 3 µ p 1r +10, +13 ϕ su t -P r p0 PP ρ PP qr P
−10, −15
s0
u
ϕ
0 0 ρQ Qµ
Q
1r −10, −15 p t0 0 r -P Pp
Q e0 Q sr Q
PP p0
P qr +10, +13 P -r +7, +10
Figure 3.1: (Local) game of partial information g(ϕ)
can always be found. We could well have chosen alternative utterances that are themselves ambiguous. This would just lead to a more complex extensive form (i.e. game tree) which is amenable to the same analysis. The symbol s stands for the first situation described above in midtown Manhattan where A’s intention is to convey p, and s0 is the counterfactual situation where A intends to convey p0 . t, t0 , e, e0 are just situations resulting from these initial situations after the relevant utterances. t and t0 are in the same information set because there is no way for B to distinguish between them initially. It is worth bearing in mind that these situations can contain a great deal of information relevant to the interpretation of an utterance, including hard-to-capture intangible things like B’s close friendship with A. The last remaining alphabetical symbols ρ and ρ0 are just the probabilities that we discussed above. They could be 0.9 and 0.1 respectively, since we said the first option was highly likely. We could just as well choose other numbers like 0.8 and 0.2 respectively, but as the difference between them decreases (or reverses), the equilibria of the game will also change.5 This is as it should be, because when the situation is different enough to warrant quite different probabilities (if A had a fishing rod in his hand, for example), then we would expect different predictions of our model as well. Note that these probabilities clearly come from the entire embedding situation rather than from any fixed aspect of the utterance.6 In Lemon et al. (July 2002), we developed a Bayesian net model of how these probabilities are
104
Game Theory and Pragmatics
dynamically generated in the context of anaphora resolution. This model is an approximate version of the game-theoretic model considered here and one can imagine how an extended network of this type might be used to implement such interpretive solutions. That brings us to the last labels on the figure, the payoffs. In general, payoffs are a complex resultant of positive and negative factors that may or may not be additive. Without sacrificing any generality, we will, to simplify things, assume that the payoffs are additive and are made up of positive benefits and negative costs. These payoffs, made up of benefits and costs, also come from the embedding situation s. It should be noted that the quantity of information in a proposition is only one possible component of the payoffs. Payoffs can depend on a very wide range of factors based on the situation the agents are in and their various characteristics – their beliefs and desires, their hopes and fears, and their concerns and inclinations. Perhaps it matters a great deal to B where A is headed, so we might assign a benefit of 20 units (or utils in utility terms) to her of getting the right information. It may matter less to A, so he may derive a smaller benefit of 15 units of conveying the right information.7 Similarly, the respective costs to A and B of uttering and interpreting ϕ may be 5 units and 7 units respectively. And the costs of uttering and interpreting µ and µ0 may be 8 units for each for A and 10 units for each for B. Again, the numbers are quite arbitrary and the costs of µ and µ0 need not be identical – it is just a simplification. Finally, the benefit of incorrect information is, we can suppose from our story, −5 units and −7 units for A and B. Adding up these benefits and costs would give us the payoffs listed in the diagram (15 − 5 = 10, 20 − 7 = 13; 15 − 8 = 7, 20 − 10 = 10; −5 − 5 = −10, −8 − 7 = −15). As I have observed in Parikh (2001, April 2000,O), it is crucial to note that, in this game and the ones to follow, we will have recourse to the ‘right’ and the ‘wrong’ information inferred by B in each situation she finds herself in. How is this correctness and incorrectness determined? The relevant inference to p or p0 is matched against A’s intention to convey p or p0 in s and s0 respectively. In other words, the payoffs implicitly include inferring A’s possible intentions in each situation but go beyond these to also include their situationally given preferences for the information and the corresponding costs of acquiring this information. I argue later in another paper Parikh (2005a) that this Gricean picture of inferring intentions may be only partially correct. There is a certain degree of correlation between the various payoffs, and this game happens to be a pure coordination game. But there is no reason to assume that this is always the case. In ordinary communicative exchanges of the kind we are looking at, the payoffs of players are not likely to be
Pragmatics and Games of Partial Information
105
opposed; in many litigious and other conflictual situations – and these are unfortunately not hard to find – they could well represent contradictory interests. Of course, even when the overall interests are opposed, there may still be a common interest in communicating. In any case, I have chosen to keep things simple, but in general, it should be borne in mind that all benefits and costs and payoffs could vary greatly between agents based on the situation they are in and their own varying characteristics.8 If we solve this game by most of the standard solution concepts, we would find that p is the solution, which means that ‘bank’ gets disambiguated in the right way. I deliberately avoid specifying any particular solution concept here because my interest is in the underlying game model rather than any particular way of solving it.
3
Signaling games and partial information games
I have called the type of game above (local) games of partial information since I first developed them in 1985 (see Parikh 1987). Games of partial information are in many ways similar to but slightly more general than games of incomplete information, the latter having been first systematically approached by Harsanyi (1967). The particular subclass of partial information games we are concerned with – those that apply to communication – are in fact similar to but again slightly more general than the subclass of incomplete information games known in economics as signaling games. The two types of game models of partial and incomplete information are formally, that is, in terms of predictions, identical. A signaling game starts with a move of Nature that reveals some private information to one player. This private information is called the type of the player and is known only to that player and not to the other player. The other player knows only the range of possible values of this type, that is, the other possible moves of Nature, but does not know which particular value was instantiated. Then the same extensive form, consisting of some actions (signals) by the first player followed by some actions (responses) by the second player, is appended to each of these types. Since the same extensive forms are attached, there are often many nodes between which the second player cannot distinguish and these are collected into appropriate information sets. The first signaling games were invented by David Lewis (1969) who represented them in normal or strategic form. They were later studied by Michael Spence (1973), Crawford and Sobel (1982), and David Kreps (1985, 1986) who represented them in extensive form, which is now the standard form in economics for signaling games.
106
Game Theory and Pragmatics
As should be apparent, the extensive form representation I have used throughout for games of partial information follows the pattern of Kreps’s representation, and its formal definition can be found in Parikh (1987, 1990, 2001), where it uses Kreps and Wilson (1982) as a starting point, but develops the more general model in terms of the more basic entities called ’situations’. Both types of representations, the extensive form and the strategic form, are useful as they make different aspects of the situation being modeled more or less visible, and we will consider the strategic form in the next section. So what are the precise differences between signaling games and the relevant subclass of partial information games? 1 In incomplete information signaling games, the same extensive form is always attached to each type (see for example Myerson 1995, Osborne and Rubinstein 1994, Watson 2002) as I said above. In partial information games, this is not necessarily true: in the game in Figure 3.1, µ and µ0 are different so that the speaker’s type may be directly revealed through an action. Of course, it is possible that the same extensive form is attached to each type in some contexts. Another way to make the same point is that in a signaling game there is a single set of ‘messages’ M from which the ‘Sender’ (i.e. the speaker) chooses a message in every state or situation s/he could be in. In a partial information type of signaling game, these message sets are in general different (but in general overlap) for each state or situation, that is, the message set is a function of the state or situation or type, or M = M (s), and M (s) is not necessarily equal to M (s0 ) for different s and s0 . So, in the game in Figure 3.1, the message set for situation s was {ϕ, µ} and the message set for situation s0 was {ϕ, µ0 }. These have nonempty intersection, but are not identical, as would be required by signaling games. Of course, they could be identical in some contexts. Similar considerations apply, mutatis mutandis, to the sets of interpretive actions. The interested reader is referred to Definition 11 on page 149 of the Appendix in Parikh (2001) for a definition of these sets in slightly more restricted circumstances but which captures this essential difference. See also Parikh (1990) and Parikh (1987) for much earlier occurrences of this definition. I should point out that in Kreps and Sobel (1994), published in 1994 – after the latter two publications of mine – Kreps and Sobel make the parenthetical remark that the message set can depend on the signaling player’s type, but their entire article uses a single message set for all types. This is the only other reference I have seen to this wider notion – as
Pragmatics and Games of Partial Information
107
I mentioned, in most standard articles, books, and textbooks, including the ones cited two paragraphs above, the more restrictive assumption is used. Also, in Kreps’s (1985, 1986) own earlier articles, he had used the more restrictive notion. Since my work Parikh (1987) containing this wider notion first issued in 1987 (and since Kreps was familiar with my work), I think it safe to conclude that it was the first source of this wider notion.9 Game theorists working in economics may prefer to continue using the term ‘signaling game’ even for the wider notion, but it would be nice to see researchers in the field of language use the term ‘game of partial information’ or perhaps ‘signaling game of partial information’ in their work as this is the relevant notion for models of language. Thus, partial information signaling games include incomplete information signaling games as a proper subclass and consequently are slightly more general than incomplete information signaling games.10 2 Another difference is that, in partial information games involving communication, the interpretive act – what Austin (1975) called an act of understanding – is always made explicit. In signaling games, there is usually an action the ‘Receiver’ takes, and the interpretive act remains implicit and part of the solution process. Of course, since the action the Receiver can take is completely general, it can always be defined to be an interpretive action in purely formal terms. But it is in natural language communicative situations (e.g. with ambiguity) that the need for an explicit representation of this interpretive act becomes fully visible. Of course, if a further action is required after the interpretive act – like the acceptance or rejection of the message, or the carrying out of a request or command, or an answer in response to a question – then that act is appended to the interpretive act (see Parikh 2001, chapter 8). 3 A third difference, which I have not mentioned in this chapter, is that B constructs the game only after A utters ϕ (and similarly the game becomes common knowledge – when common knowledge matters – only after A’s action). In signaling games, the entire game is common knowledge (or at least known) to both players before the start of the game. 4 A fourth difference is that this so-called local game is embedded in a larger global game that I have not mentioned here. This fact and the previous one are not usually important when analyzing solutions of games, but they are relevant in some contexts (e.g. when we are building artificial game-theoretic agents that communicate).
108
Game Theory and Pragmatics
5 A final difference is that the initial probabilities need not be common knowledge and identical for both players in games of partial information, something that is almost always assumed for games of incomplete information. This is a large and complex issue going back to Harsanyi (1967). I will just skip any discussion of it here but see Myerson (1995) for details. This difference – the lack of common knowledge of the initial probabilities and their lack of identity for the players – may be viewed as an aspect of the bounded rationality of the players and is another aspect of the partiality of information in such games. One application of this feature is to miscommunication, as discussed in chapter 9 of Parikh (2001). In any case, the upshot of this difference is that it also makes games of partial information more general than games of incomplete information. To summarize the above, we can say that games of partial information generalize games of incomplete information (and their corresponding subclass of signaling games) but are formally and predictively identical to them when they are analyzed in the same way. In some sense, the underlying game model is more important than the subsequent analysis through solution concepts, as there are usually several solution concepts that can be applied to a game model, most of which turn out to be equivalent for the simpler classes of games we are looking at. In his paper (van Rooij 2004b), van Rooij has the unfortunate title ‘Signaling games select Horn strategies’ – the game model, whether it is a game of partial information or a signaling game (of incomplete information), can never select anything as such; it is only the analysis of a game model based on a solution concept that results in some strategy being selected. He contrasts games of partial information with signaling games implying that, in a certain context, the latter give the expected result whereas the former don’t, but this is impossible because the two models are predictively identical when analyzed identically. In addition, because both types of models have relatively simple structures, it will turn out that most solution concepts will yield identical predictions. This means that even if the two types of games are analyzed differently, they are still likely to yield the same outcomes, though of course this would be an incorrect comparative procedure. The particular difference in prediction in van Rooij’s paper cited above arose from two faulty sources. One is whether one uses mixed strategies or behavioral strategies to do the analysis; however, it is a well known result – I believe going back to Kuhn (1953) – that the two modes of analysis are predictively identical. So this cannot be a source of difference, even though van Rooij wrongly believed that he was using a different solution
Pragmatics and Games of Partial Information
109
concept. As regards the real source, van Rooij has clarified in an email that he inadvertently used certain incorrect numbers which resulted in the different predictions.11 So there is no reason to prefer incomplete information signaling games over partial information signaling games, and since the latter are more general and can cover greater varieties of communicative behavior, they ought to be the preferred choice.12
4
The strategic form and the Bayesian form
Before I discuss the probability heuristic, I show how the extensive form game can be represented in two other ways – the strategic form and the Bayesian form.13 The normal or strategic form (Table 3.1) is derived from the extensive form by averaging the payoffs based on the initial probabilities in the usual way.14 Table 3.1:
The Game in Strategic Form
p
p0
ϕϕ
8, 10.2
-8, -12.2
µϕ
5.3, 8.5
7.3, 11.3
ϕµ0
9.7, 12.7
-8.3, -12.5
µµ0
7, 10
7, 10
The private information that A has (that is, the knowledge of what the intended meaning is) is not explicitly represented here but it is indirectly represented. First, it is represented in the extensive form by adding an initial node for ‘Nature’ and giving Nature two possible moves in line with the two possible initial situations s and s0 according to the probabilities ρ (0.9) and ρ0 (0.1). Since A’s information sets are {s} and {s0 } rather than {s, s0 }, his private information is adequately captured.15
Game Theory and Pragmatics
110
In the normal form, A’s private information is indirectly represented by the number (and labels) of strategies: A has four strategies, not two, even though there are just two actions each at s and s0 . This implies that A’s information sets are the singletons {s} and {s0 } rather than {s, s0 }, which is exactly his private information. In addition, the strategies are labeled in terms of the actions at both of these nodes – for example, ϕϕ, the action ϕ at both nodes, or ϕµ0 , the action ϕ at s and µ0 at s0 etc. This again implies the private information. It is useful for many purposes to have both representations available for consideration.16 It is easy to see the solution hϕµ0 , pi17 here by inspection of the matrices. I have suggested in Parikh (2001), and in earlier publications, that one additional way to bring in A’s private information in the normal form is by looking at the actual payoff A would get from the game based on a particular strategy profile rather than the expected payoff, but that this would come in only when A has to compare alternative utterances in the global game – something we have not discussed here – and not in determining his optimal strategy choice in our local game. In the Bayesian form representation below (Table 3.2), we explicitly retain A’s private information and its determination of payoffs by having two separate sets of matrices for the two initial situations. A knows which set of matrices is involved, B does not. As they stand, the normal form above and the Bayesian form below are equivalent because A’s private information is indirectly represented in one form and directly in the other, and both are equivalent of course to the extensive form representation.
Table 3.2:
The game in Bayesian form
s/0.9
p
p0
s0 /0.1
p
p0
ϕ
10, 13
-10, -15
ϕ
-10, -15
10, 13
µ
7, 10
7, 10
µ0
7, 10
7, 10
Most game theorists use the standard normal or strategic form representation in Table 3.1 to analyze such games.18 See again Myerson (1995) and especially Watson (2002).
Pragmatics and Games of Partial Information
5
111
The heuristic
We have seen so far that the suggested situational rule or heuristic of choosing the interpretation with the highest likelihood matches the solution to the game. Under what circumstances will this heuristic work? I showed in Parikh (1990) that if ρ, ρ0 are the shared probabilities of A’s intention to convey p, p0 respectively, and b, b0 are the respective marginal benefits of not conveying p, p0 explicitly19 then it can be shown that p is communicated with certainty if and only if ρb > ρ0 b0 . (By marginal benefit is meant the difference between the payoff to ϕ and the payoff to µ in s and likewise for other cases.) That is, p is communicated by an utterance of ϕ iff its expected marginal benefit is greater. When the marginal benefits are equal we get the special case that p is the content just in case ρ > ρ0 . In our game, the payoffs have been chosen so that the marginal benefits are not only the same for each player, but they are also equal for p and p0 , so the result stated above applies. The statement would have to be generalized slightly to include cases where they differ for the two agents. This allows A to convey p to B with just the probabilities entering the determination under a very wide range of situations. If the structure of payoffs (i.e. benefits and costs) does not satisfy the condition above, a more complex heuristic may be required. Note that I have implicitly assumed that the solution to the game is given by the Pareto-Nash equilibrium20 rather than by some other solution concept (e.g. an evolutionary equilibrium). Different solution concepts might result in different corresponding heuristics or different conditions for the same heuristic, though because the games involved are relatively simple, they may also yield equivalent results. Thus one way to view the game-theoretic model is as a method to justify the use of simple heuristics of a situational kind in the determination of content. The game itself need not enter the communicative process at all.
6
The additional information and probabilistic communication
I said above that in the situation we have set up, there is likely to be more communicated than just the information that A is going to some financial institution (that he has some relation to). Since A and B are good friends, B could also infer that A is probably headed for his place of work at Chase Bank and that he is probably conveying that he can’t stop to talk since he is late.
112
Game Theory and Pragmatics
One standard way to view such additional information is as implicatures; my own preference is to see the first piece of information that A is headed for Chase Bank as part of the literal content since that is how the definite description ‘the bank’ would be resolved after the noun ‘bank’ had been disambiguated. The second item that he cannot stop to talk as he is late is best viewed as an implicature (since the expectation would be that he would stop to talk under normal circumstances). This would be especially true if the utterance were accompanied by an appropriate gesture or facial expression. The resolution of the noun phrase can be handled in the same way as in Figure 3.1 – in fact, both processes of disambiguation and resolution can be combined in a single game if we wish. I have shown earlier Parikh (2001, 1992) that the implicature can also be naturally accommodated by models similar to the ones showed above. I want to emphasize in this section that communication can be probabilistic rather than certain, which would be captured by so-called mixed strategy equilibria rather than the pure strategy equilibria we have so far considered.21 Consider Figure 3.2.
p er -r +7, +9 3 µ p 1r +10, +8 ϕ s u -tP r p0 PP ρ PP qr P −1, −1
s0
u
1r −1, −1 p t0 0 r - P Pp
ϕ
0 0 ρQ Qµ
Q
0
e Q sr Q
Q
PP P qr P p0
+7, +6
-r +2, +1
Figure 3.2: The game for the additional information g(ϕ)
Here ϕ is still the same sentence as in Figure 3.1, but depending on the further information we’re looking at (i.e. either ‘Chase Bank’ or ‘I’m sorry, I can’t stop to talk’), µ and µ0 will be different (either ‘Chase Bank’ and ‘some bank but not Chase Bank’ or ‘I’m sorry, I can’t stop to talk’ and silence). The propositions p and p0 will also vary accordingly. I will choose probabilities
Pragmatics and Games of Partial Information
113
and payoffs for the first item of information (i.e. Chase Bank) rather than the second to make the point. Let us say ρ = 0.8 and ρ0 = 0.2. I do not try to justify these numbers – one would have to concoct some suitable expanded story. (The numbers would presumably be different for the implicature.) Note that our model could have included three possibilities instead of just two – we could have had the options Chase Bank, some bank but not Chase Bank, and some bank including Chase Bank. This obviously makes the game more complex so we restricted our attention to the first two options. The normal form is given in Table 3.3. Table 3.3:
The strategic form for the Chase Bank game
p
p0
ϕϕ
7.8, 6.2
0.6, 0.4
µϕ
5.4, 7
7, 8.4
ϕµ0
8.4, 6.6
-0.4, -0.6
µµ0
6, 7.4
6, 7.4
The key thing to notice about the numbers is that there are now two Pareto-Nash equilibria rather than a unique one because neither of the two Nash equilibria dominates the other. Of course, we could do the same thing with other solution concepts as well. The point I am making is not about a particular solution concept but rather about mixed strategy solutions whatever solution concept we use. Because this solution concept and perhaps other solution concepts we might use do not yield a unique outcome, the players of the game might reasonably look to mixed strategies for a solution. This is a standard motivation for considering mixed strategies. There are an infinite number of mixed equilibria but one compelling one is where A mixes just his second and third strategies and B mixes her two strategies.22 This yields a probability of roughly 0.71 for p.23 This indicates that the relevant information that A is going to his office at Chase Bank is interpreted with a probability of 0.71.
114
Game Theory and Pragmatics
There are many possible interpretations of mixed strategies, one or more of which may be relevant to a particular context. Osborne and Rubinstein (1994) discuss many options and a more thorough analysis would have to go through each of these and see whether it was a possible interpretation or not in our type of game. For our purposes, however, I would like to suggest that the simplest option that the players explicitly randomize may be quite acceptable. This does not mean that more involved interpretations may not also be suitable, but that for my purposes here, the direct option will have to suffice. Explicit randomizing would mean here that A chooses his two strategies µϕ and ϕµ with probabilities β and (1 − β) (i.e. since the factual situation is s, he says µ with probability β and ϕ with (1 − β)) and this would further mean that he is initially undecided about the optimal locution to utter in s and chooses one or the other with the given probabilities. Similarly, B is initially undecided too between p and p0 , and in her case, she may never need to make a decision. She may just keep all the information about p, p0 and their respective probabilities α and (1 − α) as part of her interpretation of the utterance. The principal insight here is that communication and information flow can be probabilistic, something that does not seem to have been noted before. I have also touched upon this in (Parikh, 2001, chapter 7) in the context of implicature. As should be clear now, probabilistic communication is important in the determination of both literal content and implicature. Once we take note of this fact, we begin to see it everywhere, even in the simplest information flows. This makes the need for probabilistic approaches to interpretation like game and decision theory even more apparent. There has been a strong tendency in the literature of the last century to assume that the literal contents of utterances are determinate and clearly given in communication and Grice (1989) reserved the notion of indeterminacy just for implicatures. I discuss this new observation in greater detail in Parikh (2005a). In closing this section, I remind the reader that while I have looked at an example of disambiguation, the same method applies to all sorts of problems of interpretation and communication. They are formally the same.
7
Situated game theory: analyzing games of partial information
In earlier sections, I tried as much as possible to avoid saying what type of solution concept I was using. This wasn’t entirely possible and I had to allude especially to Pareto-Nash equilibria and mixed strategy Nash equilibria occasionally. The primary reason for avoiding discussion of solution
Pragmatics and Games of Partial Information
115
concepts so far is that while they are important of course to completing the analysis, it is the game models themselves and their various representations that are more fundamental, and I tried to show in all of the above discussion how they can model a very wide range of phenomena.24 A second reason is that there are many different approaches to solving essentially the same game model, some giving similar predictions and others that are not equivalent. Once one has the basic setup one can explore multiple approaches to analyzing it without committing oneself to one or another approach. This is an ongoing task. Thus, one needs to make a clear distinction between the situation being modeled, the game model one uses to model that situation, and finally, the analysis of the model with some solution concept. Indeed, once one makes this kind of separation, a very different view of game theory emerges, a view that does not seem to have been explored even in the game-theoretic literature, at least as far as I know. I have hinted at this in chapter 4 of Parikh (2001) and elsewhere. To introduce this view of what I call situated game theory, let us consider an analogy. It has taken a while to give full recognition to the context of utterance as an integral factor in the determination of content and even today the dominant position gives it a second-class status in semantics. It is only the sentence together with a context that has a content. The proliferation of analysis strategies for games suggests that games may well be like sentences in this regard. Just as sentences can be used in a variety of ways in different situations to make different statements, so games can be solved in a variety of ways in different situations to give different solutions. The same game in an economic context may best be analyzed in one way but in a political or communicative context in other ways. That is, it is only the game together with a context that has a solution. Now games already incorporate a number of contextual factors internally through the payoffs for example, so what remains of relevance in the context after this has been done? Indeed, games are themselves models of situations. This is a moot point because if something does remain, we could always try to enrich the model and incorporate it. This would be similar to the familiar move in the case of language of enriching the sentence’s logical form. While I believe this is a bad strategy for the analysis of content, it may be acceptable in the case of games: we can sometimes enrich the game model and sometimes leave things in an external context. Perhaps there are higher level things about the concerns of the players that are best left out of the game model and used externally to determine how the game should be analyzed, whether statically or evolutionarily or behaviorally, or given one of these choices, what solution concept should be employed, and further, which of multiple possible equilibria should be selected. Perhaps there are external things about the architectural configurations of the agents in-
116
Game Theory and Pragmatics
volved: they may have all kinds of computational limitations, for example, that may vary from situation to situation and agent to agent. In the previous section, we faced two Pareto-Nash equilibria and an infinity of mixed strategy equilibria. I gave an informal argument why a particular mixed solution was compelling, but this relied on ‘external’ assumptions about the players’ preferences that had not already been captured in the game. Perhaps one way to view this argument then, in light of the foregoing analogy, is that this external information is located in the situated game, that is, in the context of the game. Different assumptions and different corresponding analyses would model other contexts and the behavior of the players would then be different. Incidentally, I should point out that we have now identified two fundamentally different kinds of contributions the utterance situation s makes to the semantics of an utterance: 1 In the determination of the game model 2 In the determination of its solution Both uses help to fix the content. We know that we have used a game to model the first use of s. Since there may be additional facts in s that have been left out of the game model (e.g. a preference for separating equilibria), why not use a second, higher-order game to model these additional facts? This would be adopting the second strategy referred to above, of leaving certain facts external to the initial game model in the context s, and then using a second game to model these facts. This idea suggests that given a situation s, we can model it as a sequence of games hg0 , g1 , g2 , . . .i, g0 being our initial model g(ϕ), and each subsequent game being at a higher level than the one before it. So g1 might encode information about a preference the players may have for a certain type of solution concept in g0 , g2 in g1 , and so on. The information in a real life situation may never be fully exhausted, but one can imagine situations where the sequence ‘converges’ so that in some gn for finite n or possibly in the limit as n → ∞, the solution is obvious. Then this would presumably cascade back all the way to g0 . Admittedly, this suggestion is a bit speculative, but I see no immediate problem in carrying it out. Certainly, there may often be additional information in s that has not been exhausted by g0 and this may be relevant for the solution of g0 . Needless to say, if this suggestion turns out to be valid, it would have interesting consequences for the use of game theory in wider contexts too, whether they are economic, political, or biological. Of course, we may also be forced to face the fact that if contextual information is not always fully exhaustible through the sequence of games above, then there may well remain a non-formal residue in our attempts to solve games satisfactorily.
Pragmatics and Games of Partial Information
117
Armed with this new perspective of a situated game theory, we now consider some approaches to the analysis of games in general and games of partial information in particular. These should all be seen as being accommodated within this larger view. This is of course a rather large topic and I will keep my remarks quite brief. There are, it seems, four broad approaches to games and their solution today: 1 Rationality-based approach: This is the classical approach that has been dominant until fairly recently and this is the approach I have relied on primarily in my earlier work. 2 Evolutionary approach: This is an approach introduced initially in evolutionary biology in the 1970s, mainly by John Maynard-Smith, and it is now fast becoming a favored approach amongst many in a variety of fields. One great advantage of this approach is that it allows one to make relatively minimal assumptions about the architecture of agents so that one can relax the stringent demands of rationality. One great disadvantage is that one is then forced to assume that for a long time until equilibrium is reached, agents behave in sub-optimal ways, and this may be unsuitable for certain applications. 3 Stochastic dynamical system approach (adaptive learning): This is a variant of the evolutionary approach that allows stochastic variation and has also begun to become popular. This also has the same advantages and disadvantages as the previous approach though it can be mathematically more complex. 4 Behavioral/experimental approach: This approach has also seen an explosive growth in recent times, especially in economics, where insights from the other social sciences and psychology are introduced to consider possible deviations from rationality that allow one to make more realistic assumptions about the architecture of agents. We used this approach briefly in Fehling and Parikh (November 1991), where we described a bounded rationality and satisficing method, but have not had a chance to explore more recent developments of the behavioral approach. Each approach allows many different solution concepts within it, so there are many options to explore. The possibility of learning is also very important and it can be incorporated within all of the above approaches, so that it forms an added component in each approach. All these approaches apply to the same basic model that we considered in earlier sections, only the analyses of this model differ. In a later paper, I study some of these options in more detail, assessing their pros and cons. For our purposes here, we first note that, in line with my earlier remarks, it is the embedding situation that ought to determine
118
Game Theory and Pragmatics
which approach is best, and also of course what type of application we are considering, whether we are trying to derive a particular interpretation on a particular occasion or derive a rule of some sort or some conventional meaning. But even if we are deriving a particular interpretation, there is no reason why we cannot employ a deterministic or stochastic evolutionary approach, especially if we incorporate learning (e.g. it is possible that agents learn how to recognize games as belonging to a particular class so that there is no need to play and solve the game each time; once the solution of a class of games is learned on some basis, possibly evolutionary, then the agents can simply use a rule to apply to each game belonging to that class). Alternatively, there is no reason to suppose that one of the static approaches (the first and often the fourth) cannot be applied to the derivation of rules (e.g. it is possible that agents learn a rule like the Horn rules in one shot; this also avoids the awkwardness of assuming that for a long time when agents are evolving a rule, they are behaving in ways that get things wrong, as the evolutionary approach forces us to assume). So there is no clear and obvious mapping from type of application to type of approach and, indeed, in line with our situated perspective, it is possible that the ideal approach, if indeed there is just one, will vary from situation to situation even for a given application. One of the key attractions at any rate of the latter three approaches is that they promise greater (psychological) realism by making less onerous demands on the processing powers and other capacities of agents. I find myself broadly sympathetic to this line of reasoning but again the problem is muddied by the fact that the games we have considered, even if considered under the first classical rationality-based approach, need not be actually solved by speakers and hearers, because all kinds of simple heuristics may be available for actual use. As I have urged before, these games and in particular these solution strategies may also be viewed just as defining the character of the constraints that obtain in communication and information flow (see Parikh 2001). Of course, it would be nice if such solution strategies could be employed directly in modeling actual communication and I cited Lemon et al. (July 2002) above as one instance of how such applications might be envisaged through the medium of Bayesian nets. I think this may also prove to be a very promising line of inquiry not just for the first approach but in fact for all four approaches. I conclude by reiterating that the stage (or base) game in all four approaches remains identical, namely, a (signaling) game of partial information.
Pragmatics and Games of Partial Information
119
Notes 1. It should be pointed out that there are at least three primary senses of bank – financial bank, river bank, and rowing bank – and several – at least eighteen in one dictionary – refinements of these primary senses. We will consider just the first two primary senses to keep the game simple, but there is nothing to prevent us from considering even a continuum of senses, though our solutions will work for just the finite case with full generality. 2. I discuss this heuristic in some detail in this chapter because it is sometimes not understood that I have not presented partial information games as psychologically plausible models of communication in the past, but rather as models that capture abstractly the logic of communication. I have presented the heuristic as one psychologically plausible model that can be justified by these games. Whether the games in their current form are themselves psycholinguistically realistic is moot. However, something like a Bayesian net model approximation to such games (see Lemon et al. July 2002) may not be unrealistic. 3. Readers are referred especially to the book Parikh (2001) cited above. 4. Actually, it stands for the action of uttering the sentence, but we will ignore this detail here. 5. I mention equilibria here because it is relevant to this paragraph on probabilities even though I have yet to finish describing the full game. 6. In a conference and later in an email, Richard Breheny described the following example to me: Johnny is a boy who is with his friend Billy. He calls home to ask if he can go swimming with Billy. His father answers the phone. It is okay with the father if Johnny goes along but he wants to check with his wife, so he calls out to Johnny’s mother: ‘Johnny wants to go swimming with Billy.’ Johnny’s mother answers: ‘He has a cold.’ Note that even if the father instead calls out, ‘Billy wants to take Johnny swimming’ or some other such sentence with an altered subject, the intuition is that the mother’s reply would still be understood to be referring to Johnny. This point is important to the example since the best theory of the effect of previous discourse on pronominal salience (Centering Theory, Grosz et al. 1995) would say that being the subject of the preceding sentence makes it the center and therefore the more likely to be pronominalized. (Incidentally, in Lemon et al. (July 2002) cited above, Oliver Lemon, Stanley Peters, and I proposed a probabilistic approach to anaphora resolution that approximates my game-theoretic approach. The result there is that no principle – recency or subject placement or something else – will outrank others in a context-independent way. One can always generate examples to defeat any fixed ranking.) It should be clear now that this example can be handled quite naturally by either assuming appropriate probabilities or appropriate payoffs or both, all based on the situation described above. For example, it is more likely that Johnny’s mother is referring to Johnny since her closer relationship is with Johnny and she would, in the described situation, be more concerned about Johnny’s health. Alternatively, this concern could be translated into greater payoffs for Johnny rather than Billy. Or finally, both effects might be combined in this situation. In the sentence with the altered word order, the prediction would still be the same, even though Johnny is no longer the subject of the sentence. The other point to the example is that it is not difficult to think up some story whereby, if ‘he’ refers to Billy, the mother could still be giving reasons why she would not allow Johnny to go (e.g. one is very likely to catch colds from some-
120
7.
8.
9.
10.
11. 12.
13.
14. 15.
Game Theory and Pragmatics one you go swimming with who has a cold etc.). In this new situation where Billy turns out to be the referent (owing to a recasting of the game based on the new story), we would presumably have a further game that would capture the implicature that Johnny should not go swimming with Billy because Billy has a cold. This would follow the argument developed in Parikh (2001) in the chapter on implicature. Note that no real interpersonal comparisons of utility are involved, this is just a manner of speaking about the situation. Any affine transformation of the numbers for each agent preserves the agent’s underlying ordering. Incidentally, Robert van Rooij (2004a) makes the observation that under certain highly restrictive assumptions about payoffs (i.e. when they are more or less the same for speaker and addressee) the game-theoretic analysis gives the same predictions as the optimal coding model. Of course, this is not true in the general case. Also, it must be remembered that showing that the two models might make some of the same predictions in a small number of special cases ignores the question of what the two models help to explain. Partial predictive equivalence falls far short of explanatory equivalence. Different underlying processes may be involved and these too have consequences for our understanding of the phenomena of information flow and linguistic behavior. Indeed, if the wider notion were taken as characteristic of signaling games themselves, as Kreps and Sobel (1994) suggest, then they would not be special cases of games of incomplete information, whose definition was clearly restrictive in requiring the same extensive forms to be attached to each type (see Harsanyi 1967 and the textbooks cited above). Just as signaling games of partial information are more general than signaling games of incomplete information, so games of partial information are correspondingly more general than games of incomplete information because they do not require the same extensive forms to be attached to each type. This more general form is also important in the derivation of some rules like the predictive rule for scalar implicature derived informally in chapter 7 of Parikh (2001) and in more detail in Parikh (2005b). My thanks to him for pointing this out. Robert van Rooij (2004b,a), has also stated that games of partial information apply just to deriving interpretations of particular utterances and not to the derivation of rules, despite clear evidence to the contrary – in Parikh (2001), I give informal derivations of the Gricean maxims and of a predictive rule for scalar implicatures. The extension to other rules is straightforward. I discuss this issue in more detail in a later paper (see Parikh 2005b). There is also a fourth representation called the Selten game or the type-agent representation, but I do not discuss it here. The interested reader can see Myerson (1995) and Osborne and Rubinstein (1994). See again the two references Myerson (1995) and Watson (2002). There is a slight awkwardness in introducing this formal device of a chance move by Nature when the private information is something fundamental to and inseparable from the relevant agent (e.g. his or her gender), but in the kinds of contexts of communication we are interested in, there should not be a problem in making this assumption. Besides, even in the case of something fundamental like a player’s gender in situations where it matters strategically and where it has the role of private information, we might still without too much awkwardness assume the same sort of device of a move by Nature determining this private information if we also assume that there is a long time delay between Nature’s move and subsequent moves by the actual players, since there is nothing in the representation that restricts the time duration of a game.
Pragmatics and Games of Partial Information
121
16. It is important to note that the strategic form representation is in fact an alternative representation of the underlying game and not just part of the analysis of the extensive form representation. Different representations make different aspects of the situation being modeled visible, just as with the simplest types of games. 17. Once again, I deliberately refrain from specifying any particular solution concept here as most applicable solution concepts are likely to be equivalent. 18. There is a further subtlety relating to A’s private information. As represented in the extensive form, he learns it only after Nature’s move, so when solving the game he has to think about what he would do in s0 as well. As represented in the strategic form, he never strictly speaking learns which of s or s0 become factual, but he has a plan for both contingencies just as in the extensive form. In the Bayesian form, the private information is explicitly represented, but he still needs to consider the second matrix which is not factual because he needs to take B’s reasoning into account. 19. This condition depends on assuming unambiguous alternative sentences, but it can be generalized without requiring this restriction. 20. This is just the unique Pareto-efficient Nash equilibrium of the two Nash equilibria. 21. While mixed strategy equilibria are routinely computed in signaling games, I believe this is the first such insight in the context of natural language communication. 22. In situations like this where we might have an infinite number of possible equilibria, game theory does not offer any simple and convincing way to make a unique choice. The reasoning always remains somewhat informal and even ad hoc. The best explanation I can think of why the mixed equilibrium I am proposing is ‘compelling’ is that it mixes precisely the two strategies of A which yield Pareto-Nash equilibria in pure strategies, which additionally are separating equilibria – the two strategies involve different actions in s and s0 – and this may be independently preferred. In a more complete analysis than I am offering here, one might consider other mixed equilibria explicitly as well. In any case, I say more about this in the last section. 23. The calculation is as follows: If α is the unknown equilibrium probability of p that B uses to mix her two strategies, then it must satisfy 5.4α+7(1−α) = 8.4α− 0.4(1 − α), which yields α ≈ 0.71. Similarly, if β is the unknown equilibrium probability of µϕ that A uses to mix his two relevant strategies, then it must satisfy 7β + 6.6(1 − β) = 8.4β − 0.6(1 − β), which yields β ≈ 0.84. 24. I also alluded briefly to how these models can be used to derive rules of all kinds, something that space did not permit me to consider more amply here. I do this in a later paper (see Parikh 2005b).
References Austin, J. L. (1975). How To Do Things With Words. Harvard University Press, Cambridge, second edition. Ed. J. O. Urmson and Marina Sbisa. Crawford, V. P. and J. Sobel (1982). Strategic information transmission. Econometrica, 50, 1431–1451. Fehling, M. and P. Parikh (November 1991). Bounded rationality in social interaction. In Knowledge and Action at Social and Organizational Levels - AAAI Fall Symposium. Grice, H. P. (1989). Logic and conversation. In Studies in the Way of Words, pp. 1–143. Harvard University Press, Cambridge, MA.
122
Game Theory and Pragmatics
Grosz, B., A. Joshi, and S. Weinstein (1995). Centering: A framework for modeling the local coherence of discourse. Computational Linguistics, 21(2), 203–225. Harsanyi, J. C. (1967). Games with incomplete information played by Bayesian players. Management Science, 14, 159–182, 320–334, 486–502. Horn, L. R. and G. Ward (2004). The Handbook of Pragmatics. Blackwell Publishing, Oxford. Kreps, D. (1985). Signalling games and stable equilibria. Mimeo, Stanford University. Kreps, D. (1986). Out of equilibrium beliefs and out of equilibrium behaviour. Mimeo, Stanford University. Kreps, D. and J. Sobel (1994). Signalling. In R. Aumann and S. Hart, eds., Handbook of Game Theory with Economic Applications, volume 2. Elsevier. Kreps, D. and R. Wilson (1982). Sequential equilibrium. Econometrica, 50, 863–894. Kuhn, H. W. (1953). Extensive games and the problem of information. In H. W. Kuhn and A. W. Tucker, eds., Contributions to the Theory of Games, volume 2, pp. 193–216. Princeton University Press, Princeton. Lemon, O., P. Parikh, and S. Peters (July 2002). Probabilistic dialogue modeling. In 3rd SIGdial Workshop on Discourse and Dialogue, pp. 125–128. Lewis, D. (1969). Convention. Harvard University Press, Cambridge, MA. Myerson, R. (1995). Game Theory: Analysis of Conflict. Harvard University Press, Cambridge, MA. Osborne, M. J. and A. Rubinstein (1994). A Course in Game Theory. The MIT Press, Cambridge, MA. Parikh, P. (1987). Language and Strategic Inference. Ph.D. thesis, Stanford University. Unpublished. Parikh, P. (1990). Situations, games, and ambiguity. In R. Cooper, K. Mukai, and J. Perry, eds., Situation Theory and Its Applications, I. CSLI Publications, Stanford. Parikh, P. (1992). A game-theoretic account of implicature. In Y. Moses, ed., Theoretical Aspects of Reasoning about Knowledge. Morgan Kaufmann Publishers, Inc., California. Parikh, P. (2001). The Use of Language. CSLI Publications, Stanford University. Parikh, P. (2005a). Radical semantics: A new theory of meaning. To review. Parikh, P. (2005b). Situations, rules, and conventional meaning: Some uses of games of partial information. Forthcoming. Parikh, P. (April 2000). Communication, meaning, and interpretation. Linguistics and Philosophy, 23, 185–212. Parikh, P. (October 1991). Communication and strategic inference. Linguistics and Philosophy, 14, 473–514. van Rooij, R. (2004a). Conversational implicatures and communication theory. In J. van Kuppevelt and R. W. Smith, eds., Current and New Directions in Discourse and Dialogue, pp. 283–303. Kluwer, Dordrecht. van Rooij, R. (2004b). Signaling games select Horn strategies. Linguistics and Philosophy, 27, 493–527. Spence, M. (1973). Job market signaling. Quarterly Journal of Economics, 87, 355–374. Watson, J. (2002). Strategy: An Introduction to Game Theory. W. W. Norton and Company, New York.
4 Game Theory and Communication Nicholas Allott
1
Introduction
This chapter looks at recent attempts to shed light on communication using game theory. It is divided into three parts. First, motivations for a gametheoretic approach to communication are briefly investigated. In the second part, one of the most fully developed game-theoretic accounts of communication is examined: Prashant Parikh’s post-Gricean utterance-by-utterance account (Parikh 1990, 1991, 2001). Doubts are raised about some of the aspects of Parikh’s treatment and suggestions are made for refinements of cost factors to improve predictive power. A more fundamental problem is that the model drops a Gricean constraint on inference in communication. I argue that this leaves it without an account of the content of implicatures. Some comparisons are made with relevance theory (Sperber and Wilson, 1986/95), a non-game-theoretic utterance-by-utterance account of communication, which retains a form of the Gricean constraint. In a final section, I look at the broader prospects for using standard (that is, non-evolutionary) game theory to capture the link between rationality and communication. The idea of using game theory to find constraints on language is briefly sketched, and I mention some general doubts about the formalisation of rationality in standard game theory. 1.1
Why employ game theory in an account of communication?
Game theory is concerned with strategic interactions – situations where two or more players (or agents) have to make decisions and the way things turn out for each player may depend on the other player or players’ choices. Superficially, at least, human communication looks like this sort of situation1 , since a speaker makes an utterance and a hearer tries2 to interpret it. We say that the hearer tries to interpret the utterance because a great deal can be meant by the speaker which is not encoded in the linguistic form (the lexical items and the syntactic structure) of the phrase or sentence uttered. The hearer must, at least, choose a meaning for ambiguous expressions, assign reference to indexical elements such as pronouns, decide on 123
124
Game Theory and Pragmatics
reference class and scope for quantifiers, work out for some lexical items how loosely or narrowly they are being used and recover intended implicit meaning (‘implicatures’). How successful the speaker and hearer each are seems to depend on choices made by the other: if the interpretation the speaker intended is (close enough to) the one the hearer works out, then communication is successful, otherwise there is miscommunication. This apparent degree of shared interest has led to the suggestion that communication be modelled as a coordination game, that is, a game where the players’ payoffs are aligned. (e.g. Lewis 1969; Parikh 1991). 1.2
Game theory, communication and rationality
A central attraction of game-theoretic models of communication is that they might help to elucidate the link between rationality and human communication (if there is one). Grice was convinced that principles governing communication would fall out from general assumptions about rationality and the cooperative nature of communication (perhaps when taken together with other assumptions about human beings or the communicative situation). I am . . . enough of a rationalist to want to find a basis that underlies these facts [about people behaving as the conversational principle and maxims say] . . . I would like to think of the standard type of conversational practice not merely as something that all or most do IN FACT follow but as something that it is REASONABLE for us to follow, that we SHOULD NOT abandon (Grice 1975, p. 48, his emphases). Standard game theory has certain assumptions about common knowledge and rationality (CKR) built in, and an account of communication in terms of standard game theory would inherit these assumptions. This approach promises, therefore, to make the link between rationality and communication clearer and more precise. On the other hand, a game-theoretic model of communication would also inherit any empirical and theoretical disadvantages of the particular formalisation of rationality adopted by standard game theory. There are at least two ways of making the link between game theory and communication (van Rooij 2004, p. 494): either rationality considerations are taken to apply to languages or directly to the communicative situation. Lewis’s work and recent work by van Rooij (2003, 2004) is in the former tradition; Parikh’s model (Parikh 1991, 2000, 2001) takes the second approach,3 looking at how communication is achieved between a speaker and a hearer
Game Theory and Communication
125
with the language as a given. In section 2, I examine Parikh’s model, arguably the most developed attempt at either of the approaches. I will compare this model with the account of communication given by relevance theory, with the aim of assessing how well Parikh’s account compares with modern cognitive pragmatics in its explanation of retrieval of explicit and implicit meaning. 1.3
Cooperative Games
Situations where the interests of the players are lined up are modelled as cooperative or coordination games.
player 1
Table 4.1:
Meeting game
Trafalgar Square UCL front quad .. .
Trafalgar Square 20, 20 −20, −10 .. .
player 2 UCL front quad −10, −10 10, 20 .. .
··· ··· ··· .. .
Consider Table 4.1. Here both players want to meet, so the outcomes where they make the same choice have high payoffs for both. The outcomes where they do not meet have negative payoffs to reflect the effort involved in making the journey. Also, player 1 has an aversion to University College London (UCL), so for him the payoffs at UCL are lower than elsewhere. If this were not the case, this would be a purely cooperative game, or, equivalently, a pure coordination game. As we will see, Parikh models communication as either cooperative or nearly so. In the simple game where Trafalgar Square (t) and UCL front quad (u) are the only choices, ht, ti and hu, ui are both ‘Nash equilibria’: situations where neither player will be better off if he changes his choice unilaterally. In coordination games, there are typically multiple Nash equilibria ‘along the diagonal’ and game-theoretic accounts of communication have had to focus on criteria for a unique solution. Games can also be played sequentially: first one player takes a turn, then the other, with the second player knowing what the first player has chosen. In this case, the games can be represented as trees. Figure 4.1 on the next page is a tree for a meeting game with two players and two choices each. Here player 1 gets to move first. How should he choose his move? He looks at what player 2 will do in each sub-tree, that is, in each situation that player 2 could be in. Here it is simple: we assume that player 2 chooses so as to maximise his payoff. It follows that if player 1 chooses t then player 2 will
126
Game Theory and Pragmatics
+20, +20
re qua
Tr
ar S afalg
quare
UCL fr
ar S
lg Trafa
ont qu
ad
-10, -10
UCL f
ron
Player 1's choice
t qu ad
uare
-20, -10
ar Sq
lg Trafa
UCL
fron
t qu
ad
Player 2's choice
+10, +20
Figure 4.1: Sequential meeting game with two choices and two players
choose t; if player 1 chooses u then player 2 will choose u (maximising his pay-off to +20 versus −10 in each case). The other outcomes can be ruled out. So player 1’s choice is between ht, ti (the top branch), and hu, ui (the bottom branch). He prefers the pay-off of +20 for ht, ti to the +10 he would get for hu, ui so he will choose t and player 2 will then choose t. The games Parikh uses are also sequential games: a speaker chooses an utterance, and a hearer, knowing what has been uttered but not the intended interpretation, chooses an interpretation.
2
Parikh’s model
In this section of the chapter I introduce Parikh’s game-theoretic model of communication, giving a critical commentary. Consider a situation where a speaker makes an utterance and a hearer tries to interpret it. Parikh starts by examining cases where the sentence uttered has two possible meanings. This could be due to lexical or structural ambiguity, to the need to assign reference or to a purely pragmatic availability of two readings for the sentence.4 Parikh gives an example in which the speaker utters (1): (1) ϕ: Every ten minutes a man gets mugged in New York. According to Parikh, this has the two possible meanings: (2) p: Every ten minutes someone or other gets mugged in New York.5
Game Theory and Communication
127
(3) p’: Some particular man gets mugged every ten minutes in New York. As Parikh acknowledges, there are other parts of the meaning of (1) which the hearer must resolve: ‘New York’ might mean the state or the city, and ‘ten minutes’ is clearly being used somewhat loosely. The aim is to show first how the model of communication resolves one uncertainty at a time, then allow for its extension to cover realistic cases where several aspects of the meaning must be fixed simultaneously.
s
φ
p
t
ρ
s' ρ'
+10
p'
-10
φ
p
t'
-10
p'
+10
Figure 4.2: Part of local game for utterance of ϕ (Parikh 2001, p. 30)
Consider Figure 4.2. There are two initial situations, s and s’. s is the situation in which the speaker intends to convey p; s’ is the situation in which she intends to convey p’. In either case, she may utter ϕ. After the utterance there are also two situations, t and t’, where t is the situation where the speaker means to convey p and has uttered ϕ and t’ is the situation where she means to convey p’ and has uttered ϕ. The speaker, knowing what she wants to convey, knows which situation she is in. The hearer does not. His uncertainty is represented by the box around t and t’. He assesses the probability of s as ϕ and the probability of s’ as ϕ0 . He then chooses an interpretation – either p or p’. Note that his preferred choice depends on information he does not have. If he is in t he prefers to play p; if he is in t’, he prefers p’. This is reflected in the payoffs. For the moment, Parikh assumes that successful communication is worth +10 to each player and unsuccessful communication is worth −10. This is based on two assumptions, both of which he later relaxes: first, that all information has the same value; secondly that the information is worth the same to both players.
128
Game Theory and Pragmatics
e
p
+7
μ
s
φ
p
t
ρ
s' ρ'
+10
p'
-10
φ
p
t'
-10
p'
μ'
+10
e'
p'
+7
Figure 4.3: Local game with alternative utterances (Parikh, 2001, p. 31)
According to Parikh, successful communication depends on consideration of alternative utterances. In Figure 4.3 the alternative utterances µ and µ0 have been included. µ unambiguously means p and µ0 unambiguously means p’. Therefore if the speaker utters µ, the hearer knows he is in situation e and will choose interpretation p. Similarly an utterance of µ0 leads the hearer to choose p’. The payoffs are worked out by assigning +10 as before for successful communication, and −3 for the extra effort involved in the production and comprehension of these longer and linguistically more complex utterances. Parikh does not give details of the way the linguistic complexity translates into effort. Later we will see that he allows the cost of constructing or processing a mental representation to come in also as a negative factor in payoffs. Figure 4.3, then, is the hearer’s model of the interaction, which he is able to construct when ϕ is uttered. The speaker can also construct this model, since it is based on shared knowledge. If there is a unique solution to the
Game Theory and Communication
129
game, therefore, it will be known to both players. Successful communication using ϕ will be possible. I return to the solution of the local game in section 2.1. If the speaker knows the payoff of the local game she knows the payoff from uttering ϕ. She compares this with uttering µ and other alternatives as shown in the ‘global game’, Figure 4.4.
ν[g(φ)] = +10 φ
s
μ
ν[g(μ)] = +7
}
alternative utterances
Figure 4.4: The speaker’s choice of utterances (after Parikh, 2001, p. 32)
As shown, ϕ has a higher payoff than µ, so the speaker should choose it as long as it also has a higher payoff than other alternative utterances. 2.1
Solutions for the game
Returning to the problem of solutions for the local game, consider the different possible strategies for both players. A strategy is a specification of the choices a player will make at all decision nodes. Here, the non-trivial choices are for the speaker at s and at s’ and for the hearer at t or t’ (the hearer’s choice is constrained to be the same at t and t’, since he cannot know which of them he is at). This gives eight different strategies (see Table 2.1), two of which are Nash equilibria, that is, solutions where neither player can do better by changing his or her choice unilaterally. The first Nash equilibrium (N1) is the intuitively correct solution: if the speaker wants to mean p – the more likely meaning – then she says ϕ, the shorter but ambiguous utterance; if she wants to mean p’ – the less likely meaning – then she says µ0 , the longer but unambiguous utterance; and the
130
Game Theory and Pragmatics
Table 4.2 What A will What A will What B will Nash equisay in s say in s’ choose in librium? (t, t’) ϕ ϕ p no ϕ µ0 p yes ϕ ϕ p’ no ϕ µ0 p’ no no µ ϕ p no no µ µ0 p no µ ϕ p’ yes µ µ0 p’ no
If not Nash equilibrium, why not? A should defect to µ0 in s’ – A should defect to µ0 in s’ A should defect to µ in s B should defect to p A should defect to µ0 in s’ B should defect to p’ A should defect to ϕ in s – A should defect to ϕ in s’
hearer correctly interprets the ambiguous utterance ϕ as having the more likely meaning, p. The second Nash equilibrium (N2) is a kind of inverse of the intuitively correct solution: if A wants to mean p – the more likely meaning – then she says µ, the longer but unambiguous utterance; if she wants to mean p’ – the less likely meaning – then she says ϕ, the shorter but ambiguous utterance; and B correctly interprets the ambiguous utterance ϕ as having the less likely meaning, p. It is clearly good that the other strategies are ruled out, since they describe even stranger arrangements than the second Nash equilibrium. However it is also necessary for Parikh to rule out the equilibrium N2 (although there is no miscommunication in N2), leaving only the intuitively correct solution, N1. Note that this problem of multiple Nash equilibria comes with the decision to model communication as a coordination game. As noted above, coordination games generally have multiple Nash equilibria. This is exemplified by Figure 4.2: the cells on the diagonal will be preferable to both players to the off-diagonal outcomes that can be reached by unilateral change of choice. Any game-theoretic approach to communication which models the communicative situation as a coordination game must resolve this problem. Parikh’s proposal here is to bring in another solution concept, Paretodominance, which is defined as follows: some solution A Pareto-dominates a solution B iff solution A has a better payoff for at least one player than solution B and the payoff of A is not less than the payoff of B for any player. In general, Pareto-dominance can pick out different solutions from Nash equilibrium6 . To avoid this, Parikh applies it as a secondary criterion to
Game Theory and Communication
131
choose between Nash equilibria.7 In this case, this works out as choosing the Nash equilibrium with the highest expected payoff. Using expected utility as the measure of the worth of an outcome, as is usual in game theory, the expected payoff for an outcome = payoff × probability of outcome. (Thus, for example, a rational agent should prefer a 50% chance of ≤ 100 to ≤ 49 for sure, neglecting risk avoidance.) Thus, setting ϕ = 0.9 and ϕ0 = 0.1, we have the following payoffs for the Nash equilibria: payoff of ‘correct’ solution, N1 = 10 × ϕ + 7 × ϕ0 = 9.7 payoff of ‘incorrect’ solution, N2 = 7 × ϕ + 10 × ϕ0 = 7.3 The solution which is intuitively correct Pareto-dominates the other solution, which can therefore be eliminated. Generally, N1 Pareto-dominates N2 in this game, iff ϕ(ϕ in s, p).ϕ + ϕ(µ0 in s’, p’).ϕ0 > ϕ(µ in s, p).ϕ + ϕ(ϕ in s’, p’).ϕ0 where ϕ(strategy) is the payoff for that strategy. This is a criterion for successful communication in this game, therefore. 2.2
Where do the probabilities come from?
The subjective probabilities assigned by the hearer to the situations s and s’ are crucial in determining whether successful communication occurs in the model. A number of questions might be asked about these probabilities, particularly where they come from, that is, how do the hearer and the speaker arrive at them? (Note that the speaker must assign at least similar probabilities to the ones assessed by the hearer for communication to be successful.) Parikh’s answer for the example given8 is that the probability, ϕ, of the speaker’s being in situation s, that is, wanting to convey p, is related to the probability that p is true, although with a complication: Since it is common knowledge that p is more likely than p’, we can take it as common knowledge that A [the speaker] probably intends to convey p rather than p’ . . . In general, there is a difference between these two probabilistic situations, and it is only the second, involving A’s intention, that matters. In the absence of any special information, one situation does inform the other . . . (Parikh 2000, p 197, my emphasis) Three important questions are raised by this formulation. First, what is ‘special’ information and how do we know when it comes into play? Secondly, what can be said about the general case, given that this formulation
132
Game Theory and Pragmatics
is only intended to apply to the example given? Thirdly, if the relative probabilities of the propositions, p and p’ sometimes tell us whether ϕ > ϕ0 , as Parikh claims they do in the example, why could Parikh not generally use (some transformation of) the probabilities of p and p’ for ϕ and ϕ0 ? The answer to the first question is given in a parallel quotation from an earlier paper: ‘Note that in general, there is a big difference between the likelihood of a proposition’s being true and the likelihood of an agent intending to communicate that proposition. It is the absence of further relevant information that justifies identifying the two possibilities.’ (Parikh 1991, p 482, my emphasis.) No characterization of relevance is given, so it appears that an intuitive notion of relevance is playing a crucial role here. Secondly, Parikh says about the general case only that the probabilities will ‘usually be subjective because objective information will not be available. In general, the initial probabilities are a result of the prior beliefs and goals of speaker and addressee, based either on prior discourse or actions, or just the relevant background beliefs and goals of each agent’ (2001, p. 28). This is too vague to be useful in making predictions, which is problematic given that sometimes a small change in the probabilities could lead to a radical change in the predicted interpretation. The problem is not that the numbers involved are not specified by the theory, and have to be put in by hand, rather that the list of factors involved in determining the probabilities is qualitatively vague. Given these difficulties, one might ask why Parikh needs to go beyond supposing that the correct probabilities come from the probabilities of the propositions. That is, does he really need to suppose that there will be ‘special’ cases and that the general case is even more complicated? In the next section I draw on work by Wilson and Matsui to show why it would not be generally correct to use the probabilities of the propositions as the probabilities of the situations s and s’. I also propose an alternative, which is not (yet) quantitatively precise in every case but is qualitatively specific about the factors involved. 2.3
Truth, relevance and disambiguation
Wilson and Matsui (1998) examine accounts of pragmatic processes such as disambiguation which use rules such as ‘The correct interpretation is the one which is most likely to be true.’ Accounts of this type are called truthbased. These accounts often make incorrect predictions: in general the correct interpretation need not be the one most likely to be true, as examples such as (4) show: (4) John wrote a letter. a. John wrote a letter of the alphabet.
Game Theory and Communication
133
b. John wrote a letter of correspondence. (Wilson and Matsui 1998, p. 24) The lexical item ‘letter’ is ambiguous, so that (4) could mean either (4a) or (4b). The disambiguation naturally arrived at (at least in non-biasing contexts) is (4b), but this cannot be more likely to be true than (4a) since (4a) is entailed by (4b): anyone writing a letter of correspondence must be writing letters of the alphabet.9 It is examples of this kind which rule out just using the probabilities of p and p’ for ϕ and ϕ0 in Parikh’s model. Would it be a good move to say that normally a truth-based approach is followed but that it needs to be modified in certain cases? Symmetrical examples like (5) and (6) suggest that it is not. (5) Mary is frightened of dogs. (ambiguous between male dogs and dogs in general) (6) Mary is frightened of cats. (ambiguous between cats in general (lions, domestic cats, tigers etc.) and domestic cats) (Wilson, lectures at UCL. See also Sperber and Wilson 1986/95, p. 168) In (5) the intuitively correct reading has the more general sense of the term, dog in general. Parikh could deal with examples like this in the same way as example (1). In (6), on the other hand, the intuitively correct reading has the more specific sense of the ambiguous term, domestic cat, and this reading is surely less likely to be true, given that big cats are more frightening than tabbies. In this case, therefore, Parikh would presumably say that relevant information somehow takes precedence over what is known about the probabilities of the propositions so that the probability that the speaker wants to convey that Mary is frightened of domestic cats is higher than the probability that she means that Mary is frightened of cats in general. There is a better solution available, however. In both cases, the intuitively correct meaning is the one with the more accessible of the two readings of the terms. Accessibility is a psycholinguistic concept. Empirical work in psycholinguistics aims to determine what makes a lexical item or a sense of a lexical item more accessible on one occasion or another. Some current theories propose that the frequency of use of the sense and the recency of its use are the key factors in accessibility of different senses in disambiguation. In a ‘neutral’ context there are no recent uses to consider, so accessibility for a reader encountering (5) and (6) as they are presented here would depend only on the frequency of the senses. Relying only on this cue for disambiguation would get both examples right: the more common meanings are domestic cat and dog in general.10
134
Game Theory and Pragmatics
These kinds of examples are not particularly hard to find. Parikh’s own examples raise the same questions. Recall that the explanation given of why we arrive at reading (2) for an utterance of (1) is that it is common knowledge that the alternative reading – in which it is a particular man who gets mugged – is less likely to be true. According to Parikh, the ambiguity in (7) is ineliminable (in some contexts only, I assume – Parikh has ‘in a general context’ (2001, p.41)): (7) A comet appears every ten years.
(Parikh 2001, p. 41)
0
The claim is that here ϕ = ϕ = 0.5, so the expected payoffs for N1 and N2 are equal and the model predicts that the utterance should be unresolvably ambiguous. Note that in this case, Parikh has set the probabilities ϕ = ϕ0 by hand, even though it is clear that the more general meaning (‘Some comet or other appears . . . ’) is entailed by and therefore at least as likely to be true as the other (‘A particular comet appears . . . ’), that is, the probability of p ≤ the probability of p’. In example (8), where the more specific reading is the intuitively correct one (in ‘neutral’ contexts), Parikh would have to appeal to special information to set ϕ ≤ ϕ0 . Here, as previously, the probability of p ≤ the probability of p’, that is, the same entailment between the propositions expressed by the general and the specific reading holds and the specific reading cannot be more likely to be true than the general one: (8) A comet appears every 76 years. Perhaps a better move for a supporter of Parikh’s model would be to say that the probabilities ϕ and ϕ’ generally reflect the psycholinguistic activation and accessibility data (rather than sometimes relating them to the probabilities of the propositions the speaker wants to convey and sometimes to ‘the prior beliefs of speaker and addressee’ (op. cit.) in general). This would allow the model to make correct predictions at least in cases like (4), (5) and (6) without appeal to special information. Another way of arriving at the same result would be to do away with the prior probabilities and allow activation to be reflected in the payoffs as cost or effort, where lower activation of a sense would translate into greater cost. This would amount to adopting part (a) of the relevance theory definition of effort (see section 2.6 on page 139). 2.4
Implicatures
Parikh (2001, pp 80f.) considers a case where there is an utterance ϕ, which could have an implicature but need not. The meaning conveyed without the implicature is l; the meaning conveyed with the implicature is p. For example:
Game Theory and Communication
135
(9) ϕ: It’s 4pm. l: The time now is 4pm. p: The time now is 4pm. + It’s time to go to the talk. As before, it is assumed that information is valuable and misinformation has a negative value (of −2 in this case). It is also assumed that ϕ is uttered in a context where the speaker knows that the hearer wants to go to a talk at 4pm. In this case p is worth more to the hearer than l, because the extra information would help him in a decision he has to make. In other cases of communication involving implicatures, p may be worth more than l for other reasons (I comment on this in section 2.6 below). Here the values of p and l are set to 9 and 4 respectively for both speaker and hearer. On the effort side, it is assumed that there are processing costs for the hearer and for the speaker: they are greater for the speaker because she has to produce the utterance. Parikh also assumes that arriving at p is more costly than arriving at l only, since ‘p has to be contextually inferred from l . . . [with] additional processing involved.’ (Parikh 2001, p. 81) Then we have the game in Figure 4.5. As in the game for explicit meaning, alternative unambiguous utterances are considered, here µ, which unambiguously means p, and silence, which is taken to convey no information, perhaps problematically, since silence is often communicative, in fact, and might have been so here. Provisionally accepting all of these assumptions and taking the probabilities of s and s’ to be negligibly different, we have as a unique Pareto-Nash equilibrium the strategy in which the speaker utters ϕ if she wants to convey p and silence if she wants to convey an ‘empty interpretation’ and in either case is correctly understood by the hearer. How adequate is this kind of account of implicatures? In the next section, I consider some questions which any account of implicatures should answer. 2.5
Questions about implicatures
A crucial question for an account of implicatures is what the search space for implicatures is. In Parikh’s model no constraints are placed on p except that it and l are both meanings of ϕ within the language shared by the speaker and the hearer. This applies to implicit meaning as well as explicit meaning. Formally, this is expressed as a meaning function that maps from utterances onto (multiple) meanings, both explicit and implicit. If current arguments are correct that free contextual enrichment can affect the proposition expressed (Carston 2002, p. 40), then it is a mistake to
136
Game Theory and Pragmatics
e
p
4, 5
μ
s
l
φ
t
p
ρ
s' ρ'
-4, -3
6, 7
l
φ sil
t'
2, 3
p
-5, -4
en
ce
e'
no information
0,0
Figure 4.5: Local game for implicatures (Parikh 2001, p. 82)
say that all the possible explicit meanings, p, p’, p” . . . are given by the language. Leaving this debate aside, though, for implicatures it seems much less plausible that a language would specify all possible meanings in all possible contexts. One reason that this approach might seem right for disambiguation and some other aspects of explicit meaning is that there really is a small set of possibilities for the disambiguation that are contextindependent, so that ‘bank’ for example, either means a financial institution or the side of a river. Thus it is not unreasonable to try to model disambiguation as a choice between options specified by the language, where context affects the payoff and probabilities of the options, but does not affect which options exist. But this reasoning does not carry over to implicit meaning. For implicatures, the set of possible meanings would itself depend heavily on context, as we see in Parikh’s example, where we are told that the speaker and hearer both know that there is a talk at 4 o’clock and that both are interested in attending. This information should constrain the possible implicatures. Parikh’s model does not explain how.
Game Theory and Communication
137
In adopting this formulation, Parikh abandons the Gricean idea that implicatures should follow logically from the combination of explicit content and some assumptions. That proposal is desirable because it vastly reduces the search space for implicatures, given that there are many propositions which do not follow from the proposition expressed plus a limited set of assumptions. The way that Grice’s insight is developed in relevance theory makes this particularly clear: implicated conclusions are logically warranted by explicit meaning together with implicated contextual assumptions. A relevance theory treatment of example (9) claims that the implicature ‘It’s time to go to the talk’ is only available as part of an interpretation if it follows logically from the explicit meaning plus contextual assumptions such as (10) which must be manifest in the context: (10) Getting to the talk will take at least five minutes; It is better to arrive no more than five minutes past the hour for a talk; etc. In contrast, in Parikh’s model, context places no restriction on the meanings that are possible for an utterance, rather, it is only used to guide the selection between possible meanings. Perhaps the reason for this is that in Parikh’s model only possible meanings of the utterance are represented and not other propositions which are involved in comprehension. In particular, he does not mention contextual assumptions, so he cannot show how explicit meaning logically constrains implicatures. To restate the point: getting disambiguation right can depend on getting the right context. With implicit meaning this is even more clearly important, because there is an open-ended number of candidates to consider for implicatures. That makes it an even more attractive possibility to have a way of getting to the right interpretation, that starts from the context, in finding candidates to evaluate. These considerations are linked to the next question I want to consider: How is the context for interpretation selected? I agree with Sperber and Wilson (1986/95, pp. 137–42) that context selection is a very serious problem for pragmatics, if not the problem. Along with many other models of communication, Parikh’s seems to me to be making use of intuitions of relevance at this point, at least in the examples given. In the example Parikh chooses p (−1) as a candidate for an implicature. Intuitively this is a good candidate in this example: it would make sense in the context which is assumed. These intuitive assumptions seem to be taking the place of an account of the way that the hearer decides in what context he should process the utterance to make the intended sense of it. Without wishing to go too deeply
138
Game Theory and Pragmatics
into this question here, I want to stress that this is a non-trivial problem. For example, a speaker, Sherlock Holmes, may utter a sentence where reference assignment or disambiguation is needed, such as ‘He went to the bank,’ and his hearer, Dr. Watson, may be simply unable to work out which person and what kind of bank are involved because although he and Holmes are in the same physical environment, perhaps even attending to the same aspects of it, and have heard the same prior discourse, still he and Holmes are in different contexts. This is because Watson has not made the same inferences as Holmes and thus starts with different information and – importantly – because he is simply unable to make those inferences and therefore cannot get himself into the same context as Holmes.11 (In relevance theoretic terminology, different propositions are manifest to him.) A good account of communication need not be a general theory of contexts (whatever that might be), but it must have something to say about the way in which a hearer, presented with an utterance, searches for a context which the speaker took to be accessible to him and in which the utterance can be interpreted so as to make the kind of sense the speaker intended. A Parikhian account of communication might try to answer this need by drawing on work on presupposition and accommodation (e.g. Stalnaker 1973, Lewis 1979)12 . Note that the contextual assumptions involved do not need to be things that the hearer believed or had even thought about before the utterance, as example (11) shows: (11) Mary and Peter are looking at a scene which has many features, including what she knows to be a distant church. Mary: I’ve been inside that church. (Sperber and Wilson 1986/95, p. 43) Mary ‘does not stop to ask herself whether [Peter] has noticed the building . . . All she needs is reasonable confidence that he will be able to identify the building as a church when required to . . . it might only be on the strength of her utterance that it becomes manifest to him that the building is a church.’ (ibid., pp. 43 f.) Sperber and Wilson develop an account, including the notion of mutual manifestness, which explains examples like these. Accounts of communication within game theory also need to deal with context selection. A further problem for modelling any open-ended inference process is proposing a stopping rule that works. My final question in this section, then, is how is the search for implicatures stopped? Parikh gives an example of the way his model stops further implicatures from being generated. I would like to suggest that to adopt his answer is effectively to adopt a principle of maximal relevance.
Game Theory and Communication
139
The example considered (Parikh 2001, pp. 84f) models a situation where there are three possible interpretations to be considered if the speaker utters ϕ: l, p and q. l and p are as before; q includes p plus some extra information. The result is a more complicated game with a solution consistent with the previous one – the speaker, wishing to convey p, utters ϕ and the hearer interprets it as p – as long as the value of the extra information, l, is less than the effort that would have to be expended to process it. This follows because choosing p will be the equilibrium strategy for the hearer (after ϕ is uttered) just when the payoff for choosing q rather than p is lower because the cost of the new information outweighs its value. If q is p plus something informative but irrelevant, as in Parikh’s example, ‘let’s take the route by the lake’, then as Parikh puts it: ‘Suppose there is some other proposition, q, . . . that is more informative than p. It will certainly have positive value but it is easy to see that it cannot have more value than p [in the context discussed]. . . . Moreover it is reasonable to assume that the greater the information, the more costly it is to process.’ (2001, p. 84). Effectively, then, the stopping criterion that emerges from this model is: generate implicatures until the effort involved in doing so is less than the value of the information they contain. This is superficially close to the communicative principle of relevance, discussed below. If Parikh’s model needs this principle and relevance theory can provide an account of communication with a similar (but different) principle and little other machinery then Parikh’s account seems to suffer from relative lack of economy. Note that Parikh’s stopping rule differs significantly, however, from the relevancetheoretic communication procedure in that it seems to look for maximal rather than optimal relevance. If so, the two different principles would make some different predictions. I discuss this in the next section, which gives details of the relevance theory comprehension procedure and compares relevance theory and Parikh’s model at several points. 2.6
Further comparisons with relevance theory
Relevance theory is a theory of cognition (Sperber and Wilson 1986/95; Wilson and Sperber 2002). It claims that human cognition tends to be geared to the maximization of relevance. (This is the cognitive principle of relevance (1986/95, pp. 260f).) Relevance is defined in terms of cognitive effects and processing effort (2002, p. 252): (12) Relevance (1986/95, p. 125; Wilson and Matsui 1998, p.16) a. The greater the cognitive effects, the greater the relevance; b. The smaller the effort needed to achieve those effects, the greater the relevance.
140
Game Theory and Pragmatics
Cognitive effects occur when new information interacts with existing contextual assumptions in one of three ways: (13) Cognitive effects (Wilson and Matsui 1998, p.16) a. Strengthening an existing assumption; b. Contradicting and eliminating an existing assumption; or c. Combining with an existing assumption to yield contextual implications. (14) Processing effort is affected by (Wilson and Matsui 1998, p.16) a. the form in which the information is presented; b. the accessibility of the context. In the special case of ostensive inferential communication, the speaker, by making an utterance, is making an offer of information. This raises the expectation that the utterance will be optimally relevant: (15) Optimal relevance (Wilson and Sperber 2002, p. 256) An utterance is optimally relevant to an addressee iff: a. it is relevant enough to be worth the addressee’s processing effort; b. it is the most relevant one compatible with the speaker’s abilities and preferences. This expectation is spelled out in the Communicative Principle of Relevance (2002, p. 256): (16) Every utterance communicates a presumption of its own optimal relevance. This in turn implies the relevance-theoretic comprehension procedure (2002, p. 259): (17) a. Consider interpretations in their order of accessibility (i.e. follow a path of least effort); b. Stop when the expected level of relevance is achieved. Compare this with Parikh’s account of communication. Parikh also factors in effects and effort, but he is less specific about what may contribute to them.
Game Theory and Communication
2.6.1
141
Effects
Effects enter the calculation as the ‘value of information’ which contributes positively to payoffs. As mentioned, an initial, provisional assumption is that all information is equally valuable; in more complex examples Parikh shows how information can be assigned a value by considering its worth in a game modelling a decision that (the speaker knows) the hearer may make. As, I would claim, with Lewis’s work, (e.g. 1969, ch. 4)13 this risks blurring the distinction between the illocutionary and perlocutionary aspects of meaning. Parikh, perhaps anticipating that there is an issue here, separates implicatures into two types: type I, where an utterance has a direct effect on the hearer’s behaviour, and type II, where an utterance affects only the hearer’s thoughts, initially at least. In the second case Parikh says, ‘this type of implicatures can be modeled in more or less the same way except that we need to consider preferences for information directly, rather than via direct action.’ (2001, p. 86) Parikh has not specified how the value of the information in type II cases is arrived at. In general, I claim, Parikh’s model does not specify in the kind of detail that relevance theory does – in (13) – the ways in which information can be valuable to the hearer. Someone wanting to make more explicit predictions with the model would have to be more explicit about the ways that information is valuable. Before moving on to Parikh’s proposals about effort, I want to dwell on his division of implicatures into two types, which I think is undesirable for two reasons. First, it seems that explicit meaning as well as implicatures sometimes leads more directly to action than at other times. Should there also be two categories of explicit meaning? Secondly, the first type of implicature seems redundant. Presumably all utterances, including ones that lead fairly immediately to decisions and actions, have their effects on the hearer by affecting his thoughts. (Or at least, in cases where this is not true we would not want to call what has happened ‘communication’.) So all implicatures will belong to Parikh’s second type. 2.6.2
Effort
I have already commented that Parikh allows as effort factors only linguistic complexity (with the metric unspecified) and, later, the cost of representing or processing an implicature mentally. I have argued that in order to account for examples such as (4) to (9), the model needs to incorporate at least (14a) from relevance theory, so that effort reflects accessibility factors connected with linguistic items in the utterance14 . There is another issue connected with effort which presents central problems for a game-theoretic account of communication. In game theory the payoffs do not include the cost of constructing the representation of the
142
Game Theory and Pragmatics
game, understanding it and finding a solution or solutions. Parikh’s model sets up all of the possible meanings of the utterance in parallel and therefore shares a problem with truth-based approaches to pragmatics. As Wilson and Matsui (1998) point out, in order to find which interpretation is most likely to be true, all possible interpretations must be considered. This makes these approaches psychologically implausible. Parikh’s model has this problem doubled or quadrupled. First, both the speaker and the hearer must consider all possible interpretations; secondly, for every interpretation an unambiguous alternative utterance must be found.15 Considering all possible interpretations is not a trivial matter even for explicit meaning. Parikh’s example (1) has at least eight different possible readings since there are at least three degrees of freedom, given that the meaning of ‘New York’ and the precision or otherwise of ‘ten minutes’ as well as the scope of the quantifiers are underdetermined by the linguistic form. For implicatures there does not seem to be any principled reason why there should be a determinate number at all. At the least there must be a huge finite number of interpretations that any given utterance could have. In fact this kind of indeterminacy arguably gets into the explicit meaning as well, as for example in (1) where resolution of the meaning of ‘ten minutes’ is more a matter of finding a degree of precision on a continuum than of choosing among a limited number of possibilities. Contrast Parikh’s model with the relevance theoretic comprehension procedure which: integrates effort and effect in the following way. It claims that the hearer is entitled to expect at least enough cognitive effects to make the utterance worth his attention, that the processing effort is the effort needed to achieve these effects, and that the hearer is entitled to accept the first interpretation that satisfies his expectation of relevance. (Wilson and Matsui 1998, p. 18) So the hearer works through interpretations in order of (decreasing) accessibility until one of them has cognitive effects worth the processing effort – in which case this will be the interpretation understood (or until the pragmatic faculty exceeds the amount of effort it can spend on this occasion and gives up – in which case no interpretation will be arrived at). In practice this means that generally not very many interpretations will need to be considered; often, as in (5) and (6), the most accessible interpretation will be the right one. Thus while any interpretation might be considered, combinatorial explosion is avoided. The relevance theoretic comprehension procedure is simple and seems computationally tractable: it is a fast and frugal heuristic (Wilson and Sperber 2002/2004, p. 276, citing Gigerenzer et al. 1999).
Game Theory and Communication
143
Parikh can claim that his model does not need to answer the charge of psychological implausibility since it may be that the model does not describe cognitive structures or processes. He says, ‘It seems better to view the game as a model of a class of constraints that capture the underlying logic of communication . . . . The game . . . describes a valid inference without saying anything about how agents arrive at the correct interpretation of an utterance’ (Parikh 2001, p. 25). I think this misses something important. Certainly, comprehension might be carried out by a heuristic which arrives at the same interpretations as the model (at least often enough). But if the model correctly describes the logic of the situation then it implies that the mental processes involved in communication must be sufficiently sophisticated to grasp the situation correctly in some way. So the more complicated the model of the situation, the more mysterious the success of the heuristic and the more difficult it would seem to give an account of the workings of that heuristic.16 In other words, the model does have implications for ‘how agents arrive at the correct interpretation’ in that it specifies at least the nature and complexity of the problem that they have to solve. Arguably, an account of communication which shows that communication is complicated without making suggestions about how people manage to understand each other is lacking in a crucial respect.17 2.6.3
A stopping rule for implicatures
Recall that in section 2.5 on page 135 I argued that Parikh’s method of stopping in implicature derivations effectively amounted to something very like a principle of relevance, that is, something like: keep generating implicatures as long as the value of the information in an implicature is greater than the effort costs associated with it. This seems to be a principle of maximal relevance, since it makes search continue while there is any more value to be obtained that is worth the effort used in obtaining it. In contrast, the principle of communicative relevance entails that search will only continue until an optimally relevant interpretation is reached. The difference is that relevance theory claims that hearers look for an interpretation such that the utterance is ‘the most relevant one compatible with the speaker’s abilities and preferences.’ (my emphasis) A theory which claims that hearers look for maximal relevance, ignoring these provisos, predicts certain implicatures that apparently do not arise. Carston discusses a number of cases like this in a paper on so-called ‘scalar implicatures’ (Carston 1998) including example (18) (her example (71)), taken from Green (1995, pp. 96-97): (18) B: Are some of your friends Buddhist? A: Yes, some of them are.
144
Game Theory and Pragmatics
Theories which claim hearers look for maximal relevance, including Parikh’s model apparently, predict here that A will be taken to implicate that not all of her friends are Buddhist, since in the context that Green sketches, it is evident that there is a more relevant response that A could have given, concerning whether all or most of her friends are Buddhist; this would have more contextual effects for the hearer (B) and would cost him negligible further processing effort. Since A has chosen not to utter this, doesn’t it follow that she must be communicating that only some (that is, not all or most) of her friends are Buddhist? (Carston 1998, p. 33) On the other hand, relevance theory correctly predicts that this implicature will not arise if it is manifest that the speaker was not willing to make a stronger statement: Green’s context makes it plain that while the speaker has the ability to make the stronger statement, she prefers not to (she is afraid of being considered a Buddhist-groupie) and the hearer is aware of this. Hence the relevance principle correctly predicts that the speaker is not implicating that not all of her friends are Buddhist and that the hearer recovers no such assumption as part of what is communicated. (Carston 1998, p. 33)18 However, Parikh’s model may not be in as much trouble with this kind of example as other frameworks. Many neo-Gricean and post-Gricean accounts of communication assume as a foundational principle that communication is cooperative. As a consequence they are simply unable to give an account of utterances where the speaker will not cooperate. Parikh’s model does not assume a Cooperative Principle, so, at least in principle, it could make correct predictions in these cases, given assumptions about the way the speaker’s preferences affect payoffs. As far as I can see, a defender of Parikh’s model would have to write in a proviso like the second half of the second clause of optimal relevance, so that the value of extra implicatures would be zero, no matter how useful the information might be to the speaker, if the speaker manifestly was not willing or able to communicate them. 2.6.4
The asymmetry between speaker and hearer
According to relevance theory, when hearers try to find the interpretation of an utterance intended by the speaker there does not have to be any cooperation between the speaker and hearer, except that the speaker wants to
Game Theory and Communication
145
be understood and the hearer to understand (Sperber and Wilson 1986/95, p. 268). The speaker knows that the hearer is built so as to take the first interpretation that is optimally relevant as the correct one, so she has to make sure that her utterance will lead the hearer to entertain this interpretation before any other which would be relevant enough to stop the search. Parikh’s model has a similar asymmetry between the speaker and the hearer: both must consider the local game which determines the interpretation of an utterance but only the speaker needs to consider the global game, choosing the utterance which has the highest payoff given an intended interpretation. There is a difference, however. According to relevance theory the interpretation must be optimally relevant for the hearer but not, in general, for the speaker. The constraint from the speaker’s point of view is to produce an utterance which is optimally relevant to the hearer, compatible with the speaker’s abilities and preferences. In contrast, in Parikh’s model the solution must be optimal for both speaker and hearer for successful communication. There seems to be something problematic about this, since the reasons why the solution will be optimal will generally be different for the speaker and the hearer. Which interpretation will be optimal for the hearer depends on the worth to him of the information he can derive from it. For the speaker the optimal solution is simply the one in which the hearer arrives at the interpretation the speaker intended (or something close enough to it). So the payoffs for the speaker and the hearer will not generally be the same. Parikh’s model allows for this in principle, but there still seems to be a worry here, since the model predicts that miscommunication will occur if the payoffs come apart too far. A defender of the model would need to show that this does not generally happen in ordinary cases where, as we have seen, the interests of the speaker and hearer are different. This is one aspect of a more general worry about the model, since Parikh allows that a number of factors – the effort and effect factors in the payoffs, the probabilities and even the set of meanings for an utterance – can be different for the speaker and hearer. Any of these might come apart, perhaps leading to miscommunication. To consider just the effort factor, this will generally be very different for the speaker and the hearer even though the narrowly psycholinguistic costs of an utterance are often assumed to be the same. Other costs may well be different since the tasks involved are different. The speaker knows (roughly) what she wants to mean, and has to work out what utterance will direct the hearer to this interpretation; the hearer knows what has been uttered and has to work out what was meant, including implicatures – which may require considerable inference.
146
3 3.1
Game Theory and Pragmatics
Rationality and game theory The other type of model
Do the problems with Parikh’s model carry over to other game-theoretic approaches to communication? In section 1.2 I noted that van Rooij (2004) applies game theory to communication in a different way from Parikh, taking rationality and economy considerations to apply to language rather than utterances in context. This approach faces the familiar problem of multiple equilibria. Van Rooij (2004) looks at a situation where there are two different possible meanings, one more salient than the other, or ‘unmarked’ in Horn’s terminology (Horn 1984) and two utterance-types, one less linguistically complex than the other (also ‘unmarked’). Van Rooij, like Parikh, finds two Nash equilibria. In his model these represent possible linguistic conventions. N1, the intuitively correct solution, is the convention corresponding to Horn’s ‘division of pragmatic labour’: unmarked utterances carry unmarked meaning; marked utterances carry marked meaning. The other Nash equilibrium, N2, is the ‘anti-Horn’ case: unmarked utterances carry marked meaning; marked utterances carry unmarked meaning. Van Rooij rejects Pareto-dominance as a secondary criterion for equilibria, so both N1 and N2 are solutions. He suggests that it may be possible to eliminate N2 by showing that only N1 is evolutionarily stable (p. 515).19 Thus the existence of the two equilibria could be reconciled with the non-existence of anti-Horn communities. Perhaps this approach will work. It is worth noting, though, that the Horn generalisation, to the degree that it is true, falls out naturally from the communicative principle of relevance: if a speaker puts a hearer to more trouble than she might have (for example by using a ‘marked’ utterance), then the hearer is entitled to more cognitive effects (in other words, a ‘marked’ interpretation). I think that there are also more general problems with the approach which applies game theory to language rather than to particular utterances. It seems to conflate language with communication, as for example when van Rooij writes that: ‘Speakers obey Horn’s rule because they use a conventional language that, perhaps due to evolutionary forces, is designed to minimize the average effort of speakers and hearers.’ This is in direct opposition to the Chomskyan view that languages are not tools for communication. Famous evidence is the existence of grammatical sentences that are unusable for communication. Further, a great deal of work in psychology suggests that pragmatic abilities are associated with theory of mind, a separate ability from language competence. (See Happ´e 1993 and other references in Wilson and Sperber 2000, fn. 40, p. 275)
Game Theory and Communication
3.2
147
Doubts about standard game theory’s formalisation of rationality
In section 1.2, I mentioned that empirical and theoretical questions have been raised about the way that standard (as opposed to evolutionary) game theory formalizes rationality. Colman (2003, p. 149) provides a summary. Briefly, in some games,20 actual players consistently do better than theoretical agents who maximise utility as standard game theory suggests they rationally should. Examination of the strategy required of a player in the same game ‘raises a suspicion that the CKR [Common Knowledge and Rationality] assumptions may be incoherent’ (ibid.). A (standard) game-theoretic account of human communication would inherit these problems. Until there is greater understanding of these issues, perhaps within psychological game theory (ibid.) it will be unclear whether a standard game-theoretic account of communication successfully links communication with reasonable assumptions about rationality, which, after all, was the main aim of applying standard game theory to the communicative situation. One assumption made by standard game theory, that the players have common knowledge of the game and of the rationality of the other player, may prove particularly problematic for game-theoretic accounts of communication. Sperber and Wilson discuss reasons why common knowledge of context is not a reasonable assumption for a pragmatic theory (1986/95, pp. 15-21). There need not be a direct clash with the position in game theory (pace Sally 2003, p. 1234), but it may be that game-theoretic accounts of communication also need to build in common knowledge of (some) features of the context. Further, some of Sperber and Wilson’s arguments may be effective against common knowledge of the game and of rationality assumptions, in which case any treatment of communication employing standard game theory would be rendered doubtful. Unlike Sally, I see no reason for a presumption in favour of CKR and game theory in this area, certainly not before there is a fairly successful game-theoretic account of communication – whose success might then be taken as corroboration of its assumptions.
4
Conclusions
In this chapter, I have tried to explain and explore Parikh’s interesting gametheoretic model of communication in order to illuminate the possibilities that game theory offers for understanding communication. I have raised empirical and theoretical doubts about this model and, briefly, about van Rooij’s alternative account, and mentioned some questions about the foundations of standard game theory. The criticisms of Parikh’s model I have presented fall into six categories. I have argued that in some cases Parikh’s game-theoretic approach seems to
148
Game Theory and Pragmatics
get the data wrong. Secondly, it relies heavily on prior probabilities, which are numbers that the theory does not generate or constrain. Thirdly, I have tried to show that it lacks predictive power, and that some of the predictions it does generate rely on notions of relevance or appropriateness that have not been fully spelled out, thus assuming part of what a theory of communication should predict. A fourth line of criticism is that Parikh’s account leaves it unclear how communication is achieved, given the huge complexity of the communicative situation in this theory. Most importantly perhaps, I have argued that Parikh’s model abandons an essential Gricean insight, that explicit meaning plus assumptions logically warrant implicatures, leaving implicature-content unconstrained.
Notes 1. See for example Parikh (1991, p. 473) and Sally (2003, p. 1223) for statements of this intuition. 2. Use of the word ‘tries’ is not meant to imply that the processes involved need be conscious or available to introspection, either during processing or subsequently. The same caveat, standard in cognitive science, applies to other verbs I have used in describing the hearer’s task, including ‘choose’, ‘decide’, ‘work out’ and others. 3. Although Parikh thinks that both approaches are applicable. In his talk at GDP 2003 he gave an example of a game-tree for the Horn case considered as a rule at the level of language. 4. Parikh regards all of these cases as cases of ambiguity, not restricting the term, as is more usual in linguistics, to cases where two or more sets of linguistic items or linguistic structures correspond to the same phonetic form. I will use the term ‘ambiguous’ only in its narrow sense, since disambiguation marks a theoretically important category which is in contrast with reference assignment and pragmatic enrichment at least. Note that it makes no difference to Parikh’s account whether or not (1) is structurally ambiguous, corresponding to two representations which differ in quantifier scope, for example, given that it has the two readings given in (2) and (3). 5. Although Parikh presents this sentence as unambiguous, it has the same two readings as (1). It might seem problematic for Parikh that it is often difficult to find unambiguous alternatives, since in his examples successful communication depends on finding unambiguous utterances, but he has shown (at GDP 2003) that ambiguous alternatives will also do. 6. Although not in pure coordination games (thanks to a reviewer for pointing this out). 7. Van Rooij has made some game-theory-internal criticisms of this solution concept (van Rooij 2004, p. 506), saying that use of Pareto dominance is normally motivated by pre-talk communication: ‘cheap talk’ (communication with no costs) before the game. Naturally, pre-talk communication is not an option for Parikh on pain of an infinite regress, since communication is what is to be explained. (See Parikh 2001, p. 36 fn. 2.) It is true that Pareto-dominant Nash equi-
Game Theory and Communication
149
libria are not necessarily the ones that are chosen: in a coordination game with many Nash equilibria, all of which have equal payoffs except for one which is Pareto-dominated by all of the others, that one is a focal point and is likely to be chosen, despite its lower payoff. An example is meeting at UCL in the game in Figure 4.2. Another problem for Pareto-Nash equilibrium as a solution concept might arise from recent research (mentioned by Sally 2003, pp. 1229 f.) showing that players often prefer ‘risk-dominant’ (which are actually risk-avoiding) strategies to Pareto-dominant strategies. 8. My thanks to Prashant Parikh (p.c.) for pointing out to me that the formulation is specific to the example. Of the general case he writes, ‘My view is that all the features of the situation have to be taken into account in determining the initial probabilities - these may have very little or nothing at all to do with the likelihood of the proposition itself.’ (p.c.) 9. Abstracting away from non-alphabetic writing systems, of course. 10. As Wilson (p.c.) points out, ‘Sperber and Wilson’s notion of relevance sheds some light on why those particular senses of ‘letter’, ‘dog’ and ‘cat’ are the most accessible ones. The fact that someone wrote a letter in the ‘letter of the alphabet’ sense would very rarely be relevant enough (achieve enough effects, at low enough effort) to be worth mentioning, so ‘wrote a letter’ will rarely be used in that sense, and this explains its infrequency of use. Similarly, narrowing ‘dog’ to ‘male dog’ would rarely make enough difference to relevance to be worthwhile. By contrast, ‘cat’ in the general sense covers such very different sub-cases that it may make a huge difference to relevance to know which sub-case is involved. So ‘accessibility’ isn’t a magic potion but something that is (a) empirically testable and (b) often theoretically predictable.’ 11. See The Blue Carbuncle, for example: ‘I can see nothing,’ said I, handing [a hat] back to my friend. ‘On the contrary, Watson, you can see everything. You fail, however, to reason from what you see. You are too timid in drawing your inferences.’ (Conan Doyle 1897/1957, p. 125) 12. Thanks to a reviewer for pointing this out. 13. Although the distinction is arguably clearer in ch. 5. (Thanks to a reviewer for pointing this out to me.) 14. I do not have enough space here to show that (14b) is also necessary. 15. I do not want to deny that consideration of alternative utterances often plays a role in hearers’ recovery of meaning (and speakers’ choices of wording). It seems that it would be good for a theory only to take alternative utterances into account where they are easily accessible and to use at most the fact that they are not easily accessible in other cases. This is how relevance theory deals with these cases (e.g. Sperber and Wilson 1986/95, pp. 200-1). 16. This argument is a distant relative of an argument used recently in minimalist syntax. (For example by Tanya Reinhart at ESPP 2003, commenting on Chomsky 2001.) We cannot ignore effort considerations in our account of syntactic competence, according to the argument, because the parser’s performance must match the competence and, on the face of it, the more complicated the representations the competence generates, the more difficult is the job of the parser.
150
Game Theory and Pragmatics
17. One response to this criticism might be that only pragmatics needs to say anything about how communication is achieved. Parikh explicitly denies that his account is a pragmatic theory: ‘I see this whole book as a part of semantics, not pragmatics, because rational agency is part of a properly situated semantics.’ (2001, p. 141) But both pragmatic theories and Parikh’s model are in the business of explaining communication. If a particular pragmatic theory can say what is communicated and how the hearer recovers the intended meaning and Parikh’s model only has an answer to the first question, then ceteris paribus the pragmatic theory is to be preferred. 18. Van Rooij (p.c.) has suggested that this example should be accounted for in terms of the information requirement imposed by the word ‘some’ in the question, following work by Groenendijk and Stokhof (1984), Schulz and van Rooij (to appear) and others. One problem for this type of solution is that a small change in context could lead to the speaker meaning ‘Some but not all of my friends . . . ’ using the same form of words in response to the same question, so it seems that the wording of the question is not the crucial factor here. 19. Similarly, Asher et al (1999) attempt to derive a Gricean maxim of truthfulness using evolutionary game theory. Of course, evolutionary approaches also have the advantage that they do not assume CKR. 20. See Colman’s (2003) discussion of ‘Centipede’.
References Asher, N., I. Sher, and M. Williams (1999). Game-theoretical foundations for gricean constraints. In Proceedings of the Thirteenth Amsterdam Colloquium. ILLC, Amsterdam. Carston, R. (1998). Information, relevance and scalar implicature. In Carston and Uchida, eds., Relevance Theory: Applications and Implications, pp. 179–236. John Benjamins, Amsterdam. Carston, R. (2002). Thoughts and Utterances: The Pragmatics of Explicit Communication. Blackwell, Oxford. Chomsky, N. (2001). Beyond explanatory adequacy. In MIT Occasional Papers in Linguistics, volume 20. MITWPL, Cambridge, MA. Colman, A. M. (2003). Cooperation, psychological game theory, and limitations of rationality in social interaction. In The Behavioral and Brain Sciences, volume 26, pp. 139–15. Doyle, A. C. (1892/1957). The blue carbuncle. In J. Murray, ed., The Adventures of Sherlock Holmes. London. References are to the 1957 edition. Gigerenzer, G., P. Todd, and the ABC Research Group (1999). Simple Heuristics that Make Us Smart. Oxford University Press, Oxford. Green, M. (1995). Quantity, volubility, and some varieties of discourse. Linguistics and Philosophy, 18, 83–112. Grice, H. P. (1975). Logic and conversation. In P. Cole and J. Morgan, eds., Syntax and Semantics of Speech Acts, pp. 41–58. Academic Press, New York. Reprinted in Grice, H. P. (1989) Studies in the Way of Words. Harvard University Press, Cambridge, MA, 22-40. References here are by page numbers in Grice (1975). Groenendijk, J. and M. Stokhof (1984). Studies on the Semantics of Questions and the Pragmatics of Answers. Ph.D. thesis, University of Amsterdam.
Game Theory and Communication
151
Happ´e, F. (1993). Communicative competence and theory of mind in autism: A test of relevance theory. Cognition, 48.2, 101–19. Horn, L. (1984). Towards a new taxonomy of pragmatic inference: Q-based and rbased implicature. In D. Schiffrin, ed., Meaning, Form, and Use in Context: Linguistic Applications, volume GURT84, pp. 11–42. Georgetown University Press, Washington. Lewis, D. (1969). Convention. Harvard University Press, Cambridge, MA. Lewis, D. (1979). Scorekeeping in a language game. In R. B¨auerle et al., eds., Semantics from Different Points of View, pp. 172–87. Springer, Berlin. Also in Journal of Philosophical Logic, 8:339-59. Parikh, P. (1991). Communication and strategic inference. Linguistics and Philosophy, 14, 473–513. Parikh, P. (2000). Communication, meaning and interpretation. Linguistics and Philosophy, 23, 185–212. Parikh, P. (2001). The Use of Language. CSLI Publications, Stanford, California. van Rooij, R. (2003). Being polite is a handicap: Towards a game-theoretical analysis of polite linguistic behaviour. In Proceedings of TARK 9. van Rooij, R. (2004). Signaling games select Horn strategies. Linguistics and Philosophy, 27, 493–527. Sally, D. (2003). Risky speech: behavioral game theory and pragmatics. Journal of Pragmatics, 35, 1223–1245. Schulz, K. and R. van Rooij (to appear). Pragmatic meaning and non-monotonic reasoning: The case of exhaustive interpretation. to appear in Linguistics and Philosophy. Sperber, D. and D. Wilson (1986/95). Relevance: Communication and Cognition. Blackwell, Oxford, 2nd edition. Sperber, D. and D. Wilson (1997). The mapping between the mental and the public lexicon. In UCL Working Papers in Linguistics, volume 9, pp. 107–25. Reprinted in Carruthers, P. and Boucher, J., eds., (1998) Language and Thought: Interdisciplinary Themes. Cambridge University Press, Cambridge, pp. 184-200. Stalnaker, R. (1973). Presuppositions. Journal of Philosophical Logic, 2, 447–57. Wilson, D. and T. Matsui (1998). Recent approaches to bridging: Truth, coherence, relevance. In UCL Working Papers in Linguistics, volume 10, pp. 173–20. Wilson, D. and D. Sperber (2004). Relevance theory. In G. Ward and L. Horn, eds., Handbook of Pragmatics, pp. 607–32. Blackwell, Oxford. References here are to the longer version, Wilson, D. and Sperber, D. (2002) Relevance theory. UCL Working Papers in Linguistics, volume 14, pp. 249-287, also available online at http://www.phon.ucl.ac.uk/home/deirdre/.
5 Different Faces of Risky Speech Robert van Rooij and Merlijn Sevenster
1
Communication as coordination problem
Suppose two individuals agreed to meet each other tonight at 10.00 o’clock in Amsterdam, but forgot to agree on a place (and don’t have the chance anymore to make an agreement). The two are now facing a coordination problem: only if they make a ‘correlated’ decision, will they land up at the same place and meet each other as desired. Schelling (1960) distinguishes two ways to solve such coordination problems: convention and salience. A coordination problem is solved by convention if the participants were engaged in similar coordination problems before, and have formed the habit, or convention, of solving these problems in a particular way. Both participants see the overwhelming similarity between the previous coordination problems and the current one, and, either out of habit, or because they expect that the other participant will behave similarly as before, they behave similarly as they did in these previous encounters. A coordination problem is solved by salience if the participants do not expect that the problem can, or will, be solved by habit or convention, but have reason to assume that the other participant will behave in a certain way, because one kind of behavior is most ‘obvious’.1 Lewis (1969) had the insight that we can think of successful communication as the solution to the coordination problem of how to transfer information. The problem involves both the speaker and the hearer: the speaker S has to decide which signal to send to transfer the intended information, and the hearer H has to interpret the signal in the way intended by the speaker in order for the communicative act to be successful. As is the case in all coordination problems, expectations are crucial: the speaker’s decision which signal to send will be based on how she expects the hearer H will interpret the signals she chooses to send, and the hearer’s decision will be based on what he expects the speaker could have meant.2 So how can the participants in a conversation have correct expectations about the communicative behavior of their partners? 152
Different Faces of Risky Speech
153
Just as in the coordination games studied by Schelling, the two ways of solving the problem are either by convention, or by salience. For both ways, expectations are crucial. The communication problem is solved by convention if the speaker ‘encodes’ her communicative intention by using a signal which has been used many times before (or at least is composed out of signals used many times before) and which has received the interpretation the speaker now wants to communicate. The speaker uses the symbol on the expectation that the hearer will interpret it in the same way as before, while the hearer interprets it on the expectation that the speaker intended to communicate the same as on previous occasions when she used the signal. Of course, linguistic conventions are much more complicated than this picture suggests, but, essentially, this is the idea. The communication problem is solved by salience provided the conventional meaning (if any) of the signal used by the speaker underspecifies its actual intended interpretation, or in case the speaker wants to implicitly convey (by conversational implicature) something on top, or instead of, what is conventionally communicated by the use of the sentence. For such cases, expectations are even more important: speaker and hearer have to agree on what would be the most obvious interpretation of the signal in this context. The traditional emphasis of linguistics has been on conventional or rulegoverned communicative behavior: syntax and (lexical and compositional) semantics. However, for pragmatics, the theory of language use, it is the concept of salience that is of crucial importance. To a large extent, the notion of salience is a psychological notion that largely has ‘escaped’ game theoretical analysis.3 It crucially involves expectations, and (at least traditional) game theory has nearly nothing to say about how these expectations are formed. However, we can abstract away from the particular expectations that participants of a conversation have, and use game theoretical reasoning to make predictions concerning their expected behavior in certain kinds of situations. That is what we will do in this chapter.
2 2.1
Games, expectations, and communication Expectations and equilibrium selection
Coordination problems can obviously be thought of in a game theoretical way.4 Suppose Row and Column have to make their respective decisions independently of one another. Row has to decide between performing R1 or R2 and Column has C1 and C2 as his alternative actions. In the simplest coordination games, both Row and Column are equally happy when they coordinate on either hR1 , C1 i or on hR2 , C2 i. Such a game can be
154
Game Theory and Pragmatics
Table 5.1:
Game 1
R1 R2
C1 1, 1 0, 0
C2 0, 0 1, 1
described in terms of the payoff-table in Table 5.1. The action pairs which they should co-ordinate on are both Nash-equilibria of the game, but their problem is on which one should they coordinate. Given that they have to decide independently of one another, their chosen action will depend on their expectations about what the other will do. In case the payoffs are equal, as in Game 1, Row, for instance, will choose R1 just in case she expects, or takes it to be more likely, that Column will play C1 . For Game 1, the choice of how to perform depends only on the players’ expectations on what the other will do. But this is just because here both equilibria have the same payoff (for both players). In general, different equilibria can give rise to different payoffs, and both players will choose by maximizing their expected utilities. These expected utilities involve both payoffs and the probabilities that a player assigns to the different actions that the other player will perform. Suppose that the probability function PR represents Row’s expectations about what Column will perform, i.e., PR (Ci ) will represent the probability with which Row thinks Column will perform action Ci . The expected utility held by Row to play R1 , EUR (R1 ), will then be PR (C1 ) × UR (R1 , C1 ) + PR (C2 ) × UR (R1 , C2 ). It is easy to see that the expected utility of R1 is higher than the expected utility of R2 , EUR (R1 ) > EUR (R2 ), just in case Row thinks it is more likely that Column will play C1 than C2 , i.e. when PR (C1 ) > PR (C2 ). Obviously, something similar holds for Column. Things are a little bit more complicated when the payoffs of the different equilibria are not the same. Consider, for instance, the following coordination problems: Table 5.2:
Table 5.3:
Game 2
R1 R2
C1 2, 2 0, 0
C2 0, 0 1, 1
Game 3
R1 R2
C1 8, 8 0, 0
C2 0, 0 1, 1
Again, in these games, both hR1 , C1 i and hR2 , C2 i are equilibria. However, now both would in principle prefer the former equilibrium to the lat-
Different Faces of Risky Speech
155
ter. But this doesn’t give them an automatic incentive to perform their part of equilibrium hR1 , C1 i: what one should do in order to maximize payoff depends also on one’s expectations as to what the other will do. On the coordination problem of Game 2, Row should do her part of the coordination equilibrium hR1 , C1 i only if she thinks (for whatever reason) that the probability that Column will choose C1 is at least 13 , because only in that case Row’s expected utility of playing R1 , EUR (R1 ) tops her expected utility of playing R2 . Similarly for the coordination problem of Game 3. Now Row should choose R2 instead of R1 if she takes it to be at least eight times as probable that Column-chooser will choose C2 than that he will choose C1 . Although expectations always play a role when several equilibria are possible, the contrast between Games 1, 2, and 3 shows that this role increases if the expected utilities of the different equilibria become more alike. Games 1, 2, and 3 each have two equilibria (in pure strategies), because the expectations that players have of other players’ behavior were not supposed to play a role. We saw that these expectations are in fact crucial to predict what will be played, and we will show now that they even influence the equilibria of the game. Consider Game 3 again, and assume that before deliberation Row expects, for some reason, with a probability of 0.7 that Column plays C2 and Column expects with the same probability that Row plays R2 . Suppose, moreover, that these probabilities are common knowledge. Then, by taking these expectations into account as well, this gives rise to a new situation, described in Table 5.4, where the payoffs are now the expected utilities: Table 5.4
R1 R2
C1 2.4, 2.4 0.7, 2.4
C2 2.4, 0.7 0.7, 0.7
It is clear that in this situation we end up with play hR1 , C1 i, which is intuitively the correct equilibrium of Game 3. This discussion indicates how important expected utility theory is for game theory. Even if we start out with the strong assumption that the players have common knowledge of each others’ initial expectations about one another’s strategy choices,5 these expectations need not be their final expectations about these strategy choices. The reason is that these prior expectations don’t take into account the reasoning of Column (Row) given that he (she) knows Row’s (Column’s) prior expectations, i.e., the process of deliberation.6 It is the final expectations (with probability 1, if we disregard mixed strategies) that count to de-
156
Game Theory and Pragmatics
termine what is the equilibrium that is being played. And this is what we saw in this situation: the initial expectations were overruled in the deliberation process that takes expected utility, and not just prior expectations into account. So, our discussion highlights the relevance of individualistic Bayesian rationality. The games we looked at so far had two features in common that singled them out as pure coordination games. First, the preference relations between action pairs were the same for both participants. Second, it was important for both Row and Collumn to coordinate: the payoff of the coordinating action pair was higher for both agents than the payoff of a noncoordinating action pair. In this paper we are interested in games where we give up one or both of these assumptions. First we will discuss (communication) games with two equilibria in which only one of these equilibria is a strict one: to receive the (lower) payoff of the other equilibrium, only one of the players (the speaker) has to perform his part of the equilibrium play. After that, we will discuss (communication) games where both participants of the conversation have an incentive to coordinate on an equilibrium (that is, the game has two strict equilibria), but where the payoffs vary on nonequilibrium actions-pairs. As we will see, giving up looking only at pure coordination games introduces extra considerations concerning risk. 2.2
Risky versus safe play
In pure coordination games as discussed in the previous section, risk already plays an important role. If one does not know for sure which equilibrium strategy the other participant will play, it is possible that by maximizing one’s expected utility one actually ends up empty handed in a noncoordinating play of the game. Because all non-equilibrium outcomes have the same zero-payoff for both participants, one strategy might be called more risky than another just in case its expected utility is lower. In the games we are going to discuss in this section, however, some strategies can be called more risky than others because of differences in the payoffs of non-equilibrium-plays of the game. In the previous section we assumed that any non-equilibrium play of the game gives both participants a payoff of 0. In an appealing article, Sally (2003) observes, however, that in games that model communication (or many other types of situations), this assumption is wrong: some nonequilibria can be worse than others for one or both participants. If a speaker deliberates whether she should encode the information she wants to communicate in a funny, indirect way or not, for instance, Sally notes that she has to take into account that unsuccessful communication resulting from her being (or trying to be) funny is probably worse than unsuccessful com-
Different Faces of Risky Speech
157
munication without her being indirect. Consequently, Sally (2003) calls, for instance, ironical indirect speech risky.7 We think this is a very useful way of looking at communicative behavior. In contrast to Sally (2003), however, we will distinguish different types of games where the notion of ‘risk’ is involved. In doing so, we claim that Sally’s notion of ‘risky speech’ is perhaps more widely applicable than suggested by Sally’s own discussion. Let us first discuss a game like the following: Table 5.5:
Game 5
R1 R2
C1 1 + ε, 1 1, 1
C2 1 − ε0 , 0 1, 1
Although this type of game has two equilibria, hR1 , C1 i and hR2 , C2 i, it is not really one of coordination. The reason is that now the equilibrium hR2 , C2 i is not a strict one: it doesn’t matter what Column plays if Row plays R2 . On the other hand, Row would benefit from the combination hR1 , C1 i (if ε, ε0 > 0). This is not only a strict equilibrium, but it is payoffdominant as well. We call an equilibrium ‘payoff-dominant’ if and only if there is no other equilibrium in the game that yields a strictly higher payoff for at least one player. In case Row has no idea whether Column will play strategy C1 or C2 , we assume that Row takes both strategies to be equally likely.8 In that case, the expected utility of playing R1 is higher/equal/lower than the expected utility of playing R2 if and only if ε > / = / < ε0 . For this reason, we will say that Row is risk-loving iff ε > ε0 , he is risk-neutral iff ε = ε0 , and he is riskaverse iff ε < ε0 . Assuming by default that ε > ε0 , we will denote strategy R1 by Risky and strategy R2 by Safe. The other kind of game we are interested in is one in which both players can choose between risky and safe strategies. This type of game also has two equilibria, but now it is important for both players to coordinate. This game differs from Games 2 and 3 discussed in section 2.1 in that both equilibria have something distinctive in their favor. Consider Rousseau’s (1755) famous Reindeer hunt game as described by Lewis (1969) and extensively discussed by Skyrms (2004); a simple twoplayer symmetric game with two strict equilibria: both hunting Reindeer, hR, Ri, or both hunting Squirrel, hS, Si. Note that we slightly changed the story of Rousseau’s game, but the contention of the game is intact.9 The first equilibrium gives the highest payoff to both, i.e., is payoff-dominant (or Pareto optimal), because it gives to both a utility of, let us say, 6, while
158
Game Theory and Pragmatics
the second equilibrium yields only one of 4. However, assume that if one hunts Reindeer but the other Squirrel, the payoff is (4,0) in ‘favor’ of the Squirrel-hunter. In that case, the payoff-dominated equilibrium where both are hunting Squirrel still has something in its favor: if one player is equally likely to play either strategy, the expected utility of hunting Squirrel for the other is optimal. Table 5.6:
Game 6: Reindeer hunt
R S
R 6, 6 4, 0
S 0, 4 4, 4
Risky Safe
Risky 1 + ε, 1 + ε 0, −ε0
Safe −ε0 , 0 1, 1
The more abstract right-hand example also has two (strict) Nash equilibria (if both ε and ε0 are higher than 0): both playing Risky, or both playing it Safe. It is obvious that equilibrium hRisky, Riskyi is payoff-dominant. Following Harsanyi and Selten (1988), we will say that Nash equilibrium ha∗ , b∗ i is risk-dominant iff for all Nash equilibria ha, bi of the game, (URow (a∗ , b∗ ) − URow (a, b∗ )) × (UCol (a∗ , b∗ ) − UCol ((a∗ , b)) ≥ (URow (a, b) − URow (a∗ , b)) × (UCol (a, b) − UCol ((a, b∗ )). In the above example hSafe, Safei is risk-dominant exactly if ε0 ≥ ε. For this reason, we will call a player risk-loving iff ε > ε0 , she is risk-neutral iff ε = ε0 , and she is risk-averse iff ε < ε0 . In contrast to the concept of payoff-dominance, the concept of risk-dominance is based on individual rationality. Think of the numerical version of the Reindeer hunt game, where it is common knowledge that the prior expectations before deliberation that the other will play R is 0.5. In that case the game gives rise to the following ‘expected utility’ table: Table 5.7:
Expected utilities: reindeer hunt
R S
R 3, 3 4, 3
S 3, 4 4, 4
Obviously, in this situation the Nash equilibrium hS, Si will be played in which both are hunting Squirrel. The preference of each player for playing her part of the risk-dominant equilibrium in the Reindeer hunt game is closely related to the preference
Different Faces of Risky Speech
159
for playing Safe (or R2 ) in Game 5 (if ε > ε0 ). Suppose that a player doesn’t know what the other player will do. In that case the speaker should choose the strategy that has the highest expected utility. Suppose that, in the absence of indications to the contrary, a player takes both actions of the other player to be equally likely. One can show that in that case the action which has the highest expected utility in the Reindeer hunt game (or any other symmetric 2 × 2 game) is the strategy which is risk-dominant. And this will be the case for the Safe/Risky-strategy if and only if the player is riskaverse/risk-loving. In the rest of this chapter we will suggest that some decisions speakers and hearers have to make when they have to coordinate their communicative behavior by salience can be modeled by the decisions that have to be made by the players in the games discussed in this subsection. First, we will discuss an example that we suggest can be modeled analogously to (an incomplete information variant of) Game 5: implicit communication where the conventional meaning of the expression underspecifies what the speaker actually wants to communicate. Then we will discuss some examples that can best be modeled as ‘impure’ coordination games with rankable equilibria and varying off-diagonal payoffs, as the Reindeer hunt. Following Sally (2003), we will suggest that this game models the communicative decisions involved in cases in which the meaning intended to be communicated can ‘overturn’ the conventional meaning. Parikh’s 2001 game-theoretical analysis of miscommunication in terms of ‘metagames’ is closely related. Finally we will discuss an example that is somewhere in between Games 5 and 6. Before we come to theses modelings, however, we will first briefly introduce ‘signaling games’. These signaling games will be extended in the following sections.
3 3.1
Games of communication Standard signaling games
Lewis (1969) defined the notion of a signaling game in order to explain the conventionalization of meaning of language without assuming any preexisting relation between messages and meanings. A signaling game is a cooperative game amongst two players: a sender S and a receiver R, whose shared goal it is to let R perform an action that is appropriate with respect to the state S and R are in. This state is only observed by S though, and S communicates it by means of a meaningless message. Think of a couple of agents, the one looking out for hungry predators and the other searching for food on the ground. Both players are in the same state – there is a predator approaching or there is not –, but only S
160
Game Theory and Pragmatics
knows which state they are in. S signals the state by means of a message that has no pre-defined meaning – say, ‘buh’ or ‘bah’. In turn, R hears the message and is free to perform any action of her liking, but every state has a most appropriate action. For example, if there is a hungry predator approaching, R should flee, otherwise R should keep on searching. The best thing for S to do is to say ‘buh’ if there is a hunting predator and ‘bah’ otherwise; and for R to flee when he hears ‘buh’ and not to in case he hears ‘bah’. (It is equally good of course to do the same, but with ‘buh’ and ‘bah’ interchanged.) If S and R play the game in such an optimal way, the meanings of the messages ‘buh’ and ‘bah’ are created in the play of the game. As we will see below, playing in a way that makes the messages meaningful amounts to coordinating on a Pareto-optimal Nash equilibrium. For future reference, we formally define signaling games. Let T be the set of states, M be the set of messages and A be the set of actions such that |T | = |A| ≤ |M |. Let f : T → A be the bijective function, that adds the appropriate action f (t) ∈ A to every type t ∈ T . Then S (R) plays the signaling game following strategy s (r ), that is, a function from T → M (M → A). In cheap talk signaling games, successful communication of state t (thus in case f (t) = r (s(t))) is rewarded with 1, whereas unsuccessful communication (thus in case f (t) 6= r (s(t))) is rewarded with 0, independent of the state t and the message s(t): ( 1, if f (t) = r (s(t)); (5.1) uS (f, t, s, r ) = uR (f, t, s, r ) = 0, if f (t) 6= r (s(t)). We assume that Nature picks the state according to some probability distribution P over T .10 The utility function for S and R is the expected utility relative to the probability distribution P over T : US (s, r ) = UR (s, r ) =
X
P (t) × uS (f, t, s, r ).
(5.2)
t∈T
Finally, we define a cheap talk signaling game G as a tuple h{S , R}, P, {S, R}, {uS , uR }i, where P is a probability distribution over T , S is the set of strategies s : T → M for player S ; R is the set of strategies r : M → A for player R; and {uS , uR } contains both players’ utility functions. G is called ‘cheap talk’ because uS and uR , simultaneously defined in (5.1) are defined this way. As a small example, consider the signaling game with only two states t1 , t2 , two messages m1 , m2 and two actions a1 , a2 , where f (ti ) = ai for i ∈ {1, 2}. Obviously, both players have four (pure) strategies each. Furthermore, let x = P (t1 ) > P (t2 ) = y. Then, we have the payoff matrix below.
Different Faces of Risky Speech
161
Table 5.8
s1 s2 s3 s4
t1 m1 m1 m2 m2
t2 m1 m2 m1 m2
r1 r2 r3 r4
m1 a1 a1 a2 a2
m2 a1 a2 a1 a2
s1 s2 s3 s4
r1 x, x x, x x, x x, x
r2 x, x 1, 1 0, 0 x, x
r3 y, y 0, 0 1, 1 y, y
r4 y, y y, y y, y y, y
The resulting signaling game has four Nash equilibria: hs1 , r1 i, hs2 , r2 i, hs3 , r3 i and hs4 , r1 i. As the reader can check, only in hs2 , r2 i and hs3 , r3 i does communication take place: these are precisely the payoff-dominant equilibria. Lewis calls such equilibria ‘signaling systems’. Technically, hs, r i is a signaling system iff f (t) = r (s(t)), for every t ∈ T . A necessary condition for hs, r i to be a signaling system is that both s and r are injective or one-to-one functions. 3.2
Super conventional signaling games
Signaling games are used by Lewis (1969) to explain literal, or conventional, meaning in terms of the game-theoretic notion of ‘stability’. This doesn’t mean, however, that we cannot motivate the use of unconventional message-meaning combinations by making use of game-theoretical equilibrium as well. In this paper we study the risk of using non-conventional, non-explicit, or non-literal speech as a means of communication. In order to do so we introduce in this section signaling games where there already exists a convention of explicit literal meaning. We will call the resulting signaling games Super Conventional, and sometimes write ‘SC signaling games’. The intuition underlying SC signaling games is that S and R play a signaling game enjoying common knowledge of the fact that some strategypair hs, ri is the conventional signaling system. We denote the conventional sender and receiver strategy by means of cs and cr , respectively. It is this pair of strategies that model the literal or explicit meanings. While playing an SC signaling game, S and R have agreed on the conventional meaning of messages in M 0 = {m ∈ M | there exists a t ∈ T such that cs(t) = m}, which is the set that contains the messages that convey the to-be-communicaed types. Since s is an injective function, |M 0 | = |T |. We assume that only the messages in M 0 can have a non-literal meaning.
162
Game Theory and Pragmatics
In non-explicit or non-literal utterances, the sentence leaves the actual interpretation underspecified, or, if taken literally, means something different than what was intended by the speaker. In formal terms, although S is of type t she uses a message m 6= cs(t), and S wants and expects R not to perform cr (m), if that exists at all, but f (t). We will model the extra gain, in case of successful non-conventional communication by a parameter ε ≥ 0, whereas we punish the player (possibly both) who deviated from its conventional strategy with the parameter-value ε0 ≥ 0, in case of unsuccessful communication. This brings us to the main definition of this section. A Super Conventional signaling game Ghcs,cr i is a standard signaling game G equipped with a convention: h{S , R}, P, {S, R}, {uS , uR }, hcs, cr ii, where uS and uR are the players’ utility function, defined below. P is the probability distribution over the set of types. For the utility functions of speaker and hearer in the signaling games discussed in the previous subsection it was taken to be irrelevant which message was being used; it only mattered whether communication was successful or not. In our superconventional signaling games we will assume that at least for the speaker, but perhaps also for the hearer, it is important which message is used for communication. In particular, it is taken to be advantageous for the speaker (and perhaps for the hearer) to communicate successfully with a non-explicit message, or with a message that should receive a non-literal interpretation. In the following two sections we are going to discuss examples where the speaker’s utility function should not be defined as in (5.1) but rather as follows
uScs (f, t, s, r ) =
8 > 1 + ε, > > < 1, > > > :
0, −ε0 ,
if f (t) = r (s(t)) and s(t) 6= cs(t) if f (t) = r (s(t)) and s(t) = cs(t) if f (t) 6= r (s(t)) and s(t) = cs(t) if f (t) 6= r (s(t)) and s(t) 6= cs(t)
(5.3)
Intuitively, uScs hard-wires that the speaker is moderately rewarded or punished if she sticks to the conventional sender strategy (i.e. s(t) = cs(t)). Of course, the utility-function still shows that successful communication (i.e. f (t) = r (s(t))) is better rewarded than unsuccessful communication (i.e. f (t) 6= r (s(t))). As for the hearer’s utility function, we will discuss two special cases: in section 4, where we discuss the risk of non-explicit communication, we will take the hearer’s utility function to be the same as in (5.1):
Different Faces of Risky Speech
( uR (f, t, s, r ) =
1, 0,
if f (t) = r (s(t)) if f (t) 6= r (s(t))
163
(5.4)
This means that only the speaker has to decide whether to play risky or not. The hearer just has to assign the correct meaning to the given message. In section 5, however, we will discuss linguistic phenomena where also the hearer can play either risky or safe. In that section we will assume that the hearer’s utility function is given by the following function:
uRcr (f, t, s, r ) =
8 > 1 + ε, > > < 1, > > > :
0, −ε0 ,
if f (t) = r (s(t)) and r (s(t)) 6= cr (s(t)) if f (t) = r (s(t)) and r (s(t)) = cr (s(t)) if f (t) 6= r (s(t)) and r (s(t)) = cr (s(t)) if f (t) 6= r (s(t)) and r (s(t)) 6= cr (s(t))
(5.5)
In section 6 we will discuss an example where the players’ utility functions are even more involved than in (4) and (5).
4
Risk of implicit communication
Let us assume a very simple signaling game, where we have two kinds of meanings, t1 and t2 , and expressions, m1 and m2 that conventionally denote t1 and t2 , respectively, in a context-independent way. Let us now assume that, in addition, we have an expression mu that is lighter than either of m1 and m2 and that has an underspecified meaning: it can mean both t1 or t2 . Formally, let us stipulate that using mu instead of m1 or m2 yields a bonus of ε > 0. It is easy to see (e.g. Parikh 2001, van Rooij 2004) that if it is common knowledge that the hearer, R, takes t1 to be more probable (salient) than t2 , P(t1 ) > P(t2 ), and m1 is more costly than mu , C(m1 ) > C(mu ), the ‘coding’ strategy that uses mu to denote t1 (and m2 to denote t2 ) is the most efficient, i.e., payoff-dominant, ‘coding’ strategy to denote t1 and t2 . In particular, it is more efficient than the coding strategy that uses m1 to denote t1 . However, when the relative probabilities of t1 and t2 are not shared between speaker S and hearer R and the latter has to guess (by tossing a coin) which meaning the speaker takes to be more salient, or probable, using a light message with an underspecified meaning is not going to have a positive payoff.11 In the simple case above where the message with the underspecified meaning can have only two specific denotations, the benefit of communicating with a light expression must be very high in order to overcome the risk of miscommunication. We are going to discuss a case like that of Game 5 repeated below.
164
Game Theory and Pragmatics
Table 5.9:
Game 5
Risky Safe
C1 1 + ε, 1 1, 1
C2 1 − ε0 , 0 1, 1
For the case at hand, thus where t1 is the case, we assume that the Safe strategy is to send the correct explicit message in the relevant state, while the Risky strategy is to use the light message with the underspecified meaning. C1 and C2 are the strategies that always interpret the explicit messages in the expected way, and interpret mu as t1 and as t2 , respectively. Thus, (i) successful communication by context-independent expressions m1 and m2 – i.e., playing the Safe strategy – is 1 for both agents; (ii) unsuccessful communication has a payoff of 0, i.e., we assume that ε0 = 1; and (iii) the benefit of successful communication with the light underspecified expression mu instead of the conventional explicit expression m1 is ε, which is a higher value than 0.12 The hearer interprets the speaker’s message in the only appropriate way in case the message has a context-independent completely specified meaning. What the hearer does if he receives the underspecified message mu depends on his beliefs: he interprets mu as t1 if he takes t1 to be more likely, PR (t1 ) > PR (t2 ), and he interprets mu as t2 if he takes t2 to be more likely, PR (t1 ) < PR (t2 ). Thus, the hearer has a choice between two strategies C1 and C2 that reflect that any conventional message mc is attached to its conventional meaning tc and that Ci attaches ti to mu , for i ∈ {1, 2}. The speaker’s payoffs of these two strategies in the different situations are given by following tables: Table 5.10
t1 Implicit Explicit
Table 5.11
C1 1+ε 1
C2 0 1
t2 Implicit Explicit
C1 0 1
C2 1+ε 1
The speaker doesn’t know how the hearer will interpret the underspecified message mu because she does not know whether the hearer will take t1 or t2 to be more likely. We have seen above already that if the speaker takes C1 and C2 to be equally likely, i.e., if PS (C1 ) = PS (C2 ), the benefit of using the underspecified message has to be at least 1, ε ≥ 1. But what if PS (C1 ) 6= PS (C2 )? Let us assume that the speaker believes with
Different Faces of Risky Speech
165
probability n that PR (t1 ) > PR (t2 ) (and thus with probability 1 − n that PR (t1 ) ≤ PR (t2 )). It is easy to see that the speaker takes implicit communication to be worthwhile in situation t1 if and only if n × (1 + ε) > 1. That is, for the expected utility of being implicit to be higher than the expected utility of being explicit, it has to be the case that ε > 1−n . The equality ε = 1−n n n can be plotted as in Figure 5.1.
20 15 10 5
0
n .25
.5
.75
1
Figure 5.1: A depiction of the states in which the expected utility of being implicit equals the expected utility of being explicit. That is, ε = 1−n . n
Obviously, if n is very close to 0 the use of mu will be a bad choice, but also for other choices of n, it probably won’t pay to be implicit: if n is 31 or 1 , for instance, the value of ε has to be 2, or 3, respectively, which seems to 4 be much too high. Being explicit is a safe strategy. It is optimal under the maximin strategy and the minimax strategy. Things are more complicated when expected utility is at issue, for now it also depends on the relative weight of n and ε. But the main, and perhaps obvious, conclusion of these considerations, whether expected utility plays a role or not, is always that it is safer to be explicit if you don’t know (for sure) what your conversational participant take to be the most salient situation of T .13
166
5
Game Theory and Pragmatics
Risk-dominance versus payoff-dominance
In this section we study the use of non-literal speech as a means of communication. Following standard practice in philosophy of language, we distinguish between what a sentence means and what a speaker means by uttering this sentence. If the sentence meaning gives rise to a fully specified interpretation, the two will coincide in standard situations, i.e., when the speaker uses the sentence in the conventional way. In specific circumstances, however, it can be that even if the sentence has a fully specified meaning, the speaker means something quite different with her use of the sentence than the literal meaning of the sentence, or something in addition to it. This is the case when the sentence has besides a literal meaning, also a non-literal interpretation. What we have in mind is, for instance, the use of indirect speech (acts), irony (such as over- and understatement), and metaphor. In the hearer’s process of attaching a non-literal interpretation to the utterance of a sentence, the hearer first has to recognize the defectiveness of the utterance’s literal meaning. Following some suggestions of Sally (2003), we will argue that this type of speech can be successfully modeled as what we called a Reindeer hunt game, where both successful communication by the non-literal and the literal use of one’s language are equilibrium plays of the game, but where the former is payoff-dominant, whereas the latter is risk-dominant. Sally (2003) argues that “people play the language game in a way that is consistent with their play in all games.” Sally does so by fixing rules of thumb14 that describe people’s behavior while playing coordination games in a lab-setting. For instance, Sally considers the rules: • (A) In a game with one outcome risk-dominant and another “modestly” payoff-dominant, the former is more likely to be chosen. • (B) As sympathy between the players increases, a payoff-dominant, risk dominated equilibrium is more likely to be realized.15 Concerning the status of these rules Sally says that “these empirical findings are clearly not hard and fast rules of coordination game play, but rather tendencies manifest in normal play.” And this is exactly the way we will treat them below. To make these findings relevant for pragmatics, Sally introduces a “more complete coordination game of communication” that does not take states/ meanings as primary objects but rather speech acts, as proposed by Austin (1962) and Searle (1969). Accordingly, the payoff-functions range over pairs of speech acts. Sally does not define them formally, however.
Different Faces of Risky Speech
167
In contrast, we have strictly defined our Super Conventional signaling games in section 3. Recall that Super Conventional signaling games were defined (with Lewis) in terms of states and actions, and furthermore, that rewarded payoff not only depends on the success of communication but also on whether or not the conventional strategies were respected. This conventional meaning was supposed to be a commonly known parameter of the Super Conventional game. As such, our model is not only more precisely defined than Sally’s but also requires less complicated notions. In particular, the only notion required besides the ones presupposed by Lewis is the notion of conventional meaning, which itself can be considered the result of a Lewisean game. We will see that this limitation does not limit Sally’s claim on language use resembling game playing. Sally applies his game-theoretical rules of thumb to his signaling game to make predictions as to how people use language. He argues that the payoffdominant equilibria are the signaling systems that communicate non-literally, whereas the risk-dominant equilibria communicate according to the convention. Then, for instance, rule (A) would predict that people speak literally by default; and rule (B) would predict that as sympathy between the players increases, people are more likely to communicate non-literally. In Sevenster (2004) it is proved that • If hs, r i is a signaling system in Ghcs,cr i , then hs, r i is a Nash equilibrium in Ghcs,cr i . This result corresponds to Lewis’s result and also establishes that in this model, signaling systems are the first-class citizens in the sense of being Nash equilibria. Furthermore, characterizations of the payoff-dominant and risk-dominant equilibria are given: • hs, r i is a payoff-dominant Nash equilibrium in Ghcs,cr i iff hs, r i is a signaling system and for every t ∈ T it is the case that s(t) 6= cs(t). • If ε0 > ε, then hs, r i risk-dominates all other signaling systems in Ghcs,cr i iff s = cs and r = cr . Taken strictly these formal characterizations do not teach us anything. In line with Sally, however, we can make a sketchy account as to how they should be interpreted. In particular, these characterizations enable us to apply rules (A) and (B) to our Super Conventional signaling games, and thereby make the same predictions as Sally. To get a better understanding of these results let us consider the case of indirect requests and spell out the implications for the parameters ε, and ε0 .16
168
Game Theory and Pragmatics
As to indirect requests, think of a room containing a hearer having control over the open window and a speaker who is cold. The speaker wants the hearer to close the window and has two ways to communicate this. Either he uses the conventional message, such as “Could you close the window” or he makes an indirect request, such as “It’s cold in here”. The hearer on the other side has also two options: to interpret the message figuratively or literally. This simple game has two equilibria: Table 5.12:
Game 7
It’s cold in here Could you close the window?
Figurative 1 + ε, 1 + ε 0, −ε0
Literal −ε0 , 0 1, 1
That the correctly communicated “It’s cold in here” is more rewarding for both (1 + ε vs. 1) can be explained in terms of politeness: the speaker did not have to command the hearer and the hearer is not commanded. That ε0 > ε means for the speaker that the benefit of being indirect is lower than the cost of being misunderstood. In case of misunderstanding by the use of a short message, the speaker would have to make a direct request, in order to accomplish her goal. Misunderstanding for the speaker is less bad if she is being literal, because she does not have to take the blame – I said so! On the other hand, misunderstanding for the hearer is less bad if interpreted literally – why didn’t you say so? Communicating literally is thus safe: the sentence meaning provides a face-saving excuse in the event of miscoordination.
6
How to interpret answers exhaustively
Until now we have discussed situations where it was possible to decompose the utility function by answering two questions which were taken to be independent: (i) was the intended content successfully communicated? and (ii) did the agent use the conventional safe strategy or not? In the final substantial section of this paper we discuss a somewhat more complicated case: we will give up the assumption that successful communication is a yes-or-no matter. In particular, we are going to discuss an example where it is intuitively the case that if the speaker adopts a risky strategy and the hearer a safe strategy there will be some useful transfer of communication, although this information transmission is not perfect. Thus, we will make a distinction in utilities when the speaker is adopting a risky strategy between the case where the hearer adopts an incorrect risky strategy, and a safe strategy.
Different Faces of Risky Speech
169
In the previous sections we assumed that the speaker could choose to play risky or safe, and that the hearer will just interpret the underspecified message either correctly or incorrectly. Now we are going to look at a situation where also the hearer can interpret an underspecified message either in a risky way or in a safe way, and where the safe interpretation of an underspecified message is better than the incorrect risky interpretation, but worse than a correct risky interpretation. In abstract, these kind of situations give rise to a game like the following: Table 5.13:
Game 8
t1 Risky Safe
C1 1 + ε, 1 1, 1
C2 0, 0 1, 1
C3 1 − ε0 , 1 − ε0 1, 1
where ε0 < 1. In the matrix of Game 8, C1 denotes the risky strategy that is correct in this situation; C2 the risky strategy that is incorrect in this situation, while C3 stands for the safe strategy. Before we analyze these kind of situations, let us first convince ourselves of their existence by looking at the interpretation of answers.17 Consider the following dialogue: (1) a. Bob: Who passed the examination? b. Ann: John and Mary. What can Bob conclude from Ann’s answer, besides the fact that John and Mary passed the examination? It seems only natural to assume that Ann mentioned all individuals she knows passed the examination. By making this assumption, Bob concludes that it is not the case that Ann knows that Sue, for instance, passed the examination. This seems a very reasonable inference to make.18 In many circumstances, however, Bob concludes something more from Ann’s answer than just (i) the semantic meaning of the answer, and (ii) that it is not the case that Ann knows that Sue, for instance, passed the examination: Bob also concludes that Sue did not pass the examination. This extra inference comes about via Bob’s extra assumption that Ann knows exactly who in fact passed the examination, i.e., the assumption that Ann is competent on the extension of the question-predicate.19 From the fact that Ann did not say that Sue passed the examination; the assumption that she mentioned all individuals she knows passed the examination; and the extra assumption that she knows who passed, Bob concludes that Ann knows that Sue did not pass. Due to the fact that knowledge entails truth, Bob concludes that Sue did, in fact, not pass.
170
Game Theory and Pragmatics
We see that, due to an assumption of competence, Bob can strengthen what he can infer from Ann’s answer by taking Ann to obey the principle to mention all individuals you know to satisfy the question-predicate. There is, however, also another way to strengthen this inference: it can be that Bob assumes that Ann is not competent on the extension of the question-predicate. In that case, Bob can strengthen his inference that it is not the case that Ann knows that Sue passed the examination to the inference that it is not the case that Ann knows whether Sue passed the examination. Notice that this is indeed a strengthening, because the lack of knowledge that Sue passed is compatible with the knowledge that Sue did not pass, but this is not the case for the lack of knowledge whether Sue passed. The above discussion shows that even if Bob assumes by Gricean reasoning that Ann mentioned all the individuals of which she knows that they passed the examination, this still leaves open three interpretations: one where Bob cannot infer any more than this; one where Bob can conclude that Sue did not pass; and one where Bob concludes that Ann doesn’t know whether Sue passed. The latter two inferences are due to assumptions of competence and incompetence, respectively. Of course, Bob’s (pragmatic) interpretation of Ann’s answer by making these assumptions is risky: his assumptions could turn out to be false, and, consequently, his additional inferences as well. But for Ann to give an answer like (1b) without explicitly mentioning what more, if at all, she knows about the extension of the question-predicate is risky as well: Bob might adopt the wrong assumption concerning Ann’s competence about the question-predicate and might interpret the answer in a different way than was intended. If Ann wants to be sure that Bob will understand the answer correctly, she has to play it safe and be very explicit about her knowledge. We can think of the dialogue as a game between Ann and Bob where both either play risky or safe. Notice that this kind of game is not really a coordination game, because it seems natural to assume that in case Ann plays it safe and is completely explicit about her knowledge, Bob will always interpret the answer in the correct way. That is why also this game can most naturally be thought of as a game with alignment of preferences. Assume that the dialogue takes place in a situation where Ann is in fact competent on the extension of the question-predicate. Then it gives rise to the following payoff-table at the left hand side. At the right hand side, we represent the expected values on the assumption that n denotes the probability that Ann is competent: Suppose first that Bob, the column player, always plays risky. In that case we are back to our discussion in section 4, and it pays for Ann to play , where n denotes the percentage by which Bob risky as well iff ε > 1−n n makes the correct prediction of competence. Thus, in case Bob is known to
Different Faces of Risky Speech Table 5.14:
Game 9
Risky Safe
Comp 1+ε 1
Incomp 0 1
Unkown 1 − ε0 1
Risky Safe
Risky n × (1 + ε) 1
171
Safe 1 − ε0 1
normally only ask questions of somebody who he thinks is competent (a not unreasonable procedure), the benefit of the short, but risky, answer for the answerer Ann doesn’t have to be very large to still be worthwhile. Now assume that Ann cannot assume that Bob always plays risky. Instead, Bob interprets answers only in 50 percent of the cases by making the assumption that the speaker was competent or incompetent. Denote the percentage by which Bob makes the correct assumption of competence, given that Ann is either competent or incompetent, to be n. In that case, it pays for speaker Ann to be risky if and only if n(1 + ε) ≥ 1 + ε0 .20 If we now assume that (for some reason) the benefit of successful implicit communication is twice as high as when the hearer interprets things in a safe way: ε = 2ε0 , the equality n(1 + ε) ≥ 1 + ε0 can be plotted as in figure 5.2.
n 1 .9 .8 .7
0
0 .25
.5
.75
1
Figure 5.2: A depiction of the function n = (1 + ε0 )/(1 + 2ε0 ). It reflects the states in which the expected utility of playing risky, n(1 + ε) equals the expected utility of playing safe, 1 + ε0 , where ε = 2ε0 .
It should not be surprising that, indeed, for Ann to play risky if she is not sure whether Bob interprets in a risky way or not, she has to be even surer that Bob makes the correct assumption of competence than if Bob is known to play risky.
172
7
Game Theory and Pragmatics
Conclusion
The starting point of our paper is the insight that initial expectations that players have of each other’s choice of action are important to solve a game with several equilibria. This important role of expectations – though crucial for Lewis’s (1969) analysis of convention – only recently was given an interesting twist in game-theoretical analyses of conversation in the work of Sally (2003). He discusses how the notion of ‘risk’ might be important in conversational situations between speakers and hearers. In this paper we try to go beyond this work (i) by clarifying the connection of Sally’s work with Lewisian signaling games, and (ii) by looking at some additional ways in which speech can be risky.
Notes 1. Of course, convention or precedence gives rise to salience, and can be thought of as a special case. With Clark (1996) and others, however, we will here assume the intuitive distinction between solving a game by convention and by salience. 2. Of course, these (first-order) expectations about what the other will do are based on the (higher-order) expectations of both participants of the conversation of the other’s expectation about one’s own behavior, and so on. In this paper we won’t go through the way the expectations about the other’s behavior are formed, but just stick to the first-order expectations. 3. But see Asher and Williams (2005) for an interesting exception. 4. Though it was the insight of Schelling (1960) that the analysis of how to solve such problems is more complicated, and thus more interesting, than previously assumed. 5. Harsanyi and Selten (1988) discuss a technique in which what they call an ‘objective prior’ can be defined solely based on the structure of the game. But it is disputable whether players really use this technique to determine these prior expectations about what the other(s) will do. For instance, it is unclear why precedence and other notions are taken to be irrelevant. 6. For some analyses of deliberation in games, see Harsanyi and Selten (1988) tracing procedure, and Skyrms (1990) for various methods of rational deliberation. 7. Parikh (2001) compares direct and indirect speech as well in his analysis of miscommunication. If speaker and hearer make contrasting assumptions about the style used by the other conversational participant, miscommunication follows, because they modeled the conversational situation as different games which have different outcomes. Parikh proposes that in the case that style is involved, interlocutors first (should) play a metagame concerning the style (or the use of language), and only then one of interpretation. One might think of Sally (2003) and section 5 of this paper as an analysis of such a metagame. 8. We realize that this is an unnatural assumption for Game 5, given that C1 weakly dominates C2 . It is discussed here only for illustrative purposes. 9. The game is normally called a ‘Stag hunt’, and hunting Stag is normally contrasted with hunting Rabbit. 10. We assume that P (t) > 0, for every t ∈ T .
Different Faces of Risky Speech
173
11. In case PR is not commonly known, the situation cannot really be described as a (signaling)game. Indeed, the standard equilibrium reasoning is not appropriate anymore. 12. Notice that S prefers to play Risky if she takes strategies C1 and C2 taken by R to be equally likely iff ε > 1, (given that ε0 = 1). 13. We have analyzed this situation with respect to a particular situation. We could have analyzed also the more general situation, where the strategies implicit and explicit stand for the strategies to use for all situations ti messages mu and mi respectively, and where the actions of the hearer still depend crucially on the expectations. It is easy to see, however, that this would not result in a different analysis. 14. Sally calls them “Wittgensteinean signposts”. 15. This rule is the result of empirical research. We think that this rule also has a theoretical counterpart. As we saw above, different expectations about the opponent yield different Nash equilibria. By modeling ‘sympathy’ as having expectations that the other player behaves such as to maximize his actual utility (leaving out considerations of expected utility for the moment), we can theoretically enforce risk dominated equilibria. That is, the more sympathetic the players are towards each other, the more they will be tempted to play riskily. We believe that this might have interesting consequences for the analysis of language change, but will not speculate about this here. 16. The same story can be told for ironical statements, i.e. understatements like “I wasn’t overimpressed by her speech.” 17. This is just one of many examples that behave in this way. We believe, however, that it is an example that is more easy to explain than many of its alternatives. 18. For a general formalization of this kind of this kind of reasoning, see van Rooij and Schulz (2004). 19. Again, see van Rooij and Schulz (2004) for one way to make this kind of reasoning precise. 20. This can be shown by the following calculation: [n (1 + ε)] + [ 21 (1 − ε0 )] 2 n(1 + ε)
≥ ≥
1 iff 1 + ε0 .
References Asher, N. and M. Williams (2005). Pragmatic reasoning, defaults and discourse structure. In this volume. Austin, J. L. (1962). How To Do Things With Words. Oxford University Press, Oxford. Clark, H. H. (1996). Using Language. Cambridge University Press, Cambridge, UK. Harsanyi, J. C. and R. Selten (1988). A General Theory of Equilibrium Selection in Games. MIT Press, Cambridge, MA. Lewis, D. (1969). Convention: A Philosophical Study. Harvard University Press, Cambridge, MA. Parikh, P. (2001). The Use of Language. CSLI Publications, Stanford. van Rooij, R. (2004). Evolution of conventional meaning and conversational principles. Synthese (Knowledge, Rationaltiy & Action), 139, 331–66. van Rooij, R. and K. Schulz (2004). Exhaustive interpretation of complex sentences. Journal of Logic, Language, and Information, 13, 491–519.
174
Game Theory and Pragmatics
Rousseau, J. J. (1755). Discours sur l’origine et les fondements de l’in´egalit´e les hommes. Marc Michel Rey, Amsterdam. Sally, D. (2003). Risky speech: behavioral game theory and pragmatics. Journal of Pragmatics, 35, 1223–45. Schelling, T. (1960). The Strategy of Conflict. Harvard University Press, Cambridge, MA. Searle, J. R. (1969). Speech Acts. An essay in the Philosophy of Language. Cambridge University Press, Cambridge, UK. Sevenster, M. (2004). Signaling games and non-literal meaning. In P. Egr´e et al., eds., Proceedings of the ESSLLI 2004 student session. Skyrms, B. (1990). The Dynamics of Rational Deliberation. Harvard University Pres, Cambridge, MA. Skyrms, B. (2004). The Stag Hunt and the Evolution of Social Structure. Cambridge University Press, Cambridge, MA.
6 Pragmatic Reasoning, Defaults and Discourse Structure Nicholas Asher and Madison Williams
1
Introduction
In this chapter we investigate the rational basis of pragmatic reasoning. For specificity, we’ll use a particular theory of discourse interpretation that combines an account of rhetorical structure with dynamic semantics. In particular we argue for a rich, linguistic notion of discourse context which we compute by means of simple defeasible rules in a nonmonotonic propositional logic. These defaults are linguistic in nature but have access to nonlinguistic domains of world knowledge that are encoded in a more complex nonmonotonic logic. Our approach takes as basic the idea that compositional and lexical semantics produce an underspecified logical form and that there’s a level of linguistic pragmatic reasoning that resolves underspecifications where possible, thus producing a more complete logical form for interpretation. We argue here that this assumption is reasonable from the standpoint of a general theory of rationality and we investigate particular justifications for the various pragmatic principles that the theory adopts, using a variety of game theoretic techniques.
2
The problem
The problem we want to concentrate on has to do with the very nature of pragmatics. Many pragmatic phenomena—e.g., the preferred resolution of anaphoric expressions or implicatures—depend on many factors that can’t easily be analyzed in a modular fashion. For example, if we consider a standard implicature like some implicates not all, it is well known that various contextual factors defeat this inference—e.g., the rhetorical function of the existential clause, background knowledge or other explicitly given information in the discourse—and this is typical of many pragmatic phenomena. This contrasts with phenomena analyzed by compositional and (some varieties of) lexical semantics; for instance, the meaning of a quantifier like most is something that can be stated independently of the context and of the meanings of its arguments. 175
176
Game Theory and Pragmatics
The analysis of pragmatic content and reasoning must use laws that are true proximally and for the most part, which we think are suitably formalized in some form of nonmonotonic logic—viz. as default rules in default logic (e.g., Reiter 1980) or as conditionals that only defeasibly support modus ponens in a modal, nonmonotonic logic (e.g., Asher and Morreau 1991). Drawing interesting consequences from such axioms in nonmonotonic logic makes use of a notion of nonmonotonic consequence, according to which, roughly, conclusions follow from premises in all the “normal cases.” However that notion is ultimately to be analyzed (and it’s immaterial for our purposes here how it is spelled out), nonmonotonic consequence makes the justification of individual axioms of a pragmatic theory not an easy matter. For many axioms in a nonmonotonic theory typically interact together to define what are the relevant “normal cases” for any defeasible generalization in the theory. This is especially true in a system like that of Asher and Morreau (1991) in which the consequence relation is sensitive to the logical specificity of the antecedents of conditional defaults when these defaults conflict. By this we mean that if we have a theory T in which we have two axioms (read A > C as if A then normally C): • A>C • (A ∧ B) > D and {C ∧ D} ∪ T is classically inconsistent, then: • T, A ∧ B |≈ D In such a system a normal case for the first axiom must be one in which we have A ∧ ¬B. That is, the second axiom tells us in fact what the normal cases are for the first axiom. We can imagine a theory with a set of conditional axioms with ever more specific antecedents, thus making the definition of what a normal case is for any one of the axioms quite complex and dependent on the way the theory analyzes other contextual parameters besides those actually stated in the default. So the empirical adequacy of one axiom may not be calculable in isolation from any of the others. Such pragmatic rules have at best a very indirect connection with the data they are supposed to secure. Were such default theories merely instrumental and the individual generalizations therein played no explanatory role except to generate certain implications, in our case components of discourse interpretations, then it might not matter which defaults we chose as long as the whole theory behaved according to expectations. But many sources of information contribute to pragmatic reasoning—including general world knowledge. So if
Pragmatic Reasoning, Defaults and Discourse Structure
177
all of pragmatic reasoning is really just instrumental, then we don’t have any means of isolating principles of linguistic pragmatics from the big Quinean interconnected set of beliefs about the world. Some researchers have adopted this view for pragmatics (Hobbs et al. 1993), but we think that picture is wrong and that there is a separate and interesting theory of linguistic pragmatics. So we think there ought to be a rational basis for the defaults in that theory that speakers assume. Without such a basis, it would also be difficult to understand how a linguistic community would come to adopt the pragmatic principles they do, as they’re not explicitly taught or even part of our conscious awareness when we interpret discourse. With some pragmatic reasoning principles, we can use standard techniques in game theory to show that these principles constitute a game theoretic optimal equilibrium, conditional upon certain assumptions (Asher et al. 2002). But in general we’ll argue here that traditional game theoretic techniques can’t help us; the problem of converging on a common set of defaults for pragmatic reasoning translates into a game theoretic problem of incomplete information, for which standard techniques don’t work. While some people like Skyrms (1996) or van Rooij (2004) have used standard game theoretic techniques to justify general signalling conventions or particular pragmatic principles, we don’t think that such justifications can be given in general for many rules of pragmatic reasoning. Almost all such justifications are dependent upon certain descriptions of the initial problem. Rather than a cause for despair, though, we think this descriptive sensitivity can serve to our advantage. In particular this suggests that our pragmatic defaults are quite finely attuned linguistically; further, investigations into focal points by, e.g., Bacharach and Bernasconi (1997) can help us understand in a more precise way coordination in the absence of optimization. We’ll proceed by first laying out a background for a particular theory of the pragmatic/semantic interface, SDRT, and then go into details about the nature of the pragmatic rules in that theory. Though our discussion will be specific to SDRT we think many of the points we make about pragmatic reasoning should carry over to other frameworks. We’ll then introduce the problem of justifying pragmatic rules game theoretically in a traditional way. Finally, we’ll introduce focal point theory and examine how this helps us justify particular default rules of SDRT.
3
Segmented Discourse Representation Theory background
The general problem we posed in the previous section is very abstract and needs to be made more concrete. Just what sort of pragmatic reasoning do we have in mind? This section provides an answer.
178
Game Theory and Pragmatics
Over a decade of research has made it quite plausible that a theory of discourse interpretation that combines an account of rhetorical structure with dynamic semantics has linguistically and philosophically interesting contributions to make concerning various contributions discourse structure makes to discourse content. Particular areas of application for one such theory, SDRT, have included the spatiotemporal structure of texts, pronominal anaphora, VP ellipsis, presupposition, lexical disambiguation, plural quantification (e.g., Asher and Wang 2003), modal subordination, speech acts and implicatures (for a discussion of many of these issues and a comprehensive bibliography see Asher and Lascarides 2003). By integrating dynamic semantics’ attention to semantic detail and the syntax/semantics interface with a broader AI vision of what discourse interpretation should try to accomplish, SDRT has contributed to the analysis of various semantic and pragmatic phenomena. Fundamental to this approach is a view of the pragmatics/semantics interface that other “pragmaticists” also adopt: pragmatics supplements semantic content in discourse interpretation. SDRT implements this idea by having lexical and compositional semantics provide an underspecified logical form (ULF) within which certain variables require pragmatic information to fill them in so as to produce a fully specified logical form. The way to understand pragmatics from this viewpoint is this: ULFs describe sets of completely specified logical forms; pragmatic reasoning then determines a set of preferred fully specified logical forms, and the interpretation of these preferred logical forms yields the linguistic content a discourse conveys to a competent interpreter.1 Another element on which SDRT and many other approaches to the pragmatic/semantic interface agree is that the pragmatic supplementation of semantic meaning exploits a notion of context. There are many different notions of context: Kaplanian contexts, Stalnakerian contexts, dynamic semantic contexts. Kaplan’s notion of context is perhaps the simplest in that it only fixes values for deictic and indexical expressions. It can be understood as a fixed, background assignment to a particular set of variables or discourse referents. More complex are dynamic notions of context, according to which the context changes as discourse proceeds. Stalnakerian contexts and dynamic semantic contexts are examples of contexts of this kind. Dynamic semantic contexts are typically formalized as assignment functions to discourse referents that various linguistic devices can reset or extend; each new sentence in a discourse then defines a relation between an input and an output assignment. The interpretation of logical operators and quantifiers yield constraints as to which values of assignments pass on and which remain within a “local” environment, and this in turn provides semantic constraints on what values anaphoric expressions can pick up.
Pragmatic Reasoning, Defaults and Discourse Structure
179
While dynamic semantic contexts have proven to have many uses, by themselves they cannot do justice to the sort of phenomena that have concerned SDRT. For example, purely compositional and lexical semantic constraints simply don’t suffice in many languages to fix the temporal structure of discourse (Lascarides and Asher 1993, Asher and Bras 1994, Wu 2003). In addition, dynamic semantic contexts are not equipped to account for the resolution of ambiguities of rhetorical connection. On the other hand, SDRT has such resolution made a central concept in the determination of many aspects of linguistic content. SDRT uses a much more complex representation of context. Each clause yields a labelled ULF that must attach to an available attachment point in the discourse context. A discourse context is represented as a graph consisting of labels of previously processed clauses linked by different rhetorical relations. The choice of rhetorical relation determines which of the labels in the graph are available attachment points for new information. Once an attachment point is chosen, we have to compute the relation by which it attaches, if we can. The update of the contextually given graph with the new information yields a new graph. These graphs, known as SDRSs, translate naturally into formulae that have dynamic semantic interpretations, but the main point of SDRSs is that they constrain and help us calculate rhetorical connections and those in turn help determine many aspects of the preferred interpretation of discourses. In effect, SDRT refines the clause linking transition ’;’ (dynamic conjunction) of dynamic predicate logic into a variety of different transitions, each of which is a different discourse relation. Examples of these relations are: Explanation, Elaboration, Narration, Commentary, Contrast, Parallel, Question-AnswerPair (QAP). Different types of rules combine to yield inferences concerning rhetorical connections. Sometimes lexical semantics determines the matter, e.g. when a discourse particle (Knott 1995) is present.2 Consider the following minimal pair: (1) a. John fell. And then Mary pushed him. b. John fell. Mary pushed him. While we prefer to interpret the second clause in the absence of any discourse connector as providing an explanation for why John fell, the presence of the phrase and then forces us to connect the two clauses in a way that we have a narrative sequence of two events. This difference in rhetorical connection also affects the temporal interpretation of the two clauses. SDRT takes the position that this reasoning is a matter of linguistic mastery; it’s part and parcel of constructing a logical form for a discourse. The
180
Game Theory and Pragmatics
way SDRT is set up, this reasoning has the task of filling in metalinguistic variables in logical form. The question is, what are the axioms or postulates that such reasoning makes use of? SDRT uses several distinct types of rules or axioms for computing discourse structure. First there are defeasible rules for computing discourse relations that look like (2), where α, β and λ are metavariables over SDRSlabels: (2) (?(α, β, λ) ∧ Info(α, β, λ)) > R(α, β, λ) In words, if β is to be attached to α with a rhetorical relation as yet unspecified (hence marked with a ’?’) and the result is labelled λ, and information Info(α, β, λ) about α, β and λ, that represents (the appropriate) information from sources like the lexicon, domain knowledge and cognitive states holds, then normally, the rhetorical connection is R. These rules allow us to reason defeasibly about inferring a particular discourse relation given a suitable attachment point. These are the main rules that we will concentrate on here.3 The rules that SDRT has developed enjoy some empirical support as they capture intended rhetorical relations in many examples in a precise way. But our question here is whether they have any deeper and more particular justification. Asher and Lascarides (2003) show that some of the instances of (2) derive from other defeasible rules linking a speaker’s intentions and beliefs with his utterances, that form a simplified cognitive model of the speaker used to infer Gricean implicatures (Asher 1999, Asher and Lascarides 2003) and other pragmatic information. And at least some of these principles of cognitive models in turn find a basis in a broader, game theoretic view of rationality (Asher et al. 2002). For our other rules, our argument is more involved. First, we argue that the general character of the system of defaults for computing rhetorical relations is preferable on rational grounds. Second, we investigate the grounds for adopting particular defaults. A striking feature of our rules for rhetorical relation and attachment site computation is their simplicity. Their simplicity plays a big factor in their optimality. They are used solely to fill in underspecified elements of the ULF, and finally are expressed within a quantifier free language whose nonmonotonic consequence relation is decidable (see Asher and Lascarides 2003 for details). Further, the logic for computing discourse structure has only restricted access to the information within other information sources and their associated logics; for example, it has access only to descriptions of formulae in the logic of information content, but not to what those formulae entail (in the dynamic logic where logical forms for discourse are interpreted). The same goes for other information sources like world knowledge or the lexicon. This also means that our axioms will be simpler than what one might expect if one was looking for a formalization of general world knowledge.
Pragmatic Reasoning, Defaults and Discourse Structure
181
The simple and independent character of the rules for specifying logical forms is important for a variety of reasons. We think that reasoning about logical form, and about what a particular discourse conveys, should be a relatively simple matter, as opposed to evaluating what was conveyed, which may be very complex indeed. We’re after what every competent interpreter of the language can glean as the basic message of a text, a message that can then be evaluated, integrated into the interpreter’s beliefs, into a detailed cognitive model of the speaker, or even various forms of literary interpretation. We think this must be the case if linguistic signalling is to have an optimal value. In fact formal theories of signalling in game theory presuppose that signals should be transparent in the sense that the receiver must eventually be able to recover the message she is supposed to recover (the one that is intended or that is evolutionarily selected for). All known biological signalling systems other than language are designed this way, and language as a product of biological evolution should be no different. Another way to think about the simplicity requirement is decision theoretically. Suppose we have two axiomatic systems, one of which, A1 uses a logic of worst case complexity that is higher than the other A2 (as in the case of weighted first order abduction vs. a propositional nonmonotonic logic). And suppose further that compositional and lexical semantics determine, as SDRT claims they do, a ULF with underspecified elements to be resolved by either A1 or A2 . Then at least as far as resolving these underspecified elements is concerned, if we can reliably estimate the probability that using A1 will have higher costs than using A2 given contextual information C—i.e. assign a value to P (c(A1 ) > c(A2 )|C) such that P (c(A1 ) > c(A2 )|C) > .5, then any payoff function that is monotonic with respect to costs will lead us to choose A2 as a more optimal than A1 . Moreover, we think this assumption about payoffs is completely reasonable when informativeness is not an issue. And informativeness is not an issue here since we have assumed that both A1 and A2 are performing the same task. To handle accuracy, we can mimic our earlier results in Asher et al. (2002) by setting the accuracy required to that of the procedure A2 . Of course A1 may have other uses such as providing results of general commonsense reasoning. But there is no need to use a powerful tool for a simple task, and rationality dictates a simple logic for a simple task, provided the probability of the accuracy of the procedure is high enough (though what it will be will depend on the exact details of the payoff function). The simplicity argument takes us still further in a way that Mill foresaw in his defense of rule utilitarianism over act utilitarianism. One might think of using decision theory itself to infer the rhetorical connections and attachment points, as well as resolving other underspecified elements in
182
Game Theory and Pragmatics
logical form by assigning probabilities as well as assigning a payoff to each possible resolution. We don’t think there’s anything wrong with this, but once again this seems more complicated than need be. Our defaults for calculating rhetorical connections are simple rules of thumb that abstract away from the calculation of optimal choice. They sometimes go wrong in that they lead us to choose a rhetorical connection that isn’t the one the speaker has in mind as in: (3) a. John went to jail. b. He embezzled the company funds. b’ because he embezzled the company funds. (4) a. John went to jail. b. He embezzled the company funds. c. But that’s not why he went to jail. d. Rather, he was convicted of tax fraud. As we see from (3), it is appropriate to draw the inference that (3b) explains (3a). This is not the correct discourse relation for (4b) as we find out from (4c). But the fact that this relation is inferred and then revised explains the appropriateness of but in (4), So far we’ve argued that our simple rules have a firmer basis than more complex but equivalent rules. But we’re a long way from justifying the particular default rules we’ve adopted. We’ve given general arguments that establish the relative superiority of a system of simple, reasonably accurate defaults. But we can imagine lots of equivalent equally simple systems of defaults, perhaps some simpler, perhaps some more accurate. The idea is to coordinate on a particular set of defaults in a game with indeterminately many default theories (each understood as a strategy). The basic problem with game theoretic accounts of coordination and linguistic convention is that standard game theory simply tells us that in a coordination game, any coordination is a good one: all the points on the diagonal of a game in which the common alternative strategies define the rows and columns are Nash equilibria, and standard game theory tells us nothing about which of these points are to be selected. While standard game theoretic techniques tell us why it’s important to coordinate, it does not tell us which coordination point to select. The experimental evidence, however, and work since the pioneering books of Schelling (1960) and Lewis
Pragmatic Reasoning, Defaults and Discourse Structure
183
(1969) have shown that coordination is far from random in many coordination games. Agents coordinate on what Schelling calls “focal points,” equilibria that are particularly salient. The data are striking and suggest that we need to supplement standard game theoretic techniques if we are to explain how rational agents behave in such situations. This has a special importance when it comes to the linguistic conventions that we adopt. Standard game theory has nothing to say about why we adopt the conventions we do. That conventions are arbitrary might apply to the basic semantic rules. For instance, it’s hard to see any rational basis for choosing the word ’red’ to refer to the property of being red. And the wealth of different words for the same concepts across the world’s languages suggests that arbitrariness for basic linguistic meaning is right. However, things are otherwise with pragmatic defaults. For one thing, the rhetorical relations of Narration, Elaboration, Explanation, Background, Result, Commentary, Parallel and Contrast are relations that appear to be relevant to discourse interpretation in many languages, as studies in RST and SDRT have shown. No doubt we could refine these discourse relations into a more finely individuated set, but they do appear to be universal features of all discourses. Further, the defaults that enable us to infer these relations also appear to work in many languages besides English, though there is of course some variation because of the interpretive work that other elements in a language perform. For instance, in English the defaults for Explanation work when the past tenses of the main verbs in both constituents are the same–whatever the past tense, whereas this is not true for French. In French Explanation cannot link two clauses in the pass´e simple, though it can if the tense in both main verbs is the pass´e compos´e. This uniformity deserves an explanation. So what we want to do is to consider why a particular default or set of defaults for a particular relation, say Explanation, might become a convention—something that speakers of a language will coordinate upon. To do this we need to take a detour into a theory that enables us to pick between equilibria. This theory is known as Dynamic Variable Frame theory.
4
Dynamic Variable Frame Theory
Bacharachs’ 1993 Variable Frame Theory formalizes the notion of salience by reorganizing games, forming equivalence classes of the actions available in the original game. The equivalence classes induce a partition and the choice set is broadened to selecting a partition cell. This reorganization often favors one cell over others. These preferences represent focal points. Intuitively, the idea is that the actions can be described in various ways. These descriptions induce a partition, and players choose an action based on salience under the description.
184
Game Theory and Pragmatics
In this paper we will present a dynamic extension of Variable Frame Theory. This is important because we wish to capture the notion that players’ beliefs are often sufficient to decide which partition will be selected. This is to say that salience, common expectations about the play of the other players, is enough to favor one cell over another, even if there is no payoff advantage to that particular outcome. Dynamic Variable Frame Theory (DVFT) considers the effects of repeated play on focal points. In particular, this allows a player to learn what the other players notice. Learning what is noticed allows players to update their beliefs about salient features and to play accordingly. In each repetition, players transform the initial game into a VFT game. This allows players to transform the game differently in each round according to their often changing beliefs about what the other player considers salient. Consequently, we give each game iteration three stages: an initial stage in which player types are fixed by nature, an analysis stage in which players update their beliefs about other players in light of new information and recast the game according to their perception of its salient features, and a playing stage in which players play this updated game. The first move by nature assigns each player a set of descriptions of the choices, called concepts, that are noticed initially. In the first analysis stage players use the concepts given by nature to anticipate the other player’s frame using the availability of the concept with some (possibly empty) noticer bias. A noticer bias weights initial beliefs in such a way that player 1 believes that player 2 is likely to notice quickly what player 1 has noticed quickly. The analysis stage amounts to players transforming the coordination game into a variable frame game. Also players can form an expectation of the results of the playing phase based on beliefs about the frames being used, using the expected utility formula. In the first playing stage players use a decision rule to select an option in the variable frame game, then they receive a payoff. These three stages amount to a variable frame game that is repeated to allow players to learn what concepts are employed by the other. This allows them to coordinate on a heuristic that can be used in all future encounters between the two. Here is a simple example of a variable frame game with two players, 1 and 2, where VFT produces interesting results. The players are presented with four objects, one square and the other three triangular. There are 4 actions: choose object 1 (the square), choose object 2, choose object 3, and choose object 4 (objects 2, 3, and 4 being the triangles). Players get a payoff of 1 each if they choose the same object, 0 otherwise. This yields the following game:
Pragmatic Reasoning, Defaults and Discourse Structure
185
Table 6.1:
1 1 2 3 4
1,1 0,0 0,0 0,0
2 0,0 1,1 0,0 0,0
3 0,0 0,0 1,1 0,0
4 0,0 0,0 0,0 1,1
The equilibria are coordinations along the diagonal (and certain mixtures thereof) where no pure strategy is preferable to any other (in this discussion we are only interested in pure equilibria). However, if we use VFT, with a frame that takes shape into account the players now have an option set of randomizing over all shapes s (the trivial partition), randomizing over the triangles, t, and picking the square, q. This yields the game:
Table 6.2:
s t q
s 0.25,0.25 0.25,0.25 0.25,0.25
t 0.25,0.25 0.33,0.33 0,0
q 0.25,0.25 0,0 1,1
This game has the pareto-optimal coordination point of both players selecting the square. On the other hand randomizing over all the shapes is the solution recommended by standard game theory since it can’t distinguish between any of the coordination points in the original game. This is the least practical solution for dealing with these situations, and does not describe either the way actual humans behave or the way they think they ought to behave in these situations. These stages are then repeated. The nth repetition starts with a move of nature where nature can alter the concept set of any player. This allows for the possibility that a player might first notice a concept that applies to the coordination game in later stages. The availability of a concept determines the probability that it will be newly assigned to a player in a given repetition. No player’s set of concepts can lose elements in a move of nature (players will not forget concepts apply) although often concepts will be assigned smaller and smaller probability weights.
186
Game Theory and Pragmatics
In the repeated action phases, players adjust their beliefs about the concepts available to the other players. If the payoff for the previous round meets expectations, their hypotheses about the concepts employed by other players will be confirmed. If not, the subjective probability that the other player has this frame will drop. Probabilities will be updated in a typical Bayesian hypothesis testing way. Also any new concepts that were added in this round must be incorporated. Any beliefs about new strategies will be updated according to the other player’s past play. If what is known is inconsistent with this new frame, the new frame will start out judged less likely than otherwise. If a new concept is consistent with past play, it will be given greater weight. This means that if a coordination has already been established, adding a new concept will only matter if it refines the coordination in use. If no coordination has been reached, players can avail themselves of the new concepts. The information from this analysis is then used to play an updated game. A dynamic refinement of VFT allows us to offer a solution in situations with competing concepts that are equally plausible. Consider the following example from Janssen (2001). There are five objects differentiated only by their color. Two are red, one is blue, one is green, and one is yellow. Players get a utility of one if they coordinate on an object, nothing otherwise. As before, in the standard game theoretic account, there are five equally good pure strategy equilibria, with nothing to distinguish them. However, if we use a frame that distinguishes colors we find the following game:
Table 6.3:
c c r b y g
1 , 5 1 , 5 1 , 5 1 , 5 1 , 5
1 5 1 5 1 5 1 5 1 5
r
1 , 5 1 , 2
1 5 1 2
0,0 0,0 0,0
b
y
g
1 1 , 5 5
1 1 , 5 5
1 1 , 5 5
0,0 1,1 0,0 0,0
0,0 0,0 1,1 0,0
0,0 0,0 0,0 1,1
Notice in this case there are three equilibria ((g,g), (b,b), and (y,y)) that are indistinguishable. Janssen’s solution is to offer a refinement that lumps together all “payoff symmetric” solutions. Since the three solutions are indistinguishable, they are treated as equally likely to be used. This gives the result that (r,r), which offers a 50-50 shot at coordination is preferable. Note that this is equivalent to using the subframe descriptions red and non-red.
Pragmatic Reasoning, Defaults and Discourse Structure
187
Bacharach and Bernasconi (1997) expected this result, but it is not in line with the empirical results they found. While a strategy coordinating on red makes sense in a one-shot game, we think it misses the fundamental role focal points play, which is to aid the discovery of conventions that are useful over the long term, using information available from experience. Our argument that focal points are essentially a dynamic tool comes mainly from the realization that the discovery of useful descriptions is not performed on the spot, but is rather the result of background world knowledge. Humans have a considerable amount of experience in interacting with others. Focal points provide a means of translating this experience into a useful tool for facilitating future interactions. Interactions of the sort described in the games played here add to this background knowledge, increasing their future usefulness. An attempt to model this phenomenon in a static setting is terribly misguided. Any participants in such a game would bring a great deal of experience and knowledge about what other players are likely to notice and what they are likely to do. The problem, then, with the sort of theory described by Bacharach and Berasconi, and by Janssen is their expectation that play will conform to the best play for the short term (picking red). However, in this case short-term and long-term viability of the solution diverge. As the repetitions rise above 5 in this example, the expected utility of picking green, blue or yellow would be better than coordinating on red from the beginning. This is because staying with the red picking equilibrium would never be better than coordinating half the time. While it would take longer to find a coordination on another color, once it was found, players would be able to coordinate all the time by keeping this color. Furthermore, with positive results confirming that the other player is picking red, one would not predict that she could profitably deviate from this coordination. Thus, having been established, we would expect it to continue. Thus, in the long-term, a red picking strategy would be worse. Another problem discovered in empirical settings is that, although experimental efforts explicitly attempt to correct for this, human subjects have a large number of possible frames. Even when deliberately attempting to limit frames to particular attributes in presenting the game to players, Bacharach and Berasconi found a part played by “oddities”, nuisance concepts that entered the self-reported frames of the participants. The use of focal points in normal interaction would, of course, leave this possibility wide open. There would be too many possible frames to consider for focal points to be a useful tool if they were simply a static concept, to be discarded when play finished rather than remembered for use in future cooperation.
188
Game Theory and Pragmatics
DVFT is useful then as a means of justifying particular axioms in a nonmonotonic theory. By considering the interpretations and background information that is likely to be used we can predict how players are apt to describe the problems to themselves in such a way as to privilege particular solutions and heuristics for interpreting discourse.
5
DVFT applied to axioms for discourse relations
There are three sources of information in spoken discourse: what is said directly, what is logically entailed by the content of the speech, and what is given by pragmatic implication. For the purposes of this discussion, we will assume information conveyed by body language and intonation are part of the shared contextual information, and so are contained in the background world knowledge. We also assume that information from pragmatic implication comes as a defeasible entailment of what is said directly or what is logically entailed and additional background knowledge. The goal, then, is to specify what information is available for use in pragmatic implicature and how that information is to be used. The argument for using a given set of heuristics has two parts: first that a certain set of background information is available to the speaker and hearer, and second that, given this information, the heuristic is the most efficient way for the speaker to communicate the desired information content to the hearer. If a heuristic is used by both the speaker and the hearer, it can be expected to convey the most information for a given utterance. We argue elsewhere (Williams 2003) that the set of background information that should be used is determined by salience, and contains only what is shared world knowledge that is available in a particular context. We use DVFT to show that a particular type of heuristic may be expected to arise. These may not be the most efficient heuristics. For example, take a case where everyone interacting believes, correctly, that they are all psychic (can read each others’ minds). The speaker and hearer both assume that all information available to one is available to the other, so the speaker assumes that the information she wishes to convey (including that she wishes to convey that particular information at that particular time) is known to the hearer, and she would say nothing. Similarly, the hearer would assume that he already knows what information the speaker wishes to convey at the time and expect to hear nothing. This is maximally efficient: all the information is conveyed with no cost. For our purposes, this is clearly flawed as a model of discourse, and one would not expect a class of “say nothing because the hearer already knows” heuristics to arise.
Pragmatic Reasoning, Defaults and Discourse Structure
189
In this paper, we are more interested in showing that, given that speakers and hearers only assume shared general knowledge derived from our basic experience of the world and basic lexical information, certain heuristics, those in SDRT, can be expected to arise. The idea is that the > axioms in SDRT for inferring discourse relations capitalize on certain aspects of the description of a dialogue. Take for instance, the default for Explanation in SDRT. Top below stands for the relation that tells us that a particular label outscopes all the other labels related to another label, while Aspect(α, β) says that the aspect of α and β may be either perfective or imperfective in the sense of introducing an event or state discourse referent into the representation.4 • Explanation: (?(α, β, λ) ∧ Top(σ, α) ∧ cause D (σ, β, α) ∧ Aspect(α, β)) > Explanation(α, β, λ). Of special interest here is the predicate cause D , which sums up a variety of factors that lead to the possibility of a causal link. The idea is that a choice of main verb and arguments for that verb sometimes get associated with a causal connection: for instance, if someone falls and that person is pushed, there is an association between these that comes from world knowledge— these events can instantiate a typical causal pattern: x pushes y and as a result y falls. Note that we wouldn’t want to say that this typical connection is implicated simply by the use of such verbs; but these causal patterns can lead to causal links when the constituents are linked and other features obtain. These causal patterns are, for one thing, quite sensitive to the form of the linguistic description. For instance, if the main verb falls within the scope of a negation, something that the ULFs of SDRT will reflect, then cause D , then this predicate won’t be inferred. And that’s what we want in virtue of (5a) though not (5b) (whose analysis involves a topic for alternations and is something we can’t go into here). (5) a. John didn’t fall. Mary pushed him. b. John fell or slipped. Mary pushed him It is this association that primes the interpretation of the second clause as a speech act of Explanation. Asher and Lascarides (2003) argue that a careful examination of the syntax/semantics interface and the argument structure for lexical entries sometimes permit us to derive these connections from MDC, if the lexical entries contain underspecified elements that the second clause can resolve. But there are certainly many cases where we infer causal links where the evidence for complex lexical entries is not at hand.
190
Game Theory and Pragmatics
DVFT furnishes us an explanation for why such > axioms should be adopted, even in the absence of further evidence from the lexicon and the syntax/semantics interface. Consider the “John fell. Mary pushed him.” example. Many interpretations satisfy this possibility. If John slipped on a banana peel and two weeks later Mary pushed him, this would satisfy the semantic criteria. But the Explanation interpretation is salient because it provides more information: The pushing caused the falling, the two events happened at the same place and time; Mary was present during the falling; the pushing preceded the falling etc. Furthermore, this interpretation accords with our experience of the world: pushings cause fallings; a falling is unusual enough that a hearer informed of one would likely want to know why it happened. Hence the explanatory interpretation is salient because it is efficient (in terms of total information conveyed) and the speaker has reason to think that the hearer will anticipate it. This interpretation is driven by the lexical meanings of the words involved (push might have a resultative lexical structure and appeal to an underspecified effect as in Asher and Lascarides 2003) and by the non-linguistic knowledge that pushings can cause fallings. Notice that this interpretation is much less salient with “John fell. Mary tickled him.” (though we can get that interpretation too). Within the context of DVFT, if payoff is a function of information conveyed and the cost of speaking, solutions that convey more information with less spoken will be preferred. So in this case, any expression that takes full advantage of the background world knowledge and lexical structure (without repeating anything the hearer should already be expected to know) will have the same payoff. The question becomes which of these efficient heuristics will we expect to arise. It is important that we limit interpretations by available background information. In the “John fell. Mary pushed him.” case there are lots of more informative interpretations even than the Explanation one. For example consider one where John fell because a large boulder hit him and 10 minutes later Mary walked by and pushed him to see if he was alive. But such an interpretation isn’t accessible in the context in which this discourse was just introduced. Why isn’t such an interpretation accessible in a standard context? Because such an interpretation requires special assumptions about the context to derive that interpretation from the discourse, and there’s no reason on the part of the speaker to think that the hearer could access the special assumptions that drive those other interpretations. While he can presume the interpreter to be a competent speaker and know general commonsense facts about the world he cannot rationally assume that the interpreter would have access to nonlinguistic information of a special nature that might create a yet more informative interpretation.
Pragmatic Reasoning, Defaults and Discourse Structure
191
This is what DVFT is meant to solve. The idea is that background world knowledge determines a starting point in the basin of attraction (the solution to which a dynamic game converges) of a particular interpretation, so we will not expect to converge on the crushing by a boulder interpretation because it requires outside information. The speaker does not expect the hearer to know this, and hence to expect to need to use this information, and the hearer, even if she knows it, does not expect the speaker to require her to know and use this information (unless it is background information: they both know it and expect the other to know they know it etc.). Let’s suppose that the context satisfies the antecedent of the default for Explanation. There are many other rules that are alternative possibilities. For instance, we could have an alternative rule to Narration. • Weird Narration: (?(α, β, λ) ∧ Top(σ, α) ∧ cause D (σ, β, α) ∧ Aspect(α, β)) > Narration(α, β, λ). A natural partition of our different defaults for a situation in which the antecedent of Explanation holds would be the following: there are those rules with that antecedent that have a discourse relation whose semantics coheres with the causal connection associated by speakers with the information summarized by causeD ; there are those rules with the antecedent of the Explanation rule but with a discourse relation whose semantics doesn’t have anything to do with the causal link (like Weird Narration) and then all the other rules without the causal information in the antecedent. DVFT should prompt us to select Explanation since it’s the only rule in the cell of the partition with the “coherent” rules; the other partition contains many rules like Weird Narration and the other cell contains indefinitely many rules that use irrelevant or ‘nuisance” information in the antecedent (we suppose in this case that there are no cue words or discourse particles present). There’s something special about Explanation in that it tracks the associated, stereotypical causal information. The idea that falling is a potential result of pushing in effect primes for the interpretation of the second speech act as an Explanation. Different discourse situations where we need to attach new information to a given context will group naturally into those that verify CauseD predicates and those that verify antecedents of the other defaults, like those for Narration, Elaboration, and Result. As agents continue to play discourse games, the probabilities of the rules on which coordination succeeds will continue to increase, while those on which coordination fails or which are not selected will get lower probabilites. DVFT allows us to keep the successful frames for future coordination. Replicating the analysis of DVFT for
192
Game Theory and Pragmatics
Explanation in these other contexts will yield justifications for these other defaults. A one shot coordination game should suffice to coordinate on any individual default, if both players share the same partition that we gave above. What if speaker and interpreter don’t share the same partition or frame? One possibility, since we’re in a signaling game (where one player gives a signal and players must coordinate on its significance), is that the signal conveys itself the relevant aspects of the speaker’s frame. Then coordination can proceed as before. If that is not the case, then as we saw earlier with the simple shape game, iterating the coordination game will produce a coordination on whatever is the largest cell that overlaps some element in each player’s frame. This will amount to having some sort of an approximation to our full default, one say in which only some features relevant to CauseD are noticed. But in any case, we think that the aspects to which our defaults are sensitive are elements that speakers will readily notice and so we expect a much better than worst case convergence here to the full defaults. Similarly, we can give an account of the Narration heuristic from Asher and Lascarides (2003). • Narration: (?(α, β, λ) ∧ Top(σ, α) ∧ occasion D (σ, β, α)) > Narration(α, β, λ). This heuristic, like the one for Explanation, contains a predicate occasion, which sums up a variety of factors that lead to the possibility of one event’s enabling another. Our DVFT explanation would serve to pick out this Narration rule from weird competitors and nuisance associations.5 It remains to be seen whether this strategy of justification fits all the other defaults that are used in SDRT’s module of pragmatic reasoning. The axiom for Background, for instance, exploits aspectual information that doesn’t appear to be directly connected to the semantics for the Background relation. Other axioms exploit intonational information, and that too requires more study. Nevertheless, DVFT is a very sophisticated tool for analysis, and the prospects for a justification of the individual defaults in SDRT appear good. In future work, we hope to extend this approach to handle not only other linguistic defaults but defaults in other areas.
Notes 1. We mean to separate out linguistic content from speaker content. The former is what every competent speaker can glean from a discourse, while the latter depends not only on linguistic content but on nonlinguistic background beliefs of the interpreter’s concerning the speaker or other elements of the context.
Pragmatic Reasoning, Defaults and Discourse Structure
193
2. Other examples of English discourse particles are too, but, because, as a result, then, etc.. 3. To choose among attachment sites, SDRT employs a monotonic rule that picks from all possible updates of a given SDRS with new information those that maximize the strength of rhetorical relations globally and minimize underspecifications; this rule, Maximize Discourse Coherence (MDC), maximizes the informativeness of a given message by minimizing underspecifications and maximizing the strength of the rhetorical connections between constituents globally. This is an optimal strategy for interpretation if our utility function in fact maximizes for informativeness and coherence. If the interpreter assumes that the speaker is a decision-theoretic maximizer, then that’s all we need to ground a principle like MDC. No application of game theory appears to be needed. 4. We follow here Kamp and Reyle’s (1993) treatment of verbal aspect. 5. This argument does not allow us to conclude, however, that this ”local” rule is to be preferred to a ”global” one that deals with several constituents at once—say one that links the clauses in the following discourse by Narration all at once: (i) Mary pushed John. John fell. He rolled down the hill. He fell in the river. We suspect that complexity of these global rules will be higher because there are many more alternatives to attend to, but we leave the details to future research.
References Asher, N. (1999). Discourse structure and the logic of conversation. Current Research in the Semantics-Pragmatics Interface, 1, 1–28. Asher, N. and M. Bras (1994). The temporal structure of French texts within the framework of a formal semantic theory of discourse structure. In M. Aurnague, A. Borillo, M. Borillo, and M. Bras, eds., Semantics of Time, Space, Movement and Spatiotemporal Reasoning, pp. 203–218. Working Papers of the 4th International Workshop. Asher, N. and A. Lascarides (2003). Logics of Conversation. Cambridge University Press, Cambridge, UK. Asher, N. and M. Morreau (1991). Commonsense entailment: A modal, nonmonotonic theory of reasoning. In Proceedings of IJCAI 91. Morgan Kaufman Press. Asher, N., I. Sher, and M. Williams (2002). Game theoretic foundations for Gricean constraints. In Proceedings of the 2001 Amsterdam Colloquium on Formal Semantics. University of Amsterdam, Amsterdam. Asher, N. and L. Wang (2003). Ambiguity and anaphora with plurals in discourse. In SALT XIII. Seattle, Washington. Bacharach, M. (1993). Variable universe games. In K. Binmore, A. Kirman, and P. Tani, eds., Frontiers of Game Theory, p. 1993. MIT Press, Cambridge, MA. Bacharach, M. and M. Bernasconi (1997). The variable frame theory of focal points: An experimental study. Games and Economic Behavior, 19(1), 1–45. Hobbs, J. R., M. E. Stickel, D. E. Appelt, and P. Martin (1993). Interpretation as abduction. Artificial Intelligence, 63, 69–142. Janssen, M. (2001). Rationalizing focal points. Theory and Decision, 50, 119–148. Kamp, H. and U. Reyle (1993). From Discourse to Logic. Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Kluwer, Dordrecht.
194
Game Theory and Pragmatics
Knott, A. (1995). A Data-Driven Methodology for Motivating a Set of Coherence Relations. Ph.D. thesis, University of Edinburgh. Lascarides, A. and N. Asher (1993). Temporal interpretation, discourse relations, and commonsense entailment. Linguistics and Philosophy, 16, 437–493. Lewis, D. (1969). Convention. Harvard University Press, Cambridge, MA. Reiter, R. (1980). A logic for default reasoning. Artificial Intelligence, 13, 81–132. van Rooij, R. (2004). Signaling games select Horn strategies. Linguistics and Philosophy, 27, 493–527. Schelling, T. (1960). The Strategy of Conflict. Harvard University Press, Cambridge, MA. Skyrms, B. (1996). Evolution of the Social Contract. Cambridge University Press, Cambridge, UK. Williams, M. (2003). Focal point based equilibrium selection and the Horn strategies. manuscript available from the author. Wu, J. (2003). An Analysis of Aspectual Markers in Mandarin. Ph.D. thesis, University of Texas at Austin.
7 Utility and Relevance of Answers Anton Benz
1
Introduction
How to answer a question? If the inquirer asks it in order to make a decision about something, then a wide range of reactions can be appropriate. If asked ‘Who of the applicants is qualified for the job?’, reactions may range from ‘Only Muller ¨ and Schmidt’, ‘At least Muller’, ¨ over ‘Muller ¨ has working experience in this field’, ‘Schmidt needs extra training’, to ‘The younger ones show more enthusiasm’, or even ‘The job needs an expert in PCF Theory’. This paper divides into two parts. The goal of the first part is to derive a measure of utility for answers from a game theoretic model of communication. We apply this measure to account for a number of judgements about the appropriateness of partial and mention-some answers. Under the assumption that interlocutors are Bayesian utility optimisers we see questioning and answering as a two-person game with complete coordination of preferences. Our approach builds up on work by A. Merin and R. v. Rooij on decision theoretically formulated measures of relevance.1 In the second part we study the relation between their approaches and our game theoretic model of answering. We are aiming for principled characterisations, and are especially interested in clarifying when and why we have to model this type of communication as a two-person game. There are a number of judgements about the appropriateness of answers that seem to be due to their utility in a specific pragmatic context. In our examples, we write ‘I’ for the inquirer, and ‘E’ for the answering expert. We use the following question (1a) as our main example: (1) Somewhere in the streets of Amsterdam... (a) I: Where can I buy an Italian newspaper? (b) E: At the station and at the Palace but nowhere else. (SE) (c) E: At the station. (A) / At the Palace. (B) 195
196
Game Theory and Pragmatics
The answers (b) and (c) are equally useful with respect to conveyed information and the inquirer’s goals. The answer in (b) is called strongly exhaustive; it tells us for every location whether we can buy there an Italian newspaper or not. The answers in (c) are called mention-some answers. In general, a mention-some answer like A is not inferior to an answer like A ∧ ¬B: (d) E: There are Italian newspapers at the station but none at the Palace. If E knows only that ¬A, then ¬A is an optimal answer: (e) E: There are no Italian newspapers at the station. We call this type of answers partial answers. In Section 2 we work out our model. We provide explicit definitions of the underlying structures because we need them for later comparison with work by Merin and v. Rooij. We will pay attention to how it incorporates Gricean maxims. Specifically: (1) The model incorporates the Cooperation Principle as our games are games of perfect coordination, i.e. the answering expert E is always cooperative. (2) Throughout we will assume that the Maxim of Quality cannot be violated, i.e. E can only answer with a proposition that she thinks to be true. (3) We ignore the Maxim of Manner, i.e. if two propositions turn out to be equally useful, then we treat them as equally good answers even if one of them needs a much more complex sentence to be expressed. The main difference shows up with respect to the Maxims of Quantity and Relevance. We replace them by the assumption that interlocutors are Bayesian utility maximisers. The answers in (1b)-(1e) form only the core of phenomena that have to be explained. We study a number of examples in Section 3. An especially interesting group results from situations where the answering expert has to take into account the possibility of misleading expectations. An answer like ‘Muller ¨ worked as a student in a software company’ may produce incorrect expectations if he did only subordinate jobs in the reception. In general, this type of examples is roughly characterised by (1) the existence of an answer C that favours but doesn’t decide a certain hypothesis, (2) another answer C 0 which disfavours the same hypothesis, and the fact that E knows C and C 0 , or, at least, believes C 0 to be very probable. We call answers like C nontrivial partial answers. We will show that they provide a principal problem for approaches that determine optimal answers according to a decision theoretically formulated measure of relevance. We prove that no such theory can be empirically adequate. We present decision theoretic explications of the Maxim of Relevance in Section 4. The results about their relation to our game theoretic definition of best answers are presented in Section 5. From Theorem 5.3 it will follow
Utility and Relevance of Answers
197
that no decision theoretically formulated criterion for selecting maximally relevant answers can avoid selecting misleading answers.
2 2.1
Optimal answers Background
There has been a controversial debate about whether or not strongly exhaustive answers have a prominent status among the set of all possible answers. Groenendijk and Stokhof (1984) are most prominent defenders of the view that they constitute the basic answer, whereas other types of answers have to be accounted for pragmatically. For a constituent question like ‘Who came to the party?’ it has to tell us for each person whether he or she came to the party or not. This is of some importance for the interpretation of embedded interrogatives: If Peter knows who came to the party, then Peter knows whether John came to the party, or whether Jeff came to the party, or whether Jane came to the party, etc. The set of all possible answers is then the set of all strongly exhaustive answers.2 On the other side there are examples like ‘Peter knows where to buy an Italian newspaper’ which does not seem to imply that Peter knows whether he can buy an Italian newspaper at X, where X ranges over all kiosks in Amsterdam. The same difference shows up for the respective unembedded, or direct, questions. This leads to a position that sees questions as ambiguous or underspecified.3 It is not our aim to solve this controversy here. We just indicate how we like to position our approach in its context. Hence we state our background assumptions and our main motives for adopting them. But, in a technical sense, our game theoretic analysis of questioning and answering does not depend on these assumptions.4 Following Groenendijk and Stokhof (1984) we identify the set of answers to a question ?x.φ(x) with the set of all strongly exhaustive answers. If we take this approach, then we have to find a pragmatic explanation for the possibility of mention-some answers. Our main motivation for adopting their view comes from the observation that only questions that are subordinated to special goals of the inquirer allow for mention-some answers. If a question is asked only for gathering information, i.e. in a pragmatically neutral context, then a strongly exhaustive answer is expected: (2)
(a) Which animals have a good sense of hearing? (b) Where do coral reefs grow? (c) When do bacteria form endospores?
198
Game Theory and Pragmatics
In situations where asking a question is subordinated to further ends we find a wide range of other reactions: (3) Somewhere in the streets of Berlin ... I: I want to take the next train to Potsdam. Where can I buy a ticket? (a) E: Lists all places where to buy and where not to buy a ticket. (b) E: At the main station / At this shop over there. (c) E: Come with me! (Takes him to the next ticket-shop) (d) E: (Hands him a ticket) (e) E: There are no controllers on the trains today. The response in (3a) is the strongly exhaustive answer, those in (3b) are mention-some answers. The response in (3c) contributes to a goal (Get to a ticket-shop (G2 )) immediately super-ordinated to the goal of getting to know a shop that sells tickets (G1 ). The response in (3d) contributes to a goal (Getting a ticket (G3 )) which is again super-ordinated to the plan of buying a ticket. The response in (3e) contributes to a project (G4 ) that is again superordinated to getting a ticket. We wouldn’t call the responses in (3c) to (3e) answers. A more appropriate name is probably reaction. Due to our assumption that strongly exhaustive answers are basic, we assume that a question ?x.φ(x) itself introduces the immediate goal of providing the strongly exhaustive answer (G0 ). Writing the sub-ordination relation as < we find in Example (3) that this immediate goal is embedded in a hierarchy of goals G0 < G1 < G2 < G3 < G4 . The following mechanism explains the possibility of responses as in (3): Super-ordinated goals can override the immediate goal of providing a strongly exhaustive answer. Mention-some answers contribute to a goal that is super-ordinated to the basic goal G0 . The information conveyed by them is optimal with respect to this super-ordinated goal. It is the aim of this paper to precisely characterise this optimality in game theoretic terms. This provides the general idea how to derive the possibility of mention-some answers. In a game theoretic model we can represent a goal by a utility function. A natural way to do this is by setting u(v, a) := 1 if we reach the goal in situation v after execution of action a, and u(v, a) = 0 if we don’t reach it. If in Example (1) a is the act of going to the station and v a world where there are Italian newspapers at the station, then act a leads to success, and hence u(v, a) = 1. Utility measures can represent more fine-grained preferences over the outcomes of actions: e.g. if the inquirer wants to buy an Italian newspaper but prefers to buy it at the Palace because it is closer to his place,
Utility and Relevance of Answers
199
then this can be represented by assigning higher values to buying Italian newspapers at the Palace than to buying them at the station. Finally, we should emphasise that we consider only direct questions, i.e. no syntactically embedded questions. If we know what is the optimal answer to a question Q?, then we do not necessarily know how to interpret it if it occurs as a syntactically embedded question. (4)
(a) Peter knows where to buy an Italian newspaper. (b) Peter knows where to buy best an Italian newspaper.
These two sentences are not equivalent. If the set of optimal answers were identical with the meaning of the embedded sentence, then they should be equivalent. Hence, by determining optimal answers, we make no claim about embedded questions. 2.2
The utility of answers
As mentioned before, our background assumptions do not immediately affect the following analysis of partial and mention-some answers. The keyidea is to see questioning and answering as embedded in a decision problem. It is due to Robert van Rooij who explored it in quite a number of papers.5 I see it as one of the most interesting contributions of game and decision theory to pragmatics up to now. To motivate this move, we look at some examples. In (1), the inquirer has to decide where to go to in order to buy an Italian newspaper. Other examples that allow a mention-some answer are: (5)
(a) (In a job centre) I am a computer expert. Where can I apply for a job? (b) I like to go skiing in the Alps. What places can you recommend? (c) (In a job interview) What are your qualifications?
In (5a), the inquirer has to decide where to apply for a job. In (5b), he has to decide where to go to for skiing; in (5c), whether or not to employ a candidate. In each case there is a finite set of actions {a1 , . . . , an } and the inquirer asks for information that helps him to make an optimal choice among them. In (1) this set may be represented as {go-to(x) | x a newspaper kiosk in Amsterdam}; in (5a) as {send-application-to(x) | x a group of regional software companies}; in (5b) as {travel-to(x) | x a valley in the Alps}; in (5c) as {employ, not-employ}. The decision depends on the preferences of the decision maker over the outcomes of these actions, and on his/her information about the state of the world. We assume that there is a fixed set Ω that collects all possible states. If the decision maker does not have complete
200
Game Theory and Pragmatics
information, then he has to rely on his expectations about the world. We can represent them by the probabilities he/she assigns to the different possible states. In order to keep things simple, we assume that there are only countably many states of the world, i.e. that Ω is countable. In this case, a probability distribution is just a real valued function P : Ω −→ R such that (1) P (v) ≥ 0 for all v ∈ Ω and (2) the sum of all P (v) equals 1. For sets P A ⊆ Ω it is usual to set P (A) = v∈A P (v). Hence P (Ω) = 1. We collect these elements in a structure: Definition 2.1 A decision problem is a triple h(Ω, P ), A, ui such that (Ω, P ) is a countable probability space, A a finite, non-empty set and u : Ω × A −→ R a function. A is called the action set, and its elements actions. u is called a payoff or utility function. A decision problem represents the inquirer’s problem. By asking his question he makes it common knowledge. Again for simplicity, we assume here that his beliefs represented by the probability space (Ω, P ) are mutually known, too. In order to indicate that a probability distribution represents the inquirer’s beliefs we write PI . How does the situation change if we include the answering expert in our model? The only parameter that we add to our formal structure is a probability distribution PE that represents her expectations about the world: Definition 2.2 A support problem is a five-tuple hΩ, PE , PI , A, ui such that (Ω, PE ) and (Ω, PI ) are countable probability spaces, and h(Ω, PI ), A, ui is a decision problem. We call a support problem well-behaved if (1) for all A ⊆ Ω : PI (A) = 1 ⇒ PE (A) = 1 and (2) for x = I, E and all a ∈ A : P v∈Ω Px (v) × u(v, a) < ∞. The first condition for well-behavedness is included in order to make sure that E’s answers cannot contradict I’s beliefs,6 the second is there in order to keep the mathematics simple. A support problem represents just the fixed static parameters of the answering situation. We assume that I’s decision does not depend on what he believes that E believes. Hence his epistemic state (Ω, P ) represents just his expectations about the actual world. E’s task is to provide information that is optimally suited to support I in his decision problem. Hence, E faces herself a decision problem, where her actions are the possible answers. The utilities of the answers depend on the way in which they influence I’s final choice. We look at the dependencies in more detail. We find two successive decision problems:
Utility and Relevance of Answers
Expert E answers ↓ • ↑ expectations of E (Ω, PE )
I decides for action ↓ A −−→
• ↑ expectations of I (Ω, PI )
201
Evaluation ↓ a −−→
• ↑ utility function u(v, a)
We assume that the answering expert E is fully cooperative and wants to maximise I’s final success. Hence, E’s payoff is identical with I’s. E has to choose his answer in such a way that it optimally contributes towards I’s decision. Due to our assumption that I’s information is mutually known, E is able to calculate how I will decide. Hence, we represent the decision process as a sequential two-person game with complete coordination of preferences. We find a solution, i.e. optimal answers and choices of actions by calculating backward from the final outcomes. The following model will be worked out using standard techniques of game and decision theory. We concentrate on ideal dialogue. By this we mean that all participants have only true beliefs and adhere to the Gricean maxims – as far as they are necessary. The Cooperation Principle e.g. is represented by the fact that we consider only games of pure coordination. We will introduce other maxims together with our analysis. 2.3
Calculating backward expected utilities
First we have to consider the final decision problem of I. The probability PI in our support situation is intended to represent his beliefs before E has given her answer. Hence, we have to say how an answer will change these beliefs. In probability theory the effect of learning a proposition A is modelled by conditional probabilities. The related learning model is known as Bayesian learning. Let H be any proposition, e.g. the proposition that there are Italian newspapers at the station; or that software companies x, y, z offer jobs for computer experts. H collects all possible worlds in Ω where these sentences are true. Let C be some other proposition, e.g. the answer given by our expert. Then, the probability of H given C, written P (H|C), is defined by: P (H|C) := P (H ∩ C)/P (C). (2.1) This is only well-defined if P (C) 6= 0. Lets consider an example. I thinks that the probability that the station has Italian newspapers and that the 1 1 owner of the newspaper store there is Italian is 50 , i.e. PI (H ∩ C) = 50 . He 2 thinks that the probability that in general the owner is Italian is PI (C) = 50 .
202
Game Theory and Pragmatics
It follows that in every second world where the owner is Italian, he must also sell Italian newspapers. If now I learns from E’s answer that the owner is indeed Italian then he should belief that there are Italian newspapers at 1 2 the station with probability 12 : With (2.1) we calculate P (H|C) = 50 : 50 = 1 . 2 I’s decision situation How does I choose his action? It is standard to assume that rational agents try to maximise their expected utilities: Let h(Ω, P ), A, ui be any decision problem. Then, the expected utility of an action a is defined by: EU (a) =
X
P (v) × u(v, a).
(2.2)
v∈Ω
As the effect of learning a proposition A with P (A) > 0 is modelled by conditional probabilities, we get the expected utility after learning A by: EU (a, A) =
X
P (v|A) × u(v, a).
(2.3)
v∈Ω
As we assumed that the decision maker I tries to maximise expected utilities by his choice, it follows that he will only choose actions that belong to {a ∈ A | EUhΩ,PI i (a, A) is maximal}. In addition we assume that I has always a preference for one action over the other, or that there is a mutually known rule that tells I which action to choose if this set has more than one element. In this case we can write aA for this unique element. In short, we assume that the function A 7→ aA , for PI (A) > 0, is known to E. E’s decision situation According to our assumption, E’s payoff function is identical with I’s payoff function u, i.e. questioning and answering is a game of complete coordination (Principle of Cooperation). In order to maximise his own payoff, E has to choose an answer such that it induces I to take an action that maximises their common payoff. We use again (2.2) for calculating the expected utility of an answer A ⊆ Ω. With aA defined as in the previous paragraph we get: X EUE (A) := PE (v) × u(v, aA ). (2.4) v∈Ω
We add here a further Gricean maxim, the Maxim of Quality. We call an answer admissible if PE (A) = 1. The Maxim of Quality is represented by the assumption that the expert E does only give admissible answers. This means that she believes them to be true. For a given support problem S =
Utility and Relevance of Answers
203
hΩ, PE , PI , A, ui we set: AdmS := {A ⊆ Ω | PE (A) = 1}
(2.5)
Hence, the set of optimal answers for S is given by: OpS = {A ∈ AdmS | ∀B ∈ AdmS EUE (B) ≤ EUE (A)}.
3
(2.6)
Examples
We consider only well-behaved support problems hΩ, PE , PI , A, ui, i.e. for all A ⊆ Ω : PI (A) = 1 ⇒ PE (A) = 1. As mentioned before, the condition PI (A) = 1 ⇒ PE (A) = 1 entails that E’s answers cannot contradict I’s beliefs. More precisely, we find: Fact 3.1 Let (Ω, PE , PI , A, u) be a given support problem, then the condition ∀A ⊆ Ω (PI (A) = 1 ⇒ PE (A) = 1) entails for all A, B ⊆ Ω: 1. PE (A) = 1 ⇒ PI (A) > 0, 2. PI (A|B) = 1 & PE (B) = 1 ⇒ PE (A) = 1. We start with our main example (1), i.e. with the question: Where can I buy an Italian newspaper? We first describe the general scenario before justifying the different types of answers. We denote by a, b the actions of going to the station and going to the Palace. There may be other actions too. Let A ⊆ Ω be the set of worlds where there are Italian newspapers at the station, and B ⊆ Ω where they are at the Palace. We represent the payoffs as follows: For every possible action c ∈ A the utility value is either 1 (success) or 0 (failure); especially we assume that u(v, a) = 1 iff v ∈ A, else u(v, a) = 0; u(v, b) = 1 iff v ∈ B, else u(v, b) = 0. Mention-some answers We have to show that the mention-some answers are equally good as the strongly exhaustive answer (SE). Technically this means that we have to show that: if A, B and SE are admissible answers, then EUE (A) = EUE (B) = EUE (SE) We start with answer A: If E knows that A, then A is an optimal answer. If learning A induces I to choose action a, i.e. if aA = a, then the proof is very simple: X X EUE (A) = PE (v) × u(v, aA ) = PE (v) × u(v, a) = 1. v∈Ω
v∈A
204
Game Theory and Pragmatics
Clearly, no other answer could yield a higher payoff. If we want to prove the claim in full generality, i.e. for all cases, may they be as complicated as they can be as long as our previously formulated restrictions hold, then we need some more calculation. We first note the following fact: Lets assume that I chooses after learning A an act c different from a, i.e. aA = c 6= a. Then let C denote the set where action c is successful, i.e. C = {v ∈ Ω | u(v, c) = 1}. Then either (i) PE (C) = 1 or (ii) PE (C) < 1. In the first case (i) it follows again that EUE (A) = 1, and our claim is proven. Case (ii) leads to a contradiction by Fact 3.1: If I chooses c, then EUI (c|A) = maxc0 ∈A EUI (c0 |A) = EUI (a|A) = 1; hence PI (C|A) = 1, and therefore PE (C) = 1 in contradiction to (ii). It follows that only (i) is possible. In the same way it follows that B is optimal if E knows that B. The same result follows for any stronger answer, including the strongly exhaustive answer SE, A ∧ B or A ∧ ¬B. This shows that their expected utilities are all equal as long as they are admissible answers. We have no condition that represents the Maxim of Manner, hence all these answers are equally good and E can freely choose between them. Partial answers We now turn to Example (1e). Let A and B denote the complements of A and B. We assume here in addition that I can only choose between a and b, i.e. between going to the station and going to the Palace. We show: If E knows only ¬A, hence PE (A) = 1, then ¬A is an optimal answer. We first assume that learning ¬A leads I to choose action b, i.e. if I learns that there are no Italian newspapers at the station, then he will go to the Palace: EUE (A)
=
X
PE (v) × u(v, aA ) =
v∈Ω
=
X
PE (v) × u(v, b)
v∈A
PE (A ∩ B) = PE (B).
Let C be any proposition. If PE (C) = 1, then either avC = a or avC = b; hence either EUE (C) = 0 or EUE (C) = PE (B). Here enters: PE (C) = 1 ⇒ PE (C ∩ A) = 0. Hence, no other answer than ¬A can be better. It is important that I can only choose between actions a and b. The result holds even if C = B. What happens, if I doesn’t choose b but a? This means that 0 = EUI (a|A) = max{EUI (a|A), EUI (b|A)}, hence EUI (b|A) = 0. This entails that PI (B∩ A) = 0 = PE (B ∩ A). Hence, E believes that there are neither Italian newspapers at the station nor at the Palace, hence no answer can raise expected utilities above 0. This concludes discussion of Example (1e).
Utility and Relevance of Answers
205
Non-trivial partial answers Non-trivial partial answers will play a significant role when we discuss relevance based approaches. We consider the same general setting as before; especially the utility functions are defined as before, and I can only choose between a and b. Lets consider the following example: (6) There is a strike in Amsterdam and therefore the supply of foreign newspapers is a problem. The probability that there are Italian newspapers at the station is slightly higher than the probability that there are Italian newspapers at the Palace, and it might be that there are no Italian newspapers at all. All this is common knowledge between I and E. Now E learns that (N ) the Palace has been supplied with foreign newspapers. In general, it is known that the probability that Italian newspapers are available at a shop increases significantly if the shop has been supplied with foreign newspapers. We model the epistemic states described in (6) by the following condition: PI (A) > PI (B) and Px (B ∩ N ) > Px (A ∩ N ) for x = I, E.
(3.7)
As before, PE describes E’s beliefs when choosing her answer, i.e. after learning N , and PI describes I’s beliefs before learning E’s answer. Is N an optimal answer? Lets first calculate I’s reaction: X EUI (a, N ) = PI (v|N ) × u(v, a) = PI (A ∩ N ); v∈N
and EUI (b, N ) =
X
PI (v|N ) × u(v, b) = PI (B ∩ N ).
v∈N
Hence, he will choose b, i.e. aN = b. With (2.4) we get for E: X EUE (N ) = PE (v) × u(v, b) = PE (N ∩ B) = PE (B) > PE (A). v∈N
It is easy to see that for any answer C either EUE (C) = P (A) or EUE (C) = P (B). Hence, N is an optimal answer. Now we change the scenario slightly: (7) We assume the same scenario as in (6) but E learns this time that (M ) the Palace has been supplied with British newspapers. Due to the fact that the British delivery service is rarely affected by strikes and not related to newspaper delivery services of other countries, this provides no evidence whether or not the Palace has been supplied with Italian newspapers.
206
Game Theory and Pragmatics
The fact that M provides no evidence whether or not there are Italian newspaper at the station (A) or the Palace (B) means that PE (A) = PE (M ∩A) > PE (M ∩ B) = PE (B). I’s epistemic state has not changed from (6), hence we assume again that PI (A) > PI (B) and PI (A ∩ N ) < PI (B ∩ N ). What are the optimal answers? If E says nothing, i.e. if she answers Ω, then the expected payoff is EUE (Ω) = PE (A). Is there a better answer than saying nothing? Let C be such that PE (C) = 1. Then either I will go to the station, i.e. choose a, or go to the Palace, i.e. choose b. Hence, EUE (C) = PE (A) or EUE (C) = PE (B) < PE (A). This shows that I cannot provide any information that does better than Ω. Is N still an optimal answer? We find that answering with N leads I to go to the Palace, and therefore EUE (N ) = PE (B) < EUE (Ω). Hence, N cannot be a best answer. It would be misleading if E replied that the Palace has been supplied with foreign newspapers. Now, lets finally consider: (8) We assume the same scenario as in (6) where E learns that (N ) the Palace got supplied with foreign newspapers but her intuition tells her that, if there have been Italian newspapers among them, then they are sold out before I can get there. Of course, this is only a conjecture of hers. We assume here that her expectations are so strong that still PE (A) > PE (B). As in (6), we assume that PI (A) > PI (B) and PI (A ∩ N ) < PI (B ∩ N ). We find again that answering N , i.e. that the Palace got foreign newspapers, will induce I to do b. Again it follows that E thinks that this is misleading. Hence, again N is not an optimal answer. In our model, the place of Grice’s Maxim of Relevance was taken over by the assumption that interlocutors are Bayesian utility maximisers, i.e. by the assumption that rational agents choose actions that maximise their expected payoffs. This principle is somewhat alien to the linguistic pragmatic tradition, hence, we may ask: Isn’t it possible to replace it again by the more familiar Gricean maxim? In order to answer this question we need a halfway precise formulation of this principle. We discuss in the next section some explications in decision theoretic terms. We prove in Section 5 that no possible decision theoretic explication of the Maxim of Relevance can explain examples (6)-(8).
4
Relevance
The Gricean Maxim of Relevance is, of course, a natural candidate for explaining our judgements about the appropriateness of various partial and
Utility and Relevance of Answers
207
mention-some answers. Hence, game and decision theoretic explications of this maxim are of immediate interest for our investigation. Roughly, relevance measures the (psychological) impact of an assertion on the addressee’s beliefs. In decision theory there is only one decision maker. He may be uncertain about the state of the world but there are no other players whose moves or beliefs he has to take into account.7 In a proper game theoretic problem, the payoffs of one player’s moves depend on the moves of the other players, and vice versa. Hence, the more general question behind the discussion of the next two sections is whether or not it is essential that we model questioning and answering as a two-person game. We divide our discussion of explications of relevance into two sections. The first one addresses proposals that measure relevance in terms of the amount of information carried by an utterance, the second one on proposals that, in addition, take into account the expected utilities. The first one concentrates mainly on the approach by Arthur Merin (1999), the second one on work by Robert van Rooij.8 4.1
Information based measures of relevance
If the stock market is rising (D), it indicates that the economy prospers, and therefore probably the unemployment rate will sink. We can see the situation as a competition between two hypotheses: (H) The unemployment rate sinks, and (H) the unemployment rate doesn’t sink. H and H are mutually exclusive and cover all possibilities. D, the rising of the stock market, does not necessarily imply that H, but our expectations that the unemployment rate will sink are somewhat higher after learning D than before. We can say, that fact D is positively relevant for our belief that H. We can model the change of degree of belief by conditional probabilities as indicated previously, and based on them, it is possible to derive measures of relevance. We discuss here one proposal by Merin in more detail. Merin (1999) defines relevance as a relation between a probability function P representing expectations in some given epistemic context i = (Ω, P ) and two propositions: a proposition H, the hypothesis, and a proposition D, the evidence. This leads to the following definition:9 i Definition 4.1 (Relevance, Merin) The relevance rH (D) of proposition D to proposition H in an epistemic context i represented by a conditional probability i function P i (.|.) is given by rH (D) := log(P i (D|H)/P i (D|H)).
Merin applies this measure to communication situations. In its new domain we can see log(P i (D|H)/P i (D|H)) as the (possibly negative) argumentative force of D to make the addressee believe that H. The details of the definition i are not of concern here. rH (D) can be positive or negative according to
208
Game Theory and Pragmatics
whether D influences the addressee to believe or disbelieve H. In the same i i way it favours H it disfavours H, i.e. rH (D) = −rH (D). In a situation where the speaker wants to convince the hearer that H an assertion D is the i i more effective, or relevant, the bigger rH (D). If rH (D) = 0 then it neither favours nor disfavours any of the two hypotheses, and it is reasonable to call D irrelevant. i takes as paramWhat is of concern for us now is only the fact that rH eters the elements of a tuple hΩ, P, H, Di. Merin takes an argumentative attitude towards communication, i.e. he sees the aim of convincing the conversational partner of some hypothesis H as the basic goal of communication. For this purpose it is reasonable to choose the proposition that has the greatest impact on the addressee’s beliefs, i.e. the most relevant propo− sition. This means that we can define by means of r− a decision function R that selects for each context hΩ, P, Hi a proposition D with maximal argumentative force. Let us consider how to apply Merin’s measure for the relevance of assertions to questioning and answering situations. If the inquirer I asks whether φ, then we can set H := {v ∈ Ω | v |= φ}, and H := {v ∈ Ω | v 6|= φ}. Assume we are in a job interview, I wants to know whether E is qualified for the job (H) or not (H). Hence, he asks her about her qualifications. E has to find the strongest argument that is not already known to I for convincing him of H. Maybe, she didn’t mention in her resume that (D) she worked as a student regularly in a similar company, which indicates that she knows the business. If this is indeed her strongest argument, then she should use it. But the selected answer may be highly misleading, even if it is truthful. E.g. it may be that E didn’t mention this information in her resume because she worked only at the telephone switchboard. If we decide to use a decision theoretic model for the conversational phenomenon we investigate, then this implies that one person plays the role that nature plays in an experiment, i.e. the decision function R does not depend on his preferences and expectations. In both cases, in scientific experimentation and in communication, relevance is then only defined from the receiver’s perspective. If the measure of relevance is based only on information, it does not even take his utilities into account. Hence it shows its limits in situations like the following: Lets call our applicant Eve. We assume that Ω consists of four worlds {v1 , . . . , v4 } of equal probability. In world v1 Eve is a highly qualified and experienced applicant who can start right at the job. In v2 she is qualified but needs some more training. In v3 and v4 she is not qualified with only minor differences between the two states. I, who read her resume, asks a colleague, E, whether she knows more about this applicant’s abilities. Now assume that E knows that D = {v2 , v3 }. Is D
Utility and Relevance of Answers
209
relevant? If the decision maker learns D, then, using Merin’s measure, it turns out that rH = log(P (E|H)/P (E|H)) = 0. Hence, D is irrelevant. But, intuitively, it is relevant for the decision maker to learn that the most favoured situation v1 cannot be the case. 4.2
Utility based measures of relevance
The last example shows that, in general, we have to consider the interlocutor’s preferences. Van Rooij’s idea was to look at the communicative situation as a problem of decision theory and thereby to derive a criterion for the relevance of questions and answers. Let’s consider an example. An oil company has to decide where to build a new oil production platform. Given the current information it would invest the money and build the platform at a place off the shores of Alaska. An alternative would be to build it off the coast of Brazil. So the ultimate decision problem is to decide whether to take action a and build a platform off the shores of Alaska, or take action b and build it off the shores of Brazil. Getting the results of the exploration drilling, the company has to decide whether to go ahead and follow their old plans and built the production platform in the north, or to redesign them and build it off Brazil. One heuristic says that information can only be relevant if it induces the company to choose an action that promises higher payoff than the action it would have chosen before getting this information. This heuristic leads to the following definitions of relevance: A proposition A is relevant if learning A induces the inquirer to change his decision about which action a to take, and is the more relevant the more it increases the inquirer’s expectations. Let a∗ denote the action where the expected payoff is maximal relative to the information available before drilling, represented by P . Then the utility value10 of proposition A is defined as: U V (A) = max EU (a, A) − EU (a∗ , A). a∈A
(4.8)
A is relevant for the decision problem if U V (A) > 0. In our example, U V (A) can only be higher if newly learned information can induce the company to build the oil platform off the shores of Brazil (action b), and not off the shores of Alaska (a = a∗ ). The utility value U V is defined from the investigator’s perspective. Metaphorically speaking, we can call an experiment a question to nature, and a result an answer from it. The answering person, nature, is not providing information with respect to the investigator’s decision problem. There is only one real person involved in this decision model, namely the inquirer himself. Nature shows oil, or doesn’t show oil, according to whether there is oil where the exploration drilling takes place or not. It does not deliberate
210
Game Theory and Pragmatics
and show it in order to help the investigator, or because it thinks that this is relevant. The model does not predict that nature will only give relevant answers, and it does not even say that this would be desirable! E.g. assume that there is indeed a very large oil field in the area near Alaska where the company wanted to build the platform given its old information, and a very small oil field in the Brazilian area. If the exploration drilling confirms that the original decision was right, then this is, according to our criterion, irrelevant. Only if by some bad luck the drilling in the Brazilian area gives rise to the hope that there is more oil than in Alaska, do we get relevant information. So even from the companies, i.e. the receiver’s perspective, relevant information is not the same as desirable information. In (2004b) van Rooij used (4.8) as a measure for the relevance of answers.11 We concentrate on this early proposal because, as we think, it shows quite clearly the principled limitations of a relevance based approach. Information is evaluated only from the inquirer’s perspective, and as our example shows, this value is not identical with its desirability. Although, a measure like (4.8) is defined only from one person’s perspective we can apply it to the communication situation. We have to ask: Whose probability is P ? There are three possibilities: 1. It is the inquirer’s subjective probability. 2. It is the expert’s subjective probability. 3. It is the subjective probability that E assigns to I. Alternatives 1. and 2. are unsatisfactory. If 1., then measures like (4.8) cannot be applied by E, except 1. and 3. coincide. If we assume that (a) the expert can only give answers that he believes to be true, then 2. implies that any answer A will do because then EU (a, A) = EU (a) for all a. In order to turn the model into a model for a two-person game we have to choose interpretation 3.12 In this case (4.8) advises the answering expert only to choose answers that can make I change his decision. The same problem that we found in the example with the oil company and its exploration drilling, we find with respect to questioning and answering: (9) Assume that it is common knowledge between I and E that there are Italian newspapers at the station with probability 32 , and at the Palace with probability 13 . Now, E learned privately that they are in stock at both places. What should E answer if she is asked (1): Where can I buy an Italian newspaper? According to the initial epistemic state, I decided to go to the station. Let’s consider three possible answers: (A) There are Italian newspapers at the
Utility and Relevance of Answers
211
station; (B) There are Italian newspapers at the Palace, and (A ∧ B). Intuitively, all three are equally good. Some calculation shows that B is the only relevant answer according to (4.8). What (4.8) shows us is that B has the largest practical impact, but this is not the same as maximising joint payoff. The generalisation we are after is to show that the same problem shows up with any measure of relevance. We provide a formal proof in the next section. As a further example we look at the following definition of utility value, also proposed by van Rooij13 as an explication for relevance: U V (A) = max EU (a, A) − max EU (a). a∈A
a∈A
(4.9)
(4.9) gives the advice: ‘Increase the hopes of the inquirer as much as you can!’ This fixes the problem with Example (9) but it’s easy to see that we run into a similar problem with negative information. Assume that in the scenario of Example (9) E learns that there are no Italian newspapers at the station (¬A); in this case (4.9) implies that ¬A is not relevant because it does not increase the inquirer’s expectations. This seems to be quite unintuitive. But the problem can be easily fixed again by taking the absolute | | of the right side of (4.9). But even here we find an example that shows a difference between the so-defined relevance and desirability. An answer that increases, or changes, the hopes of the inquirer as much as possible is not necessarily a good answer. We consider again Example (7). Some calculation shows that, according to (4.9), E should answer that the Palace has been supplied with foreign newspapers. The same holds for the improved version of (4.9) with the absolute difference. As the probability that there are Italian newspapers at the Palace, given that the Palace has been supplied with foreign newspapers, is much higher than the assumed probability for there being Italian newspapers at the station, this answer should lead the inquirer to go to the Palace. But this is the wrong choice as probabilities are higher that there are Italian newspapers at the station. A good answer should maximise the inquirer’s chances for real success, and not maximally increase or change his expectations about success. Van Rooij was aware that relevant answers might be misleading. In section 4.2 of van Rooij (2003a) he discusses two reasons: The answering person (1) lacks important information or (2) has a reason to withhold information, e.g. due to opposing interests. The situations described in our examples differ in both respects: in (7), (8) and (9) the interests completely coincide, and the expert has all the information necessary to decide that the answers picked out by criteria (4.8) and (4.9) are not optimal.
212
Game Theory and Pragmatics
4.3
Relevance based decision functions
There are a number of reasonable decision theoretic explications for the notion of relevance. If we call information relevant, then the meaning of this depends on the special circumstances of the situation including the purposes of the interlocutors. In his recent work,14 van Rooij discusses not one specific measure of relevance but compares whole groups of interesting explications and their merits and demerits in special applications. Hence, the measures that we discussed so far are only examples for types of measures. Let S be the set of all support problems over a given set Ω, and for S ∈ S let DS denote its associated decision problem. AdmS denotes, as defined in (2.5), the set of admissible answers of S. Let D := {hDS , AdmS i | S ∈ S}. For the purpose of our paper, we can divide relevance measures into two groups: (i) measures that depend only on a decision problem h(Ω, P ), A, ui and pick out a relevant answer that depends on the admissible sets, i.e. measures that define a decision function R : D −→ P(Ω) such that every R(DS , AdmS ) is optimally relevant; (ii) measures that depend in addition on a given hypothesis H such that they define a decision function R : D × P(Ω) −→ P(Ω) such that R(DS , AdmS , H) is of optimal argumentative force with respect to H. The second group corresponds roughly to the argumentative view of communication defended by Merin. Hence, we call the first group of decision functions non-argumentative, and the second group argumentative decision functions. The distinction cross classifies with our distinction between information and utility based measures, but, as a contingent matter of fact, the information based measure discussed previously belongs to the second class, and the utility based measures to the first class. Our aim in the next section is first to show that no non-argumentative measure of relevance can always pick out an optimal answer as defined by (2.6). Following our previous analysis, this implies that no non-argumentative measure can be empirically adequate. For every measure there will be a well-behaved support problem where the most relevant answer is not optimal. Second, we will show how our construction in Section 2.3 can be used to define an adequate argumentative decision function in the sense of (ii) above.
5
Relevance and best answers
Given a support problem S = hΩ, PE , PI , A, ui we call any answer in OpS as defined by (2.6) a best answer. The construction in Section 2.3 defines functions BA : S −→ P(Ω) that pick out optimal answers for each support problem S, i.e. with BA(S) ∈ OpS for all S ∈ S. It is the purpose of this
Utility and Relevance of Answers
213
section to study the relation between such best answer decision functions and functions that choose optimally relevant answers. This section does not need more mathematical skills than the previous ones, but its presentation is necessarily more formal and compact. We made a number of assumptions when we constructed optimal answers in Section 2.3. We start with a summary. We considered support problems S = hΩ, PE , PI , A, ui such that 1. S is well-behaved; 2. The answering person can only choose answers from AdmS = {A ⊆ Ω | PE (A) = 1} (Maxim of Quality); 3. There is a commonly known function a : AdmS −→ A, A 7→ aA , that chooses for each admissible answer A an action aA that is optimal from I’s perspective. aA is the action that I will perform after learning A.15 The last assumption was necessary in order to guarantee that E can calculate the effects of an answer in cases where there are several optimal choices for I. For our purposes we need a precise definition of misleading answer. An answer is misleading, if it induced the inquirer to perform an action of which E believes that it is not optimal: Definition 5.1 (Misleading Answer) Let hΩ, PE , PI , A, ui be a given support problem and a : AdmS −→ A, A 7→ aA , as above, then an answer A ⊆ Ω is misleading, iff EUE (aA ) 6= maxa∈A EUE (a). Partial answers are answers like ‘There are no Italian newspapers at the station’ in (1e). This answer still rules out one of the actions, namely going to the station. Roughly, we call a partial answer non-trivial if there are at least two actions a, b such that it doesn’t rule out both of them. There are still possible worlds where a is best, and others where b is best. This holds for both interlocutors. The answers in examples (6)-(8) are of this type. Definition 5.2 Let S = hΩ, PE , PI , A, ui be a given support problem, then S has a non-trivial partial answer C, iff there exist actions a, b ∈ A and a : AdmS −→ A, A 7→ aA , as above, such that for A
:=
{v ∈ Ω | u(v, a) > u(v, b)}
B
:=
{v ∈ Ω | u(v, b) > u(v, a)}
it holds that (1) Px (C), Px (A|C), Px (B|C) > 0, for x = I, E, and (2) a = aA .
214
Game Theory and Pragmatics
Suppose we have a support problem and we have successfully defined a measure of relevance that picks out an answer C that happens to be an optimal answer too. As a decision theoretic model does only take account of the preferences and beliefs of one player, we can redefine the beliefs of the other player without changing the value of the measure of relevance. If C is partial and non-trivial, we can do it in such a way that C becomes thereby a misleading answer. Theorem 5.3 For every well-behaved support problem S = hΩ, PE , PI , A, ui with a non-trivial partial admissible answer C, there exists a probability distribution PE0 on Ω such that for the support problem S 0 = hΩ, PE0 , PI , A, ui: (1) the admissible answers are the same as for S and (2) C is a misleading admissible answer. Proof: Let hΩ, PE , PI , A, ui and C be given as in the theorem. Then, by Definition 5.2 there exist a, b ∈ A such that for A := {v ∈ Ω | u(v, a) > u(v, b)} and B := {v ∈ Ω | u(v, b) > u(v, a)} it is Px (A|C), Px (B|C) > 0, x = I, E, and EUI (a|C) = maxc∈A EUI (c|C). As abbreviations we use B := {v ∈ Ω | u(v, a) ≥ u(v, b)} = Ω \ B. We write: ( PE (v|B) for v ∈ B 00 PE (v) := PE (v|B) for v ∈ B Then we set: Na
:=
X
PE00 (v) · (u(v, a) − u(v, b)) > 0
v∈A
Nb
:=
X
PE00 (v) · (u(v, b) − u(v, a)) > 0.
v∈B
Clearly, Na , Nb > 0. Now we can define the probability distribution PE0 : PE0 (v)
:=
PE0 (v)
:=
Nb for v ∈ B; 2(Na + Nb ) „ « Nb 00 for v ∈ B. PE (v) · 1 − 2(Na + Nb ) PE00 (v) ·
PE0 is obviously a probability distribution over Ω. Let N = Nb /(2(Na + Nb )). We first show (1), i.e. that the admissible answers are the same. This follows from elementary calculations; we show only PE0 (D) = 1 ⇒ PE (D) = 1. Let α := PE (D|B) and β := PE (D|B). If α < 1 or β < 1, then PE0 (D) = αN + β(1 − N ) < N + (1 − N ) = 1. Hence α = β = 1, and it follows that
Utility and Relevance of Answers
215
PE (D) = 1. Next we show (2). By assumption we know that EUI (a|C) = maxc∈A EUI (c|C). Hence, in order to show that C is misleading we have to prove that EUPE0 (a) < EUPE0 (b). It is X
EUPE0 (c) =
PE0 (v) · u(v, c) +
X
PE0 (v) · u(v, c) for all c ∈ A.
v∈B
v∈B
We find that: EUPE0 (b) − EUPE0 (a) =
X
PE00 (v)N (u(v, b) − u(v, a))+
v∈A
+
X
PE00 (v) (1 − N ) (u(v, b) − u(v, a)) =
v∈B
= N · (−Na ) + (1 − N ) · (Nb ) Nb2 −Na Nb + Nb − 2(Na + Nb ) 2(Na + Nb ) 1 Na + Nb = Nb − Nb · · 2 Na + Nb 1 = Nb > 0. 2
=
This shows that from E’s perspective b would be the preferable action. Hence, C is misleading and (2) is proven. From this theorem it follows immediately: Corollary 5.4 Let S be the set of all support problems over Ω. For S ∈ S let DS denote its associated decision problem. Let D := {hDS , AdmS i | S ∈ S} where AdmS is the set of admissible answers of S. Then there exists no function R : D −→ P(Ω) such that for all S ∈ S : R(DS , AdmS ) ∈ OpS . In (6)-(8) we saw examples where an empirically adequate criterion for optimal answers must pick out a non-trivial partial answer. Hence, if a measure of relevance is empirically adequate, then there are examples where it must choose non-trivial partial answers. But then there must also be an example where it chooses a misleading answer. This shows that no decision theoretically defined non-argumentative measure of relevance can be adequate for all support problems. What about argumentative measures as that proposed by Merin? The following proposition shows that there are argumentative decision functions that always select optimal answers if we can presuppose a function that provides for each support problem S a suitable hypothesis HS for which E has to argue. Proposition 1 Let S and D be defined as in Cor. 5.4, and assume that the previous conditions for construction best answer functions are fulfilled. Then there exists a
216
Game Theory and Pragmatics
function H : S −→ P(Ω), S 7→ HS and a function R such that for all S ∈ S R(DS , AdmS , HS ) ∈ OpS . Proof: Let S = hΩ, PE , PI , A, ui be a given support problem. Let BA : S −→ P(Ω) be such that for all S ∈ S BA(S) ∈ OpS , i.e. BA is a best answer decision function. Then we simply set HS := BA(S) and R(DS , AdmS , HS ) = HS . Clearly, this decision function has the desired properties. For us the main importance of this proposition lies in the fact that it makes absolutely clear the relation between our game theoretic model of questioning and answering and explanations based on decision theory. The latter need an externally given hypothesis as a goal for which an interlocutor could argue. In our model, this hypothesis is provided theory internally. But this remains the ‘only’ difference between the two approaches. Hence, the proposition provides for us a bridge to applications of pure decision theory. In the previous sections we have emphasised the differences and, of course, the weaknesses and shortcomings. This obscures somewhat the sheer usefulness of this apparatus. In all cases where we don’t need to bother about the argumentative goal, a decision theoretic criterion of relevance may be completely adequate. From Corollary 5.4 we know that a non-argumentative decision function cannot guarantee that we always select optimal answers. Proposition 1 seems to show that the argumentative conception of communication builds the proper basis for an explication of relevance. Whether it really warrants this conclusion, we have to investigate another time. We leave it at this point. The same holds for the question concerning the wider significance of our results for Relevance Theory. Sperber and Wilson (1986) are well known for their claim that the Gricean maxims can be reduced to the Maxim of Relevance. If our arguments that applied to decision theoretic explications carry over to Relevance theoretic explications, then the consequences are indeed far reaching. They would amount to a proof that no such reductionist approach can be empirically adequate. But whether or not this conclusion is warranted or not must again be left for future research.
6
Conclusion
We set out in this chapter with the aim to derive a measure of utility of answers from a game theoretic model of communication that accounts for a number of judgements about the appropriateness of partial and mentionsome answers. In general, we looked at communication as a sequential two-person game of complete coordination. We presented a sketch how to explain the existence of mention-some answers even if one assumes that the
Utility and Relevance of Answers
217
basic answer to a question is the strongly exhaustive answer. We argued that mention-some answers contribute to goals of the inquirer that are superordinated to the immediate goal of getting an answer to their question. Relative to these super-ordinated goals they provide optimal information. The set of best answers can be calculated by backward induction. We applied our model to a number of examples. A sub-group of partial answers, which we called non-trivial partial answers, turned out to be especially interesting; they make it necessary for the answering person to take into account the possibility of misleading information. We showed in the second part of the chapter that our model improved here over previous explanations based on decision theoretically formulated relevance measures. The main goal of the second part was to provide a principled characterisation of the relation between our game theoretic model and approaches that use a decision theoretically defined measure of relevance for finding optimal answers. They define relevance from the perspective of the receiver of information. Choosing maximally relevant answers then means trying to maximise his expectations about responses. As is to be expected, this runs the risk of providing misleading information. We found that no decision function based on maximal relevance can be successful in avoiding this risk. This brings us to the final question: What conclusions does our analysis allow about the status of Gricean maxims? Our model incorporated the Cooperation Principle and the Maxim of Quality, which principles were supplemented by the assumption (Utility) that interlocutors are Bayesian utility maximisers, i.e. choose actions that maximise expected utilities. It is clear, that we needed in addition the Maxim of Manner in order to rule out overly complex answers. As we provided here only a case study, we can formulate only tentative conjectures about the other maxims: Conjecture 1: The Maxim of Relevance is not among the basic axioms of pragmatics. Conjecture 2: The first sub-maxim of Quantity (Say as much as you can) is superfluous – as a consequence of (Utility). Earlier in the chapter we wrote that the more general question behind our discussion of explications of relevance is the question whether or not it is essential to model communication as a two-person game. I hope, we could show that it is.
Notes 1. I.e. Merin (1999) and quite a number of papers by R. v. Rooij listed in the bibliography.
218
Game Theory and Pragmatics
2. If Ω is a set of possible worlds with the same domain D, and [[φ]]v denotes the extension of predicate φ in v, then a strongly exhaustive answer to question ?x.φ(x) is a proposition of the form [v]φ := {w ∈ Ω | [[φ]]w = [[φ]]v }; i.e. it collects all worlds where predicate φ has the same extension. The set of all possible answers is then given by [[?x.φ(x)]]GS := {[v]φ | v ∈ Ω}. This poses a problem for mention-some answers as they are not elements of [[?x.φ(x)]]GS , hence not answers at all. 3. For a short survey of positions regarding mention-some interpretations see Groenendijk and Stokhof (1997, Sec. 6.2.3). 4. Hence, the reader who disagrees with me on the relation between mention-some and mention-all answers will still get the full value out of the game theoretic model. 5. E.g. van Rooij (2004b, 2003a,b,c). Why do we ask questions? Because we want to have some information. But why this particular kind of information? Because only information of this particular kind is helpful to resolve the decision problem that the agent faces (van Rooij 2003b, p. 727). 6. See Fact 3.1 below on p. 203. 7. Sometimes nature is considered to be a second player. 8. We concentrate on his earlier work (2004b; 2003a) and (2003b; 2003c). 9. Merin (1999), Definition 4. 10. This type of situation has been thoroughly studied in statistical decision theory. Compare e.g. Raiffa and Schlaifer (1961, Sec. 4.5) and Pratt et al. (1995). 11. In later work v. Rooij tested other measures, e.g. (4.9) in (2003a), (2003b), and compares quite a number of possible definitions in (2004a). Prashant Parikh (1992) seems to be the first one who introduced (4.8) as a measure of linguistic relevance. Rohit Parikh (1994) used it for measuring the usefulness of communicated (vague) information. It should be noted here that, in order to derive the semantics for embedded interrogatives, v. Rooij introduces an operator that combines the effects of the principles of relevance and quantity, see e.g. (2003b, Sec. 5.2). This was worked out further under the name of exhaustification, e.g. (2004a). The operator is based on an order of relevance that is introduced as a special case of the order based on U V (A) in the sense of (4.9). A due discussion of the relation between v. Rooij’s exhaustification operator and the results of this paper must wait for another occasion. 12. Of course, that is the intended interpretation. 13. See also van Rooij (2003a, Sec. 3.1) and van Rooij (2003b, Sec. 3.3). 14. See van Rooij (2004a). 15. The last assumption was introduced on p. 202. For the definitions of well-behavedness and admissibility of answers see Definition 2.2 and Equation (2.5).
References Groenendijk, J. and M. Stokhof (1984). Studies in the Semantics of Questions and the Pragmatics of Answers. Ph.D. thesis, University of Amsterdam. Groenendijk, J. and M. Stokhof (1997). Questions. In J. v. Benthem and A. ter Meulen, eds., Handbook of Logic and Language, pp. 1055–124. Amsterdam. Merin, A. (1999). Information, relevance, and social decisionmaking: Some principles
Utility and Relevance of Answers
219
and results of decision-theoretic semantics. In L. Moss, J. Ginzburg, and M. de Rijke, eds., Logic, Language, and Information, volume 2. Stanford, CA. Parikh, P. (1992). A game-theoretic account of implicature. In M. Kaufmann, ed., Proceedings of the 4th Conference on Theoretical Aspects of Reasoning about Knowledge. Monterey, CA. Parikh, R. (1994). Vagueness and utility: The semantics of common nouns. Linguistics and Philosophy, 17. Pratt, J., H. Raiffa, and R. Schlaifer (1995). Introduction to statistical Decision Theory. The MIT Press, Cambridge, MA. Raiffa, H. and R. Schlaifer (1961). Applied Statistical Decision Theory. Harvard. van Rooij, R. (2003a). Quality and quantity of information exchange. Journal of Logic, Language, and Computation, 12, 423–51. van Rooij, R. (2003b). Questioning to resolve decision problems. Linguistics and Philosophy, 26, 727–763. van Rooij, R. (2003c). Questions and relevance. In Questions and Answers: Theoretical and Applied Perspectives (Proceedings of 2nd CoLogNET-ElsNET Symposium), pp. 96– 107. van Rooij, R. (2004a). Relevance of complex sentences. To appear in: Proceedings of LOFT 04. van Rooij, R. (2004b). Utility of mention-some questions. Research on Language and Computation, 2, 401–416. Sperber, D. and D. Wilson (1986). Relevance: Communication and Cognition. Blackwell, Oxford.
8 Game-Theoretic Grounding Kris de Jaegher
1
Introduction
Common knowledge plays a role in communication in two manners. First, as stressed by Lewis (1969), conversants need to have some degree of common knowledge (also referred to as common ground) about the meaning of signals. Second, as noted by Clark and Schaefer (1989), conversants will also try and achieve some degree of common knowledge about the fact that is being communicated, in a process that is known as grounding. Signals may get lost, or may be misunderstood, and conversants will seek confirmation that the communicated fact was understood. Clark and Schaefer (1989) argue that the form taken by the grounding process depends on conversants’ grounding criteria, namely on the degree of common ground that the conversants require for their current purposes. My argument in this paper is that game theory is a useful instrument to analyse the effect of conversants’ grounding criteria, on the form that the grounding process takes. I illustrate this point by introducing several variants of the socalled electronic mail game Rubinstein (1989), and looking for evolutionarily stable separating equilibria in each of these variants. I modify Rubinstein’s 1989 electronic mail game in several ways. First, whereas Rubinstein (1989) assumes that the communication process is automatic and involuntary, following Binmore and Samuelson (2001), I assume that conversants can freely decide whether or not to send signals. Second, Rubinstein (1989) and Binmore and Samuelson (2001) assume that, once the initial signal has been sent, each consecutive signal can only be sent if the immediately preceding signal was received. Signals therefore take the form of unambiguous proofs of receipt. In my model, by contrast, there are no restrictions on when they can be sent, and signals do not take the form of proofs of receipt. It follows that, if a signal takes on a particular meaning in equilibrium, this is because the signaller finds it a best response only to send the signal in particular circumstances. A by-product of assuming that signals do not take the form of proofs of receipt is that a signal can
220
Game-Theoretic Grounding
221
take on the meaning both of an acknowledgement of receipt, and of a notice of non-receipt. Third, in the standard electronic mail game, it is assumed that each player hates to act by himself, but does not mind the other player acting by himself. Instead, I allow each player to hate acting by himself, or not to mind acting by himself, and to hate the other player acting by himself, or not to mind the other player acting by himself, and this in every possible combination. The chapter is organized as follows. Section 2 describes the game, and limits the analysis to potential separating equilibria of this game that possess certain interesting features. Among these potential separating equilibria, only some turn out actually to be equilibria, and Section 3 links these separating equilibria to Traum’s 1994 grounding acts. Section 4 distinguishes several variants of the game depending on which separating equilibria exist. Section 5 lists some potential separating equilibria which have all the interesting features of the separating equilibria that I limit my analysis to, but which turn out not to be evolutionarily stable. Section 6 interprets the results, and suggests directions for future research. Section 7 finally is a technical section formally deriving the results of this chapter.
2 2.1
The model The game
Table 8.1:
Variants of the electronic mail game
Go Do not go
Go Do not go
Go (1, 1) (bR , aC )
Do not go (aR , bC ) (0, 0)
Gg (probability(1 − δ)) Go Do not go (−M, −M ) (−M, bC ) (bR , −M ) (0, 0) Gb (probability(δ))
Table 8.1 represents a game between two players who may or may not go to the cinema together, namely Rowena and Colin, by means of two payoff matrices. The payoff matrix on the left represents the game Gg that is played when there is a good movie on (which occurs with probability (1 − δ)), the payoff matrix on the right the game Gb that is played when there are
222
Game Theory and Pragmatics
only bad movies on (which occurs with probability δ). Only Rowena knows whether or not there is a good movie on. We first look at the payoffs that are common to both payoff matrices. If neither player goes to the cinema, then each player obtains a payoff of zero, whether or not there is a good movie on. If player i does not go to the cinema, and player j does go to the cinema, then independently from the state of the world, player i obtains a payoff of bi for i = R, C, where subscript R (respectively C) denotes that the payoff is Rowena’s (respectively Colin’s). We next look at the payoffs that differ according to whether or not there is a good movie on. If there is a good movie on and the players go to the cinema together, they both obtain payoff 1. If there is a good movie on and player i goes to the cinema by himself, then player i obtains payoff for i = R, C. If there are only bad movies on, and player i goes to the cinema, then he or she obtains payoff −M , whether or not player j also shows up at the cinema. Different versions of the game are now obtained according to the value taken on by the payoffs ai and bi for i = R, C. For simplicity, I assume that the values ai and bi either take on a value of zero, or a value of −M . The game obtained for ai = −M and bi = 0 for i = R, C is the standard electronic mail game (in Morris and Shin’s 1997 version). It will suffice for our purposes to take the following assumptions about the relative value of the payoffs, due to Morris and Shin (1997). M
>
1
(8.1)
Assumption (8.1) says that the cost of an outcome that a player dislikes is larger than the benefit from a successful outcome (= both players go and see a good movie). (1 − δ)bC + δ · 0
(1 − δ) · 1 + δ(−M )
>
(8.2)
Assumption (8.2) says that Colin, when not receiving any information from Rowena, strictly prefers not to go to the cinema, and this even in case Rowena goes to the cinema when there is a good movie on. Note that, given that (8.2) must be met for bi = −M , it must be the case that δ > 0.5, meaning that the more likely event is that there are no good movies on. Finally I assume that (1 − ε) · 1 + εai
>
(1 − ε)bi + ε · 0
(8.3)
where ε is the probability that a message gets lost. Assumption (8.3) says that a player i who knows that there is a good movie on, who has sent a message, and who knows that player j will go to the cinema when receiving
Game-Theoretic Grounding
223
this message, also prefers to go to the cinema. I limit myself to levels of M such that (8.3) is met in all variants of the game, meaning that I assume it to be always the case that (1 − ε) · 1 + ε(−M ) > 0. This assumption follows the literature (Morris and Shin 1997). I assume that players can communicate in the following way. At discrete points of time labelled 1, 2, · · · each player in turn gets the opportunity of sending a single message. This communication process follows Rubinstein (1989), with two exceptions. First, contrary to Rubinstein (1989), and following Binmore and Samuelson (2001), I assume that communication is not automatic, but voluntary, in that each player can freely decide whether or not to send a message. Second, contrary to the literature, I assume that messages sent after the first message do not automatically take the form of proofs of receipt.1 The assumption that each message sent by one player arrives at the other player only with probability (1 − ε) follows Rubinstein (1989). Following Binmore and Samuelson (2001), I add the assumptions that sending a message comes at a cost d, where this cost is incurred each time a new message is sent; and that paying attention comes at a cost c, where the cost of paying attention is incurred before the message exchange, and is a cost incurred for the capacity of receiving messages rather than a cost incurred when actually receiving a message. The concept of equilibrium used in this paper is the one of evolutionary stability. As in asymmetric games, Nash equilibria in mixed strategies cannot be evolutionarily stable (Selten 1980), I do not consider such equilibria. To check whether an equilibrium is evolutionarily stable, I first check whether it is a strict Nash equilibrium, as in asymmetric games, any such equilibrium is an evolutionarily stable equilibrium (ESS) (see Benz et al., this volume). Second, if an equilibrium is weak, I check whether one player’s weak response has any influence on the other player’s best response. If it does not, the weak equilibrium is still evolutionarily stable in that it is part of an evolutionarily stable set (ES set; Thomas 1985).2 2.2
Features of potential equilibria studied
While, as is the case for any signalling game, the game described in Section 2.1 possesses a pooling equilibrium, I limit myself to studying potential separating equilibria, as these are the only cases where grounding takes place.3 In the class of potential separating equilibria, I limit myself to equilibria where Rowena at time 1 sends a message only when there is a good movie on.4 In such potential equilibria, Rowena sends a message at points of time labelled with an odd number, and Colin at points of time labelled with an even number. Also, I limit myself to separating equilibria where there is no
224
Game Theory and Pragmatics
redundancy in the signals, meaning that no more than one message is sent with exactly the same informational content.5,6 Each equilibrium message sent after time 1 must then contain new information. But the only new thing that a player can inform the other player about is whether or not he or she has received an immediately preceding message. It follows that, in the potential separating equilibria that I study, each message sent after time 1 must take the form of either an acknowledgement of receipt (aor) of the immediately preceding message, or of a notice of non-receipt (nonr) of the immediately preceding message. Finally, to keep the analysis tractable, I limit myself to potential equilibria where the total number of messages sent is never larger than three. As my messages do not automatically take the form of proofs of receipt, in an equilibrium where a message takes on the meaning of an aor or a nonr, the sender of an aor (respectively a nonr) is also able to send this message when not receiving (respectively when receiving) an immediately preceding message. I call such a message a false aor (respectively a false nonr). In order to show that a particular equilibrium exists, I must therefore show that, depending on the relevant case, a player prefers not to send a false aor or a false nonr. When a sender sends a false aor at time t, then the receiver will detect this aor to be false if the receiver did not send the message at time (t − 1) that the false aor pretends to be acknowledging.7 The sender’s willingness to send a false aor will therefore depend on the receiver’s response when detecting the aor to be false. Yet, in equilibrium, the receiver will never actually detect a false aor. Any response by the receiver to a detected false aor is therefore a weak best response. It follows that the appropriate manner to check the evolutionary stability of separating equilibria involving any aors is to check whether they are part of ES sets. When a sender sends a false nonr, the receiver will never directly observe the nonr to be false. Simply, it is always possible that the receiver did not receive the immediately preceding message.8 It follows that an equilibrium involving only nonrs does not hinge on responses to out-ofequilibrium events. The appropriate manner to check evolutionary stability here is to check whether a potential equilibrium is a strict Nash equilibrium.
3
Stable separating equilibria described in terms of Traum’s grounding acts
Given that I only consider potential separating equilibria with a length of no more than three messages, and given that both the second and third message may be either an aor or a nonr, there are seven types of potential
Game-Theoretic Grounding
225
separating equilibria to be investigated. Yet the fact that an aor or a nonr is sent at a certain point of time does not suffice to fully describe a potential separating equilibrium. This is because there are many ways in which the players can condition their decisions to go or not go to the cinema, depending on the receipt or non-receipt of an aor or a nonr, and depending on the information that they possess about whether or not there is a good movie on. Still, as formally shown in Section 7, only five of the considered potential separating equilibria turn out to be evolutionarily stable. Moreover, it turns out that all these equilibria can be described using the grounding acts that Traum (1994) argues any grounding process to consist of. Initiate means that the grounding process is started. Acknowledge means claiming understanding of previous material by the other agent. Repair means correcting a misunderstanding. Request for repair means signalling a lack of understanding. Cancel means stopping the grounding process, leaving the content of the initiator’s message ungrounded. I now show which combinations of grounding acts the obtained evolutionarily stable separating equilibria consist of, and exemplify each obtained separating equilibrium by means of a fictitious natural language conversation between Rowena and Colin, where I put undetected messages between brackets. 3.1
Initiate
Rowena: There’s a good movie at the cinema tonight. Let’s go and see it. Rowena sends a message when there is a good movie on, and goes to the cinema. Colin goes to the cinema only when he receives a message. 3.2
Initiate, acknowledge
Rowena: There’s a good movie at the cinema tonight. Let’s go and see it. Colin: OK! Rowena goes to the cinema only when she receives an aor of her message. Colin only goes to the cinema when receiving a message. Also, when receiving a message, Colin sends an aor. 3.3
Initiate, cancel
Rowena: Colin: I didn’t hear from you, and so I’m not going to the cinema tonight. Rowena only goes to the cinema when she does not receive a nonr of her message. Colin only goes to the cinema when receiving a message. Also, when not receiving a message, Colin sends a nonr.
226
Game Theory and Pragmatics
3.4
Initiate, acknowledge, cancel
Rowena: There’s a good movie at the cinema tonight. Let’s go and see it. Colin: Rowena: I didn’t hear you confirming, and so I’m staying at home. Rowena goes to the cinema only when she receives an aor of her message. When she does not receive an aor of her message, she sends a nonr. Colin only goes to the cinema when receiving a message, and when not receiving a nonr. Also, when receiving a message, Colin sends an aor. 3.5
Initiate, request for repair, repair
Rowena: Colin: I didn’t hear from you. Are you sure we’re not going to the cinema tonight? Rowena: We are, we are! Rowena goes to the cinema both if she does not receive a nonr of her message, and if she does. In the latter case, she sends a new message. Colin goes the movie both when he receives only a first message, and when he receives only a second message. In the latter case, Colin first sent a nonr of the first message.
4
Taxonomy of games according to existing stable separating equilibria
I start by noting that two separating equilibria, namely 3.1 and 3.5, can be part of an ESS independently of whether or not a player hates to go to the cinema by him- or herself, or hates the other player to go to the cinema by him- or herself. This is clear for 3.1, where only a single message is sent. It is also clear for 3.5 (initiate, request for repair, repair), when realizing that 3.5 can be interpreted as an iteration of 3.1: with his request for repair, Colin gives Rowena a second chance to invite him to the cinema. Yet, the separating equilibrium 3.5 requires some additional assumptions about the level of d (the cost of sending a message), in that d should be neither too small nor too large. If d is relatively small, then Rowena could prefer to send a repair message even when she did not receive a request for repair (= false aor). Perhaps Colin’s request for repair got lost, and Rowena can then increase the probability that Colin also goes to the cinema by always sending him a repair. At the same time, if d is relatively large, then Rowena could prefer not to send a first message at all, and could always send a repair. In this case, Rowena knows that Colin sent a request for repair, whether or not she actually receives this request. From Colin’s perspective, d should
Game-Theoretic Grounding
227
also not be too large, as otherwise Colin will find it too costly to send a request for repair when not having received a first message, given the fact that it is then likely that there are no good movies on, and that Rowena therefore is likely not to have invited him to the cinema in the first place. Thus, separating equilibrium 3.5 can only be evolutionarily stable if d takes on an intermediate value. It should be noted as well that, if 3.5 is evolutionarily stable, it will be Pareto superior (better to both players) to 3.1. Simply, when Rowena’s initial message does not arrive, 3.5 still makes it possible that Rowena and Colin go and see a good movie together. We now check under which variants of the game in Table 8.1 additional separating equilibria to 3.1 and 3.5 can be evolutionarily stable. The results are summarized in Table Table 8.2 on the next page. 4.1
Rowena does not mind going to a good movie by herself (aR = 0): no additional grounding processes
In this class of games, when there is a good movie on, Rowena does not mind being stood up. No additional evolutionarily stable separating equilibria to 3.1 and 3.5 now exist. This is because separating equilibria where Colin sends an acknowledge message (3.2, 3.4) or a cancel message (3.3) at time 2 cannot be evolutionarily stable in this class of games. Imagine that Colin sends an acknowledge (respectively a cancel) message at time 2, and that Rowena would pays attention at time 2. Given that Rowena does not mind going to see a good movie by herself, Rowena has nothing to lose by going to the cinema even when she does not receive an acknowledge message from Colin (respectively, she has nothing to lose by going to the cinema even when she receives a cancel message from Colin). Given this fact, she will not pay attention at time 2 in the first place. 4.2 4.2.1
Rowena hates to go to the cinema by herself, even when the movie is good (aR = −M) Players do not mind the other player going to the cinema by himor herself (bR = 0, bC = 0): additional grounding process 3.2
In this class of games, players do not care about standing each other up, but Rowena hates being stood up. The separating equilibria where players send cancel messages (3.3, 3.4) cannot be evolutionarily stable because players have no incentive to inform other players that they are not going to the cinema. 3.2, however, can be evolutionarily stable. Let Rowena have decided to pay attention at time 2. When she does not receive an acknowledge message, she does not go to the cinema for two reasons. First, she hates to be stood up, and second, it is not the case that she hates standing up Colin to
228
Game Theory and Pragmatics
Table 8.2:
Equilibrium grounding process in different variants of the game in Table
8.1.
aR = 0 3.1, 3.5
bR = 0 bR = −M
aR = −M bC = 0 3.1, 3.2, 3.5 3.1, 3.5 (+3.4 if aC = −M )
bC = −M 3.1, 3.2, 3.3, 3.5 3.1, 3.3, 3.5 (+3.4 if aC = −M )
such an extent that she goes to the cinema even when not having received a confirmation from him. As a result, Rowena has an incentive to pay attention to acknowledge messages. It follows that, in this class of games, both separating equilibria ending in a repair message by Rowena (3.5) and ending in an acknowledge message by Colin (3.2) can be evolutionarily stable. It is easy to see now that if 3.5 is not evolutionarily stable, then Rowena prefers 3.2, but Colin prefers 3.1. This is because 3.2 allows Rowena to avoid the risk of going to the cinema by herself. Colin, however, does not mind letting Rowena go to the cinema by herself, and the fact that he needs to confirm Rowena’s message does not benefit him. However, if 3.5 is evolutionarily stable, then for sufficiently small d and ε, 3.5 is Pareto superior to both 3.1 and 3.2. This is obvious for Colin. Rowena may also gain compared to 3.2 because she is able to induce coordinated action even when her first message got lost. 4.2.2
Rowena does not mind Colin going to the cinema by himself; Colin hates it when Rowena goes to the cinema by herself (bR = 0, bC = −M): additional grounding processes 3.2, 3.3
In this class of games, Rowena hates to be stood up and Colin hates to stand up Rowena, but Rowena does not care whether or not she stands up Colin. The only difference with the class of games in Section 4.2.1 is that the separating equilibrium ending in Colin sending a cancel message (3.3) can now be evolutionarily stable, as Colin now hates standing up Rowena, and has an incentive to inform Rowena that he is not going to the cinema; moreover, Rowena hates to be stood up, and therefore has an incentive to pay attention to any cancel messages from Colin. However, if 3.5 is evolutionarily stable, then so is 3.3, and 3.5 is then Pareto superior to 3.3. If 3.3 and 3.5 are not evolutionarily stable, and if Colin hates going to the cinema by himself (aC = −M ), it is again the case that Rowena prefers 3.2, while Colin prefers 3.1. In 3.2, while Rowena gains from avoiding the risk of not going to the cinema by herself, Colin loses by incurring the loss of going
Game-Theoretic Grounding
229
to the cinema by himself. When aC = 0, however, for sufficiently small d, 3.2 is Pareto superior, as it allows Colin to avoid the risk that Rowena goes to the cinema by herself – while Colin does not mind going to a good movie by himself. If 3.3 and/or 3.5 are part of an ESS, each is Pareto superior to 3.1 and 3.2 for sufficiently small d.
4.2.3
Rowena hates it when Colin goes to the cinema by himself; Colin does not mind Rowena going to the cinema by herself, Colin does not mind going to the cinema by himself (bR = −M, bC = 0): additional grounding process 3.4 if aC = −M
In this class of games, Rowena hates to be stood up even when there is a good movie on, and hates to stand up Colin. Colin does not mind standing up Rowena. The separating equilibrium ending in an acknowledge message by Colin (3.2) cannot be evolutionarily stable because Rowena hates standing up Colin to such an extent that she goes to the cinema even when she does not receive an acknowledge message (assuming that she has decided to pay attention at time 2). It follows that Rowena does not pay attention at time 2 in the first place. The separating equilibrium ending in a cancel message by Colin (3.3) cannot be evolutionarily stable because Colin does not mind standing up Rowena, and therefore has no incentive to inform Rowena that he is not going to the cinema. Though Rowena has an incentive to inform Colin that she is not going to the cinema (bC = −M ), when Colin does not mind going to a good movie by himself (aC = 0), the separating equilibrium ending in a cancel message by Rowena (3.4) cannot be evolutionarily stable. This is because Colin does not mind then going to a good movie by himself, and therefore has no incentive to pay attention at time 3. When Colin instead does mind going to the cinema by himself (aC = −M ), 3.4 can be evolutionarily stable. When 3.4 cannot be evolutionarily stable (aC = 0), then 3.1 and 3.5 are the only remaining possible equilibrium grounding processes. We have already seen that 3.5 is Pareto superior if it exists. When 3.4 can be part of evolutionarily stable (aC = −M ), then for sufficiently small ε, Colin prefers both 3.5 and 3.1 to 3.4, as 3.4 does not always allow him to go the cinema after receiving the first message, whereas 3.1 and 3.5 do. Rowena, however, prefers 3.4 to 3.1 (otherwise Rowena would go to the cinema when not receiving a confirmation). It follows that, if 3.5 is not evolutionarily stable, then Colin prefers 3.1 over 3.4, while Rowena prefers 3.4 over 3.1.
Game Theory and Pragmatics
230
4.2.4
Rowena and Colin both hate it when the other player goes to the cinema by him- or herself (bR=−M, bC=−M): additional grounding process 3.3; additional grounding process 3.4 if aC = −M
In this class of games, Rowena and Colin hate to stand each other up. Moreover, Rowena hates to be stood up even when there is a good movie on. The separating equilibrium in an acknowledge message by Colin (3.2) cannot be evolutionarily stable for the same reason as given in Section 4.2.1. The separating equilibrium ending in a cancel message by Colin (3.3) can be evolutionarily stable for the same reason as given in Section 4.2.2. The separating equilibrium ending in a cancel message by Rowena (3.4) can again be evolutionarily stable if Colin hates to go to the cinema by himself (aC = −M , see Section 4.2.3). When 3.4 cannot be evolutionarily stable (aC = 0), then 3.5 and/or 3.3 are again Pareto superior to 3.1, if they exist. When 3.4 can be evolutionarily stable (aC = −M ), and 3.3 and 3.5 are not evolutionarily stable, 3.4 is Pareto superior to 3.1. In this case, each player hates cases where one player goes to the cinema and the other does not. By means of 3.4, players can eliminate the possibility that Rowena goes to the cinema by herself (at the cost of a small probability that Colin goes to the movie by himself). If 3.3 and/or 3.5 are evolutionarily stable, then for sufficiently small d, they are Pareto superior to 3.4.
5
Potential separating equilibria that are never evolutionarily stable
At least as interesting as the list of evolutionarily stable separating equilibria is the list of potential separating equilibria that turn out not to be evolutionarily stable in any variant of the game. In Section 7, some potential separating equilibria with z = 3 are shown not to be evolutionarily stable. The argument extends to any message exchange where the three last messages take on a certain form, and I therefore state the results in more general terms here. 5.1
Grounding process ending in ‘invitation to go + aor + aor’
Let i be the player who can send the last aor (time z). Such an equilibrium exists if (among others) it is a best response for player i not to send a message at time z when not having received a message at time (z − 1) (= a false aor). When player j did not send a message at time (z − 1) (because player j did not receive a message at time (z − 2)), if j still receives a message at time z, he or she will detect i’s aor to be false. It now suffices that player j punishes a detected aor at time z by not going, and then it will indeed be a
Game-Theoretic Grounding
231
best response for player i not to send a false aor at time z. This is a weak best response for j, as in equilibrium, j will never actually detect a false aor. However, there is no particular reason why j would punish a detected false aor. An alternative weak best response for j is to go when detecting a false aor. But this again makes it a best response for i to send a false aor, thus destabilizing the equilibrium. It follows that this type of separating equilibrium, while it may be a weak Nash equilibrium, is not part of an ES set. This result contrasts with Binmore and Samuelson (2001), who find a whole range of evolutionarily stable separating equilibria involving only aors. The reason that I obtain a different result is because my messages are not proofs of receipt, whereas in Binmore and Samuelson they are. The problem of false aors does not pose itself in the model of these authors.9
5.2
Grounding process ending in ‘invitation to go + aor + nonr (= repeat invitation)’
In such a message exchange, at time (z − 2), player i invites player j to go to the cinema; at time (z − 1), player j acknowledges receipt of player i’s invitation; finally, at time z, player i notifies player j if she did not receive a message at time (z − 1). The problem with this type of message exchange is that player j does not have any incentive to send an aor, as player i goes to the cinema whether or not player j sends an aor. As shown above, the message exchange can still take this form if the nonr instead cancels the grounding process.
5.3
Grounding process ending in ‘invitation to go + nonr + aor (= confirm not going)’
In such a message exchange, at time (z − 2), player i invites player j to go to the cinema; at time (z − 1), player j notifies non-receipt of player i’s invitation; finally, at time z, player i acknowledges receipt of player’s i notice of non-receipt. Player j goes when not receiving an aor at time z, and goes otherwise. The problem with such a message exchange is that player i does not have any incentive to send an aor at time z after having received a nonr at time (z − 1), as i can still induce coordinated action by not sending a aor. As shown above, the message exchange can still take this form if the aor instead repairs the grounding process.
232
Game Theory and Pragmatics
5.4
Grounding processes ending in the sequence ‘invitation to go + nonr + nonr’
In such a message exchange, at time (z − 2), player i invites player j to go to the cinema; at time (z − 1), player j notifies player i when player j did not receive a message at time (z − 2); finally, at time z, player i notifies player i if she did not receive a message at time (z − 1). Player j goes to the cinema when receiving both messages from player i, and also goes when not receiving the first message from player i but still receiving the second message. The problem with this message exchange is that player i has an incentive to send a false nonr at time z when receiving a nonr at time (z − 1), in order to still achieve coordinated action. It should be noted here that, because of the manner in which noise has been modelled, false nonrs can never be detected, and therefore cannot be punished.
6
Interpretation and directions for future research
This chapter can be seen as an initial exploration into a game-theoretic treatment of grounding, as it involves the most simple game that one could possibly present to examine a grounding process. Still, I hope that this suffices to convince the reader of the relevance of game theory for a better understanding of grounding. The building blocks of the different equilibrium grounding processes that I obtain can be interpreted in terms of Traum’s (1994) grounding acts. Moreover, the fact that I find that different grounding processes exist for different variants of the game can be given a straightforward verbal intuition, in that the players’ grounding criteria (Clark and Schaefer 1989) differ in these different variants of the game. As a subject for future research, the basic game in Table 8.1 may be modified in several ways. First, the noise in my game takes on the form that no message may be received even though one was sent. One could also make the analysis when noise takes on the form that a message may be perceived even though none was sent. Second, the model only involves errors of detection, and not errors of discrimination between several messages. A model with errors of discrimination can be obtained by assuming that each player sends a different message for each possible state, or by assuming that the informed player observes more than two states of nature (so that more than one message needs to be sent). Third, both players may have private information. For instance, it may not be certain that Colin feels like going to the cinema, even when there is a good movie on. The grounding process could then start with Colin asking Rowena whether there is a good movie on, or with Rowena asking Colin whether he feels like going to see a good movie. These extensions may lead to rationales for grounding processes unaccounted for in my present basic analysis.
Game-Theoretic Grounding
7
233
Technical section
This section consecutively lists all possible combinations of initial messages, aors, and nonrs, with 1 ≤ z ≤ 3. Rowena is denoted by R, and Colin by C. The state where there is a good movie on is denoted g, the state where there are no good movies on as b. For each of these potential separating equilibria, I first show that, if such a separating equilibrium exists, then players must condition their cinema-going decisions in a particular way depending on their knowledge about the state that occurs, and depending on whether or not they receive messages (“Form”). I then check whether such a potential equilibrium is a strict Nash equilibrium, which suffices to show that it is evolutionarily stable. If the potential equilibrium instead is not a strict Nash equilibrium, I check whether it is still part of an ES set. For potential equilibria that are neither strict Nash equilibria, nor are part of an ES set, a proof of non-existence is given (“Proof of non-existence”). For potential equilibria that are strict Nash equilibria, or are part of ES sets, existence is shown in the following way. First I show that, given that C follows the strategy described in the potential equilibrium, it is a best response for R in state g (“R’s response in state g”) and R in state b (“R’s response in state b”) to follow the potential equilibrium. I then show that, given that R follows the strategy described in the potential equilibrium, it is a best response for C to follow the potential equilibrium (“C’s response”). Finally, I conclude by listing the additional conditions to assumptions (8.1) to (8.3) that are necessary for the existence of such an equilibrium (“Additional conditions”). I formally calculate the player’s action decisions (going or not going), denoted as A-equations, and their signalling decisions, denoted as S-equations. The attention paying decisions are not formally calculated. If attention paying costs are sufficiently small, then a player i who has decided to pay attention to all the messages that player j can send in equilibrium, and who acts differently depending on whether or not he receives these messages, will also prefer to pay attention to these messages. The fact that player i acts differently depending on whether or not he receives messages shows that he values the information contained in them; if he values the information contained in the messages, for sufficiently small attention paying costs, he pays attention to these messages. At the same time, the fact that attention costs are positive assures that no player prefers to send any messages after time z. As no player is paying attention to such messages, no player sends them; as no player sends them, no player pays attention to them.
234
Game Theory and Pragmatics
7.1
z = 1 (= grounding process 3.1)
Form. In this potential equilibrium, R sends a message in state g. C must act differently depending on whether or not he receives the message, otherwise R will not send any message. It follows that C should go when receiving a message, and should not go when not receiving a message. R’s response in state g. Player R, after having sent a message in state g, strictly prefers to go given that assumption (8.1) is met for i = R. Next, let player R not send a message in state g. Then she at least weakly prefers not to go given that it is met for i = R that (A1)
0 ≥ ai
Player R therefore strictly prefers to send a message in state g if it is met for i = R that (S1)
(1 − ε) · 1 + εai − d > 0
R’s response in state b. Player R, after not having sent a message in state b, strictly prefers not to go as (A2)
0 > −M
If player R does send a message in state b, she strictly prefers not to go given that it is met for i = R that (A3)
(1 − ε)bi + ε · 0 > −M
Player R therefore strictly prefers not to send a message in state b given that it is met for i = R that (S2)
0 > (1 − ε)bi + ε · 0 − d
C’s response. Player C, when receiving a message, strictly prefers to go given that it is met for j = C that (A4)
1 > bj
When not receiving a message, player C strictly prefers not to go if it is met that
Game-Theoretic Grounding
235
(1 − δ)εbC + δ · 0 > (1 − δ)ε · 1 + δ(−M )
(A5)
It can be checked that (A5) is met given assumption (8.2). Additional conditions. The only condition for existence we need in addition to assumptions (8.1) to (8.3) is therefore (S1), which is met for sufficiently small ε, and sufficiently small d. 7.2 7.2.1
z=2 Message + aor: aR = −M, bR = 0 (= grounding process 3.2)
Form. In this potential equilibrium, R sends a message only in state g. By the same argument as for 7.1, C must go when receiving a message, and not go otherwise. It follows that R must go when receiving an aor, and not go otherwise. R’s response in state g. When having sent a message in state g, and received an aor, R strictly prefers to go given that (A4) is met for j = R. When having sent a message in state g, and not received an aor, R strictly prefers not to go if it is met for i = R that (A6)
(1 − ε)εbi + ε · 0 > (1 − ε)ε · 1 + εai
It is immediately clear that (A6) cannot be met for ai = 0 for i = R. Moreover, it can be checked that (A6) is incompatible with the assumption (1 − ε) + ε(−M ) > 0 for ai = bi = −M for i = R.10 It follows that (A6) is only possible if ai = −M , bi = 0 for i = R; (A6) is indeed met for these payoff levels given assumption (8.1). When not having sent a message in state g (and necessarily not received an aor), R strictly prefers not to go, given that (A1) is strictly met for ai = −M . It follows that R strictly prefers to send a message in state g if if it met for i = R that (S3)
(1 − ε) [(1 − ε) + εbi ] + ε · 0 − d > 0
R’s response in state b. In state b, if R has decided not to send a message, in equilibrium she will not get a message from C; R then strictly prefers not to go, as going always yields her payoff −M , and not going always yields her payoff zero (as bR = 0). The event of detecting a false aor after not having sent a message is an out-of-equilibrium event, and therefore any response by R is a weak best response. If R instead has decided to send a message in state b, as not going always yields her a higher payoff than going in state b, she will decide not to go. It follows that R strictly prefers not to send a message in state b.
236
Game Theory and Pragmatics
C’s response. When having received a message and after having sent an aor, C strictly prefers to go given assumption (8.1). When having received a message but not sent an aor, C at least weakly prefers given that (A1) is met for i = C. He strictly prefers to send a message when receiving a message if S1 is met for i = C. When not having received a message, and after not having sent an aor, C strictly prefers not to go given that (A7)
(1 − δ)ε · 0 + δ · 0 > (1 − δ)εaC + δ(−M )
When not having received a message, and after still having sent an aor, under the assumption that R does not go when detecting a false aor, C strictly prefers not to go if is met that (1 − δ)ε [(1 − ε)bC + ε · 0] + δ · 0 > (A8) (1 − δ)ε [(1 − ε) + εaC ] + δ(−M ) It can be checked that (A8) is met given assumption (A2). In general, (A2) is met whatever R’s response to a detected false aor. It follows that C strictly prefers not to send a message when not receiving a message, given that
(S4)
(1−δ)ε δ+(1−δ)ε
·0+
(1−δ)ε δ+(1−δ)ε
[(1 − ε)bC + ε · 0] +
δ δ+(1−δ)ε
·0> δ δ+(1−δ)ε
·0−d
Additional conditions. The only additional condition for existence to assumptions (8.1) to (8.3) that we need is therefore again (S1), which is met for sufficiently small ε, and sufficiently small d. Given that (A8) is met whatever R’s response to a detected false aor, this equilibrium is part of an ES set. 7.2.2
Message + nonr: aR = bC = −M (= grounding process 3.3)
Form. In this potential equilibrium, R sends a message only in state g. By the same argument as for 7.1, C must go when receiving a message, and not go otherwise. It follows that R must not go when receiving a nonr, and go otherwise. R’s response in state g. When having sent a message in state g and received a nonr, or when not having sent a message in state g, R at least weakly prefers not to go given that (A1) is met for i = R. However, if aR = 0, R is equally well off if she goes whether or not she receives a nonr. It follows that for any positive costs of paying attention to messages, R will not pay
Game-Theoretic Grounding
237
attention to the nonr if aR = 0. Therefore, this type of equilibrium can only exist if aR = −M . When having sent a message in state g, and not received a nonr, R strictly prefers to go if (A9)
(1 − ε) · 1 + ε2 ai > (1 − ε)bi + ε · 0
Clearly (A9) is met given that assumption (8.3) must be met. It follows that R strictly prefers to send a message in state g if it is met for i = R that (S5)
(1 − ε) · 1 + ε [(1 − ε) · 0 + εai ] − d > 0
R’s response in state b. In state b, if R did not send a message, R knows that C sent a nonr, whether or not she receives one. She strictly prefers not to go given that (A2) is met. If R did send a message in state b and receives a nonr, she strictly prefers not to go for the same reason. If R did send a message in state b and does not receive a nonr, she strictly prefers not to go given that (A10)
(1 − ε)bi + ε2 · 0 > (1 − ε)(−M ) + ε2 (−M )
It follows that R strictly prefers not to send a message in state b given that (S2) is met for i = R. C’s response. If C does not receive a message and does not send a nonr, he strictly prefers not to go given that (A5) is met. If C does send a nonr when not receiving a message, then C strictly prefers not to go if (1 − δ)ε [(1 − ε) · 0 + εbC ] + δ · 0 > (A11) (1 − δ)ε [(1 − ε)aC + ε · 1] + δ(−M ) It is easy to see that (A11) is met given assumption 2. It follows that C strictly prefers to send a nonr when not receiving a message if (1−δ)ε δ+(1−δ)ε
(S6)
[(1 − ε) · 0 + εbC ] +
(1−δ)ε b δ+(1−δ)ε C
+
δ δ+(1−δ)ε
δ δ+(1−δ)ε
·0−d>
·0
This is only possible if bC = −M . If C receives a message and does not send a nonr, C strictly prefers to go given that (A4) is met for j = C. If C receives a message and does send a nonr, C strictly prefers not to go if it is met for j = C that
238
Game Theory and Pragmatics
(1 − ε) · 0 + εbj > (1 − ε)aj + ε · 1
(A12)
and prefers to go otherwise. Each side of (A12), with d subtracted from it, is smaller than 1. It follows that that C strictly prefers not to send a nonr when receiving a message, whatever C’s choice of action when he has sent a nonr anyway. Additional conditions. The only condition for existence that we need in addition to assumptions (8.1) to (8.3) is therefore (S6). (S6) is met if ε and (1 − δ) are not too small and if d is not too large. As all responses are strict best responses, this equilibrium is an ESS. 7.3 7.3.1
z=3 Message + aor + aor: not part of an ES set
Form. In this potential equilibrium, R sends a message in state g, C sends an aor when receiving the message, and R in turn sends an aor of the aor. C must act differently depending on whether or not he receives a message at time 3, otherwise R does not have any incentive to send an aor at time 3. As C can only receive an aor in state g, C should go when receiving an aor, and should not go otherwise. As R should not send an aor at time 3 when not receiving a message at time 2, and as C does not go when not receiving a message at time 3, R should not go when not receiving a message at time 2. Finally, as C should not send an aor when not receiving a message at time 1, he will also not receive a message at time 3, and will not go. It follows that, in a potential equilibrium of this form, each player only goes when receiving all messages that he or she can possibly receive in equilibrium. Proof of non-existence. A potential equilibrium of this form can only exist if (S1) is met for i = R. This is because R in state g, after having send a message, received an aor, and sent an aor, should strictly prefer sending a message and going to not sending a message and not going. Moreover, for a potential equilibrium of this form to exist, when R in state g sends a message at time 1 and does not receive an aor at time 2, R is supposed not to send an aor at time 3, and is supposed not to go. Alternatively R may send a false aor, pretending to have received C’s message at time 2, and still go. R should strictly not prefer this alternative. This is the case if
(S7)
(1−ε)ε ε+(1−ε)ε
·0+
(1−ε)ε ε+(1−ε)ε
[(1 − ε) · 1 + εai ] +
ε ε+(1−ε)ε
·0> ε ε+(1−ε)ε
[(1 − ε)x + εai ] − d
where the value taken by x depends on C’s response when detecting R’s aor to be false.
Game-Theoretic Grounding
239
When R sends a false aor at time 3, C will only detect R’s aor to be false if C did not receive a message at time 2. If C still goes when detecting a false aor, then payoff x in (S7) equals 1, and (S7) cannot be met, because it is incompatible with (S1). It can be shown now that an equilibrium can still exist if C does not go when detecting an aor to be false. Also, such a response by C to a detected false aor is necessarily a weak best response, as aors will never be detected in equilibrium. However, by the fact that C’s weak best response matters, an equilibrium of this form cannot be part of an ES set. 7.3.2
Message + aor + nonr: aR = bR = bC = −M (= grounding process 3.4)
Form. In this potential equilibrium, R sends a message in state g, C sends a message when receiving this message, and R in at least one of the states sends a message when not receiving C’s aor. R should act differently (i.e., go when receiving an aor and not go otherwise, or vice versa) depending on whether or not she receives an aor. Otherwise, C does not have any incentive to send an aor at time 2. C also does not have any incentive to send an aor at time 2 if R does not go when receiving an aor. It follows that R should go when receiving an aor, and should not go otherwise. As R does not go when not receiving an aor, and as C should not send an aor when not receiving a message, C should not go when not receiving a message. It follows that R does not have any incentive to send a nonr in state b. R in state b always at least weakly prefers not to go. The only reason R would send a nonr in state b is to prevent C from going. But R in state b has already assured that C will not go by not sending a message in state b. It follows that R should only send a nonr in state g (and after not having received an aor). As R does not go when not receiving an aor, C should not go when receiving a nonr. As R only has an incentive to send a nonr when C acts differently depending on whether or not he receives a nonr, C should therefore go when having received a message, sent an aor, and not received a nonr. R’s response in state g. When having sent a message in state g, not received an aor, and after having sent a nonr, R strictly prefers not to go if it is met for i = R that (A13) (1 − ε)ε [(1 − ε) · 0 + εbi ] + ε · 0 > (1 − ε)ε [(1 − ε)ai + ε · 1] + εai
This not possible if ai = aR = 0. However, it can be checked that (A13) is always met for ai = bi = −M .
240
Game Theory and Pragmatics
When having sent a message in state g, not received an aor, and after not having sent a nonr, R strictly prefers not to go if (A6) is met for i = R, and prefers to go otherwise. As it must be the case that aR = −M , (A6) is met if bi = bR = 0, and (A6) is not met if bi = bR = −M . For bi = bR = 0, it is met that (1−ε)ε ε+(1−ε)ε
(S8)
[(1 − ε) · 0 + εbi ] +
(1−ε)ε b ε+(1−ε)ε i
+
ε ε+(1−ε)ε
ε ε+(1−ε)ε
·0−d<
·0
and R strictly prefers not to send an nonr when not having received an aor at time 2. It follows that this equilibrium cannot exist for bR = 0. For bi = bR = −M , ai = aR = −M , R strictly prefers to send a nonr at time 2 if
(S9)
(1−ε)ε ε+(1−ε)ε
[(1 − ε) · 0 + εbi ] +
(1−ε)ε ε+(1−ε)ε
·1+
ε ε+(1−ε)ε
·0−d>
ε a ε+(1−ε)ε i
Thus, in case bR = −M , this equilibrium can still exist. When having sent a message in state g, received an aor, and after having sent a nonr, R strictly prefers not to go if (A12) is met for i = R, and prefers to go otherwise. When having sent a message in state g, received an aor, and after not having sent a nonr, R strictly prefers to go given that (A4) is met. Each side of (A12), with d subtracted from it, is smaller than 1, so that R strictly prefers not to send a nonr when receiving a message. If R in state g does not send a message at time 1, then C does not go. Therefore, if R does not send a message at time 1, R should not go given that (A1) is strictly met for ai = aR = −M ; also, as C will not go anyway, R strictly prefers not to send a nonr. R in state g strictly prefers to send a message at time 1 if it is met for i = R that (1 − ε) {(1 − ε) · 1 + ε [(1 − ε) · 0 + εbi − d]} + (S10) ε [(1 − ε) · 0 + ε · 0 − d] − d > 0 R’s response in state b. In state b, if R does not send a message, and does not receive an aor, then R knows that C will not go, and strictly prefers not to go by (A2). R therefore does not have any incentive to send a nonr in this case. In state b, by not sending a message at time 1, and not going, R assures herself a payoff of 0. Sending a costly message at time 1 instead will lead R to at least sometimes obtain payoff bR = −M . It follows that R does
Game-Theoretic Grounding
241
not send a message at time 1. Given that not sending a message at time 1 suffices to induce C not to go, R in state b also does not have any incentive to send a nonr at time 3. In equilibrium, R in state b will never detect a false aor by C; any response to a detected false aor is therefore a best response. C’s response. When having received a message at time 1, sent an aor, and received a nonr, C at least weakly prefers not to go given that (A1) is met for i = C. When having received a message at time 1, sent an aor, and not received a nonr, C strictly prefers to go given that (A9) is met for i = C. For aC = 0, C therefore at least weakly prefers not to go, whether or not he receives a nonr. It follows that in this case, C is better off when not incurring the cost of paying attention to a nonr at time 3. Therefore, this potential equilibrium can only exist when aC = −M . When having received a message at time 1, and not sent an aor, then C knows that R sent a nonr, whether or not C receives a nonr. C again at least weakly prefers not to go given (A1). It follows that C strictly prefers to send an aor when receiving a message at time 1, given that (S5) is met. When not having received a message at time 1, and sent a false aor, C’s decisions depend on R’s response to a detected false aor (state b). The best that could happen to C, when he has sent a false aor, is that R sends a message in state b at time 3 sends the same message that she uses in state g as a nonr. In this manner, R could still inform C that state b occurs, so that C runs less risk in sending a nonr. Let C then send a false aor, and not receive a nonr. C now strictly prefers not to go given that ˆ ˜ δ [ε · 0 + (1 − ε)εx] + (1 − δ)ε ε2 (−M ) + (1 − ε) · bC > (A14) ˆ ˜ δ [ε + (1 − ε)ε] (−M ) + (1 − δ)ε ε2 (−M ) + (1 − ε) · 1 Given assumption (8.2), (A14) can be checked to met even when x = bC = −M (C dislikes R to go by herself, and R goes when detecting a false aor). Given that C does not prefer to go even in this best possible case, where R responds to detected false aor in the way most favourable to C, it follows that C will never prefer to go after having send a false aor. Given that C can simply obtain payoff zero by not sending a false aor (and then not acting), C will therefore never prefer to send a false aor, whatever R’s response to a false aor. It follows that this potential weak equilibrium is still part of ES set when it exists. Additional conditions. The additional conditions for existence to assumptions (8.1) to (8.3) that we need are therefore (S9) and (S10). From (S9), it follows that ε should not be too small. Evidently, d should be sufficiently small. As C prefers not to send a false aor whatever R’s response when detecting a false aor, this equilibrium is part of an ES set.
242
7.3.3
Game Theory and Pragmatics
Message + nonr + aor (= grounding process 3.5)
Form. In this potential equilibrium, R sends a message in state g, C sends a nonr when not receiving this message, and R in at least one of the states sends a message when receiving C’s nonr. Given that R sends a message in state g, C should go when receiving a message, and R should go when not receiving a nonr. In state b, R knows that C sent a nonr at time 2, whether or not R receives a nonr. It follows that, if R sends a message at time 3 in state b, R will send a message whether or not she receives a nonr. But then, the message sent at time 3 is not an aor. Alternatively, R could always send a message at time 3 in state b, and in state g only send a message at time 3 when having received a nonr at time 2. The message at time 3 then at least sometimes acknowledges receipt of a message at time 2. However, when C receives an aor in this case, the probability that state b occurs equals δ/[δ + (1 − δ)ε(1 − ε)] > δ. By assumption (8.2), it follows that C strictly prefers not to go. Given that C should act differently depending on whether or not he receives an aor, C should then go when not receiving an aor. But R in state g does not have any incentive to send an aor when receiving a nonr. The only remaining possibility therefore is that R only sends an aor in state g. Consider then the case where R only sends an aor in state g. Again, C should act differently depending on whether or not he receives this aor. Consider the case where C does not go when receiving an aor. Then in state g, when receiving a nonr, R does not have any incentive to send an aor. It follows that C (after having sent a nonr at time 2) should go when receiving an aor at time 3, and should not go otherwise. R’s response in state g. Player R, after having sent a message in state g, received a nonr, and sent an aor, strictly prefers to go given that assumption (8.3) is met for i = R. In the same situation, when not having sent an aor, R at least weakly prefers not to go, given that (A1) is met for i = R. It follows that R, after having sent a message in state g and received a nonr, sends an aor if (S1) is met for i = R. These action and signalling decisions are identical for a player R who has decided not to send a message in state g, and this independently from whether or not he has received a nonr. This is because in the candidate equilibrium, when player R does not send a message at time 1, she knows with certainty that player C sends a nonr. Player R, after having sent a message in state g, not received a nonr, and not sent an aor, strictly prefers to go given that (A9) is met for i = R. After having sent a message in state g, not received a nonr, and having sent a false aor, R strictly prefers to go if it is met for i = R that
Game-Theoretic Grounding
243
(1 − ε) [(1 − ε)x + ε · 1] + ε2 [(1 − ε) · 1 + εai ] > (A15) (1 − ε) [(1 − ε)y + εbi ] + ε2 [(1 − ε)bi + ε · 0] where x and y depends on the response of C when detecting a false aor. When not getting a nonr at time 2, R strictly prefers not to send a false aor at time 3 if it is met for i = R that ε2 a (1−ε)+ε2 i
(1−ε) (1−ε)+ε2
·1+
(1−ε) (1−ε)+ε2
[(1 − ε)x + ε · 1] +
>
(S11)
ε2 (1−ε)+ε2
[(1 − ε) · 1 + εai ] − d
This can be met even if x = 1, meaning that C does not punish a detected false aor, so that R prefers to go after having sent a false aor. If R strictly prefers not to send a false aor even when C reacts in the best possible way to a detected false aor, then R strictly prefers not to send a false aor whatever C’s response. It follows that the potential equilibrium can still be part of an ES set in this case. R in state g strictly prefers to send a message at time 1 (followed by an aor when a nonr is received) to not sending a message at time 1 (and immediately sending a message at time 3, independently of whether or not she receives a nonr at time 2) if it is met for j = R that (1 − ε) · 1 + ε {(1 − ε) [(1 − ε) · 1 + εai − d] + εai } − d > (S12) (1 − ε) · 1 + εai − d It is easy to check that (S12) is met whenever (S1) is met. R’s response in state b. In state b, if R did not send a message, R knows that C sent a nonr, whether or not she receives one. If R did not send a message at time 1, did or did not receive a nonr, and did not send an aor at time 3, R strictly prefers not to go given (A2); in the same circumstances, if R did send an aor at time 3, she strictly prefers not to go given that (A3) is met for i = R. As R, after having sent a message in state b at time 1, strictly prefers not to go whether or not she sends a message at time 3, she strictly prefers not to send a message at time 3. If R in state b sends a message at time 1, receives a nonr at time 2, and does not send a message at time 3, R strictly prefers not to go given (A2); in the same circumstances, if R did send an aor at time 3, she strictly prefers not to go given that (A3) is met for i = R. As R, after having sent a message in state b at time 1 and after having received a nonr at time 2, strictly prefers not to go whether or not she sends a message at time 3, she strictly prefers not to send a message at time 3.
244
Game Theory and Pragmatics
If R in state b sends a message at time 1, does not receive a nonr at time 2, and does not send a message at time 3, R strictly prefers not to go by (A10); in the same circumstances, if R does send a message at time 3, R strictly prefers not to go if it is met for i = R that (1 − ε) [(1 − ε)x + εbi ] + ε2 [(1 − ε)bi + ε · 0] > (A16) (1 − ε)(−M ) + ε2 (−M ) where x depends on C’s response when detecting R’s message at time 3 to be a false aor. (A16) is clearly met whatever this response. It follows that R, after having send a message in state b, and not received a nonr, strictly prefers not to go whether or not she sends a message at time 3. Thus, R will strictly prefer not to send a message at time 3. Finally, as R strictly prefers not to go whether or not she has sent a message at time 1, R in state b strictly prefers not to send a message at time 1. C’s response. When C does or does not receive a message at time 1, sends a nonr, and receives an aor, C strictly prefers to go given that (A4) is met for j = C. When C does not receive a message at time 1, sends a nonr, and does not receive an aor, C strictly prefers not to go if it is met for j = C that ˜ ˆ (1 − δ) ε2 + (1 − ε)ε2 bj + δ · 0 > (A17) ˆ ˜ (1 − δ) ε2 + (1 − ε)ε2 · 1 + δ(−M ) Clearly, this is met given assumption (A2). When not receiving a message at time 1, and when not having sent a nonr, C strictly prefers not to go given that (A5) is met. It follows that, when not receiving a message at time 1, C strictly prefers to send a nonr if it is met for j = C that δ δ+(1−δ)ε
·0+
(1−δ)ε δ+(1−δ)ε
δ δ+(1−δ)ε
·0+
(1−δ)ε b δ+(1−δ)ε C
(S13)
{(1 − ε) [(1 − ε) · 1 + εbC ] + εbC } − d >
Finally, as C never detects a false aor in the potential equilibrium, any response to a detected false aor is a weak best response. Additional conditions. The conditions for existence we need in addition to assumptions (8.1) to (8.3) are therefore (S1) and (S13). Moreover, for the weak equilibrium to be part of an ES set, (S11) should be met for x = 1. 2
i) It can be checked that (S1) and (S11) are compatible if ε (1−ε)(1−a
Game-Theoretic Grounding
7.3.4
245
Message + nonr + nonr: does not exist
Form. In this potential equilibrium, R sends a message in state g, C sends a nonr when not receiving this message, and R in at least one of the states sends a nonr when not receiving C’s nonr. In state b, R knows that C sent a nonr at time 2, whether or not R receives a nonr. R’s decision whether or not to send a nonr at time 3 will thus not depend on whether or not she receives a nonr at time 2. It follows that R should not send a nonr exclusively in state b, as the message sent at time 3 is not then a nonr. Alternatively, let R always send a message at time 3 in state b, and in state g only send a message at time 3 when not receiving a message at time 2. The message at time 3 then at least sometimes notifies non-receipt of a message at time 2. However, when C receives an nonr in ‹ˆ ˜ this case, the probability that state b occurs equals δ δ + (1 − δ)ε2 ) > δ. By assumption (8.2), it follows that C strictly prefers not to go. Given that C should act differently depending on whether or not he receives a nonr at time 3, C should then go when not receiving a nonr at time 3. But then in state g, when R does not receive a message at time 2, she does not have any incentive to send a nonr. The only remaining possibility therefore is that R only sends a nonr in state g. Proof of non-existence. Consider then the case where R only sends a nonr in state g. Again, C should act differently depending on whether or not he receives this nonr. Consider the case where C (after having sent a nonr) does not go when receiving a nonr, and goes otherwise. Then in state g, when not receiving a nonr at time 2, R does not have any incentive to send a nonr. The remaining possibility is that C (after having sent a nonr) goes when receiving a nonr, and does not go otherwise. Then in state g, when receiving a nonr at time 2, C will still send a nonr at time 3. It follows that this type of equilibrium does not exist.
Notes 1. In the literature on the electronic mail game, the players only communicate by means of so-called special-purpose acknowledgements, e.g. by echoing, or paraphrasing the other player’s last message (Takeoka and Shimojima 2002), thus providing unambiguous proof that the preceding message was received. In my analysis, players instead communicate by means of general-purpose acknowledgements, such as ‘OK!’ or ‘Uh huh.’ (ibid.). 2. In an ES set, while drift may cause a player to choose an alternative weak best response, the other players’ strategies continue to be best responses under this alternative weak best response. The set of weak equilibria involving all possible weak best responses can thus be seen as an evolutionarily stable set. 3. Admittedly, I show the evolutionary stability of certain separating equilibria, but I do not show how these could evolve from an initial pooling equilibrium.
246
4.
5.
6.
7.
8.
9.
10.
Game Theory and Pragmatics Moreover, it is easily seen that the pooling equilibrium in the game in Table 8.1 is risk-dominant. Still, secret handshakes can lead to the evolution of riskdominated, Pareto superior equilibria (Robson 1990). There is a rationale for such a limitation in terms of Horn’s rule (see van Rooij 2004). By assumption (8.2), the most likely event is that there are no good movies on. As sending a message is costly, it is efficient only to send a message in the more unlikely event. Parikh (2000) argues that the very fact that equilibria that meet Horn’s rule are efficient will cause them to be played, in that they are focal points. Van Rooij (2004) provides an evolutionary argument in favour of Horn’s rule. One way in which there could be redundancy in the signals has already been eliminated by assumption, as I have assumed that each player can only send a single message at each point of time. Still, potential separating equilibria involving redundancy can now still exist e.g. if Rowena in state g sends a message at several odd-number times, while Colin remains silent. Chwe (1995) argues that redundancy and reconfirmation are the two ways in which players can deal with unreliable communication. Chwe considers redundancy in the context of the electronic mail game. Clearly, for sufficiently small d, redundancy is efficient. On the other hand, for sufficiently large d, redundancy will not be efficient, or even not be possible in equilibrium. Consider for instance the separating equilibrium described below in 3.5. It is possible to find levels of ε and d such that there is no equilibrium where Rowena unilaterally sends a message at times 1 and 3 when there is a good movie on. However, Rowena may still want to repeat her invitation if Colin tells her that he did not receive her first invitation, as is the case in 3.5. Finally, note that even if redundancy would be efficient, then this does not change my results much. Simply, in a modified model where each player could send several messages at each point of time, there could be redundancy in each of the messages sent in the grounding processes listed in Section 3. But this does not change the character of these grounding processes. Put otherwise, if I say ‘OK’ even though you did not say anything to me, then you know that I am cheating (under the assumption that I can never mistakenly hear you saying something when in fact you did not say anything). Put otherwise, if I do not say ‘OK’ even though you told me something, then you do not know that I am cheating, as it is possible that I simply did not hear that you were saying something. Consider separating equilibria consisting only of aors, and denote by z the total number of message sent in equilibrium. Elsewhere (de Jaegher 2004), I have shown that the equilibrium with z = 1 is an ESS, that equilibria with z = 2 are part of an ES set, and that equilibria with z > 2 are not part of ES sets. Also, I have shown that equilibria with z = 3 do not meet the forward induction criterion, and that equilibria with z > 3 are not sequential equilibria. Finally, I have shown that the equilibria with z ≤ 2 are accessible through replicator dynamics from equilibria with z > 2. As pointed out by a referee, there are levels of M such that for ai = bi = −M , (A6) is compatible with (8.3). My reason for assuming that (8.3) must be met in all variants of the game (including the case found in the literature on the electronic mail game where ai = −M and bi = 0) is that this makes it easier to label the several variants of the game in Table 8.2 on page 228. However, it should be stressed that the taxonomy in Table 8.2 is maintained even if one does not
Game-Theoretic Grounding
247
assume that (8.3) must be met in all variants of the game. The only effect is that the conditions under which certain grounding processes are possible become more difficult to state.
References Benz, A., G. J¨ager, and R. van Rooij (2005). An introduction to game theory for linguists. This volume. Binmore, K. and L. Samuelson (2001). Coordinated action in the electronic mail game. Games and Economic Behavior, pp. 6–30. Chwe, M. S.-Y. (1995). Strategic reliability of communication networks. Working paper, University of Chicago, Department of Economics. Clark, H. H. and E. F. Schaefer (1989). Contributing to discourse. Cognitive Science, 13, 259–94. Halpern, J. Y. and Y. Moses (1990). Knowledge and common knowledge in a distributed environment. Journal of the ACM, 37, 549–87. de Jaegher, K. (2004). Efficient communication in the electronic mail game. working paper, Vrije Universiteit Brussel, Department of Economics. Lewis, D. (1969). Convention. Harvard University Press, Cambridge, MA. Morris, S. and H. S. Shin (1997). Approximate common knowledge and coordination: recent lessons from game theory. Journal of Logic, Language and Information, 6, 171– 90. Parikh, P. (2000). Communication, meaning and interpretation. Linguistics and Philosophy, 23, 185–212. Robson, A. (1990). Efficiency in evolutionary games: Darwin, nash and the secret handshake. Journal of Theoretical Biology, pp. 379–96. van Rooij, R. (2004). Signalling games select Horn strategies. Linguistics and Philosophy, 27, 493–527. Rubinstein, A. (1989). The electronic mail game: strategic behavior under almost common knowledge. American Economic Review, 79, 385–91. Selten, R. (1980). A note on evolutionarily stable strategies in asymmetric animal contests. Journal of Theoretical Biology, 84, 93–101. Takeoka, A. and A. Shimojima (2002). Grounding styles of aged dyads: an exploratory study. In Proceeding of the 3rd SIGdial Workshop on Discourse and Dialogue, pp. 188–95. University of Pennsylvania, Philadelphia. Thomas, B. (1985). On evolutionarily stable sets. Journal of Mathematical Biology, 22, 105–115. Traum, D. R. (1994). A Computational Theory of Grounding in Natural Language Conversation. Ph.D. thesis, University of Rochester, New York.
9 A Game Theoretic Approach to the Pragmatics of Debate: An Expository Note Jacob Glazer and Ariel Rubinstein
1
The Logic of Debates
In this paper, the term ‘debate’ refers to a situation in which two parties disagree over some issue and each of them tries to persuade a third party, the listener, to adopt his position by raising arguments in his favor. We are interested in the logic behind the relative strength of the arguments and counterarguments; we therefore limit our discussion to debates in which one side is asked to argue first, with the other party having the right to respond before the listener makes up his mind. Casual observation tells us that when arguing in a debate, people often mean more than the literal content of their statements and that the meaning we attach to arguments in debates is dictated by some implicit rules particular to debates. Grice initiated the analysis of the logic by which we interpret utterances in a conversation. His fundamental assumption is that the maxims which guide people in interpreting an utterance in a conversation are derived from the ‘cooperative principle’. The cooperative principle is probably valid for interpreting utterances in a conversation but it makes no sense in the context of a debate. In a debate, the listener does not know whether he and a debater who raises an argument have the same or opposing interests. The listener must consider the possibility that the debater’s intention is to manipulate him. Consider, for example, the following scenario presented to a group of students at Tel Aviv University: Alice, Michael and you are good friends. You and Michael tend to disagree over most things and you always find yourself trying to persuade Alice that your opinion is the right one. Imagine that the three of you attend the performance of a new band. You and Michael disagree as to whether it is a good band. 248
A Game Theoretic Approach to the Pragmatics of Debate
249
Michael says: ‘This is a good band. I am especially impressed by the guy standing on the very left’. Now is the time for you to make a quick reply. The following two arguments come to mind but you can only choose one of them: A: ‘Yes, but the guy standing right next to him was terrible.’ B: ‘Yes, but the one standing fourth from the left was terrible.’ Which of the two arguments would you choose in order to persuade Alice? About 66% of the subjects thought that A is more persuasive (the results were not sensitive to the order in which the two counterarguments have appeared). This example appears to confirm that the power of a counterargument goes beyond its literal content. Raising a counterargument that is ‘distant’, in some natural sense, from the original argument is interpreted as an admission that the debater could not raise a ‘closer’ counterargument to support his view. Thus, for example, countering an argument about the musician standing on the very left by an argument about the musician standing fourth from the left will likely be interpreted by Alice as an admission that the musicians standing between these two are good, thus providing an additional evidence in favor of Michael. We observed similar phenomena in two other experiments discussed in Glazer and Rubinstein (2001). Having to respond to an argument regarding Bangkok, an argument regarding Manila was considered more persuasive than an argument regarding Brussels. Having to counter an argument regarding Monday, an argument regarding Tuesday was considered more persuasive than an argument regarding Thursday. In other words, a statement made in a debate will have a different meaning than it would have had in a conversation. Though the fact that Manila is closer to Bangkok than Brussels is irrelevant to the substance of the debate, it is relevant to the evaluation of the evidence regarding Manila and Brussels as potential counterarguments to evidence about Bangkok. Our purpose is not to provide a general theory of the pragmatics of debating but to provide a possible rationale for the phenomenon that the strength of a counterargument depends on the argument it counters. Specifically, we study the listener’s optimization problem when his aim is to adopt persuasion rules that minimize the probability that he will reach the wrong conclusion from the arguments. We study this optimization under a constraint on the amount of information the listener can process. We show that the phenomenon in which the ‘strength’ of the evidence which is brought
250
Game Theory and Pragmatics
as a counterargument depends on the initial argument is not necessarily a rhetorical fallacy but may be consistent with optimal persuasion rules.
2
Our approach
Our approach to investigating the logic of debating has several basic components: 1 We view a debate as a mechanism by which a decision maker (the listener) extracts information from two parties (the debaters). The right decision, from the point of view of the listener, depends on the realization of several bits of information. The relevant information is fully known to both debaters but not to the listener. The debaters have opposing interests regarding the decision to be made. During the debate the debaters raise arguments to support their respective positions in the form of providing hard evidence about the realization of some of the bits of information. On the basis of these arguments, the listener reaches a conclusion. 2 The listener bases his decision on what he infers from the arguments made in the course of the debate. The listener might make inferences beyond what follows directly from the information contained in the hard evidence presented by the parties. The pragmatics of debating is the set of rules which guide the listener in interpreting the arguments made by the debaters beyond their literal content. 3 The interpretation of the utterances and evidence is something embedded in the mind of the listener and not a matter of decision. 4 The two debaters view the debate as a two player game. The listener is not a player in the game. The actions in the game are the arguments the debaters can raise. The outcome of the game reflects the debaters’ common understanding of how the listener will interpret their arguments in reaching a conclusion. 5 The pragmatics of debating are viewed as if chosen by a fictitious designer. The designer is aware of the fact that the rules he chooses will determine the game that will be played by the debaters and therefore that the listener’s conclusion will depend indirectly on the rules the designer chooses. The designer aims to maximize the probability that the listener will reach the same conclusion he would have, had he known all the information known to the debaters.
A Game Theoretic Approach to the Pragmatics of Debate
251
6 The designer is constrained by physical limitations such as the difficulty in processing information, the length of the debate and the cost of bringing hard evidence. In sum, we apply an economic approach to the pragmatic of debating. Our study is a part of a research program associating phenomena in language with a rational choice of a mechanism, given physical constraints (see also Rubinstein (1996) discussing a different issue within linguistics).
3
The model
Following economic tradition, we demonstrate our approach using a simple model that is not meant to be realistic. 3.1
The basic scenario
In our model there are three participants, debater 1, debater 2 and a listener. In the final stage of the debate, the listener chooses between two actions, O1 and O2 . The ‘correct’ action, from the listener’s point of view, depends on the state of the world, thereafter referred to as a ‘state’. A state is characterized by the realization of five aspects, numbered 1, . . . , 5. Each aspect i receives one of two values, 1 or 2, with the interpretation that if its value is 1, aspect i supports the action O1 and if its value is 2 it supports O2 . The profile of realizations of all five aspects is a state. For example, (1, 1, 2, 2, 1) is the state in which aspects 1, 2 and 5 support O1 and aspects 3 and 4 support O2 . Each aspect receives the value 1 or 2 with probability 0.5. The realizations of the five aspects are assumed to be independent. The listener does not know the state. Had he known the state he would have chosen the action supported by a majority of aspects (e.g., O1 at state (1, 1, 2, 2, 1)). The debaters, on the other hand, have full information about the state. The listener, therefore, needs to elicit information from the debaters. However, the listeners do not necessarily wish to give him the missing information since debater 1 prefers the action O1 and debater 2 prefers O2 independently of the state. A debate is a process (mechanism) by which the listener tries to elicit information from the two debaters about the true state. 3.2
Debate
A debate procedure specifies the order in which the debaters speak and the sort of arguments they are allowed to raise. We restrict our attention to sequential debates in which the two parties raise arguments one after the other. Thus, debater 1 moves first and debater 2 moves second after listening to debater 1’s arguments. We do not allow debater 1 to counter the
252
Game Theory and Pragmatics
counterargument of debater 2 and we do not allow simultaneous argumentations (see Glazer and Rubinstein (2001) for a discussion of other debate procedures). We assume that a debater can only raise arguments regarding aspects that support his position. Denote by arg(i) the argument: ‘the realization of aspect i supports my position’. For example, in state (1, 1, 2, 2, 1), debater 1 can raise one of the three arguments arg(1), arg(2), and arg(5) while debater 2 can counterargue with arg(3) and arg(4). A few assumptions are implicit in this debate model: Debaters cannot make any move other than raising arguments; they cannot shout, curse or whatever. More importantly, a debater cannot lie. Thus, for example, a debater cannot claim that the value of an aspect is 1 unless that is in fact the case. The motivation of this assumption can either be that debaters simply hate to lie (or are afraid to get caught lying) or that making an argument that aspect 4 has the value 1 requires more than just making the utterance ‘aspect 4 has the value 1’ and requires providing hard evidence that proves this beyond any doubt. Third, a debater cannot raise arguments that support the outcome preferred by the other debater. We now come to another key feature of our approach which concerns the complexity of the debate. We assume that each debater is allowed to raise at most one argument, i.e., to present the realization of at most one aspect. This assumption is especially realistic for debates in which an argument requires more than simply making a short statement. In a court case, for example, parties have actually to bring witnesses. In a scientific debate, debaters may have to provide an explanation as to why their findings support a particular hypothesis. Thus, arguments in a real life debate may require scarce resources such as time or the listener’s attention. Of course, if there were no restrictions on the complexity of the debate, the listener could simply ask one of the debaters to present three arguments, ruling in his favor if he is able to fulfill this task. This would allow the listener to elicit enough information so that he would always make the correct decision. However, we take the view that debate rules are influenced by the existence of constraints on the complexity of the debate. This approach is in line with our understanding that Grice’s logic of conversations is dictated, among other things, by the fact that there are limits on the ability of the conversers to communicate and process information. 3.3
Persuasion rules and the debate game
The focus of our analysis is the concept of persuasion rules. A persuasion rule determines whether the action O1 or O2 is taken, given the events that took place in the debate. It reflects the way that the listener interprets debater
A Game Theoretic Approach to the Pragmatics of Debate
253
1’s argument and debater 2’s counterargument. We assume that if a debater does not raise any argument in his turn, he loses the debate. Formally, denote by Λ the set of feasible arguments. A persuasion rule F is a function which assigns to every element in Λ a subset (possibly empty) of arguments in Λ. The meaning of λ2 ∈ F (λ1 ) is that if debater 1 makes the argument λ1 then debater 2’s counterargument λ2 persuades the listener to take the action O2 . A persuasion rule induces, for every state, a two-stage extensive game with perfect information: In the first stage, debater 1 chooses one of the arguments that support his claim (if such an argument exists, otherwise the game degenerates to be the outcome O2 ). In the second stage debater 2 chooses an argument from those that support his claim (if such an argument exists; otherwise the game degenerates to debater 1’s choice of an argument followed by the outcome O1 ). Following any possible pair of arguments, the action to be taken (O1 or O2 ) is determined by the persuasion rule. With respect to preferences, debater 1 always prefers O1 and debater 2 always prefers O2 . Note that the listener is not modeled as a player. At this stage we assume that the listener simply follows his ‘instincts’ about what to infer from the players’ moves. The rules of pragmatics are embedded in those instincts. Example: To clarify our construction, consider the following persuasion rule: the listener is persuaded by debater 2, if and only if (i) debater 1 is silent or (ii) debater 1 presents arg(5) or (iii) debater 1 presents arg(i) (i = 1, 2, 3, 4) and debater 2 counterargues with arg(i + 1). At the state (1, 2, 1, 2, 2), for example, this persuasion rule induces the game in Figure 9.1.
Figure 9.1
254
Game Theory and Pragmatics
3.4
The designer’s problem
For any state s, and for every persuasion rule, the induced game between the two debaters is a zero-sum game Γ(s) with perfect information. Adopting the standard game theoretic view that players play rationally and that player 1 anticipates player 2’s move, we can use the concept of Nash equilibrium. Such games have a unique Nash equilibrium outcome and thus, we can talk about ‘the outcome of the game Γ(s)’. If the listener’s correct action at state s is not the outcome of the game Γ(s), we say that the persuasion rule induces a mistake at state s. To clarify our terminology, consider the persuasion rule described in the above example. Debater 1 can persuade the listener to take the action O1 only in a state with two aspects i and i + 1 supporting him. In such states debater 1 is able to raise the argument arg(i) without player 2 being able to rebuff him with arg(i + 1). Therefore, in any of the 4 states (1, 1, 2, 2, 2), (2, 1, 1, 2, 2), (2, 2, 1, 1, 2), (2, 2, 2, 1, 1) debater 1 will win the debate although he should have lost, whereas in the state (1, 2, 1, 2, 1) he will lose the debate although he should have won. In any other state the outcome of the induced game is the correct one. Thus, the number of mistakes is 5 and the probability of a mistake, induced by this persuasion rule, is 5/32. We finally come to the designer’s problem. The designer seeks a persuasion rule that minimizes the probability of a mistake. Given our assumptions this is equivalent to finding a persuasion rule with the smallest number of states in which the outcome of the game induced by the rule is the wrong one. Note that this problem reflects the assumption that all mistakes are of equal significance. In addition, it makes no differentiation between the weight put on ruling O2 in the state (1, 1, 1, 1, 1) where all aspects support the decision O1 and the weight put on ruling O2 in the state (1, 1, 1, 2, 2) where the realization of the aspects only marginally support O1 . Intuitively we often feel that the former mistake is ‘bigger’ than the latter.
4
Analysis
We will now investigate optimal persuasion rules, namely those that minimize the number of mistakes. Claim 1: The minimal number of mistakes induced by a persuasion rule is three. Proof: Consider the persuasion rule in Table 9.1 on the next page. This persuasion rule induces three mistakes. Two mistakes are in favor of debater 1: in the state (1, 1, 2, 2, 2), if debater 1 raises arg(1) debater 2 does not have a good counterargument and in the state (2, 2, 1, 1, 2), debater 2 does not have a counterargument to arg(3). There is also one mistake in favor of
A Game Theoretic Approach to the Pragmatics of Debate
255
Table 9.1 If debater 1 argues for . . . 1 2 3 4 5
. . . debater 2 wins if and only if he counterargues with 2 3 or 5 4 2 or 5 1 or 4
debater 2 in the state (1, 2, 1, 2, 1) since whatever debater 1 argues debater 2 has an appropriate counterargument. We shall show now that any persuasion rule induces at least three mistakes. Fix a persuasion rule. Following any move by debater 1, there is a set of counterarguments, available to debater 2 that will persuade the listener to select O2 . Debater 1 can win the debate by raising arg(i) only if all those aspects that, according to the persuasion rule, debater 2 could possibly counterargue with and win, are in favor of debater 1. Thus, a persuasion rule is characterized by a set E of at most five sets of aspects such that debater 1 can win the debate in state s if and only if the set of aspects that support him in state s contains a set in E. If any of the sets in E is a singleton {i}, raising arg(i) would be sufficient to persuade the listener to decide in favor of debater 1, inducing at least 5 mistakes (in the states were the arguments supporting debater 1 are only {i} or {i, j} for some j 6= i). Denote by S3 the set of 10 states with exactly 3 aspects supporting debater 1 (such as (1, 1, 1, 2, 2)). Any set in E that consists of two aspects only, induces one mistake in the state where only these two aspects support debater 1. If there is only one set in E that contains exactly two aspects, then there are at most 7 states in S3 in which debater 1 can win the induced game (in three states in S3 these two aspects support 1; and there are at most four states in which the set of aspects supporting 1 contains an element in E) and thus there are at least three mistakes. Suppose that E contains precisely two sets of two aspects. There are at most 6 sets in S3 that contain one of these two sets of aspects. Thus, there must be at least one element in S3 for which the set of aspects supporting 1 does not contain any set in E and, in that state, debater 1 cannot make his case. Thus, the number of mistakes must be at least 3. Comment: The above persuasion rule is not the only optimal persuasion rule. See Glazer and Rubinstein (2001) for another persuasion rule which is not isomorphic to this one.
256
Game Theory and Pragmatics
At this point, it is of interest to discuss what we call the Debate Consistency (DC) Principle: for any pair of arguments x and y, either y is a persuasive counterargument to x or x is a persuasive counterargument to y but not both. This principle is violated by the optimal persuasion rule described in the proof of Claim 1. If debater 1 argues arg(1) and debater 2 counterargues arg(3), debater 1 wins and if debater 1 argues arg(3) and debater 2 counterargues arg(1), debater 1 also wins. This violation is not a coincidence: Claim 2: Any optimal persuasion rule violates the DC Principle. Proof: By the proof of Claim 1, the set E associated with an optimal persuasion rule does not contain any set of size greater than 3 and contains no more than three sets of size 3. Thus, the number of two-element sets, {x, y}, which are subsets of a set in E, cannot exceed 8 and, hence, there must be two aspects, x and y, such that neither arg(x) counterargues arg(y) nor arg(y) counterargues arg(x). Thus, the logic of debate mechanisms may be quite subtle and the relative strength of arguments may depend on the order in which they are raised even in the absence of informational dependencies between these arguments.
5
Ex-post optimality
An important feature of our discussion so far has been that the listener’s optimization is carried out ex-ante, that is he chooses the persuasion rule before the debate starts. The listener is committed to follow the persuasion rule. Given an argument and a counterargument, the persuasion rule ‘dictates’ to the listener which action he should take without leaving him the liberty to deviate from the action he is supposed to take according to the persuasion rule. This kind of commitment induces the debaters to argue in a way that minimizes the probability that the listener makes the wrong decision. It is possible, however, that a persuasion rule is ex-ante optimal but is not ex-post optimal: given a course of events in the debate, the listener might make inferences which would lead him not to follow the persuasion rule he has originally announced. If the listener figures out the debaters’ strategies in the induced game at every state, it may happen that a certain combination of an argument and a counterargument is more likely to take place in states where the correct outcome is O1 but the persuasion rule assigns the outcome O2 for that combination. It could happen that even though a certain persuasion rule was ex-ante optimal, the listener would find the action that he is supposed to take according to the persuasion rule ex post (i.e., after the
A Game Theoretic Approach to the Pragmatics of Debate
257
debaters have spoken and after he updated his beliefs given his knowledge of the debaters’ strategies in the different sttaes) unoptimal and, hence, he would want to deviate from the persuasion rule. In this section we examine the question whether the (ex-ante) optimal persuasion rule described in the proof of Claim 1 above, is optimal also expost. Think about the debate as a four-stage (Bayesian) game. In the first stage, nature chooses the state and the debaters, though not the listener, are informed of nature’s choice. In the second and third stages the debaters sequentially make arguments (within the above constraints) and in the fourth stage the new player, namely the listener, chooses an outcome. The listener prefers the correct outcome over the incorrect one. In other words, the listener’s strategy is equivalent to the choice of a persuasion rule and his objective in the game is to maximize the probability of making the correct decision. Does this game have a sequential equilibrium in which the listener follows the optimal persuasion rule? One can show that the persuasion rule specified in Claim 1 is indeed a part of the following sequential equilibrium: • Debater 1’s strategy is to raise the first argument, if such an argument exists, for which debater 2 does not have a persuasive counterargument. Otherwise, debater 1 chooses the first argument in his favor. • Debater 2’s strategy is to respond with the first successful counterargument, whenever such an argument exists. Otherwise, he raises the first argument in his favor. • The listener chooses the outcome according to the persuasion rule described in the proof of Claim 1. The full proof that these three strategies are part of a sequential equilibrium involves dealing with a large number of cases. We will make do with demonstrating the main idea. Assume that debater 1 raises arg(1) and debater 2 responds with arg(3). Given the above strategies, the play of the game described above is consistent with the four states (1, 1, 2, x, y), where x = 1, 2, y = 1, 2. Given that in three of these states the majority of the aspects support debater 1, the listener’s decision O1 is ex post optimal. If debater 2 responds to arg(1) with either arg(4) or arg(5), given the above strategies, the listener should conclude that aspects 2 and 3 are in favor of debater 1 and, therefore, it is ex post optimal for the listener to choose O1 .
258
Game Theory and Pragmatics
If debater 2 responds with arg(2), the listener must conclude that, in addition to aspect 2, at least one aspect in {3, 4} and one aspect in {4, 5} are in favor of debater 2. There are five states that are consistent with the above conclusion. In only one of these, (1, 2, 1, 2, 1), debater 1 should win. Thus, the probability that the correct outcome is O1 is 0.2 and the listener’s plan to choose O2 is also ex-post optimal. Note that the four-stage debate game has some other sequential equilibria as well, one of which is particularly natural: Debater 1 raises arg(i) where i is the first aspect whose realization supports debater 1. Debater 2 responds with arg(j), where j is the next aspect after i whose realization supports debater 2, if such an argument exists and otherwise, he responds with the first argument which is in his favor. The listener’s strategy will be guided by the following logic: in equilibrium, debater 1 is supposed to raise the first argument in his favor. If he raises arg(i), then the listener believes that aspects 1, 2, .., i − 1 are in favor of debater 2. If debater 2 raises arg(j), the listener believes that the realization of aspects i + 1, .., j − 1 are in favor of debater 1. The listener chooses O1 if the number of aspects that he believes to support debater 1 is at least the same as the number of those that he believes support debater 2. This equilibrium induces seven mistakes. For example, in the state (1, 1, 2, 2, 2) agent 1 will start with 1 and debater 2 will respond with 3, inducing the listener to correctly believe that aspect 2 supports debater 1 and to wrongly choose O1 .
6
Language
Even though the persuasion rule described in the proof of Claim 1 is optimal, it does not have an intuitive interpretation and cannot be described using natural language. We expect real life debate rules to be easy to understand and, in particular, that they can be described in terms available within their context. In this section we wish to demonstrate, using a number of examples, how the vocabulary available to the fictitious designer affects the optimal persuasion rules. In all the examples, the group of aspects is a set of five individuals I = {1,2,3,4,5}. 1 A partition with two cells Assume that the set I divides naturally into two sets M (males) and F (females). Assume that a debater cannot state an argument of the type arg(i), but can only make one of the following two arguments: arg(M ) = ‘the facts regarding a male support my position’ and arg(F ) =‘the facts regarding a female support my position’. Recall that a persuasion rule
A Game Theoretic Approach to the Pragmatics of Debate
259
assigns to each of the arguments a subset of arguments each of which persuades the listener to take the action O2 . Thus, a persuasion rule is a function which assigns to every element in {arg(M ), arg(F )} one of the four sets of arguments ∅, {arg(M )}, {arg(F )} or {arg(M ), arg(F )}). There are exactly 16 persuasion rules. Consider, for example, the case M = {1, 2, 3} and F = {4, 5}. It is easy to see that within this vocabulary, the minimal number of mistakes is 7 which is attained by the following persuasion rule: Debater 2 wins the debate if and only if he counters arg(M ) with arg(M ) and arg(F ) with arg(F ). This persuasion rule induces one mistake in favor of debater 1 in the state (2, 2, 2, 1, 1) in which debater 2 does not have a counterargument to arg(F ). In the 6 states in which two aspects in M and one in F support debater 1, debater 2 is always able to counterargue successfully although he should lose. Note that the persuasion rule which requires debater 2 to counter arg(M ) with arg(F ) and arg(F ) with arg(M ) also yields 7 mistakes (in the states in which 3 aspects, two of which are males, support debater 1). Now consider the case M = {1} and F = {2, 3, 4, 5}. One optimal persuasion rule is such that debater 1 wins the debate if and only if he argues arg(F ) and debater 2 does not counterargue with arg(M ). Debater 2 will win, even though he should lose, in the 5 states where at least 3 aspects in F support debater 1 and aspect 1 supports debater 2. Debater 2 will lose the debate, even though he should win, in the four states in which agent 1 is supported by aspect 1 in addition to exactly one of the aspects in F . Another optimal persuasion rule is the one according to which debater 1 wins if and only if he argues arg(M ). 2 One single and two couples Individual 3 is single, whereas ‘1 and 2’ and ‘4 and 5’ are couples. Assume that the debaters cannot refer to any particular couple. Once debater 1 refers to a particular married individual debater 2 can refer to ‘his partner’ or to ‘an individual from the other couple’. In other words, the persuasion rule cannot distinguish between 1 and 2 and between 4 and 5, or between the couple {1, 2} and the couple {4, 5}. Thus, the set of arguments for debater 1 includes arg(single)= ‘the facts regarding the single individual support my position’ and arg(married) = ‘the facts regarding a married individual support my position’. The only possible counterargument to arg(single) is arg(married). There are three possible counterarguments to arg(married) : arg(single), arg
260
Game Theory and Pragmatics
(married to the individual debater 1 referred to) and arg(married but not to the individual to whom debater 1 referred to). One can show that any persuasion rule induces at list 6 mistakes. One optimal persuasion rule is the following: if debater 1 makes the argument arg(single) and debater 2 replies with arg(married) he loses the debate. If debater 1 makes the argument arg(married) the only successful counterargument is arg(married to the individual debater 1 referred to). Given this rule, debater 1 will win the debate if and only if there is a couple {i, j} such that both i and j support debater 1’s position. Thus, debater 1 will win the debate, although he should lose, in the two states (1, 1, 2, 2, 2) and (2, 2, 2, 1, 1) and will lose the debate, when he should win, in the 4 states where the single individual and two married individuals from distinct couples support his case. Another optimal persuasion rule is to require debater 1 to argue arg(married) and to require debater 2 to counterargue with arg(married but not to the individual to whom debater 1 referred to) In this case all 6 mistakes will be in favor of debater 2. 3 Neighborhoods Assume that the debaters have in mind a notion of neighborhood but cannot refer to any particular individual. Debater 1 can only make the single argument ‘here is an individual who supports my case’ and debater 2 has two possible counterarguments: arg(neighbor) = ‘the facts about a neighbor of the person debater 1’s referred to, support O2 ’ and arg(non-neighbor) = ‘the facts about a non-neighbor of the person debater 1’s referred to, support O2 ’. Thus, there are four possible persuasion rules. If the five aspects are located on a line in the order 1, 2, 3, 4, 5, the optimal persuasion rule dictates that in order to win, debater 2 has to counterargue with arg(neighbor). This persuasion rule induces 5 mistakes. Two mistakes are in the states (1, 1, 2, 2, 2) and (2, 2, 2, 1, 1), where debater 1 can win the debate by referring to individuals 1 or 5 respectively although he should lose. In the three states (1, 2, 1, 2, 1), (1, 2, 1, 1, 2) and (2, 1, 1, 2, 1), debater 1 cannot win the debate even though he should, since arg(neighbor) is always available to debater 2. 4 An Ordering Assume that the debaters have in mind an of ordering of the aspects. Debater 1 can only make the argument ‘here is an individual who supports my case’ and debater 2 has only two possible counterarguments:
A Game Theoretic Approach to the Pragmatics of Debate
261
arg(superior)=‘the facts about an individual who is superior to the individual debater 1 referred to, support O2 ’ and arg(inferior)=‘the facts about an individual who is inferior to the person debater 1 referred to, support O2 ’. Thus, there are four possible persuasion rules. The minimal number of mistakes in this example is 10 and is obtained by the persuasion rule which requires debater 2 to counterargue with arg(superior).
7
Related literature
This chapter is based on Glazer and Rubinstein (2001) which presented a game theoretic approach to the logic of debates. Several papers have studied games of persuasion in which one agent tries to persuade another to take a certain action or to accept his position. In particular, see Milgrom and Roberts (1986), Fishman and Hagerty (1990), Shin (1994), Lipman and Seppi (1995) and Glazer and Rubinstein (2004). Several other papers studied cheap talk debates in which two parties attempt to influence the action taken by a third party. In particular, see Austen-Smith (1993), Spector (2000) and Krishna and Morgan (2001). Those readers who are familiar with the implementation literature may wonder about the relation between that approach and ours. In both cases the designer determines a game form and the state determines the particular game played in that state. However, in the standard implementation literature, the state is a profile of preference relations and the game form played in each state is fixed. In our framework, the preference relations are fixed and the game form varies with the state.
References Austin-Smith, D. (1993). Interested experts and policy advice: Multiple referrals under open rule. Games and Economic Behavior, 5, 3–43. Fishman, M. J. and K. H. Hagerty (1990). The optimal amount of discretion to allow in disclosures. Quarterly Journal of Economics, 105, 427–44. Glazer, J. and A. Rubinstein (2001). Debates and decisions, on a rationale of argumentation rules. Games and Economic Behavior, 36, 158–73. Glazer, J. and A. Rubinstein (2004). On optimal rules of persuasions. Econometrica, 72(6), 1715–36. Grice, P. (1989). Studies in the Way of Words. Harvard University Press, Cambridge, MA. Krishna, V. and J. Morgan (2001). A model of expertise. The Quarterly Journal of Economics, 116(2), 747–75. Lipman, B. L. and D. J. Seppi (1995). Robust inference in communication games with partial provability. Journal of Economic Theory, 66, 370–405. Milgrom, P. and J. Roberts (1986). Relying on the information of interested parties. Rand Journal of Economics, 17, 18–32.
262
Game Theory and Pragmatics
Rubinstein, A. (1996). Why are certain properties of binary relations relatively more common in natural language? Econometrica, 64, 343–56. Shin, H. S. (1994). The burden of proof in a game of persuasion. Journal of Economic Theory, 64, 253–64. Spector, D. (2000). Rational debate and one-dimensional conflict. Quarterly Journal of Economics, 115, 181–200.
10 On the Evolutionary Dynamics of Meaning-Word Associations Tom Lenaerts and Bart de Vylder
1
Introduction
Over the last few decades there has been a surge of interest in the origin of meaning and language. Computer simulations based on rigorous assumptions regarding the underlying mechanisms of language acquisition were used to explain both the origin of vocabulary and grammatical constructs. In general the idea is that through social interactions stable language dispositions, like meaning-word associations, spread through a population of agents. The manner in which these interactions occur define the cultural transmission structure (Cavalli-Sforza and Feldman 1981; Boyd and Richerson 1985). Furthermore, the culturally transmitted artifacts, i.e. meaning-word associations, provide information which allows the agents to alter their internal language knowledge (lexicon). Different mechanisms can again be used and depend partially on the cultural transmission structure on which the model is focussing. We will refer to them as the cultural learning mechanisms (Tomasello et al. 1993). The combination of transmission structure and learning mechanism seems to define currently the category to which the language evolution model belongs. In the literature on language evolution models, one can observe two major categories which can be situated at opposite sides of an entire spectrum of models. The first category assumes a system where language is acquired through the statistical sampling of observations of language interactions. Two examples are the evolutionary language model defined by Martin Nowak et al. (Nowak et al. 1999) and the iterated learning model (ILM) defined by Simon Kirby and James Hurford (Kirby and Hurford 2002). As a consequence of their cultural transmission mechanism, these models are often simulated as standard evolutionary processes. In terms of lexicons, this means that those lexicons that allow for successful communication are better adapted to their environment and will take over the population. By making this assumption, the language transmission process focuses implicitly on vertical transmission structure where genetic time and cultural time 263
264
Game Theory and Pragmatics
are the same thing. Boyd and Richerson (1985) explain that in such models of symmetric inheritance, no conflicts between the cultural and biological evolutionary processes will occur. The second category assumes a language acquisition process wherein the listener is not observing some communication but actively participating. The importance of this participation is that the hearer/listener can become speaker in the next round i.e. its role can change. This role change will have an impact on the way a vocabulary is acquired. Moreover, models belonging to this second category assume that the cultural learning process is functional. Functionality refers to the fact that words are communicated as a consequence of the agents’ personal objective to transmit meaning. The idea is based on the concept semiotic cycle (Steels and Baillie 2003) which will be further discussed in Section 2. From a cultural learning perspective this model belongs to the class of operant-condition models (Rosenthal and Zimmerman 1978). This cultural learning mechanism has no implications on the kind of transmission structure used in the model i.e. it only requires that the agents participate actively in the communication. The existing work focuses on horizontal transmission between peers as opposed to vertical transmission between parents and offspring. Yet, the model can be easily extended to situations where the roles are fixed. A difference with the first category is that this model allows for asymmetric inheritance systems since there is a distinction between ’genetic’ and ’cultural’ time . In this chapter we focus on a model from the second category and apply this to the pointing-and-naming game (Steels 1996; Tomasello 2003). The goal of this game is to learn a shared set of meaning-word associations. In the following section a general description of the model will be provided and the necessary terminology will be explained. In Section 3, we will outline an evolutionary interpretation of the cultural learning mechanism. The derivation of the model is described in (Lenaerts et al. 2004). Here we will focus only on the essential parts. At the end of this section we will provide also some information on the extension of this model toward oblique and vertical transmission (Boyd and Richerson 1985). Afterward, in Section 4 we will further discuss the similarities and differences with the other category of language models. Finally we provide some conclusions and future research directions.
2
A mechanism for origin of meaning and language
The question how an embodied autonomous agent can develop a repertoire of categories for conceptualizing his or her world and how a population of
On the Evolutionary Dynamics of Meaning-Word Associations
265
such agents can build a shared communication system, has been the reason for the set up of an experiment called the ’Talking Heads’ (Steels et al. 2002). One underlying idea of this experiment is that language emerges through self-organization out of local interactions of language users. This advocates the view that language is a complex dynamical system. Another underlying idea is that meaning is not innate either but is built up steadily by each individual while interacting with his or her environment. Meaning is therefore tightly connected to bodily experiences and adapts itself to the demands and characteristics of the environment in which an agent finds himself. In the following sections an overview will be given of the mechanisms which form the background for the kind of models that will be described here. 2.1
The guessing game
In the ’Talking Heads’ experiment, the agents interact with each other by playing language games, more specifically, guessing games. Such a game proceeds as follows: Two agents are chosen randomly from a population. One agent takes the role of speaker, the other one the role of the hearer. In the following games, an agent can take a different role. Next, the speaker determines a context for the game. This context is some part of the environment the agents share, and restricts the possible objects the speaker can talk about. The context is also known by the hearer. Then, the speaker chooses an object from this context, which will be further referred to as the topic. Afterwards, the speaker gives a verbal hint to the hearer, which is an expression that identifies the topic with respect to the other objects in the context. Based on this hint, the hearer tries to guess what topic the speaker has chosen, and communicates his choice to the speaker by pointing to the object. The game succeeds if the topic guessed by the hearer is equal to the topic chosen by the speaker. The game fails if the guess was wrong or either the speaker or the hearer failed at some earlier point in the game. During and after this guessing game, both the speaker and the hearer can update their categorization and language, in order to increase the likelihood of a successful game in the future. At every point in time every agent has a map that defines the assocation between objects and meanings, and meanings and words. These relations change over time as a result of the language games, the identification of new topics and words. Therefore, the Talking Heads experiment can be used for studying semiotic dynamics. In general, a semiotic triangle, as depicted in Figure 10.1, consists of three entities: a referent, a meaning and an utterance. A referent refers to an entity in the real world. A meaning corresponds to a
266
Game Theory and Pragmatics
Referent
Meaning
Utterance Figure 10.1: The semiotic triangle describes the relation between a referent – items as they ’are’ – the meaning of the object – the realization or perception of an item by humans –, and the utterance – the sign – that is used for denoting the perceived object.
category or a combination of categories, and an utterance is a physical signal transmitted from one agent to another. An interaction between a speaker and a hearer is called a semiotic cycle: starting from an object, the speaker associates a meaning with it. Next, this meaning is expressed as a word, which the hearer interprets again as having a meaning. Finally, this meaning together with the context determines an object. Thus, the semiotic triangles of the speaker and hearer are traversed in opposite directions, starting and ending with an object. In the scope of this chapter we will not address the origin of meaning. It is assumed that all agents share the same, finite set of meanings or categories. What we will investigate, is how a population of agents manages to agree upon the names that are used to name these different categories. This can be studied by letting the agents play a somewhat simplified form of the guessing game, which is often referred to as the naming game. 2.2
A naming game
A naming game can be interpreted in two different ways. First, one can assume that in fact there are no categories involved, and that the agents have to agree upon the names they use for the different objects in the environment, resulting in proper names for these objects. In this interpretation, it is natural for an agent to point to an object. This allows a hearer to learn
On the Evolutionary Dynamics of Meaning-Word Associations
267
directly the word/meaning association the speaker used. Second, if one assumes that there really are categories involved, but only that these categories are fixed and the same for each agent, then it is not clear how an agent could point to a category. The only thing an agent can do is point to an object, which has very likely more than one category associated with it. It is this interpretation we will adhere to in the rest of this chapter, as it fits more naturally in the guessing game described earlier. The naming game proceeds analogously to the guessing game. The two agents are selected randomly from the population, one speaker, one hearer. The context plays no role. One category/meaning is chosen at random and the speaker chooses a word to describe this meaning. The hearer interprets this word as a category. At this point it is assumed that the agents can, by pointing, make the distinction between success and failure of the game. Note however, that in case of failure, this does not mean that the hearer receives information on the category the speaker intended. Nor does the speaker receive any information on the category the hearer used to interpret the communicated word. There are a lot of possibilities in fully specifying the naming game. The situation that in the beginning different agents will use different words for the same meaning, and that an agent associates different words with the same meaning will always occur. In other words, in the beginning, synonymy is unavoidable. The existence of synonymy, however, does not necessarily imply that the communication between the agents performs badly, although a lot of synonymy will slow down the language development process. Regarding homonymy, there is more influence from the design of the experiment. One can assume that the number of different words the agents can invent and use is practically unlimited, or one could restrict the number of words the agents can use. In the former case, the event that two agents start using the same word for two different objects, is negligible. In the latter case, however, this event will be very likely and homonymy will be unavoidable, at least in the beginning. In the extreme case, one could investigate what happens when there are only as many words available as there are categories to describe, such that all agents have to agree upon the same one-to-one mapping between meanings and words. In this chapter, we will assume that the number of words is indeed restricted. While this restriction may sound artificial at first sight, there are two reasons why the assumption of a limited number of signals is not that unnatural. First, most of the words that are used in human languages are composed of smaller units. In English for instance, every word is composed of consonants and vowels. This compositionality is already a very complex phenomenon. So when we want to investigate the origins of language, we
268
Game Theory and Pragmatics
could assume that the first reliably repeatable signals humans had access to were simple syllables and therefore not unlimited in number. Second, the assumption we made that agents cannot point directly to meanings implies that even in a successful language game, it is possible that the speaker and hearer associate a different meaning with the same word, when the two meanings are undistinguishable in the context. Thus in the rest of this chapter, we will assume that the number and the form of words the agents can use is fixed and known beforehand. What will be investigated is how the agents tune their behavior in order to make their communication successful. 2.3
Formal specification of the naming game
Given a fixed number of meanings, m, and words, n, an agent consists of an association matrix, A, with dimensions m × n, of which an element contains the strength between a meaning and a form. These strengths are real numbers between 0 and 1. When an agent acts as a speaker and has to choose a word for a certain meaning, he chooses the word with the highest strength. Conversely, when a hearer has to guess which category the speaker meant with a certain word, he chooses the category with the highest strength. This determines the speak and hear behavior of the agents. But in order for the agents to arrive at a common language, they have to adapt their language knowledge after each game. Therefore, an agent will increase the strength of the used association and decrease concurring associations if a game was successful. If a game fails, the agent does the opposite. More concrete, when meaning k was expressed with word l at time-step t and the game was successful, the updates for the speaker with association matrix as are: asij (t + 1) askj (t + 1) askl (t + 1)
= = =
asij (t) (1 − α)askj (t) (1 − α)askl (t) + α
for i 6= k and j 6= l for j 6= l
and for the hearer with association matrix ah the updates are: ahij (t + 1) ahil (t + 1) ahkl (t + 1)
= = =
ahij (t) (1 − α)ahil (t) (1 − α)ahkl (t) + α.
for i 6= k and j 6= l for i 6= k
In all equations, α is a value in the range [0, 1] and refers to the amount with which the values of the lexicon are adjusted when the games succeeds or fails. The variables i,j and k represent the indices to indicate row and column in the lexicon. In both systems of equations, the first one refers to the update procedure for all values not at a particular row k (for the speaker) and column l for the hearer. The second equation in both system describes
Average communicative success
On the Evolutionary Dynamics of Meaning-Word Associations
269
1 0.8 0.6 0.4 0.2
20
40
60
80
100
Number of games 10 2 Figure 10.2: A typical run of the naming game. This experiment was conducted using 10 agents. Each agent has a set of 7 meanings and 7 words. The update paramter α = 0.1. The plot shows the communicative success of the agents over time.
how all the values in the particular row k (for the speaker) or column l (for the hearer) except the selected one at the intersection of row and column are updated. The last equation in each system describes the update procedure for the selected meaning-word association. The same reasoning holds for the following systems of equations. If the game failed, this means that the hearer chose a meaning for word l different than k, say k0 . The updates for the speaker are now asij (t + 1) askj (t + 1) askl (t + 1)
= = =
asij (t) (1 − α)askj (t) + (1 − α)askl (t)
α n−1
for i 6= k and j 6= l for j 6= l
α m−1
for i 6= k0 and j 6= l for i 6= k0
and for the hearer analogously: ahij (t + 1) ahil (t + 1) ahk0 l (t + 1)
= = =
ahij (t) (1 − α)ahil (t) + (1 − α)ahk0 l (t)
A typical run of the naming game using this updating mechanism can be seen in Figure 10.2. The plot shows the communicative success between the
270
Game Theory and Pragmatics
agents. This success is measured as the number of successful games within a certain time period. As can be seen, the communication converges to an optimal one.
3
Evolutionary game model
In (Lenaerts et al. 2004) we described how language acquisition models like the naming game actually belong to the class of stimulus-response models and how they can be expressed in terms of learning automata. Furthermore due to the relation between learning automata and replicator dynam¨ ics from evolutionary game theory (EGT) (Borgers and Sarin 1997), the updating scheme of the naming game can be expressed as a selection-mutation equation (Hofbauer and Sigmund 1998). This dynamical equation describes how the strengths that associate meanings and words change over time. The intuitive, yet simplified, idea of this relation is that, when performing one game, both speaker and hearer are represented by a collection of meaning-word associations. Concretely, based on the description in the previous section, the speaker of the game has for each possible meaning a population of words and the hearer has for each possible word a population of meanings. During the game, both speaker and hearer select one of these populations. In these two populations, each meaning-word association is present in a certain concentration which represents the relevance of the association: High (low) concentrations reflect a strong (weak) association. These concentrations change according to the success or failure of the game. Since success can be compared to fitness, the cultural learning scheme can easily be mapped on the replicator equations. This relation between learning and biological evolution has always existed. Its origin lies in the fact that in both mechanisms, change is described as a gradual stepwise process from mediocre to better states. Mapping the naming game onto learning automata theory and incorporating the ground¨ breaking work of Borgers and Sarin (1997) made this relation explicit. In the next section, we discuss the selection-mutation model relevant for the evolution of meaning-word associations and its origin. Afterward we will further discuss the interpretation and provide some simple examples. In the end we will provide some extensions of the model toward one that can include oblique and vertical transmission as discussed in (Boyd and Richerson 1985). 3.1
Model specification
Replicator dynamics are a game theoretical interpretation of the replicator equation used in many dynamical studies of biological phenomena. Its most
On the Evolutionary Dynamics of Meaning-Word Associations
271
important contribution is that it highlights the role of selection in the context of EGT and examines which equilibria can be reached. We refer for the details on the replicator dynamics to the introduction of this book. Here we will only summarize some general facts while explaining the analogy. When the state of a population, which is a list of strategy concentrations, is described by vector x = (x1 , x2 , ..., xn ), the replicator equation describes how the different concentrations xi change over time. An important factor in these equations is the success of each element upon interacting with other elements (represented by a matrix A) and how the individual success (i.e. (Ax)i ) compares to the average population success (i.e. x · Ax). It is assumed that elements with above average success will increase in concentration whereas, below-average elements will decrease and go extinct. This behavior is defined by Equation (10.1) (Hofbauer and Sigmund 1998): dxi = xi ((Ax)i − x · Ax) dt
(10.1)
Yet, during replication errors can occur. The replicated offspring may change their type in respect to the type of their parents i.e. they mutate. Mutations occur, as is assumed in most biological models, very infrequently. Yet, it produces the necessary variation on which selection can act. In dynamical models in EGT, it is assumed that the probabilities of mutation from one type to another are given. Suppose that there are n types of elements then the following mutation matrix can be specified: 0 B B U=B @
µ11 µ21 ... µn1
µ12 µ22 ... µn2
... ... ... ...
µ1n µ2n .. µnn
1 C C C A
The variables µij in matrix U represent the probabilities that one type can change into another type. Now Equation (10.1) can be extended in such a way that it incorporates not only change as a result of fitness differences, but also the possibility of change due to mutation (Hofbauer and Sigmund 1998): X X dxi = xi [(Ax)i − (x · Ax)] + µji xj − µij xi dt j6=i
(10.2)
j6=i
As can be seen this second form of change can be either the result of mutations of other types xj into the current type xi (increase of type) or mutations from this type xi into other types xj (decrease of this type). It is assumed in this equation that µij is the same for all i and j.
272
Game Theory and Pragmatics
Equation (10.2) describes how the concentrations of different strategies change according to interactions between the elements of one population. Yet, in our analogy we described the interaction between two populations: a sub-population of words in the mind of the speaker and a sub-population of meanings in the mind of the hearer. In the context of EGT such a setup refers to multi-population models (Weibull 1996). This type of model is used for the study of asymmetric games where the payoff matrix of the players is no longer the same. In case of two populations, as we assume here, the two following replicator equations reflect the dynamics of the system: X X dxk = xk [(Ay)k − (x · Ay)] + µlk xl − µkl xk dt
(10.3)
X X dyi = yi [(Bx)i − (x · By)] + µji yj − µij yi dt
(10.4)
l6=k
j6=i
l6=k
j6=i
Thus, according to our analogy, the variables xi correspond to the speaker’s words which are currently associated with a particular meaning and the variables yi correspond to the hearer’s meanings which are currently associated with the transmitted word. Interactions occur between the elements of both populations. To understand this, assume a configuration as shown in Figure 10.3 where the speaker has three possible words for a particular meaning and the hearer has three particular meanings for the uttered word. Each part Xi (Yj ) of the circle represents the frequency of a meaning-word combination. One can determine how the frequency of each meaning-word association changes over time from the success or fitness of the interaction between these words and the meanings. In EGT the expected fitness is determined by the payoff two elements receive when they interact. In the context of our analogy, interaction refers to the use of a particular meaning-word association. The success or fitness is not determined by the association itself but by the outcome of interaction in the context of transmitting meaning. To understand this better, let’s assume a particular example. When a word is used in the naming game to communicate some meaning, the outcome can be either successful or unsuccessful. As can be seen in Figure 10.3, the speaker may select a meaning D2 and transmit one of the associated words. Since W3 has the highest concentration (x3 ) in the population it will be selected. The hearer receives word W3 and selects one of the associated meanings. Since D2 has the highest concentration (y2 ) in the second population (Wk ) it will be selected. In this scenario, the interaction was a success because both speaker and hearer associated the D2 with
On the Evolutionary Dynamics of Meaning-Word Associations
273
Figure 10.3: Multipopulation model with population Di corresponding to a particular meaning selected by the speaker and Wk the particular word that was understood by the hearer. The elements x1 , x2 and x3 correspond to the frequency of the association between meaning Di and word W1 , meaning Di and word W2 and meaning Di and word W3 respectively. The elements y1 , y2 and y3 correspond to the frequency of the association between word Wk and meaning D1 , word Wk and meaning D2 and word Wk and meaning D3 respectively. In EGT the success of word Wj in Di is determined by the interaction of the word with the different meanings Dl in the population Wk . This interaction is represented by the full lines. The dotted lines reflect the opposite interaction.
W3 . Consequently, the expected fitness of the meaning-word association between D2 and W3 will have an above average fitness and its concentration will increase. The expected fitness of the other associations will be below average and decrease. In terms of learning, this means that when a good association is used, this relation should be further exploited and this is done by increasing the strength of that relation. Yet, when the game is unsuccessful, the meaning-word association that was used should decrease. In other words, the bad association between D2 and W3 should decrease in favor of the other associations in the population. In the example in Figure 10.3, a certain amount of the D2 -W3 association should be removed and distributed over the two other associations. This re-distribution in case of failure is performed by a mutation of W3 elements into W1 and W2 elements. The same is true for the meanings in population Wk where D2 elements are transformed into D1 and D3 elements. As a consequence, again in terms of learning, the system tries to explore the space of meaning-word associations further until it finds a good one. The separation between success and failure required in the pointing-andnaming game is not apparent in Equations (10.3) and (10.4). In EGT, both selection and mutation parts of the equations are active at the same time and hence do not depend on the outcome of the interaction. In the pointing
274
Game Theory and Pragmatics
and naming game, however, the relative influence of selection and mutation does depend on the outcome of the interaction. In Equations (10.3) and (10.4), this is expressed through the payoff matrices A and B and the mutation matrix U, as we will show in the following section. 3.2
Origin of the model
The analogy between learning and biological evolution can be made explicit through models of evolutionary game theory. The idea of defining this re¨ lation is not new. Borgers and Sarin (1997) specify a relation between the Cross learning model and replicator dynamics. They show that in the continuous time limit the learning model converges to the asymmetric version of the replicator dynamics from evolutionary game theory. By specifying this relation, the authors provide a non-biological interpretation of evolutionary game theory. It is this interpretation which explicitly links learning and evolutionary dynamics. This result is not limited to the Cross learning model. Due to the relation of this model with the general theory of Learning Automata (Tuyls et al. 2002) and with the Q-learning model from Reinforcement Learning (Tuyls et al. 2003), these results are generally applicable. In other words, since the naming game described in Section 2 is a form of stimulus-response game which can be expressed in terms of learning automata, it can also be expressed in terms of asymmetric replicator equations (Lenaerts et al. 2004). Learning automata represent a particular approach to learning. Narendra and Thathachar (1989) describe this automaton approach to learning as a process that involves the determination of an optimal action out of a set of allowable actions. [...] these actions are assumed to be performed on an abstract random environment. The environment responds to the input action by producing an output, belonging to a set of allowable outputs which is probabilistically related to the input action (see Narendra and Thathachar, 1989, p. 35). In this context different automata are possible and variations depend on different aspects, such as the set of outputs that can be produced by the environment or, whether or not the state-action associations change over time. Specifically those automata where these associations change, i.e. variablestructure stochastic automata are important here. These automata use a general update mechanism which is equivalent to the update scheme described for the naming game in Section 2.3 under the assumption that δ s = δ f = α: pikl (t + 1) = pikl (t) + δ s (1 − β(t))(1 − pikl (t)) − δ f β(t)pikl (t)
(10.5)
On the Evolutionary Dynamics of Meaning-Word Associations
pikh (t + 1) =
pikh (t) − δ s (1 − β(t))pikh (t)+ ` ´ δ f β(t) (w − 1)−1 − pikh (t)
275
(10.6)
where pikl refers to a particular state-action pair, δ s and δ f are the reward and penalty respectively, w refers to the number of words and β((t) refers to the outcome of the interaction with the environment. It is important to remember that, in the context of learning automata, β(t) = 0 refers to a successful outcome of the game and β(t) = 1 refers to a failure of the game. Equations (10.5) and (10.6) represent the general update schemes for the associations between states and actions of a variable-structure stochastic automaton. The difference between the two equations is that Equation (10.5) is used to update the association between the action that was actually performed in a certain state of the process and that Equation (10.6) is used to update all the other state-action associations that were not used in that particular state of the process. As is the case for the naming game, the equations consist of different update schemes which depend on the output returned by the environment. Both speaker and hearer select a certain ”action” given a particular ”state”. On one hand, the speaker’s ”state” refers to the particular meaning he wants to transmit and the action is one of the competing words which can be used for that particular meaning. On the other hand, the hearer’s ”state” refers to the received word and the competing meanings correspond to the set of actions he can take given that particular situation. Now, if the interaction was a success (β(t) = 0) the strength of the association between the selected state-action pair is increased and the strengths between the other state-action pairs is decreased. For the speaker this is one of the competing words associated with that particular meaning and for the hearer it is one of the competing meanings that might be associated with the received word. If the environment reports a failure (β(t) = 1) the strength of the selected association is decreased and the strengths of all the other associations is increased equally. The same interpretation for states and actions is used here. Success and failure hence activate different parts of the updating scheme. The amount of increase and decrease is specified by the parameters δ s and δ f respectively. These equations can be used to derive the replicator dynamics for the speaker (cfr. Equation (10.3)). A similar set of equations is required for the hearer and can in turn be used to derive the replicator dynamics of the hearer. To perform the task for the hearer, the expected change (E[∆pikl | pik (t), qlj (t)]) of a particular association needs to be calculated: E[∆pikl |pik (t), qlj (t)]
=
pikl (t + 1) − pikl (t)
276
Game Theory and Pragmatics
This expected change specifies how the strength of a state-action association changes from one generation to the next given a particular state description for both the speaker strengths and hearer strengths. The derivation explained in (Lenaerts et al. 2004) produces the equation:
pikl (t)
„
d P
r=1
E[∆pikl |pik (t), qlj (t)]
=
−µkl pikl (t) w P h0 6=l
pikh (t)
pikh (t)
pikh0 (t)
d P r=1
d P r=1
d P r=1
h=1 w P h=1
µkl + w−1
w P
j Akr qrl (t) −
« j Ahr qrl (t)
j qrl (t)
j qrl (t)
(10.7) where k refers to the rows and l to the columns and w is the number of words and d the number of meanings. In this equation a number of substitutions were already introduced: • Aij = δ s (1 − β(t)): The update amount in case of success is considered to be the payoff that is received by the words and meanings upon successful interaction. The collection of all the Aij values defines the payoff matrix A: 0 B B B B A=B B B @
0 0 ... δ s (1 − β(t)) ... 0
0 0 ... δ s (1 − β(t)) ... 0
... ... ... ... ... ...
0 0 ... δ s (1 − β(t)) ... 0
1 C C C C C C C A
The resulting matrix A is w × d-dimensional and we will refer to this matrix as the payoff matrix. Each entry in this matrix represents the payoff that certain word meaning combination gets. Note that only the row is used which corresponds to the meaning selected by the speaker. All other entries are in this case irrelevant since they refer to other meanings which were not selected by the speaker. This structure is the consequence of the simplification of Equation (10.7) into a new equation which uses matrix multiplication for the selection part as in Equation (10.1). Furthermore each value δ s (1−β(t)) is only assigned when the communication is a success. This makes the payoff matrix an unusual one since all entries will become zero when the game fails. Yet it is required due to the distinction in how the update is performed between successful and unsuccessful interactions in the naming game.
On the Evolutionary Dynamics of Meaning-Word Associations
277
• µij = δ f β(t): The update amount in case of failure is considered to be the mutation rate from one type to another. The collection of all these rates corresponds to the mutation matrix U that was described in the previous section: 0 B B U=B @
δ f β(t) δ f β(t) ... δ f β(t)
δ f β(t) δ f β(t) ... δ f β(t)
... ... ... ...
δ f β(t) δ f β(t) ... δ f β(t)
1 C C C A
As with the payoff matrix, the matrix U becomes zero when the communication was successful. The different sums in Equation (10.7) can be further simplified by writing them as matrix multiplication. This rewriting produces a replicator equation that describes the dynamics of meaning-word associations within the speakers mind: E[∆pikl |pik (t), qlj (t)]
=
” “ pikl (t) (Aqlj (t))k − pik (t)Aqlj (t)
(10.8)
−µkl pikl (t)
(10.9)
µ + kl (1 − pikl (t)) w−1
(10.10)
This equation consists clearly of three parts: • Part (10.8) corresponds to a standard replicator equation as discussed in the context of evolutionary game theory. It describes the effects of selection on the frequency of a particular meaning-word combination relative to the average performance in that population. • Parts (10.9) and (10.10) correspond to the effect that mutation has on the frequency of a particular meaning-word association. Concretely, Part (10.9) of the equation expresses the fact that meaning-word combinations can decrease in number and hence disappear. Part (10.10) of the equation expresses an increase in the frequency due to mutations of other meaningword associations to this particular one. Given Equation (10.8) the same thing has to be done for the hearer. In that case, the replicator equation describes how the meaning concentrations for a particular word change due to the outcome of the interaction. How the equation should be interpreted in general was already discussed in Section 3.1. In the next section we will go into a bit more detail.
278
Game Theory and Pragmatics
3.3
Interpretation of the model
Due to the particular definitions of the matrices A and U (and B for the hearer), the replicator equation is used in two different ways: upon success selection is used, upon failure mutation is used. This distinction is well known in learning theory. Concretely, when β(t) = 0, the selectionmutation rules for both speaker and hearer are reduced to a selection model. Since no variation is introduced when the interaction succeeds, the selection dynamics only act on the given data and can converge to local optima. Hence, whether it finds the lexicon which is shared with other individuals depends on the initial state of the lexicon. In other words, there is only exploitation of the known lexicon data when the interaction was successful and no exploration towards lexicons which are shared between all the individuals. When β(t) = 1 the mutation parts of the model become active and no selection occurs. As a consequence a better exploration of the space of lexicons becomes possible and it will be easier to find a shared lexicon. A good balance between exploitation and exploration defines a good learning technique. This balance depends strongly on the kind of agents that are encountered. If all agents that one sees use the same words then convergence is easy: just exploit what is known. If it is assumed that the mutation part is missing from the update scheme, then failure of an interaction will only result in two agents shrugging their shoulders. Given the relation between the naming game and these replicator equation, one can wonder what will happen to their model when δ s is different in matrix A (and B) within a single line or over different lines. Moreover, a study could be made about a situation where δ s depends on the role i.e. a different value for speaker or hearer. Such questions would change the current system of simple agents with homogeneous update rules into a system of heterogeneous ones where possibly more interesting dynamics can be investigated. The same argument can be made for the different entries of the matrix U. So far, nothing has been said about the dynamics of an entire population of agents. Yet, as argued extensively by Boyd and Richerson (1985), since social learning causes phenotypic traits to be exchanged between individuals, social learning has population-level consequences. Thus, since language is acquired, by individuals through actual communication, language is a population-level phenomenon. In the current context, the population is defined as a collection of lexicons (one for each agent). The replicator equations previously defined describe how each agent updates this lexicon depending on the role it performs in this context. A population of such lexicons is described as a distribution over the possible lexical matrices. Moreover, the dynamics defined by for
On the Evolutionary Dynamics of Meaning-Word Associations
279
instance Equation (10.7), describes how this distribution changes over time. In other words, how the average lexical configuration and the corresponding variance changes over time. As argued by Boyd and Richerson (1985) in their model, learning has two opposing effects on the variance of distribution of cultural variants in the population. First, it causes the individuals to change their cultural variants toward a common goal. Consequently, the variance in the population decreases. Second, errors made during learning increase the variance. Since we do not discuss the extensions here where the exchange of lexical information is noisy, this fact can not be observed here. We leave this for later exploration. Nevertheless, the first effect is observable in the experiments. The cultural variants correspond to the lexical matrices of the different population members. The naming game causes the different individuals to modify their lexical variants toward a shared lexicon. Hence the variation in lexical information decreases. Moreover, it will disappear since the cultural inheritance process here is devoid of noise.
4
Positioning the model
In this section, we examine the relation of our replicator model for language evolution with other models discussed in the respective literature. Different models for language evolution are often designed to investigate or explain different phenomena. This makes a comparison with the proposed model more difficult. Nevertheless, we will discuss the models in the context of the evolution of meaning-word associations. Several criteria can be used to classify different models of language evolution. Horizontal versus vertical transmission determines whether language is transmitted within one generation or between successive generations. Random learning versus parental and role model learning discern different strategies for choosing an agent to listen to. Another distinction that can be made is whether agents merely observe and imitate other language users, or whether they more actively shape a language for its communicative function. Where applicable, these criteria will be used in the following discussion. Most models of language evolution incorporate the genetical evolution of a so called ’language acquisition device’, an entity in the brain which is responsible for learning a language when exposed to it. In (Hurford 1989), three strategies for learning meaning-word associations are discussed. Simulations demonstrate that one of the strategies, called ’Saussurean’, is most successful and dominates when exposed to natural selection according to the communicative success of an agent. The main property of the ’Saussur-
280
Game Theory and Pragmatics
ean’ strategy is that it biases the agents towards building a two-way mapping between meanings and signals. This resembles the model that we proposed, because we use the same association matrix for both speaking and listening. However, the simulations also showed that the Saussurean strategy did not guarantee the convergence towards a perfect communication system, whereas our model was experimentally proven to do so. To conclude, one could say that we are more concerned with the dynamics of the meaning-word associations once the agents have a sufficiently powerful architecture, while the mentioned model was concerned with the question how such an architecture might have arisen in the first place. As a consequence, it sufficed for this model to model language evolution through vertical transmission: the next generation learns the language from the previous one, because agents can only be selected for or against on a generational basis. A very similar approach was taken in (Nowak et al. 1999). Again, three different strategies for language acquisition were presented, which differ in the choice children make when deciding from who they should learn a language. The strategies were parental learning, role model learning and random learning. An important assumption in this model is that successful communication contributes to biological fitness, which means that agents who communicate well produce more offspring. The simulation runs show that both role-model learning and parental learning outperform random learning, the latter rarely reaching perfect communication. This result at first sight contradicts the results presented in the previous section. In our model, agents interact randomly with each other, independent of their communicative success, but still the population converges towards an optimal communicative system. Although the transmission structure is vertical and the one used in our model horizontal, this does not explain this apparent paradox. The answer lies in the agent architecture. In Nowak’s model, it is assumed that agents acquire a language by observing and imitating other agents, without doing any further processing on the data they receive. In our model, on the other hand, the agents are more driven towards a successful communication system by actively inhibiting or reinforcing alternative associations in the case of success and failure, respectively. In other words, one could say that in order to reach successful communication, the evolutionary dynamic in Nowak’s model selects between agents, while in our model selection occurs within an agent, between alternative meaning-word associations. Consequently, there is a difference between the two models in the velocity by which language can develop. Another approach to modeling the evolution of language is described in Oliphant and Batali (1997). In this paper too, different models for lan-
On the Evolutionary Dynamics of Meaning-Word Associations
281
guage learning are presented and compared with respect to their capability of developing and maintaining a successful communication system. These different models are compared separately; there is no competition between them by genetic evolution. An important aspect in this paper is that there is a clear distinction between an agent’s sending behavior, which means the mapping from meanings to signals, and an agent’s receiving behavior, which is the mapping from signals to words. The two main learning models discussed are imitation learning and learning by obverting. An imitator literally imitates the sending and receiving behavior observed in the population. An obverter, on the other hand, will choose those send-and-receive functions that maximize the communicative accuracy when communicating with a population. This means that its sending behavior will be entirely determined by the population’s receiving behavior in the optimal way, and vice versa for its receiving behavior. The authors argue that the imitation learning strategy is not capable of improving upon an existing communication system. They argue that using learning by obverting, however, an optimal communication system will always be reached. An interesting implication is that the originally separated send-and-receive functions of an agent will perfectly match each other in the end. Or as stated in the paper: . . . the bidirectionality of signals is a consequence of how the learning procedure affects the population’s communication system, not an assumed constraint on how learning is done. This two-way mapping between meanings and words is probably more inherent in the model we proposed, as our agents’ send and receive functions depend on the same association matrix. The transmission structure used in the simulations is horizontal, but an agent does all the learning before he becomes a member of the population, after which its communication system becomes fixed. Yet another model for the evolution of language is called the ’iterated learning model’, of which an overview can be found in Kirby and Hurford (2002). The main assumption underlying this model is that the most important force shaping a language is not its utility, but its learnability. Therefore, language evolution is studied by iteratively letting a population of language learners (children) learn a language from a population of language-users (adults). After a learning period, these learners become the users who provide input for the next generation of language learners. In this fashion, one can study what kind of language will eventually prevail after several generations. There are a lot of variants of this model discussed in the literature. A lot of the simulations conducted with this model are used for studying the emergence of compositionality in a language, which is an issue we do
282
Game Theory and Pragmatics
not address. In Smith (2004), however, this ’iterated learning model’ is also applied to the evolution of meaning-word associations. The language inventory of the agents consists of a lexical matrix, as in our model, and the choice the speaker and hearer make for producing and interpreting is also the same. The author then compares three different mechanisms for language acquisition, which differ in the bias they have towards homonyms: one mechanism is neutral, another is biased in favor and the third one is biased against homonyms. It is then claimed that the influence on language arising from the learner’s bias is much stronger than the influence coming from natural selection. The latter manifests itself through the selection of the parents: agents with a higher communicative accuracy are more likely to provide input for the learners. When there is no selection pressure, (adult) agents are selected at random to provide input. Because the update scheme we proposed also has a bias against homonymy and speakers are selected at random, one could say that our model conforms the findings from this paper.
5
Conclusion
Models of language acquisition can be categorized in different ways. As argued in this work, when the criteria for categorization are the transmission structure and learning mechanism, there are two major classes. We highlighted a particular model, the naming game as defined in Steels (1996), and showed that the learning mechanism is equivalent to a selection-mutation replicator dynamics as discussed in EGT. This relation makes the intuitive notion explicit that cultural learning has a similar dynamics as an evolutionary system, hence the terminology ‘cultural inheritance’ and ‘evolution’. Although this work presents only initial steps in investigating this relation in the context of language evolution, we feel that further research into this relation will prove worthwhile. Different extensions of the previously discussed model can be introduced to study the particular dynamics related to that model. Here we do not provide an exhaustive list of things that can be done. Only some will be highlighted. As explained in the introduction, the current model assumes that speaker and hearer actively participate in the communication process. Moreover their roles can change. This kind of cultural learning system uses a horizontal transmission structure. Two alternative forms exist: oblique and vertical transmission (Boyd and Richerson 1985). We will use this terminology here since it was already established in Boyd and Richerson’s excellent book on cultural evolution.
On the Evolutionary Dynamics of Meaning-Word Associations
283
Oblique transmission is a form of cultural transmission which occurs within the same genetic generation. Hence there is still a difference between cultural and genetic time. In other words, the difference with horizontal transmission is that the role assignment is no longer flexible. The speaker in this communication structure is somebody who has the role of teacher whereas the hearer is a student. The direction of communication hence is one way. Moreover, since the learning is done by the hearer, the speaker will probably not alter her knowledge. This setup refers to the idea of role-model learning described in Nowak et al. (1999). Vertical transmission differs from oblique transmission in the sense that the distinction between genetic and cultural time is removed. Cultural learning is performed over genetic generations. Changes that might occur within a single genetic generation are not considered. Hence conflicts between both learning schemes can not be considered. A mixture of for instance horizontal and vertical learning could provide interesting results on how both influence the overall dynamics of the system. In relation to the issue of difference in time, one could consider the extension of this model with a system describing the evolution of the parameters (δ f , δ s , ...) by genetic inheritance. Such a system has been for instance discussed in the context of the iterated learning model (Smith 2004). It would be interesting to see how this would hold in the current model. A final extension which we consider relevant is to transform this equation into the Price covariance model for evolution by natural selection (Price 1970, 1972). As recently discussed by Page and Nowak (2002) the Price equation can unify different dynamical models of evolution discussed in Biology. Due to its relation with replicator dynamics, the update scheme of the naming game can also be expressed in this way. Given the relation of the Price equation with the selection dynamics within and between groups, it might be interesting to see how these ideas fit in the context of the changes in the lexicons of the different agents.
References ¨ Borgers, T. and R. Sarin (1997). Learning through reinforcement and replicator dynamics. Journal of Economic Theory, 77(1), 1–14. Boyd, R. and P. Richerson (1985). Culture and the Evolutionary Process. The University of Chicago Press, Chicago and London. Cavalli-Sforza, L. and M. Feldman (1981). Cultural Transmission and Evolution, volume 16 of Monographs in Population Biology. Princeton University Press. Hofbauer, J. and K. Sigmund (1998). Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge, UK. Hurford, J. (1989). Biological evolution of the saussurean sign as a component of the language acquisition device. Lingua, 77(2), 187–222.
284
Game Theory and Pragmatics
Kirby, S. and J. R. Hurford (2002). The Emergence of Linguistic Structure: An Overview of the Iterated Learning Model, chapter 6, pp. 121–148. Springer - Verlag, London. Lenaerts, T., B. Jansen, K. Tuyls, and B. D. Vylder (2004). The evolutionary language game: an orthogonal approach. Journal of Theoretical Biology (to appear). Narendra, K. and M. Thathachar (1989). Learning Automata: an introduction. PrenticeHall International, Inc. Nowak, M. A., J. B. Plotkin, and D. Krakauer (1999). The evolutionary language game. Journal of Theoretical Biology, 200(2), 147–162. Doi:10.1006/jtbi.1999.0981. Oliphant, M. and J. Batali (1997). Learning and the emergence of coordinated communication. The newsletter of the Center for Research in Language, 11(1). Page, K. and M. Nowak (2002). Unifying evolutionary dynamics. Journal of Theoretical Biology, 219, 93–98. Price, G. (1970). Selection and covariance. Nature, 227, 520–521. Price, G. (1972). Extension of covariance selection mathematics. Annals of Human Genetics, London, 35, 485–490. Rosenthal, T. and B. Zimmerman (1978). Social Learning and Cognition. Academic Press, Inc., New York. Smith, K. (2004). The evolution of vocabulary. Journal of Theoretical Biology, 228(1), 127–142. Steels, L. (1996). Emergent adaptive lexicons. Proceedings of the Simulation of Adaptive Behavior Conference.. Steels, L. and J.-C. Baillie (2003). Shared grounding of event descriptions by autonomous robots. Robotics and Autonomous Systems, 43(2-3), 163–173. Steels, L., F. Kaplan, A. McIntyre, and J. Van Looveren (2002). Crucial factors in the origins of word-meaning. In A. Wray, ed., The Transition to Language. Oxford University Press, Oxford, UK. Tomasello, M. (2003). Constructing a language; A Usage-Based Theory of Language Acquisition. Harvard University Press, Cambridge, MA, and London. Tomasello, M., A. Kruger, and H. Ratner (1993). Cultural learning. Behavioral and Brain Sciences, 16(3), 495–552. Tuyls, K., T. Lenaerts, K. Verbeeck, S. Maes, and B. Manderick (2002). Towards a relation between learning agents and evolutionary dynamics. In Proceedings of the Fourteenth Belgium-Netherlands Conference on Artificial Intelligence, pp. 315–322. K.U.Leuven, Belgium. Tuyls, K., K. Verbeeck, and T. Lenaerts (2003). A selection-mutation model for qlearning in multi-agent systems. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 693–700. Weibull, J. (1996). Evolutionary Game Theory. MIT Press, Cambridge, MA.
Author Index Allott, N., 76 Anscombre, J., 42–44 Appelt, D., 172 Asher, N., 77, 147, 168, 171–176, 184, 185, 187 Austin, J. L., 105, 162
Good, R., 4 Green, M., 140, 141 Grice, H. P., 4, 8, 41, 43, 76, 81–86, 89, 93, 97, 112, 121, 134, 241 Groenendijk, J., 77, 147, 191, 212 Grosz, B., 117
¨ Borgers, T., 262, 266 Bacharach, M., 77, 172, 178, 182 Batali, J., 272 Battigalli, P., 97 Benz, A., 77, 216 Bernasconi, M., 172, 182 Binmore, K., 213, 216, 224 Bonanno, G., 97 Boyd, R., 255, 256, 262, 270, 274 Bras, M., 174
Hagerty, K. H., 254 Halpern, J. Y., 238 Harsanyi, J. C., 31, 103, 105, 118, 154, 168 Hobbs, J., 172 Hofbauer, J., 262, 263 Horn, L. R., 42, 67, 106, 143, 145 Hurford, J. R., 255, 271, 273 J¨ager, G., 63, 216 Jansen, B., 256, 262, 266, 267 Janssen, M., 181, 182 Jonker, L., 53
Carston, R., 132, 140, 141 Cavalli-Sforza, L., 255 Chomsky, N., 146 Chwe, M. S.-Y., 239 Clark, H. H., 168, 213, 225 Colman, A. M., 144, 147 Crawford, V. P., 36, 37, 103
Kamp, H., 188 Kandori, M., 79 Kaplan, F., 256 Kirby, S., 255, 273 Knott, A., 174 Komarova, N., 62 Krakauer, D., 255, 272, 275 Kreps, D., 34, 103, 104 Krishna, V., 254 Kruger, A., 255 Kuhn, H. W., 106
Davidson, D., 83 de Jaegher, K., 33, 77, 239 de Vylder, B., 78, 256, 262, 266, 267 Doyle, A. C., 146 Ducrot, O., 41–44 Dummett, M., 83 Ellison, G., 74 Farrell, J., 37 Fauconnier, G., 43 Fehling, M., 115 Feldman, M., 255 Fishman, M. J., 254 Fodor, J., 83
Labov, W., 63 Lascarides, A., 77, 173–175, 184, 185, 187 Lemon, O., 101, 116, 117 Lenaerts, T., 78, 256, 262, 266, 267 Lewis, D., 11, 27, 28, 35, 76, 97, 103, 121, 135, 138, 148, 157, 177, 213 Lipman, B. L., 38–40, 254
Gibson, R., 36 Gigerenzer, G., 139
Maes, S., 266 Mailath, G., 79
285
286
Game Theory and Pragmatics
Manderick, B., 266 Martin, P., 172 Matsui, T., 129, 136, 137, 139 Maynard Smith, J., 46, 48, 49, 51, 62 McIntyre, A., 256 Merin, A., 3–5, 7, 41–44, 77, 189, 190, 201–203, 209, 212 Milgrom, P., 254 Morgan, J., 254 Morgenstern, O., 5, 21 Morreau, M., 171 Morris, S., 215 Moses, Y., 238 Myerson, R., 104 Narendra, K., 266 Niyogi, P., 62 Nowak, M. A., 62, 255, 272, 275 Oliphant, M., 272 Osborne, M. J., 104 Page, K., 275 Parikh, P., 3, 7, 33, 76, 77, 97, 99, 101– 103, 115–118, 120–127, 129–134, 136– 147, 168, 212, 239 Parikh, R., 3, 7, 212 Peters, S., 101, 116, 117 Plotkin, J. B., 255, 272, 275 Pratt, J. W., 212 Price, G., 275 Quine, W. V., 83 Rabin, M., 37, 97, 98 Raiffa, H., 212 Ratner, H., 255 Reiter, R., 171 Reyle, U., 188 Richerson, P., 255, 256, 262, 270, 274 Rob, R., 79 Roberts, J., 254 Robson, A., 239 Rosenthal, T., 256 Rousseau, J. J., 153 Rubinstein, A., 104, 213, 216 Sally, D., 77, 144–146, 152 Samuelson, L., 213, 216, 224
Sarin, R., 262, 266 Schaefer, E. F., 213, 225 Schelling, T., 77, 97, 148, 149, 177 Schulz, K., 147, 169 Searle, J. R., 83 Selten, R., 58, 154, 168, 216 Seppi, D. J., 38–40, 254 Sevenster, M., 77, 163 Sher, I., 172, 175, 176 Shimojima, A., 238 Shin, H. S., 215, 254 Sigmund, K., 262, 263 Skyrms, B., 153, 168, 172 Smith, K., 273, 275 Sobel, J., 36, 37, 103, 104 Spector, D., 254 Spence, M., 103 Sperber, D., 76, 120, 130, 134–137, 139, 142–144, 146, 210 Stalnaker, R., 37, 76, 97, 135 Steels, L., 256, 274 Stickel, M., 172 Stokhof, M., 77, 147, 191, 212 Takeoka, A., 238 Taylor, P., 53 Thathachar, M., 266 Thomas, B., 216 Tomasello, M., 255, 256 Traum, D. R., 214, 217, 218, 225 Tuyls, K., 256, 262, 266, 267 van Looveren, J., 256 van Rooij, R., 3, 7, 33, 61, 63, 77, 97, 106, 118, 121, 143–145, 147, 159, 169, 172, 193, 203–206, 212, 216, 239 Vega-Redondo, F., 75, 79 Verbeeck, K., 266 von Neumann, J., 5, 21 Want, L., 173 Watson, J., 104 Weibull, J., 263 Williams, M., 77, 168, 172, 175, 176, 183 Wilson, D., 76, 120, 129, 130, 134–137, 139, 142–144, 146, 210 Wilson, R., 34, 104
Author Index Wu, J., 174 Young, H. P., 74, 79 Zahavi, A., 61 Zimmerman, B., 256
287
Subject Index accessibility, 130, 131, 137–139 acknowledgement of receipt, 214, 217 activation, 131 adversary connectives, 41 ambiguity, 123, 131 answer admissible, 197, 206–209 best, 190, 206, 209–211 mention-some, 190, 192, 193, 197, 210, 212 misleading, 207 optimal, 190, 191, 193, 195, 197, 199, 200, 206, 208, 209 partial, 198, 199, 207, 211 approach to games behavioral/experimental, 115 evolutionary approach, 115 rationality-based, 115 argument, 241 argumentative force, 201, 206 attention costs, 226
competence, 166 concept, 179–182 context, 99, 130–136, 141, 144 context selection, 134 Contrast (discourse relation), 174, 178 conventional implicature, 41 conversation, 97, 168, 241 conversational, 85, 86, 93 conversational implicature, 41, 44, 149 cooperation principle, 85, 141, 190, 195, 211, 241 coordination, 91, 148, 172, 177, 179– 182, 186, 187 coordination games, 122, 127, 145 coordination problem, 148 counterargument, 242 credibility, 82, 90, 91, 93–95, 97, 98 credible, 37, 38, 40 cultural learning, 255, 256, 262, 274, 275 decision, 86, 90, 243 function, 202, 206, 209–211 argumentative, 206, 210 best answer, 207 non-argumentative, 210 problem, 194, 195, 203, 206, 212 theory, 2, 6, 8, 77, 193, 201, 203 defaults, 170–172, 175–178, 186, 187 defeasible rules, 170, 175 deliberation, 151 designer, 243 direct question, 193 disambiguation, 109, 129, 130, 133–135, 145 domination, 20 strict, 11–13, 18, 19 weak, 18
Battle of the Sexes, 19, 20, 22 Bayesian, 87, 250 form, 99 learning, 195 net, 117 utility, 189, 190, 200, 211 best response, 20, 49, 56, 58, 61 cheap talk, 27, 35, 37–40, 82, 88–90, 95, 97, 156 coding, 159 cognitive effects, 136, 137, 139, 143 Commentary (discourse relation), 174, 178 common belief, 87, 91, 95 common ground, 213 common knowledge, 77, 86, 93, 96, 151, 213 communication, 27, 84–88, 90, 91, 94, 95, 99, 120–128, 132, 134–137, 140, 141, 143–145, 148
economic, 244 Elaboration (discourse relation), 174, 178, 186 equilibrium, 86, 97, 98, 149
288
Subject Index evolutionarily stable equilibrium (ESS), 216, 219, 222, 239 Nash equilibrium, 8, 18, 20–23, 27, 48–51, 56, 58, 60, 66, 70, 71, 75, 79, 86, 98, 127, 132, 143, 154, 177 mixed, 20, 22, 23 strict, 48, 50, 51, 58, 73, 80, 216, 226 weak, 224 Pareto-Nash equilibrium, 109, 132, 146 strict, 152 errors of detection, 225 errors of discrimination, 225 evolutionarily stable equilibrium (ESS), 216, 219, 222, 239 evolutionarily stable set, 48 evolutionarily stable set (ES set), 216, 226, 231, 234, 236, 237, 239 evolutionarily stable strategy, 48–54, 56–58, 60, 62, 66, 70–75, 79 evolutionary game theory, 46–48, 52, 59–64, 70–72, 75, 78, 79 evolutionary stability, 47, 49, 51, 53, 58, 72, 73, 77 ex-post optimality, 249 exhaustification, 212 exhaustive answer, 191, 192, 197, 198, 212 expectations, 148 expected utility, 3, 6–8, 21, 22, 35, 37, 38, 49, 52, 56, 62, 87, 91, 150 Explanation (discourse relation), 174, 178, 184–187 explicit meaning, 132–134, 138, 139, 145 extensive form, 25, 101 focal point, 172, 177–179, 182 Frame, 179, 181, 182, 186, 187 game, 1, 8–25, 27, 28, 35–40, 44, 46–50, 52–58, 60–62, 64, 66, 67, 69, 71–76, 79 cooperative, 9, 122 dynamic, 8, 9 electronic mail, 213–215, 239 non-cooperative, 9 of incomplete information, 103 of partial information, 76
289
of pure coordination, 11 sequential, 8, 9, 27 signaling, 27, 35, 37, 69, 76, 88, 89, 99, 155, 216 static, 9 two-person game, 22, 189, 195, 204, 210 variable frame game, 179 game theory, 1, 2, 8, 17, 27, 46, 52, 71, 76, 78, 86, 88, 120, 121, 128, 135, 138, 143, 144, 149, 172, 176–178, 180, 188 classical, 1, 17, 27 evolutionary, 46–48, 52, 59–64, 70– 72, 75, 78, 79 situated, 112 standard, 121, 144 Good’s relevance, 4 Gricean, 82, 84, 97, 166 Maxims, 195, 210, 211 pragmatics, 3, 41, 43 grounding, 77, 213, 214, 216–218, 220– 225, 227–229, 232, 235, 240 acts, 218, 225 criteria, 213 guessing game, 257–259 handicap principle, 61 Hawks and Doves, 55 heuristic, 99 Hirschberg, J, 43 Horn strategy, 67, 71–75 Horn’s rule, 239 implicature, 85, 93, 118, 121, 131–136, 138–142, 145 scalar, 140 search space for, 132, 134 implicit communication, 159 inference, 120, 135, 140 information, 86–92, 95 information flow, 118 intention, 101 intentionality, 83 interpretation, 100 irony, 162 iterated learning, 62 kin selection, 62
290
Game Theory and Pragmatics
language, 81–83, 89, 90, 96 language change, 62, 63 language of thought, 83 literal meaning, 162 logic, 82, 241 Maxim, 241 Maxim (Gricean) of Manner, 190, 198, 211 of Quality, 196, 207, 211 of Quantity, 211 of Relevance, 190, 200, 211 meaning, 81, 82, 84–86, 89, 90, 93, 96, 97 literal, 86 nonnatural, 83 meaning-nn, 83, 84 mechanism, 243 mistakes, 247 mixed strategy, 21, 22, 48, 49, 53, 56, 80 model for a game, 87, 90 mutation, 47–51, 53, 57, 72–75, 80 mutual knowledge, 17 naming game, 256, 258–262, 265–267, 270, 271, 274, 275 Narration (discourse relation), 174, 178, 186–188 neighborhoods, 253 notice of non-receipt, 217 ordinary language, 82 Parallel (discourse relation), 174, 178 Pareto dominance, 127, 143 Pareto efficient, 23 Pareto optimal, 23, 71, 77, 153 strongly, 23 partial blocking, 64, 67 partition, 251 partition equilibrium, 36 payoff, 5, 8–11, 13–17, 19–23, 26–28, 39, 46–48, 80, 96, 150 function, 5, 9, 10, 26 matrix, 9, 13 philosophy, 82 politeness, 164 pragmatics, 1–3, 11, 41, 43, 63, 82, 122, 134, 147, 149, 170, 172, 173
Gricean, 3, 43 Gricean, 41 prisoners’ dilemma, 12, 79 probabilistic communication, 99 probability distribution, 2 probability space, 2, 3 processing effort, 136, 137, 139 profile, 87, 244 proposition, 89, 90 pure strategy, 21, 22, 27, 48, 49, 56 Question-Answer-Pair (discourse relation), 174 rationality, 120, 121, 143, 144 bounded, 27, 46 reasoning, 133 redundancy, 216 relevance, 3–5, 7, 8, 11, 43, 44, 120, 122, 129, 131, 134, 136–143, 145, 146, 189, 190, 199–206, 208–211 communicative principle of, 136, 137, 143 Maxim (Gricean) of Relevance, 190, 200, 211 maximal, 140, 141 measure of, 202, 206, 208, 209, 211 optimal, 137, 141 relevance theory, 134, 136, 140 repair, 218, 219, 221 request for, 218, 219 replicator dynamics, 51, 53, 54, 56, 66, 67, 70–72 risk-dominance, 71, 154 risky speech, 153 Rock-Paper-Scissors, 49, 52, 54, 58 role, 256, 257, 262, 270–272, 274, 275 salience, 148 scalar implicature, 43, 44 SDRT (Segmented Discourse Representation Theory), 172–176, 178, 184, 187, 188 semantics, 83 semiotic cycle, 256, 257 semiotic triangle, 257, 258 sequential, 244 signaling system, 157 solution concept, 127
Subject Index speaker meaning, 99 speech act, 82, 83 stability, 157 stochastic, 73–75 Stag Hunt, 168 static game, 8, 9, 19 strategic form, 99 strategic game, 10, 18, 22, 23 strategy, 83, 87, 91, 95, 151 mixed, 106 pure, 119 super conventional, 157 support problem, 194, 196, 197, 206– 209 sympathy, 162
291
transmission cultural, 255, 275 horizontal, 256, 274, 275 truth-conditional meaning, 41, 43 ULF (Underspecified Logical Form), 173–176, 184 understatement, 162 utility of answers, 193, 210 utility value, 197, 203, 205 Variable Frame Theory, 178–181 Dynamic, 178, 179, 183, 185–187 variable frame theory, 77 zero-sum game, 5, 10, 11, 49