ADAPTIVE PROCESSES IN ECONOMIC SYSTEMS
Roy E. Murphy, Jr. DEPARTMENT OF ECONOMICS STANFORD UNIVERSITY STANFORD, CALIFO...
11 downloads
479 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADAPTIVE PROCESSES IN ECONOMIC SYSTEMS
Roy E. Murphy, Jr. DEPARTMENT OF ECONOMICS STANFORD UNIVERSITY STANFORD, CALIFORNIA
1965
~
ACADEMIC PRESS
New York and London
COPYRIGHT
© 1965,
BY ACADEMIC PRESS INC.
ALL RIGHTS RESERVED. NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY FORM, BY PHOTOSTAT, MICROFILM, OR ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS.
ACADEMIC PRESS INC.
111 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS INC. (LONDON) LTD.
Berkeley Square House, London W.1
LrBRARY OF CONGRESS CATALOG CARD NUMBER:
PRINTED IN THE UNITED STATES OF AMERICA
65-25003
To the Late Russell Varian, Inventor, Scientist, and Humanitarian
This page intentionally left blank
ADAPTIVE PROCESSES IN ECONOMIC SYSTEMS
Roy E. Murphy, Jr. DEPARTMENT OF ECONOMICS STANFORD UNIVERSITY STANFORD, CALIFORNIA
1965
ACADEMIC PRESS
New York and London
COPYRIGHT
© 1965,
BY ACADEMIC PRESS INC.
ALL RIGHTS RESERVED. NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY FORM, BY PHOTOSTAT, MICROFILM, OR ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS.
ACADEMIC PRESS INC.
111 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS INC. (LONDON) LTD.
Berkeley Square House, London W.l
LIBRARY OF CONGRESS CATALOG CARD NUMBER:
PRINTED IN THE UNITED STATES OF AMERICA
65-25003
To the Late Russell Varian, Inventor, Scientist, and Humanitarian
This page intentionally left blank
PREFACE During my study of economic theory I have been frequently disturbed by what I consider serious deficiencies in economic theory. These deficiencies are as follows: (a) In current economic theory, decision makers are generally assumed to possess full information or full knowledge of the parameters of the economic system in which they play an active role. However, in fact, few decision makers ever have full knowledge and still they make decisions that carry economic significance. (b) The nature of the adaptive economic process, where decision makers increase their knowledge by the cumulative experience of "doing while learning," has not been explored in any generality. (c) The theory of communications although implicit in economic theory does not play an explicit role. Actually the transmission of information is both time consuming and subject to 'equivocation. (d) The analogy between the activities of an economic market undergoing stochastic exchanges of goods and services and the stochastic collision of gas molecules in a constant energy system have never been exploited to any great degree. This analogy could permit the application of a large body of statistical thermodynamic theory to many stochastic economic problems. The assumption that decision makers possess full information about the parameters of an economic system is valid when the economist wishes to study the basic structure of an economic system. The basic structure must not be obscured by supposed irrational decisions from decision makers with less than full information. The assumption that decision makers possess full information is generally not valid when economic theory is used to formulate economic policy. In most actual situations the decision makers have only a vague idea about the system's parameters. However, because of the adaptive process, there is some hope that the decision makers will improve their knowledge of the IX
x
PREFACE
system's parameters. Three basic questions arise in the study of adaptive behavior: (1) Under what conditions does the adaptive process always improve the behavior of the decision makers? (2) What controls the rate at which the expected improvements in behavior occur ? (3) Can the adaptive process explain the diversity of observed behavior of supposed rational decision makers without appeal to the existence of individual utility functions? Restricted answers to these questions will be found in this book: Recently, a substantial amount of work has been accomplished in the application of dynamic programming techniques to statistical communications problems and self-adaptive control systems. This effort led to my recognition that this approach has important economic significance as well. In fact, the deficiencies, listed as (a)-(c), were found to be greatly reduced by this approach. Furthermore, I found that this approach actually strengthened the analogy between stochastic economic processes and thermodynamic processes, deficiency (d). Although I have demonstrated the usefulness of communications theory, self-adaptive control theory, and thermodynamic theory to certain economic processes herein, I cannot say that I have presented the complete story. Since time and space are finite, the reader will find only a beginning to the solution of these problems here. I hope, however, he will be induced to consider imaginative and useful applications and extensions of this theoretical approach. At this early stage in the development of an adaptive economic theory there is insufficient conceptual stability to warrant a rigorous mathematical treatment. Although I have been forced to employ mathematical arguments to logically extend a simple intuitive view of adaptive processes, I do not, as an engineer and economist, pretend to advance a rigorous, axiomatic theory of these processes. This task I leave to those who are better equipped in the arts of mathematics. My apology is made at the onset for the difficult notational problems which will soon confront the reader. As far as I know, a flexible and reasonable notational system for adaptive processes has not yet come into being, although progress is being made by many workers. This research would not have been possible without the vision and support of the late Dr. Russell Varian, who also searched for some useful analogy between current economic theory and the theories of electrical
PREFACE
Xl
engineering and physics. * Furthermore, without the continuing support of the Varian Foundation-in particular, Mrs. Russell Varian, Dr.Edward Ginzton (Chairman of the Board, Varian Associates), and Professor Leonard Schiff (Physics Department, Stanford University)-this work would have stopped in midstream. To these people I express my respectful gratitude. I have also received much help and encouragement from Professor Kenneth J. Arrow (Economics Department, Stanford University), who initially drew my interest to adaptive economic problems, and from Professor Norman Abramson (Electrical Engineering Department, Stanford University) whose interest in information theory extends far beyond its use in electrical engineering problems. My appreciation is extended to Dr. Richard Bellman (RAND Corporation) whose interest in my work provided the opportunity to write this book. I wish also to acknowledge the comments and suggestions of Dr. Charles W. McClelland (Operations Research, Varian Associates) and Mr. Allen H. Norris (Head, Applied Mathematics, Varian Associates). I especially wish to acknowledge the influence on my thinking of an imaginative paper written by J. L. Kelly, J r./ who first drew the connection between economic investment problems and information theory. My thanks are extended to Laura Staggers, who struggled through my obscure notes to produce a useful manuscript, and to the staff of Academic Press whose patience and understanding has made the task less of a traumatic experience. Finally, I express my gratitude to my wife, Joyce, who philosophically accepted the presence of piles of books and notes in our home during the writing of this book and who shared my many moments of triumph and despair. Roy E. MURPHY, JR. Stanford, California August, 1965
* For an interesting example, Dr. Varian was intrigued by the analogy of the principles of the bistable electronic circuit and the business cycle in the early 1930's. This theory thus actually preceded the work of Nicholas Ka1dor who, in 1940, introduced a mathematical model of the business cycle which, unknown to Kaldor, was similar to the bistable circuit theory. See: N. Ka1dor, A Model of the Trade Cycle, Economic J. 50, No. 197, 78-92 (1940). t J. L. Kelly, j-; A New Interpretation of Information Rate, Bell System Tech. J. 35, No.4, 917-926 (1956).
This page intentionally left blank
Contents PREFACE
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
IX
CHAPTER 1
Introduction 1.1 The Historical and Current Development of Adaptive Theory 1.2 Common Properties of Adaptive Processes References. . . . . . . . . . . . . . . . . . . . . . .
I 8
9
CHAPTER 2
The Mathematical Model 2.1 Introduction. . . . . . . . . . 2.2 2.3
Discrete Processes. . . . . . . . Some Discrete Sequential Processes 2.4 The Role of the Decision Maker . 2.5 The Optimal Type 2 (Adaptive) Process 2.6 The Constrained, Optimal Type 2 (Adaptive) Process 2.7 Causality and Markov Processes 2.8 The Dynamic Programming Model References. . . . . . . . . . .
12 12 16
20 22
24 25 28
30
CHAPTER 3
The Primitive Adaptive Process 3.1 The Notion of Learning and Adaptation 3.2 Some Hypothetical Experiments . . . . 3.3 An Adaptive Process of the First Kind 3.4 An Adaptive Process of the Second Kind 3.5 Mixed Adaptive Processes of the First and Second Kind 3.6 Summary . References. . . . . . . . . . . xiii
32 33 36 39
42 47 51
XIV
CONTENTS
CHAPTER 4
Subjective Probability 4.1 Introduction . . . . . . . . . 4.2 A Heuristic Example-The Three Investors 4.3 Economic Environmental Processes . 4.4 A Digression on Statistical Estimators . . . 4.5 Multinomial Subjective Probabilities 4.6 Properties of the Subjective Probability Vector References. . . . . . . . . . . . . . . .
52 53 60 63
65 68
69
CHAPTER 5
The Role of Entropy in Economic Processes 5.1 5.2 5.3
The Concept of Entropy Time Entropy and Information. The Entropy Paradox
71 73 82
References. • . . . • .
85
CHAPTER 6
Adaptive Economic Processes 6.1 Introduction. . . . . . . . . . . . . . . . . 6.2 General Deterministic Dynamic Economic Process 6.3 The Stochastic Dynamic Economic Process . . . . 6.4 The Adaptive Stochastic Dynamic Economic Process 6.5 Stochastic Growth Process .
86 87 88
89 91
CHAPTER 7
An Adaptive Investment Model 7.1 7.2 7.3 7.4 7.5 7.6
The Objective Function . An Example-The Three Investors Again An Investment Model with Full Liquidity An Adaptive Investment Process with Full Liquidity Consideration of the Special Constraint . . . . . . The Multivalued Payoff Adaptive Investment Process References. . . . . . . . . . . . . . . . . . .
95
96 104 109 121 125 128
xv
CONTENTS
CHAPTER 8
Multiactivity Capital Allocation Processes 8.1 8.2 8.3
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . The Adaptive Capital Allocation Process. . . . . . . . • . . . . . The Properties of the Adaptive Capital Allocation Process in the Limit .
129 130 139
References. . . . . . . . . . . . . . . . . . . . . . . . . . .
142
CHAPTER 9
Economic State Space 9.1 9.2 9.3 9.4 9.5 9.6 9.7
Introduction. . . . . . . . . . . . . The Concept of an Economic State Space Statistical Equilibrium and Enlightenment The Subjective Entropy Trajectory . . . The Dynamics of the Subjective Entropy Trajectory. The Economic State Space Representation . . . . . A Digression-A Likely Value for the Environmental Entropy References. . . . . . . . . . . . . . . . . . . . . . .
~
143 144 146 149 152 155 162 168
CHAPTER 10
Interactions between Decision Makers in State Space 10.1 10.2 10.3 10.4
Introduction . The State Space Probability Function Stochastic Equilibrium in the Market. Interperson Trading in the Investment Market
169 170
177 183
CHAPTER 11
The Conclusions 11.1 11.2
Individual Adaptive Behavior Collective Adaptive Behavior
189 193
Bibliography 1. General Background on Adaptive and Evolutionary Behavior 2. Sources on the Concept of Subjective Probability. . 3. Sources on the Mathematical Theory of Intelligence . . . .
197 197 198
CONTENTS
XVI
4. 5. 6. 7. 8. 9.
199
Dynamic Programming and Adaptive Control . . . . . . . . . . Sources on Investment Theory; mainly on Multi-Project Investment under Uncertainty. . . . . . . . . . . . . . . . . Sources on Information Theory . . . . . . . . . . . . Sources on the Concept of Entropy and Entropy-Gradient Sources on Statistical Mechanics and Thermodynamics Sources on Certain Techniques Used in This Book. . . .
SUBJECT INDEX
•
.
.
•
•
•
.
.
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
200
201 202 202 202
•
.
•
•
•
205
CHAPTER 1
INTRODUCTION For of beasts, some are gregarious, others are solitary; they live in the way which is best adapted to sustain them ... and their habits are determined for them by nature in such a manner that they may obtain with greater facility the food of their choice. Aristotle Politics
1.1 The Historical and Current Development of Adaptive Theory Economic science can be broadly defined as the study of mechanisms which allocate limited resources to productive uses so as to improve the welfare of the members of a society. We shall call such a mechanism an economic process, and such a society an economic system. A great deal of interest centers around the study of dynamic economic processes. Generally, the effects of the rate of change resources, particularly capital resources, are considered to be the essence of a dynamic economic process. It is recognized that a stream of actions taken by a number of decision makers in the economic system guides the rate of change of the system's resources. However, it is not clear what controls the rate of the stream of actions taken by the decision makers. In this book we shall consider the hypothesis that the rate of generation of information in the economic system controls the rate of decision making, and thus indirectly the rate of growth of the system's resources. Processes which are governed by the flow of information are often called adaptive processes. The study of adaptive processes in both social and biological evolution is very old, stemming from the Greeks who used evolutionary principles in their search for truth. This adaptive process, which they called the dialectic, was applied later to social systems during the rise of the Hegelian concept of history. The Hegelian historical dialectic contains the essential components that lead to an adaptive decision-making system. We shall consider the dialectic nature of social systems to identify these essential components, and we shall 1
2
1.
INTRODUCTION
note the key component especially for adaptive decision processes, i.e., the role of history in the formulation of the current decisions. Since Hegel's period it has been recognized by many that social dynamics is an adaptive phenomenon in which constant experimental adjustments are being made, shifting the social structure to best suit the economic and physical restraints in which the society finds itself. These biological concepts of social dynamics are basic not only to the Hegelian dialectic but to the philosophies of the founders of the American concept of federal government, Jefferson and Madison, and still later, to the philosophy of Engels. Hegel replaced a concept of a society in which history was simply a unique record of the past with a concept of society in which history was the engine of progress. In the Hegelian world, History's role is to produce an array of current problems to be resolved by society. Society, remembering the past triumphs and failures, adapts to the conditions of the present, and produces the decisions which may lead to future triumphs and failures. Engels grasped the underlying stochastic nature of the adaptive decision process when he wrote: In nature . . . there are only blind unconscious agencies acting upon one another and out of whose interplay the general law comes into operation. . . . In the history of society, on the other hand, the actors are all endowed with consciousness, are men acting with deliberation. . . . That which is willed happens but rarely; in the majority of instances the numerous desired ends cross and conflict with one another. ... Thus the conflict of innumerable individual wills and individual actions in the domain of history produces a state of affairs entirely analogous to that in the realm of unconscious nature . . . . Historical events thus appear on the whole to be likewise governed by chance. But where on the surface accident holds sway, there actually it is always governed by inner, hidden laws and it is only a matter of discovering these laws.1 The dialectic process illustrates certain fundamental aspects of adaptive processes which will aid our study of this phenomenon. First, there is a set of alternative views about the problems of the present. Second, there is some objective which is to be optimized by the selection of a subset of possible views. The crucial aspect that makes a stochastic decision process an adaptive stochastic decision process is the role of historical information or experience in the formulation of this
1.1
ADAPTIVE THEORY, HISTORICAL AND CURRENT DEVELOPMENT
3
optimum current decision. Free discussion and information about past experiences are necessary to evaluate the possible effect of current actions on the objective and to discover the limits on the set of possible actions. Third, a choice mechanism is necessary to determine which subset of views is optimal to society. Fourth, a police system is necessary to impose the optimal choice on the holders of other subsets of ideas. The society then experiences the consequences of its decision, and formulates the dialectic process all over again. The crucial link is the choice mechanism. An essential concept of adaptation is the irreversibility of the process. Once a choice is made, it becomes recorded as part of the past's unalterable history. Its effect, good or bad, is always felt in the making of current decisions. Most political thought centers about the qualities of one choice mechanism over the other. In the view of Engels, the choice mechanism was an irreversible social revolution. In an attempt to reduce the dangers of social rigidity without the associated dangers of revolution, the original principles of American federal government included a choice mechanism in the indirect republican representation system. Madison, in the Federalist Paper No. 10, outlines the necessity of providing social mobility in U.S. institutions by a choice mechanism among social "factions." He said:
As long as the reason of man continues fallible, and he is at liberty to exercise it, different opinions will be formed." . .. The regulation of these various and interfering interests forms the principal task of modern legislation, and involves the spirit of party and faction in the necessary and ordinary operations of the gooernments ... If a faction consists of less than a majority, relief is supplied by the republican principle, which enables the majority to defeat [the faction's] sinister views by regular vote. 4 Thus Madison outlined the decision process of American government and provided for the irreversible feature of sequential voting which is characteristic of adaptive processes. de Tocqueville pointed out the irreversible nature of American government when he said:
The American democracy ... is allowed to follow, in the formulation of laws, the natural instability of its desires.t ... In America, as long as the majority is still undecided, discussion is carried on; but as soon as its decision is irrevocably pronounced, everyone is silent, and the friends as well as the opponents of the measure unite in assenting to its propriety." ... The very essence
4
1.
INTRODUCTION
of democratic government consists in the absolute sovereignty of the majority. . . . Most of the American constitutions have sought to increase this natural strength of the majority of artificial means,'
One aspect of the general decision process in government and one which we shall deal with herein is the role of the expectations of the future in the evaluation of the set of possible alternatives of the present. Until recently, this aspect has been neglected in consideration of governmental functions as an aid to the nation's decision makers. Lippmann, in his book "The Public Philosophy," brought this out when he said: For besides the happiness and security of the individuals of whom a community is at any moment composed, there are also the happiness and security of the individuals of whom generation after generation it will be composed. If we think of it in terms of individual persons, the corporate body of the People is for the most part invisible and inaudible.i
The biological aspect of adaptation which emerged as the study of evolution was introduced by Malthus" in 1789; but it was Darwin who clearly saw the mechanism by which adaptation takes place. In the closing chapter of "The Origin of Species," Darwin said: As natural selection acts solely by accumulating slight, successive, favourable variations, it can produce no great or sudden modifications; it can act only by short and slow steps.I° ... [Modification of the species] has been effected chiefly through the natural selection of numerous successive, slight, favourable variations, aided in an important manner by the inherited effects of the use and disuse of parts; and in an unimportant manner, that is in relation to adaptive structures, whether past or present, by the direct action of external conditions, and by variations which seem to us in our ignorance to arise spontaneously.u
It is clear that Darwin recognized the possibility of "peaceful" transformations in processes where natural conditions permitted the full play of the biological dialectic. He recognized the need for a selection from a set of possibilities so as to optimize an objective function, given a state of the external conditions. Darwin recognized the importance of the history of past inheritances and a stochastic mechanism which
1.1
ADAPTIVE THEORY, HISTORICAL AND CURRENT DEVELOPMENT
5
appears to produce spontaneous variations. In the year 1859 Darwin's contribution to the understanding of adaptive processes was remarkable. Volterra, a mathematician, as early as 1901 devoted much of his time to the formulation and solution of a series of deterministic mathematical models of a two-species biological adaptive system which showed for the first time that interaction between parts of the same closed system could produce population cycles.P Lotka in 1924 published a study of mathematical biology in which, in recognition of the probabilistic nature of the adaptive process, he defined evolution as the history of a system undergoing irreversible changes. He said:
Having analyzed the submerged implications of the term evolution as commonly used, so as to bring them into the focus of our consciousness, and having recognized that evolution, so understood, is a history of a system in the course of irreversible transformations, we at once recognize also that the law of evolution is the law of irreversible transformations that the direction of evolution .. , is the direction of irreversible transformations. And this direction the physicist can define or describe in exact terms. For an isolated system, it is the direction of increasing entropy. 13 Thus, the connection of the concept of adaptive processes and the concept of entropy (a measure of uncertainty in an isolated system) was established. Lotka went on to establish in a verbal way the relation of thermodynamic systems to biological systems:
We may have little doubt that the principles of thermodynamics or of statistical mechanics do actually control the processes occurring in systems in the course of organic euolution.r: ... What is needed, in brief, is something of the nature of what has been termed" Allgemeine Zustandslehre," a general method of Theory of Stater" It is toward such a Theory of State that our efforts have been applied in the adaptation of Boltzmann's H theorem to economic processes. Lotka failed to develop the stochastic theory he had envisioned, but the seeds of the application of statistical methanics were sowed by Rashevsky who, in his book on mathematical biology, used a thermodynamic occupancy model to illustrate the formation of Pareto wealth distributions for a closed social systern.l" The work of the mathematical
6
1.
INTRODUCTION
biologists has been taken up by a new science, the discipline of cybernetics. This has been the result of the realization of certain fundamental similarities in biology, automatic control, psychology, and economic theories which lend to common treatment. Psychologists, particularly Pavlov'? and Thorndyke'" began to study learning phenomena as a type of adaptive process at the turn of the century. Tolman'? in 1937 considered a mathematical description of the learning behavior of animals in "T" mazes. His "behavior ratio" is closely related to the frequency concept of the subjective probability theory. Recently these statistical learning models have been improved by Estes.f" Bush and Mosteller.s! Flood,22 and others. t Tolman's white and black sensing "sowbug" is an ingenious example of the design of a light-sensitive adaptive mechanism which can be used to test learning theories. Mcf.ulloch'" has studied the resemblance between the function of neuron networks in the visual cortex and electronic computer logic circuits. This similarity led Rosenblattw to consider an adaptive device which he calls a perceptron. Perceptrons, which are similar to biological neurons, can be connected into devices which learn to recognize complex patterns. Many other similar "neuronic" adaptive devices have been developed. Widrow'" has developed an electrochemical simulated neuron which he uses to construct adaptive pattern recognition devices. These devices can learn to recognize human speech, and perform difficult balancing tasks. A theoretical structure for the design of a learning machine which maximizes the rate of growth of its intelligence has been developed by the author.s" This type of adaptive machine has the capacity to discriminate between what the machine considers relevant and irrelevant environmental information. This ability has been called adaptive concentration. Turing'" in 1936, during the study of certain propositions in mathematical logic, introduced the fundamental notion of a simple hypothetical machine or universal automaton, now known as a Turing machine. If a Turing machine is shown a mathematical series, the machine can adjust its operation so as to generate the continuing terms of this series. Turing showed that this is true for a series generated by any other machine, no matter how complicated this other machine may be. Thus, the simple Turing machine, given sufficient time, could duplicate the performance of a more complex machine. If we view the input series to the Turing machine as a sequence of event outcomes generated by t For example, see text and references in Duncan Luce.··
1.1
ADAPTIVE THEORY, HISTORICAL AND CURRENT DEVELOPMENT
7
the machine's environment, we could say that the Turing machine learns to cope with its environment. The theory of the Turing machine was extended in 1948 by von Neumann 29 to even greater significance. He showed the conditions under which a Turing machine could construct other Turing machines. Furthermore, under certain conditions he showed that a Turing machine could construct other machines far more complex than itself. This result has far-reaching biological implications; but for the theory of adaptive processes in general, it provides a mathematical framework of very fundamental nature. It appears reasonable to assume that a complex adaptive process can be decomposed into a hierarchy of simple Turing machine processes. Turing machines and other adaptive devices require an input of information from the outside world. The role of information and a measure of the rate of flow of information was considered by Norbert Wiener. In his books, "Cybernetics" and "The Human Use of Human Beings," Wiener introduces the concept of information (which is directly related to the concept of entropy) in the study of adaptive processes. He said:
Information is a name for the context of what is exchanged with the outer world as we adjust to it, and make our adjustment felt on it. The process of receiving and of using information is the process of our adjusting to the contingencies of the outer environment, and of our living effectively within that environment. 30 Although the idea of devising a measure of information occurred probably first to Hartley" and later to Fisher,32 Shannon.P Kolmogorov.s! and Wiener.s" the main influence of the role of information stems from the two papers by Shannon in which he formalized the concept. Adaptive machines which simulate animals have actually been constructed by several experimenters. Walter's electromechanical "Tortoise" or machina speculatrix is a well-known example of a machine which receives and responds to information from its environment." It may be argued, however, that this machine is not adaptive since it does not change its own behavior patterns. On the other hand, Ashby has constructed an electronic machine, the homeostat.P? which is an adaptive mechanism. When an essential part of the machine is torn out, an adaptive process is initiated and the machine, itself, finds a replacement so that it may survive. The study of such machines has sharpened the
1.
8
INTRODUCTION
concept of adaptation. Ashby has clearly shown some of the necessary conditions for adaptation to take place. 38 ,39 Ashby emphasizes the role of finite step transformations in the theory of adaptation and thus sets the stage for the use of dynamic programming techniques for the solution of multistage adaptive decision processes. Bellman's extension of dynamic programming to the study of adaptive control processes-? has provided a flexible mathematical tool which is itself well adapted for the analysis and simulation of adaptive processes in general. A great body of literature now exists on the theory and application of adaptive control systems. t Although the adaptive economic problem differs in some fundamental ways from the adaptive control problem, much of this work is useful in the study of adaptive economic processes. We shall employ the dynamic programming technique in our study of adaptive economic processes. First we must distill from the historical and current development of adaptive theory a notion of a general adaptive process.
1.2 Common Properties of Adaptive Processes If we examine various social and biological processes, we see certain common properties which may lead to an understanding of these processes. First, in taking an action, we see that at any point in time an adaptive system can move off in anyone of many directions. Second, we find that a move in any of these directions will cause a change to the system and perhaps its environment. This change becomes part of a record, the historical record of the system which is unchangeable. Third, there is in each adaptive process a choice or decision-making function. With knowledge of the historical record and an evaluation of the effect of each action on the present and future states of the system, the decision-making function chooses one of the alternative actions. Fourth, following the taking of an action, the system mayor may not achieve the anticipated result. Uncertainties in the environment or the effect of the action may cause the system to move in a direction which was unintended. t
For example see reference 41.
REFERENCES
9
Fifth, if the result was unfortunate, there is no recourse. The only corrective measure is to re-establish the above sequence of events all over again. Certain fundamental notions will be required to formulate adaptive processes. (I) A model of an adaptive process must be a sequential time process since uncertainties on the environment make success difficult to obtain and hold. (2) We must find a way of treating the information contained in the historical record. Thus, a concept of information will be an important notion. (3) We must have a stochastic representation of the effects of the system's environment which, as we see above, plays a vital role. (4) The role of the decision-making function must be made explicity. (5) We note that only one action is to be selected out of many possible alternatives. How this one action is selected is a fundamental notion with which we must deal. (6) We must specify the ways in which the structure of the system is changed by each of the possible actions and the possible effects of the environment. A mathematical model is the best way to provide answers to these fundamental requirements for the study of adaptive processes. In the next chapter some of these notions will take mathematical form. References 1. Engels, F., "Ludwig Feuerbach," p. 48. International Publishers, New York, 1941. 2. Madison, J., Federalist Papers No. 10, p. 55. National Home Library, Washington, D.C., 1937. 3. Madison, J., Federalist Papers No. 10, p. 56. Natl. Home Library, Washington D.C., 1937. 4. Madison, J., Federalist Papers No. 10, p. 57. Natl. Home Library, Washington D.C., 1937. 5. de Tocqueville, A., "Democracy in America," Vol. 1, p. 267. Vintage Books, New York, 1954. 6. de Tocqueville, A., "Democracy in America," Vol. 1, p. 273. Vintage Books, New York, 1954. 7. de Tocqueville, A., "Democracy in America," Vol. 1, p. 264. Vintage Books, New York, 1954.
10
1.
INTRODUCTION
8. Lippmann, W., "The Public Philosophy," p. 35. Mentor, New York, 1956. 9. Malthus, T., "Population: The First Essay," pp. 7-18. Michigan Univ. Press, Ann Arbor, 1959. 10. Darwin, C., "The Origin of Species," p. 413. Appleton, New York, 1883. 11. Darwin, C., "The Origin of Species," p. 421. Appleton, New York, 1883. 12. Volterra, V. "Fluctuations dans la Lutte pour la Vie." Gauthier-Villars, ImprimeurLibraire, Paris, 1938. 13. Lotka, A., "Elements of Mathematical Biology," pp. 24, 26. Dover, New York, 1956. 14. Lotka, A., "Elements of Mathematical Biology," p. 36. Dover, New York, 1956. 15. Lotka, A., "Elements of Mathematical Biology," p. 40. Dover, New York, 1956. 16. Rashevsky, N., "Mathematical Biology of Social Behavior," pp. 72-75. Chicago Univ. Press, Chicago, 1951. 17. Pavlov, I. P., "Conditioned Reflexes, an Investigation of the Physiological Activity of the Cerebral Cortex" (Dover ed.). Dover, New York, 1960. 18. Thorndyke, E. L., "The Fundamentals of Learning." Columbia Univ., Teachers Col1ege, New York, 1932. 19. Tolman, E. C., "Behavior and Psychological Man, Essays in Motivation and Learning." California Univ, Press, California, 1958. 20. Estes, W. K., Toward a statistical theory of learning, Psychol. Rev. 57, No.2, 94107 (1950). 21. Bush, R. R, and Mosteller, F., "Stochastic Models for Learning." Wiley (Interscience), New York, 1955. 22. Flood, M., Stochastic learning theory applied to choice experiments with rats, dogs, and men," Behavioral Sci. 7, No.3, 289 (1962). 23. Luce, D. R., "Individual Choice Behavior," Wiley, New York, 1959. 24. McCul1och, W. S., The brain as a computing machine, Elect. Eng. 68,492-497 (1949). 25. Rosenblatt, F., Approaches to the study of brain models, in "Principles of SelfOrganization" (H. von Foerster and G. W. Zopf, Ir., eds.), pp. 385-402. Macmillan, (Pergamon), New York, 1962. 26. Widrow, B., Generalization and information storage in networks of adaline "neurons," in "Self-Organizing Systems 1962" (M. C. Yovits, G. T. Jacobi, and G. D. Goldstein, eds.) pp. 435-461. Spartan Books, Washington, D.C., 1962. 27. Murphy, R E., Relations between self-adaptive control theory and artificially intel1igent behavior in a stationary stochastic environment, in "Artificial Intel1igence," Publication S-142, Winter General Meeting, pp. 45-63. Inst, Elect. and Electron. Eng., 1963. 28. Turing, A. M., On computable numbers with an application to the entscheidungsproblem, Proc. London Math. Soc. Ser. 2, 42, 230-265 (1937). 29. Von Neumann, J" The general and logical theory of automata, in "John von Neumann -Col1ected Works" (A. H. Taub, ed.), Vol. 5, pp. 288-328. Macmillan (Pergamon), New York, 1963. 30. Wiener, N., "The Human Use of Human Beings," pp. 17-18. Doubleday, New York, 1954. 31. Hartley, R V. L., Transmission of Information, Bell System Tech. J. 7, 535 (1928). 32. Fisher, R A., "Contributions to Mathematical Statistics," pp. 26-47. Wiley, New York, 1950. 33. Shannon, C. E., A mathematical theory of communication, in Bell System Tech. J. 27, 379-423 (1948); ibid. 27, 623-656 (1948).
REFERENCES
11
34. Kolmogorov, A., Interpolation und Extrapolation von stationiiren Zufiilligen Folgen, in Bull. Achad. Sci. USSR Sec. Math. 5, 3-14 (1941). 35. Wiener, N., "Cybernetics," pp. 75-77. Wiley, New York, 1948. 36. Walter, W. Grey, An Electromechanical "Animal," in Discovery, Vol. 11, No, 3, 90-93 (1950). 37. Ashby, W. R., "Design for a Brain," pp. 93-102. Wiley, New York, 1954. 38. Ashby, W. R., "Design for a Brain," pp. 238-240. Wiley, New York, 1954. 39. Ashby, W. Ross, "An Introduction to Cybernetics," pp. 195-259. Wiley, New York, 1956. 40. Bellman, R., "Adaptive Control Processes: A Guided Tour." Princeton Univ. Press, Princeton, New Jersey, 1961. 41. Proc. First Internat. Symp, on Optimizing and Adaptive Control, Rome, April, 1962. Instrum. Soc. Am., Pittsburgh, Pennsylvania, 1962.
CHAPTER 2
THE MATHEMATICAL MODEL •.. mathematics is the art of giving the same name to different things. Poincare
2.1
Science and Method
Introduction
Casual examination of adaptation in social and biological systems provides a framework for the development of a mathematical model of adaptive processes. A mathematical model is a formal assembly of logical interrelations which facilitates the determination of the essential elements of some physical, biological, or social process. Furthermore, a mathematical model permits one to analyze the effects of external and internal parametric changes on the system's variables. The adaptive process can be studied by use of the mathematical theory of discrete processes. A discrete process is defined on a set of real positive integers, usually called the stages of the process. This type of process is different but closely related to a continuous process which is defined on a continuous line usually called time. Although physical time appears to advance as a continuous process, events occur discretely. We are primarily concerned with the effect of events on adaptive decision makers, the passage of continuous time being an indirectly related notion. Due to this concern over events, particularly economic events, we shall find that the discrete process is a natural model for the adaptive process.
2.2
Discrete Processes
There are three fundamental notions in the theory of discrete processes. These are the notions of (1) (2) (3)
stage, state, and transition. 12
2.2
DISCRETE PROCESSES
13
Consider two sets of positive real integers: S
=
{I, 2, ..., s, ...}
(2.2.1 )
T
= {I, 2, ..., t, ...},
(2.2.2)
and
the elements of which are known as the state indexes and the stage indexes, respectively, of a discrete process. There is a set of numbers (or more generally, vectors) defined on the ordered pairs of S X T, which we will call the value function. In some explicit sense this function will represent the value of the process at each of the state and stage indexes. In deterministic processes, the value function at a given stage index is single valued. Thus, in deterministic processes we always know that there is a unique state for each stage of the process. Stochastic processes are characterized by value functions which are multivalued. It is possible for these processes to be in many different states for a given stage; thus, we are never certain what th~ value of the process will be at a given stage. We shall be dealing with stochastic processes in this book and it is fortunate that certain tricks have been developed which make these processes mathematically tractable. Figure 2.1 shows graphic examples of these discrete deterministic and stochastic processes.
"" 5 III '" 4 ~ 3
o
Vi
2 1
/
/
I
/"-...
\
\
r-,
12345678
Stages: fliT
"" 5 I--II--I---!----.--+--+--+---+III
'" 4 f---'HHH-+-+--+--+~ 3 HHHH-+--+--+--+-
o
Vi 2 f--HI--I----.H--+--+--+1 f-...-...-I---
I
234
5
Stoqes : f
FIG. 2.1
6
7
8
The notion of a deterministic process.
14
2.
THE MATHEMATICAL MODEL
The notion of a discrete sequential process is connected with the assumption that the set T may have some order. Physically, although not always mathematically, this order is unique. The sequential process can be represented as a sequence of events, one following another in "consciousness," all related to some physical time-reference process by the concept of simultaneity. This physical time-reference process is generated by some energy conservative atomic or astronomical system. Suppose that there is some unique mapping function, denoted as the time mapping function, which connects each stage index, t, to some event or point of the physical time-reference process. Furthermore, assume that the order of "consciousness" of the physical time-reference process is preserved by this mapping. We see that the time transform must be a strictly monotone increasing function of the index, t, and since the physical time-reference process is said to have a direction in time so then does the set T. We shall call the ordered set T, thus defined, the set of entropy time indexes. We shall see in Chapter 5 how this name earns its significance. This emphasis is placed on the notion of the time index because generally economic processes are only loosely coupled with physical time. t Figure 2.2 shows, as one example, the time mapping function between several weeks of physical time and a number of periods of entropy time on the N ew York Stock Exchange. It will be noted that the "stage" of the discrete .process is the daily marketing process, and the value is the number of transactions for that stage. Although this Entropy time. stock investment process
PhYSical lime, N.Y. Stock EKChange
FIG. 2.2 t A great deal of attention to the time mapping function is essential to the correct interpretation of economic statistics. Statisticians are considering this problem when they deal with the seasonality of economic time series. See, for example, Reference I.
2.2
DISCRETE PROCESSES
15
particular time mapping function is probably not the best for the stock market proces~, it does show what is meant by a time mapping function. If economic processes are based on the sequence of experienced events, then the use of an entropy time basis will simplify the representation of these processes. Once the characteristics of the entropy time economic process are known with respect to entropy time, the results can be transformed back to physical time by the use of the time mapping function for that process. The third notion of discrete processes is the idea of a transition function. Discrete sequential processes are defined on the ordered set T, and for each element or stage there is a function defined on T X S which determines what the state of the process will be at that stage. This function is known as the transition function. Most realistic processes do not have transition functions defined on the entire set of ordered pairs of S X T. Historical processes are a subclass of discrete sequential processes where the transition functions do not depend on the ordered pairs of T X S which follow the stage at which the transition function is defined. The notion of direction of time is important in historical processes, although mathematically we can, and sometimes do, devise processes which unfold backwards in time. There is a subclass of historical processes which are called Markov processes. In these processes the transition function depends only on the directly preceding stage. Such processes are quite common in physical and economic systems. This is fortunate since a great deal is known about Markov processes. In many sequential processes, the transition function at each stage t depends only on t and not on any preceding stages. Such processes are called independent sequential processes. The Bernoulli process and the more general multinomial process are simple examples of discrete independent sequential stochastic processes. Generally, we shall assume that "nature" or the environment of an adaptive process is the result of a discrete independent sequential stochastic process or a Markov process. In the introduction to some special types of discrete sequential processes which follows, we shall use the following conventions: A capital letter followed by a subscript is the index of a vector which is the value of the state of a process at the stage designated by the subscript. A function, symbol T with a subscript, will denote a transition function of the type designated by the subscript. In most cases the function will depend on several state vectors, generally its own previous state
16
2.
THE MATHEMATICAL MODEL
vector and the state vectors of other related processes within the system. A stochastic process is generated by including a random-valued vector in the transition function. Thus, the resulting state could be one of several values depending on the value of the random vector. It is assumed that some auxiliary discrete independen-t sequential process is generating the value of the random vector, and this process is running synchronously with the fundamental process. We shall call the main process the structural process, and the auxiliary process the environmental process.
2.3 Some Discrete Sequential Processes! Descriptive Processes (Type 0 Processes)
One of the most elementary sequential processes is the type 0 process. Many models of physical processes and economic processes are of this type. Type 0 processes are often called descriptive processes since they are employed to describe the state of a structural system in which only the a priori (previous) state and the environment have any effect on the system. Figure 2.3 is a block diagram of a typical type 0 process. Given some initial state, the system proceeds to transform this state into a new state. This new state in turn becomes the a priori state for U, Structural environmental vector
Structural system
s; A priori structural state vector
One stage la9
FIG. 2.3
t See, for example, Reference 2.
2.3
SOME DISCRETE SEQUENTIAL PROCESSES
17
the next stage of the process, and so on for as many stages as desired. The type 0 process is characterized by the transition function St+l
=
TiSt, Vt),
t
= 0,1,2, ...,
(2.3.1)
where S I is the a priori structural state vector of the system at stage t, given So, the initial state vector, and VI is the structural environmental vector, that is, the effect of the environment on the structural state of the process at stage t. Sequential Decision Processes (Type 1 or Feedback Processes)
Type I processes are generated by transitions that are functions of the a priori structural state vector, the structural environment vector, and a decision vector. Since the current decision vector is in turn a function of some previous structural state vector, the term "feedback" process is most appropriate. Figure 2.4 shows the block diagram of a type 1 process. The structural transition function is given by StH
= Ts(St, D, , Vt),
t
=
0, 1,2, ...,
where D I is the decision vector.
Structural system
5,+.
Decision vector A priori structural state vector
One stege 109
Decision function
Decision Maker's environmental vector
FIG.
2.4
(2.3.2)
2.
18
THE MATHEMATICAL MODEL
The decision vector in turn could be given by a decision function:
t=0,1.2.....
(2.3.3)
where V, is the decision maker's environmental vector, that is, the effect of the environment on the decision maker's action at the tth stage of the process, and 'Tis some finite integer time lag. In many cases. type I processes are deterministic. Each decision vector is completely determined by the previous structural state vector and the decision maker's environment vector. In turn each structural state vector is completely determined by the a priori structural state vector, the structural environmental vector, and the decision vector. In these deterministic cases, the type I process can be reduced to a type 0 process by the substitution of Eq. (2.3.3) into Eq. (2.3.2). The most important type I processes are the stochastic processes. Here the structural state transition function and perhaps the decision function contain environmental vectors that are random variables. In many cases the random variables may be "explained" by the existence of probability density functions, which give the likelihood for the occurrence of each of the possible values of the random variables. Adaptive Decision Processes (Type 2 Processes)
Type 2 processes are generated by structural state transition functions of the a priori structural state vector, the structural environmental vector and the decision vector, just as in the case of type I processes. The difference is that the decision vector is a function of not only the previous state vector and the decision maker's environmental vector, but also a historical information vector. This historical information vector is itself determined by a transition function of the a priori historical information vector. The historical information vector contains the record of the past observations of the environment by the decision maker. Figure 2.5 shows the block diagram of a type 2 process. The structural state transition function is given by Eq. (2.3.2). The decision vector is determined by o, = D(S'_T, H T,Vt ) , t = 0,1,2.... , (2.3.4) when H, given by
'_
IS
the historical information vector at some stage t, and is t = O. 1,2• ....
(2.3.5)
2.3
SOME DISCRETE SEQl:ENTIAL PROCESSES
4 Decision vector
Structural system
19
~., A posteriori structural state vector
s,
A priori structural state vector One stage lag
Historical process A priori historical information vector
H, H'_T
Decision function
H,.,
On stage lag T+l
stage lag
Til
stage lag
Decision Maker's environmental vector
Fill. 2.5
where H o is given and is called the conviction vector, and ThO is the historical information transition function. In deterministic type 2 processes, the historical information vector would contain no new information, since the effect of the environmental vectors on the structural state would be known in advance for each state of the process. However, in stochastic processes the historical information vector is the only clue to the random effects of the environmental vectors on the structural state of the system. Thus, the historical information vector is of use to the decision maker in the determination of a decision vector under uncertain environmental conditions. We
20
2.
THE MATHEMATICAL MODEL
shall be concerned with this type of stochastic adaptive decision process in this book.
The State of Equilibrium for a Sequential Process
In dealing with sequential processes it is necessary to consider the concept of equilibrium. Equilibrium occurs in a sequential process when for any stage the a posteriori state vector is equal to the a priori state vector. A process in continuous equilibrium would have a zero rate of change for the state vector. For a type 0 process to be in continuous equilibrium, it would be necessary that (2.3.6)
and for
t
= 0,1,2, ....
(2.3.7)
But Eq. (2.3.6) and (2.3.7) could not occur unless for
t=0,1,2, ....
(2.3.8)
Thus, a constant environment is necessary for a state of continuous equilibrium to occur in a type 0 process. This condition completely excludes the stochastic sequential process from an equilibrium state since the environment vector is a random variable. In dealing with stochastic sequential processes, a different concept of equilibrium must be devised which does not require a constant environment. In a later chapter we will extend the concept of equilibrium to the stochastic sequential processes. This extension must await the introduction of the notion of entropy as applied to stochastic decision processes.
2.4 The Role of the Decision Maker The decision maker in an adaptive sequential decision process is concerned with the determination of some unique action within the
2.4
THE ROLE OF THE DECISION MAKER
21
structure of the process. In an adaptive process it is the task of the decision maker to identify or determine: (1) his (2) the (3) the (4) the (5) the (6) the (7) the
objective, structural state variables, stochastic elements, historical information vector, strategy for handling risk and uncertainty, optimality conditions, and sensitivity of the process to his decisions.
The decision maker's objective can be stated in many ways. For example, the decision maker may either:
(l) achieve maximum rate of growth or a maximum value of the process in a fixed time or with fixed resources; or (2) achieve a given target value or a given rate of growth in a minimum period of time or with a minimum use of some resource. In stochastic processes the objective must be defined to recognize the fact that the state of the process is a stochastic variable. Thus, in a stochastic process the decision maker might desire to: (1) achieve a maximum expected value or rate of growth with a given permissible level of risk due to the uncertainty of the process, or in fixed time, or with fixed resources; or (2) achieve a given expected target value, or rate of growth with a minimum level of risk, or in a minimum period of time, or with a minimum use of resources. Most processes can be formulated in terms of one of these objectives. In many situations more unconventional objectives may be required. In any case, an important task of a decision maker is to establish some objective and a strategy to handle the risk and uncertainty of a stochastic process. Following the establishment of an objective, the decision maker must determine which variables should enter into the evaluation of his objective. For example, if the objective is to minimize the risk while achieving some target value, then the variables which measure the expected value and the risk of the process must be identified and measured. These variables become the elements of the structural state vector of the
22
2.
THE MATHEMATICAL MODEL
process. In many economic problems the structural state vector will have only one element-utility. In physical problems, the structural state vector may consist of many elements such as the position and momentum of each particle in a system.
2.5 The Optimal Type 2 (Adaptive) Process Adaptation is observed during the passage of time. Generally the transition of the a priori structural state vector to the a posteriori structural state vector can be broken down into three interstage events. These interstage events are: (1) reception of the a priori historical information vector. At the beginning of each stage a "package" of information is received from the previous stage which contains the a priori structural state vector and the historical information vector. (2) determination of the decision maker's action. The a priori historical information vector suggests a number of possible alternative actions from which a unique action is determined by the decision function. (3) the transformation of the structural state vector and historical information vector. The action determined by the decision function and the effect of the external parameters on the process transforms the a priori structural state vector into the a posteriori structural state vector. This new vector may differ from the expectations of the decision maker when the decision was made. This difference generates a transformation of the historical information vector, producing the new information package for the following stage. In the above sequence, the second interstage event, the action can be specified more completely. In types 1 and 2 sequential processes, the choice of action can be made on rational basis. The concept of a rational action requires knowledge of the objective for the process. The objective is, at the least, a function of the structural state vector, and since the change in the structural state vector is the likely result of the action taken, then the change in the objective is in part a result of the action. Thus the action which results in an optimal change in the objective is denoted as the optimal action. Choice of this optimal action by the decision maker generates the optimal decision vector. There is a possibility that the decision maker will, for some reason or another, not choose the optimal action even though such an action exists. Processes of this
2.5
THE OPTIMAL TYPE
2
(ADAPTIVE) PROCESS
23
kind are important in the theory of learning, and these processes will be considered in Chapter 3. Biological adaptive processes generally assume that the size of the population of a species is good evidence of the optimality of the process. The objective is taken that a species should grow at the maximum possible rate; thus the optimal action at each stage of the process is to choose the action which assures the best likelihood of survival for the species. On the other hand, economic adaptive processes often are based on the assumption that the maximum accumulation of some form of wealth is good evidence of optimality. The hypothesis is generally made that the optimal action at any stage is the action which most likely leads to a maximum rate of growth of a form of wealth. In many cases, the form of wealth considered in economic processes is in the form of monetary units; in other cases, it is the utility of commodities which is accumulated. The sequence of interstage events can be modified to include the concept of optimality. To do so we would replace event (2) by the following: (2a) determination of the decision maker's optimal action. The a priori historical information vector suggests a number of possible alternative actions which alter the structural state vector. At least one of these actions is unique in that this action will probably achieve an optimal move of the objective. When the decision maker chooses the optimal action, he has generated the optimal decision vector. The sequence of interstage events of an adaptive process-information, decision, and transformation-will appear in each kind of adaptive process. Consider two examples, one a political adaptive process, the other a biological adaptive process. In the political adaptive process, a priori the information consists of the structural state vector of the government and the historical information vector. A number of political issues can be identified by knowing the current state of government and history. Solutions to these current issues are suggested. From these solutions one solution is chosen as the optimal decision by some sort of voting procedure. The optimal solution is generally the one which is believed to achieve the best chance of survival for the government. The action resulting from this solution, and the effect of the environmental state (for example, world events) determine the transformations of the states of the govern-
24
2.
THE MATHEMATICAL MODEL
ment and history. Thus, new information is generated for the next stage of the process. In a biological adaptive process the a priori information consists of the current biological (structural) state of the organism and the current memory of the effects of the environment (the historical information vector). This information suggests a number of changes of the biological state. One of these changes, the optimal decision, appears to give the most likelihood of survival for the organism. Taking this optimal action and being subjected to environmental events results in a transformation of the biological state and the memory state of the organism. This transformation establishes the new information for the next stage of the process.
2.6 The Constrained, Optimal Type 2 (Adaptive) Process Economic processes are processes which determine the allocation of some scarce factors in such a way as to achieve some objective. We have seen how a sequential decision process can be formulated as a goaloriented process by including an objective which is, at the least, a function of the structural state vector. We now must tackle the concept of scarcity which is an essential part of the economic process. We shall do so by the use of the notion of constrained optimality. Suppose the rate of structural change in an adaptive process were unconstrained. If the information received at the beginning of a stage were known with perfect certainty, then it is possible that the process could adapt in one stage and remain in an equilibrium state thereafter. For example, suppose an adaptive organism, initially in equilibrium with its stationary environment, is transported to another region where a remarkable change in the environment is experienced. If the organism's rate of structural change were unconstrained, the organism could choose an action which would, in one stage, alter its structural state so as to again place the organism in equilibrium with its new environment. During the following stage no further adaptation would be required. This kind of adaptive process would not be of much interest, or be a very realistic model of actual processes. Adaptation generally takes place smoothly or in small quantum jumps. We shall assume that there exists an upper constraint on the total amount of change of the structural state vector. Furthermore, we shall assume that the adaptive process,
2.7
CAUSALITY AND MARKOV PROCESSES
25
because of the optimality of the decision maker's actions, allocates this constrained total amount of change so that an optimal move toward the objective occurs. An example would clarify this point. Consider the political adaptive process. During a party's term in office, only a limited number of laws can be adopted and executed. Thus, only a limited rate of change in the structure of the society is possible. The government must allocate its limited efforts to obtain an optimum effect so that its re-election is assured at the end of the term. In this sense the political adaptive process is an economic adaptive process.
2.7 Causality and Markov Processes In Section 2.3 we saw that a sequential process was characterized by the properties of its transition functions for the solution of a general discrete process. We will see in this section that the properties of the transition functions have far-reaching consequences involving the notions of causality and determinism. We have for a general discrete process the solution S, = T(So, t)
for
t=1,2, ...,
(2.7.1)
where So = C, S I is the state vector at stage t, So is the initial state of the system, and C is a given constant vector. Deterministic systems possess some very useful properties. Suppose we know some initial state, So, and we wish to calculate the state at some later stage t 1 • From Eq. (2.7.1) we have (2.7.2)
where So = C,
Suppose 8t more stages pass, putting us at stage t 2 (where t 2 = t 1 + 8t) and we wish to calculate the state at stage t 2 • Starting with an initial condition of S 11 ' we would have (2.7.3)
where St
> O.
26
2.
THE MATHEMATICAL MODEL
If instead we went back to Eq. (2.7.2) and calculated directly, St.
=
So
= C,
(2.7.4)
T(So, t 2)
where would we have the same state vector at t 2 ? In other words, would it be true that (2.7.5)
The condition for the equality of these expressions (Eq. (2.7.5)) constitutes one of the fundamental points in the theory of difference equations.P If Eq. (2.7.5) is valid, we say that the system so represented is a deterministic system. In classical mechanics, Eq. (2.7.5) expresses the notion the principle of causality. But what could cause Eq. (2.7.5) to be not valid? Certainly we must be assured that a solution to Eq. (2.7.1) exists. Besides the existence of a solution, suppose we know that there are many solutions to Eq. (2.7.1); i.e., a solution exists but it is not unique. A stochastic process is characterized by the fact that a process may be in anyone of several states at any given stage. Surely then a stochastic process could not lead to a solution which was both existent and unique. This would be a disaster to the mathematical analyst since practically all actual processes are stochastic processes. Fortunately, there is a way out of this dilemma. Suppose we do not try to find out the state of the process at some stage but only ask for the probability of a certain state. The set of probabilities for each of the possible states at any stage is the probability function or probability vector for the process at that stage. Now instead of calculating the actual state of the process at each stage we calculate the probability function of the process at each stage. In effect, we are substituting probability states for actual states, and basing the mathematical model on these probability vectors. From the probability calculus, we can write
of
P(i, t 2 ;j, to) =
!- P(i, t
2 ;
k, t 1 )P(k, t 1 ;j, to)
for any
kES
i,j
E
S,
u.,», > 0
and
to
= 1,2,...
(2.7.6)
where P(i, t 2 ;j , to) is, for example, the probability that the system will move from state j in stage to to state i in stage t 2 •
2.7
CAUSALITY AND MARKOV PROCESSES
27
Note that each term in the summation of Eq, (2.7.6) is the probability that the system will move from j to i over a different path through k, where k runs through the set of all the possible states, S, at stage t1 • Equation (2.7.6) is the stochastic analogy of Eq. (2.7.5). The principle of causality applies to the. analysis of state probability vectors if Eq. (2.7.6) is valid. Equation (2.7.6) is denoted as the Chapman-Kolmogorov equation.! If we let to = 0, t 1 = t, and t 2 = t 1, Eq. (2.7.6) becomes
+
Q(i, t
-+-
1) -= ~ P(i, t ...:... 1; k, t)Q(k, t),
for each i E S, given So,
(2.7.7)
«cs
where Q(k, t) is, for example, the probability of being in state k at stage t, given the initial state So . Note, for example, that Q(k, t)
=
P(k, t 1 ;j, to),
(2.7.8)
if we started in the jth state at stage 0. The P(·, .; " .)8 are called transition probability functions, and the Q(', .yare the probability state vectors. Suppose we make one important assumption regarding the transition probability functions. We assume that over the period of interest, say t = 1, 2, ..., T, the transition probability functions are not functions of the state index, i.e., not functions of time. We rewrite Eq. (2.7.8) as Q(i, t
+ 1)
=-'
~ P(i, k)Q(k, t),
given
(2.7.9)
k~S
where P(i, k)
=
P(i, t
+ 1; k, t)
for all
t
=
1,2, ..., T.
(2.7.10)
Equation (2.7.9) is the equation for a discrete :.vIarkov process. Note that once the current probability state vector, Q(k, t), is specified, all the future probability state vectors can be determined. The probability state vectors before state t are not necessary, and can have no effect on the calculation of the future states of the process once Q(k, t) is specified. This is the fundamental property of :.vIarkov processes. To preserve determinism in the mathematical model, and to simplify the effect of history on an adaptive stochastic decision process, we shall assume that the Chapman-Kolmogorov relation holds, and that the transition probability functions are constant over the period of interest. The chances are that this assumption is not unrealistic since a vast
28
2.
THE MATHEMATICAL MODEL
majority of physical and economic processes closely resemble this kind of process. In particular, statistical evidence indicates that investment and learning processes, at least, are likely to be generated by simple Markov processes of the random walk type. 5,6 We shall consider adaptive investment processes, and, to a limited extent, learning processes in later chapters.
2.8 The Dynamic Programming Model In the study of adaptive processes we have chosen to use the technique of dynamic programming as a method to generate solutions and properties for these processes. Bellman," who has introduced the technique of dynamic programming, has recently extended the technique to the study of self-adaptive control systems. This extension has been summarized in the last chapters of Bellman's "Adaptive Control Processes.':" The application is more specifically covered in a series of papers on self-adaptive control systems and communication theory.9-14 The dynamic program technique will be used because it leads to a model of a sequential discrete decision process closely resembling the actial process of economic decision making. More specifically, we shall consider only the stochastic variety of these processes. since the risk and uncertainty arising in such processes are generally present in economic decision making. We shall assume that a stochastic environmental process generates a sequence of events that are experienced by the decision maker. The likelihood of any particular event is described by a probability function. In nonadaptive processes it is supposed that this probability function is known by the decision makers; however, in adaptive processes this function is not directly known. The adaptive decision maker does know the history of the past occurences of the environmental events. It is in this sense that the adaptive decision maker possesses only limitedinformation concerning the parameters of the economic environment. We have seen that adaptive decision processes are irreversible. Once a decision is made, the effects of that decision are recorded as part of the history of the process. In the future, corrective action may be taken for a bad decision but the result of the bad decision is permanently part of the history of the process. The assumptions made above are made for the sake of greater realism. They are costly in terms of mathematical complexity. For example, the
2.8
29
THE DYNAMIC PROGRAMMING MODEL
history of any process may have an enormous number of degrees of freedom. We shall have to make certain concessions to bring an otherwise complex problem to simple results. A necessary assumption of the dynamic programming technique is that the process described is a Markov process. This assumption leads to Bellman's "Principle of Optimality." The Markovian nature of a T-stage decision process assures that:
After any number of stages, say t, the effect on the objective function of the remaining T - t stages of the process depends only on the state of the process at the end of the tth stage and subsequent stages. In dynamic programming, the state vector of the process at the end of the tth stage is transformed into the state vector for the (t I)th stage. This transformation is accomplished partly by the decision maker by applying a decision vector. The optimum decision vector is the one which optimizes a specified objective function for the decision maker I )th stage and the remaining T - t - I stages. At the end for the (t of the process (stage T), the set optimum decision vectors, sequentially chosen by the decision maker, are called by Bellman the "optimum policy." In a dynamic program for a stochastic sequential decision process, the structural state vector must be redefined so as to contain the state probability function of the environmental process since we shall require that the stochastic sequential decision process must meet the requirements of the Chapman-Kolmogorov relation. The Markovian property is obtained by defining the state vector in such a way as to contain the probability function of the environmental process. Such a requirement, incidentally, gives rise to mathematical models either going forward or backward in time.P Such an ordering of events is not time as we know it in the physical world, since physical time has direction as well as order.l" Physical processes are irreversible. Although the actual processes we desire to study are irreversible in time, we are forced to insure that the model of the process is reversible in time. We turn our attention to the complexity of the state vector. Instead of defining the state vector so as to contain the state probability function of the stochastic process, we can simplify the vector by including only the expectation function of the stochastic variables. Other functions of the state probability function could be used; for example, the momentgenerating function or the characteristic function, but herein we shall use the expectation function.
+
+
30
2.
THE MATHEMATICAL MODEL
The adaptive sequential decision process is a process in which the probability function of the environment is not known to the decision maker. We shall study a class of adaptive processes in which we assume that the functional form of the probability function is known, but the parameters of this function are not known. In this case, the decision maker possesses a historical state vector which is composed of a sequence of event descriptions coming from the environmental process. To reduce the number of variables of the transition function, which originarily would contain the entire history of the process, we shall assume that the history can be collapsed into a sufficient statistic which contains all the information in the historical sequence of event descriptions. There are other reasons why we employ a sufficient statistic to convey historical information. We are concerned with the rationality of the decision maker. When speaking about a subjective process we have no right to hold that a decision maker must have knowledge of a particular type of historical statistic; but we do have the right to say that if a certain statistic is in some specified way better than any other statistic, then the decision maker who uses this statistic is a "rational" adaptive decision maker. The sufficient statistic is related to the theory of statistical estimators and subjective probabilities. We shall investigate in Chapter 4 the properties possessed by the best estimators, and the relations between these estimators and sufficient statistics. References I. Young, A., Census trading-day adjustment method, in Business Cycle Devel. Ser. ES1, No. 64-5, pp, 59-64. U.S. Dept. Commerce, Washington D.C., t'964. 2. Bellman, R., Dynamic programming, intelligent machines, and self-organizing systems, "Mathematical Theory of Automata," Microwave Res. Inst. Symp. Ser, Vol. 12, pp. 1-11. Polytechnic Press, Brooklyn, New York, 1963. 3. Hille, E., "Functional Analysis and Semi-Groups," pp. 388-390, 423-438. Amer. Math. Soc., Providence, Rhode Island, 1948. 4. Feller, W., "An Introduction to Probability Theory and Its Applications," pp. 423424. Wiley, New York, 1957. 5. Osborne, M. F. M., Brownian motion in the stock market, Operations Res. 7, No.2, 145-173 (1959). 6. Kemeny, J. G., and Snell, J. L., Markov processes in learning theory, Psychometrica 22, 221-230 (1957). 7. Bellman, R., "Dynamic Programming." Princeton Univ. Press, Princeton, New Jersey, 1957. 8. Bellman, R., "Adaptive Control Processes," Chapter 16. Princeton Univ. Press, Princeton, New Jersey, 1961.
REFERENCES
31
9. Bellman, R., and Kalaba, R., On adaptive control processes, IRE Trans. Auto. Control, 4, No.2, 1-9 (1959). 10. Friemer, M., A dynamic programming approach to adaptive control processes, IRE Trans. Auto. Control 4, No.2, 10-15 (1959). 11. Bellman, R., and Kalaba, R, Dynamic programming and adaptive processes: mathematical foundations, IRE Trans. Auto. Control, 5, No. 1,5-10 (1960). 12. Aoki, M., Dynamic programming approach to a final-value control system with a random variable, IRE Trans. Auto. Control 5, No.4, 270-282 (1960). 13. Aoki, M., On optimal and suboptimal policies in the choice of control forces for final-value systems, IRE Trans. Auto. Control 5, No.3, 171-178 (1960). 14. Bellman, R, and Kalaba, R, On communication processes involving learning and random duration, IRE Intern. Conv. Record, Part 4, 16-21 (1958). 15. Reichenbach, H., "The Direction of Time," pp. 24-29. California Univ, Press, Berkeley, California, 1956. 16. Wiener, N., "Cybernetics," p. 44. Wiley, New York, 1948.
CHAPTER 3
THE PRIMITIVE ADAPTIVE PROCESS If an individual enjoys well-ordered thoughts. it is quite possible that this side of his nature may grow more pronounced at the cost of other sides and thus may determine his mentality in increasing degree. Albert Einstein
3.1
Autobiographical Notes
The Notion of Learning and Adaptation
In this chapter we shall develop a model of a primitive adaptive process which will illustrate Some important characteristics of adaptive processes in general. We have seen that an element of uncertainty in a type 2 sequential process (adaptive process) is an essential requirement because if there is no uncertainty the type 2 process will collapse into a type 1 process, and in turn the type 1 process will collapse into a type 0 process. The exact kind of uncertainty has not been specified up to now. We will consider two basic kinds of uncertainty. The first kind is associated with the determination of "learning" rates of animals in "T" maze experiments. Here the environment of the learning process, which is a type of adaptive process, is deterministic. That is, the end of the "T" associated with a reward is the same for each observation in a given experiment. On the other hand, the animal does not appear to make decisions (at the cross of the "T") with certainty, at least in the early stages (observations) of an experiment. The source of uncertainty in adaptive processes of this first kind arises from a stochastic disturbance in the decision process. The second kind of adaptive process is associated with behavior where "learning" (defined as the reduction of uncertainty in the decisionmaking process) has already taken place. Here decisions are made deterministically. but the environmental parameters are' subjected to uncertainty. Gambling processes are generally of this type. A person playing black jack may have learned the optimal strategy so that his decisions are deterministic at each stage of the game. However, the 32
3.2
SOME HYPOTHETICAL EXPERIMENTS
33
dealing of the cards is a stochastic process and thus the necessary random element is provided by a stochastic environment, the carddealing process. Thus, we find that an adaptive process of the first kind is one where the state of the environment is deterministic, but the decision maker is uncertain about it, resulting in an uncertain decision; on the other hand, an adaptive process of the second kind is one where the decision maker is confronted with an uncertain state of the environment, but given the state of history his current decision is deterministic. Although the observation of both kinds of adaptive processes results in data which is quite similar, the theoretical difference can be made clear through the examination of some special hypothetical experiments.
3.2 Some Hypothetical Experiments Suppose a decision maker-in this case a "rational" man-is shown two urns, and unknown to him one contains all white balls and the other all black balls. He does not know which urn holds which color, but he does suspect that the composition of the urns will be constant during the experiment. On each stage (trial) a reward is paid for drawing a white ball, and a penalty is assessed for drawing a black ball. The decision maker has some subjective probability that one particular urn contains a particular color of ball, but we can restrict ourselves to the Laplacian case where he considers it equally likely that either urn contains white balls. At the first stage, he draws a ball from either of the urns, and examines it. If it were a white ball, on each successive stage (with probability of almost one) he would draw from the same urn, and thus to his surprise receive only white balls. If it were a black ball, on each successive stage (with probability of almost one) he would draw from the other urn, and to his surprise receive white balls thereafter. We ask two questions: why would the "rational" man switch to the other urn in the case of drawing a black ball on the first trial, or stay with the chosen urn if the first ball had been white? Also, why would the probability of the switch be so high if the first ball were black, and the probability of remaining with the chosen urn so high if the first ball were whitef In effect, the decision maker has apparently learned in one trial that a particular urn is rewarding because he may suspect that the environment of the experiment is deterministic. Somehow the deterministic environment (in other words, the probability
34
3.
THE PRIMITIVE ADAPTIVE PROCESS
that a particular urn contains a particular color is unity) permits the instant shifting of the expected return from a series of stages from zero to some positive value after the first stage. The rational man's strategy is that the color of the first ball is probably a good indication of the color of the remaining balls in that urn for such an environment. To answer the second question, we may conclude that the decision maker is intelligent or introspective; therefore he can work out the calculations determining a subjective expectation of return from a series of stages and that he has the conviction that his calculations are correct. Such could be the expected intelligence of a man; but a rat might not immediately "learn" how to calculate, and thus make a decision to maximize his return from a series of stages. This brings us to the second experiment. Consider a psychologist's rat-testing "T" maze, consisting of a decision point and two paths to identical boxes, one containing food, the other containing nothing (hunger). The probability that the food is on one side is unity, and the probability that hunger is on the other side is also unity. Thus, we see the similarity to the urns of the first experiment. The rat will be expected to start his first stage (trial) with Laplacian subjective probabilities. If the first stage was rewarding, the rat could/be expected to go in this same direction on following stages (trials) with gradually increasing probability following each success (reinforcement). If the first stage was a penalty, the rat could be expected to go to the other side on following stages, with gradually increasing probability following each success. The rat does not have sufficient introspection or confidence to calculate expected returns over time and formulate a strategy based on subjective probabilities. Therefore the rat's behavior in the experiment is entirely different than a man's behavior in a similar experiment. Let us return to the urn experiment and relax the assumption that each urn must contain balls of one color. Given the same total number of balls in each urn, let us transfer a number of black balls from the black urn to the white urn, and an equal number of white balls from the white urn to the black urn. Both urns then will still contain the same number of balls as before, and the probability of drawing a white ball from one will equal the probability of drawing a black ball from the other. The man does not know the composition of each UrI1, but only knows that the composition will be constant over the stages comprising the experiment. Again the decision maker will have some initial subjective probability
3.2
SOME HYPOTHETICAL EXPERIMENTS
35
of drawing a white ball from each urn. Let us suppose that he will initially think the probability of drawing a white ball from urn 1 is more likely. If he draws a white ball from urn 1, he will draw from this particular urn on. the following stage. If on the succ~eding st~ges ~e receives more white balls than black from urn 1, he will stay with this urn and never choose urn 2. However, if he draws a black ball from urn I on the first stage, on the following stage his conviction that urn I is the urn that has the greatest probability of producing white balls will be somewhat shaken. If in the draws on the succeeding stages he finds more black balls than white, at a crucial stage he will switch to urn 2 for the remaining stages, with a new conviction that urn 2 is the most rewarding. To sum up the results of the above experiment, we see that the decision maker's behavior would be characterized by a strong stability. He "sticks to his guns" unless he clearly is shown that he is wrong. If shown to be wrong, he alters his choice decisively, and once having changed he remains with stability at the new behavior pattern. The difference between this hypothetical human decision maker, decisive and confident as he draws from an urn, and the rat, uncertain and hesitant at the junction of the "T" maze, is striking. In the "T" maze experiment, an adaptive process of the first kind, it is assumed that the actions taken by the rat will be probabilistic and unreliable, but that the reward of the maze will remain the same throughout the experiment. On the other hand, in the urn experiment, an adaptive process of the second kind, it is assumed that the decisions will be deterministic and clear cut, but that the color of the balls drawn from the urns will be probabilistic and unreliable. We see then that the adaptive process of the first kind is in a sense a complement of the adaptive process of the second kind. There is no doubt that in the actual behavior of rats in "T" mazes there exists some element of introspection, however weak. And in the behavior of humans there is no doubt that some erratic indecision occurs at the critical points. Thus, we are led to believe that neither the first kind nor the second kind of adaptive process is applicable to actual decision experiments. A process somewhere between these two extremes is necessary: an almost pure adaptive process of the first kind for low introspective animals, and an almost pure adaptive process of the second kind for highly introspective animals. Such a process can be devised, and we will do so in the following sections.
36
3.
THE PRIMITIVE ADAPTIVE PROCESS
3.3 An Adaptive Process of the First Kind Consider the "T" maze experiment, which is analogous to the urn experiment above. Let us assume that some probability can be determined that measures the likelihood that the decision maker (rat or man) will go to a particular side of the "T," say the side leading to the payoff on the tth stage. We will call this probability the Q probability for the tth stage. The initial value of this probability is not very important since during the learning process the Q probability is successively refined and approaches unity after a great many stages (trials). One can say, however, that the greater the deviation of the initial probability from unity, the greater the number of stages required to obtain some given level near unity. Let us define a function which transforms the a priori Q probability to an a posteriori Q probability for a given (t l)th stage, where the transform is a function of the results of the last, tth stage, i.e., generally the intensity of the payoff. Thus, for M alternative directions
+
for each
iEM;
(3.3.1 )
subject to the condition that for each
(3.3.2)
and
~q;t
;EM
=
1,
(3.3.3)
where Ti') is a transformation function for the decision parameters, is the probability that the decision maker will take the ith alternative on the tth stage of an adaptive process of the first kind, and VI is the kind of payoff received by the decision maker at the completion of the tth stage of the process. In the case of the "T" experiment, some simplification can be gained by the limitation of the payoff to a binary function with only two levels, such as food, or no food. In this case, Eq. (3.3.1) reduces to qil
(3.3.4) and (3.3.5)
3.3
AN ADAPTIVE PROCESS OF THE FIRST KIND
37
where qIl is the probability that the decision maker will take the righthand side of the "T" on the tth stage, and q21 is the probability that the decision maker will take the left-hand side of the "T" on the tth stage of the process. We may still further simplify Eq. (3.3.1) by introducing a particular linear form for the transform TaO which has been shown to be a fair fit to data from actual learning experiments. 1 This special transform is:
(3.3.6) and
(3.3.7)
where (Xl and (X2 are parameters fitted to the actual data, and 0 < (Xl < 1, 0<0:2<1. It will be noted that Eq. (3.3.6) has a very important property. When glt+l = qIl' then gIl' and hence qlt+1' must be identical to unity. This can occur, for a value of (Xl and 0:2 subject to the above restrictions, only after an infinite number of stages. This can be seen by assuming an initial Q probability for the first stage, and solving the recursive Eqs. (3.3.6) for the probability at any stage, t, in terms of the initial Q probability. This solution is given by
-1(X~ql.O
qlt -
(X~n
:r:rl.O
+ (1 + (1 _
and
(Xl)t (X 2
I
)t
(3.3.8) (3.3.9)
If 0 < (Xl < 1, and 0 < (X2 < 1, since lim (X~
t~oo
=
0,
and
then lim qlt = 1. t-sec
If (Xl' (X2 = 1, then qlt = qlt for all t. With these conditions and properties, it is seen that Eq. (3.3.6) does not violate the fundamental probability axioms and therefore is a valid transform for our purpose. We now must define a logical structure which, when given the state of the environment (for example, the side of the "T" which produces a
38
3.
THE PRIMITIVE ADAPTIVE PROCESS
payoff) and the decision (for example, to go to the right) indicates which payoff will be received. Let the state of the environment, Si , be an element of the set of possible states, S, and let the action, aj , be an element of the set of possible actions, A. In our particular example, the set of states consists of two elements, SI and $2 , and the set of decisions consists of two elements, a 1 and a 2 • Let us define a set of payoffs, V, where the element, Vij, is associated with the ith state and the jth decision. We can write (3.3.10)
where Vii E
V
Si E
S.
For our particular experiment, Eq. (3.3.10) becomes V(SI ,
a1)
=
Vn
V(S2'
a 1)
V(SI ,
a 2)
=
V 12
V(S2'
a 2)
= =
V 21
(3.3.11 )
V 22
where Vll is the payoff if the food is on the right-hand side and the decision maker goes to the right-hand side; V 21 is the payoff if the food is on the left-hand side and the decision maker goes to the right-hand side; V 12 is the payoff if the food is on the right-hand side and the decision maker goes to the left-hand side; V 22 is the payoff if the food is on the left-hand side and the decision maker goes to the left-hand side. Note that during the "T" maze experiment the state S2 can never occur since the payoff or food is on the right-hand side with certainty. Therefore, events (S2' a1) and (S2' a 2 ) are associated with a probability zero, and events (SI , a1 ) and (SI , a 2) imply states a1 and a 2 , respectively, and Eq. (3.3.11) reduces to V(SI ,
a1)
V(SI ,
a 2)
= =
Vll
(food)
V 12
(no food)
(3.3.12)
If the probability of food on the right-hand side, event $1 , is unity, then the probability of event $2 is zero. With this in mind, we can find the probabilities of the payoffs from Pr(v n ) Pr(v12 )
= Pr(a1 ) PreSt) = = Pr(a 2) PreSt) =
Pr(at) Pr(a 2) .
(3.3.13)
3.4
AN ADAPTIVE PROCESS OF THE SECOND KIND
39
Now the probability of a l at any stage, say the tth, is qll; therefore Eq. (3.3.13) becomes Prev ll ) =
s«
Prev 12)
1-
-=
qlt
=
(3.3.14) q2t •
Using Eq. (3.3.14), we can obtain the expectation of the payoff at the tth stage fron. (3.3.15) In this experiment the expected payoff cannot be maximized because there are no variables which the decision maker can consciously control except to make qll as large as possible, and this is exactly what the learning process is accomplishing. The chief reason for introducing the expected payoff here is to prepare for the following section on the adaptive process of the second kind.
3.4 An Adaptive Process of the Second Kind Let us return to the urn experiment. The decision maker has some subjective probability of drawing a white ball from urn 1. This subjective probability is not likely to be equal to the actual proportion of white balls in urn 1. In all likelihood, this subjective probability will be a residual of past experience, the history of the decision maker, or an insight into the motives of nature (the number of white balls in urn 1 is considered to be a parameter of the environmental process). The value of this subjective probability is to be determined by "spying" on nature, or statistically estimating the parameters of the environmental process. Given, then, that there is some stationary true value of the probability of drawing a white ball from the urn, we may define this probability by
PH in general,where this is the probability of drawing a ball of the ith kind from the jth urn; or specifically, Pll the probability of drawing a white ball from urn I, P21 the probability of drawing a black ball from urn I, P12 the probability of drawing a white ball from urn 2, and P22 the probability of drawing a black ball from urn 2.
3.
THE PRIMITIVE ADAPTIVE PROCESS
Remember that in this urn experiment the probability of a white ball from the second urn is equal to the probability of a black ball from the first urn, and vice versa. Thus, Pu = P22 and P12 = P21' and it is easier to write that
PI is the probability of drawing a white ball from urn I, or a black ball from urn 2, and P2 is the probability of drawing a black ball from urn I, or a white ball from urn 2, where and We define the subjective probability of drawing a white ball from urn I, or a black ball from urn 2 at the tth stage to be PlI' and the subjective probability of drawing a black ball from urn I or a white P21 = 1. We ball from urn 2 at the tth stage to be P21' where PlI suppose there exists some transform, T e( · ) , which transforms the a priori probability, PlI' into an a posteriori probabtlity, PlIH' subject to the limitations of the fundamental probability axioms. This transform is denoted generally by
+
PitH = Te(p;t)
""it Pit =
subject to
;EM
I
and Pit ~ 0
for each
iEM,
(3.4.1)
where Tk) is a transformation function for the subjective probability, and Pit is the subjective probability of the drawing of a ball of the ith kind on the tth stage of the process. Limiting the payoff to two values, a reward and a penalty, we may simplify Eq. (3.4.1) to
.
!
T:(PH)/
PHH
= T~(PH) \
pzt+l
=
I -
(3.4.2)
PIt+1 ,
where T: is the transform if during the tth stage a reward was received, and T; is the transform of during the tth stage a penalty was received. From the ~ny possible forms for TeC'), we choose the same linear transform used in the "T" maze experiment.
3.4
AN ADAPTIVE PROCESS OF THE SECOND KIND
41
Thus we write
(3.4.3)
where f3l and f32 are parameters subject to the restriction that 0 ::0;;;; f3l ::0;;;; 1, and 0 ::0;;;; f32 ::0;;;; 1. Equation (3.4.3) possesses the same properties as Eq. (3.3.6); i.e., with the above restrictions on the f3's, Eq. (3.4.3) does not violate the fundamental probability axioms. Agains we have the relations describing the logical structure, which are V(Sl , al)
.~ Va
V(Sl , a2)
=
V l2
V(s2 , al)
=
V(S2' a 2) =
V 22
V 2l
(3.4.4)
where $, is state 1 (that a white ball is drawn from urn 1, or a black one from urn 2), $2 is state 2 (that a black ball is drawn from urn 1, or a white one from urn 2), a l is action 1 (to draw from the first urn), a 2 is action 2 (to draw from the second urn), v n is the payoff if a white ball is drawn from urn 1, V 21 is the payoff if a black ball is drawn from urn 1, Vl2 is the payoff if a black ball is drawn from urn 2, and V 22 is the payoff if a white ball is drawn from urn 2. We can simplify Eq. (3.4.4) by assuming that (reward)
and
(penalty).
We make the hypothesis that the decision to draw from urn 1 or 2 is determined by the maximization of the subjective expected payoff and the use of a strategy based on this expectation by the decision maker. Once the maximum subjective expectation is known, the strategy leads to a deterministic decision. For example, the probability of a l is I or 0, depending on the decision maker's strategy and the maximum subjective expected payoff. For each possible action, depending on the subjective probability, Pit for that stage, there is a subjective expected payoff given by and
(3.4.5)
42
3.
THE PRIMITIVE ADAPTIVE PROCESS
A. logical strategy would be to choose with certainty the decision leading to the maximum subjective expected payoff. Thus, we would have \ a1 :
Elt ( V) =
x:.~;) Ia 2 : E2t( V)
=
vnAt V 12Plt
+ V 21P2t !
+ V 2J
2t \ .
(3.4.6)
Throughout this section we have seeded the discussion with the terminology of game theory. This has been intentional and now we are in a position to introduce the game structure to develop a mixed adaptive process.
3.5 Mixed Adaptive Processes of the First and Second Kind The adaptive process of the first kind is most suitable for the lower animals, while the adaptive process of the second kind applies to higher animals. Most subjects, human and nonhuman, fall in between these two extremes and therefore some merger between the two processes is indicated. The relations involved are best shown in the context of a theoretical game between the devision maker and his opponent, his environment.P Let us consider a payoff matrix consisting of the rewards for possible combinations of actions, and subjective P probabilities of the states of the environment. In the two-urn experiment above, we have a payoff matrix:
(3.5.1 )
where PlI and P2t are, from the point of view of game theory, the estimates of the environmental mixed strategy. The decision maker can play a 1 or a 2 to maximize the expected payoff given his estimate of the environment's mixed strategy. This approach results in a Bayesian pure strategy, but other strategies are possible. We shall examine them by use of a diagram (Fig. 3.1). The vertical axis of the figure is the expected payoff, given that a white (black) ball has been drawn from urn I (2), while the horizontal axis is
3.5
MIXED ADAPTIVE PROCESSES OF THE FIRST AND SECOND KIND
43
Pure slralegy. choose urn 1 I'll
-------
r
I I I
\
j\
£(s"A) Expecled payoff for Slole I (while boll drown 1'12 from urn 1I
i \ 1
\
-------r---------\--------1
I
I
I
\
,
Pure slralegy. choose urn 2
,
I
I
1'21 £(S2. Al
FIG.
3. J.
Expecled payoff for Slole 2 (block boll drown from urn t)
Expected payoff space for a primitive adaptive process.
the expected payoff, given that a black (white) ball has been drawn from urn 1 (2). Each of the decision maker's strategies appears as a point in this space. The two pure strategies-the choice of urn 1 with certainty or the choice of urn 2 with certainty-can be plotted; given the payoff matrix. For pure strategy I (choose urn I with certainty, at), the coordinates would be (expected payoff, given that a white ball is drawn from urn 1)
(3.5.2)
(expected payoff, given that a black ball is drawn from urn J)
(3.5.3)
and
44
3.
THE PRIMITIVE ADAPTIVE PROCESS
For pure strategy 2 (choose urn 2 with certainty, a 2) , the coordinates would be (expected payoff from a black ball from urn 2)
(3.5.4)
(expected payoff from a white ball from urn 2)
(3.5.5)
and
A convex set is formed by the points comprising the straight line between these two pure strategy points. This line is the set of all admissible strategies, A, available to the decision maker. A special line can ~ defined on the strategy space, and is given by the equation
(3.5.6) where
C is a constant, E(SI , A) is the expectation of payoff given any admissible strategy, A, and given that the state of the environment is Sl' and E(S2' A) is the expectation of payoff given any admissible strategy, A, and given that the state of the environment is S2 . Let
(3.5.7) be the line that supports the convex set of admissible strategies from below. The strategy represented by the support point is the Bayesian strategy which maximizes the expected payoff, given the decision maker's estimate of the environment's strategy. It should be noted that, except for a critical value of PlI and P21 where the supporting line coincides with the convex set of admissible strategies, the Bayesian strategy will be a pure strategy. If during the process the decision maker's estimate of the environment's strategy changes in such a way as to pass through this critical value, he will switch from one pure strategy to the other. This, of course, is analogous to the decision maker switching at some stage from drawing from one urn to drawing from the other for the rest of the stages in the process. Because of the stochastic nature of the process it would not be unusual for the switching to occur back and forth for several stages until one particular strategy (urn) emerges clearly as the superior. This uncertainty between the choice of the
3.5
MIXED ADAPTIVE PROCESSES OF THE FIRST AND SECOND KIND
45
two strategies becomes the crux of the matter when we introduce the first kind of adaptive process in the context of game theory. To place the two kinds of adaptive processes on the same footing, and to facilitate the synthesis of the combined process, we must recast the "T" maze experiment in terms of the urn experiment. It will be remembered that Q probabilities were defined as the probabilities for a right- or left-hand choice in the maze. We will now redefine them as the probabilities of a choice of urn 1 or urn 2. A transition equation was defined (Eq. (3.3.4» which depended on the payoff obtained in the previous stage. As before, we may plot the decision maker's pure strategies described in terms of expected payoff, given the state of the environment. Again these are two pure strategies (in the maze experiment, the two pure strategies were to turn right with certainty and to turn left with certainty). Remember that the probability of the food being on the right side was unity, and naturally the probability of the absence of food on the left side was also unity. For the "T" maze experiment in terms of the urn experiment, we have qlt as the probability of selecting urn 1 during the tth stage; q2t as the probability of selecting urn 2 during the tth stage,
+
q21 = 1, and qll' q21 ~ O. where qll In the "T" maze experiment, the usual assumption on the first trial is that the probability of a left- or right-hand turn is the same, i.e., the Laplacian assumption. Therefore, (3.5.8)
Upon combining this information with the previous statements, we find that any pair of qlt and q2t'S defines a mixed admissible strategy for the decision maker. For example, suppose we examine on Fig. 3.1 the first stage of a "T" maze process, where the decision maker has made the Laplacian initial conditions. Since all admissible strategies lie along a line between the points representing the two pure strategies, the initial mixed strategy must lie on this line. Returning to Eq. (3.3.15), we see that the expectation of payoff for the first stage is (3.5.9)
From this we see that the decision maker's initial mixed strategy is a
46
3.
THE PRIMITIVE ADAPTIVE PROCESS
point halfway between his two pure strategies. Since, in general, all points along this line are represented by the linear equation (3.5.10)
where Yl + Y2
=
1 and Yl , Y2 ~ O. We may substitute for the y's, (3.5.11)
and (3.5.12)
Any value for the s» subject to the probability axioms will lie along the line between the two pure strategies. Therefore, the locus of all possible states of the first kind of adaptive process is the convex set of admissible strategies. It is now possible to combine both kinds of adaptive processes into one general mixed adaptive process. Suppose we alter the transformation equations (3.3.6) so that they depend not on the payoff received during the last stage but on the expectation of the payoff. For the adaptive process of the first kind, nothing would change since the expected payoff, given the reward side is chosen, is the payoff itself, because in this process the probability of the payoff, given the reward side was chosen, is unity. However, in the mixed process, the expectation of a reward, having chosen urn 1, is not necessarily equal to the reward (this is because the conditional probability of getting a white ball, given that the decision maker has chosen urn I, is not necessarily unity). Therefore, in the mixed process, the a posteriori Q probability transition equation will depend on the maximum expected payoff as well as the a prioriQ probability. We can write a new transition equation: for each
iEM,
(3.5.13)
where LiEM gil+! = I, gil+! ~ 0 for each i E M, and E:t( V) is the expected payoff of the chosen alternative out of all M alternatives. Since the urn experiment has only two alternatives, we can simplify Eq. (3.5.13) to (3.5.14)
and (3.5.15)
3.6
47
SUMMARY
where T;'(qu)
applies if
Et(V) = Eu(V)
T;'(qu)
applies if
E~(V) = E2t(V)·
The expectation of payoff has changed greatly since we have permitted mixed strategies. The expected payoff is Et( V) = Vll PreVn)
+ V21PreV21) + V
12
+ V22PreV22).
Prevd
(3.5.16)
Remembering Eq. (3.3.11), we can write Pr(vn)
= quPu
Pr(v 21 )
= qUP2t
Pr(v12 )
=
Pr(v 22)
=
q2tPtt
q2tP2t ;
thus, the expected payoff becomes Et(V) = vnquPu
+V
21qltP2t
+ V12Q2tPtt + V2~2tP2t .
(3.5.18)
Equations (3.5.14) and (3.5.15), after substituting the linear form of the Tm function, become Qlttl
and
I
+ (1 -
1Qlt = «a« + (l CX
cx 1)! cx 2)
\
(3.5.18) (3.5.19)
where (Xl
applies if
E~( V) =
cx2
applies if
Ei;(V) = E2t(V).
Elt(V),
3.6 Summary Equations (3.4.3), (3.5.18), (3.3.11), and (3.5.17) essentially comprise the relations describing the mixed adaptive process for the urn experiment. We summarize them as follows: (1) a transition function to change the a priori subjective probability of the state of the environment at the tth stage to an a posteriori subjective probability of the state of the environment. (3.6.1)
48
3.
THE PRIMITIVE ADAPTIVE PROCESS
and
(3.6.2) where fJl applies if in the tth stage either Vn or V 12 were actually received, and fJ2 applies if in the tth stage either V 22 or V 2l were actually received. (2) a transition function to change an a priori probability of taking a particular decision at the tth stage to an a posteriori probability of taking a particular decision in the (t l)th stage:
+
(3.6.3) and
(3.6.4) where <Xl applies if ir, the tth stage either V n or V 12 were actually received, and <X2 applies if in the tth stage either V 22 or V 2l were actually received.
(3)
a logical structure equation defining the payoff function
and
(3.6.5) (4)
an expected payoff equation for each stage
(3.6.6) It is important to note that the mixed process becomes either the first kind or the second kind of adaptive process if appropriate values are assigned to the P and Q probabilities. To convert the mixed process into an adaptive process of the first kind, let
Prt =
and
1,
Then Eq. (3.6.1) is changed to P1I+l
=
1,
regardless of payoff. Therefore this is true for all stages.
3.6
49
SUMMARY
Equation (3.6.3) remains the same. Equation (3.6.5) becomes Eq. (3.3.12) since (S2' al) and (S2' a2) are events with probability zero. Equation (3.6.6) becomes Eq. (3.3.15); or
(3.6.7) To convert the mixed process to an adaptive process of the second kind, we substitute and
or vice versa.
Equation (3.6.5) remains the same. Equation (3.6.6) becomes for
(3.6.8)
for
(3.6.9)
or
It should be noted that no explicit maximization of the expected payoff is necessary in the limiting case of a mixed process converted into an adaptive process of the second kind. The maximum of expected payoff is implied by the conditions on the Q probability accompanying the two values of the expectation. If qu = 1, then E u ( V) = Ett( V) because if it had not, then ,qu would never have become unity. With the help of Fig. 3.2, it is illuminating to examine the mixed process as it goes through 'its possible states. Suppose the decision maker's initial subjective reasoning was very poor and he was convinced that urn 2 contained all white balls. With these conditions, PIO = 0, P20 = I, qlQ = 0, and q20 = 1 are possible initial values, while suppose actually the probability for a white ball from urn 2 is t, and for a black ball is l As the decision maker draws balls from urn 2 he will be disappointed, more than not, at each stage. Pu will grow, and so will qu . When qu =1= for a certain number of stages, the decision maker will erratically choose urn 1, even though he has some idea remainingthat Pu is less than P2t , but with less conviction. As the data arrives during the stages of the process, and both Pu and qu grow, the decision maker will become highly erratic and switch to urn 1 more and more frequently. Soon he will draw from urn 1 with very few drawings from urn 2, being convinced finally that urn 1 is more rewarding. Finally, as qu approaches 1, and Pu approaches t, the decision maker will never bother
°
50
3.
THE PRIMITIVE ADAPTIVE PROCESS
\,
"v,
,
Limiting sUbjective probability line
"", \ ,
E(s,.A) Expec'ted payoff for Stote 1 (white bell drown from urn tl
17
Initial subjective probability
'~
Critical sUbjective " probability line-A..:: V2 1
E(~. Al
FIG. 3.2.
V22
Expected peyoff for State 2 (black bell drewn from urn tl
Strategy shift during adaption,
to draw from urn 2 again and will have complete conviction that urn I will produce a white ball -i of the time. We see then that the mixed process "blurs" the whole decision process so that no clear-cut decision is made as in the adaptive process of the second kind. However, in the limit, the decision maker acts as though he has made a clear-cut decision for all practical purposes. An indecisive person would be one whose Q probability changed slowly so that a great number of erratic decisions would be observed during a mixed adaptive process. A dogmatic person would be one whose Q probability changed rapidly so that few erratic decisions would be observed during a mixed adaptive process. It is important to notice that whether dogmatism or indecisiveness is appropriate during a mixed process can only
REFERENCES
51
be ascertained by the payoff performance of two individuals with different ex and f3 parameters. A rational mixed adaptive process is one which possesses a unique transition function for the Q and P probabilities, which is in some sense better than any other possible transition function. For the second kind of process, such a unique best transition function will be sought, and this is the subject of the next chapter. References 1. Bush, R. R. and Mosteller, F., "Stochastic Models for Learning." Wiley, New York, 1955. 2. Blackwell, D. and Girshick, M. A., "Theory of Games and Statistical Decisions," Chapters 3, 5, and 6. Wiley, New York, 1954.
CHAPTER 4
SUBJECTIVE PROBABILITY The daily affairs of men are carried on within a framework of steady habits and confident beliefs, on the one hand, and of unpredictable strokes of fortune and precarious judgments, on the other. Ernest Nagel
4.1
International Encyclopedia of Unified Science
Introduction
An essential part of a stochastic adaptive process is the historical information vector. The effect of the confluence of the decision maker's past actions and his environment must be transmitted to him so that the current action can be determined. History is essentially a listing of events, systematically labeled and kept in sequential order. In simple processes this presents no particular difficulty. The real problem is what to do with history. How can history be used to improve current decisions? By restricting our attention to some very simple adaptive processes this latter problem has a solution. We will call a technique which, given certain initial information, determines a course of action for a decision maker, a strategy. We want to find a strategy that uses history as part of the decision-making process. The Bayes strategy not only uses historical information but uses the decision maker's a priori beliefs as well. In 1763, an English philosopher, Thomas Bayes, developed a method to convert a priori current beliefs concerning a set of events into current probabilities for these events, given information related to these events. The Bayes formula has been misused over the years because of a misunderstanding in its application. The probabilities generated by the Bayes formula are personal probabilities because they stem from personal beliefs and personal historical experience. Perhaps Laplace caused his followers to misuse Bayes's formula when he computed the probability of the rise of the sun based on historical evidence over the past 5000 years, and a particular a priori belief now known 52
4.2
A HEURISTIC EXAMPLE-THE THREE INVESTORS
53
as the Laplace assumption. Laplace's probability for the rise of the sun was in fact Laplace's personal probability. Others can, and do, have other beliefs that lead to other probabilities. If we assume that the rise of the sun is the result of some mechanistic system of astronomical events, a statistical probability could be defined in the same way we define the probability of a well-balanced coin falling heads. This statistical probability may not be equal to the personal probabilities calculated by Laplace using the Bayes formula. Personal probabilities really have no place in statistical estimation unless we clearly agree on our a priori beliefs. Personal probabilities or, as we shall call them, subjective probabilities are useful in ·the analysis of adaptive process since here we are concerned with the actions of decision makers with different a priori beliefs.lr" The Bayes formula pr~des a technique to evaluate the effect of historical information in an adaptive process. Other methods are available to do this too. One of interest is the stochastic approximation technique.! When the Bayes strategy is employed, the type of probability distribution functions used must be specified; however, the stochastic approximation technique is distribution-free, i.e., no distribution need be specified. The stochastic approximation method is used in various forms in the design of adaptive automatic control devices, but this method does not lead to mathematical models of adaptive processes that are amenable to analysis. Therefore, most of the processes studied in this book will assume Bayes's strategies. The Bayes strategy is not only realistic; but it also leads to mathematical models that are relatively easy to analyze.
4.2 A Heuristic Example-The Three Investors To illustrate the importance of subjective probability to the adaptive decision process, we shall consider the affairs of three investors who are offered the chance to purchase stock in the Slantwell Oil Corporation. Mr. Wise, who is also an accountant at Slantwell Oil, is the first of our three investors. He is known as an "insider" in the investment business. He knows that Slantwell is drilling in a location where, according to the corporation's geologists, a well will produce oil with 75 per cent reliability. Armed with this knowledge, Mr. Wise has just invested all his savings in SJantwell stock, which he feels will rise in price when the news of an oil strike reaches the public.
54
4.
SUBJECTIVE PROBABILITY
Mr. Holmes, the second of our three investors, is known in the investment business as a "technician." A technician is an investor who believes that the past and current prices of a stock indicate with some degree of reliability the future course of the price. Mr. Holmes has followed the price of Slantwell Oil stock for the past month and is convinced that a period of accumulation is occurring. During such a period, the price of a stock fluctuates over a narrow range as if it were uncertain whether to rise or fall. Mr. Holmes says that this observation indicates that insiders are buying or selling Slantwell stock. When the available supply of stock runs low, Mr. Holmes feels that the price will exhibit an upside or a downside breakout from the narrow accumulation range. The trouble is that Mr. Holmes does not know which way the breakout will occur. In such an accumulation period, the chances are about 50 per- cent for either way. Mr. Holmes's. experience indicates that a breakout of one point above (below) the accumulation range will indicate about 75 per cent probability for an upward (downward) price movement, respectively. If the breakout occurs and is upward, he will buy Slantwell stock; on the other hand, if the breakout occurs and is downward, he will sell Slantwell short. Mr. Stone is the last of our investors. Recently, he read a number of statistical studies of stock market price changes that tended to conclude that stock price mechanism is a random walk process where a rise is just as likely as a fall. This conclusion means that any cycles or trends in the price of stock are only the result of "noise" and not the result of a systematic underlying mechanism. This conclusion makes a lot of sense to Mr. Stone since over the years he never made much money in the stock market. Armed with this evidence, he decides to leave his money in his bank. The current situation of the three investors is this: Mr. Wise has already invested his savings in Slantwell, convinced that the price will rise 75 per cent of the time. Mr. Holmes has not invested yet since at this time he feels that the chance of rise is only 50 per cent until the breakout. When the breakout occurs he will invest or sell short, depending on the direction of the price. Mr. Stone has not invested either since he is also convinced that the chance of a rise in price is only 50 per cent. He is not looking for a breakout since he feels that the price of Slantwell is a random walk and devoid of any systematic trends or cycles. Suppose that we could give some kind of test to our three investors to measure their personal beliefs concerning the rise of Slantwell's price; i.e., their subjective probability for a rise in the price. We could
4.2
A HEURISTIC EXAMPLE-THE THREE INVESTORS
55
ask questions of this type: "What is the likelihood that 25 per cent of the time the price of Slantwell will rise?" Mr. Wise would say that this likelihood was pretty low, and we would have to devise a von Neumann utility test" to place a value on the "pretty low" level. Doing this for each desired value of the percentage chance of a rise, which we shall denote as p, we can plot the results in the form of a probability function for the kth investor. In other words, we will normalize the "likelihoods" from the von Neumann tests so that the area under the plotted curve is unity. Figure 4.1 shows subjective probability density functions for 6.0 Mr. Wise
~ b.
.~ tic
.:
5.0
4.0
Mr. Slone
.~
110
.e
.Q
3.0
! j sa
2.0
10
0.1 0.2
0.3
0.4
0.5
0.6
Probability. PI or (I-
FIG. 4.1.
0.7 OS
0.9
1.0
Pzl
The beta distribution (binomial case) for the three investors.
each of our three investors which we shall assume to have been measured by the von Neumann tests. The curve for Mr. Wise (Fig. 4.1) indicates by the sharp peak at p = -! that Mr. Wise is quite sure that, on the average, in three out of every four days Slantwell will increase in price. He admits that there is a slight likelihood that he could be wrong, but not by much. The curve for Mr. Holmes, which is broad but maximum at p = t, shows that he is not certain at all that his guess of 50 per cent chance of a rise is accurate since future events (the breakout) may change the whole picture.
56
4.
SUBJECTIVE PROBABILITY
Mr. Stone is highly convinced that the chance of a rise is 50 per cent, as shown in this curve by the sharpness of the peak at p = t. It is necessary to find a mathematical function that will fit the results from our tests with the three investors. We will denote this function as the subjective a priori probability function. There are many functions that we could employ, but the beta probability function has many properties that are useful and instructive. For the case where we have two alternative events, such as a price rise or a price fall, the beta probability function for the kth investor has the form gk(P)
p"lk-1(1 _ p)"2k-1
= f 1 P"lk- 1(1 - p)"2k-1 dp ' O =0
where
O~p~l
and
k
> 0, = 1,2,3;
CX1k , CX 2k
elsewhere.
(4.2.1)
Note that the beta probability function is characterized by two parameters, iXlk and iX2k • By the use of this functional form we can characterize each of our three investors by assigning the value of these parameters which cause the beta probability function to fit the test data. We shall call these parameters for the kth investor the conviction vector for the kth investor. The test data in Fig. 4.1 leads- to the values of the conviction vector given in Table 4.1, Columns 2 and 3. TABLE 4.1 Decision maker Name Mr. Wise Mr. Holmes Mr. Stone
Conviction vector
Subjective probability
k
"'kl
"'k2
a priori
after I rise
I
30 2 20
10 2 20
0.750 0.500 0.500
0.756 0.600 0.512
2 3
As a check on our results, we might ask for the expected value of the subjective a priori probability function. This value should coincide with the probability beliefs expressed by our investors. The expected value of p for the kth investor is given by Ek[P] = f~ pgk(P) dp.
(4.2.2)
Using Eq. (4.2.1), we find Ek[P]
=
B(cxlk
+ 1, (X2k)
B( (Xlk
, (X2k)
(4.2.3)
4.2
A HEURISTIC EXAMPLE-THE THREE INVESTORS
57
where we denote (4.2.4) as the beta function for any two parameters, Yl and Y2 , subject to the restrictions on Eq. (4.2.1). Now Eq. (4.2.3) can be reduced to a product of gamma functions. We have
Since T(y
+ 1) =
yT(y),
we can reduce the expectation of p to (4.2.5) Substituting the conviction coefficients for our three investors will give us the expected subjective a priori probabilities. Table 4.1 shows these values in Column 4. We see that the conviction coefficients do in fact lead to subjective a priori probabilities that were expressed by the investors. We also see in Table 4.1 that the same expected subjective a priori probability is obtained for Mr. Holmes and Mr. Stone, even though the conviction coefficients for these investors are quite different. So far we have considered only the initial conditions of a dynamic process. As we advance in time and receive information from the stock market (price quaotations) we will see the significance in the difference in the values of the conviction coefficients. Suppose on the next trading day the price of Slantwell rose. Let us see what effect this observation has on the three investors. We want to find the expected subjective probabilities for the three investors, given that on one day out of a one-day sample the price of Slantwell rose. We want Ek[p I~=i] =
r
P Prk[p 1;=;] dp
o
(4.2.6)
where n is the number of days on which a rise in price was observed; t is the total number of days ovserved; and Prk[P li:f] is the conditional subjective probability density function.
4.
58
SUBJECTIVE PROBABILITY
Bayes's formula provides a way to calculate the conditional subjective probabilities needed in Eq. (4.2.6). In our case, it is given by (4.2.7)
where Prk[n I p, t] is the conditional probability of observing n days where a rise in the price of Slantwell stock occurred, given that the probability of a rise is p for the kth investor. Equation (4.2.1) gives the value of gk(P); but how can we determine the conditional probability, Prk[n I p, t] ? Suppose we assume that there exists some underlying physical or economic stochastic process that generates the price changes in Slantwell stock. For example, if the events (a rise or a fall) are mutually exclusive and are time-independent, we can represent the stochastic underlying process by the use of the binomial probability function. For each of the k investors, we would have Pr[n I p, t] =
tl I( ~ )1 pn(l - p)t-n, n. t n .
=0
for
O:(p~l,
0:( n ~ t,
elsewhere.
(4.2.8)
We now have all we need to solve Eq, (4.2.7) for the conditional subjective probabilities after an observation of the price change for one trading day. We have (4.2.9)
Now substitution of Eq. (4.2.9) into Eq. (4.2.6) results in (4.2.10)
From the definition of the beta function equation (4.2.4), we can write Eq. (4.2.10) as
_ B(n
E k [PI n, t ] -
+ + 1, t - n + + t - n+
B(n
(Xu
(X1k'
(X2k)
)
(Xak
.
4.2
59
A HEURISTIC EXAMPLE-THE THREE INVESTORS
Rewriting the beta functions as gamma functions, we have r(n + CXll' + l)r(t - n + cx2l')r(t + CXll' + CX 2l') r(t + CXll' + CX2l' + l)r(n + (Xlk)r(t - n + CX 2l') ,
which can be reduced to (n
+ cxlk)r(n + CXll')r(t - n + ~l')r(t + cxll, + CX2l')
Finally we can cancel common terms and arrive at El'[pin, t ]
=
t
+n + + CX2k CXll'
(4.2.11)
CXll'
Now if on the next day our investors observe a rise in the price of Slantwell stock, we can set nand t equal to one in Eq. (4.2.11) and find each investor's expected subjective probability for a rise after observing a rise on this day. Column 5 in Table 4.1 shows the results. Mr. Wise is still fully invested in Slantwell and his expectations for further rises have beeen reinforced. Mr. Holmes thinks that the topside breakout may be in the making, an he is sitting hear his phone ready to place an order if this proves true. Mr. Stone feels that nothing has occurred which alters his hypothesis and has changed his opinion very little. Table 4.2 shows the observations of the three investors for the next 19 trading days. For each day, the conditional subjective probability of a rise for each investor has been calculated from Eq. (4.2.11). These probabilities have been entered in Table 4.2 for each investor. In this case, Mr. Wise's convictions were in fact true and he made a large TABLE 4.2 SUBJECTIVE PROBABILITIES (FOR A RISE IN PRICE) FOR THE THREE INVESTORS FOR
20
DAYS
Market Day Price Change Mr. Wise Mr. Holmes Mr. Stones
1 +1 .765 .600 .512
2 -1 .738 .500 .500
4 +1 .750 .625 .523
5 -1 .734 .556 .511
6
7
+1 .744 .571 .512
+1 .740 .600 .521
+1 .745 .636 .532
8 +1 .750 .666 .541
+1 .755 .694 .550
Market Day Price Change Mr. Wise Mr. Holmes Mr. Stone
11
+1 .765 .734 .569
12 +1 .770 .750 .577
13 +1 .775 .765 .585
14 -1 .760 .722 .575
15 +1 .764 .736 .582
16 +1 .768 .750 .590
17 +1 .772 .762 .596
18 -1 .759 .726 .586
+1 .764 .740 .594
3
9
19
10 +1 .760 .715 .560 20 +1 .767 .750 .600
4.
60
SUBJECTIVE PROBABILITY
profit. On the 12th day, Mr. Holmes bought Slantwell stock and made a fair profit by the 20th day. Mr. Stone, however, did not participate since changed his opinion only slightly about the fluctuations of stock prices. He is waiting for the price to fall again and vindicate his theory.
4.3
Economic Environmental Processes
The economic environment (the event-generating process) found in the preceding example permitted only two events to occur: a rise or a fall in the price of Slantwell stock. The process generating these events was assumed to be a sequence of time-independent binomial or Bernoulli trials. The three investors are said to be in a binomial economic environment. We shall find many economic adaptive processes where the structure of the economic environment is more complex. For example, we could have a situation where anyone of many events might possibly occur. A sequence of these events, mutually exclusive and time-independent, could be generated by a sequence of time-independent multinomial trials. This case would be an example of a multinomial economic environment. Another possibility is that the events within each stage are mutually exclusive but dependent over time. A sequence of this kind could be generated by a Markov chain process running synchronously with the decision-making process. This would be an example of a Markov economic environment. Any of the above types of environmental processes could appear as a feedback process. In a feedback process the event probabilities are in some degree controlled by the actions of the decision maker. It is beyond the scope of this book to consider feedback environmental processes. This exclusion is not as restrictive as would first seem. In the study of economic processes it is often assumed that there are a great number of decision makers in the system, so many in fact that anyone of them has little hope of changing the structure of the system. This assumption, which is one of the cornerstones of the concept of pure competition in economic theory, can be carried over to adaptive economic processes. Thus, we shall assume that anyone decision maker cannot effect changes in the structure of his economic environment. In other words, we shall not consider feedback environmental processes. We must also exclude time-dependent environmental processes such as the Markov environment. This leaves us with the multinomial environment which we shall consider in some detail. The exclusion of
4.3
61
ECONOMIC ENVIRONMENTAL PROCESSES
time-dependent processes is somewhat restrictive. Fortunately, it has been found that many economic environmental processes (as indicated by the study of economic time series) are time-independent in the first or second difference sense." For example, the stock market price indexes are highly time-dependent, but the differences in price from period to period (the price changes) are for practical purposes timeindependent." This observation was exploited in the previous section and will be useful in Chapter 7 also. Time independence of the first differences cannot always be assumed when one considers individual stock prices. It is in cases of this type that we will find exclusion of timedependent environmental processes most restrictive. Consider the assumption of mutually exclusive sets of events (only one event out of a set of possible events can occur during a stage of the process). We shall define a set, M, of possible events which could occur as a result of a multinomial environmental process. We shall include in this set only those events which have some economic significance to the state of the process. In other words, only those events which affect a change in the structural state vector of the system will be included. We define the stage of the process as an interval of time short enough so that one and only one event of the set M can occur during that interval. A question comes up as to whether the event, "no change," should be included in the set M. Generally in economic processes, the "no change" event carries with it no reward for the decision maker, and its role is mainly one of scaling the economic problem to physical time. Most economic problems submit to simple analysis only when based on the state changing events in set M (excluding the "no change" event). If we base the definition of a stage on this set M, we see that the time of the process will not run in pace with clock time, but will have its own scale based on the changes to the state vector of the process. Since the change of state is a measure of the number of events which have occurred, or a measure of the state of the organization of the adaptive process, w~ define the time scale on which the stages of our processes are based as the entropy time scale. Under the conditions set forth above, we define a set of probabilities which describes the environmental process and which is referred to as the probability function of the economic environmental stochastic process. We have: (I) Pr(that anyone event out of the set M will occur in one stage)
= l. = 0)
(2) Pr(that any two or more events out of set M will occur in one state)
62
4.
SUBJECTIVE PROBABILITY
Then we say: (3) Pr(that the ith event out of a set M will occur in the tth stage) = Pi' for i E M, and all t = 1, 2, ..., T. We let set M consist of m elements and define a vector composed of binary numbers, xii = 0, 1; i E M, t = 1,2, ... , T,· which take the value 0 if event i has not occurred, or 1 if event i has occurred. We make a further assumption: (4)
The event descriptions, the binary numbers {Xii}' are stochastically independent random variables.
This property can occur if and only if the joint probability function for a sequence of event descriptions can be factored into the product of the marginal probability functions for each event description. The ordered set, P, can be written in functional form suitable for further manipulation as:
f(x Jt , X2t, ... , Xmt ;Pl ,P2' ·.. ,Pm) = Xi!
= 0, 1;
~Xi! = 1;
iEM,t
=
I1N/t; iEM
1,2, ..., T;
(4.3.1)
iEM
Pi ~ 0;
~Pi = 1;
iEM
and =
°
elsewhere.
The function, f('), is the probability of some particular vector, {XiI}, or, what is saying the same thing, some particular event in the tth stage. For example, suppose the vector {XiI} was given by {XiI} = {DOlO} for some t and m = 4; i.e., the third event has occurred in the tth period. For this particular vector, Eq. (4.3.1) would give
Suppose that this process has operated for T stages. The decision maker has accumulated a sequence of vectors describing the history of the T stages of the process. This set is given by
4.4
A DIGRESSION ON STATISTICAL ESTIMATORS
63
where each vector is the {xu} for a particular stage and indicates which event occurred during that stage. Since we have T stages, we can write (4.3.3)
Furthermore, we know that within any vector, say the tth, (4.3.4)
Also, the frequency of, say, the ith event is given by T
I-Xit t=l
= ni,
ni E N.
(4.3.5)
Note from Eqs. (4.3.5) and (4.3.3) that we have I-ni = T.
(4.3.6)
,EM
We propose to use the vector, N, as a statistic leading to the estimation of the parameters of the probability function of the economic environmental process, P.
4.4 A Digression on Statistical Estimators To the statistician, the subjective probability functions (4.2.11) developed for the three investors in Section 4.2 were examples of statistically biased estimators of the probabilities of a binomial probability function. This similarity is intentional. Since statisticians know a great deal about the properties of estimators, we want to use this body of knowledge to discover the properties of our Bayesian subjective probabilities. Statisticians often talk about the "best" estimators. The term "best" is always well qualified so that there is little doubt how the estimator is best. We shall use the term best in a conventional sense. The best estimator of parameters of a stochastic process is defined as the estimator which is statistically unbiased and has a variance less than or equal to the variance of every other unbiased estimator of these parameters.
4.
64
SUBJECTIVE PROBABILITY
The results of a sequence of statistical observations can be mathematically represented by a number or vector which is called a statistic. We propose to use the vector N introduced in Section 4.3 as a statistic leading to the estimation of the vector P. This statistic can be shown to be statistically sufficient by the use of the Fisher-Neyman theorem." We know that some unbiased estimator of the vector P exists (for example, the vector {Xit} for any t is such an estimator). Therefore we can use the Rao-Blackwell theorem" to find the class of estimators which lead to the best estimators. Since we have both a sufficient statistic and an unbiased estimator of P, the Rao-Blackwell theorem tells us that if we want to find a "best", estimator, we need only search through those estimators which are. functions of the sufficient statistic, N. Since it can be shown that the probability function of the sufficient statistic, N, is a complete function t for all possible vectors, P, we know from the Lehmann-Scheffe theorem'? that if we can find an estimator of P which is an unbiased function of N, it will be the only best estimator for P. Such an unbiased estimator for the multinomial probability function is the maximum likelihood estimator (m.I.e.). In this case, it can be shown that the maximum likelihood estimator" is given by
iEM,
(4.4.1)
We note that: (1)
The m.1.e. is a function of the sufficient statistic N.
(2)
The m.1.e. is unbiased; i.e., ,
~
E(p;)
=
1
T E(n;)
=p;, t
iEM.
Definition of a complete function:
Let f(x;p), 0 < p < 1, be a probability density function (p.d.f.) of a random variable X. Let u(x) be a continuous function of x but not p. If E(u(x» = 0
for every p, 0 < p < 1, requires that u(x) be zero at every point for which there is some p, 0 < p < 1, that makes f(x; p) > 0; the p.d.f. f(x; p) is called a "complete" function. t See for example Reference 11.
4.5
65
MULTINOMIAL SUBJECTIVE PROBABILITIES
We sum up these results in the following. Let {xn , X 21 , ••• , x m 1 ; x 12 , X 22' ... , x m 2 ; ••• ; X1T, X 2T, ... , XmT} = X T be the event history of a random process of T stages (entropy time periods) having a probability function at any period, t,
its«. X 2t, ... , Xmt
;
P)
=
ITP~it; iEM
where P is a vector whose elements are subject to
Pi )0 0, ~ Pi = 1, iEM
where
Xit
= 0, 1; ~ Xi! = 1, iEM
=0
for any t = 1,2, "0' T, and elsewhere.
P,
If P is stationary for all t = 1,2, ..., T, and if the vector composed of = (liT) ~;=l Xi! , is the maximum likelihood estimator of P, then elements is the unbiased, minimum variance (best) estimator of P.
P
Pi
4.5
Multinomial Subjective Probabilities
At the time a decision maker starts collecting historical information on a stochastic process, he already has some conviction concerning the probabilities of the events to follow. Suppose the economic environmental process possesses a stationary probability function with parameters, P. Then for some particular statistic or frequency vector, N, there is a definite probability which gives the likelihood of the occurrence of this particular N. The conditional probability function of any particular N during T stages (entropy time periods) can be represented by the multinomial function Tl f(N 1P)
= - ' - IT An"
IT nil iEM
iEM
where P is a vector whose elements are subject to
and where N is a vector whose elements are subject to and
=0
elsewhere.
(4.5.1)
66
4.
SUBJECTIVE PROBABILITY
In Eq, (4.5.1), we denote ni as the frequency of the occurrence of the ith event in T stages; Pi as the actual probability of the ith event from the stationary probability function of the economic environmental process; and f(N I P) as the conditional probability function of a particular frequency vector, {nih given the vector {Pi}' It is necessary to obtain still another conditional probability function. We need f(P I N), the conditional probability of a particular event probability vector, given that a particular frequency vector has been observed by the decision maker during the passing of T stages. This probability, for the kth decision maker, is obtained by the use of Bayes's equation. It is given by
i
(P I N) k
=
f(N I P)gk(P)
f.p f(N I P)gk(P) dp ,
=0
where Pi E P , where ni E N, elsewhere.
and (4.5.2)
The function gk(P) is defined as the a priori subjective probability function of the decision maker. The function to be assumed for giP) will be the Dirichlet or multinomial beta probability function.P which will be shown to give fk(P I N) some useful properties. This function is given by
where CXki
>
Pi E P, 0
i EM,
=0
for some
k,
and
elsewhere.
(4.5.3)
In Eq. (4.5.3) we define {CXki} as a vector of parameters which represent the kth decision maker's a priori convictions about the probability function of the environmental process, and
Pi
E P, and for each iEM and the kth decision maker.
where CXki
>0
(4.5.4.)
4.5
67
MULTINOMIAL SUBJECTIVE PROBABILITIES
We define the subjective probability vector (subjective estimator) for Pas Pk T • where for the kth decision maker.
(4.5.5)
Substitution of Eq. (4.5.1) into Eq. (4.5.2) and Eq. (4.5.3) into Eq. (4.5.2) and taking the expectation yields for PkiT
[B( CXk1 , CXk2 , ..., CXkm)] -1
J II p;~+ak!-l dP P !EM
for Pi E P, CXki > 0 for some k, and iEM.
This reduces to
The multinomial beta function can be given by
IT r(Yi)
B(Y1 , Y2 , ... , Ym )
iEM = ..:.:::::....--
r[~Yi] iEM
Therefore, we can write r(ni
fi~=
+ CXki + 1)
IT
!=l,i"'!
r(ni
+ CXki)r[ ~ ni + ~ CXki] !EM
m
r( ~ ni + ~ cxki + 1)r(n; + CXki) IT !EM
!EM
!~l,i"'!
!EM
r(n~ + CXkU
We have finally ni
~
0,
for some and
k
iEM.
(4.5.6)
68
4.
SUBJECTIVE PROBABILITY
This estimator will be denoted as the subjective probability vector for the probability function of the multinomial environmental process. This subjective probability vector is identical to the "Predictive Inference Function" postulated by Carnap.P Carnap called the parameter, ():k/ ~iEM (Xki' the "logical width" of the event i (for the kth decision maker). Recently, the subjective probability vector has been applied to the prediction of source information rates in electronic communication systems by Schwartz et al. 14
4.6
Properties of the Subjective Probability Vector
Because the subjective probability vector is so important in the role of adaptive processes it is necessary to examine its properties and its relation to the unbiased, minimum variance estimator developed in the last section. The property of "statistical consistency" is desirable for any estimator used as a subjective probability vector in an adaptive scheme. The statistical consistency of the subjective probability vector implies that there is an end to the adaptive process where the decision maker (if he should live that long) will know the true probability vector of the environmental process with certainty. This requires also that this probability function does not change during this process; i.e., the probability vector is composed of stationary probabilities. The consistency of the subjective probability vector can be summed up as follows: Let {xu, X 21 , ••• , x m 1 ; X 12 , X 22' ... , X m 2; ... ; X 1T' X 2T' ... , x m T } = X T be the event history of a random process of T stages (entropy time periods) having a probability function at any period, say the tth, of f(Xlt , X 2t , ... , Xmt ;
P)
=
II Pi",
where
such that
Pi E P
iEM
~Pi
= 1;
iEM
where Xi!
= 0, 1;
!-
such that Xit
= 1,
iEM
t = 1, 2, ... , T,
=0
elsewhere.
and
ieM,
69
REFERENCES
The vector P is the probability function of the environmental process and is stationary over the stages in T. If the vector Pk T is the subjective probability function (subjective estimator) for P at stage T and the kth decision maker, then plim PkiT = T..,OO
Pi ,
where
PkiT E f\T ,
Pi E
P,
and any k;
(4.6.1)
and the vector Pk T is said to be a consistent estimator of P. The proof of this follows from the consideration of the asymptotic unbiasedness of the subjective estimator and the weak law of large numbers. We now note the relation between the subjective probability function (subjective estimator) and the maximum likelihood estimator developed in the previous section. The subjective probability vector for P is given by Eq. (4.5.6) while the m.l.e. is given by iEM.
(4.6.2)
The components of the subjective probability vector can be given generally by iEM. (4.6.3) Remember that N is a sufficient statistic so that the subjective probability vector is a function of a sufficient estimator. We note that the subjective probability vector is biased, although it is asymptotically unbiased. Since the subjective probability vector is biased by the decision maker's convictions (logical width), it cannot be the best estimator (m.I.e.) as given by the Lehmann-Scheffe Theorem. As time advances, however, the bias becomes relatively smaller and the subjective probability vector approaches the best estimator (m.I.e.). The subjective probability vector could be said to be the asymptotically best estimator. The consistency property of the subjective probability vector will be important to the theory of adaptive processes. We shall hear more about this property in later chapters. References 1. Good, 1. J., "Probability and the Weighing of Evidence," pp. 31-59. Hafner, New York, 1950. 2. Savage, L. J., "The Foundations of Statistics," pp. 27-68. Wiley, New York, 1954.
70
4.
SUBJECTIVE PROBABILITY
3. Kyburg, H. E. Jr., and SmokIer, H. E., (eds.), "Studies in Subjective Probability." Wiley, New York, 1964. 4. Driml, M., and Hans, 0., On experience theory problems, in "Trans. 2nd Prague Conf. on Information Theory, Statistical Decision Functions, Random Processes," pp. 93-11!. Academic Press, New York, 1960. 5. Mosteller, F., and Nogee, P., An experimental measurement of utility, J. Political Economy 59, 371-404 (1951). 6. Tintner, G., "The Variate Difference Method." Principia Press, Bloomington, Indiana, 1940. 7. Osborn, 1\1. F. M., Brownian movement in the stock market, Operations Res. 7, No.2, 145-173 (1959). 8. Hogg, R. V., and Craig, A. T., "Introduction to Mathematical Statistics," p. 10!. Macmillan, New York, 1959. 9. Hogg, R. V., and Craig, A. T., "Introduction to Mathematical Statistics," p. 104. Macmillan, New York, 1959. 10. Hogg, R. V., and Craig, A. T., "Introduction to Mathematical Statistics," p. 110. Macmillan, New York, 1959. I!. Mood, A. M., "Introduction to the Theory of Statistics," p. 270. McGraw-Hill, New York, 1950. 12. Wilks, S. S., "Mathematical Statistics," pp. 177-182. Wiley, New York, 1962. 13. Carnap, R., "Logical Foundations of Probability," p. 568. Chicago Univ. Press, Chicago, 1950. 14. Schwartz, L. S., Harris, B., and Hauptschein, A" Information rate from the viewpoint of inductive probability, IRE Intern. Conv. Record, Part 4, 102-111 (1959),
CHAPTER 5
THE ROLE OF ENTROPY IN ECONOMIC PROCESSES Even so time exists not by itself . . . apart from the motion or quiet rest of things. Lucretius On the Nature of Things
5.1 The Concept of Entropy Time From the beginning of life on this planet we have been evolving in a world surrounded by regular astronomical phenomena. We have established these astronomical phenomena as our time standards. For what do we use this measure, time? One clear point stands out: we use time as a common measure for the comparison of the rates of growth of various physical phenomena. When we measure length we find a standard which is invariant when transferred to the position of the object to be measured. Typically, such an invariant standard is a rigid stick. To measure the relative growth of one system to another, we cast about for the most invariant growth (or decay) system to use as a standard. A good standard is clock (sidereal) time, time geared to astronomical phenomena. For some reason the use of this standard for comparative measures of growth has obscured the understanding of the passage of time in the physical systems. The thought has occurred to many people in physics and astronomy that the aging of a physical system should not be analyzed on the basis of clock time.t-" If we base our concept of time on the measure of the aging or growth of a system, is it not more appropriate to analyze a system on the basis of a change in the state of organization of the system? Since the state of organization of a system is usually measured by the entropy of the system, the change of the state of organization could be called the entropy-gradient of that system. In Sir Arthur Eddington's book, "The Nature of the Physical World," he said:
Entropy-gradient is then the direct equivalent of the time of consciousness (a mental time concept) in both its aspects. Duration 71
72
5.
THE ROLE OF ENTROPY IN ECONOMIC PROCESSES
measured by physical clocks (timelike interval) is only remotely connected. 3 Furthermore, Henri Bergson, in his book "Time and Free Will," said:
Granted that inner duration, perceived by consciousness, is nothing else but the melting of states of consciousness into one another, and the gradual growth of the ego, it will be said, notwithstanding, that the time which the astronomer introduces into his formulae, the time which our clocks divide into equal portions, this time, at least, is something different. It must be a measurable and therefore homogeneous magnitude. It is nothing of the sort, however, and a close examination will dispel this last illusion." The phrase, " ... the gradual growth of the ego ... ," is the metaphysical form of the concept of entropy-gradient. The entropy of a system is a function of the state probabilities of the system. When the state probabilities of a system remain constant, no conscious time or "entropy time" passes. Bergson says:
When I follow with my eyes on the dial of a clock the movement of the hand which corresponds to the oscillations of the pendulum, I do not measure duration, as it seems to be thought; I merely count simultaneities, which is very different . . . . Within myself a process of organization ... is going on, which constitutes true duration. . . . Now, let us withdraw for a moment the ego which thinks of these so-called successive oscillations: there will never be more than a single oscillation, and indeed only a single position of the pendulum and hence no duration. 5 The oscillating pendulum conveys no new information (it has no entropy-gradient) after the first observed cycle. Thus the entropy time, conscious time, or Bergsonian time, stops, relative to clock time. In a process in which there is a change of the state of organization, i.e., a change in the subjective probabilities of the state of organization, there is a change in the entropy time or a change in conscious time. Dynamic economic systems are a class of systems in which changes of the state of organization constantly occur. Even the most regular dynamic economic processes are not so regular that one feels that clock time is the best basis to measure the rate of change. We shall postulate that entropy time is a useful basis to measure growth or decay in economic processes, and should be substituted for clock time. Such a
5.2
ENTROPY AND INFORMATION
73
substitution is not new in economic theory. Economic models have long used the concepts of a period of planning or a period of production in which a change of organization has occurred rather than a particular change of clock time. The period concept is intuitively related to the idea that an economically important change in state has occurred during the period, but never to the author's knowledge has the period been related to the entropy-gradient. In stochastic processes of the Poisson class a fundamental assumption is made that time is divided into sufficiently short clock time intervals so that the probabilities of the occurrence of no event and one event add to one. It is interesting to note that such an assumption simplifies the stochastic model where it is applicable. Since each period of time or stage is associated with the presence or absence of an event, it follows that for each period of time a binary number exists which completely describes the occurrence of the period or stage. In other words, we can assign the number one or zero to represent that the event has, or has not, occurred. If the occurrence or the nonoccurrence of the event is equally likely, then such a binary number conveys exactly one bitt of information in the terms of communication theory. If the event is certain to occur, the binary number conveys no information. In between these two extremes each binary number conveys an amount of information which is equal to the entropy of the underlying stochastic process generating the event occurrences. Since the reception of this information indicates that a change in the state of organization of a system has occurred, it can likewise indicate a change in the entropy of that system. A change in the system entropy advances entropy time by one unit. In the Poisson process clock time likewise advances by one period. Thus, a one-to-one matching advance of entropy and clock time occurs in a Poisson process. In addition, and more generally, the finite Markov chain process also is conditioned on the basis that one experiment results in one change in state.
5.2
Entropy and Information
Information theory as used in communication engineering is concerned with the mechanics of the transmission of messages accurately and t The term "bit" was introduced by digit."
J. W. Tukey and is a contraction of "binary
74
5.
THE ROLE OF ENTROPY IN ECONOMIC PROCESSES
efficiently." It is taken for granted that the "message" is meaningful. We are mainly concerned with the economic value of information; however, the theory of information has much that will be useful for our purposes. Some of the useful properties of the measure of information, entropy, and nature of information sources, will be briefly considered in this section. Readers who desire a thorough background in information theory are advised to seek one of the many excellent general references on this subject.l-" Suppose we consider a simple two-stage economic information situation. During the first stage the Slantwell Oil Company will discover that one of its new wells will either produce oil or not. During the second stage the directors of Slantwell will either vote a dividend or not. Noting that the alternative environmental events during each stage are mutually exclusive, we will define the probabilities: p(ull , u21 ) = Pr [that the well is successful and a dividend is paid], p(un . U 22) = Pr [that the well is successful and no dividend is paid], P(U 12 , u 21 ) = Pr [that the well is not successful and a dividend is paid],
(5.2.1)
p(u12 , U 22) = Pr [that the well is not successful and no dividend is paid],
where p(Un ,u21), p(un , U22), p(U12 , U 21 ) , p(U12 , U22) ~ 0, p(un , U21 ) + p(un , U 22) + p(U12 , U 21 ) + p(U12 , U22) = 1, Un, U 12 E U1 (the possible environmental events during stage I),
and (the possible environmental events during stage 2).
From the probability calculus we know that p(un ) = Pr [the well is successful]
= p(ull , U21 ) p(u12) P(U 21 )
22 )
= Pr [the well is unsuccessful] = p(u12 , U 21) + P(U 12 , U 22) = Pr [the dividend is paid] = p(ull , U21 )
P(U 22)
+ p(un , U
+ p(ua , u21)
= Pr [the dividend is not paid] = p(ull , U22) + P(U12 , U22)·
(5.2.2)
5.2
ENTROPY AND INFORMATION
75
Note that and
Suppose we know that the well was successful. What would be the probability that a dividend is paid? We are seeking the conditional probability and it is given by P(
I
)-
U2l Un -
P(u n , U 21) P( un)
•
(5.2.3)
Likewise the conditional probability for no dividend payment given that we know that the well was successful is given by
- P(un , U 22)
P(U 22 I Un ) -
P(un)
.
(5.2.4)
Furthermore, we have also (5.2.5) (5.2.6)
for the situations where it is known that the oil well was unsuccessful. How much information have we gained through the knowledge that the well was successful? Suppose we define a measure of information by the relation (5.2.7)
or l(u
.u )
22'
n
= In P(U22 \ un) P(U 22)
(5.2.8)
depending on whether or not the dividend is paid. Similar equations are defined for the situation where the oil well was known to be unsuccessful. These are (5.2.9)
or l(u
.u )
22'
12
= In P(U22 I U12) P(U22 )
(5.2.10)
76
5.
THE ROLE OF ENTROPY IN ECONOMIC PROCESSES
depending on whether or not a dividend is paid. This type of information has been called mutual information. Mutual information equations, (5.2.7) and (5.2.8), for example, have some interesting properties. Suppose that the probabilities, Eq. (5.2.1), are independent. For example, suppose the outcome of the director's meeting is completely independent of the news about the oil well. We would have (5.2.11)
and p(U ll , U22) = p(un)P(U 22) .
(5.2.12)
Substitution of these equations into Eqs. (5.2.3) and (5.2.4), and this result into Eqs. (5.2.7) and (5.2.8), gives us I(U21 ; un)
=
In 1
=
0,
(5.2.13)
I(u 22 ; un)
=
In I
=
O.
(5.2.14)
and
Inother words, if the events in the second stage are independent of the outcome in the first stage, there is no mutual information in the knowledge that the (IiI well was successful. This is just what we would intuitively expect from an information function. Now suppose that there is a rule at Slantwell that a dividend is paid if an oil well is successful, and no dividend is paid if a well is unsuccessful. This means that (5.2.15)
or p( U 22 I U 12) = 1.
(5.2.16)
We see from Eqs. (5.2.7) and (5.2.10) that (5.2.17)
and (5.2.18)
This kind of information is called self information. In this case the self information is the information contained in the knowledge of the outcome of the second stage alone. We know that the event outcome in the first stage is revealed completely by the outcome in the second stage since a dividend payment means that an oil well must have been
5.2
ENTROPY AND INFORMATION
77
successful and no dividend payment means that the oil well must have been unsuccessful. As a digression, note from Eq. (5.2.7), for example, that if (5.2.19)
the mutual information is positive. On the other hand, if (5.2.20)
then the mutual information is negative. In most cases we know that the events in stage two are neither completely independent nor dependent on the outcome of the first stage. Since this means that (5.2.21 )
we have that (5.2.22)
which can be generalized for the other information functions. The self information is the amount of information given by the knowledge of the outcome of the current stage of a process. But we do not know which of our two events will occur in the second stage. Information is really a random variable. We can find out what the expected information is since we know .he probabilities. The expected self information at the second stage of the example would be (5.2.23)
where H( U2 ) is denoted as the entropy of the process at the second stage. -The notion of entropy can be extended to it multi-event process. Suppose we have a 'set M of m different· events, one of which occurs each stage. The entropy for such a process for say the tth stage is given by H(U t )
= - ~P(Uti} Inp(uti}.
(5.2.24)
iEM
Since probabilities are always nonnegative and equal to or less than 1, we see that the self information is always nonnegative. Therefore, the entropy is always nonnegative: If either p(u 21) or p(u n) are zero, the entropy is zero since 0 In 0 is taken to be zero. For this reason entropy
78
5.
THE ROLE OF ENTROPY IN ECONOMIC PROCESSES
must reach a maximum somewhere between these endpoints. This point of maximum entropy is easily found by using the inequality: ln y
~
for
y - 1,
y ?" O.
(5.2.25)
Consider a general case for the entropy of the tth stage of a process with m different mutually exclusive events possible during the stage. Consider the equality, H(V t ) -In m
= - ~ p(uti) In[mp(uti)]. iEM
Using Eq. (5.2.25), we have H(V t ) -In m
~ ~
iEM
P(Uti)[_(l) - 1 ] mp Uti
= O.
Now equality in Eq. (5.2.25) occurs when, and only when, y = 1. Therefore, when
11m =
for each i E M,
p(Uti)
(5.2.26)
then H( V t ) Th.:E~fore
-
In m
=
O.
(5.2.27)
maximum entropy, that is H*( V t )
= In m,
(5.2.28)
occurs when Eq. (5.2.26) holds. If as in our example, m = 2, then the maximum entropy is In 2 and occurs when p(u n) = P(U I2) = t. Figure 5.1 shows the value of the entropy of a binomial process (i.e., m = 2) for a range of values for p(un). Returning to the example again, suppose we wish to find the information gained by the knowledge of whether or not a dividend was paid, on the condition that we know for sure what happened to the oil well. What we want is a conditional information function. Taking an example, suppose we want to know how much information we gain when we discover that a dividend was paid and given that we know that the oil well was a success. We have that (5.2.29)
In other words, we gain less information by discovering that a dividend was paid after we know that the well was successful than we would if we
5.2
79
ENTROPY AND INFORMATION
1.0
In 2
0.9
0.9\n2
~ 0.8 ...:
0.8\n2
N
~
=:
07
0.7\n2 ~
£
0.6
0.6\n2
j
05
0.5\n 2 :§
E
0.4
O.4\n 2
o .c
Ii>
E
e
= l!
.g' ..
0.3 \n2 .5
r;
.0:
r;
0.2
~
0.1
E
0.2\n2
o ~
UJ
o0L-..-OL.I-OL.2-0.L.3-0-'-.4-0-'-.5-0J.-6---10 ,-7 --'0.'--8-0,L9-.J1.8 Probability, PI or !I-P2)
FIG. 5.1. Entropy versus probability (binomial case).
were merely told that a dividend was paid and we had no knowledge of the well. The difference is just the knowledge about the dividend we can infer from the fact that the well was successful. Substitution of Eqs. (5.2.17) and (5.2.3) into Eq. (5.2.29) results in (5.2.30)
Likewise for the other cases we have I(U21 I u12 ) = -In P(U 21 I u12 ) , I(u 22 I un)
=
-In P(U 22 I un),
(5.2.31) (5.2.32)
and (5.2.33)
The expected information for the second stage can be found from the above with E[I(V 2 1 V 1 ) ]
=
H(V 2 1 VI)
= -P(U21 , Un) In P(U 21 I un) - p(U ZI , U12) In p(u21 lUtz) -P(U 22 , un) In P(U 22 I un) - p(U Z2 , u12 ) In P(U 22 I u12 )
(5.2.34)
80
5.
THE ROLE OF ENTROPY IN ECONOMIC PROCESSES
where H( V 2 I VI) is the conditional entropy of the process at stage 2. Just as before we can generalize this notion of conditional entropy to the m event process at stage t. We have H( u, I V t- 1)
=
-4 p(Uti, Ut-1i ) in p(Uti ~ Ut-l~)'
(5.2.35)
i,!E:M
To generalize the conditional information concept further will require a more compact notation. Let us indicate with the sequence, ZI , Z2' ... , Z t , the effect of the events which have occurred up to the (t l)th stage of an adaptive process. At each stage, one of a set M containing m mutually exclusive environmental events, has occurred. For-example, ZT is the effect of the event which actually occurred during the 7th stage. Denote V as the ordered set of all the possible effects on the process due to each of the events. We would have for the 7th stage
+
T
for each
7=1,2, ... ,t.
(5.2.36)
As before, we can define a sequence of probability functions, one for each stage, by P(ZI) = {Pr(zl = Ui): i
E
M}
P(Z2 I ZI) = {Pr(z2 = Uj : ZI = Ui): i, j P(Zt I Zt-l , ... , Z2 , ZI) = {Pr(zt
= Uk I Zt-l
(5.2.37) E
M}
= Uj , ... , ZI
(5.2.38)
= Ui): k,j, ..., i EM}. (5.2.39)
These conditional probability functions can be used to form the joint probability functions for the events. For example, we have (5.2.40)
Marginal probability functions may be found by the appropriate summation of the joint probability functions. For example, we have P(Zt-l) -:-
I.
P(ZI , Z2 , ... , Zt-l)
(5.2.41)
I.
P(ZI , Z2 , ... , Zt)·
(5.2.42)
U 1U2 • .. U t_ 2
and P(Zt , Zt-l) =
U 1U2oo,U'_2
With these marginal probability functions we can determine the two stage conditional probability function. Suppose we want to know the
5.2
ENTROPY AND INFORMATION
81
probability function of the events during the tth stage given that we know what eveIits occurred during the previous (t - 1) stages. We would have (5.2.43)
Now using these probability functions can we form a sequence of entropy functions, one for each stage. We have for the expected information (entropy) H( VI)
H(V 2 VI) 1
=Eu, P(ZI) In P(ZI) c"",
-
~ P(ZI' Z2) In P(z21 ZI)
(5.2.44) (5.2.45)
V'V 2
H(Vt ! V t- I , ... , VI)
= ~~ V,V,_""U,
P(ZI, Z2, ''', Zt) In P(Zt I Zt-I , ... , ZI)'
(5.2.46)
Equation (5.2.46) gives the expected information gained by knowledge of the tth event given that we know the outcome of the previous t - 1 events. In other words, it is the amount of information transmitted by the environment to the adaptive decision maker during stage t, if the decision maker knows what happened before. We shall adapt a shorthand notation for this conditional entropy given by (5.2.47)
The conditional entropy, Eq. (5.2.47), plays an important role in the theory of adaptive processes. In Section 4.5 we noted that Bayes subjective probability functions were conditional probabilities. In other words, the subjective probability function is based on the decision maker's knowledge of the event outcomes of the previous stages of the decision process. Consequently, there is a subjective entropy concept which is based on subjective probability functions. Good? in 1950 felt that entropy was in some way useful in estimation theory and he pointed out that entropy could be defined on the basis of statistical estimators or, in our case, subjective probability functions. Remembering the notation of Chapter 4 we can write for the kth decision maker as stage t (5.2.48)
82
5.
THE ROLE OF ENTROPY IN ECONO:Y1IC PROCESSES
Thus we can define Good's entropy for the kth decision maker at stage t as :l:
H kt
=
.
-!-
Pk(Zt, Zt-l' •.• , Zl)
In Pk(Zt i Zt-l,
••. ) Zl)'
(5.2.49)
U,Ul-1,,·U1
Equation (5.2.49) is the decision maker's estimate of the entropy given by Eq. (5.2.46). In 1953 Bartlett'? and Good!' modified the earlier concept. For reasons not of importance here, they defined a second form of subjective entropy, which is given for the kth decision maker at stage t by
I1kt =
-
!-
P(Zt , Zt-l , ••. , Zl)
In p(Zt
, Zt-l , •••, Zl)'
(5.2.50)
U,Ul-1",U1 .
Equation (5.2.50) plays a major role in the theory of adaptive processes because it represents the kth decision maker's estimate of entropy from the view point of the rest of the system, that is with respect to the actual probability function of the environment. For this reason, all decision makers' estimates of the entropy are comparable with each other and with the actual entropy of the system which is given by Eq. (5.2.46). Subjective entropy functions of the second kind appear not only in adaptive processes but also, as pointed out by Good and Bartlett, in the theory of statistical estimation. In fact, Bartlett's has shown that Fisher's concept of inforrnation-" which appears on first glance not to be related to entropy is actually closely related. In this role the adaptive process and statistical estimation are entropy decreasing processes. This appears to violate the famed second law of thermodynamics which calls for an increase in entropy in all closed systems. This paradox has been the subject of much interest and will be commented on in the next section.
5.3 The Entropy Paradox Because of the consistency of the subjective probability function, it can be said in general that the adaptive process is one which proceeds to alter the subjective entropy of a decision maker so that in the limit in probability as entropy time passes, the subjective entropy equals the actual entropy of the system in which the decision maker finds himself. The subjective entropy can either fall or remain constant, depending on the initial subjective probability function. The interesting situation
5.3
THE ENTROPY PARADOX
83
is that the second law of thermodynamics states that in physical systems entropy always increases. Many philosophers contend that the direction of time is associated with the rise of entropy.P If entropy can fall in social systems where adaptation is taking place, can people reverse time locally due to their organization activities ?15 The answer comes in two parts. First, consider physical organizing activities of human and other lifelike systems. The Maxwell demon argument-" explains this possible decreasing entropy process. Such a continuous process can be shown to be impossible on the grounds of quantum physics.!? Such processes could exist for short periods but ultimately the rise of entropy will continue again. Second, consider the subjective organization activities of human and other lifelike systems. These activities differ from physical organization in that subjective organization pertains to logical order of information, interpretation of statistical data, and many human organizational activities. It is the probability of decreasing entropy in this second type of organization that concerns us directly. A considerable effort on the part of philosopher-physicist Hans Reichenbach was devoted to this question. Concerned with the recording of information in his last book, he said:
In statistical interpretation, this additivity of information means that the larger the number of recorded items is, the lower is the probability that the items are a product of chance, and the higher the probability that they represent a record of the past. Strangely enough, this leads to the consequence that growing order is an indication of positive time, contrary to the entropy rule, according to which positive time tends to produce disorder. But this apparent contradiction is easily resolved. The order of the registered items does not represent a succession of states in an isolated system, but results from the space ensemble of individual interaction states, each of which produces a specific recorded item. And order in the space ensemble indicates a time direction perpendicular to the ensemble. We have here a time sequence, the order or information content of which grows with positive time ... 18 Although it is not too clear, it appears that Reichenbach was referring to just what we have seen in the adaptive process, that the subjective entropy decrease is in a sense perpendicular to the flow of natural entropy. Physical time, or at least its direction, is determined by the natural entropy increase, not the probable subjective entropy decrease. The key words are "natural" and "subjective."
84
5.
THE ROLE OF ENTROPY IN ECONOMIC PROCESSES
Although Reichenbach did not live to write the last chapter of "The Direction of Time," in which he planned to consider the relation between subjective experience of time and the natural time (in the entropy sense), it is felt that Reference 19 contains the essence of what he might have said. The key paragraph is: One question remains to be discussed. What is the relation between the time of physics and the time of our experience? Why is the flow of psychological time identical with the direction of increasing entropy? The answer is simple: Man is part of nature, and his memory is a registering instrument subject to the laws of information theory. The increase of information defines the direction of subjective time. Yesterday's experiences are registered in our memory, those of tomorrow are not, and they cannot be registered before the tomorrow becomes today. The time of our experience is the time which manifests itself through a registering instrument. It is not a human prerogative to define a flow of time; every registering instrument does the same. What we call the time direction, the direction of becoming, is a relation between a registering instrument and its environment; and the statistical isotropy of the universe guarantees that this relation is the same for all such instruments, including human memoryP The inclusion of recording instruments as subjective entropy decreasers is significant since it spoils any attempt by philosophers to ascribe the state of life to those things which decrease subjective entropy. It is not difficult to see that the whole concept of statistical inquiry is a subjective entropy-decreasing process, but what of the simple pen recorder? All real machines are subject to finite time lags. Thus, recording machines record the past, never the present or the future. In this sense they are averagers of the statistical uncertainties of the present events. Being averagers, they are in effect statistical estimation machines and thus are subjective entropy decreasers. Since their trace is only a finite sample of the variations of natural events within the period of lag, it is a subjective estimate of these events only, and therefore the entropy decrease is subjective and perpendicular to the passage of time. Recording devices, then, do not reverse the direction of time any more than statistical inquiry or lifelike things reverse time. Adaptive systems are similar to the recording devices and other
REFERENCES
85
organization systems; they are entropy decreasers only for subjective entropy. The entropy paradox is no more than the confusion of terms. References I. Eddington, Sir Arthur, "The Nature of the Physical World," p. 95. Michigan Univ. Press, Ann Arbor, Michigan, 1958. 2. Wiener, N., "Cybernetics," Chapter I. Wiley, New York, 1948. 3. Eddington, Sir Arthur, "The Nature of the Physical World," p. 95. Michigan Univ. Press, Ann Arbor, Michigan, 1958. 4. Bergson, H., "Time and Free Will," p. 107. Harper, New York, 1960. 5. Bergson, H., "Time and Free Will," p. 108. Harper, New York, 1960. 6. Shannon, C. and Weaver, W., "The Mathematical Theory of Communication." Illinois Univ. Press, Urbana, Illinois, 1962. 7. Abramson, N., "Information Theory and Coding." McGraw-Hili, New York, 1963. 8. Fano, R. M., "Transmission of Information." Wiley, New York, 1961. 9. Good, I. J., "Probability and the Weighing of Evidence," p. 75. Hafner, New York, 1950. 10. Bartlett, M. S., The statistical approach to the analysis of time-series, Trans. IRE PGIT-l, 81-101 (Feb. 1953). 11. Good, I. J., Discussion on Professor Bartlett's paper, Trans. IRE PGIT-I, 180-181 (Feb. 1953). 12. Bartlett, M. S., The statistical approach to the analysis of time-series, Trans. IRE PGIT-I, p. 88 (Feb. 1953). 13. Fisher, R. A., "Statistical Methods for Research Workers," pp. 325-329. Oliver and Boyd, Edinburgh and London, 1954. 14. Reichenbach, H., "The Direction of Time," p. 54. California Univ, Press, Berkeley, California, 1956. 15. Seifert, H. S., Can we decrease our entropy?, in Am. Scientist, 49, No.2, 124A-134A (June 1961). 16. Szilard, L., Uber die Entropieverminderung in einem Thermodynamischen System bei Eingriffen Intelligenter Wesen (The diminution of entropy in a thermodynamic system caused by the intervention of intelligent beings), Z. Physik 53,840-856 (1929). 17. Wiener, N., "Cybernetics," p. 71. Wiley, New York, 1948. 18. Reichenbach, H., "The Direction of Time," p. 54. California Univ. Press, Berkeley, California, 1956. 19. Reichenbach, H., Les Fondements logiques de la mechanique des quanta," Annales de l'Institut Henri Poincare, Paris, 13, Part 2, 156 (1953). t Note: An English translation of a portion of this reference, including the quote herein, appears as an appendix in Ref. 14, p. 269.
CHAPTER 6
ADAPTIVE ECONOMIC PROCESSES The element of time is a chief cause of those difficulties in economic investigations which make it necessary for man with his limited powers to go step by step; breaking up a complex question, studying one bit at a time, and at last combining his partial solutions into a more or less complete solution to the whole riddle. Alfred Marshall
6.1
Principles of Economics
Introduction
The theory of economics IS concerned with five major concepts. These are: (1) the production function, (2) the consumption function, (3) the price mechanism, (4) the objective function, and (5) the constraint function. The production function is a transformation function which transforms inputs (capital, labor, and other resources) into outputs (goods and services). We are concerned with a particular type of production function, the use of capital to produce more capital (the investment process). We will implicitly assume that capital, through its use in providing for consumption at some future period, contributes to an individual's future welfare or utility. A price structure is generally determined in the market place. In some societies, prices are determined by planning councils, but in both cases the price mechanism is an information transmission system which signals to each of the decision makers the actions of the other decision makers. In the investment process, prices are represented by the rates of return paid on each investment. The objective function generally employed in economic theory is the maximization of individual expected utility. In the study of business
86
6.2
GENERAL DETERMINISTIC DYNAMIC ECONOMIC PROCESS
87
enterprise, expected utility is generally taken as synonymous with expected profit. In our study of adaptive investment processes we shall employ a type of utility function which leads decision makers to maximize the expected rate of growth of their resources; A constraint function is necessary for the creation of an upper bound for the economic system. If an economic system were unbounded, unlimited wealth could be produced by the unlimited resources. Thus, the basic economic question is the allocation of limited resources so as to optimize the objective function. In the case of an investment process, each decision maker is constrained to allocate his total wealth, and no more, at each stage of the process. It is important to clarify the role of these basic economic concepts in dynamic sequential growth processes. The rate of growth of resources in an economic system can be formulated as a dynamic programming process. In the following sections, this notion of an economic growth process will be introduced.
6.2 General Deterministic Dynamic Economic Process Suppose we have a finite horizon economic system which is described at each stage by a vector 8 t , composed of components St E 8 t • We are given some initial state vector at t = 0 denoted by 8 0 , The system moves in discrete time periods or stages, and in each stage a sequence of events and decisions occur. We divide each stage into a sequence of subpoints designated as (a), (b), (c), etc. At subpoint (a), the state of the system inherited from the last time period is made known to the decision maker. At subpoint (b), the decision maker calculates the effect of each possible action vector, at E At, on the future state vectors. At subpoint (c), the decision maker chooses from these decision vectors an optimum decision vector, EAt, which maximizes the objective function of the process. At sub point (d), the state given at sub point (a), 8 t- 1 , is transformed by the optimum decision vector into a new state vector 8 t : At this subpoint, the events and decisions comprising the tth stage are complete. By the principle of optimality, we may write for the Tth and last stage of the process (the horizon),
at
F1(ST_l )
= arEAT max {<1>(ST ; AT)},
(6.2.1)
88
6.
ADAPTIVE ECONOMIC PROCESSES
where F1(ST-l) is the optimal value of a one-stage process, given state ST-l' and ~(ST; AT) is the objective function of the process. For the rest of the periods of the process,
(6.2.2) whereFT_,(St) is the optimal value of a (T - t) stage process, given St.
The set At is composed of those decision vectors, which at each stage' produce a maximum ~(ST; AT) for the last stage. The set At is denoted as the set of optimal decision vectors, or in Bellman's terms, the optimum policy.
6.3 The Stochastic Dynamic Economic Process Now suppose the economic system is subjected to some underlying stochastic environmental process. The state vector, StH' is no longer known, a priori, because it is perturbed by the environmental random variable. In this case, we assume that it is possible to have a known probability function for the environmental random variables. The probability function of the state vector may be calculated from this known probability function of the environmental process. Within any stage t, at subpoint (a), the decision maker knows the a posteriori state of the economic system S t » inherited from the previous stage. At subpoint (b), the decision maker computes the effect on the future state vectors of the system by the set of possible decision vectors, at E At, given that a particular stochastic environmental effect, Uit' has occurred. This is done for each i E M. The mathematical expectation of the value of these effects is maximized at subpoint (c) by the decision maker's application to the system of an optimum action vector, EAt. At subpoint (d), the tth event caused by the environmental stochastic process actually occurs and together with the optimum decision vector transforms the old a posteriori state vector, St, into a new state vector, S t+l • We have then, as a result of the principle of optimality for the Tth and last stage,
at
F1(ST_l)
=
max \ !NP(ST; AT I UiT)l,
aTEAT \ iEM
(6.3.1)
and, for all other stages,
(6.3.2)
6.4
THE ADAPTIVE STOCHASTIC DYNAMIC ECONOMIC PROCESS
89
We define
i M
Pi FT_t(St ; At I Uit)
as the stochastic environmental event index number, as the set of possible environmental event index numbers, as the probability of the ith event, as the maximum expected value of an optimal process for the remaining stages, given that the ith environmental event will occur in the tth stage, and as the function to be maximized in the Tth and last stage, given that the ith event will occur in that stage.
The set At is the set of decision vectors at each stage which produces a maximum t1>(ST; AT) at the final stage, and is denoted as the set of optimum decision vectors.
6.4 The Adaptive Stochastic Dynamic Economic Process Again the economic system contains some environmental stochastic process. The a posteriori state vector, St, can be described only in probability; however, the parameters of the probability function of the environmental process are no longer known to the decision maker. A sequential sample (the history) of the (t -I) past stochastic environmental events is known to the decision maker at the beginning of the tth stage, from which a subjective estimate of the probability function of the environmental stochastic process could be derived. At subpoint (a), the decision maker obtains a posteriori knowledge of the state vector inherited from the last stage, and knowledge of the a posteriori subjective probabilities from the last stage. At subpoint (b), and with this information, he computes the effect of each possible decision vector, at E At, on the value of the final state vector, given that a particular environmental stochastic effect, Ui/' has actually occurred. This is done for every i EM and at E At. The mathematical expectation ,of the value of the final stage state vector is maximized at subpoint (c) by the decision maker's application ofthe particular decision vector which gives this maximum value EAt. At subpoint (d), the tth event resulting from the environmental process actually occurs. The a priori subjective probabilities are transformed by the occurrence
at
90
6.
ADAPTIVE ECONOMIC PROCESSES
of this event into a posteriori probabilities. Likewise, the a priori state vector is transformed by both the event and the optimum decision vector into the a posteriori state vector for the next stage. It is a fundamental characteristic of an adaptive process that the performance of an adaptive decision maker may deviate from the performance expected from the same decision maker, but having full information about the probability function of the environmental process. This deviation results from the need for two sets of sequential equations to describe the process. The first set views the process through the eyes of the adaptive decision maker. The second set views the process through the eyes of the people who observe only the actual behavior of the variables following a decision, i.e., the market. Generally, these two kinds of equations are not the same. The first set of equations are expectations taken with respect to the decision maker's subjective probabilities, and they determine the decision maker's choice. The second set of equations are expectations taken with respect to the actual probability function of the environmental process, and they determine the expectation of the actual result of the decision maker's choice in the market. We have, by the use of the principle of optimality, from the decision maker's point of view for the last stage (6.4.1)
and for the other stages fr-t+l(St-1 ; Pt)
=
max
I- Pitfr-t(St ; At I Uit)·
atEAt iEM
(6.4.2)
We define
Pt Pit
as the a posteriori subjective probability function inherited from the last stage (that is the estimator of P), and as an element of P t , the subjective probability of the ith event at stage t.
Furthermore, we have (6.4.3)
where i is the actual event which occurred in the tth stage, and T( '1 . ) is the transition function; i.e., the function which transforms an a
6.5
STOCHASTIC GROWTH PROCESS
91
priori subjective probability function into an a posteriori subjective probability function. From the market's point of view, we have for the last stage F1(ST_l ; P)
I, P/1>(ST ; Ai I
=
iEM
iT),
U
(6.4.4)
and for the other stages
I, PiFT_t(St ; Ai I Uit)·
FT-t+l(St-l ; P) =
(6.4.5)
iEM
Here we define P as the set of actual probabilities for the events, i E M; i.e., the actual probability function of the environment, Pi E P, and E Ai as the optimal decision vector during the tth stage. Note that the equations from the decision maker's point of view must be solved to obtain the optimum decision vector so that the equations from the market's point of view may be evaluated.
at
6.S Stochastic Growth Process We have seen how the general adaptive process can be presented as a sequential set of equations, a dynamic program, when certain properties of the decision process are exploited. From the general form many actual sequential decision processes may be numerically solved in this way; we will restrict our attention to a special class of both stochastic and adaptive dynamic programs which arise in analyzing stochastic growth processes. This special class possesses analytic solutions which are of interest in the understanding of growth phenomena. Suppose that we let the variable, T, be defined on the set of random events, i E M. The variable T contributes to the growth of some state variable, s, at each stage t. We define a probability function of T (the environmental stochastic process) by the following: Pr(r
=
Ti)
= Pi,
I,Pi = 1,
for all i E M. (6.5.1)
tEM
We will denote the vector formed by the {ri , i we define the variable, T, such that St
=
II (l + ri)n
i,
So
iEM
E
M} as R. Furthermore, (6.5.2)
92
6.
ADAPTIVE ECONOMIC PROCESSES
where So is the initial value of the state variable (initial condition), is the number of times r i occurs in t stages (entropy time periods), and ~iEM ni = t, ni ;?: 0 for each i E M. It is appropriate when studying random variables to consider the expectation of the variables. Taking the log of Eq. (6.5.2), we have
ni
ln s, = ~ [niln(l +ri)] +lnso. iEM
Taking the expectation, we find! E(ln St) = t
k
iEM
[Pi In(l + ri)] + In So •
We define (6.5.3)
where glso) is the "expected growth rate" of the state variable s after t stages. We can write gt(so) = ~ Pi In(l + ri)' iEM
(6.5.4)
We now extend the above to the controlled stochastic growth process, or decision growth process. In the process considered above, the expected rate of growth, gt » is a function of parameters So ,R, and P, none of which are considered to be "controllable" variables from the decision maker's point of view. Suppose the decision maker can control the variable, R, by a decision vector at stage t, At, and thus control the expected rate of growth of the process at stage t, The decision maker desires to optimize an objective function by determining the magnitude of this decision vector for each stage. We shall group these decision processes into two classes. In the first class, the decision maker chooses, without restriction, a decision vector which optimizes the objective function. In the other class we shall place the economic decision processes. For example, suppose that the decision maker is charged a sum of money equal to the sum of the elements of the decision vector he chooses. Suppose also that this decision maker has a limited amount of money to devote to the optimality effort, and he must not borrow additional t
Remember that since the environmental process is multinomial we have E(ni)
=
tPi,
for each
iEM.
6.5
93
STOCHASTIC GROWTH PROCESS
money. We see that this decision maker's problem is considerably more difficult than in the first class of decision processes. He must determine an optimum allocation of his limited funds (the restricted elements of the decision vector) to optimize the objective function. Weare concerned with economic decision processes. If we base the stages in our stochastic growth model on entropy time, we may write the equations for a one-period economic stochastic growth process. At the last stage, T (the horizon), and given ST-I , we have E(ln ST)
=
In ST-I + (1) ~ Pi In(1 iEM
+ aiTTi),
(6.5.5)
where (1) indicates that this is a one-stage process; afT is the decision vector element for the ith event at stage T and aiT ;? 0, ~iEM aiT = W, for each i EM, and W is the decision maker's total supply of funds. An optimum process would be, by definition, one which results in either a maximum or minimum expected rate of growth; i.e., F 1(ST- l)
=
In ST-l
+ ~ Pi In(l + atTTi)
(6.5.6)
iEM
where afT is the optimum value of aiT, and FI(ST-I) is the maximum expected value of the log of the growth state variable for a one-stage process starting with ST-I at stage T - 1. By the principle of optimality, for a process at stage t with T - t 1 stages to go, we have
+
(6.5.7) where we define F T_I(S I I riail) as the value of the log of the state variable of a T - t stage optimal process starting with St, given that the ith increment of growth has actually occurred. It will be shown in Chapter 8 that such a dynamic program has a solution which is given by the form F 1(ST- l)
=
In ST-I
F T - t +l ( St - l )
=
In St-l
+ E(ln R) + In (W + ~ 1h) -
(6.5.8)
H,
iEM
+ (T -
t) [E(ln R)
+ In (W + ~ Ih)- H] iEM
for
T - t
+1>
1,
(6.5.9)
94
6.
ADAPTIVE ECONOMIC PROCESSES
where H
= - ~ Pi In Pi ,
and
iEM
E(In R) = ~ Pi In rio iEM
The parameter H, we remember, is called entropy and is a measure of the uncertainty in the system. It will also be shown in Chapter 8 that the set of optimal decision vectors (the optimal policy) is composed of subsets of elements given by
-: = Pi (w + ~ I/rf) - II'i,
for
i EM
and the tth stage.
(6.5.10)
iEM
It is important to note that the set of optimal decision vectors is composed of the same elements for all t = 1,2, ..., T. Such an optimal decision vector is said to be time invariant. Taking Eq. (6.5.9) and setting t = 0, we have
and we know from the definition of F that (6.5.12)
Therefore, g;(so)
= E(In R)
+ In (W + k- I/ri) -
H,
(6.5.13)
iEM
where g}(so) is the maximum expected rate of growth of s at stage T. The maximum expected rate of growth of an optimal decision stochastic growth process given T stages and the initial condition, so, is for any given system a constant less a measure of the uncertainty of that system. The above can be extended to cover the adaptive stochastic growth process; however, we shall do this for a specific process in the next chapter, and thus also extend the theory to a subject of considerable interest-the adaptive stochastic investment process.
CHAPTER 7
AN ADAPTIVE INVESTMENT MODEL In all the different employments of stock, the ordinary rate of profit varies more or less with certainty or uncertainty of the returns. . .. The ordinary rate of profit always rises more or less with risk. Adam Smith Wealth of Nations
7.1 The Objective Function The rate of growth of a resource, say capital, in rstages can be given by (7.1.1)
where g t is the rate of growth measured at stage t, K o is the initial amount of the decision maker's capital, and K, is the amount of capital at some tth stage of the process. We are concerned with the expected rate of growth at each stage of a stochastic growth process. Thus, we want
gt = ~ E
[in
~:] = ~ {E[in K t] -in K o},
(7.1.2)
where it is the expected rate of growth of capital at stage t, K o is the initial amount of capital and is a fixed variate, and K, is the capital at stage t, and is a random variable. In the investment models which follow, we shall maximize the expected rate of growth of a decision maker's capital subject to the constraint that the decision maker has some given finite amount of capital at each stage. This objective is related to the assumption of a Bernoulli objective function. 1.2 The Bernoulli objective is to maximize the expected logarithm of the capital at some stage t. In our case, it is given by max E[in K t ]
=
t[maxg t ]
+ In K o .
It is clear from the above equation that maximizing the expected logarithm of capital is the same as maximizing the expected rate of .growth of capital. 95
96
7.
AN ADAPTIVE INVESTMENT MODEL
Although this objective function has much appeal for most investors, it does have some faults. First of all, the function is not finite. When K, is infinite, then so is the objective; when K, is zero, the objective is negative and infinite. If the value of the objective. is considered to be utility, this means that the loss of a decision maker's capital is the worst possible event which could happen to him. For most decision makers this is just not true. If this objective function is to be used, it must be on the understanding that K, is neither very small nor very large. Furthermore, and more seriously, this objective function is not a function of risk. It is unreasonable to believe that two investments with the same expected rate of growth, but with different risks, would have equal appeal to an investor. The use of the variance of a stochastic process as a measure of risk has been made by many writers.s-'' Generally, the objective function is made to include expectation and variance terms in these studies. Risk will not be explicitly included in the objective function used in our study of adaptive processes. There is reason to believe that the competitive market mechanism will adjust the payoffs for each investment opportunity so that the average decision maker will receive an adequate payment for the risk he incurs. Objective functions based only on the expectations of capital growth would not be too unrealistic for these average decision makers.
7.2 An Example-The Three Investors Again To illustrate the effects of the objective function on the behavior of the decision makers, we shall return again to the three investors introduced in Chapter 4. Let us suppose that each investor wishes to control the amount of his funds invested in Slantwell stock to maximize his expected rate of capital growth. For simplicity in this example we shall make the investment illiquid. In other words, when each investor decides what fraction of his funds to place in stock, he must hold that amount for the entire T stages of the process. With this in mind, we shall define a decision variable, a k , for each investor which will be denoted as the kth investor's portfolio ratio. We have k variables defined by
Ak
=
{a k
such that
0 ,,;;;
where ak is the kth investor's portfolio ratio.
ak ,,;;;
I}
(7.2.1)
7.2
97
AN EXAMPLE-THE THREE INVESTORS AGAIN
The portfolio ratio is the fraction of the investor's capital invested in stock. The rest of the investor's capital is assumed to be held in cash. At any stage t the amount of the kth investor's capital invested in stock is given by akKk l, where K k l is the total capital owned by the kth investor at stage t. Suppose during each stage of the process the price of Slantwell rises by or falls by Y2 points (the entropy time assumption excludes the event of "no change"; see Section 4.3). For the two possible event outcomes in the tth stage for the kth investor, we would have for the beginning of the (t + 1)th stage,
'1 ,
if the price rises, or Kkl+l
= -
akT2Kkt
+ akKkt + (1 -
(7.2.2) ak)Kkl :
if the price falls,
where In this equation, we have akY1Kki as the capital gains from the investment if the price rises, a kT 2Kk i as the capital losses from the investment if the price falls, akKk l as the capital invested, and (1 - ak)Kk t as the capital remaining in cash. Combining all of these for the kth investor, we would have at the (t + l)th stage, given that the price rose by '1 points,
(7.2.3) By the same token, if the price fell by
'2 points, (7.2.4)
Suppose we examine the capital held by the kth investor after t stages, assuming that and
where n1 is the number of times the stock price rose by n 2 is the number of times the price fell by '2 points. We would have, after t stages for the kth investor,
Y1
points, and
(7.2.5)
98
7.
AN ADAPTIVE INVESTMENT MODEL
Equation (7.2.5) can be rewritten in terms of the rate of growth of capital as (7.2.6)
This equation looks interesting, but remember we do not know in advance what the numbers n I and n 2 are. Equation (7.2.6) can only tell what has happened to the capital after t days. It cannot tell us how to invest capital in the future. This situation is even worse when we realize that the investors do not even know the probabilities of a rise or fall of the price. Suppose our three investors could subscribe to an investment service with perfect foresight. For some reason, suppose that the perfect foresight service actually knows the probabilities of a rise or fall of Slantwell's stock. For a fee, this is passed on to the three investors. The investors now know that Pr (price will rise by r I points) = PI Pr (price will fall by r2 points) = P2
(7.2.7)
where These probabilities can be used to determine the likelihood of a change in price for any stage. Since perfect foresight advises that these probabilities are constant throughout the investment period, the three investors would be correct in concluding that the events, the payoffs, are being generated by a binomial environmental process. The probability of some sequence of event outcomes, given this hypothesis, would be [1
Pr(ni I PI' P2' t) = nll(t ~ nl)!P~,p~-n,
(7.2.8)
where we note that Pr(n2 I PI ,P2' t) = 1 - Pr(ni I PI ,P2 , t). Given this probability function for the frequence of the events, we can find the expected payoffs for the next t stages. This expectation is' and t
(7.2.9)
We have, for example:
E[n, I p, , P. , t]
=
t
t!
n.-O
n,!(t - n,)!
~ n, - - - - P~' p~-n. ,
7.2
AN EXAMPLE-THE THREE INVESTORS AGAIN
99
Substitution of Eq. (7.2.9) into Eq. (7.2.6) enables the kth investor to find the expected future rate of capital growth for the next t investment stages, given some portfolio ratio, a k , and some initial capital, K k O ' This is given by (7.2.10)
where Assuming that all the investors have subscribed to the perfect foresight service, Eq. (7.2.10) can be used as a transition equation where, in terms of the notation used in Chapter 2, we have S, =Kk t , U, = (r1 ,T2), D,
=
ak
for each k, a random variable, and for each t and each k.
Thus, we can devise a sequential decision process which will determine D t , i.e., a k , which optimizes the objective; i.e., maximizes the expected rate of capital growth for the kth investor. We want to find with respect to ale subject to ale E A k
•
This problem is a nonlinear programming problem. t In this case, a simple approach would be to with respect to a k
,
and place restrictions on (r1 , r2) such that 0< ak
<
l.
(7.2.11)
which can be rewritten as
Since we are assuming a probability function over the whole space, it is unity and we have E[n , I Pl , P. , t] = Ptt. t This is because we have constrained ak to lie in the positive unit space and because the objective function is logarithmic. If the objective function were linear, linear programming methods could be used to obtain an optimal av • On the other hand, if ak were unconstrained, the calculus of maximum and minimum functions could be employed.
100
7.
AN ADAPTIVE INVESTMENT MODEL
Now if (71,72) took on values beyond this restricted range, we would have what are called corner solutions for ak ; i.e., or
1.
The optimum unconstrained ak is found by setting the derivative of the objective function to zero, and assuring that the second derivative is negative. We have for the first derivative:
dgk t dak
I = 0*
(1
+ a:T
+
PITI
1
)
72
-P2
(1 - a:7 2)
=
0
,
(7.2.12)
and for the second derivative, (7.2.13)
Solving Eq. (7.2.12) for
at , we get (7.2.14)
Now if Eq. (7.2.14) is substituted into Eq. (7.2.13), we have (7.2.15)
This condition is always true since the the first factor is always positive and because of the probability definitions the last term is always negative. Therefore, if a k has an optimum value it will lead to a maximum gkl • Consider the constraint on ak given by Eq. (7.2.11). This means that
Thus we have a condition on
(71 , 7 2),
given (PI' P2), that (7.2.16)
From this inequality we note that the expectation of the payoff must be positive if the portfolio ratio is to be nonzero. In other words, you could hardly expect the investor to invest his capital if the odds were unfavorable for a profit. The other condition-the right-hand inequality
7.2
101
AN EXAMPLE-THE THREE INVESTORS AGAIN
-is not so clear in its economic interpretation. Since we will assume that
o< r <
1,
2
the right-hand inequality tells us that no matter how close to one the probability of a positive payoff, r I , becomes, there is always the possibility that the portfolio ratio can be one. Take the limiting case, for example: 7 2 = 1 and PI = 1 (and of course P2 = 0). Then at this point ak = 1 also. The right-hand inequality tells us that there is always a PI large enough so that the a k can be made equal to 1. At the limiting case, where 7 2 equals 1, the value of PI must be equal to 1 also. This can be interpreted economically. If r 2 equals 1, the decision maker can lose all his capital if event 2 ever occurs. The limiting case means then that an investor would fully invest in such a security only if the probability of a positive payoff were 1 (i.e., is the probability of a total loss of capital were zero). Equation (7.2.14) can be used to evaluate what the maximum expected rate of growth of capital will be. Substitution of Eq. (7.2.14) into Eq. (7.2.10) gives us .
-*
gkt =
[PII npI+P2 I ] np2
n.
rl + 7 2 1n 71 + r , + [PI In--+P2 r2 71 2]
(7.2.17)
where is the maximum expected rate of capital growth for the kth investor after t stages. Note that the first term in Eq. (7.2.17) is equal to the negative entropy of the system. We can rewrite Eq. (7.2.17) as
g:t
=
-H
+ In(r1 + 7 2) -
P Iln(r2)
-
P2 In(r 1 } .
(7.2.18)
By adding and subtracting In 2 to Eq, (7.2.18), we haye
s; = In 2 -
H
+ In
C 1
;
7
2)
-
Pl ln(r 2)
-
P2 In (r 1) .
(7.2.19)
We see that in part the maximum expected rate of growth of capital is the difference between In 2 and H. Note, if uncertainty were a maximum, PI = P2 = t, and H* = In 2. We can rewrite Eq. (7.2.19) as (7.2.20)
where H* is the maximum entropy of the system. From the economic viewpoint, H becomes a risk discount causing the rate of growth of
102
7.
AN ADAPTIVE INVESTMENT MODEL
capital to decrease as uncertainty increases, and to increase as uncertainty decreases. The role of uncertainty as check on the rate of growth (accumulation of capital) has been postulated by many economists, the most interesting being the paper of Kaldor," who postulated that uncertainty is the only effective limit to the rate of growth of firms; that is, their capital assets. The last three terms of Eq. (7.2.20) require some justification. Suppose we make the substitutions:
a
(7.2.21) (7.2.22)
which alters Eq. (7.2.20) so that the r's are symmetrical about zero. Thus, we get for Eq. (7.2.20),
itt
=H* -H,
(7.2.23)
where itt is the symmetrical, maximum expected rate of capital growth. We see that the last three terms of Eq. (7.2.20) are a function of the asymmetrical character of the payoff structure. We define the last three terms of Eq. (7.2.20) as the coefficient of asymmetry, and in our problem it is (7.2.24)
Thus, we have the final form of the equation for the maximum expected rate of capital growth:
itt =
H* - H
+ Y.
(7.2.25)
We note from Eq. (7.2.25) that if entropy is maximum, the maximum rate is equal to Y, and if Y is zero, then the maximum rate is equal to zero. This has economic significance. Suppose that it is just as likely to win A dollars as to lose A dollars in some risky venture (maximum entropy and symmetry); then the maximum rate of accumulation will be zero. However, note that if entropy is maximum but Y is positive, then some positive rate of capital growth exists. In effect this means that if it is equally likely to win A and lose B dollars, where A is greater than B, then ttt would be positive.
7.2
AN EXAMPLE-THE THREE INVESTORS AGAIN
103
Another possibility is when A equals B, but the probability of getting A is greater than for getting B. Here the entropy would be less than maximum and the asymmetry coefficient zero, so some positive would result. There is a possibility that the entropy is less than maximum and the asymmetry coefficient is positive. In this case the A is greater than B, and the probability of A is not equal to the probability of B. In addition, it must be remembered that inequality (7.2.16) must hold if we are to say that this case will lead to a positive Under certain conditions it is possible to have a negative asymmetry coefficient. Suppose it is highly likely to receive a small payoff and highly unlikely to lose a large amount. The chances are that the asymmetry coefficient in this case would be negative, while a positive maximum expected rate of capital growth would result. Negative Y would also result if the loss were high, and probability of loss were high also; but in such cases it must be remembered that this condition would, in all likelihood, violate inequality (7.2.16). Two possible cases shown by the graphs in Figs. 7.1 and 7.2 illustrate some of the effects of these parameters in Eq. (7.2.25). One of the weaknesses of Eq. (7.2.25) is that we have assumed that the optimum portfolio is constant over the duration of the process. Whether the optimum portfolio ratio is a constant or not should result from the volition of the investor, not from arbitrary conditions placed on the maximization. It will be shown in the next section that even if the optimum portfolio ratio is freed from this assumption, the investor will,
itt
it,.
1.0 0.9 08 0.7 0.6 a- 0.5 0.4 0.3 02 0.1 0 0
1.0
In 2
0-"'>(--)
~=>(---)
./
0.1
0.2
0.3
0.4
0.5
0.6
Probability, PI or (I
FIG. 7.1. Symmetrical case. maximum expected rate of growth.
,/
0.7
/
-Pz'
/
0.8
/
/
/
0.9
/
g-
0 1.0
the optimum portfolio ratio. - - - - -
the
104
7.
AN ADAPTIVE INVESTMENT MODEL
t
10 0.9
08 0.7 06 05
0·=> ( - - )
!f => (---)
0.4
0.3 02 0.1
/
/
/
/
/
I
/
/
/
10
I
..-//
O'--:'---:-'---'-"--=---'------JL.---'--------'L.---'-------J o 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 Probability, PI or (1- pz)
FIG. 7.2. Asymmetrical case. - - the optimum portfolio ratio. - - - - - the maximum rate of growth.
by his volition, maintain a constant optimum portfolio ratio over the duration of the process.
7.3
An Investment Model with Full Liquidity
We now consider the multistage investment process but we do not assume that the portfolio ratio is constant over the duration of the process. We find an optimum portfolio ratio for each stage which is optimal for the current stage, and all those that follow. In other words, we make use of Bellman's principle of optimality. From the economic view this means that there is sufficient liquidity so that funds invested need only be committed for one stage. We start by redefining the portfolio ratios. For each investor, say the kth, we have a vector A k • The elements of these vectors for aT-stage process are given by such that
o ~ akt ~
I}.
(7.3.1)
The maximization problem can now be stated as (7.3.2)
7.3
105
AN INVESTMENT MODEL WITH FULL LIQUIDITY
Equation (7.3.2) can be written as a sequence of terms, one for each period. We have gkT
= ~ 1E [In :kI-] + E [In KKkT-1] + .., + E [In kT-l
kT-2
KKk!
kO
]II'
(7.3.3)
Using Bellman's principle of optimality, we may maximize Eq. (7.3.3) one stage at a time. For a process containing T stages (the horizon is the Tth stage), optimization of the last stage is independent of the future because the last stage has no future. In the previous section, we optimized the portion of invested funds by maximizing the expected rate of capital growth. We can obtain the same results by maximizing the log of the capital at the end of the process, given the initial capital available for investment. It has been found that it is more instructive to maximize with respect to the expected log of capital at the end of each stage, and then show that this sequence leads to the maximization of the expected rate of capital growth for the whole duration of the process. With this in mind, the optimum portfolio ratio at the beginning of the Tth stage is found by maximization of F1(KkT-1)
= max E(ln akT
K kT)
= akTEA max
k
{In K kT- 1
+ P1ln(1 + akTr1) (7.3.4)
where K kT is the kth investor's capital funds available for investment at the end of the Tth stage, akT is the portfolio ratio for the kth investor at the beginning of the Tth stage, and F1(KkT- 1) is the expected log of the capital, K kT , for a one-stage optimal process, given that the capital at the beginning of the stage is K kT - 1 • We have shown previously that this maximum of Eq. (7.3.4) occurs when
and at this value of a kT , Eq. (7.3.4) reduces to F1(K kT-1)
= In K kT- 1 + H* - H
+ Y.
(7.3.5)
Furthermore, the optimum portfolio ratio for a two-stage process is given by the maximum of F 2(KkT-2)
= akT·akT max
-1
{In K kT} = max {PIFl(KkT-lleventt) a k T_1 P2Fl(KkT-llevent2)}'
+
(7.3.6)
106
7.
AN ADAPTIVE INVESTMENT MODEL
Using Eq. (7.3.5) we define Fl(KkT-lleventI) = In K kT_2(1 + akT-lrl)
+ H* - H + Y; akT-lT2) + H* -,- H + Y.
(7.3.7)
Fl(KkT-1Ievent2) = In K kT_2(1 -
(7.3.8)
Substituting Eqs. (7.3.7) and (7.3.8) into Eq. (7.3.6), we find F 2(KkT-2) = max {Plln(l akT-l
+ akT_lrl) + P2ln(1 - akT_lT2)} + In K kT- + H* - H + Y. 2
Note that max {Pl ln(1 + akT-lTl) + P21n(1 - akT-lT2)}
akT-l
=
H* - H
+ Y.
Therefore, we find that (7.3.9)
The above development can be repeated for T stages but the solution to the tth state can be given in a general statement. If it is true that: (1)
process,
r l and r 2 are constant over the duration of a T-stage investment
PI and P2 are constant over the duration of the process, (3) PI' P2 ~ 0, PI + P2 = 1,
(2)
(4) (5)
r lr 2 ~ PlT l - P2T2 ~ 0, and the principle of optimality holds, then we have
(i) the solution of a (T - t)-stage process for the kth decision maker at stage (t - 1) is given by FT-t(Kkt)
=
In K kt + (T - t)(H* - H
+ Y),
(7.3.10)
and (ii)
the optimum portfolio ratio is given by
for
Proof.
t
= 1,2, ..., T.
(7.3.11)
For a process at the tth period we can write
FT-t+l(Kkt-l) = max{PlFT-t(KktteventI) a • k
+ P2FT-t(Kkttevent2)}.
(7.3.12)
7.3
107
AN INVESTMENT MODEL WITH FULL LIQUIDITY
With the help of Eq. (7.3.10) in the statement for a process in the tth stage, we can write
and FT-t(Kktlevent2)
= In K k t-l (1 -
aktT2)
+ (T -
t)(H* - H
+ Y).
Substitution of these equations into Eq. (7.3.12) gives F T-t+l(Kkt- l)
= max {In K k t-l Ukt
+ (T -
+P
lln(1
t)(H* - H
+ aktTl) + P21n(1
-
aktT2)}
+ Y).
(7.3.13)
But we again note that
Therefore, Eq. (7.3.13) becomes FT-t+l(Kkt-l)
=
In K
kt-l
+ [T - t + 1][H* -
H
+ Y],
(7.3.14)
which is the solution to a process at the tth stage, as given by Eq. (7.3.10). Thus, part (i) of the statement is proved by mathematical induction on t. Since
where the conditional functions are defined in the manner of Eqs. (7.3.7)· and (7.3.8). Equation (7.3.15) then reduces to FT_tCK kt) •
=
max {Plln(1 Uk t
+1
+ (T -
t-
+ akt+lTl) + P21n(1 - akt+lT2)} 1)(H* - H + Y) + In K kt •
(7.3.16)
. We have already noted that the maximum of the bracketed term occurs when (7.3.17)
Since t is arbitrarily chosen and the p's and T'S are constants, all the a's are equal for all t. This proves part (ii) of the statement.
108
7.
AN ADAPTIVE INVESTMENT MODEL
It remains to be shown how the maximization of In K k t leads to the maximization of the expected rate of capital growth. From Eq. (7.3.10), with t = 0, we have . (7.3.18)
Dividing by T and subtracting In K k O , we get
mJ.x ~ E [In ~::]
=
H* - H
+ Y.
From previous work, we know that
t:
T
= H* - H
+ Y;
therefore, we can write -* = gkT
mJ.x r1 E [1n KkTJ K • kO
(7.3.19)
Economically, then, this means that under restrictions set forth in the statement of the general solution, even when the investor has full liquidity and can vary the portfolio ratio at each stage of a multistage investment process, he does not. Thus, the maximum expected rate of capital growth is the same as in the illiquid case where the portfolio ratios are forced to be equal throughout the process. If the constants, the p's and r's, become variables, the liquidity postulate would result in nonequal elements in the optimum portfolio ratio vector. The effect of the constant p's will be relaxed in the next section. Examination of Eq. (7.3.17) shows that the portfolio of each investor (assuming that every investor knows' the true probability density function of the payoff process) is the same. Furthermore, because of the assumption of the logarithmic objective (maximization of expected rate of growth) as initial holding of each investor's capital, K k t does not affect the choice of an optimal portfolio ratio. If this is true, then all investors will hold the same portion of their funds in securities. Of course special objective functions could be postulated for each investor which could explain why actual investors might hold different portions of their funds in the same kind of security. However, this is really no solution since then we would have to show why the investors have different objective functions. There is a more appealing approach. Investors do not have full knowledge of the probability function of the payoff process. The perfect foresight investment service simply does not exist.
7.4
AN ADAPTIVE INVESTMENT PROCESS WITH FULL LIQUIDITY
109
7.4 An Adaptive Investment Process with Full Liquidity So far, the concept of adaptation has not been brought into the investment problem. In the adaptive process we go one step farther than the introduction of risk (stochastic payoff structure). In an adaptive process, the payoff probabilities are unknown parameters. Thus, in an adaptive process, we have risk and uncertainty. In the adaptive investment process we have a probability function which plays a major role in the determination of the optimum portion of available funds to be invested each period, and the maximum expected rate of capital growth. The values of the probabilities making up the probability function of the environmental stochastic process are in reality unknown to the decision maker. On what ground are we able to say anything definite about the rate of capital growth? As in the previous adaptive processes we can substitute subjective probabilities based on a scheme which is, in some sense, the best for a given quantity of historical information. In Chapter 4 we derived such subjective probabilities and we shall employ them here. The binomial environmental process is characterized by the occurrence of certain kinds of random payoff events. We have two possible payoff events for each stage. For the number of payoffs of 7 1 , we denote the number nlt , and for the number of payoffs of 7 2 , we denote the number n2 1 • We assume here that the decision maker has experienced t stages or t events. The kth decision maker's subjective probabilities for binomial environment are given by for
i = 1,2.
(7.4.1)
We have gone about as far as we can with the simplified notational system for subjective expectations. Because of the complexity of a multistage adaptive process, we shall now be forced to introduce the notational scheme used in Chapter 5. This scheme is not as instructive as the previous scheme, but it is far more flexible. NOTATION FOR SUBJECTIVE EXPECTATIONS. Suppose we identify the finite sequence of payoff events that have occured up to the tth stage of an investment process by the vector (7.4.2)
110
7.
AN ADAPTIVE INVESTMENT MODEL
At any particular stage, say the tth, the event anyone of a finite set of possible values:
Z
t
can be associated with (7.4.3)
For the binomial case (m = 2), if Zt = T1, then the payment of T1 occurred at the the tth stage; if z , = T2, then the payment of T2 occurred at the tth stage. In our case, R, is the same for all t, and m = 2. We have assumed that the actual probability function for the payoff events is independent of the stage index, and is denoted by for any
t
=
1,2, ..., T.
(7.4.4)
For T stages, suppose we wish to know the actual probability that some particular sequence of payoff events would occur. For example, say T1 is paid at t = 1, Tj is paid at t = 2, ..., and Tk is paid at t = T. Since the events are independent of the stage index, the joint probability of these payoffs, given by the product of the probabilities of each of these payoffs, would be (7.4.5)
On the other hand, the subjective probabilities are not independent of the stage index numbers. The joint subjective probability of the sequence of payoffs denoted above for the kth decision maker would be Pk(Zl , Z2' ... , ZT)
=
{Prk(Zl
=
Ti)' Pr k(Z2
=
T;
I Zl)' ... , (7.4.6)
Let us denote the function, Pk(Zt
I Z1 , Z2 , ... , Zt-1) =
{Prk(Zt
=
Ti
I Zl
i E M} t = 1,2, ..., T, (7.4.7)
, Z2 , ... , Zt-l):
for any
as the conditional subjective probability function of an event at the tth stage for the kth decision maker. Note that (7.4.8)
In our old notation, we see from Eq. (7.4.7) that Pk(Zt
I Zl
, Z2 , ... , Zt-l)
== {Pkit : i E M}
for any
t
= 1,2, ..., T.
7.4
AN ADAPTIVE INVESTMENT PROCESS WITH FULL LIQUIDITY Present
Future
I
I
I
I I I II
------f-----.......,
I I
I
1'2' r,
I 1'1'
111
r,
I
1'2' r,
Stage I
FIG. 7.3.
Slage 2
Subjective probabilities for a three-stage process.
Fig. 7.3 may clarify the notion of the conditional subjective probability function. Suppose we desire to find the marginal subjective probability of the events at the tth and (t - I)th stage. This probability is given by Pk(Zt , Zt-l)
!-
=
Pk(Zl , Z2 , ... , Zt-l • Zt)·
(7.4.9)
R,Rz· .. R,_z
Also, the marginal subjective probability of just the (t - I)th event is PiZH)
=
!-
PiZl' Z2' •••• Zt-2' Zt-l)'
(7.4.10)
R,Rz···Rt-.
Using Eqs. (7.4.9) and (7.4.10), we can evaluate the subjective conditional probability of the tth event, given knowledge of the (t - I)th event, by p( k Zt
I Zt-l ) -_
P k( Z t , Z t- l ) P ( ) . k Zt-l
(7411) • .
112
7. AN ADAPTIVE INVESTMENT MODEL
We now have at our disposal a notational scheme which is sufficiently compact to greatly simplify the equations that follow. OPTIMIZING THE T -STAGE BINOMIAL ADAPTIVE INVESTMENT PROCESS WITH FULL LIQUIDITY: Returning to the investment problem and starting with the equations developed for a nonadaptive process, we have for the the tth stage and the kth decision maker, kt 10 KK = nit 10(1 ko
+ aktTl) + n2t 10(1 -
aktT2)'
(7.4.12)
Suppose we are presently at the end of the(T - l)th stage of aT-stage process. We have, as a consequence of the information gained from the T - I stages of history just experienced, estimates of the probabilities of receiving one of a set of possible payoffs. Substituting T for t in Eq. (7.4.12) taking the initial state to be the state of the (T - l)th stage, and taking the subjective expectation of Eq. (7.4.12) from the point of view of the kth decision maker, we have
2k (-)
indicates that the expectation is with respect to the subjective probabilities known to the kth decision maker. The factor "one" is emphasized by the parenthesis to indicate that Eq. (7.4.13) is for one stage, and the final stage of the process. From the market point of view (the actual expected rate of capital growth resulting from the decisions of the kth decision maker), we would have E (10 KKkI-.) kT-l
= (I)P 1 lo(1 + akTTl) + (l)p210(l - akTT2)'
(7.4.14)
This expectation is taken with respect to the actual probability function of the environmental stochastic process. Since the decision maker knows only the subjective expectation of the result of his decisions when he set a value for akT' he maximizes not Eq. (7.4.14) but Eq. (7.4.13), the ex ante equation, to obtain the optimum portion of funds to be invested in stage T. Maximization of Eq. (7.4.13) leads to an optimum portfolio ratio for the Tth stage given by (7.4.15)
704
AN ADAPTIVE INVESTMENT PROCESS WITH FULL LIQUIDITY
113
subject to the condition that (7.4.16)
i.e., (7.4.17)
Because the Pkir'S are themselves random variables, we take note here that there is a finite probability that Eq. (704.17) will be violated even if it is true that (7.4.18)
The probability of this violation occuring will be derived later, but at this point let us condition our next results on the chance that such a violation does not occur during the stage under study. The actual expected rate of capital growth can be determined from the decision maker's optimal portfolio ratio. It is a function related to Eq. (704.13), but different in an important way. Equation (704.13) is an expectation with respect to the subjective probability function since it is viewed from the decision maker's standpoint at the beginning of a stage. The actual expected rate of capital growth is an expectation with respect to the actual probability function of the payoff. It is the expectation of capital growth which is actually seen on the market at the end of a stage. This is given by (7.4.19)
where we define (7.4.20)
From the point of view of the decision maker the expected growth of capital is given by (7.4.21)
where we define (7.4.22)
7.
114
AN ADAPTIVE INVESTMENT MODEL
and
In the dynamic programming notation equation (7.4.21) is given as fl(K kT-1 )
= In K kT-1 + H* + Yk(R T I R T-1 ••• R 1)
-
Hh(RT I R T-1 ••• R 1) · (7.4.24)
For a two-stage process starting with K kT- 2 , we have from the decision maker's point of view, f2(KkT-2) = max {PklT-dl(KkT-1 I ZT-l = a.T_l
T1)
+ Pk2T-dl(KkT-
1 [
ZT-l = T2)}·
(7.4.25)
We define fl(K kT- 1 I ZT-l
=
T1)
= In K kT- 2(1 + akT-ITl)
+ Yk(R T I R T-
and, fl(K kT-1 I ZT-l
+ H*
1
R 1)
IZT_1-rl
- Hk(R T I R T-1
R 1)
[zT_l-rl
(7.4.26)
= T2) = In K kT_2(I - akT-IT2) + H*
+ Yk(R T I R T_1 ... R 1) IzT_1-r.
In these equations we define for i = 1, 2, Hk(R T I R T-1
•••
R 1)
IzT_1-r,
= - ~ Pk(ZT I ZT-l = T, , ..., Zl) RT
In Pk(ZT I ZT-l = T, , ..., Zl)' (7.4.28)
Note that Hk(R T I R T-1
.. ,
R 1)
= ~
RT_l
Pk(ZT-l I ZT-2 , ... , zl)Hk(R T [ R T-1
...
R 1)
IzT_1-r,·
(7.4.29)
7.4
AN ADAPTIVE INVESTMENT PROCESS WITH FULL LIQUIDITY
Also we define for i Yk(R T I R T-1 ... R 1)
115
= I, 2,
IZT_1-rj
= In [r1 ~ r2] I ZT-l = ri' ..•, Zl) In r2
- Pk(ZT
=
r1
- Pk(ZT
=
r 2 1 ZT-l = r;, ..., Zl) In r 1 • (7.4.30)
Again note that Yk(R T I R T-1 ... R 1) = ~ PiZT-l I ZT-2 , ..., Zl)Yk(R T I R T-1 ... R 1) RT_l
IzT_1=rj •
(7.4.31)
Substitution of Eqs. (7.4.26) and (7.4.27) into Eq. (7.4.25) results in f2(K kT-2)
=
max {PklT-1ln(l
a.T_l
+ akT-lrl) + A2T-lln(1
+ H* + Yk(R T I R T-1 ... R 1) -
- akT-1r2)}
lIk(R T I R T-1
•••
R 1) ·
(7.4.32)
Maximization of the bracketed term in Eq. (7.4.32) leads to max{PklT_1ln(1 a.T_l
+ akT-1r1) + Pk2T-1ln(1
- akT-lr2)}
If an interior solution exists, then 0< atT_l
< I,
(7.4.34)
or what is the same thing, (7.4.35)
Again under the condition that Eq. (7.4.35) is not violated, we may write f2(K kT-2)
= In K kT-2 + 2H* + Yk(R T-1 I R T-2 ..· R 1)
+ Yk(R T I R T-1 ... R 1) - lIk(R T_ 1 I R T-2 ... R 1) - Hk(RT I R T-1
...
R 1)·
(7.4.36)
The above process may be repeated for T stages, but a general solution may be given for any tth stage of the process by the following statement.
116
7.
AN ADAPTIVE INVESTMENT MODEL
If r1 and r 2 are constant over the T-stage adaptive investment process, the principle of optimality holds, and if at no stage in the process the condition
if
for is violated, then
(i) the maximum value of the (t of the kth decision maker is given by
+ 1)th
fr-tCK kt) = In K kt + (T - t)H*
t
= 1,2, ..., T
(7.4.37)
stage from the point of view
+k T
Yk(R I R T
T-
1 •••
R1)
T~t+l
(7.4.38)
where (7.4.39)
and
(ii)
The optimal portfolio ratio for the kth decision maker is given by t = 1,2, ... , T.
(7.4.41)
(iii) The maximum expected rate of capital growth for the kth decision maker from the market point of view is given by (7.4.42)
where g~t is the maximum expected rate of capital growth for the kth decision maker at the tth stage of the process, and where we define (7.4.43) Proof.
For an adaptive process at the tth stage, we can write
7.4
AN ADAPTIVE INVESTMENT PROCESS WITH FULL LIQUIDITY
117
With the help of Eq. (7.4.38) of the solution statement, we can say fT-t(K kt I Zt = r1) = In K kt_1(1 + aktr1)
+ I. T
T=1+1 T
+ (T -
Yk(R T I R T-1 ...
t)H*
R1) I't~rl
- T=1+1 I. Hk(R T I R T- 1 ... R 1) Iz,~rl A
and
+ I. T
T=1+1 T
Yk(R T I R T- 1 ••• R 1) I.,~r.
- T=1+1 I. Hk(R T I R T- 1 ... R 1) I.,=r•. A
In view of the definitions (7.4.26) and (7.4.27), we substitute these equations into Eq, (7.4.44) and get fT-t+l(Kkt- 1)
= max {Pklt In(l a kt
+ aktY1) + Pkzt In(l
+ In K kt-1 + (T -
+ I. T
t)H*
T=1+1
- akt1"Z)}
Yk(R T I R T-1 ... R 1) (7.4.45)
We note again that max {Pklt In(I °kt
+ aktr1) + PkZt In(l
- akt1"Z)}
= H* + Yk(R t I R t-1 ... R 1) - Hk(R t I R t- 1 ... R 1)· Therefore, Eq. (7.4.45) becomes fT-1+1(K kt-1) = In Kkt-l
+ (T - t + I)H* + I. Yk(R TI R T- 1 •.. R 1) T
T=t
- I. Hk(R T I R T-1 ... R 1) T
T=t
A
(7.4.46)
7.
118
AN ADAPTIVE INVESTMENT MODEL
which is the solution to a process at the tth stage as given by Eq. (7.4.38) in the statement. Thus, the first part of the solution statement is proved by induction on t. When the bracketed term of Eq. (7.4.45) is maximized, the maximum occurs at Pkltrl - Pk2t r2 r 1T 2
t = 1,2, ..., T.
for any
(7.4.47)
This proves part (ii) of the solution statement. Extending Eq. (7.4.19), we can write the sequence E*[In K kt] E*[In Kkt-l]
+ H* + Y E*[In K kt- Z] + H* + Y -
= E*[In Kkt-l]
Hk(R t I R t- 1
=
H k(Rt_1 I R t- 2
...
R1 )
•••
R1)
Repeated substitution of these equations, one into the other, gives us E*(In K kt)
= In K kO+ t(H*
+ Y) -
!- Hk(R t
T
IR
T-
1 •••
R 1) .
7=1
Subtracting In K k O from both sides and dividing by t gives
This proves part (iii) of the solution statement. The general solution statement shows that the optimal portfolio ratio for the kth adaptive decision maker is a function of the magnitudes of the possible payoffs, the historical information concerning the frequency of the payoffs, and the a priori conviction of the decision maker. Contrary to the optimal portfolio ratio for the nonadaptive decision maker we see that there is good reason to believe that each adaptive decision maker behaves differently, even if all adaptive decision makers face the same payoff structure and history. This is because there is little possibility of all adaptive decision makers having the same a priori convictions, the o:'s.
7.4
AN ADAPTIVE INVESTMENT PROCESS WITH FULL LIQUIDITY
119
For example, let us examine the implications of the general solution on our three investors, Mr. Holmes, Mr. Stone, and Mr. Wise. Using the data provided in Section 4.2, Table 4.1, we see that the three investors will follow quite different investment plans and end up with different amounts of capital, even having started with the same sum. TABLE 7.1 PORTFOLIO RATIOS FOR THE THREE INVESTORS
Decision maker
Mr. Wise Mr. Holmes Mr. Stone
Conviction vector "kl
"k2
30 2 20
10 2 20 T1
Subj. prob.
Observations
100 100 100 =
nl
n.
Pklt
Pw
70 70 70
30 30 30
.714 .692 .643
.286 .308 .357
1 point rise
T2
Port. ratios Ok'
.428 .384 .286
1 point fall
=
Table 7.1 shows the results for the three investors following a number of entropy periods (stages). Now let us examine the adaptive investment process as the length of the process increases; i.e., the number of entropy stages increases without limit. The nature of such an adaptive process is tied up ill the effect of the limiting values for the subjective conditional entropy of each decision maker. In our particular process, where the payoff events are not dependent on the stage index, it can be shown that (7.4.49)
and we denote for
T
= 1,2, ..., T.
(7.4.50)
Therefore, we note that the average entropy is given by (7.4.51)
The average subjective entropy is not the same as Eq. (7.4.51) since the subjective probabilities are not independent of the stage index. The
120
7.
AN ADAPTIVE INVESTMENT MODEL
limiting value of the average subjective entropy is of use in determining the probable limit of the adaptive process. Since Pk(ZT I ZT-l' "0' Zl) is a consistent estimator (see Section 4.4), we have for the kth decision maker.
(7.4.52)
Since also H(R r)
= ~ P(zr) In P(zr) Rr
and Bk(Rr I R r-l ... R1 )
= -
P(zr)p(Zr-l) ... P(ZI) In Pk(zr I Zr-l , ..., Zl),
~ RrRr_,oo.R,
we have in probability as
T
-+ 00,
= - ~ P(zr) In P(zr) Rr
Because of Eq. (7.4.50), we have for the probability limit of the kth decision maker's average subjective entropy,
=
H(R).
(7.4.53)
Under condition (7.4.35), we can then say Rlimg* = g*. T->oo kT
(7.4.54)
We see then that the adaptive maximum expected rate of capital growth, in probability, approaches the full knowledge stochastic rate of growth as the process extends in time. Given the same initial capital for both processes, and using the full knowledge stochastic process as a
7.5
CONSIDERATION OF THE SPECIAL CONSTRAINT
121
comparative base, we can define the relative efficiency of the kth decision maker in an adaptive binomial investment process as (7.4.55)
and since we have Eq. (7.4.55) at the limit, in probability (7.4.56)
Thus the relative efficiency of this adaptive process approaches unity in probability as time extends to infinity. However, one should note that during this period of adaptation, the kth decision maker has accumulated a definite loss in capital relative to the full knowledge process that is irretrievable. He can minimize this loss by rhe use of a "best" subjective estimator, but he can never eliminate this loss since it is the cost of insufficient information. This irretrievable loss is characteristic of irreversible processes.
7.5 Consideration of the Special Constraint We are ready now to return to the consideration of the constraint (7.4.35) and the finite probability that it will be violated. We can say in addition to Eq. (7.4.41) that if
(7.5.1)
if
(7.5.2)
and
We define three probabilities: CPkl = Pr[O CPk2
=
<
Pr[£k(r)
£k(r)
<
<
r1r2],
0],
CPk3 = Pr[£k(r) > 71r2],
and we assert CPkl
+ CPk2 + CPk3 = 1.
(7.5.3)
(7.5.4) (7.5.5)
(7.5.6)
122
7.
AN ADAPTIVE INVESTMENT MODEL
Rewriting condition (7.4.35) in terms of Pklt alone, we find that for the lower critical bound, 2k (r ) = 0, . PkIt
=
+ +
tir) r2 r1 r2
=
r1
r2
+ r2 '
and for the upper critical bound, 2 k (r) = r1r2
(7.5.7)
,
(7.5.8)
Substituting the estimation equation for boundary line,
Pklt , we find that for the lower (7.5.9)
and for the upper boundary line (7.5.10)
We note that the slope of the upper boundary is always greater than the slope of the lower boundary, and that the intercept with t = 0 of the upper boundary is always greater than the intercept of the lower boundary. Figure 7.4 shows a possible position for these boundaries. The adaptive process is a variant of the sequential analysis problems considered by Wald" and others. In Fig. 7.4 we find that
< Ek(r) < r 1r2 • Area ABC ~ Ek(r) > r 1r2 • Area DEF => Ek(r) < O.
(1) Area OABED (2) (3)
=>
0
> t) is not possible. Area ODH is inaccessible because (nIt < 0) is not possible.
(4) Area OAl is inaccessible because (nlt (5)
(6) Point A occurs when
(7) Point D occurs when
7.5
CONSIDERATION OF THE SPECIAL CONSTRAINT
123
.5
D ,. T (horizon)
FIG. 7.4.
The probability that
nIl
rpk2
Admissible values for nu •
will be in area DEF can be written
= Pr
nlt [
t
nklt
<
]
0 .
(7.5.11)
Now from Eq. (7.5.9) we see that
Note that
and since nIl!t is the consistent maximum likelihood estimator of PI , we know that "
nlt
t...""
t
1l m - =PI' p
Furthermore, we note that a possible condition for PI is that r
2 PI > r+-r . 1
2
(7.5.12)
124
7.
AN ADAPTIVE INVESTMENT MODEL
Because of these facts we can write in the limit that limPr [
n
lt
-
t-ocr:>
t
nklt
< 0] =
0,
or lim CfJkZ = O. t-ecc
(7.5.13)
rZ PI
(7.5.14)
On the other hand, suppose
then (7.5.15)
Now the condition of Eq. (7.5.12) is equivalent to saying that (7.5.16)
and also condition (7.5.14) is equivalent to saying that Eir)
< o.
(7.5.17)
We have maintained that Eq. (7.5.17) is not a valid expectation for the investment market as a whole; therefore, we will assume that only Eq. (7.5.12) is valid in a real market situation, and thus the limit of CfJkZ is zero as time extends to infinity. In effect, then, the lower bound falls away from the region of probable deviation of nIt faster than that region can expand with increasing time in a real market situation. The probability that nIt will be in area ABC can be written CfJk3
= P r [ n lt - t
nw >0].
(7.5.18)
Now from Eq. (7.5.10) we see that
A possible condition for PI is that
PI <
rZ(rI
r1
+ 1) + rZ .
(7.5.19)
7.6
MULTIVALUED PAYOFF ADAPTIVE INVESTMENT PROCESS
125
With these facts, we can write the limit lim Pr [ t-+oo
nIt -
t
nw
> 0]
=
0,
or lim t-+oo
epk3
= O.
(7.5.20)
On the other, hand, suppose that (7.5.21 )
then we find epk3 = 1. lim t-+oo
(7.5.22)
Condition (7.5.19) is equivalent to saying (7.5.23)
This condition is one which was previously made and shown to be a characteristic of a real market situation. If we assume condition (7.5.19), we also assume that the limit (7.5.20) is valid. As a result, as t increases, the upper bound pulls away faster from the region of probable deviation of nIt than that region can expand. Provided that the probability function of the environmental stochastic process does not violate Eqs. (7.5.12) and (7.5.19), it is quite likely that the subjective probabilities will not violate Eq. (7.4.35) in the later periods of the process. In earlier periods of the process the probability of a violation of Eq. (7.4.35) is governed in part by the accuracy of the initial convictions (logical width) and the closeness of a violation of Eqs. (7.5.12) and (7.5.19) by the probability function of the environmental stochastic process.
7.6 The Multivalued Payoff Adaptive Investment Process The model introduced in the last section could be generalized in a number of ways. We would like to make it possible to introduce a multivalued payoff structure. This could be done in several ways, but by the extension of the entropy time concept we can introduce a multivalued
126
7.
AN ADAPTIVE INVESTMENT MODEL
payoff structure while pointing out some interesting relationships between decision processes in entropy time and in physical time. First let us define the payoff structure in a special way by the set R
= {rl = lr,
where
1= ±l, ±2, ...},
(7.6.1)
where r is come constant minimum incremental payoff. For example, in the context of the stock market, r could be related to a price change of -lth of a point. Now suppose the stages, the entropy time increments, are made very short so that a rise of several T'S of magnitude in physical time could be represented in entropy time as a sequence of several stages. In other words, if a stock price rose I point in a day's trading, in entropy time eight stages could have passed during this day's trading (assuming that r = -lth point). The process in entropy time could be at the least a sequence of eight increases of -lth point each. An investment process formulated in this way can be seen as a form of the classical random walk process. Bacheller" introduced the notion that investment price statistics were probably the result of a random walk process. He also noticed the connection between random walk processes and the diffusion of heat in matter. This observation led to the modern thermodynamic theory and to the study of the EinsteinWiener processes. Unfortunately, the economic aspect of Bachelier's work has been largely unexploited. Recently modern statistical methods have been applied in the analysis of investment price fluctuations. These studies have uncovered evidence that the investment price mechanism is the result of a random walk process.v l" In the classical random walk model a particle is permitted to advance or fall back by a unit distance which is a random variable generated by a ,{ quence of independent Bernoulli trials. Using this approach, the complex payoff process can be treated in the same manner as in the previous section except for a tighter definition of entropy time which we shall call the decomposition postulate. (Note: we let both T 1 and r 2 in the previous process equal r in this process. Thus, if r = -l, then T 1 = -l and r 2 = -l) Suppose that the physical time process is a Poisson process. The probability of '\ rise in the price of a security of I points in one time period would be given by Pr(R
= rl) =
where A is the Poisson parameter.
e-~
,\I
IT
(7.6.2)
7.6
MULTIVALUED PAYOFF ADAPTIVE INVESTMENT PROCESS
127
Setting 1 equal to 1, we have Pr(R = r1) = Pr(R
= r) =
r
A,\.
(7.6.3)
Now we know that the probability of one payoff or r points, event 1 in the entropy based process, has been given as P1' Substitution of this fact into Eq. (7.6.3) and then Eq. (7.6.3) into Eq. (7.6.2) gives us the relation between the probabilities of both time bases. We have Pr(R = r ) = Z
z
rAU-1)
P1 l! '
1=1,2, ....
(7.6.4)
This probability function tells us that there is a chance that many entropy time periods (and thus 1 decision stages) may occur in one unit of clock time. One might ask why should not decisions come at a constant clock time rate instead of a constant entropy time rate? Actually, the logic behind such an assumption of a one-to-one correspondence between clock time and the decision rate is not realistic. Consider an investor who is trading a speculative, widely fluctuating stock. If he is to increase his capital gains over, say, a month of clock time, he will have to observe the price fluctuations of this stock on an hour-to-hour basis. Thus, there may be about 160 potential decision stages (entropy time periods) for this investor in one month of clock time. On the other hand, suppose another investor purchases a conservative blue chip stock. He may only have to check its price once a week to assure the safety of his investment. During the month the first investor has required 160 potential decision stages and the second investor has only required 4 stages. From this example we see that the rate of passage of decision stages, and thus entropy time periods, actually increases and decreases with respect to clock time in realistic investment processes, and depends on the rate of transmission of information. What are the limitations of the decomposition postulate? The essential problem which we must face is illustrated by the above example. Suppose that the investor who trades the speculative stock finds it necessary to stay each day at the stock ticker, watching minute-to-minute changes in the price of the stock. Now suppose that the general market is in a panic and all stocks are falling rapidly. The stock ticker is the only information transmission medium, and if it falls behind the quotations in New York our investor will find himself subjected to considerable risk. Events will be occuring at a faster rate than the stock ticker's maximum information transmission rate. In this case the decomposition
128
7.
AN ADAPTIVE INVESTMENT MODEL
postulate will not hold since a one-to-one relation between decisions and events cannot be maintained. If we can be assured that the rate of information transmission within the economic system is not constrained, we could by the use of the decomposition postulate reduce the multivalue payoff processes to the simple investment model with a positive and a negative payoff by a suitable time transformation relation. Furthermore, if the probability relations between the clock time process and the entropy time process are known, they can be used to determine the maximum expected decision rate with respect to clock time. It is possible then to design an optimal economic information transmission system which will not limit the decision rates of the decision makers. The importance of adequate communication facilities in a modern economy must not be underestimated. References 1. Bernoulli, D., Exposition of a new theory on the measurement of risk (Trans!. by Louise Sommer), Econometrica 22, No.1, pp. 23-36 (1954). 2. Arrow, K., Alternative approaches to the theory of choice in risk-taking situations, Econometrica 19, No.4, 404-435 (1951). 3. Tobin, J., Liquidity preference as a behavior toward risk, Rev. Economic Studies 25 (2), No. 67, 65-86 (1958). 4. Markowitz, H. M., "Portfolio Selection." Wiley, New York, 1959. 5. Kaldor, N., The equilibrium of the firm, Economic J. 4, 60-76 (1934). 6. Wald, A., "Sequential Analysis," pp. 93-94. Wiley, New York, 1947. 7. Bachelier, L., "Le jeu, la Chance, et Ie Hasard." Flammarion, Paris, 1914. 8. Osborn, M. F. M., Brownian movement in the stock market, Operations Res. 7, No.2, 145-173 (1959). 9. Borch, K., "Prime Movements in the Stock Market," Economic Research Program No.7, April 30, 1963. Princeton Univ., Princeton, New Jersey. 10. Granger, C. W. J., and Morgenstern, 0., Spectral Analysis of New York Stock Market Prices, KYKLOS 16, No.1, 1-27 (1963).
CHAPTER 8
MULTIACTIVITY CAPITAL ALLOCATION PROCESSES The machines of nature . . . are still machines in their smallest parts ad infinitum.
Leibniz
8.1
Monadology
Introduction
In the investment process considered in the last section, "investment" was considered as an activity in which there was a possibility of capital gains or capital losses. The constant possibility of a loss created the necessity for the decision maker to hold some of his assets in the form of cash. Actually, however, this single investment activity is composed of a whole array of investment opportunities or activities. The decision maker has some choice over his participation in this array of activities; some he rejects outright; others he finds attractive and profitable. We would like to extend our analysis to this kind of multiactivity investment process. This task is not as easy as might be expected. In multi-activity processes there is a possibility of interaction between the payoff probabilities of the activities. Interactions between the payoff probabilities of the investment activities could be handled by the assumption of a Markovian environmental stochastic process which generates the probability of the payoff events. For example, in such a model the probability payoff of anyone activity falling (risirtg), given that in the previous stages many of the other activities fell (rose) would be very high. In an adaptive version of such a process the decision makers would have to estimate the Markovian probability transition matrix to determine their optimal policies. Because of the budget constraint (the allocations of capital to the investment activities and cash must equal the whole amount of capital at each stage) and the positivity of the portfolio ratios, a difficult problem in nonlinear programming would 129
130
8.
MULTIACTIVITY CAPITAL ALLOCATION PROCESSES
result, and we would be limited to numerical solutions. Such problems are beyond the scope of this book. Instead of tackling the general problem, we shall consider a simplified version of the multiactivity capital investment process which has theoretical significance. We shall make the investment activities noninteracting, and we shall limit the set of possible activities to those having only positive payoff events. Thus, each investment activity will have a positive expected payoff. This kind of multi-activity process will be called a capital allocation process.
8.2 The Adaptive Capital Allocation Process Suppose we have an ordered set, M, consisting of iii possible payoffs, one fo"r each of iii investment activities denoted by the index numbers i E M. Suppose further that these payoffs are all positive. Based on entropy time where one payoff event for each state of the process, we can write the kth decision maker's subjective expected rate of capital growth at stage T as
gkT = ~t [In T
K kT
K kO
]
= ~t [iln~] = ~ T
t=l
K k t- 1
T
it [In~] t=l
K k t- 1
(8.2.1)
where gkT is the kth decision maker's subjective expected rate of capital growth for a T-stage process, and K k l is the kth decision maker's capital at stage t. The decision maker must maximize gkT by determining a sequence of vectors A k l , one for each stage of the process, such that for each for each
iEM} t = 1,2, ..., T.
(8.2.2)
The vectors A k l are the policy vectors which consist of the capital allocation ratios (portfolio ratios) for each stage of the process. Now the decision maker must maximize gkT stage by stage. This is because the information he needs to form the subjective probability function is revealed stage by stage as the payoff events occur. Nevertheless, at each stage the decision maker must consider the future effect of each current decision. For this reason we shall use again the dynamic programming technique. We shall maximize gkT stage by stage, starting at the Tth stage, the last stage of the process.
8.2
THE ADAPTIVE CAPITAL ALLOCATION PROCESS
131
We have, for the subjective expectation of the rate capital growth for the kth decision maker and for the last Tth stage of the process, E'[ln K kT ] - In K kT- l
=
k qkiT In(1 + akiTYi)
(8.2.3)
ieM
where Yi is the positive payoff generated by the ith event, and qikT is the subjective probability of the ith event for the kth decision maker at the beginning of stage T. We have that and for each i E Iff. qkiT = I
k
ieM
Putting Eq, (8.2.3) in the dynamic programming form, we have fl(K kT- l)
= max lin K kT- l akiTeAkT I .
+ «u kqkiTln(1 + akiTYi) I, \
(8.2.4)
subject to the constraints (8.2.2). We find this maximum value by use of the Kuhn-Tucker theorem.' First, from the ordered set N! we form two subsets. The first, designated as MtT' contains all the indexes of the portfolio ratios which are positive. The second subset, designated as MZT , contains the indexes of the rest of the portfolio ratios, which are zero. The particular elements, or numbers of elements, in subset MtT and M2T have yet to be determined, so we shall assign an arbitrary number, mkT, of the iii ratios to be in the subset MtT' Since the function gkT is continuous, possesses continuous partial derivatives and is concave, the Kuhn-Tucker theorem in this application recognizes the two following conditions (i) The condition ~ie1i7 atiT cash) which implies
<
I
(the kth decision maker holds for
(8.2.5)
for
(8.2.6)
and
(ii) The condition ~ieM atiT cash) which implies
8g H
• --
I
8akiT akiT
=
'- 0
-l\kT-
(the kth decision maker holds no for
(8.2.7)
132
8.
MULTIACTIVITY CAPITAL ALLOCATION PROCESSES
and for
(8.2.8)
We note that Ais to be a nonnegative number, and OgkT
8a k i T
I at,T
qkiTTi
for
= 1 + a~TTi
iEM.
(8.2.9)
In this model, Condition (i) is not possible (i.e., the kth decision maker never holds cash) since it can be seen from Eq. (8.2.9) and the restrictions on ri that ogkT/oakiT is always positive, and thus Eq. (8.2.5) and Eq. (8.2.6) cannot hold. Summing Eq. (8.2.7) over the subset MtT' using Eq. (8.2.9) and the fact that (8.2.10)
we find as a value for Ak T that
(8.2.11)
Substituting this value for AkT into Eq. (8.2.8) and using Eq. (8.2.9), we find that for
and
(8.2.12)
From Eq. (8.2.8) we find that for
(8.2.13)
8.2 Since atiT
THE ADAPTIVE CAPITAL ALLOCATION PROCESS
> 0 for
i
E
133
MitT' we can write Eqs. (8.2.12) and (8.2.13) as
~
qkiT
-ieM+kT
qkiTri > 1
1
i E MtT'
for
+ ieM+ ~rs
(8.2.14)
- kT
and
(8.2.15)
for
We now order the qkiTr/S of subset MtT in order of decending magnitude, starting with qklTr1 and ending with qkmTrm' Also the subset of events M2T is ordered such that qkm+lTrm+l is largest and qkmTrm is the smallest of the qkiTr/S in this subset. We take in particular the smallest qkiTri in subset MtT' i.e., qkmTrm' and the largest qkiTri in subset M2T' that is qkm+lTrm+l' For these qkiTr/s we have from Eqs. (8.2.14) and (8.2.15) that (8.2.16)
We may substitute, one at a time, all the permissible values of m (m = 1,2, ..., M), into Eq. (8.2.16) until the inequality holds. This value of m which satisfies Eq. (8.2.16) is denoted mkTand is the lower limit of the ordered subset MtT' Using this method to find mkT we may rewrite Eq. (8.2.12) as
* _'hiT [1 +
a kiT -
""" 1] - ~ 1 £.< ~
ieM+!.
We define , qkiT PkiT = -,-, UkT
kT
for
"t
where and
(8.2.18)
134
8.
MULTIACTIVITY CAPITAL ALLOCATION PROCESSES
The conditional probability, PkiT, is defined on the set of those events which actually contribute to the growth of capital. Because of the maximizing behavior of the decision maker, the remaining events contribute nothing to capital growth and are classed along with the "no change" events. We excluded these "no change" events in previous processes when the assumption of an entropy time base was made. Excluding the "no change" events, and basing the expectation of the rate of capital growth only on the subset of events which leads to capital growth for the decision maker, we can determine the maximum expected rate of growth of capital for the Tth stage. Since we have an expression for the optimal allocation vector, At'T' we can substitute this expression back into Eq. (8.2.3) to find the optimal expected rate of capital growth for the Tth stage from the decision maker's point of view. We have 2*[ln Kd
=
In K kT - 1
+ ~ qkiT 1n(1 + a0.Tri)' «u
Since the log terms in Eq. (8.2.19) are zero for i (8.2.19) as
E
(8.2.19)
M2T' we can write (8.2.20)
Substituting Eq. (8.2.12) for 2*[ln K kT ]
= In K k T- 1 + ~
atiT
in Eq. (8.2.20) gives us
tlkiTln qkiT
ieM+ kT
+
~
qkiTln ri
ieM+ kT
(8.2.21 )
Using Eq. (8.2.18), we can simplify Eq. (8.2.21) to 2*[ln K kT ]
=
In K kT- 1
I
+ UkT ~ ieM: T
PkiTlnpkiT
+ In UkT
8.2
135
THE ADAPTIVE CAPITAL ALLOCATION PROCESS
Cancelling terms, introducing notation similar to that used in Section 7.4 and writing in dynamic programming context, we have fl(K kT-1) = In K kT _ 1 + Gk(R T I R T-1 ... R1){Hk*(R T I R T-1 '" R 1) - Hk(R T I R T-1 ... R 1)
+ l\(R T I R T-1 ... R 1)}
(8.2.23)
where we define Gk(R T I R T-1 ... R 1) =
~ qkiT, ieM+kT
(8.2.24) (8.2.25) (8.2.26)
and l\(R T I R T-1
...
R 1)
= ~
PkiT In Ti
+ In (1 +
ieM~T
~ I/Ti )
(8.2.27)
!.eM~T
-In mtT' We have now prepared the way for the consideration of a two-stage process starting with stage T - 1. We have from the decision maker's point of view (8.2.28) where the last terms in the summation are the expectations of growth in the next stage (stage T) from the decision maker's point of view and conditioned on the effect of the current event, i.e., ZT-l' on the next stage. We have for these terms
+ Gk(R T I R T-
R 1 ), IZT~l=r;{Hk*(RT I R T-1
- Hk(R T I R T-1
R1)
1
IZT_1-rj
...
+ Yk(R T I R T-
R 1 ) !zT_l=r;
1 ...
R 1)
IzT_1>
(8.2.29)
136
8.
MULTIACTIVITY CAPITAL ALLOCATION PROCESSES
Substituting Eq. (8.2.29) into Eq. (8.2.28) we have f2(K kT-2)
= In K kT-2 + Gk(R T I RT-1 ... R1){Ht(R T I RT-1 ... R1) - Hk(R T I RT-1 ... R1)
+ l\(R T
!
RT-1 ... R1)}
+ a,iTmax \ ~ qkiT_1 ln(1 + akiT_lri)I, EAkT I iEM \ -1
(8.2.30)
-1
where we define Gk(R T I R T-1 ... R1) = ~ qkiT-1Gk(R T I RT-1 ... R1) IZT_1~ri' (8.2.31) iEM
H;(R T I RT-1 ... R1)
= ~ qkiT-1H:(R T I RT-1 '" R1) IZT_1-ri ,(8.2.32)
Hk(RT I RT-1 ... R1) =
ieM
~ qkiT-1Hk(RT I R T-1 ... R1) Iz T_ =r, ' 1
iEM
(8.2.33)
and Yk(RT 1R T-1 ... R1) = ~ qkiT-1Yk(R T I RT-1 ... R1) IZT_1~r,. (8.2.34) ieM
The term of Eq. (8.2.30) in braces can be maximized in the same way as the maximization of the growth in the Tth stage. We have max
akiT_1 EAkT_1
\ ~ iikiT-1In(I 1 iE!J
= Gk(RT-1 I RT-2
+ Yk(RT-1 I RT-
2
+ akiT-1ri)! .
R1){H:(R T_1 I R T-2 ... R1) - Hk(RT_1 I RT-2 ... R1)
(8.2.35)
R1)}·
Substituting Eq. (8.2.35) into Eq. (8.2.30) we have finally f2(K kT-2)
=
In K kT-2 + Gk(R T I RT-1 ... R1){Hk*(R T I RT-1 ... R1)
- Hk(R T I R T-1 ... R1) + Yk(RT I RT_1 ... R1)} + uk(RT_1 I R T-2 ~
...
R1)
{H;(R T_1 I R T-2 ... R1) - H,r(RT-1 I R T-2 ... R1) + Yk(RT-ll RT_2 ... R1)}· (8.2.36)
Equation (8.2.36) is composed of terms, for example the subjective entropy terms, which are effects from the risk and uncertainty of the
8.2
137
THE ADAPTIVE CAPITAL ALLOCATION PROCESS
current, (T - l)th, and the Tth stage. For example, the subjective entropy for the Tth stage is conditioned on the event which takes place in the (T - 1)th stage. The subjective entropy of the (T - 1)th stage is not conditioned because there is no stage preceding it for a two-stage process. Furthermore, the subjective entropy at stage T - 1 can never be conditioned by the future events of the Tth stage, therefore, it stands by itself. We could go through this maximization process for a three-stage process and so forth. However, we will sum up the results with a general 1)-stage process in the following solution for the tth stage of a (T - t general statement.
+
If it is true that:
(1) r. > 0, (2) Qk t = {qkit: ~ieM qkit = 1, qkit ~ 0 for all i E M} for each t = 1,2, ..., T, (3) gkt is a continuous convex function which possesses continuous first and second derivatives with respect to akil E A k t and (4) the principle of optimality holds; then: (i) for the tth stage (from the point of view of the kth decision maker) we have that fT-t+l(Kkt-1)
=
+I
T
In K kt-1
uk(R T I R T _ 1
...
R1)H;(R T I R T -
1 ...
R 1)
T~t
:::::
T
- ~ uk(R T I R T - 1 ... R1)Hk(R T I R T -
1 ...
R 1)
'T=t
T
+ ~ uk(R
T
IR
T=t
T-
1 ...
R 1)Yk(R T I RT-} ... R 1 ) ;
(8.2.37)
also, (ii) the optimal capital allocation vector for the tth stage and the kth decision maker is composed of elements given by atit
= Pk i t (I
+
I Ih) - 1fr
ieM+
-
i
for
(8.2.38)
k,
and for furthermore,
(8.2.39)
138
8.
MULTIACTIVITY CAPITAL ALLOCATION PROCESSES
(iii) the maximum expected rate of growth of capital for a T-stage process from the market point of view is given by 1
t;T = T
:k aiR. I R.-1 ... R1)H;(R. I R .=1 T
1
H
...
+ T :k ak(R, I R._1 ... R 1)Yk(RTI Rr-l T
'1=1
In the above equations, we define: uk(R. I R'_1
•••
R 1) =
:k
R 1)
... R 1)·
(8.2.40)
(8.2.41)
qkiT'
ieM+
kr
H;(R. I R,-l
R 1) = In mt.,
Hk(R. I R.-1
R 1) = -
(8.2.42)
:k Pki. InPki.,
(8.2.43)
ieM+
k.
Yk(R. I R'_1
...
R 1) =
~ Pki. In ri + In (I
ieM+
kT
ak(R. I R._1 ... R 1) = ~
~ l/ r!)- In mt., (8.2.44)
+
ieM+
-
k'T
s.,
(8.2.45)
ieM+
k.
B k(R. I R ._1 · .. R 1 ) = - ~ PilnPkir>
(8.2.46)
ieM+
k'
and Yk(R. I R._1 ... R 1) =
:k ieM+
k'T
Pi In r,
:k
+ In (1 +
I/r!) ~ In m~..
(8.2.47)
ieM+ -
k'T
+
The proof of the solution for stage t of a (T - t 1)-stage process is obtained by induction in the same way as shown in Section 7.3.
The optimal capital allocation vectors are given for the T-stage process of the kth decision maker by Eqs. (8.2.38) and (8.2.39) for all t = 1,2, ..., T. Again we notice that the solution for the optimal capital allocation vectors is independent of the amount of capital the kth
8.3
PROPERTIES OF ADAPTIVE CAPITAL ALLOCATION PROCESS
139
decision maker possesses. This is of course the result of the type of objective function chosen for the process. We will find that the independence of the capital possessed and the capital allocation vectors is a useful property in the study of adaptive processes in state space in the next chapter.
8.3 The Properties of the Adaptive Capital Allocation Process in the Limit We will assert without proof that the capital allocation process with full information (that the decision maker knows for certain the probability function of the environment) is characterized by the equations G
= a(H* - H
+ Y)
(8.3.1)
where (8.3.2)
H*
= In mr,
(8.3.3)
H = - ~ Pi In Pi ,
(8.3.4)
ieM+
Y =
~ Pi In r,
+ In (1 +
ieM+
~ l/r1)
-In m+,
(8.3.5)
iEM+
and (8.3.6)
In the above equations, the set M+ and the number of elements of this set, mv are determined through the satisfaction of the inequality gained from the Kuhn-Tucker conditions for optimality. This is given by
~ qi (8.3.7)
where the qir/s are ordered in descending value. The m which satisfies this inequality becomes mr and the set of expected payoffs greater than and including mr becomes the ordered set M+.
140
8.
MULTIACTIVITY CAPITAL ALLOCATION PROCESSES
It should be made very clear that the above system of index numbers does not represent the same system of index numbers used at any stage within a process. This can be seen clearly if it is assumed that the initial subjective probability function possessed by the decision maker is seriously in error. In this case, the subjective expected payoffs will also be badly in error. Thus, the ordering of these expectations may be different from the ordering of the actual expectations. The critical value mr will also be different. The decision maker will be holding a very poor set of investments, and the rate of growth of capital will be low (the decision maker can be said to be confused). What can be said about the adaptive process and the full certainty process when they seem to have little in common? Because of the consistency of the subjective probability function, a great deal may be said about the limit of the adaptive process as T increases to infinity. The consistency of the subjective probability function means that Rlim qkiT T-.oo
=
for
qi
i E M.
(8.3.8)
It directly follows that (8.3.9)
and thus, by the examination of the inequality (8.2.I6) for seen that Rlim m k T = mr.
mkT,
T-.oo
it is
(8.3.10)
Therefore, Rlim ak(R, I R'_l ... R 1) T-.oo
=
for each
a
T
= 1,2, ... , T, (8.3.11)
and (8.3.12)
With these equations, it is seen that p)~~ H~(R,I R'_l ... R 1)
= H*,
(8.3.13) (8.3.14)
and all for each
T
=
1,2, ... , T.
(8.3.15)
8.3
PROPERTIES OF ADAPTIVE CAPITAL ALLOCATION PROCESS
141
These equations show that I
T
Rlim - ~ o (R 7 I R 7-1 ... R 1 )H*(R I R 7-1 ... R 1 ) T ->00 T 7=1 k k 7
I
+T
T
~ uk(R 7 I R r- 1 ... R 1) Y k(R7 I R 7 - 1 ... R 1)
'T=1
= u(H* - H + Y),
(8.3.16)
and thus RlimgtT= G. T->oo
(8.3.17)
The consistency property of the subjective probability function assures that as T increases, the actual expected rate of growth of capital with full information, and the subjective expected rate of growth with limited information, will converge in probability. Consider now the variable Uk('). By examination of Eq. (8.2.41), it can be seen that (8.3.18)
The variable Uk(') equals (I) when mkt = iii. In other words, ui') equals one when the decision maker invests in all the activities. When Uk(') equals qklt' the decision maker invests in one activity only, the one which is associated with the payoff, qk1h. The quantity Uk(') will be denoted as the subjective threshold coefficient. Furthermore, the quantity a as defined by Eq. (8.3.2) will be denoted as the natural threshold coefficient. It should be noted that there is a natural threshold for every environment; i.e., every set of probabilities, qi' associated with values of ri for each i E M. One of the tasks of an adaptive decision maker subjected to a given stochastic environment is to discover this threshold. Further, it is necessary for the decision maker to allocate his limited funds at each stage to investment in activities whose expected payoffs are superior to the threshold level. The entropy of the environment has been defined on a conditional probability space. There is an advantage to be gained by restating this entropy based on the entire probability space. This entropy is given by Hv
= - ~ qi In qi . ieM
(8.3.19)
142
8.
MULTIACTIVITY CAPITAL ALLOCATION PROCESSES
If the decision maker has a limiting threshold, mr, as given in Eq. (8.3.10), the truncated entropy is given by hp
= - ~
qi In qi •
(8.3.20)
ieM+
Because each term of the type found in Eq. (8.3.19) is always negative, and because there are more elements in set M than in set M+, it can be seen that (8.3.21) In the limit, the threshold reduces the system (environment plus decision maker) entropy. In general, this cannot be said. Since the decision maker could be highly confused (the subjective probability function may be very different from the actual probability function), he may seek to exclude those investments which could contribute the greatest capital growth. One of the values of the consistency property of the subjective probability function is that this confusion cannot last for a long time. Sooner or later, in all probability, the confusion would be reduced and the truncated entropy would fall below the actual entropy of the environment. The decision maker would be acting to reduce local entropy. References 1. Kuhn, H. W. and Tucker, A. W., Nonlinear programming, in Proc. 2nd Berkeley Symp, on Mathematical Statistics and Probability, pp. 481-492. California Univ. Press, Berkeley, California, 1951.
CHAPTER 9
ECONOMIC STATE SPACE . . . it is especially needful to remember that economic problems are imperfectly presented when they are treated as problems of statical equilibrium, and not of organic growth. Alfred Marshall Principles of Economics
9.1
Introduction
Economic processes appear in all of our modern societies. They are so much a part of our daily lives that it is surprising that so little is known about them. Most economic analysis is concerned with the development of optimal policies in so-called "comparative static" situations, i.e., situations which arise during the state of deterministic equilibrium. Although the state of deterministic equilibrium is rarely experienced in real economic processes, these studies are useful since economic processes may always be tending to approach this terminal state. For many years economists have been dissatisfied with their understanding of economic processes during the passage toward a state of deterministic equilibrium. Static optimal policies, although they might produce excellent results in a static world, may not be optimal during the transition from one state to another. The study of the economics of transition, often called dynamic economics, has been hindered by the limitations inherited from the study of static theories. The most unrealistic of these (in the dynamic context) is the assumption that all decision makers possess complete information concerning the rewards gained from their action, and the effects of the actions of all other decision makers. Much current economic analysis has been developed in the same way as one might develop a theory of the game Bridge, if all players had full knowledge of their opponents' hands. Such a Bridge theory could hardly improve the decisions made during an actual Bridge game, where the very essence of the game is the deduction of hidden information from the knowledge of revealed information. In the same way economic processes can be looked at as games against 143
9.
144
ECONOMIC STATE SPACE
opponents, where each opponent reveals his information discretely as the process unfolds. It is surprising that information theory has not become the major concern in the analysis of dynamic economic processes. This criticism of dynamic economic theory is really unfair. There have been few mathematical tools useful for the analysis of the role of information in dynamic economic processes until recently. One of these tools is the concept of economic state space.
9.2 The Concept of an Economic State Space At any point in time, a decision maker in an economic system can be tagged with an ordered n-tuple which describes the magnitude of n economic variables. A function may be defined for each n-tuple which tells us how many decision makers bear this tag. We shall call this function the frequency distribution, and we shall call the tag the economic state. Consider, for example, the distribution of income in the United States. Each income is tagged by an income bracket. All the income earners within a bracket calculate their income tax by the same formula and thus have an element in common. For economic purposes, we are not concerned with the name of any person in a given bracket, but the number of persons in the bracket. Furthermore, the income earners may be tagged in other ways. Suppose we identify the occupation of each income earner by an index number. For example, we may have 1, for 2, for 3, for 4, for
farmer, an industrial worker, a professional, and a capitalist.
Now a person's tag contains two numbers: one telling us his income bracket, and the other his occupation. This tag is an ordered two-tuple. Given any ordered two-tuple, there is a number, given by the distribution function, which tells us how many income earners hold this tag. This two-tuple is an economic state, and the frequency distribution tells us how many persons are in this economic state. The notion of a state transition is defined by an ordered pair of ordered n-tuples. The first n-tuple denotes the state of origin or the a priori state, while the last n-tuple denotes the state of destination or
9.2
THE CONCEPT OF AN ECONOMIC STATE SPACE
145
a posteriori state. A function may be defined on each kind of transition which has been called a transition probability in Section 2.7. The transition probability is a measure of the likelihood that a person in some given a priori state moves to some given a posteriori state. In deterministic processes this probability is given the value one. This means that each a posteriori state is rigidly determined by the preceding a priori state. For this reason, a decision maker's path from state to state in a deterministic process is completely determined once his initial state has been specified. We shall be mainly concerned with processes where some, or all, of the transition probabilities are less than one, or stochastic processes. Transition probabilities may also be zero. For example, if we admitted the age of an income earner as a variable which enters the state n-tuple, the probability of a transition from a higher age to a lower age would be zero, since time is not reversible. However, a forward transition in age would have a probability close to one (there is the possibility that the income earner may die before the next state is reached). We have spoken of the state variables as though there were denumerable sets of points, and therefore the state space was composed of a denumerable set of points. There is no reason why the concept cannot be extended to continuous state variables, with suitable conditions, so that a continuous path through state space can be described. In some cases this leads to an operational simplification of the mathematics of a process, and thus to a better understanding of the problem. However, in this case we shall be considering discrete economic events, which are suited to the more fundamental notion of a denumerable state space. We are concerned with two general economic state variables, resources and information. Much of economic theory is concerned with only the first of these, or in the allocation of resources. We maintain that the allocation of resources is intimately connected with the state of information of the decision makers. While recognizing this essential connection, most economic theory has postulated that all decision makers possess full information concerning the implications of their decisions. This postulate results in the great interest in the notion of pure competition in economic theory and practice, where full information is one of the necessary assumptions. In stochastic economic processes the assumption of the full information postulate comes into direct conflict with the nature of the probabilistic postulates. The fundamental structure of a state space view of an economic process is determined by the structure of the state variables. Information
9.
146
ECONOMIC STATE SPACE
and resources have been chosen as state variables to represent an adaptive economic process. In the next sections we will develop a detailed statement as to what these variables are, and how they are determined for an adaptive capital allocation process.
9.3 Statistical Equilibrium and Enlightenment A simplified version of the capital allocation process introduced in Chapter 8 will be developed, which will be useful in pointing out certain fundamental properties of adaptive processes in state space. Suppose we assume that the expected payoffs are large enough for each kind of event possible, so that the decision maker invests in all the available projects some finite amount of his resources. In other words, we shall assume that the threshold of investment, ai'), is unity, constant over t and thus that the process for decision maker k and stage T can be represented by simplified relations:
(9.3.1 ) where H*
=
in m,
Hk(R t I R t- 1
, ••• ,
R1 ) =
-
(9.3.2)
k
P(Rt)P(R t- 1 )
•••
P(R 1 )
RtRt_1···R 1
(9.3.3) and
Y= k Pi in T + in (1 + k 1fTj) i
in m.
(9.3.4)
iEM
iEM
Equation (9.3.1) possesses the same limiting properties as the similar Eq. (8.3.14) in Chapter 8. In other words, we have that (9.3.5)
where we define H
= -
kPi inpi'
iEM
(9.3.6)
9.3
STATISTICAL EQUILIBRIUM AND ENLIGHTENMENT
147
We can be more specific about this convergence. In terms of the simplified notation, we can express Eq. (9.3.3) for stage t as Hk(R t I R t - 1
R) =
...
-!- Pi In Pkit·
(9.3.7)
iEM
Suppose we find the minimum of H k (-) with respect to different values of Pkil by the use of the Lagrangian equation given by (9.3.8)
Taking the partial derivatives of L with respect to each then setting them equal to zero, we have
Ail
and '\,
iEM,
and
Summing the first equations, we find
"" * ' £.f Pi -_ - ,\* "" £.f Pkit A
iEM
or
,\*
= -1.
iEM
Therefore, we have for each
iEM.
(9.3. 9)
Since the p/s are positive, so also must the ltil's. The second partial derivatives of Eq. (9.3.8) with respect to each ptil are given by
and for
i,iEM.
From this we can say that B k ( · ) is a minimum when equal to H, and thus (9.3.10)
148
9.
ECONOMIC STATE SPACE
This property means that the adaptive process reduces the entropy of the system (decision maker and environment) to the entropy of the environment as a lower bound. Furthermore, at this bound the subjective entropy must be constant; thus, the decision maker is in a state of statistical equilibrium with his environment. Needless to say, in the above limit equations infinity is a long time away. For practical purposes the limit equations usually hold after what is termed a "reasonable" amount of time. Therefore, although statistical equilibrium for an adaptive process occurs after an infinite time, we shall say that a process is in the state of enlightenment after the passage of a "reasonable amount of time." This concept can be made more acceptable by the following definition. An adaptive process is in the state of enlightenment when the difference between the subjective entropy of the process and the actual entropy of the environmental process falls to within ±8 of zero for at least Tf times. Although we must specify the error band, ±8 and the number Tf, we will see soon that the state of enlightenment, the practical analog of the state of statistical equilibrium, is useful in the analysis of the dynamic performance of actual adaptive processes. Suppose we have T' stages of an adaptive allocation process when the decision maker, say the kth, is in the state of enlightenment. Since for all practical purposes the decision maker's subjective probability function is equal to the actual probability function of the environment, the optimal capital allocation vectors are identical for each stage. We have for each
iEM
and
t'
= 1,2, ..., T'. (9.3.11)
Under these conditions we find that the expected rate of capital growth for T' stages during the state of enlightenment, and given K k O' at the beginning, is given by 1,
itT- ~ T [E*(ln K kT,) -In KkO'] =
I. Pi ln(l + a7ri)
(9.3.12)
iEM
where itT' is the optimal expected rate of growth of capital for T' stages of a state of enlightenment for the kth decision maker. Because the environment is a multinomial process, we can write T'Pi = iii and thus E*(ln K k To)
-
K k O'
=
I. iii In(1 + a7 rJ
iEM
(9.3.13) (9.3.14)
9.4
THE SUBJECTIVE ENTROPY TRAJECTORY
149
This equation will be useful in Section 9.7, where we consider certain properties of a closed economic adaptive system which lead to a value for the limiting environmental entropy which is very likely to occur in a competitive system.
9.4 The Subjective Entropy Trajectory We know where the decision maker's subjective entropy starts, and where, in time, it must end. What can we say about the slope path of the decision maker's subjective entropy in between these limiting cases? Introducing the simplified notation.
(9.4.1) we note that from the analytical point of view the subjective entropy function is a random variable. This is because Hk t is a function of the random variables, Pw , i E M, which are in turn functions of the random variables nit, i EM. The path of Hk t as a function of entropy time is irregular, a characteristic of random variables; yet we desire to show that Hk t is a function which decreases over entropy time towards the environmental entropy H. Since we cannot show this directly, we shall introduce another function in place of the subjective entropy function which we shall denote as the subjective entropy trajectory. This trajectory, for the kth decision maker and the tth stage, is given by
(9.4.2)
£kt = - ~ Pi In E[Pkit], iOM
where from Chapter 4 we know that for each
iEM.
(9.4.3)
The trajectory is not the expectation of Hk t but is what statisticians would call the "certainty equivalent" of Hk t . Since it is not a random variable, we can use the trajectory in the analysis of .the dynamic properties of adaptive processes.
150
9.
ECONOMIC STATE SPACE
We now examine the' sign of the derivative of the subjective entropy trajectory to see if it is negative. Taking the derivative of Eq. (9.4.2), we have (9.4.4)
Since (9.4.5)
we have that
k dE[Pkit] = o. iEM
(9.4.6)
dt
Because of Eq. (9.4.6) we know that at least one term is positive and the rest of the terms are negative. Let us assign the index number j to this positive term. Separating this jth term, we can write Eq. (9.4.4) as (9.4.7)
Furthermore, Eq. (9.4.6) can be written as ~
, - '4J
1=.1, t
'1"'"
dE[Pkit] __ dE[Ait] dt dt'
(9.4.8)
Substitution of Eq. (9.4.8) into Eq. (9.4.7) gives us
t
-4-]
lLif'kt = _ [~_ dE[Pkit] . dt i=I#J E[Pkit] E[PkJt] dt
(9.4.9)
Differentiation of Eq. (9.4.3) gives us
or for each
i E M.
(9.4.10)
9.4
151
THE SUBJECTIVE ENTROPY TRAJECTORY
From Eq. (9.4.6) and the definition of the jth term we can write
dE~kit] <
0
for all
i :::/= i.
(9.4.11)
and dE[Pkit] dt
0
(9.4.12)
>.
Putting these facts into Eq. (9.4.10), we have that
for
(9.4.13)
and (9.4.14) This means that Eq. (9.4.9) must be negative, and thus the slope of the subjective entropy trajectory will be a strictly monotonically decreasing function of entropy time. These characteristics are clearly shown from the results of a digitally simulated adaptive capital allocation process in Fig. 9.1. We now know that the subjective entropy drops to the entropy of the environment, in the simplified process. In Section 9.7 we shall see that there is a likelihood in a competitive system that the entropy of the environment may have a unique value which depends on the payoff structure and the total resources of the system.
.
~
1.0
0.8 0.6
H
1\
'"-"--
......
0.4 0.2
0.00
10
20
30
40
50
60
70
Entropy lime periods (sto9es). I
FIG. 9.1.
Variation of the subjective entropy trajectory for Decision Maker k.
9.
152
ECONOMIC STATE SPACE
9.5 The Dynamics of the Subjective Entropy Trajectory From the results of the previous sections we know that the subjective entropy trajectory for the kth decision maker starts at some initial value given by (9.5.1 )
and drops to a limiting value given by H = -
'Z Pi In Pi ,
(9.5.2
iEM
a shown in Fig. 9.1. Furthermore, at some intermediate stage, t, the trajectory is given by ~
_
.:n kt -
-
~ p In Ci.ki ~
i
iEM
t
+ tPi
+ 'Z
(9.5.3)
CXkj
iEM
Suppose we want to study the dynamic behavior of the kth decision maker in state space. We have a difference equation connecting the resource state, K k t- 1 , and the resource state, K k t , given by (9.5.4)
The kth decision maker's subjective entropy trajectory, ~kt' is denoted as the dynamic driving function of the system. Solution of Eq. (9.5.4) is difficult, but an approximate solution can be obtained by the substitution of an approximation for the subjective entropy trajectory given by (9.5.5)
where E",O < E" < 1, is a parameter which is characteristic of each system. Fig. 9.2 shows an example of Eq. (9.5.3) and the approximation (9.5.5). Using Eq. (9.5.5) in Eq. (9.5.4), we have E[ln
Kkt+l] '"
E[ln
K k t]
+ H*
- H - (B kO - H)Et
+ Y.
(9.5.6)
9.5
DYNAMICS OF THE SUBJECTIVE ENTROPY TRAJECTORY
.
153
12
Hko
~
1.0
~~ ca ~
g
-H
~
t
0.6
1\
\
,..
\
<, -
.-
0.4
.~ U '" 0.2 :g en 0.0
0102030
4050
60
70
Entropy lime Periods (sloges),1
FIG. 9.2.
Approximation to the subjective entropy trajectory for Decision Maker k.
Equation (9.5.6) can easily be solved by the use of the z-transform theory.' Excluding the details, the solution of Eq. (9.5.6) for stage tis given by E[lnKkt]"-'lnKkO+t(H*-H)-(I-Et)
H k -H 10_E
+tY.
(9.5.7)
Figure 9.3 shows this solution for the trajectory approximation in Fig. 9.2. Several interesting properties are shown by Fig. 9.3. These are as follows. (a) If the entropy of the environment is maximum, i.e., H = H*, little growth in capital resources is possible; however, the decision maker can lose some more capital growth finding this fact out. We have t
E[ln K kt] "-' In K kO- (I - E)
H kO- H* 1_ E
where at
and E[In K koo] = In K kO-
Hko
-
I_
H*
€
+ 00,
t
=
+ tY,
(9.5.8)
0,
at
t
=
00.
154
9.
ECONOMIC STATE SPACE
.-o
Trojectory.
I':l
-.-;Y---
_.-
E (In Kt,l
Entropy time, ,tOiles
FIG. 9.3.
The trajectory of the resource state, E(ln K k t ) .
Note, however, that if the decision maker knows that the environmental entropy is maximum, his expected losses are zero. That is, Hk O = H*, so we would have for all
t,
(9.5.9)
(b) Suppose that the entropy of the environment is some value H, where 0 < H < H'*. The decision maker will suffer irretrievable loss in adapting to the environment. Examination of Fig. 9.3 shows that only if the decision maker initially knows the entropy of the environment, i.e., Hk O = H, will be able to increase his capital at the maximum rate. We-have, under these conditions, E[ln K k t ]
~
In K k O + t(H* - H)
+ tY.
(9.5.10)
Now if he does not know H (and this is the usual case), he will be able to increase his capital resources but will be subjected to an irretrievable loss. The transient portion of th.e loss is given by (I -
E
t
/
f
f~ H . E
9.6
THE ECONOMIC STATE SPACE REPRESENTATION
155
As t approaches infinity, the steady state portion of the loss is given by
fI k O 1-
H E
•
9.6 The Economic State Space Representation We see from Section 9.2. that we could classify at any period of entropy time all the decision makers in an economic system, according to their amount of information; or, in other words, their current state of knowledge. We shall call this classification variable the information state. Thus in each information state we would find some number of decision makers who possess the same amount of information. In Chapter 5 we showed that entropy is a measure of information. In defining economic state space, we have introduced the notion of a state of information. There is a degree of confusion in the use of these words which must be eliminated before we proceed to consider trajectories in state space. The entropy of a system, H, is a measure of the rate at which information is transmitted by the environment. The stock of information held by the kth decision maker at stage t is measured by the quantity, H* - fl k l • In other words, if he knows the parameters of the environmental probability function, the stock of information is maximum; on the other hand, if he assumes the Laplacian hypothesis where all environmental events are considered to be equilikely, the stock of information will be zero, i.e., H* = Hk l • This stock of information will be defined as the state of information for the decision maker. The entropy of the environment, H, is the asymptotic limit of all adaptive behavior trajectories in state space. The distance between the zero axis and the H limit is a measure of economic risk in the system. The distance.between fl k l and H is the measure of economic uncertainty of the kth decision maker at stage t. The risk is common to all decision makers in the economic system, but the uncertainty is a personal thing-a characteristic of each decision maker in the system. The entropy of the environment is a key parameter of any economic system since it marks off the stochastic nature of the economic process into levels of risk and uncertainty for each member of the system. How can the H of a system be measured? This is not an easy task because among other things it implies that we can state an exhaustive list of all possible events.
156
9.
ECONOMIC STATE SPACE
However, in a system in stochastic equilibrium there is a unique value of H which depends only on the payoff structure and the total wealth of the system. In Section 9.7 this notion will be expanded in detail. We might expect that each of the decision makers in a particular information state would act in some identifiable fashion if presented with an economic opportunity. Actually, however, there may be many feasible actions for each of these decision makers. In many economic systems there might be some optimal action for the decision makers in an information state. Decision makers who take this optimal action will be called rational decision makers. We could not expect all decision makers within one information state to react in an identical way to a common economic opportunity. The decision maker in a particular information state may have different economic resources (for example, capital). Surely a wealthy decision maker and a poor decision maker might not find the same action optimal when presented with a common economic opportunity, even though they possessed the same level of information. When decision makers have the same amount of economic resources, we shall say that they have the same resource state. Thus, we can classify each decision maker by his state of information and his resource state. Decision makers can move from one state description, or cell, to another as a result of certain events. They may move horizontally, so to speak, by increasing their level of information. They may move vertically, so to speak, by changing their level of economic resources. They may even move diagonally by events which change their economic and their information states. In each state description, or cell, there will be a continual influx and outflux of decision makers, depending on the flow of economic events. We will be interested in Chapter 10 in finding out at some time how many decision makers enter and leave a particular cell, and in the long run how many decision makers end up in a particular cell. To do so we shall have to generate the state space of a specific economic process, and make a number of simplifying assumptions. A decision maker who loses resources in some economic event may also gain information. (In Fig. 9.4, we see four kinds of state space transitions which could result in an investment market.) The dashed transitions in Fig. 9.4 are not necessarily excluded from theoretical consideration, since all these transitions imply that some unlearning has occurred due to the occurrence of an economic event. Such unlearning could be attributed to a "Machiavellian" environment which deliberately
9.6
THE ECONOMIC STATE SPACE REPRESENTATION
Quadrant & Resources increase and information decreases,
157
Quadrant I Resources and information increase
""
)I
/
/
n
Quadrant ill
Quadrant
Resources and information decrease
Resources decrease and onfarmatian increases
Information stote, (He-tikll
FIG. 9.4.
Possible transitions in state space.
tries to confuse decision makers, or the effort of a human opponent who attempts to confuse the decision makers by transmitting misinformation at each stage of the process. These processes represent a form of economic multiperson game and are interesting, but we shall not consider them here. We assume, therefore, that at each stage in the process an event occurs which always increases the measure of information. We must also some day come to grips with the human phenomenon of forgetting. Forgetting produces a backwards transition of the information state which is not necessarily accompanied by an economic event. Forgetting could be included in our model as a random event which causes a decrease in the information state which is not directly associated with an economic event, but we shall not include forgetfulness in our processes since this generally only tends to bias the results. Let us proceed to identify the information state with greater precision. In Chapter 4 we saw that each decision maker possesses a conviction vector which describes the subjective probabilistic belief on the condition that no information has yet been received. This conviction vector represents the initial conditions for the economic decision process. We shall define the initial information state of the decision process on the basis of this vector. Furthermore, we have seen that the observation of an event can modify the initial conviction vector, and thus alter the
9.
158
ECONOMIC STATE SPACE
initial information state. Each kind of observed event alters the initial information state in a different way, so that decision makers in a particular initial information state soon move to a new information state as they observe the events at each stage of the process. The simplified adaptive capital allocation process developed in Section 9.3 is useful in the development of an interpretation of resource state. In this context we will make the economic resource state the decision maker's expected capital. Suppose there is a denumerable set of values for the expectation of capital. Each of these represents an economic resource state, and can be identified by an index number, k, k E K. We shall assume that the set of resource states can be ordered, and that the magnitude of the index k represents the order; the smallest resource state will be designated as State 1. In a sense, the set K is a constraint on the system. In fact we can define an economic system by several restrictions, one of which is the largest member of the set K.
I
Resource slate (109 scale)
Entropy lime, slages
Informolion slale HO-H - - - - - - - . . ; "
FIG. 9.5.
Construction of a state space trajectory.
9.6
159
THE ECONOMIC STATE SPACE REPRESENTATION
Restricting our attention to the single decision maker moving from cell to cell in state space, we use Eq. (9.5.7) and (9.5.5) to define the kth decision maker's trajectory. Fig. 9.5 shows how these trajectories may be graphically generated. t The equation for the trajectory is E[ln K k t ] :::: In K kO + (H* _ H
+ Y) In[(.i'
kt -
~)/(HkO nE
H)]
HkO -
I -
':#kt E
(9.6.1)
Note that initially, because
Figure 9.6 shows the trajectory of three decision makers who start with the same .amount of information, but with different amounts of I
I
I
I
I
I
I
.
I
~
.
~
l!
I
r
State of mQxlmum information (enlightenment) State of full information
r-RiSk-j
In K 30
I I I
",.
s
~ &
I
H I
In K 20
I I I
~ uncertainty..J
I
I
I
I
I
ot time I fOf each decision maker
I
I
H'-H"
Information state, H"- Hkl
FIG. 9.6.
Three investors-different initial capital, same initial information state.
t For convenience, the resource state is plotted on a log scale. As a matter of fact, a resource of this type could be considered as Bernoulli utility.
9.
160
ECONOMIC STATE SPACE
capital. Since the rate of expected growth of capital does not depend on the amount of capital held, each trajectory has identical curvature, but is displaced vertically. Consider, on the other hand, the trajectories shown in Fig. 9.7. Here the decision makers possess the same amount of initial capital, but start with different initial information; that is, start with different conviction vectors.
I
I
r
IV
"8 '"
IV
o
:;
o
8!'"
st a te af full information
I ~state of ~ I I maximum information
I I I
I
(enlightenment)
I I
I
I
I I
I
I
I
I
I
I
I
I
.
,
I
H"-O
Information state, H-Hk l FIG.
9.7.
Three investors-same initial information state, different initial capital states.
We see that for each initial point in state space there will be a unique trajectory (assuming that all decision makers observe the same events, of course) through state space toward the state of enlightenment. At the state of enlightenment, no further adaptation takes place, only capital growth, and the trajectory approaches H asymptotically. If the reader examines the equations which lead to the trajectories shown in Fig. 9.6 and 9.7, he will note that if all decision makers observe the same sequence of events and follow optimal policies, any pair of trajectories cannot intersect. In this situation, the entire state space is filled with a field of nonintersecting trajectories. It can be seen that if any decision maker starts with a low level of information, he can never
9.6
THE ECONOMIC STATE SPACE REPRESENTATION
161
catch up with those decision makers who start with more information. Such a decision maker will suffer an irretrievable loss of capital relative to other more knowledgeable decision makers. It is only in situations where some decision makers observe more or less events than others, or observe with more or less precision than other where the trajectories can cross, and some decision makers can "catch up" to others. We have seen that communication channels with limited information rate can cause a loss of information for some decision makers. Fig. 9.8 shows an interesting case where Decision Maker 1 has
.,
..
oo
FIG. 9.8.
Trajectory of two investors with information monopoly.
a great deal of initial capital, but loses information due to a faulty communication channel. Decision Maker 2, possessing less capital, receives complete information on the events of the environment. Decision Maker 2 soon catches up with Decision Maker I, and ends up with more capital. This situation is analogous to the stock market investment market where Decision Maker I could be a wealthy investor who receives his information through a daily financial newspaper, and Decision Maker 2 is a floor broker in the stock exchange itself. Unfortunately, we cannot, within the scope of this book, consider the interesting implications of channel capacity and equivocation in economic communication systems.
162
9.
ECONOMIC STATE SPACE
9.7 A DigressionA Likely Value for the Environmental Entropy Heretofore, we have considered the actual probabilities of the economic environmental process as a given set of parameters. Given certain assumptions there can be derived a particular probability function which is highly likely to exist in a free economic market. The pioneering methods used by Boltzmann (1844-1906)2 and others 3-5 in the study of the statistical properties of thermodynamic processes can be used to advantage in the study of this economic process. Suppose there is a set, M, consisting of m different payoffs, each paying a particular positive amount, say ri , for the ith payoff. If the environmental process has been running for T stages, the expected number of times the ith payoff has occurred will be denoted by fii • We note that for all the fii we have that
I
iEM
fli = T.
(9.7.1)
There are many different historical sequences of payoffs which could result in the same specific set of event frequencies, {fi i } , and fulfill Eq. (9.7.1). For a given T and some specific set {fi i } , we could have as many as T!
II flil iEM
different historical sequences of payoffs. On a priori grounds we have no reason to suspect that anyone payoff is any more likely to occur than any other. We can find the probability of a particular expected frequency of payments, say the set {fi i } , by use of the multinomial probability function. The probability we desire is given by
(_I_)T
Pr[{fli } ] = ~ -.1 m n,.
II
(9.7.2)
iEM
where Pr[{fii )] is the probability of a particular set of expected event frequencies, {fi i } , leading to a given growth of capital, fi i is the expected number of times the ith payoff occurs during the T stages in the particular combination of events which leads to the given growth of capital, T is the total number of stages which, in entropy time, is equal to the number of events and 11m is the a priori probability of any payoff event.
9.7
A LIKELY VALUE FOR THE ENVIRONMENTAL ENTROPY
163
Taking the log of Eq. (9.7.2), we find In Pr[{iii}] = In T! -
I
In ii;!
iEM
+ TIn -.!.- . m
(9.7.3)
If T and each iii are large numbers, we may replace the factorials in Eq. (9.7.3) by use of the log of Sterling's formula, In xl
r-J
(x In x) - x
+ ! In x + ! In 21T.
(9.7.4)
Substituting Eq. (9.7.4) into Eq. (9.7.3), we find that In Pr[{ii;}]
r-J
TIn T - T
+
I
(ii; - ii; In iii)
iEM
+ ! In T + ! In 21T - ! I
iEM
In e,
(9.7.5) Using Eq. (9.7.1), Eq. (9.7.5) reduces to In Pr[{ii;}]
r-J
Tin T -
I
n, In ii;
;EM
+ tin T + tIn 21T - ! I
;EM
In iii
(9.7.6) We will now find the set {iii} which will make Eq. (9.7.6) a maximum under the following constraints:
I
iEM
and
(9.7.7)
iii = T
E[ln Kd = In K ko +
I
iEM
iii In(I
+ ati T;).
(9.7.8)
We note that constraint (9.7.8) implies that the investment process being considered here as an example is the one described in Section 9.3. In other words, we assume that all decision makers have reached the state of full information (enlightenment) and are in the same resource
9.
164
ECONOMIC STATE SPACE
state, k. This assumption is a simple extension of the classical notion of the "representative economic man." This problem of finding the most likely set, {nil, is a case of the classical occupation problem" with special constraints. In such problems the Lagrangian method is used to find this most likely set. The Lagrange equation will be L
=
TIn T -
~ e, In e, + T in J... + tIn T + tin 271" m
iEM
-
m in 271" - A( ~ fl i
2
-!L
-
iEM
C~ fl i in(I
t ~ In fl; iEM
T)
+ atiri) -
E[in Kd
+ In K kO)
(9.7.9)
where ,\ and JL are Lagrangian multipliers. Treating the set {ni} as a set of continuous variables, and taking the partial differential of L with respect to ni , '\, and fL, we have that
oL
,
-1 -In fl i
et.l <
_ in(I = ["'" k.t ni
ofl. =
A -!L in(I
-
* + akiri)
-21 [ T1, ]
for each
iEM
and UfL
iEM
* ]+ akiri)
E[In K kT]
+ in K kO
We have assumed that the n/s for any i E M are large numbers. Therefore, the term i[llni] in the above equation for 8LI8ni may, for all practical purposes, be dropped. On this assumption, upon setting these equations to zero, solving for the optimum set {ni!')}, and given some particular value for fL, we have that in n~!')
"-'
-(1
+ A) -
!L In(I
+ atiri),
for each
where
and
"'" - in(I + akiri) * k.t ni
iEM
= E
[1 n KK kT ] . kO
iEM,
9.7
A LIKELY VALUE FOR THE ENVIRONMENTAL ENTROPY
165
Taking the antilog of both sides, we have for each
iEM.
(9.7.10)
Summing these m equations, we find that T = r(l+~) ~ (1
+ a:{i)-I-'.
(9.7.11)
iEM
Substituting Eq. (9.7.11) into Eq. (9.7.10) results in for each
iEM.
Since the probability function of the environmental process is a multinomial probability function, we know that the expected frequency of, say, the ith event is given by for each
iEM.
Thus, we may write for each
iEM,
(9.7.12)
with high probability for some value of fJ-, and i E M. The probability distribution function represented by Eq. (9.7.12) is analogous to the Maxwell-Boltzmann distribution of statistical mechanics to indicate the most likely number of atoms to be found in a particular energy state in a closed system. There is a whole set of these most likely probability functions depending on the magnitude of the Lagrange multiplier fJ-. For example, if fJ- were zero, Eq. (9.7.12) would reduce to with high probability, for each
iEM,
the a priori assumption of equal likely probability. Out of all these sets of most likely probabilities, one for each value of fJ-, one set is unique. We will find this set by the use of a function which was first introduced to statistical mechanics by Max Planck. Herein,
166
9.
ECONOMIC STATE SPACE
this function will be called the economic characteristic function. The development of economic interpretation of this function is based on the work of Khinchin? in the field of statistical mechanics. In this application, we will define the function in a specialized form suitable for the stochastic investment process. The function is given by !f1 = In ~ (1 + atiri)-j.<
(9.7.13)
iEM
where fL is the Lagrange multiplier of Eq. (9.7.12) and is greater than zero. Using Eqs. (9.7.12) and (9.7.13), we note the following: (9.7.14) with high probability, for each i
E
M.
(9.7.15)
Since there is a whole set of probability functions,
one for each possible value of fL, we wish to find one probability function from this set which possesses special properties. Let the fL for the probability function which possesses these properties be designated as fL*. When the set given by Eq. (9.7.12) was developed, we used condition (9.7.8), where the quantity E[ln(KT/Ko)] was known and constant. We note that Eq. (9.7.14) has the same form; it is the expected rate of capital accumulation corresponding to every possible value of fL. The particular value of fL which makes Eq. (9.7.15) equal to the known constant, E[ln(KT/Ko)] , is the unique value of fL which is desired, fL*. Thus, we have (9.7.16)
and using the set of probabilities, {p~j.<')}, which equates this expression we have with high probability, for each i
E
M.
(9.7.17)
9.7
A LIKELY VALUE FOR THE ENVIRONMENTAL ENTROPY
167
The existence of such a unique solution for iL* has been assured by a theorem of Khinchin." The probability function {p1"')} is the unique Maxwell-Boltzmann function which meets all the conditions of the maximization of the likelihood function (9.7.6). It is interesting to note that the reciprocal of the parameter iL is. analogous to the temperature of a physical system times the Boltzmann constant. We now can explore the relation of the above results to the entropy of the system. Given any value of iL, the entropy of that particular Maxwell-Boltzmann function is given by
= - ~p~") Inp~").
H
(9.7.18)
iEM
Taking the log of Eq. (9.7.12), we have that
Inp~") '"" In(I
+ a:iri)-" -In ~ (1 + a~rt.)-",
for each i
E
M.
(9.7.19)
iEM
Multiplying Eq. (9.7.19) by p1"l, summing over i, and using Eqs, (9.7.13) and (9.7.18), we have that H
~
fL
~ p~") In(I
iEM
+ a~ri) + 0/.
(9.7.20)
Taking the partial differential of Eq. (9.7.20) with respect to iL, we find that
() I (I 8en '"" ~ £.f p;" n fL
iEM
ri + 800/ . + aki*) fL
But substitution of Eq. (9.7.14) shows that oH
OfL ~
0,.
with high probability.
(9.7.21)
The process of maximizing the likelihood function (9.7.6) is an isotropic process. In effect, by maximizing (9.7.6), we are asking for the most likely set of probability functions of all the possible functions for some given value of the system entropy. Such a process is independent of the passage of irreversible time. It is similar to the maximization of utility in dynamic economic processes, without regard for the passage of time.
168
9.
ECONOMIC STATE SPACE
References 1. Zadeh, L. A. and Desoer, C. A., "Linear System Theory-The State Space Approach," pp. 486-494. McGraw-Hill, New York, 1963. 2. Boltzmann, L., "Vorlesungen tiber Gastheorie," Part 1, Section 1, Vol. 6, pp. 32-47. Verlag Von Johann, Leipzig, 1896. 3. Gibbs, J. Willard, "Elementary Principles in Statistical Mechanics." Dover Publications, New York, 1960. 4. Tolman, R. C., "The Principles of Statistical Mechanics," Chapter 4. Oxford Univ. Press, London and New York, 1938. 5. Schrodinger, E., "Statistical Thermodynamics," Chapter 5. Cambridge Univ. Press, London and New York, 1960. 6. Feller, William, An Introduction to Probability Theory and its Applications, Vol. I, second ed., II.11.5 (p. 58). Wiley, New York, 1957. 7. Khinchin, A. I., "Mathematical Foundations of Statistical Mechanics," pp. 145-147. Dover, New York, 1949. 8. Khinchin, A. I., "Mathematical Foundations of Statistical Mechanics," p. 80. Dover, New York, 1949.
CHAPTER 10
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE ... il peut fort bien arriver que deux hommes de bonne foi, dans des matieres complexes au ils possedent exactement les memes elements d'information, arrivent a des conclusions differentes sur les probabilites d'un I!venement et que pariant ensemble, chacun sefigure . . . que c'est lui le voleur et l'autre l'imbecile. Emile Borel
10.1
Valeur pratique et philosophic des probabilites
Introduction
At each stage, each decision maker in an economic system can be tagged by his information and capital (resource) state: All the decision makers in a given state space point or cell have common behavioral patterns. For example, in the capital allocation model, since each decision maker in a given state has the same subjective probability function, each has the same set of capital allocation vectors. In effect, all the decision makers act as one with the combined assets of each. Following each e~vironmental event or payoff, each decision maker must trade securities (or in general, commodities) to remain optimal in his new state (since a payoff has been received and an event has occurred, the state must change each stage). But there is a catch; a change in the capital allocation vector requires a trade of securities between decision makers in each cell of the state space. Heretofore, we have assumed that these trades are always possible. But are they? A market trading transaction requires two or more decision makers for whom the trade is mutually advantageous. There must be a probability that such trade is not possible. If this is so, decision makers can "hang up" in time, waiting for a transaction which will bring their capital allocation vectors into alignment with their information states. This possible disequilibrium between the capital allocation state (reflecting the actual holdings of the decision maker) and the information state will be the subject of this chapter. 169
170
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
One of the fundamental concepts in the analysis of processes in state space is the notion of the state probability function, the set of probabilities of finding decision makers in given states or cells. By making some plausible assumptions about the economic system, we can arrive at a state probability function which is more likely than others. Because of this stochastic transaction effect we find that there exists another level of uncertainty in the market behavior of decision makers, the uncertainty in finding transaction partners. The study of the interrelations between the environmental entropy and this new market entropy will complete this chapter.
10.2 The State Space Probability Function If all of the decision makers are maximizers of the expected rate of capital growth, or at least uniform in their behavior toward subjective probabilities, there is a set of individual capital allocation vectors (portfolio ratios) corresponding to every information state. Assuming the investment model of Section 9.3, each information state, j, corresponds to a set of individual capital allocation vectors, Ait, given by
* = Pm
aijt
A
[
1
~ - 1] - - 1, + ,£..t ri ri
for each i
E
M
(10.2.1)
!oEM
where a~t is the optimum capital allocation ratio for the ith kind of investment for a person in the jth information state at time t. We see that given that some decision maker is in some information state there is an associated subjective probability function. If this decision maker maximizes his expected rate of capital growth, there is an optimal set of individual capital allocation ratios for this decision maker. Suppose the investment market is a closed economic system in which there is a set, Q, of decision makers. These decision makers are assigned to a denumerable set, I, of different information states. Each decision maker in an information state has a specific set of capital allocation vector for m different investments. We denote for the jth information state the vector Aft as the associated individual capital allocation vector at stage t. The expected rate of capital growth from the allocations of capital by one decision maker in the jth state for the tth stage is given by (10.2.2)
10.2
THE STATE SPACE PROBABILITY FUNCTION
171
The expected rate of capital growth for the capital held by the whole population is quite different from Eq. (10.2.2). We shall describe the events which lead to the growth of capital for an economic system as a whole. The economic system will be composed of individual decision makers, each seeking to maximize their individual expected rates of capital growth by the optimum allocation of their capital over a set, M, of risky investments. The microeconomic behavior of these decision makers is described in Section 9.3. Each event is described by a 3-tuple (three index numbers) which is, in part, related to the state of the decision makers. Consider the particular 3-tuple, (i,j, k). This 3-tuple means that a payoff of magnitude ri has been made to a decision maker in the jth information state and the kth capital state. We define a probability function which describes the likelihood of the occurrence of anyone of these events. We have: Sijk
= Pr(ri is paid to a decision maker in the kth capital state and the jth information state)
(10.2.3)
where
~
Sijk
iEM,jEJ
=
1,
kEK
The decision makers each possess an initial amount of capital. Let us introduce a new set of ratios called interpersonal capital allocation ratios Suppose the initial capital available to an investment system as a whole is given as K o . The initial capital in the possession of each decision maker in state k will be denoted, as bkKo . We note that for each k
E
K,
(10.2.4)
and
It is important that the difference between the interpersonal capital allocation ratios, {b k } , and the individual capital allocation ratios, the {aijt}, be clearly understood. In an investment system, we are primarily interested in the individual allocation of capital, not the interpersonal allocation of capital. Thus, we shall assume that the number of decision makers with bkKo units of capital is known for each state, k. The amount of capital possessed is partly related to the effectiveness of a decision maker's accumulation of capital. In other words, there is a relation
172
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
between the ratios, {b k } , and the ratios, {att}, for any decision maker during the growth process. Therefore in the study of statistical equilibrium which follows we will assume that the trading in securities (a security is considered a war rent to receive a payoff with some unknown probability from a specific investment) occurs between the payoff events of the process. Thus the given number of decision makers in each state, bk , cannot change during what we shall call the trading process. Consider the contributions of total system capital if event (i,j, k) occurs in the first period. We have (10.2.5)
The first term represents the ith payoff to, and the conservation of principal for, a single decision maker in capital state k, and information state j. The last term conserves the principal of the rest of the decision makers who did not receive a payoff during the first stage. Consider a random number of these (i,j, k) events which could occur in t stages. We denote this number as nijkt . We wish to find the expectation of this random number. At this point, it would be advantageous to have a visual impression of the operation of this system of events. Visualize a three-dimensional space, the floor of which is laid off in rectangles or cells. For each cell there is a number of decision makers in the same capital state (possess the same quantity of capital, and the same information state (allocate their capital in the same way). The total number of decision makers is equal to Q, which is the area of the floor. Suppose the floor area of each cell is equal to the expected number of decision makers in that cell. Now suppose a mechanism drops labeled balls, with numbers, i, randomly selected from the set M, into the cells with equal probability per unit floor area. These balls represent the payoffs. The number of balls which happen to fall in a particular cell will be proportional to the floor area of that cell. We denote the total number of balls dropped as t (the number of stages of the process). We denote the floor area of a particular cell as ijjkt . This number is equal to the expected number of decision makers in capital state k, and information state j. Out of the total number of balls dropped, let the expected number of, say, the ith kind, be nit. Now examine the contents of any cell. If the mechanism dropped t balls, the expected number of balls of any kind in cell jk will be equal to tijjkt/Q (the relative floor area of the cell times the number of balls dropped). Of all the balls dropped, the expected fraction of balls marked i will be nit/t. Then the expected
10.2
THE STATE SPACE PROBABILITY FUNCTION
173
number of balls marked i in cell jk will be tiiitifjktltQ. We have then for the expectation of n iikl , (10.2.6)
Now we know from Section 4.4 that (10.2.7)
so that we can write Eq. (10.2.6) as (10.2.8)
Now suppose we look at the ex ante expectation of nijkl (from the viewpoint of a decision maker in information state j, after t stages of history have passed). He does not know Pi' only Pijl' Assuming that ifjkt is given, the ex ante subjective expectation of n ijkt for the next period is given by
niikt --
(I)PiJtijikt Q .
(10.2.9)
where (1) is to denote that this is for a one-stage process. For a one-stage process starting at t - 1, we have from the view of the decision makers who possess (t - 1) stages of history,
This equation leads to an optimal set of capital allocation ratios. These ratios are independent of the value of ijjkt and are given by for each i
E
M,
(10.2.11)
which is derived in much the same way as similar equations in Section 9.3. The decision maker can determine his optimum capital allocation vector without knowledge of the expected number of the other decision
174
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
makers who are in the same capital state and the information state. Assuming that a~t+l has been determined for each i and j, we turn to the market view of the growth contributed by the decision makers in this period. We have, for the ex post expected system growth in the period,
k
+ a~tbkri)'
Pi qQikt 1n(1
iEM,jEJ kEK
(10.2.12)
Gr
Now the ijjkt must be known to determine We can find the most likely value of ijjkt subject to the constraints that the number of decision makers remains constant at Q, and the maximum expected capital growth for the tth stage, Gt, remains constant. This is possible because we have already assumed an a priori probability function for the events (ijk) with respect to anyone decision maker. We stated that the a priori probability of event (ijk) falling on anyone decision maker is equal to l/Q. As a result of this assumption, we see that a decision maker's total capital and allocation of that capital cannot alter the probability function of the environmental stochastic process. This assumption is used in the economic analysis of competitive markets where the decision makers are assumed to be price takers. The likelihood function of a particular value of ijjkt is given by _ Pr[{qikt}]
=
ITQ! jEJ,kEK
qikt!
[1]0 -Q
for
qikt > 0,
j
E
J,
k E K,
(10.2.13)
elsewhere.
=0,
We can maximize the log of Eq. (10.2.13) by the variation of ijjkt' under the constraints given above, by the Lagrangian technique. Assuming Q and each ijjkl to be large numbers, and using Sterling's formula, we have for the log of Eq. (10.2.13), In Pr[{qikt}] ~ -Q
+ Q InQ -k
iEJ,kEK
qikt In qikt +
k
qikt - Q In Q.
iEJ,kEK
By reduction, we find that In Pr[{gikt}] ~ -
k
jEJ,kEK
qjkt In gild·
(10.2.14)
10.2
175
THE STATE SPACE PROBABILITY FUNCTION
The Lagrangian is given by L
= -
~
qjkt
In qjkt
A ~
-
jEJ,keK
- QfL
qjkt
jEJ,kEK
~
Piqjkt
i£M,l EJ
In(l
+ AQ
+ ai;tbkri) + fLGj
(10.2.15)
UK
where Aand fL are Lagrangian multipliers. Noting that the a'tj/s are independent of the variations of qjkl, we take the partial differential of z with respect to the qjk/S, A, and fL. The partials are evaluated at qf,., where they equal zero. We have
for each
k
E
K, j
E
J; (10.2.17)
and
I
T8L _ = - Q1 ~ ~ fL q;kl iEM, jEJ
-*
Piqjkt
1n(l
* bkri ) + G*t + a,;t+l
= O.
(10.2.18)
kEK
This result can be written as exp [-fLIQ ~ Pi In(l tEM
+ atJtb~j)]
This equation can be put in terms of the entropy corresponding to the jth information state for the a'tj/s We have that exp[ -(fLIQ)Hjt]
~ exp[ -(fLIQ)Ht t]
LEJ
C
(10.2.20) k
176
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
where
c k -
exp [-(p.IQ) In
(b + ~ +)] k
'EM'
~ exp [ _(p.IQ) In (bJs + ~ _1)]
~EK
!EM
(10.2.21)
r,
Equation (10.2.20) or (10.2.19) is a member of the Maxwell-Boltzmann class of probability functions. They show us that it is highly unlikely that a large number of decision makers can be found realizing large capital gains and that it is unlikely that just a few decision makers can be found realizing small capital gains. We note that the differences in capital growth for the same amount of capital initially are caused by the differences in the information state of the decision makers. The adaptive process will work to reduce these differences. The equations show us that if the adaptive process could be completed (the state of enlightenment), the differences in expected capital growth would be only due to the initial capital possessed by each decision maker. Further, note that the parameter p. is the same for all members of the economic system. This is because the constraint (10.2.12) is for the whole system. An economic system, then, could be defined as being composed of those decision makers who face the same collective constraints. The parameter fL is so far undetermined. In Section 9.7, a method was presented for the determination of a similar parameter. We find that, given the system's expected rate of capital growth, and by virtue of Khinchin's theorem (see Section 9.7), there exists a unique value of fL which is compatible with constraint (l 0.2. 12). We denote this value of fL as fL *. This value of fL is found from the economic characteristic function for the system, ,p(fL) (see Section 9.7). When the partial of !f;(fL) with respect to fL equals the negative of the system's expected rate of capital growth, the value of fL at this point is fL *. With the most likely value of ijjkt now on hand, the growth of the system in the stage can be found. We have for this growth, (10.2.22)
Before turning to the market trading process and statistical equilibrium, we will find the contribution to the above growth by a single decision
10.3
STOCHASTIC EQUILIBRIUM IN THE MARKET
177
maker in the jkth cell. This contribution would be the variation of Eq, (10.2.22) with respect to the number ijfkt. We have then
SCi
IjEJ.kt'K
= Q1
~ Pi 1n(l + atitbkTi)'
(10.2.23)
iEM
We will use this equation in Section 10.4 to develop the notion of a market trading process.
10.3 Stochastic Equilibrium in the Market Each decision maker in the economic system is in a specified capital state and a specified information state. We have seen that each information state is associated with a particular capital allocation vector (a portfolio). Suppose that the payoff events generated by the environment change the subjective probabilities and thus change the information states of the decision makers. Are we sure that the necessary securities can be traded to assure that the new optimum portfolio can be obtained for each decision maker? There appears to be a possibility that in a disorderly market the two or more decision makers needed to consummate a transaction could not find each other. Ultimately, the price mechanism would be pressed into action to bring the market into equilibrium. But if all the decision makers could find the necessary partners to consummate their transactions, there would be no price changes, since the market would be cleared. We see that the probability of one decision maker finding another is an essential part of the market clearing mechanism. Examples of the need for order are clearly seen in the security exchanges where specialists in particular securities are fixed in location to facilitate the trading operation. Most markets are not so well organized, and disorder is quite pronounced. For example, consider the market for used automobiles. The probability that there are more Ford/Chevrolet deals is higher than the probability of EMF/Wasp deals simply because there are far more Ford and Chevrolet owners than there are EMF and Wasp owners. The probability of a transaction is likely to be proportional to the number of pairs of decision makers seeking that transaction. In order to determine these probabilities, let us assign an optimum portiolio to each point in state space. Let the jth state, and kth capital state, be associated with a particular portfolio, given by (10.3.1)
178
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
where w~i means that W~i capital units of the ith kind of investment are possessed by a decision maker with portfolio ~. Let us suppose that there is a finite probability that the quantity of any investment will be a particular magnitude. This probability is defined by for each
iEM.
(10.3.2)
Since the ~th portfolio is made up of certain quantities of the m investments in the magnitudes {W'i}' the probability of finding a decision maker with the ~th portfolio is the product of the probabilities that the portfolio will contain the specified quantities, {wd. This probability is
I, = IIpr[w,;] iEM
(10.3.3)
Suppose the ~th portfolio is one-step communicable with some other portfolio, say the gth, by the exchange of a specified quantity of security u for security v. Suppose also the 1]th portfolio can be transformed into the pth portfolio by the transfer or the same quantity of-security v for u. Then, if a transaction were completed between the two decision makers, one with the ~th portfolio and the other with the 1]th portfolio, the decision maker who had ~ would now have g; and, the decision maker who had 1] would now have p, We will assume that the probability of such a deal is proportional to the product of the probabilities of there being decision makers with the ~th and gth kinds of portfolios. We have Pr[decision makers shifting from the {th to the portfolios] =
1.1'1 .
~th
and 7Jth to pth (10.3.4)
Furthermore, we assume that the greater the probability of this event the greater is the probable change in the expected number of decision makers holding these portfolios. Thus we have di}, ~dt
=rxld'f/Q
(10.3.5)
~ ~'f/
=rxld'f/Q
(10.3.6)
10.3
179
STOCHASTIC EQUILIBRIUM IN THE MARKET
(10.3.7) (10.3.8)
where fie, for example, is the expected number of decision makers holding the ~th kind of portfolio and ex is a constant of proportionality. Such a market has a statistical parameter, which we shall call the market entropy. The market entropy, H m , is a measure of the uncertainty in the distribution of decision makers holding each kinds of portfolio. Starting at some initial condition, for example, where all Q decision makers hold certain portfolios with probability one (market entropy equal to zero), the decision makers carry out transactions on a probabilistic basis. After the first transaction the probability for finding a decision maker holding in a particular portfolio ceases to be one or zero and the market entropy increases. The problem is to prove entropy will increase and to show when it will stop increasing and become a constant at its maximum. We define a system possessing a constant entropy as being in statistical equilibrium. The market entropy is given by Hm
= -
2.f, , In / ,
(10.3.9)
where ~ is the portfolio index number. Taking the differential of H with respect to time we find that dH m = dt
Now
-2., dl,dt In/, - 2., dhdt .
2.1, , = 1,
and so
therefore dH m dt
=
2.,dt df, In/, .
(10.3.10)
Assume that we let only one transaction of the type described above be made per stage. For- this transaction the change in market entropy is given by only dH m
&
I
~E
'1... p
= -
dh In', _ dl" Inf, _ diE InlF
&
J.
&
"
&
.
_
dip Int.
&
p
(10.3.11)
180
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
the rest of the sum (10.3.10) being zero. We denote as representing a transaction where decision maker A moves from portfolio ~ to portfolio g, and decision maker B moves from portfolio TJ to portfolio p.
Furthermore, we know that the expected number of decision makers holding each portfolio is given by ijr"
= Qfr",
(10.3.12)
ij~
=
(10.3.13)
ij,
= Qf"
ijp
=
Qf~,
(10.3.14)
and
Qfp.
(10.3.15)
Taking the differential of the above four equations using Eqs. (10.3.5)(10.3.8), and substituting the result into Eq. (10.3.11), we have
d~m /c. e = + (xjd~ Inf~ + ctld~ Inf~ - ctld~ Infp ~
p
-
ctfd~ In Ie
'
which may be rewritten dHm I I Id~ T r".... e = ct)Jj, r" ~ n f, I" • ,)p
(10.3.16)
~~p
Since the market is free (all portfolios are communicable), there is a finite probability of a reverse transaction occuring; i.e., a decision maker with portfolio g making a transaction with decision makers with portfolios p and obtaining portfolios ~ and 'Y), respectively. By the same argument as before, we find
ar; Ip ....~ = T
I. I fdp
< p n f j, .
,I"
ct)
<~r"
(10.3.17)
r" ~
Now the total change of market entropy possible between portfolios and p is given by the sum of the change from the forward and reverse transactions. Thus, summing Eqs. (10.3.16) and (10.3.17), we have that
" g, TJ,
su;
(ft =
-ct
(f.
I"
<J P
-
f r" j,~ ) In fd~ fdp .
(10.3.18)
10.3
STOCHASTIC EQUILIBRIUM IN THE MARKET
181
Because of the log form of Eq. (10.3.18), it possesses a unique property. If fe!J~ and are all positive, which is the case here, then the expression
t.
(10.3.19)
and thus (10.3.20)
Since the forward transaction, [;::~J and reverse transaction, [~::n are selected in general, the same holds for all possible deals between communicable portfolios. Therefore, the sum of all the possible changes in market entropy is still positive or zero; i.e., (10.3.21)
Market entropy always rises or is constant. We now must determine the conditions necessary for the change in market entropy to be zero, which is the definition of statistical market equilibrium. We note that dH m =0 dt
when in general (10.3.22)
Equation (10.3.22) says that market entropy will be maximized and remain constant at the point where the probability of any transaction is the same both ways; for example, the probability of decision maker A trading his security u for decision maker B's security, v, is the same as decision maker A trading his security, v, for decision maker B's security, u. Moreover, we may assume that during a transaction, the sum of the total capital involved remains constant. This fact we denote by the rule of conservation of capital. Thus, we may write (10.3.23)
and (10.3.24)
182
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
These equations, in addition to the requirement (10.3.22) are both satisfied by the assumption of Maxwell-Boltzmann probability functions for the general probability functions of Eq. (10.3.3). Thus, we have
(10.3.25)
fTl =
exp
f~ = exp
[-Am - /L ~ WTli] ,
(10.3.26)
[-Am - /L ~ W~i] ,
(10.3.27)
iEM
iEM
and
(10.3.28) Equations (10.3.25)-(10.3.28) being exponentials, we know from the rule of conservation of capital that
(10.3.29) but since only securities u and v were transferred between the above decision makers, Eq. (10.3.29) reduces to
(10.3.30) because both Eqs. (10.3.23) and (10.3.24) are in capital units and can be summed. Now the assumption of exponential Maxwell-Boltzmann probability functions assures that Eq. (10.3.29) holds, and thus that Eq. (10.3.30) is true. Therefore, the conclusion is that a market described by the Maxwell-Boltzmann probability function is in truth a market in statistical equilibrium. Furthermore, any market of the type described, if not in statistical equilibrium, will increase its market entropy until it comes into statistical equilibrium. A further note is necessary on the bilateral transactions we have been considering in the stochastic trading process. It is not probable that any forward transaction would be followed by a reverse transaction. It is more likely that several intermediate transactions may be made before the required reverse transaction can be consummated. As long as the portfolios are communicable- as long as there exists no portfolio
10.4
INTERPERSON TRADING IN THE INVESTMENT MARKET
183
where capital may be destroyed, never to enter the market again-sooner or later a reverse transaction will be made. We must note, however, that there is a small but finite probability that this chain of transactions leading to a reserve transaction may be infinite; thus, the necessity of assuming that H m is maximized in probability as the number of transactions become infinite. There appears no reason to assume that the mechanism of bilateral trade cannot be extended to multilateral trading. For example, for trilateral trade, Eq. (10.3.18) can be written
sn; = T
-0:
(j0 f PI,6
-
f , f TIl.f) In fdJ6 hf"J.
(10.3.31)
and zero change in market entropy would occur when
fdTlf. = fdpf6 .
(10.3.32)
Higher order multilateral trading appears to be unusual in actual market situations; therefore, its importance is doubtful in this model.
10.4 Interperson Trading in the Investment Market A particular information state implies a particular subjective probability function. A particular subjective probability function implies a particular capital allocation vector. Each decision maker can be tagged with a pair of index numbers, (t, y) denoted as a cell, which specifies the decision maker's capital allocation vector and capital state. Let us consider three other cells (-ry, ep), (g, y), and (p, ep). These cells are defined so that during a particular market transaction between two decision makers, one in cell (t, y), and the other in cell (-ry, ep), a change in the amounts of securities held by each transforms the decision makers into new cells, (g, y) and (P, ep) respectively. Suppose the securities traded are of the uth and vth type. We can write the contribution to system capital growth for any of the decision makers in these four cells in the following way:
+ attTibyt)'
(10.4.1 )
l/Q ~ Pi In(1
+ a;itTibyt)'
(10.4.2)
SG:i.rt = l/Q ~ Pi In(l
+ a~tTib'l't),
(10.4.3)
SOtyt = l/Q ~ Pi In(l ieM
sOtt
=
ieM
ieM
184
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
and SG:rpt = l/Q
I-Pt 1n(1 + a;itYibrpt).
(10.4.4)
tEM
In the above, we have made certain assumptions about the market trading operations as follows. (a) All decision makers in the market (system) face the same environmental probability function, P. (b) All decision makers receive the same payoffs, {ri}' for the same kinds of securities. (c) Each decision maker's capital is converved before and after a transaction. Now in addition, assume that: (d) Market trading can be decomposed into consecutive isolated two-party transactions involving two types of securities, and such transactions do not effect changes in the positions of the other decision makers or other types of securities. This assumption permits us to write for all the securities not entering into the transaction between cells (~, y) and (7], ep), for
i =F u, v;
(10.4.5)
for
i =F u, v.
(10.4.6)
With these equations we may conclude that the contribution to expected system growth by the securities not traded will not change after the transaction. We may then write for
i =F u, v
(10.4.7)
for
i =F u, v.
(1004.8)
and
We now assume that: (e) The total contributions to the system's expected rate of growth of capital from the traded securities must be equal in money value before and after a transaction.
10.4
185
INTERPERSON TRADING IN THE INVESTMENT MARKET
This assumption, and Eqs. (10.4.1)-(10.4.4) lead to the equations
r. In(1 + a!"tr"byt) + r. In(1 + a:"tr"b'l't) = p" In(1
+ a;"tr"byt) + p" In(1 + a:"tr"b'l't)
(10.4.9)
and P» In(1
+ a!vtrVbyt) + Pv In(1 + a:vtrvb'l't) = P» In(1 + a;vtrAt) + Pv In(1 + a:vtrvb'l't).
(10.4.10)
Adding Eqs. (10.4.9) and (10.4.10), we get
+ a7"tr"b~t) + Pv In(1 + a!vtrVbyt) + P" In(1 + a:utr"bpt) + Pv1n(1 + a:vl'vbq;t) = p"ln(1 + a;ul'"byt) + Pv1n(1 + a;vtrAt) + P" In(1 + a:"tr"b",t) + P» In(1 + a:vtrvb",t).
P" In(1
(10.4.11)
It is interesting to note, in digression, that these assumptions imply that no change occurs in the total expected rate of capital growth for the system during a transaction. This can be seen by adding Eqs. (10.4.7) and (10.4.8) to Eq. (10.4.11) so as to get, with the help of Eqs. (10.4.1 )-(10.4.4), (10.4.12)
Since all other decision makers are not involved in the transaction, their individual contribution to the system's expected rate of capital growth is the same before and after the transaction. The equality (10.4. 12) means that the expected rate of capital growth is conserved for the system during the transactions. Note that (10.4.13)
and (10.4.14)
by the definition that the four cells are in fact different cells; i.e., an actual transfer of some type of security must be made during a trans-
186
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
action. Individually, from the point of view of the market, one decision maker may be better off and the other decision maker may be worse off after the transaction. It is here that the stochastic trading model differs from the deterministic model, where it is assumed that any transaction where one of the two decision makers is worse off bears probability zero, while any transaction where either of the decision makers is better off, or just as well off, bears probability one (Pareto optimality). In a stochastic trading model a poor transaction does have a finite probability of occuring. This recognizes the fact that the decision maker in cell (s, y), at some particular stage t, does not know the actual SGt t but only the subjective value of a transaction as given by (10.4.15)
and this value may be different from the value given by Eq. (10.4.1). Remembering Eq, (10.4.11), we return to the Maxwell-Boltzmann probability function given by Eq. (10.2.19). From Section 10.3, we note that for a market in stochastic equilibrium it must be true that (10.4.16)
The probability f* signifies that this is a particular value of the probability function given in Eq. (10.2.19), which leads to the equality (10.4.16). This equality says that the probability of a transaction which transforms decision makers in cells (s, y) and (p, cp), respectively, is equal to the probability of a transaction which transforms decision makers in cells (g, y) and (p, cp) into cells (s, y) and (7], cp), respectively. To obtain this result we must assume the following. (f) The probability of a transaction between decision makers in cells (s, y) and (7], cp) is the product of the probability of there being decision makers in cells (s, y) and (7], cp). (g) The rate of growth of the number of decision makers in a particular cell" say (s, y), is proportional to the probability of the kind of transactions which transfer decision makers into cell (s, y) and the probability of the kind of transactions which transfer decision makers out of cell (s, y). (h) Definition. The condition of statistical equilibrium is obtained when the entropy of the system (the uncertainty of the distribution of decision makers in the cells) is both constant and maximum.
lOA
INTERPERSON TRADING IN THE INVESTMENT MARKET
187
Assumptions (a)-(h) have been more or less intuitive statements which could be made about any economic stochastic system, whether in statistical equilibrium or not. We are interested in systems which possess Maxwell-Boltzmann probability functions. In such systems assumptions (a)-(h) lead to an assertion as follows.
In a stochastic investment market composed of decision makers classed in cells describing their capital state and capital allocation vectors, where assumptions (a)-(h) are valid, and where the probability of a decision maker being in any particular cell is given by the Maxwell-Boltzmann probability function, statistical equilibrium is obtained. Proof. Substitution of Eq. (10.2.19) into Eq. (1004.16) with some reduction gives us
k Pi In(1 + a~tribyt) + k Pi In(1 + a~trib'l't)
iEM
iEM
=
k Pi In(l + a~tribyt) + k Pi In(1 + a~trib,pt).
iEM
(10.4.17)
iEM
Subtracting Eq. (1004.7) and (10.4.8) from Eq. (1004.17), we have
+ Pu In(1 + a;utrubrr t) + P» In(I + a;vtrVb'ft). (10.4.18)
Equation (10.4.18) is identical to Eq. (10.4.11). The assumption of a Maxwell-Boltzmann probability function and (f)-(h) leads to an equation which is the direct result of assumptions (a)-(e). The role of adaption in this system is important. We have assumed that during the transactions above no payoffs have been received from the investments. Thus, the system has been "hung up" in entropy time with respect to environmental events. This means that no changes have occurred to change the state of information of each decision maker, and thus his subjective probabilities or the capital state. Permitting payoff events to occur will cause mobility between the cells because the new payoff events will alter the amounts of capital and the subjective
188
10.
INTERACTIONS BETWEEN DECISION MAKERS IN STATE SPACE
probabilities for those decision makers receiving them. We note that the consistency property of the subjective probabilities means that plim Pm = t~oo
Pi
for
iEM
and
j
E
J.
(10.4.19)
It is seen then that as entropy time passes, the number of information states decreases to one (the state of enlightenment). This is because the subjective probabilities for all decision makers are identical to the respective probabilities of the environmental process, and thus equal for all decision makers. Thus, in turn we can say
for
iEM
and
j
E
J.
(10.4.20)
This collapse of the number of capital allocation vectors into one vector has interesting implications which will be explored in Section 11.2. In the specific case of the adaptive investment process, the entropy paradox (see Section 5.3) is clearly evident. We have two kinds of entropy in our model: (1) a subjective entropy generated by the probabilities of payoffs received by the decision makers, and (2) the market entropy generated by the probable distribution of decision makers in the cells. The market entropy is a function of the subjective entropy. We know from Section 9.3 that the subjective entropy, in probability, decreases for each decision maker over entropy time; furthermore, in this chapter we have shown that the market entropy is always increasing or is constant and maximum. The interesting fact is that as the subjective entropy decreases, it decreases the number of cells on which market entropy is based. In the limit, with only one row of cells for the given capital state (a given set of interpersonal capital distribution coefficients, {b k } ) , and the single individual information state, the maximum market entropy is minimized by the adaptive process. Thus, at any stage there is a force tending to increase the market entropy to a ceiling which is a limit resulting from the subjective entropy at that stage. In effect, the adaptive effort of the decision makers is to place a "lid" on the natural tendency for the market process to reach a state of maximum disorganization. This appears to be a fundamental role for the optimizing behavior of the adaptive decision maker.
CHAPTER 11
THE CONCLUSIONS Shall I tell you what knowledge is? It is to know both what one knoios and what one does not know. Confucius
11.1
Individual Adaptive Behavior
11.1.1 The adaptive version of the principle ofoptimality leads to a sequential mathematical model (a dynamic program) of certain types of adaptive behavior of a decision maker confronted with a stochastic environment. It is convenient to postulate several conditions to come to this conclusion.
(a) In order to reduce the irreversible adaptive behavior process to a simple deterministic process having Markovian properties, it is convenient that the decision maker possess the capacity to formulate a subjective probability function from the probability function of the environmental process controlling the payoff events. This subjective probability function is a biased sufficient maximum likelihood estimator where the bias is defined as the initial conviction or logical width of the decision maker. The advantage of the use of such an estimator is reinforced by its association with the frequency concept of probability which is a common notion of the lay decision maker. (b) The environmental process must possess a stationary probability function over the stages of interest. For example, if the actions taken by the decision maker were to alter the probability function of the environmental process, the probability function would not be stationary. Such an assumption is commonly made in economics as the postulate of ceteris paribus. (c) The occurrences of any two events in time or phase space are statistically independent. 189
190
11.
THE CONCLUSIONS
It may commented that these conditions have been used for convenience in this study, but in future work it may be possible to relax them. For example, it has been considered that a time-weighted estimator could with reasonable accuracy a slowly changing Markovian environmental process. Such an estimator is often found in practical forecasting techniques, such as the moving average method where nonstationarity of the underlying process is strongly suspected. Using such an estimator would make it possible to consider that the environmental process was in reality another decision maker playing a game of strategy with the first decision maker. The result would be a dynamic game theory in which two subjective probability functions (strategies) based on the players' actions would evolve. Furthermore, the assumption of independence may be unnecessarily restrictive if a more complicated model is admissible. Certainly in the real world in the bond and stock market it is somewhat unrealistic to ignore the interactions of similar bonds and stocks. If the investment opportunities are productive processes, more independence could be expected. It would be more realistic to formulate a model where interdependences, if they exist, have a role in the model so that by adaptation they will become recognized by the decision makers and will be included in their decisions. 11.1.2 The decision maker is considered to be making "rational" use of a sequence of historical events if he makes use of the "best" subjective probability function as defined in Chapter 4.
The concept of rational behavior is at best a subjective subject itself. To define rational behavior, we must acknowledge some ground rules, and see to it that our rational pattern meets them. Since the rules for a "best" estimator have been relatively firmly established bythe mathematical statisticians, we can do no better than use these rules to evaluate our subjective probability function, and thus establish some rationality in using it. 11.1.3 Economic growth processes are one of a class of processes which give rise to analytic solutions to the dynamic program equations and which give solutions that are linear functions of entropy.
Because of the exponential functional forms appearing in growth processes, the dynamic programming equations which result from these processes are readily solvable. These solutions involve the measure of uncertainty known as entropy. In the investment decision process
11.1
INDIVIDUAL ADAPTIVE BEHAVIOR
191
which we have considered in some detail, entropy appears to be the embodiment of the economic concept of a risk discount. In periods of high risk (high entropy) the maximum expected rate of capital growth is low, while in periods of low risk (low entropy) the maximum expected rate of capital growth is high. Thus, the expected rate of growth of a capital stock is a function not only of the payoffs of the productive or investment processes, but also directly related to the measure of the uncertainty for the system. 11.1.4 In adaptive growth processes, solutions to the dynamic program equations involve linear functions of subjective entropy. Depending on the limited amount of information and the accuracy of the decision maker's initial conviction, the subjective entropy may be greater than the actual entropy (source entropy) of the system. Thus, a decision maker with little information may not achieve an expected rate of growth of capital that is achievable under Conclusions 11.1.3. In the adaptive investment process the risk discount of an individual decision maker may be greater than the discount appropriate to the actual environmental stochastic process. 11.1.5 Under the assumptions made in Conclusion 11.1, the subjective entropy for a decision maker will, as time approaches infinity, converge in probability to the actual entropy of the environmental process, and will be a decreasing function of time. This conclusion, based on the use of Chebyshev's inequality and Shannon's theorems, is most interesting. The implication is that any decision maker will adapt to a given environmental process in such a way that his performance in probability will improve, rather than deteriorate. Such a discovery was first made by Darwin, who observed that changes in living things generally tend toward the improvement, rather than the degradation of the species as long as the environment does not change radically (quasi-stationary environmental process). In some cases the change in the adaptive process is less rapid than the change in the environment, and may see the extinction of certain species. 11.1.6 In the absence of adaptation, given that the decision makers maximize the expected rate of the growth of capital, optimum individual allocations of capital (portfolio ratios) will remain constant over the duration of the process. Moreover, with adaptation taking place, the optimum
192
II.
THE CONCLUSIONS
individual capital allocations will change as a function of the amount of information possessed by the decision maker. In all probability they will approach those optimum individual capital allocations of the nonadaptive process, given the same environmental process.
This conclusion is based on Conclusions 11.1.1-11.1.5, but is of interest for the investment process. The optimum individual capital allocations are a key indicator of the operations of the investment market. In all likelihood the deviations of individual capital allocations from the average capital allocations in the same market indicate the incomplete information possessed by the individuals, and not the existence of irrational behavior. For this reason, statistical estimations made from capital allocation ratio statistics are liable to be in error because such statistics are based on individual behavior under limited information conditions. For reasonable statistical studies it is necessary to postulate that all the individuals' convictions are self-compensating, and all individuals possess the same event histories. It is reasonable to believe that all persons in a communicable market possess the same event histories (stock market investor services serve in this useful role); however, there is no reason to assume that people possess self-compensating convictions. If anything can be learned from the stock market behavior it is the recognition of the mass movement of investor convictions from despair to enthusiasm (unfortunately, here the stock market services tend to propagate similarities in convictions). Probably only at the center of the swing of a cycle in stock prices are the investors' convictions self-compensating. 11.1.7 The time scale of an adaptive growth process depends on the entropy gradient; i.e., the rate of adaptation. Furthermore, a system in which the entropy gradient is zero is a static system in the sense that irreversible changes are not oceuring.
The time concept in economic processes is a great deal more complex than it appears on the surface. Time taken as a measure of the aging or growth rate of a system is not the equivalent to clock time which is related to the universe. The normal concept of time is the comparison of the aging of the observed system to the aging or running down of a standard system (for example, a clock spring). Relative to clock time an economic system's speed of response is rapid if the rate of adaptation is high, and slow if the rate of adaption is low. The time scale of such a system is then a function of the entropy gradient. The rate of change of
11.2
COLLECTIVE ADAPTIVE BEHAVIOR
193
entropy is proportional to the rate of transmission of information and the effectiveness of the digestion of this information by the decision maker. In a system with poor transmission capabilities, i.e., where the information is so garbled that more information must be received to ungarble the original information, the rate of passage of entropy time is slower than in more efficient systems. Given the same parameters, a cyclic system (for example, the Samuelson business cycle model) with a high entropy gradient may have a shorter period than a system with a low entropy gradient. This conclusion has implications in the analysis of business cycles in countries with various levels of sophistication in handling information. Given the same environmental structure, countries with superior market communications and enlightened decision makers should experience smaller periods in cycles of economic activity than countries with isolated markets and unsophisticated decision makers. 11.1.8 Given some upper bound on the achievable expected rate of capital growth, the most likely probability function describing the probability of a payoff from any investment is an exponential function (MaxwellBoltzmann function). This conclusion, derived from the dynamic program solution, implies only what we observe in the investment market place. The investments with the largest payments are found very infrequently, while those paying a small return are numerous. It is interesting to note that because of the exponential character of the Maxwell-Boltzmann distribution, it is possible, but highly improbable, to find an investment opportunity which pays an individual the entire stock of capital in the system. One could say that in a world in which events were described by MaxwellBoltzmann distributions, anything is possible but the best things are always improbable.
11.2 Collective Adaptive Behavior 11.2.1 Given assumptions (a)-(h) covered in Chapter 10, we conclude that in an adaptive investment market which is in statistical equilibrium the decision makers possess different individual capital allocation vectors which result from the different states of information each possesses. The probability of finding a person in a given information state and a given capital state is given by an exponential probability function of the M axwellBoltzmann type.
194
11.
THE CONCLUSIONS
This conclusion is based on the material in Chapter 10. It is interesting to note that differences in the portfolio ratios between investors with the same amount of capital (the same capital state) can be attributed to the limited information about the environmental structure of the market, rather than any irrational behavior pattern. In contrast to investment models of the Tobin-Markowitz type, where differences in the portfolio ratios depend on differences in the individual utility of return and risk, this investment model requires that the decision makers only maximize the growth of their capital under conditions of insufficient information about the influence of the environment in the market. Given some capital state, the Maxwell-Boltzmann probability function will give the probability of finding a decision maker in any particular information state. Each of these particular states is directly connected with a particular expected rate of capital growth. Thus, if given some capital state, within this state it will be highly unlikely that an investor experiencing a high rate of capital growth can be found, and on the other hand, it will be highly likely that an investor experiencing a low capital growth can be found. Experience in the actual investment market only too well establishes the truth of this result for most investors. It would be interesting to study the distribution of the expected growth rates of capital in the investment market to see if indeed the exponential distribution is a good fit. It should also be possible to study other economic growth phenomena for this same regularity. The relations between statistical equilibrium and the MaxwellBoltzmann probability function in an adaptive market should shed some light on how exponential wealth distributions arise, and how their parameters are controlled by social forces. 11.2.2 As time goes to infinity, the adaptive processin probability converges the number of different information states to a single state as a limit-the state of enlightenment.
During adaptation, the decision makers occupy different information states because of differences in history and convictions. As time passes, the process of adaptation forces, in probability, every decision maker to the same information state, and thus the same set of individual capital allocation ratios. With only one state in the limit, all decision makers would allocate their capital the same way over the same investments. If identical individual capital allocation ratios are the limiting result of the adaptive 'process, then those factors which explain the rate of
11.2
COLLECTIVE ADAPTIVE BEHAVIOR
195
adaptation-the entropy gradient-in turn should lead us to the social factors that control the differences in individual capital allocation. We know that the rate of adaptation is related to the decision maker's ability to decode (digest) the information transmitted by the events arising from the environmental process. From this we can conclude that if the environmental process is stationary, and the economic system has the ability to transmit and decode information, then in time the information states will in all probability become the same for all decision makers. However, if the environmental process is not stationary, but slowly changes, the direction of adaptive convergence may be in error, and the information states for each decision maker will become more dissimilar, the greatest expected capital growth being obtained by the decision makers whose convictions of the future direction of the environmental process, by luck or shrewd guess, lie in the proper direction. We can examine the effect of social phenomena on the ability of the decision makers to decode the information they receive, or the effect of the environmental process, to show the impact these social phenomena have on the differences in the allocation of individual capital. It is suggested that future study be directed toward (1) the determination of the role of technological change in the formation of the mechanism of the environmental process, a possible source of nonstationarity, and (2) the determination of the effect of improvements in communication, permitting more efficient information decoding by the decision makers. These phenomena may tend to balance each other, thus controlling the differences in individual portfolio ratios in the society.
11.2.3 Trading activity still occurs in an economic market in statistical equilibrium. A criticism of the classical economic theory of the stationary state is that at this state, marginal utilities are all equal between the market participants and the need for "higgling and piggling" ceases. In the theory of the firm, the same condition causes the cessation of the need for entrepreneurship since decision making is no longer necessary. Assumption of a changing cross section of the aging market participants to explain continuous decision making must also require the assumption of limited information for the new decision makers. If the new decision makers had full information, they would know the optimum set of goods and services at once, no decisions being necessary. Limited information is a necessity to explain the role of decision making in such processes,
196
11.
THE CONCLUSIONS
and the adaptive process is the process which defines the role of information in economic system. The above results are in part due to the absence of uncertainty in the classical model. In an adaptive stochastic economic system which is in statistical equilibrium, a stationary state of the classical type cannot exist. In statistical equilibrium the probability that a trade will occur which will make person A better off, and person B worse off, is equal to the probability that a trade will be made making person A worse off and person B better off. Thus the probability of an advantageous trade becomes zero in statistical equilibrium; but trading still goes on, the decision makers thinking they might make an advantageous trade.
BIBLIOGRAPHY This list of references is not intended to be a complete bibliography of the books and papers of those authors who have contributed to adaptive theory. It is intended to be a guide to the reader who desires to extend his understanding in any of the many fields touched on herein.
1. General Background on Adaptive and Evolutionary Behavior Arrow, K. J., The economic implications of learning by doing, Tech. Rept. No. 101 (7 Dec. 1961). Institute for Mathematical Studies in the Social Sciences, Applied Mathematics and Statistics Laboratories, Stanford, California. Ashby, W. R, "Design for a Brain." Wiley, New York, 1954. Bellman, R (Ed.), "Mathematical Optimization Techniques." California Univ, Press, Berkeley, California, 1963. Bellman, R., "Adaptive Control Processes." Princeton Univ, Press, Princeton, 1961. Edgeworth, F. Y., "Mathematical Psychics." Kegan Paul, London,.1881. Fogel, L. J., "Biotechnology: Concept and Applications." Prentice-Hall, Englewood Cliffs, New Jersey, 1963. Haavelmo, T., "A Study in the Theory of Economic Evolution." North Holland, Amsterdam, 1954. Hawkins, J. K., Self-organizing systems-a review and commentary, Proc, I.R.E, 49, 31-47 (1961). Lotka, A. J., "Elements of Mathematical Biology." Dover, New York, 1956. Rashevsky, N., "Mathematical Biology of Social Behavior." Chicago Univ. Press, Chicago, 1951. Simon, H. A., "Models of Man." Wiley, New York, 1957. Volterra, V., "Theory of Functionals and of Integro-Differential Equations." Dover, New York, 1959. Von Foerster, H. and Zopf, G. W. (Ed.), "Principles of Self-Organization." Macmillan (Pergamon), New York, 1962. Walter, W. G., "The Living Brain." Norton, New York, 1953. Yovits, M., Jacobi, G., and Goldstein, G., "Self-Organizing Systems-1962." Spartan Books, Washington, D.C., 1962. Yule, G. D., A mathematical theory of evolution, based on the conclusions of Dr. J. C. Willis, F.RS., Philosophical Trans. Roy. Soc. London, Ser. B, 213, 21-87 (1924).
2. Sources on the Concept of Subjective Probability Carnap, R., "Logical Foundations of Probability." Chicago Univ, Press, Chicago, 1950. Driml, M., and Hans, 0., On experience theory problems, Trans. 2nd Prague Conf,
197
198
BIBLIOGRAPHY
Information Theory, Statistical Decision Theory, Random Processes, Prague, June 1959, pp. 93-111. Academic Press, New York, 1960. Good. 1. J., "Probability and the Weighing of Evidence." Hafner, New York, 1956. Hart, A. G., Risk uncertainty, and the unprofitability of compounding probabilities, in "Studies in Mathematical Economics and Econometrics" (0. Lange, F. McIntyre, and Y. O. Yntema, Ed.). Chicago Univ. Press, Chicago, 1942. Keynes, J. M., "A Treatise on Probability." Macmillan, London, 1921; reprinted Harper & Row, New York, 1962. Kyburg, H. E. j r., and Smokier, H. E. (Ed.), "Studies in Subjective Probability." Wiley, New York, 1964. Laplace, P. S., "A Philosophical Essay on Probabilities," Dover, New York, 1951. Ramsey, F. P., "The Foundations of Mathematics and other Logical Essays." Humanities Press, New York, 1950. Savage, L. J., "The Foundations of Statistics." Wiley, New York, 1954. Schwartz, L. S., Harris, B., and Hauptschein, A., Information theory from the viewpoint of inductive probability, I.R.E. 1959 Natl. Convention Record, Part 4, pp. 102-111. Suppes, P., The role of subjective probability and utility in decision-making, Proc, 3rd Berkeley Symp: Mathematical Statistics and Probability, Vol. V. California Univ. Press, Berkeley, California, 1956.
3. Sources on the Mathematical Theory of Intelligence Aborn, M., and Rubenstein, H., Information theory and immediate recall, J. Exper, Psychol. 45, 260-266 (1952). Bush, R. R., and Mosteller, F., "Stochastic Models for Learning." Wiley, New York, 1955. Edwards, W., Behavioral decision theory, Ann. Rev. Psychol. 12, 473-498 (1961). Guilford, J. P., The structure of the intellect, Psychological Bull. 53, 267-293 (1956). Harmon, L. D., Levinson, J., and van Bergeijk, W., Analog models of neural mechanism, I.R.E. Trans. Inform. Theory 8, 107-112 (1962). Kochen, M., and Galanter, E. H., The acquisition and utilization of information in problem solving and thinking, Information and Control I, No.3, 267-288 (1958). MacKay, D. M., On comparing the brain with machines, Am. Scientist 42, 261-268 (1954). Marill, T., Progress in artificial intelligence, I.R.E. Trans. Human Factors Electron. 2, 2 (1961). McCulloch, W. S., The brain as a computing machine, Elec. Eng. 68, 492-497 (1949). Miller, G. A., Human memory and the storage of information, I.R.E. Trans. Inform. Theory 2, 129-137 (1956). Minsky, M., Steps toward artificial intelligence, Proc. I.R.E. 49, 8-30 (1961). Murphy, R. E., Relations between self-adaptive control theory and artificially intelligent behavior in a stationary stochastic environment, Artificial Intelligence pp. 45-63, 1.E.E.E. Publication S-142 (Jan. 1963). Newell, A., Shaw, J. C., and Simon, H. A., The process of creative thinking, The Rand Corporation Rept. No. P-1320 (September, 1958). Samuel, A. L., Some studies in machine learning using the game of checkers, IBM J. Res. Develop. 3, 210-229 (1959).
BIBLIOGRAPHY
199
Shannon, C. E., and McCarthy, J. (Ed.), "Automata Studies." Princeton Univ, Press, Princeton, New Jersey, 1956. Solomonoff, R J., An inductive inference machine, I.R.E. Convention Record Part 2 56-62 (1957). ' Turing, A. M., On computable numbers with an application to the entscheidungsproblem, London Math. Soc. 42, 230-265 (1937). Uttley, A. M., Information, machines and brains, I.R.E. Trans. Inform. Theory 1, 143-149 (1953). Von Neumann, J., The general theory of automata, in "John von Neumann-Collected Works" (A. H. Taub, Ed.), Vol. 5, pp. 288-328. Macmillan (Pergamon), New York, 1963. Walter, W. G., An electromechanical "animal," Discovery 11, No.3, 90 (1950). Wechsler, D., Intelligence, quantum resonance and thinking machines, Trans. New York Acad. Sci. Ser. II, 22, 259-267 (1960).
4. Dynamic Programming and Adaptive Control Aoki, M., On optimal and suboptimal policies in the choice of control forces for final-value systems, I.R.E. Trans. Auto. Control 5, 171-178 (1960). Aoki, M., Dynamic programming approach to a final value control system with a random variable, I.R.E. Trans. Automatic Control 5, 270-282 (1960). Bellman, R., A problem in the sequential design of experiments, Sankhya 16, Parts 3 and 4, 221-229 (1956). Bellman, R, "Dynamic Programming." Princeton Univ, Press, Princeton, 1957. Bellman, R., and Kalaba, R, On the role of dynamic programming in statistical information theory, I.R.E. Trans. Information Theory 3, 197-203 (1957). Bellman, Rand Kalaba, R, On communication processes involving learning and random duration, I.R.E. 1958 Natl. Convention Record, 16-21. Bellman, Rand Kalaba, R, A mathematical theory of adaptive control processes, Proc, Natl. Acad. Sci 45, 1288-1290 (1959). Bellman, Rand Kalaba, R, On adaptive control processes, I.R.E. Trans. Automatic Control 4, 1-9 (1959). Bellman, Rand Kalaba, R, Dynamic programming and adaptive processes: mathematical foundation, I.R.E. Trans. Automatic Control 5, 5-10 (1960). Bellman, R, "Adaptive Control Processes: A Guided Tour." Princeton Umv. Press, Princeton, New Jersey, 1961. Bellman, R, Dynamic programming, intelligent machines, and self-organizing systems, in "Mathematical Theory of Automata," (1. Fox, Ed.) pp, 1-1 I. Brooklyn Polytechnic Press, Brooklyn, 1963. Bellman, R, Mathematical model making as an adaptive process, in "Mathematical Optimization Techniques" (R Bellman, Ed.), pp. 333-339. California Univ, Press, Berkeley, California, 1963. Friemer, M., A Dynamic programming approach to adaptive processes, I.R.E. Trans. Auto. Control 4, 10-15 (1959). Howard, R A., "Dynamic Programming and Markov Processes." Wiley, New York, 1960.
200
BIBLIOGRAPHY
Marcus, M. B., The utility of a communication channel and applications to suboptimal information handling procedures, I.R.E. Trans. Inform. Theory 4, 147-151 (1958). Marschak, J., On adaptive programming, Management Sci. 517-526 (July 1963).
5. Sources on Investment Theory; mainly on Multi-Project Investment under Uncertainty Bernoulli, D., Exposition of a new theory On the measurement of risk, Papers of the Imperial Acad. Sci. Petersburg, 1738; translation by L. Sommer, Econometrica (January, 1954). Borch, K., Price movements in the stock market, Paper No.7, Economic Research Program. Princeton Univ., Princeton, New Jersey, April 30, 1963. Bowman, M. J. (Ed.), "Expectations, Uncertainty, and Business Behavior." Social Science Research Council, New York, 1958. Coleman, R. P., Formulas for investment and management decisions that maximize the expected exponential rate of capital growth, I.E.E.E. Trans. Eng. Management, 10, 174-177 (1963). Dorfman, R., Samuelson, P. A., and Solow, R. M., "Linear Programming and Economic Analysis." McGraw-Hill, New York, 1958. Egerton, R. A. D., "Investment Decisions Under Uncertainty." Liverpool Univ, Press, Liverpool, England, 1960. Farrar, D. E., "The Investment Decision Under Uncertainty." Prentice-Hall, Englewood Cliffs, New Jersey, 1962. Fisher, I., "The Theory of Interest." Macmillan, New York, 1930; reprinted, Kelley and Millman, New York, 1954. Friedman, M. and Savage, L. J., The utility analysis of choices involving risk, in "Readings in Price Theory" (G. J. Stigler and K. E. Boulding, Eds.), American Economic Association, Chicago, 1952. Granger, C. W. J. and Morganstern, 0., Spectral analysis of New York stock market prices, Kyklos 16, No.1, 1-27 (1963). Gurley, J. G. and Shaw, E. S., "Money in a Theory of Finance." The Brookings Institute, Washington, D.C., 1960. Haavelmo, T., "A Study in the Theory of Investment." Chicago Univ, Press, Chicago, 1960. Hardy, C. 0., "Risk and Risk-bearing." Chicago Univ. Press, Chicago, 1931. Hillier, F. S., Derivation of probabilistic information for the evaluation of risky investments, Management Sci. 443-457 (Apr. 1963). Kaldor, N., The Equilibrium of the Firm, Economic J. 4, 60-76 (1934). Kalecki, M., "Essays in the Theory of Economic Fluctuations." Allen and Unwin, London, 1939. Kaufman, G. M., Sequential investment analysis under uncertainty, J. Business 36, No.1, 39-64 (1963). Knight, F. H., "Risk, Uncertainty, and Profit." Houghton Mifflin, Boston and New York, 1921. Latane, H. A., Criteria for choice among risky ventures, J. Political Economy (Apr. 1959).
BIBLIOGRAPHY
201
Lutz, F. and Lutz, V., "The Theory of Investment of the Firm." Princeton Univ, Press, Princeton, New Jersey, 1951. Markowitz, H. M., "Portfolio Selection." Wiley, New York, 1959. Masse, P., "Optimal Investment Decisions." Prentice-Hall, Englewood Cliffs New Jersey, 1962. Murphy, R. E., Adaptive processes in economic systems, Rept, 119. Institute for Mathematical Studies in the Social Sciences, Stanford University, Stanford, California, 30 July 1962. Osborne, M. F. M., Brownian motion in the stock market, Operations Res. 7, No.2, 145-173 (1959); (Also see "Comments" and "Reply," ibid. 7, No.6, 806-811.) Rosett, R. N., Estimating the utility of wealth from call options data, Rept, 6332, Nov. 27. Netherlands School of Economics, The Netherlands, 1963. Shackle, G. L. S., "Decisions, Order and Time in Human Affairs." Cambridge Univ. Press, London and New York, 1961. Tobin, J., Liquidity preference as behavior toward risk, Rev. Economic Studies 25 (2), No. 67, 65-86 (1958).
6. Sources on Information Theory Abramson, N., "Information Theory and Coding." McGraw-Hill, New York, 1963. Brillouin, L., "Science and Information Theory," 2nd Ed. Academic Press, New York, 1962. Fano, R. M., "Transmission of Information." Wiley, New York, 1961. Fisher, R. A., "Contributions to Mathematical Statistics." -Wiley, New York, 1950. Hartley, R. V. L., Transmission of information, Bell System Tech. J. 7, 535-563 (1928). Kelly, J. L., A new interpretation of information rate, Bell System Tech. J. 35, No.4, 917-926 (1956). Khinchin, A. I., "Mathematical Foundations of Information Theory." Dover, New York, 1957. Kolmogorov, A., Interpolation und Extrapolation von stationaren Zufalligen Folgen, Bull. Acad, Sci. U.S.S.R. Math. Ser. 5, 3-14 (1941). Kullback, S., "Information Theory and Statistics." Wiley, New York, 1959. Marschak, J., Remarks on the economics of information, Cowles Foundation Paper No. 146, Yale University (1960). Marschak, J., Problems in information economics, in "Management Controls: New Directions 'in Basic Research" (C. P. Bonini, R. K. Jaedicke, and H. M. Wagner, Eds.), pp. 38-74. McGraw-Hill, New York, 1964. Murphy, R. E., Information theory and economic processes, Proc. 7th Joint Western Regional Meeting of the Institute for Management Science and the Operations Research Society of America pp. 179-204. Western Periodicals, North Hollywood, California (1965). Pinsker, M. S., "Information and Information Stability of Random Variables and Processes," Holden-Day, San Francisco, 1964. Rosenblatt, M., "Random Processes." Oxford Univ. Press, London and New York, 1962.
r-;
202
BIBLIOGRAPHY
Shannon, C. E., A mathematical theory of communication, Bell System Tech. J. 27, No.3, 379-423 (1948); Ibid. 27, No.4, 623-656 (1948). Wiener, N., What is information theory? I.R.E. Trans. Inform. Theory 2, 48 (1956).
7. Sources on the Concept of Entropy and Entropy-Gradient Amber, G. H., Resonance-probability and entropy-evolution relationships, Proc. I.R.E. 46, 1962 (1958). Bergson, H., "Time and Free Will." Harper, New York, 1960. Brillouin, 1.., "Science and Information Theory," 2nd Ed. Academic Press, New York, 1962. Eddington, Sir Arthur, "The Nature of the Physical World." Michigan Univ. Press, Ann Arbor, Michigan, 1958. Fogel, 1.., A note on the fourth dimension, Proc, I.R.E. 42, 1699 (1954). Mendelssohn, K., Probability enters physics, Am. Sci. 49, No. I, 37-49 (1961). Reichenbach, Hans, Les fondements logiques de la mechanique des quanta, Ann. Inst, Henri Poincare 13, No.2, 156 (1953). Reichenbach, H., "The Direction of Time." California Univ. Press, Berkeley, California, 1956. Seifert, H. S., Can we decrease our entropy? Am. Sci. 49, No.2, 124-134 (1961). Szilard, L., Uber die Entropieverminderung in einem thermodynamischen System bei Eingriffen intelligenter Wesen, Z. Physik 53, 840-856 (1929). Wiener, N., "Cybernetics." Wiley, New York, 1940. Wiener, N., "The Human Use of Human Beings." Doubleday, New York, 1954.
8. Sources on Statistica' Mechanics and Thermodynamics Boltzmann, 1.., "Vorlesungen tiber Gastheorie." Ambrosius Barth Verlag Von Johann, Leipzig, 1896. Eisenschitz, R., "Statistical Theory of Irreversible Processes." Oxford Univ. Press, London and New York, 1958. Gibbs, J. W., "Elementary Principles in Statistical Mechanics." Dover, New York, 1960. Khinchin, A. I., "Mathematical Foundations of Statistical Mechanics." Dover, New York, 1949. Schrodinger, E., "Statistical Thermodynamics." Cambridge Univ. Press, London and New York, 1960. Tolman, R. C., "The Principles of Statistical Mechanics." Oxford Univ. Press, London and New York, 1958.
9. Sources on Certain Techniques Used in This Book Arrow, K., Alternative approaches to the theory of choice in risk-taking situations, Econometrica 19, 4, 404-35 (1951).
BIBLIOGRAPHY
203
Feller, W., "An Introduction to Probability Theory and its Applications," Vol. I. Wiley, New York, 1957. Hogg, R. V. and Craig, A. T., "Introduction to Mathematical Statistics." Macmillan, New York, 1959. Kuhn, H. W. and Tucker, A. W., Nonlinear programming, Proc, 2nd Berkeley Symp, Mathematical Statistics and Probability. California Univ. Press, Berkeley, 1951. Luce, R. D., "Individual Choice Behavior." Wiley, New York, 1959. Mood, A. M., "Introduction to the Theory of Statistics." McGraw-Hili, New York, 1950. Parzen, E., "Modern Probability Theory and its Applications." Wiley, New York, 1960. Wald, A., "Sequential Analysis." Wiley, New York, 1947. Wilks, S. S., "Mathematical Statistics." Wiley, New York, 1962.
This page intentionally left blank
SUBJECT INDEX A
Capital, state, see State, capital Cash, 97, 129, 132 Causality, principle of, 25-28 Certainty equivalent, 149 Chapman-Kolmogorov equation, 27, 29 Characteristic function, economic 166 176 Chebyshev's inequality, 191 ' , Commodities, 23, 169 Communications, 128, 161, 193, 195 Competition, economic, 60, 145, 151 Confusion, of a decision maker, 140, 142,
Action, of a decision maker, 9, 22 Adaptive behavior, collective, 193-196 individual, 189-193 Adaptive control, 8, 28, 53 Adaptive process, see Process, adaptive Asymmetry coefficient, 102 B
157 Bayes equation, 66 Bayes strategy, see Strategy, Bayes Bernoulli objective, see Utility, Bernoulli Bernoulli probability, see Probability, binomial Bernoulli trials, see Process, binomial Binary number, 62, 73 Biology, mathematical, 5 Binomial environment, see Environment, binomial Binomial probability, see Probability, binomial Bit, 73 Boltzmann's constant, 167 Business cycle, 193
c Capital, accumulation of, see Growth, expected rate of allocation of (individual), see Portfolio ratios allocation of (interpersonal), 171 conservation of, 182, 184
205
Consistency, statistical, 68-69, 140-142,
188 Constraints, 24-25, 86-87, 95, 129 Consumption, economic, 86 Convictions, of a decision maker, 19,
49-51, 53, 56-57, 65, 69, 118-119, 157-158, 160, 189, 192, 194 Corner solutions, 100 Cybernetics, 7 D
Decision process, see Process, decision Decision optimal, 92-93, 100, 105 Decision vector, 17 Decomposition postulate, 126-128 Determinism, 25-28 Dialectic, 1 Dirichlet's probability, see Probability, multinomial beta Discount, for risk, 101-102, 191 Dynamic driving function, 152 Dynamic economics, 143-144, 167 Dynamic process, see Process, dynamic Dynamic programming, see Programming, dynamic
206
SUBJECT INDEX
E
Economic process, see Process, economic Economic, system, I, 158, 171 Efficiency, of an adaptive process, see Process, adaptive, efficiency of Einstein-Wiener process, see Process, Einstein-Wiener Enlightenment, state of, see State of enlightenment Entrepreneurship, economic, 195 Entropy, 5, 7, 20, 73-82, 94, 101-103, 113,
116, 119-120 actual (environmental), 141-142, 146-
148, 152-155, 160, 162-167, 170. 190 conditional, 79-81 gradient, 71-73, 82-85, 192-195 limit of, 82 market, 170, 179-181, 183, 188 maximum, 78, 101, 136, 138, 153-155 paradox, 82-85, 188 subjective, 82-85, 119-120, 136-138, 146-149, 149-151, 152-155, 188 time, 14-15, 61, 65, 71-73, 92, 126:"128, 130, 155, 188 trajectory, 149-155, 159-161 truncated, 142 Environment, 4, 24, 32-33, 38 binomial, 60, 98 Machiavellian, 156-157 Markovian, 90, 129, 190 multinomial, 60---63, 68-69, 148-149, 162-167 probability function, see Probability, environmental process, see Process, environmental Equilibrium, deterministic, 20, 24, 143 statistical (stochastic), 20, 146-149, 172,
description, 62, 156, 192 frequency, 66, 97-99, 122-125, 162, 194 "no change," 61, 134 Evolution, see Process, adaptive Expectation(s), 21, 29, 39, 67, 88 subjective, 109-112, 130-131 Expected, payoff, see Payoff, expected rate of growth, see Growth, expected rate of utility, see Utility, expected F
Feedback, see Process, feedback Fisher-Neyman theorem, 64 Forecasting, 190 Foresight, perfect, 98-99, 108 Forgetting, 156-157 Future, effect of, 4, 8 G
Game theory, 42, 45, 143-144, 190 Growth, expected rate of (for a decision maker), 21, 23, 86-87, 91-94, 95-96,
98-99, 109, 113, 130---131, 134, 138, 148, 166, 170-171, 191, 193-194 (fora system), 171, 174, 183-186 H H theorem, 5 Historical information vector, 18-19, 52, 62, 118, see also State, information Historical process, see Process, historical Homeostat, 7 Horizon, of a decision maker, 87, 105
177-183, 186-187, 193, 195-196 Estimators, 30, 39, 63-69, 81 best, 63-65, 69 biased, 63 maximum likelihood, 64, 69 minimum variance, 64, 68 subjective, 67 unbiased, 63, 65, 68-69 Event, discrete, 12, 22-23, 145
Illiquidity, economic, see Liquidity, economic Induction, mathematical, 107 Information, conditional, 78-80 decoding, 195 distorted (misinformation), 157
207
SUBJECT INDEX
expected, 77-82 Fishers concept of, 82 full, 90, 120-121, 139-142, 143-146,
163-164, 195 limited, 141-142, 143-144, 192-195 measure of, 7, 73-75, 155 messages, 73, 74 mutual, 75, 77 revealed, 143-144 self, 76 state, see State, information theory, 7, 9, 22-24, 28, 155 transmission of, 127-128, 193 vector, see State, information Introspection, 34-35 Investors, "The Three," 53-60, 96-104,
119 Intelligence, artificial, 6 Irretrievable loss, see Loss, irretrievable Irreversibility, see Time, irreversibility of
stock, 14-15,53-60,61,96-104, 126-128,
161, 190, 192
Market transactions, 170, 177, 179-188 Markov environment, see Environment, Markovian Markov process, see Process, Markov Maximum likelihood function, see Likelihood function, maximum Maxwell's demon, 83 Maxwell-Boltzmann probability, see Probability, Maxwell-Boltzmann Memory, human, 84 Model, mathematical, 12 Multinomial probability, see Probability, multinomial Multinomial beta probability, see Probability, multinomial beta Multinomial environment, see Environment, multinomial N
K Knowledge, see Information Kuhn-Tucker theorem, 131-133, 139 L
Lagrangian function, 147, 164-166, 174-
175 Laplacian assumption, 33, 45, 52-53 Learning theory, 6, 22-23, 32-33, 37 Lehmann-Scheffe theorem, 64, 69 Likelihood function, 58, 65, 162-163, 167,
174 Liquidity, economic, 96, 104, 108-109 Logical width, 68 Loss, irretrievable, 121, 154-155, 161 M
Machina speculatrix, 7 Market, clearing, 177 entropy, see Entropy, market investment, 170, see also Market, stock process, see Process, trading
Nature, spying on, 39 Neurons, 6
o Objective function, see Utility function Occupation problem (statistical), 164 Optimal decision, see also Decision, optimal time invariant, 94 Optimal, policy, 29, 88-89, 129 Optimality, Pareto, 186 principle of, 29, 87, 90, 93, 105, 137, 189 Organization, see Entropy p
Pareto distribution, 5, 194 Pattern recognition, 6 Payoff, expected, 39-51, 98, 124-125 function, 37-51, 96-104, 109-113, 118,
125-128, 146, 151, 156, 162, 169, 171-173, 177, 184 interactions, 129 matrix,42 probability, see Probability, payoff
208
SUBJECT INDEX
Perceptron, 6 Poisson process, see Process, Poisson Portfolio ratios, 92-93, 96-108, 112-125,
129-139, 146, 169-177, 187-188, 191195 Predictive inference function, 68 Price mechanism, 86, 177 Price takers, 174 Probability, a priori, 36, 40, 46-47, 56-57, 66, 89-91, 174 a posteriori, 36, 40, 46-47, 89-91 beta, 55, 58-59 Bernoulli, see Probability, binomial Dirichlet's, see Probability, multinomial beta environmental, 139-141, 148, 162, 184,
188, 189 Maxwell-Boltzmann, 165,167,176,182.
186, 193-194 multinomial, 65-68 multinomial beta, 66-67 Probability, P type, 42, 48, 51 payoff, 38-40, 171-173 personal, see Probability, subjective Q type, 36-39, 45-46, 48, 50-51 state, 144, 170-183, 186-188 subjective, 34, 39-42, 52-69, 81-82,
89-91, 101, 109-112 transition; 25-28, 144-145 Process, adaptive, decision 2-4, 8-9, 12,
15, 18-24 decision makers view of, 90-91, 113-
116, 130, 135, 137 efficiency of, 121 first kind, 36-39, 48 limit of, 119-121, 139-142, 146-149,
154-155, 191, 194-195 market view of, 90-91, 113-116 mixed, 42-51 primitive, 32-51 properties of, 8-9 second kind, 35, 39-51 Bernoulli, see Process, binomial binomial, 15, 60, 98, 109, 112, 126 controlled, 92 continuous, 12 decision, 8-9, 17-20, 92 deterministic, 13, 26, 32-34, 87-88, 145
discrete, 12, 15, 19, 145 descriptive, 16-17 economic, 1, 24, 87-91 Einstein-Wiener, 126 environmental, 16-20, 23, 29, 39, 42,
47, 60-63, 69, 88-91, 98, 109, 112, 189, 195 feedback, 17, 60 gambling, 32 growth, 91-94, 190 historical, 18-20, 21-24, 30 isotropic; 167 investment, 129-130 Markov, 15, 25-28, 60, 73 Poisson, 73, 126 random walk, 28, 54, 126 sequential, 9, 12-30 social political, 8, 12, 23, 25 stochastic, 9,18,20-21,26-28,32,88-94 structural, 16-22 thermodynamic, 5, 126, 165 time dependent, 61 time independent, 61 trading, 14, 169-170, 176-188, 195-196 type 0, 16-17 type 1,17-18 type 2, 18-20, 22-25 Production function, economic, 86 Production, period of, 73 Programming, dynamic, 8, 28-30, 91-93, 105-108, 114-118, 130-139, 189-193 nonlinear, 99, 129-130 ~45,
Psychology, mathematical, 6
R Random walk, see Process, random walk Rao-Blackwell theorem, 64 Rationality, 22, 30, 33, 156, 190, 192 Reinforcement, psychological, 34 Risk, economic, 21, 96, 101-103, 109, 194 Risk discount, see Discount, for risk
s Seasonality, statistical, 14 footnote Sequential analysis, 122
209
SUBJECT INDEX
Shannon's theorems, 191 Stability, of decisions, 35, 49-51 Stage, 12-13 State, 5, 12-13, 143-162 a priori, 16-18, 20, 22, 88-91, 145 a posteriori, 20, 22, 88-91, 145 capital, 87, 145-146, 156, 158-161,
169-174, 187, 193 cells, 156, 159, 169-173, 177, 183-188 of enlightenment, 146-149, 163-164,
188, 194 historical, see State, information information, 155-162, 169-177, 183, 187, 193-195 initial, 25, 145 probability, see Probabilities, state resource, see State, capital stationary, economic, 195-196 State transition function, 12,15-17,22-23, 27, 36-37, 40-41, 46-48, 51, 90, 99, 129, 177-178 Stationarity (statistical); 189, 191, 195 Statics, economic, 143 Stochastic approximation, 53 Stochastic process, see Process, stochastic Strategy, 34, 41-48, 52-53 admissible, 44 Bayes, 42, 44, 52-53 mixed admissible, 45-47 pure, 43-45 Strategy space, 43-51 Structural process, see Process, structural Structure, logical, 37-38, 41 Sufficient, statistically, 30, 64, 69
T T Maze, psychological, 6 32-39, 45 Technological change, economic, 195 Temperature, thermodynamic, 167 Thermodynamic process, see Process, thermodynamic
Thermodynamic Second Law, 82-83 Threshold coefficient, 141-142 Time, Bergsonian (of consciousness), 14,
71-73, 84-85 clock (natural), 12, 71-73, 83, 126-128,
192-193 entropy, see Entropy, time index of, 14 Time, irreversibility of, 8, 28-29, 167,
189,192 order of, see Time mapping function standards, 71 Time mapping function, 14 Trading process, see Process, trading Transformation function, see State, transition function Transition probability, see Probability, transition Turing machine, 6-7
u Utility, 21-22, 86-87, 92, 95-96, 159, 167,
194 Bernoulli, 95, 159 expected, 86-87, 96 von Neumann, 55-56 Uncertainty, economic, 8, 21, 32, 44, 94,
101-102, 109, 190-191, 196 Urn experiment, 33-35, 39-51
v Value function, 13 Variance, as a measure of risk, 96
Z Z-transform, 153
This page intentionally left blank