However, with values of /? small, this is approximately equal to 6^ « 2/3/1(^)0(1 -4>)^0 This expression shows that the effect of increasing the expected advantage of choice 1 over 2 by h> 0 is nuUified by the presence of small (3. This is why we call this phenomenon the uncertainty trap. The economy cannot move out of ^ even when g is shifted upwards to favor choice 1. To summarize, an expansionary effort by shifting g function by h > 0 is equal to 6^ = -h{^)/g'{^)>0 with large /?, but is nearly zero with small
4. A Stochastic Model of Business Cycies In standard economic explanations of business cycles are direct consequences of individual agents' choices in changing economic environments such as consumers' intertemporal substitutions. We have three main objectives in this section. First, we demonstrate that aggregate fluctuations arise as an outcome of interactions of many sectors/agents in a simple model. Second, we show that the average level of aggregate output depends on the patterns of demand across sectors. Third, our simple quantity adjustment model clearly shows how some (actually only one in our model) sectors are randomly selected to act first, and their
A New Stochastic Framework for Macroeconomics
131
actions alter aggregate outputs and the interaction patterns among sectors/agents, thus starting the stochastic cycles all over again.
4.1 A Quantity Adjustment Model Consider an economy composed of K sectors, and sector i employs Ui workers^, i=l,...K. To present a simple model we assume K and prices are fixed in this section^ The output is assumed to be given by a linear production function Yi = CiUi for i = 1,2,...,K, where c, is the productivity coefficient. The total output (GDP) is given by the sum of all sectors K
y-E^i1=1
Demand for good i is given by SiY, where Sj is a positive share of the total output Y which falls on sector i goods, with J2i^i = 1- Here they are treated as exogenously fixed. In the next section we let them depend on the total output (GDP) in explaining Okun's law. Each sector has the excess demand defined by fi = SiY-Yi
(1)
f0Ti=l,2,...,K Changes in Y due to changes in any one of the sector outputs affect the excess demands of all sectors. That is, there exists an externaUty between aggregate output and demands for goods of sectors. Changes in the patterns of s's also affect these sets of excess demands. The time evolution of the model is given by a time-time Markov chain, as described in Aoki (2002, Sec. 8.6). At each point in time, the sectors of economy belong to one of two subgroups; one composed of sectors with positive excess demands for their products, and the other of sectors with negative excess demands. We denote the sets of sectors with positive and negative excess demands by 1+ = {i •• fi >0}, and / - = {i: fi < 0}, respectively. These two groups are used as proxies for groups of profitable and unprofitable secThe variable n, need not be the number of employees in a hteral sense. It should be a variable that represents 'size' of the sector in some sense. For example, it may be the number of lines in assembly lines. Actually, K can change as sectors enter and exit. See Aoki (2002, Sec. 8).
132
M. Aoki
tors, respectively. All profitable sectors wish to expand their production. All unprofitable sectors wish to contract their production. A novel feature of our model is that only one sector succeeds in adjusting its production up or down by one unit of labor at any given time. We use the notion of shortest holding time as a random selection mechanism of the sectors. That is, the sector with the shortest holding or sojourn time is the sector that jumps first. Only the sector that jumps first succeeds in implementing its desired adjustment. See Lawler (1995) or Aoki (2002, p. 28) for the notion of holding or sojourn time of a continuous-time Markov chain. We call that sector that jumps first as the active sector. Variables of the active sector are denoted with subscript a.
4.2 Transition Rates It is well known that dynamics of this time-time Markov chain are determined uniquely by the transition rates. See Breiman (1968, Chapter 15). We assume that the economy has initially enough numbers of unemployed workers so that sectors incur zero costs of firing or hiring, and do not hoard workers. We also assume no search on the job by workers. To increase outputs the active sector calls back one (unit of) worker from the pool of workers who were earlier laid-off by various sectors^. When fa < 0, Ua is reduced by one, and the number of unemployed pool of sector a, Ua, is increased by one, that is one worker who is immediately laid off. When /„ is positive, Ua is increased by one. See the next section for more detailed explanation.
4.3 Continuum of Equilibria The equilibrium states of this model are such that all excess demands are zero, that is, SiYe = Cin\, i = l,2,. ..,K. where subscript e of Y, and superscript e to Tii denote equilibrium values. Denoting the total equilibrium employment by Le = S i ^^f, we have liYe = Le
(2)
The actual rehired worker is determined by a probabilistic mechanism that involves measuring distances among clusters of heterogeneous laid-off workers by ultra-metrics. See Sec. 5. We note merely that our model can incorporate idiosyncratic variations in profitabihty of sectors andfrictionsin hiring and firing.
A New Stochastic Framework for Macroeconomics
133
where . This equation is the relation between the equilibrium level of GDP and that of employment. We see that this model has a continuum of equilibria. In equilibria, the sizes of the sectors are distributed as being proportional to the ratio for all ,
(3)
In the next section we see that the parameter in the model behavior.
plays an important role
4.4 Model Behavior Aoki (2002, Sec. 8.6) analyzes a simple version with K = 2 and show that as are changed, so are the resulting aggregate output levels. In simula, and several different demand share patterns: tions we have used some with more demand shares among more productive sectors and others more demand shares on less productive sectors. Simulations verify the sector size distribution formula given above. The aggregate outputs for all demand patterns initially decrease when we start the model with initial too large for equilibrium values. The model quickly conditions with sheds excess labors and settles down to oscillate around equilibrum level, i.e., business cycles. Loosely speaking, the more demands are concentrated among more productive sectors, the more quickly the model settles into business cycles. The more demands are concentrated on more productive sectors, the higher are the average levels of aggregate output. An interesting phenomenon is observed when the demand patterns are switched from more productive to less productive demand patterns and conversely. See Aoki and Yoshikawa (2005) for details.
5. New Model of Labor Dynamics This section discusses Okun’s law and the Beveridge curve by augmenting the model in the previous section by a mechanism for hiring and firing, while keeping the basic model structure the same.
5.1 A New State Vector Consider, as before, an economy composed of K sectors, and sector i employs workers,
134
M. Aoki
Sectors are now in one of two statuses; either in normal time or in overtime. That is, each sector has two capacity utilization regimes. The output of the sector is now given by
,
where Vi take the value of 0 in normal time, and 1 in overtime. More explicitly, in normal time
for , where is the productivity coefficient, and denotes the number of employees of sector i. In overtime, indicated by variable , workers produce output equal to
In overtime, note that the labor productivity is higher than in normal time because . This setup may be justified due to possible underutilization of labor. The total output (GDP) is given by the sum of all sectors, as before, Recall that demand for good i is given by as in the previous section, where is a positive share of the total output Y which falls on sector i goods, with
5.2 Transition Rates To implement a simple model dynamics we assume the following. Other arrangements of the detail of the model behavior are of course possible. Each sector has three state vector components: the number of employed, , the number of laid off workers, , and a binary variable , where means that sector i is in overtime status producing output with employees. Sectors in overtime status all post one vacancy sign during overtime status. When one of the sectors in overtime status becomes active with positive excess demand, then it actually hires one additional unit of labor and cancels the overtime sign. When a sector in overtime becomes active with negative excess demand, then it cancels the overtime and returns to normal time and the vacancy sign is removed. When , sector i is in normal time producing output with workers. When one of the sectors, sector i say, in normal time becomes active with positive excess demand, then it posts one vacancy sign and changes into one. If this sector has negative excess demand when it becomes active, it fires one unit of labor. To summarize: when , is reduced by one, and is increased by one, that means one worker is immediately laid off. We also assume
A New Stochastic Framework for Macroeconomics
135
that Va is reset to zero. When fa is positive, we assume that it takes a while for the sector to hire one worker if it has not been in overtime status, i.e., Va is not 1. If sector a had previously posted a vacancy sign, then sector a now hires one worker and cancels the vacancy sign; i.e., resets Va to zero. If it has not previously posted a vacancy sign, then it now posts a vacancy sign, i.e., sets Va to 1, and increases its production with existing number n,, of workers by going into over-utilization state. The transition path may be stated as z to z', where {na,Ua,Va
=0)
—> {na,Ua,Va
= I)_
and ( n o , Wo, 1^0 = 1) —» {na + l,Ua~
1, Va = 0)
In either case the output of the active sector changes into Y^ = K + Co. We next describe the variations in the outputs and employment in business cycles near one of the equilibria.
5.3 Hierarchical tree of unemployment pools In our model jobs are created or destroyed by changes in excess demand patterns. Pools of unemployed are heterogeneous because of geographical location s of sectors, human capital, length of unemployed periods, and so on. A given sector i, say, has associated with it a pool of unemployed who are the laid-off workers of sector i. They have the highest probability of being called back if sector i is active and can hire one worker. Pools of workers who are laid of from sector j , j ^ i have lower probability of being hired by sector i, depending on the distance d{i,j), called ultrametrics. These pools are organized into hierarchical trees with the pool of the laidoff workers from sector eat the root. The probability of a worker who is outside the pool i is a decreasing function of d{i,j). The ultrametric distance between pool i and j is symmetric d{i,j) = d{j,i) and satisfies what is called ultrametric condition d{i,j) < max{{d{i,k),d{k,j)}. See Aoki (1996, p. 36, Chapter 7) for further explanation^".
This ultrametric notion is used also in numerical taxonomy. See Jardine and Sibson (1971).For spin glasses and other physics, see Mezard and Virasoro (1985). See Schikhof (1984) for the mathematics involved. Feigelman and loffe (1991) have an example to show why the usual correlation coefficients between patterns do not work in hierarchical organization.
136
M. Aoki
5.4 Okun's Law Okun's law is an empirical relationship between changes in GDP, Y, and the unemployment rate u. We define Okun' s law by AY
^=-fii
,jr^U,
— h
(4)
where N = L + U is the total population of which L is employed and U is unemployed. In this paper we keep A^ fixed for simpler presentation. This numerical value of /? is much larger than what one expects under the standard neoclassical framework. Take, for example, the CobbDouglas production function with no technical progress factor. Then, GDP is given hyY = K^-°'L°' with a of about 0.7. We have .MI = ~AL, where AK and AN are assumed to be negligible in the short run. The production function implies then that AY/Y = aAL/L in the short run. That is, one percent decrease in F corresponds to an increase of AU/N = -{1/a){AY/Y){1 -U/N), i.e., an increase of a little over 1 percent of the unemployment rate. To obtain the number 4, as in the Okun's law, we need some other effects, such as increasing marginal product of labor or some other nonlinear effects. See Yoshikawa (2000). We assume that economies fluctuate about their equilibrium state, and refer to the relation (4) as Okun's law, where Ye is the equilibrium level of GDP, approximated by the central value of the variations in Y in simulation. Similarly, AU^ is the amplitude of the business cycle oscillation in the unemployed labor force. Ue is approximated by the central value of the oscillations in U, and Ye and Le are related by the equilibrium relation (2). The changes AY/Y and AUjU are read off from the scatter diagrams in simulation after allowing for a sufficient number of times to ensure that the model is in "stationary" state. In simulations we note that after a sufficient number of time steps have elapsed, the model is in or near the equilibrium distribution. Then, Fand U are nearly linearly related with a negative slope, which can be read off from scatter diagrams i.e., AY — -xAU, and we derive the expression for 0 Ue/N' We next see that the situation changes as the demand shares are made to depend on Y. We now assume that demand shares depend on Y, hence K depends on Y.
A New Stochastic Framework for Macroeconomics
Differentiate the continuum of equilibrium relation superscript e from now on) with respect to Y to obtain
137
(dropping
with
so that
with
Then the coefficient of the Okun’s law becomes
(5)
Okun’s law in the economics literature usually refers to changes in gross domestic products (GDP) and unemployment rates measured at two different time instants, such as one year apart. There may therefore be growth or decline in the economies. To avoid confusing the issues about the relations between GDP and unemployment rates during stationary business cycle fluctuations, that is, those without growth of GDP and those with growth, we run our simulations in stationary states assuming no change in the numbers of sectors, productivity coefficients, or the total numbers of labor force in the model. Okun’s law refers to a stable empirical relation between unemployment rates and rate of changes in GDP: one percent increase (decrease) in GDP corresponds to percent decrease (increase) in unemployment, where is about 4 in the United States.
Example Let the shares vary according to
for , where is the equilibrium value , and to satisfy the condition that the sum equals 1. Here are some numbers. With
138
M. Aoki
{S1S2) = (1 - s,s) where soi = 0.1, i = l , 2 , ( l / c - 1) = 10^, 71 = 10"^ we obtain /3 = 2.1. With 7 - 1.5 x IQ-^, 0 = 4.3. With 7 = 3 x lO^^^ /3= 3.6,50 =4x10-2 l / c - l = 102. The next two figures are the Okun's coefficients derived from simulation with iiT = 10". UY Curve and Okun Law for D1
UY Curve and Okun Law for D2
(xt OLS «lopa, «^ O k u n cc«f., Convarganea at 1S00)
(x: OLS • l o p o , a : Okun oool., Convergence a l 1 5 M )
x=2a& kx=4.41 a =3.09
8
196
%194 R
#
192
K g
iao
1 •>
^
17S 176
125
Agregates uromplo/ees after convergence
130
1M
140
14S
unemployees after convergence
Fig. 1 Examples of Okun's law
5.5 Beveridge Curves In the real world unemployment and vacancies coexist. The relation between the two is called the Beveridge curve. It is usually assumed that its position on the -u - v plane is independent of aggregate demand. In simulations, however, we observe that their loci will shift with Y. Our model has a distribution of productivities. Demand affects not only Fbut also the relation between unemployment and vacancy loci. This result is significant because it means that structural unemployment cannot be separated from cyclical unemployment due to demand deficiency. This implies that the notion of a natural rate of unemployment is not well defined.
The original Matlab program was written by an economics graduate student in the Department of Economics at the University of California at Los Angeles, L.Kalesnikov. It was later revised by two graduate students at the Faculty of Economics, University of Tokyo, Ikeda, and Hamazaki.
A New Stochastic Framework for Macroeconomics
139
Fig. 2. Examples of Beveridge Curves
5.6 Simulation Studies Since the model is nonlinear and possibly possesses multiple equilibria, we use simulations to deduce some of the properties of the models. We pay attention to the phenomena of trade-offs between GDP and unemployment, and the scatter diagrams of GDP vs. unemployment to gather information on business cycle behaviors. Our model behaves randomly because the jumping sectors are random due to holding times being randomly distributed. This is different from the models in the literature, which behave randomly due to technology shocks that are exogenously imposed. As we indicate below, the state spaces of the model have many basins of attractions, each with nearly equal output levels. Simulations are used to gather information on model behavior12. Various cases with K = 4, K = 8, and K = 10 have been run. Four hundred Monte Carlo runs of duration 7000 elementary time steps each have been run. Fig. 2 is the average GDP of P1. It shows that after 700 time steps the model is in the closed set.
—————— 12
Simulation programs were written originally by V. Kalesnik, a graduate student at the University of California at Los Angeles, and later modified by F. Ikeda and M. Suda, graduate students at the Faculty of Economics, University of Tokyo.
140
M. Aoki
5.7 Effects of Demand Management on Sector Sizes With some demand patterns such that the low productivity sectors share a major portion of the aggregate demand, and as (2) shows suppose that the least productive sector has the largest equilibrium size to meet the demand. Suppose that the model has entered the closed set and exhibits stationary business cycle, and suppose that the demand pattern is switched so that the high productivity sectors now receive the major portion of the demand. One would conjecture that the stationary Y values will increase and the model will reach a new stationary state. When the size of the least productive sector is very large, the model will start by shrinking the size of the least productive sector more often than increasing the size of the more productive sector. Under some conditions it is easy to show that the probability of size reduction by the least productive sector is much larger than that of the size increase of the productive sector, at least immediately after the switch of demand pattern. When the productivity coefficients demand shares satisfy certain conditions, net reduction of Y is permanent, contrary to our expectation. See Aoki and Yoshikawa (2004) for details.
5.8 Summary of Findings from Simulations Simulation results may be summarized as follows: 1. Larger shares of demand on more productive sectors result in the higher average values of GDP. 2. The relationship between unemployment and vacancy depends on demand. Our simulations show that Beveridge curves shift up or down when Y goes down or up, respectively. In other words, when Y declines (goes up) the Beveridge curve shifts outward (downward). 3. The relationship between unemployment and the growth rate of GDP is described by a relation similar to Okun’s law. 4. The economy reaches the ‘equilibrium’ faster with larger shares of demand falling on more productive sectors. This indicates that demand affects not only the level of GDP but also adjustment speed toward equilibrium. In other words, our simulations show that higher percentages of demands falling on more productive sectors produce four new results: (1) average GDPs are higher; (2) the Okun’s coefficients are larger; (3) transient responses are faster; and (4) timing of the demand pattern switches matter in changing GDP. Unlike the Cobb-Douglas production or linear production function that lead to values of less than one, we obtain values from 2 to 4 depending on as the Okun’s coefficient in our simulation.
A New Stochastic Framework for Macroeconomics
141
It is remarkable that we can deduce these results from models with linear constant coefficient production functions. This indicates the importance of stochastic interactions among sectors introduced through the device of stochastic holding times. Using a related model, Aoki and Yoshikawa (2003) allow the positive excess demand sectors to go into overtime until they can fill the vacancy. This model produces the Beveridge curse shifts. The details are to appear in Aoki and Yoshikawa (2005).
6. Summing Up We have advanced the following propositions and perspectives by our new stochastic approach to macroeconomics. 1. Equilibria of a macroeconomy are better described as probability distribution. Master equations describe time evolutions of a macroeconomy. 2. Sectoral reallocations of resources generate aggregate fluctuations or business cycles. Given heterogeneous microeconomic objectives and constraints, thresholds for change in strategies differ across sectors/firms. It takes time for productivities across sectors to equalize. In the meantime, in responding to excess demands or supplies, the level of resource inputs in at least one sector changes, and macroeconomic situations also change, initiating another rounds of changes.
References Aoki, M. (1996): New Approaches to Macroeconomic Modeling, Cambridge University Press, New York. Aoki, M. (1998): A simple model of asymmetrical business cycles: Interactive dynamics of large number of agents with discrete choices, Macroeconomic Dynamics 2 427-442. Aoki, M. (2002): Modeling Aggregate Behavior and Fluctuations in Economics, Cambridge University Press, New York. Aoki, M., and H. Yoshikawa (2002): Demand saturation-creation and economic growth, Journal of Economic Behaviour and Organization, 48, 127-154. Aoki, M., and H. Yoshikawa (2002): Modeling Aggregate Behavior and Fluctuations in Economics, Cambridge University Press, New York. Aoki, M., and H. Yoshikawa (2003): A Simple Quantity Adjustment Model of Economic Fluctuation and Growth, in Heterogeneous Agents, Interaction and Economic Peroformance, R. Cowan and N Jonurd (eds). Springer, Berlin.
142
M. Aoki
Aoki, M., and H. Yoshikawa (2003): A Simple Quantity Adjustment Model of Economic Fluctuation and Growth, in Heterogeneous Agents, Interaction and Economic Performance, R. Cowan and N Jonurd (eds). Springer, Berlin. Aoki, M., and H. Yoshikawa (2003): Uncertainty, Policy Ineffectiveness, and Long Stagnation of the Macroeconomy, Working Paper No. 316, Stern School of Business, New York University. Aoki, M., and H. Yoshikawa (2004): Effects of Demand Management on Sector Sizes and Okun’s Law, presented at 2004 Wild@ace conference, Torino, Italy. Forthcoming in the conference proceeding, and in the special issue of Computational Economics. Aoki, M., and H. Yoshikawa (2004): Stochastic Approach to Macroeconomics and Financial Markets, under preparation for Japan-US Center UFJ monograph series on international financial markets. Aoki, M., and H. Yoshikawa (2005): A New Model of Labor Market Dynamics: Ultrametrics, Okun’s Law, and Transient Dynamics. pp. 204-219 Nonlinear Dynamics and Heterogenous Interacting Agents, Thomas Lux, Stefan Reitz, and Eleni Samanidou (eds), N° 550 Lecture Notes in Economics and Mathematical Systems, Springer –Verlag Berlin Heidelberg 2005. Aoki, M., and H. Yoshikawa (2005): Reconstructing Macroeconomics: A Perspective from Statistical Physics and Combinatorial Stochastic Processes Cambridge University Press, New York, forthcoming 2006. Aoki, M., H. Yoshikawa, and T. Shimizu (2003): The long stagnation and monetary policy in Japan: A theoretical explanation. Conference in Honor of James Tobin; Unemployment: The US, Euro-area, and Japan, W. Semmler (ed). Routledge, New York. Blanchard, O. (2000): Discussions of the Monetary Response-Bubbles, Liquidity Traps, and Monetary Policy, in R. Mikitani and A. S. Posen (eds), Japan’s Financial Crisis and its Parallel to U. S. Experience, Inst. Int. Econ., Washington D.C. Blanchard, O., and P. Diamond (1989): Beveridge Curve, Brookings Papers on Economic Activity 1, 1-60. Blanchard, O., and P. Diamond (1992): The flow approach to labor markets, American Economic Review 82, 354-59. Davis, S. J., J. C. Haltwanger, and S. Schuh (1996): Job Creations and Destructions, MIT Press, Cambridge MA. Davis, S.J., and J. C. Haltiwanger (1992): Gross job creations, gross job destructions, and employment reallocation, Quart. J. Econ. 107, 819-63. Feigelman, M. V., and C. B. Ioffe (1991): Hierarchical organization of memory in models of neural networks, in E. Dommany, J. L. van Hemmen, and K. Schulter (eds). Springer, Berlin. Girardin, E., and N. Horsewood (2001): Regime switching and transmission mechanisms of monetary policy in Japan at low interest rates, Working paper, Univ. de la Mediterrainee, Aix-Marseille II, France. Hamada, K., and Y. Kurosaka (1984): The relationship between production and unemployment in Japan: Okun’s law in comparative perspective. European Economic Review, June.
A New Stochastic Framework for Macroeconomics
143
Ingber, L. (1982): Statistical Mechanics of Neocortical Interactions, Physica D 5, 83-107. Jardine, N., and R. Sibson (1971): Mathematical Taxonomy, John Wiley, London. Krugman, P. (1998): It’s baaack: Japan’s Slump and the Return of the Liquidity Trap, Brookings Papers on Economic Activities 2, 137-203. Lawler, G. (1995): Introduction to Stochastic Processes, Chapman & Hall, London. Mezard, M., and M. A. Virasoro (1985): The microstructure of ultrametrics, J. Phys. bf 46,1293-1307. Mortensen, D. (1989): The persisitence and indeterminacy of unemployment in search equilibrium, Scand. J. Econ. 91, 347-60. Ogielski, A. T., and D. L. Stein, (1985): Dynamics on ultrametric spaces, Phy. Rev. 55, 1634-1637. Okun, A. M. (1983): Economics for Policymaking: Selected Essays of Arthur M. Okun, ed. A Pechman, MIT Press, Cambridge MA. Schikhof, W. H. (1984): Ultrametric Calculus: An Introduction to p-adic Analysis. Cambridge Univ. Press, London. Taylor, J.B. (1980): Aggregate Dynamics and Staggered Contracts, J. Pol. Econ. 88, 1-23. Yoshikawa, H. (2003): Stochastic Equilibria. 2003 Presidential address, the Japanese Economic Association. The Japanese Econ. Rev. 54, 1-27.
145
____________________________________________________________
Probability of Traffic Violations and Risk of Crime: A Model of Economic Agent Behavior J. Mimkes
1. Introduction The behavior of traffic agents is an important topic of recent discussions in social and economic sciences (Helbing, 2002). The methods are generally based on the Focker Planck equation or master equations (Weidlich, 1972, 2000). The present investigations are based on the statistics of binary decisions with constraints. This method is known as the Lagrange LeChatelier principle of least pressure in many-decisions systems (Mimkes, 1995, 2000). The results are compared to data for traffic violations and other criminal acts like shop lifting, theft and murder. The Lagrange LeChatelier principle is well known in thermodynamics (Fowler and Guggenheim, 1960) and is one method to support economists by concepts of physics (Stanley, 1996, 1999).
2. Probability with constraints The distribution of N cars parked on the two sides of a street is easily calculated from the laws of combinations. The probability of N l cars parking on the left side and N r cars parking on the right side of the street is given by P( N l ; N r ) =
N! 1 N l ! N r! 2 N
(1)
In Fig. 1 the cars are evenly parked on both sides of the street. The probability of this even distribution has always the highest probability. According to Eq.(1) P(2;2) = 37,5 %.
146 J. Mimkes
Fig. 1 The distribution of one half of the cars on each side of the street is most probable. According to Eq.(1) the probability for Nl = 2 and Nr = 2 is given by P(2;2) = 37,5 %.
Fig. 2 The distribution of all cars on one side and none on the other side has always the least probability. For Nl = 0 and Nr = 4 we find P = 6,25 %.
3. Constraints In Fig. 3 the no parking sign on the left side forces the cars to park on the right side, only. The “no parking” sign is a constraint, that enforces the least probable distribution of cars and makes the very improbable distribution in Fig. 2 most probable! Laws are constraints that will completely change the distribution in any system.
Probability of Traffic Violations and Risk of Crime 147
[
Fig. 3 The “no parking” sign enforces the least probable distribution of cars.
In Fig. 4 we find one driver ignoring the “no parking” sign. What is the probability of this unlawful distribution?
[
Fig. 4 One car is ignoring the “no parking” sign. This is an unlawful act and may be called a defect. In solids we expect the number of defects to depend on the energy (E) of formation in relation to the temperature (T ). The number of wrong parkers will depend on the amount (E) of the fine in relation to normal income (T ).
Fig. 4 shows an unlawful act. In physics the occupation of a forbidden site is called a defect, like an atom on a forbidden lattice site. The probability of this traffic defect may be solved by looking at the laws of statistics of many decisions with certain constraints, the Lagrange principle.
148 J. Mimkes
4. Probability with constraint (Lagrange principle) The laws of stochastic systems with constraints as introduced by Joseph de Lagrange (1736 – 1813), L = E + T ln P maximum! (2) L is the Lagrange function and a sum of the functions E and T ln P. The function p is the probability of distribution of N cars according to Eq.(1). The function E stands for the constraint [, the fine for wrong parking in Fig. 4. The parameter T is called Lagrange factor and will be a mean amount of money. The natural logarithm of probability ln p is called entropy. In the calculation of ln p according to Eq.(1) we may use the Stirling formula, ln p = ln (N! / (Nw ! Nr ! N ) (2a) = N ln N - Nw ln Nw - (N - Nw) ln(N - Nw) - ln 2N Nw cars parked on the wrong side have to pay Nw parking ticket (-E0). Introducing the relative amount x of wrong parked cars x = N w / N we obtain the Lagrange function of the system of parked cars: L(T, x) = N Σ x (-E0 ) + (2b) + TN { - x ln x - (1 - x) ln (1 - x) - ln 2} → max! At maximum the derivative of L with respect to x will be zero, ∂L / ∂ x = - N {E0 + T { ln x + ln (1- x) } = 0
(2c)
This leads to the relative number x = Nw /N of wrong parked cars as a function of the fine (-E0) at a mean amount of money T, Nw 1 = N 1 + exp( E o / T )
(2d)
Distribution of traffic offenders in Flensburg, 2000, Fig. 5, shows a Boltzmann function. Apparently this distribution is also valid for “social defects” like traffic violations, as indicated by the good agreement of calculations and data.
5. Probability with two constraints (LeChatelier principle) In Fig. 6 again a car is parked on the forbidden side. In contrast to Fig. 4 there is no space on the legal side. Due to the missing space the driver is
Probability of Traffic Violations and Risk of Crime 149
forced to park his car on the illegal side, if there is an important reason or stress for the driver to park in this lot.
2. Traffic offence (A=4000, R = 0, T = 2,2) 2500
Number of offenders
2000 1500 1000 500 0 0
5
10
15
20
-500 Number of points
Fig. 5. Distribution of traffic offenders in Flensburg, 2000. Total number of all offenders: 2869. Red line: Calculation according to (2d)
In Fig. 6 we have two constraints in the decision of the driver to park, the “no parking” sign (E) and a limited space (V). The Lagrange principle now has two constraints and two order parameters, (T) and (p): L=E - pV + T ln p → maximum! L E T V p P
(3)
: Lagrange function : 1. contraint, e.g. ([), fine for parking violation : 1. Lagrange factor: e.g. mean amount of parking costs : 2. constraint, e.g. space, freedom : 2. Lagrange factor, e.g. stress, pressure : probability, eq.(1)
Equation (3) is also called the LeChatelier principle of least pressure. It is applied to all situations where people (or atoms) try to avoid external pressure.
150 J. Mimkes
[
Fig. 6. One cars is forced to violate the “no parking “ sign due to an external stress or pressure (p), that reduces the freedom of choice of the roadside.
6. Risk, stress and punishment We may calculate the probability of the situation in Fig. 6. For Nr cars parked on the right side and Nw cars parked on the wrong side with a parking ricket (E P) and a parking space (v) needed per car we obtain L = Nw (E p - p v) +T{N ln N - Nw ln Nw - Nr ln Nr -ln 2} → max!
(3a)
Relative number x of violators
Risk and fine 0,5 0,4
R= 0 R = 30% R = 60% R = 100%
0,3 0,2 0,1 0 0
1
2
3
4
Relative fine E/ T
Fig. 7. Distribution of traffic violators under various risks of punishment. The probability of violations drops with rising risk and a rising relative fine E / T.
Probability of Traffic Violations and Risk of Crime 151
At equilibrium the maximum is reached and we may again differentiate with respect to the number of delinquents, Nw.
∂L / ∂ Nw = (Ep - p v ) - T { ln (Nw / N ) + ln [(N - Nw) / N ] } = 0
(4)
Solving for the ratio Nw / N we obtain the relative number of agents violating the law as a function of the relative fine (Ep / T) :
Nw 1− R = N 1 − R + exp( E P / T )
(5)
Nw / N is the relative number of cars parked on the wrong side. EP is the fine (or punishment) for the parking violation in relation to a mean value T. Risk: In Eq.(5) the function (1 - R) has been introduced to replace the stress finding a parking space (v): 1 - R = exp ( - p v / T )
(6)
R = 1 - exp ( - p v / T )
(6a)
or The parameter R may be regarded as the risk that is taken into account under external pressure (p): R (p = 0) = 0
(6b)
R (p → ∞) → 1
(6c)
without stress (p=0) no risk is taken. And at very high stress (p →∞) a risk close to 100 % is taken into account. This is an important result: Risk is caused by internal or external pressure. Only people under stress will take into account the risk of punishment. Risk (R) and punishment (EP) of Eq.(5) have been plotted in Fig. 7. At no risk the distribution of right and wrong is 50:50. Doing right or wrong does not matter, if the risk of being caught is zero. With growing risk the number of violators is reduced, if the punishment is high enough. At a risk of 60 % the relative number of violators, Nw/N is close to the Boltzmann distribution observed in Fig. 5. At 100 % risk nobody violates the law, even if the relative fine (E/T) is not very high. In Fig. 8 the percentage of shop lifters is shown as a function of the stolen value (E). The data are given for Germany 2002 and have been taken from the German criminal statistics (BKA). The risk for being caught in shop lifting is given by rate of solved to reported cases. For shop lifting the risk is R = 0,95 according to the statistics of the BKA. Appar-
152 J. Mimkes
ently the generally juvenile shop lifters link the value of the loot (E) to the amount of punishment (EP), fear dominates over greed. Shoplifting (R = 95 %, T = 150 €)
percentage of shop lifters
60 50 40 30 20 10 0 0
200
400
600
800
1000
value in Euro
Fig. 8. Distribution of shop lifters as a function of stolen values at 95% risk in shop lifting in Germany 2002, (www.bka.de). The probability of violations drops with rising stolen value E.
Relative number x of criminals
Probability of crime at risk R and fine Ep /T = 5 1
R=0%
0,8
R = 30 % R = 50 %
0,6 0,4
R = 90 % R = 99 % R = 100 %
0,2 0 0
2
4
6
8
10
Relative amount of gain E / T
Fig. 9. Distribution of criminal violators under various risks (R) at relative fine Ep/T = 5 according to Eq.(7). At low risk nearly all agents will become criminals, if the greed or the expected relative gain (E) is sufficiently high.
7. Criminal acts In contrast to parking problems in criminal acts the greed for expected profit (E) reduces the fear of punishment (EP), and Eq.(5) has to be replaced by
Probability of Traffic Violations and Risk of Crime 153 Nw 1− R = N 1 − R + exp[(EP − E) / T ]
(7)
rel number of criminal acts
Criminal acts under growing risk at constant E/T 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.3
0.4
0.5
0.6
0.7
risk
Fig. 10. Decreasing criminal violators under growing risks (R) according to Eq.(7) at constant value of (Ep - E) / T = 5 according to Eq.(7).
Rel. number of thefts, risk
Calculation and data on theft and risk in D 2002 70 60
Risk data
50 40 30 20
Theft data and calculations
10 0 . lz lt s. rn nd en W Pf ha der ye n rl a ess n d l e a a e A i n B d H N ei Sa S. Ba Rh
g W ur NR mb Ha
Fig. 11. Distribution of theft (square data points) under various risks (diamond data points) in the 15 federal states of Germany in 2002. There were 1,5 mill. Thefts reported of a total amount of 692 Mill. €, or a mean amount of 461 € per theft [www.bka.de] . Calculations (solid line) according to Eq.(7) with a negative exponent, (Ep - E) / T = -3 (greed dominating fear).
154 J. Mimkes
rel. numbers
Rate of murder D 2002, A = 1.9, E/T = -3
1,5
murder data and calculations
1,25 1 0,75 0,5 1993
Risk data
1995
1997
1999
year murders per 100 risk
2001
calculation
Fig. 12. Distribution of murder (blue data points) under various risks (red data points) in Germany between 1003 and 2001, [www.bka.de]. Calculations (solid line) according to Eq.(7) with a negative exponent, (Ep - E) / T =-3 (greed dominating fear).
The risk (R) is again the same as in Eq.(6). The relative number of criminals Nw/N expected according to Eq.(7) has been plotted in Fig. 9. In Fig. 9 the relative number of criminals Nw/N grows with the expected illegal profit (E/T). According to the figure all agents will become criminals, if the risk is very low and the expected profit is high enough: nearly everybody may be bribed!! In countries with a low standard of living the probability of corruption is much higher than in countries with high standard of living. Unfortunately, no data have been found to match the calculations.
8. Conclusions The statistics of agent behavior under the constraint of laws has been applied to data in traffic and in crime. In both systems the statistical Lagrange LeChatelier principle agrees with the well recorded results. Behavior is apparently determined by risk and punishment (economic losses). A similar behavior of agents may be expected in all corresponding economic and social systems (situations) like in stock markets, financial markets or social communities.
Probability of Traffic Violations and Risk of Crime 155
References BKA (2002): Polizeiliche Kriminalstaistik 2002 der Bundesrepublik Deutschland, http://www.bka.de. Fowler, R. and Guggenheim, E. A., (1960): Statistical Thermodynamics Cambridge University Press. Helbing, D., (2002): Volatile decision dynamics: experiments, stochastic description, intermittency control and traffic optimization, New J. Phys. 4 33. KBA (2002): Kraftfahrt Bundesamt Flensburg 2003 http://www.kba.de. Mimkes, J., (1995): Binary Alloys as a Model for Multicultural Society J. Thermal Anal. 43: 521-537. Mimkes, J., (2000): Die familiale Integration von Zuwanderern und Konfessionsgruppen – zur Bedeutung von Toleranz und Heiratsmarkt in Partnerwahl und Heiratsmuster, Klein,Th., Heidelberg, Editor, Verlag Leske und Budrich, Leverkusen. Mimkes, J., (2000): Society as a many-particle System, J. Thermal Analysis 60: 1055 – 1069. Stanley, H. E.; L. A. N. Amarala, D. Canningb, P. Gopikrishnana, Y. Leea and Y. Liua, (1999): Econophysics: Can physicists contribute to the science of economics? Physica A, 269: 156 – 169. Stanley, M. H. R.; L. A. N. Amaral, S. V. Buldyrev, S. Havlin, H. Leschhorn, P. Maass, M. A. Salinger, and H. E. Stanley, (1996): Scaling Behavior in the Growth of Companies, Nature 379, 804-806. Weidlich, W., (1972): The use of statistical models in sociology, Collective Phenomena 1, 51. Weidlich, W., (2000): Sociodynamics, Amsterdam : Harwood Acad. Publ.
157
____________________________________________________________
Markov Nets and the NetLab Platform: Application to Continuous Double Auction L. Muchnik and S. Solomon
1. Introduction and background 1.1 What’s the problem? In describing dynamics of classical bodies one uses systems of differential equations (Newton laws). Increasing the number of interacting bodies requires finer time scales and heavier computations. Thus one often takes a statistical approach (e.g. Statistical Mechanics, Markov Chains, Monte Carlo Simulations) which sacrifices the details of the event-by-event causality. The main assumption is that each event is determined only by events immediately preceding it rather than events in the arbitrary past. Moreover, time is often divided in slices and the various cause and effect events are assumed to take place in accordance with this arbitrary slicing. The dynamics of certain economic systems can be expressed similarly. However, in many economic systems, the dynamics is dominated by specific events and specific reactions of the agents to those events. Thus, to keep the model meaningful, causality and in particular the correct ordering of events has to be preserved rigorously down to the lowest time scale. We introduce the concept of Markov Nets (MN), which allows one to represent exactly the causal structure of events in natural systems composed of many interacting agents. The Markov Nets preserve the exclusive dependence of an effect event on the event directly causing it but makes no assumption on the time lapse separating them. Moreover, in a Markov Net the possibility exists that an event is affected if another event happens in the meantime between its causation and its expected occurrence. We present a simulation platform (NatLab) that uses the MN formalism to make
158
L. Muchnik and S. Solomon
simulations that preserve exactly the causal timing of events without paying an impossible computational cost.
Fig. 1. The figure represents two scenarios, (a) and (b), in which two traders react to the market’s having reached the price 10, by a buy and respectively a sell order. This moment is represented by the leftmost order book in both (a) and (b) sequences. The current price is marked by the two arrow heads at 10. We took, for definiteness, both offer and demand stacks occupied with equal size limit orders (represented by horizontal lines) at integer values around 10. The (a) sequence illustrates the case where the seller is faster and saturates the buy order at the top of the buy stack: 9 (this is shown by the order book in the middle). Then, the buyer saturates the lowest offer order: 11 (right-most column). Thus the final price is 11. The (b) figure illustrates the case where the buyer is faster. In this case the final price is 9. Thus an arbitrary small difference in the traders’ reaction times leads to a totally different subsequent evolution of the market. The problem is to devise a method that insures arbitrary precision in treating the timing of the various events (see discussion of Fig. 13 below).
The present paper describes the application of the “Markov Net” (MN) concept on a generic continuous time asynchronous platform (NatLab) (Muchnik, Louzoun, and Solomon 2005) to the study of continuous double-auction markets. The continuous double-auction market is a fundamental problem in economics. Yet most of the results in economics were obtained for markets that are discrete time, synchronous, or both. The markets evolve in a reality in which time is continuous and traders act asynchronously by sending bid and ask orders independently. A change in the sequence of arrival to the market of two orders may change the actual price by a significant value and eventually lead to completely different subsequent market development (even if their arrival time difference is arbitrarily small). Thus, to describe the market dynamics faithfully one has to insure arbitrary precision in the timing of each event (Fig. 1).
Markov Nets and the NatLab platform
159
The insistence on preserving in great detail the causal order of the various events might seem pedantic and irrelevant for large disordered systems of many independent traders. After all, statistical mechanics is able to reproduce equilibrium thermodynamic results of some Ising-like models without even considering the details of their dynamics (McCoy and Wu 2005). However, there are certain systems in which this requirement seems unavoidable. Economic Markets is one of them. The classical theoretical framework for understanding market price formation is discrete and synchronous. It is based on the equilibrium price concept introduced by Walras in 1874. Walras’ equilibrium price did not address the issue of market time evolution. He assumed that in order to compute the market price it is sufficient to aggregate the offer and demand of all the traders. The equilibrium price will then be the one that insures that there is no outstanding demand or offer left. The extension of the Walrasian equilibrium-price mechanism to time evolving markets was suggested only later by Hicks (1939, 1946), Grandmont (1977), and Radner (1972. At each time, the demand curves of all the traders were collected and aggregated synchronously. The intuition that ignoring the asynchronous character of the market may miss its very causality occurred to a number of economists in the past. In fact, Lindahl (1929) suggested that economic phenomena like the business cycle may be due exactly to the iterative process of firms / individuals reacting to each other’s positions. This insight did not have wide impact because the methods used in modeling markets with perfect competition (General Equilibrium Theory) and markets with strategic interaction (Game Theory) are synchronous and, arguably, even timeless1. If the extension of the Walrasian paradigm (Walras 1874, 1909) to time evolving market price would hold, the sequence in which the orders arrive would be irrelevant and the insistence on absolute respect of the time order of the various market events would be unnecessary. However, in spite of the conceptual elegance of the Walrasian construction, its application to time-dependent markets is in stark contrast with what one experiences in the real market. In reality the demand and offer order stacks include at each time only a negligible fraction of the shares and of the share holders. Moreover, only the orders at the top of the stack have a direct influence on the price changes. One may hope that the local fluctuations due to the particular set of current traders would amount to only some noise on top of —————— 1
We are greatly indebt to Martin Hohnisch for very illuminating and informative correspondence on these points.
160
L. Muchnik and S. Solomon
a more fundamental dynamics closer to the Walrasian paradigm. However, there are many indications that this is not the case. First, the multi-fractal structure of the market implies that there is no real separation between the dynamics at the shortest time scales and the largest time scales (Mandelbrot, 1963, Liu et al., 1999). Thus one cannot indicate a time beyond which the market is bound to revert to the fundamental/equilibrium price. Second, the motivations, evaluations, and decisions of most of the traders are not in terms of an optimal long range investment but rather in terms of exploiting the short term fluctuations whether warranted or not by changes in the “real” or “fundamental” value of the equity. Third, the most dramatic changes in the price are taking place at time scales of maximum days, so aggregating over longer periods misses their causality. Fourth, the Walrasian auction requires each trader to define a personal demand function, i.e. defining his conditional bid/offer for any eventual price. In practice, not only do the traders not bother with, and do not have the necessary notions to make, their decisions for arbitrary improbable prices, but also the market microstructure just does not have the instruments capable of collecting these kinds of orders. All the arguments above can be summed in a more formal tautological form: the variation of the price for large time periods is the sum (in fact product) of the (relative) variations of the market at the single transaction level. A full understanding of the market dynamics is therefore included in the understanding of the single-trade dynamics. This idea has been investigated in models of high frequency financial data (Scalas 2005; Scalas, Gorenflow, Luckock et al. 2004; Scalas, Gorenflow, and Mainardi, 2000). The fact that Walras did not emphasize this point is due not only to the lack at the time of detailed single-trade data. Ideologically he had no way to be primed in this direction: the program of Statistical Mechanics of deducing global average thermodynamic properties from the aggregation of the binary interactions of individual molecules was not accepted even in physics in his time. In recent times, however, there have been attempts to reformulate Walrasian ideas in a statistical mechanical framework (Smith and Foley 2004).
1.2 The Background of the “Markov Net” concept One may be surprised at the long time the natural continuous time asynchronous reality has been represented in computers in terms of discrete time synchronous approximations. Even after this became unfashionable, the lack of consideration for the precise causal order of events continued to
Markov Nets and the NatLab platform
161
prevail through the use of random event ordering. Alternatively, in many applications the exact order in which the various events take place is left to uncontrollable fortuitous details such as arbitrary choices by the operating system or the relative work load on the computer processors. This is even more surprising given the fact that some fundamental scientific problems fall outside the limits imposed by these approximations. An outstanding example is the continuous time double auction market that we present in the second part of this paper. A possible explanation for this state of the art is the influence that the Turing discrete linear machine paradigm has on the thinking about computers. This influence led to the implicit (thus automatic) misconception that discrete machines can only represent discrete-time evolution. The continuous time is then viewed as a limit of arbitrary small time steps that can be achieved only by painstaking efforts. We will show that this is not necessarily the case. We also are breaking away from another implicit traditional assumption in computer simulations: the insistence on seeing the time evolution of the computer state as somewhat related to the time evolution of the simulated system. Even when the computer time intervals are not considered proportional to the simulated system time, the order in which the events are computed is assumed to be the one of the simulated world. As we shall see, in our implementation of the Markov Nets, there is occasional lack of correspondence between the two times. For instance, one may compute the time of a later (or even eventually un-happened) event before implementing the event that happens next according to the simulated world causality. The “Markov Net” (MN) idea is easier to understand if one disabuses oneself of the psychological tendency to assign to the simulated system (1) the discreteness and (2) the causality of the computer process simulating it. The mental effort to overcome this barrier is worthwhile: we can (without computational load penalty) go easily to any time precision (e.g. if desired, one can use time resolution of the order of the time quanta - the time necessary for light to traverse a Plank length ~ 10−44 sec in simulations that involve time scales relevant to daily trading activities). The only requirement would be to use double – or if, desired, arbitrary – real variable bit-length (Koren, 2002). Using the NatLab framework, one can apply the MN concept to applications in heterogeneous agent based macroeconomic models (Gallegatti and Giulioni 2003).
162
L. Muchnik and S. Solomon
2. Markov Net Definition and NatLab Description 2.1 Definition, Basic Rules, and Examples In this chapter we describe sequences of events in the Markov Net formalism and the way the NatLab platform implements them. While ostensibly we just describe a sequence of simulated events, this will allow us to make explicit the way the MN treats the scheduling and re-scheduling of future events as the causal flow is generated by the advancing of the process from one present event to the next. A Markov Net is a set of events that happen in time to a set of agents. The events cause one another in time with a certain lag between the cause and effect. Thus at the time of causation the effect event is only potential. Its ultimate happening may be affected by other events happening in the meantime. To take this into account, after executing the current event of the Markov Net, the putative happening times of all its directly caused potential effects are computed. The process then jumps to the earliest yet unhappened putative event, which becomes thereby the current event. From there, the procedure is iterated as above. Note that with the present definition, Markov Nets are deterministic structures. We leave the probabilistic Markov Nets study for future research. The name “Markov Nets” has been chosen in order to emphasize the fact that the present events are always only the result of their immediate cause events (and not of the entire history of the Net). However, as opposed to a Markov Chain (Bharucha-Reid 1960, Papoulis 1984), the current state is not dependent only on the state of the system immediately preceding it. For instance, as seen below, the current event could be directly caused by an event that took place arbitrarily far in the past. This is to be contrasted with even the n-order generalizations of the Markov Chains where the present can only be affected by a limited time band immediately preceding it. To recover continuum time in an n-order Markov Chain, the time step is taken to 0 and the dynamics collapses into an norder stochastic differential equation (continuous Markov Chain) (Wiener 1923). In contrast, a Markov Net exists already in continuum time and the past time-strip that influences the present is undefined depending on the system dynamics. In this, it shares properties with the Poisson process (except that it allows for a rather intricate causal structure). Thus, only in special regimes can it be approximated by differential equations. In order to explain the functioning of MN, one represents the various agents’ time evolution as directed horizontal lines going from some initial time (at the left) towards the future (the right) (Fig.2).
Markov Nets and the NatLab platform
163
The events of receiving some input (being caused) or sending some output (causing) will be represented by points on these horizontal lines. Thus each event is associated with an agent line and with a specific time. Its position on the agent axis will correspond to the Markov Net time of the event (Fig. 3). A vertical line will indicate the current time. Note that in a Markov Net, one jumps from the time of one event to the next one without passing through or even considering any intermediate times. The actual transmission/causation will then be represented by a directed line starting on the sending (causing) agent’s axis at the sending (cause event) time and ending on the receiving agent axis at a (later) reception (effect event) time (Fig. 3). Note that while the Markov Net is at a given time (event), the process is (re)calculating and (re)scheduling all the future potential events directly caused (or modified) by it. These events may be later modified or invalidated by “meantime events” (events that have taken place in the meantime), so we represent them by open dots pointed to by grey dotted (rather than full black) lines. Those lines are only to represent that the events’ timings were computed and the events were scheduled, but in fact their timing or even occurrence can be affected by “meantime secondary effects” (secondary effects that have happened in the meantime) of the current state (Figs. 5-7). Incidentally, this procedure, which is explained here for deterministic events, can be extended to probabilistic events. The actual time that the transmission/causation takes is computed based on the state of the sending (causing) and receiving (affected) agents. Other time delays, e.g. between the arrival of an order to the market and its execution, are similarly represented. Since the transmission / causation takes time, it often happens that as a signal travels from one event (agent) to another, another event on the receiver site affects the reception (delaying it, cancelling it ,or changing its effect). The entire dynamics of the system can then be expressed in terms of its Markov Net: the communication/causation lines and their departure (cause) and arrival (effect) event times. We consider the internal processes of an agent (e.g. thinking about a particular input to produce a decision as output) as a special case in which an arrow starts and ends on events belonging to the same agent. The beginning of the arrow represents the time at which the process started and the end of the arrow represents the time the process ended (typically by a decision and sending a communication). The computation of the thinking time and the scheduling of the issue time for the decision are made immediately after the execution of the event that triggers the decision process.
164
L. Muchnik and S. Solomon
Of course, as with the evaluation of other future events, if other meantime events affect the agent, the thinking process and the decision can be affected or even cancelled. Thus the scheduling of an event arrival is always provisory / tentative because the relevant agent can undergo another “unexpected” event before the scheduled event. Such an intervening event will modify the state of the agent and in particular the timing or even the actual happening of the event(s) that the agent is otherwise scheduled to undergo. For instance, as shown in Fig. 8, as a result of certain “news”, at 15:00, the agent is tentatively computing a certain internal deliberation that ends up with a sell order at 17:00. If, however, at 16:00 the agent receives a message cancelling the “news”, the ordering event tentatively scheduled for 17:00 is cancelled (Fig. 9) and some other event (going home at 16:30) can be tentatively scheduled (Fig. 9) and executed (Fig. 10). Another example of a Markov Net, involving three agents, is shown in Fig. 11. Their interaction sequence is explicitated through the MN scheduling and re-scheduling mechanisms. At the beginning of the process two events are pre-scheduled to take place at times t1 (affecting the middle agent) and t2 (affecting the upper agent). These events cause the two agents to enter internal deliberation states to prepare some reaction. The middle agent succeeds in making a very fast decision - the decision at t3-and causing the lower agent to send (at t4) a new signal to the upper agent. This new signal, (received at t5) interrupts the internal deliberation state in which the upper agent existed. Thus, a scheduled decision event that was expected to happen at t6 as a result of t2 is cancelled (this is shown by the dashed arrow). Instead, as a result of the signal received from the lower agent, the upper agent enters a new internal deliberation, which concludes into an answer being sent (at t7) to the lower agent.
3. Applications to Continuous Double Auction 3.1 Simple Example of a Double Auction Markov Net The simplest MN diagram of a continuous double-auction is shown in Fig. 12. In this diagram we show only one trader (the upper horizontal line) and the market (the lower horizontal line). The market price evolution is shown below the market line. The process described by this diagram starts with the agent setting P0 (represented on the price graph by a *) as an attention trigger. This is represented by a formal (0-time delay) message sent to the market (at time t1). When the price reaches P0 (at time t2), the agent notices it (t3), considers it (t4), and makes the decision to send a buy order (t5)
Markov Nets and the NatLab platform
165
that will be executed at t6. At the same time (t5), the agent also sets a new attention trigger price P1. Returning to the example in the introduction, we can now see how the Markov Net formalism takes care of the details of the traders’ behavior (Tversky and Kahneman, 1974; Kahneman, Knetch, and Thaler, 1986; Akerlof, 1984) in order to represent faithfully the sequence of market events that their behavior engenders. In Fig. 12 one can see that at certain past times (t1 and t2), each of the traders has fixed 10 as an action trigger price threshold. This is represented in the MN conventions by messages that each of them sent to the market. This is only an internal NatLab platform technicality such that the messages are considered as not taking any transmission time in the simulated world. As a consequence of these triggers, when the market reaches 10 at time t3 (following some action by another trader), each of them receives a “message” that is the representation on the platform of the fact that their attention is being awakened by the view (presumably on their computer screen) of the new price display equal to 10. Due to the delays in their perception (and other factors), their reception times (t4 and t5) are slightly different. Once they perceive the new situation, they proceed to think about it and reach a decision. They have different times as among each other until they reach the decision. Thus in the end they send their market orders at different times (t6 and t7). After taking into account the time each order takes to reach the market, one is in a position to determine the arrival times of each of the orders (t8 and t9). Thus, for given parameters (perception times, decision times, and transmission lags), those times are reproducibly computable and unambiguous market scenarios can be run. Fig. 13 A and B illustrates the two possible outcomes discussed in Fig. 1. Incidentally, upon seeing the effect of Agent 2’s order, o, Agent 1 may wish to take fast action to cancel his own order. Of course this possibility depends on the existence of a second communication channel to the market that is faster than the one on which the initial order has been sent.
3.2 Application of Markov Nets to Realistic Continuous DoubleAuction Market We have performed computer-based experiments of a continuous doubleaction market. As seen below (e.g. the spectacular loss of the professional strategy 7), the details of the timing of the various events at the lowest time scale (single trade) have an overwhelming influence on the market outcome.
166
L. Muchnik and S. Solomon
Agent 1 Time Fig. 2. The line representing an agent time evolution in a Markov Net. Below it, the time axis is plotted.
Agent 1
Agent 2
Time Fig. 3. The black dot on the line of Agent 1 represents the currently happened event. The vertical dotted line represents the time at which the Markov Net evolution has currently arrived. The dashed arrow represents the potential causation of a future event (empty dot on the line of Agent 2) by the current event.
A g ent 1
a
A g ent 2
A g ent 3 T im e
Fig. 4. This Markov Net represents two initial events (black dots) belonging to Agents 1 and 3 that cause two potential effect events to Agent 2.
Markov Nets and the NatLab platform
A gent 1
167
b
A gent 2
A gent 3 T ime
Fig. 5. The Markov Net of Fig. 4 advances to the first event on the Agent 2 axis.
A g ent 1
c
A g ent 2
A gent 3 T im e
Fig. 6. As a consequence of the first event acting on Agent 2, the second event is modified and shifted to a later time.
A gent 1
A gent 2
A gent 3 Tim e
Fig. 7. The modified second event acting on Agent 2 is executed.
d
168
L. Muchnik and S. Solomon
a
0
Agent
1
2
Time
17:00
15:00
Fig. 8. News arriving at potentially a decision at 17:00.15:00 causes
b
0
Agent
1
4
3
2
Time
15:00
16:00
16:30
17:00
Fig. 9. The news is cancelled by a new “news” at 16:00. The new “news” produces a potential effect at 16:30.
c
0
Agent
1
4
3
Time
15:00
16:00
16:30
Fig. 10. In the absence of other meantime events, the effect event at 16:30 is actually executed.
Markov Nets and the NatLab platform
t1 t 2
t3
t4
t5
T IM E
t6
t7
169
t8
t1 t2 t3 t4 t5 t7 t8
Trader
Fig. 11. A Markov Net with three interacting agents and some interfering events.
conditional attention (idle)
price
Book
condition set
*
P0
t1
observe
decide
condition triggered
new condition set
*
P1 *
t2
t3 Time
t4
conditional attention order
t5
Fig. 12. Schematic implementation of the continuous double auction.
t6
170
L. Muchnik and S. Solomon
a Trader 1
idle
idle
decision
Trader 2
idle
decision
idle
buy order from Trader 2 The Book
rde r ing o inc om
sell order from Trader 1
offer bid
offer bid
offer bid
12
12
12
11
11
11
10
10
10
9
9
9
8
8
8
Time
b
Trader 1
idle
idle
decision
Trader 2
decision
idle
buy order from Trader 2
sell order from Trader 1 The Book i ng incom
or der
idle
offer bid
offer bid
offer bid
12
12
12
11
11
11
10
10
10
9
9
9
8
8
8
Time
Fig. 13. The order books are the same as in Fig. 1. The figure describes in detail the Markov Net implementation of the correct time order in such a situation. The upper (lower) panel describes the situation
The NatLab experiments were based on the “Avatars” method that we have introduced lately (Daniel, Muchnik, and Solomon. 2005) in order to insure that our simulations/experiments are realistic. The “Avatars”
Markov Nets and the NatLab platform
171
method allows us to capture in the computer the strategies of real traders (including professionals). By doing so we are crossing the line between computer simulations and experimental economics (Daniel, Muchnik, and Solomon 2005). In fact we have found that NatLab is a convenient medium to elicit, record, analyze, and validate economic behavior. Not only does NatLab operate in a most motivating and realistic environment but it also provides a co-evolution arena for individual strategies dynamics and market emergent behavior. Casting the subjects’ strategies within computer algorithms (Avatars) allows us to use them subsequently in various conditions: with and without dividend distribution, with and without interest on money in the presence or absence of destabilizing strategies. We present below some experiments in detail. We have collected the preferred trading strategies from seven subjects and have created (by varying the individual strategies parameters) a set of 7000 trading agents. We have verified that the subjects do agree that the traders behave according to their intentions. We then made runs long enough to have a number of trades per trader of the order 1000. In addition, we introduced 10,000 random traders to provide liquidity and stability. One can also consider their influence on the market as a surrogate for market makers (which we intend to introduce in future experiments). In this particular experiment we verified the influence on the market of having 1000 of the traders respecting a daily cycle. We did so by first running the market simulation in the absence of the periodic traders and then repeating the runs in their presence. Let us first describe some of the relevant traders, their algorithms, and their reaction to the introduction of the periodic daily trends (the numbers on the curves in the graphs in Fig 15 to Fig. 18 correspond to the numbers in the following list). 1 Random We introduced random traders. Each random trader extracts his belief about the future price from a flat, relatively narrow distribution around the present price and then offers or bids accordingly with a limit order set to be executed at the price just below (offer) or above (bid) the current market price. The order volume is proportional to the number of the agent’s shares (in case of offer) or to his cash (in case of bid) and to the relative distance between the current market price and the price the agent believes is right. As seen in Fig. 15, the random traders are performing poorly; their average (Fig. 16) is very smooth because their guesses at each moment are uncorrelated. Their performance and behavior are not affected by the pres-
172
L. Muchnik and S. Solomon
ence or absence of daily trends, due to their inability to adjust to them (Fig. 17, Fig. 18). 2 Short range mean reverters They compute the average over a certain - very short - previous time period and they assume that the price will return to it in the near future. On this basis they buy or sell a large fraction of their shares. When executed on the market with no periodic daily trends (Fig. 15), this strategy took advantage of the observed short-range negative autocorrelation of price returns. In fact, it was the winning strategy in that case. Agents following this strategy had their behavior strongly correlated among themselves. They were able to adjust to the continuously changing fluctuations (Fig. 16). When periodic trends (and hence short-range correlations in returns annihilating the natural anti-correlations) were introduced (Fig. 17), the performance of the strategy dropped dramatically. However, even in this case, positive returns were recorded. In the case of the periodic market, one can clearly identify the emergence of correlation between the actions of all agents in this group (Fig. 18). 3 Evolutionary extrapolation Each agent that follows this strategy tries to continuously evaluate its performance and occasionally switches between trend following and mean reverting behavior. Due to relatively long memory horizon, any trend and correlation that could have been exploited otherwise was averaged out and the strategy shows moderate performance (Fig. 15 and Fig. 17) with almost no spikes of consistent behavior (Fig. 16, Fig. 18). 4 Long range mean reverters Long range mean reverters compare long range average to short term average in order to identify large fluctuations and submit moderate limit orders to exploit them. This strategy does not have any advantage in the case when no well-defined fluctuations exist, and it cannot outperform other strategies (Fig. 15). The actions of the 1000 agents following this strategy are uncorrelated, and their performance is uncorrelated and in general is averaged to 0 (Fig. 16). However, when long trends do appear in the case when periodic fluctuations of the price are induced, agents following this strategy discover and exploit them (though with some delay) (Fig. 18) to their benefit. In fact, they expect that the price fluctuation will end and price will eventually revert to the average. Unlike the short-range mean reverters, they are not disturbed by the small fluctuations around the - otherwise consistent – longer cycle. However, due to these agents’ high risk aversion, their investment volumes are small which prevents them from outperforming the best strategy in the daily periodic case (Fig. 17).
Markov Nets and the NatLab platform
173
174
L. Muchnik and S. Solomon
2
Normalized Wealth
0.18
0.16
5 3 6 4 1
0.14
0.12
7 0.10 200
400
600
800
1000
Normalized Time
Number of Traded Shares
Fig. 15. Relative wealth in a market including the first seven types of traders. The strategy labels on the left border of the graph correspond to the numbers labeling the traders in the text.
10 8 6 4 2 0 -2 -4 -6 -8 -10 10 8 6 4 2 0 -2 -4 -6 -8 10 -10 8 6 4 2 0 -2 -4 -6 -8 10 -10 8 6 4 2 0 -2 -4 -6 -8 10 -10 8 6 4 2 0 -2 -4 -6 -8 10 -10 8 6 4 2 0 -2 -4 -6 -8 -10 10 8 6 4 2 0 -2 -4 -6 -8 -10
1 2
3
4 5
6 7 0
200
400
600
Normalized Time
800
1000
Fig. 16. The shares’ trading history is plotted for each strategy. The average number of bought shares per time unit per trader is represented by the height of the graphs above 0. The number of sold shares is represented by the depth below 0.
Markov Nets and the NatLab platform
175
5
0.18
4
Normalized Wealth
0.16
2 3 1, 6
0.14 0.12 0.10
8 0.08
7 0.06 200
400
600
800
1000
Normalized Time
Fig. 17. The relative wealth evolution in a market containing all types of traders including daily traders (number 8). 10 8 6 4 2 0 -2 -4 -6 -8 -10 10 8 6 4 2 0 -2 -4 -6 -8 -10 10 8 6 4 2 0 -2 -4 -6 -8 -10 10 8 6 4 2 0 -2 -4 -6 -8 10 -10 8 6 4 2 0 -2 -4 -6 -8 -10 10 8 6 4 2 0 -2 -4 -6 -8 -10 10 8 6 4 2 0 -2 -4 -6 -8 -10
Number of Traded Shares
1 2 3 4 5 6 7 0
200
400
600
Norm alized Time
800
1000
N u m b e r o f T ra d ed S h a
Fig. 18. The shares traded
8 4 0 -4 -8 0
200
400
600
800
N o r m a liz e d T im e
Fig. 19. The activity of the daily trader (strategy number 8 in the list).
1000
176
L. Muchnik and S. Solomon
5 Rebalancers and Trend Followers They keep 70% of their wealth in the riskless asset. The remaining 30% is used in large order volumes in a trend following manner: if the short-range average price exceeds the long range average by more than one standard deviation, they buy. If the price falls by more than one standard deviation, they sell. Agents practicing this strategy use market orders to ensure that their intentions are executed. In general this strategy is relatively immune to sudden price fluctuations since it causes its agents to put, at most, 30% of their wealth at risk. It attempts to ride moderate trends. This does not produce remarkable performance in the first case when trends are nor regular. The strategy performance is average in this case. However, it is the best possible strategy in the case of periodic fluctuations. The agents following this strategy discover the trends quickly and expect them to continue for as long as the price evolution is consistent with them. In fact, they consistently manage to predict the price evolution and make big earnings (Fig. 17). 6 Conservative mean reverter Similar to long range mean reverters (subsection 5 above) but with slightly shorter range average and less funds in the market. This does not make any difference when no fluctuations are present (Fig. 15). However, when the periodic fluctuations are introduced (Fig. 17), they expect the price to return to the average sooner than it actually does. Thus the periodic fluctuations are mistaken for rather long intervals. 7 Professional volume watcher Professional volume watchers watch the aggregated volumes of the buy and sell limit orders. If a large difference towards offer is discovered, they sell, and vice versa. Their intention is to predict starting dynamics and respond immediately upon the smallest possible indicators by issuing market orders so that the deal is executed ASAP. This strategy, though suggested by a professional trader (and making theoretically some sense), fails to perform well in both cases. The reason for that is, in our opinion, wrong interpretation of the misbalance in the orders. Agents of this type expect the volume of the limit orders in the book to indicate the excess demand or supply and hence give a clue to the immediate price change. However, in our case, the volumes of limit orders on both sides of the book are highly influenced by big market orders (for example, those issued by rebalancers + followers) and in fact are the result of very recent past large deals rather than a sign of an incoming trend. Another way to express this effect is to recall that the prices are strongly anticorrelated at one tick. Thus this strategy is consistently wrong in its expectations. Interestingly enough, its disastrous outcome would be hidden by
Markov Nets and the NatLab platform
177
any blurring of the microscopic causal order (that would miss the one-tick anticorrelation). 8 The daily traders Daily traders gradually sell all their holdings in the “morning” and then buy them back in the “evening hours”. This causes the market price to fluctuate accordingly. What is more important, the behavior of all the other strategies is greatly affected as well as their relative success. Some of these strategies adapt and exploit the periodic trends while others are losing systematically from those fluctuations. Some of the results (specifically the time evolution of the average relative wealth of the traders representing each subject) are plotted in Fig. 15 and Fig. 17. The unit of time corresponds to the time necessary to have, on average, a deal per each agent.
4. Conclusions and Outlook For a long while, the discrete time Turing machine concept and the tendency to see computers as digital emulations of the continuous reality led to simulation algorithms that mis-represented the causal evolution of systems at the single event time scale. Thus the events took place only at certain fixed or random times. The decision of how to act at the current time was usually by picking up (systematically or randomly) agents and letting them act based on their current view of the system (in some simulations, event ordering was even left to the arbitrary decision of the operating system!). This neglected the lags between cause, decision, action, and effect. In the “Markov Net” representation it is possible for an event to be affected by other arbitrary events that were caused in the meantime between its causation and its happening. This is achieved exactly and without having to pay the usual price of taking a finer simulation mesh. We have constructed a platform (NatLab) based on the Markov Net (MN) principle and performed a series of numerical and real-subject experiments in behavioral economics. In the present paper we have experimented with the interactions and emergent behavior of real subjects’ preferred strategies in a continuous double-auction market. In the future, we propose to extend the use the of NatLab platform in a few additional directions: • Experiment with the effect on the market of various features and events. • Compare the efficiency of different trading strategies. • Isolate the influence of (groups of) traders’ strategies on the market.
178
L. Muchnik and S. Solomon
• Study the co-evolution of traders’ behavior. • Find ways to improve market efficiency and stability. • Study how people depart from rationality. • Study how out-of-equilibrium markets achieve or do not achieve efficiency. • Anticipate and respond to extreme events. • Treat spatially extended and multi-items markets (e.g. real estate), firms’ dynamics, economic development, novelty emergence, and propagation. Last but not least, the mathematical properties of Markov Nets are begging to be analyzed.
Acknowledgments We are very grateful to Martin Hohnisch for sharing with us his extensive and deep understanding of the economic literature on price formation mechanisms. We thank David Bree, Gilles Daniel, Diana Mangalagiu, Andrzej Nowak, and Enrico Scalas for very useful comments. We thank the users of the NatLab platform for their enthusiastic cooperation. This research is supported in part by a grant from the Israeli Academy of Sciences. The research of L.M is supported by Yeshaya Horowitz Association through The Center for Complexity Science. The present research should be viewed as part of a wider interdisciplinary effort in complexity that includes economists, physicists, biologists, computer scientists, psychologists, social scientists and even humanistic fields (http://shum.huji.ac.il/-sorin/ccs/ giacsannex12.doc; http://shum.huji.ac.il/-sorin/ccs/co3_050413.pdf, and http://europa.eu.int/comm/research/fp6/nest/pdf/nest_pathfider_projects_en pdf)
References Akerlof G A (1984): An Economic Theorist’s Book of Tales, Cambridge, UK: Cambridge University Press. Bharucha-Reid A T (1960): Elements of the Theory of Markov Processes and Their Applications, New York: McGraw-Hill. ISBN: 0486695395. Daniel G, Muchnik L, and Solomon S (2005): “Traders imprint themselves by adaptively updating their own avatar”, Proceedings of the 1st Symposium on Artificial Economics, Lille, France, Sept. Erez T, Hohnisch M, and Solomon S (2005): “Statistical Economics on MultiVariable Layered Networks”, in Economics: Complex Windows, M. Salzano and A Kirman eds., Springer, p. 201.
Markov Nets and the NatLab platform
179
Gallegati M and Giulioni G (2003): “Complex Dynamics and Financial Fragility in an Agent Based Model”, Computing in Economics and Finance 2003 86, Society for Computational Economics. Grandmont J M (1977): “Temporary General Equilibrium Theory”, Econometrica, Economic Society, vol. 45(3), pp. 535-72. Hicks J R (1939): Value and Capital, Oxford, England: Oxford University Press. Hicks J R (1946): Value and Capital: An Inquiry into Some Fundamental Principles of Economic Theory, Oxford: Clarendon. http//europa.eu.int/comm/research/fp6/nest/pdf/nest_pathfinder_projects_en.pdf http://shum.huji.ac.il/-sorin/ccs/co3_050413,pdf http://shum.huji.ac.il/-sorin/ccs/giacs-annex12.doc Kahneman D, Knetch J L, and Thaler R H (1986): “Fairness and the assumptions of economics”, Journal of Business, LIX (1986): S285-300. Koren I (2002): Computer Algorithms, Natick, MA: A. K. Peters. ISBN 1-56881160-8. Lindahl E.R. (1929): “The Place of Capital in the Theory of Price”, 1929, Ekonomisk Tidskrift. Liu Y., Gopikrishnan P., Cizeau P., Meyer M., Peng C. K., Stanley H. E. (1999): Statistical properties of the volatility of price fluctuations; Phys Rev E 1999; 60:1390-4000. Mandelbrot B (1963): “The variation of certain speculative prices”, Journal of Business 36, pp. 394-419. McCoy B M and Wu T.T (1973): The Two-Dimensional Ising Model, Cambridge, MA: Harvard University Press (June 1), ISBN: 0674914406. Muchnik L, Louzoun Y, and Solomon S (2005): Agent Based Simulation Design Principles - Applications to Stock Market, “Practical Fruits of Econophysics”, Springer Verlag Tokyo Inc. Papoulis A (1984): “Brownian Movement and Markoff Processes”, Ch. 15 in Probability, Random Variables, and Stochastic Processes, 2nd ed., New York: McGraw-Hill, pp. 515-53. Radner R (1972): “Existence of equilibrium of plans, prices and price expectations in a sequence of markets”, Econometrica 40, pp. 289-303. Scalas E (2005): “Five years of continuous-time random walks in econo-physics”, Proceedings of WEHIA 2004, http://econwpa.wustl.edu/eps/fin/papers/0501 /05011005.pdf Scalas E, Gorenflo R, Luckock H, Mainardi F, Mantelli M, and Raberto M (2004): “Anomalous waiting times in high-frequency financial data”, Quantitative Finance 4, pp. 695-702. htp.//tw.arxiv.org/PS_cache/physics/pdf/0505/0505210.pdf Scalas E, Gorenflo R, and Mainardi F (2000): “Fractional calculus and continuous-time finance”, Physica-A (Netherlands, vol. 284, pp. 376-84. http://xxx. lanl.gov/abs/cond-mat/0001120). Smith E and Foley D K (2004): Classical thermodynamics and economic general equilibrium theory; http://cepa.newschool.edu/foleyd/econthermo.pdf Tversky A and Kahneman D (1974): “Judgment under uncertainty: Heuristics and biases”, Science, CLXXXV (Sept.), pp. 1124-31.
180
L. Muchnik and S. Solomon
Walras L (1909): “Economique et Méchanique”,Bulletin de la Société Vaudoise 48, pp. 313-25. Walras L (1954) [1874]): Elements of Pure Economics, London: George Allen and Unwin. Wiener N (1923): “Differential space”, J. Math. Phys. 2, p. 131.
181
____________________________________________________________
Synchronization in Coupled and Free Chaotic Systems F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
1. Introduction Global bifurcations in dynamical systems are of considerable interest because they can lead to the creation of chaotic behaviour [Hilborn, 1994]. Global bifurcations are to be distinguished from local bifurcations around an unstable periodic solution. Typically, they occur when a homoclinic point is created. A homoclinic point is an intersection point between the stable and the unstable manifold of a steady state saddle point p on the Poincaré section of a, at least, 3D flow. The presence of a homoclinic point implies a complicated geometrical structure of both the stable and the unstable manifolds usally referred to as a homoclinic tangle. When a homoclinic tangle has developed, a trajfectory that comes close to the saddle point behaves in an erratic way, showing sensitivity to initial conditions. Homoclinic chaos (Arneodo et al., 1985) represents a class of selfsustained chaotic oscillations that exhibit quite different behaviour as compared to phase coherent chaotic oscillations. Typically, these chaotic oscillators possess the structure of a saddle point S embedded in the chaotic attractor. In a more complex system, the orbit may go close to a second singularity which stabilizes the trajectory away from the first one, even though the chaotic wandering around the first singularity remains unaltered. We are thus in presence of a “heteroclinic” connection and will show experimentally some interesting peculiarities of it. Homoclinic and heteroclinic chaos (HC) have received large consideration in many physical (Arecchi, Meucci, and Gadomski, 1987), chemical (Argoul, Arneodo, and Richetti, 1987), and biological systems (Hodgkin and Huxley, 1952; Feudel et al., 2000).
182
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
Homoclinic chaos has also been found in economic models, that is, in heterogeneous market models (Brock and Hommes, 1997, 1998; Foroni and Gardini, 2003; Chiarella, Dieci, and Gardini, 2001). The heterogeneity of expectations among traders introduces the key nonlinearity into these models. The physical system investigated is characterized by the presence, in its phase space, of a saddle focus SF and a heteroclinic connection to a saddle node SN. Interesting features have been found when an external periodic perturbation or a small amount of noise is added. Noise plays a crucial role, affecting the dynamics near the saddle focus and inducing a regularization of the chaotic behaviour. This peculiar feature leads to the occurrence of a stochastic resonance effect when a noise term is added to a periodic modulation of a system parameter. Besides the behaviour of a single chaotic oscillator, we have investigated the collective behaviour of a linear array of identical HC systems. Evidence of spatial synchronization over the whole length of the spatial domain is shown when the coupling strength is above a given threshold. The synchronization phenomena here investigated are relevant for their similarities and possible implications in different fields such as financial market synchronization, which is a complex issue not yet completely understood.
2. The physical system The physical system consists of a single mode CO2 laser with a feedback proportional to the output intensity. Precisely, a detector converts the laser output intensity into a voltage signal, which is fed back through an amplifier to an intracavity electro-optic modulator, in order to control the amount of cavity losses. The average voltage on the modulator and the ripple around it are controlled by adjusting the bias and gain of the amplifier (Fig. 1). These two control parameters can be adjusted in a range where the laser displays a regime of heteroclinic chaos characterized by the presence of large spikes almost equal in shape but erratically distributed in their time separation T. The chaotic trajectories starting from a neighbourhood of SF leave it slowly along the unstable manifold and have a fast and close return along the stable manifold after a large excursion (spike) which approaches the stabilizing fixed point SN (Fig. 2). Thus a significant contraction region exists close to the stable manifold. Such a structure underlies spiking behaviour in many neuron (Hodgkin and Huxley, 1952; Feudel et al. 2000),
Synchronization in Coupled and Free Chaotic Systems
183
chemical (Argoul, Arneodo, and Richetti, 1987) and laser (Arecchi, Meucci, and Gadomski, 1987) systems. It is important to note that these HC systems has intrinsically highly nonuniform dynamics and the sensitivity to small perturbations is high only in the vicinity of SF, along the unstable directions. A weak noise thus can influence T significantly.
Fig. 1. Experimental setup. M1 and M2: mirrors; EOM: electro-optical modulator; D: diode detector; WG: arbitrary waveform and noise generator; R: amplifier; B0: applied bias voltage
Fig. 2. Time evolution of the laser with feedback and the reconstructed attractor
3. Synchronization to an external forcing The HC spikes can be easily synchronized with respect to a small periodic signal applied to a control parameter. Here, we provide evidence of such a synchronization when the modulation frequency is close to the ‘‘natural frequency”, that is, to the average frequency of the return intervals. The required modulation is below 1%. An increased modulation amplitude up to 2% provides a wider synchronization domain which attracts a frequency range of 30% around the natural frequency. However, as we move away from the natural frequency, the synchronization is imperfect insofar as phase slips; that is, phase jumps of ± 2π , appear. Furthermore, applying a large negative detuning with respect to the natural frequency gives rise to synchronized bursts of homoclinic spikes
184
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
separated by approximately the average period, but occurring in groups of 2, 3, etc. within the same modulation period (locking 1:2, 1:3 etc.) and with a wide inter-group separation. A similar phenomenon occurs for large positive detuning; this time the spikes repeat regularly every 2, 3 etc. periods (locking 2:1, 3:1 etc.) (Fig. 3).
Fig. 3. Different response of the chaotic laser to external periodic modulation with different frequency with respect to the natural one of the system
The 1:1 synchronization regime has been characterized in the parameter space (amplitude and frequency) where synchronization occurs, Fig. 4. The boundary of the synchronization region is characterized by the occurrence of phase slips, where the phase difference from the laser signal and the external modulation presents a jump of 2π (grey dots).
4. Noise induced synchronization Let us now consider the effects induced by noise. White noise is added as an additive perturbation to the bias voltage. From a general point of view, two identical systems which are not coupled, but are subjected to a common noise, can synchronize, as has been reported both in the periodic (Pikovsky, 1984; Jensen, 1998) and in the chaotic (Matsumoto and Tsuda, 1983; Pikovsky, 1992; Yu, Ott, and Chen, 1990) cases. For noise-induced synchronization (NIS) to occur, the largest Lyapunov exponent (LLE) ( λ1 > 0 in a chaotic system) has to become negative (Matsumoto and Tsuda, 1983; Pikovsky, 1992; Yu, Ott, and Chen 1990). However, whether noise can induce synchronization of chaotic systems has been a subject of intense controversy (Mritan and Banavaar, 1994; Pikovsky, 1994; Longa,
Synchronization in Coupled and Free Chaotic Systems
185
Curado, and Oliveira, 1996; Herzel and Freund, 1995; Malescio, 1996; Gade and Basu, 1996; Sanchez, Matias, and Perez-Munuzuri, 1997).
Fig.4 Experimental reconstruction of the synchronization region for the parameters amplitude and frequency for the 1:1 synchronization regime
As introduced before, noise changes the competition between contraction and expansion, and synchronization ( λ1 < 0 ) occurs if the contraction becomes dominant. We first carry out numerical simulations on the model which reproduces the dynamics of our HC system
(
x&1 = k 0 x1 x2 − 1 − k1 sin 2 x6
)
(1)
x& 2 = −γ 1 x2 − 2k 0 x1 x2 + gx3 + x4 + p0
(2)
x&3 = −γ 1 x3 + gx2 + x5 + p0
(3)
x& 4 = −γ 2 x4 + zx2 + gx5 + zp0 x&5 = −γ 2 x5 + zx3 + gx4 + zp0
(4) (5)
⎛ rx1 ⎞ ⎟ + Dξ (t ) x&6 = − β ⎜⎜ x6 − b0 + (6) 1 + αx1 ⎟⎠ ⎝ In our case, x1 represents the laser output intensity, x2 the population inversion between the two resonant levels, x6 the feedback voltage signal which controls the cavity losses, while x3 , x4 and x5 account for molecular exchanges between the two levels resonant with the radiation field and the other rotational levels of the same vibrational band. Furthermore, k 0 is the
186
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
unperturbed cavity loss parameter, k1 determines the modulation strength,
g is a coupling constant, γ 1 , γ 2 are population relaxation rates, p0 is the pump parameter, z accounts for the number of rotational levels, and β ,r ,α are respectively the bandwidth, the amplification and the saturation factors of the feedback loop. With the following parameters k0 = 28.5714, k1 = 4.5556, γ1 = 10.0643, γ2 = 1.0643, g = 0.05, p0 = 0.016, z = 10, β = 0.4286, α = 32.8767, r = 160 and b0 = 0.1031, the model reproduces the regime of homoclinic chaos observed experimentally (Pisarchik, Meucci, and Arecchi, 2001). The model is integrated using a Heun algorithm (Garcia-Ojalvo and Sancho, 1999) with a small time step ∆t = 5 ▪ 10-5ms (note that typical T ≈ 0.5 ms). Since noise spreads apart those points of the flow which were close to the unstable manifold, the degree of expansion is reduced. This changes the competition between contraction and expansion, and contraction may become dominant at large enough noise intensity D. To measure these changes, we calculate the largest Lyapunov exponent (LLE) λ1 in the model as a function of D (Fig. 5a, dotted line). λ1 undergoes a transition from a positive to a negative value at Dc ≈ 0.0031. Beyond Dc, two identical laser models x and y with different initial conditions but the same noisy driving Dξ(t) achieve complete synchronization after a transient, as shown by the vanishing normalized synchronization error (Fig.5a, solid line).
E=
x1 − y1 x1 − x1
(7)
At larger noise intensities, expansion becomes again significant; the LLE increases and synchronization is lost when λ1 becomes positive for D=0.052. Notice that even when λ1 < 0, the trajectories still have access to the expansion region where small distances between them grow temporally. As a result, when the systems are subjected to additional perturbations, synchronization is lost intermittently, especially for D close to the critical values. Actually, in the experimental laser system, there exists also a small intrinsic noise source. To take into account this intrinsic noise in real systems, we introduce into the equations x6 an equivalent amount of independent noise (with intensity D=0.0005) in addition to the common one Dξ(t). By comparison, it is evident that the sharp transition to a synchronized regime in fully identical HC systems (Fig. 5a, solid line) is smeared out (Fig. 5a, closed circles).
Synchronization in Coupled and Free Chaotic Systems
187
In experimental study of NIS, for each noise intensity D we repeat the experiment twice with the same external noise. Consistently with numerical results with a small independent noise, E does not reach zero due to the intrinsic noise, and it increases slightly at large D (Fig. 5b).
Fig. 5. Lyapunov exponents calculated from the model of the laser system and the synchronization error form the simulations and experimental showing the range of noise providing the synchronization between two systems
Experimental evidence of the noise induced synchronization is also provided by the data reported in Fig. 6.
Fig. 6. Experimental evidence of noise induced synchronization, the action of the common noise signal starts at time t=0
5. Coherence Resonance and Stochastic Resonance Coherence effects have been investigated by considering the distribution of the return time T which is strongly affected by the presence of noise. These effects are more evident in the model than in the experiment.
188
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
The model displays a broad range of time scales. There are many peaks in the distribution P(T) of T, as shown in Fig. 7a. In the presence of noise, the trajectory on average cannot come closer to SF than the noise level. As a consequence the system spends a shorter time close to the unstable manifold. A small noise (D=0.0005) changes significantly the time scales of the model: P(T) is now characterized by a dominant peak followed by a few exponentially decaying ones (Fig. 7b).
Fig. 7. Noise-induced changes in time-scales. (a) D=0; (b) D=0.0005; and (c) D=0.01. The dotted lines show the mean interspike interval T0(D), which decreases with increasing D
This distribution of T is typical for small D in the range D = 0.00005÷ 0.002. The experimental system with only intrinsic noise (equivalent to D = 0.0005 in the model) has a very similar distribution P(T ) (not shown). At larger noise intensity D = 0.01, the fine structure of the peaks is smeared out and P(T ) becomes an unimodal peak in a smaller range (Fig. 7c). Note that the mean value T0(D)=〈T 〉t (Fig. 7, dotted lines) decreases with D. When the noise is rather large, it affects the dynamics not only close to S but also during the spiking, so that the spike sequence becomes fairly noisy. We observe the most coherent spike sequences at a certain intermediate noise intensity, where the system takes a much shorter time to escape from S after the fast reinjection, the main structure of the spike being preserved. As a result of noise-induced changes in time scales, the system displays a different response to a weak signal (A = 0.01) with a frequency
f e = f 0 (D ) =
1 , T0 (D )
i.e., equal to the average spiking rate of the unforced model. At D = 0, P(T ) of the forced model still has many peaks (Fig. 8a), while at D = 0.0005, T is sharply distributed around the signal period Te = T0(D) (Fig. 8b). However, at larger intensity D=0.01, P(T ) becomes lower and broader again (Fig. 8c).
Synchronization in Coupled and Free Chaotic Systems
189
In the model and experimental systems, the pump parameter p0 is now modulated as
p(t ) = p0 [1 + A ⋅ sin (2πf e t )]
(8)
by a periodic signal with a small amplitude A and a frequency ƒe. First we focus on the constructive effects of noise on phase synchronization. To examine phase synchronization due to the driving signal, we compute the phase difference θ (t)=φ (t)-2πƒet. Here the phase φ (t) of the laser spike sequence is simply defined as [Pikovsky et al. ]
⎛
φ (t ) = 2π ⎜⎜ k + ⎝
t −τ k τ k +1 − τ k
⎞ ⎟⎟, (τ k ≤ t ≤ τ k +1 ) ⎠
(9)
where τ k is the spiking time of the kth spike.
Fig. 8. Response of the laser model to a weak signal (A=0.01) at various noise intensities. (a) D=0; (b) D=0.0005; and (c) D=0.01. The signal period Te in (a), (b), and (c) corresponds to the mean interspike interval T0(D) of the unforced model (A=0)
We have investigated the synchronization region (1:1 response) of the laser model in the parameter space of the driving amplitude A and the relative initial frequency difference
∆ω =
f e − f o (D ) f o (D )
, where the average frequency
f o (D ) of the unforced laser model is an increasing function of D. The ac-
tual relative frequency difference in the presence of the signal is calculated as
∆Ω =
f − fe f o (D )
where
f =
1 T t
is the average spiking frequency of the forced
laser model. The synchronization behaviour of the noise-free model is quite complicated and featureless at weak forcing amplitudes (about A<0.012). As shown in Fig. 9a, there does not exist a tonguelike region similar to the Arnold tongue in phase-coherent oscillators; for a fixed A, ∆Ω is not a monotonous function of ∆ω and it vanishes only at some
190
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
specific signal frequencies; at stronger driving amplitudes about (A > 0.012), the system becomes periodic at a large frequency range. The addition of a small noise, D = 0.0005, drastically changes the response: a tonguelike region Fig. 9b, where effective frequency locking ( ∆Ω ≤ 0.003 ) occurs, can be observed similar to that in usual noisy phase-coherent oscillators. The synchronization region shrinks at a stronger noise intensity D = 0.005 (Fig. 9c). The very complicated and unusual response to a weak driving signal in the noise-free model has not been observed in the experimental system due to the intrinsic noise whose intensity is equivalent to D = 0.0005 in the model.
Fig. 9. Synchronization region of the laser model at various noise intensities. A dot is plotted when ∆Ω ≤ 0.003 . (a) D=0; (b) D=0.0005; and (c) D=0.005
We now study how the response is affected by noise intensity D for a fixed signal period Te. Here, in the chaotic lasers without the periodic forcing the average interspike interval T0(D) decreases with increasing D, and stochastic resonance (SR) similar to that in bistable or excitable can also be observed. We employ the following measure of coherence as an indicator of SR (Marino et al., 2002).
I=
Te
σT
(1+α )Te
∫α )P(T )dT ,
(1−
(10)
Te
where 0 < α < 0.25 is a free parameter. This indicator takes into account both the fraction of spikes with an interval roughly equal to the forcing period Te and the jitter between spikes [Marino et al. 2002]. SR of the 1:1 response to the driving signal has been demonstrated both in the model and in the experimental systems by the ratio T T and I in Fig. 10. Again, the t
e
behaviour agrees well in the two systems. For Te < T0(0), there exists a
Synchronization in Coupled and Free Chaotic Systems T
191
≈ 1 . The noise intensity optimizing the synchronization region where T coherence I is smaller than the one that induces coincidence of T0(D) and Te (dashed lines in Fig. 10a, c). It turns out that maximal I occurs when the dominant peak of P(T) (Fig. 9b) is located at Te. This kind of noiseinduced synchronization has not been reported in usual stochastic resonance systems, where at large Te numerous firings occur randomly per signal period and result in an exponential background in P(T) of the forced system, while at small Te a 1:n response may occur which means an aperiodic firing sequence with one spike for n driving periods on average (Marino et al. 2002; Benzi, Sutera, and Vulpiani 1981; Wiesenfeld and Moses 1995; Gammaitoni et al. 1998; Longtin and Chiaivo 1998). t
e
Fig. 10. Stochastic resonance for a fixed driving period. Left panel: model, A = 0.01, Te = 0.3 ms. Right panel: experiments: forcing amplitude 10 mV (A = 0.01) and period , Te = 1.12 ms; here the intensity D is of the added external noise. Upper panel: noise-induced coincidence of average time scales (dashed line, A = 0) and synchronization region. Lower panel: coherence of the laser output. α = 0.1 in Eq. (12)
6. Collective behaviour Synchronization among coupled oscillators is one of the fundamental phenomena occurring in nonlinear dynamics. It has been largely investigated in biological and chemical systems. Here we consider a chain of nearest neighbour coupled HC systems with the aim of investigating the emergence of collective behaviour by increasing the strength of the coupling [Leyva et al. 2003]. Precisely, going back to the model Eqs. (1)-(6) we add a superscript i to denote the site ini dex. Then in the last equation for x6 , we replace x1i with
192
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
(1 + ε )x1i + ε (x1i+1 + x1i−1 − 2 ⋅
x1i
) where ε
is a mutual coupling coeffi-
i
cient and x1 denotes a running time average of x1i . We report in Fig. 11 the transition from unsynchronized to synchronized regimes by showing the space time representation of the evolution of the array. The transition to phase synchronization is anticipated by regimes where clusters of oscillators spike quasi-simultaneously (Leyva et al., 2003; Zheng, G. Hu, and B. Hu, 1998). The number of oscillators in the clusters increases with ε extending eventually to the whole system (see Fig. 11c, d).
Fig. 11. Space-time representation of a chain of coupled chaotic oscillators for different values of ε : (a) 0.0 , (b) 0.05, (c) 0.1 (d) 0.12 (e) 0.25
A better characterization of the synchronized pattern can be gathered by studying the emerging “defects” (Leyva et al., 2003). Each defect consists of a phase slip; that is a spike that does not induce another spike in its immediate neighbourhood. In order to detect the presence of defects, we map the phase in the interval between two successive spikes of a same oscillator as a closed circle ( 0,2π ), as usual in pulsing systems (For a review of the subject, see Pikovsky, Rosenblum, and Kurths 2001; Boccaletti et al.2002). Notice that since a suitable observer threshold isolates the spikes getting rid of the chaotic small inter-spike background, we care only for spike correlations. With this notation, we will consider that a defect has occurred when the phase of an oscillator has change by 2π while the phase of an immediate neighbour has changed by 4π . Here, our “phase synchronization” term denotes a connected line from left to right, which does not necessarily imply equal time occurrence of spikes at different
Synchronization in Coupled and Free Chaotic Systems
193
sites; indeed, the wavy unbroken lines which represent the prevailing trend as we increase ε are what we call “phase synchronization”.
7. Evidence of homoclinic chaos in financial market problems with heterogeneous agents Important changes have influenced economic modelling during the last decades. Nonlinearities and noise effects have played a crucial role to explain the occasionally high irregular movements in economic and financial time series (Day 1994; Mantegna and Stanley, 1999). One of the peculiar differences between economics and other sciences is the role of expectations or beliefs in the decisions of agents operating on markets. In rational expectation models, largely used in classic economy [Muth 1961], agents evaluate their expectations on the knowledge of the linear market equilibrium equations. Nowadays the rational equilibrium hypothesis is considered unrealistic and a growing interest is devoted to bounded rationality where agents are using different learning algorithms to predict their beliefs. In the last decade several heterogeneous market models have been introduced where at least two types of agents coexist (Brock and Hommes, 1997; 1998; Arthur, Lane, and Durlauf, 1997; Lux and Marchesi, 1999; Farmer and Joshi, 2002). The first group is composed of fundamentalists who believe that the asset prices are completely determined by economic fundamentals. The other group is composed of chartists or technical traders believing that the asset prices are not determined by their fundamentals but can be derived using trading rules based on previous prices. Brock and Hommes first discovered that a heterogeneous agent model with rational versus naive expectation can exhibit complicated price dynamics when the intensity of choice to switch between strategies is high (Brock and Hommes, 1997, 1998). Agents can either adopt a sophisticated predictor H1 (rational expectation) or another simple and low cost predictor H2 (adaptive short memory or naive expectation). Near the steady state or equilibrium price most of the agents use the naive predictor. Prices are driven far from the equilibrium. However, when prices diverge from their equilibrium, forecasting errors tend to increase and as a consequence it becomes more convenient to switch to the sophisticated predictor. The prices will move back to the equilibrium price. According to Brock and Hommes a heterogeneous market can be considered as a complex adaptive system whose dynamics are ruled by a two dimensional map for the variables pt and mt :
194
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
pt +1 =
− b(1 − mt ) pt = f ( p t , mt ) 2 B + b(1 − mt )
2 ⎛ β ⎛ b ⎛ b(1 − m ) ⎞⎞ ⎞ 2 ⎜ t ⎜ ⎜⎜ mt +1 = tanh + 1⎟⎟ pt − C ⎟ ⎟ = g ( pt , mt ) ⎜ 2 ⎜ 2 2 B + b(mt + 1) ⎟⎟ ⎠ ⎠⎠ ⎝ ⎝ ⎝
(11)
(12)
pt represents the deviations from the steady state price p determined by the intersection of demand and supply. mt is the difference between the two fractions of agents, β is the intensity of choice indicating how fast agents switch predictors. The parameter C is the price difference between the two predictors. The parameter b is related to a linear supply curve derived from q2
a quadratic cost function c(q ) = 2b where q is the quantity of a given nonstorable good. The temporal evolution of the variables m and p at the onset of chaos is reported in Fig. 12. The corresponding chaotic attractor is shown in Fig. 13. This attractor occurs after the merging of the two coexisting 4 piece chaotic attractors. The importance of homoclinic bifurcations leading to chaotic dynamics in economic models with heterogeneous agents has been recently emphasized by I. Foroni and L. Gardini, who extended the theoretical investigations also to noninvertible maps (Foroni and Gardini, 2003; Chiarella, Dieci, and Gardini, 2001).
Fig. 12. Simulation of the model showing the chaotic evolution of the price p and the agents difference m. The used parameters are b = 1.35, B = 0.5, C = 1 and β = 5.3
Synchronization in Coupled and Free Chaotic Systems
195
8. Conclusions Heterogeneous markets models have received much attention during the last decade for the richness of their dynamical behaviours including homoclinic bifurcations to chaos as pointed out by the simple evolutionary economic model proposed by Brock and Hommes. This is similar to the behaviour of a laser with electro-optic feedback. The feedback process, acting on the same time scale of the other two relevant variables of the system, that is, laser intensity and population inversion, is the crucial element leading to heteroclinic chaos. In such a system, chaos shows interesting features. It can be easily synchronized with respect to small periodic perturbations and, in the presence of noise, it displays several constructive effects such as stochastic resonance. As we have stressed in this paper, homoclinic/heteroclinic chaos is a common feature in many different disciplines including economics. Quite frequently, chaos is harmful, so controlling or suppressing its presence has been widely considered in recent years. However, in some situations chaos is a beneficial behaviour. Typically a chaotic regime disappears as a result of a crisis which is an abrupt change from chaos to a periodic orbit or a steady state solution. This occurrence finds its analogy in microeconomics when a firm is near bankruptcy. In such a critical condition, suitable and careful interventions are necessary in order to recover the usual cycle of the firm.
Fig. 13. Attractor of the dynamics of the model. The values of the parameter are the same reported in Fig. 12
196
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
Other important aspects concern the occurrence of synchronization phenomena in economics. Currently, the attention is on synchronization among the world’s economies considering the greater financial openness. In this way, globalization effects lead to an increase of the links between the world’s economies, in particular by means of capital markets and trade flows.
Authors acknowledge MIUR-FIRB contract n. RBAU01B49F\_002
References Arecchi, F.T., R. Meucci, and W. Gadomski, (1987): “Laser Dynamics with Competing Instabilities”, Phys. Rev. Lett. 58, 2205. Argoul, F., A. Arneodo, and P. Richetti, (1987): “A propensity criterion for networking in an array of coupled chaotic systems”, Phys. Lett. 120A, 269. Arneodo, A., P.H. Coullet, E.A. Spiegel, and C. Tresser, (1985): “Asymptotic chaos”, Physica D 14, 327. Arthur, W., D. Lane, and S. Durlauf (eds.), (1997): The Economy as an Evolving Complex System II, Addison-Wesley, Redwood City, CA. Benzi, R., A. Sutera, and A. Vulpiani, (1981): “The mechanism of stochastic resonance”, J. Phys. A 14, L453. Boccaletti, S., J. Kurths, G. Osipov, D. Valladares, and C. Zhou, (2002): “The synchronization of chaotic systems”, Phys. Rep. 366, 1. Brock, W.A., and C.H. Hommes, (1997): “A rational route to randomness”, Econometrica, 65, 1059. Brock, W.A., and C.H. Hommes, (1998): “Heterogeneous beliefs and routes to chaos in a simple asset pricing model”, Journ. of Econom. Dynam. and Control, 22, 1235. Chiarella, C., R. Dieci, and L. Gardini, (2001): “Asset price dynamics in a financial market with fundamentalists and chartists”, Discrete Dyn. Nat. Soc. 6, 69. Day, R.H., (1994): Complex Economics Dynamics, MIT Press, Cambridge, MA. Farmer, J.D., and S. Joshi, (2002): “The Price Dynamics of Common Trading Strategies”, Journ. of Econ. Behavior and Organization 49, 149. Feudel, U. et al., (2000): “Homoclinic bifurcation in a Hodgkin-Huxley Model of Thermally Sensitive Neurons”, Chaos 10, 231. Foroni, I., and L. Gardini, (2003): “Homoclinic bifurcations in Heterogeneous Market Models”, Chaos, Solitons and Fractals 15, 743-760. Gade, P.M., and C. Basu, (1996): “The origin of non-chaotic behavior in identically driven systems”, Phys. Lett. A 217, 21. Gammaitoni L., P. Hanggi, P. Jung, and F. Marchesoni, (1998): “Stochastic Resonance”, Rev. Mod. Phys. 70, 223. Garcia-Ojalvo, J., and J.M. Sancho, (1999): Noise in Spatially Extended System, Springer, New York.
Synchronization in Coupled and Free Chaotic Systems
197
Herzel, H., and J. Freund, (1995): “Chaos, noise, and synchronization reconsidered”, Phys. Rev. E 52, 3238. Hilborn, R.C., (1994): Chaos and Nonlinear Dynamics, Oxford University Press, Oxford. Hodgkin, A.L., and A.F. Huxley, (1952): “A quantitative description of membrane current and its application to conduction and excitation in nerve”, J. Physiol. 117, 500. Jensen, R.V., (1998): “Synchronization of randomly driven nonlinear oscillators”, Phys. Rev. E 58, R6907. Leyva, I., E. Allaria, S. Boccaletti, and F. T. Arecchi, (2003): “Competition of synchronization domains in arrays of chaotic homoclinic systems“, Phys. Rev. E 68, 066209. Longa, L., E.M.F. Curado, and A. Oliveira, (1996): “Roundoff-induced coalescence of chaotic trajectories”, Phys. Rev. E 54, R2201. Longtin, A., and D. Chialvo, (1998): “Stochastic and Deterministic Resonances for Excitable Systems”, Phys. Rev. Lett. 81, 4012. Lux, T., and M. Marchesi, (1999): “Scaling and criticality in a stochastic multiagent model of a financial market”, Nature 397, 498. Malescio, G., (1996): “Noise and Synchronization in chaotic systems”, Phys. Rev. 53, 6551. Marino, F. et al., (2002): “Experimental Evidence of Stochastic Resonance in an Excitable Optical System”, Phys. Rev. Lett. 88, 040601. Mantegna, R.N., and H.E. Stanley, (1999): An Introduction to Econophysics: Correlations and Complexity in Finance, Cambridge University Press. Matsumoto, K., and I. Tsuda, (1983): “Noise-induced Order”, J. Stat. Phys. 31, 87. Maritan, A., and J.R. Banavar, (1994): “Chaos, Noise and Synchronization”, Phys. Rev. Lett. 72, 1451. Muth, J.F, (1961): “Rational Expectations and the Theory of Price Movements” Econometrica 29, 315. Pikovsky, A., (1994): “Comment on “Chaos, Noise and Synchronization’’”, Phys. Rev. Lett. 73, 2931 . Pikovsky, A., M. Rosenblum, and J. Kurths, (2001): Synchronization: A Universal Concept in Nonlinear Sciences, Cambridge, University Press. Pikovsky, A.S., Radiophys, (1984): “Synchronization and stochastization of an ensemble of self-oscillators by external noise”, Quantum Electron. 27, 576. Pikovsky, A.S., M.G. Rosenblum, G.V. Osipov, and J. Kurths, (1997): “Phase Synchronization of Chaotic Oscillators by External Driving”, Physica D 104. Pisarchik, A.N., R. Meucci, and F.T. Arecchi, (2001): “Theoretical and experimental study of discrete behavior of Shilnikov chaos in a CO2 laser”, Eur. Phys. J. D 13, 385. Sanchez, E., M.A. Matias, and V. Perez-Munuzuri, (1997): “Analysis of synchronization of chaotic systems by noise: An experimental study”, Phys. Rev. E 56, 4068. Wiesenfeld, K., and F. Moss, (1995): “Stochastic resonance and the benefits of noise: from ice ages to crayfish and SQUIDs”, Nature 373, 33.
198
F.T. Arecchi, R. Meucci, E. Allaria, and S. Boccaletti
Zheng, Z., G. Hu, and B. Hu, (1998): “Phase Slips and Phase Synchronization of Coupled Oscillators”, Phys. Rev. Lett. 81, 5318.
199
Part IV
Agent Based Models
201
____________________________________________________________
Explaining Social and Economic Phenomena by Models with Low or Zero Cognition Agents P. Ormerod, M. Trabatti, K. Glass, and R. Colbaugh
1. Introduction We set up agent based models in which agents have low or zero cognitive ability. We examine two quite diverse socio-economic phenomena, namely the distribution of the cumulative size of economic recessions in the United States and the distribution of the number of crimes carried out by individuals. We show that the key macro-phenomena of the two systems can be shown to emerge from the behaviour of these agents. In other words, both the distribution of economic recessions and the distribution of the number of crimes can be accounted for by models in which agents have low or zero cognitive ability. We suggest that the utility of these low cognition models is a consequence of social systems evolving the “institutions” (e.g., topology and protocols governing agent interaction) that provide robustness and evolvability in the face of wide variations in agent information resources and agent strategies and capabilities for information processing. The standard socio-economic science model (SSSM) assumes very considerable cognitive powers on behalf of its individual agents. Agents are able both to gather a large amount of information and to process this efficiently so that they can carry out maximising behaviour. Criticisms of this approach are widespread, even within the discipline of economics itself. For example, general equilibrium theory is a major intellectual strand within mainstream economics. A key research task in the twentieth century was to establish the conditions under which the existence of general equilibrium could be proved. In other words, the conditions under which it could be guaranteed that a set of prices can be found under
202
P. Ormerod, M. Trabatti, K. Glass, and R. Colbaugh
which all markets clear. Radner (1968) established that when agents held different views about the future, the existence proof appears to require the assumption that all agents have access to an infinite amount of computing power. A different criticism is made by one of the two 2002 Nobel prize winners. One of them, Kahneman (2002), describes his empirical findings contradicting the SSSM model as having been easy, because of the implausibility of the SSSM to any psychologist. An alternative approach is to ascribe very low or even zero cognitive ability to agents. The second 2002 Nobel laureate, Smith (2002), describes results obtained by Gode and Sunder. An agent based model is set up for a single commodity, operating under a continuous double auction. The agents choose bids or asks completely at random from all those that do not impose a loss on the agent. They use no updating or learning algorithms. Yet, as Smith notes, these agents “achieve most of the possible social gains from trade”. In practice, agents may face environments that are both high dimensional and not time-invariant. Colbaugh and Glass (2003) set out a general model to describe the conditions under which agents can and cannot learn to behave in a more efficient manner, suggesting that the capability of agents to learn about their environment under these conditions is in general low. Ormerod and Rosewell (2003) explain key stylised facts about the extinction of firms by an agent based model in which firms are unable to acquire knowledge about either the true impact of other firms’ strategies on its own fitness, or the true impact of changes to its own strategies on its fitness. Even when relatively low levels of cognitive ability are ascribed to agents, the model ceases to have properties that are compatible with the key stylised facts. In this paper, we examine two quite diverse socio-economic phenomena, namely the distribution of the cumulative size of economic recessions in the United States and the distribution of the number of crimes carried out by individuals. We set up agent based models in which agents have low or zero cognitive ability. We show that the key macro-phenomena of the two systems can be shown to emerge from the behaviour of these agents. In other words, both the distribution of economic recessions and the distribution of the number of crimes can be accounted for by models in which agents have low or zero cognitive ability. Section 2 describes the model of the cumulative size of economic recessions, and section 3 sets out the crime model. Section 4 concludes by offering an explanation for the efficacy of these simple models for complex socio-economic phenomena.
Explaining Social and Economic Phenomena
203
2. Economic recessions The cumulative size of economic recessions, the percentage fall in real output from peak to trough, is analysed for 17 capitalist economies by Ormerod (2004). We consider here the specific case of the United States 1900-2004. There are in total 19 observations, ranging in size from the fall of barely one fifth of one per cent in GDP in the recession of 1982, to the fall of some 30 per cent in the Great Depression of 1930-33. Fig. 1 plots the histogram of the data, using absolute values. On a Kolmogorov-Smirnov test, the null hypothesis that the data are described by an exponential distribution with rate parameter = 0.3 is not rejected at the standard levels of statistical significance (rejected at p = 0.59). The observation for the Great Depression is an outlier, suggesting the possibility of a bi-modal distribution, but technically the data follow an exponential distribution. An agent based model of the business cycle that accounts for a range of key features of American cycles in the twentieth century is given in Ormerod (2002). These include positive cross-correlation of output growth between individual agents, and the autocorrelation and power spectrum properties of aggregate output growth. The cumulative size distribution of recession was not, however, considered. The model is populated by firms, which differ in size. These agents operate under uncertainty, are myopic, and follow simple rules of thumb behaviour rather than attempting to maximise anything. In other words, in terms of cognition they are at the opposite end of the spectrum to firms in the standard SSSM. The model evolves in a series of steps, or periods. In each period, each agent decides its rate of growth of output for that period, and its level of sentiment (optimism/pessimism) about the future. Firms choose their rate of growth of output according to:
xi (t ) = ∑ wi y i (t − 1) + ε i (t )
(1)
i
where xi(t) is the rate of growth of output of agent i in period t, yi(t) is the sentiment about the future of the ith agent, and wi is the size of each individual firm (the size of firms is drawn from a power law distribution). Information concerning the sentiment of firms’ about the future can be obtained readily by reading, for example, the Wall Street Journal or the Financial Times. The variable εi(t) plays a crucial role in the model. This is a random variable drawn separately for each agent in each period from a normal distribution with mean zero and standard deviation sd1. Its role is to reflect
204
P. Ormerod, M. Trabatti, K. Glass, and R. Colbaugh
4 0
2
Frequency
6
8
both the uncertainty that is inherent in any economic decision making and the fact that the agents in this model, unlike mainstream economic models that are based on the single representative agent, are heterogeneous.
0
5
10
15
20
25
30
Cumulative percentage fall in GDP, absolute value
Fig. 1. Histogram of cumulative absolute percentage fall in real US GDP, peak to trough, during recessions 1900-2002
The implications of any given level of overall sentiment for the growth rate of output of a firm differs both across the N agents and over time. Firms are uncertain about the precise implications of a given level of sentiment for the exact amount of output that they should produce. Further, the weighted sum of the sentiment of firms is based upon an interpretation of a range of information that is in the public domain. Agents again differ at a point in time and over time in how they interpret this information and in consequence differ in the value that they attach to this overall level of sentiment. The model is completed by the decision rule on how firms decide their level of sentiment: ⎡ ⎤ y i (t ) = (1 − β ) y i (t − 1) − β ⎢∑ wi x i (t − 1) + η i (t )⎥ ⎣ i ⎦
(2)
where ηi is drawn from a normal distribution with mean zero and standard deviation sd2. The coefficient on the weighted sum of firms’ output growth in the previous period, β, has a negative sign, reflecting the Keynesian basis of
Explaining Social and Economic Phenomena
205
the model. The variable ηi(t) again reflects agent heterogeneity and uncertainty. At any point in time, each agent is uncertain about the implications of any given level of overall output growth in the previous period for its own level of sentiment. In this model, it is as if agents operate on a fully connected network. Each agent takes account of the previous level of overall sentiment and output. In other words, it takes account of the decisions of all other agents in the model. In this context, this seems a reasonable approximation to reality. Information that enables firms to form a view on the previous overall levels of output and sentiment is readily available in the pubic domain, either from official economic statistics or from more general comment in the media. In practice, of course, whilst taking account of the overall picture, firms are likely to give particular weight to prospects in their own sector or sectors, so the actual network across which information is transmitted will contain a certain amount of sparsity. But the assumption of a fully connected network to transmit information seems reasonable. This apparently simple model is able to replicate many underlying properties of time series data on annual real output growth in the United States during the twentieth century. Fig. 2 below compares the cumulative distribution functions of total cumulative fall in real output in recessions generated by the theoretical model, and of the actual cumulative falls of real output in America in the twentieth century. The model is run for 5,000 periods; the number of agents is 100; β = 0.4, sd(ε) = 0.04, sd(η) = 0.5. On a Kolmogorov-Smirnov test, the null hypothesis that the two cumulative distribution functions are the same is only rejected at p = 0.254. In other words, equations (1) and (2) are able to generate economic recessions whose cumulative size distribution is similar to that of the actual data for the US.
3. Crimes by individuals Cook et al. (2004) examine the distribution of the extent of criminal activity by individuals in two widely cited data bases. The Pittsburgh Youth Study measures self-reported criminal acts over intervals of six months or a year in three groups of boys in the public school system in Pittsburgh, PA. The Cambridge Study in Delinquent Development records criminal convictions amongst a group of working class youths in the UK over a 14year period.
206
P. Ormerod, M. Trabatti, K. Glass, and R. Colbaugh
The range of the data is substantially different between these two measures of criminal activity, since one is based on convictions and the other on self-reported acts. However, there are similarities in characteristics of the data sets. Excluding the frequency with which zero crimes are committed or reported, a power law relationship between the frequency and rank of the number of criminal acts per individual describes the data well in both cases, and fits the data better than an exponential relationship. The exponent is virtually identical in both cases. A better fit is obtained for the tail of the distribution. The data point indicating the number of boys not committing any crime does not fit with the power law that characterizes the rest of the data; perhaps a crucial step in the criminal progress of an individual is committing the first act. Once this is done, the number of criminal acts committed by an individual can take place on all scales.
0.0
0.2
0.4
0.6
0.8
1.0
Empirical and Hypothesized exponential CDFs
0
5
10
15
20
25
30
solid line is the empirical d.f.
Fig. 2. Cumulative distribution functions of the percentage fall in real output during economic recessions. The solid line is the empirical distribution function of real US GDP, peak to trough, during recessions 1900-2004. The dotted line is the distribution function of total output series generated by equations (1) and (2)
The stylised facts that characterize both data sets can be described as: • Approximately two-thirds of all boys in the sample commit no crime during a given period of time. • The distribution of the number of crimes committed by individuals who did commit at least one crime fits a power law with exponent of –1.
Explaining Social and Economic Phenomena
207
We aim to build a model that reconstructs these stylized facts. We imagine that a cohort of youths arrives at the age at which they start to commit crime. These youths are from relatively low income backgrounds, and are themselves relatively unskilled. This particular social type is responsible for a great deal of the total amount of crime that is committed. Of course, in reality, different individuals begin committing crimes at different ages with different motivations. The idea of this model is that the opportunity to commit a crime presents itself sequentially to a cohort over time. We use preferential attachment to represent the process by which a crime opportunity becomes attached to a particular individual, and so becomes an actual rather than a potential crime. Preferential attachment is widely used to describe how certain agents become extremely well connected in a network of interpersonal relationships as existing connections provide opportunities to establish new connections (Newman 2003, Albert and Barabasi 2002). Briefly, to grow a network of this type, new connections are added to the nodes of a graph with a probability equal to the proportion of the total number of connections that any given node has at any particular time. At the outset of our model, none of the agents have committed a crime. During each step a crime opportunity arises and is presented to an agent for attachment with a probability that increases with the number of crimes the agent has already committed. In the beginning, each agent has the same probability to experience a crime opportunity and at each step the opportunity arises for the jth individual to commit a crime with the following probability:
(
Pj = (n j + ε ) / ∑ j=1 n j + εN N
)
(3)
where 1/ε represents approximately how many times the number of past crimes committed is more important than just belonging to the cohort in order to receive a new crime opportunity. Once a crime opportunity is attached, the individual decides whether to commit the crime with a probability that increases with the number of past crimes committed. This is a reasonable hypothesis since, as more crimes are committed, an individual progressively loses scruples about committing new crimes. Initially agents have a heterogeneous attitude towards committing a crime (i.e. probability to commit a crime once they have the opportunity) that increases every time they commit a crime. The initial probability for each agent is drawn from a Gaussian distribution (µ,σ) and then increased by a factor of α for every crime committed.
208
P. Ormerod, M. Trabatti, K. Glass, and R. Colbaugh
The model is run for 200 agents with ε = 0.4, µ = σ = 0.2, α = 0.1 and the results closely match the stylized facts found for the data set mentioned above. Out of 200 agents, 70 commit at least one crime and the cumulative distribution of the number of crimes committed by those with at least one crime follows a power law with exponent of -0.2 (so that the probability density has an exponent of −1.2), in excellent agreement with the stylized facts from the empirical study. It should be observed that the ratio of parameters ε and µ influence the number of individuals not committing crime and this may have a reasonable explanation in the fact that in a community the value of having committed past crimes decreases as the facility to commit crimes arises.
Fig. 3. Cumulative Distribution function of number of crimes committed per criminal. Logarithmic scale
4. Discussion The preceding examples demonstrate that agent-based models in which agents have zero or low cognitive ability are able to capture important aspects of “real world” social systems. We suggest that the utility of these low cognition models may be a consequence of social systems evolving the “institutions” (e.g., topology and protocols governing agent interaction) that provide robustness and evolvability in the face of wide variations in
Explaining Social and Economic Phenomena
209
agent information resources and agent strategies and capabilities for information processing. More precisely, suppose that the socio-economic system of interest evolves to realize some “objective”, using finite resources in an uncertain world, and by incrementally: (1) correcting observed defects, and (2) adopting innovations. In this case we can prove that such systems are “robust, yet fragile” and allow “deep information from limited data”, and that these properties increase as system maturity increases (Colbaugh and Glass, 2003). Further, if robustness and evolvability are both important, we can show that: (1) system sensitivity to variations in topology increases as the system matures, and (2) system sensitivity to variations in vertex characteristics decreases as the system matures (Colbaugh and Glass, 2004). In other words, these results suggest that as complex social systems (of this class) “mature”, the importance of accurately modeling their agent characteristics decreases. The implication is the degree of cognition that is assigned to agents in socio-economic networks decreases as the system matures. Further, the importance of characterizing the network or institutional structure by which the agents are connected increases. The implication is that successful institutional structures evolve to enable even low cognition agents to arrive at “good” outcomes, and that in such situations the system behaves as if all agent information processing strategies are simple.
References Albert, R., and A. Barabasi (2002): Statistical mechanics of complex networks, Rev. Mod. Physics, Vol. 74, pp. 47-97. Colbaugh, R., and K. Glass (2003): Information extraction in complex systems, Proc. 2003 NAACSOS Conference, Pittsburgh, PA, June (plenary talk). Colbaugh, R., and K. Glass (2004): Simple models for complex social systems, Technical Report, U.S. Department of Defense, February. Cook W., P. Ormerod, and E. Cooper (2004): Scaling behaviour in the number of criminal acts committed by individuals, Journal of Statistical Mechanics: Theory and Experiment, 2004, P07003. Kahneman, D. (2002): Nobel Foundation interview, http://www.nobel.se/economics/ laureates/2002/khaneman-interview.html Newman, M. (2003): The structure and function of complex networks, SIAM Review, Vol. 45, No. 2, pp. 167-256. Ormerod, P. (2002): The US business cycle: power law scaling for interacting units with complex internal structure, Physica A, 314, pp.774-785. Ormerod, P. (2004): Information cascades and the distribution of economic recessions in capitalist economies, Physica A, 341, 556-568, 2004
210
P. Ormerod, M. Trabatti, K. Glass, and R. Colbaugh
Ormerod, P., and B. Rosewell (2003): What can firms know? Proc. 2003 NAACSOS Conference, Pittsburgh, PA, June (plenary talk). Radner, R. (1968): Competitive Equilibrium Under Uncertainty, Econometrica, 36. Smith, V. L. (2003): Constructivist and ecological rationality in economics, American Economic Review, 93, pp. 465-508.
211
____________________________________________________________
Information and Cooperation in a Simulated Labor Market: A Computational Model for the Evolution of Workers and Firms S. A. Delre and D. Parisi
1. Introduction In free markets workers and firms exchange work for salaries and they have both competing and mutual interests. Workers need to work to get a salary but they are interested in getting as high a salary as possible. Firms need to hire workers but they are interested in paying them as low a salary as possible. At the same time both categories need each other. Neoclassical microeconomics formalizes the labor market as any other goods market. Workers represent the supply of labor and firms represent the demand. On one hand workers are assumed to have perfect knowledge about wages and the marginal rate of substitution between leisure and income so that they can decide how to allocate their hours: work in order to increase the income and leisure in order to rest. On the other hand, firms know the level of wages, the price of the market, and the marginal physical product of labor (the increasing production resulting from the increase of one unit of labor) so that they can compute the quantity of labor hours they need. At the macro level, workers’ supplies and firms’ demands are aggregated and they intersect in the equilibrium point determining the level of wage and the level of employment. Neoclassical formalization of the labor market is the point of departure for many other works about labor markets (Ehrenberg and Smith, 1997; Bresnahan, 1989). However, the micro foundation of neoclassical economics is a strong simplification of reality and it does not take into consideration many other important factors like fairness (Rees, 1993; Fehr and Schmidt, 1999), bargaining power (Holt, 1995), information matching and social contacts (Rees and Shultz, 1966; Mont-
212
S. A. Delre and D. Parisi
gomery, 1991; Granovetter, 1995). In order to include these phenomena in the analysis of labor markets, neoclassical assumptions need to be changed or at least relaxed and extended. Moreover, neoclassical economics provides a static explanation of the equilibrium in which macro-variables should be if workers and firms were always rational and information were always completely available. In the last three decades, alternative approaches have been born and have flourished, such as evolutionary economics (Nelson and Winter, 1982; Dosi and Nelson, 1994; Arthur et al., 1997) and agent-based computational economics (Epstein and Axtell, 1996; Tesfatsion, 2002a). Both approaches have in common the idea that economic agents have bounded rationality. The former focuses on the fitness of agents’ behaviors in the environment and how these behaviors adapt and evolve under the pressure of selection rules. The latter uses computational models in order to simulate economic agents’ behaviors and it shows how macro-regularities of the economy emerge from the micro-rules of the interactions of economic agents. These approaches have introduced new interests also in the field of labor markets and many new works have appeared in order to explain crucial stylized facts like job concentration (Tesfatsion, 2001), the Beveridge curve, the Phillips curve vs the wage curve, Okun’s curve (Fagiolo et al., 2004), and equality and segregation (Tassier and Menczer, 2002). In this paper we present an agent-based model in order to study (a) the evolution of different labor markets, (b) the effects of unions on the labor market; and (c) the effects of social network structures on the value of information for workers. First, labor markets differ according to the constraints firms have when hiring new workers. Either employers are completely free to have a one-toone bargaining process with the worker or they have to respect some lows like minimum wage, some bargaining norms like fairness, or some policy like a wage indexation scale. In this paper we present a comparison of the two extremes of this continuum: in the first scenario firms are completely constrained and they cannot change and adjust their behaviors; in the second scenario firms are able to change, evolve, and enter and leave the market according to their performance. Here we present the effects of these different situations on total production of the market, levels of salaries, firms’ profits, and value of information about a job. Second, in labor markets labor unions increase workers’ bargaining power. Unions limit competition among workers and they try to defend unionized employees’ interests by means of collective bargaining. However, unions’ influences can also affect non-unionized workers because they strongly influence the evolution of firms’ strategies. The model allows us to introduce a collective bargaining where a fraction of worker agents is
Information and cooperation in a simulated labor market
213
unionized and it shows how this unionization affects the average level of salaries. Third, many studies have shown that informal hiring methods like employee referrals and direct applications are very relevant when workers are looking for a new job (Rees and Schultz, 1970; Granovetter, 1995). Friends and social contacts are big sources of information about employment information. Montgomery (1991) has analyzed four databases about alternative job-finding methods and he found that approximately half of all employed workers found the job through social contacts like friends and relatives. We have connected the worker agents of our multi-agent model in different network structures and we have studied the effects of such structures on the value of information workers have about a job.
2. The model The categories of workers and firms are in a relation of reciprocal activation or mutualism (Epstein, 1997). The two categories need each other and when a category increases in number, it feeds back to the other category. Workers need firms in order to get a salary and firms need workers to develop, to produce, and to sell goods. On the one hand, in any given market, if workers were to disappear, firms also would disappear and vice versa. On the other hand, if firms increased in number because of more resources of the environment, also workers would increase in number in order to use those resources. In such an artificial world, assuming also infinite resources in the environment, workers’ and firms’ populations would simply increase exponentially, but if we assume a limit for available resources (carrying capacity), the process is blocked and at a certain point workers and firms will stop increasing in number. However, although their mutual relationship induces both workers and firms to collaborate in order to increase in number, they compete with each other in that they have opposite interests: workers aim at getting higher salaries from firms and firms aim at paying lower salaries to workers because this enables them to compete more successfully with other firms in the market. Workers want many firms offering high salaries and firms want many workers accepting low salaries. We have reproduced such a situation in an agent-based simulation where worker agents and entrepreneur agents are two separate categories and they live together in an environment with a given carrying capacity. The simulation proceeds with discrete time-steps (cycles). A local selection algorithm (Menczer and Relew, 1996) has been used to evolve the behavior of worker and entrepreneur agents. Both worker agents and entre-
214
S. A. Delre and D. Parisi
preneur agents are born, live, reproduce, and die. At birth each agent is endowed with a certain energy of which a constant quantity (individual consumption) is consumed at each time step in order for the agent to remain alive. If an agent’s energy goes to zero, the agent dies. To survive, the agent must procure other energy to reintegrate the consumed energy. As long as an agent succeeds in remaining alive, the agent periodically (every K1 cycles of the simulation) generates an offspring, i.e., a new agent of the same category, that inherits the genotype of its single parent. All agents die at a certain maximum age (K2 cycles of the simulation). The agents that are better able to procure energy live longer and have more offspring. Consequently the two populations of agents vary in number during the simulation run according to the strategies of the agents, and those agents that adopt the best strategies will evolve better then the others. At each time-step agents have costs of energy and gathering of energy: worker agents gain new energy getting salaries and entrepreneurs getting profits. Equations 1 and 2 describe respectively the evolution of energy for worker and entrepreneur agents: ewi , t = ewi , t − 1 − Cp − hi * Cr + si , t eej , t = eej , t − 1 − Cp + πj , t
(1) (2)
where ewi,t indicates the energy of worker agent i at time t, Cp are fixed costs for individual consumption in order to survive, hi, Cr are fixed costs for job research (hi indicates the number of offers evaluated by the worker agent i and Cr indicates fixed costs for each offer), si,t indicates the salary the worker agent i gets for its job, eej,t indicates the energy of entrepreneur agent j at time t, and πj,t indicates its profits at time t. Equation 3 describes profits of entrepreneur agent j:
πj , t = P * yi , t − Cf − ∑ sq, t
(3)
q
revenue P*yj,t (where p is the fixed price and yj,t is the production) minus fixed costs of the entrepreneur’s firm Cf and salaries paid to the employed worker agents. The genetic algorithm permits the evolution of both worker and entrepreneur agents’ behaviors. Each agent, both worker and entrepreneur, possesses a different “genotype” that determines its behavior. For worker agents, the genotype specifies the minimum salary (minS) that the agent is ready to accept from an entrepreneur agent and the number of different entrepreneur agents (h) that it contacts when, being unemployed, it is looking for a job. For entrepreneur agents, the genotype only specifies the maximum salary (maxS) that it is ready to pay to a worker agent when the entrepreneur agent hires it. When a worker agent i is jobless, i contacts h en-
Information and cooperation in a simulated labor market
215
trepreneur agents and it takes into consideration only the highest offer of entrepreneur agent j. Worker agent’s genotype (minSi) and entrepreneur agent’s genotype (maxSj) are compared and the worker agent i starts working for the entrepreneur agent j only if maxSj is equal to or higher than minSi. When a contract is set up, the salary of the worker agent will be set to the mean of maxSj and minSi as indicated in equation (4). ⎛1 ⎞ si , t = min Si + ⎜ (max Sj − min Si )⎟ ⎝2 ⎠
(4)
All working contracts have the same length (K3 cycles of the simulation) and, when a contract expires, the worker agent becomes jobless and it looks again for a new job. For both worker agents and entrepreneur agents it is convenient that maxS and minS evolve towards stable values in order to increase the chances of contracts to start. At the same time, for worker agents it is not convenient to have a very low minS because that means low salaries and for entrepreneur agents it is not convenient to have a high maxS because that means high costs and less profits. The evolution of min S and max S (respectively average among worker agents and average among entrepreneur agents) indicates how the relation between worker agents and entrepreneur agents changes during the time of the simulation. Finally, the second part of workers’ genotype measures for a worker agent i how important information about a job is when it is jobless. The higher hi, the higher the chances to find a more remunerative job. However, also hi has a given cost (Cr) so that the evolution of h (average among worker agents) indicates how much worker agents are ready to pay to have more information about job offers. Entrepreneur agents continue hiring worker agents till the marginal revenue (the increase in production from hiring one more worker agent) is higher than or equal to the marginal cost (the salary paid to the new worker agent). We assume that the production curve is convex; that is, the marginal physical product of labor is declining (i.e., the more worker agents are assumed by the entrepreneur agent, the less production will increase from hiring a new worker agent). Equation 5 describe the production curve of the firm of entrepreneur agent j: yj = αNz
(5)
where yj is the production of the firm of entrepreneur agent j, N is the number of workers employed by j, α is a constant indicating how much production an additional worker agent guaranties, and 0
216
S. A. Delre and D. Parisi
when the total production reaches the level of the carrying capacity, the market is divided and assigned to the entrepreneurs according to their relative contribution to the total production. Fig. 1 presents (a) the pseudo-code and (b) a graphical representation of the cycle of the simulation run. In order to analyze the effects of globalization, we connected worker agents in different network structures and let information travel through these networks. The nodes of the networks are worker agents and the arcs of the networks are the contacts among worker agents. If a contact exists between two worker agents, then they are friends and they can exchange information about their jobs. When a worker agent looks for a new job, it uses its contacts, it asks its friends for which entrepreneur agents they work, and it evaluates the offers of those entrepreneur agents. Different network structures have been taken into consideration: regular networks, small-world networks, and random networks (Watts and Strogatz, 1999). In a regular network, all nodes have contacts just with their neighbors so that if a worker agent i is a friend of a worker agent j and worker agent j is a friend of a worker agent k, it is very likely that worker agents i and k are also friends. Such a network is very clustered and information travels very slowly through it. It can represent a highly segregated world where workers have just local information coming from their neighbors. If we take a regular network and we rewire each arc of the network with a probability p=1, then we obtain a random network where worker agents are connected completely randomly among themselves. Such network is not clustered at all and information goes very fast through it. This structure can represent a global world where all contacts among worker agents have the same probability to exist. If we rewire each arc of the regular network with a probability p such that 0
3. The experimental design We have simulated two different scenarios. In the first scenario, entrepreneur agents are fixed in number and their genotypes do not evolve. This assumption represents a closed market where firms cannot freely decide about job offers. These markets have to respect many constraints created by laws like job protection, minimum salaries, etc. In the second scenario,
Information and cooperation in a simulated labor market
217
entrepreneur agents are free to evolve and the best entrepreneur agents determine the level of job offers. They evolve according to the efficiency of their strategies, and those entrepreneur agents who earn more have more chances to reproduce. In this case, markets are assumed to be totally free for firms. (a) Initialize # W 0 wo rker a ge nts ; Initialize #E 0 e ntrep re neur a ge nts ; A ssign ge no types to worke r and e ntrep re neur a ge nts ; for C cycles { for each worke r age nt i { if (i is jobless) { evaluate h i offers ; take the best offer; if (ma xS > = minS ) { i sets up the contract; i works , earns a nd consu mes ; } else { i consu mes ; } } else {
(b ) W orkers Look for a job
Entrepreneurs O ffer a job
B argaining Working tim e
M anagem ent
E arning and con sum ption
E arn ing and consum ption
New generatio n
N ew gen eration
i works , earns a nd consu mes } if (ew i<0 ) { i dies; } if (C mod K 1 = 0) {
For N cycles
i reproduces; } if (C mod K 2 = 0) { i dies; } if (C mod K 3 = 0) { i becames jobless; } } for each e ntrep re neur age nt j { if (marginal reve nue >= margina l cost) { j keeps hiring worker a ge nts ; } else { j stops hiring; } j sells production, pays salaries and fixed costs; if (ee j <0) { j dies; } if (C mod K 1 = 0) { j reproduces; } if (C mod K 2 = 0) { j dies; } } M easure statistics of the evolutio n }
Fig. 1. The cycle of the simulation. (a) The pseudo-code. (b) A graphical representation of the genetic algorithm
218
S. A. Delre and D. Parisi
First, we compare the levels of production of the two scenarios at the steady states in order to analyze the efficiency of different markets at a global level. For similar amounts of resources (the same carrying capacity), more production means more efficiency of the system. Moreover, we study also how the average levels of energies change in the different markets and how worker agents’ and entrepreneur agents’ genotypes evolve. This allows us to analyze the efficiency of the system at the individual level, too. The higher the average levels of energy in the two populations, the better the conditions available for the agents; the higher the value of min S and max S , the higher the chances for worker agents to get a more remunerative job. Second, we study the effects of worker agents’ cooperation on the level of salaries. We let a fraction of worker agents (coop) cooperate and bargain collectively with entrepreneur agents. Those worker agents do not accept any salary less than a minimum collectively established (min_sal). We aim at seeing whether such cooperation could affect entrepreneur agents’ behaviors in bargaining so that we observe variation of entrepreneur agents’ genotypes max S . The higher the value of max S at the steady state, the stronger the effect of the cooperation is. (a)
(b)
(c)
p=0
0
p=1
Fig. 2. Different network structures for 20 nodes and 40 links. (a) Regular network, (b) Small-world network and (c) Random network
Third, in order to study the value of information when worker agents look for a job, we observe the evolution of h (average among the worker agents population) along the simulation runs in the two scenarios. The value of hi indicates how many offers the worker agent i evaluates in order to find a new job so that the higher the value of h , the higher the value of the information for worker agents. Finally, we connect worker agents in different network structures: regular network, small-world network, and random network, and we observe the influence of such structures on the value of h .
Information and cooperation in a simulated labor market
219
Table 1 resumes the experimental design indicating the assumptions, the independent and the dependent variables of the simulation experiments. Given Values and Assumptions c = [0, 3000] P = 100 K1 = 20 cycles K2=100 cycles K3 = 10 cycles α=1 RangeMutationMinS = [-2, 2] RangeMutationMaxS=[-2, 2] RangeMutationH=[-0.2, 0.2] ewi,0 = [0,100] eei,0 = [100, 1000] #W0 = 1000 #E0 = 200 CC = 105 Cf = 50
Independent variables Cp = {30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85} Cr = {0, 1, 2, 5, 10} z = {0.95, 1} p = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0} min_sal ={30, 40, 50, 60, 70, 80, 90, 100} coop = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0}
Dependent variables Y
ew ee max S min S h
where c is the cycle of simulations, P is the price; K1 is the number of simulation cycle after which each agent reproduces; K2 is the maximum number of cycle an agent lives; K3 is the number of cycles a worker agent works for an entrepreneur agent every time it finds a job; α is the additional contribution of a new worker to the production; z represents the decreasing contribution to production of a new worker, RangeMutationMinS, RangeMutationMaxS and RangeMutationH indicate how much minSi and maxSj and hi can change when agents reproduce, eei,0 and ewi,0 are the amounts of energy entrepreneur and worker agents have at cycle 0; #W0 and #E0 are the numbers of worker and entrepreneur agents at the beginning of the simulation run; CC is the carrying capacity; Cf stays for the fixed costs for each entrepreneur agent’s firm at each cycle, Cp stays for the individual consumption at each cycle, Cr stays for the cost a worker has to pay for each evaluation of a job offer; p represent the degree of globalization the worker agents’ network; min_sal is the minimum salary worker agents want when they bargain collectively; coop is the portion of worker agents that cooperate; Y is the total production;
ew and ee are the averages of worker agents’ and entrepreneur agents’ energies; maxS is the average value of maxSj; minSc is the average value of minSi; h is the average value o f hi.
Table 1. The set up of the simulation runs
4. Results In this section we first present how our system evolves in two different labor market environments (scenario1: given and constrained labor demand, and scenario2: evolving and free labor demand). Section 4.1 presents the
220
S. A. Delre and D. Parisi
level of production, the level of contracts, and the level of wealth of worker and entrepreneur agents; in section 4.2 we study the effects of cooperation among workers; in section 4.3 we observe how worker agents spend different efforts and resources in order to find a job in the two scenarios; and finally, in section 4.4 we describe the differences of the value of information for worker agents’ when they are connected through different network structures.
4.1 A comparison between a fixed labor demand and an evolving labor demand In Fig. 3 we present the levels of production in the two scenarios for similar conditions: Cp=30, Cr=10. In scenario1 worker agents can reproduce but entrepreneur agents cannot and only minSi evolves but maxSj = [0, 100] is fixed for each entrepreneur agent. In scenario2 both worker agents and entrepreneur agents can reproduce and minSi and maxSj can evolve. When entrepreneur agents do reproduce and do not evolve, the production does not even reach the level of carrying capacity. In a very high bureaucratic and closed market where new firms are not allowed to enter the market and old firms cannot change their strategies, the system cannot move to a better state unless there is an exogenous change like new technologies. This result represents a lock-in situation where both worker and entrepreneur agents are satisfied and they have no interest in using the more available resources of the environment. When new entrepreneur agents are allowed to enter the market and to evolve their strategies, all the resources of the environment are used and production largely overcomes the carrying capacity. The level of production is much higher than the carrying capacity of the system because in our model the only form of competition among entrepreneur agents is the level of production. The more entrepreneur agents produce, the more chances of increasing their portion of the market they obtain. Because in scenario2 evolution agents reward the best competitive entrepreneur agents, production is pushed at the maximum level, even much more than the carrying capacity of the system. In Fig. 4 we present the evolution of min S and max S in order to see how the genetic algorithm drives the evolution of the system in the two scenarios with the same conditions as before. In both scenarios a steady state is reached but min S and max S are much higher in scenario1. In such a situation worker agents can ask higher salaries because the labor demand varies more than in scenario1. While in scenario1 there are always entrepreneur agents that offer high salary because they cannot evolve, in scenario2 entrepreneur agents co-evolve; in fact, their job offers become
Information and cooperation in a simulated labor market
221
lower and more similar. This means that there are better conditions for worker agents in scenario1 because they have the opportunity to obtain higher contracts. Surprisingly, this does not mean less profit for entrepreneur agents. In Fig. 5, we compare the average amount of energy of both categories, ew and ee , against time. It can be seen how both categories have a higher amount of energy in scenario1 than in scenario2. The initial period is very rich for both categories: ew reaches about 500 in scenario1 and 300 in scenario2; ee reaches more then 1200 in scenario1 and almost 600 in scernario2. After the initial boom, when the system reaches a steady state, ee always stays much higher than ew and the difference between ee and ew is much less in scenario1 than in scenario2. Why do worker and entrepreneur agents have less energy in scenario2 than in scenario1 if the level of production is much higher? Because the number of agents is much lower in scenario1 than in scenario2. Consequently, although there is less production in scenario1, all agents have more energy per head. Moreover, it can be observed that in scenario2 the level of the salary is just the minimum worker agents need to survive. Varying the value of Cp for 13 simulation runs (Cp = {30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90}), we let the system converge towards the
steady state and we collect values of minS and maxS after 3000 time-
steps. In Fig. 6 we observe how the level of minS and maxS depends on the fixed consumption Cp in scenario2. We also ran simulations for bigger values of Cp and we found that for Cp >= 85, the populations go to extinction. The selection pushes worker agents at the minimum level of salary. Those worker agents that are ready to earn just what they consume at each time-step are those who better reproduce. Asking a higher salary means less chance to find a job because almost all entrepreneur agents offer low salary and asking a lower salary means not having enough to survive.
4.2 The effects of cooperation As described in Fig. 6, in scenario2 entrepreneur agents are able to push worker agents at the limit of their individual consumption. Salaries evolve during the simulation run and then they stabilize just at the level of Cp. In this situation worker agents earn enough in order to keep working during their working life and if the economy of the system grows, either entrepreneur agents earn more or the number of worker and entrepreneurs agents increases so that the production can also increase. But what happens if worker agents cooperate in order to obtain higher salaries? What would happen if they set up a union that collectively bargains with entrepreneur
222
S. A. Delre and D. Parisi
agents? We simulate such a situation, introducing two more independent variables in the basic set-up of the simulation: the percentage of cooperation (coop) and the minimum salary that worker agents who cooperate decide to ask (min_sal). Thus, if worker agents belong to the union and they are unemployed, they use min_sal instead of minS in the bargaining process with entrepreneur agents. That is, they do not accept any salary that is lower than min_sal. Finally, coop indicates the fraction of the worker population that cooperates. Production 350000 300000 250000
Y
200000 150000 100000 50000
3001
2876
2751
2626
2501
2376
2251
2126
2001
1876
1751
1626
1501
1376
1251
1126
876
1001
751
626
501
376
251
1
126
0
time
Y (scenario1) Y (scenario2) Carrying Capacity
Fig. 3. The evolution of total production (Y) in the two scenarios Level of contracts (1) minS (scenario1)
70 minS and maxS
60 50
maxS (scenario1)
40 30
minS (scenario2)
20 10 3001
2751
2501
2251
2001
1751
1501
1251
1001
751
501
251
1
0
tim e
Fig. 4. The evolution of min S and max S in the two scenarios
maxS (scenario2)
Information and cooperation in a simulated labor market
223
We set the model in scenario2 at the parameters’ values Cp = 30, Cr = 10 and we collected the values of max S at the steady states for different values of min_sal and coop (min_sal = {30, 40, 50, 60, 70, 80, 90, 100} and coop = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0}). Fig. 7 shows the results. Whatever is the required salary asked by those worker agents that cooperate, they manage to get better conditions if the cooperation is higher than 50%. We may have expected that the higher the salary they ask, the higher is the percentage of cooperation they need. But our results indicate that the system has a threshold mechanism: if the cooperation involves less than half of the worker population, entrepreneur agents can resist and they do not change the way they bargain. In this case, they just hire those worker agents that do not cooperate, because there are many to be found and hired. On the contrary, if more than half of the population cooperates, then entrepreneur agents have to adapt and meet some of the worker agents’ requests. Then the value of max S increases linearly with the percentage of cooperation and the slope of the line is determined by the minimum salary worker agents require. Levels of wealth
1000
ew (scenario1)
ew and ee
800
ee (scenario1)
600
ew (scenario2)
400 200
ee (scenario2) 2731
2458
2185
1912
1639
1366
1093
820
547
274
1
0
time
Fig. 5. The evolution of ew and ee in the two scenarios
4.3 The value of information In Fig. 8a we present the evolution of h in the two scenarios for different costs of information (Cr = {0, 1, 2}, Cp = 30) and in Fig. 8b we show how h varies at the steady states for a larger parameters’ space (Cr. = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, Cp = 30). The value of h at the steady states of the simulations is higher in scenario1 than in scenario2. Worker agents are
224
S. A. Delre and D. Parisi
ready to pay more for information when entrepreneurs evolve. This means that information becomes more important when worker agents have lower salaries. When entrepreneur agents evolve they offer similar and lower salaries. Why do worker agents value information more although job offers are lower and the chances to find a better job are also lower? Moreover, in scenario2 worker agents earn the exact amount of energy to survive; then why does the evolution select those worker agents that decide to look for more job offers? The value of information explains this apparent strange result. In scenario1 worker agents live in a better condition than in scenario2 because they earn more and they easily find good jobs. They have no pressure to find a better job. On the contrary, at the steady state of scenario2, worker agents are pushed to a survival condition so that if a single worker agent is able to find a better job, it obtains a conspicuous advantage in comparison with the others. Selection rewards this advantage and that is why worker agents with a high h value are better selected.
4.4 Globalization and segregation: the effects of different social network structures The value of information depends on the conditions under which worker agents live when they look for a job. But also different network structures affect the value worker agents give to information about jobs. We set the model to Cr = 1, Cp = 50 and z = 0.95 and we investigate the parameter space of p from 0 to 1 for 11 conditions (p = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0}). Compared to the previous simulation runs, the value of z has been lowered in order to avoid a lock-in situation where one single entrepreneur agent can hire all worker agents. Each worker agent i has 10 friends; consequently hi indicates how many of its friends the worker agent i consults when it is looking for a job. Thus hi can vary from 1 to 10. We collected results at the steady states after 3000 timesteps and Fig. 8 shows the results. Also, the value of h is bigger than 5 when the network is completely clustered. This is due to the low cost of information (Cr = 1) compared with the individual consumption (Cp = 50). In Fig. 9, it is clear that worker agents are more interested in job information when the network is less clustered because information is very redundant. In this world, worker agents are very segregated because their only sources of news about jobs are local neighbors. In this case, the more friends a worker agent asks for job information, the higher the chances to obtain new information, but in general these chances are small. On the contrary, when the network becomes less clustered, the chances of obtaining useful news about well-remunerated jobs are higher. That is why the value of h becomes higher for higher value of p. However, when p ap-
Information and cooperation in a simulated labor market
225
proaches 0.6, the value of h stabilizes around 9. For such a value, the network is already very randomized and information about good jobs is already well spread among worker agents. Any increase in randomization of the network does not affect the value of h anymore.
Level of contracts (2)
MaxS and Mins
100 80 60
MinS
40
MaxS
20 0 30
40
50
60
70
80
90
Cp
Fig. 6. The influence of individual consumption (Cp) on the level of salaries in scenario2
MaxS (average)
Effects of cooperation 90 80 70 60 50 40 30 20 10 0
Min_sal=40 Min_sal=50 Min_sal=60 Min_sal=70 Min_sal=80 Min_sal=90 0
0.2
0.4
0.6
percentage of cooperation
Fig. 7. Effects of cooperation in scenario2
0.8
1
S. A. Delre and D. Parisi Cr=0 (scenario1)
The value of information (1)
10 9 8 7 6 5 4 3 2 1 0
Cr=1 (scenario1) Cr=2 (scenario1)
Cr=1 (scenario2)
2983
2770
2557
2344
2131
1918
1705
1492
1279
853
1066
640
427
1
Cr=0 (scenario2) 214
h (average)
226
time
Cr=2 (scenario2)
Fig. 8a. The evolution of h in the two scenarios for different costs of the information (Cr)
The value of information (2)
h (average)
10 8 6 4 2 0 0
Scenario1
a
2
4
6
8
10
Cr
Scenario2
Fig. 8b. The value of h at the steady states for different costs of the information in the two scenarios
5. Conclusions Here we have presented an agent-based model in order to simulate the evolution of workers’ and firms’ behaviors in the labor market. A sharp distinction has been made between completely free labor markets and com-
Information and cooperation in a simulated labor market
227
pletely constrained labor markets. When workers and firms individually bargain for the salary, the free labor market is globally more efficient because the level of production is much higher than the carrying capacity and locally less efficient because the levels of wealth are worse both for workers and for firms. Nowadays policies like privatization, deregulation, and liberalization are always related with high production and higher efficiency of the global market. However, just as any medal has its reverse, in many markets such policies can create worse levels of wealth because of lower salaries. Our simple evolutionary model is able to simulate these two sides of the coin and it allows us to study more detailed phenomena of labor markets like cooperation and value of information. The value of information in different network structures 10 h (average)
9 8 7 6 5 4 0
0.2
0.4
0.6
0.8
1
p
Fig. 9. The effects of network structures on the value of information for worker agents
According to our results, in a completely free market individual bargaining (one worker vs one firm) leads to minimum levels of salaries. The evolution rules of the genetic algorithm adopted in our model impose more pressure on workers than on firms. However, in labor markets, unions dim the pressure on workers and the unions may strongly influence the evolution of firms’ behaviors. Letting worker agents of our simulations cooperate and bargain collectively with firms, we simulate a unionization of workers. Our results show an increasing tension of the evolution of the system when the fraction of unionized workers increases. If less than half of the workers cooperate, then firms “win” and salaries stay at the same level of survival individual consumption. But if more than half of the workers cooperate, then workers “win” and they obtain an increase in the level of salaries. After this critical stage, the higher the fraction of coopera-
228
S. A. Delre and D. Parisi
tion is, the higher the levels of salaries workers can obtain. However, if workers’ requests are too high (for example, all the profits of the firms), then firms simply abandon the market. The last part of this paper has focused on analyzing the value of information for workers in the labor market. How much effort do workers need for getting a job? How many offers do they consider before deciding which job to do? How important for workers is information about other job offers? Many studies have analyzed how workers try to find a job using referral networks (Granovetter, 1973; Tassier and Menczer, 2002). All these studies focus on the importance of friends and relatives in finding a job but they assume the labor demand of firms to be fixed and given. On the contrary, finding a job depends on the general conditions of the labor market. During recession periods finding a job is much more difficult than during periods of economic growth. Our model explains how the value of social contacts changes according to the conditions of the market. We found that in completely free labor markets, where salaries stabilize to the level of individual consumption, the value of information is systematically higher than in constrained labor markets. This phenomenon is very evident in labor markets today. High flexibility is accompanied by high competition for part-time and badly paid jobs also. In many big firms, the competition at the low levels of the organization is also very high and workers struggle for little increments in their work position. Clerks aim to become cash coordinators; team members want to become team managers, etc. On the contrary, in old bureaucracies like public firms, workers seem to pay little attention to the chances of promotion they have. In this case there is little competition to obtain a better position and usually workers exert no effort to obtain a better job. Finally we have adapted our model in order to study the effects of globalization and segregation in a labor market and we have found that the value workers give to information about jobs increases with the degree of globalization. The less clustered the network structure connecting workers, the higher the value of the information. Nowadays people search for better jobs much more than before. Especially in the last few decades, looking for a job has become a very important activity and it has been extended for the total working life of workers. The globalization of western society has positively influenced the possibility of having more not-local information and, thus, the chances of getting better jobs. Also for this reason, today we spend much more effort than before in looking for a “better” job.
Information and cooperation in a simulated labor market
229
References Arthur, B. W., Durlauf, S. N. and Lane, D. (1997): The Economy as an Evolving Complex System II, SFI, Studies in the Science of Complexity, AddisonWesley, Reading, MA. Bresnahan, T. F. (1989): Empirical Studies of Industries with Market Power, in Handbook of Industrial Organization, R. Schmalensee and R. D.Willig, eds., Vol. II, Elsevier Science Publisher B. V. Amsterdam. Dosi, G. and Nelson, R. R. (1994): An Introduction to Evolutionary Theories in Economics, Journal of Evolutionary Economics, 4, 153-172. Ehrenberg, R. G. and Smith, R. S. (1997): Modern Labor Economics, Sixth Edition, Addison-Wesley, Reading, MA. Epstein, J. E. and Axtell, R. (1996): Growing Artificial Societies: Social Science from the Bottom Up, MIT Press, Cambridge, MA. Epstein, J. E. (1997): Nonlinear Dynamics, Mathematical Biology, and Social Science, Addison-Wesley, Santa Fe Institute. Fagiolo, G., Dosi, G. and Gabriele, R. (2004): Towards an Evolutionary Interpretation of Aggregate Labor Market Regularities, working paper, Sant’Anna School of Advanced Studies, Pisa, Italy. Fehr, E. and Schmidt, K. M. (1999): A Theory of Fairness, Competition, and Cooperation, Quarterly Journal of Economics, 114, 817-868. Granovetter, M. (1973): The Strength of Weak Ties, American Journal of Sociology, 78, 1360-1380. Granovetter, M. (1995): Getting a Job: A Study on Contacts and Careers, Harvard University Press, Cambridge, MA. Holt, C. (1995): Industrial Organization: A Survey of Laboratory Research, in Handbook of Experimental Economics, J. H. Kagel and A. E. Roth, eds., Princeton University Press, Princeton. Menczer, F. and Belew, R. K. (1996): Latent Energy Environments, in Adaptive Individuals in Evolving Population: Models and Algorithms. K. Belew and M. Mitchell, eds., Santa Fe Institute, Studies in the Science of Complexity, Addison Wesley, Reading, MA. Montgomery, J. D. (1991): Social Networks and Labor-Market Outcomes: Toward an Economic Analysis, The American Economic Review, 81, 1408-1418. Nelson, R. R. and Winter S. (1982): An Evolutionary Theory of Economic Change, Harvard University Press, Cambridge, MA. Rees, A. and Schultz, G. P. (1966): Information Networks in Labor Markets, American Economic Review, 56, 559-566. Rees, A. and Schultz, G. P. (1970): Workers in an Urban Labor Market, University of Chicago, Chicago. Rees, A. (1973): The Economics of Work and Pay, Harper and Row, New York. Rees, A. (1993): The Role of Fairness in Wage Determination, Journal of Labor Economics, 1, (11), 243-252. Richardson, L. F. (1960): Arms and Insecurity: A Mathematical Study of the Causes and Origins of War, Boxwood Press, Pittsburgh.
230
S. A. Delre and D. Parisi
Tassier, T. and Menczer, F. (2001): Emerging Small-World Referral Networks in Evolutionary Labor Markets, IEEE Transactions on Evolutionary Computation, 5 (5). Tassier, T. and Menczer, F. (2002): Social Network Structure, Equality and Segregation in a Labor Market with Referral Hiring, working paper, http://informatics.indiana.edu/fil/papers.asp Tesfatsion, L. (2001): Structure, Behavior, and Market Power in an Evolutionary Labor Market with Adaptive Search, Journal of Economic Dynamics and Control, 25, 419-457. Tesfatsion, L. (2002a): Agent-Based Computational Economics: Growing Economies from the Bottom Up, Artificial Life, 1, (8), 55-82. Tesfatsion, L. (2002b): Hysteresis in an Evolutionary Labor Market with Adaptive Search, in S. H. Chen, Evolutionary Computation in Economics and Finance, Physica-Verlag Heidelberg, New York. Watts, D. J. and Strogatz, D. H. (1998): Collective Dynamics of Small-World Networks, Nature, 393, 440-442.
231
Part V
Applications
233
____________________________________________________________
Income Inequality, Corruption, and the NonObserved Economy: A Global Perspective E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
1. Introduction1 How large the non-observed economy (NOE) is and what determines its size in different countries and regions of the world is a question that has been and continues to be much studied by many observers (Schneider and Enste, 2000, 2002)2. The size of this sector in an economy has important ramifications. One is that it negatively affects the ability of a nation to collect taxes to support its public sector. The inability to provide public services can in turn lead more economic agents to move into the nonobserved sector (Johnson, Kaufmann, and Shleifer, 1997). When such a sector is associated with criminal or corrupt activities it may undermine social capital and broader social cohesion (Putnam, 1993), which in turn may damage economic growth (Knack and Keefer, 1997; Zak and Knack, 2001). Furthermore, as international aid programs are tied to official —————— 1
The authors wish to thank Joaquim Oliveira for providing useful materials. We have also benefited from discussions with Daniel Cohen, Lewis Davis, Steven Durlauf, James Galbraith, Julio Lopez, Branko Milanovic, Robert Putnam, Lance Taylor, Erwin Tiongson, and the late Lynn Turgeon. The usual caveat applies. 2 Many terms have been used for the non-observed economy, including informal, unofficial, shadow, irregular, underground, subterranean, black, hidden, occult, illegal, and others. Generally these terms have been used interchangeably. However, in this paper we will make distinctions between some of these and thus we prefer to use the more neutral descriptor, non-observed economy, adopted for formal use by the UN System of National Accounts (SNA) (see Calzaroni and Ronconi, 1999; Blades and Roberts, 2002).
234
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
measures of the size of economies, these can be distorted by wide variations in the relative sizes of the NOE across different countries, especially among the developing economies. Early studies (Guttman, 1977; Feige, 1979; Tanzi, 1980, Frey and Pommerehne, 1984) emphasized the roles of high taxation and large welfare state systems in pushing businesses and their workers into the nonobserved sector. Although some more recent studies have found the opposite, that higher taxes and larger governments may actually be negatively related to the size of this sector (Friedman, Johnson, Kaufmann, and Zoido-Lobatón, 2000), others continue to find the more traditional relationship (Schneider, 2002)3. Various other factors have been found to be related to the NOE at the global level, including degrees of corruption, degrees of over-regulation, the lack of a credible legal system (Friedman, Johnson, Kaufmann, Zoido-Lobatón), the size of the rural sector and the degree of ethnic fragmentation (Lassen, 2003). One factor that has been little studied in this mix is income inequality. To the best of our knowledge the first published papers dealing empirically with such a possible relationship focused on this relationship within transition economies (Rosser, Rosser, and Ahmed, 2000, 2003)4. For a major set of the transition economies they found a strong and robust positive relationship between income inequality and the size of the non-observed economy. The first of these studies also found a positive relationship between changes in these two variables during the early transition period, although the second study only found the levels relationship still holding significantly after taking account of several other variables. The most important other significant variable appeared to be a measure of the degree of macroeconomic instability, specifically the maximum annual rate of inflation a country had experienced during the transition. —————— 3
However, in Schneider and Neck (1993) it is argued that the complexity of a tax code is more important than its level of tax rates. 4 Until late February, 2004, we believed we were also the first to posit the idea theoretically. However, we thank Lewis Davis (2004) for bringing to our attention the theoretical model of Rauch (1993) that hypothesizes such a relationship in development in conjunction with the Kuznets curve. During the middle stage of development, inequality increases as many poor move to the city and participate in the “underemployed informal economy”, a concept that follows the discussion of de Soto (1989), although this resembles more the “underground” economy as defined later in our paper here. Rauch does not provide empirical data and his theoretical model differs from the one we present here and involves a different mechanism than ours as well.
Income Inequality, Corruption, and the Non-Observed Economy
235
In this paper we seek to extend the hypothesis of a relationship between the degree of income inequality and the size of the non-observed economy to the global data set studied by Friedman et al. However, we also include macroeconomic variables that they did not include. Our main conclusion is that the finding of our earlier studies carries over to the global data set: income inequality and the size of the non-observed economy possess a strong, significant, and robust positive correlation. The other variable that consistently shows up as similarly related is a corruption index; indeed, this is the most statistically significant single variable although income inequality may be slightly more economically significant. However, inflation is not significantly correlated for the global data set, in contrast to our findings for the transition countries, and neither is per capita GDP. In contrast with Friedman et al. measures of regulatory burden and property rights enforcement are weakly negatively correlated with the size of the non-observed economy but not significantly so. However, these are strongly negatively correlated with corruption, so we expect that they are working through that variable. The finding of Friedman et al that taxation rates are negatively correlated with the size of the non-observed economy holds only insignificantly in our multiple regressions. In addition we have looked at which variables are correlated in multiple regressions with income inequality and with levels of corruption. In a general formulation the two variables that are significantly correlated with income inequality are a positive relation with the size of the non-observed economy and a negative relation with taxation rates. Regarding the corruption index, the variables significantly correlated with it are negative relations with property rights enforcement and lack of regulatory burden, and a positive relation with the size of the non-observed economy. Real per capita GDP is curiously positively related at the 10 percent level. In the next section of the paper, theoretical issues will be discussed. The following section will deal with definitional and data matters. Then empirical results will be presented. The final section will present concluding observations.
2. Labor Returns in the Non-Observed Economy Whereas Friedman et al. focus upon decisions made by business leaders, we prefer to consider decisions made by workers regarding which sector of the economy they wish to supply labor to. This allows us to emphasize more clearly the social issues involved in the formation of the nonobserved economy that tend to be left out in such discussions. Focusing on decisions by business leaders does not lead readily to reasons why income
236
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
distribution might enter into the matter, and it may be that the use of such an approach in much of the previous literature explains why previous researchers have managed to avoid the hypothesis that we find to be so compelling. To us, factors such as social capital and social cohesion seem to be strongly related to the degree of income inequality and thus need to be emphasized. Before proceeding further we need to clarify our use of terminology. As noted in footnote 1 above, most of the literature in this field has not distinguished between such terms as “informal, underground, illegal, shadow”, etc., in referring to economic activities not reported to governmental authorities (and thus not generally appearing in official national and income product accounts, although some governments make efforts to estimate some of these activities and include them). In Rosser, Rosser, and Ahmed (2000, 2003) we respectively used the terms “informal” and “unofficial” and argued that all of these labels meant the same thing. However we also recognized there that there were different kinds of such activities and that they had very different social, economic, and policy implications, with some clearly undesirable on any grounds and others at least potentially desirable from certain perspectives, e.g. businesses only able to operate in such a manner due to excessive regulation of the economy (Asea, 1996)5. In this paper we use the term, “non-observed economy” (NOE), introduced by the United Nations System of National Accounts (SNA) in 1993 (Calzaroni and Ronconi, 1999), which has become accepted in policy discussions within the OECD (Blades and Roberts, 2002) and other international institutions. Calzaroni and Ronconi report that the SNA further subdivides the NOE into three broad categories: illegal, underground, and informal. There are further subdivisions of these regarding whether their status is due to statistical errors, underreporting, or non-registration, although we will not discuss further these additional details. The illegal sector is that whose activities would be in and of themselves illegal, even if they were to be officially reported, e.g. murder, theft, bribery, etc. Some of what falls into the category of corruption fits into this category, but not all. By and large these activities are viewed as unequivocally undesirable on social, economic, and policy grounds. Underground activities are those that are not illegal per se, but which are not reported to the government in order to avoid taxes or regulations. Thus they become —————— 5
Another positive aspect of non-observed economic activity of any sort arises from multiplier effects on the rest of the economy that it can generate (Bhattacharya, 1999).
Income Inequality, Corruption, and the Non-Observed Economy
237
illegal, but only because of this non-reporting. Many of these may be desirable to some extent socially and economically, even if the non-reporting of them reduces tax revenues and may contribute to a more corrupt economic environment. Finally, informal activities are those that take place within households and do not involve market exchanges for money. Hence they would not enter into national income and product accounts by definition, even if they were to be reported. They are generally thought to occur more frequently in rural parts of less developed countries and to be largely beneficial socially and economically. Although the broader implications of these different types of non-observed economic activity vary considerably, they share the feature that they result in no taxes on them being paid to the government. Although it is not necessary in order to obtain positive relations between our main variables, income inequality, corruption, and the size of the NOE, it is useful to consider conditions under which multiple equilibria arise as discussed in Rosser et al. (2003). This draws on a considerable literature, much of it in sociology and political science, which emphasizes positive feedbacks and critical thresholds in systems involving social interactions. Schelling (1978) was among the first in economics to note such phenomena. Granovetter (1978) was among the first in sociology, with Crane (1991) discussing cases involving negative social conduct spreading rapidly after critical thresholds are crossed. Putnam (1993) suggested the possibility of multiple equilibria in his discussion of the contrast between northern and southern Italy in terms of social capital and economic performance. Although Putnam emphasizes participation in civic activities as key in measuring social capital, others focus more on measures of generalized trust, found to be strongly correlated with economic growth at the national level (Knack and Keefer, 1997; Zak and Knack, 2001). Given that Coleman (1990) defines social capital as the strength of linkages between people in a society, it can be related closely to lower transactions costs in economic activity and to broader social cohesion. Rosser et al. (2000, 2003) argue that the link between income inequality and the size of the NOE is a two-way causal relationship, with the main links running through breakdowns of social cohesion and social capital. Income inequality leads to a lack of these, which in turn leads to a greater tendency to wish to drop out of the observed economy due to social alienation. Zak and Feng (2003) find transitions to democracy easier with greater equality. Going the other way, the weaker government associated with a large NOE reduces redistributive mechanisms and tends to aggra-
238
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
vate income inequality6. Bringing corruption into this relation simply reinforces it in both directions. Although no one prior to Rosser et al. directly linked income inequality and the NOE, some did so indirectly. Thus, Knack and Keefer (1997) noted that income equality and social capital were both linked to economic growth and hence presumably to each other. Putnam (2000) shows that among the states in the United States social capital is positively linked with income equality but is negatively linked with crime rates. The formal argument in Rosser et al. (2003) drew on a model of participation in mafia activity due to Minniti (1995). That model was in turn based on ideas of positive feedback in Polya urn models due to Arthur, Ermoliev, and Kaniovski (1987, see also Arthur, 1994). The basic idea is that the returns to labor of participating in NOE activity are increasing for a while as the relative size of the NOE increases and then decrease beyond some point. This can generate a critical threshold that can generate two distinct stable equilibrium states, one with a small NOE sector and one with a large NOE sector. In the model of criminal activity, the argument is that law and order begins to break down and then substantially breaks down at a certain point, which coincides with a substantially greater social acceptability of criminal activity. However, eventually a saturation effect occurs and the criminals simply compete with each other, leading to decreasing returns. Given that two of the major forms of NOE activity are illegal for one reason or another, similar kinds of dynamics can be envisioned. Let N be the labor force; Nnoe be the proportion of the labor force in the NOE sector; rj be the expected return to labor activity in the NOE sector minus that of working in the observed sector for individual j; and aj be the difference due solely to personal characteristics for individual j of the returns to working in the NOE minus those of working in the observed economy. We assume that this variable is uniformly distributed on the unit interval, j ε [0,1], with aj increasing as j increases, ranging from a minimum at ao and a maximum at a1. We assume that this difference in returns between the sectors follows a cubic function. With all parameters assumed positive this gives the return to working in the NOE sector for individual j as rj = aj + (-αNnoe3 + βNnoe2 + γNnoe),
(1)
—————— 6
This effect is seen further from studies showing that tax paying is tied to general trust and social capital (Scholz and Lubell, 1998; Slemrod, 1998).
Income Inequality, Corruption, and the Non-Observed Economy
239
with the term in parenthesis on the right hand side equaling f(Nµ). Fig. 1 shows this for three individuals, each with a different personal propensity to work in the NOE sector. Broader labor market equilibrium is obtained by considering stochastic dynamics of the decision-making of potential new labor entrants. Let N` = N + 1; q(noe) = probability a new potential entrant will work in the NOE sector, 1 – q(noe) = probability new potential entrant will work in observed sector, with λnoe = 1 with probability q(noe) and λnoe = 0 with probability 1 – q(noe). This implies that q(noe) = [a1 – f(Nnoe)]/(a1 – a0)
(2)
Thus after the change in the labor force the NOE share of it will be N`noe = Nnoe + (1/N)[q(noe) – Nnoe] + (1/N)[λnoe – q(noe)].
(3)
The third term on the right is the stochastic element and has an expected value of zero (Minniti, 1995, p. 40). If q(noe) > Nnoe, then the expected value of N`noe > Nnoe. This implies the possibility of three equilibria, with the two outer ones stable and the intermediate one unstable. This situation is depicted in Fig. 2. Our argument can be summarized by positing that the location of the interval [a0,a1] rises with an increase in either the degree of income inequality or in the level of corruption in the society. Such an effect will tend to increase the probability that an economy will be at the upper equilibrium rather than at the lower equilibrium and if it does not move from the lower to the higher it will move to a higher equilibrium value. In other words, we would expect that either more income inequality or more corruption will result in a larger share of the economy being in the nonobserved portion.
3. Variable Definitions and Data Sources In the empirical analysis in this paper we present results using eight variables: a measure of the share of the NOE sector in each economy, a Gini index measure of the degree of income inequality in each economy, an index of the degree of corruption in each economy, real per capita income in each economy, inflation rates in each economy, a measure of the tax burden in each economy, a measure of the enforcement of property rights, and
240
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
a measure of the degree of regulation in each economy7. This set of variables produced equations for all of our dependent variables with very high degrees of statistical significance based on the F-test, as can be seen in Tables 2-4 below. Let us note the problems with measuring each of these variables and provide the sources we have used in our estimates.
Fig. 1. - Relative returns to working in non-observed sector for three separate individuals (vertical axis) as function of percent of economy in non-observed sector (horizontal axis)
—————— 7
Other variables have been included in other tests, including unemployment rates, aggregate GDP, a fiscal burden measure, and a general economic freedom index.. However, neither of the first two was significant as they were not in other studies as well. Real per capita GDP presumably is a better measure than aggregate GDP, anyway. Regarding fiscal burden, this is the same as our tax burden measure except that it includes the level of government spending. Most literature supports the idea that the tax aspect is the more important part of this and our results would support this. Finally, the overall economic freedom index contains five subindexes, three of which we are already using individually. Also one index going into it is a measure of “black market activity”, which looks like another measure directly of non-observed economic activity, or at least an important portion of it. So this variable has too many direct correlations with other variables to be of use. However, results of these regressions are available on request from the authors.
Income Inequality, Corruption, and the Non-Observed Economy
241
Fig. 2 Probability average new labor force entrant works in non-observed sector q(u) (vertical axis) as function of percent of economy in non-observed sector (horizontal axis)
Without question the hardest of these to measure is the relative share of an economy that is not observed. The essence of the problem is that one is trying to observe that which by and large people do not wish to have observed. Thus there is inherently substantial uncertainty regarding any method or estimate, and there is much variation across different methods of estimating. Schneider and Enste (2000) provide a discussion of the various methods that have been used. However, they argue that for developed market capitalistic economies the most reliable method is one based on using currency demand estimates. An estimate is made of the relationship between GDP and currency demand in a base period; then deviations from this model’s forecasts are measured. This method, due to Tanzi (1980), is widely used within many high income countries for measuring criminal activity in general. Schneider and Enste recommend the use of electricity consumption models for economies in transition, a method originated by Lizzera (1979). Kaufmann and Kaliberda (1996) and also Lackó (2000) have made such estimates for transition economies, with these providing the basis for the previous work by Rosser et al. (2003). Kaufmann and Kaliberda’s estimates are similar in method to the currency demand one except that a relationship is estimated between GDP and electricity use in a base period, with deviations later providing the estimated share of the NOE. Lackó’s
242
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
approach differs in that she models household electricity consumption relations rather than electricity usage at the aggregate level. Another approach is MIMIC, or multiple indicator multiple cause, due originally to Frey and Pommerehne (1984) and used by Loayza (1996) to make estimates for various Latin American economies. This method involves deriving the measure from a set of presumed underlying variables. Unfortunately this method is not usable if one is testing for relationships between any of the underlying variables and the size of the NOE. In effect it already presumes to know what the relationship is. One more method is to look at discrepancies in national income and product accounts data between GDP estimates and national income estimates. Schneider and Enste list several other methods that have been used. However, these four are the ones underlying the numbers we use in our estimates. Although we use some alternatives to some of their other variables we use the measures of the NOE that Friedman et al. use. These in turn are taken from tables appearing in an early version of Schneider and Enste. They have 69 countries listed and for many countries they provide two different estimates. By and large, for OECD countries they use currency demand estimates, mostly due to Schneider (1997) or Williams and Windebank (1995) or Bartlett (1990), with averages of the estimates provided when more than one is available. For transition economies, electricity consumption models are used, mostly from Kaufmann and Kaliberda, with a few from Lackó. Electricity consumption models are also used for the more scattered estimates for Africa and Asia, with most of these estimates drawn on work of Lackó as reported in Schneider and Enste. For Latin America most of the estimates come from Loayza (1996), who used the MIMIC method. However, for some countries, electricity consumption model numbers are available, due to Lackó and reported by Schneider and Enste. Finally, the national income and product accounts discrepancy approach was the source for one country, Croatia, also as reported in Schneider and Enste. For our study we have selected the estimate from those available based on the prior arguments regarding which would be expected to be most accurate. Most of these numbers are for the early to mid-1990s. Although not as difficult to measure as the NOE, income inequality is a variable that is somewhat difficult to measure, with various competing approaches. The Gini coefficient is the most widely available number across different countries, although it is not available for all years for most countries. Furthermore, there are different data sources underlying estimates of it, with the surveys in higher income countries generally reflecting income whereas in poorer countries they often reflect just consumption patterns. For most of the transition countries we use estimates constructed
Income Inequality, Corruption, and the Non-Observed Economy
243
by Rosser et al. (2000); however, for the other countries we use the numbers provided by the UN Human Development Report 2002, which are also for various years in the 1990s. Of the 69 countries studied in Friedman et al. there are three for which no Gini coefficient data are available, Argentina, Cyprus, and Hong Kong. Hence they are not included in our estimates. Our measure of corruption is an index used by Friedman et al. that comes from Transparency International (1998). We note that the scale used for this index is higher in value for less corrupt nations and ranges from one to ten. This is in contrast to our NOE and Gini coefficient numbers, which rise with more NOE and more inequality. Thus, a positive relation between corruption and either of those other two variables will show up as a negative relationship for our variables. Real per capita GDP numbers come from the UN Human Development Report 2002 and are for the year 2000. The inflation rate estimate is from the same source but is an average for the 1990-2000 period. Our measure of tax burden comes from the Heritage Foundation’s 2001 Index of Economic Freedom (O’Driscoll, Holmes, and Kirkpatrick, 2001). This combines an estimate based on the top marginal income tax rate, the marginal tax rate faced by the average citizen and the top corporate tax rate and ranges from one (low tax burden) to 5 (high tax burden). This number increases as the taxation burden increases. Our measure of property rights enforcement comes from O’Driscoll et al. and ranges from one (high property rights enforcement) to five (low property rights enforcement). The measure of regulatory burden is also from O’Driscoll et al. and ranges from one (low regulatory burden) to five (high regulatory burden). Obviously there is a considerable amount of subjectivity involved in many of these estimates.
4. Empirical Estimates As a preliminary to our OLS multiple regressions, we display the correlation matrix for these seven variables as Table 1. What comes out of the OLS regressions is generally foreshadowed in this matrix and is consistent with it, with a few exceptions. Essentially for each of the three variables that we study as a dependent variable, the independent variables that prove to be statistically significant in the OLS regressions also have a high absolute value in the correlation matrix with the dependent variable. The two exceptions are that lack of property rights enforcement and regulatory burden appear strongly correlated with the NOE, but are not statistically so in the multiple regression, but their relations with corruption are the highest
244
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
bivariate correlations in the matrix, foreshadowing that corruption probably carries the effect of property rights and regulatory burden in the relevant multiple regression. We note that in all these tables, the NOE measure is labeled “MODIFIEDSHARE”, the Gini Coefficient is labeled “GINIINDEX”, the corruption index is labeled “CORRUPTION”, the inflation rate is labeled “DEFLATOR”, real per capita GDP is labeled “REALGDPCA”, the taxation burden index is labeled “TAXATION”, the index of property rights enforcement is labeled “PROPERTYRI”, and the regulatory burden index is labeled “REGULATION”. Table 2 shows the results for the OLS regression in which the measure of the non-observed economy is the dependent variable and the other six variables are the independent ones. The most statistically significant independent variable with respect to NOE is the corruption index, significant at the 5 percent level. The expected positive relationship between these two (shown by a negative sign) holds. The other significant variable, only so at the 5 percent level but not at the 1 percent level, is the Gini coefficient. This confirms that the finding of Rosser et al for the transition economies carries over to the world economy. We note that although we do not report them here, the qualitative results seen here show up consistently in other formulations of possible regressions with these and some other variables in various combinations. Following the arguments of McCloskey and Ziliak (1996) we also observe that the size of the coefficients for these two statistically significant variables are large enough to be considered economically significant as well. Thus, the presumed ceteris paribus relations would be that a ten percent increase in the Gini coefficient would be associated with a six percent increase in the share of GDP in the non-observed economy, while a ten percent increase in the rate of corruption (change in index value of one point) would be associated with four percent increase in the share of GDP in the non-observed economy. These are noticeable relationships economically, although one must be careful about making such extrapolations as these. However, one finding of Rosser et al. (2003) does not appear to carry over to the global data set. This is the statistically significant relationship between inflation and the size of the NOE, which even carried over to the growth of the NOE as well. A possible explanation of this that seems reasonable is that during the period of observation, the transition economies experienced much higher inflation than most of the rest of the world, with Ukraine reaching a maximum annual rate of more than 10,000 percent. This high inflation was strongly related to the general process of institutional collapse and breakdown that happened in those countries.
Income Inequality, Corruption, and the Non-Observed Economy GINIINDEX
245
MODIFIED CORRUPREALGDP PROPERTY REGULADEFLATOR REALGNP TAXA-TION SHARE TION CAPITA RIGHTS TION
GINIINDEX
1.000000
0.479591 -0.299891
0.066234 -0.139274 -0.344565
-0.562805
0.293423
0.039558
MODIFIED SHARE
0.479591
1.000000 -0.624869
0.107299 -0.223452 -0.409824
-0.325932
0.527356
0.413125
CORRUPTION -0.299891
-0.624869 1.000000 -0.456403
0.195630
0.526753
0.365308
1.000000 -0.125176 -0.241908
-0.273240
-0.851864 -0.764468
DEFLATOR
0.066234
0.107299 -0.456403
0.533411
0.523695
REALGNP
-0.139274
-0.223452 0.195630 -0.125176
1.000000
0.823649
0.140774
-0.186629 -0.148728
REALGDP CAPITA
-0.344565
-0.409824 0.526753 -0.241908
0.823649
1.000000
0.231894
-0.458439 -0.325608
TAXATION
-0.562805
-0.325932 0.365308 -0.273240
0.140774
0.231894
1.000000
-0.366470 -0.106925
PROPERTY RIGHTS
0.293423
0.527356 -0.851864
0.533411 -0.186629 -0.458439
-0.366470
1.000000
0.777601
REGULATION
0.039558
0.413125 -0.764468
0.523695 -0.148728 -0.325608
-0.106925
0.777601
1.000000
Table 1. Correlation Matrix
One finding of Friedman et al. is not confirmed by our results, their finding that taxation burden is negatively correlated with the size of the NOE significantly. Our correlation matrix does show a negative bivariate correlation of -.287, but in this regression this becomes a weakly positive and statistically insignificant relation. There is an obvious possible explanation for the contrast between our finding and that of Friedman et al. As we shall see, there is a strong negative relation between taxation burden and income inequality. There is a negative bivariate correlation between taxation burden and the non-observed economy (see Table 1). but in the multiple regression it appears that it is dominated by the negative relation between taxation and income inequality. However, it would appear that the more important factor here is income inequality, and when a measure of it appears in an equation, the statistical significance (and even the sign found) disappears. Thus, the fact that Friedman et al. left out income distribution in their various estimates appears to have profoundly distorted their findings. On the other hand our results do not provide any support for the more traditional view that tax burden is a major factor in the growth of the non-observed economy, either. It is simply not statistically significant in either direction, particularly in a more fully specified model. Table 3 shows the OLS regression results for the same set of variables but with the Gini coefficient as the dependent variable. The finding of Rosser et al. (2000, 2003) that the size of the NOE is statistically significantly related to income inequality when the latter is an independent variable found for the transition economies is confirmed for the global data set as well, although only at the 5 percent level, not at the 1 percent level.
246
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
Even more statistically significant, holding strongly at the 1 percent level, is tax burden, which is negatively correlated. It would appear that these tax burdens result in noticeable income redistribution, or if they do not, then nations with more equal income distributions are more willing to tolerate higher tax rates. However, just as in Table 2, the inflation measure also does not show up as statistically significant, although it was not significant in the equation for the Gini coefficient in Rosser et al. (2003). The one other variable that was significant in their study (negatively so), an index of Democratic Rights, is not included in this study, as it was specifically measured for the transition countries. None of the other variables are significant, and do not show up as being so in alternate formulations. Table 2 Dependent Variable: MODIFIEDSHARE Method: Least Squares Date: 02/17/04 Time: 12:38 Sample(adjusted): 3 67 Included observations: 52 Excluded observations: 13 after adjusting endpoints Variable
Coefficient
C GINIINDEX CORRUPTION REALGDPCAPITA DEFLATOR TAXATION PROPERTYRIGHTS REGULATION
20.40339 0.649755 -4.126899 -6.71E-05 -0.021824 0.185368 0.201990 2.099252
R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat
0.523851 0.448100 13.98235 8602.267 -206.6068 1.708203
t-Statistic
Prob.
27.06869 0.753763 0.277039 2.345361 1.749074 -2.359477 0.000210 -0.320209 0.013136 -1.661379 3.271516 0.056661 3.791687 0.053272 4.531270 0.463281
0.4550 0.0236 0.0228 0.7503 0.1037 0.9551 0.9578 0.6454
Std. Error
Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)
26.68846 18.82131 8.254107 8.554298 6.915435 0.000015
Regarding economic significance the relation from the NOE to income inequality appears to be somewhat weaker than going the other way. Thus, a ten percent increase in the share of the non-observed economy in GDP would only be associated with about a two percent increase in the Gini coefficient. The taxation burden appears to be economically significant, with a twenty percent increase in tax burden leading to a forty percent decline in Gini coefficient, which probably must be considered as holding only locally within a range.
Income Inequality, Corruption, and the Non-Observed Economy
247
Table 3 Dependent Variable: GINIINDEX Method: Least Squares Date: 02/17/04 Time: 12:37 Sample(adjusted): 3 67 Included observations: 52 Excluded observations: 13 after adjusting endpoints Variable
Coefficient
Std. Error
t-Statistic
Prob.
C MODIFIEDSHARE CORRUPTION REALGDPCAPITA DEFLATOR TAXATION PROPERTYRIGHTS REGULATION
51.42799 0.171024 0.341917 -0.000131 -0.002270 -4.812388 1.635447 -3.045089
11.62930 0.072920 0.951033 0.000106 0.006939 1.513601 1.929675 2.284738
4.422278 2.345361 0.359522 -1.242197 -0.327189 -3.179430 0.847525 -1.332796
0.0001 0.0236 0.7209 0.2207 0.7451 0.0027 0.4013 0.1895
R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat
0.471365 0.387263 7.173550 2264.232 -171.9022 1.694839
Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)
35.13846 9.164256 6.919317 7.219508 5.604737 0.000116
Finally, Table 4 shows the OLS regression for this same set of variables but with the corruption index as the dependent variable, again keeping in mind that a lower value of this index indicates more corruption. One variable is statistically significant at the 1 percent level, property rights enforcement, which is negatively correlated with corruption. Two variables are statistically significant at the 1 percent level, the size of the NOE, which is positively correlated with corruption (a negative sign in this regression), and the regulatory burden also positively related to corruption. These results agree with the findings of Friedman et al. The role of the tax burden here would seem to fit better with the Friedman et al. view, with the tax variable operating through corruption, than the more traditional view that higher tax rates directly bring about illicit activities of various sorts.
248
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
Table 4 Dependent Variable: CORRUPTION Method: Least Squares Date: 02/17/04 Time: 12:39 Sample(adjusted): 3 67 Included observations: 52 Excluded observations: 13 after adjusting endpoints Variable
Coefficient
Std. Error
t-Statistic
Prob.
C GINIINDEX MODIFIEDSHARE REALGDPCAPITA DEFLATOR TAXATION PROPERTYRIGHTS REGULATION
9.601494 0.008566 -0.027215 2.95E-05 -8.57E-07 0.231810 -1.005120 -0.778700
1.673072 0.023827 0.011534 1.64E-05 0.001100 0.263372 0.268059 0.349690
5.738840 0.359522 -2.359477 1.794987 -0.000779 0.880160 -3.749621 -2.226834
0.0000 0.7209 0.0228 0.0795 0.9994 0.3836 0.0005 0.0311
R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat
0.813109 0.783376 1.135469 56.72872 -76.04776 2.641767
Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)
5.303846 2.439621 3.232606 3.532798 27.34736 0.000000
Regarding economic significance, the tax and regulatory burden variables appear to have large coefficients, with both implying approximately that a twenty percent change in their levels would be associated with a ten percent change in the corruption level, although given that we are dealing with artificially constructed indexes on both sides of this equation, this interpretation must be taken with caution. The link from the NOE to corruption seems to be somewhat weaker, however, with it taking a fifty percent increase in NOE share to be associated with a ten percent change in corruption, a change of one point in the corruption index. Interestingly, income inequality does not seem to be significantly correlated with the level of corruption, although it could arguably have been expected to be correlated, based on our theoretical argument. Of course, corruption was not a significant independent variable in the estimate with the Gini coefficient as the dependent variable, either.
Income Inequality, Corruption, and the Non-Observed Economy
249
5. Summary and Conclusions We have tested the relationships between the non-observed economy, income inequality, the level of corruption, real per capita income, inflation, tax burden, property rights enforcement, and regulatory burden on a set of 66 countries from all the regions of the world, although with somewhat fewer from the poorest regions of Africa and Asia due to data availability problems. The finding of Rosser, Rosser, and Ahmed (2000, 2003) that there appears to be a significant two-way relationship between the size of the non-observed economy (or informal or unofficial economy) and income inequality is confirmed when the data set is expanded to include nations representing a more fully global sample. The finding of Friedman, Johnson, Kaufmann, and Zoido-Lobatón (2000) that there is a strong relationship between the size of the non-observed economy and the level of corruption in an economy is confirmed, and appears also to be a significant two-way relationship, although somewhat stronger in going from corruption to the non-observed economy than the other way. On the other hand, at least one relationship found in each of these earlier studies is not confirmed in this study. Whereas Rosser, Rosser, and Ahmed (2003) found the maximum annual rate of inflation to be important in the size and, even more, the growth of the non-observed economy for the transition economies, this did not hold for the global data set using a decade average of inflation rates, and real per capita GDP was not statistically significant, either. It may be that using a maximum annual rate of inflation as in the earlier study would show something, although it may be that the transition economies are peculiar with their exceptionally high rates of inflation in the 1990s that were associated with much greater economic and social dislocations than were occurring in most other nations at the time, which may account for the difference. The finding not confirmed from the Friedman, Johnson, Kaufmann, and Zoido-Lobatón study is that of a negative relationship between higher taxes and the size of the non-observed economy. Our results find no statistically significant relationship, which puts us in between this view and the alternative, more traditional, view that argues that higher taxes drive people into the non-observed economy. We hypothesize that the failure of Friedman et al. to include any measures of income inequality, which is strongly negatively correlated with our measure of tax burden, explains the contrast between our findings and theirs. Also, their findings that the nonobserved economy increases with lack of property rights enforcement and regulatory burdens is not directly found. However, we find strong relations between these and corruption, which is strongly linked with the non-
250
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
observed economy, suggesting perhaps that this is the pathway through which these variables have their effect. Let us conclude with two related caveats. The first is to remind the reader that there are tremendous problems and uncertainties regarding much of the data used in this study, especially for the estimates of the size of the non-observed economy. There are competing series for some of the other variables as well, especially for those that are indexes estimated by one organization or another. This leads to our second caveat, that caution should be exercised in making policy recommendations based on these findings. Nevertheless, our results do reinforce the warning delivered in Rosser and Rosser (2001): international organizations that are concerned about the negative impacts on revenue collection in various countries of having large non-observed sectors should be cautious about recommending policies that will lead to substantial increases in income inequality.
References Arthur, W. Brian (1994): Increasing returns and path dependence in the economy. Ann Arbor: University of Michigan Press. Arthur, W. Brian, Y.M. Ermoliev, and Y.M. Kaniovski (1987): Path-dependent processes and the emergence of macro-structure. European Journal of Operational Research 30: 294-303. Asea, P.K. (1996): The informal sector: Baby or bath water? A comment. Carnegie-Rochester Conference Series on Public Policy 44: 163-71. Bartlett, B. (1990): The underground economy: Achilles heel of the state? Economic Affairs 10: 24-27. Bhattacharya, D.K. (1999): On the economic rationale of estimating the hidden economy. The Economic Journal 109: 348-59. Blades, D., and David R.. (2002): Measuring the non-observed economy. OECD Statistics Brief (5): 1-8. Calzaroni, M., and S. Rononi. (1999): Introduction to the non-observed economy: The conceptual framework and main methods of estimation. Chisinau Workshop Document NOE/02. Coleman, J.. (1990): Foundations of social theory. Cambridge, MA: Belknap Press of Harvard University. Crane, J. (1991): The epidemic theory of ghettos and neighborhood effects on dropping out and teenage childbearing. American Journal of Sociology 96: 1226-59. Davis, L.. (2004): Explaining the evidence on inequality and growth: Marginalization and redistribution. Mimeo, Department of Economics, Smith College. de Soto, H.. (1989): The other path: The invisible revolution in the third world. New York: Harper and Row. Feige, E. L. (1979): How big is the irregular economy? Challenge 22(1): 5-13.
Income Inequality, Corruption, and the Non-Observed Economy
251
Frey, B. S., and Werner Pommerehne. (1984): The hidden economy: State and prospect for measurement. Review of Income and Wealth 30: 1-23. Friedman, E., Johnson, S.; Kaufmann, D. and Zoido-Lobatón, P., (2000): Dodging the grabbing hand: The determinants of unofficial activity in 69 countries. Journal of Public Economics 76: 459-93. Granovetter, M.. (1978): Threshold models of collective behavior. American Journal of Sociology 83: 1420-43. Guttman, P.S. (1977): The subterranean economy. Financial Analysts Journal 34(6): 24-25, 34. Johnson, S., Kaufmann, D. and Shleifer, A., (1997): The unofficial economy in transition. Brookings Papers on Economic Activity (2): 159-221. Kaufmann, D., and Kaliberda, A., (1996): Integrating the unofficial economy into the dynamics of post socialist-economies. In Economic Transition in the newly independent states, edited by B. Kaminsky. Armonk, NY: M.E. Sharpe. Knack, S., and Keefer, P., (1997): Does social capital have an economic payoff? A cross-country investigation. Quarterly Journal of Economics 112: 1251-88. Lackó, M., (2000): Hidden economy – an unknown quantity? Comparative analysis of hidden economies in transition countries, 1989-95. Economics of Transition 8: 117-49. Lassen, D. D., (2003): Ethnic division, trust, and the size of the informal sector. Journal of Economic Behaviour and Organization, forthcoming. Lizzera, C., (1979): Mezzogiorno in controluce [Southern Italy in eclipse]. Naples: Enel. Loayza, N.V., (1996): The economics of the informal sector: A simple model and some empirical evidence from Latin America. Carnegie-Rochester Conference Series on Public Policy 45: 129-62. McCloskey, D. N., and Ziliak, S. T., (1996): The standard error of regressions. Journal of Economic Literature 34: 97-114. Minniti, M., (1995): Membership has its privileges: Old and new mafia organizations. Comparative Economic Studies 37: 31-47. O’Driscoll, G. P., Jr., Holmes, K. R. and Kirkpatrick, M., (2001): 2002 index of economic freedom. Washington and New York: The Heritage Foundation and the Wall Street Journal. Putnam, R. D., (1993): Making democracy work: Civic traditions in Italy. Princeton, NJ: Princeton University Press. Putnam, R. D., (2000): Bowling alone: The collapse and renewal of American community. New York: Simon & Schuster. Rauch, J. E., (1993): Economic development, urban underemployment, and income inequality. Canadian Journal o f Economics 26: 901-918. Rosser, J. B., Jr., and Rosser, M. V., (2001): Another failure of the Washington consensus on transition countries: Inequality and underground economies. Challenge 44(2): 39-50. Rosser, J. B., Jr., and Rosser, M. V., and Ahmed E., (2000): Income inequality and the informal economy in transition economies. Journal of Comparative Economics 28: 156-71.
252
E. Ahmed, J. B. Rosser, Jr., and M. V. Rosser
Rosser, J. B., Jr., and Rosser, M. V., and Ahmed E., (2003): Multiple unofficial economy equilibria and income distribution dynamics in systemic transition. Journal of Post Keynesian Economics 25: 425-47. Schelling, T. C., (1978): Micromotives and Macrobehavior. New York: W.W. Norton. Schneider, F., (1997): Empirical results for the size of the shadow economy of western European countries over time. Institut für Volkswirtschaftslehre Working Paper No. 9710, Johannes Kepler Universität Linz. Schneider, F., (2002): The size and development of the shadow economies of 22 transitional and 21 OECD countries during the 1990s. IZA Discussion Paper No. 514, Bonn. Schneider, F., and Enste D. H., (2000): Shadow economies: Size, causes, and consequences. Journal of Economic Literature 38: 77-114. Schneider, F., and Enste D. H., (2002): The Shadow Economy: An International Survey. Cambridge, UK: Cambridge University Press. Schneider, F., and Neck R., (1993): The development of the shadow economy under changing tax systems and structures. Finanzarchiv F.N. 50: 344-69. Scholz, J. T., and Lubell M., (1998): Trust and Taxpaying: Testing the heuristic approach. American Journal of Political Science 42: 398-417. Slemrod, J., (1998): On voluntary compliance, voluntary taxes, and social capital. National Tax Journal 51: 485-91. Tanzi, V., (1980): The underground economy in the United States: Estimates and implications. Banca Nazionale Lavoro Quarterly Review 135: 427-53. Transparency International, (1998): Corruption perception index. Transparency International. United Nations Development Program, (2002): Human development report 2002: Deepening democracy in a fragmented world. New York: Oxford University Press. Williams, C.C., and Windebank J., (1995): Black market work in the European Community: Peripheral work for peripheral localities? International Journal of Urban and Regional Research 19: 23-39. Zak, P. J., and Feng Y., (2003): A dynamic theory of the transition to democracy. Journal of Economic Behavior and Organization 52: 1-25. Zak, P. J., and Knack, S., (2001): Trust and growth. The Economic Journal 111: 295-321.
253
____________________________________________________________
Forecasting Inflation with Forecast Combinations: Using Neural Networks in Policy P. McNelis and P. McAdam
1. Introduction1 Forecasting is a key activity for policy makers. Given the possible complexity of the processes underlying policy targets, such as inflation, output gaps, or employment, and the difficulty of forecasting in real time, recourse is often taken to simple models. A dominant feature of such models is their linearity. However, recent evidence suggests that simple, though non-linear, models may be at least as competitive as linear ones for forecasting macro variables. Our paper contributes to this important debate in a number of respects. We follow Stock and Watson (1999, 2001) and concentrate on Phillips curves for forecasting inflation. However, we do so using linear and encompassing non-linear approaches. We further use a transparent comparison methodology. To avoid “model-mining”, our approach first identifies the best performing linear model and then compares that against a trimmed-mean forecast of simple non-linear models, which Granger and Jeon (2004) call a “thick model”. We examine the robustness of our inflation forecasting results by using different indices and sub-indices as well —————— 1
This paper was prepared for the NEW2004 conference in Salerno. We thank, without implicating, Blake LeBaron, Clive Granger, Gonzalo Camba-Méndez, Roberto Mariano, Ricardo Mestre, Jim Stock, and conference participants for helpful comments and suggestions. The opinions expressed are not necessarily those of the ECB.
254
P. McNelis, and P. McAdam
as conducting several types of out-of-sample comparisons using a variety of metrics. Specifically, using the Phillips-curve framework, this paper applies linear and forecast combination of neural networks (NN) to forecast monthly inflation rates. The appeal of the NN is that it efficiently approximates a wide class of non-linear relations. Our goal is to see how well this approach performs relative to the standard linear one, for forecasting with “real-time” and randomly-generated “split sample” or “bootstrap” methods. In the “real-time” approach, the coefficients are updated periodby-period in a rolling window, to generate a sequence of one-period-ahead predictions. Since policy makers are usually interested in predicting inflation at twelve-month horizons, we estimate competing models for this horizon, with the bootstrap and real-time forecasting approaches. It turns out that the forecast model – or as it has recently been called the “thick model” – based on trimmed-mean forecasts of several NN models dominates in many cases, the linear model for the out-of-sample forecasting with the bootstrap and the “real-time” method. Our “thick model” approach to neural network forecasting follows on recent reviews of neural network forecasting methods by Zhang et al. (1998). They acknowledge that the proper specification of the structure of a neural network is a “complicated one” and note that there is no theoretical basis for selecting one specification or another for a neural network (Zhang et al., 1998, p. 44). We acknowledge this model uncertainty and consequently make use of the “thick model” as a sensible way to utilize alternative neural network specifications and “training methods” in a “learning” context. The paper proceeds as follows. Section 2 lays out the basic model. Section 3 discusses key properties of the data and the methodological background. Section 4 presents the empirical results for the US case for the insample analysis, as well as for the twelve-month split-sample forecasts and examines the “‘real-time” forecasting properties2. Section 5 concludes.
2. The Phillips Curve We begin with the following forecasting model for inflation: π th+ h − π t = f (∆u t , ..., ∆u t − k , ∆π t , ..., ∆π t − m ) + e t + h
(1)
—————— 2
Following Stock and Watson (1999), the US is the normal benchmark in such academic (i.e., not specifically policy oriented) forecasting papers.
Forecasting Inflation with Forecast Combinations
π th+ h =
1200 ⎛ Pt ln⎜⎜ h ⎝ Pt − h
⎞ ⎟ ⎟ ⎠
255
(2)
where π t + h is the percentage rate of inflation for the price level p at an annualized value (at horizon t+h), u is the unemployment rate, and et+h is a random disturbance term, while k and m represent lag lengths for unemployment and inflation. We estimate the model for h=12. Given the discussion on the appropriate measure of inflation for monetary policy (e.g., Mankiw and Reis, 2003), we forecast using both the Consumer Price Index (CPI) and the producer price index (PPI) as well as indices for food, energy, and services. We use monthly and seasonally adjusted data of US inflation and unemployment. The data comes from the Federal Reserve of St. Louis FRED data base.
3. Non-linear Inflation Processes Should the inflation/unemployment relation or inflation/economic activity relation be linear? Fig. 1 pictures the inflation unemployment relation in the USA. Fig. 1 shows inflation-unemployment clusters around both low and high rates. The eyeball case for non-linearities in the inflationunemployment nexus is therefore straightforward to appreciate. We know, for example, that business cycles are typically highly non-linear – contractions differ from expansions in terms of their relative durations and strength or deepness (see, among others, Sichel, 1993) – and, given the socalled “Okun’s Law” mapping developments in output with unemployment, it would not be surprising to witness non-linearities in unemployment, either. Indeed, without wishing to review the literature at length, we remark that many authors have regarded the unemployment-inflation nexus (and in particular the cyclical experience of unemployment) as inherently non-linear, given business-cycle asymmetries (as mentioned above) as well as such things as relative labor-market rigidities, asymmetric adjustment, differences in hiring and firing costs across different sectors in the economy and over the cycle, labor hoarding, hysteresis (or path dependency) in unemployment etc. (e.g., Blanchard and Wolfers, 2000; Ljunqvist and Sargent, 2001; León-Ledesma and McAdam, 2004).
256
P. McNelis, and P. McAdam
Fig. 1. Phillips curves: 1988-2001
3.1 Neural Networks Specifications In this paper, we make use of a hybrid alternative formulation of the NN methodology: the basic multi-layer perceptron or feed-forward network, coupled with a linear jump connection or a linear neuron activation function. Following McAdam and Hughes-Hallett (1999), an encompassing NN can be written as: I
J
i =1
j =1
nk ,t = ω 0 + ∑ ωi xt ,i + ∑ φ j N t −1, j N k , t = h(nk , t )
(3) (4)
K
I
k =1
i =1
y i ,t = γ i , 0 + ∑ γ i , k N k ,t + ∑ β i x i ,t
(5)
where inputs (x) represent the current and lagged values of inflation and unemployment and the outputs (y) are their forecasts, and where the I regressors are combined linearly to form K neurons, which are transformed
Forecasting Inflation with Forecast Combinations
257
or “encoded” by the “squashing” function. The K neurons, in turn, are combined linearly to produce the “output” forecast3. Within this system, (3)–(5), we can identify representative forms. Simple (or standard) Feed-Forward, φ j = β i = 0, ∀i, j , links inputs (x) to outputs (y) via the hidden layer. Processing is thus parallel (as well as sequential); in equation (5) we have both a linear combination of the inputs and a limited-domain mapping of these through a “squashing” function, h, in equation (4). Common choices for h include the log-sigmoid form, N k , t = h( nk , t ) =
val:
1 (Fig. 2) which transforms data to within a unit inter−n 1 + e k ,t ⎧ lim h ( n ) → 1 ⎪ n→ ∞ ⎪ h : R → [0 ,1 ], ⎨ ⎪ ⎪ lim h ( n ) → 0 ⎩ n → −∞
Other, more sophisticated, choices of the squashing function are considered in section 3.3. The attractive feature of such functions is that they represent threshold behavior of the type previously discussed. For instance, they model representative non-linearities (e.g., a Keynesian liquidity trap where “low” interest rates fail to stimulate the economy or “labor-hoarding” where economic downturns have a less than proportional effect on layoffs, etc.). Further, they exemplify agent learning – at extremes of non-linearity, movements of economic variables (e.g., interest rates, asset prices) will generate a less than proportionate response to other variables. However, if this movement continues, agents learn about their environment and start reacting more proportionately to such changes.
—————— 3
Stock (1999) points out that the LSTAR (logistic smooth transition autoregressive) method is a special case of NN estimation. In this case, yt + h = α (L ) yt + dt β (L ) yt + ut + h , the switching variable dt is a log-sigmoid function of past data, and determines the “threshold” at which the series switches.
258
P. McNelis, and P. McAdam
Fig. 2.
We might also have Jump Connections, φ j ≠ 0, ∀j , β i = 0, ∀i : direct links from the inputs, x, to the outputs. An appealing advantage of such a network is that it nests the pure linear model as well as the feed-forward NN. If the underlying relationship between the inputs and the output is a pure linear one, then only the direct jump connectors, given by {βi } , i = 1,...I, should be significant. However, if the true relationship is a complex non-linear one, then one would expect {ω} and {γ } to be highly significant, while the coefficient set {β } would be expected to be relatively insignificant. Finally, if the underlying relationship between the inputs variables {x} and the output variable {y} can be decomposed into linear and nonlinear components, then we would expect all three sets of coefficients, {β , ω , γ }, to be significant. A practical use of the jump connection network is that it is a useful test for neglected non-linearity in a relationship between the input variables x and the output variable y4. In this study, we examine this network with varying specifications for the number of neurons in the hidden layers, jump connections. The lag lengths for inflation and unemployment changes are selected on the basis of in-sample information criteria.
—————— 4
For completeness, a final case in this encompassing framework is Recurrent networks (Elman, 1988), φ j = 0 ∀j , β i ≠ 0 ∀i , with current and lagged values of the inputs into system (memory). However, this less popular network is not used in this exercise. For an overview of NNs, see White (1992).
Forecasting Inflation with Forecast Combinations
259
3.2 Neural Network Estimation and Thick Models The parameter vectors of the network, {ω}, {γ } , {β } may be estimated with non-linear least squares. However, given the solution method’s possible convergence to local minima or saddle points (e.g., see the discussion in Stock, 1999), we follow the hybrid approach of Quagliarella and Vicini (1998): we use the genetic algorithm for a reasonably large number of generations, one hundred, then use the final weight vector [{ωˆ }, {γˆ}, {β }] as the initialization vector for the gradient-descent minimization based on the quasi-Newton method. In particular, we use the algorithm advocated by Sims (2003). The genetic algorithm proceeds in the following steps: (1) create an initial population of coefficient vectors as candidate solutions for the model; (2) have a selection process in which two different candidates are selected by a fitness criterion (minimum sum of squared errors) from the initial population; (3) have a cross-over of the two selected candidates from previous steps in which they create two offspring; (4) mutate the offspring; (5) have a “tournament”, in which the parents and offspring compete to pass to the next generation, on the basis of the fitness criterion. This process is repeated until the population of the next generation is equal to the population of the first. The process stops after “convergence” takes place with the passing of one hundred generations or more. A description of this algorithm can be found in McNelis (2005)5. Quagliarella and Vicini (1998) point out that hybridization may lead to better solutions than those obtainable using the two methods individually. They argue that it is not necessary to carry out the gradient descent optimization until convergence, if one is going to repeat the process several times. The utility of the gradient-descent algorithm is its ability to improve the individuals it treats, so its beneficial effects can be obtained by performing just a few iterations each time. Notably, following Granger and Jeon (2004), we make use of a “thick modeling” strategy: combining forecasts of several NNs, based on different numbers of neurons in the hidden layer, and different network architectures (feedforward and jump connections) to compete against that of the linear model. The combination forecast is the “trimmed mean” forecast at each period, coming from an ensemble of networks, usually the same network estimated several times with different starting values for the parameter sets in the genetic algorithm, or slightly different networks. We numerically rank the predictions of the forecasting model, then remove the —————— 5
See also Duffy and McNelis (2001) for an example of the genetic algorithm with real, as opposed to binary, encoding.
260
P. McNelis, and P. McAdam
100*α% largest and smallest cases, leaving the remaining 100*(2-α)% to be averaged. In our case, we set α at 5%. Such an approach is similar to forecast combinations. The trimmed mean, however, is fundamentally more practical since it bypasses the complication of finding the optimal combination (weights) of the various forecasts.
3.3 Adjustment and Scaling of Data For estimation, the inflation and unemployment “inputs” are stationary transformations of the underlying series. As in equation (1), the relevant forecast variables are the one-period-ahead first differences of inflation6. Besides stationary transformation and seasonal adjustment, scaling is also important for non-linear NN estimation. When input variables {xt} and stationary output variables {yt} are used in a NN, “scaling” facilitates the non-linear estimation process. The reason scaling is helpful is that the use of very high or small numbers, or series with a few very high or very low outliers, can cause underflow or overflow problems, with the computer stopping, or even worse, as Judd (1998, p. 99) points out, the computer continuing by assigning a value of zero to the values being minimized. There are two main ranges used in linear scaling functions: as before, in the unit interval, [0, 1], and [-1, 1]. Linear scaling functions make use of the maximum and minimum values of series. The linear scaling function * for the [0, 1] case transforms a variable xk into x k in the following way7:
xk*, t =
xk ,t − min( xk ) max( xk ) − min( xk )
(6)
A non-linear scaling method proposed by Helge Petersohn (University of Leipzig), transforming a variable xk to zk allows one to specify the range
[ ]
0
—————— 6
As in Stock and Watson (1999), we find that there is little noticeable difference in results using seasonally adjusted or unadjusted data. Consequently, we report results for the seasonally adjusted data. 7
**
The linear scaling function for [-1,1], transforming xk into xk , has the form,
xk**, t = 2
xk ,t − min ( xk ) −1. max (xk ) − min ( xk )
Forecasting Inflation with Forecast Combinations
z k ,t
(
)
)
(
⎛ ⎡⎛ ln z k −1 − 1 − ln z −1 − 1 ⎞ ⎤⎞ −1 ⎟ ⎜ ⎜ ⎟ −1 −k ⎢ ⎥ = ⎜1 + exp ⎜ ⎟⎟[xk ,t − min(xk )] + ln z− k − 1 ⎥ ⎟ ⎢ ( ) ( ) − max x min x ⎜ k k ⎜ ⎟ ⎠ ⎣⎝ ⎦⎠ ⎝
(
)
261 −1
(7)
Finally, Dayhoff and De Leo (2001) suggest scaling the data in a two step procedure: first, standardizing the series x, to obtain z, then taking the log-sigmoid transformation of z: z=
x* =
x−x
σx
1 1 + exp (− z )
(8)
(9)
Since there is no a priori way to decide which scaling function works best, the choice depends critically on the data. The best strategy is to estimate the model with different types of scaling functions to find out which one gives the best performance. When we repeatedly estimate various networks for the “ensemble” or trimmed mean forecast, we use identical networks employing different scaling functions. In our “thick model” approach, we use all three scaling functions for transforming the input variables. For the hidden layer neurons, we use the log sigmoid functions for the neural network forecasts. The networks are simple, with one, two, or three neurons in one hidden layer, with randomly-generated starting values8, using the feedforward and jump connection network types. We thus make use of twenty different neural network “architectures” in our thick model approach. These are twenty different randomly-generated integer values for the number of neurons in the hidden layer, combined with different randomly generated indictors for the network types and indictors for the scaling functions. Obviously, our thick model approach can be extended to a wider variety of specifications but we show, even with this smaller set, the power of this approach9. —————— 8
We use different starting values as well as different scaling functions in order to increase the likelihood of finding the global, rather than a local, minimum. 9 We use the same lag structure for both the neural network and linear models. Admittedly, we do this as a simplifying computational shortcut. Our goal is thus to find the “value added” of the neural network specification, given the benchmark best linear specification. This does not mean that alternative lag structures may work even better for neural network forecasting, relative to the benchmark best linear specification of the lag structure.
262
P. McNelis, and P. McAdam
In nonlinear neural network estimation, there is no closed-form solution for obtaining the parameter values of the network. The final values of the parameter estimates, and thus the predicted values of inflation, even with convergence, may be slightly different, depending on the choice of the scaling function and the starting values of the estimates, for a given neural network structure. Since we are also varying the network structure, of course, we will have a spectrum of predicted values. From this set we derived the trimmed mean forecast. This “thick model” approach is similar to “bagging predictors” in the machine learning and artificial intelligence literature (see Breiman, 1996).
3.4 The Benchmark Model and Evaluation Criteria We examine the performance of the NN method relative to the benchmark linear model. In order to have a fair “race” between the linear and NN approaches, we first estimate the linear auto-regressive model, with varying lag structures for both inflation and unemployment. The optimal lag length for each variable, for each data set, is chosen based on the Hannan-Quinn criterion. We then evaluate the in-sample diagnostics of the best linear model to show that it is relatively free of specification error. For most of the data sets, we found that the best lag length for inflation, with the monthly data, was ten or eleven months, while one lag was needed for unemployment. After selecting the best linear model and examining its in-sample properties, we then apply NN estimation and forecasting with the “thick model” approach discussed above, for the same lag length of the variables, with alternative NN structures of two, three, or four neurons, with different scaling functions, and with feedforward, jump connection We estimate this network alternative for thirty different iterations, take the “trimmed mean” forecasts of this “thick model” or network ensemble, and compare the forecasting properties with those of the linear model. 3.4.1 In-sample diagnostics We apply the following in-sample criteria to the linear auto-regressive and NN approaches: • R 2 goodness-of-fit measure; • Ljung-Box (1978) and McLeod-Li (1983) tests for autocorrelation and heteroskedasticity - LB and ML, respectively; • Engle-Ng (1993) LM test for symmetry of residuals - EN; • Jarque-Bera test for Normality of regression residuals - JB; • Lee-White-Granger (1992) test for neglected non-linearity - LWG;
Forecasting Inflation with Forecast Combinations
263
• Brock-Dechert-Scheinkman (1987) test for independence, based on the “correlation dimension” - BDS. 3.4.2 Out-of-sample forecasting performance The following statistics examine the out-of-sample performance of the competing models: • The root mean squared error estimate - RMSQ; • The Diebold-Mariano (1995) test of forecasting performance of competing models - DM; • The Persaran-Timmerman (1992) test of directional accuracy of the signs of the out-of-sample forecasts, as well as the corresponding success ratios, for the signs of forecasts - SR; • The bootstrap test for “in-sample” bias. For the first three criteria, we estimate the models recursively and obtain “real-time” forecasts. For the US data, we estimate the model from 1970.01 through 1990.01 and continuously update the sample, one month at a time, until 2003.01. For the euro-area data, we begin at 1980.01 and start the recursive real-time forecasts at 1995.01. Obtain mean square error from estimation set
SSE (n ) =
Draw B samples of length n from estimation set
z1,z2,…,zB
Ω1 , Ω 2 ,..., Ω B ~ z ,~ z ,..., ~ z
Estimate coefficients of model for each set
Obtain “out of sample” matrix for each sample
Calculate average mean square error for “out of sample”
Calculate average mean square error for B bootstraps
Calculate “bias adjustment”
Calculate “adjusted error estimate”
1 n ∑ [yi − yˆi ]2 n i =1
1
SSE (nb ) =
2
1 nb
B
∑ [~z nb
b
i =1
( )]
−~ zˆb Ω b
2
1 B SSE (B ) = ∑ SSE (nb ) B b =1
ω~ (0.632) = 0.632[SSE(n ) − SSE(B )] SSE(0.632)=(1-0.632)SEE(n)+0.632SEE(B)
Table 1. “0.632” Bootstrap Test for In-Sample Bias
The bootstrap method is different. This is based on the original bootstrapping due to Effron (1983), but serves another purpose: out-of-sample forecast evaluation. The reason for doing out-of-sample tests, of course, is to see how well a model generalizes beyond the original training or estimation set or historical sample, for a reasonable number of observations. As mentioned, the recursive methodology allows only one out-of-sample error
264
P. McNelis, and P. McAdam
for each training set. The point of any out-of-sample test is to estimate the “in-sample bias” of the estimates, with a sufficiently ample set of data. LeBaron (1998) proposes a variant of the original bootstrap test, the “0.632 bootstrap” (described in Table 1)10. The procedure is to estimate the original in-sample bias by repeatedly drawing new samples from the original sample, with replacement, and using the new samples as estimation sets, with the remaining data from the original sample not appearing in the new estimation sets, as clean test or out-of-sample data sets. However, the bootstrap test does not have a well-defined distribution, so there are no “confidence intervals” that we can use to assess if one method of estimation dominates another in terms of this test of “bias”.
4. Results11 Table 2 contains the empirical results for the broad inflation indices for the US. The data set begins in 1970 and we “break” the sample to start “realtime forecasts” at 1990.01. With such a lag length for inflation, it is not surprising that the overall in-sample explanatory power of all of the linear models is quite high, over 0.99. The marginal significance levels of the Ljung-Box indicate that we cannot reject serial independence in the residuals12. The McLeod-Li tests for autocorrelation in the squared residuals are insignificant except for the US producer price index. We can reject normality in the regression residuals of the linear model. Furthermore, the Lee-White-Granger and BrockDeckert-Scheinkman tests do not indicate “neglected non-linearity”, suggesting that the linear auto-regressive model, with lag length appropriately chosen, is not subject to obvious specification error. This model, then, is a “fit” competitor for the neural network “thick model” for out-of-sample forecasting performance. The forecasting statistics based on the root mean squared error and success ratios are quite close for the linear and network thick model. What matters, of course, is the significance: are the real-time forecast errors sta—————— 10
LeBaron (1997) notes that the weighting 0.632 comes from the probability that n
⎡ ⎤ a given point is actually in a given bootstrap draw, 1 − ⎢1 − n ⎥ ≈ 0.632 . ⎦ ⎣ 1
11
The (Matlab) code and the data set used in this paper are available on request. Since our dependent variable is a 12-month-ahead forecast of inflation, the model by construction has a moving average error process of order 12, one current disturbance and 11 lagged disturbances. We approximate the MA representation with an AR (12) process, which effectively removes the serial dependence.
12
Forecasting Inflation with Forecast Combinations
265
tistically “smaller” for the network model, in comparison with the linear model? The answer is “Not always”. At the ten percent level, the forecast errors, for given autocorrelation corrections with the Diebold-Mariano statistics, are significantly better with the neural network approach for the US CPI and PPI. To be sure, the reduction in the root mean squared error statistic from moving to network methods is not dramatic, but the “forecasting improvement” is significant for the US. The bootstrapping sum of squared errors shows a small gain (in terms of percentage improvement) from moving to network methods for the US CPI and PPI. USA
Lags-Inf Lags-Un
CPI 10 1
PPI 10 1
RSQ-LS L-B* McL-L* E-N* J-B* LWG BDS*
0.992 0.948 0.829 0.628 0.001 0 0.083
0.992 0.851 0.000 0.000 0.000 1 0.000
RSQ-NET
0.0992
0.0992
RMSQ-LS RMSQ-NET SR-LS SR-NET DM-1* DM-2* DM-3* DM-4* DM-5*
0.214 0.213 0.986 0.986 0.036 0.043 0.029 0.033 0.019
0.386 0.385 0.971 0.971 0.088 0.104 0.087 0.118 0.108
0.079 0.182 Bootstrap SSE-LS 0.079 0.181 Bootstrap SSE-NET Ratio 0.997 0.993 Note: * represents probability values. Bold indicates those series that show superior performance of the network, either in terms of DieboldMariano or bootstrap ratios. DM_1, … DM_5 etc allow for the out of sample forecast errors to be corrected for autocorrelations at lags 1 through 5.
Table 2. Diagnostic / Forecasting Results
The usefulness of this “think modeling” strategy for forecasting is evident from an examination of Fig. 3. There, we plot the standard deviations
266
P. McNelis, and P. McAdam
of the set of forecasts for each out-of-sample period of all of the models. This comprises at each period twenty-two different forecasts, one linear, one based on the trimmed mean, and the remaining twenty being neural network forecasts. 0.05
0.045
0.04
0.035
0.03
0.025
0.02
0.015
0.01 1990
1992
1994
1996
1998
2000
2002
Notes: The vertical axis shows the standard deviations of the forecasts of the different models at each time (horizontal axis)
Fig. 3. Thick Model Forecast Uncertainty: US
We see in this figure that the thick model forecast uncertainty is highest in the early 1990’s and after 2000. The earlier period of uncertainty is likely due to the oil price shocks of the Gulf War. The uncertainty after 2000 is likely due to the collapse of the US share market. What is most interesting about the figure is that models diverge in their forecasts in times of abrupt structural change. It is, of course, in these times that the thick model approach is especially useful. When there is little or no structural change, models converge to similar forecasts, and one approach does about as well as any other. What about sub-indices? In Table 3, we examine the performance of the two estimation and forecasting approaches for food, energy, and service components. The bootstrap method shows a reduction in the forecasting error “bias” for all of the indices.
Forecasting Inflation with Forecast Combinations
Lags-Inf Lags-Un
Food 10 1
USA Energy 11 6
Services 10 1
RSQ-LS L-B* McL-L* E-N* J-B* LWG BDS*
0.992 0.728 0.000 0.000 0.000 5 0.000
0.993 0.971 0.043 0.075 0.000 0 0.000
0.992 0.728 0.000 0.000 0.000 5 0.000
RSQ-NET
0.991
0.994
0.991
RMSQ-LS RMSQ-NET SR-LS SR-NET DM-1* DM-2* DM-3* DM-4* DM-5*
0.332 0.320 0.949 0.955 0.511 0.512 0.513 0.513 0.514
2.123 2.144 0.974 0.974 0.882 0.854 0.848 0.839 0.812
0.332 0.320 0.949 0.955 0.511 0.512 0.513 0.513 0.514
Bootstrap SSE-LS Bootstrap SSE-NET Ratio
0.402 0.410 0.998
3.001 2.992 0.994
0.402 0.410 0.998
267
Note: See notes to Table 2
Table 3. Food, Energy, and Services Indices: Diagnostics and Forecasting
5. Conclusions Forecasting inflation for industrialized countries is a challenging task. Notwithstanding the costs of developing tractable forecasting models, accurate forecasting is a key component of successful monetary policy. No model, however complex, can capture all of the major structural characteristics affecting the underlying inflationary process. Economic forecasting is a learning process, in which we search for better subsets of approximating models for the true underlying process. Here,
268
P. McNelis, and P. McAdam
we examined one set of approximating alternative, a “thick model” based on the NN specification, benchmarked against a well-performing linear process. We do not suggest that the network approximation is the only alternative or the best among a variety of alternatives13. However, the appeal of the NN is that it efficiently approximates a wide class of non-linear relations. Our results show that non-linear Phillips curve specifications based on thick NN models can be competitive with the linear specification. We have attempted a high degree of robustness in our results by using different indices and sub-indices as well as performing different types of out-ofsample forecasts using a variety of supporting metrics. The performance of the neural network relative to a recursivelyupdated well-specified linear model should not be taken for granted. Given that the linear coefficients are changing each period, there is no reason not to expect good performance, especially in periods when there is little or no structural change taking place. We show in this paper that the linear and neural network specifications converge in their forecasts in such periods. The payoff of the neural network “thick modeling” strategy comes in periods of structural change and uncertainty, such as the early 1990’s and after 2000. When we examine the components of the CPI, we note that the nonlinear models work especially well for forecasting inflation in the services sector. Since the service sector is, by definition, a highly labor-intensive industry and closely related to labor-market developments, this result appears to be consistent with recent research on relative labor-market rigidities and asymmetric adjustment. This paper has tried to advocate the inherent case for non-linear modeling in a conventional policy setting – namely, that of forecasting inflation. We tried to show that such techniques as Neural Networks are by no means esoteric or invalid for forecasting exercises (as goes on regularly in policy institutions) and thus can successfully be brought to the data and, in turn, to the policy analysis process.
—————— 13
One interesting competing approximating model is the auto-regressive model with drifting coefficients and stochastic volatilities, e.g., Cogley and Sargent (2002).
Forecasting Inflation with Forecast Combinations
269
References Blanchard, O. J., and J. Wolfers (2000): “The role of shocks and institutions in the rise of European unemployment”, Economic Journal, 110, 462, C1-C33. Breiman, L. (1996): “Bagging predictors”, Machine Learning 24, 123-140. Brock, W., W. Dechert, and J. Scheinkman (1987): “A test for independence based on the correlation dimension”, Working Paper, Economics Department, University of Wisconsin at Madison. Chen, X., J. Racine, and N. R. Swanson (2001): “Semiparametric ARX Neural Network models with an application to forecasting inflation”, Working Paper, Economics Department, Rutgers University. Cogley, T., and T. J. Sargent (2002): “Drifts and Volatilities: Monetary Policies and Outcomes in Post-WWII US”. Available at: www.stanford.edu/~sargent. Dayhoff, Judith E., and James M. De Leo (2001): “Artificial Neural Networks: Opening the black box”, Cancer, 91, 8, 1615-1635. Diebold, F. X., and R. Mariano (1995): “Comparing predictive accuracy”, Journal of Business and Economic Statistics, 3, 253-263. Duffy, J., and P. D. McNelis (2001): “Approximating and simulating the stochastic growth model: Parameterized expectations, Neural Networks and the genetic algorithm”, Journal of Economic Dynamics and Control, 25, 1273-1303. Efron, B. (1983): “Estimating the error rate of a prediction rule: Improvement on cross validation”, Journal of the American Statistical Association 78(382), 316-331. Elman J. (1988): “Finding structure in time”, University of California, mimeo. Engle, R., and V. Ng (1993): “Measuring the impact of news on volatility”, Journal of Finance, 48, 1749-1778. Fogel, D., and Z. Michalewicz (2000): How to Solve It: Modern Heuristics, New York: Springer. Granger, C. W. J., and Y. Jeon (2004): “Thick modeling”, Economic Modeling, 21, 2, 323-343. Granger, C. W. J., M. L. King, and H. L. White (1995): “Comments on testing economic theories and the use of model selection criteria”, Journal of Econometrics, 67, 173-188. Judd, K. L. (1998): Numerical Methods in Economics, MIT Press. LeBaron, B. (1998): “An Evolutionary Bootstrap Method for Selecting Dynamic Trading Strategies”, in A. P. N. Refenes, A. N. Burgess, and J. D. Moody (Eds.), Decision Technologies for Computational Finance, Amsterdam: Kluwer Academic Publishers, 141-160. Lee, T. H., H. White, and C. W. J. Granger (1992): “Testing for neglected nonlinearity in times series models: A comparison of Neural Network models and standard tests”, Journal of Econometrics, 56, 269-290. León-Ledesma, M., and P. McAdam (2004): “Unemployment, hysteresis and transition”, Scottish Journal of Political Economy, 51, 3, 377-401. Ljunqvist, L. and T. J. Sargent (2001): “European unemployment: From a worker’s perspective”, Working Paper, Economics Department, Stanford University.
270
P. McNelis, and P. McAdam
Mankiw, N., Gregory, and R. Reis (2003): “What measure of inflation should a central bank target”, Journal of European Economic Association, 1, 5, 10581086. Marcellino, M. (2002): “Instability and non-linearity in the EMU”, Working Paper 211, Bocconi University, IGIER. Marcellino, M., J. H. Stock, and M. W. Watson (2003): “Macroeconomic forecasting in the euro area: Country specific versus area-wide information”, European Economic Review, 47, 1-18. McAdam, P., and A. J. Hughes Hallett (1999): “Non linearity, computational complexity and macro economic modeling”, Journal of Economic Surveys, 13, 5, 577-618. McLeod, A. I., and W. K. Li (1983): “Diagnostic checking ARMA time series models using squared-residual autocorrelations”, Journal of Time Series Analysis, 4, 269-273. McNelis, P. D. (2005): Neural Networks in Finance, Elsevier Academic Press. Michaelewicz, Z. (1996): Genetic Algorithms + Data Structures = Evolution Programs. Third Edition. Berlin: Springer. Pesaran, M. H., and A. Timmermann (1992): “A simple nonparametric test of predictive performance”, Journal of Business and Economic Statistics, 10, 46165. Quagliarella, D., and A. Vicini (1998): “Coupling Genetic Algorithms and Gradient Based Optimization Techniques” in D. J. Quagliarella et al. (Eds.) Genetic Algorithms and Evolution Strategy in Engineering and Computer Science, John Wiles and Sons. Sargent, T. J. (2002): “Reaction to the Berkeley Story”, Web Page: www.stanford.edu/~sargent. Sichel, D. E. (1993): “Business cycle asymmetry”, Economic Inquiry, 31, 224– 236. Sichel, D. E. (1994): “Inventories and the three stages of phases of the business cycle”, Journal of Business and Economic Statistics, 12, 3, 269-77. Sims, C. S. (2003): “Optimization Software: CSMINWEL”. Webpage: http://eco072399b.princeton.edu/yftp/optimize. Stock, J. H. (1999): “Forecasting Economic Time Series”, in Companion in Theoretical Econometrics, Badi Baltagi (Ed.), Basil Blackwell. Stock, J. H., and M. W. Watson (1998): “A comparison of linear and non-linear univariate models for forecasting macroeconomic time series”, NBER WP 6607. Stock, J. H., and M. W. Watson (1999): “Forecasting inflation”, Journal of Monetary Economics, 44, 293-335. Stock, J. H., and M. W. Watson (2001): “Forecasting output and inflation”, NBER WP 8180. White, H. L. (1992): Artificial Neural Networks, Basil Blackwell. Zhang, G. B., Eddy Patuwo, and M. Y. Hu (1998): “Forecasting with artificial neural networks: The state of the art”, International Journal of Forecasting, 14, 1, 1, 35-62.
271
Part VI
Policy Issues
273
____________________________________________________________
The Impossibility of an Effective Theory of Policy in a Complex Economy K. Vela Velupillai
1. Preliminaries on Complexity and Policy There is one main theme and correspondingly one formal result in this paper. On the basis of a general characterization of what is formally meant by a ‘complex economy’, underpinned by imaginative suggestions to this end in Foley (2003) and in Brock and Colander (2000; henceforth BC), it will be shown that an effective1 theory of economic policy is impossible for such an economy. There is, in addition, also a half-baked conjecture; it will be suggested, seemingly paradoxically, that a ‘complex economy’ can be formally based on the foundations of orthodox general equilibrium theory and, hence, a similar impossibility result is valid in this case, too. I have found Duncan Foley’s excellent characterisation of the objects of study by the sciences of complexity in (2003, p. 2), extremely helpful in providing a base from which to approach the study of a subject that is technically demanding, conceptually multi-faceted, and philosophically and epistemologically highly in-homogeneous:
Complexity theory represents an ambitious effort to analyze the functioning of highly organized but decentralized systems composed of very large numbers of individual components. The basic processes of life, involving the chemical interactions of thousands of proteins, the living cell, which localizes and organizes these processes, the human brain in which
—————— 1
I mean by ‘effective’ the formal sense of the word in (classical) recursion theory.
274
K. Vela Velupillai
thousands of cells interact to maintain consciousness, ecological systems arising from the interaction of thousands of species, the processes of biological evolution from which new species emerges, and the capitalist economy, which arises from the interaction of millions of human individuals, each of them already a complex entity2, are leading examples. (Foley, 2003, p. 2; italics added.)
These objects share: [A] potential to configure their component parts in an astronomically large number of ways (they are complex), constant change in response to environmental stimulus and their own development (they are adaptive), a strong tendency to achieve recognizable, stable patterns in their configuration (they are self-organizing), and an avoidance of stable, selfreproducing states (they are non-equilibrium systems). The task complexity science sets itself is the exploration of the general properties of complex, adaptive, self-organizing, non-equilibrium systems. The methods of complex systems theory are highly empirical and inductive […] A characteristic of these […] complex systems is that their components and rules of interactions are non-linear […] The computer plays a critical role in this research, because it becomes impossible to say much directly about the dynamics of non-linear systems with a large number of degrees of freedom using classical mathematical analytical methods. (Foley, 2003, p. 2; emphasis added.)
In a similar vein, Fontana and Buss (1996) have suggested that complexity in a dynamical system arises as a result of the interactions between high dimensionality and non-linearity.
—————— 2
Note Foley’s interesting characterisation of each human individual in the interaction of millions in a decentralized system as a complex entity. This is diametrically opposed to the formalism in agent-based models of simple agents interacting to generate self-organized patterns of behaviour. In this latter class of models, the formalism chosen for the individual agent abstracts away from all its realistic complex characteristics. A Turing Machine formalism is the best way, in a precisely definable sense and consistent with formal economic theory, to encapsulate the full complexities of an individual agent.
The Impossibility of an Effective Theory of Policy in a Complex Economy
275
“... [A] Fundamental problem in methodology [is that] the traditional theory of dynamical systems is not equipped for dealing with constructive processes3. We seek to solve this impasse by connecting dynamical systems with fundamental research in computer science. Many failures in domains of biological (e.g., development), cognitive (e.g., organization of experience), social (e.g., institutions), and economic science (e.g., markets) are nearly universally attributed to some combination of high dimensionality and nonlinearity. Either alone won’t necessarily kill you, but just a little of both is more than enough. This, then, is vaguely referred to as ‘complexity’ (p. 56-57; italics added).
The suggestions for a formalism of a ‘complex economy’ in BC are more specifically framed in the context of a discussion of the role of policy in such an economy. I hope the following concise summary is reasonably faithful to the spirit of their approach and encapsulates the more imaginative ideas explicitly. In section 2 of BC, titled ‘How Complexity Changes Economists’ Worldview, the authors list six ways in which a complexity vision brings about (or should bring about) a change in the worldview that is currently dominant in policy circles. The latter is given the imaginative representative name ‘economic reporter’, a person trained in one of the better and conventional graduate schools of economics and fully equipped with the tinted glasses that such an education provides: a worldview for policy that is underpinned by general equilibrium theory (henceforth GET) and game theory. These are the bright young things who go around the world seeing Nash equilibria and the two welfare theorems in the processes that economies generate, without the slightest clue as to how one can interpret actions, events, and institutional pathologies during processes, and their frequent paralysis. BC, as ‘complexity theorists’, aim to take away the intellectual props that these economic reporters carry as their fall-back position for policy recommendations: general equilibrium theory and the baggage that comes with it (although, curiously, they do not mention taking away that other prop: game theory, which, in my opinion, is far more sinister). With this aim in mind, the six changes that a ‘complexity vision’ may bring about in the worldview of the economic reporter, according to BC, are as follows:
—————— 3
I do not believe this refers to ‘bonstructive’ in the strict sense of bonstructive mathematics of any variety.
276
K. Vela Velupillai
• With a complexity vision the most important policy input is in institutional design. • The complexity vision brings with it an attitude of theoretical neutrality to abstract debates about policy. • The ‘complexity-trained policy economist’ will try to seek out the boundaries of the equivalent of the basins of attractions of dynamical systems - i.e., equipped with notions of criticality and their crucial role in providing adaptive flexibilities in the face of external disturbances, the complexity-trained-economist will not be complacent that any observable dynamics is that of an elementary, characterizable, attractor. • There will be more focus on inductive process models than on abstract deductive models. • Due to the paramount roles played by positive feedback, underpinning path dependence, and increasing returns in the complexity visioned economist, the attitude to policy will be honed towards a temporal dimension to policy being given crucial roles to play. • The complexity worldview makes policy recommendations less certain simply because pattern detection is a hazardous activity and patterns are ephemeral, eternally transient phenomena. On the whole, then, I base my interpretation of a formal dynamical system representing a ‘complex economy’ to be underpinned by high dimensionality and nonlinearity and configured on the boundaries of basins of attractions. However, I wish to add three caveats to these highly suggestive and entirely plausible elements for a characterization of a dynamical system encapsulating the essential features of a ‘complex economy’. The first is the joint suggestion in Foley and in Fontana-Buss (1996) and in much of the standard literature of ‘the sciences of complexity’ that high dimensionality and nonlinearity are necessary ingredients for the manifestation of complex behavior by a formal dynamical system. I think it is quite easy to show complex behaviour in a low dimensional, asynchronously coupled, dynamical system, in any sense defined as desirable by the sciences of complexity. Secondly, it is easy to show, formally, that a dynamical system must possess the computability-based property of self-reproduction for it to be capable of self- organization. Thirdly, there is a difficulty in reconciling the three desiderata - bordering on formal impossibility - of: institutional design, self-organization, and the idea of a dynamical system delicately poised on the boundaries of its basins of attraction. Finally, implicit - and occasionally also explicit - in all of the criteria suggested by the authors cited above for denning a dynamical system underpinning a ‘complex economy’ by the authors cited above, there is the
The Impossibility of an Effective Theory of Policy in a Complex Economy
277
forceful call for a shift away from the formal methods of mathematics and philosophy of science customarily used in economic theory, and a move is advocated, by necessity if you like, towards the study of the high dimensional, nonlinear, dynamical systems using the methods of simulation and numerical studies based on the (digital) computer. I am in complete agreement with this ‘call’, although not just high dimensionality and nonlinearity call forth this shift towards reliance on the (digital) computer, induction/abduction and so on. Now, a serious acceptance of these admonitions requires that the formalisms of dynamical systems must also be underpinned by the mathematics of the computer - recursion theory. I am not aware of any systematic study by the theorists, advocates, and practitioners of a ‘complexity vision’ of basing their formalisms on recursion theory. Thus, there is an uneasy and easily demonstrable dissonance between the various above mentioned desiderata and the actual mathematical formalisms used. For example, it is one thing to state that a dynamical system representing a ‘complex economy’ should be configured at the boundary of the basin of attraction of attractors; it is quite another thing to make such a criterion numerically meaningful so that it can be investigated on the (digital) computer. Of course, if one is working with analogue computers, the ‘complexity vision’ can be kept consistent with classical mathematical analytical methods, up to a point. The dissonance is, however, only apparent and not real. This is because the fact is that the new mathematical formalisms for the new sciences of complexity came from two very special directions, almost simultaneously: developments in the theory and numerical study of non-linear dynamical systems and new paradigms for representing, in a computational format, ideas about self-reproduction, self-reconstruction, and self-organization in (not necessarily) high-dimensional systems. The interpretations of the latter in terms of the theory of the former and the representations of the former in terms of the paradigms of the latter were the serendipitous outcome that gave the sciences of complexity their most vigorous and sustained mathematical framework. The classic contributions to this story are the following: Turing, 1952; von Neumann, 1966; Lorenz, 1963; May, 1976; Ruelle and Takens, 1971, and Smale, 1967, from which emerged a vast, exciting, and interesting work that has, in my opinion, led to the complexity vision and the sciences of complexity. These contributions are classics and, like all classics, are still eminently readable - they have neither aged nor have the questions they posed become obsolete by the development of new theoretical technologies; if anything, the new theoretical technologies have reinforced the visions they foresaw and the scientific traditions and trends they initiated.
278
K. Vela Velupillai
Let me end these preliminaries with some remarks on policy in a ‘complex economy’. BC make the entirely reasonable observation that:
Much of deductive standard economic theory has been directed at providing a general theory based on first principles. Complexity theory suggests that question may have no answer and, at this point, the focus of abstract deductive theory should probably be on the smaller questions - accepting that the economy is in a particular position, and talking about how policy might influence the movement from that position. That makes a difference for policy in what one takes as given - it suggests that existing institutions should be incorporated in the models, and that policy suggestions should be made in reference to models incorporating such institutions. .(Brock and Colander, 2000, p. 79; emphasis added).
Was this not, after all, the message in the General Theory? Accepting that the economy is in a particular position of unemployment equilibrium and devising a theory for policy that would influence movement from that position to a more desirable position. Such a vision implied, in the General Theory, an economy with multiple equilibria and, at the hands of a distinguished array of nonlinear Keynesians, also that other hobby horse of the complexity visionaries, ‘positive feedback’ - in more conventional dynamic terms, locally unstable equilibria in a dynamical system subject to relaxation oscillations. In addition, in the early nonlinear Keynesian literature, when disaggregated macrodynamics was investigated, there were coupled markets, but the mathematics required to analyze coupled nonlinear differential equations was only in its infancy and these nonlinear Keynesians resorted to ad hoc numerical and geometric methods. So, we are in familiar territory, but not the terrain that is usually covered in the education of the economic reporters. To this extent BC are right on the mark about the policy-complexity nexus. Were these precepts also not the credo of the pioneers of classical behavioural economics: Herbert Simon, Richard Day, and James March and Co? Studying adjustment processesphenomenologically, eschewing the search for first principles, underpinning economic theoretic closures with institutional assumptions, enriching rationality postulates by setting agents in explicit institutional and problem-solving contexts, seeking algorithmic foundations for behaviour4, and so on. Was there ever an economic agent —————— 4
Which, automatically, brings with it undecidabilities, uncomputabilities, and other indeterminate problems that can only, always, be ‘solved’ pro tempore, aim-
The Impossibility of an Effective Theory of Policy in a Complex Economy
279
abstracted away from an institutional setting in any of Simon’s writings? Were not all of Day’s agents, in his dynamic economics, behaving adaptively? So, these precepts have a noble pedigree in classical behavioural economics and in the economics of Keynes, and it is not as if those of us who were trained in these two traditions were not aware of the complexitypolicy nexus. For over a quarter of a century I have taught macroeconomics emphasizing the ‘paradox of saving’, referring to the ‘Banana Paradox’ in the Treatise, the story of the origins of the classical theory of economic policy5 and the emergence of the Lucasian critique as an outcome of an awareness of the paradoxes of self-referentiality in a dynamical system capable of computation universality. Coming to terms with the wedge that has to be designed between ‘parts’ and ‘wholes’ that emerges in a macroeconomy subject to the ‘paradox of saving’, and related paradoxes, and respecting ————— ing to determine the boundaries of basins of attraction numerically, and so on. This is a list that not only encompasses the Brock-Colander set of six-fold precepts but also one that is far richer in inductive content and retroductive - i.e., abductive - realization. 5 By the ‘classical theory of economic policy’ I am referring, now, to the so-called ‘target-instrument’ approach that is usually attributed to Ragnar Frisch, Jan Tinbergen and Bent Hansen and not the theory of policy of the classical economists. It may be useful to record the origins of this approach (as narrated to me by Mrs. Gertrud Lindahl, during personal conversations at her home in Lund, in 1983). In the early 1930s, Ernst Wigforss was the Social Democratic Minister of Finance of a Sweden grappling, like most other economies, with the ravages wrought in the labour market by the great depression, was Ernst Wigforss. He approached the two leading Swedish economists, Gunnar Myrdal and Erik Lindahl, both sympathetic to the political philosophy of the Social Democrats, and requested them to provide him with a ‘theory for the underbalancing of the budget’ so that he could justify the policy measures he was planning to implement to combat unemployment due to insufficient effective demand. He needed a ‘theory’, he told them, because the leader of the oppositon in that Parliament was the Professor of Economics at Stockholm University, Gösta Bagge, who was versed only in a theory that would justify a balanced budget. Thus was born, via the framework devised in Myrdal’s famous memorandum to Wigforss (Myrdal, 1934), the classical theory of economic policy, made mathematically formal, first, by Frisch, Tinbergen and Hansen and famously known as the ‘target-instrument’ approach. It was built on the essential back of the ‘paradox of saving’, the main macroeconomic repository of the wedge between ‘wholes’ and ‘parts’ that makes a mockery of reductionism. The pioneers of macroeconomics, Keynes, Myrdal, Lindahl, Lundberg, and others, were theorists of the complexity-policy nexus before their time - rather like Molière’s famous unconscious purveyor of prose, Monsieur Jourdain.
280
K. Vela Velupillai
the conundrums of self-referentiality in underpinning it - the macroeconomy - in a system of rational individual behaviour, are the twin horns on which the ‘complexity-policy’ nexus has often floundered. As a result, the unfortunate trajectory of the theory of policy has oscillated between adherence to a mechanical view of the feasibility of policy and a nihilistic attitude advocating the irrelevancy of policy. The via media that is being wisely suggested in BC - except for their advocacy of institution design for a complex economy - is entirely justifiable for a complex economy.
2. Undecidability of Policy in a ‘Complex Economy’ I am not sure what BC mean by an ‘elementary, characterizable, attractor’ simply because the formalism for ‘characterizability’ is not precisely defined. I take it, however, that they mean by the phrase ‘elementary characterizable attractor’ the standard limit points, limit cycles, and ‘strange’ (i.e., chaotic) attractors. As for ‘characterizable’, I will assume effective characterization of defining ‘basins of attraction’. Then, given the observable trajectories of a dynamical system, say computed using simple Poincaré maps or the like, an ‘elementary characterizable attractor’ is one that can be associated with a Finite Automaton. Thus, limit points, limit cycles, and strange attractors are effectively characterizable in a computable trivial sense; however, dynamical systems capable of computation universality have to be associated with Turing Machines. Hence, trajectories that are generated by dynamical systems poised on the boundaries of the basins of attractions of simple attractors may possess undecidable properties due to the ubiquity of the Halting problem for Turing Machines, the emergence of Busy Beavers (i.e., uncomputabilities), etc. Any theory of policy, i.e., any rule - fixed or discretionary - that is a function of the values of the dynamics of an economy formalized as a dynamical system capable of computation universality will share these exotic properties. I will assume an abstract model of a ‘complex economy’, or of an economy capable of ‘complex behaviour’, to be a dynamical system capable of computation universality. No other dynamical system would satisfy the imaginative characterizations suggested explicitly, and by implication, by Foley and BC, above. I will also have to assume, in the face of obvious space constraints, familiarity with the formal definition of a dynamical system (cf. for example, the obvious and accessible classic, Differential Equations, Dynamical Systems and Linear Algebra [Hirsch and Smale, 1974], or the more modern Introduction to Dynamical Systems (Brin and Stuck, 2002)], the necessary associated concepts from dynamical systems theory,
The Impossibility of an Effective Theory of Pohcy in a Complex Economy
281
and all the necessary notions from classical computability theory (for which the reader can, with profit and enjoyment, go to a classic like Theory of Recursive Functions and Effective Computalility [H. Rogers, Jr.], or, at the frontiers, to "On the Nature of Turbulence" [Ruelle and Takens, 1971]). For ease of reference, the bare bones of relevant definitions for dynamical systems are given below in the usual telegraphic form''. An intuitive understanding of the definition of a 'basin of attraction' is probably sufficient for a complete comprehension of the main result - provided there is reasonable familiarity with the definition and properties of Turing Machines (or partial recursive functions or equivalent formalisms encapsulated by Church's Thesis). Definition 1 The Initial Value Problem (TVP) for an Ordinary Differential Equation (ODE) and Flows. Consider a differential equation: X = fix)
(1)
where x is an unknown function of t € I (say , t: time and I an open interval of the real line) and f is a given function of x. Then, a function x is a solution of(l) on the open interval I if: x{t) = f{x{t)),MteI
(2)
The initial value problem (ivp) for (1) is, then, stated as: X = f{x)^ x{to) = xo
(3)
and a solution x{t) for (3) is referred to as a solution through xo at h. Denote x{t) and Xo, respectively, as: ip{t,xo) = x{t) , and (p{0,xo) = xo
(4)
where (p{t, xo) is called the flow of x = f{x) Definition 2 Dynamical System If f is a C^ function (i.e., the set of all differentiable functions with continuous first derivatives), then the flow ip{t,xo), Vt, induces a map of U \zR into itself, called a C^ dynamical system on K; In the definition of a dynamical system given below I am not striving to present the most general version. The basic aim is to lead to an intuitive understanding of the definition of a basin of attraction so that the main theorem is made reasonably transparent. Moreover, the definition given below is for scalar ODEs, easily generahzable to the vector case.
282
K. Vela Velupillai xo'—"f{t,xo)
(5)
if it satisfies the following (one-parameter group) properties: 1. 1^(0, So)
=X(i
2. 0, fitjx) € V and: (fi{t, a;) —>• J4 a s t —>• o o
(6)
Remark 7 It is important to remember that in dynamical systems theory contexts the attracting sets are considered the observable states of the dynamical system and its flow. Definition 8 The basin of attraction of the attracting set A of a flow, denoted, say, by QA, is defined to be the following set: &A - ^t
(7)
where: >p^{.) denotes the flow ip{.,.),Vt. Remark 9 Intuitively, the basin of attraction of a flow is the set of initial conditions that eventually leads to its attracting set - i.e., to its limit set (limit points, limit cycles, strange attractors, etc). Anyone familiar with the definition of a Turing Machine and the famous Halting problem for such machines would immediately recognise the connection with the definition
The Impossibility of an Effective Theory of Pohcy in a Complex Economy
283
of basin of attraction and suspect that my main result is obvious and triv-
iaf
On the policy side, my formal assumption is that by 'policy' is meant 'rules' and my obvious working hypothesis - almost a thesis, if not an axiom - is the following: Claim 10 Every rule is reducible to a recursive rule^. Remark 11 This claim and the results below are valid whether 'rule' is meant to be an element from a set of preassigned rules (i.e., the notion of policy as a fixed 'rule' in the 'rules vs. discretion' dichotomy) or a rule as a (partial recursive or total) function of the current state of the dynamics of a complex economy (discretionary policy f. Remark 12 If anyone can suggest a rule that cannot be reduced to a recursive rule, it can only be due to an appeal to a non-algorithmic principle like an undecidable disjunction, magic, ESP or something similar. Definition 13 Dynamical Systems capable of Computation Universality: A dynamical system capable of computation universality is one whose defining initial conditions can be used to program and simulate the actions of any arbitrary Turing Machine, in particular that of a Universal Turing Machine. Proposition 14 Dynamical systems characterizable in terms of limit points, limit cycles or 'chaotic' attractors, called 'elementary attractors', are not capable of universal computation. Proposition 15 Only dynamical systems whose basins of attraction are poised on the boundaries of elementary attractors are capable of universal computation. In the same sense in which the Wakasian Equilibrium Existence theorem is trivial and obvious for anyone familiar with the Brouwer (or similar) fixed point theorem(s). The finesse, however, was to formalise the Walrasian economy topologically, in the first place. A similar finesse is required here, but space does not permit me to go into details. Firstly, 'recursive' is meant to be interpreted in its 'recursion theoretic' sense; secondly, this claim is, in fact, a restatement of Church's Thesis (cf. [1], p.34). It may be useful to keep in mind the following caveat introduced in one of the famous papers on these matters by Kydland and Prescott (1980, p. 169): "[W]e emphasize that the choice is from a [fixed] set of fiscal pohcy rules".
284
K. Vela Velupillai
Theorem 16 There is no effective procedure to decide whether a given observable trajectory is in the basin of attraction of a dynamical system capable of computation universality
Proof. The first step in the proof is to show that the basin of attraction of a dynamical system capable of universal computation is recursively enumerable but not recursive. The second step, then, is to apply Rice’s theorem to the problem of membership decidability in such a set. First of all, note that the basin of attraction of a dynamical system capable of universal computation is recursively enumerable. This is so since trajectories belonging to such a dynamical system can be effectively listed simply by trying out, systematically, sets of appropriate initial conditions. On the other hand, such a basin of attraction is not recursive. To see this, suppose a basin of attraction of a dynamical system capable of universal computation is recursive. Then, given arbitrary initial conditions, the Turing Machine corresponding to the dynamical system capable of universal computation would be able to answer whether (or not) it will halt at the particular configuration characterising the relevant observed trajectory. This contradicts the unsolvability of the Halting problem for Turing Machines. Therefore, by Rice’s theorem, there is no effective procedure to decide whether any given arbitrary observed trajectory is in the basin of attraction of such recursively enumerable but not recursive basin of attraction. Given this result, it is clear that an effective theory of policy is impossible in a complex economy. Obviously, if it is effectively undecidable to determine whether an observable trajectory lies in the basin of attraction of a dynamical system capable of computation universality, it is also impossible to devise a policy - i.e., a recursive rule - as a function of the defining coordinates of such an observed or observable trajectory. Just for the record I shall state it as a formal proposition:
Proposition 17 An effective theory of policy is impossible for a complex economy. What if the realized trajectory lies outside the basin of attraction of a dynamical system capable of computation universality and the objective of policy is to drive the system to such a basin of attraction? This means the policy maker is trying to design a dynamical system capable of computational universality with initial conditions pertaining to one that does not have that capability. Or, equivalently, the policy maker is making an attempt is being made, by the policy maker, to devise a method by which to make a Finite Automaton construct a Turing Machine, an impossibility. In other words, an attempt is being made endogenously to construct a ‘com-
The Impossibility of an Effective Theory of Policy in a Complex Economy
285
plex economy’ from a ‘non-complex economy’. Much of this effort is, perhaps, what is called ‘development economies’ or ‘transition economies’, and I interpret the Brock-Colander remarks on institution design in this context and, therefore, claim that it is recursively impossible. Essentially, my claim is that it is recursively impossible to construct a system capable of computation universality using only the defining characteristics of a Finite Automaton. To put it more picturesquely, a non- algorithmic step must be taken to go from systems incapable of self-organisation to systems that are capable of self-organization. This interpretation is entirely consistent with the original definition, explicitly stated, of an ‘emergent property’ or an ‘emergent phenomenon’, by George Henry Lewes. This is why ‘development’ and ‘transition’ are difficult issues to theorise about, especially for policy purposes. Thus, the Brock-Colander desideratum of requiring the ‘complexitytrained policy economist’ to try to seek out the boundaries of the equivalent of the ‘basins of attractors of dynamical systems’ is a recursively undecidable task. It must, however, be remembered that this does not mean that the task is impossible in any absolute sense. There may well be nonrecursive methods to seek out the boundaries of the equivalent of the basins of attractors of dynamical systems. There may also be ad hoc means by which recursive methods may be discovered for such a task. The above theorem seeks only to state that there are no general purpose effective methods for such a policy task. Hence the Brock-Colander admonition to be modest about policy proposals for a complex economy is entirely reasonable Hence, also, when Brock-Colander hold the other horn of the complexity worldview and ‘warn’ the complexity vision holders that the ‘complexity worldview makes policy recommendations less certain’, they are being eminently realistic and insightful, although it may well be for other than algorithmic reasons.
3. Remarks on Generating a GET Complex Economy Both Duncan Foley and Fontana and Buss reflect accurately the intuition of complexity theorists that a conjunction of nonlinearity and high dimensionality (just a little of both), underpinned by adaptation (the basis for dynamics), might be the defining criteria for the genesis of a complex adaptive dynamical system (CADS). It is almost inevitable that the methods of ‘classical analytical mathematical methods’ are inadequate for the purposes of studying these systems in traditional ways - i.e., looking for closed form solutions - and the computer and its mathematics have to be harnessed in a serious way in the study of CADS. The additional ad hoc
286
K. Vela Velupillai
criterion of choosing those dynamical systems whose attracting sets are located on the boundaries of elementary attractors is less compelling from a theoretical point of view, at least to this writer. In the citadel of economic theory, General Equilibrium Theory, there is more than ‘a little of both’, nonlinearity and high dimensionality. The general equilibrium theorist can justifiably claim that the core model has been studied with impeccable analytical rigour, ‘using classical mathematical analytical methods! Therefore, since there is so much more than ‘just a little of both’, nonlinearity and high dimensionality, in the general equilibrium model, it must encompass enough ‘complexity’ of some sort for us to be able to endow it with a ‘complexity vision’. Why, the puzzled economic theorist may ask, don’t we do precisely that, instead of building ad hoc models, without the usual micro closures10. On the other hand, there are three fundamental criticisms of orthodox theory - by which I will understand the ‘rigorous’ version of GET and not some watered down version in an intermediate textbook - implied by the complexity vision. Orthodox theory is weak on processes; it is almost silent on increasing returns technology; it is insufficiently equipped to handle disequilibria. The first obviates the need to consider adaptation; the second, in conjunction with the first, rules out positive feedback processes; and the third, again in conjunction with the first, circumvents the problem of the emergence of selforganised, novel, orders. I am in substantial agreement with these fundamental criticisms. However, I do not believe that orthodox theory has to be thrown overboard and that an entirely new closure has to be devised for the ‘complexity vision’ to be encapsulated in a dynamical system capable of computation universality to represent a complex economy To be fair, I should add that BC do maintain, in varying degrees of emphasis, that the ‘complexity vision’ should be viewed as complementing orthodox visions and they do not envisage or advocate a throwing out of the proverbial baby with the bath water. It is my belief that orthodox GET has been much maligned by the complexity theorists and I would like to suggest a pathway to redress some of the ‘mischief’. If the pathway is reasonably acceptable, then the complexity-policy nexus can be buttressed by the fundamental theorems of welfare economics as foundations for policy11. Lack of space prevents me —————— 10
I use this word ‘closure’ instead of the more loaded word ‘microfoundations’ deliberately. I believe the idea of the ‘closure’ of neoclassical economics - i.e., based on preferences, endowments and technology - is more fundamental and can be given many more foundations than the orthodox idea. 11 It must, however, be remembered that the more important of the two theorems as policy underpinnings is the second one. On the other hand, this theorem, in its
The Impossibility of an Effective Theory of Policy in a Complex Economy
287
from making my case formally. I shall, however, suggest the broad line of attack that might make the case for the ‘defence’, so to speak. There are three elements to my pathway: 1. The Sonnenschein-Debreu-Mantel (S-D-M) (Sonnenschein, 1972) theorem on excess demand functions; 2. The Uzawa Equivalence theorem (Uzawa, 1962) between the Walrasian Equilibrium Existence theorem and the Brouwer fixed point theorem; 3. The Pour-El and Richards theorem (Pour-El, Boykan, and Richards, 1979) on the genesis of computable generation of a non-recursive real as the solution to the IVP of an ODE. As a consequence of the S-D-M theorem, the only structural properties that have to be preserved when, say, introducing ad hoc price dynamics for a general equilibrium exchange economy, are Walras’s Law and continuity. This is entirely compatible with the characterizations given above in defining flows of dynamical systems. Next, the main implication of the Uzawa equivalence theorem is that the economic equilibrium is a nonrecursive real whereas the initial conditions of the economy given by the endowments, etc., have to be recursive reals. The question is, then, how to devise an IVP for an ODE such that for recursive reals as inputs, a nonrecursive real solution is the result. Such a flow can then be associated with the price dynamics of a general equilibrium exchange economy in view of the arbitrariness sanctioned by the S-D-M theorem. It is at this point that the Pour-El and Richards construction comes into play and allows an unusual tatonnement to be devised such that its actions can simulate the activities of a Turing Machine - i.e., a tatonnement dynamics that is capable of computation universality. What of policy in such a GET complex economy? The only question of policy is whether the out-of-equilibrium tatonnement dynamics can be ‘speeded up’ towards the non-recursive real solution, while not violating the strictures of at least the second fundamental theorem of welfare economics. Needless to say, it is relatively easy to show that no effective procedure can be devised to achieve such speeding up. Since the frontiers of theoretical policy discussions are almost entirely about driving out-ofequilibrium configurations towards (stochastic) dynamic equilibrium paths, the bearing of such an ineffective result must be obvious to any sympathetic reader. It is, however, a complete research agenda and not an issue that can be settled with throwaway remarks at the end of a paragraph.
————— general form, relies for its proof on the Hahn-Banach theorem, which is recursively dubious.
288
K. Vela Velupillai
4. Policy, Poetry, and Political Economy The ‘complex economy’ considered in the previous sections did not encapsulate any notion of emergence. Emergence, at least as defined and intended by the pioneers, John Stuart Mill, George Henry Lewes, and C Lloyd Morgan, was meant to be an intrinsically non-algorithmic concept. So, almost by definition and default, no question of policy as rules to maintain or drive trajectories of emergent dynamical systems to desired locations in phase space is even conceivable, at least not in any formal, effective way. Perhaps this is the reason for Hayek’s lifelong skepticism on the scope for policy in economies that emerge and form spontaneous orders. It is not for nothing that Harrod’s growth path was on a knife-edge and Wicksell’s cumulative process was a metastable dynamical system, located on the boundary defined by the basins of attractions of two stable elementary dynamical systems (one for the real economy, founded on a modified Austrian capital theory; the other for a monetary macroeconomy underpinned by a pure credit system). When policy discussions resort to reliance on special economic models, the same unease that causes disquiet when special interests advocate policies should be the outcome. Any number and kind of special dynamic economic models can be devised to justify almost anything - all the way from policy nihilism, the fashion of the day, to dogmatic insistence on rigid policies, justified on the basis of seemingly sophisticated, essentially ad hoc, models. Equally, studying patterns by simulating complex dynamical models and inferring structures, without grounding them on the mathematics of the computer is a dangerous pastime. A fortiori, suggesting policy measures on the basis of such inferred structures is doubly dangerous. Nothing in the formalism of the mathematics underlying the digital computer, the vehicle in which such investigations are conducted, and simulations by it, justifies formal inferences on implementable effective policies. Impossibility and undecidability results do not mean paralysis. Arrow’s impossibility theorem did not mean that democratic institution design was abandoned forever; Sen’s theory of the impossibility of a Paretian liberal did not forbid individual advocacy along lines that could only be interpreted as a philosophy of a Paretian liberal; Rabin’s powerful result that even though there are determined classical games, it is not possible to devise effective instructions to guide the theoretical winner to implement a winning strategy has not meant that game theory cannot be a useful guide to policy. Similarly, the impossibility of an effective theory of policy does not mean that the poets in our profession cannot devise enlightened policies that benefit a complex economy. I can do no better to illustrate this at-
The Impossibility of an Effective Theory of Policy in a Complex Economy
289
titude than to repeat a ‘Keynes story’ that was told, almost with uncharacteristic awe, by the redoubtable Bertrand Russell:
On Sunday, August 2, 1914, I met [Keynes] hurrying across the Great Court of Trinity. I asked him what the hurry was and he said he wanted to borrow his brother-in-law’s motorcycle to go to London. ‘Why don’t you go by train’, I said. Because there isn’t time, he replied. I did not know what his business might be, but within a few days the bank rate, which panic-mongers had put up to ten per cent, was reduced to five per cent. This was his doing. (Russell, Autobiography, vol. 1, pp. 68-69).
There are similar stories about Lindahl’s intuition on policy relating to the bank rate. Keynes and Lindahl did not rely on mechanical deductions from formal mathematical models of the economy. Perhaps the growth of the complexity of economies calls forth more than intuition based on a thorough familiarity with the institutions of an economy and its behavioural underpinnings, the kind of familiarity a Keynes or a Lindahl possessed. There is, however, no question that the relevant knowledge and its manifestation in policy actions could have resulted from recursive procedures. Poetry is not an algorithmic endeavour - either in its creation or in its appreciation; nor is policy, especially in a complex economy. Essentially the main message of this seemingly negative paper is that the justification for policy cannot be sought in effective formalisms. One must resort to poetry and classical political economy, i.e., rely on imagination and compassion, for the visions of policies that have to be carved out to make institutions locate themselves in those metastable configurations that are defined by the boundaries in which dynamical systems capable of universal computation get characterised. This is, essentially, the enlightened message that I inferred from the Brock-Colander discussion of policy in a complex economy, the paper that guided my thoughts in framing the questions posed in this paper. I am not sure the answers will be welcomed by either Brock-Colander or Foley, the other thoughtful prop that was my inspiration for the framework and contents of this paper.
References Beeson, Michael J. (1985): Foundations of Constructive Mathematics, SpringerVerlag Berlin, Heidelberg New York.
290
K. Vela Velupillai
Brin, Michael, and Garrett Stuck (2002): Introduction to Dynamical Systems, Cambridge University Press, Cambridge. Brock William A, and David Colander (2000): Complexity and Policy, in The Complexity Vision and the Teaching of Economics edited by David Colander, Chapter 5, pp. 73-96, Edward Elgar, Cheltenham. Cooper, S. Barry (2004): Computability Theory, Chapman & Hall/CRC, Boca Raton and London. Foley, Duncan K. (2003): Unholy Trinity: Labour, Capital and Land in the New Economy, Routledge, London. Fontana, Walter, and Leo W. Buss (1996): The Barrier of Objects: From Dynamical Systems to Bounded Organizations, in Barriers and Boundaries edited by J. Casti and A. Karlqvist, pp. 56-116; Addison-Wesley, Reading, MA. Hirsch, Morris W., and Stephen Smale (1974): Differential Equations, Dynamical Systems and Linear Algebra, Academic Press, New York and London. Kydland, Finn, and Edward C Prescott (1980): A Competitive Theory of Fluctuations and the Feasibility and Desirability of Stabilization Policy, in Rational Expectations and Economic Policy edited by Stanley Fischer, Chapter 5, pp. 169-198, The University of Chicago Press, Chicago and London. Lorenz, Edward N. (1963): Deterministic Nonperiodic Flow, Journal of Atmospheric Sciences, Vol. 20, pp.130-141. May, Robert M. (1976): Simple Mathematical Models with Very Complicated Dynamics, Nature, Vol. 261, June, 10; pp. 459-467. Myrdal, Gunnar (1934): Finanspolitikens Ekonomiska Verkningar, Statens Oflentliga Utredningar, 1934:1, Socialdepartmentet, Stockholm. Pour-El, Marian Boykan, and Ian Richards (1979): A Computable Ordinary Differential Equation which Possesses no Computable Solution, Annals of Mathematical Logic, Vol. 17, pp. 61-90. Rogers, Hartley Jr. (1967): Theory of Recursive Functions and Effective Computability, MIT Press, Cambridge, MA. Ruelle, David, and Floris Takens (1971): On the Nature of Turbulence, Communications in Mathematical Physics, Vol. 20, pp.167-92 (and Vol. 23, pp. 34344). Russell, Bertrand (1967 [1975]): Autobiography - Volume 1, George Allen & Unwin (Publishers) Ltd., 1975 (one-volume paperback edition). Smale, Steve (1967): Differentiable Dynamical Systems, Bulletin of the American Mathematical Society, Vol. 73, pp. 747-817. Sonnenschein, Hugo (1972): Market Excess Demand Functions, Econometrica, Vol. 40, pp. 549-563. Turing, Alan M. (1952): The Chemical Basis of Morphogenesis, Philosophical Transactions of the Royal Society, Series B, Biological Sciences, Vol. 237, Issue 641, August, 14; pp. 37-72. Uzawa, Hirofumi (1962): Walras’ Existence Theorem and Brouwer’s Fixed Point Theorem, The Economic Studies Quarterly, Vol. 8, No. 1, pp. 59-62. von Neumann, John (1966): Theory of Self-Reproducing Automata, Edited and completed by Arthur W. Burks, University of Illinois Press, Urbana, Illinois, USA.
291
____________________________________________________________
Implications of Scaling Laws for Policy-Makers M. Gallegati, A. Kirman, and A. Palestrini
1. Introduction The industrial dynamic literature has shown the existence of some stylized facts regarding firms’ distribution among which a very important one, for the reasons discussed in the paper, is the scaling law of firms’ size (Axtell, 2001)1. Such a kind of distributions seemed, to economists and statisticians, a strange object. In fact, if the rate of growth of incomes is moderately correlated and the variance exists, by the central limit theorem, they would expect an approximately lognormal distribution. This statistical argumentation regarding the limiting distribution of multiplicative stochastic processes in economics was explicitly formulated with the pioneering work of Gibrat (1931) in industrial dynamics in which he claims that holding the law of proportional effect, i.e. the firm’s growth rate is independent of firm’s size - the distribution of firms’ size must be right skewed and that such distribution must be a member of the lognormal family. Jjiri-Simon (1964) showed that the lognormal was not the only asymptotic distribution consistent with Gibrat’s law. Slightly modifying Gibrat’s model, they were able to obtain a skewed firms’ size distribution of the Yule type. —————— 1 The analysis of such distributions and their importance in economics takes us back to the famous 1897 work of the Italian economist Vilfredo Pareto in which he discovered that the distribution of income, above a certain threshold y0 , follows a power law behavior. That is, the probability to observe an income equal or greater than y is proportional to y to the power of - α with the α exponent close to 1.5.
292
M. Gallegati, A. Kirman, and A. Palestrini
More recently Axtell (2001) and Gaffeo et al. (2003)2 have found that the distribution of firms’ size follows a Zipf or power law instead of a lognormal distribution3. Moreover Stanley et al. (1996) and Amaral et al. (1997) have found that the growth rate of firms’ output, gS follows a Laplace probability density function. To explain this puzzle the literature has followed two lines of research. The first one is theoretical and focuses only on the statistical properties of the link between the distribution of the level of activity and that of the rates of change. For instance, Reed (2001) shows that independent rates of change do not generate a lognormal distribution of firms’ size if the time of observation of units’ characteristics is not deterministic but is itself a random variable following approximately an exponential distribution. In this case, even if Gibrat’s law holds true at the individual level, units’ characteristics will converge to a double Pareto distribution. Palestrini (2005) shows the exact conditions under witch the lack of characteristic scale of firms’ size implies a Laplace distribution for firms’ growth rate. The second line of research stresses the importance of non-price interaction among firms with multiplicative shocks. For example, Delli Gatti et al. (2005) show the appearance of scaling phenomena in industry composed of a large number of heterogeneous interacting agents. Bottazzi and Secchi (2003) obtain a Laplace distribution of firms’ growth rates within the model put forward by Ijiri and Simon (1977) relaxing the assumption of independence of firms’ opportunity to growth from firms’ size . In this paper we want to explore the economic policy implications of such findings. Section 2 describes the importance of power law behaviors in explaining business cycles and economic growth. Section 3 presents our conclusions.
2. Scaling Laws and Economics The importance of power law probability functions resides in (1) the lack of a characteristic scale: i.e., the occurrence of rare and frequent events is governed by the same law; (2) power law behavior in the tail of a distribution is a property characterizing stable distributions: i.e., the distribution of —————— 2
But see also Ijiri and Simon, 1977. Even though this result seems to hold true only in the aggregate but not at a sector level (Bottazzi-Secchi, 2001). This is still an open problem because the efficient estimate of scaling law distributions needs a number of observations not available at disaggregate levels.
3
Implications of Scaling Laws for Policy-Makers
293
the sum of identical and independent random variables belongs to the same family. Lévy (1925) proved that, even though it is not possible to write the analytical class of stable distributions, it is possible to analyze it in terms of its characteristic function with four parameters, M(t), that taking the logarithm is ⎧i µ t − γ t α ⎡1 − i β sgn ( t ) tan (πγ/ 2 ) ⎤ ,α ≠ 1 ⎣ ⎦ ⎪ lnM ( t ) = ⎨ 2 ⎡ ⎤ ⎪ i µ t − γ t ⎢1 + i β sgn ( t ) ln ( γ ) ⎥ ,α = 1 π ⎣ ⎦ ⎩
where 0 < α ≤ 2 , γ is a positive scale factor, µ is a real number, and β is an asymmetry parameter ranging from -1 to 1. Expanding M(t) in Taylor series it is possible to show (Mantegna and Stanley, 2000) that the probability density function of a Lévy random variable x, in the tails, is proportional to x to the power of − (1 + α ) . When α = 2 , β = 0 and γ = σ 2 / 2 the distribution is Gaussian with 2 mean µ and variance σ . The Gaussian family is the only member of the Lévy class for which the variance exists. If economic phenomena are characterized by stable distributions4, except for a very special case, it is not possible to apply the standard limit theorems (Gabaix, 2005). The existence of second moments, with only idiosyncratic shocks hitting firms’ economic activity, implies no aggregate macro fluctuations. In fact, without aggregate shocks, the variance of the mean output of N firms is less 2 divided by N: a than the maximum variance of firms’ output, say σ max quantity that, for N going to infinity, vanishes. On the contrary, stable distributions with α < 2 do not need aggregate shocks to produce fluctuations because of the non-existence of second moments. These considerations are amplified when economic relationships are non-linear and characterized by direct interaction. These arguments point out the limits of mainstream economics that is based on reductionism, i.e. the methodology of classical mechanics. Such a view is coherent if the law of large numbers holds true, i.e.: • the functional relationships among variables are linear; • there is no direct interaction among agents; • Idiosyncratic shocks average out. Since non-linearities are pervasive, mainstream economics generally adopts the trick of linearizing functional relationships. Moreover, agents are supposed to be all alike and are not supposed to interact. Therefore, an —————— 4
The first one to conjecture it explicitly was Mandelbrot (1960).
294
M. Gallegati, A. Kirman, and A. Palestrini
economic system can be conceptualized as consisting of several identical and isolated components, each one being a representative agent (RA). The optimal aggregate solution can be obtained by means of a simple summation of the choices made by each optimizing agent. Moreover, if the aggregate is the sum of its constitutive elements, its dynamics cannot but be identical to that of each single unit5. The ubiquitous RA, however, is at odds with the empirical evidence (Stoker, 1993), and this is a major problem in the foundation of general equilibrium theory (Kirman, 1992) and is not coherent with many econometric investigations and tools (Forni and Lippi, 1997). The search for natural laws in economics does not necessarily require the adoption of the reductionist paradigm. Scaling phenomena and power law distributions are a case in point. If a scaling behavior exists, then the search for universality can be pushed very far. But how much room is there for economic policies? First, economic policies based on the RA framework may provide wrong answers (Kirman, 1992)6. Second, power law distributions imply a high level of concentration (in which many very small firms coexist with few very big firms) that needs to be controlled. Third, scaling laws are stylized facts that do not hold true in a quantitative sense but only in a qualitative way. For example, the scale parameter α may not be stable but may vary in time. This is important because of the negative relationship between the scale exponent and the strong level of firms’ concentration that this kind of distribution implies. Furthermore, It will be shown below that a small variation in the scale parameter implies a large variation in the concentration index and in the aggregate rate of growth. Defining PL( y;α ) = Pr (Y ≤ y) the power law distribution function of an economic quantity y , the concentration in a market - expressed for example as the relative importance of the bigger ten firms - is clearly related to the tail of the distribution above a certain threshold k , T ( k ,α ) —————— 5
Furthermore, even in cases in which heterogeneity is considered, mainstream economics, in order to aggregate, needs to know the values of the average and the standard deviation of the agents’ size. If there’s scaling, none of these conditions will be guaranteed. What can assure us that a policy that increases the variance, of, say, the firms’ sizes, would be canceled out by the same opposite sign policy? If there is scaling (or path dependence), then it will happen only by chance. 6 The literature has shown that RA cannot couple, by definition, with the redistributive policy and with dynamics. Comparative statics is also a straightjacket with the RA because of the lack of stability and the fact that the RA’s preferences do not represent aggregate preferences.
Implications of Scaling Laws for Policy-Makers
295
∞
T ( k ,α ) = ∫ dPL(α ) k
that, for power law economic variables, is proportional to the following ∞
T (k , α ) ∝ ∫ y − (1+α ) dy. k
From the above equation we can compute the derivative of the tail T (k , α ) with respect to α . That is dT ( k , α ) d ∝ dα dα
∫
∞
k
y − (1+α ) dy =
∞
= − ∫ y − (1+α ) ln( y ) dy ≤ 0. k
The problem of sensitivity between the concentration index and parameter α was studied by Gini (1922). He investigated the puzzle of the low variation in the scale parameter across different countries and economic systems7 and the huge difference in concentration in the same countries/economic systems. Having sorted the individual characteristics xi of n economic units (i.e., xi −1 < xi ) the ratio
xn − m +1 + xn − m + 2 + …+ xn m is the mean value of the m biggest economic units. The obvious inequality
x1 + x2 + …+ xn xn − m +1 + xn − m + 2 + …+ xn < n m implies that
xn − m +1 + xn − m + 2 + …+ xn m < x1 + x2 + …+ xn n In other terms, the fraction of total x owned by the biggest m economic units (first member of the inequality above) is greater than the fraction of units. Gini proposed, as an index of concentration, the exponent δ so that the inequality becomes an equality, that is
—————— 7
In fact, in the empirical analysis the income scale parameter seems to be around 1.5 and often in the range (1.1,1.8) (Guarini and Tassinari, 2000).
296
M. Gallegati, A. Kirman, and A. Palestrini δ
⎛ xn − m +1 + xn − m + 2 + …+ xn ⎞ m ⎜ ⎟ = x1 + x2 + …+ xn n ⎝ ⎠ This parameter depends on m so that a better index is the average value of the parameter obtained with different m . He proved that, under certain conditions and with xn that goes to infinity and α > 1, between α and δ there is the following relationship α . δ= α −1 The above asymptotic equation shows that a small variation of α implies a large variation in the concentration index; i.e., when α fluctuates between 1 and 2, δ goes from 2 to ∞ . Because of the negative relationship and the importance of the scale parameter in economic fluctuations, the economic policies that reduce the concentration in the market (anti-trust, etc.), assume a central role for the policy-maker. The above considerations regarding stable distributions and the argument in Gabaix (2005) suggest that an important source of aggregate variation is the presence of idiosyncratic shocks at a micro level. This main message of the scaling law literature is revealed in the Delli Gatti et al. (2005) framework in which there are no aggregate shocks by construction. At any time period t, the supply side of the economy consists of finitely many competitive firms indexed with i = 1, …, Nt, each one located on an island. The total number of firms (hence, islands) Nt depends on t because of endogenous entry and exit processes. Let the i-th firm use capital (Kit) as the only input to produce a homogeneous output (Yit) by means of a linear production technology, Yit = φKit. Capital productivity (φ) is constant and uniform across firms, and capital stock never depreciates. The demand for goods in each island is affected by an iid idiosyncratic real shock. Since arbitrage opportunities across islands are imperfect, the individual selling price in the i-th island is the random outcome of a market process around the average market price of output Pt, according to the law Pit = uitPt, with expected value E(uit) = 1 and finite variance. By assumption, firms are fully rationed on the equity market, so that the only external source of finance at their disposal is credit. The balance sheet identity implies that firms can finance their capital stock by resorting either to net worth (Ait) or to bank loans (Lit), Kit = Ait + Lit. Under the assumption that firms and banks sign long-term contractual relationships, at each t debt, commitments in real terms for the i-th firm are ritLit, where rit
Implications of Scaling Laws for Policy-Makers
297
is the real interest rate8. If, for the sake of simplicity, the latter is also the real return on net worth, each firm incurs financing costs equal to rit(Lit + Ait) = ritKit. Total variable costs proportional to financing costs9, gritKit, with g > 1. Therefore, profit in real terms (πit) is:
π it = u it Yit − grit K it = (u it φ − grit )K it . and expected profit is E(πit) = (φ − grit)Kit. In this economy, firms may go bankrupt as soon as their net worth becomes negative, that is Ait < 0. The law of motion of Ait is:
Ait = Ait −1 + π it , that is, net worth in previous period plus (minus) profits (losses). The definition of profit and the law of motion of the net worth implies that the bankruptcy state occurs whenever:
u it =
A ⎞ 1⎛ ⎜⎜ grit − it −1 ⎟⎟ ≡ u it . φ⎝ K it ⎠
In this model, the presence of multiplicative (price) shocks and the “floor” described by the bankruptcy condition generates scaling laws regarding firms’ size distribution. As specified above, in the model there are no aggregate shocks. All aggregate fluctuations derive from idiosyncratic shocks. These results may be shown using the Gabaix (2005) “island” representation of the Delli Gatti el al. (2005) framework. The net worth dynamic equation may be written as
∆Ait +1 = π it (uit ) Ait In other terms, the random variable describing a firm’s profit is, from the equation defining profit, a function of relative price (uit ) volatility (σ it ) . Total equity of the economy is defined as: Nt
At = ∑ AIt i =1
—————— 8
It follows that the credit lines periodically extended by the bank to each firm are based on a mortgaged debt contract. 9 As a matter of example, one can think of retooling and adjustment costs as being sustained each time the production process starts.
298
M. Gallegati, A. Kirman, and A. Palestrini
and the aggregate growth is A ∆At +1 Nt = ∑ π it (uit ) it At At i =1
Since the shocks are idiosyncratic, this implies that the volatility of the aggregate volatility is 1/ 2
2 ⎛ Nt ⎛A ⎞ ⎞ σ At = ⎜ ∑ σ i2π t ⎜ it ⎟ ⎟ ⎜ i =1 ⎝ At ⎠ ⎟⎠ ⎝
The above equation reveals that the two main sources of aggregate volatility are the magnitude of micro shocks and the concentration of firms. In order to understand this message more clearly, let us assume that the volatilities of the micro shocks are identical, (σit = σt ) ⇒(σiπ =σπ ). This implies that we can write the aggregate volatility as a product of two terms t
t
1/ 2
2 ⎛ Nt ⎞ 2 ⎛ Ait ⎞ σ At = ⎜ ∑ σ π t ⎜ ⎟ ⎟ ⎜ i =1 ⎝ At ⎠ ⎟⎠ ⎝
= σ π t ht
where ht is the Herfindahl of the economy that multiplies firms’ volatility. In other terms, the magnitude of the idiosyncratic price shock is multiplied by the concentration index to produce aggregate fluctuations. A similar argument in Gabaix (2005) shows the importance of reducing firms’ volatility of productivity growth, suggesting the use of economic policies able to control for individual volatility. An important example is patent policies reducing (at least the rate of growth of) legal protection of intellectual property. The economic growth literature (see Grossman and Helpman, 1991; Aghion, Harris, Howitt, and Vickers, 2001) has shown that stronger patent protection can in some cases reduce the overall pace of technological change and weaken the incentive to perform R&D. In other terms they show that a certain amount of imitation increases the flow of innovation. In the present work we stress that the same policies, when able to reduce micro-volatility, may also stabilize the aggregate quantities. The presence of firms’ size scaling laws affects economic growth of the market in a heavy way even in a situation in which firms’ growth distribution seems to possess all the moments in contrast with firms’ size. In fact, as mentioned above, Stanley et al. (1996) and Amaral et al. (1997) s have found that the growth rate of firms, g , follows a Laplace probability density function. But the rate of growth of the market, say gmt , is defined by
Implications of Scaling Laws for Policy-Makers
gmt
∑ = ∑
N
i =1 N
i =1
Sit
Sit −1
299
−1
where N is the number of firms. The above equation may also be written as N
gmt = ∑ git i =1
Sit −1
∑
N i =1
Sit −1
that, normalizing the total size to 1, reduces to N
gmt = ∑ git Sit −1 i =1
that is an average of firms’ growth weighted with its size. In other terms, the average firms’ growth is the product of two distributions, one of which (firms’ size) may not possess all the moments10. A more rigorous derivation of the aggregate rate of growth can be found using Gabaix’s theorem (2005), which links the aggregate rate of growth to the scaling puzzle discovered by Stanley et al. (1996), i.e. firms’ volatility seems to depend negatively on their size and follows a power law relationship11 σ i ( S ) = kSi − β
where k is a constant and with the scaling exponent β empirically close to 0.15. Theorem (Gabaix 2005): in an island economy, if firms’ volatility follows the above equation and if the size follows a power law distribution, then GDP fluctuations follow a Lévy distribution with exponent —————— 10
Scaling laws and the consequent high level of concentration that arises “naturally” in multiplicative stochastic economic phenomena imply that a concept like “the average rate of growth” may have no meaning. Suppose there is one very large firm which is growing, while all the others are having negative growth. On the average the growth rate could be positive at an aggregate level, while actually almost all firms have a higher probability of going bankrupt and as a result market concentration increases. 11 This result may seem in contrast with the above analysis showing the importance of reducing the concentration in the market in order to reduce aggregate fluctuation because big firms have smaller volatility. But this effect is weaker than what would happen if a firm of size S were composed of S independent units of size 1, which would predict a scale exponent equal to 1/2.
300
M. Gallegati, A. Kirman, and A. Palestrini ⎛ 1 α + β −1 ⎞ min ⎜ , ⎟ α ⎝2 ⎠
that implies the possibility that some moments may not exist. The theorem implies that aggregate rate of growth is characterized by heavy tails, which is a very important fact when considering financial conditions in a firm’s decision process. As suggested in Delli Gatti et al. (2005), the presence of scaling law dynamics makes it necessary to use policies able to control for financial crises of the economy because of two reasons: (a) the probability of bankruptcy may cause large variation in big firms determining not only a change in aggregate volatility but also in growth trend; (b) because of the probability of bankruptcy monetary policies act in an asymmetric way. Let us assume that the Central Bank keeps tight credit. As a consequence, the rate of interest rises by forcing the more financially fragile firms to go bankrupt. Moreover, because of the possible domino effects or bankruptcy avalanches, it is quite unrealistic to assume that the opposite monetary policy action would restore the status quo ante.
3. Conclusions In this paper, we have emphasized the straitjacket of the representative agent approach for economic policies. In particular, Kirman (1992) shows that such a theoretical device is badly equipped for analyzing redistributive policies and, above all, the analysis may be wrong even regarding the qualitative aspect of aggregate phenomena since there are situations where it is not suitable to apply the reductionist principle. A growing literature is now emphasizing the discovery of scaling phenomena in an economic system. In this paper we show that this finding implies a different approach in analyzing aggregate phenomena. In particular, we have shown that in order to stabilize, and to promote growth, the policy-maker should act at three micro-meso levels (1). Since the main source of aggregate fluctuations is idiosyncratic volatility (prices, productivity, etc.), it seems that economic policies able to control for individual volatility are very important with regard to aggregate fluctuations. An example for such policies are those that reduce (at least the rate of growth of) legal protection of intellectual property. The importance of these policies is in line with the economic growth literature (see Grossman and Helpman, 1991; Aghion, Harris, Howitt, and Vickers, 2001) showing that stronger patent protection can, in some cases, reduce the overall pace of technological change and weaken the incentive to perform R&D. In other terms they show that a certain amount of imitation increases the flow of innovation
Implications of Scaling Laws for Policy-Makers
301
(2). Economic policies that reduce the concentration in the market (antitrust, etc.) assume a central role for the policy-maker because aggregate fluctuations depend heavily on them. Power law distribution implies a high level of concentration in which many very small firms coexist with few very big ones. This argument suggests that microeconomic (idiosyncratic) shocks of very big firms are, in a certain sense, important aggregate shocks (3). The Delli Gatti et al. (2005) pioneer line of research suggests the importance - in situations characterized by a high level of concentration - of using economic policies able to control for financial crises in an economy for two reasons: (a) the probability of bankruptcy may cause large variation in big firms, causing not only a change in aggregate volatility but also in growth trend; (b) the probability of bankruptcy monetary policies may act in an asymmetric way.
References Aghion P., Harris C., Howitt P., and Vickers J. (2001): Competition, Imitation and Growth with Step-by-Step Innovation, Review of Economic Studies, 68, pp. 121-144. Amaral L., Buldyrev S., Havlin S., Leschhorn H., Maas P., Salinger M., Stanley E., and Stanley M. (1997), Scaling Behavior in Economics: I. Empirical Results for Company Growth, Journal de Physique, 7, pp. 621-633. Axtell R. (2001): Zipf Distribution of U.S. firm sizes, Science, 293, pp. 18181820. Bottazzi G., and Secchi A. (2003): Why Are Distribution of Frms’ Growth Rates Tent-shaped? Economic Letters, 80, pp. 415–420. Carroll C. (2001): Requiem for the Representative Consumer? Aggregate Implications of Microeconomics Consumption Behavior, American Economic Review, 90, pp. 110-115. Delli Gatti D., Di Guilmi C., Gaffeo E., Gallegati M., Giulioni G. and Palestrini A. (2005): A New Approach to Business Fluctuations: Heterogeneous Interacting Agents, Scaling Laws and Financial Fragility, Journal of Economic Behavior and Organization, 56, pp. 489-512. Forni M., and Lippi M. (1997): Aggregation and the Microfoundations of Dynamic Macroeconomics. Oxford University Press, Oxford. Gabaix X. (2005): Power Laws and the Granular Origins of Aggregate Fluctuations, mimeo M.I.T. and NBER. Gaffeo E., Gallegati M., and Palestrini A. (2003): On the Size Distribution of Firms. Additional Evidence from the G7 Countries. Physica A , 324, pp. 17123. Gibrat R. (1931): Les inégalités économiques; applications: aux inégalités des richesses, à la concentration des entreprises, aux populations des villes, aux statistiques des familles, etc., d’une loi nouvelle, la loi de l’effet proportionnel. Paris: Librairie du Recueil Sirey.
302
M. Gallegati, A. Kirman, and A. Palestrini
Gini C. (1922): Indici di concentrazione e di dipendenza, Biblioteca dell’Economista, 20. Grossman G., and Helpman E. (1991): Quality Ladders and Product Cycles, Quarterly Journal of Economics, 106, pp. 557-586. Guarini R., and Tassinari F. (2000): Statistica Economica, Il Mulino, Bologna. Hildebrand W., and Kirman A. (1988): Equilibrium Analysis. North-Holland, Amsterdam. Ijiri, Y., & Simon, H. (1964): Business firm growth and size. American Economic Review, 54, 77-89. Ijiri Y., and Simon, H. (1977): Skew Distributions and the Sizes of Business Firms, North Holland, Amsterdam. Kirman A. (1992): Whom or What Does the Representative Individual Represent? Journal of Economic Perspective, 6, pp. 117-136. Lebwel A. (1989): Exact Aggregation and the Representative Consumer, Quarterly Journal of Economics, 104, pp. 621-633. Lévy P. (1925): Calcul des probabilities. Gauthiers-Villars, Paris. Mandelbrot B.B. (1960): The Pareto-Lévy Law and the Distribution of Income, International Economic Review, 1:2, pp. 79-105. Mantegna R.N., and Stanley H. Eugene (2000): An Introduction to Econophysics: Correlations and Complexity in Finance. Cambridge University Press, Cambridge UK. Palestrini A. (2005): Analysis of Industrial Dynamics: the Relationship between Firms’ Size and Growth Rate, Working Paper. Pareto V. (1897): Cours d’Économie Politique. Macmillan, London. Reed W. (2001): The Pareto, Zipf and Other Power Laws, Economics Letters, 74, pp. 15-19. Stanley M. (1997): Scaling Behavior in Economics: I. Empirical Results for Company Growth, Journal de Physique, 7, pp. 621-633. Stanley M., Amaral L., Buldyrev S., Havlin S., Leschhorn, H., Maass P. Salinger M.A., and Stanley H. (1996): Scaling Behaviour in the Growth of Companies, Nature, 379, pp. 804–806. Stoker T. (1993): Empirical Approaches to the Problem of Aggregation Over Individuals, Journal of Economic Literature, 21, pp. 1827-1874.
303
____________________________________________________________
Robust Control and Monetary Policy Delegation G. Diana and M. Sidiropoulos
1. Introduction In the recent literature on optimal monetary policy, policymakers are assumed to know the true model of the economy and observe accurately all relevant variables. The sources and properties of economic disturbance are also taken to be known. Uncertainty in this case arises only due to the unknown future realisations of these disturbances. In this context, “uncertainty” means the realisation of an event whose true probability distribution is known. Pure uncertainty, where the state space of outcomes is known but one is unable to assign probabilities, has largely been ignored. In practice, the policymaker’s choice is made in the face of tremendous uncertainty about the true structure of the economy, the impact policy actions have on the economy, and even about the current state of the economy. The policymaker is therefore unsure about his model, in the sense that there is a group of approximate models that he also considers as possibly true. Because uncertainty is pervasive, it is important to understand how alternative policies work when the policymaker cannot accurately observe important macro variables or when he employs a model of the economy that is incorrect in unknown ways. The resulting problem is one of robust control, in the sense of Hansen and Sargent (2004), where the objective is to choose a rule that will work under a range of different model specifications. The notion that policy decisions may be more robust if based on systematically distorted models of the economy is a key implication of the recent research on robust contro11. —————— 1
For an application of robust control framework to the optimal monetary rules, see Walsh (2004). Robust control framework is also applied in other topics in re-
304
G. Diana and M. Sidiropoulos
In this context, it is particularly important to search for monetary policies that are able to deliver good macroeconomic outcomes even when policymakers are uncertain with regard to the true structure of the economy. The purpose of this paper is to show how a robust control framework of model misspecification doubts can be applied to the government’s problem of monetary policy delegation to a “conservative” central banker. More precisely, we try to give an answer to the question whether model uncertainty can affect the government’s optimal commitment to fight inflation, as well as its will to delegate the conduct of monetary policy to a “conservative” central banker. We proceed from the assumption that the government has a model of the economy that it believes is a reasonable approximation to the true model but that this approximating model may be subject to misspecification. Rather than viewing the set of possible misspecifications as simply random, the policymaker assumes “nature” is an evil agent who will choose the misspecification that makes the policymaker look as bad as possible. In such an environment, we find that the government’s robust choice reveals the emergence of a precautionary behaviour in the case of uncertainty about the true structure of the economy, reducing its willingness to delegate monetary policy to a “conservative” central banker in the sense of Rogoff’s classical 1985 article. The rest of the paper is organised as follows. Section 2 sets up a oneperiod model of monetary policy. Section 3 derives the discretionary equilibrium. Section 4 derives the optimal degree of conservativeness of the central banker. Section 5 summarises the main conclusions.
2. A one-period model of monetary policy The model used here is based on Rogoff’s (1985) game-theoretic model of monetary policy delegation in which the policymaker sets inflation in view of the following standard expectation-augmented Phillips curve: u = u ∗ − (π − π e ) + ε
(1)
where u is the unemployment rate, u ∗ > 0 the natural rate of unemployment, π the inflation rate, π e the rationally expected inflation rate, and ε is a random variable with mean zero and variance 1. However, according to Hansen and Sargent (2004), we modify Rogoff’s (1985) model by as————— cent research. For example, in the field of environmental economics, see RosetaPalma and Xepapadeas (2004) for an application to water management decisions.
Robust Control and Monetary Policy Delegation
305
suming that the government views his model (equation 1) as an approximation. One way to represent the uncertainty about the process that governs the unemployment rate is to assume that the policymaker suspects that this process might actually be governed by u = u ∗ − (π − π e ) + ε + h
(2)
where h is an unknown distortion to the mean of the shock, representing a possible specification error. However, the size of distortion h must be bounded as the policymaker has some information on the process. More precisely, we assume that the magnitude of the square of the specification error verify: h2 ≤η 2
(3)
where the parameter η bounds the square of the government’s specification error h 2 . Restriction (3), together with equation (2), define a set of models that the government considers as being possible outcomes. Preferences of the government are described by the following utility function: 2
Ug = −
1 2 (u + π 2 ) 2
(4)
where the government is assumed to stabilise unemployment and inflation around their target values, which are for simplicity fixed to zero. Following Rogoff (1985), monetary policy is delegated by the government to an independent or “conservative” central banker whose utility function is the following: 1 U cb = − [u 2 + (1 + φ )π 2 ] 2
(5)
Where φ > 0 is chosen by the government and is the optimal extra (relative) weight the central banker sets on inflation versus unemployment stabilisation2. We proceed now to the analysis of the central bank’s policies using a backward induction analysis. We first derive the central bank’s robust best response of π taking the private sector’s setting of inflation expectations π e as fixed. Afterwards, we derive the government’s decision about the optimal degree of central bank conservativeness. In —————— 2
Rogoff (1985) demonstrates that the optimal extra (relative) weight the central banker sets on inflation is finite and strictly positive.
306
G. Diana and M. Sidiropoulos
other words, government chooses the optimal value of φ so as to attain satisfactory outcomes for all h 2 ≤ η 2 .
3. The discretionary equilibrium under robust control The uncertainty problem can be thought of as a zero-sum multiplier game between two players, where the central banker is maximising over π and the “nature” is minimising over h . Then the problem can be written as:
θ 1 max min = − [u 2 + (1 + φ )π 2 ] + h 2 h π 2 2
(6)
where both the minimisation and maximisation are subject to equation (2). The parameter θ is a fixed penalty parameter that reflects both the government and the central banker’s desired degree of robustness. In other words, we assume that the government and the central banker share the same doubts about the accuracy of the model described by equation (1). Moreover, θ > 1 can be interpreted as a Lagrangian multiplier on constraint (3). The value θ = 1 is the breakdown point to be discussed later. The value for θ would be endogenous in the constrained Lagrangian, and it would be associated to the specific η value used in the constraint. The way the problem is written here, θ is chosen directly and the constraint is adapted accordingly. Note also that larger values of θ imply smaller sets of models so that θ is an indicator of the precautionary behaviour of the authorities. In other words, the more θ is close to one, the more the government is unsure about the accuracy of the model it uses. In the opposite case, as θ → +∞ , the government believes that its model is a good approximation of the true model of the economy. In the limit case, where θ = +∞ , there is no misspecification and the government is convinced that the model it uses is the true one. From the first order conditions for π and h in the problem (6), we obtain respectively the reaction functions: π (θ ) =
θ (u ∗ + ε + π e ) θ + (1 + φ )(θ − 1)
(7)
h(θ ) =
(1 + φ ) (u ∗ + ε + π e ) θ + (1 + φ )(θ − 1)
(8)
where π (θ ) gives the central banker’s (robust) best reaction function for e setting π as a function of π , while h(θ ) determines the worst case model, given π e and the central banker’s setting π (θ ) . Then, using equa-
Robust Control and Monetary Policy Delegation
307
tion (7) and assuming rational expectations of the private sector (π e = Eπ ) yields: π e (θ ) =
θ (θ − 1)(1 + φ )
u∗
(9)
By substituting equation (9) into equation (7), the time-consistent rational expectations equilibrium inflation rate is readily found as: π (θ ) =
θ (θ − 1)(1 + φ )
u∗ +
θ θ + (θ − 1)(1 + φ )
ε
(10)
From equation (10), it is clear that as θ → 1 , the model breaks down as the equilibrium inflation rate tends to infinity. Finally, substituting equations (8), (9), and (10) into equation (2), we obtain the equilibrium rate of unemployment:
u (θ ) =
θ (θ − 1)
u∗ +
θ (1 + φ ) ε θ + (θ − 1)(1 + φ )
(11)
Using equations (10) and (11), we can see that when θ → +∞ there is no concern for model misspecification as h (∞) = 0 and thus the standard rational expectation model arises. In this situation, the equilibrium inflation rate and unemployment rate are respectively given by π (∞ ) =
1 1 u∗ + ε (1 + φ ) 1 + (1 + φ )
u (∞ ) = u ∗ +
(1 + φ ) ε 1 + (1 + φ )
(12)
(13)
Using equations (10) to (13), we establish the following proposition. Proposition 1: If the approximating model is true (i.e., θ = +∞ and hence h (∞ ) = 0 ), so that the authority’s concern about misspecification is misplaced, the authority’s ignorance of the model causes the central banker to set both inflation and unemployment higher than if he knew the model for sure. In other words, when the approximating model is correct, robust policies sacrifice macroeconomic performance. Proof. This result straightforwardly follows from the respective comparison of equations (10) and (12) and equations (11) and (13). However, it can be easily shown that as the specification error increases (i.e., as the parameter θ decreases), the deterioration of the macroeconomic performances and the government’s expected losses are lower under the robust policy (see Hansen and Sargent, 2004, chap. 5).
308
G. Diana and M. Sidiropoulos
4. The optimal degree of conservativeness We now consider the government’s optimal appointment of a conservative central banker. As the government chooses optimally the central banker, φ must be solved endogenously in order to maximise the government’s expected utility. Our objective is to analyse how the precautionary behaviour of the government, captured by θ , affects its optimal choice concerning the characteristics of the central banker, i.e. his degree of conservativeness φ . Thus, using equations (10) and (11) into the government utility function (4), we derive the government’s expected utility as a function of φ : ⎤ 1 ⎡ 1 + (1 + φ ) 2 1 + (1 + φ ) 2 ∗ 2 + ε E (U g ) = − θ 2 ⎢ ( u ) 2 ⎥ 2 ⎣⎢ (θ − 1) 2 (1 + φ ) 2 [θ + (θ − 1)(1 + φ )] ⎦⎥
(14)
Then we derive equation (14) with respect to φ and we obtain the following first-order condition: ∂E (U g ) ∂φ
= 0 ⇔ f (φ ;θ ) =
(1 + θφ )(θ − 1) 2 (1 + φ )3 − (u ∗ ) 2 = 0 [θ + (θ − 1)(1 + φ )]2
(15)
Finally, using equation (15), we establish the following proposition. Proposition 2: The more the government desires a robust monetary policy suitable to the uncertainty about the true structure of the economy, the less it needs to appoint a conservative central banker with a high degree of inflation aversion. Proof. From equation (15) and remembering that θ > 1 , we can write
[
]
∂f θ (θ − 1) 2 (1 + φ ) 2 = θ (1 + φ ) + 3(1 + φθ ) + (θ − 1)(1 + φ ) 2 > 0 ∂φ [θ + (θ − 1)(1 + φ )] 4
(16)
[
(17)
and
]
(θ − 1)(1 + φ )3 ∂f =− (θ − 1)(2 + φ )2 − 2(1 + φ ) < 0 ∂θ [θ + (θ − 1)(1 + φ )]4
if θ >1+
2(1 + φ ) (2 + φ ) 2
(18)
Robust Control and Monetary Policy Delegation
309
Note that inequality (17) holds only if condition (18) is verified. However, condition (18) is not restrictive since it is verified as soon as θ > 1.5 . Applying the implicit function rule, we obtain3: ∂φ ∂f / ∂θ =− >0 ∂θ ∂f / ∂φ
(19)
According to Proposition 2, the government’s robust choice of φ reveals the emergence of a precautionary behaviour in the case of uncertainty about the true structure of the economy, reducing its willingness to delegate monetary policy to a “conservative” central banker. The intuition behind this proposition is that the more the government is uncertain about the true structure of the economy, the more it is reluctant to accept sacrificing macroeconomic performance. Consequently, as the structure of the economy is uncertain, it becomes optimal for the government to appoint a central banker characterised by a small degree of independence and conservativeness in order to implement a more flexible monetary policy (in the sense of employment stabilisation), which will work well over a larger set of possible alternative models.
5. Conclusions In this paper, using Rogoff’s classical 1985 article, we examine how the “robust control” method of Hansen and Sargent (2004) about model misspecification uncertainties can be applied to the problem of monetary policy delegation to a “conservative” central banker. We show that the government’s robust choice reveals the emergence of a precautionary behaviour in the case of uncertainty about the true structure of the economy, reducing its willingness to delegate monetary policy to a “conservative” central banker with a high degree of inflation aversion. In other words, our analysis suggests that it is not optimal for a government to appoint a “conservative” central banker when there is uncertainty about the true structure of the economy.
—————— 3 In fact, as φ > 0 it is obvious that 2(1 + φ ) /(2 + φ ) 2 ≤ 0.5. For example if φ = 1 condition (18) becomes θ > 1.444 and if φ = 2, condition (18) becomes θ > 1.375. In other words, the more the central banker is independent and the more the condition (18) can be verified.
310
G. Diana and M. Sidiropoulos
References Hansen, L. P. and Sargent, T. J., (2003): Robust Control for Forward Looking Models, Journal of Monetary Economics, 50(3), 586-604. Hansen, L. P. and Sargent, T. J., (2004): Robust Control and Model Uncertainty in Macroeconomics, Princeton University Press. Rogoff, K., (1985): The Optimal Degree of Commitment to an Intermediate Monetary Target. Quarterly Journal of Economics, 100, 1169-1189. Roseta-Palma C. and A. Xepapadeas, (2004): Robust Control in Water Management, Journal of Risk and Uncertainty 29(1), 21-34 Walsh, C., (2004): Robustly Optimal Instrument Rules and Robust Control: An Equivalence Result, Journal of Money, Credit, and Banking, 36(6), 11051113.
New Economic Windows Massimo Salzano, Alan Kirman (Eds.) Economics: Complex Windows 2005, XX, 217 p., Hardcover ISBN: 978-88-470-0279-1 Arnab Chatterjee, Sudhakar Yarlagadda, Bikas K. Chakrabarti (Eds.) Econophysics of Wealth Distributions Econophys-Kolkata I 2005, X, 248 p., Hardcover ISBN: 978-88-470-0329-3 Arnab Chatterjee, Bikas K. Chakrabarti (Eds.) Econophysics of Stock and other Markets Proceedings of the Econophys-Kolkata II 2006, XIV, 253 p., Hardcover ISBN: 978-88-470-0501-3 Massimo Salzano, David Colander (Eds.) Complexity Hints for Economic Policy 2007, XXIV, 312 p., Hardcover ISBN: 978-88-470-0533-4
Printed in December 2006