INTRODUCTION
TO THE SERIES
The aim of the Handbooks in Economics series is to produce Handbooks for various branches of economics, each of which is a definitive source, reference, and teaching supplement for use by professional researchers and advanced graduate students. Each Handbook provides self-contained surveys of the current state of a branch of economics in the form of chapters prepared by leading specialists on various aspects of this branch of economics. These surveys summarize not only received results but also newer developments, from recent journal articles and discussion papers. Some original material is also included, but the main goal is to provide comprehensive and accessible surveys. The Handbooks are intended to provide not only useful reference volumes for professional collections but also possible supplementary readings for advanced courses for graduate students in economics. KENNETH J. ARROW and MICHAEL D. INTRILIGATOR
PUBLISHER'S
NOTE
For a complete overview of the Handbooks in Economics Series, please refer to the listing at the end of this volume.
CONTENTSOF
VOLUME
THE HANDBOOK
1A
PART 1 - EMPIRICAL AND HISTORICAL PERFORMANCE Chapter 1 Business Cycle Fluctuations in US Macroeconomic Time Series JAMES H. STOCK and MARK W. WATSON
Chapter 2 Monetary Policy Shocks: What Have we Learned and to What End? LAWRENCE J. CHRISTIANO, MARTIN EICHENBAUM and CHARLES L. EVANS
Chapter 3 Monetary Policy Regimes and Economic Performance: The Historical Record MICHAEL D. BORDO AND ANNA J. SCHWARTZ
Chapter 4 The New Empirics of Economic Growth STEVEN N. DURLAUF and DANNY T. QUAH PART 2 - M E T H O D S O F D Y N A M I C A N A L Y S I S
Chapter 5 Numerical Solution of Dynamic Economic Models MANUEL S. SANTOS
Chapter 6 Indeterminacy and Sunspots in Macroeconomics JESS BENHABIB and ROGER E.A. FARMER
Chapter 7 Learning Dynamics GEORGE W. EVANS and SEPPO HONKAPOHJA
Chapter 8 Micro Data and General Equilibrium Models MARTIN BROWNING, LARS PETER HANSEN and JAIVIES J. HECKMAN
vii
Contents of the Handbook
viii
PART 3 - MODELS
OF ECONOMIC
GROWTH
Chapter 9 Neoclassical Growth Theory ROBERT M. SOLOW
Chapter 10 Explaining Cross-Country Income Differences ELLEN R. McGRATTAN and JAMES A. SCHMITZ, Jr.
VOLUME
1B
PART 4 - CONSUMPTION
AND INVESTMENT
Chapter 11 Consumption ORAZIO E ATTANASIO
Chapter 12 Aggregate Investment RICARDO J. CABALLERO
Chapter 13 Inventories VALERIE A. RAMEY and KENNETH D. WEST
PART 5 - MODELS
OF ECONOMIC
FLUCTUATIONS
Chapter 14 Resuscitating Real Business Cycles ROBERT G. KING AND SERGIO T. REBELO
Chapter 15 Staggered Price and Wage Setting in Macroeconomics JOHN B. TAYLOR
Chapter 16 The Cyclical Behavior of Prices and Costs JULIO J. ROTEMBERG and MICHAEL WOODFORD
Chapter 17 Labor-Market Frictions and Employment Fluctuations ROBERT E. HALL
Chapter 18 Job Reallocation, Employment Fluctuations and Unemployment DALE T. MORTENSEN and CHRISTOPHER A. PISSARIDES
Contents of the Handbook
VOLUME 1C PART 6 - F I N A N C I A L M A R K E T S A N D T H E M A C R O E C O N O M Y
Chapter 19 Asset Prices, Consumption, and the Business Cycle JOHN Y. CAMPBELL
Chapter 20 Human Behavior and the Efficiency of the Financial System ROBERT J. SHILLER
Chapter 21 The Financial Accelerator in a Quantitative Business Cycle Framework BEN S. BERNANKE, MARK GERTLER and SIMON GILCHRIST PART 7 - M O N E T A R Y A N D F I S C A L P O L I C Y
Chapter 22 Political Economics and Macroeconomic Policy TORSTEN PERSSON and GUIDO TABELLINI
Chapter 23 Issues in the Design of Monetary Policy Rules BENNETT T. McCALLUM
Chapter 24 Inflation Stabilization and BOP Crises in Developing Countries GUILLERMO A. CALVO and CARLOS A. VI~GH
Chapter 25 Government Debt DOUGLAS W. ELMENDORF AND N. GREGORY MANKIW
Chapter 26 Optimal Fiscal and Monetary Policy V.V CHARI and PATRICK J. KEHOE
ix
PREFACE TO THE HANDBOOK
Purpose The Handbook of Macroeconomics aims to provide a survey of the state of knowledge in the broad area that includes the theories and facts of economic growth and economic fluctuations, as well as the consequences of monetary and fiscal policies for general economic conditions.
Progress in Macroeconomics Macroeconomic issues are central concerns in economics. Hence it is surprising that (with the exception of the subset of these topics addressed in the Handbook of Monetary Economics) no review of this area has been undertaken in the Handbook of Economics series until now. Surprising or not, we find that now is an especially auspicious time to present such a review of the field. Macroeconomics underwent a revolution in the 1970's and 1980's, due to the introduction of the methods of rational expectations, dynamic optimization, and general equilibrium analysis into macroeconomic models, to the development of new theories of economic fluctuations, and to the introduction of sophisticated methods for the analysis of economic time series. These developments were both important and exciting. However, the rapid change in methods and theories led to considerable disagreement, especially in the 1980's, as to whether there was any core of common beliefs, even about the defining problems of the subject, that united macroeconomists any longer. The 1990's have also been exciting, but for a different reason. In our view, the modern methods of analysis have progressed to the point where they are now much better able to address practical or substantive macroeconomic questions - whether traditional, new, empirical, or policy-related. Indeed, we find that it is no longer necessary to choose between more powerful methods and practical policy concerns. We believe that both the progress and the focus on substantive problems has led to a situation in macroeconomics where the area of common ground is considerable, though we cannot yet mmounce a "new synthesis" that could be endorsed by most scholars working in the field. For this reason, we have organized this Handbook around substantive macroeconomic problems, and not around alternative methodological approaches or schools of thought.
xi
xii
Preface
The extent to which the field has changed over the past decade is considerable, and we think that there is a great need for the survey of the current state ofmacroeconomics that we and the other contributors to this book have attempted here. We hope that the Handbook of Macroeconomics will be useful as a teaching supplement in graduate courses in the field, and also as a reference that will assist researchers in one area of macroeconomics to become better acquainted with developments in other branches of the field. Overview The Handbook of Macroeconomics includes 26 chapters, arranged into seven parts. Part 1 reviews evidence on the Empirical and Historical Performance of the aggregate economy, to provide factual background for the modeling efforts and policy discussion of the remaining chapters. It includes evidence on the character of business fluctuations, on long-run economic growth and the persistence of crosscountry differences in income levels, and on economic performance under alternative policy regimes. Part 2 on Methods of Dynamic Analysis treats several technical issues that arise in the study of economic models which are dynamic and in which agents' expectations about the future are critical to equilibrium determination. These include methods for the calibration and computation of models with intertemporal equilibria, the analysis of the determinacy of equilibria, and the use of "learning" dynamics to consider the stability of such equilibria. These topics are important for economic theory in general, and some are also treated in the Handbook of MathematicalEconomics, The Handbook of Econometrics, and the Handbook of Computational Economics, for example, from a somewhat different perspective. Here we emphasize results - such as the problems associated with the calibration of general equilibrium models using microeconomic studies - that have particular application to macroeconomic models. The Handbook then turns to a review of theoretical models of macroeconomic phenomena. Part 3 reviews Models of Economic Growth, including both the determinants of long-run levels of income per capita and the sources of cross-country income differences. Both "neoclassical" and "endogenous" theories of growth are discussed. Part 4 treats models of Consumption and Investment demand, f r o m the point of view of intertemporal optimization. Part 5 covers Models of" Economic Fluctuations. In the chapters in this part we see a common approach to model formulation and testing, emphasizing intertemporal optimization, quantitative general equilibrium modeling, and the systematic comparison of model predictions with economic time series. This common approach allows for consideration of a variety of views about the ultimate sources of economic fluctuations and of the efficiency of the market mechanisms that amplify and propagate them. Part 6 treats Financial Markets and the Macroeconomy. The chapters in this part consider the relation between financial market developments and aggregate economic
Preface
xiii
activity, both from the point of view of how business fluctuations affect financial markets, and how financial market disturbances affect overall economic activity. These chapters also delve into the question of whether financial market behavior can be understood in terms of the postulates of rational expectations and intertemporal optimization that are used so extensively in modern macroeconomics-an issue of fundamental importance to our subject that can be, and has been, subject to special scrutiny in the area of financial economics because of the unusual quality of available data. Finally, Part 7 reviews a number of Monetary and Fiscal Policy issues. Here we consider both the positive theory (or political economics) of government policymaking and the normative theory. Both the nature of ideal (or second-best) outcomes according to economic theory and the choice of simple rules that may offer practical guidance for policymakers are discussed. Lessons from economic theory and from experience with alternative policy regimes are reviewed. None of the chapters in this part focus entirely on international, or open economy, macroeconomic policies, because many such issues are addressed in the Handbook of International Economics. Nevertheless, open-economy issues cannot be separated from closed-economy issues as the analysis of disinflation policies and currency crises in this part of the Handbook of Macroeeonomics, or the analysis of policy regimes in the Part I of the Handbook of Maeroeconomics make clear.
Acknowledgements Our use of the pronoun "we" in this preface should not, of course, be taken to suggest that much, if any, of the credit for what is useful in these volumes is due to the Handbook's editors. We wish to acknowledge the tremendous debt we owe to the authors of the chapters in this Handbook, who not only prepared the individual chapters, but also provided us with much useful advice about the organization of the overall project. We are grateful for their efforts and for their patience with our slow progress toward completion of the Handbook. We hope that they will find that the final product justifies their efforts. We also wish to thank the Federal Reserve Bank of New York, the Federal Reserve Bank of San Francisco, and the Center for Economic Policy Research at Stanford University for financial support for two conferences on "Recent Developments in Macroeconomics" at which drafts of the Handbook chapters were presented and discussed, and especially to Jack Beebe and Rick Mishkin who made these two useful conferences happen. The deadlines, feedback, and commentary at these conferences were essential to the successful completion of the Handbook. We also woutd like to thank Jean Koentop for managing the manuscript as it neared completion. Stanford, California Princeton, New Jersey
John B. Taylor Michael Woodford
Chapter 1
BUSINESS
CYCLE
FLUCTUATIONS
US MACROECONOMIC
TIME
IN
SERIES
JAMES H. STOCK Kennedy School o f Government, Harvard University and the NBER
MARK W WATSON Woodrow Wilson School, Princeton University and the NBER
Contents Abstract Keywords 1. Introduction 2. Empirical methods of business cycle analysis 2.1. Classical business cycle analysis and the determination of turning points 2.2. Isolating the cyclical component by linear filtering 3. Cyclical behavior of selected economic time series 3.1. The data and summary statistics 3.2. Discussion of results for selected series 3.2.1. Comovementsin employmentacross sectors 3.2.2. Consumption, investment,inventories,imports and exports 3.2.3. Aggregate employment,productivity and capacity utilization 3.2.4. Prices and wages 3.2.5. Asset prices and returns 3.2.6. Monetary aggregates 3.2.7. Miscellaneous leading indicators 3.2.8. International output 3.2.9. Stability of the predictive relations 4. Additional empirical regularities in the postwar US data 4.1. The Phillips curve 4.2. Selected long-run relations 4.2.1. Long-run money demand 4.2.2. Spreads between long-term and short-term interest rates 4.2.3. Balanced growth relations Acknowledgements Appendix A. Description of the data series used in this chapter A. 1. Series used in Section 1 Handbook o f Macroeconomics, Volume 1, Edited by JB. Taylor and M. Woodford © 1999 Elsevier Science B.V. All rights reserved 3
4 4 5 8 8 10 14 14 39 39 40 41 42 43 44 44 45 45 46 46 50 50 52 54 56 56 56
4
J.H. Stock and M. W. Watson
A.2. Series used in Section2 A.3. Additionalseries used in Section4 References
56 60 61
Abstract This chapter examines the empirical relationship in the postwar United States between the aggregate business cycle and various aspects of the macroeconomy, such as production, interest rates, prices, productivity, sectoral employment, investment, income, and consumption. This is done by examining the strength of the relationship between the aggregate cycle and the cyclical components of individual time series, whether individual series lead or lag the cycle, and whether individual series are useful in predicting aggregate fluctuations. The chapter also reviews some additional empirical regularities in the US economy, including the Phillips curve and some longrun relationships, in particular long run money demand, long run properties of interest rates and the yield curve, and the long run properties of the shares in output of consumption, investment and government spending.
Keywords economic fluctuations, Phillips curve, long run macroeconomic relations J E L classification: E30
Ch. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
1. I n t r o d u c t i o n
This chapter summarizes some important regularities in macroeconomic time series data for the Uriited States since World War II. Our primary focus is the business cycle. In their classic study, Burns and Mitchell (1946) offer the following definition o f the business cycle: A cycle consists of expansions occurring at about the same time in many economic activities, followed by similarly general recessions, contractions, and revivals which merge into the expansion phase of the next cycle; this sequence of changes is recurrent but not periodic; in duration business cycles vary from more than one year to ten or twelve years; they are not divisible into shorter cycles of similar character with amplitudes approximating their own. Burns and Mitchell, 1946, p. 3.
Figure 1.1 plots the natural logarithm of an index o f industrial production for the United States from 1919 to 1996. (Data sources are listed in the Appendix.) Over these 78 years, this index has increased more than fifteen-fold, corresponding to an increase in its logarittma by more than 2.7 units. This reflects the tremendous growth o f the US labor force and o f the productivity o f American workers over the twentieth century. Also evident in Figure 1.1 are the prolonged periods of increases and declines that constitute American business cycles. These fluctuations coincide with some o f the signal events o f the US economy over this century: the Great Depression o f the 1930s; the subsequent recovery and growth during World War II; the sustained boom o f the 1960s, associated in part with spending on the war in Vietnam; the recession o f 1973-1975, associated with the first OPEC price increases; the disinflationary twin recessions of the early 1980s; the recession o f 1990, associated with the invasion o f Kuwait by Iraq; and the long expansions of the 1980s and the 1990s. To bring these cyclical fluctuations into sharper focus, Figure 1.2 plots an estimate
(9
e0
E4 £
0 ©
q
4 i l r l l l l ~
ippii
iiiii
p i i r l l l l l l
iiirl
iii,
,i
....
i ....
i ....
i ....
i ....
i ....
i ....
i ....
i ....
1920 1925 1930 19,55 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000
D@
Fig. 1.1. Industrial production index (logarithm of levels).
J.H. Stock and M.W. Watson
+o
~I
[I If
II r[
d
~I
If,
II li [ II~[ iI~I [
I[ II ,I ~II II["~
llII'll I Illll ,
II I I, I, II I i,l ,l~ ,,r .,,,,
Jl
] I I
II II II I[ I[ K) II , .... ,,,,, j .... i .... r, ....
[~ H" l[ .... li,,
[I II il II I[
II I[ II II II
II II II [i II
fl [I II [I II I[ [I d[ I[ li , , ,,F,,I, i, )I,,,, , ....
,'~,
[I [ [I I dl i II, ,,,, I ....
I[ I[ [I II il
[I d[ II il [i
,f
I, II
I, I I
I[I III Ill irlr .... , .... , .... , ....
1920 1925 1950 1955 1940 /948 1950 1955 1960 1965 /970 1978 1980 1985 1990 /998 2000 Date
Fig. 1.2. Business cycle componentof industrial productionindex. of the cyclical component of industrial production. (This estimate was obtained by passing the series through a bandpass filter that isolates fluctuations at business cycle periodicities, six quarters to eight years; this filter is described in the next section.) The vertical lines in Figure 1.2 indicate cyclical peaks and troughs, where the dates have been determined by business cycle analysts at the National Bureau of Economic Research (NBER). A chronology of NBER-dated cyclical turning points from 1854 to the present is given in Table 1 (the method by which these dates were obtained is discussed in the next section). Evidently, the business cycle is an enduring feature of the US economy. In the next two sections, we examine the business cycle properties of 71 quarterly US economic time series. Although business cycles have long been present in the US, this chapter focuses on the postwar period for two reasons. First, the American economy is vastly different now than it was many years ago: new production and financial technologies, institutions like the Federal Reserve System, the rise of the service and financial sectors, and the decline of agriculture and manufacturing are but a few of the significant changes that make the modern business cycle different from its historical counterpart. Second, the early data have significant deficiencies and in general are not comparable to the more recent data. For example, one might be tempted to conclude from Figure 1.2 that business cycles have been less severe and less frequent in the postwar period than in the prewar period. However, the quality of the data is not consistent over the 78-year sample period, which makes such comparisons problematic. Indeed, Romer (1989) has argued that, after accounting for such measurement problems, cyclical fluctuations since World War II have been of the same magnitude as they were before World War I. Although this position is controversial [see Balke and Gordon (1989), Diebold and Rudebusch (1992) and Watson (1994a)], there is general agreement that
Ch. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
Table 1 NBER business cycle reference dates Trough
Peak
December
1854
June
1857
December
1858
October
1860
June
1861
April
1865
December
1867
June
1869
December
1870
October
1873
March
1879
March
1882
May
1885
March
1887
April
1888
July
1890
May
1891
January
1893
June
1894
December
1895 1899
June
1897
June
December
1900
September
1902
August
1904
May
1907
June
1908
January
1910
January
1912
January
1913
December
1914
August
1918
March
1919
January
1920
July
1921
May
1923
July
1924
October
1926
November
1927
August
1929
March
1933
May
1937
June
1938
February
1945
October
1945
November
1948
October
1949
July
1953
May
1954
August
1957
April
1958
April
1960 1969
February
1961
December
November
1970
November
1973
March
1975
January
1980
July
1980
July
1981
November
1982
July
1990
March
1991
aSource: National Bureau of Economic Research.
J.H. Stock and M. W. Watson
comparisons of business cycles from different historical periods is hampered by the severe limitations of the early data. For these reasons, this chapter focuses on the postwar period for which a broad set of consistently defined data series are available, and which is in any event the relevant period for the study of the modern business cycle. There are other important features of the postwar data that are not strictly related to the business cycle but which merit special emphasis. In the final section of this chapter, we therefore turn to an examination of selected additional regularities in postwar economic time series that are not strictly linked to the business cycle. These include the Phillips curve (the relationship between the rate of price inflation and the unemployment rate) and some macroeconomic relations that hold over the long run, specifically long-run money demand, yield curve spreads, and the consumptionincome and consumption-investment ratios. These relations have proven remarkably stable over the past four decades, and they provide important benchmarks both for assessing theoretical macroeconomic models and for guiding macroeconomic policy.
2. Empirical methods of business cycle analysis 2.1. Classical business cycle analysis and the determination of turning points
There is a long intellectual history of the empirical analysis of business cycles. The classical techniques of business cycle analysis were developed by researchers at the National Bureau of Economic Research [Mitchell (1927), Mitchell and Burns (1938), Burns and Mitchell (1946)]. Given the definition quoted in the introduction, the two main empirical questions are how to identify historical business cycles and how to quantify the comovement of a specific time series with the aggregate business cycle. The business cycle turning points identified retrospectively and on an ongoing basis by the NBER, which are listed in Table 1, constitute a broadly accepted business cycle chronology. NBER researchers determined these dates using a two-step process. First, cyclical peaks and troughs (respectively, local maxima and minima) were determined for individual series. Although these turning points are determined judgementally, the process is well approximated by a computer algorithm developed by Bry and Boschan (1971). Second, common turning points were determined by comparing these seriesspecific turning points. If, in the judgment of the analysts, the cyclical movements associated with these common turning points are sufficiently persistent and widespread across sectors, then an aggregate business cycle is identified and its peaks and troughs are dated. Currently, the NBER Business Cycle Dating Committee uses data on output, income, employment, and trade, both at the sectoral and aggregate levels, to guide their judgments in identifying and dating business cycles as they occur [NBER (1992)]. These dates typically are announced with a lag to ensure that the data on which they are based are as accurate as possible. Bums, Mitchell and their associates also developed procedures for comparing cycles in individual series to the aggregate business cycle. These procedures include measuring leads and lags of specific series at cyclical turning
Ch. 1:
Business Cycle Fluctuations in US M a c r o e c o n o m i c Time Series
points and computing cross-correlations on a redefined time scale that corresponds to phases o f the aggregate business cycle. The classical business cycle discussed so far refers to absolute declines in output and other measures. A n alternative is to examine cyclical fluctuations in economic time series that are deviations from their long-run trends. The resulting cyclical fluctuations are referred to as growth cycles [see for example Zarnowitz (1992), ch. 7]. Whereas classical cycles tend to have recessions that are considerably shorter than expansions because o f underlying trend growth, growth recessions and expansions have approximately the same duration. The study o f growth cycles has advantages and disadvantages relative to classical cycles. On the one hand, separation o f the trend and cyclical component is inconsistent with some modern macroeconomic models, in which productivity shocks (for example) determine both long-run economic growth and the fluctuations around that growth trend. From this perspective, the trend-cycle dichotomy is only justified i f the factors determining long-run growth and those determining cyclical fluctuations are largely distinct. On the other hand, growth cycle chronologies are by construction tess sensitive to the underlying trend growth rate in the economy, and in fact some economies which have had very high growth rates, such as postwar Japan, exhibit growth cycles but have few absolute declines and thus have few classical business cycles. Finally, the methods o f classical business cycle analysis have been criticized for lacking a statistical foundation (for example Koopmans (1947)]. Although there have been some modern treatments o f these nonlinear filters (for example Stock (1987)], linear filtering theory is better understood 1. Modern studies o f business cycle properties therefore have used linear filters to distinguish between the trend and cyclical components o f economic time series 2. Although we note these ambiguities, in the rest o f this chapter we follow the recent literature and focus on growth recessions and expansions 3.
I A linear filter is a set of weights {ai, i=0,:t:1, -t-2, ... } that are applied to a time series Yt; the filtered version of the time series is ~ i ~_ oo aiYt i. If the filtered series has the form ~i°°O aiy t i (that is, a i = 0, i < 0), the filter is said to be one-sided, otherwise the filter is two-sided. In a nonlinear filter, the filtered version of the time series is a nonlinear fimction of {Yt, t = 0, 4-1, ±2 . . . . }. 2 See Hodrick and Prescott (1981), Harvey and Jaeger (1993), Stock and Watson (1990), Baekus and Kehoe (1992), King and Rebelo (1993), Kydland and Prescott (1990), England, Persson and Svensson (1992), Hassler, Lundvik, Persson and S6derlind (1992), and Baxter and King (1994) for more discussion and examples of linear filtering methods applied to the business cycle. 3 This discussion treats the NBER chronology as a concise way to summarize some of the most significant events in the macroeconomy. A different use of the chronology is as a benchmark against which to judge macroeconomic models. In an early application of Monte Carlo methods to econometrics, Adelnaan and Adelman (1959) simulated the Klei~Goldberger model and found that it produced expansions and contractions with durations that closely matched those in the US economy. King and Plosser (1994) and Hess and Iwata (1997) carried out similar exercises. Pagan (1997) has shown, however, that a wide range of simple time series models satisfy this test, which indicates that it is not a particularly powerful way to discriminate among macroeconomic models. Of course, using the NBER dating methodology to describe data differs from using it to test models, and the low power of the test of the Adelmans simply implies that this methodology is better suited to the former task than the latter.
10
J.H. Stock and M.W. Watson m
,t
L
oO
e [ [ i I ~
~
b 47
I] II II
I I I
IL II [I
II
I
II
II FI [1 II
I ] 11 I II II
II il
[ II
52
'
I I I I
II I I II [ I
I I
i[
I I
I I
El
I I
[I []
II [I
57
62
67
72 Dole
77
87
92
g7
Fig. 2.1. Level of GDR Jl
1
I[ If
/
~
II I
AIA
v 5,, a#
dI
,," ",V I[ II II
I
41
II
52
FI
57
II
I
[[
I
Ii ~k,,,,o~
I
62
67
72 Date
77
B2
'
II
87
92
97
Fig. 2.2. Linearly detrended GDP.
2.2. Isolating the cyclical component by linear filtering Quarterly data on the logarithm of real US GDP from 1947 to 1996 are plotted in Figure 2.1. As in the longer index of industrial production shown in Figure 1.1, cyclical fluctuations are evident in these postwar data. Without further refinement, however, it is difficult to separate the cyclical fluctuations from the long-run growth component. Moreover, there are some fluctuations in the series that occur over periods shorter than a business cycle, arising from temporary factors such as unusually harsh weather, strikes and measurement error. It is therefore desirable to have a method to isolate only those business cycle fluctuations of immediate interest. I f the long-run growth component in log real GDP is posited to be a linear time trend, then a natural way to eliminate this trend component is to regress the logarithm of GDP against time and to plot its residual. This "linearly detrended" time series, scaled to be in percentage points, is plotted in Figure 2.2. Clearly the cyclical fluctuations of output are more pronounced in this detrended plot. However, these detrended data still contain fluctuations of a short duration that are arguably not related to business cycles. Furthermore, this procedure is statistically valid only if the long-run growth component is a linear time trend, that is, if GDP is trend stationary (stationary around a linear
11
Ch. 1." Business Cycle Fluctuations in US Macroeconomic Time Series • r F
'
I
I
'1 I"
"1 I
II
II
I" r I I
[I
II
LI
[I
I[
II
[I
II
II
I
I/
[I
I
[~
I
I
I
I
¸
I
I
[I
II
I1~
I
I
II
I
II
'
" "
"
'
"
'
I
I I
I
I
I
|
I
I
I
|
i ~
I"
"
'
•
B
II II
I'1 . . . . I I
II
I
I
I]
I
I
I
I
t
I
I
I
II
I
I
II
'
47
,, "=It
II
I I
IJ
II
52
57
'N t
I
62
67
72
I
I
I
I
I]
77
ii il
I 82
. . . .
v
I'1 I
'
II
1
!!
•
87
92
g7
Dole
Fig. 2.3. Growth rate of GDR time trend). This latter assumption is, however, questionable. Starting with Nelson and Plosser (1982), a large literature has developed around the question of whether GDP is trend stationary or difference stationary (stationary in first differences), that is, whether GDP contains a unit autoregressive root. Three recent contributions are Rudebusch (1993), Diebold and Senhadji (1996), and Nelson and Murray (1997). Nelson and Plosser (1982) concluded that real GDP is best modeled as difference stationary, and much of the later literature supports this view with the caveat that it is impossible to distinguish large stationary autoregressive roots from unit autoregressive roots, and that there might be nonlinear trends; see Stock (1994). Still, with a near-unit root and a possibly nonlinear trend, linear detrending wilt lead to finding spurious cycles. If log real GDP is difference stationary, then one way to eliminate its trend is to first difference the series which, when the series is in logarithms, transforms the series into quarterly rates of growth. This first-differenced series, scaled to be in the units of quarterly percentage growth at an annual rate, is plotted in Figure 2.3. This series has no visible trend, and the recessions appear as sustained periods of negative growth. However, first-differencing evidently exacerbates the difficulties presented by short-run noise, which obscures the cyclical fluctuations of primary interest. These considerations have spurred time series econometricians to find methods that better isolate the cyclical component of economic time series. Doing so, however, requires being mathematically precise about what constitutes a cyclical component. Here, we adopt the perspective in Baxter and King (1994), which draws on the theory of spectral analysis of time series data. The height of the spectrum at a certain frequency corresponds to fluctuations of the periodicity that corresponds (inversely) to that frequency. Thus the cyclical component can be thought of as those movements in the series associated with periodicities within a certain range of business cycle durations. Here, we define this range of business cycle periodicities to be between six quarters and eight years 4. Accordingly, the ideal linear filter would preserve
4 The NBER chronology in Table 1 lists 30 completecycles since 1858. The shortest full cycle (peak to peak) was 6 quarters, and the longest 39 quarters; 90% of these cycles are no longer than 32 quarters.
J.H. Stock and M.W. Watson
12
I J J f
J J f J
/
f
2
c; c~
larget O p[irrl(ll HP
Firsk D i f f e r e n c e
---
o
0 (]
. 0.4
.
.
.
I a.8
\/~h.~ 1.2
T. . . . . . . 1 ~J
-2.0
.
.
.
. 2.4
.
.
.
. 2s
sp
Ircqucncy
Fig. 2.4. Filter gains. these fluctuations but would eliminate all other fluctuations, both the high frequency fluctuations (periods less than six quarters) associated for example with measurement error and the low frequency fluctuations (periods exceeding eight years) associated with trend growth. In other words, the gain of the ideal linear filter is unity for business cycle frequencies and zero elsewhere 5. This ideal filter cannot be implemented in finite data sets because it requires an infinite number o f past and future values of the series; however, a feasible (finite-order) filter can be used to approximate this ideal filter. Gains of this ideal filter and several candidate feasible filters are plotted in Figure 2.4. The first-differencing filter eliminates the trend component, but it exacerbates the effect o f high frequency noise, a drawback that is evident in Figure 2.3. Another filter that is widely used is the Hodrick-Prescott filter [Hodrick and Prescott (1981)]. This filter improves upon the first-differencing filter: it attenuates less of the cyclical component and it does not amplify the high frequency noise. However, it still passes much of the high frequency noise outside the business cycle frequency band. The filter adopted in this study is Baxter and King's bandpass filter, which is designed to mitigate these problems [Baxter and King (1994)]. This feasible bandpass filter is based on a twelve-quarter centered moving average, where the weights are chosen to minimize the squared difference between the optimal and approximately optimal filters,
5 The spectral density of a time series xt at frequency e) is sx((o)=(2Jr ) 1 ~)o~ ow Z~(j)exp(i~oj), where yx(j)=cov(xt,xt j). The gain of a linear filter a(L) is ]A(co)I,where A((o) = }-~V~oo ajexp(Roj). The spectrum of a linearly filtered series, Yt = a(L)xt, with L the lag operator, is sy(co) = ]A(co)l2 Sx(fO). See Hamilton (1994) for an introduction to the spectral analysis of economic time series.
Ch. 1:
13
Business Cycle Fluctuations in US Macroeeonomic Time Series
Pt
Y'"' [
~/
62
I I A /~ I / i I
I
I
II II
[[ [[ II II
All /All
I
"
I I ! . . . . . . . . 52
I I I I
I I r. . . . . 67
72
II
t ] I ! .... 77
I I
rl ] rl I !rl
] e i. . . . . . . .
82
g7
II j[ rr
..... 02
Date
Fig. 2.5. Bandpass-filteredGDP (business cycle). subject to the constraint that the filter has zero gain at frequency zero 6. Because this is a finite approximation, its gain is only approximately flat within the business cycle band and is nonzero for some frequencies outside this band. The cyclical component of real GDP, estimated using this bandpass filter, is plotted in Figure 2.5. This series differs from linearly detrended GDR plotted in Figure 2.2, in two respects. First, its fluctuations are more closely centered around zero. This reflects the more flexible detrending method implicit in the bandpass filter. Second, the high frequency variations in detrended GDP have been eliminated. The main cyclical events of the postwar period are readily apparent in the bandpass filtered data. The largest recessions occurred in 1973-1975 and the early 1980s. The recessions of 1969-1970 and 1990-1991 each have shorter durations and smaller amplitudes. Other cyclical fluctuations are also apparent, for example the slowdowns in 1967 and 1986, although these are not classical recessions as identified by the NBER. During 1986, output increased more slowly than average, and the bandpass filtered data, viewed as deviations from a local trend, are negative during 1986. This corresponds to a growth recession even though there was not the absolute decline that characterizes an NBER-dated recession. This distinction between growth recessions and absolute declines in economic activity leads to slight differences in official NBER peaks and local maxima in the bandpass filtered data. Notice from Figure 2.1 that output slowed markedly before the absolute turndowns that characterized the 1970, 1974, 1980 and 1990 recessions. Peaks in the bandpass filter series correspond to the beginning of these stowdowns, while NBER peaks correspond to downturns in the level of GDE The bandpass filtering approach permits a decomposition of the series into trend, cycle and irregular components, respectively corresponding to the low, business cycle, and high frequency parts of the spectrum. The trend and irregular components are
6 To obtain filtered values at the beginning and end of the sample, the series are augmentedby twelve out-of-sample projected values at both ends of the sample, where the projections were made using forecasts and backcasts from univariate fourth-order autoregressivemodels.
14
Jt-L Stock and M.W. Watson
/ ~o
r- 47
52
57
62
67
72
77
82
87
92
Date
Fig. 2.6. Bandpass-filteredGDP (trend).
q~vvvv -V wvv
vv "v"Vv
47
52
57
62
67
12
v
V vv'v "
17
82
87
92
97
Date
Fig. 2.7. Bandpass-filteredGDP (irregular). plotted in Figures 2.6 and 2.7; the series in Figures 2.5-2.7 sum to log real GDR Close inspection of Figure 2.6 reveals a slowdown in trend growth over this period, an issue of great importance that has been the focus of considerable research but which is beyond the scope of this chapter.
3. Cyclical behavior of selected economic time series 3.1. The data and summary statistics
The 71 economic time series examined in this chapter are taken from eight broad categories: sectoral employment; the National Income and Product Accounts (NIPA); aggregate employment, productivity and capacity utilization; prices and wages; asset prices; monetary aggregates; miscellaneous leading indicators; and international output. Most of the series were transformed before further analysis. Quantity measures (the NIPA variables, the monetary aggregates, the level of employment, employee hours, and production) are studied after taking their logarithms. Prices and wages are transformed by taking logarithms and/or quarterly difference of logarithms (scaled to
Ch. 1:
15
Business Cycle Fluctuations in US Macroeconomic Time Series
be percentage changes at an annual rate). Interest rates, spreads, capacity utilization, and the unemployment rate are used without further transformation. The graphical presentations in this section cover the period 1947:I-1996:IV The early years of this period were dominated by some special features such as the peacetime conversion following World War II and the Korean war and the associated price controls. Our statistical analysis therefore is restricted to the period 1953:I1996:IV Three sets of empirical evidence are presented for each of the three series. This evidence examines comovements between each series and real GDR Although the business cycle technically is defined by comovements across many sectors and series, fluctuations in aggregate output are at the core of the business cycle so the cyclical component of real GDP is a useful proxy for the overall business cycle and is thus a useful benchmark for comparisons across series. First, the cyclical component of each series (obtained using the bandpass filter) is plotted, along with the cyclical component of output, for the period t947-1996. For series in logarithms, the business cycle components have been multiplied by 100, so that they can be interpreted as percent deviation from long run trend. No further transformations have been applied to series already expressed in percentage points (inflation rates, interest rates, etc.). These plots appear in Figures 3.1-3.70. Note that the vertical scales of the plots differ. The thick line in each figure is the cyclical component of the series described in the figure caption, and the thin line is the cyclical component of real GDR Relative amplitudes can be seen by comparing the series to aggregate output.
co ~¢
i r i [
ii i i
ii i i
ii i i
i i r i
/~
] I
I
II
I
I
II
:'d . . . . ol
',1'
I 47
]]
[I
, ii
',,,
52
II
,,, ......
57
62
,,,,,,,, 67
,.1 ....
72
77
f~ ~\E/ I],l t" 82
fv I . . . . 92
.... 87
Date
Fig. 3.1. Contract and constructionemployment.
~'¢
I I
II
II
i° I
~ I/I
i~/I
Bl/I
n l ~ / / ~
IX~/
i \~ ~
rT ]\\l//
• l oI
@7 Iv
I~ ii
E} / Ii
I I ir
J , J I
i i
,, rl
r 47
[
52
57
[
62
67
[
72
[I
V I 77
[
I
[ V
i ~
i V I i 82
Date
Fig. 3.2. Manufacturingemployment.
,, ii 87
92
~-
16
JH. Stock and M.W. Watson
[ I ~
~,,r
~ | e/
/I
I I
[ I
I
I
~/
I
I I
I
I
I
ii i i
i i i i
i i
II
t,'~ /I/ , 4
IV ii
i
5'2
147
U,V - ' ~
,Y
5'7
6'2
I
,~
6'7
,/,~
I
/
I
' [i
'V i i
II
/ 24 ~ ~
V i
7'2
I [I
7'7
8'2
~
8'7
v. g'2
g7
92
']7
92
97
Dole
Fig. 3.3. Finance, insurance and real estate employment.
to I '~03 I c t/~[
/
[ f
[ [
i V I I
n/
[I [I
II
[ 47
I
I m,, I I
II II IE
I I ii
I I
57
52
62
[ [ I
[ I I
I I I
[ I I
II II II
I j
I I
i I
iv I
~ II
67
72
I I I
I I
77
v I
82
~!
II
v
II 87
D0le
Fig. 3.4. Mining employment.
I ~1
[I
I
[ I [
5.7[ J
I
I
I
[ J
I [
I I
52
82
67
II I[ .
I I
I I
72
If II
I[ [
77
I[ II
I
82
8']
Dole
Fig. 3.5. Government employment.
~1
I I I I
I I I
~
~"¢" " ~
I~
I' ,,v, ~,
I[
-~
I I I
I
~'¢"\"¢~"~~
,:~
~, ~, ~l~kY ,oL '47
I [ [ [
[ I
[ [
,,,T , , [ I
-~
II [I
'\'N'F'I"Y
I
7
I I
~
] ]
'~'l'x~"
,,,,, , ~
,I v I
"~
I I
II
~
[
II I
""
"~"'~I
I
-~
"g"
-~
Dole
Fig. 3.6. Service employment.
el
I
I
II
7
52
[ [ 57
I I I I 62
67
I
I
I
I I
I [
[ - I [
72 Dote
I / / ~
I
II II
77
I I
I
II II
I
82
Fig. 3.7. Wholesale and retail trade employment.
I .]
87
f12
97
Ch. 1:
17
Business Cycle Fluctuations in US Macroeconomic Time Series
N
I
I
'tt'//
47
I I
I
,llll tk// ,V ~
,,,,,,,//
~10
I
..,~s II 52
,~ II
I
,k
,,II
§7
I
,, I I 07
82
,I v" I
72 Dole
,,II I ~I
77
~2
:~
I
,,II
87
02
07
Fig. 3.8. Transportation and public utility employment.
I I i I
~°~
'°1
II [I
I I I I
II
I I
I
,\p
,y
~t/
"~
,,
,,
,, ,,
47'
'
II I
'
52
V
57
/t
I I
II [I
]
I
I
[
,,
, Y
'
I,
I
!',','
I I I I
02
67
72 Dole
I I
I I
II II
I
~/
'tj . . . . . . . .
77
82
I
,~ ~!
87
..... 92
97
Fig. 3.9. Consumption (total).
~ e~~ 1
I. . . . I
Ii II
11 ] J
I I ,I
" I / t ~
• 1 "7 , 4 "J
,~7
,DT/ i ~
~'1 V
',7
',V ,~
c°l I 47
I[
,I 57
52
I i ii :
~
I
, 67
02
x'kl /k\
i .... I
I] II
I I
I. . . . . . . . I
II II
. . . . .
W
\k,//
,,,'~k7/
,~J
:
v
,:,,,,,V
,'
[
, , I1,1, 72 O~te
I. . . .
77
82
I . . . . 02
87
] / 97
Fig. 3.10. Consumption (nondurables).
~1
oI
I I I I
I
I
I I
~i
I I
I
17 ~
co 47'
'
5'2
'
I I
[ I I
I I [ I
I
I x /
I T
',tT,, [ [ I I '
§'7 ....
I
,,I I
0'2 ....
6'7 '
'
'
I I
II II
I
I I I
,v I I 7'7 Date
I I
I II
I
,,t7 II I I 7'7
8'7
,I 8'7
1/ 9'2
97
Fig. 3.11. Consumption (services).
~
I I
I ©1
[ :/7
'47
I~f
I
,
s7
I I
I I
I I
I
I
I~ s
II
I I
I I
,,
:~
t/,, ~'7
' '67'
I T
I
'
'o'7'
'
I
[I
I
i
II I
[
[
II
,v
I
72
I
[[
II
,,
, ',V D
I
I
'8'2
Dole
Fig. 3.12. Consumption (nondurables + services).
II
'87
' ' '97
' ' '~7
18
J.H. Stock and M.W. Watson to
I ~^ [
,
c~O
,
II
,
,
I / / ~ . . . ~ , / . ~ [I[II
147
II
[I
[ I
[ I
I I
I I ~ [
57
52
62
67
I
I
72 Date
i i
ii
ii
:F:V 47
52
57
li
'
,v
62
"
I
II
II
I
I
II
i
v'
12 Dote
82
i! . . . . .
I
92
97
87
(durables).
i i
67
I
77
Fig. 3.13. C o n s u m p t i o n
2
II
'
illr
....
ll
V
77
82
87
92
Fig. 3.14. I n v e s t m e n t (total fixed).
2
~'
A~'
'
'V .... 57
52
[
62
"
,V/
I
147
""
I
67
I
72 Dote
I
V
77
82
87
92
97
92
97
Fig. 3.15. I n v e s t m e n t ( e q u i p m e n t ) .
I I
,
147
', ,
57
62
67
I I I
;',
I I [
72 Date
, ,~
II
I
I I I
i
U
17
82
I
/ ~
:V
',
~
87
Fig. 3.16. I n v e s t m e n t ( n o n r e s i d e n t i a l structures).
o ~
l
o,~ll
147
I [
II
I
I I
[
] i II II
II
I ,
52
III
I I
II
I I
I I
I
I / ~
[
[
I I I . J. . . . . . . . .
57
62
I
I ,
67
72 Dote
[
II
I
J
I
II
I
I
Y
. . ]/
.
i
[~/
.
. . 82
Fig. 3.17. I n v e s t m e n t (residential structures).
II .
.
.
. 87
. 92
19
Ch. 1." Business Cycle Fluctuations in US Macroeconomic Time Series
N
I I
©l
[
[[[[
]H 47
I I
I II
52
I I
II
57
[
,
,
62
67
I
[
[
I
[
72 Date
[I
I
I
[I
I
I
77
82
II
II
I
87
02
97
Fig. 3.18. Change in business inventories (relative to trend GDP).
to
. . . . .
JO
I I
I [
i°['l'l~V'l
v
~I
I 01 I 47
I
I
\ V/
/
..w
i~/
II
L I.V ....
[
52
I
[ I
,~/ I , Y ' X 7 - IVII I I .i i.
[ ~%./ [ I /.I...].
I , ........
57
62
[
"~qWI
67
I
I
II
J ~"~x,./
II I\kL// M . . J
ll~-./--
, I
[I I ~ ''/ [I I I . . . . . . . . .
II
[ [ I ....
72
77
82
II I ..... 87
92
Date
Fig. 3.19. Exports.
to co ~[
I [
I [
/ I ~ F ~ I
I I
I
[ I
II
I
]
&l
--
t ~l
IV i[
~i Ir
I 47
52
I I
~l iF
57
62
]
J
~
r l r I
I I
67
I
[I
I
[I
~
''
\/
rl Ill
r..,'
72
77
II
II
I '
J
- -
- -
Iv I 82
87
92
97
92
97
Date
Fig. 3.20. Imports,
~1
I I
L[ ~l ©/
[ [
] [ I
[ I [ F
I I I I
[
] [
[ [
I~ J
[ I
I I
IH
I I [ I
-
I [
II [I
I I
] I
II
\,Y-", \, 7 "---',T ,\, ] 1] I II I ] I I
I
/
[]
I
]
Date
Fig. 3.21. Trade balance (relative to trend GDP).
II I]
~o~. [I I1 I]
&[
o 147
I
52
57
62
67
I
I 72
I
77
Date
Fig. 3.22. Government purchases.
82
87
20
JH. Stock and M. W. Watson
147
57
52
62
67
72
77
82
87
92
97
9'2
97
Dale
Fig. 3.23. Government purchases (defense).
i~ ~ ~ :i / II. ~ Ii , + ~°ii . :i~ , :II: ~ I 47
52
57
62
67
7'2 Dole
7'7
ii
8'2
8'7
Fig. 3.24. Government purchases (non-defense).
'~[
r
u
i
I I
I
ii
~vll t/,// '/\l// [ ~1, I,~ ' 47
,
,
Jl
F i
I
I
II
~
1"
,~// ~r~" /II ~ r,~,,
,r ~r
I f
f I
f t
,I,[
I I
,[I
52
II
57
,
,
62
,
[,
I
I[
I
I
I
i
I'~ I
I
I
I
I
I
II [ ~( I 1 , 1 , I, 82
:/~ ~' ' v
h
67
I
r
72
II
,,r, ,/\/ I\y
,
77
~
J
,~ ,,
, 87
II ,1'
. . . . .
92
97
DoLe
Fig. 3.25. Employment (total employees).
N
.~[-2-TM,~',A,
I I
~o
I
I
I I
l
I
J l
° F'I:/I Ill/I
~'~l ~[ ~0
'(,/ I
L'v I
J
I 47
I I
[M i I
.
I
52
AA, ~
.
.
I I
57
62
.
i
I
~,
I
[I
J
I
I
r
l
IT
I
I
,,I ,v F I ~,/
I
I
I
I
I
II
[
[
I
I
]
I
/~,
:V, ~ " . . .
67
72
II
.
~
p,
[
77
II
I
82
87
92
Date
Fig. 3.26. Employment (total hours).
~
I
I
I
[
II
[
~ b4 ~1
l0
J
I 47
I I
JI
I
I
[
IJ
I
I I
I
[
f
II
I
I
I
II
I
'~ ~ W ~/~ - ~'~J Y'U I
I I
Ix/
~'~-~/ "~ "~-~ II
[
I
H
I I
I I
I I
II
[
I
,
52
,
57
[ T
I
,
,
,
62
67
72 DaLe
I
I
,
77
,
82
Fig. 3.27. Employment (average weekly hours).
II
,
87
I[
,
92
97
Ch. 1:
21
Business Cycle Fluctuations in US Macroeconomic Time Series II
I I i
Jt
,~h
"~
_
II I
ii
~
"~
142
II I I
I I I I
i
i
~"
"~
I
i
i
"~
II ii
i
i[
"~ Date
~
I I
I
i
i
"~
II LI
ii ~
97
"~
Fig. 3.28. Unemployment rate.
tL
,/Y/ V / , ~ 7 ~
-
,,v,~w v'~/ iI i 47
52
57
62
,,W/ , Y Y
,,,Vq
',V IV
',:1~v~
67
72
77
82
-
,~--./
"Y 87
92
97
Date
Fig. 3.29. Vacancies (Help Wanted index).
o
I I
It
~o : :°t'v~-/~ O11
','
ii
I I
I
',',
"
'
V ' ,'
:
52
I
II
,
] ......
57
V'
62
', 1 ~
67
I
I
•
It
""
V X/W\ ,/,
i'v~/"~
~
I 47
Ii
:
"
x..j,,~-%,,' 'r
i ,
72 Dote
',,_./,,
~[1'
,,.H, ''
77
82
87
.... 02
97
Fig. 3.30. New unemployment claims.
oo ~'~
. i I
, i°l~--,r 'J'~l
I\J
.
.
.
i
I I
.
.
.
.
.
.
.
I i
,/:~,A, /'/A/~ v ,V/# ~,1 ;t,/"~:" v LiYI
~'1 UV ~l 1 47
.
I~ /
52
ik[J
,V ,,i~ Ii
,v ii 57
62
.
.
I i
.
.
, ,,~4
, ,~i/~
, v i i
,i k/i~
i'll /
67
.
.
i
. I
I
I[
,.,i , A._.~, ,v,~,f ",~"
I ~//
72 Dab
. ii
[i
i%,/
,,,',;' El i i 77
87
82
A
\~--.----~ ii v ,,ii 92
97
Fig. 3.31. Capacity utilization.
II eel
E47'
¸ li
I r
/#~
I ....
I]
I
'
8'2
I
. . . . . .
II
. . . .
i i
'
' 5'2 '
5'7
6'2 . . . .
6'7 . . . .
7'2'
'
"
7'7
'
Dole
Fig. 3.32. Total factor productivity.
8'7
'
'9'2
'
97
JH. Stock and M.W. Watson
22
] i
~1
I I
II
i4"N]/ IOl I
I
,I 't7 ~)
I
,v
"
I 47
I I I I
I
I]
.
I I I I
I I
,,
II
.
.
.
57
I [
[I [I
- ' W VY 'I \ k / / T
,v :: il
.
52
I I
.
62
I
.
,v
I
I
.
67
.
I I
II II
' IzI %I /I / -
- ' VII / J
,,',v
I
72
I I
II
r
77
,,
I
if
82
I
87
92
97
Date
Fig. 3.33. Average labor productivity.
~t
I
[ I j
C~I
[ I [I I
I[ I
I I li J I
°,
I I I I ] I
I
[ I
[I [
"
{~
I~
II
,
I 47
I
.
52
.
I
I I
I [
,,,
II
.
57
.
62
[
.
[
I
.
67
I
.
72
,,
II
[ /
77
II
82
,
87
92
97
Dote
Fig. 3.34. Consumer price index (level).
I I
II
t 0 / ~ l / ~ A
Z~l~
~
~i I, I
l~r/l f ~/
~°i
I"
47
• I
" O I . . . . . . . .
i
/ / ~ 1
J J
J
II
I
IJ
I
II
I I
rv r
rw r
, 52
r i
=
,
P . . . . . 62 67
57
I
I
I
"
II
JI//~
'~ /F k/
I, I,
I " I. . . . . . . .
I j/'~
~ 1 ~
I
~ 1
v
~
, , I. . . . . 72 77
IE . . . . . II
, 1 ~
it ii
i
kV I
rl,],
I
k/
iI tl
P. . . . . . . . 87
82
11 . . . . 92
97
Date
Fig. 3.35. Producer price index (level).
0
I
I r
II II
II II
I
I I
I
I
I
II
I
[
II II [I II
[ I I I I 82
-~o
I
I
II
I I , I, I,
II , ,i,I 52
I 47
II
I
] I ,i J 57
I i
h
I I I I I I 62
I ]
67
I I
72
77
II J V
tl
II 87
92
97
92
g7
Date
Fig. 3.36. Oil prices.
[
II
II
ol
tO I
/
47
I[
Jl
I I
r l
I J
I
~
~
I
]
J/ /
I~/
I
IH
II
II
II
I
.
52
.
57
.
.
82
.
.
D7
I
I
II
_.d..~[ E
.
I
I
,
II II
II
II
.
72
I
.
I
II
I
I
I
[I
I
I
77
82
Dote
Fig. 3.37. GDP price deflator (level).
II
87
l
Ch. 1." Business Cycle Fluctuations in US Macroeconomic Time Series
I
..
I I
,,,,, 47
23
.........
,,
52
57
62
67
72 Date
77
82
87
I
g2
97
Fig. 3.38. Commodity price index (level).
~0/
i ~ l
ii
J l
~ i
I I/~[I
II
]1
I I
o
J
E
41\\ L / ,Y o_' oOl
/'x~,,'
~ [
~.4
1'47
,~"
IV II
~'k,
i
I
II
ii
, .
, , I I
6'2
,
,,,,,,
~ / - - ~ " " , 2 / - :,2
,, II
5'7
52
/:X
~
-
, I
.
Y I
7'2 DoLe
6'7
~,
.
I
i
~L L I
I
~.,.
.
rl [I
I~T / I I
7'7
I
, ~,,.,zXT~
.
,,
i
.
.
w
.
~
~
_
I.
I I
82
8'7
92
9'7
Fig. 3.39. Consumer price index (inflation rate).
~o
I
~!
[
~,~ [
[I
li
tr-4~Y"
l~[
II
147
I I
I
r~7-',~'~ I I
52
v
I I
57
I
J
f~l I I
62
I
67
I
,~,"J I
II
72 Dote
I
L
I
i,k.Zg/'~lI
77
~i~ t~'"
I
I
82
87
92
97
Fig. 3.40. Producer price index (inflation rate).
~0
II II
I I
] I
~1
i\~
"
~L ' ~ i
II II
.
47
I I
.
52
II II
II
'Y [ [
.
57
I I
I
I
[I
I
i
[
I I
I
[
[I
I
I
I
II
I V
II
F
I I
i I
62
.
67
I
.
I Y
I
I
.
I
72 Date
.
77
.
I
82
1 |
11
.
87
1
J
92
97
Fig. 3.41. GDP price deflator (inflation rate).
cO
I [ I [
II II
I
II
I
,
v
~I 147
,"
,l
52
.
I] il
II ]I
I I I I
II
]I
I
,,,~.w,.v ~ .
.
.
57
.
.
.
I
I
I
, ~ .
~
62
67
,
I I
,
, ,
,
72 DQte
,
II II
] I
I I
i[
I
I
II II II
v T
v ,
77
[I
I,
]
82
Fig. 3.42. Commodity price index (inflation rate).
~
87
v il
,
92
, ,
24
JH. Stock and M.W. Watson I
[
~{~
II
[
[
~.r ~\I ~I I co/
I I
I
I
I
[I
[ I
I I
I I
I
[
,\,~,} ~,i ,\r/~ II I I l"g
' 4
I I
I I
5'2
147
[ I
II II
I
,\,5c~4, 9--",YII ,\,/~ 1 1 [ I
I I
I
5'7
I
6'2
I
6'7
I
I
II
72
I
k/,~/-~-~ II
I
II
8'2
7'7
~
8'7
9'2
97
Dale
Fig. 3.43. Nominal wage rate (level).
I I (I
I [
c
,
II I
°o°? f l "~ ,,,
i'l
II
II
[ I
I I
/,/
F r [ I
..
,,s-,.,,~,A
~T ,,V
,"
I [
II I II I
k7 ~ /
w,,
,,
I
II II
1
,,~, :~
,v
, ,y
~ ~,
'4. . . . . I .I . . . . I. I. . . I I. . . . . . . . . . . . .I . I . . . .I . I. . . . . . . II. . I . .I . . .
to I 47
52
57
62
67
72
77
12
"~
, II
17
92
97
Date
Fig. 3.44. Real wage rate (level).
I I A i" dl I I / / ~ 1 ,
v
~ ti l/ Iltl
"
"1"1 I I
r
~
I 47
52
i.i
. . . . . . . .
IV
I" I" I I
v
~Y ~
v
d.~,.,
I I I I
~
~
~
"
62
I I
"
IJ I
,~
-
I [
I I
,T,
II II
,
,I,'7-" ,, J\V ,\VE
' '
.,211 ....
57
I
' v
V
'~
"
,T
V ,~w ~ --
i.i...i.i .... II.',.D.~'..........
67
72
77
82
87
92
1
97
Date
Fig. 3.45. Nominal wage rate (rate of change).
I
~
~
II
I
1 ,
i.,J~"v'v ~,~,A/V/ Ill
I
[
~/ 1'4 147'--
I I
II
I I
I I
I I
~r; ¥ / ' ~
I I
I I
II II
I I
I I
M
II
- "~r'j \,/~,"j v - , x \ / v V \ / ' ~ - / ~ t
I I
I
I xJ
I
I
[I
P
I
II
i
i i
ii
r i
r
ii
r , "~
iJ
g
"~
g
"~
77 Dale
~
~"
"~
I ~97
92
07
Fig. 3.46. Real wage rate (rate of change).
I I
till
I I
&
I [ I 1 47
I [
I I
I I
[ I
[ I
I I
I
FI
II
I
[
II
I
[
II
I
II II
-
[
I 5'2
57
I
62
67
I
I
72 Date
I
II
77
Fig. 3.47. Federal funds rate.
I
I[
I
82
87
25
Ch. 1." Business Cycle Fluctuations in US Macroeconomic 7~me Series I I
(/
[ [
li
!/ ~T
ikl
l :iV
~h
r I [ L
L I I L
,~
N I I
,~"rYv/~ '-" ,\
J [
I I
,, I I
]
I H
~
I
I I
I I
II II
,v
,, I I
I
I I
II I
~
,II k ~ , '
,,,y
I
II
I
,, II
I I
il
r
[)ale
Fig. 3.48. Treasm'y Bill rate (3 month).
II jq
I
O_l
[
I
'
i
I I [ I
I I I I
I I
i ',~
i?
'[I,
,,i I
' 5'? 52 '
57
62
I I
67
I [
[I II
I
'iv/ v
,I ,v 72
I
:',',V [ 77
82
,,II
87
92
97
D~J[e
Fig. 3.49. Treasury Bond rate (10 year).
to
i
i. . . .
I I
I
[~1 A J ( / ©I I
I I
, Ii,~ v
47
i
I
[ I
I
. . . . . .
I I
'\'/ "'1'7 '\7 ~ / ' "
Wr
I
I
II
[Y
52
Ir
57
i
i
I
I
I
I
I
[
I
I
~'\'_/"'~'
I I
. . . . .
i I
I i
. . . . . .
62
I, I
67
,
"
ii
i
i
II
II
I
I
II
F~",,-J'\'Y
72 DGte
II
r
- V_/-" ~ k / " ~ - ' ~
I M/
I
, If,l,' . . . . . .
77
82
I
' .....
87
1
92
97
92
97
Fig. 3.50. Real Treasury Bill rate (3 month).
I i
~o
t
Lq I
I
Iv I I
52
II [I
II [L
i\l]
ii
IV
I I
57
I I
I I
[ I
r r I
62
[
I I
I I
I II
Ii r\l/
~ /
[
67
II [I
I
72
II
iI II
[ w
77
82
87
Dole
Fig. 3.51. Yield curve spread (long-short).
I [
~o b
-,\,7%/~,,~
0.
147
I
52
57
I
L
~V-7--,\,~-2~ ,~ ,Ill
ii
i,
iv
I I
I I
I
62
II
67
72 Dale
i
77
I
I
II
~Y~\,'~--~ ,, I \ , /
-i~--/-~ ll,.,
rl I\7
II
[I
II
I
82
Fig. 3.52. Commercial paper/Treasury Bill spread.
87
92
97
26
JH. Stock and M. W. Watson
7~
'O'r"
V,,V
~I , ' ' i , 'G, I 47
52
1 i r ,r ,
--
v
; ........
57
iT 'l Ii~
62
67
'Itv ....
72
[[ :' ,'V,
77
.......
82
" ; .....
t
92
97
87
Dote
Fig. 3.53. Stock prices.
I
, E i
7~
i
~°7 ....
5'2 ....
I
I
~ ~
v
/ i l
I
t~X
I I
,.~rl
~\,# ,\/,/J I I i ~
[-
62
~
i
67
I
I
J
~'~ ' V
57 5'
[
i
i
72
I _
"~/
/~x
"~
Fi '[J
I [ I1~/
il
[
77
-,~/%'¢ 1 % II
i
II 87
82
..
/
92
97
92
97
Dole
Fig. 3.54. Money stock (M2, nominal level).
,t
v I [
ii II
I
I II
I I II
[ i
] I
i [
i I
•
i¸
i I
li II
i ¸ I [ I
i
[
[i
[
. . . .
tl II
,,
&i
I 47
52
[
57
62
[
67
72
77
i
82
II
87
Dote
Fig. 3.55. Monetary base (nominal level).
:I 147'
'
I J I I
I I
"
' 5'2 '
'
57
I [
II II
\~# ,\\,Y J\Y '\V
W I I , ,
I I
, 62
, 67
I "~] , ,
I
, .... 72
"tJ
d I
I I
,~'%,~j ,r /
\~\t/ " ~(F
, .... 77
II
i
]
JI II
I
, . . . . . . . . 82 87
]1
, 92
/ , / 97
[)Die
Fig. 3.56. Money stock (M2, real level).
,¢
I I
II
[[
[ J
E
(~0
~I 1 47
II I ~
I I
I
I
I
II
~r 52
,~, 57
# . . . . . ~2
Lr, , 67
72
, ~
, , , !,,i,, 77
[
IJ
I
[
I
, rv
I I/~
82
Dote
Fig. 3.57. Monetary base (real level).
_
I
I
..... 87
,! ..... 92
I 97
27
Ch. 1." Business Cycle Fluctuations in US Macroeconomic Time Series I I
wI
v
I I [
52
~7
57
I
I
I
k,~1/I
I
P"'l I [
'till
62
67
I
II
/V
72
I
II
I
,, I,"t:k" ]
II rl
I
77
I I
I
82
87
92
07
Date
Fig. 3.58. Money stock (M2, nominal rate of change).
I I ~,~
I
II
I ~ 1 1
7~ vklr/I
~1
~'~
iI~/
II II
I
. ;,. .,~ . I
47
II, 52
[I
II
I
I
[
[I
I
I
E
f I
I I
I
I
I
II
I
I
I
[ [
I I
~
~
[I , [I, 57
~~ [ [
I I ,I I 62
,7",~',i
I I
I I
[ [
[I [i
[ I
]
I
!1,[
07
72 Date
I I
77
X,J E~'m~
II
I I
I[ I[
I 82
87
92
97
Fig. 3.59. Monetary base (Nominal rate of change).
I
to
I
i
~ ~. 1 1
[
'1
m| I 47
I
i
J J
I
li
li I
,
r,I,
52
I
li,
, I,I,
i
i
II
I
i I
I
I
•
I
I
li
I i
i
IV
i 1 1
ii
,I
57
I
I I
67
72
ii
I
/
77
-
-
~
I \1 I
62
I
ii
\
. k,/
82
87
92
g7
Dote
Fig. 3.60. Consumer credit.
I
I
I[
II
II
I I
[
,V,, , I, I. . . .
i 47
ID,
52
, ,II
57
I
II
,
, ,I I
i, L
62
67
, , I, I
72
I
I
,, ,
II
77
II
--, i, ~
,, II
,., .....
82
87
92
82
87
92
Dote
Fig. 3.61. Consumer expectations.
if o
I
I
uO
F
&
r r
~n i47
"
I
r 52
57
62
67
72
77
Date
Fig. 3.62. Building permits.
97
J.H. Stock and M.W. Watson
28
[ t 'I '"A~ ~ r' r[
0
~ol~#~?-~-,r ~,I \ I ' vq 52
I 47
[
f'
"
/ J
' '
J I
\ , ~ i,¢",--W "~ v, [ I
I I
57
52
I
A'
I
II '~ /'I
[
"A ? ' - ~ - ~ 7 - ~ - ' r w ~ v ,v 'I'I I I
67
~','x>'-- "-,-" ~' '
I
72 Dale
'
I
II
]
77
~ - " ~'
i
82
II ,r
',71
II
87
92
"
"
97
Fig. 3.63. Vendor performance.
o I ~
] I
I
[ I
~,~I
I I II
,
0~°
147
52
~
,
[
"
I
f
I . . . .
57
I
[ I
~
. . . . . . . . . . .
I
62
~
, r r y ,
67
' I
I" I
II
I
I
[I
F
I
~
7
72 Dote
II
'
, -
! ,'
r
77
82
~
,
-
.
. . . . . . . . . . .
87
92
97
92
97
Fig. 3.64. Manufacturers' unfilled orders, durable goods industry.
i
i
tq ~
i
I J
i E
i
I
II
[I
[
[I
i
•
i
i
•
£ [
i[
i ¸ I
~[~11
~
•
I
ii Jl
I I
'
i
/ I
147
52
I
'I -~ y/ / /
I II
57
67
72 Date
I
v 'I]WNJ
r
I~"{
62
I[
II
77
82
I[
87
Fig. 3.65. Manufacturers' new orders, non-defense capital goods.
11
147
I I
r I " /~[
"
"
"
[
i f
I
57
62
67
I
I[ I ¸ ~
Ji
'
II
I
"
[
ir
72 Date
i
77
A_
I[
82
87
92
97
Fig. 3.66. Industrial production, Canada.
'r ~o
147
[
I I /~¢~
[ I
I ]
'
'
,
\/"
I I
~,~II II
52
\/
I~V,r I I I I [[ II
57
u
62
67
AI'
I r
J
I I
i YI I ] J~]
[ [ I ]
72 Date
I
II
"
II I L v I[ II
77
I I
r
I I
82
Fig. 3.67. Industrial production, France.
v
II II
87
92
Ch. 1." Business Cycle Fluctuations in US Macroeconomic Time Series
29
]
to
I
[
~ °I l /t X d l i 21 ,',' 147
- ,lW / lv ....
',', 52
I
'}a
'
""
I
I
I
, V / , ' i~¢ V - ' ~ /~
i~/ , ,',~, 57
- " ,'J . . . . . . 62 67
~
,i ~~f/ " "
', ', , ,', 72 Dote
" I
I
,,i~ ,i Vi
~' .... 77
l',','
~/ . . . . . . . . 87
82
,,~w,.,2~/ ii 1 '!
..... 92
/ 97
Fig. 3.68. Industrial production, Japan.
©
I
II
f I
II
it
II
I I
I
I
II
I I
I
I
II
v cOI I "47
I
5'7
5'2
6'2
I
I
I
I
II
i,
6'7
7'2 Oate
w 8'2
7'7
8'7
9'2
!
Fig. 3.69. Industrial production, UK.
to I I I
147
/ ',,
I[ I I If~ll
52
','
II
,!!
57
I I
I I I I
" ..... 62
67
",
I I
72 Dote
I I
' '
II II
,
77
"','
I I
I I
82
..... 87
'I,,
92
97
Fig. 3.70. Industrial production, Germany.
Second, the comovements evident in these figures are quantified in Table 2, which reports the cross-correlation of the cyclical component of each series with the cyclical component of real GDR Specifically, this is the correlation between xt and Y~+k, where x¢ is the bandpass filtered (transformed) series listed in the first column and Yt+k is the k-quarter lead of the filtered logarithm of real GDE A large positive correlation at k = 0 indicates procyclical behavior of the series; a large negative correlation at k = 0 indicates countercyclical behavior; and a maximum correlation at, for example, k = - i indicates that the cyclical component of the series tends to lag the aggregate business cycle by one quarter. Also reported in Table 2 is the standard deviation of the cyclical component of each of the series. These standard deviations are comparable across series only when the series have the same units. For the series that appear in logarithms, the units correspond to percentage deviations from trend growth paths.
30
JH. Stock and M. W. Watson t-xl
Ill
Ill d
d
I I I I d
d
d
I
d
d
d
I I
I
o
r
I
I
d
g
Io
d
d
d
d
d
d
d
o
r~
r~
?
o
I
II
?
II
N 2 N ~ N ~ d
d
d
d
d
d
d
d
d
d
£
M
d
d
£
~
?
o
+
o
~ ~
Gh. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
0
0
I l l
0
0
0
0
0 I
0
0
31
I
I I I
t¢3
I [ I
I
I
I
I
I I
I
1
I
I
I
0
I
I 0
0
0
0
0
I
0
0
0
I
t
N N ~ N ~ N N N I t'N
d
?
d
0
0
d
d
d
0
d
0
d
d I ]
I I
d
I 0
I d
d
d
0
0
0
d
d
0
d
d
d
d
I
I
I I
l
I I
I
~? I I I
II
I I I
I
I I I
~b
.,~,
~
r~
r~
~
~
~
{
o
o
~o:oo
.,-=
~
32
J.H. Stock and M. W. Watson
I
I
I
I
I
I
II
I I
I
I I
I I I I
I
II
I I
I
I
I l l l
I
II
I I
I
I
e~
II
I
I
e4 I I
I I
eq
I
?
d
I
d
d
d
d
[
d
I d
I
I
I
I
I
d
d
d
I
I
I
I
I
I
N N ~ N g ~ I
d d d d d d d
I
I
I
£ £ d d d d d
~,
I=I
~ ~
~ ~-~ ~z~
~ e
~~
~
o
~'~
Ch. 1:
33
Business Cycle Fluctuations in US Macroeconomic Time Series
~o
¢-q
I l l
I
°
I
o
~
o
o
~
i ~ ~ ~
I
I
I
I
~ o
34
JH. Stock and M.W. Watson
For the other series, the units are the native traits o f the series as described in the Appendix 7, 8. The third set o f evidence examines the lead-lag relations between these series and aggregate output from a somewhat different perspective. One formulation o f whether a candidate series, for example consumption, leads aggregate output is whether current and past data on consumption helps to predict future output, given current and past data on output. If so, consumption is said to Granger-cause output [Granger (1969), Sims (1972)]. The first numerical column in Table 3 reports the marginal R 2 that arises from using five quarterly lags o f the candidate series to forecast output growth one quarter ahead, conditional on five quarterly lags of output growth; this is the R 2 o f the regression ofyt+l on ( Y t . . . . . y t - 4 , St,..., St 4), minus the R 2 o f the regression ofyt+l on (Yt . . . . . Yt 4), where St denotes the candidate series. The second numerical column reports the marginal R 2 when the dependent variable is the four-quarter growth in output [log(GDPt+4 /GDPt)], using the same set o f regressors. The next two columns report these statistics, except that the two variables are reversed; that is, the marginal R 2 measures the extent to which past output growth predicts one- and four-quarter changes in the candidate series, holding constant past values o f the candidate series. Care must be taken when interpreting Granger causality test results. Granger causality is not the same thing as causality as it is commonly used in economic discourse. For example, a candidate variable might predict output growth not because it is a fundamental determinant o f output growth, but simply because it reflects information on some third variable which is itself a determinant o f output growth. Even if Granger causality is interpreted only as a measure o f predictive content, it must be borne in mind that any such predictive content can be altered by inclusion o f additional variables. Still, the partial R2s in Table 3 provide a concrete measure o f forecasting ability in bivariate relations, with which theoretical economic models should be consistent 9. Technology and policy have evolved over the postwar period, and this raises the possibility that these bivariate predictive relations might be unstable. The final two columns therefore report the p-values o f a test for parameter stability, the Quandt Likelihood Ratio (QLR) test [Quandt (1960)], which tests for a single break in a regression. The column headed " Q L R s ~ y " reports tests o f the hypothesis that the coefficients on the candidate series and the intercept are constant in the predictive regression that produced the one-quarter ahead marginal R 2 reported in the first column. The column headed "QLRs--+s" tests the stability o f the coefficients and
7 To save space, the standard errors for the sample correlations in Table 2 are not reported. The median of all the standard errors of the cross-correlations in Table 2 is 0.10; 10% of the standard errors are less than 0.06, while 10% exceed 0.13. 8 The empirical results in Table 2 based on the bandpass filter are similar to ones obtained using the Hodrick-Prescott (1981) filter. 9 The observation that predictive content is not the same thing as economic causality is hardly new. Further discussion of Granger causality can be found in Zellner (1979), Granger (1980) and Geweke (1984).
Ch. i."
35
Business Cycle Fluctuations in US Macroeconomic Time Series
0
e
o ~z r~
0
0
0
0
0
0
0
0
~
~
~
J.It. Stock and M. W Watson
36
o
e-~
cq
~
¢q
~q
cq
~'q
cq
¢'q
cq
ee~
e~
er~
e~
~
¢¢~
e~
e~
ee~
e¢~
Ch. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
z-._z.
37
J.H. Stock and M. W. Watson
38
CY
~
.~
©
~i
~i ~
o~
~
~o
Ch. 1.. Business Cycle Fluctuations in US Macroeconomic Time Series
39
intercept in a fifth-order univariate autoregression o f the candidate series. In both cases, if the test is significant at the 10% level, then the estimated break date is reported as well a0. 3.2. Discussion o f results f o r selected series 3.2.1. Comovements in employment across sectors
A key notion o f the business cycle is that fluctuations are common across sectors. Examination o f the statistics for the sectoral employment variables sheds some light on the extent to which activity in different sectors moves with the aggregate cycle. Generally speaking, the cross-correlations in Table 2 indicate a large degree o f positive association between these series and the cyclical component o f real GDP. The cyclical component o f contract and construction employment is more than twice as volatile as the cyclical component o f real GDP, as measured by the ratio o f the standard deviations o f the two filtered series; by this measure, the cyclical component o f manufacturing is 50% more volatile than the cyclical component o f real GDP. Employment in services, in wholesale and retail trade, and in transportation and public utilities are also strongly procyclical, although the cyclical volatility o f these series is much less than for contract and construction employment or for manufacturing employment. All these series have maximal cross-correlations at a lag o f one or, for services employment and transportation and public utility employment, two quarters. These patterns are consistent with employment being procyclical with a slight lag and with cyclical fluctuations across industries occurring approximately simultaneously. The exceptions to this general pattern are employment in finance, insurance and real estate, in mining, and in government; these cross-correlations are distinctly lower than for these other sectors. It is not surprising that government employment exhibits no substantial cyclical movements. Although mining is highly volatile at business cycle frequencies, these movements are generally unrelated to the aggregate business cycle. Mining includes oil and gas extraction, areas in which employment expanded during the sharp energy price increases associated with the 1974-1975 and 1980 recessions. Not apparent in these plots is the different trend growth rates in sectoral employment. For example, manufacturing employment grew at an average annual rate o f 0.3% over
10 The QLR statistic is computed as follows: First a break date is posited, say date ~. The likelihood ratio statistic, Fr, testing the null hypothesis of constant regression coefficients, against the alternative hypothesis that the regression coefficients changed at the break r, is computed by comparing the value of the Gaussian likelihood of the full sample regression to the two relevant subsample regressions. The QLR statistic is maxk0~<~~<7" k0Fr, where k0 is a trimming value, taken to be 15% of the sample size for the results in Table 3. Although this test was originally developed to detect a single break, it also has good power against alternatives with multiple breaks and slowly evolving coefficients. For a review of the QLR and other break tests, see Stock (1994). P-values for the QLR statistic were computed using the approximation developed by Hansen (1997).
40
J.H. Stock and M.W. Watson
the sample period, while service employment grew at an average annual rate of 4.0%. This produced large changes in the shares of employment in these sectors: the share of total employment in manufacturing fell from 36% in 1947 to 15% in 1996, while the share for services rose from 11% to 29%. This shift from employment in a cyclically volatile sector to employment in a less cyclically volatile sector may be partially responsible for the reduction in the variability in the business cycle variability in aggregate output (Figure 2.5) and aggregate employment (Figure 3.25) over the sample period. See Zarnowitz and Moore (1986) and Denson (1996) for a more detailed discussion of the effect of industrial composition on the business cycle. 3.2.2. Consumption, investment, inventories, imports and exports
Consumption, investment, inventories, and imports are all strongly procyclical. Based on the cross-correlations in Table 2, consumption moves approximately coincidently with the aggregate cycle, but the cyclical volatility of its components varies considerably. Consistent with the smoothing implied by the permanent income hypothesis, consumption of services is considerably less volatile than output over the cycle. In contrast, consumption of durables (which, importantly, measures purchases of durable goods rather than the service flow from those durable goods) is strongly procyclical and is far more cyclically volatile than real GDP or the other consumption measures. This too is consistent with consumers smoothing the stream of services derived from durables but with purchases of durables being concentrated in good economic times. Some observers have suggested that exogenous shifts in consumption have been the proximate causes of certain cyclical episodes in the United States. For example, Gordon cites the 1955 auto boom as an example of an essentially unexplainable consumption shock which spurred an investment boom, which in turn led to particularly strong economic growth [Gordon (1980), p. 117]. Similarly, Blanchard (1993) puts most of the blame for the 1990-1991 recession on a negative consumption shock, presumably in reaction to the invasion of Kuwait by Iraq. These explanations suggest that changes in consumption might predict changes in output. Alternatively, consumers might observe an exogenous shock to the economy and accordingly adjust their consumption levels; if this adjustment occurs more rapidly on average than the associated adjustment in output, then changes in consumption will help to predict changes in output, although not because of exogenous movements in consumption but rather because of the exogenous shocks observed by consumers. The marginal R2s in Table 3 are consistent with both views. However, the large values of these statistics should be interpreted cautiously, because many components of quarterly services consumption in particular are constructed by judgmental interpolation from ex post annual surveys and thus incorporates future data; this would tend to produce spurious Granger causality. Investment in equipment and nonresidential structures is procyclical with a lag, based on the cross-correlations in Table 2. These series also lag output in the sense
Ch. 1: Business Cycle Fluctuations in US Macroeconomic Time Series
41
of Table 3: they produce only moderate improvements in forecasts of output, but output produces large improvements in forecasts of these series and of total investment, especially at the one-year horizon. The cyclical component of the change in business inventories relative to trend GDP is procyclical and large, with a standard deviation that approximately 25% of the total cyclical standard deviation in GDP. In a mechanical sense, this means that changes in business inventories, which constitute but a small fraction of total GDP, account for one-fourth of the cyclical movements in GDP [see Blinder and Holtz-Eakin (1986)]. Investment in structures, especially residential structures, is procyclical and highly Volatile. Housing can be thought of as an asset that provides a net revenue stream far into the future or as a consumer durable with a very low depreciation rate. Either way, housing prices will be interest sensitive and sensitive to fluctuations in the aggregate cycle, especially if potential homeowners face liquidity constraints. The strong procyclicality of housing and its good predictive properties for output in Table 3 are consistent with this interpretation. Although imports are strongly procyclical, exports tend not to move strongly with the aggregate business cycle. On net, this leaves the trade balance countercyclical, as found by de la Torre (1997) for many other developed economies. It is noteworthy that government nondefense purchases exhibit considerable volatility at business cycle frequencies, but that their movements are largely unrelated to the business cycle. Moreover, govenmaent purchases makes a negligible contribution to forecasting fluctuations in real GDP at either the one- or four-quarter horizon. This is consistent with exogenous nondefense spending not being a significant source of the postwar US business cycle 11 3.2.3. Aggregate employment, productivity and capacity utilization
Like sectoral employment, total employment, employee hours and capacity utilization are strongly procyclical, and the unemployment rate is strongly countercyclical. The employment series lag the business cycle by approximately one quarter, while the capacity utilization rate is approximately coincident with the cycle. Other labor market series tend to lead the cycle, however, as measured by their cross-correlations and/or by the marginal R2s in Table 3. For example, the vacancy rate has considerable marginal predictive content for real GDP growth. This accords with Blanchard and Diamond's finding that the vacancy rate has substantial predictive content for new hires, given the lagged unemployment rate and lagged hires [Blanchard
Il Another explanation which is consistent with these correlations is that non-defense spending is finetuned optimallyto stabilize output, which would imply that the spending series has no predictiveoutput for future fluctuations in output. While a theoretical possibility, in practice this would require a reaction time and a degree of central control that is implausible in light of the slow and bureaucratic procurement process through which governmentpurchases in the United States are actually made.
42
JH. Stock and M.W. Watson
and Diamond (1989)] 12. Also, flows into unemployment, as measured by new claims for unemployment insurance, leads the cycle by one quarter in the sense of Table 2. Both total factor productivity and labor productivity are procyclical and slightly lead the cycle in the sense of Table 2. Both series also make modest contributions to forecasts of output. 3.2.4. P r i c e s a n d w a g e s
The statistics presented here make it possible to address two questions. First, are prices procyclical or countercyclical? Prices are commonly treated as procyclical, but recent studies by Kydland and Prescott (1990), Cooley and Ohanian (1991), and Backus and Kehoe (1992) present evidence that the cyclical component of prices are countercyclical. Second, are the business cycle properties of different price series similar or different? First, consider the broad price measures (the Consumer Price Index (CPI) and the GDP deflator). Consistent with the findings of Kydland and Prescott (1990) and B ackus and Kehoe (1992), the cyclical component of the level of prices is countercyclical. The evidence in Table 2 suggests that these broad measures lead the cycle by approximately two quarters. This correlation is strong (the cross-correlation with the CP! at a lead of two quarters is -0.68, for example), and inspection of the figures suggests that this countercyclical pattern has been relatively stable since 1953. Although these price levels are countercyclical, the cyclical components of the rates of inflation of these prices are strongly procyclical and lag the business cycle. This pattern is clearly apparent in the figures: the cyclical component of the CPI inflation rate declines during and after each of the eight recessions since 1953. This distinction between correlations in levels and correlations in first differences matters for the implications of these facts for economic models; see for example Ball and Mankiw (1994). This pattern of leading, countercyclical price levels and lagging, procyclical rates of inflation is present for some but not all factor prices. The nominal wage index exhibits a pattern quite similar to the CPI. One explanation for this is the contractual indexing of nominal wage to the CPI, a practice that became widespread during the inflation of the 1970s. In contrast, real wages have essentially no contemporaneous comovement with the business cycle. The cross-correlations suggest that changes in real wages lag the cycle by approximately one year, but these cross-correlations are low. Real wages have no predictive content for output growth at the one- or four-quarter horizons. The
12 Blanchard and Diamond (1989, footnote 24) use a modification of the help-wanted index, which adjusts for trend discrepencies between the help-wanted index and vacancies. These adjustments affect the trend level of the series, which is filtered out of the bandpass filtered version of the series that forms the basis of the results in Table 2.
Ch. 1." Business Cycle Fluctuations in US Macroeconomic Time Series
43
weak cyclical movements o f real wages has been viewed as poorly explained by a variety o f macroeconomic theories [see Christiano and Eichenbaum (1992)] 13 3.2.5. A s s e t p r i c e s a n d returns
Nominal interest rates are contemporaneously procyclical. The cross-correlations in Table 2 also indicate that interest rates are a leading indicator, with positive values o f interest rates associated with cyclical declines in output approximately two to six quarters in the future. The leading indicator properties o f interest rates, particularly the short-term rates, are also evident in Table 3: both three-month Treasury bills and the Federal Funds rate produce improvements in R2s exceeding 0.25 at the one-year horizon. Real rates are less cyclical than nominal rates; Table 2 suggests that they are weakly countercyclical and slightly leading, but Table 3 suggests that they have little predictive content for GDP growth at either the one- or four-quarter horizon. The spread between long- and short-term interest rates has long been recognized as a leading indicator: an inverted yield curve (short rates exceeding long rates) is associated with subsequent declines in economic activity 14. Although the crosscorrelations in Table 2 suggest that the yield curve actually lags the cycle, the considerable predictive content o f the yield curve for real GDP at the one and especially four-quarter horizon is evident by the large marginal R2s in Table 3. It is also noteworthy that this forecasting relationship is unstable: the QLR test rejects at the 1% level and a break is estimated to have occurred in 1972. The risk premium for holding private debt, as measured by the spread between six-month commercial paper and the six-month US Treasury bill rate, is countercyclical with a lead o f approximately one year. This series also has considerable predictive power for output [see Friedman and Kuttner (1993) for additional discussion and interpretation]. The statistics for stock prices must be interpreted with particular care. A model that provides a good first approximation is that log stock prices follow a martingale, so that deviations o f stock returns from their mean are unforecastable. Thus as discussed in Section 3.1, the strong cyclical fluctuations in stock prices should be understood as a consequence o f the bandpass filter; by retaining only fluctuations at these frequencies, the filtered version o f stock prices will not be a martingale. Still, it is noteworthy that this filtered version is moderately procyclical and indeed somewhat leads the cycle. These cross-correlations and the marginal R2s are consistent with stock prices being a leading indicator o f the cycle, which in turn is consistent with the principal that stock
13 Barsky, Parker and Solon (1994) provide evidence that the lack of relation between the real wage and the business cycle is in part an artifact of how the real wage index is constructed, in which the index weights fail to capture changes in the composition of employment over the business cycle. Holding composition constant, they conclude that real wages are procyclical. 14 See Estrella and Hardouvelis (1991) and Stock and Watson (1989). As of this writing, the spread between ten year US Treasury Bonds and the Federal Funds rate has been included in the composite Index of Leading Economic Indicators [The Conference Board (1996)].
44
JH. Stock and M. W Watson
prices reflect market participants' expectations of discounted future earnings. Notably, movements in real GDP do not substantially help to predict stock returns, a finding consistent with view that log stock prices follow a martingale. 3.2.6. Monetary aggregates
In theory, money plays an important role in the determination of the price level and, because of various nominal frictions in the economy, can result in movements in real quantifies. In practice, quantifying this link is difficult because it requires defining and measuring "money". The postwar period has seen extraordinary growth in the financial sector and in the diversity of financial instruments available to consumers and businesses, and these changes have made the task of measuring money a difficult one that has attracted considerable attention at central banks over the past decades. Here, we consider two measures of money, the monetary base, a variable which is essentially under the short-term control of the Federal Reserve Bank, and a broader aggregate, M2. Over the full sample, the log level of nominal M2 is procyclical with a lead of two quarters, and the nominal monetary base is weakly procyclical and leading. Inspection of the plot of their cyclical component, however, suggests that these procyclical movements were more pronounced before 1980 than after; indeed, the contemporaneous cross-correlation between the cyclical components of nominal M2 and real GDP is 0.6 for 1959-1979, but this drops to -0.1 for 1980-1996. In contrast, the growth rates of nominal M2 and the nominal monetary base are countercyclical and lagging. The real monetary aggregates are more strongly procyclical than their nominal counterparts, but this relationship too has weakened since the mid 1980s. There is a large literature on the empirical relationship between money and output. Over the past two decades, much of this literature has focused on whether money Granger-causes output [seminal works are Sims (1972, 1980)]. The results in Table 3 indicate that the real monetary base and real M2 both have predictive content for output. Like many forecasting relations with narrow definitions of money, those with the monetary base are unstable: the QLR test rejects all specifications with base money at the 1% level, and identifies a break in 1972. Stability is not rejected for the specifications with the broader aggregate, M2. Although the monetary aggregates have predictive power for output in these bivariate relations, once one controls for other aggregate variables, in particular interest rates, the predictive content of real or nominal monetary aggregates for real output is reduced, although nominal M2 is not eliminated from forecasts of nominal income [see Friedman and Kuttner (1992) and Feldstein and Stock (1994)]. 3.2.7. Miscellaneous leading indicators
Over the years, economic forecasters have found many series which are precursors of the aggregate cycle but which do not fit neatly into the previous categories. The seminal work on leading economic indicators is Mitchell and Burns (1938). The
Ch. 1: Business Cycle Fluctuations in US Macroeconomic Time Series
45
cyclical properties of a few such leading indicators are summarized in Tables 2 and 3. Building permits (housing starts) are a measure of future housing expenditures, and new orders are a measure of future expenditures on durable goods; both series are both procyclical and have considerable predictive content for output. Expectations of future economic variables play an important role in modern macroeconomic theories, and consumer expectations are procyclical, lead the aggregate cycle, and have some predictive content for output. 3.2.8. International output
The economies of various countries are linked through trade in goods and services, financial markets, and the diffusion of technology. For these and other reasons, developed economies have cyclical components that have some common comovements. Some of these comovements with the US cycle are summarized in Tables 2 and 3 for Canada, France, Japan, the United Kingdom, and Germany. The Canadian and US economies are closely linked, and not surprisingly the Canadian and US business cycles are highly correlated. The cycles in the other four countries are weakly positively correlated with and lag the US cycle. US output predicts UK output, but output from none of these five countries substantially helps to predict US output. These statistics only scratch the surface of the many important issues involved in the empirical analysis of international cyclical fluctuations, including the international transmission of business cycles, international comovements of consumption, the effect of common supply shocks, and risk sharing using foreign asset markets. These issues are beyond the scope of this survey of the US business cycle, and interested readers are referred to Backus and Kehoe (1992) and Baxter (1995). 3.2.9. Stability o f the predictive relations
The QLR tests in Table 3 suggest a considerable amount of instability in these time series models. The hypothesis of stability is rejected at the 10% level in 18 of the 70 bivariate predictive relations, and in 36 of the 70 univariate autoregressions 15. If the relationships were stable, only seven rejections would be expected by random chance at the 10% level. In the bivariate relations, the rejections are concentrated in regressions involving the monetary base, wage rates, some measures of employment and unemployment, and some interest rates. Although the estimated breaks do not occur at single date, most of the breaks in the bivariate models are estimated to have occurred in the late 1960s or early 1970s, a period associated with the reduction in the trend growth rate of the economy as seen in Figure 2.6.
15 Stock and Watson (1996b) find similar evidence of instability in their examination of 5700 bivariate relations using US monthly data.
46
JH. Stock and M. W Watson
4. Additional empirical regularities in the postwar US data 4.1. The Phillips curve
Over the past 40 years the term "Phillips curve" has been used to denote three distinct characteristics of the unemployment-inflation relationship. The first is a stable statistical relationship between the unemployment rate and the level of inflation [Phillips (1958), Samuelson and Solow (1960)]. The second is a stable statistical relationship between the unemployment rate and changes in inflation (or more generally unanticipated inflation) [Gordon (1982a, 1982b)]. The third is a structural relationship describing the simultaneous adjustments of both real activity and prices to changes in aggregate demand [Friedman (1968); Phelps (1967); Lucas (1972); Taylor (1980)]. In this subsection we present evidence relating to the first two concepts of the Phillips curve as an empirical regularity. There is a large literature related to the third concept of the Phillips curve as a structural economic relation. The key issue in this literature is the econometric identification of aggregate demand shocks. There is a large literature on identifying aggregate shocks, most recently in the context of structural vector autoregressions [see King and Watson (1994) for a discussion in the context of the Phillips curve], but these matters go beyond the scope of this chapter and are not taken up here. In this subsection we address three questions. First, is there a stable negative relationship between the unemployment rate and the rate of inflation, as first documented by Phillips (1958) for the UK and Samuelson and Solow (1960) for the US? Our answer to this question is a qualified no: while there is no stable relationship between the levels of inflation and unemployment, there is a clear and remarkably stable negative relation between the cyclical components of inflation and unemployment. Second, is there a stable negative relationship between the unemployment rate and future changes in the inflation rate? Our answer to this question is yes: there are large marginal RZs associated with adding lags of the unemployment rate to an autoregression of changes in inflation, and the resulting forecasting relation is stable over the sample period. Third, does the empirical Phillips curve provide a useful basis for estimating the level of unemployment at which inflation is predicted to be constant, that is, the Non-Accelerating Inflation Rate of Unemployment (NAIRU)? Here, the answer is a qualified no: estimates of the NAIRU obtained from conventional specifications of the Phillips curve suggest that the NAIRU is well-defined empirically and has been fairly stable over the postwar period, but that the actual value of the NAIRU is imprecisely estimated. Figure 4.1 is a scatterplot of the level of the unemployment rate and the quarterly inflation rate (computed from the CPI) from 1953:I to 1996:IV. There appears to be little relationship between the series, and indeed the simple correlation between the variables is 0.16. If attention is restricted to sub-periods, however, a negative but unstable relationship emerges (in Figure 4.1, data for the three periods 1953-1970, 1971-1983 and 1984-1996 are plotted using different symbols). Evidently there was
Ch. 1:
Business Cycle Fluctuations in US Macroeeonomic Time Series
A A A
A A A
A
A
A
A A
q
o~
AA
A eO0
O00
(®
A A
, A
A
47
A
Ae
A
A
A~0%~',,oAo
AA
AI.~'^ A A A
A
o
A
o
O
7o 7
0
AO
0 A
A
0 07
A
o
o 0
0
AO
A A ~n A
0i AOA AO
L
A | fl A
o° [
~A
A A A
0
I
i ,
I
, p
, t
, i
3
4
5
6
7
r
i , ~
8
9
,
i , i
10
11
l
,
12
uL
Figure 4.1. Scatterplot of the unemployment and inflation rates in levels. Open circles, 1953-1970; triangles, 1971-1983; solid circles, 1984-1996.
-2
i
-I
,
i
,
0
i
1
,
r
,
2
u(Bq
Figure 4.2. Seatterplot of cyclical components of the unemployment and inflation rates. Open circles, 1953 1970; triangles, 1971-1983; solid circles, 1984-1996.
a negative relation in the 1950s and 1960s, but this relation shifted out dramatically in the 1970s, and shifted back somewhat during the 1980s. Controlling for these shifts, there is relative stability: the sample correlation of the observations from 1953-1970 and 1971-1983 is -0.4, and falls to -0.3 in the 1984-1996 sub-period. This suggests that inflation and the unemployment rate may be negatively related over suitably short horizons, but that this relationship is obscured by their longer-run movements. To investigate this, Figure 4.2 presents a scatterplot of the cyclical components of the unemployment and inflation rates over the same period, computed using the bandpass filter. Recall from Section 2.2 that the bandpass filter eliminates the long-run (zero frequency) movements in these series. A clear negative relation is apparent. Moreover the relationship appears to be quite stable over the sub-samples; the full-sample correlation is -0.6 and ranges from -0.4 to -0.65 in the sub-sample periods. Taken together, Figures 4.1 and 4.2 suggest that there is not a stable relation between the levels of the unemployment and inflation rates but that there is a stable negative relation between the cyclical components of these series. Figure 4.3 is a scatterplot of the annual change in the annual inflation rate over the next year (more precisely, 100[ln(CPIt+4 / CPIt) - ln(CPIt / CPIt 4)]) against the current unemployment rate. There is a negative relationship, although it is not quite as distinct as the relationship between the bandpass filtered levels of the series shown in Figure 4.2. These scatterplots fail to account for the possibly lengthy dynamic adjustment of prices and unemployment to macroeconomic shocks. Nevertheless, the main lessons
48
J.H. Stock and M.W. Watson ,
L
,
,
,
,
,
,
,
,
,
J
,
,
,
,
,
L
,
A A A
A A
O
A
~A~ A
I
A
o°
o~o
A II
o~,,*o~.,,~
A ~
A
I
A
A a
AA
A A
i
2
i
3
i
i
4
,
i
5
i
i
6
,
A
A
A
A
i
7
i
I
8
,
i
9
t
i
i
i
10 11
i
12
Fig. 4.3. Scatterplot of the unemployment rate and changes in future inflation. Open circles, 1953 1970; triangles, 1971-1983; solid circles, 198~1996.
from Figures 4.2 and 4.3 are supported by regressions that predict future inflation using lags of both the unemployment rate and inflation. The marginal R2s from adding four lags of the unemployment rate to a regression predicting inflation over the next k quarters using four quarterly lags of inflation is 0.18 for predicting inflation k = 1 quarter ahead, 0.23 two quarters ahead, 0.28 four quarters ahead, and 0.25 eight quarters ahead (these are in-sample marginal R2s for regressions run from 1953:I to 1996:IV). Moreover these regressions are stable: the QLR statistic for the one-step ahead forecasting regression has a p-value of 27%. Evidently the unemployment rate has considerable predictive content for annual inflation, and the QLR statistic fails to detect instability in this relationship. The relative stability of the scatterplot in Figure 4.3 has led some to treat the NAIRU as an empirical expression of Friedman's notion of a natural rate of unemployment [Friedman (1968)]. Accordingly, this version of the Phillips curve has come to provide a guidepost for monetary policy: if unemployment persists too long below the NAIRU, inflation is predicted to increase. There is a significant literature on the estimation of the NAIRU, see for example Gordon (1982b, 1998) and the references therein. Currently, regression formulations of the Phillips curve typically include various control variables relating to specific factors such as the 1972-1974 wage and price controls and the energy price shocks of the 1970s in addition to lags of unemployment and inflation. Accordingly, a standard formulation of the Phillips curve is A~,+I =/3(L)(u,
5) + y(L)AJrt + 6(L)Xt + e,,
(4.1)
where fi(L), v(L), and 6(L) are lag polynomials, btt is the unemployment rate, Yvt is the rate of inflation, X t denotes the supply shock control variables, and ff is the NAIRU. In Equation (4.1), the NAIRU is assumed to be constant; alternatively, the NAIRU could
Ch. 1:
49
Business Cycle Fluctuations in US Macroeconomic Time Series
Table 4 Estimates of the slope of the Phillips curve and of the NAIRU, 1953-1996 a Inflation series
CPI
CPI
GDP deflator
GDP deflator
NAIRU Model
constant
spline
constant
spline
/3(1)
-0.204
-0.367
-0.167
-0.237
(standard error)
(.078)
(.121)
(.064)
(.105)
Estimates of NAIRU (fit) and 95% confidence intervals
70:1
6.11 (4.91, 7.73)
5.77 (4.56, 8.02)
5.96 (4.69, 7.39)
6.31 (4.86,12.48)
80:1
6.11
7.05
5.96
6.63
(4.91, 7.73)
(5.38, 8.40)
(4.69, 7.39)
(2.52, 8.28)
6.11
6.47
5.96
6.29
(4.91, 7.73)
(4.63, 8.42)
(4.69, 7.39)
(2.76, 9.19)
NA
1.53 (0.171)
NA
0.969 (0.448)
90:1 F-test (p-value) of constant NAIRU a Regression:
A~ t =
fi(L)(ut
1 - bit) + 6 ( L ) A Y g t
I + ] / ( L ) X t + et
The regressions were estimated using quarterly data over the period 1953:I 1996:IV Unemployment is the total civilian unemployment rate. All regressions contain four lags each of the change of inflation and the unemployment rate. The spline model of the NAIRU specifies the NAIRU as evolving according to a cubic spline, with three equidistant knot points. /3(1) is the sum of the coefficients on lagged unemployment. The confidence intervals for the NAIRU are constructed using Fieller's method. In all specifications, one lag of a food and energy supply shock variable (the difference between food and energy inflation and general inflation) and a variable for the Nixon price controls (taken from Gordon 1982b) were included. For additional discussion and references see Staiger, Stock and Watson (1997). be expressed as a flexible function o f time to allow for potential time variation in the NAIRU. Table 4 reports estimates o f Equation (4.1) for different measures o f inflation and for different specifications o f the NAIRU. These estimates indicate that/3(1) (the s u m o f the coefficients o n ut and its lags) is statistically significant, and in this sense the N A I R U is well defined. There is some evidence that the N A I R U has changed over the postwar period; however, this time variation is moderate, within a range o f approximately one percentage point o f u n e m p l o y m e n t . U n e m p l o y m e n t and its lags are strongly significant in these regressions. These results reinforce the conclusion that there is a stable Phillips relation between changes o f inflation and u n e m p l o y m e n t . However, the resulting estimates o f the N A I R U are imprecise: most o f the actual values o f u n e m p l o y m e n t over this period fall w i t h i n the reported 95% confidence intervals for the NAIRU. Somewhat more precise estimates o f the N A I R U can be obtained u s i n g certain (but not all) narrowly defined measures o f core inflation. Generally speaking, however, the m a i n findings o f a
50
J.H. Stock and M. Pg Watson
stable Phillips relation, with a NAIRU that is imprecisely measured, and unemployment having considerable marginal forecasting content for inflation are highly robust across specifications, see Staiger, Stock and Watson (1997). 4.2. Selected long-run relations
The focus so far has been on fluctuations over business cycle frequencies. There are however some important relations among macroeconomic variables that might be expected to hold over long horizons, although their relationship might be less transparent over short horizons. In this section, we look at three such empirical relationships: long-run money demand; the spread between short- and long-term interest rates; and the so-called balanced growth relations, which refer to consumptionincome and investment income ratios. The key hypothesis that permits examining these long-run relations is that linear combinations of the series based on these long-rma relations are considerably less persistent than are the series themselves. Thus, although the rates on 90-day Treasury bills and 30-year Treasury bonds are each highly persistent series, the spread (or difference) between these two rates is less persistent and tends to revert to a constant mean. One formulation of this idea is that the long and short rate both have a unit root, but that the spread does not; in this case, the long and short rates are said to be cointegrated, with a cointegrating coefficient of one [Engle and Granger (1987)]. There is now a vast literature on cointegration; see Watson (1994b) for a survey. The treatment here focuses on examining the stability and reduced persistence of these long-run relations, rather than on the formal methods of cointegration. The main measure of persistence used here is the magnitude of the largest autoregressive roots in the individual series and in the residual from the long-run relation. If this root is large, then shocks to that series are highly persistent; if the root is one, then the effect of that shock persists into the infinite future. On the other hand, if the root is small, then the process decays quickly after a shock. 4.2.1. Long-run money demand
The relation between money and output over the long run has been of enduring interest in economics. Annual data on the logarithm of M1 velocity (the ratio of output, here GNP to M1) and the commercial paper rate over the period 1915-1996 are plotted in Figure 4.4. Evidently both the commercial paper rate and velocity exhibit trend movements, although this trend is variable. At a visual level, there appears to be considerable long-run comovement between these two series, although the comovements over short horizons are less strong, see Lucas (1988). Estimates of the long-run relation between the logarithm of real money, the logarithm of real GNP, and the nominal interest rate are given in Table 5. Estimates are computed using two methods: a cointegrating regression [specifically, the dynamic OLS (DOLS) method of Stock and Watson (1993)], and a method that does not require exact
Ch. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
--
o C~
51
MI Velocity (log,scaled)
L \
E~
0 4
o lllllllllllllllllr,, 1910
1920
......... 1930
, ......... 1940
, ......... 1950
, ......... 1960
, ......... 1970
, ......... 1980
, ......... 1990
2000
Dote Fig. 4.4. M1 and the c o m m e r c i a l paper rate.
cointegration [the full information maximum likelihood method o f Stock and Watson (1996a)]. The residuals from the FIML estimates are plotted in Figure 4.5. The point estimates in Table 5 indicate that there is an income elasticity o f approximately 0.9 and an interest semi-elasticity of approximately -0.1, values which accord with other estimates of these long-run coefficients [see Hoffman and Rasche (1991)]. In contrast to the series themselves, it is evident from Figure 4.5 that the residuals from the long-run money demand relation exhibit considerable mean reversion. The past twenty years have seen historically large deviations from this long-run relation, but these deviations appear to persist for only a few years. The standard errors computed using the DOLS method are predicated on the long-run money demand being a cointegrating regression, with log output (Yt) and the interest rate (rt) having an exact unit root. However, the assumption of an exact unit root is not plausible for interest rates and need not be true for output, so these standard errors are questionable. Alternative 95% confidence regions that do not rely on the Table 5 Estimates o f long-run m o n e y demand, 1921-1996 a Estimation m e t h o d
/3y
D y n a m i c OLS (DOLS, S t o c k and Watson 1993)
0.868
Full information m a x i m u m likelihood (FIML)
0.874
/3r (0.070)
-0.094
(0.018)
-0.096
a m t _ a 4- [JyYt 4- [Jrrt 4- ut. Both D O L S and F I M L are i m p l e m e n t e d using t w o leads and lags o f the annual data. For the DOLS estimates, the regressions are run f r o m 1918 to 1994, w i t h earlier values u s e d for initial conditions. Standard errors are in parentheses.
52
JH. Stock and M.W. Watson .........
. . . . . . . . .
/910
, .........
i
i .........
. . . . . . . . .
1920
J .........
i . . . . . . . . .
1930
L .........
i . . . . . . . . .
i
i .........
. . . . . . . . .
1950
1940
i
h .........
. . . . . . . . .
1960
i .........
i . . . . . . . . .
L .........
i . . . . . . . . .
1970
1980
i . . . . . . . . .
1990
2000
Oate
Fig. 4.5. Long run m o n e y demand residual.
exact unit root assumption, computed using the methods in Stock and Watson (1996a), contain a unit income elasticity. Thus these results are consistent with there being a stable long-run relation between velocity and interest rates, which can be thought of as a stable long-run money demand relation.
4.2.2. Spreads between long-term and short-term interest rates Annual data on interest rates on long-term high grade industrial bonds and short-term commercial paper and the spread between these two rates are plotted in Figure 4.6 ~1
. . . . . . . . .
i . . . . . . . . .
i . . . . . . . . .
, . . . . . . . . .
i . . . . . . . . .
i . . . . . . . . .
i . . . . . . . . .
i . . . . . . . . .
i . . . . . . . . .
i . . . . . . . . .
V
--IndustriQI Bond RQte I-- Cornmercial Paper Rote I- Spread I
ol
/"
_ - -
/
"1\1\\1\\//"1 I
1900
1910
1920
\ 1930
1940
1950
"41 1980
1970
DaLe
Fig. 4.6. Private interest rates and spread.
/
.
-
.I 1980
1990
2000
53
Ch. 1." Business Cycle Fluctuations in US Macroeconomic Time Series
10 YearT-Bond RoleI 3 Moi,th Bi,, R0t0 /
! \\
-I-I
1950
i
i
,
I
r
,
,
1986
i
I
/
/\ ,
1960
,
r
i
P
i
1969
r
i
,
I
,
1970
,
,
,
\ I
,
,
1979
A
/ ,
,
I
i
\/ r
,
1980
,
I
,
1985
,
,
i
\. I
i
i
1990
i
i
I
r
1995
,
i
,
2000
Dote
Fig. 4.7. Interest rates on government debt and spread.
over the period 1900-1996. These rates have fluctuated over a fairly large range over this period. They also exhibit considerable persistence: rates were low during much of the Depression and the 1940s, and were high relative to their historical values during the 1970s and 1980s. In contrast, the spread between these two rates is more stable and, during most episodes, exhibits considerably more short-term volatility. A similar pattern is evident in the postwar data in Figure 4.7 on 90-day Treasury bill rates and ten-year Treasury bond rates. Of course, over these periods there have been great Table 6 Largest autoregressive roots of interest rates and spreads a Sample
Largest root OLS
High grade industrial bonds
1900-1996
0.95
1.02
Commercial paper
190~1996
0.86
0.90
0.78-1.03
1900-1996
0.56
<0.60
<0.60-0.66
1953-1996
0.84
1.05
0.83 1.09
1953-1996
0.76
0.87
1953-1996
0.22
<0.11
Spread 10-year Treasury Bond 90-day Treasury Bill Spread
Median unbiased
90% confidence
period
interval 0.91 1.04
0.61-1.07 <0.11-0.43
All estimates are based on annual data. OLS refers to ordinary least squares. The median unbiased estimates and the 90% confidence interval are computed by inverting the Dickey-Fuller (1979) unit root test statistic (including a constant and time trend) using the method described by Stock (1991), with the number of lags selected by the Akaike Information Criterion (AIC). Upper bounds (denoted by <) rather than point values are reported for the median unbiased estimate and confidence interval endpoints when these values are less than the smallest values tabulated by Stock (1991). a
J.H. Stock and M. W Watson
54
changes in financial markets, and these changes would arguably induce instabilities in the relation between these rates. Empirical estimates of persistence, as measured by the value of the largest autoregressive root o f each series, are given in Table 6. These estimates support the view that the spreads are considerably less persistent than the interest rates themselves 16. Indeed, the hypothesis o f a unit root cannot be rejected for each o f the four interest rates series. In contrast, the largest autoregressive roots for the two spreads are small. 4.2.3. Balanced growth relations
Another set o f long-run relations are the so-called balanced growth relations among consumption, income and output. Simple stochastic equilibrium models that incor-
/
J c
f ~
/
f
~ _J
/
_r
-~-
--
_
I
"L r~ o O~ O
J
I
_ /
- -
/" I
--
I
---
1950
1955
- -
1960
/
G~P CoastJrnp[ion
' Investment Government Purchoses
1965
1970
1975
1980
1985
1990
1995
9000
Date
Fig. 4.8. Major macroeconomic aggregates. porate growth imply that even though these aggregate variables may contain trends, including stochastic trends, their ratios should be stationary; see King, Plosser and Rebelo (1988) and King, Plosser, Stock and Watson (1991). These aggregates are plotted in Figure 4.8, and their log ratios are plotted in Figure 4.9. Although the aggregates have grown significantly since 1953, their ratios have been more stable.
16 Because the ordinaryleast squares (OLS) estimator of the largest autoregressive root is biased towards zero, a second, median unbiased estimator of this largest root is reportedin Table 6. The median unbiased estimator is constructed following Stock (1991) by inverting the Dickey-Fuller (1979) test for a unit root in the relevant series. Also reported in Table 6 are 90% confidence intervals for this largest root, constructed using the method described by Stock (1991).
55
Ch. 1." Business Cycle Fluctuations in US Macroeconomic Time Series ,
,
, \ ,
,
.
.
.
\
/
"\\/
.
,
/\
.
.
.
.
,
.
.
.
.
,
.
.
.
.
,
.
.
.
.
,
.
.
.
.
,
.
.
.
.
,
.
.
.
.
i
_~
\/
\
\
/'\ \//
\\
~,
~/\
- -
/\
Consurnption/GNP
--
\
/
~ /
\
/
\
\
/
/
-
"
\ \
' Inveslmenl/CNP
\
G0vef~men/GNP i
i
i
i
1950
i 1955
i
i
i
....
i
i
i
i
i
1960
~
i
i
1905
i
i
i
\
i
i
i
1070
i
i
i
i975
i
i
r
i 19110
i
i
r
i
1
i
i
i
1985
i
i 1990
i
i
i
r
i 1995
i
i
i
i 2000
I)Gte
Fig. 4.9. Balanced growth ratios (logs). Consistent w i t h the h i g h cyclical volatility o f total investment in Table 2, the log investment/output ratio has b e e n m u c h m o r e volatile than the log c o n s u m p t i o n / o u t p u t ratio. Statistical e v i d e n c e on the persistence o f these series f r o m 1953 to 1996 is presented in Table 7. The hypothesis o f a unit autoregressive root is not rejected in favor o f trend Table 7 Largest autoregressive roots of main N1PA aggregates and their ratios a Growth rate (% per annum)
OLS
Largest root Median unbiased
90% confidence interval
GDP (Y)
3.1
0.89
1.06
0.96-1.10
Consumption (C)
3.3
0.92
1.06
0.97-1.10
investment (I)
3.6
0.66
0.69
0.42-1.05
Govt. (G) purchases
1.8
0.85
0.74
0.48-1.06
Log levels:
Log ratios:
C-Y
0.3
0.38
0.70
I-Y
0.4
0.51
0.32
<0.14~0.67
-1.2
0.74
0.72
0.46-1.06
G-Y
0.43 1.05
a Based on logarithms of annual data, 1953-1996. The method for estimating the largest autoregressive roots and for constructing confidence intervals is described in the notes to Table 6. The mean growth rate of each series was estimated using the Prais-Winston method as described by Canjels and Watson (1997), with the same lag lengths as for the root statistics for that series.
56
JIK Stock and M.W. Watson
stationarity at the 5% level for output, consumption or investment. Although a unit root cannot be rejected for the consumption-output ratio, the estimates of the largest root for the two balanced growth ratios are small. Although these statistics do not line up perfectly with the simple balanced growth predictions, they do suggest that these ratios are considerably more mean reverting than the aggregate series themselves. Plots and statistics for government purchases and the log government purchases/income ratio are also contained in these figures and tables. The trend growth rate of government purchases is considerably less than that of the other aggregates. The share of government purchases in output has dropped significantly over the postwar period, and this decline has been offset by an increase in the output shares of consumption and investment.
Acknowledgements The authors have benefited from comments from and/or discussions with Michael Bordo, Christopher Carroll, Karen Dynan, Benjamin Friedman, Robert King, Jeffrey Miron, Adrian Pagan, Christopher Sims, and John Taylor. This research was supported in part by National Science Foundation Grants Nos. SBR-9409629 and SBR-9730489.
Appendix A. Description of the data series used in this chapter This Appendix contains a description of the data series used in this chapter. Most of the series were obtained from Citibase; for these series, the uppercase names listed below refer to the Citibase labels for the series. The following abbreviations are used: sa = seasonally adjusted; saar = seasonally adjusted at an annual rate; par = percent at an annual rate. The numbers in parentheses in Section A.2 correspond to figure numbers used in Section 3. The series transformation, if any, is given in square brackets. If the QLR test result in Table 3 is based on the first difference of the series, this is noted in the series description by "QLR-FD". A.1. Series used in Section 1
Industrial Production Index (total, 1992=100, saar). Source: Federal Reserve Board. d.2. Series used in Section 2
(0) (1)
Gross Domestic Product GDPQ: gross domestic product (bil 92 chained $, saar) [Log], QLR-FD Contract and Construction Employment LPCC: employees on nonag, payrolls: contract construction (thous.,sa) [Log], QLR-FD
Ch. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
57
(2) Manufacturing Employment LPEM: employees on nonag, payrolls: manufacturing (thous., sa) [Log], QLR-FD (3) Finance, Insurance and Real Estate Employment LPFR: employees on nonag, payrolls: fin., insur. & real estate (thous., sa) [Log], QLR-FD (4) Mining Employment LPMI: employees on nonag, payrolls: mining (thous., sa) [Log], QLR-FD (5) Government Employment LPGOV: employees on nonag, payrolls: government (thous., sa) [Log], QLR-FD (6) Service Employment LPS: employees on nonag, payrolls: services (thous., sa) [Log], QLR-FD (7) Wholesale and Retail Trade Employment LPT: employees on nonag, payrolls: wholesale & retail trade (thous., sa) [Log], QLR-FD (8) Transportation and Public Utility Employment LPTU: employees on nonag, payrolls: trans. & public utilities (thous., sa) [Log], QLR-FD (9) Consumption (Total) GCQ: personal consumption expend-total (bil 92 chained $, saar) [Log], QLR-FD (10) Consumption (Nondurables) GCNQ: personal consumption expend-nondurables (bil 92 chained $, saar) [Log], QLR-FD (11) Consumption (Services) GCSQ: personal consumption expend-services (bil 92 chained $, saar) [Log], QLR-FD (12) Consumption (Nondurables + Services) (AC) GCNQ + GCSQ [Log], QLR-FD (13) Consumption (Durables) GCDQ: personal consumption expend-durables (bil 92 chained $, saar) [Log], QLR-FD (14) Investment (Total Fixed) GIFQ: fixed investment, total (bil 92 chained $, saar) [Log], QLR-FD (15) Investment (Equipment) GIPDEQ: private purch, of producers dur. equip. (bil 92 chained $, saar) [Log], QLR-FD (16) Investment (Nonresidential Structures) GISQF: purchases of nonres structures-total (bil 92 $, saar) [Log], QLR-FD (17) Investment (Residential Structures) GIRQ: fixed investment, residential (bil 92 chained $, saar) [Log], QLR-FD (18) Change in Business Inventories (Relative to Trend GDP) (AC) GVQ/GDPQT, where GDPQT is calculated as the low-filtered (Periods8 years) component of GDPQ (unitless ratio, not in logarithms)
58
J.H. Stock and M.W. Watson
(19) Exports GEXQ: exports of goods & services (bil 92 chained $, saar) [Log], QLR-FD (20) Imports GIMQ: imports of goods & services (bil 92 chained $, saar) [Log], QLR-FD (21) Trade Balance (Relative to Trend GDP) (AC) (GEXQ-GIMQ)/GDPQT, QLR-FD (22) Government Purchases GGEQ: gov. consumption exp. & gross investment (bil 92 chained $, saar) [Log], QLR-FD (23) Government Purchases (Defense) GGFENQ: nat. defense cons. exp. & gross inv. (bil 92 chained $, saar) [Log], QLR-FD (24) Government Purchases (Non-Defense) (AC) GGEQ-GGFENQ [Log], QLR-FD (25) Employment: Total Employees LPNAG: employees on nonag, payrolls: total (thous., sa) [Log], QLR-FD (26) Employment: Total Hours LPMHU: employee hours in nonagric, est. (bil. hours, saar) [Log], QLR-FD (27) Employment: Average Weekly Hours (AC) LPMHU/LPNAG [Log], QLR-FD (28) Unemployment Rate LHUR: unemployment rate: all workers, 16 years & over (%, sa), QLR-FD (29) Vacancies (Help Wanted Index) LHEL: index of help-wanted advertising in newspapers (1967=100;sa) [Log], QLR-FD (30) New Unemployment Claims LUINC: avg wkly initial claims, state unemploy, ins., exc p. rico (thous., sa) [Log], QLR-FD (31) Capacity Utilization IPXMCA: capacity util rate: manufacturing, total (% of capacity, sa) (fib), QLR-FD (32) Total Factor Productivity (AC) Solow's Residual calculated using GDP less farm, housing and government (GBXHQF-GGEQ), employees on non-agriculture payrolls (LP), quarterly values of the capital stock [constructed by interpolating annual values of the fixed non-residential capital stock (KNQ) using quarterly values of fixed investment (GIFQ)], and a labor share value of 0.65, QLR-FD (33) Average Labor Productivity LBOUTU: output per hour all persons: nonfarm business (82=100, sa) [Log], QLR-FD (34) Consumer Price Index (Level) PUNEW: cpi-u: all items (82-84=100, sa) [Log], QLR-FD
Ch. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
59
(35) Producer Price Index (Level) PW: producer price index: all commodities (82=100, nsa) [Log], QLR-FD (36) Oil Prices PW561: producer price index: crude petroleum (82=100, nsa) [Log], QLR-FD (37) GDP Price Deflator (Level) GDPD: gdp:implicit price deflator (index, 92=100) (t7.1) [Log], QLR-FD (38) Commodity Price Index (Level) PSCCOM: spot market price index:bls & crb: all commodities (67=100, nsa) [Log], QLR-FD (39) Consumer Price Index (Inflation Rate) Rate of Change in PUNEW (par), QLR-FD (40) Producer Price Index (Inflation Rate) Rate of Change in PW (par), QLR-FD (41) GDP Price Deflator (Inflation Rate) Rate of Change in GDPD (par), QLR-FD (42) Commodity Price Index (Inflation Rate) Rate of change of PSCCOM (par), QLR-FD (43) Nominal Wage Rate LBCPU: compensation per hour: nonfarm business sec (1982=100, sa) [Log], QLR-FD (44) Real Wage Rate (AC) LBCPU/GMDC [Log], QLR-FD (45) Nominal Wage Rate (Change) Rate of change in LBCPU (par), QLR-FD (46) Real Wage Rate (Change) Rate of change in LBCPU/GMDC (par), QLR-FD (47) Federal Funds Rate FYFF: interest rate: federal funds (effective) (% per annum, nsa), QLR-FD (48) Treasury Bill Rate (3 Month) FYGM3: interest rate: US treasury bills, sec mkt, 3-mo. (% per ann,nsa), QLR-FD (49) Treasury Bond Rate (10 Year) FYGT10: interest rate: US treas, const maturities, 10-yr. (% per ann, nsa), QLR-FD (50) Real Treasury Bill Rate (3 Month) FYGM3-Forecast of One Quarter of GMDC Growth, QLR-FD (51) Yield Curve Spread (Long-Short) (AC) FYGT 10-FYGM3 (52) Commercial Paper/Treasury Bill Spread (AC) FYCP-FYGM6 (53) Stock Prices FSPCOM: S&P's common stock price index: composite (1941-43=10) [Log], QLR-FD
60
J.H. Stock and M. W Watson
(54) Money Stock (M2, Nominal Level) FM2: m2(ml+o'nite rps, euro$, g/p&b/d mmmfs&sav&sm time dep (bil $, sa) [Log], QLR-FD (55) Monetary Base (Nominal Level) FMBASE: monetary base, adj for reserve req chgs (fib of st.louis) (bil $, sa) [Log], QLR-FD (56) Money Stock (M2, Real Level) (AC) FM2/GDPD [Log], QLR-FD (57) Monetary Base (Real Level) (AC) FMBASE/GDPD [Log], QLR-FD (58) Money Stock (M2, Nominal Rate of Change) Rate of Change in FM2 (par), QLR-FD (59) Monetary Base (Real Rate of Change) Rate of Change in FMBASE (par), QLR-FD (60) Consumer Credit (AC) CCIPY.GMPY, Consumer installment credit (bil, saar) [Log], QLR-FD (61) Consumer Expectations BCI Series UOM083, The Conference Board [Log], QLR-FD (62) Building Permits BCI Series A0m029, The Conference Board [Log] (63) Vendor Performance BCI Series A0m032, The Conference Board (64) Mfrs' Unfilled Orders, Durable Goods Ind. BCI Series A1M092, The Conference Board [Log], QLR-FD (65) Mfrs' New Orders, Nondefense Capital Goods BCI Series AOM027, The Conference Board [Log], QLR-FD (66) Industrial Production - Canada IPCAN: Industrial Production: Canada (1990=100, sa) [Log], QLR-FD (67) Industrial Production - France IPFR: Industrial Production: France (1987=100, sa) [Log], QLR-FD (68) Industrial Production - Japan IPJP: Industrial Production:Japan (1990=100, sa) [Log], QLR-FD (69) Industrial Production - UK IPUK: Industrial Production: United Kingdom (1987=100, sa) [Log], QLR-FD (70) Industrial Production - Germany IPWG: Industrial Production: West Germany/Germany (1990=100, sa) [Log], QLR-FD A.3. A d d i t i o n a l s e r i e s u s e d in S e c t i o n 4
- Industrial Bond Yield: Yield on Long-Term Industrial Bonds (Highest Quality). Data from 1900-1946 are from the NBER Historical Data Base [see Feenberg and Miron (1997)], series
Ch. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
61
m 13108. Data from 1947-1995 are from Citibase, series FYAAAI. Annual averages of monthly data. Commercial Paper Rates: Yield on 6-month Commercial Paper. Data from 1900-1946 are from the NBER Historical Data Base, series m13024. Data from 1947-1995 are from Citibase, series FYCP. Annual averages of monthly data. - Money Supply: M1. Data from 1914-1958 are from the N B E R Historical database, series m14016 and m14018 [Currency+DD, All Commercial Banks (SA), from Friedman and Schwartz (1963), Table A - l , Col. 7 and Friedman and Schwartz (1970), Table 1]. These were linked to Citibase series - M 1 (M1 from the Federal Reserve system) in 1959. Real GNP: Data from 1900-1928 are from Balke and Gordon (1989). Data from 1929-1995 are from the NIPA. - GNP Deflator: Implicit Price Deflator constructed from ratio of nominal GNP [Balke and Gordon (1989), for data 1900-1928 and NIPA for data from 1929-1995).
R e f e r e n c e s
Adelman, I., and EL. Adelman (1959), "The dynamic properties of the Klei~Goldberger model", Econometrica 27:596-625. Backus, D.K., and P.J. Kehoe (1992), "International evidence on the historical properties on business cycles", American Economic Review 82(4):864-888. Balke, N.S., and R.J. Gordon (1989), "The estimation of prewar gross national product: methodology and new evidence", Journal of Political Economy 94:38-92. Ball, L., and N.G. Mankiw (1994), "A sticky-price manifesto", Carnegie-RochesterConference Series on Public Policy 41:127-151. Barsky, R., J. Parker and G. Solon (1994), "Measuring the eyelicality of real wages: how important is the composition bias", Quarterly Journal of Economics 109(1):1~5. Baxter, M. (1995), "International trade and business cycles", in: G.M. Grossman and K. Rogoff, eds., Handbook of International Economics, vol. 3 (Elsevier, Amsterdam) 1801-1864. Baxter, M., and R.G. King (1994), "Measuring business cycles: approximate band-pass filters for economic time series", manuscript (University of Virginia). Blanchard, O.J. (1993), "Consumption and the recession of 1990-1991", American Economics Review 83(2, May):270. Blanchard, O.J., and EA. Diamond (1989), "The Beveridge curve", Brookings Papers on Economic Activity 1989(1):1-76. Blinder, A.S., and D. Holtz-Eakin (1986), "Inventory fluctuations in the United States since 1929", in: R. Gordon, ed., The American Business Cycle: Continuity and Change, Studies in Business Cycles, vol. 25 (University of Chicago Press for the NBER, Chicago, IL). Bry, G., and C. Boschan (1971), Cyclical Analysis of Time Series: Selected Procedures and Computer Programs (Columbia University Press for the NBER, New York). Burns, A.E, and W.C. Mitchell (1946), Measuring Business Cycles (NBER, New York).
62
J.H. Stock and M. W. Watson
Canjels, E., and M.W. Watson (1997), "Estimating deterministic trends in the presence of serially correlated errors", Review of Economics and Statistics 79:184-200. Christiano, L.J., and M. Eichenbaum (1992), "Current real-business-cycle theories and aggregate labormarket fluctuations", American Economic Review 82:430-450. Conference Board (1996), "Details on the revision in the composite indexes", Business Cycle Indicators 1(11, December) 3-5. Cooley, T.E, and L.E. Ohanian (1991), "The cyclical behavior of prices", Journal of Monetary Economics 28:25 60. de la Torre, M. (1997), "A study of a small open economy with non-tradeable goods", manuscript (Northwestern University). Denson, E.M. (1996), The effects of changing industrial composition on the postwar economy, Ph.D. Dissertation (Northwestern University). Dickey, D.A., and WA. Fuller (1979), "Distribution of the estimators for autoregressive time series with a unit root", Journal of the American Statistical Association 74:427-431. Diebold, EX., and G.D. Rudebusch (1992), "Have postwar economic fluctuations been stabilized?", American Economic Review 82:993-1005. Diebold, EX., and A.S. Senhadji (1996), "The uncertain unit root in real GNP: a comment", American Economic Review 86:1291-1298. Engle, R.E, and C.WJ. Granger (1987), "Cointegration and error correction: representation, estimation, and testing", Econometrica 55:251-276. Englund, E, T. Persson and L.E.O. Svensson (1992), "Swedish business cycles: 1861-1988", Journal of Monetary Economics 30(3):343 372. Estrella, A., and G.A. Hardouvelis (1991), "The term structure of interest rates as a predictor of real economic activity", Journal of Finance 46:555-572. Feenberg, D., and J.A. Miron (1997), "Improving the accessibility of the NBER's historical data", Journal of Business and Economic Statistics 15:293-299. Feldstein, M., and J.H. Stock (1994), "The use of a monetary aggregate to target nominal GDP", in: N.G. Mankiw, ed., Monetary Policy (University of Chicago Press for the NBER, Chicago, IL) 7 70. Friedman, B.M., and K.N. Kuttuer (1992), "Money, income, prices, and interest rates", American Economic Review 82:472-492. Friedman, B.M., and K.N. Kuttner (1993), "Why does the paper-bill spread predict real economic activity?", in: J.H. Stock and M.W. Watson, eds., Business Cycles, Indicators and Forecasting, Studies in Business Cycles, vol. 28 (University of Chicago Press for the NBER, Chicago, IL). Friedman, M. (1968), "The role of monetary policy", American Economic Review 68:1-17. Friedman, M., and A.J. Schwartz (1963), A Monetary History of the United States, 1867-1960 (Princeton University Press for the NBER, Princeton, NJ). Friedman, M., and A.J. Schwartz (1970), Monetary Statistics of the United States (Columbia University Press for the NBER, New York). Geweke, J. (1984), "Inference and causality in economic time series models", in: Z. Griliches and M. Intriligator, eds., Handbook of Econometrics, vol. 2 (Elsevier, Amsterdam) ch. 19:1101-1144. Gordon, R.J. (1980), "Postwar macroeconomics: the evolution of events and ideas", in: M. Feldstein, ed., The American Economy in Transition (University of Chicago Press for the NBER, Chicago, IL). Gordon, R.J. (1982a), "Price inertia and policy ineffectiveness in the United States, 1890-1980", Journal of Political Economy 90:1087-1117. Gordon, R.J. (1982b), "Inflation, flexible exchange rates, and the natural rate of unemployment", in: M.N. Baily, ed., Workers, Jobs and Inflation (Brookings Institution, Washington). Gordon, R.J. (1998), "Foundations of the Goldilocks economy: supply shocks and the time-varying NAIRU", Brookings Papers on Economic Activity 1998(2):297-333. Granger, C. (1969), "Investigating causal relations by econometric models and cross-spectral methods", Econometrica 34:150 161.
Ch. 1:
Business Cycle Fluctuations in US Macroeconomic Time Series
63
Granger, C. (1980), "Testing for causality, a personal viewpoint", Journal of Economic Dynamics and Control 2:32~352. Hamilton, J.D. (1994), Time Series Analysis (Princeton University Press, Princeton, NJ). Hansen, B.E. (1997), "Approximate asymptotic p-values for structural-change tests", Journal of Business and Economic Statistics 15(1):60 67. Harvey, A.C., and A. Jaeger (1993), "Detranding, stylized facts and the business cycle", Journal of Applied Econometrics 8(3):231-248. Hassler, J., R Lundvik, T. Persson and E S6dedind (1992), The Swedish Business Cycle: Facts over 130 Years, Institute for International Economic Studies Monograph Series, no. 22 (University of Stockholm). Hess, G.D., and S. Iwata (1997), "Measuring and comparing business cycle features", Journal of Business and Economic Statistics 15:432~444. Hodrick, R., and E.C. Prescott (1981), "Post-war U.S. business cycles: an empirical investigation", Working Paper, Carnegie-Mellon University; printed 1997, Journal of Money, Credit and Banking 29:1-16. Hoffman, D.L., and R.H. Rasche (1991), "Long run income and interest elasticities of money demand in the United States", Review of Economics and Statistics 73:665-674. King, R.G., and C.I. Plosser (1994), "Real business cycles and the test of the Adelmans", Journal of Monetary Economics 33(2):405~438. King, R.G., and S.T. Rebelo (1993), "Low frequency filtering and real business cycles", Journal of Economic Dynamics and Control 17:207-231. King, R.G., and M.W Watson (1994), "The post-war U.S. Phillips Curve: a revisionist econometric history", Carnegie-Rochester Conference on Public Policy 41:157~19. King, R.G., C.I. Plosser and S.T. Rebelo (1988), "Production, growth, and business cycles: II, New directions", Journal of Monetary Economics 21:309-342. King, R.G., C.I. Plosser, J.H. Stock and M.W. Watson (1991), "Stochastic trends and economic fluctuations", American Economic Review 81(4):81%840. Koopmans, T.J. (1947), "Measurement without theory", Review of Economics and Statistics 29:161172. Kydland, EE., and E.C. Prescott (1990), "Business cycles: real facts and a monetary myth", Federal Reserve Bank of Minneapolis Quarterly Review, Spring 1990:3-18. Lucas, R.E. (1972), "Expectations and the neutrality of money", Journal of Economic Theory 4(2): 103124. Lucas, R.E. (1988), "Money demand in the United States: a quantitative review", Carnegie-Rochester Conference Series on Public Policy 29:13~168. Mitchell, W.C. (1927), Business Cycles: The Problem and Its Setting (National Bureau of Economic Research, New York). Mitchell, W.C., and A.E Burns (1938), Statistical Indicators of Cyclical Revivals, NBER Bulletin 69 (National Bureau of Economic Research, New York); reprinted 1961, in: G.H. Moore, ed., Business Cycle Indicators (Princeton University Press, Princeton, NJ) ch. 6. NBER (1992), Recessions (Release by the NBER's Public Information Office). Nelson, C.R., and C.J. Murray (1997), "The uncertain trend in U.S. GDP", Manuscript (University of Washington). Nelson, C.R., and C.I. Plosser (1982), "Trends and random walks in macroeconomic time series", Journal of Monetary Economics 10(2):139-162. Pagan, A.R. (1997), "Towards an understanding of some business cycle characteristics", Australian Economic Review 30:1-15. Phelps, E.S. (1967), "Phillips Curves, Expectations of inflation, and optimal inflation over time", Economica, NS 34:254~281. Phillips, A.W.H. (1958), "The relation between unemployment and the rate of change of money wages in the United Kingdom, 1861 1957", Economica 25:283-299.
64
JH. Stock and M.W. Watson
Quandt, R.E. (1960), "Tests of the hypothesis that a linear regression system obeys two separate regimes", Journal of the American Statistical Association 55:324-330. Romer, C.D. (1989), "The prewar business cycle reconsidered: new estimates of Gross National Product, 1869 1908", Journal of Political Economy 97:1-37. Rudebusch, R.G. (1993), "The uncertain unit root in real GNP", American Economic Review 83:26472. Samuelson, EA., and R.M. Solow (1960), "Analytical aspects of anti-inflation policy", American Economic Review, Papers and Proceedings 50:177-194. Sims, C.A. (1972), "Money, income and causality", American Economic Review 62:54~552. Sims, C.A. (1980), "Macroeconomics and reality", Econometrica 48:1-48. Staigel, D., J.H. Stock and M.W Watson (1997), "The NAIRU, unemployment, and monetary policy", Journal of Economic Perspectives, Winter 1997:33-50. Stock, J.H. (1987), "Measuring business cycle time", Journal of Political Economy 95:1240-1261. Stock, J.H. (1991), "Confidence intervals for the largest autoregressive root in U.S. economic time series", Journal of Monetary Economics 28(3):435-460. Stock, J.H. (1994), "Unit roots, structural breaks, and trends", in: R. Engle and D. McFadden, eds., Handbook of Econometrics, vol. IV (Elsevier, Amsterdam) ch. 46:2740-2843. Stock, J.H., and M.W Watson (1989), "New indexes of leading and coincident economic indicators", NBER Macroeconomics Annual 1989:351 394. Stock, J.H., and M.W Watson (1990), "Business cycle properties of selected U.S. economic time series, 1959-1988", Working Paper No. 3376 (National Bureau of Economic Research). Stock, J.H., and M.W Watson (1993), "A simple estimator of cointegrating vectors in higher-order integrated systems", Econometrica 61:783-820. Stock, J.H., and M.W. Watson (1996a), "Confidence sets in regressions with highly serially correlated regressors", manuscript (Harvard University). Stock, J.H., and M.W. Watson (1996b), "Evidence on structural instability in macroeconomic time series relations", Journal of Business and Economic Statistics 14:11-30. Taylor, J.B. (1980), "Aggregate dynamics and staggered contracts", Journal of Political Economy 88:123. Watson, M.W (1994a), "Business cycle durations and postwar stabilization of the U.S. economy", American Economic Review 84(1):24-46. Watson, M.W (1994b), "Vector autoregressions and cointegration", in: R. Engle and D. McFadden, eds., Handbook of Econometrics, vol. IV (Elsevier, Amsterdam) ch. 47:2843-2915. Zarnowitz, V. (1992), Business Cycles: Theory, History, Indicators, and Forecasting, Studies in Business Cycles, vol. 27 (University of Chicago Press for the NBER, Chicago, IL). Zarnowitz, V, and G.H. Moore (1986), "Major changes in cyclical behavior", in: R.J. Gordon, ed., The American Business Cycle: Continuity and Change (University of Chicago Press, Chicago, IL). Zellner, A. (1979), "Causality and econometrics", Carnegie-Rochester Conference Series on Public Policy 10:%54.
Chapter 2
MONETARY POLICY SHOCKS: WHAT HAVE WE LEARNED AND TO WHAT END? LAWRENCE J. CHRISTIANO Northwestern University, NBER and the Federal Reserve Bank of Chicago MARTIN EICHENBAUM Northwestern University, NBER and the Federal Reserve Bank of Chicago CHARLES L. EVANS Federal Reserve Bank of Chicago
Contents Abstract Keywords 1. Introduction 2. M o n e t a r y p o l i c y shocks: s o m e possible interpretations 3. Vector autoregressions and identification 4. The effects o f a m o n e t a r y policy shock: a recursiveness a s s u m p t i o n 4.1. The recursiveness assumption and VARs 4.2. Three benchmark identification schemes 4.2.1. The benchmark policy shocks displayed 4.2.2. What happens after a benchmark policy shock? 4.2.2.1. Results for some major economic aggregates 4.3. Results for other economic aggregates 4.3.1. US domestic aggregates 4.3.1.1. Aggregate real variables, wages and profits 4.3.1.2. Borrowing and lending activities 4.3.2. Exchange rates and monetary policy shocks 4.4. Robustness of the benchmark analysis 4.4.1. Excluding current output and prices from £2t 4.4.2. Excluding commodity prices from g2~: The price puzzle 4.4.3. Equating the policy instrument, St, with M0, M1 or M2 4.4.4. Using information from the federal funds futures market 4.4.5. Sample period sensitivity 4.5. Discriminating between the benchmark identification schemes 4.5.1. The Coleman, Gilles and Labadie identification scheme Handbook of Macroeconomics, Volume 1, Edited by J.B. Taylor and M. Woodford © 1999 Elsevier Science B.V. All rights reserved 65
66 66 67 71 73 78 78 83 84 85 85 91 91 91 93 94 96 97 97 100 104 108 114 114
66
L.J. Christiano et aL
4.5.2. The Bernanke-Mihov critique 4.5.2.1. A model of the federal funds market 4.5.2.2. Identifyingthe parameters of the model 4.5.2.3. The Bernanke-Mihov test 4.5.2.4. Empirical results 4.6. Monetary policy shocks and volatility 5. The effects of monetary policy shocks: abandoning the recursiveness approach 5.1. A fully simultaneous system 5.1.1. Sims-Zha: model specification and identification 5.1.2. Empirical results 6. Some pitfalls in interpreting estimated monetary policy rules 7. The effects of a monetary policy shock: the narrative approach 8. Conclusion References
115 116 117 119 121 123 127 128 128 131
134 136 143 145
Abstract
This chapter reviews recent research that grapples with the question: What happens after an exogenous shock to monetary policy? We argue that this question is interesting because it lies at the center of a particular approach to assessing the empirical plausibility of structural economic models that can be used to think about systematic changes in monetary policy institutions and rules. The literature has not yet converged on a particular set of assumptions for identifying the effects of an exogenous shock to monetary policy. Nevertheless, there is considerable agreement about the qualitative effects of a monetary policy shock in the sense that inference is robust across a large subset of the identification schemes that have been considered in the literature. We document the nature of this agreement as it pertains to key economic aggregates.
Keywords monetary policy shocks, recursiveness assumption, benchmark analysis
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
67
1. Introduction In the past decade there has been a resurgence o f interest in developing quantitative, monetary general equilibrium models o f the business cycle. In part, this reflects the importance o f ongoing debates that center on monetary policy issues. What caused the increased inflation experienced by many countries in the 1970s? What sorts o f monetary policies and institutions would reduce the likelihood o f it happening again? How should the Federal Reserve respond to shocks that impact the economy? What are the welfare costs and benefits o f moving to a common currency area in Europe? To make fundamental progress on these types o f questions requires that we address them within the confines o f quantitative general equilibrium models. Assessing the effect o f a change in monetary policy institutions or rules could be accomplished using purely statistical methods. But only if we had data drawn from otherwise identical economies operating under the monetary institutions or rules we are interested in evaluating. We don't. So purely statistical approaches to these sorts o f questions aren't feasible. And, real world experimentation is not an option. The only place we can perform experiments is in structural models. But we now have at our disposal a host o f competing models, each o f which emphasizes different frictions and embodies different policy implications. Which model should we use for conducting policy experiments? This chapter discusses a literature that pursues one approach to answering this question. It is in the spirit o f a suggestion made by R.E. Lucas (1980). He argues that economists %.. need to test them (models) as useful imitations of reality by subjecting them to shocks for which we are fairly certain how actual economies or parts of economies would react. The more dimensions on which the model mimics the answers actual economies give to simple questions, the more we trust its answers to harder questions." R.E. Lucas (I980) The literature we review applies the Lucas program using monetary policy shocks. These shocks are good candidates for use in this program because different models respond very differently to monetary policy shocks [see Christiano, Eichenbaum and Evans (1997a)]. 1 The program is operationalized in three steps: • First, one isolates monetary policy shocks in actual economies and characterizes the nature of the corresponding monetary experiments. • Second, one characterizes the actual economy's response to these monetary experiments. • Third, one performs the same experiments in the model economies to be evaluated and compares the outcomes with actual economies' responses to the corresponding experiments. These steps are designed to assist in the selection o f a model that convincingly
1 Other applications of the Lucas program include the work of Gall (1997) who studies the dynamic effects of technology shocks, and Rotemberg and Woodford (1992) and Ramey and Shapiro (1998), who study the dynamic effects of shocks to government purchases.
68
L.J. Christiano et al.
answers the question, "how does the economy respond to an exogenous monetary policy shock?" Granted, the fact that a model passes this test is not sufficient to give us complete confidence in its answers to the types of questions we are interested in. However this test does help narrow our choices and gives guidance in the development of existing theory. A central feature of the program is the analysis of monetary policy shocks. Why not simply focus on the actions of monetary policy makers? Because monetary policy actions reflect, in part, policy makers' responses to nonmonetary developments in the economy. A given policy action and the economic events that follow it reflect the effects of all the shocks to the economy. Our application of the Lucas program focuses on the effects of a monetary policy shock per se. An important practical reason for focusing on this type of shock is that different models respond very differently to the experiment of a monetary policy shock. In order to use this information we need to know what happens in response to the analog experiment in the actual economy. There is no point in comparing a model's response to one experiment with the outcome of a different experiment in the actual economy. So, to proceed with our program, we must know what happens in the actual economy after a shock to monetary policy. The literature explores three general strategies for isolating monetary policy shocks. The first is the primary focus of our analysis. It involves making enough identifying assumptions to allow the analyst to estimate the parameters of the Federal Reserve's feedback rule, i.e., the rule which relates policymakers' actions to the state of the economy. The necessary identifying assumptions include functional form assumptions, assumptions about which variables the Fed looks at when setting its operating instrument and an assumption about what the operating instrument is. In addition, assumptions must be made about the nature of the interaction of the policy shock with the variables in the feedback rule. One assumption is that the policy shock is orthogonal to these variables. Throughout, we refer to this as the recursiveness assumption. Along with linearity of the Fed's feedback rule, this assumption justifies estimating policy shocks by the fitted residuals in the ordinary least squares regression of the Fed's policy instrument on the variables in the Fed's information set. The economic content of the recursiveness assumption is that the time t variables in the Fed's information set do not respond to time t realizations of the monetary policy shock. As an example, Christiano et al. (1996a) assume that the Fed looks at current prices and output, among other things, when setting the time t value of its policy instrument. In that application, the recursiveness assumption implies that output and prices respond only with a lag to a monetary policy shock. While there are models that are consistent with the previous recursiveness assumption, it is nevertheless controversial. 2 This is why authors like Bernanke (1986),
2 See Christiano, Eichenbaumand Evans (1997b) and Rotembergand Woodford(1997) for models that are consistent with the assumption that contemporaneousoutput and the price level do not respond to a monetary policy shock.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
69
Sims (1986), Sims and Zha (1998) and Leeper et al. (1996) adopt an alternative approach. No doubt there are some advantages to abandoning the recursiveness assumption. But there is also a substantial cost: a broader set of economic relations must be identified. And the assumptions involved can also be controversial. For example, Sims and Zha (1998) assume, among other things, that the Fed does not look at the contemporaneous price level or output when setting its policy instrument and that contemporaneous movements in the interest rate do not directly affect aggregate output. Both assumptions are clearly debatable. Finally, it should be noted that abandoning the recursiveness assumption doesn't require one to adopt an identification scheme in which a policy shock has a contemporaneous impact on all nonpolicy variables. For example, Leeper and Gordon (1992) and Leeper et al. (1996) assume that aggregate real output and the price level are not affected in the impact period of a monetary policy shock. The second and third strategies for identifying monetary policy shocks do not involve explicitly modelling the monetary authority's feedback rule. The second strategy involves looking at data that purportedly signal exogenous monetary policy actions. For example, Romer and Romer (1989) examine records of the Fed's policy deliberations to identify times in which they claim there were exogenous monetary policy shocks. Other authors like Rudebusch (1995) assume that, in certain sample periods, exogenous changes in monetary policy are well measured by changes in the federal funds rate. Finally, authors like Cooley and Hansen (1989, 1997), King (1991), Christiano (1991) and Christiano and Eichenbaum (1995) assume that all movements in money reflect exogenous movements in monetary policy. The third strategy identifies monetary policy shocks by the assumption that they do not affect economic activity in the long run. 3 We will not discuss this approach in detail. We refer the reader to Faust and Leeper (1997) and Pagan and Robertson (1995) for discussions and critiques of this literature. The previous overview makes clear that the literature has not yet converged on a particular set of assumptions for identifying the effects of an exogenous shock to monetary policy. Nevertheless, as we show, there is considerable agreement about the qualitative effects of a monetary policy shock in the sense that inference is robust across a large subset of the identification schemes that have been considered in the literature. The nature of this agreement is as follows: after a contractionary monetary policy shock, short term interest rates rise, aggregate output, employment, profits and various monetary aggregates fall, the aggregate price level responds very slowly, and various measures of wages fall, albeit by very modest amounts. In addition, there is agreement that monetary policy shocks account for only a very modest percentage of the volatility of aggregate output; they account for even less of the movements in
3 For an early example of this approach see Gali (1992).
70
L.J. Christiano et al.
the aggregate price level.4 The literature has gone beyond this to provide a richer, more detailed picture of the economy's response to a monetary policy shock (see Section 4.6). But even this small list of findings has proven to be useful in evaluating the empirical plausibility of alternative monetary business cycle models [see Christiano et al. (1997a)]. In this sense the Lucas program, as applied to monetary policy shocks, is already proving to be a fruitful one. Identification schemes do exist which lead to different inferences about the effects of a monetary policy shock than the consensus view just discussed. How should we select between competing identifying assumptions? We suggest one selection scheme: eliminate a policy shock measure if it implies a set of impulse response functions that is inconsistent with every element in the set of monetary models that we wish to discriminate between. This is equivalent to announcing that if none of the models that we are interested in can account for the qualitative features of a set of impulse response functions, we reject the corresponding identifying assumptions, not the entire set of models. In practice, this amounts to a set of sign and shape restrictions on impulse response functions [see Uhlig (1997) for a particular formalization of this argument]. Since we have been explicit about the restrictions we impose, readers can make their own decisions about whether to reject the identifying assumptions in question. In the end, the key contribution of the monetary policy shock literature may be this: it has clarified the mapping from identification assumptions to inference about the effects of monetary policy shocks. This substantially eases the task of readers and model builders in evaluating potentially conflicting claims about what actually happens after a monetary policy shock. The remainder of this chapter is organized as follows: Section 2: We discuss possible interpretations of monetary policy shocks. Section 3: We discuss the main statistical tool used in the analysis, namely the Vector Autoregression (VAR). In addition we present a reasonably self-contained discussion of the identification issues involved in estimating the economic effects of a monetary policy shock. Section 4: We discuss inference about the effects of a monetary policy shock using the recursiveness assumption. First, we discuss the link between the recursiveness assumption and identified VAR's. Second, we display the dynamic response of various economic aggregates to a monetary policy shock under three benchmark identification schemes, each of which satisfies the recursiveness assumption. In addition, we discuss related findings in the literature concerning other aggregates not explicitly analyzed here. Third, we discuss the robustness of inference to various perturbations including: alternative identification schemes which also impose the recursiveness assumption, incorporating information from the federal funds futures market into the analysis and varying the subsample over which the analysis is conducted. Fourth, we consider
4 These latter two findings say nothing about the impact of the systematic component of monetary policy on aggregate output and the price level. The literature that we review is silent on this point.
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
71
some critiques of the benchmark identification schemes. Fifth, we consider the implications of the benchmark identification schemes for the volatility of various economic aggregates. Section 5." We. consider other approaches which focus on the monetary authority's feedback rule, but which do not impose the recursiveness assumption. Section 6." We discuss the difficulty of directly interpreting estimated monetary policy rules. Section 7: We consider the narrative approach to assessing the effects of a monetary policy shock. Section 8: We conclude with a brief discussion of various approaches to implementing the third step of the Lucas program as applied to monetary policy shocks. In particular we review a particular approach to performing monetary experiments in model economies, the outcomes of which can be compared to the estimated effects of a policy shock in actual economies. In addition we provide some summary remarks.
2. Monetary policy shocks: some possible interpretations Many economists think that a significant fraction of the variation in central bank policy actions reflects policy makers' systematic responses to variations in the state of the economy. As noted in the introduction, this systematic component is typically formalized with the concept of a feedback rule, or reaction function. As a practical matter, it is recognized that not all variations in central bank policy can be accounted for as a reaction to the state of the economy. The unaccounted variation is formalized with the notion of a monetary policy shock. Given the large role that the concepts of a feedback rule and a policy shock play in the literature, we begin by discussing several sources of exogenous variation in monetary policy. Throughout this chapter we identify a monetary policy shock with the disturbance term in an equation of the form St =f(£2t) + ose~'.
(2.1)
Here St is the instrument of the monetary authority, say the federal funds rate or some monetary aggregate, and f is a linear fimction that relates St to the information set £2t. The random variable, ase 7, is a monetary policy shock. Here, e7 is normalized to have unit variance, and we refer to as as the standard deviation of the monetary policy shock. One interpretation o f f and f2t is that they represent the monetary authority's feedback rule and information set, respectively. As we indicate in Section 6, there are other ways to think about f and g2t which preserve the interpretation of e7 as a shock to monetary policy. What is the economic interpretation of these policy shocks? We offer three interpretations. The first is that e[ reflects exogenous shocks to the preferences of
72
L.J. Christiano et al.
the monetary authority, perhaps due to stochastic shifts in the relative weight given to unemployment and inflation. These shifts could reflect shocks to the preferences of the members of the Federal Open Market Committee (FOMC), or to the weights by which their views are aggregated. A change in weights may reflect shifts in the political power of individual committee members or in the factions that they represent. A second source of exogenous variation in policy can arise because of the strategic considerations developed in Ball (1995) and Chari, Christiano and Eichenbaum (1998). These authors argue that the Fed's desire to avoid the social costs of disappointing private agents' expectations can give rise to an exogenous source of variation in policy like that captured by e7. Specifically, shocks to private agents' expectations about Fed policy can be self-fulfilling and lead to exogenous variations in monetary policy. A third source of exogenous variation in Fed policy could reflect various technical factors. For one set of possibilities, see Hamilton (1997). Another set of possibilities, stressed by Bernanke and Mihov (1995), focuses on the measurement error in the preliminary data available to the FOMC at the time it makes its decision. We find it useful to elaborate on Bernanke and Mihov's suggestion for three reasons. First, their suggestion is of independent interest. Second, we use it in Section 6 to illustrate some of the difficulties involved in trying to interpret the parameters of f Third, we use a version of their argument to illustrate how the interpretation of monetary policy shocks can interact with the plausibility of alternative assumptions for identifying e~. Suppose the monetary authority sets the policy variable, G, as an exact function of current and lagged observations on a set of variables, xt. We denote the time t observations on xt and xt-1 by xt(O) and xt_ffl), where
xt(O) =x~+vt,
x~_l(1) =x~_l +u,_l.
(2.2)
So, vt represents the contemporaneous measurement error in xt, while ut represents the measurement error in xt from the standpoint of period t + 1. Ifxt is observed perfectly with a one period delay, then ut = 0 for all t. Suppose that the policy maker sets St as follows: (2.3)
St = fioSt_l q- [~ixt(O) + [~2xt 1(1).
Expressed in terms of correctly measured variables, this policy rule reduces to Equation (2.1) with:
f ( f2t) = [3oSt-1 + [31xt + [32xt-1,
Os6 t = [31Ut + [~2Ut-1 •
(2.4)
This illustrates how noise in the data collection process can be a source of exogenous variation in monetary policy actions. This example can be used to illustrate how one's interpretation of the error term can affect the plausibility of alternative assumptions used to identify eZ. Recall the
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
73
recursiveness assumption, according to which e7 is orthogonal to the elements o f g2t. Under what circumstances would this assumption be correct under the measurement error interpretation o f eT? To answer this, suppose that vt and ut are classical measurement errors, i.e. they are uncorrelated with xt at all leads and lags. If fi0 = 0, then the recursiveness assumption is satisfied. Now suppose that fi0 e 0. I f ut -- 0, then this assumption is still satisfied. However, in the more plausible case where fi2 ~ 0, ut ~ 0 and ut and vt are correlated with each other, then the recursiveness condition fails. This last case provides an important caveat to measurement error as an interpretation o f the monetary policy shocks estimated by analysts who make use o f the recursiveness assumption. We suspect that this may also be true for analysts who do not use the recursiveness assumption (see Section 5 below), because in developing identifying restrictions, they typically abstract from the possibility o f measurement error.
3. Vector autoregressions and identification A fundamental tool in the literature that we review is the vector autoregression (VAR). A VAR is a convenient device for summarizing the first and second moment properties of the data. We begin by defining more precisely what a VAR is. We then discuss the identification problem involved in measuring the dynamic response of economic aggregates to a fundamental economic shock. The basic problem is that a given set o f second moments is consistent with many such dynamic response functions. Solving this problem amounts to making explicit assumptions that justify focusing on a particular dynamic response function. A VAR for a k-dimensional vector o f variables, Zt, is given by Zt = B1Zt_I + . . . q- B q Z t - q - k blt,
Eutu~ = E
(3.1)
Here, q is a nonnegative integer and ut is uncorrelated with all variables dated t - 1 and earlier. 5 Consistent estimates o f the Bi's can be obtained by running ordinary least squares equation by equation on Equation (3.1). One can then estimate V from the fitted residuals. Suppose that we knew the Bi's, the ut's and V. It still would not be possible to compute the dynamic response function o f Zt to the fundamental shocks in the economy. The basic reason is that ut is the one step ahead forecast error in Zt. In general, each element o f ut reflects the effects o f all the fundamental economic shocks. There is no reason to presume that any element o f ut corresponds to a particular economic shock, say for example, a shock to monetary policy.
5 For a discussion of the class of processes that VAR's summarize, see Sargent (1987). The absence of a constant term in Equation (3.1) is without loss of generality, since we are free to set one of the elements of Zt to be identically equal to unity.
74
L.J. Christiano et aL
To proceed, we assume that the relationship between the VAR disturbances and the fundamental economic shocks, et, is given by Aout = et. Here, A0 is an invertible, square matrix and Eete[ = D, where D is a positive definite matrix. 6 Premultiplying Equation (3.1) by A0, we obtain: AoZt
= AIZt
(3.2)
1 -t- . . • + A q Z t q + E t.
Here Ai is a k × k matrix o f constants, i = 0 , . . . , Bi
AolAi,
i=l,...,q,
and
q and
V=AoID(Aol)
'
(3.3)
The response o f Zt+h to a unit shock in et, Yh, can be computed as follows. Let ~h be the solution to the following difference equation: ~/h=Bl~h L + ' " + B q ) h q ,
h=l,2
....
(3.4)
with initial conditions
~0 -- 1,
~ 1 -- ~ 2
--"
~ q = 0.
(3.5)
Then, Yh = ~hAo 1, h = 0, 1. . . . .
(3.6)
Here, the (j, l) element o f Yh represents the response o f the j t h component o f Z~+h to a unit shock in t h e / t h component o f et. The gh's characterize the "impulse response function" o f the elements o f Zt to the elements o f et. Relation (3.6) implies we need to know A0 as well as the Bi's in order to compute the impulse response function. While the Bi's can be estimated via ordinary least squares regressions, getting A0 is not so easy. The only information in the data about A0 is that it solves the equations in (3.3). Absent restrictions on A0 there are in general many solutions to these equations. The traditional simultaneous equations literature places no assumptions on D, so that the equations represented by V = A o l D (Ao 1) ~ provide no information about A0. Instead, that literature develops restrictions on Ai, i = O, . . . , q that guarantee a unique solution to AoBi = Ai, i = 1, . . . , q. In contrast, the literature we survey always imposes the restriction that the fundamental economic shocks are uncorrelated (i.e., D is a diagonal matrix), and places no restrictions on Ai, i = 1. . . . . q. 7 Absent additional restrictions on A0 we can set D = I.
(3.7)
Also note that without any restrictions on the Ai's, the equations represented by AoBi = Ai, i = 1, . . . , q provide no information about A0. All o f the information about
6 This corresponds to the assumption that the economic shocks are recoverable from a finite list of current and past Zt's. For our analysis, we only require that a subset of the et's be recoverable from current and past Zt's. 7 See Leeper, Sims and Zha (1996) for a discussion of Equation (3.7).
Ch. 2: MonetaryPolicy Shocks: WhatHave we Learned and to WhatEnd?
75
this matrix is contained in the relationship V = Ao 1 (A01) ~. Define the set of solutions to this equation by
In general, this set contains many elements. This is because A0 has k 2 parameters while the symmetric matrix, V, has at most k(k + 1)/2 distinct numbers. So, Qv is the set o f solutions to k(k + 1)/2 equations in k 2 unknowns. As long as k > 1, there will in general be many solutions to this set o f equations, i.e., there is an identification problem. To solve this problem we must find and defend restrictions on A0 so that there is only one element in Qv satisfying them. In practice, the literature works with two types o f restrictions: a set o f linear restrictions on the elements o f A0 and a requirement that the diagonal elements o f A0 be positive. Suppose that the analyst has in mind l linear restrictions on A0. These can be represented as the requirement rvec(A0) = 0, where T is a matrix o f dimension 1 × k 2 and vec(A0) is the k 2 x 1 vector composed o f the k columns o f A0. Each o f the l rows o f T represents a different restriction on the elements of A0. We denote the set o f A0 satisfying these restrictions by: Q~ = {A0 : rvec(A0) = 0}.
(3.9)
In the literature that we survey, the restrictions summarized by ~ are either zero restrictions on the elements o f A0 or restrictions across the elements o f individual rows of A0. Cross equation restrictions, i.e., restrictions across the elements o f different rows of A0, are not considered. Next we motivate the sign restrictions that the diagonal elements o f A0 must be strictly positive. 8 If Q~ n Qv is nonempty, it can never be composed o f just a single matrix. This is because irA0 lies in QvN Q~, then A0 obtained from A0 by changing the sign o f all elements of an arbitrary subset o f rows o f A0 also lies in Q~ N Qv. To see this, let W be a diagonal matrix with an arbitrary pattern o f ones and minus ones along the diagonal. It is obvious that WAo E Q~. Also, because W is orthonormal (i.e., W ' W = I), WAo E Qv as well. Suppose we impose the restriction that the diagonal elements o f A0 be strictly positive. This rules out matrices A0 that are obtained from an A0 E Q~ N Qv by changing the signs o f all the elements o f A0. In what follows we only consider A0 matrices that obey the sign restrictions. That is, we insist that Ao E Qs, where Qs = {A0 :A0 has strictly positive diagonal elements}.
(3.10)
From Equation (3.2) we see that the ith diagonal o f A0 being positive corresponds to the normalization that a positive shock to the ith element o f et represents a positive shock to the ith element o f Zt when the other elements o f Zt are held fixed. s The following discussion ignores the possibility that Q~ N Qv contains a mataix with one or more diagonal elements that are exactly zero. A suitable modification of the argument below can accommodate this possibility.
76
L.J Christiano et al.
W h e n there is m o r e than one e l e m e n t in the set Qv A Qr N Qs we say that the system is "underidentified", or, " n o t identified". W h e n Q v N Qr N Qs has one element, we say it is "identified". So, in these terms, solving the identification p r o b l e m requires selecting a r w h i c h causes the s y s t e m to be identified. N o t e that Qv n Qr is the set o f solutions to k ( k + 1)/2 + l equations in the k 2 u n k n o w n s o f A0. In practice, the literature seeks to achieve identification by selecting a full row rank ~ satisfying the order condition, l >~ k ( k - 1)/2. However, the order and sign conditions are not sufficient for identification. For example, w h e n l = k ( k - 1)/2 underidentification c o u l d o c c u r for two reasons. First, a n e i g h b o r h o o d o f a g i v e n Ao E Qv n Qr N Qs c o u l d contain other m a t r i c e s b e l o n g i n g to Qv N Qr n Qs. This possibility can be ruled out by v e r i f y i n g a simple rank condition, n a m e l y that the matrix derivative w i t h respect to A0 o f the equations defining (3.8) is o f full rank. 9 In this case, we say we have established local identification. A second possibility is that there m a y be other matrices b e l o n g i n g to Q v n Q~ n Qs but w h i c h are not in a small n e i g h b o r h o o d o f A0. ~0 In general, no k n o w n simple conditions rule out this possibility. I f we do m a n a g e to rule it out, we say the system is globally identified. 11 In practice, we use the rank and order conditions to v e r i f y local identification. Global identification m u s t be established on a case by case basis. S o m e t i m e s , as in our discussion o f B e r n a n k e and M i h o v (1995), this can be done analytically. M o r e typically, one is limited to b u i l d i n g confidence in global identification by c o n d u c t i n g an ad hoc n u m e r i c a l search through the p a r a m e t e r space to d e t e r m i n e i f there are other e l e m e n t s
in Qv n Qr n Qs. T h e difficulty o f establishing global identification in the literature we survey stands in contrast to the situation in the traditional simultaneous equations context.
9 Here we define a particular rank condition and establish that the rank and order conditions are sufficient for local identification. Let a be the k(k + 1)/2 dimensional column vector of parameters in A0 that remain free after imposing condition (3.9), so that Ao(a) C Qr for all a. L e t f ( a ) denote the k(k + 1)/2 dimensional row vector composed of the upper triangular part of A0(a ) 1 iAo(a) its_ V. Let F(a) denote the k(k + 1)/2 by k(k + 1)/2 derivative matrix o f f ( a ) with respect to a. Let a* satisfy f(a*) - O. Consider the following rank condition: F(a) has full rank for all a E D(a*), where D(a ~) is some neighborhood of a*. We assume thatf is continuous and that F is well defined. A straightforward application of the mean value theorem (see Bartle (1976), p. 196) establishes that this rank condition guarantees f(a) ~ 0 for all a C D(a*) and a ,~ a*. Let gL : [.eL,eLI - - + Rk(k+l)/2 be defined by gL(e) = f(a* + re), where ~ is an arbitrary non-zero k(k + 1)/2 column vector, and _eL and 2t are the smallest and largest values, respectively, of e such that (a* + be) E D(a*). Note that g~(e) - trF(a* + te) and e L < 0 < et. By the mean value theorem, gL(e) = gL(0) + g~(y)e for some 7 between 0 and e. This can be written gt(e) = t~F(a * + ~e)e. The rank condition implies that the expression to the right of the equality is nonzero, as long as e ~ 0. Since the choice of t e 0 was arbitrary, the result is established. 10 A simple example is (x - a) (x - b) = 0, which is one equation with two isolated solutions, x = a and
x-b. 11 We can also differentiate other concepts of identification. For example, asymptotic and small sample identification correspond to the cases where V is the population and finite sample value of the variance covariance matrix of the VAR disturbances, respectively. Obviously, asymptotic identification could hold while finite sample identification fails, as well as the converse.
Ch. 2: Monetary Policy Shocks." What Haue we Learned and to What End?
77
There, the identification problem only involves systems o f linear equations. Under these circumstances, local identification obtains i f and only i f global identification obtains. The traditional simultaneous equations literature provides a simple set o f rank and order conditions that are necessary and sufficient for identification. These conditions are only sufficient to characterize local identification for the systems that we consider. 12 Moreover, they are neither necessary nor sufficient for global identification. We now describe two examples which illustrate the discussion above. In the first case, the order and sign conditions are sufficient to guarantee global identification. In the second, the order condition and sign conditions for identification hold, yet the system is not identified. In the first example, we select z- so that all the elements above (alternatively, below) the diagonal o f Ao are zero. If, in addition, we impose the sign restriction, then it is well known that there is only one element in Q v A Qr A Qs, i.e., the system is globally identified. This result is an implication o f the uniqueness o f the Cholesky factorization o f a positive definite symmetric matrix. This example plays a role in the section on identification o f monetary policy shocks with a recursiveness assumption. For our second example, consider the case k = 3 with the following restricted Ao matrix:
[al, 0 a13] Ao =
0 0
a22 a23/ ' a32 a 3 3 J
where aii > 0 for i = 1,2, 3. Since there are three zero restrictions, the order condition is satisfied. Suppose that Ao c Qv, so that Ao E Q v c-I Q~ cl Qs. Let W be a block diagonal matrix with unity in the (1, 1) element and an arbitrary 2 x 2 orthonormal matrix in the second diagonal block. Let W also have the property that WAo has positive elements on the diagonal. Then, g ~ = I , and WAo E Qv N Qr N Qs. 13 In this case we do not have identification, even though the order and sign conditions are satisfied. The reason for the failure o f local identification is that the rank condition does not hold. I f it did hold, then identification would have obtained. The failure o f the rank condition in this example reflects that the second and third equations in the system are indistinguishable.
12 To show that the rank condition is not necessary for local identification, consider f(x) = (x For this function there is a globally unique zero at x - a, yetf~(a) = 0. 13 To see that this example is non empty, consider the case all - 0.70, a13 = 0.40, a22 - 0 . 3 8 , a 2 3 -a32 = 0.83, a33 - 0.71 and let the 2 x 2 lower block in W be
I
0.4941 0.8694 I ' 0.8694 -0.4941
It is easy to verify that WAo satisfies the zero and sign restrictions on A0.
a) 2 .
0.50,
78
L.J. Christiano et al.
It is easy to show that every element in Qv n Qr n Qs generates the same dynamic response function to the first shock in the system. To see this, note from Equation (3.5) that the first column of Ao 1 is what characterizes the response of all the variables to the first shock. Similarly, the first column of (WAo) ~ controls the response of the transformed system to the first shock. But, the r e s u l t (WA0) -1 = Ao 1 W t, and our definition of W imply that the first columns of (WA0)-1 and ofAo 1 are the same. So, if one is only interested in the dynamic response of the system to the first shock, then the choice of the second diagonal block of W is irrelevant. An extended version of this observation plays an important role in our discussion of nonrecursive identification schemes below.
4. The effects of a monetary policy shock: a recursiveness assumption
In this section we discuss one widely used strategy for estimating the effects of a monetary policy shock. The strategy is based on the recursiveness assumption, according to which monetary policy shocks are orthogonal to the information set of the monetary authority. Section 4.1 discusses the relationship between the recursiveness assumption and VARs. Section 4.2 describes three benchmark identification schemes which embody the recursiveness assumption. In addition, we display estimates of the dynamic effects of a monetary policy shock on various economic aggregates, obtained using the benchmark identification schemes. Section 4.3 reviews some results in the literature regarding the dynamic effects of a monetary policy shock on other economic aggregates, obtained using close variants of the benchmark schemes. Section 4.4 considers robustness of the empirical results contained in Section 4.2. Section 4.5 discusses various critiques of the benchmark identification schemes. Finally, Section 4.6 investigates the implications of the benchmark schemes for the volatility of various economic aggregates. 4.1. The recursiveness assumption and VARs
The recursiveness assumption justifies the following two-step procedure for estimating the dynamic response of a variable to a monetary policy shock. First, estimate the policy shocks by the fitted residuals in the ordinary least squares regression of St on the elements of £2t. Second, estimate the dynamic response of a variable to a monetary policy shock by regressing the variable on the current and lagged values of the estimated policy shocks. In our analysis we find it convenient to map the above two-step procedure into an asymptotically equivalent VAR-based procedure. There are two reasons for this. First, the two-step approach implies that we lose a number of initial data points equal to the number of dynamic responses that we wish to estimate, plus the number of lags, q, in g2t. With the VAR procedure we only lose the latter. Second, the VAR methodology provides a complete description of the data generating process for the elements of g2t.
Ch. 2:
Monetary Policy Shocks." What Have we Learned and to What End?
79
This allows us to use a straightforward bootstrap methodology for use in conducting hypothesis tests. We now indicate how the recursiveness assumption restricts A0 in Equation (3.2). Partition Zt into three blocks: the kl variables, Xlz, whose contemporaneous values appear in g2t, the k2 variables, X2t , which only appear with a lag in £2t, and St itself. Then, k = kl + k2 + 1, where k is the dimension of Zt. That is:
z'=/% We consider kl, k2 > 0. To make the analysis interesting we assume that if kl = 0, so that Xlt is absent from the definition of Zt, then k2 > 1. Similarly, if k2 = 0, then kl > 1. The recursiveness assumption places the following zero restrictions on Ao • all (kl xkl)
Ao =
a21
0 a22
(lxk~) (1×1) 17/3l (k2 ×kl)
0
(kl x 1) (k~ xk2) 0 (lxk2)
(4.1)
a32 a33 (k2 × 1) (k2 × k2)
Here, expressions in parentheses indicate the dimension of the associated matrix and a22 = 1/G, where G > 0. The zero block in the middle row of this matrix reflect the assumption that the policy maker does not see X2t when St is set. The two zero blocks in the first row of Ao reflect our assumption that the monetary policy shock is orthogonal to the elements in Xlt. These blocks correspond to the two distinct channels by which a monetary policy shock could in principle affect the variables in Xlt. The first of these blocks corresponds to the direct effect of St on X w The second block corresponds to the indirect effect that operates via the impact of a monetary policy shock on the variables in X2~. We now show that the recursiveness assumption is not sufficient to identify all the elements of Ao. This is not surprising, in light of the fact that the first kl equations are indistinguishable from each other, as are the last k2 equations. Significantly, however, the recursiveness assumption is sufficient to identify the object o f interest: the dynamic response of Zt to a monetary policy shock. Specifically, we establish three results. The first two are as follows: (i) there is a nonempty family of Ao matrices, one of which is lower triangular with positive terms on the diagonal, which are consistent with the recursiveness assumption [i.e., satisfy Equation (4.1)] and satisfy Ao 1 (A01) ' = V; and (ii) each member of this family generates precisely the same dynamic response function of the elements of Zt to a monetary policy shock. Result (iii) is that if we adopt the normalization of always selecting the lower triangular Ao matrix identified in (i), then the dynamic response o f the variables in Zt are invariant to the ordering of variables in J(lt and X2t.
L.J. Christiano et al.
80
To prove (i)-(iii) it is useful to establish a preliminary result. We begin by defining some notation. Let the ((kl + 1)k2 + kl) × k 2 matrix r summarize the zero restrictions on A0 in Equation (4.1). So, Qr is the set o f A0 matrices consistent with the recursiveness assumption. Let Qv be the set o f A0 matrices defined by the property that Aol(Aol) / [see Equation (3.8)]. In addition, let W=
0 0
1 0 0 W33
,
(4.2)
where W is partitioned confonrlably with Ao in Equation (4.1) and WH and W33 are arbitrary orthonormal matrices. Define Q~0 = {A0 : A0 = WA0, for some W satisfying (4.2)}. Here A0 is a matrix conformable with W. We now establish the following result:
Q~o = Qv n Q~,
(4.3)
where ~]0 is an arbitrary element o f Qv n Qr. It is straightforward to establish that A0 E Q~0 implies A0 E Qv n Qr. The result, A0 E Qv follows from orthonormality o f W and the fact, ~]0 E Qv. The result, A0 C Qr, follows from the block diagonal structure o f W in Equation (4.2). Now consider an arbitrary A0 C Qv n Qr. To show that A0 E Q~0, consider the candidate orthonormal matrix W = A0~]o1, where invertibility o f ~]0 reflects A0 E Qv. Since W is the product o f two block-lower triangular matrices, it too is block-lower triangular. Also, it is easy to verify that WW / = /. The orthonormality o f W, together with block-lower triangularity imply that W has the form (4.2). This establishes A0 E Q~0 and, hence, Equation (4.3). We now prove result (i). The fact that Qv n Qr is not empty follows from the fact that we can always set A0 equal to the inverse of the lower triangular Cholesky factor of V. The existence and invertability o f this matrix is discussed in Hamilton (1994, p. 91). 14 To see that there is more than one element in Qv n Qr, use the characterization result (4.3), with A0 equal to the inverse of the Cholesky factor o f V. Construct the orthonormal matrix W ~ I by interchanging two o f either the first kl rows or the last k2 rows of the k-dimensional identity matrix. 15 Then, W~]0 ¢ ~]0. Result (i) is established because W~]0 E Qv N Qr. 14 The Cholesky factor of a positive definite, symmetric matrix, V, is a lower triangular matrix, C, with the properties (i) it has positive elements along the diagonal, and (ii) it satisfies the property, CCI = V. 15 Recall, orthonormality of a matrix means that the inner product between two different columns is zero and the inner product of any column with itself is unity. This property is obviously satisfied by the identity matrix. Rearranging the rows of the identity matrix just changes the order of the terms being added in the inner products defining orthonorrnality, and so does not alter the value of column inner products. Hence a matrix obtained from the identity matrix by arbitrarily rearranging the order of its rows is orthonormal.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
81
We now prove result (ii). Consider any two matrices, Ao, ~4o E Qv N Qr. By Equation (4.3) there exists a W satisfying Equation (4.2), with the property -40 = WAo, so that z]01 = Ao I W t.
In conjunction with Equation (4.2), this expression implies that the (kl + 1)th column of ~]o1 and Ao I are identical. But, by Equation (3.6) the implied dynamic responses of Zt+i, i = 0, 1. . . . to a monetary policy shock are identical too. This establishes result (ii). We now prove (iii) using an argument essentially the same as the one used to prove (ii). We accomplish the proof by starting with a representation of Zt in which A0 is lower triangular with positive diagonal elements. We then arbitrarily reorder the first kl and the last k2 elements of Zt. The analog to A0 in the resulting system need not be lower triangular with positive elements. We then apply a particular orthonormal transformation which results in a lower triangular system with positive diagonal elements. The response of the variables in Zt to a monetary policy shock is the same in this system and in the original system. Consider Zt = D Z , where D is the orthonormal matrix constructed by arbitrarily reordering the columns within the first kl and the last k2 columns of the identity matrix. 16 Then, Zt corresponds to Zt with the variables in Xlt and X2t reordered arbitrarily. Let Bi, i = 1. . . . . q and V characterize the VAR of Zt and let A0 be the unique lower triangular matrix with positive diagonal terms with the property Ao 1 (Aol) ~ = V. Given the Bi's, Ao characterizes the impulse response function of the Zt's to et [see Equations (3.4)-(3.6)]. The VAR representation of Zt, obtained by suitably reordering the equations in (3.1), is characterized by DBiD ~, i = 1. . . . . q, and
DVD'. Iv Also, it is easily verified that (AoD') -t [(AoD')-l] ' = DVD', and that given the DBiD ~'s, AoD ~ characterizes the impulse response function of the Zt's to et. Moreover, these responses coincide with the responses of the corresponding variables in Zt to et. Note that AoD ~ is not in general lower triangular. Let -40 = AoDq
f40=
[
nil 0 0 ] ~2l ~22 0 , a31 a32 a33
where aii is full rank, but not necessarily lower triangular, for i = 1, 3. Let the QR decomposition of these matrices be ?tu = QiRi, where Qi is a square, orthonormal
16 The type of reasoning in the previous footnote indicates that permuting the columns of the identity matrix does not alter orthonormality. 17 To see this, simply premultiply Equation (3.1) by D on both sides and note that BiZ t i = BiOtDZt-i, because D~D = L
L . J Christiano et aL
82
matrix, and Ri is lower triangular with positive elements along the diagonal. This decomposition exists as long a s ~lii , i = 1, 3, is nonsingular, a property guaranteed by the fact Ao E Qv N Qr [see Strang (1976), p. 124]. 18 Let
W=
[100] 10 0 Q 3'
.
Note that WW ' = I, (W.40) -a [(W~]0)-~] ' = DVD', and WA0 is lower triangular with L
~
positive elements along the diagonal. Since (W.~0) -1 = Ao 1W I, the (kl + 1)th columns of~]o 1W / and .~o I coincide. We conclude that, under the normalization that A0 is lower diagonal with positive diagonal terms, the response o f the variables in Zt to a monetary policy shock is invariant to the ordering o f variables in Xlt and X2t. This establishes (iii). We now summarize these results in the form o f a proposition.
Proposition 4.1. Consider the sets Qv and Qr. (i) The set Qv N Qr is nonempty and contains more than one element. (ii) The (kl + 1)th column of 7i, i = O, 1, ... in Equation (3.6) is invariant to the choice of Ao E Qv n Qr. (iii) Restricting Ao E Qv n Qr to be lower triangular with positive diagonal terms, the (kl + 1)th column of 7i, i = O, 1, ... is invariant to the ordering of the elements in Xlt and X2t. We now provide a brief discussion of (i)-(iii). According to results (i) and (ii), under the recursiveness assumption the data are consistent with an entire family, Qz n QT, o f A0 matrices. It follows that the recursiveness assumption is not sufficient to pin down the dynamic response functions o f the variables in Zt to every element o f et. But, each Ao E Qv n QT does generate the same response to one o f the et's, namely the one corresponding to the monetary policy shock. In this sense, the recursiveness assumption identifies the dynamic response o f Zt to a monetary shock, but not the response to other shocks. In practice, computational convenience dictates the choice o f some Ao E Qv n Qr. A standard normalization adopted in the literature is that the A0 matrix is lower triangular with nonnegative diagonal terms. This still leaves open the question of how to order the variables in Xlt and X2t. But, according to result (iii), the dynamic response o f the variables in Zt to a monetary policy shock is invariant to this ordering. At
is Actually, it is customary to state the QR decomposition of the (n × n) matrix A as A = QR, where R is upper triangular. We get it into lower triangular form by constructing the orthonormal matrix E with zeros everywhere and 1's in the (n + 1 - i, i)th entries, i = 1 , 2 ..... n, and writing A = (QE) (E~R). The orthonormal matrix to which we refer in the text is actually QE.
Ch. 2: MonetaryPolicy Shocks: WhatHave we Learned and to What End?
83
the same time, the dynamic impact on Zt of the nonpolicy shocks is sensitive to the ordering of the variables in Xlt and Xzt. The recursiveness assumption has nothing to say about this ordering. Absent further identifying restrictions, the nonpolicy shocks and the associated dynamic response functions simply reflect normalizations adopted for computational convenience.
4.2. Three benchmark identification schemes We organize our empirical discussion around three benchmark recursive identification schemes. These correspond to different specifications of St and g2t. In our first benchmark system, we measure the policy instrument, St, by the time t federal funds rate. This choice is motivated by institutional arguments in McCallum (1983), Bernanke and Blinder (1992) and Sims (1986, 1992). Let Yt, Pt, PCOMt, FFt, TRt, NBRt, and Mt denote the time t values of the log of real GDP, the log of the implicit GDP deflator, the smoothed change in an index of sensitive commodity prices (a component in the Bureau of Economic Analysis' index of leading indicators), the federal funds rate, the log of total reserves, the log of nonborrowed reserves plus extended credit, and the log of either M1 or M2, respectively. Here all data are quarterly. Our benchmark specification of g2t includes current and four lagged values of Y , Pt, and PCOMt, as well as four lagged values ofFFt, NBRt, TRt and Mr. We refer to the policy shock measure corresponding to this specification as an F F policy shock. In our second benchmark system we measure St by NBRt. This choice is motivated by arguments in Eichenbaum (1992) and Christiano and Eiehenbaum (1992) that innovations to nonborrowed reserves primarily reflect exogenous shocks to monetary policy, while innovations to broader monetary aggregates primarily reflect shocks to money demand. We assume that f2t includes current and four lagged values of Yt, Pt, and PCOMt, as well as four lagged values of FFt, NBRt, TRt and Mr. We refer to the policy shock measure corresponding to this specification as an NBR policy shock. Note that in both benchmark specifications, the monetary authority is assumed to s e e Yt, Pt and PCOMt, when choosing St. 19 This assumption is certainly arguable because quarterly real GDP data and the GDP deflator are typically known only with a delay. Still, the Fed does have at its disposal monthly data on aggregate employment, industrial output and other indicators of aggregate real economic activity. It also has substantial amounts of information regarding the price level. In our view the assumption that the Fed sees Yt and Pt when they choose St seems at least as plausible as assuming that they don't. 20 Below we document the effect of deviating from this benchmark assumption.
19 Examples of analyses which make this type of information assumption include Christiano and Eichenbaum (1992), Christiano et al. (1996a, 1997a), Eichenbaum and Evans (1995), Strongin (1995), Bernanke and Blinder (1992), Bernanke and Mihov (1995), and Gertler and Gilchfist (1994). 2o See for example the specifications in Sims and Zha (1998) and Leeper et al. (1996).
84
L.J Christiano et al.
Notice that under our assumptions, Yt, Pt and PCOMt do not change in the impact period o f either an F F or an NBR policy shock. Christiano et al. (1997b) present a dynamic stochastic general equilibrium model which is consistent with the notion that prices and output do not move appreciably in the impact period of a monetary policy shock. The assumption regarding PCOMt is more difficult to assess on theoretical grounds absent an explicit monetary general equilibrium model that incorporates a market for commodity prices. In any event, we show below that altering the benchmark specification to exclude the contemporaneous value o f PCOMt from g2t has virtually no effect on our results. 21 In the following subsection we display the time series of the two benchmark policy shock estimates. After that, we study the dynamic response of various economic time series to these shocks. At this point, we also consider our third benchmark system, a variant o f the NBR policy shocks associated with Strongin (1995). Finally, we consider the contribution o f different policy shock measures to the volatility o f various economic aggregates.
4.2.1. The benchmark policy shocks" displayed We begin by discussing some basic properties o f the estimated time series o f the F F and NBR policy shocks. These are obtained using quarterly data over the sample period 1965:3-1995:2. Figure 1 contains two time series of shocks. The dotted line depicts the quarterly F F policy shocks. The solid line depicts the contemporaneous changes in the federal funds rate implied by contractionary NBR policy shocks. In both cases the variable Mt was measured as M l t . Since the policy shock measures are by construction serially uncorrelated, they tend to be noisy. For ease o f interpretation we report the centered, three quarter moving average o f the shock, i.e., we report (eT+l + e7 + el_ 1)/3. Also, for convenience we include shaded regions, which begin at a National Bureau of Economic Research (NBER) business cycle peak, and end at a trough. The two shocks are positively correlated, with a correlation coefficient o f 0.5 I. The estimated standard deviation o f the F F policy shocks is 0.71, at an annual rate. The estimated standard deviation of the NBR is 1.53% and the standard deviation o f the implied federal funds rate shock is 0.39, at an annual rate. In describing our results, we find it useful to characterize monetary policy as "tight" or "contractionary", when the smoothed policy shock is positive, and "loose" or "expansionary" when it is negative. According to the F F policy shock measure, policy was relatively tight before each recession, and became easier around the time of the trough. 22 A similar pattern is observed for the movements in the federal funds rate
21 This does not mean that excluding lagged values from £2~ has no effect on our results. 22 In Figure 1, the beginning of the 1973~4 recession appears to be an exception to the general pattern. To some extent this reflects the effects of averaging since there was a 210 basis point FF policy shock in 1973Q3.
Ch. 2:
85
Monetary Policy Shocks." What Have we Learned and to What End?
2.0
1.5
1.0
0.5 x
~
k
I
"E 03
.o
0.0
n
I I I
/ ~
II
I\
/
i'V7 I
-0.5
!,I
iI
II iI
NBR model - - ] -
1.0
Fed Funds model . . . . -
1.5
76
I
4
I
I
I
I
81
Three-month centered, equal-weighted moving average Fig. 1. Contractionary benchmark policy shocks in units of federal funds rate. The dotted line depicts the quarterly F F policy shocks. The solid line depicts the contemporaneous changes in the federal funds rate implied by contxactionary NBR policy shocks. In both cases the variable Mt was measured as M1 t.
implied by the NBR shocks, except that in the 1981-1982 period, policy was loose at the start, very tight in the middle, and loose at the end of the recession. 4.2.2. What happens after a benchmark policy shock? 4.2.2.1. Results for some major economic aggregates. Figure 2 displays the estimated impulse response functions of contractionary benchmark F F and NBR policy shocks on various economic aggregates included in g2t. These are depicted in columns 1 and 2, respectively. Column 3 reports the estimated impulse response functions from a third policy shock measure which we refer to as an NBR/TR policy shock. This shock measure was proposed by Strongin (1995) who argued that the demand for total reserves is completely interest inelastic in the short run, so that a monetary policy shock initially only rearranges the composition of total reserves between nonborrowed and borrowed reserves. Strongin argues that, after controlling for movements in certain variables that are in the Fed's information set, a policy shock should be measured as the
86
L.J Christiano et al. Fed Funds Model w i t h M1
N B R Model with M1 MP Shock => Y
MP Shock => Y
0~
~
\
-
_
N B R / T R M o d e l w i t h M1 MP Shock => Y
_6
MP Shock =>
Price
MP Shock =>
. . . . . . . .
Price
MP Shock =>
Price
"~2--
0
MP Shock => Pcom
iiil
MP Shock => Pcom
\ . f
MP Shock => Pcorn
iiij
.
MP Shock => FF
>---
f -
MP Shock - > FF
MP Shock => FF
MP Shock => N B R
MP Shock => N B R
r
°o°~ ~ \
M P Shock => N B R
os0 so
o"~
/ ~~~~'
o
3
9
12
;~l
---
-6._ . . . .
~--.o
M P Shock => M1
]
,~s
'o
r
r; . . . . . .
0
12'
Fed Funds Model w i t h M 2
NBR Model with M2 MP Shock => M2
+"
'1'
5
MP Shock => M1
IS
MP Shock => M1
iit
MP Shock => M2
/
MP Shock = > T R
.- .........
00
o
/
1~
MP Shock => TR
MP Shock => TR
1::
/
-i t/
j.
..........
NBR/TR Model with M2 MP Shock => M2
Fig. 2. The estimated impulse response functions of contractionary benchmark FF and NBR policy shocks on various economic aggregates included in f2t (columns 1 and 2). Column 3 reports the estimated impulse response functions from a third policy shock measure which we refer to as an NBR/TR policy shock. The solid lines in the figure report the point estimates of the different dynamic response functions. Dashed lines denote a 95% confidence interval for the dynamic response functions.
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
87
innovation to the ratio of nonborrowed to total reserves. We capture this specification by measuring St as N B R and assuming that g2t includes the current value of TR. With this specification, a shock to e[ does not induce a contemporaneous change in TR. All three identification schemes were implemented using M1 and M 2 as our measure of money. This choice turned out to have very little effect on the results. The results displayed in Figure 2 are based on a system that included M1. The last row of Figure 2 depicts the impulse response function of M 2 to the different policy shock measures, obtained by replacing M 1 with M 2 in our specification of £2t. The solid lines in the figure report the point estimates of the different dynamic response functions. Dashed lines denote a 95% confidence interval for the dynamic response functions. 23 The main consequences of a contractionary F F policy shock can be summarized as follows. First, there is a persistent rise in the federal funds rate and a persistent drop in nonborrowed reserves. This finding is consistent with the presence of a strong liquidity effect. Second, the fall in total reserves is negligible initially. But eventually total reserves fall by roughly 0.3 percent. So according to this policy shock measure, the Fed insulates total reserves in the short run from the full impact of a contraction in nonborrowed reserves by increasing borrowed reserves. 24 This is consistent with the arguments in Strongin (1995). Third, the response of M1 is qualitatively similar to the response o f TR. In contrast, for the M 2 system, the F F policy shock leads to an immediate and persistent drop in M2. Fourth, after a delay o f 2 quarters, there is a sustained decline in real GDE Notice the 'hump shaped' response function with the maximal decline occurring roughly a year to a year and a half after the policy shock. Fifth, after an initial delay, the policy shock generates a persistent decline in the index of commodity prices. The GDP deflator is flat for roughly a year and a half after which it declines.
23 These were computedusing a bootstrap Monte Carlo procedure. Specifically,we constructed 500 time r denote the vector of residuals from the estimated VAR. series on the vector Z t as follows. Let { t}t=l We constructed 500 sets of new time series of residuals, {~t(j)}r 1, j = 1.... ,500. The tth element of {~t(J)}T 1 was selected by drawing randomly,with replacement, from the set of fitted residual vectors, r For each {~t(j)}tr_l, we constructed a synthetic time series of Zt, denoted {Zt(.J)}Tl, using { *},=1. the estimated VAR and the historical initial conditions on Zt. We then re-estimated the VAR using {Zt(j)}tr_l and the historical initial conditions, and calculated the implied impulse response functions forj = 1, ..., 500. For each fixed lag, we calculated the 12th lowest and 487th highest values of the corresponding impulse response coefficients across all 500 synthetic impulse response functions. The boundaries of the confidenceintervals in the figures correspond to a graph of these coefficients.In many cases the point estimates of the impulse response functions are quite similar to the mean value of the simulated impulse response functions. But there is some evidence of bias, especially for Y, M2, NBR and FE The location of the solid lines inside the confidenceintervals indicates that the estimated impulse response functions are biased towards zero in each of these cases. See Killian (1998) and Parekh (1997) for different procedures for accommodatingthis bias. 24 A given percentage change in total reserves corresponds roughly to an equal dollar change in the total and nonborrowed reserves. Historically,nonborrowedreserves are roughly 95% of total reserves. Since 1986, that ratio has moved up, being above 98% most of the time.
L.J. Christiano et al.
88
Before going on, it is of interest to relate these statistics to the interest elasticity of the demand for NBR and M1. Following Lucas (1988, 1994), suppose the demand for either of these two assets has the following form: Mt = fM(g2t) - q)FF, + el, where e/denotes the money demand disturbance and M denotes the log of either M1 or NBR. Here, q~ is the short run, semi-log elasticity of money demand. A consistent estimate of q~ is obtained by dividing the contemporaneous response of Mt to a unit policy shock by the contemporaneous response of FFt to a unit policy shock. This ratio is just the instrumental variables estimate of q~ using the monetary policy shock. The consistency of this estimator relies on the assumed orthogonality of e~' with eta and the elements of g2t. 2s Performing the necessary calculations using the results in the first column of Figure 2, we find that the short run money demand elasticities for M1 and NBR are roughly -0.1 and -1.0, respectively. The M1 demand elasticity is quite small, and contrasts sharply with estimates of the long run money demand elasticity. For example, the analogous number in Lucas (1988) is -8.0. Taken together, these results are consistent with the widespread view that the short run money demand elasticity is substantially smaller than the long run elasticity [see Goodfriend (1991)]. We next consider the effect of an NBR policy shock. As can be seen, with two exceptions, inference is qualitatively robust. The exceptions have to do with the impact effect of a policy shock on TR and M 1. According to the F F policy shock measure, total reserves are insulated, roughly one to one, contemporaneously from a monetary policy shock. According to the NBR policy shock measure, total reserves fall by roughly one half of a percent. Consistent with these results, an NBR policy shock leads to a substantially larger contemporaneous reduction in M1, compared to the reduction induced by an F F policy shock. Interestingly, M2 responds in very similar ways to an F F and an NBR policy shock.
25 To see this, note first the consistency of the instrumental variables estimator:
Cov(M. eD -qJ
Cov(FFt, eT)"
Note too that: Cov(Mt, el) : cpMa~, Cov(FFt, e;) : q)Rae2, where q~M and q~R denote the contemporaneous effects of a unit policy shock on Mt and FFt, respectively, and o 2 denotes the variance of the monetary policy shock. The result, that the instrumental variable estimator coincides with q~m/cpR, follows by taking the ratio o f the above two covariances. These results also hold if Mr, FFt, and g2t are nonstationary. In this case, we think of the analysis as being conditioned on the initial observations.
Ch. 2." Monetary Policy Shocks: What Have we Learned and to What End?
89
From column 3 of Figure 2 we see that, aside from TR and M1, inference is also qualitatively similar to an N B R / T R policy shock. By construction TR does not respond in the impact period of a policy shock. While not constrained, M 1 also hardly responds in the impact period of the shock but then falls. In this sense the N B R / T R shock has effects that are more similar to an F F policy shock than an N B R policy shock. A maintained assumption of the N B R , F F and N B R / T R policy shock measures is that the aggregate price level and output are not affected in the impact period of a monetary policy shock. On a priori grounds, this assumption seems more reasonable for monthly rather than quarterly data. So it seems important to document the robustness of inference to working with monthly data. Indeed this robustness has been documented by various authors. 26 Figure 3 provides such evidence for the benchmark policy shocks. It is the analog of Figure 2 except that it is generated using monthly rather than quarterly data. In generating these results we replace aggregate output with nonfarm payroll employment and the aggregate price level is measured by the implicit deflator for personal consumption expenditures. Comparing Figures 2 and 3 we see that qualitative inference is quite robust to working with the monthly data. To summarize, all three policy shock measures imply that in response to a contractionary policy shock, the federal funds rate rises, monetary aggregates decline (although some with a delay), the aggregate price level initially responds very little, aggregate output falls, displaying a hump shaped pattern, and commodity prices fall. In the next subsection, we discuss other results regarding the effects of a monetary policy shock. We conclude this subsection by drawing attention to an interesting aspect of our results that is worth emphasizing. The correlations between our three policy shock measures are all less than one (see, for example, Figure 1). 27 Nevertheless, all three lead to similar inference about qualitative effects of a disturbance to monetary policy. One interpretation of these results is that all three policy shock measures are dominated by a common monetary policy shock. Since the bivariate correlations among the three are less than one, at least two must be confounded by nonpolicy shocks as well. Evidently, the effects of these other shocks is not strong enough to alter the qualitative characteristics of the impulse response functions. It is interesting to us just how low the correlation between the shock measures can be without changing the basic features of the impulse response functions. A similar set of observations emerges if we consider small perturbations to the auxiliary assumptions needed to implement a particular identification scheme. For example, suppose we implement the benchmark F F model in two ways: measuring Mt by the growth rate of M2 and by the tog of M1. The resulting policy shock measures 26 See for example Geweke and Runkle (1995), Bernanke and Mihov (1995) and Christiano et al. (1996b). 27 Recall, the estimated correlation between an FF and NBR shock is 0.51. The analog correlation between anNBR / TR shock and anFF shock is 0.65. Finally, the correlation between anNBR/TR shock and an NBR shock is 0.82.
90
L.J. Christiano et al. Monthly Fed Funds Modet with M1 MP Shock => EM
Mpnthly N B R Model with M1 MP Shock => EM
:::;J MP Shock => Price
Monthly N B R / T R Model with M1 MP Shock => EM
............ MP Shock - > Price
iit MP Shock => Pcom
J ..........
MP Shock - > Price
'
MP Shock => Pcom
~" J " - t
i MP Shock => FF
-" . . . . .
MP Shock => Pcom
I
i!ii7 '--'~'J
MP Shock => FF
MP Shock => FF
MP Shock => N B R
MP Shock => N B R
o,
MP Shock => N B R
::.71,
MP Shock => TR
::7_ i. fooo ~
MP Shock => TR
...... 1i
!~!] oo
l" ~ /
. . . . . . . .
. .....
MP Shock => TR
s- .......
MP Shock => M1
MP Shock => M1
MP Shock => M1
Monthly Fed Funds Model with M2 MP Shock => M2
Monthly N B R Model with M2 MP Shock - > M2
Monthly N B R / T R Model with M2 MP Shock => M2
Fig. 3. Evidence for benchmark policy shocks. Analog of Figure 2, but using monthly rather than quarterly data. have a correlation coefficient o f only 0.85. This reflects in part that in several episodes the two shock measures give substantially different impressions about the state o f
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
91
monetary policy. For example in 1993Q4, the M1 based shock measure implies a 20 basis point contractionary shock. The M2 growth rate based shock measure implies an 80 basis point contractionary shock. These types o f disagreements notwithstanding, both versions of the benchmark F F model give rise to essentially the same inference about the effect o f a given monetary policy shock. We infer from these results that while inference about the qualitative effects o f a monetary policy shock appears to be reliable, inference about the state o f monetary policy at any particular date is not. 4.3. Results f o r other economic aggregates
In the previous section we discussed the effects o f the benchmark policy shocks on various economic aggregates. The literature has provided a richer, more detaited picture of the way the economy responds to a monetary policy shock. In this section we discuss some o f the results that have been obtained using close variants o f the benchmark policy shocks. Rather than provide an exhaustive review, we highlight a sample o f the results and the associated set of issues that they have been used to address. The section is divided into two parts. The first subsection considers the effects o f a monetary policy shock on domestic US economic aggregates. In the second subsection, we discuss the effects o f a monetary policy shock on exchange rates. The papers we review use different sample periods as well as different identifying assumptions. Given space constraints, we refer the reader to the papers for these details. 4.3.1. US domestic aggregates
The work in this area can be organized into two categories. The first category pertains to the effects o f a monetary policy shock on different measures of real economic activity, as well as on wages and profits. The second category pertains to the effects of a monetary policy shock on the borrowing and lending activities o f different agents in the economy. 4.3.1.1. Aggregate real variables, wages and profits. In Section 4.2.2 we showed that aggregate output declines in response to contractionary benchmark F F and NBR policy shocks. Christiano et al. (1996a) consider the effects o f a contractionary monetary policy shock on various other quarterly measures o f economic activity. They find that after a contractionary benchmark F F policy shock, unemployment rises after a delay of about two quarters. 28 Other measures o f economic activity respond more quickly to the policy shock. Specifically, retail sales, corporate profits in retail trade
28 Working with monthly data Bernanke and Blinder (1992) also find that unemployment rises after a contractionary monetary policy shock. The shock measure which they use is related to our benchmark FF policy shock measure in the sense that both are based on innovations to the Federal Funds rate and both impose a version of the recursiveness assumption.
92
L.J Christiano et al.
and nonfinancial corporate profits immediately fall while manufacturing inventories immediately rise. 29 Fisher (1997) examines how different components of aggregate investment respond to a monetary policy shock [see also Bernanke and Gertler (1995)]. He does so using shock measures that are closely related to the benchmark F F and N B R policy measures. Fisher argues that all components of investment decline after a contractionary policy shock. But he finds important differences in the timing and sensitivity of different types of investment to a monetary policy shock. Specifically, residential investment exhibits the largest decline, followed by equipment, durables, and structures. In addition he finds a distinctive lead-lag pattern in the dynamic response functions: residential investment declines the most rapidly, reaching its peak response several quarters before the other variables do. Fisher uses these results to discuss the empirical plausibility of competing theories of investment. Gertler and Gilchrist (1994) emphasize a different aspect of the economy's response to a monetary policy shock: large and small manufacturing firms' sales and inventories. 3° According to Gertler and Gilchrist, small firms account for a disproportionate share of the decline in manufacturing sales that follows a contractionary monetary policy shock. In addition they argue that while small firms' inventories fall immediately after a contractionary policy shock, large firms' inventories initially rise before falling. They use these results, in conjunction with other results in their paper regarding the borrowing activities of large and small firms, to assess the plausibility of theories of the monetary transmission mechanism that stress the importance of credit market imperfections. Campbell (1997) studies a different aspect of how the manufacturing sector responds to a monetary policy shock: the response of total employment, job destruction and job creation. Using a variant of the benchmark F F policy shock measure, Campbell finds that, after a contractionary monetary policy shock, manufacturing employment falls immediately, with the maximal decline occurring roughly a year after the shock. The decline in employment primarily reflects increases in job destruction as the policy shock is associated with a sharp, persistent rise in job destruction but a smaller, transitory fall in job creation. Campbell argues that these results are useful as a guide in formulating models of cyclical industry dynamics. We conclude this subsection by discussing the effects of a contractionary monetary policy shock on real wages and profits. Christiano et al. (1997a) analyze various measures of aggregate real wages, manufacturing real wages, and real wages for ten 2 digit SIC level industries. In all cases, real wages decline after a contractionary benchmark F F policy shock, albeit by modest amounts. Manufacturing real wages
29 The qualitative results of Christiano et al. (1996a) are robust to whether they work with benchmark NBR, FF policy shocks or with Romer and Romer (1989) shocks.
3o Gert|er and Gilchrist (1994) use various monetarypolicy shock measures, including one that is related to the benchmarkFF policy shock as well as the onset of Romer and Romer (1989) episodes.
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
93
fall more sharply than economy-wide measures. Within manufacturing, real wages fall more sharply in durable goods industries than in nondurable good industries. Christiano et al. (1997a) argue that these results cast doubt on models of the monetary transmission mechanism which stress the effects of nominal wage stickiness per se. This is because those types of models predict that real wages should rise, not fall, after a contractionary monetary policy shock. To study the response of real profits to a monetary policy shock, Christiano et al. (1997a) consider various measures of aggregate profits as well as before tax profits in five sectors of the economy: manufacturing, durables, nondurables, retail and transportation and utilities. In all but two cases, they find that a contractionary F F policy shock leads to a sharp persistent drop in profits. 31 Christiano et al. (1997a) argue that these results cast doubt on models of the monetary transmission mechanism which stress the effects of sticky prices per se but don't allow for labor market frictions whose effect is to inhibit cyclical movements in marginal costs. This is because those types of models predict that profits should rise, not fall, after a contractionary monetary policy shock. Finally, we note that other authors have obtained similar results to those cited above using policy shock measures that are not based on the recursiveness assumption. For example, policy shock measures based on the identifying assumptions in Sims and Zha (1998) lead to a qualitatively similar impact on wages, profits and various measures of aggregate output as the benchmark F F policy shock. Similarly, Leeper, Sims and Zha's (1996) results regarding the response of investment are quite similar to Fisher's. 4.3.1.2. Borrowing and lending activities. Various authors have investigated how a monetary policy shock affects borrowing and lending activities in different sectors of the economy. In an early contribution, Bernanke and Blinder (1992) examined the effects of a contractionary monetary policy shock on bank deposits, securities and loans. Their results can be summarized as follows. A contractionary monetary policy shock (measured using a variant of the benchmark F F policy shock) leads to an immediate, persistent decline in the volume of bank deposits as well as a decline in bank assets. The decline in assets initially reflects a fall in the amount of securities held by banks. Loans are hardly affected. Shortly thereafter security holdings begin climbing back to their preshock values while loans start to fall. Eventually, securities return to their pre-shock values and the entire decline in deposits is reflected in loans. Bernanke and Blinder (1992) argue that these results are consistent with theories of the monetary transmission mechanism that stress the role of credit market imperfections. Gertler and Gilchrist (t993, 1994) pursue this line of inquiry and argue that a monetary policy shock has different effects on credit flows to small borrowers (consumers and small firms) versus large borrowers. Using a variant of the benchmark F F policy
31 The two exceptions are nondarable goods and transportation and utilities. For these industries they cannot reject the hypothesis that profits are unaffected by contractionarypolicy shock.
94
L.J Christiano et al.
shock, they find that consumer and real estate loans fall after a contractionary policy shock but commercial and industrial loans do not [Gertler and Gilchrist (1993)]. In addition, loans to small manufacturing firms decline relative to large manufacturing firms after a contractionary monetary policy shock. In their view, these results support the view that credit market imperfections play an important role in the monetary transmission mechanism. Christiano et al. (1996a) examine how net borrowing by different sectors of the economy responds to a monetary policy shock. Using variants of the F F and NBR benchmark policy shocks, they find that after a contractionary shock to monetary policy, net funds raised in financial markets by the business sector increases for roughly a year. Thereafter, as the decline in output induced by the policy shock gains momentum, net funds raised by the business sector begin to fall. Christiano et al. (1996a) argue that this pattern is not captured by existing monetary business cycle models. 32 Christiano et al. (1996a) also find that net funds raised by the household sector remains unchanged for several quarters after a monetary policy shock. They argue that this response pattern is consistent with limited participation models of the type discussed in Christiano et al. (1997a,b). Finally, Christiano et al. (1996a) show that the initial increase in net funds raised by firms after a contractionary benchmark F F policy shock coincides with a temporary reduction in net funds raised (i.e., borrowing) by the government. This reduction can be traced to a temporary increase in personal tax receipts. After about a year, though, as output declines further and net funds raised by the business and household sectors falls, net funds raised by the government sector increases (i.e., the government budget deficit goes up). Taken together, the above results indicate that a contractionary monetary policy shock has differential effects on the borrowing and lending activities of different agents in the economy. Consistent with the version of the Lucas program outlined in the introduction to this survey, these findings have been used to help assess the empirical plausibility of competing theories of the monetary transmission mechanism. 4.3.2. Exchange rates and monetary policy shocks
Various papers have examined the effects of a monetary policy shock on exchange rates. Identifying exogenous monetary policy shocks in an open economy can lead to substantial complications relative to the closed economy case. For example, in some countries, monetary policy may not only respond to the state of the domestic economy but also to the state of foreign economies, including foreign monetary policy actions. At least for the USA, close variants of the benchmark policy shock measures continue to give reasonable results. For example, Eichenbaum and Evans (1995) consider variants of the benchmark F F and NBR/TR policy shock measures in which some
32 Christiano et al. (1996a) and Gertler and Gilchrist (1994) discuss possible ways to account for this response pattern.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
95
foreign variables appear in the Fed's reaction ftmction. A maintained assumption of their analysis is that the Fed does not respond contemporaneously to movements in the foreign interest rate or the exchange rate. Eichenbaum and Evans use their policy shock measures to study the effects of a contractionary US monetary policy shock on real and nominal exchange rates as well as domestic and foreign interest rates. 33 They find that a contractionary shock to US monetary policy leads to (i) persistent, significant appreciations in US nominal and real exchange rates and (ii) persistent decreases in the spread between foreign and US interest rates, and (iii) significant, persistent deviations from uncovered interest rate parity in favor of US investments. 34 Under uncovered interest rate parity, the larger interest rate differential induced by a contractionary US monetary policy shock should be offset by expected future depreciations in the dollar. Eichenbaum and Evans' empirical results indicate that the opposite is true: the larger return is actually magnified by expected future appreciations in the dollar. Eichenbaum and Evans discuss the plausibility of alternative international business cycle models in light of their results. While variants of the benchmark F F identification scheme generate results that are consistent with traditional monetary analyses when applied to the USA, this is generally not the case when they are used to identify foreign monetary policy shocks. For example, Grilli and Roubini (1995) consider policy shock measures for non-US G7 countries that are closely related to Eichenbaum and Evans' measures. Using these measures, they find that a contractionary shock to a foreign country's monetary policy leads initially to a depreciation in the foreign country's currency. Grilli and Roubini argue that this result reflects that the measured policy shocks are confounded by the systematic reaction of foreign monetary policy to US monetary policy and expected inflation. This motivates them to construct an alternative policy shock measure which is based on the recursiveness assumption and a measure of St equal to the spread between foreign short term and long term interest rates. With this measure, they find that a contractionary shock to foreign monetary policy leads to a transitory appreciation in the foreign exchange rate and a temporary fall in output. In contrast to Grilli and Roubini, authors like Cushman and Zha (1997), Kim and Roubini (1995), and Ctarida and Gertler (1997) adopt identification schemes that do not employ the recursiveness assumption. In particular, they abandon the assumption that the foreign monetary policy authority only looks at predetermined variables when setting its policy instrument. Cushman and Zha (1997) assume that Bank of Canada officials look at contemporaneous values of the Canadian money supply, the exchange rate, the US foreign interest rate and an index of world commodity prices when setting a short term Canadian interest rate. Kim and Roubini (1995) assume that the reaction
33 The foreign countries which they look at are Japan, Germany,Italy, France and Great Britain. 34 Sims (1992) and Grilli and Roubini (1995) also analyze the effect of a monetary policy shock on US exchange rates using close variants of the FF benchmark policy shock. They too fred that a contractionary policy shock leads to an appreciation of the US exchange rate.
96
L.J Christiano et al.
function o f foreign central bankers includes contemporaneous values o f the money supply, the exchange rate and the world price o f oil (but not the federal fund rate). Clarida and Gertler (1997) assume that the Bundesbank's reaction function includes current values o f an index o f world commodity prices, the exchange rate, as well as the German money supply (but not the US federal funds rate). 35 In all three cases, it is assumed that the money supply and the exchange rate are not predetermined relative to the policy shock. As a consequence, monetary policy shocks cannot be recovered from an ordinary least squares regression. Further identifying assumptions are necessary to proceed. The precise identifying assumptions which these authors make differ. But in all cases, they assume the existence o f a group o f variables that are predetermined relative to the policy shock. 36 These variables constitute valid instruments for estimating the parameters in the foreign monetary policy maker's reaction function. We refer the reader to the papers for details regarding the exact identifying assumptions. 37 With their preferred policy shocks measures, all three o f the above papers find that a contractionary foreign monetary policy shock causes foreign exchange rates to appreciate and leads to a rise in the differential between the foreign and domestic interest rate. 38 In this sense, their results are consistent with Eichenbaum and Evans' evidence regarding the effects o f a shock to monetary policy. In addition, all three papers provide evidence that a contractionary foreign monetary policy shock drives foreign monetary aggregates and output down, interest rates up and affects the foreign price level only with a delay. In this sense, the evidence is consistent with the evidence in Section 4.2.2 regarding the effect o f a benchmark F F policy shock on the US economy. 4.4. Robustness o f the benchmark analysis In this subsection we assess the robustness o f our benchmark results to various perturbations. First, we consider alternative identification schemes which also impose the recursiveness assumption. Second, we consider the effects o f incorporating information from the federal funds futures market into the analysis. Finally, we analyze the subsample stability o f our results.
35 Clarida et al. (1998) provide a different characterization of the Bundesbank's reaction function as well as the reaction functions of five other central banks. 36 For example in all these cases it is assumed that a measure of commodity prices, foreign industrial production, the foreign price level and the federal funds rate are predetermined relative to the foreign monetary policy shock. 37 Clarida and Gall (1994) use long run identifying restrictions to assess the effects of nominal shocks on real exchange rates. 38 Consistent with the evidence in Eichenbaum and Evans (1995), Cushman and Zha (1997) find that a contractionary foreign monetary policy shock induces a persistent, significant deviation from uncovered interest parity in favor of foreign investments.
Ch. 2: Monetary Policy Shocks: WhatHave we Learned and to What End?
97
4.4.1. Excluding current output and prices from f2t The estimated one-step-ahead forecast errors in Yt and FFt are positively correlated (.38), while those in Yt and NBRt are negatively correlated (-.22). Any identification scheme in which St is set equal to either the time t federal funds rate or nonborrowed reserves must come to terms with the direction of causation underlying this correlation: Does it reflect (a) the endogenous response of policy to real GDP via the Fed's feedback rule, or (b) the response of real GDP to policy? Our benchmark policy measures are based on the assumption that the answer to this question is (a). Under this assumption we found that a contractionary monetary policy shock drives aggregate out-put down. Figure 4 displays the results when the answer is assumed to be (b). Specifically, columns 1 and 3 report the estimated impulse response functions of various economic aggregates to policy shock measures that were computed under the same identification assumptions as those underlying the F F and NBR policy shocks except that Yt is excluded from f2t. The key result is that under these identifying assumptions, a contractionary policy shock drives aggregate output up before driving it down. In other respects, the results are unaffected. It might be thought that the initial response pattern of output could be rationalized by monetary models which stress the effects of an inflation tax on economic activity, as in Cooley and Hansen (1989). It is true that in these models a serially correlated decrease in the money supply leads to an increase in output. But, in these models this happens via a reduction in anticipated inflation and in the interest rate. Although the candidate policy shock is associated with a serially correlated decrease in the money supply, it is also associated with a rise in the interest rate and virtually no movement in the price level. This response pattern is clearly at variance with models in which the key effects of monetary policy shocks are those associated with the inflation tax. We do not know of other models which can rationalize a rise in output after a contractionary monetary policy shock. Absent some coherent model that can account for the response functions in columns 1 and 3 of Figure 4, we reject the underlying identifying assumptions as being implausible. We suspect that the resulting shock measures confound policy and nonpolicy disturbances. Columns 2 and 4 of Figure 4 report the estimated impulse response functions to policy shock measures computed under the same identification assumptions as those underlying the F F and NBR policy shocks except that Pt is excluded from f2t. As can be seen, the benchmark results are virtually unaffected by this perturbation. 4.4.2. Excluding commodity prices from g-2t: The price puzzle On several occasions in the postwar era, a rise in inflation was preceded by a rise in the federal funds rate and in commodity prices. An example is the oil price shock in 1974. Recursive identification schemes that set St equal to FFt and do not include the commodity prices in f2t as leading indicators of inflation in the Fed's feedback rule sometimes imply that contractionary monetary policy shocks lead to a sustained rise in
98
L.J. Christiano et al.
Fed F u n d s Model, Y after MP
Fed F u n d s Model, P after MP
MP Shock => Y
MP Shock => Y
NBR Model, Y after MP
NBR Model, P after MP
MP Shock =~ Y
M P Shock = > Y
o~ o1
\
2: °
MP Shock => Price
,
MP Shock => Price
°
+"F
il
,~
+
,,
MP Shock => Price
MP Shock => Price
2 ~-~===--:
1
........_ MP Shock => Pcom
MP Shock => Pcom
ii: :i iil.....:>-A
MP Shock => Pcom
MP Shock => Pcom
MP Shock => FF
MP Shock => FF
r '+--" .... o
+
,
+
+,
,,
iit '-"
MP Shock => FF
MP Shock => FF °+
/\
',iw o,+ / \ +,+ \ /
2: ::L_
_'_- - -
...... I
Z .o. . . "~2Y,ZT 7 7 ,~+--~,s , , +
MP Shock => NBR
. . . .
MP Shock=> NBR
/ /
Z / 0
3
~
9
12
I~
~i
+j
.
MP Shock => TR
.
.
Z °"I
\
//
+" .-°.
....
.
.
.
+
o
,
+
+
,~
,+
/
.....
\/ //
_ ....
MP Shock => M1
MP Shock => ml
MP Shock => M1
"® j
i
.
MP Shock => TR
. . . . .
MP Shock => ml
//
0
,
+
+
,=
,.
"~t
//
0~/
X/J
Fig. 4. Results when the answer is assumed to be the response of real GDP to policy. Columns 1 and 3 report the estimated impulse response functions of various economic aggregates to policy shock measures that were computed under the same identification assumptions as those underlying the FF and NBR policy shocks except that Yt is excluded from Qt. Columns 2 and 4 report the estimated impulse response functions to policy shock measures computed under the same identification assumptions as those underlying the FF and NBR policy shocks except that Pt is excluded from ~2t. As can be seen, the benchmark results are virtually unaffected by this perturbation.
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
99
the price level. 39 Eichenbaum (1992) viewed this implication as sufficiently anomalous relative to standard theory to justify referring to it as "the price puzzle". 4o Sims (1992) conjectured that prices appeared to rise after certain measures o f a contractionary policy shock because those measures were based on specifications o f g2t that did not include information about future inflation that was available to the Fed. Put differently, the conjecture is that policy shocks which are associated with substantial price puzzles are actually confounded with nonpolicy disturbances that signal future increases in prices. Christiano et al. (1996a) and Sims and Zha (1998) show that when one modifies such shock measures by including current and lagged values o f commodity prices in Dr, the price puzzle often disappears. It has now become standard practice to work with policy shock measures that do not generate a price puzzle. To document both the nature o f the puzzle and the resolution, Figure 5 displays the impulse response o f Pt to eight different contractionary monetary policy shock measures. The top and bottom rows display the effects o f shocks to systems in which St is measured by FFt and NBRt, respectively. Columns 1-4 correspond to policy shock measures in which (i) the current value o f Pt, Yt and current and lagged values o f P C O M t are omitted from g2t, (ii) current and lagged values o f P C O M t are omitted from £2¢, (iii) the current value o f P C O M t is omitted from £2t, and (iv) g2t is given by our benchmark specification, respectively. A number o f interesting results emerge here. First, policy shock measures based on specifications in which current and lagged values o f P C O M are omitted from g2t imply a rise in the price level that lasts several years after a contractionary policy shock. Second, according to the point estimates, the price puzzle is particularly pronounced for the specification in which the current values o f Yt and Pt are also excluded from g2t (column 1). Recall that deleting Pt from (2t had virtually no effect on our results. These findings suggest that current Y and current and past P C O M play a similar role in purging policy shock measures of nonpolicy disturbances. Third, the 95% confidence intervals displayed in Figure 5 indicate that the price puzzle is statistically significant for the Fed Funds based shock measures associated with columns 1 and 2 in Figure 5.41
39 The first paper that documents the "price puzzle" for the USA and several other countries appears to be Sims (1992). 4o There do exist some models that predict a temporary rise in the price level after a contraction. These models stress the role of self fulfilling shocks to expectations in the monetary transmission mechanism. See for example Beaudry and Devereux (1995). Also there exist some limited participation models of the monetary transmission mechanism in which the impact effect of contractionary monetary policy shocks is so strong that prices rise in the impact period of the policy shock. See for example Fuerst (1992) and Christiano et al. (1997a). 4~ We used the artificial data underlying the confidenceintervals reported in Figure 5 to obtain a different test of the price puzzle. In particular, we computed the number of times that the average price response over the first 2, 4 and 6 quarters was positive. For the FF model underlying the results in column 1 the results were 96.4%, 97.2%, and 98.0%, respectively. Thus, at each horizon, the price puzzle is significant at the 5% significance level. For the FF model underlying the second column, the results are 95.6%, 94.6%, and 89.8%, so that there is a marginally significant price puzzle over the first year. Regardless
100 0.70
L.J. Christiano et al. Fed Funds, MP first: No Pcom MP Shock => P
0.35
0.00
/
,
/ \x
~
Fed Funds,: No Pcom MP Shock - > P
0.70 [
0.70
0.35
0,35 '
0.35
oo0
0.00
, 7
10
/
13
1
4
~°°°l
quarters
-0.35
-0.70 7
10
13
NBR. MP first: No Pcom MP Shock => P
\
0.70 ~ 1
4
quarters
0.70
Fed Funds with M1 MP Shock => P
~
0.35
x
4
~
\\
-0.35
-0.70
Fed Funds: Pcom after MP MP Shock => P
0.70 I
7
10
13
1
4
7 10 quarters
quarters
NBR: NO Pcom MP Shock => P
NBR: Pcom after MP MP Shock => P
13
NBR with M1 MP Shock => P
0.70
0.70
0.70
0.35
0.35
0.35
-0.35
-0.35
$ 0.00
0.35
-0.70
1
4 7 10 quarters
13
-0,70
1
4
7 10 quarters
13
. . . . . .
1
. . . . . .
r v
7 10 quarters
i
13
4
-0.70
\~\~\~
,
. . . . . . . . . . . . . .
1
4
7 10 quarters
13
Fig. 5. The impulse response of Pt to eight different contractionary monetary policy shock measures. The top and bottom rows display the effects of shocks to systems in which St is measured by F F t and NBRt, respectively. Columns 1-4 correspond to policy shock measures in which (i) the current value of Pt, Yt and current and lagged values o f P C O M t are omitted from g2t, (ii) current and lagged values of P C O M t are omitted from E2t, (iii) the current value of P C O M t is omitted from g2,, and (iv) ~2t is given by our benchmark specification, respectively. Fourth, consistent with results in Eichenbaum (1992), the price p u z z l e is less severe for the N B R based policy shocks. Fifth, little evidence o f a price puzzle exists for the benchmark F F and N B R policy shocks. We conclude this section by noting that, in results not reported here, we found that the dynamic responses o f nonprice variables to monetary policy shocks are robust to deleting current and lagged values o f P C O M from £2t.
4.4.3. Equating the policy instrument, St, with MO, M1 or M 2 There is a long tradition o f identifying monetary policy shocks with statistical innovations to monetary aggregates like the base (M0), M 1 and M2. Indeed this was
of the horizon, the price puzzle was not significant at even the 10% significance level for the other specifications in Figure 5.
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
101
the standard practice in the early literature on the output and interest rate effects of an unanticipated shock to monetary policy. 42 This practice can be thought of as setting St equal to a monetary aggregate like M0, M1 or M2 and using a particular specification of Y2t. We refer the reader to Leeper, Sims and Zha (1996) and Cochrane (1994) for critical reviews of this literature. Here we discuss the plausibility of identification schemes underlying M based policy shock measures by examining the implied response functions to various economic aggregates. Figure 6 reports estimated response functions corresponding to six policy measures. Columns 1 and 2 pertain to policy shock measures in which St is set equal to M0t. Column 1 is generated assuming that g2t consists of 4 lagged values of Yt, Pt, PCOMt, FFt, NBRt and M0t. For column 2, we add the current value of Yt, Pt, and PCOMt to g2t. Columns 3 and 4 are the analogs of columns 1 and 2 except that M0t is replaced by Mlt. Columns 5 and 6 are the analogs of columns 1 and 2 except that MOt is replaced by M2. We begin by discussing the dynamic response functions corresponding to the M0 based policy shock measures. Notice that the responses in column 1 are small and estimated very imprecisely. Indeed, it would be difficult to reject the hypotheses that Y, P, PCOM, and F F are all unaffected by the policy shock. Once we take sampling uncertainty into account, it is hard to argue that these response functions are inconsistent with the benchmark policy shock measure based response functions. In this limited sense, inference is robust. Still, the point estimates of the response functions are quite different from our benchmark results. In particular, they indicate that a contractionary policy shock drives Pt and FFt down. The fall in Pt translates into a modest decline in the rate of inflation. 43 After a delay of one or two periods, Zt rises by a small amount. The delay aside, this response pattern is consistent with a simple neoclassical monetary model of the sort in which there is an inflation tax effect on aggregate output [see for example Cooley and Hansen (1989)]. The response functions in column 2 are quite similar to those in column 1. As before, they are estimated with sufficient imprecision that they can be reconciled with various models. The point estimates themselves are consistent with simple neoclassical monetary models. Compared to column 1, the initial decline in Yt after a contractionary policy shock is eliminated, so that the results are easier to reconcile with a simple neoclassical monetary model. The impulse response fimctions associated with the M1 based policy shocks in columns 3 and 4 are similar to those reported in columns t and 2, especially when sampling uncertainty is taken into account. The point estimates themselves seem harder to reconcile with a simple monetary neoclassical model. For example, according to 42 See for example Barro (1977), Mishkin (1983), S. King (1983) and Reichenstein (1987). For more recent work in this tradition see King (1991) and Cochrane (1994). 43 The fall in P~ translates into an initial .20 percent decline in the annual inflation rate. The maximal decline in the inflation rate is about .25 percent which occurs after 3 periods. The inflation rate returns to its preshock level after two years.
L.J. C h r i s t i a n o et al.
102 Money Model: MO first
Money Model
M0 Shock => Y
MO Shock => Y
Money Model: M1
first
M1 Shock => Y
/ /--' . /
o
3
e
•
,2
,~ M0 Shock =>
MO Shock => Price
]]
M I ShOCk => Price
Price
o,
.--'"
/ 1
MO Shock => PCOM
M0 Shook => P C O M
f
M f ShOck => P C O M
kt i l ~
ijI M0 Shock => FF
MO Shock => FF
M1 Shock => FF
/ /
\
\\l'~. l "
/
~
\ ~- I
o
M0 Shock => N B R
.
~
.
.
M0 Shock => M0
jl
i
3
M0
1=
"~,
/
e
Shock
~ I
~
=>
NBR
I=
,5
M1
Shock
=>
NBR
,i
MO Shock => M0
i
f
M1 Shock => M1
I
i/
Fig. 6. Estimated response fimctions corresponding to six policy measures.
I
Ch. 2:
103
Monetary Policy Shocks." What Have we Learned and to What End? Money Model
Money Model: M2 first
Money Model
M1 Shock => Y
M2 Shock => Y
M2 Shock => Y
/,/
M1 Shock => Price
.
M2 Shock => Price
//
M2 Shock => Price
/ f f
,
M1 Shock => PCOM
~
4
12
t
t
IS
\,
o
M1 Shock => FF
0
e
•
12
is
M2 Shock => PCOM
a
i
•
I=
,s
3
t/, g
=
M2 Shock => FF
~'~ I~
i~
../
.
.
o
M2 Shock => NBR
.
.
.
.
~
>.-~r? e
=
.
.
.
.
12
,s
12
.
M2 Shock => NBR
1 .... \
~
-
-
~ o
M1 Shock => M1
°I
~=
~
//
~"=t I~ 1
is
.I
"~,.~
o®
,~
M2 Shock => FF
°
M1 Shock => NBR
,
M2 Shock => PCOM
o°
0
3
M2 Shock => M2
~
~
9
M2 Shock => M2
\ ~ / j /
104
L.J. Christiano et al.
column 3, output falls for over two quarters after a contractionary policy shock. The fact that output eventually rises seems difficult to reconcile with limited participation or sticky wage/price models. This is also true for the results displayed in column 4. Moreover, the results in that column also appear to be difficult to reconcile with the neoclassical monetary model. For example, initially inflation is hardly affected by a monetary contraction, after which it actually rises. Sampling uncertainty aside, we conclude that the M I based policy shock measures are difficult to reconcile with known (at least to us) models of the monetary transmission mechanism. Finally, consider the M2 based policy shock measures. Here a number of interesting results emerge. First, the impulse response functions are estimated more precisely than those associated with the M0 and M1 based policy shock measures. Second, the impulse response functions share many of the qualitative properties of those associated with the benchmark policy shocks measures. In particular, according to both columns 5 and 6, a contractionary monetary policy shock generates a prolonged decline in output and a rise in FFt. Also the price level hardly changes for roughly 3 quarters. This is true even for the policy shock measure underlying column 5 where the price level is free to change in the impact period of the shock. There is one potentially important anomaly associated with the M2 based policy shock measures: after a delay, NBR and M2 move in opposite directions. In sum, the M based policy shock measures provide mixed evidence on the robustness of the findings associated with our benchmark policy shocks. The response functions associated with the M0 and M1 policy shock measures are estimated quite imprecisely. In this sense they do not provide evidence against robustness. The point estimates of the response functions associated with the M1 based policy shock measures are hard to reconcile with existing models of the monetary transmission mechanism. But the point estimates associated with the M0 based policy shock measures are consistent with simple neoclassical monetary models. I f one wants evidence that is not inconsistent with simple neoclassical monetary models, this is where to look. Finally, apart from the anomalous response of NBR, qualitative inference about the effects of a monetary policy shock are robust to whether we work with the M2 based policy shock measure or the benchmark policy shock measures.
4.4.4. Using information from the federal funds futures market An important concern regarding the benchmark policy shock measures is that they may be based on a smaller information set than the one available to the monetary authority or private agents. Rudebusch (1996) notes that one can construct a marketbased measure of the one-month ahead unanticipated component of the federal fimds rate. He does so using data from the federal funds futures market, which has been active since late 1988. 44 He recognizes that a component of the unanticipated move in the 44 See Brunner (1994), Carlson et al. (1995), and Krueger and Kuttner (1996) for further discussion and analysis of the federal funds futures market.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
105
federal funds rate reflects the Federal Reserve's endogenous response to the economy. To deal with this problem, he measures the exogenous shock to monetary policy as the part o f the unanticipated component of the federal funds rate which is orthogonal to a measure o f news about employment. In Rudebusch's view, the correlation between the resulting measure and our F F benchmark policy shock measure is sufficiently low to cast doubt upon the latter. 45 But policy shock measures can display a low correlation, while not changing inference about the economic effects of monetary policy shocks. We now investigate whether and how inference is affected by incorporating federal funds futures market data into the analysis. To study this question, we repeated the benchmark F F analysis, replacing FFt with F F t - F M t - 1 in the underlying monthly VAR. 46 Here FMt_I denotes the time t - 1 futures rate for the average federal funds rate during time t. 47 We refer to the orthogonalized disturbance in the F F t - FM¢ I equation as the F M policy shock. In addition, because o f data limitations, we redid the analysis for what we refer to as the Rudebusch sample period, 1989:04-1995:03. Because of the short sample period, we limit the number o f lags in the VAR to six. Before considering impulse response functions to the policy shocks, we briefly discuss the shocks themselves. Panel A of Figure 7 displays the F M policy shocks for the period 1989:10-1995:03. In addition, we display F F policy shocks for the same period. These were computed using our benchmark, monthly VAR model, estimated over the whole sample period, using six lags in the VAR. Panel B is the same as Panel A, except that the VAR underlying the benchmark F F policy shocks is estimated using data only over the Rudebusch sample period. A few features o f Figure 7 are worth noting. First, the shock measures in Panel A are o f roughly similar magnitude, with a standard deviation o f the benchmark and F M policy shocks being 0.22 and 0.16, respectively. Consistent with the type o f findings reported by Rudebusch, the correlation between the two shock measures is relatively low, 0.34. 48 Second, when we estimate the VARs underlying the benchmark F F and F M policy shocks over the same sample period, the correlations rise to approximately 0.45. Interestingly, the F F policy shocks now have a smaller standard deviation than the F M policy shocks. 49 We now proceed to consider robustness o f inference regarding the effects o f monetary policy shocks. The dynamic response functions to an F M policy shock, together with 95% confidence intervals, are displayed in column 1 of Figure 8. There
45 See Sims (1996) for a critique of Rudebusch's analysis. 46 Evans and Kuttner (1998) find that small, statistically insignificant deviations from futures market efficiencypartially account for the low correlations between variants of the FF benchmark policy shocks and FF~ - F Mt 1. 47 These data were taken from Krueger and Kuttner (1996). 48 Rudebusch actually reports the R2 in the regression relation between the two shocks. This is the square of the correlation between the two variables. So, our correlation translates into an R2 of 0.12. 49 Given the short sample, it is important to emphasize that the standard deviations have been adjusted for degrees of freedom.
L.d Christiano et al.
106
1.00
CEE66 0.75
"
/
66 startdate, 6 lags
RUDE EPS
0.50 0.25 0.00 -0.25 -0.50 -0.75 -1.00 90
92
,
94
_
E89
0.75 4 ~ / FIUDE EPS 0.50 ~ L _ _ - 0,25 ~
89 startdate, 6 lags
-1 .... ] _ _ _ _
J
/~
-0.25 1
~
-050
:
'"
~
V
-0.75 J -1.00
I
'
90
92
94
Fig. 7. Top: the FM policy shocks for the period 1989:10-1995:03. In addition, we display FF policy shocks for the same period. These were computed using our benchmark, monthly VAR model, estimated over the whole sample period, using six lags in the VAR. Bottom: the same as top, except that the VAR underlying the benchmark FF policy shocks is estimated using data only over the Rudebusch sample period.
are two obvious features to these results. First, the policy shock itself is very small (a little over 10 basis points). Second, with the exception o f F F L - FMt-1, the response o f the other variables is not significantly different from zero at all lags. To compare these results with those based on the benchmark F F policy shocks, we need to control for the difference in sample periods and lag lengths. To this end, we report the impulse response functions and standard errors o f the 6 lag benchmark F F model estimated over the Rudebusch sample period. These are displayed in column 2 o f Figure 8. We see that the same basic message emerges here as in column 1: over the Rudebusch sample period, the shocks are small and the impulse response functions are imprecisely estimated. We conclude that there is no evidence to support
Ch. 2:
Monetary Policy Shocks." What Haue we Learned and to What End?
Short Sample
Rudebusch
MP Sh~k
=>
FM Model
Short FF Model,
EM
[
short sample confidence
S h o r t F F M o d e ~ , fu~l
sample confidence
M P Shock => E M
M P Shock => E M
M P Shock => Price
M P Shock => Pdce
107 Short FF Model, bootsub3
confidence
M P Shock => E M
f
M P Shock => Price
M P Shock =>
Price
°~7--
::i ....................... ...........
i: ........................................ I M P Shock => Pcom
I
1111............~,:,~,~,~
M P Shock => Pcom
M P Shock ~
i
Pcom
M P Shock => Poem
°'°I
1
=o.t.,
M P Shock => Rudebusch Shocks
M P Shock => FF
M P Shock => FF
t°
°~
M P Shock => N B R
M P Shock => N B R
i
~
M P Shock => N B R
~ii ....................... M P Shock => TR
X
M P Shock => FF
....
' iiil/~:,~,:r.~r:~,;,
M P Shock => TR
M P Shock => T R
o~
M P S h o c k => N B R
I
1:2 ~ .............
~':1
MP Sh~k =>TR
o~
J ~o.~,
mo.~ .
M P Shock => M I
M P Shock => M1
iq~r 2
tl
~ monks
~g
.
.
.
,., M P S h o c k => M1
~o,th,
M P Shock = , M1
!
3~
Fig. 8. The dynamic response functions to an FM policy shock, together with 95% confidence intervals, are displayed in column 1. There are two obvious features to these results. First, the policy shock itself is very small (a little over 10 basis points). Second, with the exception o f F F t -FM~ 4, the response of the other variables is not significantly different from zero at all lags.
108
L.J. Christiano et al.
the notion that inference is sensitive to incorporating federal funds market data into the analysis. This conclusion may very well reflect the limited data available for making the comparison. 4.4.5. Sample period sensitioity
Comparing the results in Figure 8 with our full sample, benchmark F F results (see column 1, Figure 2) reveals that the impulse response functions are much smaller in the Rudebusch sample period. A similar phenomenon arises in connection with our benchmark NBR model. Pagan and Robertson (1995) characterize this phenomenon as the "vanishing liquidity effect". Wong (1996) also documents this phenomenon for various schemes based on the recursiveness assumption. These findings help motivate the need to study the robustness o f inference to different sample periods. We now proceed to investigate subsample stability. Our discussion is centered around two general questions. First, what underlies the difference in impulse response functions across subsamples? Here, we distinguish between two possibilities. One possibility is that the difference reflects a change in the size o f the typical monetary policy shock. The other possibility is that it reflects a change in the dynamic response to a shock o f a given magnitude. We will argue that, consistent with the findings in Christiano's (1995) discussion o f the vanishing liquidity effect, the evidence is consistent with the hypothesis that the first consideration dominates. Second, we discuss robustness o f qualitative inference. Not surprisingly in view o f our findings regarding the first question, we find that qualitative inference about the effects o f a monetary policy shock is robust across subsamples. This last finding is consistent with results in Christiano et al. (1996b). In the analysis that follows, we focus primarily on results for the benchmark F F policy shocks. We then briefly show that our conclusions are robust to working with the NBR policy shocks. To begin our analysis o f subsample stability, we test the null hypothesis that there was no change at all in the data generating mechanism for the Rudebusch sample period. To this end, we constructed confidence intervals for the impulse response functions in column 2 o f Figure 8 under the null hypothesis that the true model is the one estimated using data over the full sample. 50 The resulting confidence intervals are reported in column 3. In addition, that column reports for convenience the estimated response functions from column 2. We see that the estimated impact effect o f a one standard deviation policy shock on the federal funds rate (see the
so These confidence intervals were computed using a variant of the standard bootstrap methodology employed in this paper. In particular, we generated 500 artificial time series, each of length equal to that of the full sample, using the six lag, benchmark full sample FF VAR and its fitted disturbances. In each artificial time series we estimated a six lag, benchmark FF VAR model using artificial data over the period corresponding to the Rudebusch sample period. The 95% confidence intervals are based on the impulse response functions corresponding to the VARs estimated from the artificial data.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
109
fourth row of column 3) lies well below the 95% confidence interval. So, we reject the null hypothesis that there was no change at all in the data generating mechanism in the Rudebusch sample. 51 Next, we modified the null hypothesis to accommodate the notion that the only thing which changed in the Rudebusch sample was the nature of the monetary policy shocks. In all other respects, the data generating mechanism is assumed to remain unchanged. Under this null hypothesis, we generated 95% confidence intervals for the estimated impulse response functions in column 2 of Figure 8. s2 These confidence intervals are reported in column 4 of Figure 8, which also repeats for convenience the point estimates from column 2. Notice that, with one exception, all of the estimated impulse response functions lie within the plotted confidence intervals. 53 The exception is that the impulse response function o f P C O M lies just outside the plotted confidence intervals for roughly the first six periods. Based on these results, we conclude that there is little evidence against the joint hypothesis that (i) the response of the aggregates to a given policy shock is the same in the two sample periods and (ii) the size of the shocks was smaller in the post 1988:10 period. For any particular subsample, we refer to these two conditions as the modified subsample stability hypothesis. We now consider the stability of impulse response functions in other subsamples. Figure 9 reports response functions to monthly benchmark F F policy shocks, estimated over four subsamples: the benchmark sample, and the periods 1965:1-1979:9, 1979:10-1994:12, and 1984:2-1994:12. In each case, the method for computing confidence intervals is analogous to the one underlying the results in column 4 of Figure 8. 54 From Figure 9 we see that the estimated response functions for 51 The procedure we have used to reject the null hypothesis of no change versus the alternative of a change in 1989 implicitly assumes the choice of break date is exogenous with respect to the stochastic properties of the data. There is a large literature (see Christiano (1992) and the other papers in that Journal of Business and Economic Statistics volume) which discusses the pitfalls of inference about break dates when the choice of date is endogenous. In this instance our choice was determined by the opening of the Federal Funds Futures Market. Presumably,this date can be viewed as exogenous for the purpose of our test. 52 With one exception, these confidence intervals were computed using the procedure described in the previous footnote. The exception has to do with the way the shocks were handled. In particular, the artificial data were generated by randomly sampling from the orthogonalized shocks, rather than the estimated VAR disturbances. Residuals other than the policy shocks were drawn, with replacement, from the full sample period set of residuals. The policy shocks were drawn from two sets. Shocks for periods prior to the analog of the Rudebusch sample period were drawn, with replacement, from the pre-Rudebusch sample fitted policy shocks. Shocks for periods during the analog of the Rudebusch sample period were drawn, with replacement, from the Rudebusch sample fitted policy shocks. 53 In this manuscript, we have adopted the extreme assumption that the stochastic properties of the policy shock changed abruptly on particular dates. An alternative is that the changes occur smoothly in the manner captured by an ARCH specification for the policy shocks. Parekh (1997) pursues this interpretation. He modifies our bootstrap procedures to accommodate ARCH behavior in the shocks. 54 That is, they are computed under the assumption that the data generating mechanism is the six lag, full sample estimated VAR with policy shocks drawn only from the relevant subsample.All other shocks are drawn randomly from the full sample of fitted shocks.
110
L.J Christiano et al.
Fed Funds Model 65:01 to 94:12
Fed Funds Model 65:01 to 79:09
Fed Funds Model 79:10 to 94:12
Fed Funds Model 84:02 to 94:12
MP Shock => EM
MP Shock => EM
MP Shock => EM
MP Shock => EM
MP Shock => Price
MP Shock => Price
MP Shock => Price
MP Shock => Pcom
MP Shock => Pcom
MP Shock
ii ................................. J
MP Shock => Pcom
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
::2:t
.....................
MP Shock => FF
MP Shock => FF
e\
/\
=> Pcom
MP Shock => FF
~\
= L , " .........................................r MP Shock => NBR
MP Shock => NBR
MP Shock=> NBR
MP Shock => NBR
MP Shock => TR
MP Shock => TR
MP Shock => TR
MP Shock = > M1
MP Shock => M1
¢ MP Shock => TR
iil
::J
I
..........
g
17
25
MP Shock => M1
.
.
.
1
.
.
.
.
.
5
~
~1
MP Shock => M1
2', .....: T T ? . : . : : , : . : 2 L ~
"
~ 0,
..........................
Fig. 9. Response functions to monthly benchmark F F policy shocks, estimated over four subsamples: the bencl~mark sample, and the periods 1965:1-1979:9, 1979:10-1994:12, and 1984:2-1994:12.
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
111
employment, P, P C O M , and M1 almost always lie within the confidence intervals. For the third and fourth sample periods there is no evidence against the modified subsample stability hypothesis. There is some marginal evidence against the hypothesis in the first subsample. In particular, the P C O M and price level responses lie outside the plotted confidence interval at some horizons. We find these results somewhat troubling, since they may indicate that the benchmark F F policy shocks are contaminated by other shocks to which the Fed responds. Despite this, the overall impression one gets from these results is that the modified subsample stability hypothesis is not rejected for the benchmark F F policy shocks. At the same time, there is strong evidence that the variance of the policy shocks changed over the sample. One interpretation is that the early 1980s were a period in which policy shocks were very large, but that the shocks were of comparable magnitude and substantially smaller size throughout the rest of the post-war period. One bit of evidence in favor of this view is that the estimated policy shocks in the second and fourth sample periods are reasonably similar in size, 20 basis points versus 12 basis points, respectively. We now briefly point out that qualitative inference is robust across subsamples. For each subsample we find evidence consistent with a liquidity effect. Specifically, a policy-induced rise in the federal funds rate is associated with a decline in nonborrowed reserves, total reserves and M1. In addition, the contractionary policy shock is associated with a delayed response of employment and a very small change in the price level. We now consider the results for the benchmark N B R policy shocks, reported in Figure 10. The overall impression conveyed here is similar to what we saw in Figure 9. There is relatively little evidence against the modified subsample sensitivity hypothesis. For the most part, the point estimates all tie within the plotted confidence intervals. Note that the impulse response functions are qualitatively robust across subsamples. We now turn to a complementary way of assessing subsample stability, which focuses on the magnitude of the liquidity effect. Panels A and B of Table 1 report summary statistics on the initial liquidity effect associated with the benchmark F F and N B R identification schemes, respectively. In that table, F F / N B R denotes the average of the first three responses in the federal funds rate, divided by the average of the first three responses in nonborrowed reserves. These responses are taken from the appropriate entries in Figure 9. As a result, F F / N B R denotes the percentage point change in the federal funds rate resulting from a policy-induced one percent change in NBR. F F / M 1 denotes the corresponding statistic with the policy-induced change in M1 in the denominator. Because of the shape of the impulse response function in M1, we chose to calculate this statistic by averaging the first six responses in F F and M1. The statistics are reported for the four sample periods considered in Figure 9. In addition, the 95% confidence intervals are computed using the appropriately modified version of the bootstrap methodology used to compute confidence intervals in Figure 9. Panel B is the exact analog of Panel A, except that the results are based on the N B R policy shocks.
112
L.J. Christiano et al.
N B R M o d e l 85:01 to 9 4 : 1 2
N B R M o d e l 85:01 to 7 9 : 0 9
N B R M o d e l 7 9 : 1 0 to 9 4 : 1 2
N B R M o d e l 8 4 : 0 2 to 9 4 : 1 2
MP Shock => EM
MP Shock => EM
MP Shock => EM
MP Shock => EM
MP Shock => Price
MP S h o c k - > Price
MP S h o c k => Price
MP Shock
MP Shock - > Pcom
MP Shock - > Poem
ii!! .
MP Shock => Pcom
.
.
.
~5
~
41
>Pcom
11,t1
MP Shock
>FF
~__~ ~ ~ . ~
MP Shock => FF
MP S h o c k - > FF
MP Shock => g g
MP Shock => NBR
MP Shock - > NBR
MP Shook => NBR
MP Shock - > NBR
MP Shock => TR
MP Shock => TR
MP Shock => TR
2~............................................ MP Shock => M1
~ 23 .............................
i
MP Shock => M1
..... L ~
J MP Shock => M1
i
Fig. 10. Results for the benchmark NBR policy shocks.
MP Shock => M1
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
113
Table 1 The liquidity effect, sample period sensitivity Subsample
FF/NBR
FF/M 1
-0.94 (-1.30, -0.73) -0.70 ( 2.64,-0.55) 0.71 (-1.95,-0.64) -0.69 (-5.52, 1.86)
-2.17 (-3.61, -1.36) -1.88 (-8.32,-0.72) -1.13 ( 4.48, 0.82) -0.97 (-13.92, 13.39)
-0.23 ( 0.29, 0.17) -0.07 (-0.36, -0.11) -0.27 ( 0.35, 0.13) -0.13 (-0.45,-0.04)
-1.14 ( 2.10,-0.59) -2.08 (-4.86, -0.14) -0.37 (-3.56,-0.15) -0.47 (-10.12, 5.35)
Panel A: FF Policy Shock
65:01~4:12 65:01-79:09 79:10-94:12 84:02-94:12 Panel B: NBR Policy Shocks
65:01-94:12 65:01-79:09 79:1044:12 84:02-94:12
We begin our discussion by reviewing the results in panel A. The full sample results indicate that a one percent policy-shock induced increase in nonborrowed reserves results in roughly a one percentage point reduction in the federal funds rate. A one percent policy-shock induced increase in M1 results in roughly a two percentage point decline in the federal funds rate. The point estimates do vary across the subsamples. However, the evidence suggests that the differences in estimated responses can be accounted for by sampling uncertainty. In particular, there is little evidence against the null hypothesis that the true responses are the same in the subsamples. This is evident from the fact that the confidence intervals in the subsamples include the point estimates for the full sample. Turning to panel B, we see that, using the N B R identification scheme, we obtain point estimates of the responses that are generally smaller. Again, there is little evidence against subsample stability. We now summarize our findings regarding subsample stability. We have two basic findings. First, there is evidence that the variance of the policy shocks is larger in the early 1980s than in the periods before or after. Second, we cannot reject the view that the response of economic variables to a shock of given magnitude is stable over the different subsamples considered.
114
L.J Christiano et al.
We conclude this section by noting that other papers have also examined the subsample stability question. See, for example Balke and Emery (1994), Bernanke and Mihov (1995) and Strongin (1995). These papers focus on a slightly different question than we do. They investigate whether the Fed adopted different operating procedures in different subperiods, and provide some evidence that different specifications of the policy rule in Equation (2.1) better characterize different subsamples. At the same time, Bernanke and Mihov (1995) and Strongin (1995) do not find that the dynamic response functions to a monetary policy shock are qualitatively different over the different subsample periods that they consider. In this sense, their results are consistent with ours. 4.5. Discriminating between the benchmark identification schemes
In the introduction we sketched a strategy for assessing the plausibility of different identification schemes. The basic idea is to study the dynamic response of a broad range of variables to a monetary policy shock. We dismiss an identification scheme if it implies a set of dynamic response functions that is inconsistent with every model we are willing to consider. The first subsection illustrates our approach by comparing the plausibility of two interpretations of an orthogonalized shock to NBR. These amount to two alternative identification schemes. The first corresponds to the benchmark NBR identification scheme described in Section 4. Under this scheme, an orthogonalized contractionary shock to NBR is interpreted as a negative money supply shock. The second scheme, recently proposed by Coleman, Gilles and Labadie (1996), interprets the same shock as either a positive shock to money demand, or as news about a future monetary expansion. When we use our strategy to assess their identification scheme, we find that we can dismiss it as implausible. 55 The second subsection contrasts our approach to discriminating among identification schemes with one recently proposed in Bernanke and Mihov (1995). We review their methodology and explain why we think our approach is more likely to be fruitful. 4.5.1. The Coleman, Gilles and Labadie identification scheme
According to Coleman, Gilles and Labadie (1996), understanding why an NBR policy shock may not coincide with an exogenous contractionary shock to monetary policy requires understanding the technical details about the way the Fed allocates the different tasks of monetary policy between the discount window and the Federal Open Market Committee. They argue, via two examples that a contractionary NBR shock may correspond to other types of shocks.
55 The discussion presented here summarizes the analysis in Christiano (1996).
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
115
In their first example, they argue that a negative NBR shock may actually correspond to a positive shock to the demand for money. The argument goes as follows. Suppose that there was a shock to either the demand for TR, M 1 or M2 that drove up the interest rate. Absent a change in the discount rate, this would lead to an increase in Borrowed Reserves via the discount window. Suppose in addition that the FOMC believes that the managers of the discount window always over accommodate shocks to the demand for money, and respond by pulling nonborrowed reserves out of the system. An attractive feature of this story is that it can potentially account for the fact that the federal funds rate is negatively correlated with nonborrowed reserves and positively correlated with borrowed reserves [see Christiano and Eichenbaum (1992)]. Unfortunately, the story has an important problem: it is hard to see why a positive shock to money demand would lead to a sustained decline in total reserves, M1 or M2. But this is what happens after an NBR policy shock (see Figure 2). In light of this fact, the notion that a negative NBR policy shock really corresponds to a positive money demand shock seems unconvincing. In their second example, Coleman, Gilles and Labadie argue that a negative NBR shock may actually correspond to a positive fi~ture shock to the money supply. The basic idea is that the Fed signals policy shifts in advance of actually implementing them, and that a signal of an imminent increase in total reserves produces an immediate rise in the interest rate. Such a rise would occur in standard neoclassical monetary economies of the type considered by Cooley and Hansen (1989). Suppose that the rise in the interest rate results in an increase in borrowed reserves. If the Fed does not wish the rise in borrowed reserves to generate an immediate rise in total reserves, it would respond by reducing nonborrowed reserves. This interpretation of the rise in the interest rate after an NBR policy shock is particularly interesting because it does not depend on the presence of a liquidity effect. Indeed, this interpretation presumes that the interest rate rises in anticipation of a future increase in the money supply. To the extent that the interpretation is valid, it would constitute an important attack on a key part of the evidence cited by proponents of the view that plausible models of the monetary transmission mechanism ought to embody strong liquidity effects. Again there is an important problem with this interpretation of the evidence: the anticipated rise in the future money supply that the contractionary NBR policy shock is supposed to be proxying for never happens: TR, M 1 and M 2 f a l l for over two years after a contractionary NBR policy shock. In light of this, the notion that a contractionary NBR policy shock is proxying for expansionary future money supply shocks seems very unlikely. 4.5.2. The Bernanke-Mihov critique
The preceding subsection illustrates our methodology for assessing the plausibility of different identification schemes. Bernanke and Mihov (BM) propose an alternative approach. Under the assumption that the policy function is of the form of Equation (2.1), they develop a particular test of the null hypothesis that e7 is a monetary policy shock
L.J Christiano et al.
116
against the alternative that Ei~' is confounded by nonmonetary policy shocks to the market for federal funds. To implement their test, Bernanke and Mihov develop a model o f the federal funds market which is useful for interpreting our benchmark identification schemes. These schemes are all exactly identified, so that each fits the data equally well. To develop a statistical test for discriminating between these schemes, BM impose a particular overidentifying restriction: the amount that banks borrow at the discount window is not influenced by the total amount o f reserves in the banking system. BM interpret a rejection o f a particular overidentified model as a rejection o f the associated NBR, F F or NBR/TR identification scheme. But a more plausible interpretation is that it reflects the implausibility o f their overidentifying restriction. This is because that restriction is not credible in light o f existing theory about the determinants o f discount window borrowing and the empirical evidence presented below.
4.5.2.1. A model o f the federal funds market. reserves is given by rRt = fTR(Y2t) -- aFFt + OdEd,
BM assume that the demand for total
(4.4)
where fTR(g2t) is a linear function o f the elements o f £2,, a, oa > 0, and eta is a unit variance shock to the demand for reserves which is orthogonal to Or. According to Equation (4.4), the demand for total reserves depends on the elements o f g2t and responds negatively to the federal funds rate. The demand for borrowed reserves is:
BRt =fBR(Qt)+ [3FFt - yNBRt + Obebt,
(4.5)
wherefBR(g2t) is a linear function o f the elements o f f2t and ob > 0. The unit variance shock to borrowed reserves, Etb, is assumed to be orthogonal to f2t. BM proceed throughout under the assumption that y = 0. Below, we discuss in detail the rationale for specification (4.5). 56 Finally, they specify the following Fed policy rule for setting NBRt:
NBR~ = fxgR ( g2t) + et,
(4.6)
where
et = Oa a d ( + Ob Obetb + ase 7
(4.7)
Here, e7 is the unit variance exogenous shock to monetary policy. By assumption, etd, e), e7 are mutually orthogonal, both contemporaneously and at all leads and lags.
56 We follow BM in not including the interest rate charged at the discount window (the discount rate) as an argument in Equation (4.5). BM rationalize this decision on the grounds that the discount rate does not change very often.
Ch. 2. Monetary Policy Shocks: What Have we Learned and to What End?
117
The parameters q~d and Cb control the extent to which Fed responds contemporaneously to shocks in the demand for total reserves and borrowed reserves. Using the fact that TR = N B R + BR, and solving Equations (4.4)-(4.7), we obtain TRt ] NBRt I = F(f2t) + ut, ut = Be~, FF, J
(4.8)
where
"fTR(~3 F(g2t)
=
0
1
1
y 1
0
fNBR(~2t)
(4.9)
1
and (Td~ CdaY+OJaflU~--aO's~-~a/3--tZ~b~-l+O~Y-¢bfi+a B =
~ra4)a
~d~d7 ~ a+l ~+a
G 7 1
~s ~
(4.10)
G ~ z' ~ -1+¢b7-~ ~
~'b
e, =
(4.11)
4/ We now turn to the problem of identifying the parameters of the money market model. As in Section 3, we first estimate ut using the fitted disturbances, ~tt, in a linear regression of the money market variables on g2t, and then estimate et from ~, = B-lht using a sample estimate of B. The latter can be obtained by solving 4.5.2.2. Identifying the p a r a m e t e r s o f the model.
(4.12)
V = B B ~,
where V is the Gaussian maximum likelihood estimate of E u t d t which respects the restrictions, if any, implied by Equation (4.12) and the structure of B in Equation (4.10). The estimate, V, is obtained by maximizing T 2 {loglVl+tr(SV-')},
where
1 T S =~Zht~t;,
(4.13)
t=l
subject to conditions (4.10)-(4.12). When the latter restrictions are not binding, the solution to this maximization problem is V = S. s7 57 BM use a slightlydifferentestimation strategy. See the appendix in BM.
118
L.J Christiano et al.
Denote the model's eight structural parameters by ~p = [a, [3, 7, cpd , o b, ad2, 0-2, 0-21.
(4.14)
Let ~0r denote a value of ~0 which implies a B that satisfies condition (4.12). The model is underidentified if there exist other values of ~ that have this property too. The model is exactly identified if ~ r is the only value of ~ with this property. Finally, the model is overidentified if the number of structural parameters is less than six, the number of independent elements in S. Given the symmetry of V, condition (4.12) corresponds to six equations in eight unknown parameters: a, fi, 7, q~d, q~b, 0-2, 0-I, 0-2. To satisfy the order condition discussed in Section 3, at least two more restrictions must be imposed. Recall that the F F , N B R and N B R / T R identification schemes analyzed in the previous section correspond to a particular orthogonality condition on the monetary policy shock. These conditions are satisfied in special cases of the federal funds market model described above. Each special case corresponds to a different set of two restrictions on the elements of ~p. In each case, the estimation procedure described above reduces to first setting V = S and then solving the inverse mapping from V to the free elements of ~p in condition (4.12). The uniqueness of this inverse mapping establishes global identification. When St = NBRt, relations (4.8)-(4.10) imply that the measured policy shock is given by Equation (4.7). So, from the perspective of this framework, our N B R system assumes: q~d = q~b = O.
(4.15)
The free parameters in ~p are uniquely recovered from V as follows: V21 0 -2 = V22, V32, V l l + uV31
(4.16)
a --
fi -
V31+ aV33'
(4.17)
g
1
=
a~2 = ( 1 3 + a ) [ V 3 1 + aV33]'
V21(~ + a ) (/0-2 ,
2 O~i = V 3 3 ( a + / 3 ) 2 _ O - c ~ _ ( 1 _
7) 2 a s2 .
(4.18)
where V O. refers to the (i,j) element of V. When St = FFt then ~ p d ( y _ l ) + l a + e ; -l+q~b(g-1)~+e~, y - 1 fi + a fi + a ~et.s
et -
(4.19)
From the perspective of this framework, the benchmark F F system assumes: q~d _
1 1--•'
q~b = _q)d.
(4.20)
The free parameters in ~p are recovered from V as follows: _ C
~/- 1 /~ q- a '
0-2 _ V322 V33,
V31 Of = -- V3~,
(4.21)
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
-;=
gll
_ a2C2 0-2 s,
[
y = 1 --
0-2 = ( 1 - y)2 [V22 L
0-2 (1 -
V21-k
]
0-2
7) 2
"
1
(4.22)
acq,2J ' /3 = (y
119
V32
-
-- a . 1) V77ss
(4.23)
'
The NBR/TR system assumes: a = q}b = O.
(4.24)
Under these conditions, it is easy to verify that the error of the regression of NBRt on g2t and TRt is e~. The free parameters of the money model are recovered from V as follows: 0-; = g l l ,
~d _ V21
0"2 = V 2 2 - ( 0 d ) 20.2d,
(4.25)
V~I' cl -
V32 - Odv3~ 0.2 ,
d :/32 [v33- c20-~2 -
c2-
cZos2],
v31 02,
fi= [c2-0~cl] -~ ,
y = /3C1 + 1.
(4.26) (4.27)
Restrictions (4.15), (4.20), (4.24) guarantee that the benchmark NBR, F F and NBR/TR policy shock measures are not polluted by nonmunetary policy shocks, respectively. 4.5.2.3. The Bernanke-Mihov test. Recall that the basic purpose of the money market model discussed above is to help assess whether different monetary policy shock measures are polluted by nonpolicy shocks to the money market. In the case of the NBR policy system this amounts to testing restriction (4.15). For the F F and NBR/TR systems this corresponds to testing restrictions (4.20) and (4.24), respectively. The problem is that, since each of these systems is exactly identified, the restrictions cannot be tested using standard statistical procedures. From this perspective, the money market model is not helpful. As the model stands, to assess the different identification schemes, one must revert to the strategy laid out in the previous section. Namely, one must examine the qualitative properties of the impulse response functions. Instead BM impose an additional maintained assumption on the model. Specifically, they assume y = 0, i.e., the demand for borrowed reserves does not depend on the level of nonborrowed reserves. With this additional restriction, the NBR, F F and NBR/TR models have only five structural parameters, so each is overidentified. Consequently, each can be tested using standard likelihood ratio methods. An important limitation of this approach is that we can always interpret a rejection as evidence against the maintained hypothesis, y = 0, rather than as evidence against the NBR, F F or NBR/TR identification schemes. A rejection would be strong evidence against one of these identification schemes only to the extent that one had overwhelmingly sharp priors
120
L.J Christiano et al.
that y really is zero. In fact, there are no compelling reasons to believe that y is zero. Just the opposite is true. Standard dynamic models of the market for reserves suggest that y is not zero. Consider for example Goodfriend's (1983) model of a bank's demand for borrowed reserves. Goodfriend highlights two factors that affect a bank's decision to borrow funds from the Federal Reserve's discount window. The first factor is the spread between the federal funds rate and the Fed's discount rate (here assumed constant). The higher this spread is, the lower is the cost of borrowing funds from the discount window, relative to the cost of borrowing in the money market. The second factor is the existence of nonprice costs of borrowing at the Federal Reserve discount window. These costs rise for banks that borrow too much or too frequently, or who are perceived to be borrowing simply to take advantage of the spread between the federal funds rate and the discount rate. Goodfriend writes down a bank objective function which captures both of the aforementioned factors and then derives a policy rule for borrowed reserves that is of the following form: O<3
BRt = 21BRt 1 - 22hFFt - h Z
)J2Et(FFt-I+i)'
-1 < 21, 22 < O, h > O.
(4.28)
i-2
Here Et denotes the conditional expectation based on information at time t. Reflecting the presence of the first factor in banks' objective functions, the current federal funds rate enters the decision rule for BR with a positive coefficient. The variable, BRt 1, enters this expression with a negative coefficient because of the second factor. The presence of the expected future federal funds rate in the policy rule reflects both factors. For example, when EtFFt+I is high, banks want BRt to be low so that they can take full advantage of the high expected funds rate in the next period without having to suffer large nonprice penalties at the discount window. The crucial thing to note from Equation (4.28) is that any variable which enters Et (FFt-l+i) also enters the "demand for borrowed reserves" (4.5). So, if nonborrowed reserves help forecast future values of the federal funds rate, y should not equal zero. To assess the empirical importance of this argument we proceeded as follows. We regressed FFt on 12 lagged values (starting with month t - 1) of data on employment, P, P C O M , FF, NBR, and TR. The estimation period for the regression is the same as for our monthly benchmark VAR's. We computed an F-statistic for testing the null hypothesis that all the coefficients on NBR in this equation are equal to zero. The value of this statistic is 3.48 which has a probability value of less than 0.001 percent using conventional asymptotic theory. Given our concerns about the applicability of conventional asymptotic theory in this context we also computed the probability value of the F-statistic using an appropriately modified version of the bootstrap methodology used throughout this chapter. Specifically, we estimated a version of our benchmark monthly VAR in
Ch. 2: MonetaryPolicy Shocks: What Have we Learned and to What End?
121
which all values o f NBR were excluded from the federal funds equation. 58 Using the estimated version o f this VAR, we generated 500 synthetic time series by drawing randomly, with replacement, from the set o f fitted residuals. On each synthetic data set, we computed an F-statistic using the same procedure that was applied in the actual data. Proceeding in this way, we generated a distribution for the F-statistic under the null hypothesis that lagged values o f NBR do not help forecast the federal funds rate. We find that none o f the simulated F-statistics exceed the empirical value of 3.48. This is consistent with the results reported in the previous paragraph which were based on conventional asymptotic distribution theory. Based on this evidence, we reject the null hypothesis that lagged values o f NBR are not useful for forecasting future values o f F F and the associated hypothesis that NBR is not an argument o f the demand for BR. The argument against the BM exclusion restriction (7 = 0), is a special case of the general argument against exclusion restrictions presented in Sargent (1984) and Sims (1980). In fact, this argument suggests that none o f the parameters o f BM's money market model are identified since even exact identification relies on the exclusion o f NBR and BR from total reserves demand (4.4), and TR from the borrowed reserves function (4.5). There is another reason not to expect 7 = 0. The second factor discussed above suggests that a bank which is not having reserve problems, but still borrows funds at the discount window, may suffer a higher nonprice marginal cost o f borrowing. This would happen if the discount window officer suspected such a bank were simply trying to profit from the spread between the federal funds rate and discount rate. 59 Presumably a bank that possesses a large amount o f nonborrowed reserves could be viewed as having an "ample supply of federal funds". The appropriate modification to the analysis in Goodfriend (1983) which reflects these considerations leads to the conclusion that NBRt should enter on the right hand side o f Equation (4.28) with a negative coefficient. We conclude that what we know about the operation o f the discount window and the dynamic decision problems o f banks provides no support for the BM maintained hypothesis that 7 is equal to zero. To make concrete the importance o f BM's maintained assumption that g = 0, we estimated both the restricted and unrestricted NBR, F F and NBR/TR models, as discussed above. The results are reported in Tables 2a and 2b. Each table reports results based on two data sets, the BM monthly data and the quarterly data used in the rest o f this chapter. For the BM data, we used their estimated S matrix, which they kindly provided to us. The column marked "restricted" reports results for the model with 7 = 0. These correspond closely to those reported by BM. 6o The 4.5.2.4. Empirical results.
5s Each equation in this VAR was estimated separately using OLS and 12 lags of the right hand side variables. 59 Regulation A, the regulation which governs the operation of the discount window, specificallyexcludes borrowing for this purpose. 6o The small differences between the two sets of results reflect different estimation methods.
L.J Christiano et aL
122 Table 2a Estimation results for money market models NBR model
B-M Data, 1965:1-1994:12 restricted unrestricted a fi
FF model
1965:Q31995:Q2 unrestricted
B-M Data, 1965:1-1994:12 restricted unrestricted
1965:Q31995:Q2 unrestricted
0.009
0.035
0.022
-0.003
-0.003
-0.001
(0.00763)
(0.00763)
(0.00550)
(0.00099)
(0.00070)
(0.00104)
0.03•
0.012
0.012
0.012
0.012
0.012
(0.00269)
(0.00129)
(0.00151)
(0.00106)
(0.00091)
(0.00107)
y
0
0.481
0.279
0
-0.103
-0.073
(0.03229)
(0.05599)
(0.04849)
(0.05864)
0d
0
0
0
1
1
1
crd
0.011
0.020
0.022
0.009
0.009
0.013
(0.00065) ~s
0.013 (0.00048)
ob
0.013 (0.00100)
p-value
0.000
(0.00333) 0.013 (0.00048) 0.007 (0.00033)
(0.00387) 0.018 (0.00114) 0.009 (0.00070)
(0.00033) 0.004 (0.00070) 0.009 (0.00035)
(0.00023) 0.004 (0.00047) 0.010 (0.00053)
(0.00058) 0.008 (0.00107) 0.011 (0.00077)
0.052
columns marked "unrestricted" report the analog results when the restriction, 7 = 0, is not imposed. The bottom row o f Tables 2a and 2b reports the p-values for testing the monthly restricted versus unrestricted model. 61 Several results in these tables are worth noting. To begin with, according to column 1 o f Table 2a, BM's restricted N B R model is strongly rejected. Recall, they interpret this rejection as reflecting that 0~ and/or 0b are nonzero. A s we have stressed, one can just as well infer that 7 is not zero. In fact, from column 2 we see that the estimated value o f 7 is positive and highly statistically significant. O f course, this result would not be particularly interesting i f the estimated values o f the other parameters in the unrestricted model violated BM's sign restrictions. But, this is not the case. All the parameter values satisfy BM's sign restrictions. This is the case whether we use monthly or quarterly data. Taken together, our results indicate that BM's claim to have rejected the benchmark N B R model is unwarranted.
61 We use a likelihood ratio statistic which, under the null hypothesis, has a chi-square distribution with 1 degree of freedom.
123
Ch. 2: Monetary Policy Shocks." What Have we Learned and to What End?
Table 2b Estimation results, restricted and unrestricted models NBR/TR model
B-M Data, 1965:1-1994:12 restricted unrestricted
1965:Q3-1995:Q2 unrestricted
a
o
o
o
/3
0.046
0.038
0.026
(0.00424) 7
0
0a
ad
-0.011 (0.10371)
(0.06350)
(0.06350)
0.009
0.009
0.011 0.019 (0.00188)
p-value
0.200 (0.07836) 0.802
(0.00040) crb
(0.00358)
0.802
(0.00033) a~.
(0.00422)
(0.00033)
0.886 (0.09664) 0.013 (0.00082)
0.011
0.014
(0.00040)
(0.00087)
0.016
0.015
(0.0018 l)
(0.00234)
0.032
Next, from column 4 o f Table 2a, we see that, consistent with BM's results, the F F model cannot be rejected on the basis o f the likelihood ratio test. Notice, however,
that the estimated value o f a is negative. Indeed, the null hypothesis, a ~> 0, is strongly rejected. This calls into question the usefulness o f their model for interpreting the benchmark F F identification scheme for the sample period as a whole. 62 Finally, note from Table 2b that the N B R / T R model is not strongly rejected by BM's likelihood ratio test and the parameter values are consistent with all o f BM's sign restrictions. In sum, B M have proposed a particular way to test whether the policy shock measures associated with different identification schemes are polluted by nonpolicy shocks. The previous results cast doubt on the effectiveness o f that approach. 4.6. Monetary policy shocks and volatility
Up to now we have focussed on answering the question, what are the dynamic effects o f a monetary policy shock? A related question is: How have monetary policy
62 BM actually argue that this model is most suitable for the pre-1979 period. Here too, their point estimate of a is negative and significantly different from zero.
124
L.J Christiano et al.
shocks contributed to the volatility of various economic aggregates? The answer to this question is of interest for two reasons. First, it sheds light on the issue of whether policy shocks have been an important independent source of impulses to the business cycle. Second, it sheds light on identification strategies which assume that the bulk of variations in monetary aggregates reflect exogenous shocks to policy. For example, this is a maintained assumption in much of the monetized real business cycle literature. 63 Table 3 summarizes the percentage of the variance of the k step ahead forecast errors in P, Y, PCOM, FF, NBR, TR and M 1 that are attributable to quarterly benchmark FF, NBR and NBR/TR policy shocks. Analog results for policy shock measures based on M0, M 1, and M2 are reported in Table 4. We begin by discussing the results based on the benchmark policy measures. First, according to the benchmark F F measure, monetary policy shocks have had an important impact on the volatility of aggregate output, accounting for 21%, 44% and 38% of the variance of the 4, 8 and 12 quarter ahead forecast error variance in output, respectively. However, these effects are smaller when estimated using the NBR/TR policy shock measures and smaller still for the benchmark NBR policy shocks. Indeed, the latter account for only 7%, 10% and 8% of the 4, 8 and 12 quarter ahead forecast error variance of output. Evidently, inference about the importance of monetary policy shocks depends sensitively on which policy shock measure is used. In addition, conditioning on the policy shock measure, there is substantial sampling uncertainty regarding how important policy shocks are in accounting for the variance of the k step forecast error. Second, none of the policy shock measures account for much of the volatility of the price level, even at the three year horizon. In addition, only the F F benchmark policy shock measure accounts for a nontrivial portion of the variability of PCOM. Evidently, monetary policy shocks are not an important source of variability in prices, at least at horizons of time up to three years in length. Third, regardless of whether we identify St with the federal funds rate or NBR, policy shocks account for a large percent of the volatility of St at the two quarter horizon. However, their influence declines substantially over longer horizons. Fourth, according to the benchmark F F and NBR/TR measures, monetary policy shocks play a very minor role in accounting for the variability in TR and M1. Policy shocks play a more important role according to the benchrnark NBR measure. Even here, most of the volatility in TR and M1 arises as a consequence of nonpolicy shocks. Identification strategies which assume that monetary aggregates are dominated by shocks to policy are inconsistent with these results. Finally, policy shocks are more important in explaining the volatility in M2 than for TR or M1. This is true regardless of which benchmark policy measure we consider. Still, the variation in M2 due to policy shocks never exceeds 50%.
63 See Cooley and Hansen (1989), Chaff et al. (1996) and Christiano and Eichenbaum (1995).
Ch. 2."
Monetary Policy Shocks: What Have we Learned and to What End?
¢~
,~-
~-
o)
52
¢9
0
09
,-2
~
~
0"5
125
L.J. Christiano et al.
126
Z
t¢3
m~
~ v
cq
s
v
o
~Z TM
o
& v
~Z t"-I
&
¢q
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
127
Next we consider the results obtained for policy shock measures based on M0, M1, and M2. The VAR's underlying these results correspond to the ones underlying the results reported in columns 2, 4 and 6 in Figure 6. In each case, St is equated to either M0, M I or M2, and the information set, g2t, includes current and past values of Yt, Pt, PCOMt as well as lagged values o f F F t , TRt and St. A number of results are interesting to note here. First, the M 0 and Ml-based policy shock measures account for only a trivial fraction of the fluctuations in output. In contrast, at horizons greater than a year, M2-based policy measures account for a noticeably larger fraction of output variations. While they account for a smaller fraction of output volatility than do the F F policy shocks, they are similar on this dimension to the NBR/TR policy shock measures. Second, neither the M0 or Ml-based policy shock measures account for more than a trivial part of the volatility of P and PCOM. Policy shock measures based on M2 play a somewhat larger role at horizons of a year or longer. However, there is considerable sampling uncertainty about these effects. Finally, at horizons up to a year, M0, M 1 , and M2-based policy shocks account for sizeable percentages of M0, M1, and M2, respectively. At longer horizons the percentages are lower. Viewed across both sets of identification strategies that we have discussed, there is a great deal of uncertainty about the importance of monetary policy shocks in aggregate fluctuations. The most important role for these shocks emerged with the FF-based measure of policy shocks. The smallest role is associated with the M0 and Ml-based policy shock measures. We conclude this subsection by noting that even if monetary policy shocks have played only a very small role in business fluctuations, it does not follow that the systematic component, f in Equation (2.1), of monetary policy has played a small role. The same point holds for prices. A robust feature of our results is that monetary policy shocks account for a very small part of the variation in prices. This finding does not deny the proposition that systematic changes in monetary policy, captured by f , can play a fundamental role in the evolution of prices at all horizons of time.
5. The effects of monetary policy shocks: abandoning the recursiveness approach In this section we discuss an approach to identifying the effects of monetary policy shocks that does not depend on the recursiveness assumption. Under the recursiveness assumption, the disturbance term, e)~, in the monetary anthority's reaction function [see Equation (2.1)] is orthogonal to the elements of their information set g2t. As discussed above [see Equation (4.1)] this assumption corresponds to the notion that economic variables within the quarter are determined in a block recursive way: first, the variables associated with goods markets (prices, employment, output, etc.) are determined; second, the Fed sets its policy instrument (i.e., NBR in the case of the
128
L.J Christiano et al.
benchmark NBR system, and F F in the case of the benchmark F F system); and third, the remaining variables in the money market are determined. To help compare the recursiveness assumption with alternative identifying assumptions, it is convenient to decompose it into two parts. First, it posits the existence of a set of variables that is predetermined relative to the policy shock. Second, it posits that the Fed only looks at predetermined variables in setting its policy instrument. Together, these assumptions imply that monetary policy shocks can be identified with the residuals in the ordinary least squares regression of the policy instrument on the predetermined variables. The papers discussed in this section abandon different aspects of the recursiveness assumption. All of them drop the assumption that the Fed only looks at variables that are predetermined relative to the monetary policy shock. This implies that ordinary least squares is not valid for isolating the monetary policy shocks. Consequently, all these papers must make further identifying assumptions to proceed. The papers differ in whether they assume the existence of variables which are predetermined relative to the monetary policy shock. Sims and Zha (1998) assume there are no variables with this property. In contrast, papers like Sims (1986), Gordon and Leeper (1994), and Leeper, Sims and Zha (1996) assume that at least a subset of goods market variables are predetermined. Under their assumptions, these variables constitute valid instruments for estimating the parameters of the Fed's policy rule. The section is organized as follows. First, we discuss the identifying assumptions in the paper by Sims and Zha (1998). We then compare their results with those obtained using the benchmark identification schemes. Finally, we briefly consider the analyses in the second group of papers mentioned above.
5.1. A fully simultaneous system
This section is organized as follows. In the first subsection we discuss the specification of the Sims and Zha (1998) (SZ) model and corresponding identification issues. In the second subsection, we compare results obtained with a version of the SZ model to those obtained using the benchmark policy shocks. 5.1.1. Sims-Zha: model specification and identification
We begin our discussion of the SZ model by describing their specification of the money supply equation. It is analogous to our policy function (2.1), with St identified with a short term interest rate, Rt. Sims and Zha (1998) assume that the only contemporaneous variables which the Fed sees when setting St are a producer's price index for crude materials (Pcm) and a monetary aggregate (M). In addition, the Fed is assumed to see a list of lagged variables to be specified below. Note that unlike the benchmark systems, £2t does not contain the contemporaneous values of the aggregate price level
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
129
and output. A s Sims and Zha (1998) point out, this is at best only a reasonable working hypothesis. 64 The reaction function in the SZ model can be summarized as follows: Rt = const. + aiMt + a2Pcmt +.fs(Zt-1 . . . . , Zt-q) + a ~ ~,
(5.1)
where fs(Z~-i . . . . , Zt_q) is a linear function o f past values o f all the variables in the system, q > 0, a > 0, and Ets is a serially uncorrelated monetary policy shock. Sims and Zha (1998) assume that P c m and M are immediately affected by a monetary policy shock. As noted above, this rules out ordinary least squares as a method to estimate Equation (5.1). Instrumental variables would be a possibility i f they made the identifying assumption that there exists a set o f variables predetermined relative to the monetary policy shock. However, they are unwilling to do so. They make other identifying assumptions instead. First, they postulate a money demand function o f the form: Mt - Pt - Yt : const. + blRf +fM(Zt 1, . . . , Zt-q) + aMe M.
(5.2)
Here, f M ( Z t _ l
.... , Zt-q) is a linear function o f past values o f all the variables in the system, aM > 0, and eM is a serially uncorrelated shock to money demand. Recall, Yt and Pt denote aggregate output and the price level. Note that the coefficients on Et and Pt are restricted to unity. Sims and Zha display a model which rationalizes a money demand relationship like Equation (5.2). 65 Second, they assume that Pcmt responds contemporaneously to all shocks in the system. They motivate this assumption from the observation that crude materials prices are set in auction markets. Third, as noted above, they are not willing to impose the assumption that goods market variables like P and Y are predetermined relative to the monetary policy shock. Clearly, they cannot allow P and Y to respond to all shocks in an unconstrained way, since the system would then not be identified. Instead, they limit the channels by which monetary policy and other shocks have a contemporaneous effect on P and Y. To see how they do this, it is convenient to define a vector o f variables denoted by Xt, which includes Pt and Yr. Sims and Zha impose the restriction that Xt does not respond directly to Mt or Rt, but that it does respond to Pcmt. A monetary
64 This is because the Fed does have at its disposal various indicators of price and output during the quarter. For example, the Fed has access to weekly reports on unemployment claims and retail sales. Also, two weeks prior to each FOMC meeting, policymakers have access to the "Beige Book", which is compiled from nationwide surveys of business people. In addition, FOMC members are in constant contact with members of the business community. Moreover, the Fed receives, with a one month lag, various monthly measures of output and prices (e.g. employment, wages and the consumer price level). 65 Their model rationalizes a relationship between the contemporaneous values of Mr, Pt, Yt and St. One can rationalize the lagged terms in the money demand equation if there is a serially correlated shock to the marginal product of money in their model economy. Ireland (1997) and Kim (1998) rationalize similar relationships with Y replaced by consumption.
L . J Christiano et al.
130
policy shock has a contemporaneous impact on the variables in Xt via its impact on Pcmt.
To see this, first let
Xt =
Wt Pimt
,
Zt=
Lrbk' j
R,
"
x,
where P i m denotes the producer price index of intermediate materials, W denotes average hourly earnings of nonagricultural workers, Tbk denotes the number of personal and business bankruptcy filings. The assumptions stated up to now imply the following restrictions on the matrix A0 in representation (3.2) of Zt: - a l l a12 a13 a14 a15 al6 a17 a18 0 0 0 0 a22 a 2 3 - a 2 2 - a 2 2 a33 0 0 0 0 0 a31 a32 A0 =
a41 asa a61 aT1
0 0 0 0
as1 0
0 0 0 0 0
a44 a54 a64 a74 a84
a45 a55 a65 a75 a85
a46 a56 a66 a76 a86
a47 a57 a67 a77 a87
a48 a58 a68 a78 a88
(5.3)
The first row of A0 corresponds to the P c m equation. The second and third rows correspond to the money demand equation (5.2), and to the monetary policy rule (5.1), respectively. The next five rows correspond to Xt. The second and third elements of et in Equation (3.2) correspond to ~ and e7. It is evident from Equation (5.3) that the impact of a monetary policy shock operates on Xt via its influence on Peru. Specifically, this reflects the fact that the (4, 1) to (8, 1) elements of A0 are potentially nonzero. I f we impose that these elements are zero, then, given the other zero restrictions in the second and third columns of A0, the variables in Xt are predetermined relative to a monetary policy shock. We now consider identification o f the SZ model. Notice that the last five rows in A0 have the same restrictions, suggesting that Equation (3.2) is not identified. To see that this is in fact the case, consider the following orthonormal matrix:
,o]
(3×3) (3;5)
m =
0 ' (5×3) (5x5) J
(5.4)
where the dimensions are indicated in parentheses and ~" is an arbitrary orthonormal matrix. Note that if A0 satisfies (i) the restrictions in Equation (5.3) and (ii) the relation
Ch. 2.. Monetary Policy Shocks: What Have we Learned and to What End?
131
Ao I (Aol) ' = V, then WAo does too. Here, V denotes the variance covariance matrix of the fitted residuals in the VAR (3.1), for Zt. By the identification arguments in Section 3, representation (3.2) with A0 and with WAo are equivalent from the standpoint o f the data. That is, there is a family o f observationally equivalent representations (3.2), for the data. Each corresponds to a different choice o f A0. We now discuss the implications o f this observational equivalence result for impulse response functions. Recall from Equation (3.6) that, conditional on the Bz's characterizing the VAR o f Zt, the dynamic response functions o f Zt to et are determined by Ao 1. Also, note that (WAo) -1 = Ao I W I. Two important conclusions follow from these observations. First, the impulse response functions o f Zt to the first three elements o f et are invariant to the choice o f A0 belonging to the set o f observational equivalent A0's defined above, i.e., generated using W~s o f the form given by Equation (5.4). Second, the dynamic response functions to the last five elements o f et are not. To the extent that one is only interested in the response functions to the first three elements o f et, the precise choice o f ff~ is irrelevant. Sims and Zha choose to work with the A0 satisfying Equation (5.3) and the additional restriction that the square matrix formed from the bottom right 5 × 5 matrix in A0 is upper triangular. 66 The corresponding dynamic response functions o f Zt to the last five shocks in et simply reflect this normalization. We now make some summary remarks regarding identification of the SZ model. In Section 3 we discussed an order condition which, in conjunction with a particular rank condition, is sufficient for local identification. According to that order condition, we need at least 28 restrictions on A0. The restrictions in Equation (5.3), along with the normalization mentioned in the previous paragraph, represent 31 restrictions on A0. So, we satisfy one o f the sufficient conditions for identification. The rank condition must be assessed at the estimated parameter values. Finally, to help guarantee global identification, Sims and Zha impose the restriction that the diagonal element o f A0 are positive. 5.1.2. Empirical results
We organize our discussion o f the empirical results around three major questions. First, what are the effects of a contractionary monetary policy shock using the SZ identification scheme? Second, how do these effects compare to those obtained using the benchmark identification scheme? Third, what is the impact on Sims and Zha's (1998) results o f their assumption that the variables in Xt respond contemporaneously to a monetary policy shock?
66 The A0 matrix is contained in the set of observationallyequivalentA0's as long as that set is non-empty. To see this, suppose there is some A0 that satisfies (i) Equation (5.3) and (ii) the relation A~1 (Aol)I = V. Let QR denote the QR decomposition of the lower right 5 × 5 part of this matrix ~. The 5 × 5 matrix Q is orthonormal and R is upper triangular. Then, form the orthonormal matrix W as in Equation (5.4), with W = Qt. The matrix WAo satisfies (i) and (ii) with the additional restriction on Equation (5.3) that the lower fight 5 × 5 matrix in A0W is upper triangular. This establishes the result sought.
132
L.J Christiano et al.
To answer these questions, we employ a version o f the SZ model in which Mt corresponds to M 2 growth and Rt corresponds to the 3 month Treasury Bill Rate. 67 The four-lag VAR model was estimated using data over the period 1965Q3-1995Q2.68 Our results are presented in column 1 o f Figure 11. The solid lines correspond to our point estimates o f the dynamic response o f the variables in Zt to a contractionary monetary policy shock. The dotted lines represent 95% confidence intervals about the mean o f the impulses. 69 The main consequences o f a contractionary SZ policy shock can be summarized as follows. First, there is a persistent decline in the growth rate o f M2 and a rise in the interest rate. Second, there is a persistent decline in the GDP deflator and the prices o f intermediate goods and crude materials. Third, after a delay, the shock generates a persistent decline in real GDP. Finally, note that the real wage is basically unaffected by the SZ policy shock. Comparing these results with those in Figure 2, we see that the qualitative response o f the system to an SZ policy shock is quite similar to those in the benchmark F F and N B R systems. It is interesting to note that the estimated SZ policy shocks are somewhat smaller than the estimated benchmark F F policy shocks. For example, the impact effect o f a benchmark F F policy shock on the federal funds rate is about 70 basis points, while the impact o f a SZ policy shock on the three-month Treasury bill rate is about 40 basis points. At the same time, the SZ policy shock measure is roughly o f the same order o f magnitude as an N B R policy shock. In both cases a policy shock is associated with a forty basis point move in the federal funds rate. We now turn to the third question posed above. We show that Sims and Zha's insistence that Xt is not predetermined relative to a monetary policy shock has essentially no impact on their results. To do this, we simply shut down the coefficients in A0 which allow a monetary policy shock to have a contemporaneous impact on Xt and reestimate the system. Column 2 in Figure 11 reports the results. Comparing columns 1 and 2, we see that inference is virtually unaffected. It is interesting to compare the SZ model with the analysis in Leeper et al. (1996). They work with a system that contains more variables. But, the fundamental difference is that they impose the assumption that goods market variables are predetermined
67 The variable, Tbk, is not used in our analysis. Also, SZ measure M as the log level of M2. Comparing the estimated dynamic response functions to a monetary shock in our version of SZ with those in SZ it can be verified that these two perturbations make essentially no difference to the results. 68 The variable, Pcm, was measured as the log of the producer price index for crude materials, SA; Pim is the logged producer price index for intermediate materials, SA; Y is logged GDP in fixed-weight 1987 dollars, SA; P is the logged GDP deflator derived from nominal GDP and GDP in fixed-weight 1987 dollars, SA; R is the three-month Treasury bill rate; and the change in the log of M2, SA. These data series are taken from the Federal Reserve Board's macroeconomic database. Logged average hourly earnings of private nonagricultural production workers are divided by the GDP deflator, SA, and are derived from the Citibase data set. 69 These were computed using the procedure described in Sims and Zha (1995).
Ch. 2:
S i m s - Z h a Monthly MP Shock => GDP
Sims_Zha, Quarterly MP S h o c k => G D P
0.40
~,~
I
o.oo|
I
~
. ~
.
.
ooo
~
-0.40 / o 80
133
Monetary Policy Shocks." What Have we Learned and to What End?
R
-- -- ~ ~ .
o
.
3
.
6
-ors
-0,32
9
1
8
quarters
15 months
22
29
MP S h o c k => P C E Price Deflator
MP S h o c k => G D P Price Deflator 0.18
0.1B
-0.18 -0.36
~
.
-0.54 3 quarters
o
.
. 1
9
6
8
15 months
22
MP Shock => FF
MP Shock => FF 0.80 /\\ 0.30 o.o0 -0.40
~.~
o
~ .
3
.
quarters
.
.
.
6
8.
~
o.oo
-0.3O
9
...... 1
8
15 months
22
2g
MP Shock => Growth in M2
MP Shock => Growth in M2 O,OB
&
/ -O.4
0
3
quarters
6
1
0
0.5 . . . . . .
23
29
/ ~ - - ~ - - ~
o.o
--~-'
/
-o.2s
-0.50 -0.75
15 months
MP Shock => Crude Materials Prices
MP S h o c k => Crude Materials Prices 0.25 0.oo
8
-1 .o
~ ~ ~ ~ -- ~ ~ ~
-1,5
quarters
15 months
MP S h o c k = > Intermediate Materials Prices
MP Shock => Intermediate Materials Prices
0
3
6
9
1
8
22
29
03
-0.30 -0.60
-t.6
o
3
6
-0.~
qua~ers
9
1
8
15 months
22
MP S h o c k => Real W a g e s
MP Shock => Real Wages
29
0.32
-0.12
-0.16 -0.32
-0 04 o
3
6
O.36
9
quaders
Fig. 11.
~ "~
'" .................................... 1 8 15 months
22
29
134
L.J. Christiano et al.
relative to a monetary policy shock. 70 The response to a monetary policy shock of the variables that these analyses have in common is very similar. This is consistent with our finding that the absence of predeterminedness of good market variables in the SZ model is not important. A number of other studies also impose predeterminedness of at least some goods market variables. These include Sims (1986), who assumes predeterminedness of investment, and Gordon and Leeper (1994), who assume all goods market variables and the 10 year Treasury rate are predetermined. Inference about the dynamic response of economic aggregates is very similar across these papers, Sims and Zha (1998), Leeper et al. (1996) and the benchmark systems.
6. Some pitfalls in interpreting estimated monetary policy rules In Sections 4 and 5 we reviewed alternative approaches for identifying the effects of a monetary policy shock. A common feature of these different approaches is that they make enough identifying assumptions to enable the analyst to estimate the parameters of the Federal Reserves's feedback rule. A natural question is: why did we not display or interpret the parameter estimate? The answer is that these parameters are not easily interpretable. In this section we describe three examples which illustrate why the estimated policy rules are difficult to interpret in terms of the behavior of the monetary authority. We emphasize, however, that the considerations raised here need not necessarily pose a problem for the econometrician attempting to isolate monetary policy shocks and their consequences. The central feature of our examples is that the policy maker reacts to data that are different from the data used by the econometrician. In the first example, the decision maker uses error-corrupted data, while the econometrician uses error-free data. In the second and third examples the decision maker reacts to a variable that is not in the econometrician's data set. The policy rule parameters estimated by the econometrician are a convolution of the parameters of the rule implemented in real time by the policy maker and the parameters of the projection of the missing data onto the econometrician's data set. It is the convolution of these two types of parameters which makes it difficult to assign behavioral interpretations to the econometrician's estimated policy rule parameters. Our first example builds on the measurement error example discussed in Section 2. We assume there is measurement error in the data used by real time policy makers, while the econometrician uses final revised data. We suppose xt + vt corresponds to the
70 In their description of the model, monetary policy shocks impact on the analog of X via a limited set of variables. In practice, however,they set the coefficients on these variables equal to zero. So, all their estimated systems have the property that the goods market variables are predetermined relative to the monetarypolicy shock.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
135
raw data received by the primary data collection agency and that o t reflects classical reporting and transmission errors that are uncorrelated with the true variable, xt, at all leads and lags. In addition, we suppose that the reporting errors are discovered in one period, so that ut in Equation (2.2) is zero. We assume that the data collection agency (or, the staff of the policy maker) reports its best guess, ~?t, of the true data, xt, using its knowledge of the underlying data generating mechanism and the properties of the measurement error process. 71 Finally, suppose that xt evolves according to xt = p l S t - 1 + p2xt-1 + oat,
where rot is uncorrelated with all variables dated t - 1 and earlier. Suppose the data collection authority computes ~?t as the linear projection o f x t on the data available to it. Then, fft = P [ x t [St 1, xt + ut, xt 1] = a 0 S t - I + a l (xt + ut) + a 2 x ~ l ,
(6.1)
where the ai's are functions of pl,/)2, and the variances of oat and or. Now, suppose that the policy authority is only interested in responding to xt, and that it attempts to do so by setting (6.2)
St = a2ct
in real time. Substituting Equation (6.1) into this expression, we see that Equation (6.2) reduces to Equations (2.1) and (2.4) with t o = a a o , [31 = a a l ,
[32 = aa2.
(6.3)
Notice how different the econometrician's estimated policy rule, (2.4) and (6.3), is from the real time policy rule (6.2). The [3's in the estimated policy rule are a convolution of the behavioral parameter, a, the measurement error variance, and the parameters governing the data generating mechanism underlying the variables that interest the policy maker. 72 Also notice that an econometrician who estimates the policy rule using the recursiveness assumption will, in population, correctly identify the monetary policy shock with aalvt. This example shows how variables might enter f , perhaps even with long lags, despite the fact that the policy maker does not care about them p e r se. In the example, the variables St-l and xt-i enter o n l y because they help solve a signal extraction problem. Finally, the example illustrates some of the dangers involved in trying to give
71 For a discussion of the empirical plausibility of this model of the data collection agency, see Mankiw et al. (1984), and Mankiw and Shapiro (1986). 72 See Sargent (1989), for a discussion of how to econometrically unscramble parameters like this in the presence of measurement error.
L.J. Christiano et al.
136
a structural interpretation to the coefficients in f Suppose a0 and a are positive. An analyst might be tempted to interpret the resulting positive value of/3o as reflecting a desire to minimize instrument instability. In this example, such an interpretation would be mistaken. Significantly, even though the estimated policy rule has no clear behavioral interpretation, the econometrician in this example correctly identifies the exogenous monetary policy shock. For our second example, we assume that the policy maker responds only to the current innovation in some variable, for example, output. In particular suppose that, St = aet + cr~.Et, where e~ is the innovation to which the policy maker responds, a is the policy parameter, and E~ is the exogenous policy shock. Suppose that et is related to data in the following way, et = ~i~-o/3ixt-i, so that in Equation (2.1), O<75
f ( f2t ) : a Z
[3ixt-i.
i=0
Suppose the econometrician makes the correct identification assumptions and recovers f(U2t) exactly. An analyst with sharp priors about the number of lags in the policy maker's decision rule, or about the pattern of coefficients in that rule, might be misled into concluding that fundamental specification error is present. In fact, there is not. The disturbance recovered by the econometrician, St - f ( f 2 t ) , corresponds exactly to the exogenous monetary policy shock. Our final example is taken from Clarida and Gertler (1997) and Clarida et al. (1997, 1998). They consider the possibility that the rule implemented by the policy authority has the form St = aEtxt+l + crseT. In this case, f(f2t) = aEtxt+i, and £2t contains all the variables pertinent to the conditional expectation, Etxt+b Assuming there is substantial persistence in xt, f will contain long lags and its coefficients will be hard to interpret from the standpoint of the behavior of policy makers. 73 These examples suggest to us that direct interpretation of estimated policy rules is fraught with pitfalls. This is why we did not discuss or report the estimated policy rules. Instead, we focused on dynamic response functions of economic aggregates to monetary policy shocks.
7. The effects of a monetary policy shock: the narrative approach
In the previous sections, we have discussed formal statistical approaches to identifying exogenous monetary policy shocks and their effects on the economy. The central
73 Claridaet al. (1997, 1998) estimatethe parameters of forwardlookingpolicyrules, so that in principle they can uncoverinterpretableparameterslike a.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
137
problem there lies with the identification of the exogenous monetary policy shock itself. As we discussed above, there are many reasons why shocks measured in this way may not be exogenous. These include all the reasons that policy rules, like (2.1), might be misspecified. For example, there may be subsample instability in the monetary policy rule, policymakers' information sets may be misspecified. In addition, the various auxiliary assumptions that must be made in practice, e.g., the specification of lag lengths, are always subject to question. Romer and Romer motivate what they call the narrative approach as a way of identifying monetary policy shocks that avoids these difficulties. 74 This section is organized as follows. First, we discuss the specific identifying assumptions in Romer and Romer's analysis. Second, we contrast results obtained under their assumptions with the benchmark results reported above. 75 Any approach that wishes to assess the effects of a monetary policy action on the economy must grapple with the endogeneity problem. Romer and Romer (1989) do so by identifying episodes (p. 134) "... when the Federal Reserve specifically intended to use the tools it had available to attempt to create a recession to cure inflation." They select such episodes based on records pertaining to policy meetings of the Federal Reserve. They interpret the behavior of output in the wake of these episodes as reflecting the effects of monetary policy actions and not some other factors. To justify this interpretation, they make and attempt to defend two identifying assumptions. First, in these episodes, inflation did not exert a direct effect on output via, say, the anticipated inflation tax effects emphasized in Cooley and Hansen (1989). Second, in these episodes inflation was not driven by shocks which directly affected output, such as supply shocks. These two assumptions underlie their view that the behavior of output in the aftermath of a Romer and Romer episode reflected the effects of the Fed's actions. The Romer and Romer (1989) episodes are: December 1968; April 1974; August 1978; October 1979. We follow Kashyap et al. (1993) by adding the 1966 credit crunch (1966:2) to the index of monetary contractions. In addition, we add the August 1988 episode identified by Oliner and Rudebusch (1996) as the beginning of a monetary contraction. 76 For ease of exposition, we refer to all of these episodes as Romer and Romer episodes. It is difficult to judge on a priori grounds whether the narrative approach or the strategy discussed in the previous sections is better. The latter approach can lead to misleading results if the wrong identifying assumptions are made in specifying the Fed's policy rule. A seeming advantage of Romer and Romer's approach is that one is not required formally to specify a Fed feedback rule. But there is no free lunch.
74 They attribute the narrative approach to Friedman and Schwartz (1963). 75 See Christiano et al. (1996b), Eichenbaum and Evans (1995) and Leeper (1997) for a similar comparison. 76 In a later paper, Romer and Romer (1994) also add a date around this time.
138
L.J Christiano et al.
2.0
I
I 1.5
I
II 1.0
0.5 "E
.o
0.0
Q.
-0.5
NBR model
-1.0
- -
Fed Funds model . . . . -F ,
-1.5 67
74
81
,,,
~ -|, 88
f
,
1
t
f
-
Three-month centered, equal-weighted moving average
Fig. 12. Contractionary benchmark policy shocks in units of federal funds rate; three-month centered, equal-weighted moving average, with Romer dates.
As we pointed out, they too must make identifying assumptions which are subject to challenge. Shapiro (1994) for example challenges the usefulness of these dates on the grounds that they do not reflect an exogenous component of monetary policy. In his view, they reflect aspects of monetary policy that are largely forecastable using other macro variables. An additional shortcoming of the Romer and Romer approach, at least as applied to postwar monetary policy, is that it delivers only a few episodes of policy actions, with no indications of their relative intensity. In contrast, the strategy discussed in the previous section generates many "episodes", one for each date in the sample period, and a quantitative measure of the intensity of the exogenous shock for each date. So in principle, this approach can generate more precise estimates of the effects of a monetary policy shock. It is of interest to compare the Romer and Romer episodes with the benchmark F F and N B R shocks. According to Figure 12, with one exception each Romer and Romer episode is followed, within one or two quarters, by a contracfionary F F and N B R policy contraction. The exception is October 1979, which is not followed
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
139
by a contractionary NBR policy shock. 77 At the same time, we identify several contractionary policy shocks which are not associated with a Romer and Romer episode. We now turn to the issue of how qualitative inference is affected by use of the Romer and Romer index. To determine the impact of a Romer and Romer episode on the set of variables, Zt, we proceed as follows. First, we define the dummy variable, dr, to be one during a Romer and Romer episode, and zero otherwise. Second, we modify the benchmark VAR to include current and lagged values of
d,: Zt = A ( L ) Z t
1 q-[~(L)dt -]-ut.
(7.1)
Here, fi(L) is a finite ordered vector polynomial in nonnegative powers of L. We estimate Equation (7.1) using equation-by-equation least squares. For calculations based on quarterly data, the highest power of L in A(L) and in /3(L) are 5 and 6, respectively. For calculations based on monthly data, the corresponding figures are 11 and 12. The response of Z¢+/~ to a Romer and Romer episode is given by the coefficient on L I~ in the expansion of [I-A(L)L] -I fi(L). To obtain confidence intervals for the dynamic response function of Zt, we apply a version of the bootstrap Monte Carlo procedure used above which accommodates the presence of dt in Equation (7.1). In principle, the right way to proceed is to incorporate into the bootstrap simulations a model of how the Fed and then Romer and Romer process the data in order to assign values to dr. This task is clearly beyond the scope of this analysis. In our calculations, we simply treat dt as fixed in repeated samples. We also report results obtained using the monetary policy index constructed by Boschen and Mills (t991). Based on their reading of the FOMC minutes, Boschen and Mills rate monetary policy on a discrete scale, {-2, - 1 , 0 , 1,2} where - 2 denotes very tight and +2 denotes very loose. To look at the effects of this policy measure, we include it in our definition of Zt and calculate the dynamic response of the variables in Zt to an innovation in the Boschen and Mills index. Figure 13 reports the monthly data based estimates of the dynamic response of various aggregates to a Romer and Romer shock and an innovation in the Boschen and Mills index. To facilitate comparisons, column 1 reproduces the dynamic response functions associated with our monthly benchmark FF policy shocks. According to our point estimates, the qualitative responses to an FF policy shock and a Romer and Romer episode shock are quite similar: the federal funds rate rises, the price level is not much affected, at least initially, employment falls with a delay, PCOM falls, and all the monetary aggregates (NBR, M1 and M2) fall. It is interesting that the initial impacts of a Romer and Romer episode on employment
77 We cannot estimate benchmark shocks for 1966:2 because of data limitations.
L.J Christiano et al.
140 M o n t h l y Fed F u n d s M o d e l with M1 MP Shock => EM
M o n t h l y R o m e r M o d e l with M1 MP Shock => EM
M o n t h l y B o s c h e n & Mills M o d e l with M1 MP Shock => EM
MP Shock => Price
MP Policy Shock => Price
MP S h o c k - > Price
MP Shock => Pcom
MP Policy Shock => Pcom
MP S h o c k => P c o m
MP S h o c k => FF
MP Policy Shock => FP
:°dl- ~ ~22
i!:!!i MP Shock => FF o,~
~1 !
io MP Policy Shock - > NBR
MP Shock => N B R
MP S h o c k - > NBR
\rv~J °°
~x
MP Shock => TFI
/
. . . . .
------
-
MP S h o c k => TR
MP Policy S h o c k - > TR
0~
°0~
~
.....
0,
.0~
MP Pollcy Shock => M1
MP S h o c k => M1
MP S h o c k
> M1
0 i,
~°
a
~
2°
, - - ~
....
:°°2
M o n t h l y Fed F u n d s M o d e l with M 2 MP Shock => M2
o=
M o n t h l y R o m e r M o d e l with M 2 MP Policy Shock > M2
M o n t N y B o s c h e n & Mills M o d e l with M 2 MP Shock - > M2 N\
-- - - \
/ j
0,
Not Applicable
MP S h o c k = > Bo Mi Index
Not Applicable
0~0
Fig. 13. The monthly data based estimates of the dynamic response of various aggregates to a Romer and Romer shock and an innovation in the Boschen and Mills index. To facilitate comparisons, colunm 1 reproduces the dynamic response functions associated with our monthly benchmark FF policy shocks.
Ch. 2: Monetary Policy Shocks: What Have we Learned and to What End?
141
and the price level are quite small. Unlike the identification schemes underlying the benchmark shock measures, this is not imposed by the Romer and Romer procedure. There are some differences between the estimated effects o f the two shock measures. These pertain to the magnitude and timing of the responses. Romer and Romer episodes coincide with periods in which there were large rises in the federal funds rate. The maximal impact on the federal funds rate after a Romer and Romer episode is roughly 100 basis points. In contrast, the maximal impact on the federal funds rate induced by an F F policy shock is roughly 60 points. Consistent with this difference, the maximal impact o f a Romer and Romer shock on employment, P C O M , NBR, TR, M 1 and M 2 is much larger than that o f a F F policy shock. Finally, note that the response functions to a Romer and Romer shock are estimated less precisely than the response functions to an F F policy shock. Indeed, there is little evidence against the hypothesis that output is unaffected by a Romer and Romer shock. 78 While similar in some respects, the estimated response functions to an innovation in the Boschen and Mills results do differ in some important ways from both the F F and Romer and Romer shocks. First, the impact o f a Boschen and Mills shock is delayed compared to the impact o f the alternative shock measures. For example the maximal increase in the federal funds rate occurs 14 months after a Boschen and Mills shock. In contrast, the maximal increase of the federal funds rate occurs 1 and 3 periods after an F F and Romer and Romer shock, respectively. Another anomaly associated with the Boschen and Mills response functions is the presence of a price puzzle: both P C O M and the price level rise for a substantial period of time after a contraction. Figure 14 reports the quarterly data based estimates o f the dynamic response o f various aggregates to a Romer and Romer shock and an innovation in the Boschen and Mills index. The key finding here is that the qualitative properties of the estimated impulse response functions associated with the three policy shock measures are quite similar. Unlike the monthly results where employment initially rises in response to a Romer and Romer episode, there is no initial rise in aggregate output. The only major difference is that, as with the monthly data, the maximal impact o f a Boschen and Mills shock measure on the federal funds rate is substantially delayed relative to the other two shock measures. Integrating over the monthly and quarterly results, we conclude that qualitative inference about the effects of a monetary policy shock is quite robust to the different shock measures discussed in this section.
7s Romer and Romer report statistically significant effects on output. This difference could arise for two reasons. First, we include more variables in our analysis than do Romer and Romer. Second, we compute standard errors using a different method than they do.
L . J Christiano et al.
142
F e d F u n d s M o d e l with M1 MP Shock => Y
o~ oo ~ k
R o m e r M o d e l with M1 MP Shock => Y ~L----
~
B o s c h e n & Mills M o d e l with M1 MP Shock => Y ~
-! %'EF---~2--
"1 MP Shock => Price
xX./
. . - - _
I
--
MP Policy Shock => Price
,~1 . . . . . . . . . . . . . . . . . . . . . .
MP Shock => Price
°0N
?--
~-.
MP Shock => Pcom
MP Policy Shock => Pcom
MP Shock => FF
MP Policy Shock => PP
MP Shock => FF
MP Shock => NBR
MP Policy Shock => N B R
MP Shock => N B R
MP Shock => Pcom
J
:S
7 y
o
2~_r
iil
,,
........
MP Shock => TR
MP Policy Shock => TR
MP Shock => TR
MP Shock => M1
MP Policy Shock => M1
MP Shock => M1
F e d F u n d s M o d e l with M 2 MP Shock => M2
R o m e r M o d e l with M 2 M P Policy S h o c k => M2
B o s c h e n & Mills M o d e l with M 2 MP Shock => M2
iit W---- ........
iji
'
Not Applicable
Not Applicable
MP Shock => Bo Mi index
] i
.......
:::I
,"----
.......
iil Y"-
Fig. 14. The quarterly data based estimates of the dynamic response of various aggregates to a Romer and Romer shock and an innovation in the Boschen and Mills index.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
143
8. Conclusion
In this chapter we have reviewed the recent literature that grapples with the question: What happens after a shock to monetary policy? This question is of interest because it lies at the center of the particular approach to model evaluation that we discussed: the Lucas program applied to monetary economics. The basic step in that program involves subjecting monetary models to a particular experiment: a monetary policy shock. Since alternative models react very differently to such a shock, this experiment can, in principle, be the basis of a useful diagnostic test. But to be useful in practice, we need to know how the actual economy responds to the analog experiment. Isolating these data based experiments requires identifying assumptions. We argued that qualitative inference about the effects of a monetary policy shock is robust across many, but not all the sets of identifying assumptions that have been pursued in the literature. A key question remains: How can the results of the literature we reviewed be used to quantitatively assess the performance of a particular model? Much of the empirical literature on monetary policy shocks proceeds under the assumption that monetary policy is highly reactive to the state of the economy. In sharp contrast, analyses of quantitative general equilibrium models often proceed under much simpler assumptions about the nature of the monetary authority's reaction function. This leads to an obvious problem: unless the monetary policy rule has been specified correctly, the nature of the monetary experiment being conducted in the model is not the same as the experiment in the data. One way to deal with the problem is to solve theoretical models using estimated reaction functions taken from the policy shock literature. There are two potential problems associated with this approach. First, and most importantly, it is often the case that models have multiple equilibria when policy is specified as a relationship between endogenous variables. Second, the complexity of estimated reaction functions makes it difficult (at least for us) to gain intuition for the way a monetary policy shock impacts on a model economy. Christiano et al. (1997b) suggest an alternative approach to ensuring the consistency between model and data based experiments. The basic idea is to calculate the dynamic effects of a policy shock in a model economy under the following representation of monetary policy: the growth rate of money depends only on current and past shocks to monetary policy. Formally such a specification represents the growth rate of money as a univariate, exogenous stochastic process. However this representation c a n n o t be developed by examining the univariate time series properties of the growth rate of money, say by regressing the growth rate of money on its own lagged values. Instead the representation must be based on the estimated impulse response function of the growth rate of money to a monetary policy shock. The rationale underlying the proposal by Christiano et al. (1997b) is as follows. To actually implement a particular monetary policy rule, the growth rate of money m u s t (if only implicitly) respond to current and past exogenous shocks in an appropriate way. This is true even when the systematic component of policy is thought of as a
144
L.J. Christiano et al.
relationship between endogenous variables, like the interest rate, output and inflation. The literature on monetary policy shocks provides an estimate o f the way the growth rate o f money actually does respond to a particular shock - a monetary policy shock. For concreteness we refer to the estimated impulse response function of the growth rate o f money to a policy shock as "the exogenous monetary policy rule". 79 Suppose that an analyst solves a monetary model under the assumption that policy is given by the exogenous policy rule. In addition, suppose that the model has been specified correctly. In this case, the dynamic responses o f the model variables to a policy shock should be the same as the dynamic response functions o f the corresponding variables to a policy shock in the VAR underlying the estimate o f exogenous policy rule [see Christiano et al. (1997b)]. This is true even if the monetary policy shock was identified in the VAR assuming a policy rule that was highly reactive to the state o f the economy. So, the empirical plausibility of a model can be assessed by comparing the results o f an exogenous policy shock in the model to the results o f a policy shock in a VAR. It is often the case that a model economy will have multiple equilibria when policy is represented as a relationship between endogenous variables. Each may be supported by a different rule for the way the growth rate o f money responds to fundamental economic shocks. Yet, for any given rule relating the growth rate o f money to these shocks, it is often (but not always) the case that there is a unique equilibrium [see Christiano et al. (1997b) for examples]. Under these circumstances the proposal by Christiano et al. (1997b) for evaluating models is particularly useful. The monetary policy shock literature tells us which exogenous policy rule the Fed did adopt and how the economy did respond to a policy shock. These responses can be compared to the unique prediction o f the model for what happens after a shock to monetary policy. However, it is unclear how to proceed under a parameterization of monetary policy in which there are multiple equilibria. We conclude by noting that we have stressed one motivation for isolating the effects o f a monetary policy shock: the desire to isolate experiments in the data whose outcomes can be compared with the results o f analog experiments in models. Authors like Sims and Zha (1998) and Bernanke et al. (1997) have pursued a different motivation. These authors argue that if the analyst has made enough assumptions to isolate another fundamental shock to the economy, then it is possible to understand the consequences o f a change in the systematic way that monetary policy responds to that shock, even in the absence o f a structural model. Their arguments depend in a critical way on ignoring the Lucas critique. This may or may not be reasonable in their particular applications. We are open minded but skeptical. For now we rest our
79 Christiano et al. (1997b) argue that a good representation for the exogenous monetary policy rule relating the growth rate of M1 to current and past policy shocks is a low order M A process with a particular feature: the contemporaneous effect of a monetary policy shock is small while the lagged effects are much larger. In contrast, the dynamic response function of the growth rate of M2 to current and past policy shocks is well approximated by an AR(1) process.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
145
c a s e for t h e u s e f u l n e s s o f t h e m o n e t a r y p o l i c y s h o c k literature o n the m o t i v a t i o n we h a v e p u r s u e d : t h e desire to b u i l d structural e c o n o m i c m o d e l s that c a n b e u s e d to t h i n k a b o u t s y s t e m a t i c c h a n g e s i n p o l i c y i n s t i t u t i o n s a n d rules.
References Balke, N.S., and K.M. Emery (1994), "The federal funds rate as an indicator of monetary policy: evidence from the 1980's", Economic Review (Federal Reserve Bank of Dallas) First Quarter, 1-16. Ball, L. (1995), "Time-consistent policy and persistent changes in inflation", Journal of Monetary Economics 36(2):329-350. Barro, R.J. (1977), "Unanticipated money growth and nnemployment in the United States", American Economic Review 67(2): 101-115. Bartle, R.G. (1976), The Elements of Real Analysis, 2nd edition (Wiley, New York). Beaudry, R, and M.B. Devereux (1995), "Money and the real exchange rate with sticky prices and increasing returns", Carnegie-Rochester Conference Series on Public Policy 43:55-101. Bernanke, B.S. (1986), "Alternative explanations of the money-income correlation", Carnegie-Rochester Conference Series on Public Policy 25:49 99. Bernanke, B.S., and A.S. Blinder (1992), "The federal funds rate and the channels of monetary transmission", American Economic Review 82(4):901 92I. Bernanke, B.S., and M. Gertler (1995), "Inside the black box: the credit channel of monetary policy transmission", Journal of Economic Perspectives 9(4):27-48. Bernanke, B.S., and I. Mihov (1995), "Measuring monetary policy", Working Paper No. 5145 (NBER). Bernanke, B.S., M. Gertler and M.W. Watson (1997), "Systematic monetary policy and the effects of oil price shocks", Brookings Papers on Economic Activity 1997(1):91 142. Boschen, J.E, and L.O. Mills (1991), "The effects of countercyclical monetary policy on money and interest rates: an evaluation of evidence from FOMC documents", Worldng Paper 91 20 (Federal Reserve Bank of Philadelphia). Brunner, A.D. (1994), "The federal funds rate and the implementation of monetary policy: estimating the federal reserve's reaction function", International Finance Discussion Paper No. 466 (Board of Governors of the Federal Reserve System). Campbell, J. (1997), "Cyclical job creation, job destruction and monetary policy", manuscript (University of Rochester). Carlson, J.B., J.M. McIntire and J.B. Thomson (1995), "Federal funds futures as an indicator of future monetary policy: a primer", Federal Reserve Bank of Cleveland Economic Review 31(1):20-30. Chari, V.V, RJ. Kehoe and E.R. McGrattan (1996), "Sticky price models of the business cycle: the persistence problem", Staff Report 217 (Federal Reserve Bank of Minneapolis). Chari, VV, L.J. Christiano and M. Eichenbaum (1998), "Expectation traps and discretion", Journal of Economic Theory 81(2):462 492. Christiano, L.J. (1991), "Modeling the liquidity effect of a money shock", Federal Reserve Bank of Minneapolis Quarterly Review 15(1):3 34. Christiano, L.J. (1992), "Searching for a break in GNP", Journal of Business and Economic Statistics 10(3):237 250. Christiano, L.J. (1995), "Resolving the liquidity effect: conunentary", Federal Reserve Bank of St. Louis Review 77(3):55 61. Christiano, L.J. (1996), "Identification and the liquidity effect: a case study", Federal Reserve Bank of Chicago Economic Perspectives 20(3):2 13. Christiano, L.J., and M. Eichenbaum (1992), "identification and the liquidity effect of a monetary policy shock", in: A. Cukiern~an, Z. Hercowitz and L. Leiderman, eds., Political economy, growth and business cycles (MIT Press, Cambridge and London), 335 370.
146
L.J Christiano et aL
Christiano, L.J., and M. Eichenbaum (1995), "Liquidity effects, monetary policy and the business cycle", Journal of Money, Credit and Banking 27(4):1113-1136. Christiano, L.J., M. Eichenbaum and C.L. Evans (1996a), "The effects of monetary policy shocks: evidence from the flow of funds", Review of Economics and Statistics 78(1): 16-34. Christiano, L.J., M. Eichenbaum and C.L. Evans (1996b), "Identification and the effects of monetary policy shocks", in: M. Blejer, Z. Eckstein, Z. Hercowitz and L. Leiderman, eds., Financial Factors in Economic Stabilization and Growth (Cambridge University Press, Cambridge) 36-74. Christiano, L.J., M. Eichenbaum and C.L. Evans (1997a), "Sticky price and limited participation models: a comparison", European Economic Review 41(6):1201-1249. Christiano, L.J., M. Eichenbaum and C.L. Evans (1997b), "Modeling money", Working Paper 97-17 (Federal Reserve Bank of Chicago). Clarida, R., and J. Gali (1994), "Sources of real exchange rate fluctuations: how important are nominal shocks?", Carnegie-Rochester Conference Series on Public Policy 41:1-56. Clarida, R., and M. Gertler (1997), "'How the Bundesbank conducts monetary policy", in: C.D. Romer and D.H. Romer, eds., Reducing Inflation: Motivation and Strategy (University of Chicago Press)" 363~406. Clarida, R., J. Gali and M. Gertler (1997), "Monetary policy rules and macroeconomic stability: evidence and some theory", manuscript (New York University). Clarida, R., J. Gali and M. Gertler (1998), "Monetary policy rules in practice: some international evidence", European Economic Review 42(6):1033 1067. Cochrane, J.H. (1994), "Shocks", Carnegie-Rochester Conference Series on Public Policy 41:295 364. Coleman II, W.J., C. Gilles and EA. Labadie (1996), "A model of the federal funds market", Economic Theory 7(2):337-357. Cooley, T.E, and G.D. Hansen (1989), "The inflation tax in a real business cycle model", American Economic Review 79(4):733-748. Cooley, T.E, and G.D. Hansen (1997), "Unanticipated money growth and the business cycle reconsidered", Journal of Money, Credit and Banking 29(4, Part 2):624-648. Cushman, D.O., and T. Zha (1997), "Identifying monetary policy in a small open economy under flexible exchange rates", Journal of Monetary Economics 39(3):433~448. Eichenbaum, M. (1992), "Comment on interpreting the macroeconomic time series facts: the effects of monetary policy", European Economic Review 36(5):1001-1011. Eichenbaum, M., and C.L. Evans (1995), "Some empirical evidence on the effects of shocks to monetary policy on exchange rates", Quarterly Journal of Economics 110(4): 1975-1010. Evans, C.L., and K. Kuttner (1998), "Can VARs describe monetary policy", Research Paper 9812 (Federal Reserve Bank of New York). Faust, J., and E.M. Leeper (1997), "When do long-rtm identifying restrictions give reliable results?", Journal of Business and Economic Statistics 15(3):345-353. Fisher, J. (1997), "Monetary policy and investment", manuscript (Federal Reserve Bank of Chicago). Friedman, M., and A.J. Schwartz (1963), A Monetary History of the United States: 1867-1960 (Princeton University Press, Princeton, NJ). Fuerst, T. (1992), "Liquidity, loanable funds, and real activity", Journal of Monetary Economics 29(1): 3-24. Gall, J. (1992), "How well does the IS-LM model fit post war data?", Quarterly Journal of Economics 107(2):709 738. Gali, J. (1997), "Technology, employment, and the business cycle: do technology shocks explain aggregate fluctuations?", Working Paper No. 5721 (NBER). Gertler, M., and S. Gilchrist (1993), "The role of credit market imperfections in the monetary transmission mechanism: arguments and evidence", Scandinavian Journal of Economics 95(1):43-64. Gertler, M., and S. Gilchrist (1994), "Monetary policy, business cycles and the behavior of small manufacturing firms", Quarterly Journal of Economics 109(2):309-340.
Ch. 2:
Monetary Policy Shocks: What Have we Learned and to What End?
147
Geweke, J.E, and D.E. Rankle (1995), "A fine time for monetary policy?", Federal Reserve Bank of Minneapolis Quarterly Review 19(1): 18-31. Goodfriend, M. (1983), "Discount window borrowing, monetary policy, and the post-October 6:1979 Federal Reserve operating procedure", Journal of Monetary Economics 12(3):343056. Goodfriend, M. (1991), "Interest rates and the conduct of monetary policy", Carnegie-Rochester Conference Series on Public Policy 34:7-30. Gordon, D.B., and E.M. Leeper (1994), "The dynamic impacts of monetary policy: an exercise in tentative identification", Journal of Political Economy 102(6):1228-1247. Grilli, V, and N. Roubini (1995), "Liquidity and exchange rates: puzzling evidence from the G-7 Countries", Working Paper No. S/95/31 (New York University Solomon Brothers). Hamilton, J.D. (1994), Time Series Analysis (Princeton University Press, Princeton, NJ). Hamilton, J.D. (1997), "Measuring the liquidity effect", American Economic Review 87(1):80-97. Ireland, EN. (1997), "A small, structural, quarterly model for monetary policy evaluation", CarnegieRochester Conference Series on Public Policy 47:83 108. Kashyap, A.K., J.C. Stein and D.W. Wilcox (1993), "Monetary policy and credit conditions: evidence from the composition of external finance", American Economic Review 83(1):78-98. Killian, L. (1998), "Small-sample confidence intervals for impulse response functions", Review of Economics and Statistics 80(2):218-230. Kim, J. (1998), "Monetary policy in a stochastic equilibrium model with real and nominal rigidities", Finance and Economics Discussion Series, PA8-02 (Board of Governors of the Federal Reserve System). Kim, S., and N. Roubini (1995), "Liquidity and exchange rates, a structural VAR approach", manuscript (New York University). King, R.G. (1991), "Money and business cycles", Proceedings (Federal Reserve Bank of San Francisco). King, S. (1983), "Real interest rates and the interaction of money, output and prices", manuscript (Northwestern University). Krueger, J.T., and K.N. Kutmer (1996), "The fed funds futures rate as a predictor of Federal Reserve policy", Journal of Futures Markets 16(8):865-879. Leeper, E.M. (1997), "Narrative and VAR approaches to monetary policy: common identification problems", Journal of Monetary Economics 40(3):641 657. Leeper, E.M., and D.B. Gordon (1992), "In search of the liquidity effect", Journal of Monetary Economics 29(3):341-369. Leeper, E.M., C.A. Sims and T. Zha (1996), "What does monetary policy do?", Brookings Papers on Economic Activity 1996(2): 1 63. Lucas Jr, R.E. (1980), "Methods and problems in business cycle theory", Journal of Money, Credit and Banking 12(4):696 715. Lucas Jr, R.E. (1988), "Money demand in the United States: a quantitative review", Carnegie-Rochester Conference Series on Public Policy 29:137-167. Lucas Jr, R.E. (1994), "On the welfare cost of inflation", Working Papers in Applied Economic Theory 94-07 (Federal Reserve Bank of San Francisco). Mankiw, N.G., and M.D. Shapiro (1986), "News or noise: an analysis of GNP revisions", Survey of Current Business 66(5):20 25. Mankiw, N.G., D.E. Runkle and M.D. Shapiro (1984), "Are preliminary announcements of the money stock rational forecasts?", Journal of Monetary Economics 14(1): 15-27. McCallum, B.T. (1983), "A Reconsideration of Sims' Evidence Regarding Monetarism", Economic Letters 13(2,3):167-171. Mishkin, ES. (1983), "A rational expectations approach to testing macroeconomics: testing policy ineffectiveness and efficient-markets models" (University of Chicago Press, Chicago, IL). Oliner, S.D., and G.D. Rudebusch (1996), "Is there a broad credit channel for monetary policy?", Federal Reserve Bank of San Francisco Review 1:3-13.
148
L.J. Christiano et al.
Pagan, A.R., and J.C. Robertson (1995), "Resolving the Liquidity Effect", Federal Reserve Bank of St. Louis Review 77(3):33-54. Parekh, G. (1997), "Small sample bias, unit roots, and conditional heteroskedasticity in macroeconomic autoregression", PhD dissertation (Northwestern University). Ramey, V.A., and M.D. Shapiro (1998), "Costly capital reallocation and the effects of government spending", Carnegie-Rochester Conference Series on Public Policy 48:145-194. Reichenstein, W. (1987), "The impact of money on short-term interest rates", Economic Inquiry 25(11): 67-82. Romer, C.D., and D.H. Romer (1989), "Does monetary policy matter? A new test in the spirit of Friedman and Schwartz", NBER Macroeconomic Annual 1989 (MIT Press, Cambridge) 121-170. Romer, C.D., and D.H. Romer (1994), "Monetary policy matters", Journal of Monetary Economics 34:75-88. Rotemberg, J.J., and M. Woodford (1992), "Oligopolistic pricing and the effects of aggregate demand on economic activity", Journal of Political Economy 100(6): 1153-1207. Rotemberg, J.J., and M. Woodford (1997), "An optimization-based econometric framework for the evaluation of monetary policy", NBER Macroeconomics Annual, 297-345. Rudebusch, G.D. (1995), "Federal Reserve interest rate targeting, rational expectations and the te~wn structure", JotLr~al of Monetary Economic 35(2):245-274. Rudebusch, G.D. (1996), Do measures of monetary policy in a VAR make sense?, Working Papers in Applied Economic Theory 96-05 (Federal Reserve Bank of San Francisco). Sargent, T.J. (1984), "Autoregressions, Expectations and Advice", American Economic Review 74(2): 408-415. Sargent, T.J. (1987), Macroeconomic Theory, 2nd edition (Academic Press, Boston, MA). Sargent, T.J. (1989), "Two models of measurements and the investment accelerator", Journal of Political Economy 97(2):251-287. Shapiro, M.D. (1994), "Federal Reserve policy: cause mad effect", in: N.G. Mankiw, ed., Monetary Policy (University of Chicago Press, Chigago, IL). Sims, C.A. (1980), "Macroeconomics and reality", Econometrica 48(1): 1-48. Sims, C.A. (1986), "Are forecasting models usable for policy analysis?", Federal Reserve Bank of Minneapolis Quarterly Review 10(1):2-16. Sims, C.A. (1992)i "Interpreting the macroeconomic time series facts: the effects of monetary policy", European Economic Review 36(5):975 1000. Sims, C.A. (1996), "Comments on 'Do measures of monetary policy in a VAR make sense?' by Glen Rudebusch", manuscript (Yale University). Sims, C.A., and T. Zba (1995), Error bands for impulse responses. Working paper number 95-6 (Federal Reserve Bank of Atlanta). Sims, C.A., and T. Zha (1998), "Does monetary policy generate recessions?", Working Paper 98-12 (Federal Reserve Bank of Atlanta). Strang, G. (1976), Linear Algebra and its Applications (Academic Press, New York). Strongin, S. (1995 ), "The identification of monetary policy disturbances: explaining the liquidity puzzle", Journal of Monetary Economics 34(3):463-497. Uhlig, H. (1997), "What are the effects of monetary policy? Results from an agnostic identification procedure", manuscript (Tilburg University). Wong, K.-E (1996), "Variability in the effects of naonetary policy on economic activity", unpublished manuscript (Universaty of Wisconsin) October.
Chapter 3
MONETARY POLICY REGIMES A N D ECONOMIC PERFORMANCE: THE HISTORICAL RECORD MICHAEL D. BORDO
Rutgers University, New Brunswick, and NBER
ANNA J. SCHWARTZ National Bureau of Economic Research, New York Contents
Abstract Keywords 1. Policy regimes, 1 8 8 0 - 1 9 9 5 1.1. Definition of a policy regime 1.2. Types of regimes 1.3. Rules vs. discretion in monetary regimes 2. International m o n e t a r y r e g i m e s 2.1. The gold standard 2.1.1. Gold as a monetary standard 2.1.2. Gold and the international monetary system 2.1.3. Central banks and the rules of the game 2.1.4. Theory of commodity money 2.1.5. The gold standard as a rule 2.1.6. The viability of the gold standard 2.2. Interwar vicissitudes of the gold standard 2.3. Bretton Woods 2.4. The recent managed float and the European Monetary System 3. Episodes in U S central b a n k i n g history 3.1. Origins of US central banking 3.2. Federal Reserve 1914 3.3. Interwar years, 1919-1941 3.3.1. 1919-1929 3.3.2. The Great Depression of 1929-1933 3.3.2.1. Policy continuity? 3.3.2.2. Banking panics 3.3.2.3. Transmission of the monetary collapse to the real economy 3.3.2.4. The October 1929 stock market crash Handbook of Macroeconomics, Volume 1, Edited by J.B. Taylor and M. WoodJbrd © 1999 Elsevier Science B. g All rights' reserved 149
151 151 152 152 152 153 153 153 154 155 156 157 158 160 161 163 167 168 168 172 175 175 178 179 180 181 183
150
M.D. Bordo and A.J. Schwartz
3.3.2.5. Would stable money have attenuated the depression? 3.3.2.6. gold standard policies in transmitting the Great Depression 3.3.3. 1933-1941 3.4. Bretton Woods, 1946 1971 3.4.1. 1946-1951 3.4.2. Federal Reserve discretionary regime, 1951-1965 3.4.3. Breakdown of Bretton Woods, 1965-1971 3.5. Post-Bretton Woods, 1971-1995 3.5.1. 1971 1980 3.5.2. Shifting the focus of monetary policy, 1980-1995 3.6. Conclusion 3.6.1. Breakdown of the gold standard, 1914-197l 3.6.2. The Great Depression, 192%1933 3.6.3. The Great Inflation, 1965-1980 4. M o n e t a r y r e g i m e s and e c o n o m i c p e r f o r m a n c e : the e v i d e n c e 4.1. Overview 4.2. Theoretical issues 4.3. Measures of macroeconomic performance, by regime 4.4. Inflation and output levels and variability 4.4.1. Inflation 4.4.2. Real per capita income growth 4.5. Stochastic properties of macrovariables 4.6. Inflation persistence, price level predictability, and their effects on financial markets 4.6.1. Inflation persistence 4.6.2. Price level uncertainty 4.6.3. Effects on financial markets 4.7. Temporary and permanent shocks 5. Overall assessment o f m o n e t a r y p o l i c y r e g i m e s Acknowledgments A p p e n d i x A. Data sources A.1. United States of America A.2. United Kingdom A.3. Germany A.4. France A.5. Japan References
184 184 186 188 188 190 192 192 192 194 199 199 200 201 202 202 2O2 204 206 206 207 211 213 213 215 216 216 219 220 220 220 221 222 222 223 223
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
151
Abstract Monetary policy regimes encompass the constraints or limits imposed by custom, institutions and nature on the ability of the monetary authorities to influence the evolution of macroeconomic aggregates. This chapter surveys the historical experience of both international and domestic (national) aspects of monetary regimes from the nineteenth century to the present. We first survey the experience of four broad international monetary regimes: the classical gold standard 1880-1914; the interwar period in which a short-lived restoration of the gold standard prevailed; the postwar Bretton Woods international monetary system (1946-1971) indirectly linked to gold; the recent managed float period (1971-1995). We then present in some detail the institutional arrangements and policy actions of the Federal Reserve in the United States as an important example of a domestic policy regime. The survey of the Federal Reserve subdivides the demarcated broad international policy regimes into a number of episodes. A salient theme in our survey is that the convertibility rule or principle that dominated both domestic and international aspects of the monetary regime before World War I has since declined in its relevance. At the same time, policymakers within major nations placed more emphasis on stabilizing the real economy. Policy techniques and doctrine that developed under the pre-World War I convertible regime proved to be inadequate to deal with domestic stabilization goals in the interwar period, setting the stage for the Great Depression. In the post-World War II era, the complete abandonment of the convertibility principle, and its replacement by the goal of full employment, combined with the legacy of inadequate policy tools and theory from the interwar period, set the stage for the Great Inflation of the 1970s. The lessons from that experience have convinced monetary authorities to reemphasize the goal of low inflation, as it were, committing themselves to rule-like behavior.
Keywords gold standard, Bretton Woods, managed float, Federal Reserve, domestic policy regime, convertibility rule, stabilization goals, Great Depression, Great Inflation of the 1970s, rules, nominal anchor, exchange rate arrangements, inflation level, inflation variability, output level, output variability, trend stationary process, difference stationary process, inflation persistence, price level uncertainty, permanent shocks, temporary shocks J E L classification: E42, E52
152
M.D. Bordo and A.J Schwartz
1. Policy regimes, 1880-1995 1.1. Definition of a policy regime Monetary policy regimes encompass the constraints or limits imposed by custom, institutions and nature on the ability o f the monetary authorities to influence the evolution o f macroeconomic aggregates. We define a monetary regime as a set o f monetary arrangements and institutions accompanied by a set o f expectations - expectations by the public with respect to policymakers' actions and expectations by policymakers about the public's reaction to their actions. By incorporating expectations, a monetary regime differs from the older concept o f a monetary standard, which referred simply to the institutions and arrangements governing the money supply 1.
1.2. Types of regimes Two types o f regimes have prevailed in history: one based on convertibility into a commodity, generally specie, and the other based on fiat. The former prevailed in the U S A in various guises until Richard Nixon closed the gold window in August 1971, thereby terminating the gold convertibility feature o f the Bretton Woods international monetary system. The latter is the norm worldwide today. The two types o f regimes relate closely to the concept o f a nominal anchor to the monetary system. A nominal anchor is a nominal variable that serves as a target for monetary policy. Under specie convertible regimes, the currency price o f specie (gold and/or silver coin) is the nominal anchor. Convertibility at that price ensures that price levels will return to some mean value over long periods o f time 2. Regimes have both a domestic (national) and international aspect. The domestic aspect pertains to the institutional arrangements and policy actions of monetary authorities. The international aspect relates to the monetary arrangements between nations. Two basic types o f international monetary arrangements prevail - fixed and flexible exchange rates, along with a number o f intermediate variants including adjustable pegs and managed floating.
I See Leijonhufvud (1984) and Bordo and Jonung (1996). Eichengreen (1991a, p. 1) defines "a regime as an equilibrium in which a set of rules or procedures governing the formulation of public policy generates stable expectations among market participants". He views a monetary regime "as a set of rules or procedures affecting money's ability to provide one or more of [the] three functions [of money]". 2 A moving nominal anchor is used by central banks today. The monetary authorities pursue an inflation target based on the desired growth rate of a nominal variable, treating the inherited past as bygones. In this regime, although the inflation rate is anchored, the price level rises indefinitely [Flood and Mussa (1994)].
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
t53
1.3. Rules vs. discretion in monetary regimes
Alternative monetary regimes can be classified as following rules or discretion. The convertible metallic regimes that prevailed into the twentieth century were based on a rule - adherence to the fixed price of specie. The rule operated in both the domestic and the international aspects of the regime. In the international aspect, maintenance of the fixed price of specie at its par value by its adherents ensured fixed exchange rates. The fixed price of domestic currency in terms of specie provided a nominal anchor to the international monetary system. Fiat or inconvertible regimes can also be based on rules if the authorities devise and credibly commit to them. At the domestic level, setting the growth rates of monetary aggregates or those targeting the price level are exanaples of rules. At the international level, fixed exchange rate regimes such as the European Monetary System (EMS) are based on a set of well-understood intervention principles and the leadership of a country dedicated to maintaining the nominal anchor. This chapter surveys the historical experience of both international and domestic (national) aspects of monetary regimes from the nineteenth century to the present. We first survey the experience of four broad international monetary regimes: the classical gold standard 1880-1914; the interwar period in which a short-lived restoration of the gold standard prevailed; the postwar Bretton Woods international monetary system (1946-1971) indirectly linked to gold; the recent managed float period (1971-1995). We then present in some detail the institutional arrangements and policy actions of the Federal Reserve in the United States as an important example of a domestic policy regime. The survey of the Federal Reserve subdivides the demarcated broad international policy regimes into a number of episodes. A salient theme in our survey is that the convertibility rule or principle that dominated both domestic and international aspects of the monetary regime before World War I has since declined in its relevance. At the same time, policymakers within major nations placed more emphasis on stabilizing the real economy. Policy techniques and doctrine that developed under the pre-World War I convertible regime proved to be inadequate to deal with domestic stabilization goals in the interwar period, setting the stage for the Great Depression. In the post-World War II era, the complete abandonment of the convertibility principle, and its replacement by the goal of full employment, combined with the legacy of inadequate policy tools and theory from the interwar period, set the stage for the Great Inflation of the 1970s. The lessons from that experience have convinced monetary authorities to reemphasize the goal of low inflation, as it were, committing themselves to rule-like behavior.
2. International monetary regimes 2.1. The gold standard
The classical gold standard which ended in 1914 served as the basis of the convertibility principle that prevailed until the third quarter of the twentieth century.
154
M.D. Bordo and A.J Schwartz
We discuss five themes that dominate an extensive literature. The themes are: gold as a monetary standard; gold and the international monetary system; central banks and the "rules of the game"; the commodity theory of money; the gold standard as a rule. 2.1.1. G o l d as a m o n e t a r y s t a n d a r d
Under a gold standard the monetary authority defines the weight of gold coins, or alternatively fixes the price of gold in terms of national currency. The fixed price is maintained by the anthority's willingness freely to buy and sell gold at the mint price. There are no restrictions to the ownership or use of gold. The gold standard evolved from earlier commodity money systems. Earlier commodity money systems were bimetallic - gold was used for high-valued transactions, silver or copper coins for low-valued ones. The bimetallic ratio (the ratio of the mint price of gold relative to the mint price of silver) was set close to the market ratio to ensure that both metals circulated. Otherwise, Gresham's Law ensured that the overvalued metal would drive the undervalued metal out of circulation. The world switched from bimetallism to gold monometallism in the 1870s. Debate continues to swirl over the motivation for the shift. Some argue that it was primarily political [Friedman (1990a), Gallarotti (1995), Eichengreen ( 1996)] - nations wished to emulate the example of England, the world's leading commercial and industrial power. When Germany used the Franco-Prussian War indemnity to finance the creation of a gold standard, other prominent European nations also did so 3. Others argue that massive silver discoveries in the 1860s and 1870s as well as technical advances in coinage were the key determinants [Redish (1990)]. Regardless of the cause, recent research suggests that the shift both was unnecessary and undesirable since France, the principal bimetallic nation, had large enough reserves of both metals to continue to maintain the standard [Oppers (1996), Flandreau (1996)]; and because remaining on a bimetallic standard, through the production and substitution effects earlier analyzed by Irving Fisher (1922), would have provided greater price stability than did gold monometallism [Friedman (1990b)]. The simplest variant of the gold standard was a pure gold coin standard. Such a system entails high resource costs and, consequently in most countries, substitutes for gold coin emerged. In the private sector, commercial banks issued notes and deposits convertible into gold coins, which in turn were held as reserves to meet conversion demands. In the public sector, prototypical central banks (banks of issue) were established to help governments finance their ever expanding fiscal needs [Capie, Goodhart and Schnadt (1994)]. These notes were also convertible, backed by gold reserves. In
3 Gallarotti (1995) describes the shift of political power in favor of the gold standard in Germany. See Friedman and Schwartz (1963) and Friedman (1990b) for a discussion of the US switch de facto to gold in 1879.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
155
wartime, convertibility was suspended, but always on the expectation of renewal upon termination of hostilities. Thus the gold standard evolved into a mixed coin and fiduciary system based on the principle of convertibility. A key problem with the convertible system was the risk of conversion attacks of internal drains when a distrustful public attempted to convert commercial bank liabilities into gold; and external drains when foreign demands on a central bank's gold reserves threatened its ability to maintain convertibility. In the face of this rising tension between substitution of fiduciary money for gold and the stability of the system, central banks learned to become lenders of last resort and to use the tools of monetary policy to protect their gold reserves [Bagehot (1873), Redish (1993), Rockoff (1986)]. The gold standard, both the pure coin variety and the more common mixed standards, were domestic monetary standards which evolved in most countries through market driven processes. By defining its unit of account as a fixed weight of gold or alternatively by fixing the price of gold, each monetary authority also fixed its exchange rate with other gold standard countries and became part of an international gold standard. 2.1.2. Gold and the international monetary system
The international gold standard evolved from domestic standards by the fixing of the price of gold by member nations. Under the classical gold standard fixed exchange rate system, the world's monetary gold stock was distributed according to the member nations' demands for money and use of substitutes for gold. Disturbances to the balance of payments were automatically equilibrated by the Humean price-specie flow mechanism. Under that mechanism, arbitrage in gold kept nations' price levels in line. Gold would flow from countries with balance of payments deficits (caused, for example, by higher price levels) to those with surpluses (caused by lower price levels), in turn keeping their domestic money supplies and price levels in line. Some authors stressed the operation of the law of one price and commodity arbitrage in traded goods prices, others the adjustment of the terms of trade, still others the adjustment of traded relative to nontraded goods prices [Bordo (1984)]. Debate continues on the details of the adjustment mechanism; however, there is consensus that it worked smoothly for the core countries of the world although not necessarily for the periphery [Ford (1962), DeCecco (1974), Fishlow (1985)]. It also facilitated a massive transfer of long-term capital from Europe to the new world in the four decades before World War I on a scale relative to income which has yet to be replicated. Although in theory exchange rates were supposed to be perfectly rigid, in practice the rate of exchange was bounded by upper and lower limits - the gold points - within which the exchange rate floated. The gold points were determined by transactions costs, risk, and other costs of shipping gold. Recent research indicates that although in the classical period exchange rates frequently departed from par, violations of the gold points were rare [Officer (1986, 1996)], as were devaluations [Eichengreen (1985)]. Adjustment to balance of payments disturbances was greatly facilitated by short-term
156
M.D. Bordo and A.J Schwartz
capital flows. Capital would quickly flow between countries to iron out interest rate differences. By the end of the nineteenth century the world capital market was so efficient that capital flows largely replaced gold flows in effecting adjustment. 2.1.3. Central banks and the rules o f the game
Central banks also played an important role in the international gold standard. By varying their discount rates and using other tools of monetary policy they were supposed to follow "the rules of the game" and speed up adjustment to balance of payments disequilibria. In fact many central banks violated the rules [Bloomfield (1959), Dutton (1984), Pippenger (1984), Giovannini (1986), Jeanne (1995), Davutyan and Parke (1995)] by not raising their discount rates or by using "gold devices" which artificially altered the price of gold in the face of a payments deficit [Sayers (1957)]. But the violations were never sufficient to threaten convertibility [Schwartz (1984)]. They were in fact tolerated because market participants viewed them as temporary attempts by central banks to smooth interest rates and economic activity while keeping within the overriding constraint of convertibility [Goodfriend (1988)]. An alternative interpretation is that violations of the rules of the game represented the operation of an effective target zone bordered by the gold points. Because of the credibility of commitment to gold convertibility, monetary authorities could alter their discount rates to affect domestic objectives by exploiting the mean reversion properties of exchange rates within the zone [Svensson (1994), Bordo and MacDonald (1997)]. An alternative to the view that the gold standard was managed by central banks in a symmetrical fashion is that it was managed by the Bank of England [Scammell (1965)]. By manipulating its Bank rate, it could attract whatever gold it needed; furthermore, other central banks adjusted their discount rates to hers. They did so because London was the center for the world's principal gold, commodities, and capital markets, outstanding sterling-denominated assets were huge, and sterling served as an international reserve currency (as a substitute for gold). There is considerable evidence supporting this view [Lindert (1969), Giovannini (1986), Eichengreen (1987)]. There is also evidence which suggests that the two other European core countries, France and Germany, had some control over discount rates within their respective economic spheres [Tullio and Wolters (1996)]. Although the gold standard operated smoothly for close to four decades, there were periodic financial crises. In most cases, when faced with both an internal and an external drain, the Bank of England and other European central banks followed Bagehot's rule of lending freely but at a penalty rate. On several occasions (e.g. 1890 and 1907) even the Bank of England's adherence to convertibility was put to the test and, according to Eichengreen (1992), cooperation with the Banque de France and other central banks was required to save it. Whether this was the case is a moot point. The cooperation that did occur was episodic, ad hoc, and not an integral part of the operation of the gold standard. Of greater importance is that, during periods of financial crisis, private capital flows aided the Bank. Such stabilizing capital movements likely
Ch. 3: MonetaryPolicy Regimes and Economic Performance: The Historical Record
157
reflected market participants' belief in the credibility of England's commitment to convertibility. By the eve of World War I, the gold standard had evolved de facto into a gold exchange standard. In addition to substituting fiduciary national monies for gold to economize on scarce gold reserves, many countries also held convertible foreign exchange (mainly deposits in London). Thus the system evolved into a massive pyramid of credit built upon a tiny base of gold. As pointed out by Triffin (1960), the possibility of a confidence crisis, triggering a collapse of the system, increased as the gold reserves of the center diminished. The advent of World War I triggered such a collapse as the belligerents scrambled to convert their outstanding foreign liabilities into gold. 2.1.4. Theory of commodity money
The gold standard contained a self-regulating mechanism that ensured long-run monetary and price level stability, namely, the commodity theory of money. This was most clearly analyzed by Irving Fisher (1922) although well understood by earlier writers. The price level of the world, treated as a closed system, was determined by the interaction of the money market and the commodity or bullion market. The real price (or purchasing power of gold) was determined by the commodity market; and the price level was determined by the demand for and supply of monetary gold. The demand for monetary gold was derived from the demand for money while the monetary gold stock was the residual between the total world gold stock and the nonmonetary demand. Changes in the monetary gold stock reflected gold production and shifts between monetary and nonmonetary uses of gold [Barro (1979)]. Under the self-equilibrating gold standard, once-for-all shocks to the demand for or supply of monetary gold would change the price level. These would be reversed as changes in the price level affected the real price of gold, leading to offsetting changes in gold production and shifts between monetary and nonmonetary uses of gold. This mechanism produced mean reversion in the price level and a tendency towards longrun price stability. In the shorter run, the shocks to the gold market or to real activity created price level volatility. Evidence suggests that the mechanism worked roughly according to the theory [Cagan (1965), Bordo (1981), Rockoff (1984)] but other factors are also important - including government policy towards gold mining and the level of economic activity [Eichengreen and McLean (1994)]. This simple picture is complicated by a number of important considerations. These include technical progress in gold mining; the exhaustion of high quality ores; and depletion of gold as a durable exhaustible reserve. With depletion, in the absence of offsetting technical change, a gold standard must inevitably result in long-run deflation [Bordo and Ellson (1985)]. Although there is evidence that the gold standard was self-regulating, the lags involved were exceedingly long and variable (between 10 and 25 years, according to Bordo (1981), so that many observers have been unwilling to rely on the mechanism as a basis for world price stability, and prominent contemporary
158
M.D. Bordo and A.J. Schwartz
authorities advocated schemes to improve upon its performance. Others, e.g., Keynes (1930), doubted the operation of the self-regulating mechanism and attributed whatever success the gold standard had before 1914 to purely adventitious acts - timely gold discoveries in Australia and California in the 1850s, invention of the cyanide process in the 1880s, and gold discoveries in South Africa and Alaska in the 1890s. 2.1.5. T h e g o l d s t a n d a r d a s a rule
One of the most important features of the gold standard was that it embodied a monetary rule or commitment mechanism that constrained the actions of the monetary authorities. To the classical economists, forcing monetary authorities to follow rules was viewed as preferable to subjecting monetary policy to the discretion of wellmeaning officials. Today a rule serves to bind policy actions over time. This view of policy rules, in contrast to the earlier tradition that stressed both impersonality and automaticity, stems from the recent literature on the time inconsistency of optimal government policy. In terms of the modern perspective of Kydland and Prescott (1977) and Barro and Gordon (1983), the rule served as a commitment mechanism to prevent governments from setting policies sequentially in a time inconsistent manner. According to this approach, adherence to the fixed price of gold was the commitment that prevented governments from creating surprise fiduciary money issues in order to capture seigniorage revenue, or from defaulting on outstanding debt [Bordo and Kydland (1996), Giovannini (1993)]. On this basis, adherence to the gold standard rule before 1914 enabled many countries to avoid the problems of high inflation and stagflation that troubled the late twentieth century. The gold standard rule in the century before World War I can also be interpreted as a contingent rule, or a rule with escape clauses [Grossman and Van Huyck (1988), DeKock and Grilli (1989), Flood and Isard (1989), Bordo and Kydland (1996)]. The monetary authority maintained the standard - kept the price of the currency in terms of gold fixed - except in the event of a well understood emergency such as a major war. In wartime it might suspend gold convertibility and issue paper money to finance its expenditures, and it could sell debt issues in terms of the nominal value of its currency, on the understanding that the debt would eventually be paid off in gold or in undepreciated paper. The rule was contingent in the sense that the public understood that the suspension would last only for the duration of the wartime emergency plus some period of adjustment, and that afterwards the government would adopt the deflationary policies necessary to resume payments at the original parity. Observing such a rule would allow the government to smooth its revenue from different sources of finance: taxation, borrowing, and seigniorage [Lucas and Stokey (1983), Mankiw (1987)]. That is, in wartime when present taxes on labor effort would reduce output when it was needed most, using future taxes or borrowing would be optimal. At the same time positive collection costs might also make it optimal to use the inflation tax as a substitute for conventional taxes [Bordo and V6gh (1998)].
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
159
A temporary suspension o f convertibility would then allow the government to use the optimal mix o f the three taxes 4. It is crucial that the rule be transparent and simple and that only a limited number o f contingencies be included. Transparency and simplicity avoided the problems o f moral hazard and incomplete information [Canzoneri (1985), Obstfeld (1991)], i.e., prevented the monetary authority from engaging in discretionary policy under the guise o f following the contingent rule. In this respect a second contingency - a temporary suspension in the face o f a financial crisis, which in turn was not the result o f the monetary authority's own actions - might also have been part o f the rule. However, because o f the greater difficulty o f verifying the source o f the contingency than in the case o f war, invoking the contingency under conditions o f financial crisis, or in the case o f a shock to the terms o f trade - a third possible contingency - would be more likely to create suspicion that discretion was the order o f the day. The basic gold standard rule is a domestic rule and it was enforced by the reputation o f the gold standard itself, i.e., by the historical evolution o f gold as money. A n alternative commitment mechanism was to guarantee gold convertibility in the constitution as was done in Sweden before 1914 [Jonung (1984)]. The gold standard contingent rule worked successfully for the "core" countries o f the classical gold standard: Britain, France, and the U S A [Bordo and Schwartz (1996a)]. In all these countries the monetary authorities adhered faithfully to the fixed price o f gold except during major wars. During the Napoleonic War and World War I for England, the Civil War for the U.SA., and the Franco-Prussian War for France, specie payments were suspended and paper money and debt were issued. But in each case, after the wartime emergency had passed, policies leading to resumption at the prewar parity were adopted. Indeed, successful adherence to the pre-World War I rule may have enabled the belligerents to obtain access to debt finance more easily in subsequent wars. In the case o f Germany, the fourth "core" country, no occasions arose for application o f the contingent aspect o f the rule before 1914. Otherwise its record o f adherence to gold convertibility was similar to that o f the other three countries. Unlike the core countries, a number o f peripheral countries had difficulty in following the rule
4 The evidence on revenue smoothing is mixed. According to Mankiw (1987), both the inflation tax and conventional taxes should follow a Martingale process and a regression of the inflation rate on the average tax rate should have a positive and significant coefficient as the former as well as Poterba and Rotemberg (1990) and Trehan and Walsh (1990) found for the post-World War I United States. However, Bordo and White (1993) for the Napoleonic War suspension of convertibility by Britain, Lazaretou (1995) for Greece in periods of inconvertibility in the nineteenth century, and Goff and Toma (1993) for the USA under the classical gold standard reject the hypothesis of revenue smoothing but not that of tax smoothing. As Goff and Toma (1993) argue, seigniorage smoothing would not be expected to prevail under a specie standard where the inflation rate does not exhibit persistence (which was the case during the British and during the Greek inconvertibility episodes). The Bordo and White, and Lazaretou results suggest that, although specie payments were suspended, the commitment to resume prevented the government from acting as it would under the pure fiat regime postulated by the theory.
160
M.D. Bordo and A.J. Schwartz
and their experience was characterized by frequent suspensions of convertibility and devaluations. One author argues that the commitment to gold convertibility by England and the other core countries was made possible by a favorable conjuncture of political economy factors. The groups who were harmed by the contractionary policies, required in the face of a balance of payments deficit to maintain convertibility, did not have political power before 1914. By contrast, in some peripheral countries, powerful political groups, e.g., Argentine ranchers and American silver miners, benefited from inflation and depreciation [Eichengreen (1992)]. The gold standard rule originally evolved as a domestic commitment mechanism but its enduring fame is as an international rule. As an international standard, the key rule was maintenance of gold convertibility at the established par. Maintenance of a fixed price of gold by its adherents in turn ensured fixed exchange rates. The fixed price of domestic currency in terms of gold provided a nominal anchor to the international monetary system. According to the game theoretic literature, for an international monetary arrangement to be effective both between countries and within them, a time-consistent credible commitment mechanism is required [Canzoneri and Henderson (1991)]. Adherence to the gold convertibility rule provided such a mechanism. Indeed, Giovannini (1993) finds the variation of both exchange rates and short-term interest rates within the limits set by the gold points in the 1899-1909 period consistent with market agents' expectations of a credible commitment by the core countries to the gold standard rule. In addition to the reputation of the domestic gold standard and constitutional provisions which ensured domestic commitment, adherence to the international gold standard rule may have been enforced by other mechanisms [see Bordo and Kydland (1996)]. These include: the operation of the rules of the game; the hegemonic power of England; central bank cooperation; and improved access to international capital markets. Indeed the key enforcement mechanism of the gold standard rule for peripheral countries was access to capital obtainable from the core countries. Adherence to the gold standard was a signal of good behavior, like the "good housekeeping seal of approval"; it explains why countries that always adhered to gold convertibility paid lower interest rates on loans contracted in London than others with less consistent performance [Bordo and Rockoff (1996)]. 2.1.6. The viability of" the goM standard
The classical gold standard collapsed in 1914. It was reinstated as a gold exchange standard between 1925 and 1931, and as the gold dollar standard from 1959 to 1971. The gold standard, while highly successful for a time, lost credibility in its 20th century reincarnations and was formally abandoned in 1971. Among the weaknesses which contributed to its abandonment were the cost of maintaining a full-bodied gold standard. Friedman (1953) estimated the cost for the USA in 1950 as 1.5 percent of real GNR Shocks to the demand for and supply of
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
161
gold that produced drift in the price level also weakened support for the gold standard, leading many economists to advocate schemes for reform [Cagan (1984)]. Finally, in a growing world, the gold standard, based on a durable exhaustible resource, posed the prospect of deflation. The key benefits of the gold standard, in hindsight, were that it provided a relatively stable nominal anchor and a commitment mechanism to ensure that monetary authorities followed time consistent policies. However, the gold standard rule of maintaining a fixed price of gold meant, for a closed economy, that continuous full employment was not a viable policy objective and, for an open economy, that domestic policy considerations would be subordinated to those of maintaining external balance. In the twentieth century few countries have been willing to accept the gold standard's discipline [Schwartz (1986b)]. 2.2. Interwar vicissitudes o f the goM standard
The outbreak of World War I in August 1914 led to a massive worldwide financial crisis as investors across the world scrambled to liquidate sterling and other financial assets in exchange for domestic currency and gold. The response to the crisis and the need by the European belligerents for gold to pay for war material led to the breakdown of the gold standard. After the war the UK and other countries expressed a strong preference to return to gold parity at the original parity following the gold standard contingent rule [see the Cunliffe Report (1918)]. At the Genoa Conference in 1922, the Financial Commission, under British leadership, urged that the world return to the gold standard. However, the system they advocated was a gold exchange standard that encouraged member countries to make their currencies convertible into gold but to use foreign exchange (the currencies of key reserve countries, the UK and the USA) as a substitute for gold. The experts also encouraged members to restrict the use of gold as currency, thus establishing a gold bullion standard, and to cooperate when raising or lowering their discount rates to prevent competition for gold. The motivation to economize on gold was a belief that the world would suffer a severe gold shortage in coming decades. The gold standard was restored worldwide in the period 1924-1927. It only lasted globally until 1931. The key event in its restoration was the return in April 1925 by the UK to convertibility at the prewar parity of $4.86. It is believed to have overvalued sterling between 5 and 15 percent depending on the price index used [Keynes (1925), Redmond (1984)] 5.
5 A vociferousdebate continues between the followers of Keynes who attribute the UK's weak economic performance and high unemploymentin the 1920s to the decision to return to gold at an overvalued parity, and those who attribute the high unemploymentto policies that raised the replacement ratio (the ratio of unemployment benefits to money wages), as well as other supply side factors. See, e.g., Pollard (1970); Thomas (1981); and Benjamin and Kochin (1979, 1982). For a recent discussion of the economics of resumption in 1925, see Bayoumiand Bordo (1998).
162
M.D. Bordo and A.J. Schwartz
Countries with high inflation, such as France and Italy, returned to gold but at a greatly devalued parity. It took France seven years to stabilize the franc after the war. As described by Eichengreen (1992), the franc depreciated considerably in the early 1920s reflecting a war of attrition between the left and the right over the financing of postwar reconstruction and over new fiscal programs [Alesina and Drazen (1991)]. The weakness of the franc was halted by Poincar6's 1926 stabilization program which restored budget balance, low money growth, and an independent central bank [Sargent (1984), Prati (1991)]. Germany, Austria, and other countries, which had endured hyperinflation, all stabilized their currencies in 1923/1924 and, with the aid of the League of Nations, all returned to gold convertibility at greatly devalued parities 6. The gold standard was restored on the basis of the recommendations of Genoa. Central bank statutes typically required a cover ratio for currencies of between 30 and 40 percent, divided between gold and foreign exchange. Central reserve countries were to hold reserves only in the form of gold. The gold exchange standard suffered from a number of serious flaws compared to the prewar gold standard [Kindleberger (1973), Temin (1989), Eichengreen (1992, 1996)]. The first problem was the adjustment problem. The UK with an overvalued currency ran persistent balance of payments deficits and gold outflows which imparted deflationary pressure, and in the face of sticky prices and wages, low growth, and high unemployment. This also required the Bank of England to adopt tight monetary policies to defend convertibility. At the other extreme, France with an undervalued currency enjoyed payments surpluses and gold inflows. The Banque de France did not allow the gold inflows to expand the money supply and raise the price level. It sterilized the inflows and absorbed monetary gold from the rest of the world 7. At the same time the USA, the world's largest gold holder, also sterilized gold inflows and prevented the adjustment mechanism from operating [Friedman and Schwartz (1963)]. The second problem was the liquidity problem. Gold supplies were believed to be inadequate to finance the growth of world trade and output. This in turn was a legacy of high World War I inflation which reduced the real price of gold. The League of Nations in the First Interim Report of the Gold Delegation [League of Nations (1930)] tried to forecast the world demand for and supply of gold in the next decade. The Report argued that, unless further attempts to economize on gold succeeded, the world was destined to suffer from massive deflation. That happened in the period 1929-1933, not because of a gold shortage but because of the Great Depression [Bordo and Eichengreen (1998)].
6 Accordingto Sargent (1984), because the reform package was credibly believed to signal a change in the monetary regime, the price level stabilized with no adverse real effects. Wicker (1986), by contrast, presents evidence of a substantial increase in unemployment in Austria, Hungary, and Poland, which persisted for several years. 7 According to Eichengreen (1992), a change in the statutes of the Banque de France following the Poincar6 stabilization, prevented the Banque from using open market operations to expand the money supply. Meltzer (1995b, Chapter 5) disputes this interpretation, arguing that the Banque was not required to deflate the world economyby selling foreign exchange for gold.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
163
In the face of the perceived gold shortage, following the strictures of Genoa, central banks substituted foreign exchange for gold. This in turn created a confidence problem. As outstanding pounds and dollars increased relative to gold reserves in London and New York, the greater the likelihood that some shock would lead to a speculative attack on sterling or the dollar by foreign holders fearful that they would be unable to convert their balances. Indeed this is what happened to sterling in 1931 [Capie, Mills and Wood (1986)] and to the dollar in 1933 [Wigmore (1987)]. The final problem plaguing the gold exchange standard was a lack of credibility. A change in the political complexion of many European countries (the growth of labor unions and left-wing parties) after World War I made it more difficult to defend convertibility if it meant adopting deflationary monetary policy [Eichengreen (1992, 1996), Simmons (1994)]. Speculative attacks made short-term capital flows destabilizing instead of stabilizing, as they were before World War I. The lack of credibility could have been offset, according to Eichengreen (1992), by increased central bank cooperation but it was not forthcoming. The system collapsed in the face of the shocks of the Great Depression 8. 2.3. Bretton Woods
Bretton Woods was the world's last convertible regime. It fits within the context of the gold standard because the USA, the most important commercial power, defined its parity in terms of gold and all other members defined their parities in terms of dollars. The planning that led to Bretton Woods aimed to avoid the chaos of the interwar period [Ikenberry (1993)]. The ills to be avoided were deduced from the historical record: floating exchange rates, condemned as prone to destabilizing speculation in the early 1920s; the subsequent gold exchange standard that enforced the international transmission of deflation in the early 1930s; and devaluations after 1933 that were interpreted as beggar-thy-neighbor actions and declared to be wrong, as was resort to trade restrictions, exchange controls, and bilateralism [Nurkse (1944)]. To avoid these ills, an adjustable peg system was designed that was expected to combine the favorable features of the fixed exchange rate gold standard and flexible exchange rates. Both John Maynard Keynes representing the UK and Harry Dexter White representing the United States planned an adjustable peg system to be coordinated by an
Eichengreen (1990) contrasts two alternative explanations for the collapse of the gold exchange standard: it collapsed after the start of the Great Depression in 1929 because of a scramble by central banks for gold in the face of a loss of confidence in the reserve country currencies; it collapsed as a consequence of inappropriate policies followedby the USA and France in sterilizing gold inflows and thereby creating deflationarypressure on the international monetary system. Cross-countryregressions for 24 countries over the period 1929-1935 explaining the demands for international reserves, gold and foreign exchange, including dummy variables for the USA and France, provide strong support for the latter hypothesis.
164
M.D. Bordo and A.J. Schwartz
international monetary agency. The Keynes plan gave the International Clearing Union substantially more resources and power than White's United Nations Stabilization Fund, but both institutions were to exert considerable power over the domestic financial policy of the members. The British plan contained more domestic policy autonomy than did the US plan, while the American plan put more emphasis on exchange rate stability. The Articles of Agreement signed at Bretton Woods, New Hampshire, in July 1944 represented a compromise between the American and British plans. It combined the flexibility and freedom for policy makers of a floating rate system which the British team wanted, with the nominal stability of the gold standard rule emphasized by the USA. The system established was a pegged exchange rate, but members could alter their parities in terms of gold and the dollar in the face of a fundamental disequilibrium. Members were encouraged to rely on domestic stabilization policy to offset temporary disturbances to their payments balances and they were protected from speculative attack by capital controls. The International Monetary Fund (IMF) was to provide temporary liquidity assistance and to oversee the operation of the system [Bordo (1993a)]. Although based on the principle of convertibility, with the USA rather than England as the center country, Bretton Woods differed from the classical gold standard in a number of fundamental ways. First, it was an arrangement mandated by an international agreement between governments, whereas the gold standard evolved informally. Second, domestic policy autonomy was encouraged even at the expense of convertibility, in sharp contrast to the gold standard where convertibility was key. Third, capital movements were suppressed by controls [Marston (1993), Obstfeld and Taylor (1998)]. The Bretton Woods system faced a number of problems in getting started, and it took 12 years before the system achieved full operation. Each of the two key problems in the early years - bilateralism and the dollar shortage - was largely solved by developments outside the Bretton Woods arrangements. The dollar shortage was solved by massive US Marshall Plan aid and the devaluation of sterling and other currencies in 1949. Multilateralism was eventually achieved in Western Europe in 1958 following the establishment in 1950 of the European Payments Union [Eichengreen (1995)]. The period 1959-1967 was the heyday of Bretton Woods. The system had become a gold dollar standard whereby the United States pegged the price of gold and the rest of the world pegged their currencies to the dollar. The dollar emerged as the key reserve currency in this period, reflecting both its use as an intervention currency and a growing demand by the private sector for dollars as international money. This growth in dollar demand reflected stable US monetary policy. Also the system evolved a different form of international governance than envisioned at Bretton Woods. The IMF's role as manager was eclipsed by that of the USA in competition with the other members of the G-10. According to Dominguez (1993), although the IMF provided many valuable services, it was not successful in serving as a commitment mechanism.
Ch. 3: Monetary Policy Regimes' and Economic Performance: The Historical Record
165
The Bretton Woods system, in its convertible phase from 1959 to 1971, was characterized by exceptional macroeconomic performance in the advanced countries (see Section 4 below). It had the lowest and most stable inflation rate and highest and most stable real growth rates of any modern regime. However, it was short-lived. Moreover, it faced smaller demand and supply shocks than under the gold standard. This suggests that the reason for the brevity of its existence was not the external environment but, as with the gold exchange standard, structural flaws in the regime and the lack of a credible commitment mechanism by the center reser,)e country. The three problems of adjustment, liquidity, and confidence dominated academic and policy discussions during this period. The debate surrounding the first focused on how to achieve adjustment in a world with capital controls, fixed exchange rates, and domestic policy autonomy. Various policy measures were proposed to aid adjustment [Obstfeld (1993)]. For the United States, the persistence of balance of payments deficits after 1957 was a source of concern. For some it demonstrated the need for adjustment; for others it served as the means to satisfy the rest of the world's demand for dollars. For monetary authorities the deficit was a problem because of the threat of a convertibility crisis, as outstanding dollar liabilities rose relative to the US monetary gold stock. US policies to restrict capital flows and discourage convertibility did not solve the problem. The main solution advocated for the adjustment problem was increased liquidity. Exchange rate flexibility was strongly opposed. The liquidity problem evolved from a shortfall of monetary gold beginning in the late 1950s. The gap was increasingly made up by dollars, but, because of the confidence problem, dollars were not a permanent solution. New sources of liquidity were required, answered by the creation of Special Drawing Rights (SDRs). However, by the time SDRs were injected into the system, they exacerbated worldwide inflation [Genberg and Swoboda (1993)]. The key problem of the gold-dollar system was how to maintain confidence. If the growth of the monetary gold stock was not sufficient to finance the growth of world real output and to maintain US gold reserves, the system would become dynamically unstable [Triffin (1960), Kenen (1960)]. Indeed the system was subject to growing speculative attacks, in which market agents anticipated the inconsistency between nations' financial policies and maintenance of pegged exchange rates [Garber and Flood (1984), Garber (1993)]. Although capital flows were blocked in most countries, controls were increasingly evaded by various devices including the use of leads and lags - the practice of accelerating payments in domestic currency and delaying foreign currency receipts in the expectation of a devaluation of the domestic currency [Obstfeld and Taylor (1998)]. Thus successful attacks occurred against sterling in 1947, 1949 and 1967 and the franc in 1968 [Bordo and Schwartz (1996b)]. From 1960 to 1967, the United States adopted a number of policies to prevent conversion of dollars into gold. These included the Gold Pool, swaps, Roosa bonds, and moral suasion. The defense of sterling was a first line of defense for the dollar. When none of the measures worked the dollar itself was attacked via a run on the
166
M.D. Bordo and A.J Schwartz
London gold market in March 1968 leading to the adoption of the two-tier gold market arrangement. This solution temporarily solved the problem by demonetizing gold at the margin and hence creating a de facto dollar standard. The Bretton Woods system collapsed between 1968 and 1971 in the face of US monetary expansion that exacerbated worldwide inflation. The United States broke the implicit rules of the dollar standard by not maintaining price stability [Darby et al. (1983)]. The rest of the world did not want to absorb dollars and inflate. They were also reluctant to revalue. The Americans were forced by British and French decisions to convert dollars into gold. The impasse was resolved by President Richard Nixon's closing of the gold window, ending convertibility on 15 August 1971. Another important source of strain on the system was the unworkability of the adjustable peg under increasing capital mobility. Speculation against a fixed parity could not be stopped by either traditional policies or international rescue packages. The breakdown of Bretton Woods marked the end of US financial predominance in the international monetary system. The absence of a new center of international management set the stage for a multipolar system. Under the Bretton Woods system, as under the classical gold standard, a set of rules was established, based on the convertibility of domestic currency into gold, although under Bretton Woods only the United States was required to maintain it 9. Also, as under the gold standard, the rule was a contingent one. Under Bretton Woods the contingency, which would allow a change of parity, was a fundamental disequilibrium in the balance of payments, although fundamental disequilibrium was never clearly defined. Unlike the example of Britain under the gold standard, however, the commitment to maintain gold convertibility by the USA, the center country, lost credibility by the mid-1960s. Also the contingency aspect of the rule proved unworkable. With fundamental disequilibrium being ill-defined, devaluations were avoided as an admission of failed policy. In addition, devaluations invited speculative attack even in the presence of capital controls. Once controls were removed, the system was held together only by G-10 cooperation and once inconsistencies developed between the interests of the USA and other members, even cooperation became unworkable. In conclusion, under Bretton Woods gold still served as a nominal anchor. This link to gold likely was important in constraining US monetary policy, at least until the mid-1960s, and therefore that of the rest of the world. This may explain the low inflation rates and the low degree of inflation persistence observed in the 1950s and 1960s [Alogoskoufis and Smith (1991), Bordo (1993b)]. However, credibility was considerably weaker than under the gold standard and it was not as effective a nominal anchor [Giovannini (1993)]. Moreover, when domestic interests clashed with convertibility, the anchor chain was stretched and then discarded [Redish (1993)]. This was evident in the US reduction and then removal of gold reserve requirements in 1965
9 McKinnon (1993) also views Bretton Woods and the gold standard as regimes based on a set of rules.
Ch. 3:
Monetary Policy Regimes and Economic Performance: The Historical Record
167
and 1968, the closing of the Gold Pool in 1968 and the gold window itself in 1971. The adoption of the Second Amendment to the IMF Articles of Agreement in 1976 marked the absolute termination of a role for gold in the international monetary system. With the closing of the gold window and the breakdown of the Bretton Woods system, the last vestiges of the fixed nominal anchor of the convertibility regime disappeared. The subsequent decade under a fiat money regime and floating exchange rates exhibited higher peacetime inflation in advanced countries than in any other regime. An interesting unanswered question is whether the demise of the fixed nominal anchor and the convertibility principle explains the subsequent inflation or whether a change in the objectives of monetary authorities - full employment rather than convertibility and price stability - explains the jettisoning of the nominal anchor.
2.4. The recent managed float and the European Monetary System As a reaction to the flaws of the Bretton Woods system, the world turned to generalized floating exchange rates in March 1973. Though the early years of the floating exchange rate were often characterized as a dirty float, whereby monetary authorities extensively intervened to affect both the levels of volatility and exchange rates, by the 1990s it evolved into a system where exchange market intervention occurred primarily with the intention of smoothing fluctuations. Again in the 1980s exchange market intervention was used by the Group of Seven countries as part of a strategy of policy coordination. In recent years, floating exchange rates have been assailed from many quarters for excessive volatility in both nominal and real exchange rates, which in turn increase macroeconomic instability and raise the costs of international transactions. Despite these problems, the ability of the flexible regime to accommodate the problems of the massive oil price shocks in the 1970s as well as other shocks in subsequent years without significant disruption, as well as the perception that pegged exchange rate arrangements amongst major countries are doomed to failure, render the prospects for significant reform of the present system at the world level remote. Based upon the Bretton Woods experience, major countries are unwilling to compromise their domestic interests for the sake of the dictates of an external monetary authority or to be subject to the constraints of an international exchange rate arrangement which they cannot control [Bordo (1995)]. This is not the case at the regional level where there is a greater harmony of interests than between major countries. Indeed Europe is moving unsteadily towards creating a monetary union with a common currency. On the road to that end, the EMS established in 1979 was modelled after Bretton Woods (although not based on gold), with more flexibility and better financial resources [Bordo (1993b)]. It was successful for a few years in the late 1980s when member countries followed policies similar to those of Germany, the center country [Giavazzi and Giovannini (1989)]. It broke down in 1992 to 1993 in a manner similar to the collapse of Bretton Woods in 1968-1971. It also collapsed for similar reasons - because pegged exchange rates, capital mobility, and
168
M.D. Bordo and A.J. Schwartz
policy autonomy do not mix. It collapsed in the face of a massive speculative attack on countries that adopted policies inconsistent with their pegs to the D-mark and also on countries that seemingly observed the rules, but whose ultimate commitment to the peg was doubted. The doubt arose because of rising unemployment in the latter. The lesson from this experience is that the only real alternatives for the European countries are monetary union, perfectly fixed exchange rates and the complete loss of monetary independence, or else floating. Halfway measures such as pegged exchange rate systems do not last. Schemes to reimpose capital controls [Eichengreen, Tobin and Wyplosz (1995)] will be outwitted and will only misallocate resources. The legacy of the gold standard and its variants for EMU is the role of gold as the nominal anchor and of a credible policy rule to maintain it. Cooperation and harmonization of policies under the gold standard was episodic and not by design - in contrast with Bretton Woods, EMS and EMU. For the EMU to succeed, members must have the same credible commitment to their goal as did the advanced nations to the gold standard rule a century ago. That is, they must sacrifice domestic to international stability. The advent of generalized floating in 1973 allowed each country more flexibility to conduct independent monetary policies. In the 1970s inflation accelerated as advanced countries attempted to use monetary policy to maintain full employment. However, monetary policy could be used to target the level of unemployment only at the expense of accelerating inflation [Friedman (1968), Phelps (1968)]. In addition, the USA and other countries used expansionary monetary policy to accommodate oil price shocks in 1973 and 1979. The high inflation rates that ensued led to a determined effort by monetary authorities in the USA and UK and other countries to disinflate. The 1980s witnessed renewed emphasis by central banks on low inflation as their primary (if not sole) objective. Although no formal monetary rule has been established, a number of countries have granted their central banks independence from the fiscal authority and have also instituted mandates for low inflation or price stability. Whether we are witnessing a return to a rule like the convertibility principle and a fixed nominal anchor is too soon to tell. We now turn from the general discussion of domestic and international monetary regimes to survey an important example of a domestic regime - the USA.
3. Episodes in US central banking history 3.1. Origins o f US central banking
Before the passage of the Federal Reserve Act in 1913, the United States did not have a central bank, but it did adhere successfully to a specie standard from 1792 on, except for a brief wartime suspension at the end of the War of 1812 and the 17-year greenback episode from 1862 to 1879. From 1879 to 1914, the United States adhered to the gold standard without a central bank. With the exception of a period in the 1890s, when
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
169
agitation for free coinage of silver led to capital flight and threats of speculative attacks on the dollar [Grilli (1990), Calomiris (1993)], US commitment to gold convertibility was as credible as that of the other core countries [Giovannini (1993)]. Although a formal central bank was not in place before 1914, other institutions performed some of its functions. The Independent Treasury, established in 1840, served as a depository for federal government tax receipts in specie. On a number of occasions, by transferring specie to commercial banks, by judicious timing of its debt management, and by disbursement of the budget surplus, the Treasury mitigated financial stress. It even engaged in primitive open market operations, according to Timberlake (1993, ch. 6). Clearing house associations in various financial centers, beginning with New York in 1857, provided lender of last resort services of a central bank by issuing emergency currency [Timberlake (1993), ch. 14], but often after rates became extremely high - 100 percent in 1907. The Federal Reserve system was established to deal more systematically than had the Treasury and the clearing houses with the perceived problems of the banking system including periodic financial panics and seasonally volatile short-term interest rates. It came into existence at the end of the classical gold standard era, yet it was founded directly upon the precepts of central banking under the gold standard: use of discount rate policy to defend gold convertibility, and the importance of a lender of last resort [Meltzer (1995a), ch. 2]. In addition, the new institution was organized to smooth seasonal movements in short-term interest rates by providing an elastic money supply. By accommodating member bank demand for redisconnts, based on eligible, self-liquidating commercial bills, the reserve banks were designed to promote sufficient liquidity to finance economic activity over the business cycle [Meltzer (1996), ch. 3]. The remaining subsections cover episodes of the eighty-odd years of the Federal Reserve's existence within the broad regimes demarcated in Section 2:1919-1941 ; 1946-1971; 1971-1995 l0 The environment in which the system operated in each of these episodes was vastly different from that envisioned by the founders. Monetary policy changes took place. The changes reflected the influence of three sets of players, who shaped the saga of the system: Congress, by legislation and oversight; the system's officials, by their efforts to fulfill its mission, as they understood it; and the research community, by its interpretation and evaluation of the system's performance. Our discussion comments on these sources of influence on the system. To accompany the discussion, Figures 3.1 a - f present annual series for six important macroeconomic aggregates, 1914-1995: CPI and real per capita income; M2 and the monetary base; the short-term commercial paper rate, and a long-term bond yield. Vertical lines on each plot mark the separate monetary policy episodes that distinguish the Federal Reserve era.
I0 We omit war years, 1915 1918 and 1941 1946. World War II for the USA began later than for the European countries, hence the difference between the dating of the Fed episodes and the broad international regimes in sections 2 and 4.
M.D. Bordo and A.J Schwartz
170
1
2
5
4-
5
6
7
8
r7 c
~_j'-
~c
9~
'3515 1920 1925 1930 1935 1940 1945 1950 1955 198C 1965 1970 1575 1980 1985 1590 1995
Fig. 3.1a. Real per capita income, 1914-1995, USA. Regimes: 1. World War II; 2. 1920's; 3. Great Contraction; 4. the recovery; 5. interest rate peg; 6. Fed. discretionary regime; 7. breakdown of convertibility principle; 8. shifting the focus of monetary policy. Data sources: see Appendix A.
2
3
4
5
7
r
I
8
j /
~915 19~ 1925 1930 1935 1940 1945 1950 1955 1980 1965 1970 1975 1980 1985 1990 1995
Fig. 3.1b. CPI, 1914-1995, USA. See Figure 3.1a for legenda.
1
2
3
4
5
6
7
8
1915 1920 1925 1930 1935 1940 1945 1950 1955 1980 1965 1910 1975 1980 1985 1990 1995
Fig. 3.1c. Monetary base, 1914-1995, USA. See Figure 3.1a for legenda.
Ch. 3." Monetary Policy Regimes and Economic Performance." The Historical Record
1
2.
~
4
5
6
7
8
I b
t
J p._.,
,t
1915 1920 1925 1930 1935 1940 1945 1950 1955 1950 1365 1970 1975 198D 1985 1990 t995
Fig. 3.1d. M2, 1914-1995, USA. See Figure 3.1a for legenda.
1
2
5
4
5
6
7
8
1
Jt
/
'
N
01915 ................................................................. 197"0 1925 1930 1'9r35 1940 1945 1950 1955 1980 1965 1970 1975 1980 1985 I'~9~' '1'995 Fig. 3.1e. Short-term interest rate, 1914 1995, USA. See Figure 3.1a for legenda.
1
N
Z
3
4
r ....
5
6
7
8
' ..............
1915 1920 1925 1930 1935 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 199'0' 1995
Fig. 3.1£ Long-term interest rate, 1914-1995, USA. See Figure 3.1a for legenda.
171
172
M.D. Bordo and A.J Schwartz
3.2. Federal Reserve 1914
In the 30 sections of the Federal Reserve Act that was signed into law on 13 December 1913 Congress sketched the outlines of the system it sought to create. Its structure included a board based in Washington, DC, of five (increased to six, June 1922) appointees of the President, one of whom he would designate as the Governor, plus the Comptroller of the Currency, and the Secretary of the Treasury as ex officio chairman; no fewer than eight and no more than twelve Federal Reserve banks, each located in a principal city, the final number and boundaries of the districts to be determined by a committee of the Secretaries of Treasury and Agriculture, and the Comptroller of the Currency; a Federal Advisory Council of one banker elected by each reserve bank. By this structure Congress intended to create a system of semi-autonomous regional reserve banks, loosely subject to the supervision of the Washington board. Over the next two decades the board and the reserve banks and the reserve banks among themselves would be pitted against one another in a struggle to determine which one was dominant. The principal change the Federal Reserve Act introduced was the provision of an "elastic currency", Federal Reserve notes (or, equivalently, member bank deposits at the reserve banks). "Elastic" meant that the new Federal Reserve money would be subject to substantial change in quantity over short periods, thus requiring some body to control the creation and retirement of the money, some means for creating and retiring the money, and some criteria to determine the amount to be created or retired [Friedman and Schwartz (1963)]. Both the board and the reserve banks, without clear lines of demarcation of their respective powers, were given joint control of the creation and retirement of Federal Reserve money. The means for creating it were gold inflows, rediscounting of "eligible" paper, discounting of foreign trade acceptances, and open market purchases of government securities, bankers' acceptances, and bills of exchange. Retirements involved the converse. The criteria for determining the amount of Federal Reserve money, on the one hand, was a gold standard rule, imposing the requirement of a 40 percent gold reserve against notes and a 35 percent gold reserve against deposits, and convertibility of Federal Reserve money in gold on demand at the Treasury Department or in gold and lawful money at any Federal Reserve bank; and, on the other hand, a real bills doctrine, according to which the amount issued would be linked to "notes, drafts, and bills of exchange arising out of actual commercial transactions" (section 13), offered for discount at rates to be established "with a view of accommodating commerce and business" (section 14d). In addition to gold backing, each dollar of Federal Reserve notes was also to be secured by a 60 percent commercial paper collateral requirement. The two criteria were on the surface contradictory. While the gold standard rule requires the stock of notes and deposits to be whatever is necessary to balance international payments over the long run, in the short run, the stock of gold reserves and international capital market flows can accommodate temporary imbalances.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
173
However, the gold standard does not determine the division of the stock of money between currency and deposits, although facilitating shifts between the two forms of money was a crucial attribute of the new institution. The real bills criterion, by contrast, which was linked to this division, sets no limit to the quantity of money. A basic monetary problem that the Federal Reserve Act was intended to solve was an attempt by the public to shift from holding deposits to holding currency. Such attempts had led to a series of banking crises before 1914 [Schwartz (1986a)]. The solution was to introduce a form of currency that could be rapidly expanded - the role of the Federal Reserve note - and to enable commercial banks readily to convert their assets into such currency - the role of rediscounting. By limiting the lender of last resort to rediscounting only such paper as arose from "actual commercial transactions" as opposed to paper arising from "speculative transactions" (i.e., loans backed by stock market collateral), the Federal Reserve Act sustained the real bills doctrine but, in so doing, it confused the elasticity of one component of the money stock relative to another and the elasticity of the total. Systemwide open market operations were not contemplated in the Act. Each reserve bank had discretion to choose the amount of government securities to buy and sell and first claim on the earnings of its government securities portfolio. The Federal Reserve Act gave the board and the reserve banks the right to regulate interest rates. As a result, the behavior of short-term interest rates changed. Before the Federal Reserve began operations, nominal interest rates displayed extreme seasonality, which was linked to financial crises [Kemmerer (1910), ch. 2; Macaulay (1938), chart 20; Shiller (1980), pp. 136-137; Clark (1986), Miron (1986), Mankiw, Miron and Weil (1987), Miron (1996)]. Once in operation, it apparently altered the process generating short-term interest rates. According to Barro (1989), the shifts in monetary policy involved changes in the process for monetary-base growth. Federal Reserve policy did not completely eliminate seasonality in nominal interest rates, but substantially reduced its amplitude. Why the policy of smoothing was quickly effective in reducing seasonality and other transitory movements in nominal interest rates has been the subject of debate. Was it the founding of the Federal Reserve, as Miron (1986) and Goodfriend (1991) contend, or the abandonment of the gold standard by many countries in 1914 that led to diminished interest rate seasonality, as Truman Clark (1986) contends, or was there no regime change at all, as Fishe and Wohar (1990) maintain? Whichever interpretation one adopts, if one regards the nominal interest rate as the implicit tax on holding real money balances, smoothing the nominal interest rate over the year is a benefit but only of small consequence in raising welfare. McCallum (1991) suggests, however, that seasonal interest rate smoothing encouraged Federal Reserve smoothing in nonseasonal ways also, which was probably detrimental to monetary policy more generally. Goodfriend (1988) asks how the Federal Reserve was able to combine a commitment to a fixed dollar price of gold, on its founding, with interest rate smoothing. His answer is that, under a gold standard, the Federal Reserve could choose policy rules for both
174
M.D. Bordo and A.J. Schwartz
money and gold. It varied its stockpile of gold in supporting a fixed price of gold, and used monetary policy to target interest rates. Semi-autonomous reserve banks, according to the Federal Reserve Act, would each establish discount rates in accordance with regional demand for and supply of rediscounts, subject to review and determination of the board (section 13). Discount rates were to vary by types of eligible paper and by different maturities. Where the power rested to initiate discount rate changes would become contentious. The example of the Bank of England in setting its rate above market rates influenced early reserve bank belief that discount rates should be penalty rates. This belief conflicted with the political interest to use the Act to achieve a low level of interest rates [Meltzer (1996), ch. 3]. The Federal Reserve Act also included a fiscal provision (section 7). Member banks own the reserve banks, and are paid a 6 percent cumulative dividend on their capital stock, as if the reserve banks were a public utility and the board were the regulatory body [Timberlake (1993)]. Expenses of both the reserve banks and the board were paid from earnings on assets. Timberlake finds a contradiction between regarding the reserve banks as both the income-earning utility and regulators of the commercial banking system. The net earnings of the reserve banks, according to the law, after payment of dividends were to be divided between the surplus account and the Treasury. However, before they needed to turn over any part of their earnings to the government, the reserve banks could build up their surplus until the accounts equaled (originally 40 percent, changed in March 1919 to 100 percent of) their subscribed capital and, even then, 10 percent of net earnings would continue to be added to the surplus before the remainder was paid to the Treasury as a franchise tax on the note issue. The objective of the Federal Reserve was to serve as a lender of last resort and thus eliminate financial crises, to be achieved by interest rate smoothing, according to the consensus view of writers on its founding. It would issue notes and deposits, based on real bills, and convertible into gold on demand. Toma (1997) regards the foregoing specification of the intent of the Federal Reserve Act as misconceived. Based on a public choice approach, he describes the reserve banks as a network of competitive clearinghouses that were to provide liquidity for retail banks. Assigning money creation powers to the Federal Reserve was a way of funding the general government, which could indeed raise revenue for itself by granting monopoly status to a clearinghouse and taxing its profits. That strategy, however, would reduce the liquidity the clearinghouse were to offer banks. Greater government financing needs meant less liquidity supplied by the reserve industry and greater bank fragility. Hence, for Toma, the founding of the Federal Reserve reflected a tradeoff between government revenue needs and financial stability. Since prospective government seigniorage requirements were low in 1913, financial stability goals dominated. Toma also disputes the role of interest rate smoothing. The solution to the financial crisis problem in his view did not rely on interest rate control. Instead, the Federal
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
175
Reserve rebated earnings to large city banks through an in-kind payment of checkclearing services, and subsidized loans during the fall when discount rates were constant and market interest rates rose. Hence probability of a financial crisis was reduced. Manipulation of market interest rates was not required. Toma's emphasis on government revenue needs as an important element in the thinking of the founders of the Federal Reserve would carry weight if he would cite evidence to this effect during the lengthy debate preceding the law's enactment. As it is, his evidence is that public finance considerations accounted for the creation of the national banking system and 19th century central banks. These examples do not clinch his case. Similarly, Toma's argument that interest rate smoothing was not needed for financial stability because it was achieved by the alternative means he identifies does not challenge the fact that smoothing occurred. 3.3. Interwar years, 1919-1941 3.3.1. 1919-1929
The system's experiences during World War I and the aftermath left the policy guidelines of the Federal Reserve Act of questionable value. The gold criterion had become operative only when inflation rose in 1919-1920, and the system's gold reserve ratio plunged. In the face of that decline, the system had contracted. However, when gold inflows followed, and the gold criterion signaled the need to lower interest rates, the real bills criterion signaled the opposite policy. The real bills criterion had been emasculated by wartime finance considerations, but in 1920 member bank indebtedness to the reserve banks and their large portfolios of government securities signaled a need for higher interest rates. Moreover, the steep discount rates in 1920-1921 were not penalty rates since they were lower than open market rates on commercial paper [Meltzer (1996), ch. 3]. In the deep contraction of 1920-1921 the system had no compass by which to steer to keep to a chosen course. The violent swings of prices that marked the inflation of 1919 and deflation of 1920 was the background to Federal Reserve performance in the years before the Great Depression. No disputes exist about what the Federal Reserve's actions were, but a contentious literature has arisen about the interpretation of those actions. The issues concern the Federal Reserve's commitment to the gold standard criterion and the real bills doctrine, and whether stabilization of the business cycle became its goal. With respect to the gold standard criterion, the problem for the Federal Reserve was that gold standard rules appeared to be inapplicable in a world where only the United States maintained gold payments. The flow of gold to the United States in 1921-1922 threatened monetary stability if the authorities responded with expansionary actions. But gold sterilization was incompatible with using the gold reserve ratio as a guide to Federal Reserve credit. From 1923 on gold movements were largely offset by movements in Federal Reserve credit, so essentially no relation is observed between the gold movements and the
176
M.D. Bordo and A.J. Schwartz
monetary base [Friedman and Schwartz (1963), pp. 279-284]. The system justified sterilization of gold movements on three grounds: pending the return to gold standards by countries abroad, much of the gold was in this country only temporarily; gold movements could not serve their equilibrating role with most of the world not on the gold standard; sterilization of the inflow was desirable to increase the gold stock in view of increased short-term foreign balances here. Once other countries returned to the gold standard, however, these reasons were no longer valid, although the system still repeated them. Wicker's (1965, pp. 338-339) objection to regarding gold sterilization as a significant indicator of monetary policy is that "Federal Reserve monetary policy may not have been at all times rationally conceived and administered" (p. 338). He sees a conflict between sterilization for domestic considerations and the commitment to fully convertible gold currencies abroad, but he concedes that the Federal Reserve rejected the reserve ratio as a guide, although only until the international gold standard would be fully restored. To replace the gold reserve ratio, the Tenth Annual Report [Federal Reserve Board (1924)] of the Federal Reserve system maintained that credit would not be excessive "if restricted to productive uses". This seems to be a procyclical needs of trade doctrine. The Report distinguishes between "productive" and "speculative" use of credit, the latter referring to speculative accumulation of commodity stocks, not stock market speculation. Wicker argues that the Report emphasized a quantitative as well as a qualitative criterion for the adequacy of bank credit, and that the system was not guilty of the real bills fallacy (1965, pp. 340-341). How the quantitative criterion was to be applied in practice Wicker does not explain. Strong in 1922 in a speech at Harvard showed that he understood that the qualitative criterion was ineffectual, noting that "the definition of eligibility does not affect the slightest control over the use to which the proceeds (of Federal Reserve credit) are put" [Chandler (1958), p. 198; Meltzer (1996), ch. 3]. A third issue that divides commentators on monetary policy during the 1920s is whether the system consciously pursued the goal of stabilizing the business cycle. After its unfortunate experience with the discount rate in 1919-1920 as the instrument to implement monetary policy, in the following years the system experimented with open market operations. They were initially regarded as a means to obtain earnings for the reserve banks. The banks individually bought government securities without apparent concern for the influence of those purchases on the money market, with the result that their uncoordinated operations disturbed the government securities market. The Treasury's dismay led the reserve banks in May 1922 to organize a committee of five governors from eastern reserve banks to execute joint purchases and sales and to avoid conflicts with Treasury plans for new issues or acquisitions for its investment accounts. The committee met for the first time on 16 May at the New York reserve bank and elected Strong as permanent chairman. Although the centralization of open market operations led to a recognition of the bearing of purchases and sales on monetary policy, it did not happen immediately.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
177
Opposition to open market operations was voiced by Adolph Miller, an economist member of the reserve board. He argued that changes in member bank borrowing offset open market operations and therefore had no effect on credit conditions. In his view the reserve banks should limit provision of credit to rediscounting bills that member banks submitted. Opposition to open market operations was related to a general view of monetary policy that distinguished sharply between discounts and bankers' acceptances, on the one hand, and government securities on the other as sources of credit expansion. Reserve creation by buying bills and discounting bills was regarded as financing a genuine business transaction, while reserve creation by buying government securities had no such direct connection with the needs of trade. Reserve creation in the latter case might filter into loans on Wall Street. These conflicting domestic policy positions were intertwined with international considerations. The system attached great importance to the reestablishment of a worldwide gold standard, but official literature contained no discussion of the policy measures appropriate to achieve the objective. Strong played the leading role in the system's relations with other countries, promoting credit arrangements with countries that returned to the gold standard during the 1920s. From Strong's standpoint, easing measures in 1927 served two purposes: overcoming slack business conditions, despite his concern about speculation in the stock market; and helping to strengthen European exchange rates. Recession in the United States reached a trough in November 1927, and European exchange rates strengthened. Wicker (1965, p. 343) disputes that stabilization through skilful open market operations was Strong's objective. He contends that open market purchases in 1924 and 1927 were intended to reduce US interest rates relative to Britain's to encourage a gold flow to London. According to Wicker, for Strong it was through the restoration of the world gold standard that stabilization of national economies would automatically occur. Wicker concludes, "The error of assigning too much weight to domestic stability as a major determinant of monetary policy has arisen.., out of a faulty and inadequate account of the nature of Benjamin Strong's influence on open market policy and a tendency to exaggerate the extent to which some Federal Reserve officials understood the use of open market policy to counteract domestic instability". Wheelock (1991) models econometrically the Federal Reserve's open market policy from 1924 to 1929 with alternative explanatory variables. His results confirm that the Federal Reserve attempted "to limit fluctuations in economic activity, to control stock market speculation, and to assist Great Britain retain gold" (p. 29). He is unable, however, to discriminate between Wicker's approach and that of Friedman and Schwartz. Toma (1997) disputes the Friedman and Schwartz view that the Federal Reserve in the 1920s discovered how to use open market policy to fine tune the economy and that those years were the high tide of the system. He contends that the system had no such stabilization powers. Open market purchases tend to reduce the volume of discounting and open market sales to increase it - the so-called scissors effect - that
178
M.D. Bordo and A.J. Schwartz
Adolph Miller had earlier mentioned, for reasons different from Toma's. For Toma the private banking system eliminated any lasting effect these operations might have had on Federal Reserve credit (p. 80), and the relative stability o f the 1920s cannot be attributed to fine tuning by the Federal Reserve (p. 87). In his view the stability is associated with monetary restraint that competitive open market operations o f profitseeking reserve banks induced. The period o f the 1920s, for him, was "one of reserve bank competition interrupted by occasional episodes o f coordination" (p. 73). Toma also contends that the Federal Reserve did not use centralized open market operations to smooth interest rates during the 1920s (p. 80). He reports that seasonal behavior o f Federal credit during 1922-1928 was "driven by the demands of the private banking system (i.e., discount loans and bankers' acceptances) rather than by open market operations". Two fallacies undermine Toma's positions. He treats the scissors effect as if it were a one-to-one offset o f open market operations. The inverse relation between borrowing and open market operations was hardly that close [Meltzer (1995b), ch. 5]. In addition, Toma's insistence that open market operations continued to be decentralized after the OMIC was established is incorrect. His portrayal o f the system in a public choice framework seems far removed from the facts. 3.3.2. The Great Depression o f 1929-1933
No period o f Federal Reserve history has elicited as much discussion as the four years that set off the economic collapse that began in August 1929 and ended in March 1933. Since our subject is monetary regimes, we exclude the view that the contraction can be explained by real business cycle theory [Prescott (1996)] 11. Instead we deal with issues on which opinions are divided among students of the period for whom monetary policy is the central focus. There are six principal issues: (1) Was there a significant change in Federal Reserve conduct o f monetary policy between 1923-1929 and 1929-1933? (2) Were bank failures a significant contributor to the economic collapse? (3) How was the monetary collapse transmitted to the real economy? (4) Did the stock market crash in October 1929 play an important role in initiating the economic decline? (5) Had the Federal
11 Bernanke and Carey (1996) note that "any purely real theory" (p. 880) is unable to give a plausible explanation of the strong inverse relationship they find (across a panel of countries over the period 1931-1936) between output and real wages, and of their finding that countries that adhered to the gold standard typically had low output and high real wages, while countries that left the gold standard early had high output and low real wages. The dominant source of variation between the two sets of countries was differences in money stocks and hence in levels of aggregate demand. Another view attributes the severity of the Great Depression to the collapse of world trade following the passage of the Smoot-Hawley tariffin 1930 [Meltzer (1977), Crucini and Kahn (1996)]. The importance of the tariff act and the retaliation it provoked are minimized as an important cause of the downturn in Eichengreen (1989) and Irwin (1996).
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
179
Reserve not allowed the money stock to decline, would the depression have been attenuated? (6) Did gold standard policies transmit the depression to the rest of the world? 3.3.2.1. Policy continuity?. Friedman and Schwartz (1963) maintain that during the
1920s the Federal Reserve responded effectively to fluctuations in economic activity, but during the depression it did not. They attribute the change to the death of Benjamin Strong in 1928. It removed from the scene the dominant figure in the system who had the best understanding of its capabilities. No one with equal authority replaced Strong. Power within the system shifted from him to a leaderless conference of reserve bank governors and a board that had no stature. Challenges to the foregoing position have been mounted by Wicker (1965), Brunner and Meltzer (1968), Temin (1989), Wbeelock (1991), and Meltzer (1995b). They find no shift in Federal Reserve performance between the Strong years and the depression years. For Wicker, who believes international considerations dominated open market operations in the 1920s, the reason the Federal Reserve saw no need for action in 1929-1931 was that those years posed no threat to the gold standard. When Britain abandoned the gold standard in 193l, however, the system raised discount rates in order to maintain convertibility. It was acting on the consistent principle that domestic stability was subordinate to the gold standard. Temin agrees with Wicker. Brunner and Meltzer (1968, p. 341) do not accept the argument for continuity based on the primacy of international considerations. Rather, they trace the continuity to the Federal Reserve's mistaken monetary policy strategy, which they assert has been an unchanging characteristic of its performance. For the system, a low level of nominal interest rates and of member bank borrowing are indicators of monetary ease, a high level, of monetary tightness. In 1924 and 1927, interest rates and bank borrowing had declined only moderately, hence they indicated relative monetary tightness, justifying open market purchases. During the depression years, since interest rates and member bank borrowing were at exceptionally low levels, they signified to the Federal Reserve that there was monetary ease and that injections of reserves were unneeded. Based on regression estimates of the demand for borrowed reserves for all member banks in the New York reserve district and for weekly reporting member banks in New York City, Wheelock (1991) also finds Federal Reserve behavior largely consistent throughout the 1920s and the depression. Meltzer (1995b, ch. 5) disagrees with the view that, had Strong lived, policies would have differed. He describes Strong as an ardent upholder of market interest rates and borrowing as the main indicators of monetary policy. Since Strong approved of the deflationary policy of 1920-1921, he sees no reason to believe that Strong would have opposed deflation from 1929 to 1931. Meltzer notes that, while the real bills doctrine and member bank borrowing as policy indicator were the prevailing principles of Federal Reserve officials, and some so-called liquidationists supported a more deflationary policy, support for expansionary policy was at best a future possibility, "much of the time" not under consideration
180
M.D. Bordo and A.J. Schwartz
during the depression years. For Friedman and Schwartz, Strong was not a slavish follower of the prevailing principles of the Federal Reserve, and there is enough evidence in his speeches and Congressional testimony to suggest that he would not have passively observed cataclysmic economic decline without championing policies he knew had succeeded in 1924 and 1927. Hetzel (1985) provides the evidence on Strong's views. The expansionist position taken by the New York reserve bank during 1930 is also persuasive evidence that policy would have been different had Strong then been alive. 3.3.2.2. Banking panics. Friedman and Schwartz (1963 ) identified four banking panic s between October 1930 and March 1933, and found them largely responsible for the steep contraction in the stock of money that took place. A bank failure not only eliminated its deposits from the money stock, but also diminished the public's confidence in other banks, with the result that holding currency became preferable to holding a bank's liabilities. A withdrawal of deposits in the form of currency reduced bank reserves. Given Federal Reserve policy to hold back on the provision of reserves, both the deposit-currency and the deposit-reserve ratios declined, contributing far more to the decline in the money stock than did a bank failure. Recent research on banking panics has centered on whether it is accurate to designate the cluster of bank failures in November 1930-January 1931 as the first banking panic, as Friedman and Schwartz do; the geographical boundaries of each of the panics and whether they had a national impact; whether periods additional to those Friedman and Schwartz designated qualify as bona fide panics; whether panics during the Great Depression differed from pre-1914 examples; causes of bank suspensions. Wicker (1996) is the author of the most substantial empirical work on the microeconomic level of banking panics during the Great Depression. He has combed old newspaper files to learn the names and locations of failed banks, and compiled data on currency inflows and outflows by Federal Reserve districts to track the fallout from a concentration of bank failures in panic subperiods. Controversy over the validity of the assignment by Friedman and Schwartz of special significance to the failure of the Bank of United States in December 1930 dates back to Temin (1976). He asserted that failures were induced by the decline in agricultural income and in the prices of relatively risky long-term securities held by banks, and that the failure of the Bank of United States did not precipitate a liquidity crisis. In his Lionel Robbins lecture [Temin (1989)], he repeated the view that the first banking panic was a minor event. White (1984), who found that balance sheets of failed banks in the 1930s did not differ from those of the 1920s, denied the characterization by Friedman and Schwartz of bank failures in 1930 as cases of illiquidity, unlike pre-1914 cases of insolvency. White overlooks the fact that runs on banks in distress in the 1920s were rare [Schwartz (1988)], but in the 1930s were common. Wicker (1980) called attention to the omission by Friedman and Schwartz of the failure in November 1930 of Caldwell and Company, the largest investment banking house in the South, that led to runs on 120 banks in
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
181
four states. He concludes [Wicker (1996), p. 32] that, on the evidence of Temin, White, and his own research, "the 1930 crisis was a region specific crisis without noticeable national economic effects". He believes the second crisis from April to August 1931 perhaps is also region specific, and without clearly identifiable national effects (p. 18). Wicker also identifies a fifth panic in June 1932 in the city of Chicago, comparable in severity, he says, to the 1930 panic. Measured by the deposit-currency ratio, however, and the money stock, which are national in coverage, their fall unmistakably records the incidence of the first two bank crises. Regional disparities are not incompatible with national effects. As for the absence of noticeable national economic effects, does Wicker suggest that economic activity did not deteriorate between October 1930 and August 1931? Some attention has been given to the question whether banks that fail during panics are in the main illiquid or insolvent. Calomiris and Gorton (1991) find the answer depends on which of two rival theories applies. The random withdrawal theory associates bank suspensions with illiquidity induced by contagion of fear. The asymmetric information theory associates suspensions with insolvency due to malfeasance. Saunders and Wilson (1993) found contagion effects in a sample of national banks 1930-1932, but did not examine separately panic and nonpanic months. Wicker also notes contagion effects in the Caldwell collapse in November 1930. Wicker (1996) highlights a difference between pre-1914 and Great Depression panics. In the former the New York money market was the center of the crisis. In 1930 and 1931, however, the crisis originated in the interior of the country, with minimal central money market involvement. Wicker credits the Federal Reserve with this result: "there were no spikes in the call money rate or other short-term interest rates" (p. 23). However, he faults the Federal Reserve for not attempting to restore depositor confidence through open market purchases.
3.3.2.3. Transmission o f the monetary collapse to the real economy. The literature
on the propagation of the depression takes two different approaches. One stresses real wage and price rigidity as the propagator [on price setting in product markets and wage setting in labor markets, see Gordon (1990)]. The other approach stresses the consequences of price deflation, whether anticipated or unanticipated. The disruption of the process of financial intermediation owing to bank failures has also been studied as a nonmonetary link to output decline. O'Brien (1989) provides empirical evidence on nominal wage rigidity in the late 1920s and thereafter. Manufacturing firms became convinced following the 19201922 steep wage and price decline that maintaining wage rates during a downturn was necessary if precipitous sales declines were to be avoided. They did so collectively and voluntarily. The puzzle is why firms adhered to the policy once the severity of the sales decline in 1929-1931 became evident. It took until the fall of 1931 for many firms to decide that wage cuts would not have adverse consequences for productivity.
182
M.D. Bordo and A.J. Schwartz
Based on data for 22 countries, 1929-1936, Bernanke and Carey (1996) assess empirically whether slow adjustment of nominal wages was an important factor in the depression. They found a strong inverse relationship between output and real wages. They do not offer their own explanation of the failure of wages and other costs to fall along with prices that thus contributed to the rise in unemployment and the decline in sales. They cite conjectures by other researchers that coordination failures or politicization of wage and price setting as possible explanations 12. The issue whether the price deflation during the Great Depression was anticipated or not is important for choosing between the debt deflation hypothesis or high ex ante real interest rates as the explanation for the severity of the Great Depression. According to the debt deflation hypothesis, unanticipated deflation increases the real burden of nominal debt, curtails expenditures, and makes it more difficult for borrowers to repay bank loans. As a result bank balance sheets deteriorate, and banks ultimately may fail. Financial intermediation is reduced, with negative effects on economic activity. However, if deflation was anticipated, the debt explanation for the severity of the Great Depression turns on a collapse of consumption and investment expenditures driven by high real interest rates. No conclusive evidence can be cited in support of deflation as either unanticipated or anticipated. Research findings diverge. Barsky (1987) and Ceechetti (1992) concluded that simple time series models predicted price changes. An opposite conclusion was reached by Dominguez, Fair and Shapiro (1988), on the basis of forecasts from VAR models using data ending at various dates between September 1929 and June 1930. Hamilton (1987, 1992) links unanticipated deflation to the Federal Reserve's tight monetary policy in 1928, and shows that deflation was not anticipated in selected commodities markets for which he examined the relationship between spot and futures prices. Nelson (1991) found in reviewing the contemporary business press that there was some expectation that prices would decline but not the degree or the duration of the decline. Evans and Wachtel (1993) construct a test using data on inflation and interest rates that suggests that time series forecasts of price change, such as Cecchetti reported, are not accurate representations of what people expected prices would be. The prospect of future policy changes or knowledge of past changes of policy made them highly uncertain about the future behavior of prices. They expected little of the deflation that actually occurred. Evans and Wachtel indicate that, in 1930-1933, with anticipated deflation of no more than 2 percent and nominal interest rates ranging between 5 and
12 Bordo, Erceg and Evans (1997) simulate over the interwar period an equilibrium model of the business cycle with sticky wages embodiedin Fischer (1977) and Taylor(1980) staggeredcontracts. They show that monetary contraction closely replicates the downturn in output until early 1933. Thereafter, their monetary model produces a much faster recovery than actually occurred. Other forces, such as Roosevelt'sNIRA policy [Weinstein (1981)] and technology shocks may be important in accounting for the recovery.
Oh. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
183
1 percent, the ex ante real rate of interest was unlikely to have exceeded 7 percent and was probably much smaller. The foregoing studies focus on the United States. Bernanke and James (1991) in an examination of the experience of 24 countries find that the extent of the worldwide deflation was less than fully anticipated in view of two facts: the nominal interest rate floor was not binding in the deflating countries, and nominal returns on safe assets were similar whether countries did or did not remain on the gold standard. The issue whether price deflation during the Great Depression was anticipated or unanticipated is still unresolved. Another nonmonetary channel that served to propagate the depression has also been studied. Bernanke (1983) introduced the decline in financial intermediation as a nonmonetary shock, operating as an independent force in producing real economic decline in the 1930s. The disruption of financial markets as a result of the reduction in banks' ability to lend engendered a fall in the net worth of households and firms holding nominally fixed debt. The ensuing debt crisis became an important propagator of economic contraction, increasing the number of bankruptcies [see also Bernanke (1995), Bernanke and Gertler (1989), Calomiris (1993)]. Brunner and Meltzer (1968, 1993) accept Bernanke's emphasis on the importance of the credit market in the transmission of shocks but not his treatment of it and the debt crisis as a separate and independent exogenous shock. They regard it as an induced response to the monetary authorities' failure to counter deflation.
3.3.2.4. The October 1929 stock market crash. The Dow Jones Industrial Index was between 300 and 320 during the first half of 1929 until the end of June, when it stood at 333. It-climbed during the following months and peaked at 381 on 2 September. By the end of September, the index had fallen to 343. On 23 October stock prices dropped to 305. The crash came on 24 October, "Black Thursday". By 6 November the index was down to 231. A week later the index had fallen to 199. This was the low following the crash [Wigmore (1985), pp. 4-26 and Table A-19]. It is commonly believed that the stock market crash reduced the willingness of consumers to spend. It is said to have caused "a collapse in domestic consumption spending" [Romer (1993), p. 29] because it created uncertainty, decreased wealth and reduced the liquidity of households' balance sheets [Mishkin (1978)]. Temin (1976) specifically rejects an explanation of the fall in consumption as reflecting the effect on wealth of the stock market crash, on the ground that the wealth effect was too small. He regards the fall as autonomous and unexplained. Yet econometric evidence in support of this proposition is far from convincing. In her recent paper Romer bases her regressions on her intuition that stock market variability made people temporarily uncertain about the level of their future income and thus caused them to postpone durable goods purchases and stimulate consumer spending on nondurables. Her model predicts a greater fall in 1930 in durables than actually occurred, does not predict the slight fall in perishables, and overpredicts a rise in semidurables.
184
M.D. Bordo and A.J. Schwartz
Romer goes on to examine the estimated effect of stock market variability following the October 1987 crash and suggests that uncertainty was both more severe and more persistent in 1929-1930 than in 1987-1988, and that this explains why consumers began spending again in 1988 while they continued to defer purchases of durable goods in 1930. A key difference that Romer does not note is that the stock of money grew 4.9 percent (M1; 5.5 percent M2) in the year following the 1987 crash. A policy issue that has not been addressed in recent research on the 1929 stock market crash is whether the Federal Reserve then should have made itself an "arbiter of security speculation" (in the words of the press statement released by the board on 9 February 1929). The board wrangled with the reserve banks by insisting that moral suasion rather than raising the discount rate would curb speculation. In the end the discount rate was raised. It broke the bull market but also sacrificed stable economic growth. The question of the system's responsibility for stock market valuations applies not only to 1929 but to 1987 and 1997. 3.3.2.5. Would stable money have attenuated the depression?. McCallum (1990) showed that his base rule (with feedback) would have avoided the severe decline in nominal income that occurred between 1929 and 1933. Following McCallum's methodology of using an empirical model of the economy based on interwar data to examine how a counterfactual policy would have performed, Bordo, Choudhri and Schwartz (1995) considered two variants of Milton Friedman's constant money growth rule and estimated separate relations for output and the price level. Basic simulations of both variants yielded results consistent with claims that, had a stable money policy been followed, the depression would have been mitigated and shortened. The view that a k percent rule (constant money growth rule) is suboptimal [Eichenbaum (1992)] compares economic performance under constant money growth with alternative rules or discretion that yield a superior outcome. Focus on the constant money growth policy relative to actual performance during the depression shows that it was clearly preferable. 3.3.2.6. gold standard policies in transmitting the Great Depression. Recent research gives the gold standard a major role in the causation and transmission of the depression, but assigns no special significance to US monetary policy, although Bemanke and James (1991) note that US panics may have contributed to the severity of the world deflation. They stress the close connection between deflation and nations' adherence to the gold standard, but find the case for nominal wage stickiness or real interest rates as transmission mechanisms dubious. They favor financial crises as the mechanism by which deflation can induce depression. Another view [Temin (1989, 1993)] is that gold standard ideology, which accorded external balance more weight than internal balance, produced the transmission, with financial crises constituting another transmission channel. According to Temin (1989, p. 84), dealing only with the United States, it is hard to explain how the initial downturn
Ch. 3: Monetary Policy Regimes and Economic PerJbrmance: The Historical Record
185
was spread and intensified to produce three or four years o f contraction, much less the international propagation mechanism 13. The operation o f the gold standard in interwar years was impaired by forced contraction in countries losing gold without producing expansion in countries gaining gold [Eichengreen (1992)]. Instead o f gold standard ideology, Meltzer (1995b) emphasizes the hold o f the b e l i e f that there had been a speculative situation between 1921 and 1929; he asks (1995b, ch. 5) why deficit countries chose to deflate rather than suspend convertibility, which happened many times in the 19th century. His answer is that policy makers in many o f these countries believed that deflation was the corrective needed in response to previous speculative excesses. What was paramount in their minds was not so much the gold standard imperative as it was the real bills doctrine. Similarly, with respect to Federal Reserve failure to p u r c h a s e government securities in 1930 and most o f 1931, when the system's reserve ratio was generally twice the required ratio, and subsequently when the "free gold problem" 14 was alleged to prevent such action, the explanation for Meltzer was the real bills doctrine, the belief that deflation was exacted by earlier speculative credit expansion. The board could have suspended reserve requirements in 1932-1933 rather than compel intensified contraction, but did not 15 Meltzer's perspective suggests that it was not an unyielding commitment to the gold standard that enforced deflation on the world. It was the failure o f policy makers to exercise temporary release from the commitment, which was a well-established feature o f the gold standard, in response to an internal or external drain [Bordo and Kydland (1995)]. A n d the failure can be traced to the hold o f the real bills doctrine and unawareness o f the distinction between nominal and real interest rates. A subject that needs to be explored is whether it is true that expansionary monetary policy by the Federal Reserve would have been futile because it would have aroused suspicion that the United States intended to leave the gold standard, and consequently resulted in gold losses. For two reasons this scenario is hard to credit. In the first place,
13 A response to this view was made by Haberler (1976, p. 8): Given the dominant position of the US economy and the monetary arrangements and policy maxims of the time - fixed exchanges trader the new gold standard the depression that came about in the United States was bound to spread to the four corners of the world. This does not mean that there were no other focal points of depression elsewhere in the world, for example in Central Europe; but the American infection clearly was the most virulent and the United States was in the strongest position to stop the slide. 14 Eichengreen (1992) argues that low free gold reserves prevented the system from conducting expansionary policy after 1931. Friedman and Schwartz (1963) and Meltzer (1995b, Ch. 5) regard free gold reserves as a pretext for the system's inaction that is explained by totally different reasons. 15 On 3 March 1933, when the New York reserve bank's reserve percentage fell below its legal limit, the board suspended reserve requirements for thirty days, too late to alter the imminent collapse of the system.
186
M.D. Bordo and A.J. Schwartz
it does not acknowledge the enormous size of US gold reserves. In February 1933, when there was both an internal and external drain, reflecting lack of confidence in Roosevelt's commitment to gold, the gold loss was $263 million. Gold reserves of $4 billion remained. In the second place, had expansionary monetary policy been in place, it would have stabilized the money supply and propped up the banking system. A quantitative estimate of the gold loss coefficient under these conditions, we conjecture, would reveal it to be modest in size, and would dispose of the argument that the possibility of expansionary monetary policy was illusory.
3.3.3. 1 9 3 3 - 1 9 4 1
The passivity of the Federal Reserve during the depression continued after it ended but under wholly different circumstances. New Deal institutional changes transformed monetary policy. Institutional changes that enhanced the authority of the board at the expense of the reserve banks ironically were the setting in which the Federal Reserve was overshadowed by the Treasury. The Treasury became the active monetary authority, while the Federal Reserve was passive. The main source of growth in the base was gold imports, which surged as foreigners took advantage of the steadily higher price of gold in 1933 that was fixed at $35 by the Gold Reserve Act. When the Treasury bought gold, it paid with a check at a reserve bank, which increased member bank reserves. The Treasury could print a corresponding amount of gold certificates, which it could deposit at the reserve bank to restore its deposits. These transactions accounted for the major movements in the monetary base. However, as a result of the gold sterilization program the Treasury adopted in December 1936, in the first nine months of 1937 the monetary base did not reflect the growth of the gold stock. During that period, the Treasury paid for the gold it bought by borrowing rather than by using the cash balances it could create on the basis of the gold. This was similar to sterilization by the Federal Reserve in the 1920s, when it sold government securities to offset the effect on the monetary base of gold inflows. The difference was that in the 1930s the Treasury rather than the Federal Reserve sold the bonds and took the initiative in sterilizing gold. The Treasury's gold sterilization program became effective at a time when the Federal Reserve undertook its first monetary policy action since the New Deal was in place. The sharp rise in member bank excess reserves beginning after the banking panic of 1933 was seen as raising dangers of future inflation. Sales of securities would have been desirable but for the need for adequate earnings. The system's room for maneuver was further limited by the political context within which it had to operate, since the Treasury could nullify anything it wished to do. The one option the Federal Reserve thought it had was to reduce excess reserves by exercising the power to double reserve requirements that the Banking Act of 1935 gave it. It did so in three steps between
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
187
August 1936 and May 1937. Given the banks' demand for prudential reserves, the action backfired and led to recession. Reserve requirements were not reduced until April 1938 to a level that eliminated one-quarter of the combined effect of earlier rises. A start toward Treasury desterilization was made in September 1937, when the board requested the Treasury to release $300 million from the inactive gold account. The board itself, of course, could have taken the economic equivalent by buying $300 million of government securities. On 19 April 1938 the Treasury discontinued the inactive gold account. Romer (1992) highlights money growth in stimulating real output growth between 1933 and 1942. Three other studies examine Federal Reserve behavior during those years: Eichengreen and Garber (1991), Calomiris and Wheelock (1998), and Toma (1997). Eichengreen and Garber regard monetary policy in 1933-1940 as foreshadowing wartime practices. The Federal Reserve acceded to Treasury requests in 1935 to moderate the rise in interest rates, and it purchased long-term government bonds for the first time in its history. In April 1937 after the second increase in reserve requirements the Federal Reserve again bought government bonds to moderate interest rate rises, acknowledging in 1938 its responsibility for "orderly conditions in the government securities market". The reason it did so, according to Eichengreen and Garber, was that changes in bond prices might endanger financial and economic security. Calomiris and Wheelock attribute the Treasury's dominance to the increase in its resources generated by gold and silver purchase programs which enabled it to alter bank reserve positions and to intervene directly in financial markets. In fact, the Treasury always had these powers. It was the New Deal political environment which was hospitable to their use. That had not been the case in preceding administrations. A shift in the focus of monetary policy away from markets for commercial paper and bankers acceptances and toward the market for government securities seems to Calomiris and Wheelock less a result of economic conditions than of Administration pressure. With the gold standard constraint absent and Federal Reserve independence diminished, monetary policy was free to monetize government debt, Calomiris and Wheelock conclude. Of course, it was the continued growth of the monetary gold stock that freed the Federal Reserve from the gold reserve constraint, not the absence of a legal gold standard constraint. In Toma's (1997) interpretation of the New Deal period, the government's financing requirements took center stage and induced changes in monetary institutions. In his view, New Deal legislation increased the seigniorage capacity of the monetary sector and fundamentally changed the Treasury's monetary authority. The Treasury took possession of the monetary gold stock and with the allowance for change in the dollar price of gold (the weight of the gold dollar at any level between 50 and 60 percent of its prior legal weight, of which the President specified 59.06 percent), a long-run constraint on the government's monetary powers was relaxed. A positive probability of
188
M.D. Bordo and A.J. Schwartz
future upward revaluation of the official gold price created the opportunity for future Treasury profits. The Treasury had money-creating powers equal to those of the Federal Reserve. Neither the Federal Reserve nor the Treasury had to share with each other revenue from money creation. After 1933 the Federal Reserve could keep all its earnings and make no transfers to the Treasury. And only the Treasury benefited from gold inflows since the gold certificates the Federal Reserve received did not give it legal title to the gold. Toma explains the Federal Reserve constant credit policy as a way of assigning monopoly rights to the Treasury as the money producer. The Treasury happened to be the least cost producer; it could provide the government's seigniorage requirement by the increase in the monetary base that was equal to or less than the value of gold inflows. In effect, the Federal Reserve paid the Treasury for the right to operate by forgoing its role as money producer. The doubling of reserve requirements, on Toma's interpretation, occurred because of an increase in the government's financing needs. The legislative authorization of flexibility in reserve requirements provided not only for the government's needs but also for the Federal Reserve's earnings objective. Had reserve requirements not been increased, the government's seigniorage revenue would have been lower, and income tax rates would have been higher, damaging real economic activity. Higher reserve requirements imposed costs on retail banks, so policy makers established federal deposit insurance as one way to moderate adverse stability implications for the financial system. Toma's version of events does not square with the record. The Federal Reserve was concerned with its own earnings needs, not with maximizing the government's seigniorage revenue. The reserve requirement increases led to government securities sales by member banks that raised interest rates for the Treasury, hardly the optimal principal agent relationship. Toma's linkage of the passage of federal deposit insurance with the reserve requirement increases rewrites the history of that act, which was a response to depression bank failures. 3.4. Bretton Woods, 1946-1971 3.4.1. 1946-1951
As in World War I, Federal Reserve credit outstanding rather than gold accounted for the increase in the monetary base during World War II. The Federal Reserve again became the bond-selling window of the Treasury and used its powers almost entirely for that purpose. After World War II ended, as after World War I, the system continued the wartime policy of providing the reserves demanded at a fixed cost: through supporting the price of government securities at unchanged levels. During the immediate postwar period and for some time thereafter, the Federal Reserve did not question the desirability of supporting the price of government
Ch. 3: Monetary Policy Regimes and Economic PerJbrmance: The Historical Record
189
obligations. On 10 July 1947, however, the posted 3/8 of 1 percent buying rate on Treasury bills and the repurchase option granted to sellers of bills were terminated. The Treasury, which had been reluctant to see any change in the pattern of rates, was reported to have consented to the rise in interest costs on its short-term debt owing to the offset created by the adoption on 23 April 1947 by the system of a policy of paying into the Treasury approximately 90 percent of the net earnings of the reserve banks. The next step in the program of raising the support rates somewhat was the sharp narrowing of the difference between short and long rates as a result of a rise in rates on bills and certificates. This led to a shift to short-term securities by individual holders and to a reverse shift by the Federal Reserve. The $5 billion of bonds the system bought was offset by a reduction of some $6 billion in its holdings of short-term securities, so there was monetary contraction in 1948. It was not, however, recognized and inflation fears prevailed, when inflationary pressure in fact was waning. Banks were urged to avoid making nonessential loans, discount rates were raised to 1.5 percent in 1948, reserve requirements were raised in September after Congress authorized a temporary increase in the legal maximum, and consumer credit controls were reinstated. The system was slow in reacting to the cyclical decline that began in November 1948. Not until March-April 1949 were credit controls eased. Between May and September, six successive reductions were made in reserve requirements. In June the system announced that it would not seek to prevent bond prices from rising. For the time being, the system regained some control over its credit outstanding. After the final reduction in reserve requirements in September 1949, the system held outstanding credit roughly constant for the balance of the year and early 1950, and hence refrained from offsetting the expansionary influence of the released reserves. The outbreak of the Korean War in June 1950 unleashed a speculative boom. The accompanying rise in interest rates pushed up yields to levels at which the Federal Reserve was committed to support government security prices. Concern grew that the support program would become the engine for an uncontrollable expansion of the money stock. The system's desire to be freed from this commitment was, however, accomplished only after protracted negotiations with the President and the Treasury, which was fearful of losing the advantage of a ready residual buyer of government securities and of low interest rates. In March 1951 an agreement with the Treasury was finally reached, relieving the system of responsibility for supporting the government security market at pegged prices. Eichengreen and Garber (1991) contend that the existing literature lacks a formal analysis of why investors were willing to hold Treasury securities at low interest rates in the 1940s, and why this willingness disappeared at the end of the decade. They build on the explanation by Friedman and Schwartz (1963) that expectations of deflation after the war induced the public to hold higher amounts of liquid assets than they otherwise would, and that expectations of inflation after 1948 induced the public to hold smaller amounts of liquid assets than they otherwise would. In 1946-1948, the implication of the target zone approach that they adopt is that the 1948 increases in reserve
190
~/LD. Bordo and A.J Schwartz
requirements and the 1949 bond sales by the Federal Reserve can be thought of as keeping the price level below the upper bound. Bank liquidity declined, and inflationary pressure subsided. Eventually the Federal Reserve reduced reserve requirements as if the price level was approaching the lower bound of the implicit price zone, and by the end of 1949 M1 began to rise. Interest rates rose with inflationary expectations and the cap on interest rates became inconsistent with Korean War imperatives. That is why the Accord with the Treasury was negotiated, if the Eichengreen and Garber analysis is accepted. A question Eichengreen and Garber pose and answer is why the Federal Reserve was concerned about price and interest rate stability - referring to an interest rate peg, not a target - in the aftermath of World War II and not in other periods. They say it was not the system's subservience to the Treasury's pursuit of low debt-service costs that is the answer. Instead, it was fear that a rise in interest rates would cause capital losses on commercial bank portfolios and undermine the stability of the banking system. Despite the fact that by 1951 the banks' vulnerability to capital losses had been attenuated, the Federal Reserve was still concerned to minimize them, and the Treasury helped by offering at par nonmarketable bonds with 2.75 percent yields in exchange for 2.5 percent long-term bonds marketed in 1945. Toma (1997) disagrees with Eichengreen and Garber that the Federal Reserve adopted the stable interest rate program for financial stability reasons. He assigns the seigniorage motive as the driving force with financial stability as at best a secondary consideration. According to Toma, coordination between the Treasury and the Federal Reserve as the two money producers substituted for the gold standard in limiting monetary growth. It seems to us quixotic, however, to describe wartime inflationary monetary growth as a substitute for the gold standard. 3.4.2. Federal Reserve discretionary regime, 1951-1965
The Treasury-Federal Reserve Accord overthrew the dominance of Treasury financing needs over monetary policy. In 1951, after more than 20 years of depression and war, the Federal Reserve had to formulate the criteria by which it would operate as an independent central bank. At that date the Bretton Woods system was in a formative stage, but under its aegis the US commitment to the convertibility of the dollar into gold initially seemed impregnable. By the end of the 1950s, however, as the gold stock began to decline, preventing gold outflows became a major objective of the Treasury as well as the Federal Reserve. A more immediate criterion for monetary policy than the convertibility principle was that the Federal Reserve should "lean against the wind", by taking restrictive action during periods of economic expansion, and expansionary action during periods of economic contraction. The countercyclical theme in the period ending 1965 was generally described in terms of avoiding either inflation or deflation, but full employment was also accepted as an equally important goal of monetary policy.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
191
The specific operating strategy for implementing "leaning against the wind" that the Federal Reserve adopted was unchanged from its practice in the 1920s [Calomiris and Wheelock (1998)]. It used open market operations to affect the level of discount window borrowing and free reserves - excess reserves minus borrowings. The theory of bank borrowing the Federal Reserve developed was that a change in nonborrowed reserves i.e., reserves provided by open market operations, forced banks to adjust the amount they borrowed. A tradition at the Federal Reserve against borrowing acted to restrain borrowing, even if it were profitable for banks to do so. According to the theory, when free or net reserves were high, market interest rates tended to fall, and bank credit and the money supply tended to grow. When free reserves were low or negative, i.e., net borrowed reserves, market rates tended to rise, bank credit and the money supply tended to contract [Brunner and Meltzer (1964)]. Because of this framework, the Federal Reserve has seen itself as exercising a dominant influence on the evolution of short-term market interest rates. In the 19511965 period, it targeted the Federal funds rate indirectly by using the discount rate and borrowed reserves target. This is now known as interest rate smoothing, a procedure that was earlier known as free reserves or net borrowed reserves targeting [Goodfriend (1991)]. The intention of indirect targeting is to avoid fluctuations and minimize surprise changes in interest rates. Removing seasonality in interest rates, however, is not the main aspect of smoothing under consideration here. Goodfriend describes the modus operandi of indirect targeting in the 1950s as follows. The Federal Reserve estimated the banks' demand for reserves during a defined period and provided most of the reserves by open market purchases. The balance had to be obtained from the discount window where borrowing became a privilege not a right. The Federal Reserve thus targeted borrowed reserves. The amount the banks were willing to borrow, however, depended positively on the spread between the Federal funds rate and the discount rate. Accordingly, the Federal Reserve targeted the Federal funds rate indirectly. Because the demand for borrowed reserves was unstable, it could not target borrowing exactly. In the relation between borrowed reserves and a discount rate-Federal funds rate combination, there was no tight linkage between the Federal funds rate and the discount rate. As a result, the market could not readily determine precisely what the indirect Federal funds rate target was, but it could estimate the range in which the funds rate should fall. Goodfriend's explanation for the Federal Reserve's preference for indirect targeting, even if the result was market misinterpretation of its intention, was that the procedure gave it the option to make changes quietly, keeping target changes out of the headlines. As we shall see, in 1994 it reversed the position it had held for decades and began to announce changes in the Federal funds rate, by that time a directly targeted rate, immediately after an FOMC decision. Capturing headlines did not have the adverse effects on monetary policy the Federal Reserve had for so long claimed would occur. For monetarist criticism of interest rate smoothing one must turn to earlier studies [Brunner and Meltzer (1964), Meigs (1962)]. Essentially, the criticism of interest rate smoothing is that, if the Federal Reserve sets the price of bank reserves and lets
192
M.D. Bordo and A.J. Schwartz
the market determine the quantity demanded, it abdicates control over the quantity. Goodfriend does not pose the normative question whether the procedure is optimal. Poole (1991), the discussant, does. He tries to make the case for the Federal Reserve's implementation of policy through the Federal funds rate rather than through monetary aggregates control, the preferable alternative for him. The smoothing arguments for interest rate control - it smooths the flow of revenue from the inflation tax; it stabilizes unemployment and inflation; it stabilizes rates at all maturities - in Poole's analysis lack substance. The only argument that he finds plausible is the belief that asset prices under the alternative policy of steady money growth could differ significantly from full-employment equilibrium levels and that the Federal Reserve can anchor interest rates at approximately the correct level when the market cannot do as well. Successful central banks, according to Poole, permit short-run fluctuations in monetary growth but adjust money market interest rates as necessary to constrain money aggregates in the long run from growing too fast or too slow. The Federal Reserve's performance since 1992 provides support for Poole's conclusion. Interest rate smoothing by the Federal Reserve during the decade and a half from 1951 did not preclude a low average inflation rate, but it also yielded unstable industrial output, as contemporaries judged it. Whether this outcome could have been avoided had the Federal Reserve's objective been only the price level and not also output is a subject to which we return when we discuss the 1990s. 3.4.3. Breakdown o f Bretton Woods, 1965-1971
Money growth accelerated in the early 1960s and persisted through the 1970s. US inflation began to accelerate in 1964, with a pause in 1966-1967, and was not curbed until 1980. An inflationary monetary policy was inappropriate for the key reserve currency in the Bretton Woods system. US balance of payments deficits from the late 1950s threatened a convertibility crisis as outstanding dollar liabilities rose and the monetary gold stock dwindled. To prevent conversion of dollars into gold, the United States and other central banks formed the London Gold Pool in 1961 to peg the price of gold at $35 an ounce, established a network of currency swaps with the other central banks, and issued bonds denominated in foreign currencies. These measures fell short. If the link with the dollar was unbroken, US inflation condemned the rest of the world to inflate. The only way to restrain US policy was to convert dollars into gold. French and British intentions to do just that prompted US suspension of gold convertibility in August 1971. Generalized floating of exchange rates followed (see Section 2.4 above). 3.5. Post-Bretton Woods, 1971-1995 3.5.1. 1971-1980
As tenuous as the convertibility obligation had become by the mid-1960s, its absence after the early 1970s totally removed the discipline of convertibility from domestic
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
193
monetary policy. The Federal Reserve was freed o f commitment to maintain a stable price level. To cope with inflation that they blamed on supply-side shocks or shifts in demand for money, policy makers turned to incomes policy which soon failed. Peacetime inflationary episodes as a result came to be associated with discretionary monetary policy. The episode from 1965 to 1980 is commonly attributed to the willingness o f the Federal Reserve to fund government expenditures for the Vietnam war and Great Society social programs and to the authority's belief that it could exploit short-run Phillips curve tradeoffs. Raising monetary growth to provide employment was consonant with Federal Reserve discretion. When the inflation rate accelerated, the authority became ensnared in a trap it itself had set. Monetarist doctrine had convinced Federal Reserve officials that reducing monetary growth in order to curb inflation would produce a recession. They could not bring themselves to choose that option, because o f the political costs. So they pernaitted continuance o f high monetary growth rates and everrising inflation until Paul Volcker broke the spell in 1979. Monetary policy in this period, as in earlier ones, was implemented by control over interest rates rather than control over money growth. The dangers of operating with an interest rate instrument became clear when rising interest rates from the mid-1960s on reflected growing fears o f inflation, not restrictive monetary policy. Rising interest rates were accompanied by high money growth. In January 1970, in response to criticism o f its policymaking, the FOMC for the first time adopted a money growth target. In 1975 Congress passed Joint Congressional Resolution 133 requiring the Federal Reserve to adopt and announce 1-year money growth targets and, in October 1979, the reason for the change in Federal Reserve operating procedures was said to be more precise control o f money growth. The Federal Reserve annotmced the target growth range each year on a base equal to the actual level o f the money stock in the fourth quarter o f the previous year. In the late 1970s, above-target money growth in one year was built into the next year's target, and in 1981, below-target money growth was built into the 1982 target. The Federal Reserve thus permitted base drift, contributing to instability o f money growth. These differences between targets and actual money growth were a consequence o f the Federal Reserve's policy o f maintaining a narrow, short-run target range for the Federal funds rate, unchanged from its operating procedures before monetary growth targets were adopted 16. One change in Federal Reserve operational procedure during the period was its gradual shift during the early 1970s from indirect targeting to direct targeting o f the Federal funds rate within a narrow band specified by the FOMC each time it
16 Differences between an aggregate selected for monetary control and a stable relationship with prices and nominal income that existed before the adoption of the targeted aggregate are said to arise because of financial innovations. The breakdown of the relationship has come to be known as Goodhart's Law [Goodhart (1989)]. It is true that financial innovation does occur and affects the definition of any monetary aggregate and the predictability of its velocity. There is no evidence, however, that links monetary targeting to innovation.
194
M.D. Bordo and A.J. Schwartz
met [Goodfriend (1991)]. The range within which the rate was allowed to move was commonly 25 basis points. The Federal Reserve managed the rate within the band by open market operations, adding reserves to maintain the rate at the upper bound o f the band, subtracting reserves to maintain the rate at the lower bound. A move o f the band up or down signaled a change in the target, which the market readily perceived. The financial press usually reported a change the day after the Federal Reserve implemented it [Cook and Hahn (1989)]. To support the current target, the Federal Reserve had to accommodate changes in money demand. It had to supply the level o f reserves that would keep the Federal funds target within the narrow band the FOMC set for periods between meetings. This is another way o f explaining how it became an engine o f inflation during the second half o f the 1970s, given that it had no nominal anchor and that the current target could be too low. If the Federal Reserve was slow in raising the target and, when it did raise the target, did not raise it enough, as total nominal spending in the economy rose, rapid money growth resulted, and accordingly higher inflation. Furthermore, interest rate smoothing could itself be a determinant o f the inflation generating process. In Goodfriend's (1987) model, he shows that rate smoothing with a price level objective induces a nontrend-stationary process for the money stock and the price level. This contributes to both money stock trend and price level drift. Interest smoothing increases both the price level forecast error variance and the variability o f expected inflation. So interest rate smoothing tends to create macroeconomic instability 17.
3.5.2. Shifting the focus of monetary policy, 1980-1995 In the period following the inflation episode of 1965-1980, operating procedures at the Federal Reserve underwent modifications. The adoption by the FOMC on 6 October 1979 o f targeting on nonborrowed reserves in place o f direct Federal funds rate targeting represented an admission that earlier interest rate smoothing had failed to provide noninflationary monetary growth. The new procedure was designed to supply banks with the average level o f total reserves that would produce the rate o f monetary growth the FOMC desired over the period from a month before a meeting to some future month, without regard for the accompanying possible movement o f the Federal funds rate outside a widened range o f 400 basis points.
17 An empirical study of UK monetary policy, 1976-1985, by Bordo, Choudhri and Schwartz (1990) suggests that rate smoothing by the Bank of England allowed money stock base drift to reduce the predictability of the trend price level. Had the Bank of England followed a trend-stationary money supply rule, it would have reduced the variance of the trend in prices by more than one-half. Ireland (1993) extends this analysis to the US case. He shows that the Friedman rule would have reduced long-run price uncertainty by 82 percent over the 1915-1990 period.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
195
At each FOMC meeting a decision was made and never kept not only about the desired growth rate of M1 and M2 but also about the average level of borrowed reserves that it was assumed the banks would desire over the intermeeting period. The staff then estimated a weekly total reserves path from which it subtracted the borrowing assumption to arrive at a nonborrowed reserves path on which the Open Market Desk targeted open market purchases. It sought to keep the average level of nonborrowed reserves between FOMC meetings equal to the nonborrowed reserves path. Under this procedure an increase in the demand for reserves was not mechanically accommodated; in the event, to keep total reserves on its path, nonborrowed reserves might be decreased. When total reserves were above the path level, the level of the nonborrowed reserves path or the discount rate was adjusted to reduce deviations of the money aggregates from their desired rate of growth. When the nonborrowed reserves path was lowered, banks were compelled to increase their borrowings, as a result of which the Federal funds rate rose. A 3 percent surcharge on discount window borrowings by banks with deposits of $500 million or more that borrowed frequently that was first imposed by the Federal Reserve on 14 March 1980 was eliminated a few months later, then reimposed at a lower rate, which was subsequently raised, and later again lowered until finally eliminated on 17 November 1981. Despite the official description of the operation of the nonborrowed reserves procedure, movements in the Federal funds rate were far from automatic [Cook (1989), Goodfriend (1993)]. There were judgmental adjustments to the nonborrowed reserve path at FOMC meetings and between FOMC meetings that changed what the reserves banks were expected to borrow at the discount rate, in effect changing the funds rate target. There were also changes in the discount rate and, as just noted, in the surcharge. Goodfriend concludes that the 1979-1982 period was one of aggressive Federal funds rate targeting rather than of nonborrowed reserve targeting. At the 5 October 1982 FOMC meeting, it abandoned nonborrowed reserve targeting. The Federal Reserve interpreted its experience over the preceding three years as demonstrating that short-run control of monetary aggregates was inferior to interest rate smoothing for stabilization. The outcome of the experiment was that, although M1 growth slowed on average, its volatility tripled compared to the period preceding October 1979 [Friedman (1984)], the Federal funds rate became highly volatile [Gilbert (1994)], and both nominal and real GDP displayed exceptionally large fluctuations quarterly [Friedman (1984)1. Goodfriend (1983) attributed the Federal Reserve's difficulty with reserve targeting to the unreliability of the demand function for discount window borrowing on which its operating procedure critically depended. Pierce (1984) found that the flaw in the operating procedure was produced by lagged reserve accounting in effect at the time, under which required reserves were based on deposit liabilities two weeks earlier. Therefore, only free reserves could serve as a target and, hence, borrowing estimates, which were inaccurate, became crucial. The upshot was that open market operations destabilized money growth.
196
M.D. Bordo and A.J. Schwartz
On 5 October 1982, when the Federal Reserve suspended the nonborrowed reserves procedure, it shifted to targeting borrowed reserves. In line with this change, the FOMC at each meeting stated its instruction to the Open Market Desk for open market operations to achieve either more or less reserve restraint. More restraint was equivalent to a higher level of borrowings, less, to a lower level. If the demand for total reserves increased, the Federal funds rate and borrowings would rise. In order to reduce borrowed reserves to their desired predetermined level, nonborrowed reserves had to increase, with the effect of reducing the Federal funds rate. No change in borrowed reserves or the funds rate would then occur. This amounted to indirect targeting of the Federal funds rate. To keep the total of reserves the banks borrowed near some desired level, the spread between the Federal funds rate and the discount rate had to be such that banks would have an incentive to borrow that level of reserves. An increase in the spread induced banks to increase their borrowings. It could be achieved by changing the discount rate or the Federal funds rate. The target level of borrowings was attained by providing the appropriate amount of nonborrowed reserves. The borrowed reserves target operated with loose control of the funds rate. Sometime about 1992 the Federal Reserve began to target the Federal funds rate directly in a narrow band. Target changes were made in small steps of 25-50 basis points, usually separated by weeks or months, and not soon reversed. The FOMC directive has not, however, specified the target Federal funds rate, but refers to degrees of reserve restraint that would be acceptable. The model of this regime that Rudebusch (1995) sets up and simulates replicates Federal Reserve operations. Nevertheless, since February 1994, the Federal Reserve during FOMC meetings has announced a change in the funds rate if one has been made. A further procedural change was made in mid-December 1996 in Federal Reserve daily money market operations, revealed at a press conference at the New York reserve bank. The system will announce when it enters the market the size of its open market operations, to be conducted from system accounts, rather than from its customer accounts. The objective is to inform the market about the amount of liquidity the open market operations provide to or withdraw from the banking system. So in the 1920s and since the 1950s, the Federal Reserve in one way or another has targeted the Federal funds rate, while simultaneously announcing a money growth target. In the years since 1992 it has apparently taken low inflation as its sole objective and has succeeded in adjusting the target rate. A side effect is that monetary volatility has been low, and the real economy has not been buffeted by monetary shocks, facilitating low unemployment and financial market stability. Only possible inflation of equity market prices seems troubling. The Federal Reserve along with other central banks changed its policy goals during this period. The primary goal became resisting inflationary pressures. It did so aggressively in 1980-1982. Disinflation was largely accomplished by 1983, when the inflation rate declined to 4 percent per annum. Goodfriend (1993) interprets rising long-term rates in 1983 and 1987 as signaling expectations that the Federal Reserve might again allow inflation to increase. The Federal Reserve met the test by raising
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
197
the Federal funds rate long enough to contain the inflation scare. Goodfriend remarks on the fragility of the credibility of the Federal Reserve and on how costly it is to maintain. In 1996-1997 the long rate at 6.5-7 percent was high enough to suggest that Goodfriend's assessment of the Federal Reserve's credibility is accurate. The duration of a 30-year bond at an interest rate of 6.75 percent is 14.8 years. Who would confidently predict that the then current inflation rate of 2.5 percent would not increase over that horizon? So the expectations explanation for the success of monetary policy targeted on the funds rate seems questionable. The basic problem is that there are no institutional underpinnings of the low-inflation policy. There is no guarantee that the successor to the present chairman of the Federal Reserve will also have a strong aversion to inflation. The durability of Federal Reserve commitment to price stability is a question that only the future will determine. Of the 82 years that the Federal Reserve has been in existence, only 18 can be termed years of stable (consumer) prices - 1923-1929 (average per year change of 0.3 percent); 1960-1965 (average per year price change of 1.3 percent); 1992-1995 (average per year price change of 2.8 percent). The most recent episode is too brief to take for granted its staying power. Arguments in favor of a stable price level in preference to a low inflation rate have been advanced by Feldstein (1996, 1997) and Svensson (1996a,b). Svensson compares price level and inflation targeting, when society (the principal) delegates the choice to a central bank (the agent), under the assumption that output and employment are at least moderately persistent. The decision rule the central bank follows under discretion for inflation targeting is a linear feedback rule for inflation on employment. The variance of inflation is proportional to the variance of employment. Under price level targeting, the decision rule is a linear feedback rule for the price level on employment. Inflation, the change in the price level, is a linear function of the change in employment. Based on a very special set of assumptions, Svensson concludes that society will be better off assigning a price level target rather than an inflation target to the central bank because the variance of inflation will be lower, there is no inflation bias, and employment variability will be the same as under inflation targeting. Feldstein bases his argument on the interaction of taxes and inflation that bias the allocation of resources in favor of current consumption and in favor of owner-occupied housing. The higher the inflation rate, the bigger the bias. Reducing the inflation rate by 2 percent would raise the level of real GDP by 2/3 of 1 percent each year in the future as long as the inflation rate remained at the lower level. Feldstein maintains that the arguments against going from low inflation to price stability do not singly or collectively outweigh the tax-inflation case for going to price stability or even to a lower inflation rate. One argument for inflation targeting is that reducing the permanent rate of inflation requires a loss of output. With a target price path, the monetary authority offsets past errors, creating more uncertainty about short-term inflation than with an inflation target [Fischer (1994), pp. 281-284]. Feldstein's response is that the output loss is
198
M.D. Bordo and A.J. Schwartz
temporary, a shortfall of GDP below what it would otherwise be of 2.5 percent for two years to reduce the inflation rate by 2 percentage points. That is why he compares the one-time loss of reducing the inflation rate with the permanent increase of real GDP from reducing the tax-inflation effect. Another argument for inflation targeting has been made by Akerlof, Dickens and Perry (1996). They contend that a very low level of inflation may lead to higher unemployment than at a higher inflation level because workers are unwilling to accept nominal wage decreases. Feldstein's response is that, by reducing fringe benefits, it is possible to reduce a worker's compensation without reducing his money wage rate. They also assume that workers don't learn that falling prices raise real wages. Whether the price level or inflation is the target, a central bank has to determine the pace at which to try to achieve either one. The question is whether it is optimal to move immediately to the target. One answer is that gradualism is acceptable in the absence of a cost in terms of growth foregone [Dornbusch (1996), p. 102]. The information and transactions costs of moving from the old to the new regime also argue for a gradual return to a noninflationary position. Long-term borrowing and lending contracts and employment contracts arranged under the old regime need to be unwound. Advance announcement of the gradualism policy would give the private sector time to adjust its expectations. The speed of adjustment of monetary policy should respond to the speed with which expectations adjust and the gradualist prescription is that expectations adjust slowly. Feldstein suggests that this view needs to be modified and disinflation should proceed forthwith when political support for the policy permits it to go forward, since political support is indispensable but is not always at hand. A stronger argument for speedy adjustment than Feldstein's is the rational expectations approach that treats expectations as adjusting quickly, and hence finds shock treatment is preferable. Sargent's view (1986, p. 150) is that "gradualism invites speculation about future reversals, or U-turns in policy". A major consideration in the choice between gradualism and shock treatment is the initial position. With moderate inflation of 8-10 percent, as observed in advanced countries, gradualism may be the answer. With very high inflation rates of 1000 percent per year, as recently experienced in Latin America, gradualism is meaningless. Only shock treatment will suffice. Still another view, dubbed "opportunistic disinflation" [Orphanides and Wilcox (1996)], argues that the Federal Reserve should conduct contractionary monetary policy only during business expansions; during recessions, it should abstain, counting on recessionary tendencies themselves to produce further disinflation. McCallum (1996, p. 112] notes a confusion in this view between regime design, with which the paper advocating opportunistic disinflation is concerned, and the issue of managing the transition from one regime with higher inflation to a regime with a lower level of inflation. Opportunistic disinflation is not a contribution to the literature on the timing of disinflation during the transition. If there is a temporary cost in bringing down inflation, how high is that cost? Unfortunately, no quantitative estimates exist of the cost in lost output and employment
Ch. 3.. MonetaryPolicy Regimes and Economic Performance. The Historical Record
199
of a disinflation of a given magnitude pursued over a given period. Hypothetical scenarios based on differing models arrive at qualitatively different conclusions. The announcement of a perfectly credible disinflation will either entail no expected output loss [King (1996)] or, perhaps, an increase in cumulative output [Ball (1994)]. The cost depends on the speed of adjustment of anticipations, which in turn depends on the underlying price level performance of the monetary regime. Alan Greenspan at the Tercentenary Symposium of the Bank of England [Greenspan (1994, p. 259)] remarked: "... the pressure towards reserving or rather focusing central bank activity to the equivalent of the gold standard will become increasingly evident". If this is a correct prediction that price stability will be the single goal of the Federal Reserve over the long term, and if it is achieved, price stability may well become a credible surrogate for convertibility. The system will then end up fulfilling a key element of the vision of its founders. 3.6. Conclusion
Three events stand out in our survey of monetary policy episodes and macroeconomic performance. One is the breakdown of the gold standard in stages over the period from 1914 to 1971. The second is the Great Depression of 1929-1933. The third is the Great Inflation of 1965-1980. To escape from the macroeconomic experience that marked the economy in each of these watershed happenings became the driving force for change. The change was intellectual, reflecting what was perceived as the problem and deduced as its solution. It also led to a change in the monetary policy episode that succeeded each of these events. The new episode in turn exhibited unforeseen deficiencies. To conclude the section, we comment on the way the triad of events unfolded. 3.6.1. Breakdown o f the gold standard, 1914-1971
After World War I, the discipline of the gold standard came to be regarded as an impediment to the management of the economy to achieve the objectives of growth and high employment. The deep depressions of the interwar years were the measure by which the economy under a gold standard was judged to be a failure. The loosening of the link to gold after World War I presaged its abandonment 50 years later. Although price stability was generally included among the goals of the post-World War II era, stability of employment took precedence. The instability of the interwar years led to the creation of the Bretton Woods system, which had a good record of price and output stability until the mid-1960s. Nevertheless, the convertibility principle lost favor. Improving the real performance of the economy was given pride of place. To achieve the improvement, the task was assigned to government management of monetary and fiscal policy, not to impersonal market forces. The simple rule for governments to maintain a fixed price of gold was set aside in 1971, but the seeds of the downfall of that rule were sown earlier in the postwar
200
M.D. Bordo and A.J Schwartz
years as country after country opted for monetary independence, full employment, and economic growth. Countries rejected the restraints that the operation of a fixed exchange rate imposed on the pursuit of these widely supported national objectives. In the United States, where the share of international trade was a minor factor in aggregate national income, the view prevailed that the domestic economy should not be hostage to the balance of payments. Maintenance of the price of gold was not an objective of the Employment Act of 1946. The growth of government itself has destroyed the viability of a gold standard. A real gold standard was feasible in a world in which government spent 10 percent of national income, as in Britain and the USA pre-World War I. It is not feasible in a world in which governments spend half or more of national income. 3.6.2. The Great Depression, 1929-1933
The Great Depression was sui generis. To explain it, it is necessary to examine policy errors and the weaknesses of the interwar gold standard. It is a consensus view that monetary contraction began in the United States, and was transmitted to the rest of the world by fixed exchange rates. Monetary contraction began in 1928 to curb a boom on the New York Stock Exchange. Although the stock market crashed in October 1929, the policy of contraction was not then halted. Instead, it was pursued relentlessly by the Federal Reserve until the spring of 1932. The Federal Reserve mistakenly believed that monetary policy had been overexpansionary in the 1920s and that deflation was the proper remedy. In fact the system had achieved stable economic growth from 1922 to 1929 with falling wholesale prices. The US gold stock rose during the first two years of the 1929-1933 contraction, but the Federal Reserve did not permit the inflow of gold to expand the US money stock. It not only sterilized the inflow, it went much further. The US quantity of money moved perversely, going down as the gold stock went up, contrary to gold standard rules. Under a fixed exchange rate system, shocks in one country's income, employment, and prices tend to be transmitted to its trading partners' income, employment, and prices. Absent policy changes in the USA, the only recourse for countries on the gold standard was to cut the fixed exchange rate link. The first major country to do so was Britain. After runs on sterling, it abandoned the gold standard in September 1931. The international monetary system split in two, one part following Britain to form the sterling area; the other part, the gold bloc, following the United States. The trough of the depression in Britain and in other countries that accompanied her in leaving gold was reached in the third quarter of 1932. In the two weeks following Britain's departure from gold, central banks and private holders in foreign countries converted substantial amounts of their dollar assets in the New York money market to gold. The US gold stock declined by the end of October 1931 to about its level in 1929. The Federal Reserve, which had not responded to
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
201
an internal drain from December 1930 to September 1931 as a series of runs on banks, bank failures, and shifts from bank deposits to currency by anxious depositors produced downward pressure on the US quantity of money, responded vigorously to the external drain. A sharp rise in discount rates ended the gold drain but intensified bank failures and runs on banks. In October 1931, unlike the situation in 1920, the system's reserve ratio was far above its legal minimum. The system overreacted to the gold outflow and magnified the internal drain. Federal Reserve officials believed that purchases of government securities, which would have relieved monetary contraction, were inconsistent with the real bills doctrine that the Federal Reserve Act enshrined. They resisted engaging in such purchases until March 1932, when they undertook doing so, following which there was widespread revival in the real economy in the summer and fall. The termination of the purchase program during the summer was followed in the six months from October 1932 by mounting banking difficulties. States began to declare banking holidays. By February 1933, fears of a renewed foreign drain added to the general anxiety. For the first time also, the internal drain took the form of a specific demand by depositors for gold coin and gold certificates in place of Federal Reserve notes or other currency. The Federal Reserve reacted as it had in September 1931, raising discount rates in February 1933 in reaction to the external drain but not seeking to counter either the external or internal drain by extensive open market purchases. The drains continued until 4 March, when the Federal Reserve banks and all the leading exchanges did not open for business. A nationwide banking holiday was proclaimed after midnight on 6 March by the incoming administration, which ushered in a new regime.
3.6.3. The Great Inflation, 1965-1980
By the mid-1960s, the convertibility principle no longer dominated central bank policies. The goal of full employment supplanted it in the minds of central bank and govenmaent officials. The Phillips curve presented them with a course of action that promised higher employment at the cost of rising inflation, a cost that was typically dismissed as insignificant. An additional factor that nurtured an acceleration of inflation was central bank reliance on short-term interest rates as the instrument to control monetary growth. Under noninflationary conditions, this practice produced a procyclical movement in monetary growth. Under the gathering inflationary conditions from the mid-1960s, the inflation premium that became imbedded in interest rates made the instrument unreliable as an indicator of restriction or ease. Reliance on it contributed to a rise in the rate of monetary growth. It was not until the 1970s, when ever higher inflation was accompanied by a decline in economic activity and a rise in unemployment that pressure arose to reverse the policies and procedures that led to the Great Inflation. The upshot was a shift to a new regime in 1979, in which disinflation was the guiding principle. The regime since
202
M.D. Bordo and A.J. Schwartz
the last decade has focused on price stability, reviving the peacetime domestic objective of the classical gold standard.
4. Monetary regimes and economic performance: the evidence 4.1. O v e r v i e w
Having surveyed the history of international monetary regimes and of the institutional arrangements and episodes in Federal Reserve history viewed as a domestic policy regime, we ask the question, under what conditions is one or another type of monetary regime best for economic performance? One based on convertibility into specie (gold and or silver), in which the monetary authority defines its monetary unit in terms of a fixed weight of specie and ensures that paper money claims on the specie monetary unit are always interchangeable for specie? Or one based on government fiat? Alternatively, in the international monetary sphere, which international monetary regime is superior, one based on fixed exchange rates? One based on floating rates? Or some intermediate variant such as the adjustable peg that characterized the Bretton Woods system and the EMS? Or the managed float which prevails in the world today? Evidence on the performance of alternative monetary regimes is crucial in assessing which regime is best for welfare. 4.2. T h e o r e t i c a l issues"
Traditional theory posits that a convertible regime, such as the classical gold standard that prevailed 1880-1914, is characterized by a set of self-regulating market forces that tend to ensure long-run price level stability. These forces operate through the classical commodity theory of money [Bordo (1984)]. According to that theory, substitution between monetary and nonmonetary uses of gold and changes in production will eventually offset any inflationary or deflationary price level movements. The fixed nominal anchor also ensures long-run price predictability and hence protects long-term contracts. It also may foster investment in long-lived projects [Klein (1975), Leijonhufvud (1984), Flood and Mussa (1994)]. Adherence to the fixed nominal anchor by providing credibility to monetary policy contributes to low inflation both by restraining money growth and by enhancing money demand [Ghosh et al. (1996)]. However, while ensuring long-run price stability and predictability, a gold standard provided no immunity to unexpected shocks to the supply of or demand for gold. Such shocks could have significant short-run effects on the price level. In a world with nominal rigidities they would generate volatility in output and employment 18.
18 According to Fischer (1994), in a comparison of price level stability versus low inflation, these volatility costs outweighthe benefits of long-run price level predictability.
Ch. 3." Monetary Policy Regimes and Economic Performance." The Historical Record
203
Indeed, because of the problem of wide swings in the price level around a stable mean under the gold standard, Fisher (1920), Marshall (1926), Wicksell (1898), and others advocated reforms such as the compensated dollar and the tabular standard that would preserve the fixed nominal anchor yet avoid swings in the price level [Cagan (1984)]. In an inconvertible fiat money regime, without a nominal anchor, monetary authorities in theory could use open market operations, or other policy tools, to avoid the types of shocks that may jar the price level under a specie standard and hence provide both short-run and long-run price stability. However, in the absence of a fixed nominal anchor, some other type of commitment would be required to prevent the monetary authority from using seigniorage to satisfy the government's fiscal demands, or to maintain full employment. In its international dimension, the convertible regime was one of fixed exchange rates and a stable nominal anchor for the international monetary system. Stability, however, came at the expense of exposure to foreign shocks through the balance of payments. In the presence of wage and price stickiness, these shocks again could produce volatile output and employment. Adherence to the international convertible regime also implied a loss of monetary independence. Under such a regime the monetary authorities' prime commitment was to maintain convertibility of their currencies into the precious metal and not to stabilize the domestic economy. In a fiat (inconvertible) money regime, adhering to a flexible exchange rate provides insulation against foreign shocks 19. However, as in a convertible regime, countries in fiat money regimes can adopt fixed exchange rates with each other. The key advantage is that it avoids the transactions cost of exchange. However, a fixed rate system based on fiat money may not provide the stable nominal anchor of the specie convertibility regime unless all members define their currencies in terms of the currency of one dominant country (e.g., the USA under Bretton Woods or Germany in the EMS). The dominant country in turn must observe the rule of price stability [Giavazzi and Pagano (1988)]. The theoretical debate on the merits of fixed and flexible exchange rates stemming from Nurkse's (1944) classic indictment of flexible rates and Friedman's (1953) classic defense is inconclusive 20. It is difficult to defend an unambiguous ranking of exchange rate arrangements 21. Hence, evidence on the performance of alternative monetary
19 Theoretical developments in recent years have complicatedthe simple distinction between fixed and floating rates. In the presence of capital mobility, currency substitution, policy reactions, and policy interdependence, floating rates no longer necessarily provide insulation from either real or monetary shocks [Bordo and Schwartz (1989)]. Moreover, according to recent real business cycle approaches, no relationship may exist between the international monetary regime and transmission of real shocks [Baxter and Stockman (1989)]. 20 For surveys, see Frenkel and Mussa (1985) and Bordo and Schwartz (1989). Also see McCalltnn (1997) p. 15. 21 See, for example, Helpman and Razin (1979) and Helpman (1981).
204
M.D. Bordo and A.J Schwar~
r e g i m e s is crucial in assessing the condition u n d e r w h i c h one or another r e g i m e is best for welfare 22.
4.3. Measures o f macroeconomic performance, by regime In Table 4.1 we present annual data on two key m e a s u r e s o f e c o n o m i c p e r f o r m a n c e , the inflation rate ( G N P deflator) and the g r o w t h rate o f real per capita i n c o m e ( G N P ) for the five largest industrial countries across four r e g i m e s over the p e r i o d 1 8 8 1 1995 23. The r e g i m e s covered are: the classical g o l d standard (1881 1913); the interwar p e r i o d ( 1 9 1 9 - 1 9 3 8 ) ; Bretton Woods ( 1 9 4 6 - 1 9 7 0 ) ; the present floating e x c h a n g e rate r e g i m e ( 1 9 7 3 - 1 9 9 5 ) 24. We divide the Bretton W o o d s p e r i o d into two subperiods: the preconvertible phase ( 1 9 4 6 - 1 9 5 8 ) and the convertible phase (1959-1970)25. We divide the recent float into two subperiods: h i g h inflation ( 1 9 7 3 - 1 9 8 2 ) and low inflation (1983-1995). For the U n i t e d States o v e r the p e r i o d 1 8 8 0 - 1 9 2 9 , we show data f r o m two sources: Balke and G o r d o n (1986), and R o m e r (1989). A l l sources for the U S A and other countries are shown in the Data A p p e n d i x . For e a c h variable and e a c h country w e present two s u m m a r y statistics: the m e a n and standard deviation. A s a s u m m a r y statistic for the countries taken as a group, we s h o w the grand m e a n 26. We c o m m e n t o n the statistical results for e a c h variable.
22 Meltzer (1990) argues the need for empirical measures of the excess burdens associated with flexible and fixed exchange rates the costs of increased volatility, on the one hand, compared to the output costs of sticky prices on the other hand. His comparison between EMS and non-EMS countries in the postwar period, however, does not yield clear-cut results. 23 For similar comparisons for the G-7 see Bordo (1993b). For 21 countries including advanced and developing countries see Bordo and Schwartz (1996b). Other studies comparing historical regime performance include: Bordo (1981); Cooper (1982); Meltzer (1986); Schwartz (1986b); Meltzer and Robinson (1989); Eichengreen (1993b); and Mills and Wood (1993). 24 One important caveat is that the historical regimes presented here do not represent clear-cut examples of fixed and floating exchange rate regimes. The interwar period is not an example of either a fixed or floating rate regime. It comprises three regimes: a general floating rate system from 1919 to 1925, the gold exchange standard from 1926 to 1931, and a managed float to 1939. For a detailed comparison of the performances of these three regimes in the interwar period, see Eichengreen (1991b). We include this regime as a comparison to the other three more clear-cut cases. The Bretton Woods regime cannot be characterized as a fixed exchange rate regime throughout its history. The preconvertibility period was close to the adjustable peg envisioned by its architects, and the convertible period was close to a defacto fixed dollar standard. Finally, although the period since 1973 has been characterized as a floating exchange rate regime, at various times it has been subject to varying degrees of management. 25 We also examined the period (1946-1973), which includes the three years of transition from the Bretton Woods adjustable peg to the present floating regime. The results are similar to those of the 1946-1970 period. 26 Bordo (1993b) also presents data on seven other variables: money growth, nominal and real short-term and long-term interest rates and nominal and real exchange rates. Bordo and Schwartz (1996b) show the same data plus the government budget deficit relative to GDP for fourteen additional countries.
Ch. 3." Monetary Policy Regimes and Economic Performance: The Historical Record
205
I I
o
%,o
(.9
~z4) © tt~
tt~
=I , ~ ,
e,i
e-i •,< .~ :..v, o
~1
~
oq q-)
e-i
Z~ °~
o
206
M.D. Bordo and A.J. Schwartz
0
.,:
oj
....
.,
,
.
:
•
'i
..
.
.,
'
...........,
11883 1891 189g 1907 1915 1923 1931 1939 1947 1955 196,~ 1971 1919 1987 1995
11888 1891 189g 1907 1915 1923 1931 1959 1947 1955 1963 1911 1979 1987"i998
L fl'£
.
.
.
.
.
.
.
.
.
.
.
.
.
.
,
.
.
.
.
1683 1891 18~'g 1907 1915
.
.
.
.
.
.
.
.
1923 1931
.
.
.
.
.
.
.
.
.
1939 1947
.
.
.
.
,
,
,
1955 1963.
,
,
,
,
,
,
,
,
,
,
,
,
,
1971 1979 19B7 1995
Fig, 4.1, Annual inflation rate, 1880 1995, five countries. 4.4. Inflation and output levels and uariability 4.4.1. Inflation
The rate of inflation was lowest during the classical gold standard period (Figure 4.1). This was true for every country except Japan which did not go on the gold standard until 1897. During the interwar period mild deflation prevailed. The rate of inflation during the Bretton Woods period was on average and for every country except Japan lower than during the subsequent floating exchange rate period. During the Bretton Woods convertible period the inflation rate in the USA, the UK and France was higher than in the preceding subperiod; the reverse was true for Germany and Japan but on average there was not much difference between the
Ch. 3: Monetary Policy Regimes and Economic PerJbrmance: The Historical Record
207
subperiods. During the floating regime inflation has been lower in the recent subperiod o f low inflation than during the Bretton Woods convertible subperiod except in the U S A and U K 27. The Bretton Woods period had the most stable inflation rate as j u d g e d by the standard deviation. The managed float and the gold standard periods were next. The interwar period was the most unstable. However, when subperiods o f the regimes are distinguished, the recent decade o f low inflation was the most stable, followed by the Bretton Woods convertible regime, then the inflation phase o f the float, and last, the gold standard period. In general, the descriptive evidence o f lower inflation under the gold standard and the Bretton Woods convertible regime than is the case for the other regimes is consistent with the view that convertible regimes provide an effective nominal anchor. The marked low inflation o f the recent decade suggests that the equivalent o f the convertibility principle may be operating. At the same time, evidence that inflation variability on average was higher in the classical gold standard period than in most other regimes is consistent with the commodity theory o f money and the price-specie flow mechanism which posits offsetting changes in the monetary gold stock 28. The evidence on inflation and inflation variability is also consistent with the behavior o f two other nominal variables [Bordo (1993b)]. First, m o n e y growth was generally lowest under the gold standard across all countries, followed by the Bretton Woods convertible regime. It was most stable during the Bretton Woods convertible regime. Second, long-term nominal interest rates were lowest during the classical gold standard period. During Bretton Woods they were lower than in the recent float [see also McKinnon (1988)]. 4. 4.2. Real p e r capita income growth Generally, the Bretton Woods period, especially the convertible period, exhibited the most rapid output.growth o f any monetary regime, and, not surprisingly, the interwar
27 The dispersion of inflation rates between countries was lowest during the classical gold standard and to a lesser extent during the Bretton Woods convertible subperiod compared to the floating rate period and the mixed interwar regime [Bordo (1993b)]. This evidence is consistent with the traditional view of the operation of the classical price-specie-flow mechanism and commodity arbitrage under fixed rates and insulation and greater monetary independence under floating rates. 28 Supporting evidence is provided in a recent study by Ghosh et al. (1996). Classifying the exchange rate systems for 136 countries over the period 1960 to 1990 into pegged, intermediate, and floating, they adopt a methodology similar to that of Table 4.1. They find that the unconditional mean inflation rate for countries on pegged exchange rates was significantly lower than for those that did not peg. This result holds up, controlling for the 1960s during which most countries adhered to Bretton Woods. The only exception was high-income floating countries which had lower than average inflation rates. Their results are unchanged when conditioned on a set of determinants of inflation, and when account is taken of possible endogeneity of the exchange rate regime. With respect to the volatility of inflation, they found it to be highest among floaters, again with the exception of high income countries. For them, it was the lowest.
208
M.D. Bordo and A.J. Schwartz
period the lowest (Figure 4.2). Output variability was also lowest in the convertible subperiod of Bretton Woods, but because o f higher variability in the preconvertibility period, the Bretton Woods system as a whole was more variable than the floating exchange rate period. Both pre-World War II regimes exhibit considerably higher variability than their post-World War II counterparts. The comparison does not apply to the U S A based on the Romer data 29, 30, 31 To link rapid growth in the industrialized countries in the quarter century following World War II to the Bretton Woods international monetary system [Bretton Woods Commission (1994)], seems less compelling than for other aspects of macroeconomic performance. First, there is little conclusive evidence linking exchange rate volatility to either trade flows or the level o f investment [Mussa et al. (1994)], avenues by which a stable exchange rate regime might have affected economic growth. Although Ghosh et al. (1996) find evidence linking real growth to the growth of investment and trade for pegged countries, they also find total factor productivity growth to be an important channel o f growth for floaters. Second, although trade liberalization may have played an important role in the acceleration o f growth rates in the European economies during the Golden Age, most o f the liberalization of trade, before nations declared Article VIII current account convertibility in December 1958, was under the aegis o f institutions developed outside o f the Bretton Woods framework - the Marshall Plan, Organization for European Economic Cooperation (OEEC), European Payments Union (EPU), and European Coal and Steel Community (ECSC) [Eichengreen (1995)]. Finally, the Bretton Woods arrangements might have contributed to postwar growth by being part o f the overall package creating political and economic stability - "the Pax Americana", that was a reaction to the chaos o f the interwar and World War II periods. In this view, rapid postwar growth represented a "catch up" by the European nations and Japan from low levels o f per capita output compared to that o f the leading industrial country, the USA. The "catch up" by these nations was encouraged by the USA. They adopted the leader's best-practice technology and hence grew at a much more rapid rate than before [Abramovitz (1986)] 32.
29 The Bretton Woods regime also exhibited the lowest dispersion of output variability between countries of any regime, with the interwar regime the highest [Bordo (1993b)]. The lower dispersion of output variability under Bretton Woods may reflect conformity between countries' business fluctuations, created by the operation of the fixed-exchange rate regime [Bordo and Schwartz (1989)]. 3o The Hodrick Prescott filter alternative to the first differences used in Table 4.1, yields basically the same rankings of regimes. 3~ In their 1960 1990 sample, Ghosh et al. (1996) find little connection between adherence to a pegged exchange rate and growth, once account is taken of the 1960s experience. High-income floaters generally had more rapid growth than low-income floaters. There was little correlation between output volatility and the regime. 32 in an institutional vein, it has been argued that the Bretton Woods framework(plus GATT) contributed to growth by providing an overall framework of rules. Within them Western European nations solved a hierarchy of coordination problems, allowing them to encourage investment in growth-generating export
Ch. 3:
Monetary Policy Regimes and Economic Performance." The Historical Record 0 ~i
E
209
. . . . . . . . . .
.'i
i|,, .............
,,,,
.................
,,,
.............
ii+lll
{1683 1891 1899 1907 1915 1923 19,11 1959 1947 ISN 1963 1971 1979 1987 1995
ii_;
Y £
r1883
1891 1899 19b~'i9'1i'i9'2~'i93i i%9 i947i959'i~'6~'i9'7i'i9'7~'i9~7i995
o. 0
1885 1891 1899 1907 1915 1923 1931 1959 1947 1955 1~.5 197{ 1979 1987 1995
Fig. 4.2. Annual real per capita income growth, 1880-1995, five countries. Adherence to the convertibility rules of the Bretton Woods system by the USA and other industrialized countries may possibly explain the stability of real output in that regime. Money growth, but not the growth of real government spending, was less variable under Bretton Woods than under the succeeding float [Bordo (1993b),
sectors [Eichengreen (1995)]. Without the Bretton Woods framework it might not have been possible to solve prisoner's dilemma games between labor and capital within each country taken in isolation and, for the OEEC, EPU, Marshall Plan, and ECSC to liberalize trade on comparative advantage lines between the members. Given that the European regional arrangements occurred outside of, and because of, shortcomings in the Bretton Woods arrangements, one wonders if institutional developmentswould have been much different if the European countries were not party to Bretton Woods at all.
M.D. Bordo and A.J. Schwartz
210
<
o
I I
0
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
~
0
~
I
d
d
d
d
d
~
~
£
~
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
211
Eichengreen (1993a)]. Also temporary (aggregate demand) shocks, measured using the Blanchard-Quah (1989) procedure, presumably incorporating policy actions, were lowest under Bretton Woods of any regime [Bordo (1993b), Bayoumi and Eichengreen (1994a,b)]. According to Eichengreen (1993b), the credibility of commitment to the nominal anchor, as evidenced by the low degree of inflation persistence under Bretton Woods, made inflationary expectations mean reverting (see Table 4.2). This produced a flatter short-run aggregate supply curve than under the float where, in the absence of a nominal anchor, inflationary expectations became extrapolative. Under these conditions stabilization policy could be effective in stabilizing output. That activist stabilization policy is in the main responsible for the low real output variability under Bretton Woods is doubtful. For the U.SA., activist Keynesian policies were a product of the late 1960s and 1970s and, for the other countries, the ongoing conflict between internal and external balance dominated policy making. A more likely explanation for real output stability was the absence of serious permanent (aggregate supply) shocks. Bordo (1993b) and Bayoumi and Eichengreen (1994a,b) show permanent (supply) shocks - presumably independent of the monetary regime to be the lowest under Bretton Woods of any regime. In sum, there is compelling evidence linking convertible regimes to superior nominal performance. Whether such a connection can be made for the real side is less obvious. More evidence is required. 4.5. Stochastic properties o f macrouariables
We investigated the stochastic properties (of the log) of the price level and (of the log) of real per capita GNP across monetary regimes 33. Economic theory suggests that the stochastic properties of the price level and other nominal series would be sensitive to the regime. Under convertible regimes based on a nominal anchor, the price level should follow a trend-stationary process, whereas under a regime not so anchored, it should follow a difference-stationary process or a random walk. By contrast there are few predictions that can be made about the stochastic properties of real output under different regimes. To ascertain the stochastic properties of the (log of) the price level and the (log of) real per capita GNP across monetary regime we follow the approach of Cochrane (1988) and Cogley (1990) and calculate the variance ratio. This statistic, defined as the ratio of 1/k times the variance of the series k differences divided by the variance
33 A controversialliterature has centered on whether real GNP and other time series are trend stationary or difference stationary [Nelson and Plosser (1982)] or, alternatively, whether GNP and other series contain a substantial unit root. This debate pertains to different theories of the business cycle: those emphasizing real factors positing a unit root (the primacy of real shocks), and those emphasizing monetary and other actions in the face of price rigidities positing reversion to a long-run trend (the primacy of transitory shocks).
M~D. Bordo and A.J, Schwartz
212
US
UK
° I"
IXr
%t-GfzTOTOtd~
/-~
.\ •\
/
i:
<((,4 n,'
,................. ,--.._-j ....... ,-.;
4
8
12
16
20
24
<~ > o
4
12
8
GERMANY
16
20
24
20
24
FRANCE
IX,
ob
///'~'~'N\ \ //
i:
k\
/
m
4
8
12
16
20
\.
24
G5 AGGREGATE
JAPAN
ob
o
ix,
r- ~
w q-
ZO~
,""/
\.% -\ "',.\
>¢ 4
8
12
16
20
24
°0
4
8
12
16
Fig. 4.3. Variance ratio for the price level by regimes. o f first differences, provides a point estimate o f the size o f the unit root, rather than a test for the existence or absence o f a unit root, as in the earlier literature. The variance ratio is the variance o f the unit root component o f a series relative to the variance o f the trend-stationary component. I f the ratio is above one, the series contains a substantial unit root and is clearly difference stationary. When it is below one, the unit root represents a much smaller fraction o f the variance o f the series; and when it is zero, the series is completely trend stationary 34.
34 Initially we tested for a unit root in both series in the different regimes using the Dickey-Fuller test (1979) and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test (1992). The results detecting the
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
213
Figure 4.3 shows the variance ratio o f the log o f the price level for the five countries and their aggregate by regime 35. From the figure there appears to be a marked difference between the gold standard, interwar, and Bretton Woods regimes on the one hand and the recent float on the other. For the USA, the ratio rises above three during the float and then declines below one after eight years; under the gold standard it gradually rises above two for 13 years and then declines to zero. In the other regimes it declines to zero. For the other four countries and for the aggregate, for all regimes except the float, the ratio quickly declines below one. These results, which suggest that the price level is trend stationary under convertible regimes, but apparently not in the inconvertible fiat regime, generally are consistent with the evidence on persistence and price predictability described in the following subsection. The findings, however, are at best suggestive, since they are based on short samples o f annual data for which it may not be possible to draw on asymptotic theory and perform tests o f statistical significance. In Figure 4.4 (overleaf), which shows the variance ratio o f the log o f real per capita GNR it is difficult to detect a distinct pattern across countries by regimes. The only exception is a marked rise in the variance ratio in the interwar period in the USA and Germany, the two countries hardest hit by the Great Depression. For the aggregate, however, it appears as if the gold standard and interwar ratios decline quickly below one, whereas in both postwar regimes they do so only after three to five years. That shocks to output seem to be more long-lived in the post-World War II period than prewar is more likely consistent with explanations other than the nature o f the monetary regime.
4.6. Inflation persistence, price level predictability, and their effects on financial markets 4. 6.1. Inflation persistence An important piece o f evidence on regime performance is the persistence of inflation. Evidence o f persistence in the inflation rate suggests that market agents expect that monetary authorities will continue to pursue an inflationary policy; its absence would be consistent with market agents' belief that the authorities will pursue a stable monetary rule such as the gold standard's convertibility rule.
presence or absence of a unit root were inconclusive. The Dickey-Fuller test rejected the hypothesis of a unit root for the price level for the USA only during the Bretton Woods period. For real output, the unit root is rejected only for the USA and France during Bretton Woods. These results are generally in accordance with the original Nelson and Plosser (1982) findings. On the other hand, the KPSS test could not reject the hypothesis that both series are trend stationary universally across regimes at the five percent level. 35 To calculate the aggregates we used current GNP weights in current US dollars.
M.D. Bordo and A.J. Schwartz
214
US
UK
o
o
Iw c4
o zO
zo[ '
r~
"
~
-5oi ,";'-"? " :' .T:_~<
o
4
8
12
16
20
2.4 oo
4
8
12
16
20
24
16
20
24
20
24
FRANCE
GERMANY
o Ov F
o
i,i I~
o zc
~
\.
t
"%"
ix
o
4
zo' ,'.:,N:-:........ -,
~,2
N.
8
.......,~,
12
16
[
,\.,
,,.,.
-%1 ~:;~.:",<,"::~ 20
o0
24
4
8
12
65 AGGREGATE
JAPAN
~flX
ig
o"
W~
O,-
Z
4
8
12
16
20
24
.I~'\ / x-, ;>--" q~,.
4
8
12
16
Fig. 4.4. Variance ratio for real per capita income by regimes.
Evidence of inflation persistence can be gleaned from an AR(1) regression on CPI inflation. Table 4.2 presents the inflation rate coefficient from such regressions for five countries over successive regimes since 1880, as well as the standard errors, and the Dickey-Fuller tests for a unit root. The results show an increase in inflation persistence for most countries between the classical gold standard and the interwar period, and also between the interwar period and the post-World War II period as a whole 36. Within the
36 Alogoskoufis and Smith (1991) also show, based on AR (1) regressions of the inflation rate, that inflation persistence in the USA and the UK increased between the classical gold standard period and the interwar period and between the interwar period and the post-World War II period. Also see
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
215
post-World War II period, inflation persistence is generally lower (but not in France and Japan) in the preconvertible Bretton Woods than in the convertible period. This suggests that, though the immediate post-World War II period was characterized by rapid inflation, market agents might have expected a return to a stable price regime. The higher degree o f persistence in the convertible regime suggests that this expectation lost credence. Persistence was generally highest during the float and it did not decline much between the high inflation and low inflation episodes 37. This may mean that the public is aware o f the absence o f a stable nominal anchor 38. 4.6.2. Price Ieoel uncertainty
An important distinction between a convertible or fixed nominal anchor regime (or even one dedicated to price level stability) compared to an inconvertible regime (or one following an inflation target) is lower long-run price level uncertainty. This reflects the trend-stationary (mean reversion) process underlying a convertible regime, compared to the difference-stationary process o f an inconvertible regime. Moreover, forecast errors should increase linearly as the time horizon is lengthened [Leijonhufvud (1984), Fischer (1994)]. Early evidence, by Klein (1975) for the USA, showing long-run price level uncertainty under the pre-1914 gold standard, the interwar period and the 1950s, compared to the 1960s and 1970s, is supported by stochastic simulations of hypothetical price level paths by Fischer (1994), Duguay (1993) and Lebow, Roberts and Stockton (1992) 39. While a convertible regime (or one dedicated to price level stability) yields lower long-run price level uncertainty, short-run price level uncertainty may be higher as a consequence o f the equilibrating changes in the monetary gold stock (or offsetting changes in money supply required to maintain price stability) than under an inconvertible (or inflation targeting) regime, where price level increases need not be reversed. In this regard, Klein (1975) using annual data for the USA, Meltzer (1986)
Alogoskoufis (1992), who attributes the increase in persistence to the accommodation of shocks by the monetary authorities. 37 However, Emery (1994), using quarterly data, finds that inflation persistence in the USA declined significantly between 1973-1981 and 1981 1990. 38 Supportive evidence, based on autocorrelations and time series models of CPI and WPI inflation for the USA, UK, France, and Italy in the nineteenth and twentieth centuries, shows that inflation under the gold standard was very nearly a white noise process, whereas in the post-World Wax II period it exhibited considerable persistence [Klein (1975), Barsky (1987), Bordo and Kydland (1996)]. 39 Bordo and Jonung (1996), using the univariate Multi-State Kalman Filter methodology, measured forecast errors in inflation at one-, five-, and ten-year horizons for sixteen countries over the period 1880-1990, across regimes. They found that forecast errors at the one-year horizon were lowest on average for the advanced G- 11 countries during the Bretton Woods convertible regime, followed by the gold standard and the floating rate period. Also they found that the inflation forecast error increased with time across all regimes, but much more so under the recent float, as Leijonhufvud (1984) predicted.
216
M.D. Bordo and A.J. Schwartz
using quarterly US data, and Meltzer and Robinson (1989) using annual data for seven countries observed higher short-run price level uncertainty for the gold standard than under subsequent regimes 4°. 4. 6. 3. Effects on financial markets' Adherence or non-adherence to a nominal anchor also had implications for financial markets. Mean reversion in price level expectations anchored the term structure o f interest rates. Under the gold standard in the U S A and the UK, the long-term-shortterm interest rate spread predicted short-term rates according to the expectations theory. Under the subsequent fiat money regime, in which monetary authorities smoothed short-term interest rates, the relationship broke down. Similarly the response o f long-term rates to shocks to short-term rates increased after 1914 as short-term rates exhibited more persistence [Mankiw and Miron (1986), Mankiw, Miron and Weil (1987), Mirun (1996)]. Moreover, the Fisher effect - the correlation between nominal interest rates and expected inflation - is hard to detect before 1914 because inflation was a white noise process whereas, later in the twentieth century, when inflation became more persistent, it became more apparent [Barsky (1987), Mishkin (1992)]. 4. 7. Temporary and permanent shocks A n important issue is the extent to which the performance o f alternative monetary regimes, as revealed by the data in Table 4.1, reflects the operation o f the monetary regime in constraining policy actions or the presence or absence o f shocks to the underlying environment. One way to shed light on this issue is to identify such shocks. Authors have used structural VARs to calculate permanent and temporary output shocks to identify differences in behavior across regimes. In a number o f recent papers Bayoumi and Eichengreen [e.g. Bayoumi and Eichengreen (1994a,b)] have extended the bivariate structural vector autoregression (VAR) methodology developed by Blanchard and Quah (1989) which identified permanent shocks as shocks to aggregate supply and temporary shocks as shocks to aggregate demand. According to Bayoumi and Eichengreen, aggregate supply shocks reflect shocks to the environment and are independent o f the regime, but aggregate demand shocks likely reflect policy actions and are specific to the regime 41 .
40 Klein (1975) based his conclusions on a 6-year moving standard deviation of the annual rate of price change; Meltzer (1986) and Meltzer and Robinson (1989) calculated 1-period ahead forecast errors, using a univariate Multi-State Kalman Filter. Simulations by Fischer (1994) of univariate models showed higher short-run forecast errors under a price level target than under a low inflation target. 41 Restrictions on the VAR identify an aggregate demand disturbance, which is assumed to have only a temporary impact on output and a permanent impact on the price level, and an aggregate supply disturbance, which is assumed to have a permanent impact on both prices and output. Overidentifying restrictions, namely, that demand shocks are positively correlated and supply shocks are negatively correlated with prices, are tested by examining the impulse response functions to the shocks.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
217
The m e t h o d o l o g y d e v e l o p e d by Blanchard and Q u a h (1989) raises e c o n o m e t r i c issues 42. M o r e controversial, however, is the labeling o f the shocks as aggregate supply and d e m a n d shocks, as B a y o u m i and E i c h e n g r e e n (1994a,b) do. Interpreting shocks with a p e r m a n e n t i m p a c t on output as supply disturbances and shocks w i t h a t e m p o r a r y impact on output as d e m a n d disturbances implies that one accepts the aggregate d e m a n d - a g g r e g a t e supply m o d e l as correct. For our purpose, it is not necessary to take a stand on this issue. We reach no c o n c l u s i o n that depends on differentiating the two types o f shocks, or w h e t h e r one type predominates. It is e n o u g h to retain the m o r e neutral descriptions o f t e m p o r a r y and p e r m a n e n t shocks w h e n relying on the V A R results to identify u n d e r l y i n g disturbances across r e g i m e s 43. Figure 4.5 s u m m a r i z e s the results o f this line o f research 44. It displays the p e r m a n e n t (aggregate supply) and t e m p o r a r y (aggregate demand) shocks for the five-country aggregate for the data u n d e r l y i n g Table 4.145. For these countries, b o t h t e m p o r a r y
42 Lippi and Reicblin (1993) point out that the Blanchard Quah procedure assumes that the error terms in the model are fimdamental, whereas results are different with nonfundamental representations. This comment, however, applies to all dynamic econometric analyses, not the Blanchard-Quah procedure in particular [Blanchard and Quah (1993)]. Likewise, the comment by Faust and Leeper (1994) that using finite-horizon data and problems of time aggregation cast doubt on the identification of the shocks applies also to other strategies for isolating shocks from responses, and analyzing the speed of adjustment. 43 For two reasons Bayourni and Eichengreen (1994b) strongly defend use of the aggregate demandaggregate supply framework. First, it allows attributing the difference in macroeconomic behavior between fixed and floating exchange rate regimes to a change in the slope of the aggregate demand curve. Second, the model implies that demand shocks should raise prices, supply shocks lower them. These responses are not imposed, hence can be thought of as "over-identifying restrictions" that the data satisfy. However, they acknowledge that the shocks could be misidentified as supply, when they are temporary, and demand, when they are permanent. Finally, a limitation of this approach is that it is difficult to identify the historical events in each monetary regime that correspond to the statistical results. Some authors have conjectured what these events might be. 44 Meltzer (1986) and Meltzer and Robinson (1989) use the univariate Multi-State Kalman Filter methodology to distinguish between permanent and transitory shocks to the price level and real output for cross-regime comparisons. Authors who use the bivariate structural VAR methodology to identify underlying shocks to demand and supply include Bordo (1993b), who extends the approach to historical regime comparisons over the 1880-1990 period for G-7 countries. Bordo and Schwartz (1996a) and Bordo and Jonung (1996) apply the methodology for comparisons over a larger set of countries. Cecchetti and Karras (1994) and Betts, Bordo and Redish (1996) follow Gall (1992) in decomposing the aggregate demand shock into an LM (money) shock and an IS (rest of aggregate demand) shock, applying it to historical data over the interwar period for the USA and Canada, respectively. A different labeling has been adopted by Robinson and Wiekens (1992) who refer to shocks with a temporary impact on output as nominal shocks, and those with a permanent effect on output as real shocks. 45 The shocks were calculated from a two variable vector autoregression in the rate of change of the price level and real output. The VARs are based on three separate sets of data: 1880-1913, 1919 1939, and 1946 1995, omitting the war years because complete data are available for only two of the countries. The VARs have two lags. We derived the number of lags using the Akaike (1974) procedure. We rendered the two series
218
M.D. Bordo and A.J. Schwartz
O3 I". LO
o
X'Z
?
- -
PERMANENT
CD,
T
............. TEMPORARY I
w-
7
887
1899
1911
1923
1935
1947
1959
1971
1983
1995
Fig. 4.5. Permanent (aggregate supply) and temporary (aggregate demand) shocks 1883-1995, omitting war years, G5 aggregate. and permanent shocks were considerably larger before World War II than afterwards. Both types of shocks, but especially permanent shocks, were much larger under the classical gold standard than during the two post-World War II regimes. There is not much difference in the size of both types of shocks between Bretton Woods and the subsequent float, although the Bretton Woods convertible regime was the most tranquil of all the regimes. Thus, this evidence suggests that the superior real performance of the Bretton Woods convertible period may have a lot to do with the lower incidence of shocks compared to the gold standard and interwar periods. This raises an interesting question: why was the classical gold standard durable in the face of substantial shocks (it lasted approximately 35 years), whereas Bretton Woods was fragile (the convertible phase lasted only 12 years) in the face of the mildest shocks in the past century? One possible answer is more rapid adjustment of prices and output to shocks under the gold standard than under the postwar regimes. Evidence in Bordo (1993b), based on calculations from the impulse response functions derived from the bivariate autoregressions underlying Figure 4.5, reveals that the response of both output and
stationary by first differencing. The aggregate income growth and inflation rates are a weighted average of the rates in the different countries. The weights for each year are the share of each country'snominal national income in the total income in the five countries, where the national income data are convertedto US dollars using current exchange rates.
Ch. 3:
Monetary Policy Regimes and Economic Performance: The Historical Record
219
prices to both temporary and permanent shocks in the G-7 aggregate and in most of the individual countries was markedly more rapid under the gold standard than under the postwar regimes. Within the postwar regimes, the response was also more rapid under Bretton Woods than under the float [also see Bayoumi and Eichengreen (1994a)]. Perhaps countries under the gold standard were able to endure the greater shocks that they faced owing to both greater price flexibility and greater factor mobility before World War I [Bordo (1993b)]. Alternatively, perhaps the gold standard was more durable than Bretton Woods because, before World War I the suffrage was limited, central banks were often privately owned and, before Keynes, there was less understanding of the link between monetary policy and the level of economic activity. Hence, there was less of an incentive for the monetary authorities to pursue fullemployment policies which would threaten adherence to convertibility [Eichengreen (1992)]. Another explanation for the relative longevity of the international gold standard and the short life of Bretton Woods may be the design of the monetary regime and specifically the presence or absence of a credible commitment mechanism (or a monetary rule). As shown in Section 2, although Bretton Woods, like the gold standard, was a regime based on rules, the system did not provide a credible commitment mechanism, such as the gold standard contingent rule for the core countries. That outcome may in turn have reflected a shift in society's objectives away from convertibility and price stability towards domestic real stability.
5. Overall assessment of monetary policy regimes The historical record and evidence on the performance of monetary policy regimes leave unanswered questions concerning the forces that predispose policy makers to adopt and then to abandon a regime. We do not know in detail why so many countries chose the gold standard before World War I as the monetary regime par excellence. Was it simply path dependence, since monetary systems evolved from specie-based regimes, and the success of England, the leading commercial power, which accidentally shifted to gold in the early 18th century, led many silver and bimetallic adherents as well as those on paper standards in turn to switch to gold? Was it the opinion of experts who testified before commissions regarding the choice that swayed the decision makers? Was it economic theory that convinced the leaders of public opinion? Was it the experience of inflationary fiat money in preceding regimes that carried the day? Finally, was the gold standard viable because the scope of government activity was limited? In the case of the United States, we know that the combination of real bills and gold standard rules in the Federal Reserve Act reflected the influence of bankers and public servants as well as the testimony of representatives of foreign central banks. Yet the wartime departures from the arrangements that the Act prescribed were never undone.
220
M.D. Bordo and A.J. Schwartz
The explanation of the abandonment of the gold standard under wartime conditions, in both wars, poses no problem. When financing government becomes the primary concern of the monetary and fiscal authorities, gold standard rules cannot be sustained. Peacetime limits on money creation give way to the requirement to provide the financial sinews of war in tandem with contributions from taxation and government debt issues. Accounting for the monetary regime choices in postwar eras, however, raises many questions. Did the rise of the democracy and the power of the labor movement make adherence to the strictures of the gold standard less acceptable? Has the growth of government been inimical to the requirements of a real gold standard? Were the political disorders of the post-World War I decades so crippling - reparations, war debts, US isolationism, rearmament, fascist and communist dictatorships - that, after the brief restoration of the gold standard, no international monetary regime was viable? Was the brief restoration possible only because the American Benjamin Strong and the Englishman Montagu Norman willed it? Bretton Woods, post-World War II, again represented the will of an American Harry Dexter White and an Englishman Maynard Keynes, but this time, more so than in the 1920s, the economic weight of the US backing of its representative was overwhelming. Is that why countries fell into line to apply for membership in the system? Or, alternatively, was the Anglo-Saxon system imposed on other leading countries, which because they were either occupied or were enemies during the war, had no input in constructing Bretton Woods? Post-Bretton Woods, the questions center on the commonality of the experience of stagflation in the 1970s and the switch since the 1980s to low inflation as the objective of domestic monetary policy regimes. We can observe the change in procedures that central banks adopted in order to achieve the low inflation result, but pinpointing the forces that led country after country to the change its monetary policy objective is less apparent. We have learned much about the virtues and shortcomings of the monetary regimes that the world has experienced since 1880. Much more still has to be learned.
Acknowledgments
For helpful suggestions we thank Milton Friedman, Marvin Goodfriend, Robert Gordon, Peter Ireland, Lars Jonung, Allan Meltzer, John Taylor, and Geoffrey Wood. Able research assistance was provided by Jong Woo Kim.
Appendix A. Data sources A. 1. United States o f America
(1) Population. 1880-1975, Bordo and Jonung (1987). 1976 1995, International Financial Statistics Yearbook, 1996, pp. 787-791, line 99z.
Ch. 3: MonetaryPolicy Regimes and Economic Performance: The Historical Record
221
(2) M2. 1880-1947, Bordo and Jonung (1987). 1948-1989, Data supplied by Robert Rasche. 1990-1995, International Financial Statistics Yearbook, 1996, pp. 787-791, line 59mb. (3) Real GNE 1880-1945, Balke and Gordon (1986), pp. 781-783, col 2. 1946-1989, The Economic Report of the President, 1991, p. 288. Real GDE 1990-1995, International Financial Statistics Yearbook, 1996, pp. 787791, line 99br. (4) Deflator. 1880-1945, Balke and Gordon (1986), pp. 781-783, col. 2. 1946-1989, The Economic Report of the President, 1991, p. 290. 1990-1995, International Financial Statistics Yearbook, 1996, pp. 787-791, line 99bir. (5) Money Base. 1880-1982, Balke and Gordon (1986), pp. 784-786, col. 4. 1946-89, 1983-1995, International Financial Statistics Yearbook, 1996, line 14. (6) Consumer Price Index. 1880-1970, US Bureau of the Census (1975), Historical Statistics of the United States." Colonial Times to 1970: Bicentennial Edition (Washington, D.C.), pp. 210-211 (hereafter cited as Historical Statistics). 1971-1995, International Financial Statistics Yearbook, 1996, pp. 787-791, line 64. (7) Short-Term Interest Rate. Commercial paper rate. 1880-1986, Bordo and Jonung (1987). 1987-1995, International Financial Statistics Yearbook, 1996, pp. 787-791, line 60bc. (8) Long-Term Interest Rate. Long-term government bond yield. 1880-1986, Bordo and Jonung (1987). 1987-1989, Bordo and Jonung (1990), pp. 165-197. 1990-1995, International Financial Statistics Yearbook, 1996, pp. 787-791, line 61.
A.2. United Kingdom
(1) Population. 1880-1975, Bordo and Jonung (1987). 1976-1995, International Financial Statistics Yearbook, 1996, pp. 782-785, line 99z. (2) Real NNP. 1880-1985, Bordo and Jonung (1987). 1986-1989, Central Statistical Office, Economic Trends (various issues). Real GDE 1990-1995, International Financial Statistics Yearbook, 1996, pp. 782785, line 99br. (3) Deflator. 1880-1985, Bordo and Jonung (1987). 1986-1989, Central Statistical Office, Economic Trends (various issues). 1990-1995, International Financial Statistics Yearbook, 1996, pp. 782-785, line 99bit.
222
M.D. Bordo and A.J. Schwartz
(4) Consumer Price Index. 1880-1965, Feinstein's retail price series [Capie and Webber (1985), vol. 1, table III.(12)]. 1966-1995, International Financial Statistics Yearbook, 1996, pp. 782-785, line 64. (5) Exchange Rate. US Dollar/Pound. 1880-1939, Friedman and Schwartz (1982), table 4.9, col. 8, pp. 130-135. 1947-1995, International Financial Statistics (various issues), pp. 782-785, line rh. A.3. Germany
(1) Population. 1880-1979, Sommariva and Tullio (1987), pp. 234-236. 1980-1995, International Financial Statistics Yearbook, 1996, pp. 376-379, line 99z. (2) Real GNE 1880-1985, Data underlying Meltzer and Robinson (1989). Real GDR 1986-1995, International Financial Statistics Yearbook, 1996, pp. 376379, line 99br. (3) Deflator. 1880-1985, Meltzer and Robinson (1989). 1986-1995, International Financial Statistics Yearbook, 1996, pp. 376-379, line 99bir. (4) Consumer Price Index. 1880-1979, Sommariva and Tullio (1987), pp. 231-234. 1980-1995, International Financial Statistics Yearbook, 1996, pp. 376-79, line 64. (5) Exchange Rate. Deutsche Mark/US Dollar. 1880-1979, Sommariva and Tullio (1987), pp. 231-234. 1990-1995, International Financial Statistics Yearbook, 1996, pp. 376-379, line rh. A.4. France
(1) Population. 1880-1949, Mitchell (1978), table A1. 1950-1995, International Financial Statistics Yearbook, 1996, pp. 364-367, line 99z. (2) Real GDR 1880-1900, Calculated from the Toutain Index [Saint Marc (1983), pp. 99-100]. 1901-1949, Sauvy (1954). 1950-1988, INSEE, Statistique annuaire de la France retrospeetif (1966) and Statistique annuaire de la France (various issues). 1989-1995, International Financial Statistics Yearbook, 1996, pp. 364-367, line 99br. (3) Deflator. Calculated as the ratio of nominal to real GDP. Nominal GDP, 18801913, Levy-Leboyer and Bourguignon (1990), table A-III. 1914-1988, INSEE, Statistique annuaire de la France retrospectif (1966) and
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
223
Statistique annuaire de la France (various issues). 1989-1995, International Financial Statistics Yearbook, 1996, pp. 364-367, line 99bir. (4) Consumer Price Index. 1880-1969, Saint Marc (1983), p. 107. 1970-1995, International Financial Statistics Yearbook, 1996, pp. 364-367, line 64. (5) Exchange Rate. French Franc/US Dollar. 1880-1969, Saint Marc (1983), p. 107. 1970-1995, International Financial Statistics Yearbook, 1996, pp. 364-367, line rh. A.5. Japan (1) Population. 1880-1949, Bureau of Statistics (1957), Japan Statistical Yearbook. 1950-1995, International Financial Statistics Yearbook (various issues), pp. 4 5 8 461, line 99z. (2) Real GNR 1885-1988, Data supplied by Robert Rasche. Real GDR 1989-1995, International Financial Statistics Yearbook, 1996, pp. 4 5 8 461, line 99br. (3) Deflator. 1885-1988, Data supplied by Robert Rasche. 1989-1995, International Financial Statistics Yearbook, 1996, pp. 458-461, line 99bir. (4) Consumer Price Index. 1950-1995, International Financial Statistics Yearbook (various issues), pp. 458-461, line 64. (5) Exchange Rate. Japan Yen/US Dollar. 1880-1989, Data Supplied by James Lothian. 1990-1995, International Financial Statistics" Yearbook, 1996, pp. 458-461, line rh.
References Abramovitz, M. (1986), "Catching up, forging ahead, and falling behind", Journal of Economic History 46(2, June):385~406. Akaike, H. (1974), "A new look at the statistical model identification", IEEE Transactions on Automatic Control A619:716-723. Akerlof, G.A., WT. Dickens and G.L. Perry (1996), "The macroeconomics of low inflation", Brookings Papers on Economic Activity 1996(1):1-76. Alesina, A., and A. Drazen (1991), "Why are stabilizations delayed?", American Economic Review 81(5):1170-1188. Alogoskoufis, G.S. (1992), "Monetary accommodation, exchange rate regimes and inflation persistence", Economic Journal 102(412, May):461-480. Alogoskoufis, G.S., and R. Smith (1991), "The Phillips curve, the persistence of inflation and the Lucas critique: evidence from exchange-rate regimes", American Economic Review 81(2):1254~1273. Bagehot, W. (1873), Lombard Street. Reprint of the 1915 edition (Arno Press, New York, 1969). Balke, N.S., and R.J. Gordon (1986), "Appendix B: Historical data", in: R.J. Gordon, ed., The American Business Cycle: Continuity and Change (University of Chicago Press, Chicago, IL).
224
M.D. Bordo and A.J. Schwartz
Ball, L. (1994), "Credible disinflation with staggered price-setting", American Economic Review 84(March):282-289. Barro, R.J. (1979), "Money and the price level under the gold standard", Economic Journal 89:12-33. Barro, R.J. (1989), "Interest-rate targeting", Journal of Monetary Economics 23(January):3-30. Barro, R.J., and D.B. Gordon (1983), "Rules, discretion and reputation in a model of monetary policy", Journal of Monetary Economics 12:101-121. Barsky, R.B. (1987), "The Fisher hypothesis and the forecastability and persistence of inflation", Journal of Monetary Economics 19(1, January):3~4. Baxter, M., and A.C. Stockman (1989), "Business cycles and the exchange-rate regime: some international evidence", Journal of Monetary Economics 23(May):377-400. Bayoumi, T., and M.D. Bordo (1998), "Getting pegged: comparing the 1879 and 1925 gold resumptions", Oxford Economic Papers 50:12~149. Bayoumi, T., and B. Eichengreen (1994a), "Economic performance under alternative exchange rate regimes: some historical evidence", in: E Kenen, E Papadia and E Saccomani, eds., The International Monetary System (Cambridge University Press, Cambridge) 257-297. Bayoumi, T., and B. Eichengreen (1994b), "Maeroeeonomic adjustment under Bretton Woods and the post-Bretton Woods float: an impulse-response analysis", Economic Journal 104(July):813 827. Benjamin, D., and L. Kochin (1979), "Searching for an explanation of unemployment in interwar Britain", Journal of Political Economy 87:441-478. Benjamin, D., and L. Kochin (1982), "Unemployment and unemployment benefits in 20th Century Britain: a reply to our critics". Journal of Political Economy 90:410-436. Bernanke, B.S. (1983), "Non-monetary effects of the financial crisis in the propagation of the great depression", American Economic Review 73(June):257-276. Bernanke, B.S. (1995), "The macroeconomics of the great depression: a comparative approach", Journal of Money, Credit and Banking 27(February):l~8. Bernanke, B.S., and K. Carey (1996), "Nominal wage stickiness and aggregate supply in the great depression", Quarterly Journal of Economics 11 l(August):853-883. Bernanke, B.S., and M. Gertler (1989), "Agency costs, net worth, and business fluctuations", American Economic Review 79(March): 14 31. Bernanke, B.S., and H. James (1991), "The Gold Standard, deflation and financial crisis in the great depression: an international comparison", in: R.G. Hubbard, ed., Financial Markets and Financial Crisis (University of Chicago Press, Chicago, IL) 33-68. Betts, C.M., M.D. Bordo and A. Redish (1996), "A small open economy in depression: lessons from Canada in the 1930s", Canadian Journal of Economics 29(February): 1-36. Blanchard, O.J., and D.T. Quah (1989), "The dynamic effects of aggregate demand and aggregate supply disturbances", American Economic Review 79(September):655-673. Blanchard, O.J., and D.T. Quah (1993), "The dynamic effects of aggregate demand and supply disturbances: reply", American Economic Review 88(June):653 658. Bloomfield, A. (1959), Monetary Policy under the International Gold Standard, 1880-1914 (Federal Reserve Bank of New York, New York). Bordo, M.D. (1981), "The Classical Gold Standard: some lessons for today", Federal Reserve Bank of St. Louis Review 63(May):2 17. Bordo, M.D. (1984), "The Gold Standard: the traditional approach", in: M.D. Bordo and A.J. Schwartz, eds., A Retrospective on the Classical Gold Standard, 1821-1931 (University of Chicago Press, Chicago, 1L) 23-119. Bordo, M.D. (1993a), "The Bretton Woods international monetary system. A historical overview", in: M.D. Bordo and B. Eichengreen, eds., A Retrospective on the Bretton Woods System: Lessons for International Monetary Reform (University of Chicago Press, Chicago, IL) 3 108. Bordo, M.D. (1993b), "The Gold Standard, Bretton Woods and other monetary regimes: an historical appraisal", in: Dimensions of Monetary Policy: Essays in Honor of Anatole B. Balbach. Federal Reserve Bank of St. Louis Review, Special Issue, April May.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
225
Bordo, M.D. (1995), "Is there a good case for a new Bretton Woods International Monetary System?", AEA Papers and Proceedings (May):317-322. Bordo, M.D., and B. Eichengreen (1998), "Implications of the Great Depression for the development of the International Monetary System", in: M.D. Bordo, C. Goldin and E.N. White, eds., The Defining Moment: The Great Depression and the American Economy in the 20th Century (University of Chicago Press, Chicago, IL). Bordo, M.D., and R.E. Ellson (1985), "A model of the Classical Gold Standard with depletion", Journal of Monetary Economics 16(1, July): 109-120. Bordo, M.D., and L. Jonung (1987), The Long-Run Behavior of Velocity of Circulation: The International Evidence (Cambridge University Press, New York). Bordo, M.D., and L. Jonung (1990), "The long-run behavior of velocity: the institutional approach revisited", Journal of Policy Modeling 12(Summer): 165- 197. Bordo, M.D., and L. Jonung (1996), "Monetary regimes, inflation and monetary reform", in: D. Vaz and K. Velupillai, eds., Inflation, Institutions and Information, Essays in Honor of Axel Leijonhufvud (Macmillan Press, London). Bordo, M.D., and EE. Kydland (1995), "The Gold Standard as a rule: an essay in exploration. Explorations in Economic History 32(40ctober):423M64. Bordo, M.D., and EE. Kydland (1996), "The Gold Standard as a commitment mechanism", in: T. Bayoumi, B. Eichengreen and M. Taylor, eds., Modern Perspectives on the Gold Standard (Cambridge University Press, Cambridge). Bordo, M.D., and R. MacDonald (1997), "Violations of the 'Rules of the Game' and the credibility of the Classical Gold Standard, 188~1914", Working Paper (NBER, July). Bordo, M.D., and H. Rockoff (1996), "The Gold Standard as a 'Good Housekeeping Seal of Approval"', Journal of Economic History 56(2, June):384-428. Bordo, M.D., and A.J. Schwartz (1989), "Transmission of real and monetary disturbances under fixed and floating rates", in: J.A. Dorn and W.A. Niskanen, eds., Dollars, Deficits and Trade (Kluwer, Boston) 237258. Bordo, M.D., and A.J. Schwartz (1996a), "Why clashes between internal and external stability goals end in currency crises, 1797-1994", Open Economies Review 7(Suppl. 1):437M68. Bordo, M.D., and A.J. Schwartz (1996b), "The operation of the Specie Standard: evidence for core and peripheral countries, 1880 1990, in: J. Braga de Macedo, B. Eichengreen and J. Reis, eds., Currency Convertibility: The Gold Standard and Beyond (Rutledge, New York) 11-83. Bordo, M.D., and C.A. V6gh (1998), "What if Alexander Hamilton had been Argentinean: a comparison of the early monetary experiences of Argentina and the United States", Working Paper No. 6862 (NBER). Bordo, M.D., and E. White (1993), "British and French finance during the Napoleonic Wars, in: M.D. Bordo and E Capie, eds., Monetary Regimes in Transition (Cambridge University Press, Cambridge). Bordo, M.D., E.U. Choudhri and A.J. Schwartz (1990), "Money stock targeting, base drift, and pricelevel predictability: lessons from the U.K. experience", Journal of Monetary Economics 25(March): 253-272. Bordo, M.D., E.U. Choudhri and A.J. Schwartz (1995), "Could stable money have averted the great contraction?", Economic Inquiry 33(July):484-505. Bordo, M.D., C.J. Erceg and C.L. Evans (1997), "Money, sticky wages and the Great Depression", Working Paper No. 6071 (NBER, June). Bretton Woods Commission (1994), "Bretton Woods: looking to the future" (Bretton Woods Commission, Washington, DC). Brunner, K., and A.H. Meltzer (1964), An Analysis of Federal Reserve Monetary Policymaking. House Committee on Banking and Currency (Government Printing Office, Washington, DC). Brunner, K., and A.H. Meltzer (1968), "What did we learn from the monetary experience of the United States in the Great Depression?", Canadian Journal of Economics l(May):334-348.
226
M.D. Bordo and A.J Schwartz
Brunner, K., and A.H. Meltzer (1993), Money in the Economy: Issues in Monetary Analysis. Raffaele Mattioli Lectures (Cambridge University Press, Cambridge). Cagan, R (1965), Determinants and Effects of Changes in the Stock of Money 1875-1960 (Columbia University Press, New York). Cagan, R (1984), "On the Report of the Gold Commission 1982 and convertible monetary systems", Carnegie-Rochester Conference Series on Public Policy 21(Spring):247-267. Calomiris, C.W (1993), "Greenback resumption and silver risk: the economics and politics of monetary regime change in the United States, 1862-1900", in: M.D. Bordo and F. Capie, eds., Monetary Regimes in Transition (Cambridge University Press, Cambridge). Calomiris, C.W., and G. Gorton (1991), "The origin of banking panics: models, facts, and bank regulation", in: R.G. Hubbard, ed., Financial Markets and Financial Crises (Chicago University Press, Chicago, IL) 109-173. Calomiris, C.W, and D.C. Wheelock (1998), "Was the Great Depression a watershed for American monetary policy?", In: M.D. Bordo, C. Goldin and E.N. White, eds., The Defining Moment: The Great Depression and the American Economy in the 20th Century (University of Chicago Press, Chicago, IL) 23-65. Canzoneri, M.B. (1985), "Monetary policy games and the role of private information", American Economic Review 75(December):1056~1070. Canzoneri, M.B., and D.W Henderson (1991), Monetary Policy in Interdependent Economies (MIT Press, Cambridge, MA). Capie, E, and A. Webber (1985), A Monetary History of the United Kingdom (Allen & Unwin, London). Capie, E, T.C. Mills and G.E. Wood (1986), "What Happened in 1931?", in: E Capie and G.E. Wood, eds., Financial Crises and the World Banking System (Macmillan, London) 120-148. Capie, E, C. Goodhart and N. Schnadt (1994), "The development of central banking", in: E Capie, C. Goodhart, S. Fischer and N. Schnadt, The Future of Central Banking (Cambridge University Press, Cambridge) 1-231. Cecchetti, S.G. (1992), "Prices during the Great Depression: was the deflation of 1930 1932 really unanticipated?", American Economic Review 92(March): 141 156. Cecchetti, S.G., and G. Karras (1994), "Sources of output fluctuations during the interwar period: further evidence on the causes of the Great Depression. Review of Economics and Statistics 76(February): 80-102. Chandler, L.V. (1958), Benjamin Strong: Central Banker (Brookings, Washington, DC). Clark, T.A. (1986), "Interest rate seasonals and the Federal Reserve", Journal of Political Economy 94(Febrnary):76-125. Cochrane, J.H. (1988), "How big is the random walk in GNP?", Journal of Political Economy 96(October): 893-920. Cogley, 32 (1990), "International evidence on the size of the random walk in output", Journal of Political Economy 98(June):501 518. Cook, T. (1989), "Determinants of the Federal Funds rate: 197~1982", Federal Reserve Bank of Richmond Economic Review 75(January-February):3 19. Cook, T., and T. Hahn (1989), "The effect of changes in the Federal Funds rate target on market interest rates in the 1970s", Journal of Monetary Economics 24(November) 331-351. Cooper, R. (1982), "The gold standard: historical facts and future prospects", Brookings Papers on Economic Activity 1982(1): 145. Crueini, M.J., and J. Kahn (1996), "Tariffs and aggregate economic activity: lessons from the Great Depression", Journal of Monetary Economics 38(December):427-467. Cunliffe Report (1918), First Interim Report of the Committee on Currency and Foreign Exchanges after the War. Cmnd 9182. Reprinted 1979 (Arno Press, New York). Darby, M.R., J.R. Lothian, A.E. Gandolfi, A.J. Schwartz and A.C. Stockman (1983), The International Transmission of Inflation (University of Chicago Press, Chicago, IL).
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
227
Davutyan, N., and W.R. Parke (1995), "The operations of the Bank of England, 189~1908: a dynamic probit approach", Journal of Money, Credit and Banking 27(4, November Part I): 1099-1112. DeCecco, M. (1974), Money and Empire: The International Gold Standard: 1890-1914 (Rowman and Littlefield, London). DeKock, G., and M Grilli (1989), "Endogenous exchange rate regime switches", Working Paper No. 3066 (NBER, August). Dickey, D.A., and W.A. Fuller (1979), "Distribution of the estimators for autoregressive time series with a unit root", Journal of the American Statistical Association, Part I, 74(June):427-431. Dominguez, K. (1993), "The role of international organizations in the Bretton Woods system", in: M.D. Bordo and B. Eiehengreen, eds., A Retrospective on the Bretton Woods System (University of Chicago Press, Chicago, IL) 357-404. Dominguez, K., R.C. Fair and M.D. Shapiro (1988), "Forecasting the Depression: Harvard versus Yale", American Economic Review 78(September):595 612. Dornbusch, R. (1996), "Commentary: How should central banks reduce inflation? - Conceptual issues", in: Achieving Price Stability (Federal Reserve Bank of Kansas City, August) 93-103. Duguay, P (1993), "Some thoughts on price stability versus zero inflation", mimeograph (Bank of Canada). Dutton, J. (1984), "The Bank of England and the rules of the game under the international Gold Standard: new evidence", in: M.D. Bordo and A.J. Schwartz, eds., A Retrospective on the Classical Gold Standard (University of Chicago Press, Chicago, IL) 173-202. Eichenbaum, M. (1992), "Comment on 'Central bank behavior and the strategy of monetary policy: observations from six industrialized countries', in: O.J. Blanchard and S. Fischer, eds., NBER Maeroeconomics Annual 1992 (MIT Press, Cambridge) 228-234. Eichengreen, B. (1985), "Editor's introduction", in: B. Eiehengreen, ed., The Gold Standard in Theory and History (Methuen, London). Eichengreen, B. (1987), "Conducting the international orchestra: Bank of England leadership under the Classical Gold Standard", Journal of International Money and Finance 6:5~9. Eichengreen, B. (1989), "The political economy of the Smoot Hawley Tariff", in: R.L. Ransom, EH. Lindert and R. Sutch, eds., Research in Economic History, vol. 12 (JAI Press, Greenwich, CT) 1~43. Eichengreen, B. (1990), Elusive Stability (Cambridge University Press, New York). Eichengreen, B. (1991a), "Editor's introduction", in: B. Eichengreen, ed., Monetary Regime Transformations (Edward Elgar, Cheltenham). Eichengreen, B. (1991b), "Comparative performance of fixed and flexible exchange rate regimes: interwar evidence", in: N. Thygesen, K. Velupillai and S. Zambelli, eds., Business Cycles: Theories, Evidence and Analysis (Macmillan, London) 229 272. Eichengreen, B. (1992), Golden Fetters: The Gold Standard and the Great Depression, 1919-1939 (Oxford University Press, New York). Eiehengreen, B. (1993a), "History of the international monetary system: implications for research in international macroeconomics and finance, in: E van der Ploeg, ed., Handbook of International Macroeconomics (Blackwell, Oxford). Eichengreen, B. (1993b), "Three perspectives on the Bretton Woods System", in: M.D. Bordo and B. Eichengreen, eds., A Retrospective on the Bretton Woods System (University of Chicago Press/ NBER, Chicago/New York). Eichengreen, B. (1995), "Institutions and economic growth: Europe after World War II, in: N.F.R. Crafts and G. Toniolo, eds., Comparative Economic Growth of Postwar Europe (Cambridge University Press, Cambridge). Eichengreen, B. (1996), Globalizing Capital: A History of the International Monetary System (Princeton University Press, Princeton, NJ). Eichengreen, B., and P.M. Garber (1991), "Before the U.S. Accord: U.S. monetary-financial policy,
228
M.D. Bordo and A.J. Schwartz
1945-51", in: R.G. Hubbard, ed., Financial Markets and Financial Crises (University of Chicago Press, Chicago, 1L) 175-205. Eichengreen, B., and I. McLean (1994), "The supply of gold under the pre-1914 Gold Standard", Economic History Review 48:288-309. Eichengreen, B., J. Tobin and C. Wyplosz (1995), "Two cases for sand in the wheels of international finance", Economic Journal 105(January):16~172. Emery, K.M. (1994), "Inflation persistence and Fisher effects: evidence of a regime change", Journal of Economics and Business 46(August):141-152. Evans, M., and E Wachtel (1993), "Were price changes during the great depression anticipated?: evidence from nominal interest rates", Journal of Monetary Economics 32(August):3-34. Faust, J., and E.M. Leeper (1994), "When do long-run identifying restrictions give reliable results?", Working Paper 94-2 (Federal Reserve Bank of Atlanta). Federal Reserve Board (1924), Tenth Annual Report (Government Printing Office, Washington, DC). Feldstein, M. (1996), "Overview", in: Achieving Price Stability (Federal Reserve Bank of Kansas City) 319-329. Feldstein, M. (1997), "The costs and benefits of going from low inflation to price stability", in: C. Romer and D. Romer, eds., Reducing Inflation (University of Chicago Press, Chicago, IL). Fischer, S. (1977), "Long-term contracts, rational expectations, and the optimal money supply rule", Journal of Political Economy 85(February):191~05. Fischer, S. (1994), "Modern central banking", in: E Capie, C. Goodhart, S. Fischer and N. Schnadt, The Future of Central Banking (Cambridge University Press, Cambridge) 262~08. Fishe, R.P.H., and M.E. Wohar (1990), "The adjustment of expectations to a change in regime: comment", American Economic Review 80(September):968-976. Fisher, I. (1920), Stabilizing the Dollar (Macmillan, New York). Fisher, I. (1922), The Purchasing Power of Money (Augustus M. Kelley Reprint, New York, 1965). Fishlow, A. (1985), "Lessons from the past: capital markets during the 19th Century and the interwar period", International Organization 39:383-439. Flandreau, M. (1996), "The French Crime of 1873: an essay on the Emergence of the international Gold Standard, 1870-1880", Journal of Economic History 51(4, December):862 897. Flood, R.P., and R Isard (1989), "Simple rules, discretion and monetary policy, Working Paper No. 2934 (NBER). Flood, R.R, and M. Mussa (1994), "Issues concerning nominal anchors for monetary policy", in: T.J.T. Balino and C. CottarelIi, eds., Framework for Monetary Stability: Policy Issues and Country Experiences (International Monetary Fund, Washington, DC). Ford, A.G. (1962), The Gold Standard 1880-1914: Britain and Argentina (Clarendon Press, Oxford). Frenkel, J.A., and M.L. Mussa (1985), "Asset markets, exchange rates, and the balance of payments", in: R.W Jones and EB. Kenen, eds., Handbook of International Economics, vol. 2 (North-Holland, Amsterdam) Chapter 14. Friedman, M. (1953), "The case for flexible exchange rates", in: Essays in Positive Economics (University of Chicago Press, Chicago, IL) 157-203. Friedman, M. (1968), "The role of monetary policy", American Economic Review 58(March): 1-17. Friedman, M. (1984), "Lessons from the 1979 82 monetary policy experiment", American Economic Review 74(May):397-400. Friedman, M. (1990a), "Bimetallism revisited", Journal of Economic Perspectives 4(4):85-104. Friedman, M. (1990b), "The Crime of 1873", Journal of Political Economy 98(December):1159-1194. Friedman, M., and A.J. Schwartz (1963), A Monetary History of the United States 1867 to 1960 (Princeton University Press, Princeton, N J). Friedman, M., and A.J. Schwartz (1982), Monetary Trends in the United States and the United Kingdom (University of Chicago Press, Chicago, IL). Gali, J. (1992), "How well does the IS=LM model fit postwar U.S. data?", Quarterly Journal of Economics 107(May):709-738.
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
229
Gallarotti, G.M. (1995), The Anatomy of an International Monetary Regime: The Classical Gold Standard 1880-1904 (Oxford University Press, New York). Garber, EM. (1993), "The collapse of the Bretton Woods fixed exchange rate system", in: M.D. Bordo and B. Eiehengreen, eds., A Retrospective on the Bretton Woods System (University of Chicago Press, Chicago, IL). Garber, RM., and R.R Flood (1984), "Gold monetization and gold discipline", Journal of Political Economy 92(February):90-107. Genberg, H., and A. Swoboda (1993), "The provision of liquidity in the Bretton Woods system", in: M.D. Bordo and B. Eichengreen, eds., A Retrospective on the Bretton Woods System (University of Chicago Press, Chicago, IL) 269-306. Ghosh, A.R., A.M. Gulde, J.D. Ostry and H. Wolf (1996), "Does the nominal exchange rate regime matter?", Working Paper 121 (IMF, November). Giavazzi, E, and A. Giovannini (1989), Limiting Exchange Rate Flexibility (MIT Press, Cambridge, MA). Giavazzi, E, and M. Pagano (1988), "The advantage of tying one's hands: EMS discipline and central bank credibility", European Economic Review 32:1055 1082. Gilbert, R.A. (1994), "A Case Study in Monetary Control: 1980-82", Federal Reserve Bank of St. Louis Review (September/October):35-55. Giovannini, A. (1986), "'Rules of the Game' during the International Gold Standard: England and Germany", Journal of International Money and Finance 5:467~483. Giovannini, A. (1993), "Bretton Woods and its precursors: rules versus discretion in the history of international monetary regimes", in: M.D. Bordo and B. Eichengreen, eds., A Retrospective on the Bretton Woods System: Lessons for International Monetary Reform (University of Chicago Press, Chicago, IL). Goff, B.L., and M. Toma (1993), "Optimal seigniorage, the Gold Standard, and central banking financing", Journal of Money, Credit and Banking 25(Febrnary):79-95. Goodfriend, M. (1983), "Discount window borrowing, monetary policy, and the post-October 6, 1979 Federal Reserve operating procedure", Journal of Monetary Economics 12(3, September):343-356. Goodfriend, M. (1987), "Interest rate smoothing and price level trend-stationarity", Journal of Monetary Economics 19(May):335-348. Goodfriend, M. (1988), "Central banking under the Gold Standard", Carnegie-Rochester Conference Series on Public Policy 19:85 124. Goodfriend, M. (1991), "Interest rates and the conduct of monetary policy", Carnegie-Rochester Conference Series on Public Policy 34:770. Goodfriend, M. (1993), "Interest rate policy and the inflation scare problem: 1979 1992", Federal Reserve Bank of Richmond Economic Quarterly 79(Winter): 1~4. Goodhart, C.A.E. (1989), "The conduct of monetary policy", Economic Journal 99 (June):293-346. Gordon, R.J. (1990), "What is new-Keynesian economics?", Journal of Economic Literature 28(September): 1115-1171. Greenspan, A. (1994), "Open session: The development of central banking", in: E Capie, C. Goodhart, S. Fischer and N. Schnadt, eds., The Future of Central Banking: The Tercentenary Symposium of the Bank of England (Cambridge University Press, Cambridge) 259. Grilli, VU. (1990), "Managing exchange rate crises: evidence from the 1890s", Journal of International Money and Finance 9(September):258-275. Grossman, H.J., and J.B. Van Huyck (1988), "Sovereign debt as a contingent claim: excusable default, repudiation, and reputation", American Economic Review 78:1088-1097. Haberler, G. (1976), The World Economy, Money, and the Great Depression 1919 1939 (American Enterprise Institute for Public Policy Research, Washington, DC). Hamilton, J.D. (1987), "Monetary factors in the Great Depression.", Journal of Monetary Economics 19(March) 145 170.
230
M.D. Bordo and A.J. Schwartz
Hamilton, J.D. (1992), "Was the deflation during the Great Depression anticipated?, Evidence from the commodity futures market", American Economic Review 82(March): 157-178. Helpman, E. (1981), "An exploration in the theory of exchange rate regimes", Journal of Political Economy 89(5):865-890. Helpman, E., and A. Razin (1979), "Toward a consistent comparison of alternative exchange-rate regimes", Canadian Journal of Economics 12:394-409. Hetzel, R.L. (1985), "The rules versus discretion debate over monetary policy in the 1920s", Federal Reserve Bank of Richmond Economic Quarterly 71(November/December):3-14. Ikenberry, G.J. (1993), "The political origins of Bretton Woods", in: M.D. Bordo and B. Eichengreen, eds., A Retrospective on the Bretton Woods System: Lessons for International Monetary Reform (University of Chicago Press, Chicago, IL) 155 182. Ireland, EN. (1993), "Price stability under long-run monetary targeting", Federal Reserve Bank of Richmond Economic Quarterly 79(1, Winter):25-45. Irwin, D.A. (1996), Against the Tide: An Intellectual History of Free Trade (Princeton University Press, Princeton). Jeanne, O. (1995), "Monetary policy in England 1893-1914: a structural VAR analysis", Explorations in Economic History 32:302 326. Jonung, L. (1984), "Swedish experience under the Classical Gold Standard, 1873-1914", in: M.D. Bordo and A.J. Schwartz, eds., A Retrospective on the Classical Gold Standard, 1821-1931 (University of Chicago Press, Chicago, IL). Kemmerer, E.W (1910), Seasonal Variations in the Demand for Currency and Capital in the United States, National Monetary Commission (Government Printing Office, Washington, DC). Kenen, E B. (1960), "international liquidity and the balance of payments of a reserve-currency country", Quarterly Journal of Economics (November):572~86. Keynes, J.M. (1925), The economic consequences of Mr. Churchill, in: The Collected Writings of John Maynard Keynes, vol. IX, Essays in Persuasion (1972, Macmillan, London). Keynes, J.M. (1930), The Applied Theory of Money: A Treatise on Money. Volume 1 of The Collected Writings (1971, Cambridge University Press, Cambridge). Kindleberger, C.E (1973), The World in Depression, 1929-1939 (University of California Press, Berkeley, CA). King, M. (1996), "How should central banks reduce inflation? Conceptual issues", in: Achieving Price Stability (Federal Reserve Bank of Kansas City, August) 53 91. Klein, B. (1975), "Otu new monetary standard: measurement and effects of price uncertainty, 1880-1973", Economic Inquiry 13:461-484. Kwiatkowski, D., P.C.B. Phillips, P. Schmidt and Y. Shin (1992), "Testing the null hypothesis of stationarity against the alternative of a unit root: how sure are we that economic time series have a unit root?", Journal of Econometrics 54(October/December): 159 78. Kydland, EE., and E.C. Prescott (1977), "Rules rather than discretion: the inconsistency of optimal plans", Journal of Political Economy 85:473-491. Lazaretou, S. (1995), "Government spending, monetary policies and exchange rate regime switches: the Drachma in the Gold Standard period", Explorations in Economic History 32(1, January):28 50. League of Nations (1930), " First Interim Report of the Gold Delegation of the Financial Committee (Geneva). Lebow, D.E., J.O. Roberts and D.J. Stockton (1992), "Economic performance under price stability", Working Paper No. 125 (Board of Governors Federal Reserve System, Division of Research and Statistics, April). Leijonhufvud, A. (1984), "Constitutional constraints on the monetary power of government", in: R.B. McKenzie, ed., Constitutional Economics (Lexington Books, Lexington, MA) 95-107. Levy-Leboyer, M., and F. Bourguignon (1990), The French Economy in the Nineteenth Century (Cambridge University Press, New York).
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
231
Lindert, R (1969), Key Currencies and Gold, 1900-1913. Princeton Studies in International Finance (Princeton University Press, Princeton). Lippi, M., and L. Reichlin (1993), "The dynamic effects of aggregate demand and supply disturbances: comment", American Economic Review 83(June):644~653. Lucas Jr, R.E., and N.L. Stokey (1983), "Optimal fiscal and monetary policy in an economy without capital", Journal of Monetary Economics 12:55-93. Macaulay, ER. (1938), Some Theoretical Problems Suggested in the Movements of Interest Rates, Bond Yields, and Stock Prices in the United States since 1856 (National Bureau of Economic Research, New York). Mankiw, N.G. (1987), "The optimal collection of seigniorage: theory and evidence", Journal of Monetary Economics 20:327042. Mankiw, N.G., and J.A. Miron (1986), "The changing behavior of the term structure of interest rates", Quarterly Journal of Economics 101(May):211-228. Mankiw, N.G., J.A. Miron and D.N. Weil (1987), "The adjustment of expectations to a change in regime: a study of the founding of the Federal Reserve", American Economic Review 77(June):35~374. Marshall, A. (1926), Official Papers (Macmillan, London). Marston, R.C. (1993), "Interest differentials under Bretton Woods and the post Bretton Woods float: the effects of capital controls and exchange risk, in: M.D. Bordo and B. Eichengreen, eds., A Retrospective on the Bretton Woods System: Lessons for International Monetary Reform (University of Chicago Press, Chicago, IL) 515-540. McCallum, B.T. (1990), "Could a monetary base rule have prevented the Great Depression?", Journal of Monetary Economics 26(1):3-26. McCallum, B.T. (1991), "Seasonality and monetary policy: a comment", Carnegie-Rochester Conference Series on Public Policy 34:71-76. McCallum, B.T. (1996), "Commentary: how should central banks reduce inflation? - Conceptual issues", in: Achieving Price Stability (Federal Reserve Bank of Kansas City) 105-114. McCallum, B.T. (1997), "Issues in the design of monetary policy rules", Working Paper No. 6016 (NBER, April). McKinnon, R.I. (1988), "An international Gold Standard without gold", Cato Journal 8(Fa11):351-373. McKinnon, R.I. (1993), "International money in historical perspective", Journal of Economic Literature 31(1, March): 1-44. Meigs, A.J. (1962), Free Reserves and the Money Supply (University of Chicago Press, Chicago, IL). Meltzer, A.H. (1977), "Monetary and other explanations of the start of the Great Depression", Journal of Monetary Economics 2:455-471. Meltzer, A.H. (1986), "Some evidence on the comparative uncertainty experienced under different monetary regimes", in: C.D. Campbell and W.R. Dougan, eds., Alternative Monetary Regimes (Johns Hopkins University Press, Baltimore, MD) 122-153. Meltzer, A.H. (1990), "Some empirical findings on differences between EMS and non-EMS regimes: implications for currency blocs", Cato Journal 10(2):455-483. Meltzer, A.H. (1995a), "The development of central banking, theory and practice", mimeograph, in: A History of the Federal Reserve (Carnegie-Mellon University) Chapter 2. Meltzer, A.H. (1995b), "Why did monetary policy fail in the Thirties?", mimeograph, in: A History of the Federal Reserve (Carnegie-Mellon University) Chapter 5. Meltzer, A.H. (1996), "In the beginning", mimeograph, in: A History of the Federal Reserve (CarnegieMellon University) Chapter 3. Meltzer, A.H., and S. Robinson (1989), "Stability under the Gold Standard in practice", in: M.D. Bordo, ed., Monetary History and International Finance: Essays in Honor of Anna J. Schwartz (University of Chicago Press, Chicago, IL) 163 195. Mills, T.C., and G.E. Wood (1993), "Does the exchange rate regime affect the economy?", Federal Reserve Bank of St. Louis Review (July/August):3-20.
232
M.D. Bordo and A.J. Schwartz
Miron, J.A. (1986), "Financial panics, the seasonality of the nominal interest rate, and the founding of the Fed", American Economic Review 76(March): 125-140. Miron, J.A. (1996), The Economics of Seasonal Cycles (MIT Press, Cambridge, MA). Mishkin, ES. (1978), "The household balance sheet and the Great Depression", Journal of Economic History 38(December):918-937. Mishkin, E S. (1992), "Is the Fisher effect for real?", Journal of Monetary Economics 30(2): 195-215. Mitchell, B.R. (1978), European Historical Statistics, 1750-1970 (Columbia University Press, New York). Mussa, M., M. Goldstein, RB. Clark, D. Matthieson and T. Bayoumi (1994), "Improving the International Monetary System: constraints and possibilities", Occasional Paper 116 (international Monetary Fund, Washington, DC, December). Nelson, C.R., and C.I. Plosser (1982), "Trends and random walks in macroeconomic thne series: some evidence and implications", Journal of Monetary Economics 10(September): 139-62. Nelson, D.B. (1991), "Was the deflation of 1929-1930 anticipated? The monetary regime as viewed by the business press, in: R.L. Ransom and R. Sutch, eds., Research in Economic History, vol. 13 (JAI Press, Greenwich, CT) 1-65. Nurkse, R. (1944), International Currency Experience (League of Nations, Geneva). O'Brien, A.E (1989), "A behavioral explanation for nominal wage rigidity during the Great Depression", Quarterly Journal of Economics 104(November):719-735. Obstfeld, M. (1991), "Destabilizing effects of exchange rate escape clauses", Working Paper No. 3603 (NBER). Obstfeld, M. (1993), "The adjustment process", in: M.D. Bordo and B. Eichengreen, eds., A Retrospective on the Bretton Woods System (University of Chicago Press, Chicago, IL). Obstfeld, M., and A. Taylor (1998), "The Great Depression as a watershed: international capital mobility over the long run, in: M.D. Bordo, C. Goldin and E.N. White, eds., The Defining Moment: The Great Depression and the American Economy in the Twentieth Century (University of Chicago Press, Chicago, IL). Officer, L. (1986), "The efficiency of the Dollar-Sterling Gold Standard, 1890-1908", Journal of Political Economy 94(October):1038 1073. Officer, L. (1996), Between the Dollar-Sterling Gold Points: Exchange Rates, Parity and Market Behavior (Cambridge University Press, New York). Oppers, S. (1996), "Was the worldwide shift to gold inevitable? An analysis of the end of Bimetallism", Journal of Monetary Economics 37:143 162. Orphanides, A., and D.W Wilcox (1996), "The opportunistic approach to disinflation", Discussion Paper 96-24, mimeograph (Federal Reserve Board). Phelps, E.S. (1968), "Money wage dynamics and labor market equilibrium", Journal of Political Economy 76(July-August):678 711. Pierce, J.L. (1984), "Did financial innovation hurt the Great Monetarist Experiment?", American Economic Review 74(May):39~396. Pippenger, J. (1984), "Bank of England operations, 1893-1913", in: M.D. Bordo and A.J. Schwartz, eds., A Retrospective on the Classical Gold Standard, 1821 1931 (University of Chicago Press, Chicago, IL). Pollard, S., ed. (1970), The Gold Standard and Employment Policies Between the Wars (Metbuen, London). Poole, W (1991), "Interest rates and the conduct of monetary policy", Carnegie-Rochester Conference Series on Public Policy 34:31-40. Poterba, J.M., and J.J. Rotemberg (1990), "Inflation and taxation with optimizing government", Journal of Money, Credit and Banking 22:1-18. Prati, A. (1991), "Poincar6's stabilization: stopping a run on government debt", Journal of Monetary Economics 27(2, April):213~40. Prescott, E.C. (1996), Profile. The Region (Federal Reserve Bank of Minneapolis).
Ch. 3: Monetary Policy Regimes and Economic Performance: The Historical Record
233
Redish, A. (1990), "The evolution of the Gold Standard in England", Journal of Economic History (December):789-806. Redish, A. (1993), "Anchors aweigh: the transition from commodity money to fiat money in Western economies", Canadian Journal of Economics 26(4, November):777~95. Redmond, J. (1984), "The Sterling overvaluation in 1925: a multilateral approach", Economic History Review 2nd set. 37:520-532. Robinson, D., and M.R. Wickens (1992), "Measuring real and nominal macroeconomic shocks and their international transmission under different monetary systems", Discussion Paper (London Business School, Center for Economic Forecasting). Rockoff, H. (1984), "Some evidence on the real price of gold, its cost of production, and commodity prices", in: M.D. Bordo and A.J. Schwartz, eds., A Retrospective on the Classical Gold Standard, 1821-1931 (University of Chicago Press, Chicago, IL). Rockoff, H. (1986), "Walter Bagehot and the theory of central banking, in: E Capie and G.E. Wood, eds., Financial Crises and the World Banking System (Macmillan, London) 160-180. Romer, C.D. (1989), "The prewar business cycle reconsidered: new estimates of gross national product, 1869-1908", Journal of Political Economy 97(1, February): 1-37. Romer, C.D. (1992), "What ended the Great Depression?", Journal of Economic History 52(December): 757-784. Romer, C.D. (1993), "The Nation in Depression", Journal of Economic Perspectives 7(Spring): 19-39. Rudebusch, G.D. (1995), "Federal Reserve interest rate targeting, rational expectations, and the term structure", Journal of Monetary Economics (April):245-274. Saint Marc, M. (1983), Histoire Monetaire de la France, 1800.1980 (Presses Universitaires de la France, Paris). Sargent, 32 (1984), "Stopping moderate inflations: the methods of Poincar6 and Thatcher", in: R. Dornbusch and M.H. Simonsen, eds., Inflation, Debt and Indexation (MIT Press, Cambridge, MA). Sargent, T. (1986), Rational Expectations and Inflation (Harper & Row, New York). Saunders, A., and B. Wilson (1993), "Contagious bank runs: evidence from the 1929-1933 period", mimeograph (New York University Salomon Center). Sauvy, A. (1954), Rapport sur le Revenu National Presente (Conseil Economique, Paris, March). Sayers, R.S. (1957), Central Banking After Bagehot (Clarendon Press, Oxford). Scammell, W.M. (1965), "The working of the Gold Standard", Yorkshire Bulletin of Economic and Social Research 12(May):32-45. Schwartz, A.J. (1984), "Introduction", in: M.D. Bordo and A.J. Schwartz, eds., A Retrospective on the Classical Gold Standard, 1821 1931 (University of Chicago Press, Chicago, IL). Schwartz, A.J. (1986a), "Real and pseudo-financial crises", in: E Capie and G.E. Wood, eds., Financial Crises and the World Banking System (Macmillan, London) 10-31. Schwartz, A.J. (1986b), "Alternative monetary regimes: the Gold Standard", in: C.D. Campbell and WR. Dougan, eds., Alternative Monetary Regimes (Johns Hopkins University Press, Baltimore, MD) 44-72. Schwartz, A.J. (1988), "Financial stability and the federal safety net", in: WS. Haraf and R.M. Kushmeider, eds., Restructuring Banking Financial Services in America (American Enterprise Institute, Washington, DC) 34~52. Shiller, R.J. (1980), "Can the Fed control real interest rates?", in: S. Fischer, ed., Rational Expectations and Economic Policy (University of Chicago Press, Chicago, IL) 117-156; 165-167. Simmons, B. (1994), Who Adjusts: Domestic Sources of Foreign Economic Policy During the Interwar Years (Princeton University Press, Princeton). Sommariva, A., and G. Tullio (1987), German Macroeconomic History, 1880-1979 (St. Martin's Press, New York). Svensson, L.E.O. (1994), "Why exchange rate bands?: Monetary independence in spite of fixed exchange rates", Journal of Monetary Econmnics 33(1):15~199.
234
M.D. Bordo and A.J. Schwartz
Svensson, L.E.O. (1996a), "Commentary: how should monetary policy respond to shocks while maintaining long-run price stability? - Conceptual issues", in: Achieving Price Stability (Federal Reserve Bank of Kansas City) 209-219. Svensson, L.E.O. (1996b), "Price level targeting vs inflation targeting: a free lunch?", Working Paper No. 5719 (NBER, August). Taylor, J.B. (1980), "Aggregative dynamics and staggered contracts", Journal of Political Economy 88(1, February): 1-23. Temin, P. (1976), Did Monetary Forces Cause the Great Depression? (W.W Norton, New York). Temin, E (1989), Lessons from the Great Depression (MIT Press, Cambridge, MA). Temin, E (1993), "Transmission of the Great Depression", Journal of Economic Perspectives 7(Spring): 87-102. Thomas, T.J. (1981), "Aggregate demand in the United Kingdom 1918-45", in: R. Floud and D.N. McCloskey, eds., The Economic History of Britain since 1700, vol. 2 (Cambridge University Press, Cambridge). Timberlake, R.H. (1993), Monetary Policy in the United States: An Intellectual and Institutional History (University of Chicago Press, Chicago, IL). Toma, M. (1997), Competition and Monopoly in the Federal Reserve System, 1914 1951 (Cambridge University Press, Cambridge). Trehan, B., and C.E. Walsh (1990), "Seigniorage and tax smoothing in the United States: 1914-1986", Journal of Monetary Economics 25:97-112. Triffm, R. (1960), Gold and the Dollar Crisis (Yale University Press, New Haven, CT). Tullio, G., and J. Wolters (1996), "Was London the conductor of the international orchestra or just the triangle player? An empirical analysis of asymmetries in interest rate behaviour during the Classical Gold Standard, 187~1913", Scottish Journal of Political Economy 43(September):419-443. Weinstein, M.M. (1981), "Some macroeconomic impacts of the National Industrial Recovery Act, 1933 1935", in: K. Brunner, ed., The Great Depression Revisited (Martinus Nijhoff, Boston) 262 281. Wheelock, D.C. (1991), The Strategy and Consistency of Federal Reserve Monetary Policy 1924-1933 (Cambridge University Press, Cambridge). White, E.N. (1984), "A reinterpretation of the banking crisis of 1930. Journal of Economic History 44(March):119 138. Wicker, E. (1965), "Federal Reserve monetary policy, 1922-33: a reinterpretation", Journal of Political Economy 73(August):325-343. Wicker, E. (1980), "A reconsideration of the causes of the banking panic of 1980", Journal of Economic History 40(September):571-583. Wicker, E. (1986), "Terminating hyperinflation in the dismembered Habsburg Monarchy", American Economic Review 76(June):350-364. Wicker, E. (1996), The Banking Panics of the Great Depression (Cambridge University Press, Cambridge). Wicksell, K. (1898), Interest and Prices: A Study of the Causes Regulating the Value of Money [Translated from the original German by R.E Kahn (Macmillan, London, 1936)]. Wigmore, B.A. (1985), The Crash and Its Aftermath: A History of Securities Markets in the United States, 192%1933 (Greenwood Press, Westport, CT). Wigmore, B.A. (1987), "Was the Bank Holiday of 1933 caused by a run on the Dollar?", Journal of Economic History 47(3, September):739-756.
Chapter 4
THE
NEW
EMPIRICS
OF ECONOMIC
GROWTH*
STEVEN N. DURLAUF and DANNY T. QUAH University of Wisconsin, Madison and LSE
Contents Abstract Keywords 1. Introduction 2. Preliminaries and stylized facts 3. Theoretical m o d e l s 4. F r o m theory to empirical analysis 4.1. The neoclassical model: one capital good, exogenous technical progress 4.2. The neoclassical model: multiple capital goods 4.3. Endogenous growth: asymptotically linear technology 4.4. Nonconvexities and poverty traps 4.5. Endogenous growth: R&D and endogenous technical progress 4.6. Growth with cross-country interactions 5. E m p i r i c a l techniques 5.1. Cross-section regression:/~-convergence 5.2. Augmented cross-section regression 5.3. Panel-data analysis 5.4. Time series: unit roots and cointegration 5.5. Clustering and classification 5.6. Distribution dynamics 6. C o n c l u s i o n A p p e n d i x A. Proofs and additional discussions A. 1. Single capital good, exogenous technical progress A.2. Endogenous growth: asymptotically linear technology
236 236 237 238 240 246 246 254 257 261 264 265 268 269 276 283 287 289 290 294 295 295 297
* We thank the British Academy, the ESRC, the John D. and Catherine T. MacArthur Foundation, the NSF, and the Santa Fe Institute for financial support. Kim-Sau Chung, Donald Hester, Brian Krauth, Rodolfo Manuelli, Hashem Pesaran, John Taylor and Jonathan Temple provided valuable comments. Kim-Sau Chung provided outstanding research assistance. The word "empiric" has, in Webster's Unabridged Dictionary, two definitions: first, "that depending upon the observation of phenomena"; second, "an ignorant and unlicensed pretender; a quack; a charlatan". We have chosen, nevertheless, to use the word in the title as we think it conveys the appropriate meaning. Handbook of Macroeconomics, Volume 1, Edited by J.B. Taylor and M. Woodford © 1999 Elsevier Science B. E All rights reserved 235
236 A.3. Distributiondynamics Appendix B. Data References
S.N. Durlauf and D.T. Quah
299 301 303
Abstract We provide an overview of recent empirical research on patterns of cross-country growth. The new empirical regularities considered differ from earlier ones, e.g., the well-known Kaldor stylized facts. The new research no longer makes production function accounting a central part of the analysis. Instead, attention shifts more directly to questions like, Why do some countries grow faster than others? It is this changed focus that, in our view, has motivated going beyond the neoclassical growth model.
Keywords classification, convergence, cross-section regression, distribution dynamics, endogenous growth, neoclassical growth, regression tree, threshold, time series, panel data JEL classification: C21, C22, C23, D30, El3, 030, O41
Ch. 4:
The New Empirics of Economic Growth
237
1. Introduction
Economists study growth across countries for at least three reasons. First, understanding the sources of varied patterns of growth is important: persistent disparities in aggregate growth rates across countries have, over time, led to large differences in welfare. Second, the intellectual payoffs are high: the theoretical hypotheses that bear on economic growth are broad and, perhaps justifiably, ambitious in scale and scope. Third, the first wave of new empirical growth analyses, by making strong and controversial claims, have provoked yet newer ways of analyzing cross-country income dynamics. These newer techniques are, in turn, generating fresh stylized facts on growth with important implications for theory. This chapter provides one overview of the current state of macroeconomists' knowledge on cross-country growth. Since a number of excellent summaries on this subject already exist [e.g., Barro and Sala-i-Martin (1995), Jones (1997), Pritchett (1997), Romer (1996)], it is useful to clarify how our presentation differs. First, our emphasis is empirical: we develop different growth models focusing on their observable implications for cross-country income data. To bring out key ideas, we eschew overly-restrictive and detailed parametric assumptions on the theoretical models that we develop below. We seek only restrictions on data that follow from a general class of models. At the same time, we show that it is relatively easy to specialize from our analysis to the various empirical specifications that have become standard in the literature. This allows assessing the generality and robustness of earlier empirical findings. Second, we provide an organizing framework for the different econometric approaches - time-series, panel-data, cross-section, and distribution dynamics - used by researchers. We survey what we take to be the important and econometrically sound findings, and we attempt to explain the different conclusions found across some of these studies. We describe the links between alternative econometric specifications used in the literature and different observable implications of growth models. By organizing the discussion around a single general framework, we seek to gauge how far the empirical literature has succeeded at discriminating across alternative theories of growth. The questions studied in the new empirical growth literature differ from those in earlier empirical work embodying Kaldor's stylized facts [Kaldor (1963)] or those in a production function [Solow-Denison, Solow (1957) and Denison (1974)] accounting exercise. The new literature emphasizes understanding cross-country patterns of income, not the stability within a single economy of factor shares or "great ratios" (the ratio of output to capital, consumption, or investment), l It eschews understanding
1 Some researchers have remarked that the original growth models should be viewed as explaining only within-country dynamics. Cross-country evidence, therefore, should not be taken to refute or support those theoretical models- especiallywith parametersand circumstancesbeing so different across
238
S.N Durlauf and D.T. Quah
growth exclusively in terms o f factor inputs. It freely uses all kinds o f auxiliary explanatory factors, thus no longer making the production function residual a primary part o f the analysis, as was previously done. The remainder o f this chapter is organized as follows. Section 2 develops some initial stylized facts: they differ from those typically given in empirical growth papers. We begin with them as they seem natural from the perspective of the theoretical framework we adopt. Sections 3 and 4 sketch some theoretical models that we use to organize the subsequent presentation of empirical results and models. Our goal is to provide a structure sufficiently rich to accommodate a range o f theoretical perspectives and, at the same time, to allow comparing different empirical growth studies. Section 5 presents empirical models and critically evaluates the empirical findings and methodologies in the literature. Section 6 provides conclusions. Sections 7 and 8 are the Technical and Data Appendices covering material omitted from the main text for expositional convenience.
2. Preliminaries and stylized facts Theoretical growth models typically analyze the behavior o f a single representative national economy. However, turning to the observed historical experiences o f national economies in the twentieth century, what is most striking instead is how no single national economy is usefully viewed as representative. Rather, understanding crosscountry growth behavior requires thinking about the properties o f the cross-country distribution o f growth characteristics. What properties are most salient? A first set o f stylized facts relates to the world population distribution. Most o f the world's economies are small. Over the period 1960-1964, the largest 5% o f the world's economies contained 59.0% o f the world's population; the largest 10% contained 70.9%. 2 A quarter-century later, over the period 1985-1989, the largest 5% o f economies held 58.3% o f the population; the largest 10%, 70.2%. In both periods, the lower 50% of the world's economies ranked by population held in total less than 12.5% o f the world's population.
economies. There are at least two arguments against this position. First, even accepting the premise, it is long part of scientific analysis that theories be tested by going beyond their original domain and without liberally adding in free parameters in the process. Looking rigorously at cross-country evidence to assess growth models is simply part of that research tradition. Second, and more specifically on the topic, economists from at least Kaldor (1963) on have marshalled cross-country stylized facts as compelling starting points for discussions about economic growth. Indeed, Lucas (1988) and Romer (1986, 1994) use exactly income comparisons across countries to motivate their endogenous growth analyses. 2 Hereafter, "the world's economies" refers to the 122 countries with essentially complete income and population data for 1960 1989 in the Summers Iteston V6 database [Summers and Heston (1991)]. These countries are identified in Appendix B.
Ch. 4:
The New Empirics of Economic Growth
239
A second set of facts relates to the stability of these cross-country population distributions. For the last 35 years, the percentiles associated with the distribution of population across countries have been remarkably stable. This is not to say that those countries now highly populated have always been highly populated, rather that the distribution of cross-section differences has changed little. Indeed, churning within a stable cross-section distribution will figure prominently in discussions below. Economists have typically been most interested in growth models as a way to understand the behavior of per capita income or per worker output (labor productivity). What are the stylized facts here? From 1960 through 1989, world income per capita increased at an annual average rate of 2.25%. However, per capita incomes in individual economies varied widely around the world average. Averaged over 1960-1964, the poorest 10% of the world's national economies (in per capita incomes, taken at the beginning of the interval) each had per capita incomes less than 0.22 times the world average; those economies contained 26.0% of the world's population. Poor economies therefore appear to be also large ones, although it is actually China None accounting for most of that population figure. By contrast, the richest 10% of national economies each had per capita incomes exceeding 2.7 times the world average, while altogether containing 12.5% of the world's population. By 1985-1989, 10th percentile per capita income level had declined to 0.15 times the world average - those economies then held only 3.3% of the world's population as China became relatively richer and became no longer a member of this group. At the same time the 90th percentile per capita income level i n c r e a s e d to 3.08 times the world average; their share of the world population fell to 9.3%. In contrast to the stability of population size distributions, the cross-conntry distributions of per capita incomes seem quite volatile. The extremes appear to be diverging away from each other - with the poor becoming poorer, and the rich richer. However, that is not the entire picture. In 1960-1964, the income distance between the 15th and 25th percentiles was 0.13 times world per capita income; by 1985-1989, this distance had fallen to 0.06. Over this same time period, the income distance between the 85th and 95th percentiles fell from 0.98 times world per capita income to 0.59. Thus, while the overall spread of incomes across countries increased over this 25 year period, that rise was far from uniform. Within clusters, one sees instead a fall in the spread between (relatively) rich and (relatively) poor. Figure 1 plots a stylized picture of the empirical regularities just described. The figure shows the distribution of income across national economies at two different points in time. It caricatures the increase in overall spread together with the reduction in intra-distribution inequalities by an emergence of distinct peaks in the distribution. Figure 1 also shows, to scale, the historical experiences of some relative growth successes and failures. Singapore and South Korea experienced high growth relative to the world average, Venezuela the opposite. The above constitutes an initial set of stylized facts around which we organize our discussion of economic growth in this chapter. We focus on the dynamics of per capita incomes as providing the background against which to assess alternative empirical
S.N. Durlauf and D.T. Quah
240
Increasing incomes
g rich
zg poor
t
t+s
~
time
J
Income distributions Fig. 1. Evolving cross-country income distributions. Post-1960 experiences projected over 40 years for named countries are drawn to scale, relative to actual historical cross-country distributions.
analyses on growth. In this we depart from, say, Kaldor's (1963) stylized facts - the stability of factor shares, the variability of factor input quantities, the stability of timeaveraged growth rates in income and in physical capital investment, and so on. Recent empirical analyses of growth and convergence study how alternative conditioning economic variables or different economic hypotheses imply differing behavior for time paths of per capita incomes. We think it useful, therefore, to focus on exactly those dynamics.
3. Theoretical models
This section develops a growth model on which we will base our analysis of the empirical literature. The model is designed to ease comparison across different studies, and to clarify the lessons from empirical work for theoretical reasoning. Consider a closed economy, with total output denoted Y. Let the quantity of labor input be N, and assume that the stock of human capital H is embodied in the labor force so that the effective labor input is )V = N H . There are different kinds of physical
Ch. 4." The New Empirics of Economic Growth
241
capital; write them as the vector K = (Kb K2, ...). Finally, let A be the (scalar) state of technology. We use two different production technologies in the discussion: Y = ~'(K, N, A), where either
F(K, N, A) = F(K, NA)
(la)
F(K, N, A) = AF(K, N).
(lb)
or
The distinction between these is whether technical change is labor-augmenting (la) or Hicks-neutral (lb). We will generally employ Equation (la), but will draw on Equation (lb) to provide certain links to the literature. Initially, we assume that F is twice differentiable, homogeneous of degree 1, increasing, and jointly concave in all its arguments and strictly concave in each. Different combinations of these assumptions will be relaxed when we consider endogenous growth models. In addition, we require some Inada-type conditions on F such that V / a n d VA, N, K~, K2t . . . . , K 7 ,, KT+I . . . .
greater than 0"
lim F(K~, . . . , K~I, /<1, Kit+l, . . . , N, A) ~> 0
(2)
Kl-+0
and V1:
0P
OKl
~ oo as Kl ---+0.
(3)
The homogeneity of degree 1 and concavity assumptions rule out increasingreturns endogenous growth. However, as we will see below, they can nevertheless generate observations usually taken as evidence for endogenous growth models with technological nonconvexities. Define quantities in per effective labor unit terms as .P de__f Y/NA and vector ~: def (NA) IK. These are unobservable, however, and so we write their measured counterparts as:
Y def=H A × n _ N
k
de__f( k l , k2, '" ') =
HA × lc N-1K.
The definitions imply y = F(k, HA) under assumption (la) and y = AF(k, H) under assumption (lb). In turn, under assumption (la) total output can be rewritten as
Y = NA × F((NA)-IK, 1) ~ ~ =f(lc),
S.N. Durlauf and D.T. Quah
242
where f ( . ) def F(-, 1). This gives growth rate in per worker output y as
A)
fi -
+~
y
+f(k) 1 [Vf(k)]/d~: dt '
with V f denoting the gradient o f f :
But
die {Z Of(lc) ~ Of(it) ) [vs(~)]' ~7 = t , ~ , ~ 7 - , ~ 2 ~
(//,
,q
A ~)
k,k2 [~n AA
,,~- ~~-.
....
' /
so that defining
I
f([c)
(necessarily, sl(7c) c [0, 1] and E/sl(/c) ~< 1) we have the growth equation
Y f2
l l
kt
H
A
'
or
(4a)
kl
(Equation 4a refers to both the expressions above, as they are logically identical.) Applying similar reasoning to specification (lb) we obtain the growth equation (4b) Y
l
where functions sl are defined as before, only here they are evaluated at leA rather than k. But no matter where they are evaluated, each sl is nonnegative, and their sum
Ch. 4: The New Empirics of Economic Growth
243
is bounded from above by 1. When F is Cobb-Douglas, each sl is constant. More generally, nonnegativity and boundedness of the st's follow from the assumptions that F is increasing, homogeneous, and concave. The terms in braces on the right-hand side of Equations (4a) and (4b) can therefore have only similarly bounded impact on growth rates j~/y (i.e., the impact of fcl/kl on 29/Y is never more than one-for-one). To study the dynamics of this system under different economic assumptions, we first provide some definitions. We say balanced growth is a collection of time paths in observable per capita quantities (y, k) with
y
-
kt
- a constant
V/.
A balanced-growth equilibrium is balanced growth (5) and consistent specific model. Finally, equilibrium of time paths (y, k) consistent with
))(t)
lira y - ~ exists,
t~
and
(5)
a collection of time paths in (y, k) satisfying with the decisions of all economic agents in a tending towards balanced growth is a collection a specific economic model and satisfying
/~ft)
lim / ' ' "
t-~ \y(t)
" /
kz(t) J
=0
Vl.
(6)
Conditions (5) and (6) are appropriate to use when working with observable quantities y and k. Translating them to the technology-adjusted ~ and k is trivial and often convenient when discussing theoretical models. We will do so freely below. Also, conditions (5) and (6) are, again, appropriate when the model is deterministic. For stochastic models, they can be modified to be, for instance, statements on expectations. We consider some of those below in Section 5. In the best-known case - the neoclassical growth model with exogenous technical progress - technology is assumed to be
A(t) = A(O) e ~t, so that ~ is the exogenously given constant rate of technical progress. Balanced-growth equilibrium then occurs with (y, k) growing at rate ~, and therefore implying (~, lc) constant and finite. That equilibrium, under the standard assumptions we have made here, is approached from almost all initial values of k. In other situations, such as under endogenous growth, there is no guarantee that a balanced-growth equilibrium exists. We will then be interested in whether there are equilibria that tend towards balanced growth, and if so, what characteristics those show. Distinguishing balanced-growth equilibrium and balanced growth is useful to understand how adding economic structure to Equations (4a) and (4b) can produce new
244
XN. Durlauf and D.T. Quah
insights. For instance, suppose technical change is labor augmenting so that growth follows Equation (4a). Suppose further F is Cobb-Douglas, so that
F(K,~rA)= (~Ktat)(NA) l~ta~ withal>Oand~laiE(O, 1) giving
f ( k ) = H ~las" l Equation (4a) then becomes
so that under balanced growth (5)
Since the multiplier ( ~ l al) is strictly less than 1, equality between)/y and ~ / k i can occur only at
il -~, -~-o, independent of any other economic structure beyond the technology specification. [We will see this below when we study the Solow-Swan model (Solow 1956, 1957, Swan 1956), its general equilibrium Cass-Koopmans version (Cass 1965, Koopmans 1965), and the modification due to Mankiw, Romer and Weil (1992).] The reasoning just given extends naturally to production technologies beyond CobbDouglas when the counterpart to ~1 al (or, more generally, ~-~l sl(lc)) is not constant but always remains strictly less than 1. The reasoning fails, instructively, in the following counter example. Suppose lc is scalar but F is CES with
F(K, ~¢A)= [TKKa + 7N(NA) a] 1/a,
0 < a < 1 and 7K, 7N > 0,
SO that f(lc) = [yKlca + ~N] 1/a Then, sl(k) -
l
YK
+ 7N[£-a]
S
1
as lc ---+co.
Here, it is now possible to have ~ / ~ and ~/$ always positive and tending towards positive balanced growth in a way that varies with economic parameters. This behavior
Ch. 4: The New Empirics of Economic Growth
245
occurs also in endogenous growth models that exploit externalities and increasing returns [Romer (1986)] or in models with the production technology "asymptotically linear" [Jones and Manuelli (1990), Rebelo (1991)1. Our definition of balanced-growth equilibrium compares the growth rates j~/y and Icl/kl. This is not sensible for technology (lb) where we see that A/A appears with ~/y - [l/H but not with/~l/kl - [l/H. The definition of balanced growth is, then, not generally useful for such technologies, although special cases exist when it is - for instance where A is suitably endogenized. If factor input markets are competitive and F fully describes the contribution of factor inputs to production, then sl is the factor share of total output paid to the owners of the lth physical capital good. However, the discussion thus far has made no assumptions about market structure, the behavior of economic agents, the processes of capital accumulation and technological progress, and so on. Production functions (1 a) and (lb) imply, respectively, (4a) and (4b) regardless of whether savings rates are endogenous (as in the Cass-Koopmans approach) or exogenous (as in the Solow-Swan formulation). The implications hold independent of whether technology A evolves exogenously, or endogenously through physical capital accumulation or R&D investment. Thus, growth theories whose substantive differences lie in alternative F specifications can be compared by studying the different restrictions they imply for dynamics (4a) and (4b). This reasoning provides a useful insight for empirically distinguishing endogenous and neoclassical growth models. In so far as many models differ substantively only through alternative specifications of the production technology, formulating them within a general equilibrium framework might have only limited payoff empirically. To be clear, doing so is important for issues such as existence or optimality, and sometimes can place further qualitative restrictions on the behavior of particular aggregates. However, it provides no fundamentally new empirical perspective. Indeed, studies such as Barro and Sala-i-Martin (1991, 1992), while using general equilibrium formulations to justify their empirical analyses, typically consider regression models observationally equivalent to the Solow-Swan model with exogenous savings rates. Many approaches to studying growth empirics can be viewed as tracing out implications of either Equation (4a) or Equation (4b). For example, under (4a) a researcher investigating the determinants of long-run economic growth might consider situations where the last summand - the term involving the different capital stocks vanishes, and seek only to understand the economic forces driving [l/H and fl/A. Alternatively, a researcher interested in the dynamics surrounding the time path implied by [-I/H +ft/A might seek to model only ~-~l sl(Tc) x {1~1/k~-[I/H-A/A} or ~ l sz([cA) x {[cz/kl- [1/H}, taking as given (conditioning on) [ l / g and fl/A. This is exactly what is done in studies of conditional fi-convergence defined in Section 5 below): see, e.g., Barro and Sala-i-Martin (1992) or Mankiw, Romer and Weil (1992). Finally, this formulation highlights how certain terminologies have been used inconsistently in the literature. For example, while Lucas (1988) uses a definition of
246
S.N. Durlauf and D.T. Quah
human capital that is H in our formulation, Mankiw, Romer and Weil (1992) use a definition of human capital that is one of the components in vector K. Of course, both definitions are consistent with higher human capital improving labor productivity, but they do so in conceptually distinct ways. While interesting exceptions exist, a wide range of growth models can be cast as special cases of our framework. We use it then as an organizing structure for the analysis of empirical work that follows.
4. F r o m theory to empirical analysis
In this section, we consider a number of growth models in the literature, and study how they restrict observations on growth dynamics. 4.1. The neoclassical model: one capital good, exogenous technical progress
The first specific structure we consider is the neoclassical growth model, as developed in Barro and Sala-i-Martin (1992), Cass (1965), Koopmans (1965), Solow (1956, 1957), and Swan (1956). As argued in Section 3, the key empirical implications of the neoclassical model depend solely on the assumed production function. However, some quantitative features of the dynamics do depend on preferences. To clarify those, we study a general equilibrium formulation here. The neoclassical model assumes the production function (la) supplemented with the following:
/:/ H
- 0,
normalizing H(0) = 1,
A
A - ~ ~> 0,
given A(0) > 0,
(7b)
givenN(0)>0,
(7c)
f¢ N
-v>~0,
K scalar,
(7a)
given K(0) > 0.
(7d)
These assumptions say that only physical capital is accumulated, and population growth and technical change are exogenous. In addition, assume that VNA > 0
lim F ( K , l , ~ j _ O. K--+oo K
(8)
Let physical capital depreciate exponentially at rate 6 > 0. Physical capital accumulation will be assumed to follow one of two possibilities. First, as in Solow
247
Ch. 4." The New Empirics of Economic Growth
(1956) and Swan (1956), suppose savings is a constant fraction r E (0, l) of income. Then,
k
(9a)
- v f ( ~ c ) - ( 6 + v + ~).
k
As the second possibility, suppose as in Cass (1965) and Koopmans (1965), that economy-wide savings is determined by the optimization problem max N(0) U(c(t))e-(P-~)t dt, p > v + ~ >~ 0 {c(t),K(t)}~>o ~0 °° K(t) = Y ( t ) - c ( t ) N ( t ) - 6K(t), subject to
U(c)
c ~ o_ 1 i ~0 '
(10)
0 > O,
and (la), (Ta-d). The maximand in Equation (10) is the number of people multiplied by what each enjoys in present discounted value of utility from consumption c. The k constraint says that capital accumulates from the output left over after total consumption and depreciation. Coefficient 0 parametrizes the intertemporal elasticity of substitution in consumption, while p is the discount rate. We emphasize that we have restricted p to be not just nonnegative but to exceed the sum of the rates of population growth and technical change, p>v+~.
(11)
Equation (10) determines consumption and thus savings and investment to maximize social welfare. Define ~ to be per capita consumption normalized by technology, i.e., = c/A. Appendix A shows that the necessary first order conditions to Equation (10) are: _ f(k) -
C
- (Vf(/~) - {p + (5 + 0~1)
0 -1 ,
(9b)
lim lc(t) e -(p-v-~)t = O.
t-~OO
A balanced-growth equilibrium is a positive time-invariant technology-normalized capital stock k (together with implied ~ = f ( k ) ) such that under Equation (9a)
XN. Durlauf and D.T. Quah
248
.f ([c)[c-1
m,
o
k'
Fig. 2. Solow-Swan growth and convergence. Function f(lcfic 1 is continuous, and tends to infinity and zero as fc tends to zero and infinity respectively. Moreover, it is guaranteed to be monotone strictly decreasing. The vertical distance betweenf(~:),~-1 and (6 + v + ~)'c 1 is v t~/k. Convergence to steady state k* therefore occurs for all initial values k. and under Equation (9b) c=0
where = f ( k ) - (6 + v + ~) lc E (0, f(/})). (Our balanced-growth equilibrium definition implies that we can specialize to timeinvariant k.) Balanced-growth predictions are identical under either accumulation assumptions (9a) and (9b). To see this, note that at balanced-growth equilibrium under assumption (9b) we can find r in (0, 1) such that =f(/c) - (6 + v + ~)X: = (1 - O f ( k ) as both k and ~ are constant through time; Equation (9b) thus reduces to (9a). Two questions arise from this formulation. First, does a balanced-growth equilibrium always exist? And, second, even if both formulations have the same empirical implications in long-run steady state, do transitions to steady state differ? Figure 2 shows that a unique balanced-growth equilibrium exists and that lc satisfying assumption (9a) is dynamically stable everywhere in the region lc > 0 (Appendix A also proves this). Since ~ = f ( k ) , we immediately have that output per effective worker too has a unique, globally stable steady state.
Ch. 4." The New Empirics of Economic Growth
249
The dynamics of this model can be understood further by taking a Taylor series expansion in log/c about steady-state lc*,
k - T (Vf(k)-f(k)k
1)
× ( l o g / c - l o g k*).
For F Cobb-Douglas, F ( K , N A ) = K a ( N A ) l-a,
a C (0, 1)
~ f(Tc) =/c a,
(12)
this first-order series expansion becomes d log /c =a -(1 - a ) ( 6 + v + ~) × (log k - log/c*) = ft. × (log k - log/c*) where we have defined j, de__f--(1 -- a)(6 + V + ~) < 0.
(13)
Solving this differential equation gives log lc(t) - log k* = (log lc(0) - log/c*)e it log y(t) - log ~* = (log ~(0) - log/p*)e xt ~ 0 as t ---+ oo,
(14a)
i.e., log lc and log F converge to their respective steady-state values log/c* and log~* de=f logf(/c*) exponentially at rate [~[. As a increases to 1 this rate of convergence approaches 0: thus, the larger is the Cobb-Douglas coefficient on physical capital, the slower does log ~ converge to its steady-state value. Under the Cobb-Douglas assumption (12), the accumulation equation (9a) and Figure 2 imply the steady-state level
= [(6+ v + ~) lr]a/(l 6)
(15)
Equation (15) gives steady-state income levels as depending positively on the saving rate and negatively on the labor force growth rate. Before discussing in detail the empirical implications of Equation (14a), we turn to how the Solow-Swan and the general equilibrium Cass-Koopmans versions of this
250
S.N. Durlauf and D.Z Quah 42
~
Z, + ~ = p - v - ~+ 0~> 0
X,X2 = V2f(!c~')g" 0 '< 0
Fig. 3. Eigenvalues in the Cass-Koopmans model. Since V2f(k *) ~*0 ' - (V2f(/~*) lc*) [J(/c*)//~* (6 + v + ~)] 0 1, if f(/0 =/~a, with a E (0, 1) then as a increases towards unity the negative eigenvalue ~2 rises towards zero. model differ in their observable predictions. First, rewrite the first two equations in Equation (9b) as d (log~)= dt \ l o g
( ~ - ( 6 + v + ~ ) ) (Vf(k)_[p+6+O~])O-1
(16) "
Define the zero of (~/~:, c/~) by (~:*, ~*). (Appendix A establishes that this is welldefined.) Then the first-order Taylor series expansion of (log/~, log ~) about steady state (log lc*, log ~*) is:
d log )a \ log
1
= ~
V2.f(~g)k0 1
('og log *) 0
x
log ~
log 8" (17)
{ log lc - log k* ) deftM × \ log ~ - log ~* _ " Coefficient matrix M in Equation (17) has determinant Vzf(k)~0 -1 < 0 so its eigenvalues are real and of opposite sign. Moreover, its trace is Vf(k*) - (fife*) - ~*)/k* = ( p + 6 + 0~) - (6 + v + ~)
=p-(v+~)+O~ >O. Denote the eigenvalues of M by )h > 0 > )~2. Figure 3 uses these determinant and trace properties to establish how A1 and A2 vary with the parameters of the model. For
Ch. 4.. The New Empirics of Economic Growth
251
the Cobb-Douglas technologyf(]~c) = fca, eigenvalue Z2 increases towards 0 as c~ rises towards 1. Eigenvalue ~2 determines dynamics local to the steady state as: log lc(t) - log lc* = (log lc(0) - log lc*) e z2t,
(18)
log ~(t) - log ~* = (log ~(0) - log ~*) e x:t, with [log fc(0) - log lc*] and [log ~(0) - log ~*] satisfying a specific proportionality condition described in Appendix A. Then for technology (12), with ~* = (lc*) a, the first equation in (18) gives log ~(t) - log y* = (log y(0) - log ~*) e z2t ~ 0 as t ~ oc.
(14b)
Comparing equations (14a) and (14b) we see that assumptions (9a) and (9b) deliver identical observable implications - not just in steady-state balanced growth, but also locally around steady state. The convergence rates Z and Z2 have different interpretations as they depend on different economic parameters. However, they vary in the same way when the technology parameter a changes. How are these common observable implications useful for understanding patterns of cross-country growth? Parallel to the theoretical development above, we interpret the bulk of the empirical literature as concerned with two sets of implications: first, steady-state balanced-growth predictions and, second, (convergence) predictions local to steady state. Without loss, write the convergence coefficient as Z in both (14a) and (14b). From observed per capita income y = ~HA = )~A we have: log y(t) = log y(t) + log A(t) = log ~* + [log ~(0) - log ~*]e xt + log A(0) + ~t. Moreover, since .p* =f(/c*) a n d f ( k * ) / k * = (6 + v + ~ ) r l, there is some function g such that ~* = g((6 + v + ~) I r). We can therefore write the implied sample path in observable per capita income as log y(t) = log(g((6 + v + ~) 1z-)) + log A(0) + ~t + [log y(0) - (log(g((6 + v + ~) 1T)) + log A(0))]e xt,
(19)
and its time derivative d - log y(t) = ~ + Z × [log y(0) - (log (g((6 + v + ~) 1r)) + log A(0))]e xt dt
(19')
From Equation (19) log y can be viewed as having two components: a convergence component (the term involving e xt) and a levels component (the rest of the right-hand side).
252
~N. DurlaufandD.Z Quah
logy(t) l°gy~(O)
fit + (log~* + logA(O))a
logy3(O) logya(0) /
~
~
~t + (logy*+ logA(0))b
logy4(0) I
Fig. 4. Growth and convergence in the neoclassical model: two different possible steady-state paths corresponding to two possible values for the sum log~* + logA(0) - log(g((6 + v + ~) ~r)) + logA(0). As long as this sum remains unobserved or unrestricted, any pattern of cross-country growth and convergence is consistent with the model. As drawn, the a value applies to economies at yl(0) and y2(0) while the b value to y3(0) and y4(0). Economies 1 and 2 converge towards each other, and similarly economies 3 and 4. At the same time, however, economies 2 and 3, although each obeying the neoclassical growth model, are seen to approach one another, criss-cross, and then to diverge.
Figure 4 displays a graphical representation o f Equation (19) for two possible values o f log(g((6 + v + ~ ) - l r ) ) + log A(0). The figure shows two different possible steady-state paths - corresponding to two possible values for the sum log ~* + log A(0) = log(g((6 + v + ~)-1 r)) + log A(0). Relative to typical claims in the literature, Figure 4 conveys a negative message. As long as log j?* + log A(0) remains unobserved or unrestricted, any pattern o f crosscountry growth and convergence is consistent with the model. As drawn in Figure 4, the a value applies to economies at yl(0) and ye(0) while the b value to y3(0) and y4(0). Economies 1 and 2 converge towards each other, as do economies 3 and 4. A t the same time, however, economies 2 and 3, although each obeying the neoclassical growth model, are seen to approach one another, criss-cross, and then diverge. We can now organize those empirical studies that use the neoclassical growth model for their theoretical underpinnings. Cross-section regression analyses, such as Barro and Sala-i-Martin (1992), Baumol (1986), DeLong (1988), Mankiw, Romer and Weil (1992), and Sachs and Warner (1995) estimate variants o f Equation (19). Mankiw, Romer and Weil (1992), in particular, consider two versions o f Equation (19): first, when the term in e zt is already at its limiting value, then the first component o f the
Ch. 4:
The New Empirics o f Economic Growth
253
expression is taken to "explain" the steady-state cross-section distribution of income. 3 Second, when the term in e zt is taken to be central - and the rest o f the right-hand side o f Equation (19) is given (or are nuisance parameters) - the equation is taken to "explain" convergence in income. This second interpretation motivates the convergence analyses o f the other papers mentioned above. 4 In our reading o f the empirical literature, there is some confusion over the goals o f the analysis. On the one hand, a researcher might study Equation (19) to estimate the coefficients o f interest in it. But the only parameters related to the economic reasoning in Equation (19) are those in the function g, i.e., parameters o f the production function. Thus, standard econometric techniques applied to this equation might be useful for recovering such parameters. A researcher might go further and seek, in an ad hoc way, to parameterize A(0) and ~ as functions o f other economic variables. While this might be useful for regression fitting, its results are difficult to interpret in terms o f the original economic analysis. After all, A(0) and ~ played no integral role in the theoretical reasoning and it is unclear that a structural model incorporating these other variables would produce a regression o f the type typically estimated. A second goal o f an empirical analysis o f Equation (19) is to address questions o f cross-country patterns o f growth. We think, however, that all such analyses, even at their most successful, are silent on those questions. From Figure 4, as long as A(0) is unrestricted or omitted from the analysis, no study o f Equation (19) can reveal how cross-country incomes evolve. One interpretation o f the preceding is that the basic model's key implications are both too strong and too weak. I f A(0) were required to be identical across economies, then the growth and convergence predictions in Figure 2 are likely inconsistent with the inequality dynamics in cross-country incomes we described in Section 2. If, on the other hand, a researcher goes to the opposite extreme and allows A(0) to differ arbitrarily across economies, then the theoretical model says little about cross-country patterns of growth. The free parameters A(0) carry the entire burden o f explanation. Finally, should a researcher take a middle path, and restrict A(0) to depend on specific economic variables in an ad hoc manner, then that researcher might well end up fitting the data satisfactorily. However, the results of such a procedure can be difficult to interpret within the Solow-Swan (or Cass-Koopmans) growth model. 5
3 The Mankiw Romer-Weil formulation, of course, includes human capital accunmlation. That feature is ignored for expositional convenience here as it does not affect our basic point. We return to it below. 4 An earlier literature [e.g., Grief and Tullock (1989)] studied similar regression equations with growth on the left-hand side and explanatory variables on the right. We distinguish this from the work described in the text only because that earlier research did not show any preoccupation with convergence. It instead investigated, using exploratory empirical techniques, only the determinants of growth an important question, certainly, but distinct from the simultaneous interest in convergence that characterizes the newer literature. 5 Mankiw, Romer and Weil (1992) is a key exception. Those authors focus on that part of the steadystate path that depends on savings and population growth rates, not on A(0), and suggest that their human capital modification of the Solow-Swan model does fit the data. We discuss that model below.
S.N. Durlaufand D.T. Quah
254
Empirical studies such as Bernard and Durlauf (1995, 1996), Durlauf and Johnson (1995), and Quah (1997) seek to circumvent some of the criticisms we have just described. One strand of this work estimates models that explicitly nest the traditional neoclassical setup. Another strand seeks to identify those features of the long-run behavior of cross-country incomes that are invariant with respect to finely-detailed structural assumptions. Before turning to more detailed empirics, however, we describe models that depart from the basic set of assumptions in the neoclassical growth model. This is easy to do given the structure we have set up. Again, our goal is not to repeat discussion already found elsewhere, but to survey in a unified way the empirical implications of the different classes of models. 4.2. The neoclassical model: multiple capital goods A well-known model due to Mankiw, Romer and Weil (1992) (hereafter MRW) adds human capital to the Solow-Swan model, and develops empirics that potentially better explain the cross-country income data than models that account only for physical capital accumulation following Solow's original work. The MRW model fits in our framework as follows. Again, take production technology (la), and assume (7a-c). In place of Equation (7d), let K have two components, the first called physical capital Kp and the second human capital Kh: K = (Kp, Kh)'.
(7d')
(Distinguish Kh from that concept of human capital that is H - the latter multiplies the labor input N to produce effective labor input N, while the former is an entry in the vector of capital stocks, and thus is better viewed as analogous to physical capital Kp.) Extend the accumulation assumption (9a) to Rp = vpY - 6pKp,
vp, 6p > O,
Kh = vhY - 6hKh,
~h, 6h > 0,
(9a')
Z'p+ Th < 1. Then technology-intensive effective capital stocks k = @,, L ) ' with/Cp = Kp/NA and L = Kh/NA satisfy
~
Y -(a~+ v+~),
L
Y -(ah+ v+~).
A balanced-growth equilibrium is a positive time-invariant triple (~, kp, kh)* such that
Y
Y = 6h+v+~.
Ch. 4: The New Empirics of Economic Growth
255
When F is Cobb-Douglas so that
fffCp, fCh)=(lcp)a~ffch)ah,
etp, ah > O and ap+Cth < l,
(20)
straightforward calculation establishes that a balanced-growth equilibrium has:
-ap -(1 - ap) ( log ((6p + V + ~)'Cpl) )
log k;
\
×
log ((6h + v+~)rh 1)
= (1 - ap - ah) -~
× a~log((ap+v+~) ~r,~)+(~-a~)log((a~+~+~)-~r,,) and log ~* = (1
ap - ah)-' [ap log ((6p + V + ~)-1Tp) + ah log ((6h + v + ~)-l rh)].
0s')
Equation (15 ~) is the MRW counterpart to the Solow-Swan levels prediction (15). It specializes to the latter when ah is set to 0; otherwise, it comprises a geometric average of contributions from physical and human capital. It is easy to show in state space (/Cp,fch) that this system is globally stable and converges to balanced-growth equilibrium. In general, then, all dynamics - including those of y - depend on the bivariate state vector (/~p, fch). This would suggest that, in a growth regression, studying the (one-dimensional) coefficient on initial income alone, with or without auxiliary ad hoc conditioning, gives a misleading picture of dynamics local to steady state. However, with additional restrictions on model parameters, conditioning on the level of~(t) can render the local convergence behavior of~ independent of the state (fop(t), [ch(t)). Mankiw, Romer and Weil (1992) achieve this by setting equal the depreciation rates on human and physical capital, i.e., 6p = Oh. From Equation (20), and taking the firstorder Taylor series expansion in log ~, log lop, and log lch, we have:
~p +ah~ =%
Tp~pp
Y- + ah [r~ L a
(ah + v + ~ ) l ~
= ~ [~a~ + ~ + ~) (Clog ~ - log ~*) - Oog ~,, - log k ; ) ) ] + a~ [(ah + v + ~) (Clog ~ - log ~*) - Clog ~ - log ~;))]
256
S.N. Durlauf and D.T. Quah
so that 6p - 6h = b then gives
33 -
(1 - % - ah)(6
+ V + if) ×
(log 33--
l o g 33*).
(21)
Under this MRW specification the sample path (19) changes so that the levels and convergence components include terms in rh and c% The observable implications remain unchanged: observed per capita income evolves in balanced-growth equilibrium as A(t); away from steady state, observed per capita income converges towards that balanced-growth path. The dynamics are still as given in Figure 4. The MRW model has been used as the basis for numerous empirical studies. To aid our subsequent discussion of those studies, we develop a more explicit representation for the model's predictions. From Equation (21) now let )~ ~f -(1 - % - ah)(6 + v + ~) < 0,
(22)
so that log 33(t) - log 33* = [log 33(0) - log 33*] e zt log 33(t + T) - log 33" = [log 33(t) - log 33*] e zT". Transforming to get observable log y(t), this becomes: log y ( t + T) - [log A(0) + (t + T)~] = (1 - e zr) log 33* + [log y(t)
log A(0) - t~] e zr
log y ( t + T) - log y(t) = (1 - e zT) log 33* + (e zr - 1) log y ( t ) + (1 - eZV) log A(0) + (t + T - eZrt)~ Substituting in Equation (15 1) for steady state log 33* gives log y ( t + T) - log y(t) = (1 - e zr) log A(0) + (t + T - eZrt)~ + (e zT" - 1)log y(t) + ( 1 - e zr) l _ a pap ah log rp
(23)
ah - ah log rh + (1 - e ~v) 1 - ap - (1
-
e ~'v) 1 --a pa+ p a-h
ah l o g ( 6 + v + ~ ) .
In words, growth depends on some (exogenously given) constants, the initial level log y(t), savings rates, technological parameters, and the population growth rate. Since ). < O, the coefficient on the initial level log y(t) should be negative.
Ch. 4: The New Empirics of Economic Growth
257
Comparing MRW's convergence rate (22) with Solow-Swan's (13), the only difference is the addition o f ah in the former. Thus, keeping fixed ap (physical capital's coefficient), 6, v, and ~, MRW's addition o f human capital to the neoclassical model implies )~ closer to zero, or a slower rate of convergence, than in the Solow-Swan model. In both the MRW and traditional neoclassical models the levels o f balanced-growth income time paths can vary with the parameters o f preferences and technology (r, p, 0, and a). However, the rate o f change in those balanced-growth time paths in incomes is always just the exogenously given ~ = ft/A. This is useful to remember when working with representations such as Equation (23) - although the dependent variable in the regression equation is a growth rate, these models do not explain growth rates over long time horizons. It is this that makes it useful to label these models of exogenous growth.
4.3. Endogenous growth: asymptotically linear technology We now consider a range o f models that generate long-run growth from other than exogenous technical change. When possible, we will show how such models can be derived by straightforward perturbations o f the parameterizations we have used to describe the neoclassical model. 6 Assume, as in the standard one-capital neoclassical model, Equations (la) and (7a-d), but instead o f Equation (8), suppose that V2)A > 0 :
lim F ( K , NA) > 0. x~oc K
(24)
For instance, the CES production function
F ( K , N A ) = [~/xK a + ]IN(~(A) a] 1/a is homogeneous o f degree 1, concave, and satisfies Equations (2), (3) and (24) with VNA > 0"
lim F ( K , N A ) _ g~x/~ > O. K--+o~ K
Call a production function satisfying condition (24) asymptotically linear. The motivation for this terminology comes f r o m f ( k ) varying linearly with ~: as the latter gets large. 7
6 Such a strategy is inspired by Solow (1956, Example 3); see also Jones and Manuelli (1990). 7 Of course, even if the limitingf(k)k I were zero rather than positive, we would still have asymptotic linearity (albeit trivially), but we hereafter ignore this possibility when using the phrase. A useful alternative is to say that condition (24) impliesf(k) is O(~:)(or big-oh k), following standard terminology in statistics and elsewhere. Dully and Papageorgiou (1997) find that a CES specification for the aggregate production function fits cross-cotmtry data better than a Cobb-Douglas, and moreover that the elasticity of substitution between capital and labor exceeds one. This evidence implies the possibility for endogenous growth of the kind described in Jones and Manuelli (1990) and this subsection.
S.N. Durlauf and D.T. Quah
258
lim~ ~ o *
f
('~)k-I
> (6 + v + ~) r
i
(6+v+~)r
o
~
Fig. 5. Asymptotically linear (O(k)) growth and convergence. The continuous function J(Tc)fc 1 tends to infinity as lc tends to zero and to limi~oof(Tc)Tc 1 > 0 as k tends to infinity. Moreover, it is
guaranteed to be monotone strictly decreasing for finite lc. The vertical distance between f(k)Ic 1 and ((~ + V d- ~).g--I is T l~/~f. If limk~o~f(fc)Tc-1 < (6 + V + ~)T-1 then convergence occurs as in the Solow-Swan model with some constant finite Ic* describing balanced-growth equilibrium. However, if lim~ ~oof(Tc)Tc-1 > (6 + v + ~)r 1 then ~/tc is always positive, and balanced growth obtains only as /c 7 ec. Every initial ~:(0) is part of an equilibrium tending towards balanced growth. By l'Hopital's rule, condition (24) gives lim ~ ( k ) =
lim f(~)~-I >0 ~
1~
s(k) = 1,
so that, following the reasoning in Section 3, balanced-growth equilibria with positive ) / ~ are now possible. Let capital accumulation follow (9a) as before. Whereas previously Figure 2 established existence of a unique balanced-growth equilibrium with finite (y*, k*) and k / k = 0, Figure 5 now shows a range of possibilities. Taking technology parameters as fixed, define the threshold savings rate T=
-
6+v+ff lim1_~oo f ( k ) ~ 1"
The numerator is the rate at which technology-adjusted physical capital per worker naturally "dissipates", given the rates of discount, population growth, and exogenous technology development. The denominator is physical capital's limiting average product, which equals the limiting marginal product. This expression thus displays a tension between two opposing forces: the more productive physical capital is in the limit, the lower is the threshold savings rate, whereas the faster capital naturally dissipates, the higher is the threshold. If _r is at least 1, then all feasible savings rates r E (0, 1) imply the same behavior as the Solow-Swan outcome: growth in y occurs
Ch. 4." The New Empirics of Economic Growth
259
in the long run at rate ~. However, if_T is less than 1, more intricate long-run dynamics can manifest. When an economy has r less than _r, again, the result is the Solow-Swan outcome. But when economies have sufficiently high savings rates, i.e., T C (_r, 1), then k/k always exceeds a time-invariant positive quantity, and has limiting behavior given by
tli~rn~ ~:(t)
linL f(k) k '
z- - (b + V + ~) > O.
Moreover, such (y, k) paths tend towards balanced-growth equilibrium since Vf(k(t))
To(t)
~(t)
/~(t)
[
k(t)
~-
[1 f(~(t))~(t)_ I It(t)
~
--4
0
as t --, oc.
As long-run growth rates are then Y"9-~+ [(~Em f(fc)Tc
1)r--(6-}-V+~)1
> ~,
they increase in r, meaning that economies saving a higher fraction of their income grow faster in the long run. It is this growth effect that makes the current specification an "endogenous growth" model. Compare this with the standard neoclassical growth model where savings rates affect only the levels of balanced-growth sample paths, not growth rates. This relation between savings and long-run income growth applies only to those economies with savings rates exceeding the threshold value _r. All economies with savings rates below this value cannot influence long-run income growth rates by changing their savings behavior (unless they move savings rates above that threshold). What observable implications follow from this? If savings rates were uniformly distributed across countries, there should be one cluster of economies around the same low per capita income growth rate and a different group with scattered income growth rates increasing in savings rates; see, for instance, Figure 6. As in the standard neoclassical model, this asymptotically linear technology model can be given a general equilibrium interpretation. Recall assumption (9b), and assume the preference parameter 0 satisfies l i m ~ f ( f c ) fc-1 - (p + 5)
> 0 >
lim~__,o~f(k) k-' - (p + 6) > O. p-v
(25)
From Equation (10) the parameter 0 is the inverse of the intertemporal elasticity of substitution. Thus, Equation (25) states that elasticity can be neither too high nor too low - it must respect bounds varying with technology parameters.
S.N. Durlauf and D.Z Quah
260 long-run e l y
Y Fig. 6. Threshold effect of savings on long-run income growth rates in O(~:) model. For economies with savings rates r less than the threshold value _% the long-run income growth rate is ~ independent of r. If T > 2, however, then savings rates positively affect long-run growth.
From p > v, Equation (25) implies that limk~o~f(Tc) fc i > p + 6. For the interval of feasible values for 0 to exist, it suffices that ~ < p - v, which in turn follows from Equation (11). Finally, these relations imply lim f ( k ) ~:
1 > 6 + v +
~,
lc---+oo
which had been used earlier to guarantee r < 1. Thus, Equation (25) is related to but strengthens the assumption underlying Figure 5. In Appendix A, we show that Equation (25) implies that there exists a balancedgrowth equilibrium with a positive growth rate given by tlimoo~ =
f(k)k-l-[p+6+O~]
0 I >0,
and that for every initial k(0) there exists an equilibrium tending towards balanced growth. If, however, 0 is too large, then the unique balanced-growth equilibrium has limt--,o~ )(t)@(t) = 0. The equilibria have exactly the character described above in the discussion surrounding Figure 5, only with 0 -~ replacing r. The models in Rebelo (1991) and Romer (1986) differ from those above in several important ways. Rebelo (1991) uses a linear AK specification in place of the usual convex production technologies. (Linearity, of course, implies asymptotic linearity.) Equilibrium in that model tends towards balanced growth. Romer (1986) distinguishes the productive effects of individual-specific physical capital from economy-wide externalities induced by private accumulation. Romer's
Ch. 4: The New Empirics of Economic Growth
261
model uses the production technology (lb) with the arguments to F identified as the actions of private agents, and lets A depend on K, but with K defined as the social or aggregate outcome. Private agents ignore the effects of their actions on A; there is an externality in private agents' decisions to accumulate physical capital. In Romer's model, as far as private agents are concerned, A still evolves exogenously. In equilibrium, of course, A depends on the purposeful actions of economic agents, and thus is properly viewed as endogenous. Private agents' optimizing decisions on consumption and savings remain identical to those in the standard neoclassical model. At the same time, the equilibrium aggregate outcome can display ongoing, endogenously-determined growth differing from the standard model. Moreover, the model also allows evaluating the efficiency properties of particular decentralized economic equilibria. Some versions of Romer's model imply equilibria tending towards balanced growth; others display ongoing growth but with no tendency towards balanced growth. 8 Essential economic features therefore differ. However, the model of Rebelo (1991) and certain versions of the general model in Romer (1986) resulting in ongoing endogenous growth have, in essence, the same mathematical structure as that described earlier in this section. Their observable implications, therefore, are also the same. One apparently natural conclusion from these models is that the researcher should now calculate regressions across economies of income growth rates on savings rates, tax rates, and so on - variables that in the analyses of Jones and Manuelli (1990), Rebelo (1991), and Romer (1986) potentially affect long-run growth rates. Such regressions would resemble the MRW regression (23) except that there is now no reason for the initial condition log y(t) to appear with a negative coefficient. This line of reasoning suggests that what distinguishes exogenous and endogenous growth models is whether the initial condition log y(t) enters negatively in an equation explaining growth rates. Note, though, that this endogenous growth analysis does not imply that the initial condition log y(t) should never appear in an estimated regression. By contrast, that initial condition is absent only in the balanced-growth limit, i.e., with lc infinite. But in any balanced-growth limit, even the exogenous-growth neoclassical model has the initial condition vanish from the right of relation (19), (19I), or (23). 4.4. Nonconvexities and poverty traps
An alternative class of models has focused on specific nonconvexities in the aggregate production function. 9 This research has analyzed the implications of such nonconvexities for the relation between initial conditions and the steady-state behavior
8 A suitably parameterized model following Example 1 in Romer (1986, p. 1028) yields equilibria tending towards balanced growth. 9 Increasing returns to scale, of the kind studied in Romer (1986), is also a nonconvexity, of course. What we mean instead are those nonconvexitiesassociated specificallywith certain threshold effects we will describe below.
S.N. Durlauf and D.T. Quah
262
of aggregate output. Models with nonconvexities, unlike the neoclassical model, lead to long-run dependence in the time-series properties of aggregate output. Specifically, nonconvex models can display poverty traps, where economies with low initial incomes or capital stocks converge to one steady-state level of per capita output, while economies with high initial incomes or capital stocks converge to a different steadystate level. Examples of such models include those by Durlauf (1993), Galor and Zeira (1993), and Murphy, Shleifer and Vishny (1989). The model due to Azariadis and Drazen (1990) is particularly convenient for illustrating the empirical differences between this framework and the neoclassical approach. The Azariadis-Drazen model works off thresholds in the accumulation of human and physical capital. These thresholds stem from spillovers between individual investments arising when aggregate capital is sufficiently high. In effect, economies with insufficient aggregate capital have different production functions from those with sufficiently high aggregate capital. We present the basic ideas of Azariadis and Drazen (1990) in our framework as follows. Modify the MRW production technology (20) to:
ap(t) = f ~p if lOp(t) > tOp(t) ap
ah(t) =
otherwise;
(26)
{ a h i f L ( t ) > tch(t) c~h otherwise;
where the explicit (t) indicates variables changing through time and the coefficients ap(t), ah(t) vary with the underlying state (lop, lch). The quantities t
Ch. 4." The New Empirics of Economic Growth
263 45°
~
k(t
L
k
=
k(t)
kc B
~--
m
u
Fig. 7. Multiple locally stable steady states. Either of the two possible limit points ~ or ~ obtains, depending on k(0) <>lcC. The dark kinked line describes fc(t+ 1) as a function of ~:(t) in the Galor-Zeira model, as applied by Quah (1996b) to study economies confronting imperfect capital markets. If a cross section of economies had randomly distributed initial conditions lc(0), then over time the cross-section distribution of/o's (and thus of~'s) will tend towards a clustering around k_and ~. Under these assumptions, the law o f motion for economy j changes from Equation (23) to have %, ah, and thus )~ depend on time and state: log yj(t + T) - log yj(t) = r ~ + (1 - e xJ) [log Aj(O) + t~] (1 - e # r)
+ i - - ~p--~h [ap/Cp¢ + ahdrhj - (apj + ah,/)log(6 + vj + ~)] - (1 - e # r ) l o g
(27)
y/(t).
Durlauf and Johnson (1995) study Equation (27) and find evidence for multiple regimes in cross-country dynamics. They conclude that initial conditions matter, and that the MRW extension o f the neoclassical model does not successfully explain the patterns of growth across countries. We discuss their findings in greater detail below in Section 5.5. Dynamics similar to those in the Durlauf-Johnson equation (27) also obtain in the model o f Galor and Zeira (1993). Quah (1996b) applies Galor and Zeira's ideas to study empirically cross sections o f economies (rather than cross sections o f families as in the original model). Figure 7 - a two-regime counterpart to equation (27) - is used to motivate analysis o f the distribution dynamics in cross-country incomes.
264
S.N. Durlaufand D.T. Quah
This formulation gives an interpretation different from that in Azariadis and Drazen (1990) and Durlauf and Johnson (1995). Here, only one law of motion exists across economies - that given in Figure 7. However, that law of motion displays apolarization effect, namely, economies evolve towards one of two distinct steady states [see, e.g., Esteban and Ray (1994)]. Regardless of the interpretation, however, the observable implications are the same. Already-rich economies converge to a high steady-state level; already-poor ones, to a low steady-state level. 4.5. Endogenous growth: R&D and endogenous technical progress Yet a different class of endogenous growth models turns to features of the production technology (1) thus far unconsidered. We have already described Romer's (1986) model with accumulation externalities, where the variable A in Equation (lb) is taken to depend on the social outcome in capital investment. While A - the ultimate cause of growth - evolves endogenously, it is not the consequence of a deliberate action by any economic agent. One class of endogenous growth models makes A directly the result of such choices. Our immediate concern is: how do the empirical implications then differ? Certain key details differ, but the models of Aghion and Howitt (1992), Grossman and Helpman (1991), Jones (1995a), and Romer (1990) all associate the evolution of A with a measurable input such as research and development expenditure, the number of scientists and engineers, and so on. By contrast, models such as those in Lucas (1988, 1993) focus on improvement in H - human capital embodied in the labor force - as the source for endogenous growth. When the production technology is (la) the resulting dynamics in measured per capita income will be indistinguishable across A and H improvements. The empirical approach suggested by this reasoning focuses on variables that proxy the effects and economic costs of research activity. Jones (1995b) notes that the US, for one, has seen neither permanent changes in growth rates nor trend path levels of per capita GDP since 1880. Yet, resources devoted to R&D, by almost any measure, have increased dramatically in the last half century alone. Thus, in Jones' analysis, R&D-based growth models (or, indeed, all growth models with "scale effects") are at odds with empirical evidence. This conclusion has to be tempered somewhat in light of results from two distinct lines of research. Recall that the empirical evidence in Jones (1995b) takes two forms: his Figure 1, indicating stability of an (ex ante estimated) deterministic time trend; and his Table 1, showing the time-series stability properties of US GDP per capita growth rates. This should be compared with that line of research beginning from the unit-root analyses of Nelson and Plosser (1982), extending through the breaking-trend research of Perron (1989) (and numerous others since), arguing that, over different timespans, the time-series properties of different income measures do show permanent changes. We do not suggest here that the evidence is decisive one way or the other, merely that circumspection is called for in these univariate time-series analyses. The second line
Ch. 4:
The New Empirics of Economic Growth
265
of research is that from, e.g., Ben-David (1996), where permanent growth and trend path changes - across time samples comparable to that in Jones' work - are, indeed, observed for a wide range o f countries other than the US. The subtlety of statistical tests on these growth series, and the wide range of variation observable in the data had, indeed, formed part o f the empirical motivation in the early endogenous growth discussion in Romer (1986). Coe and Helpman (1995) investigate the dependence o f a country's A levels on domestic and foreign R&D capital. They relate their estimates o f such cross-country spillovers to openness o f an economy to trade. Their findings are two-fold: first, beneficial cross-country R&D spillovers are stronger, the more open is an economy. Across the G7, in particular, up to one quarter of the total benefits of R&D investment can accrue to one's trade partners. Second, the estimated effects on A o f R&D - both foreign and domestic - are large. Coe and Helpman chose to conduct their analysis entirely in terms o f productivity and income levels. The Coe-Helpman and Jones analyses, although substantively interesting, raise issues that differ from our focus in this chapter. We therefore do not discuss them further below. 4.6. Growth with cross-country interactions
Lucas (1993) presents a growth model with empirical implications that differ markedly from those we have considered above. The model shows how taking into account patterns of cross-country interaction - in this case, human capital spillovers - alters conclusions on patterns o f growth, even when one considers fixed and quite standard production technologies. ,0 In the notation o f Equation (1) take A and N to be constant and equal to 1, but let there now be w o r k effort w C [0, 1] so that: Y = F(K, wH) ~ y = F(k, wH),
with F satisfying assumptions (2), (3) and (8) as in the Solow-Swan model. The harder the labor force works, the higher is w, and thus the more output can be produced for a given quantity o f human capital H. Assume there is no depreciation and adopt the Solow-Swan savings assumption so that: k = ry.
(28)
Begin by letting
/:/ H - G(w),
G ( w ) > 0 for w > 0,
(29)
10 TO emphasize, it is spillovers across economies that will be of interest here, not spillovers within an economy, such as one might find in models with externalities.
266
XN. Durlauf and D.T. Quah
so that how fast human capital accumulates depends on work effort w. If the economy shows learning by doing, then G ~ > 0; on the other hand, schooling effects or resting effects (where having rested, labor is subsequently more efficient) give G ~ < 0. A balanced-growth equilibrium is a configuration of time paths (y, k, H, w) satisfying Equations (28) and (29) such that
y
k
/4
k
H
and
w - N constant.
Since w varies in a bounded interval, it is natural to take it constant in balanced growth. Further, assuming identical preferences across economies, all countries then select a common constant effort level ~. A theory of differing cross-country growth rates can be constructed from allowing N to vary, but that is not considered here. From Equation (28), we have in balanced growth
]c y --T k k
=T
F(k, NH) k
= TF(~H' l) (w--~)1= T fI(~~W H(~H) // (using homogeneity of degree 1 in F). Moreover, subtracting/4/H = G(N) from both sides yields
The right hand side of this generates the same graph as Figure 2 substituting G(~) for c5 + v + ~ and k/~H for ft. Thus, we see that balanced-growth equilibrium exists, is unique, and is globally stable. Indeed, once again, Figure 4 describes equilibrium time paths in y, and all the previous remarks apply. The substantive difference between the two models is that
H(t) = H(0)e G(~)~ in the interactions model replaces the neoclassical technical progress term A(t). Because k/H is constant across economies in balanced growth, economies evolve with per capita incomes following parallel paths. These levels of per capita income are determined by the initial level of human capital H(0). As before, for a given economy, per capita income converges to its balanced-growth path. However, the balancedgrowth paths of different economies will not be the same, unless those economies are identical in all respects, including initial conditions.
Ch. 4:
The New Empirics of Economic Growth
267
Y
:\ C7=C
/ ./ Fig. 8. Common average H. Because the evolution of human capital across economies depends on the world's average - the symbol J denotes the entire cross section, C a clustering or club convergence occurs to a degenerate point mass. Next, suppose there are cross-country spillovers in accumulating human capital. Write world average human capital as H , and suppose each economy is small relative to the rest o f the world. Change Equation (29) to /:/j = G(w)Hj) ~ H :~,
for e c o n o m y j •
:v C [0, 11.
(29')
The parameter ~ measures the strength o f cross-country spillovers in human capital. The larger is this parameter, the more does e c o n o m y j ' s human capital evolve in step with the world average. Conversely, when :v is zero, Equation (29 ~) reduces to (29) where no cross-country spillover occurs. From Equation (29r), write
~ - G(w) =
H
:r
This says that when ~ exceeds the world average H , then growth in human capital in economy j slows below G(w). On the other hand when Hj is low relative to H , then growth speeds up and/:/j/Hj exceeds G(w). Applying this to balanced growth with w = ~ - and recalling that each e c o n o m y j is small relative to the world average - we see that the ratio H / / / j is globally stable around a unique steady-state value of unity, so that eventually//j = H for all j. But then all equilibrium observed time paths in Figure 4 must coincide, so that the distribution o f incomes across economies eventually converges to a point mass, as in Figure 8.
S.N. Durlauf and D.Z Quah
268
Ci .J J = C o UCI °-~
co
D,
t Fig. 9. Distinct average H across clubs. Each economy now has a natural clustering - either Co o r C 1 again with ZT the entire cross section - so that the relevant average H differs across economies. As drawn here convergenceoccurs to a two-point or twin-peaked distribution.
What are the principal empirical conclusions to take away from this discussion? Whether or not convergence happens - in the sense that all economies converge to a common level of per capita output (illustrated in Figure 8) - is a matter here of accounting for the interactions across countries, not only of assumptions on the form o f the production function. Whether the cross-section distribution piles up at a single value, as in Figure 8, depends on the nature of those interactions. It is easy to see that if we allowed natural groupings of economies to form, so that economies within a group interact more with each other than with those outside, then the "average" H that they converge to will, in general, vary across groups. Depending on other assumptions one can construct models where convergence takes the form of convergence-club dynamics, as in Figure 9 [e.g., Quah (1997)]. 11 The empirical intuition emerging from these models matches well that from the stylized facts discussed in Section 2.
5. Empirical techniques This section describes a variety of empirical approaches that have been used in growth analysis.
11 Models displaying persistent inequality between families due to neighborhood spillover effects [e.g., B6nabou (1993) and Durlauf (1996)] are also driven by endogenousformation of interactionnetworks.
Ch. 4: The New Empirics of Economic Growth
269
5.1. Cross-section regression: fi-convergence The most common approach to growth and convergence applies cross-section regression analysis to variants o f Equations (19) and (19r). 12 Taking 6, v, ~, and r to be time-averaged measures for each country, the term g((6 + v + ~) i r) is determined up to unknown parameters in an assumed production function. When the researcher tacks on a least-squares residual on the right o f Equation (19) or (19 r) then cross-section least-squares regression with hypothesized steady-state levels or time-averaged growth rates in income potentially recovers the unknown parameters in these equations. Barro and Sala-i-Martin (1992) focus on the initial condition tl x e ~tt × [log ~(0) log ~*(0)] in Equation (19/), and ask if the coefficient 3. is negative. If so, then the data are said to satisfy fl-convergence (fi in their paper is -3~ in this chapter). In Barro and Sala-i-Martin (1991) the leading term in Equation (19/), the common technology growth rate ~, is constrained to be identical across regional economies in the cross section. If the same assumption is made in our model, a negative tl implies unconditional fi-convergence. Following Barro and Sala-i-Martin (1992), when this leading term depends on auxiliary economic variables - measures of democracy, political stability, industry and agriculture shares in countries, rates o f investment - a negative A implies conditional fi-convergence. 13 In most empirical studies, the choices o f additional control variables are ad hoc across datasets and political units. As one example, the data appendix in Levine and Renelt (1992) lists over 50 possibilities. Among the range o f controls that have appeared in the literature are the growth o f domestic credit, its standard deviation, inflation and its standard deviation, an index o f civil liberties, numbers o f revolutions and coups per year, rates o f primary and secondary enrollment, and measures of exchange-rate distortion and outward orientation. 14 Following the publication of Levine and Renelt's paper, yet other control variables have been introduced. We discuss further below the issues raised by these additional regressors. Barro and Sala-i-Martin (1992) and Sala-i-Martin (1996) assert that with the right conditioning variables, a rate o f convergence o f 2% per year is uniformly obtained across a broad range o f samples. They draw two implications: first, in a Cobb-Douglas production function for aggregate output, physical capital's coefficient is over 0.9, appreciably larger than the 0.4 implied by factor shares in national income accounts. Second, convergence occurs: the poor do catch up with the rich.
T2 Well-known examples include Barro and Sala-i-Martin (1991, 1992), Baumol (1986), and Mankiw, Romer and Weil (1992), but the list is legion. 13 Some researchers use the phrase absolute fi-eonvergence to mean unconditional fi-convergence. We prefer just to contrast conditional and unconditional. Thus, we also do not distinguish situations where the conditioning uses variables appearing in the original Solow-Swan model from where the conditioning uses yet a broader range of variables. 14 Of course, none of these is explicitly modelled in either neoclassical or endogenous growth analyses.
270
S.N. Durlaufand D.T. Quah Table 1 Cross section regressions: initial output and literacy-based sample breaks a
Observations
MRW
yj(1960) < 1950 and LRj(1960) < 54%
1950
98
42
42
Unconstrained regressions logyj(1960) -0.29 t
-0.44 t
-0.43 t
(0.06)
(0.16)
(0.08)
log((5 + v! + ~)
-0.38
-0.54
log rp#
(0.29) 0.52 t
(0.47) 0.31 ~
(0.28) 0.69 t
log vhj
(0.09) 0.23 t
(0.11) 0.21 t
(0.17) 0.11
(0.06)
(0.09)
(0.16)
0.46
0.27
0.48
_~2
0.50
54%~LRj(1960)
Constrained regressions
ap
0.43~
0.28~
0.51 t
ah
0.24 t
0.22t
O.11
R2
0.42
0.28
0.50
a Dependent variable: log£i(1985 ) -logyj(1960). The Table reports a selection of results from Durlauf and Jobalson (1995, Table 2), with the notation changed to match this chapter's. The t symbol denotes significance at 5% asymptotic level. Parentheses enclose estimated standard errors. Constrained regressions indicate estimation imposing the restriction 2~= -(1 - ap - ah)(6 + vl + ~). The original MRW paper never reported results using such a restriction, and thus the MRW column is from Durlauf and Johnson (1995).
Mankiw, R o m e r and Weil (1992) provide an essentially equivalent t - c o n v e r g e n c e analysis w h e n they add h u m a n capital investment as a n additional control. Their analysis differs from the vast majority o f such studies in that their modification o f the basic growth regression is justified by an explicit e c o n o m i c model; namely, they estimate the exact law o f m o t i o n generated b y the Solow model with C o b b - D o u g l a s technology. The second c o l u m n o f Table 1 presents a baseline M R W estimate. F r o m the estimated coefficient on logyj(1960) the implied convergence rate I)~1 is 0.014, similar to Barro and Sala-i-Martin's 2%; however, the estimate o f ap is only 0.43, in line with physical capital's factor share in national i n c o m e accounts. Recalling the earlier comparison between Equations (13) and (22), we note that the key contribution in Mankiw, R o m e r a n d Weil (1992) is to alter Barro and Salai-Martin's first conclusion. In M R W a low estimated rate o f convergence does not imply
Ch. 4..
The New Empirics o f Economic Growth
271
a large coefficient ap for physical capital. Indeed, as seen in Tables IV, V and VI of their paper, Mankiw, Romer and Weil find convergence rates similar to Barro and Sala-i-Martin's estimates. The difference between the two papers is the structural interpretation o f that 2% rate o f convergence. IS Researchers have identified a number of econometric problems with conditional /3-convergence analysis. Binder and Pesaran (1999), Den Haan (1995) and Kocherlakota and Yi (1995) argue that how one augments the growth model with stochastic disturbances profoundly affects the inference to be drawn from the data. 16 Their point resembles the classical econometric result where serially correlated disturbances in distributed lag equations lead to regression estimators that are inconsistent for the parameters o f interest. A more fundamental interpretive difficulty for/3-convergence analysis arises from recalling Figure 4, where cross-country growth patterns can exhibit highly nonlinear dynamics. Suppose that the a and b values there index multiple steady-state equilibria in the sense of, say, Azariadis and Drazen (1990). The figure then graphically illustrates the point in Bernard and Durlauf (1996) and Durlauf and Johnson (1995) that models having multiple steady states can display convergence o f the kind studied in Barro and Sala-i-Martin (1992), Mankiw, Romer and Weil (1992), and others. Thus, for discriminating between models having widely different policy implications, standard cross-country tests o f convergence need not provide great insight. While, under the neoclassical model, the conventional cross-country growth equation is (approximately) linear, under many endogenous growth models, it is profoundly nonlinear. As shown in Bernard and Durlauf (1996), using a linear specification to test one model versus another is then o f limited use. Put differently, relative to the class o f endogenous growth models, no uniformly most powerful test exists under the null hypothesis o f the neoclassical model. To emphasize the point, recall from Section 4 that while the Romer (1986) model produces observations not satisfying (conditional)/3-convergence, data generated by the Azariadis-Drazen (1990) model might - even though in both kinds o f endogenous growth models, global convergence fails.
15 Cohen (1996) takes this "deconstruction" exercise a step further, and in a different direction. He argues that, typically-constructed stocks of human and physical capital show unconditional fl-convergence, even if per capita income does not. He concludes that it is the dynamics of the Solow residual across countries that account for this, and suggests a vintage human capital model to explain it. 16 This result on the importance of the stochastic specification is related to but different from that in Kelly (1992) and Leung and Quah (1996). These authors show that an appropriate stochastic specification can distort, not just statistical inference, but the underlying relation between physical capital's coefficient in the production function and the convergence or divergence properties of observed per capita income. In some of the examples they construct, even technologies displaying increasing returns to scale can give convergence of the cross-section distribution to a degenerate point mass. There is of course a voluminous theoretical literature on stochastic growth providing conditions under which regular behavior emerges [see, e.g., the references in Stokey and Lucas (1989) (with Prescott)]. The resulting empirical analysis can then still be close to that from Section 4, but the issues we discuss remain outstanding.
272
S.N. Durlaufand D.T. Quah
The linear/nonlinear distinction we have just drawn is not mere nitpicking. The lack of attention to the implications of nonlinear alternatives to the neoclassical growth model in assessing empirical results is one basis for our rejecting the commonly held position summarized in Barro (1997): "It is surely an irony that one of the lasting contributions of endogenous growth theory is that it stimulated empirical work that demonstrated the explanatory power of the neoclassical growth model". If the explanatory power of a model means, as we think it should, demonstrating that greater understanding of some phenomenon derives from that model as opposed to its alternatives, rather than merely compatibility with some empirical observations, then evidence of/3-convergence simply does not provide the sort of corroboration of the neoclassical model claimed by Barro and many others. 17 Barro and Sala-i-Martin (1991) recognize that part of the importance of the convergence-rate estimate lies in its ability to shed light on whether and how rapidly poorer economies are catching up with the richer ones. They attempt to analyze this question through use of their concept of ~r-convergence. They define ~r-convergence to occur when the cross-section standard deviations of per capita incomes diminish over time. This type of convergence differs from/3-convergence; that they are not the same illustrates some of the conceptual difficulties associated with statistical convergence measures in general and cross-country growth regressions in particular. But a-convergence too is problematic. To understand those difficulties, it is convenient to begin with a further look at /3-convergence. For simple stochastic models constructed around Equation (19), quite elaborately varied behavior for the cross-section distribution is consistent with even well-behaved (unconditional) /3-convergence. Figures 10a-10c, similar to those in Quah (1996c), show three possibilities. It is easy to generate all three from a single fixed model satisfying the same transition dynamics as given in Equation (19), varying only y(0) and the variance of the regression residual term (itself ad hoc and not suggested by any explicit economic structure). Thus, the same/3-convergence statistics are found in all three cases, even though implications on the poor catching up with the rich differ across them. We can make this argument explicit by drawing on reasoning given in Quah (1993b). Remove from each observed y its upward-sloping steady-state growth path in Figures 10a-10c, so that all the y's have mean zero. Suppose, moreover, that in the long run these transformed y's satisfy two conditions: (i) Holding the cross-sectional economyj fixed, the time-series process yj is stationary with finite second moments. This holds for all j. (ii) Holding the time point t fixed, the collection of random variables {yj(t): integer j} is independent and identically distributed. This holds for all t. These restrictions are innocuous, given the points we wish to make here: essentially the same conclusions hold under quite general conditions.
~7 See Galor (1996) for further discussion.
Ch. 4." The New Empirics of Economic Growth
273
Yj
,,,y~-
......
----
........
Yzk
,
eventually constant a Yk
...... ....
t Fig. 10a. o divergence towards o-constant stationary state. The figure shows a cross section of economies that begin close together relative to their steady-state distribution and then spread out over time to converge in distribution to a well-defined steady state. Such dynamics are easy to generate, even with iid economies, each satisfying a covariance stationary linear autoregressive process.
Common
steady state ',...
......... ...................................... ~ : }
................... .'::5;..-
growth path
3-::> :3 "
............................. {;;ii;5 :::............ z i / //" / // /
t Fig. 10b. Coincident/3 and a convergence. The figure shows a cross section of economies where/3 and a convergence coincide. All economies converge smoothly in towards the common steady-state growth path. Similarly, the dispersion of the cross-section distribution declines to zero.
For an arbitrary pair of time points tl and t2 with tl < t2, the population cross-section regression of log y(t2) on a constant and log y(tl) is, by definition, the projection P [log y(t2) I 1, log y(ti) ] = Ec log y(t2) + b (log y(tl) - Ec log y(t,)), where b = Varc 1 (log Y(h))" Covc (log y(t2), log y(tl)),
S.N Durlauf and D.T. Quah
274
\ ," i~~; ~ ] ............;;~/......... ....
/...
~ ,,'" ...................... ~,7
constant a
Fig. 10c. (r convergent limit with ongoing intra-distribution churning. The figure shows a cross section of economies at the steady-state distribution limit, but displaying ongoing intra-distribution dynamics. This situation might be viewed as the distributional endpoint of the earlier Figure 10a. Y
}
-
Yj
constant o
t Fig. 10d. a-convergent limit without intra-distribution churning. Figure shows a cross section of economies at the steady-state distribution limit, but unlike in Figure 10c there are no ongoing intradistribution dynamics. All economies simply move in parallel.
the C subscript denotes appear on the left gives
cross-section. Rearranging
the projection so that growth rates
P [log y(t2) - log Y(h) I 1, log y ( q ) ] = [Ec log y(t2) bEc log y(tl)] - (1 - b) log
(30)
y(h).
The sign o f the coefficient on log Y(h) in this regression depends on whether b exceeds 1. The projection coefficient b, in turn, depends on how large the covariance
Ch. 4: The New Empirics of Economic Growth
275
between growth and initial income is relative to the variance of initial income. Suppose that we are in the situation described by Figure 10c, where long-run stationary steady state has been reached and log y(t) has its cross-sectional variances invariant in time. Since t2 > tl, equation (30) is a regression of growth rates on initial conditions. The Cauchy-Schwarz inequality ICovc(log y(t2), log y(ti))l ~< Var~/2(log y(t2))Var~/Z(log Y(h)) (with the inequality strict except in degenerate cases) then implies that (1 b) in Equation (30) is negative. In words, the conditional average - for that is what is represented by a cross-section regression - shows its growth rate negatively related to its initial level. That might, at first, suggest that we should see converging crosssection dynamics like those in Figure 10b, where the poor eventually attain the same income levels as the rich. However, recall that this negative relation between growth rates and initial levels has been constructed precisely when the cross-section dynamics are instead those in Figure 10c, where the gap between poorest and richest is always constant. More elaborate examples are easily constructed. For one, we need not consider situations only at long-run steady state. Since - outside of degenerate cases - the Cauchy-Schwarz inequality is strict, it is easy to find examples where -(1 - b) is negative even when Varc(log y(t2)) is bigger than Varc(log Y(h)), i.e., the cross-section dispersion is increasing even as the regression representation is suggesting dynamics like Figure 10b. Moreover, if one perturbs the regressor so that it is not log y(tl) but instead some other log y(to) then the same argument shows that the regression coefficient on the "initial" level can be positive regardless of whether the cross-section distribution is expanding, diminishing, or unchanged in dispersion. Different interpretations can be given to the effects we have just described - one early manifestation of these is known in the statistics literature as Galton's Fallacy or Galton's Paradox [see, e.g., Friedman (1992), Maddala (1988, 3.12), Stigler (1986, ch. 8), or Quah (1993b)]. 18 We prefer to regard the situation constructed above as one where knowledge of what happens to the conditional average (the regression representation) is uninformative for what happens to the entire cross section. In this interpretation, further fi-convergence regression analysis of the growth equation (23) be it with cross-section data, panel-data, or any other structure; be it conditional or unconditional - cannot reveal whether the poor will catch up with the rich. These considerations suggest instead directly analyzing the dynamics of the cross-section distribution. Doing so goes beyond studying just a-convergence, as the latter studies only one aspect of the distribution at each point in time. Moreover, a-convergence is silent on whether clusters form within the cross section (as in the emerging twin peaks
18 This connection had been impressed on Quah by G.S. Maddala and Marc Nerlove separately in private commtmications.
276
S.N. Durlaufand D.T. Quah
of Figure 1) and on whether transitions occur within the distribution: both Figure lOc and Figure lOd show the same a-convergence dynamics, yet economic behavior across them must differ dramatically. 5.2. Augmented cross-section regression
More recent empirical growth studies have tried to go beyond the original cross-section regressions and, instead, emphasize identifying those factors that explain international differences. Relative to the neoclassical growth model of Section 4, these exercises can be interpreted as parameterizing A. Table 2 surveys those regressors that, in the literature, have been used in crosscountry regressions 19. In addition to the four variables suggested by the augmented Solow-Swan model (initial income and the rates of human capital investment, physical capital investment, and population growth), the table includes 36 different categories of variables and 87 specific examples. Recall that the sample to which nearly all these additional control variables have been applied has only about 100 observations (the size of the subsample typically used from the Heston-Summers dataset). While these augmented cross-section regression studies have suggested some insightful extensions of the neoclassical growth model, we find problematic the lessons drawn from some of the empirical findings. First, many studies fail to make clear whether the regressions they consider can be interpreted within some economic model. It is certainly always possible to let A be a linear function of arbitrary control variables. But exploiting that hypothesized linear function need not be a useful way of studying the control in question. For example, the threshold externality in the Azariadis-Drazen model can be viewed as a latent variable indexing the aggregate production function. Such an interpretation is plausible for factors ranging from international market access to political regime - the ability of a society to innovate and to exploit readily available opportunities is influenced by political culture, with well documented historical examples going as far back as Athens and Sparta. However, we conclude from the model that these factors induce nonlinearities in the growth relation. Linear regressions are, not surprisingly, unable to get at the features of interest. Moreover, it is unclear what exercise a researcher conducts by adding a particular control variable, even when the variable is motivated by a particular economic theory. The basic Solow-Swan model admits an immense range of extensions through factors such as inequality, political regime, or trade openness. These are often highly correlated with one another, and are neither mutually exclusive nor prioritized as possible explanations of growth. Hence, it is difficult to assign much import to the statistical
19 Temple (1996) provides an excellent literataxreoverview discussing some of these studies in greater detail.
Ch. 4."
277
The New Empirics of Economic Growth
Table 2 Growth regression compilation a Explanatory variable
Reference
Finding b
Change in labor force participation rate Blomstrom, Lipsey and Zejan (1996) Corruption
+*
Mauro (1995)
Capitalism (level)
Sala-i-Martin (1997)
+*
Democracy, some
Barro (1996, 1997)
+*
more
Barro (1996, 1997)
-*
overall
Alesina, Ozler, Roubini and Swagel (1996)
?
Domestic credit, growth rate
Levine and Renelt (1992)
+f
volatility of growth rate
Levine and Renelt (1992)
+f
Barro and Lee (1994)
-
Barro and Lee (1994) Barro (1996) Barro (1997) Caselli, Esquivel and Lefort (1996) Forbes (1997)
-* -*
Education, college level female
+* -*
female growth
Barro and Lee (1994)
-*
male
Barro and Lee (1994) Caselli, Esquivel and Lefort (1996) Forbes (1997)
+* -* +*
male growth
Barro and Lee (1994)
+*
overall
Barro (1991) Knowles and Owen (1995) Levine and Renelt (1992) Mankiw, Romer and Weil (1992)
+* + +f +*
primary
Barro (1997)
-
Barro (1996) Barro and Lee (1994) Easterly (1993) Harrison (1995) Levine and Renelt (1992) Sala-i-Martin (1997)
-* -*
Easterly (1993) Harrison (1995) Sala-i-Martin (1997)
-
Barro (1996, 1997) Barro and Lee (1994) Caselli, Esquivel and Lefort (1996) Easterly, Kremer, Pritchett and Summers (1993)
+* + +* +*
External debt (dummy)
Easterly, Kremer, Pritchett and Summers (1993)
-
Fertility
Barro (1991, 1996, 1997) Barro and Lee (1994)
-* -*
Exchange rates (real), black market premium
distortions
terms of trade improvement
-* -f -*
-*
continued on next page
278
S.N. Durlauf and D.Z Quah
Table 2, continued Explanatory variable
Reference
Finding b
Financial repression
Easterly (1993)
-*
Financial sophistication
King and Levine (1993)
+*
Fraction college students, engineering
Murphy, Shleifer and Vishny (1991)
+*
Murphy, Shleifer and Vishny (1991)
-*
Barro (1991, 1996, 1997) Barro and Lee (1994) Caselli, Esquivel and Lefort (1996)
-* -* +*
growth in consumption
Kormendi and Meguire (1985)
+
deficits
Levine and Renelt (1992)
_t"
investment
Barro (1991)
+
Alesina, Ozler, Roubini and Swagel (1996)
+*
Alesina, Ozler, Roubini and Swagel (1996) Easterly, Kremer, Pritchett and Summers (1993)
+ +
Health (various proxies)
Barro (1997) Barro and Lee (1994) Caselli, Esquivel and Lefort (1996) Knowles and Owen (1995)
+* +* -* +*
Inequality, democracies
Persson and Tabellini (1994)
-*
non-democracies
Persson and Tabellini (1994)
+*
overall
Alesina and Rodrik (1994) Forbes (1997)
-* +*
law Government, consumption
Growth rates, G7 G7 lagged
Inflation, change
Kormendi and Meguire (1985)
level (above 15%)
Barro (1997)
level
Levine and Renelt (1992)
variability
Barro (1997) Levine and Renelt (1992)
+
Barro (1991, 1997) Barro and Lee (1994) Barro and Sala-i-Martin (1992) Ben-David (1996) Caselli, Esquivel and Lefort (1996) Cho (1996) Kormendi and Meguire (1985) Levine and Renelt (1992) Mankiw, Romer and Weil (1992) Romer (1993)
-* -* -* -* -* +* -* _r * -*
Barro (1997)
-*
Initial income
(interacted with male schooling)
-* f f
continued on next page
Ch. 4."
279
The N e w Empirics o f Economic Growth
Table 2, continued ExplanatOry variable
Reference
Finding b
Investment ratio
Barro ( 1991) Barro (1996, 1997) Barro and Lee (1994) Caselli, Esquivel and Lefort (1996) Levine and Renelt (1992) Mankiw, Romer and Weil (1992)
+* + +* +* +r +*
Investment, equipment, fixed capital
Blomstrom, Lipsey and Zejan (1996) DeLong and Summers (1993) Sala-i-Martin (1997)
+* +*
Sala-i-Martin (1997)
+*
Sala-i-Martin (1997)
+*
Mining (fraction of GDP)
Sala-i-Martin (1997)
+*
Money growth
Kormendi and Meguire (1985)
+
Politics, civil liberties
Barro and Lee (1994) Kormendi and Meguire (1985) Levine and Renelt (1992) Sala-i-Martin (1997)
* + ?f +*
instability
Alesina, Ozler, Roubini and Swagel (1996) Barro (1991) Barro and Lee (1994) Caselli, Esquivel and Lefort (1996) Levine and Renelt (1992) Sala-i-Martin (1997)
-* -* -* -* -f *
political rights
Barro and Lee (1994) Sala-i-Martin (1997)
+* +*
Barro and Lee (1994) Kormendi and Meguire (1985) Levine and Renelt (1992) Mankiw, Romer and Weil (1992)
+ -* -f *
< 15 years
Barro and Lee (1994)
-*
_> 65 years
Barro and Lee (1994)
?
Easterly (1993) Harrison (1995)
+ -*
Barro (1991) Easterly (1993)
* -*
non-equipment Latitude (absolute)
Population growth
Price distortion, consumption investment Price levels, consumption investment
Easterly (1993)
+
Easterly (1993)
-* continued on next page
S.N. Durlauf and D.T. Quah
280
Table 2, continued Explanatory variable
Reference
Finding b
Regions, latitude (absolute)
Sala-i-Martin (1997)
+*
East Asia
Barro (1997) Barro and Lee (1994)
+ +
former Spanish colony
Sala-i-Martin (1997)
Latin America
Barro (1991) Barro (1997) Barro and Lee (1994) Sala-i-Martin (1997)
-*
Barro (1991) Barro (1997) Barro and Lee (1994) Sala-i-Martin (1997)
-*
Religion, Buddhist
Sala-i-Martin (1997)
+*
Catholic
Sala-i-Martin (1997)
-*
sub-Saharan Africa
*
-* -*
-* -*
Confucian
Sala-i-Martin (1997)
+*
Muslim
Sala-i-Martin (1997)
+*
Protestant
Sala-i-Martin (1997)
-*
Rule of law
Barro (1996, 1997) Sala-i-Martin (1997)
+* +*
Scale effects, total area
Sala-i-Martin (1997)
?
total labor force
Sala-i-Martin (1997)
?
Frankel and Romer (1996) Frankel, Romer and Cyrus (1996) Harrison (1995) Levine and Renelt (1992)
+* +*
primary products in total exports (fraction)
Sala-i-Martin (1997)
-*
export-GDP ratio (change)
Kormendi and Meguire (1985)
+*
FDI relative to GDP
Blomstrom, Lipsey and Zejan (1996)
-
Trade, export/import/total trade as fraction of GDP
machinery and equipment imports Romer (1993) Trade policy, import penetration
+f
+*
Levine and Renelt (1992)
?f
Learner index
Levine and Renelt (1992)
-f
openness (change)
Harrison (1995)
+*
openness (level)
Harrison (1995) Levine and Renelt (1992)
+* ?f continued on next page
Ch. 4:
281
The New Empirics of Economic Growth
Table 2, continued Explanatory variable
Reference
Finding b
Trade policy (cont'd), outward orientation tariffs years open, 1950-1990 Variability, growth innovations
Levine and Renelt (1992)
?f
money War, casualties per capita duration occurrence
Barro and Lee (1994) Sala-i-Martin (1997) Kormendi and Meguire (1985) Ramey and Ramey (1995) Kormendi and Meguire (1985) Easterly, Kremer, Pritchett and Summers (1993) Barro and Lee (1994) Barro and Lee (1994) Sala-i-Martin (1997)
+* -* -* * + + -*
a In this table we can give no more than a flavor of the findings extant. Detailed variable definitions can be found in the individual references. b Symbols: * denotes a claim of significance (authors' significance levels differ across studies, and are not always explicitly reported); ? denotes that the author(s) did not report the result; and f and r indicate fragility and robustness in the sense used by Levine and Renelt (1992).
significance o f an arbitrarily chosen subset o f possible controls. We therefore find unpersuasive claims that these regressions are able to identify economic structure. The problem o f open-ended alternative models also extends to various attempts in the literature to find instruments for the various baseline and augmented S o l o w Swan regressors, which are o f course typically endogenous themselves. Frankel and Romer (1996) use geographic variables to instrument their measure o f trade openness. However, that these variables are exogenous with respect to trade openness does not make them legitimate instruments. For example, from the perspective o f European and Asian history it is wholly plausible that land mass correlates with military expenditures and military strength, which themselves correlate with tax rates and political regime two alternative augmentations o f the Solow model which have been proposed. Because growth explanations are so broad, it is especially easy to construct plausible reasons why "exogenous" instruments are less useful than they might first appear. The failure o f the growth model to naturally generate useful instruments contrasts with rational expectations models whose structure produces such instruments automatically from the orthogonality o f forecast errors and available information. This reasoning has led to a reexamination o f the empirical conclusions from this line o f work. The issue has been addressed in two ways. First, Levine and Renelt (1992) have challenged many o f the findings in cross-country growth regressions. They emphasized that findings o f statistical significance may be fragile due to dependence on additional controls whose presence or absence is not strongly motivated by any theory.
S.N. Durlauf and D.T. Quah
282
By applying Leamer's [Leamer (1978)] extreme bounds analysis (thereby identifying the range of coefficient estimates for a given regressor generated by alternative choices of additional regressors) they found that only the physical capital investment rate and, to a weaker degree, initial income are robustly related to cross-country growth rate differentials. Levine and Renelt (1992) have identified a serious problem with the empirical growth literature. However, their procedure for dealing with the problem is itself problematic. The difficulty may be most easily seen in the following example. Suppose that one is interested in the coefficient b0 relating variables X and Y, where the true data generating process is given by
~
= X j b 0 + ~j,
with X deterministic and c normally distributed N(0, a2). Suppose the researcher considers a set of controls {Zl : integer l}, each ZI being separately entered in the regression: Yj = X j b + Ztjct + ej.
(31)
Assume that the Zl's are nonstochastic and that, in sample, have zero cross-product with X. Denote the sample second moments of X and ZI by IlXll2 and 1121112 respectively. Then OLS on Equation (31) produces ,5 estimates that are draws from the normal distribution N(b0, (112112+ IlZlll2) 10"2). Since the Zl are deterministic, as researchers increase the number of separate ZI used in different regression analyses, so does the probability increase that some draw on b will have sign opposite to that on the true b. 2o The problem is that the b distribution has support that can become unbounded due to sampling variation induced by the arbitrarily chosen regressors. Without a theory on how to control this problem, it is difficult to draw strong conclusions about the fragility of regression coefficients. Hence, while we find the Levine and Renelt analysis suggestive, the import of the challenge is unclear. Said-i-Martin (1997) has attempted to deal with this limitation by calling "robust" only those variables found statistically significant in 95% of a group of regressions in a wide range of possible combinations of controls. This work finds that many more variables appear to be robust. These variables fall into 9 categories: (1) region (dummy variables for Sub-Saharan Africa and Latin America), (2) political structure (measures of rule of law, civil liberties and political instability), (3) religion, (4) market distortions (measured with reference to official and black market exchange rates), (5) equipment investment, (6) natural resource production, (7) trade openness, (8) degree of capitalism, and (9) former Spanish colonies.
20 The basic argumentclearlystill applieswhenthe Zt are stochastic, eventhoughthen the b distributions are not typicallynormal.
Ch. 4:
The New Empirics of Economic Growth
283
However, it is again unclear how to interpret such results. Suppose that one were to take a given regression relationship and begin to include alternative sets o f right hand side variables which were in each case orthogonal to the original regressors. The presence or absence o f these regressors would have (by assumption) no effect on estimated coefficient size or estimated standard errors. Hence, one could always generate an arbitrarily large number o f regressions with the same significant coefficient but with no implications as to whether the coefficient estimate is or is not robust. Hence, it is impossible to know whether Sala-i-Martin's exercise actually reveals something about robustness, or merely something about the covariance structure of the controls which he studies. Further, the exercise assumes that robustness is interesting outside of the context of which variables are under study. The fact that the presence o f one variable in a growth regression renders another insignificant is not vitiated by the fact that others do not do so, when the first is o f economic interest, and the others are not. The problem with both these approaches to robustness o f control variables in growth regressions is that they attempt to use mechanical statistical criteria in identifying factors whose interest and plausibility is motivated by economic (or social science) theory. The dimensions along which one wants estimates to be robust are determined by the goals o f the researcher, which cannot be reduced to algorithms o f the kind that have been employed. 5.3. Panel-data analysis
To permit unobservable country-specific heterogeneity in growth regressions, Benhabib and Spiegel (1997), Canova and Marcet (1995), Casetli, Esquivel and Lefort (1996), Evans (1998), Islam (1995), Lee, Pesaran and Smith (1997), and Nerlove (1996) have used panel-data methods to study the cross-country income data. Following traditional motivation in panel-data econometrics [e.g., Chamberlain (1984)], many such studies seek to eliminate, in the notation o f Section 4, unobservable country-level heterogeneity in A(0). Those heterogeneities, denoted individual effects in the language o f panel-data econometrics, constitute nuisance parameters that within the conventional framework the researcher attempts to remove. 21 Panel-data studies proceed from the neoclassical (MRW) model (23) as follows. Assume that depreciation 6 and technology growth ~ are constant across economies.
21 Canova and Marcet (1995) and Evans (1998) are exceptions to this. Canova and Marcet analyze a Bayesian-motivatedparameterization of the individual effects, and conclude that those effects do, indeed, differ across economies. Evans, using a different statistical technique, concludes the same. Evans follows Levin and Lin (1992) and Quah (1994) in taking an underlying probability model where both time and cross-section dimensions in the panel dataset are large. This contrasts with standard panel-data studies where the time dimension is taken to be relatively small. The large N, large T framework then allows inference as if the individual effects are consistently estimated, and permits testing for whether they differ across countries. See also Ira, Pesaran and Shin (1997).
S.N. Durlauf and D.Z Quah
284
Fix horizon T, append a residual on the right, and redefine coefficients to give, across economies j, the regression equation log yj(t + T) - log yi(t) = bo + b~ log yj(t) + b2 log Tpj + b3 log Th,i+ b4 log(b + Y/+ ~) + (-j,t (32) with def
b0 = (1 - e;W)log A(0) + (t + T - e;Wt)~, def" e~.T bl = -
1,
b2 def = (1 _ eZr)_l
ap - a p - ah
b3 def (1 -- e zr) 1
ah --
def
b4 = - ( 1
C/p
, --
ah
eZZ) % + ah -
1 - %-ah"
Let T = 1 and assume that b0 is a random variable with unobservable additive components varying i n j and t: log yj(t + 1) - log yj(t) = I~j + tft + bl log yj(t) + b2 log "gpj -t- b3 log rhj + b4 log(b + vj + ~) + (~j,t. (33) This formulation differs from the original MRW specification in two ways. First, the law of motion for output is taken in one-period adjustments. This is inessential, however, and the researcher is free to recast Equation (32) with T set to whatever the researcher deems appropriate. Second, the (originally) constant b0 is decomposed into economy-specific and time-specific effects: b0 =/~j + tot.
(34)
Panel-data methods, applied to the model above, have produced a wide range of empirical results. While Barro and Sala-i-Martin (1991, 1992) defend a 2% annual rate of convergence from cross-section regressions, estimates from panel-data analyses have been more varied. Lee, Pesaran and Smith (1997, 1998) conclude annual convergence rates are approximately 30% when one allows heterogeneity in all the parameters. Islam (1995) permits heterogeneity only in the intercept terms, and finds annual convergence rates between 3.8% and 9.1%, depending on the subsample under study. Caselli, Esquivel and Lefort (1996) suggest a convergence rate of 10%, after conditioning out individual heterogeneities and instrumenting for dynamic endogeneity. Nerlove (1996), by contrast, finds estimates of convergence rates that are even lower than those generated by cross-section regression. He explains this difference as being due to finite
Ch. 4:
The New Empirics of Economic Growth
285
sample biases in the estimators employed in the other studies using the neoclassical growth model. The disparate results across panel-data studies can sometimes, but not always, be attributed to the different datasets that different researchers have employed. The use of a panel-data structure has advantages and disadvantages. One significant advance comes from clarifying the difficulties in interpreting the standard cross-section regression. In particular, the dynamic panel (33) typically displays correlation between lagged dependent variables and the unobserved residual. The resulting regression bias depends on the number of observations in time and only disappears when that number becomes infinite. Moreover, the bias does not disappear with time averaging. Thus, if the dynamic panel were the underlying structure, standard cross-section regressions will not consistently uncover the true structural parameters. But beyond simply pointing out difficulties with the cross-section OLS formulation, the panel-data structure has been argued, on its own merits, to be more appropriate for analyzing growth dynamics. For instance, Islam (1995) shows how time- and country-specific effects can arise when per capita output is the dependent variable instead of output per effective worker (Islam argues this substitution to be appropriate). Alternatively, one might view the error structure as a consequence of omitted variables in the growth equation, whereupon the separate time and country effects in Equation (34) have alternative natural interpretations. These instances of the greater flexibility (and, thus, reduced possibilities for misspecification) allowed by panel-data analyses - unavailable to cross-section regression studies - account for their broader econometric use more generally, not just in studies of economic growth. However, the putatively greater appeal of panel-data studies should not go unchallenged. To see the potential disadvantages, consider again the decomposition in Equation (34). For researchers used to the conventions in panel-data econometric analysis, this generalization from a constant unique b0 is natural. But for others, it might appear to be a proliferation of free parameters not directly motivated by economic theory. Freeing b0 so that it can vary across countries and over time can only help a theoretical model fit the data better. Restricting b0 to be identical across countries and over time - when, in reality, b0 should differ - can result in a model that is misspecified, thereby lowering confidence that the researcher has correctly identified and estimated the parameters of interest. This advantage of a panel-data approach applies generally, and is not specific to growth and convergence. But for convergence studies, the flexibility from decomposing b0 into economy-specific and time-specific components can instead be problematic, giving rise to misleading conclusions. We describe two scenarios where we think this might naturally occur. First, note that Equation (32) implies that A(0) (and thus b0 through/~j) forms part of the longrun path towards which the given economy converges (see again Figures 10a-10d). Ignore Galton's Fallacy to sharpen the point here. If the researcher insists that A(0) be identical across economies, then that researcher concludes convergence to an underlying steady-state path precisely when catching up between poor and rich takes place. Thus, the implication from a convergence finding is transparent: it translates
286
S.N. Durlauf and D.T. Quah
directly into a statement about catching up (again, abstracting away from Galton's Fallacy). By contrast, when the researcher allows A(0) to differ across economies, finding convergence to an underlying steady-state path says nothing about whether catching up occurs between poor and rich: Figures 10a-10d show different possibilities. This is not just the distinction between conditional and unconditional convergence. In panel-data analysis, it is considered a virtue that the individual heterogeneities A(0) are unobservable, and not explicitly modelled as functions of observable right-hand side explanatory variables. By leaving free those individual heterogeneities, the researcher gives up hope of examining whether poor economies are catching up with rich ones. The use of panel-data methods therefore compounds the difficulties in interpreting convergence regression findings in terms o f catchup from poor to rich. For the second scenario, recall the problem the panel-data regression Equation (33) traditionally confronts is the possibility that the #j's, the individual-specific effects, are correlated with some of the right-hand side variables. If not for this, OLS on Equation (33) would allow both consistent estimation and (with appropriately corrected standard errors) consistent inference. 22 One class of solutions to the inconsistency problem derives from transforming Equation (33) to annihilate the /~j. For instance, in the so-called "fixed-effects" or within estimator, one takes deviations from timeaveraged sample means in Equation (33), and then applies OLS to the transformed equation to provide consistent estimates for the regression coefficients. But note that in applying such an individual-effects annihilating transformation, the researcher winds up analyzing a left-hand side variable purged of its long-run (timeaveraged) variation across countries. Such a method, therefore, leaves unexplained exactly the long-run cross-conntry growth variation originally motivating this empirical research. The resulting estimates are, instead, pertinent only for higher-frequency variation in the left-hand side variable: this might be o f greater interest for business cycles research than it is for understanding patterns o f long-run economic growth across countries. 23 Our point is general: it applies not just to the fixed-effects estimator, but also to the first-difference estimator, and indeed to any panel-data technique that conditions out the individual effects as "nuisance parameters". In dealing with the correlation between individual effects and right-hand side variables - a properly-justified problem in microeconometric studies [again see, e.g., Chamberlain (1984)] - the solution
22 OLS might not be efficient, of course, and GLS might be preferred where one takes into account the covariance structure of the #j's. 23 This statement clearly differs from saying that fixed-effects estimators are inconsistent in dynamic models without strict exogeneity of the regressors [e.g., Chamberlain (1984)]. The absence of strict exogeneity characterizes Equation (33), and thus is an additional problem with fixed-effects estimators. This shortcoming has motivated studies such as Caselli, Esquivel and Lefort (1996) that use techniques appropriate for such correlation possibilities. However, those techniques do nothing for the shortrun/long-run issue we raise.
Ch. 4:
The New Empirics o f Economic Growth
287
offered by panel-data tecbafiques ends up profoundly limiting our ability to explain patterns of cross-country growth and convergence. 24 Interestingly, that conditioning out country-specific effects leaves only highfrequency income movements to be explained creates not only the problem just described, but also its dual. Over what time horizon is a growth model supposed to apply? Many economists (or Solow and Swan themselves in the original papers for that matter) regard growth analyses as relevant over long time spans. Averaging over the longest time horizon possible - as in cross-section regression work - comes with the belief that such averaging eliminates business cycle effects that likely dominate per capita income fluctuations at higher frequencies. By contrast, Islam (1995, p. 1137) has argued that since Equation (23) is "based on an approximation around the steady state ... it is, therefore, valid over shorter periods of time". However, we think this irrelevant. Different time scales for analyzing the model are mutually appropriate only if the degree of misspecification in the model is independent of time scale. In growth work, one can plausibly argue that misspecification is greater at higher frequencies. Taking Islam's argument seriously, one might attempt using the neoclassical growth model to explain even weekly or daily income fluctuations in addition to decadal movements. 5.4. Time series: unit roots and cointegration
An alternative approach to long-run output dynamics and convergence based on timeseries ideas has been developed in Bernard and Durlauf (1995, 1996), Durlauf (1989), and Quah (1992). Convergence here is identified not as a property of the relation between initial income and growth over a fixed sample period, but instead of the relationship between long-run forecasts of per capita output, taking as given initial conditions. Bernard and Durlauf (1996) define time-series forecast convergence as the equality of long-term forecasts taken at a given fixed date. Thus, given ~t the information at date t, economies j a n d f show time-series forecast convergence at t when: lim E(yj(t + T ) - yj,(t + T) ] ~t) = O,
T~oo
i.e., the long-term forecasts of per capita output are equal given information available at t. It is easy to show that time-series forecast convergence implies/3-convergence when growth rates are measured between t and t + T for some fixed finite horizon T. The critical distinction between time-series forecast convergence and/3-convergence is that an expected reduction in contemporary differences (/3-convergence) is not the same as the expectation of their eventual disappearance.
24 Quah (1996c, p. 1367) has also argued this.
288
S.N. Durlauf and D.T. Quah
This dynamic definition has the added feature that it distinguishes between convergence between pairs of economies and convergence for all economies simultaneously. Of course, if convergence holds between all pairs then convergence holds for all. Some of the theoretical models we have described - in particular, those with multiple steady states - show that convergence need not be an all or nothing proposition. Subgroups of economies might converge, even when not all economies do. To operationalize this notion of convergence, a researcher examines whether the difference between per capita incomes in selected pairs of economies can be characterized as a zero-mean stationary stochastic process. Hence, forecast convergence can be tested using standard unit root and cointegration procedures. Under the definition, deterministic (nonzero) time trends in the cross-pair differences is as much a rejection of convergence as is the presence of a unit root. In the literature applying these ideas, two main strands can be distinguished. The first, typified by Bernard and Durlauf (1995, 1996), restricts analysis to particular subgroups of economies, for instance the OECD. This allows the researcher to use long time series data, such as those constructed by Maddison (1989). Multivariate unit root and cointegration tests reject the null hypothesis that there is a single unit-root process driving output across the OECD economies - thus, across all the economies in the OECD grouping, time-series forecast convergence can be rejected. At the same time, however, individual country pairs - for instance, Belgium and the Netherlands do display such convergence. In a second strand, Quah (1992) studies the presence of common stochastic trends in a large cross section of aggregate economies. He does this by subtracting US per capita output from the per capita output of every economy under study, and then examines if unit roots remain in the resulting series. Because the number of time-series observations is the same order of magnitude as the number of countries, random-field asymptotics are used to compute significance levels. Quah's results confirm those of Bernard and Durlauf described above. He rejects the null hypothesis of no unit roots in the per capita output difference series; in other words, he finds evidence against convergence (in the sense given by the forecasting definition). Time series approaches to convergence are subject to an important caveat. The statistical analysis under which convergence is tested maintains that the data under consideration can be described by a time-invariant data generating process. However, if economies are in transition towards steady state, their associated per capita output series will not satisfy this property. Indeed, as argued by Bernard and Durlauf (1996), the time series approach to convergence, by requiring that output differences be zeromean and stationary, requires a condition inconsistent with that implied in crosssection regressions, namely that the difference between a rich and poor economy have a nonzero mean. Time-series and cross-section approaches to convergence rely on different interpretations of the data under consideration. Hence they can provide conflicting evidence; in practice, the two approaches commonly do.
Ch. 4: The New Empirics of Economic Growth
289
5.5. Clustering and classification
Following Azariadis and Drazen's (1990) theoretical insights, Durlauf and Johnson (1995) study Equation (27), and find evidence for multiple regimes in cross-country growth dynamics. They do this in the dataset originally used by MRW [Mankiw, Romer and Weil (1992)] by identifying sample splits so that within any given subsample all economies obey a common linear cross-section regression equation. Durlauf and Johnson allow economies with different 1960 per capita incomes and literacy rates (LR) to be endowed with different aggregate production functions. Using a regression-tree procedure 25 to identify threshold levels endogenously, Durlauf and Johnson find the MRW dataset display four distinct regimes determined by initial conditions: (1) yj(1960) < $800; (2) $800 ~
25 Breiman, Friedman, Olshen and Stone (1984) describe the regression-tree procedure and its properties.
S.N. Durlauf and D.Z Quah
290
but that they do converge (to different limits). Interestingly, Franses and Hobijn additionally find that productivity convergence does not lead to convergence in social indicators like infant mortality. This work suggests that a richer notion of convergence, one accounting explicitly for the multivariate nature of aggregate socioeconomic characteristics, warrants further study.
5.6. Distribution dynamics Bianchi (1997), Desdoigts (1994), Jones (1997), Lamo (1996), Quah (1993a,b, 1996b, 1997) have studied the predictions of the theoretical growth models in terms of the behavior of the entire cross-section distribution. While this work is often quite technical, it can be viewed as just a way to make precise the ideas previously described informally in Section 2. Turning back to Figure 1, label the cross-section distribution F, at time period t, and call the associated (probability) measure q~t. Figure 1 can then be interpreted as describing the evolution of a sequence of measures {~bt : t ~> 0}. In empirical work on distribution dynamics, the researcher seeks a law of motion for the stochastic process {q~t : t ~> 0}. With such a scheme in hand, one can ask about the long-run behavior of q~t: if q~t displayed tendencies towards a point mass, then one can conclude that there is convergence towards equality. If, on the other hand, q~t shows tendencies towards limits that have yet other properties - normality or twin peakedness or a continual spreading apart - then those too would be revealed from the law of motion. Moreover, having such a model would allow one to study the likelihood and potential causes of poorer economies becoming richer even than those already currently rich, and similarly the likelihood and potential causes of those already rich regressing to become relatively poor. Finally, a researcher with access to such a law of motion can look further to ask what brings about particular patterns of cross-country growth. The simplest scheme for modelling the dynamics of {q~t : t ~> 0} is analogous to the first-order autoregression from standard time-series analysis: q}t = T*(0t-1, ut) = T~(Ot 1), t ~ 1,
(35)
where T* is an operator that maps the Cartesian product of measures and generalized disturbances u to probability measures, and T,* absorbs the disturbance into the definition of the operator. (See Appendix A for the meaning of * in the two operators T* and T~.) This is no more than a stochastic difference equation taking values that are entire measures. Equivalently, it is an equation describing the evolution of the distribution of incomes across economies. A first pass at Equation (35) discretizes the income space, whereupon the measures q~t can be represented by probability vectors. For instance, Quah (1993a) considers dividing income observations into five cells: the first comprising per capita incomes no greater than 1/4 the world average (at each date); the second, incomes greater than 1/4 but no more than 1/2; the third, incomes greater than 1/2 but no more than
Ch. 4:
291
The New Empirics of Economic Growth Table 3 Cross-country income dynamics a
(Number)
Upper endpoint 1/4
1/2
(456)
0.97
0.03
(643)
0.05
0.92
0.04
0.04
0.92
0.04
0.04
0.94
0.02
O.Ol
0.99
0.16
0.30
(639) (468)
1
(508) Ergodic
0.24
0.18
0.16
2
oc
a 118 economies, relative to world per capita income, 196~1984. Grid: (0, 1/4, 1/2, 1, 2, oo). This table is a portion of Table 1 from Quah (1993a). The table shows transition dynamics over a single-year horizon. The cells are arrayed in increasing order, so that the lower right-hand portion of the table shows transitions from the rich to the rich. The numbers in parentheses in the leftmost column are the number of economy/year pairs beginning in a particular cell. Cells showing 0 to two decimal places are left blank; rows might not add to 1 because of rotmding. The ergodic row gives the long-rtm distribution from transitions according to the law o f motion given in the matrix.
the average; the fourth, greater than the average but no more than double; and finally, in the fifth cell, all other incomes. In terms o f Figure 1, at any given date t, a five-element probability vector ~bt completely describes the situation. Moreover, since we observe which economies transit to different cells in this discretization (and the cells from which they came), we can construct a matrix Mt whose rows and columns are indexed by the elements o f the discretization, and where each row of Mr is the fraction o f economies beginning from that row element ending up in the different column elements. By construction Mt has the properties o f a transition probability matrix: its entries are nonnegative and its row sums are all 1. If we assume that the underlying transition mechanism is time-invariant, then one can average the Mt to obtain a single transition probability matrix M describing the dynamics of the (discretized) distribution. Table 3 shows such an M, as estimated in Quah (1993a). Because the transitions are only over a one-year horizon, it is unsurprising that the diagonal entries are close to 1, and most of the other entries are zero. What interests us, however, is not any single one o f these numbers but what the entire law o f motion implies. The row labelled Ergodic is informative here. To understand what it says, note that by construction: q~t+l = M'q~t, so that
V s >~ 1 :
O,+s = (M')'Ot.
(36)
Since M is a transition probability matrix, its largest eigenvalue is 1, and the left eigenvector corresponding to that eigenvalue can be chosen to have all entries
292
s.)~ Durlauf and D.T. Quak
nonnegative summing to 1. Generically, that largest eigenvalue is unique, so that M s converges to a rank-one transition probability matrix. But then all its rows must be equal, and moreover equal to that probability vector satisfying:
The vector q~o~ is the E r g o d i c row vector; it corresponds to the limit o f relation (36) as s --~ oc. In words, q ~ is the long-rtm limit o f the distribution o f incomes across economies. 26 Table 3 shows that limiting distribution to be twin-peaked. Although in the observed sample, economies are almost uniformly distributed across cells - if anything, there is a peak in the middle-income classes - as time evolves, the distribution is predicted to thin out in the middle and cluster at rich and poor extremes. This polarization behavior is simply a formalization o f the tendencies suggested in Figure 1. Such analysis leads to further questions. How robust are these findings? The discretization to construct the transition probability matrix is crude and ad hoc. Moving from a continuous income state space - Figure 1 - to a discrete one comprising cells - Table 3 - aliases much o f the fine details on the dynamics. Does changing the discretization alter the conclusions? To address these issues, we get rid o f the discretization. In Appendix A we describe the mathematical reasoning needed to do this. The end result is a s t o c h a s t i c k e r n e l the appropriate generalization o f a transition probability matrix - which can be used in place o f matrix M in the analysis. Quah (1996b, 1997) estimates such kernels. Figures 1 la and 1 lb show the kernel for the transition dynamics across 105 countries over 1961 through 1988, where the transition horizon has been taken to be 15 years. The twin-peaked nature o f the distribution dynamics is apparent now, without the aliasing effects due to discretization. Bianchi (1997) and Jones (1997) eschew dealing with the stochastic kernel by considering the cross-section distribution Ft for each t in isolation. This ignores information on transition dynamics, but is still useful for getting information on the shape dynamics in F. Each Ft is estimated nonparametrically. Bianchi (1997) goes further and applies to each Ft a bootstrap test for multimodality (twin-peakedness, after all, is just bimodality). Bianchi finds that in the early part of the sample (the early 1960s) the data show unimodality. However, by the end o f the sample (the late 1980s) the data reject unimodality in favor o f bimodality. Since Bianchi imposes less structure in his analysis - nowhere does he consider intradistribution dynamics, or in the language of Appendix A, the structure o f T* - one guesses that his findings
26 Potential inconsistency across M matrices estimated over single- and multiple-period transitions is a well-known problem from the labor and sociology literature [e.g., Singer and Spilerman (1976)]. Quah (1993a) shows that, in the Heston-Summers cross-country application, the long-run properties of interest are, approximately, invariant to the transition period used in estimation.
Ch. 4."
293
The New Empirics o f Economic Growth
15.year.Horizon
.% %.q
0.•
~.~
o.~
?
%
0.2 0r0
Fig. 1la. Relative income dynamics across 105 countries, 1961 1988. For clarity, this stochastic kernel is one taken over a fifteen-year transition horizon. The kernel can be viewed as a continuum version of a transition probability matrix. Thus, high values along the diagonal indicate a tendency to remain. A line projected from a fixed value on the Period t axis traces out a probability density over the kernel, describing relative likelihoods of transiting to particular income values in Period t + 15. The emerging twin-peaks feature is evident here, now without the aliasing possibilities in discrete transition probability matrices. are more robust to possible misspecification. Here again, however, twin-peakedness manifests. We have taken care, in building up the theoretical discussion from the previous sections, to emphasize that those models give, among other things, ways to interpret these distribution dynamics. A n observed pattern in the distribution dynamics o f crosscountry growth and convergence can be viewed as a reduced form - and one can ask i f it matches the theoretical predictions o f particular classes o f models. We view in exactly this way the connection between the empirics just discussed and the distribution dynamics o f models such as Lucas's (1993) described in Section 4 above. The work just described, while formalizing certain facts about the patterns o f crosscountry growth, does not yet provide an explanation for those patterns. Putting this differently, we need to ask what it is that explains these reduced forms in distribution dynamics. In light o f our discussion above on the restrictions implied by cross-
294
S.N. Durlauf and D.T. Quah
15.year.Horizorl
2 o q)
Period
2 ~+15
3
Contour plot levels at 0.2, 0.35, 0.5 Fig. 1 lb. Relative income dynamics across 105 countries, 1961-1988, contour plot. This figure is just the view from above of Figure 1 la, where contours have been drawn at the indicated levels and then projected onto the base of the graph.
country interactions, we conjecture that this "explaining distribution dynamics" needs to go beyond representative-economy analysis. Quah (1997) has addressed exactly this issue: in the spirit of our discussion above on theoretical models with crosscountry interaction, Quah asks for the patterns of those interactions that can explain these reduced-form stochastic kernels. He finds that the twin-peaks dynamics can be explained by spatial spillovers and patterns of cross-country trade - who trades with whom, not just how open or closed an economy is.
6. Conclusion
We have provided an overview of recent empirical work on patterns of cross-country growth. We think the profession has learned a great deal about how to match those empirical patterns to theoretical models. But as researchers have learnt more, the
Ch. 4:
The New Empirics of Economic Growth
295
criteria for a successful confluence of theory and empirical reality have also continued to sharpen. In Section 2 we described some of the new stylized facts on growth - they differ from Kaldor's original set. It is this difference, together with the shift in priorities, that accounts for wishing to go beyond the original neoclassical growth model. Neither the newer empirical nor theoretical research has focused on preserving the stability of the "great ratios" or of particular factor prices. Instead, attention has shifted to a more basic set of questions: why do some countries grow faster than others? What makes some countries prosper while others languish? Sections 3 and 4 described a number of well-known theoretical growth models and presented their empirical implications. Although a considerable fraction of the empirical work extant has studied growth and convergence equations - whether in cross-section or panel data - we have tried to highlight first, that those equations might be problematic and second, that in any case they need not be the most striking and useful implications of the theory. Distribution-dynamics models make this particularly clear. Appropriate empirical analysis for all the different possibilities we have outlined above is an area that remains under study. Section 5 described a spectrum of empirical methods and findings related to studying patterns of cross-country growth. The range is extensive and, in our view, continues to grow as researchers understand more about both the facts surrounding growth across countries and the novel difficulties in carrying out empirical analyses in this research area.
At the same time, we feel that the new empirical growth literature remains in its infancy. While the literature has shown that the Solow model has substantial statistical power in explaining cross-country growth variation, sufficiently many problems exist with this work that the causal significance of the model is still far from clear. Further, the new stylized facts of growth, as embodied in nonlinearities and distributional dynamics have yet to be integrated into full structural econometric analysis. While we find the new empirics of economic growth to be exciting, we also see that much remains to be done. Appendix A. Proofs and additional discussions This appendix collects together proofs and additional discussion omitted from the main presentation. It is intended to make this chapter self-contained, but without straying from the empirical focus in the principal sections. A. 1. Single capital good, exogenous technical progress
The classical Cass-Koopmans [Cass (1965), Koopmans (1965)] analysis produces dynamics (9b) from the optimization program (10). To see this, notice that given assumptions (la) and (7a-c), K ( t ) = Y(t) - c ( t ) N ( t ) - 6 K ( t )
296
S.N. Durlauf and D.T. Quah
can be rewritten as k = y - c - (6 + v)k,
The original problem (10) can then be analyzed as max
U(c(t))e-(p-~)t dt
subject to k = F ( k , A ) - c - ( 6 + v)k.
{c(t),k(t)),~o
The first-order conditions for this are: ~U" = (p + 6 - OF(k, A)/Ok) U', k = F ( k , A ) - c - ( 6 + v)k,
lim k ( t ) e (p v ) t = O.
t~oo
Rewrite these in growth rates and then in technology-normalized form; use the parameterized preferences U from program (10); and recall that F homogeneous degree 1 means its first partials are all homogeneous degree 0. This yields the dynamics (9b). Turn now to convergence. In order to understand Figure 2 note that if we define g(~) derf(~)/c ~, then on k > 0 function g is continuous and strictly decreasing: Vg(k) = Vf(k)k -1 - f ( k ) k -2 =
[~:Vf(k)-/(k)]
k -2 < 0
by concavity and lim1~ 0 f ( k ) /> 0 from Equation (2). Moreover, limk~ 0 g(k) --, oc (directly if limk~ 0 f ( k ) > 0; by l'Hospital's Rule and Equation (3) otherwise) and limk~ ~ g(k) = 0 from Equation (8). These endpoints straddle (6 + 7 + ~)r l, and therefore the intersection k* exists, and k satisfying T-'~/lc = g(lc) - (6 + v + ~)v -1 is dynamically stable everywhere on/} > 0. To see that (k*, ~*), the zero of Equation (16), is well-defined, let lc* solve
vf(/) = p + 6 + 0~ (k* > 0) and notice that then
~. de__f[f_(k*) k W - - ( 6 + v + ~ ) j l k* >o since
f(k*) - k*
~> V f ( / ~ * ) = p + 0 + 0 ~
from the assumption p > v + ~.
> 6+v+~
297
Ch. 4: The New Empirics of Economic Growth
To see how Equation (18) follows from Equation (17), notice that since M's eigenvalues are distinct and different from zero, we can write its eigenvalue-eigenvector decomposition: M =
1,
with VM full rank and having columns equal to M's right eigenvectors, and
Then the unique stable solution of Equation (17) is
log ~(t)- log ~*
\ log ~(0)- log ~* eX2t'
with log fc(0) - log fc* ) log ~(0) log ~* having 0 as its first entry.
VM~ ×
(This proportionality property can always be satisfied since ~(0) is free to be determined while k(0) is given as an initial condition.) This timepath constitutes a solution to the differential Equation (17) for it implies d dt
Vml ~
log ~(t)-log 0*
=)~2× \ l o g ~ ( t ) _ l o g ~ .
log ~(t) - log ~*
=
~2 × VM1 ~ log 0(t) - log 0*
= ~ \ log ~(t) -- log ~*
=M
)~2 VMI \ log 0(t) -- log ~* log ~(t) - log ~*
This solution is clearly stable. Since any other solution contains an exponential in ,~i, this solution is also the unique stable one. A.2. Endogenous growth: asymptotically linear technology
We need to verify that Equations (24) and (25) imply the existence of a balancedgrowth equilibrium with positive limt~o~ ~/7c and limt~o~ ~(t)/lc(t). Along the
S.N. Durlaufand D.T. Quah
298
optimal path ~ is a function of ~:. Consider conditions (9b) as k ~ oo (since we are interested in equilibria with ~(t)/k(t) bounded from below by a positive quantity). Then lim =~ = l i m f ( k ) k l _ ( 6 + v + ~ ) _ ~ l i m ~,~k
f(k)k-l-[p+6+O~] using limk~ ~ ~Tf(k) = l i m ~ J ' ( k ) k
)
0 1,
I. For these to be equal,
lim ~= = lim f(k)lc -1 - ( 6 + v ) - [ l f l ~ J ( k ) k ' k--+~ k ~ - ~ > limf(k) k 1_(6+v)_(p v) k--+~ -
lim f(k)
- (p+6)]O'
~: I _ ( p + 6 ) > 0.
The long-run growth rate is
(k~na f (k) k -I - [p + 6 + O~]) 0 -1, which is positive from Equation (25) 0 < lim~__+~f(k)k l _ ( p + 6) o0<
lim f(X:)k 1-(p+6+0~).
Finally, along such balanced-growth paths we have l i m t ~
k(t)e (; v ~)t = 0 since
l i m k ~ f ( k ) 7c-1 - (p + 6) 0 > p-v
~p-
v - ~ > [tim f(k)lc -1 -(p+6+O~)l 0-'.
If 0 is too large [exceeding the upper bound in Equation (25)] then this model collapses to the traditional neoclassical model where balanced-growth equilibrium has finite (fi*, k*), and neither preference nor technology parameters (apart from ~) influences the long-run growth rate.
Ch. 4:
The New Empirics of Economic Growth
299
A.3. Distribution dynamics
Rigorous expositions o f the mathematics underlying a formulation like Equation (35) can be found in Chung (1960), Doob (1953), Futia (1982), and Stokey and Lucas (1989) (with Prescott) 27. Since we are concerned here with real-valued incomes, the underlying state space is the pair (IR, 91), i.e., the real line R together with the collection 91 of its Borel sets. Let B(1R, 91) denote the Banach space of bounded finitely-additive set functions on the measurable space (R, 91) endowed with total variation norm:
in B(R, 91) "
Iqol =sup ./
where the supremum in this definition is taken over all { A / : J - 1, 2 . . . . , n} finite measurable partitions o f R. Empirical distributions on R can be identified with probability measures on (R, 9l); those are, in turn, just countably-additive elements in B(R, 91) assigning value 1 to the entire space IR. Let ~3 denote the Borel a-algebra generated by the open subsets (relative to total variation norm topology) of B(IR, 91). Then (B, ~3) is another measurable space. Note that B includes more than just probability measures: an arbitrary element q) in B could be negative; qo(lR) need not be 1; and q0 need not be countably-additive. On the other hand, a collection o f probability measures is never a linear space: that collection does not include a zero element; if ¢1 and ¢2 are probability measures, then ¢1 - ¢2 and ¢1 + ~2 are not; neither is x¢1 a probability measure for x E R except at x = 1. By contrast, the set o f bounded finitely-additive set functions certainly is a linear space, and as described above, is easily given a norm and then made Banach. Why embed probability measures in a Banach space as we have done here? A first reason is so that distances can be defined between probability measures; it then makes sense to talk about two measures - and their associated distributions - getting closer to one another. A small step from there is to define open sets o f probability measures, and thereby induce (Borel) a-algebras on probability measures. Such a-algebras then allow modelling random elements drawn from collections of probability measures, and thus from collections o f distributions. The data o f interest when modelling the dynamics o f distributions are precisely random elements taking values that are probability measures.
27 Economic applications of these tools have also appeared in stochastic growth models [e.g., the examples in Stokey and Lucas (1989, ch. 16] (with Prescott), income distribution dynamics [e.g., Loury (1981)], and elsewhere. Using these ideas for studying distribution dynamics rather than analyzing a time-series stochastic process, say, exploits a duality in the mathematics. This is made explicit in Quah (1996a), a study dealing not with cross-country growth but business cycles instead.
S.N. Durlaufand D.Z Quah
300
In this scheme then, each Ot associated with the observed cross-sectional income distribution Kt is a measure in (B, ~3). If (g2, 5, Pr) is the underlying probability space, then q~t is the value of an ~/~3-measurable map ~ : (£2, 5) ~ (B, ~3). The sequence {qst " t ~> 0} is then a B-valued stochastic process. To understand the structure of operators like Z*U t it helps to use the following: Definition" Stochastic Kernel Definition. Let cp and ~p be elements of B that are probability measures on (R, 91). A stoehastie kernel relating q) and ~p is a mapping M(~0,v) : (1R, 91) --+ [0, 1] satisfying: (i) Vy in N, the restriction M(~,e)(y, .) is a probability measure; (ii) VA in 91, the restriction M(cp,~l,)(',A) is 91-measurable; (iii) VA in 91, we have cp(A) = fM(~,v)(y, A) d~p(y). To see why this is useful, first consider (iii). At an initial point in time, for given y, there is some fraction d~p(y) of economies with incomes close to y. Count up all economies in that group who turn out to have their incomes subsequently fall in a given 91-measurable subset A C_ IR. When normalized to be a fraction of the total number of economies, this count is precisely M(y, A) (where the (q), ~p) subscript can now be deleted without loss of clarity). Fix A, weight the count M(y, A) by d~p(y), and sum over all possible y, i.e., evaluate the integral f M(y, A) d~p(y). This gives the fraction of economies that end up in state A regardless o f their initial income levels. If this equals q)(A) for all measurable subsets A, then q) must be the measure associated with the subsequent income distribution. In other words, the stochastic kernel M is a complete description of transitions from state y to any other portion of the underlying state space R. Conditions (i) and (ii) simply guarantee that the interpretation of (iii) is valid. By (ii), the right hand side of (iii) is well-defined as a Lebesgue integral. By (i), the right hand side of (iii) is a weighted average of probability measures M(y, .), and thus is itself a probability measure. How does this relate to the structure of T~ ? Let b(R, 91) be the Banach space under sup norm of bounded measurable functions on (1R, 91). Fix a stochastic kernel M and define the operator T mapping b(lR, 91) to itself by V f in b(R, 91), Vy in R •
(Tf)(y) =
f f(x)M(y,dx).
Since M(y, .) is a probability measure, the image Tf can be interpreted as a forward conditional expectation. For example, if all economies in the cross section begin with incomes y, and we take f to be the identity map, then (i~f)(y) = f x M ( y , dx) is next period's average income in the cross section, conditional on all economies having income y in the current period. Clearly, T is a bounded linear operator. Denote the adjoint of T by T*. By Riesz Representation Theorem, the dual space of b(IR, 91) is just B(IR, 9l) (our original collection of bounded finitely additive set functions on 91); thus T* is a bounded linear
Ch. 4: The New Empirics of Economic Growth
301
operator mapping B(R, 9l) to itself. It turns out that T* is also exactly the mapping in (iii) of the Stochastic Kernel Definition, i.e., V V probability measures in B, VA in ~ "
(T*
V)(A) f M(y, A) dV(y).
(This is immediate from writing the left-hand side as
(T*V)(A)= f IA d(T* V)(Y)= f (TIA)(y)dV(y)
/[flA(x)M(y,dx)]
dV(y)
(adjoint) (definition of T)
. f M(y, A)dV(y),
(calculation)
with 1A the indicator function for A.)
Appendix B. Data
The data used in Section 2 are from version V6 of Summers and Heston (1991). Income is taken to be real GDP per capita in constant dollars using Chain Index (at 1985 international prices) (series RGDPCH). Economies not having data in 1960 and 1989 were excluded. The remaining sample comprised 122 economies (integers immediately before country names are the indexes in the Summers-Heston database): 1
(1) Algeria
2
(2) Angola
3
(3) Benin
4
(4) Botswana
5
(5) Burkina Faso
6
(6) Burundi
7
(7) Cameroon
8
(8) Cape Verde Islands
9
(9) Central African Republic
10
(10) Chad
11
(11) Comoros
12
(12) Congo
13
(14) Egypt
14
(16) Gabon
15
(17) Gambia
16
(18) Ghana
17
(19) Guinea
18
(20) Guinea Bissau
19
(21) Ivory Coast
20
(22) Kenya
21
(23) Lesotho
22
(25) Madagascar
23
(26) Malawi
24
(27) Mali
25
(28) Mauritania
26
(29) Mauritius
S.N. Durlauf and D.T. Quah
302 27
(30) Morocco
28
(31) Mozambique
29
(32) Namibia
30
(33) Niger
31
(34) Nigeria
32
(35) Reunion
33
(36) Rwanda
34
35
36
37
(38) Seychelles (40) Somalia
(37) Senegal (39) Sierra Leone
38
(41) South Africa
39
(43) Swaziland
4O
(44) Tanzania
41
(45) Togo
42
(46) Tunisia
43
(47) Uganda
44
(48) Zaire
45
(49) Zambia
46
(50) Zimbabwe
47
(52) Barbados
48
49
5O
51
(55) Costa Rica (58) E1 Salvador
(54) Canada (57) Dominican Republic
52
(6o) Guatemala
53
(61) Haiti
54
(62) Honduras
55
56
(64) Mexico
57
(63) Jamaica (65) Nicaragua
58
59
(67) Puerto Rico
6O
(66) Panama (71) Trinidad and Tobago
61
(72) USA
62
(73) Argentina
63
(74) Bolivia
64
65
66
67
(76) Chile (78) Ecuador
(75) Brazil (77) Colombia
68
(79) Guyana
69
(80) Paraguay
7O
(81) Peru
71
(82) Suriname (84) Venezuela
72
(83) Uruguay
74
(86) Bangladesh
(88) China (90) India
76
(89) Hong Kong
78
(91) Indonesia
(92) Iran (95) Japan
8O
(94) Israel
82
(96) Jordan
73 75 77 79 81
(97) Korean Republic 85 (102) Myanmar 87 (106) Philippines
84 (100) Malaysia
89 (109) Singapore 91 ( l l l ) Syria
9O (110) Sri Lanka 92 (112) Taiwan
83
86 (105) Pakistan 88 (108) Saudi Arabia
Ch. 4:
The New Empirics o f Economic Growth
303
93 (113) Thailand 94 (116) Austria 95 (117) Belgium 96 (119) Cyprus 97 (120) Czechoslovakia 98 (121) Denmark 99 (122) Finland 100 (123) France 101 (125) Germany, West 102 (126) Greece 103 (128) Iceland 104 (129) Ireland 105 (130) Italy 106 (131) Luxembourg 107 (132) Malta 108 (133) Netherlands 109 (134) Norway 110 (136) Portugal 111 (137) Romania 112 (138) Spain 113 (139) Sweden 114 (140) Switzerland 115 (141) Turkey 116 (142) UK 117 (143) USSR 118 (144) Yugoslavia 119 (145) Australia 120 (146) Fiji 121 (147) New Zealand 122 (148) Papua New Guinea The clustering-classification results described in Section 5 derive from the following subsample split [taken from Durlauf and Johnson (1995), Table IV]: (1) yj(1960) < $800: Burkina Faso, Burundi, Ethiopia, Malawi, Mali, Mauritania, Niger, Rwanda, Sierra Leone, Tanzania, Togo, Uganda; (2) $800 ~< yj(1960) ~< $4850 and LRj(1960) < 46%: Algeria, Angola, Benin, Cameroon, Central African Republic, Chad, Congo (People's Republic), Egypt, Ghana, Ivory Coast, Kenya, Liberia, Morocco, Mozambique, Nigeria, Senegal, Somalia, Sudan, Tunisia, Zambia, Zimbabwe, Bangladesh, India, Jordan, Nepal, Pakistan, Syria, Turkey, Guatemala, Haiti, Honduras, Bolivia, Indonesia, Papua New Guinea; (3) $800 ~< yj(1960) ~< $4850 and 46% ~< LRj(1960): Madagascar, South Africa, Hong Kong, Israel, Japan, Korea, Malaysia, Philippines, Singapore, Sri Lanka, Thailand, Greece, Ireland, Portugal, Spain, Costa Rica, Dominican Republic, E1 Salvador, Jamaica, Mexico, Nicaragua, Panama, Brazil, Colombia, Ecuador, Paraguay, Peru; (4) $4850 < yj(1960): Austria, Belgium, Denmark, Finland, France, Germany (Federal Republic), Italy, Netherlands, Norway, Sweden, Switzerland, UK, Canada, Trinidad and Tobago, USA, Argentina, Chile, Uruguay, Venezuela, Australia, New Zealand. References
Aghion, R, and P. Howitt (1992), "A model of growth through creative destruction",Econometfica 60(2):323-351.
304
S.N. Durlauf and D.T. Quah
Alesina, A., and D. Rodrik (1994), "Distributive politics and economic growth", Quarterly Journal of Economics 109(2):465-490. Alesina, A., S. Ozler, N. Roubini and R Swagel (1996), "Political instability and economic growth", Journal of Economic Growth 1(2):189-211. Azariadis, C., and A. Drazen (1990), "Threshold externalities in economic development", Quarterly Journal of Economics 105(2):501-526. Barro, R.J. (1991), "Economic growth in a cross-section of countries", Quarterly Journal of Economics 106(2):407-443. Barro, R.J. (1996), "Democracy and growth", Journal of Economic Growth 1(1): 1-27. Barro, R.J. (1997), Determinants of Economic Growth (MIT Press, Cambridge, MA). Barro, R.J., and J.-W. Lee (1994), "Sources of economic growth", Carnegie-Rochester Conference Series on Public Policy 40:1-57. Barro, R.J., and X. Sala-i-Martin (1991), "Convergence across states and regions", Brookings Papers on Economic Activity 1991(1):107-182. Barro, R.J., and X. Sala-i-Martin (1992), "Convergence", Journal of Political Economy 100(2):223-251. Barro, R.J., and X. Sala-i-Martin (1995), Economic Growth (McGraw-Hill, New York). Baumol, W.J. (1986), "Productivity growth, convergence, and welfare", American Economic Review 76(5):1072 85, December. Ben-David, D. (1996), "Trade and convergence among countries", Journal of International Economics, 40((3/4)):279-298. B~nabou, R. (1993), "Workings of a city: location, education, and production", Quarterly Journal of Economics 108(3):619-52. Benhabib, J,, and M.M. Spiegel (1997), "Cross-country growth regressions", Working Paper 97-20, CV Starr Center, New York University. Bernard, A.B., and S.N. Durlauf (1995), "Convergence in international output", Journal of Applied Econometrics 10(2):97-108. Bernard, A.B., and S.N. Durlauf (1996), "interpreting tests of the convergence hypothesis", Journal of Econometrics, 71 ( 1-2): 161- 174. Bianchi, M. (1997), "Testing for convergence: evidence from non-parametric multimodality tests", Journal of Applied Econometrics 12(4):393-409. Binder, M., and M.H. Pesaran (1999), "Stochastic growth models and their econometric implications", Working Paper (University of Maryland, February). Blomstrom, M., R.E. Lipsey and M. Zejan (1996), "Is fixed investment the key to economic growth?", Quarterly Journal of Economics 111(1):269-276. Breiman, L., J.H. Friedman, R.A. Olshen and C.J. Stone (1984), Classification and Regression Trees (Chapman and Hall, New York). Canova, E, and A. Marcet (1995), "The poor stay poor: non-convergence across countries and regions", Discussion Paper 1265, CEPR, November. Caselli, E, G. Esquivel and E Lefort (1996), "Reopening the convergence debate: a new look at cross-country growth empirics", Journal of Economic Growth 1(3):363-389. Cass, D. (1965), "Optimal growth in an aggregate model of capital accumulation", Review of Economic Studies 32:233-240. Chamberlain, G. (1984), "Panel data", in: Z. Griliches and M.D. Intriligator, eds., Handbook of Econometrics, vol. II (Elsevier North-Holland, Amsterdam) chapter 22, pages 1247-1318. Cho, D. (1996), "An alternative interpretation of conditional convergence results", Journal of Money, Credit and Banking 28(4):669-681. Chung, K.U (1960), Markov Chains with Stationary Transition Probabilities (Springer, Berlin). Coe, D.T., and E. Helpman (1995), "International R&D spillovers", European Economic Review 39(5): 859-887. Cohen, D. (1996), "Tests of the 'Convergence Hypothesis': some further results", Journal of Economic Growth 1(3):351-362.
Ch. 4:
The New Empirics o f Economic Growth
305
DeLong, J.B. (1988), "Productivity growth, convergence, and welfare: a comment", American Economic Review 78(5): 1138-55, DeLong, J.B., and L.H. Summers (1993), "How strongly do developing economies benefit from equipment investment", Journal of Monetary Economics 32(3):395-415. den Haan, WJ. (1995), "Convergence in stochastic growth models: the importance of understanding why income levels differ", Journal of Monetary Economics 35(1):65 82. Denison, E.E (1974), Accounting for United States Growth, 1929-1969 (The Brookings Institution, Washington DC). Desdoigts, A. (1994), "Changes in the world income distribution: a non-parametric approach to challenge the neoclassical convergence argument", Ph.D. Thesis (European University Institute, Florence, June). Doob, J.L. (1953), Stochastic Processes (Wiley, New York). Duffy, J., and C. Papageorgiou (1997), "The specification of the aggregate production function: a cross-country empirical investigation", Working paper, University of Pittsburgh. Durlauf, S.N. (1989), "Output persistence, economic structure, and the choice of stabilization policy", Brookings Papers on Economic Activity 1989(2):6%116. Durlauf, S.N. (1993), "Nonergodic economic growth", Review of Economic Studies 60(2):349-366. Durlauf, S.N. (1996), "A theory of persistent income inequality", Journal of Economic Growth 1(1):75 93. Durlauf, S.N., and RA. Johnson (1995), "Multiple regimes and cross-country growth behavior", Journal of Applied Econometrics 10(4):365-384. Easterly, W (1993), "How much do distortions affect growth?", Journal of Monetary Economics 32(2): 187-212. Easterly, W, M. Kremer, L. Pritchett and L.H. Smnmers (1993), "Good policy or good luck? Country growth performance and temporary shocks", Journal of Monetary Economics 32(3):459-483. Esteban, J.-M., and D. Ray (1994), "On the measurement ofpolarizatinn", Econometrica 62(4):819-851. Evans, P. (1998), "Using panel data to evaluate growth theories", International Economic Review 39(2):295-306. Forbes, K. (1997), "Back to the basics: the positive effect of inequality on growth", working paper (MIT, Cambridge, MA). Frankel, J.A., and D. Romer (1996), "Trade and growth: an empirical investigation", Working Paper No. 5476 (NBER, June). Frankel, J.A., D. Romer and T. Cyrus (1996), "Trade and growth in east Asian countries: cause and effect?", Working Paper No. 5732 (NBER, June). Franses, RH., and B. Hobijn (1995), "Convergence of living standards: an international analysis", Technical Report 9534/A, Econometric Institute, Erasmus University Rotterdam, September. Friedman, M. (1992), "Do old fallacies ever die?", Journal of Economic Literature 30(4):2129-2132. Futia, C. (1982), "Invariant distributions and the limiting behavior of markovian economic models", Econometrica 50(1):377-408. Galor, O. (1996), "Convergence? Inferences from theoretical models", Economic Journal 106(437): 10561069. Galor, O., and J. Zeira (1993), "Income distribution and macroeconomics", Review of Economic Studies 60(1):35-52. Grier, K.B., and G. Tullock (1989), "An empirical analysis of cross-national economic growth, 1951-80", Journal of Monetary Economics 24(2):259-276. Grossman, G.M., and E. Helpman (1991), Innovation and Growth in the Global Economy (MIT Press, Cambridge, MA). Harrison, A. (1995), "Openness and growth: a time-series, cross-country analysis for developing countries", Working Paper No. 5221 (NBER). Ira, K., M.H. Pesaran and Y. Shin (1997), "Testing for unit roots in heterogeneous panels", Working Paper (University of Cambridge, December).
306
S.N. Durlauf and D.T Quah
Islam, N. (1995), "Growth empirics: a panel data approach", Quarterly Journal of Economics 110(443): 1127-1170. Jones, C.I. ( 1995a), "R&D-based models of economic growth", Journal of Political Economy 103 (3):759784. Jones, C.I. (1995b), "Time series tests of endogenous growth models", Quarterly Journal of Economics 110:495-525. Jones, C.I. (1997), "On the evolution of the world income distribution", Journal of Economic Perspectives 11 (3): 19-36, Summer. Jones, L.E., and R.E. Manuelli (1990), "A convex model of equilibrium growth: theory and implications", Journal of Political Economy, 98(5, part 1): 1008-1038. Kaldor, N. (1963), "Capital accumulation and economic growth", in: EA. Lutz and D.C. Hague, eds., Proceedings of a Conference Held by the International Economics Association (Macmillan, London). Kelly, M. (1992), "On endogenous growth with productivity shocks", Journal of Monetary Economics 30(1 ):47-56. King, R.G., and R. Levine (1993), "Finance and growth: Schumpeter might be right", Quarterly Jounlal of Economics 108(3):717-737. Knowles, S., and RD. Owen (1995), "Health capital and cross-country variation in income per capita in the Mankiw-Romer-Weil model", Economics Letters 48(1):99-106. Kocherlakota, N.R., and K.-M. Yi (1995), "Can convergence regressions distinguish between exogenous and endogenous growth models?", Economics Letters 49:211~15. Koopmans, T.C. (1965), "On the concept of optimal economic growth", in: The Econometric Approach to Development Planning (North-Holland, Amsterdam). Kormendi, R.C., and R Meguire (1985), "Macroeconomic determinants of growth: cross-country evidence", Journal of Monetary Economics 16(2): 141-163. Lamo, A.R. (1996), "Cross-section distribution dynamics", Ph.D. Thesis (London School of Economics, February). Learner, E.E. (1978), Specification Searches: Ad Hoc Inference from Non-Experimental Data (Wiley, New York). Lee, K., M.H. Pesaran and R.P. Smith (1997), "Growth and convergence in a multi-country empirical stochastic Solow model", Journal of Applied Econometrics 12(4):357-392. Lee, K., M.H. Pesaran and R.R Smith (1998), "Growth empirics: a panel data approach - a comment", Quarterly Journal of Economics 113(452):319-323. Leung, C., and D. Quah (1996), "Convergence, endogenous growth, and productivity disturbances", Journal of Monetary Economics 38(3):535-547. Levin, A., and C. Lin (1992), "Unit root tests in panel data: asymptotic and finite-sample properties", Working paper, Economics Department, UCSD, San Diego. Levine, R., and D. Renelt (1992), "A sensitivity analysis of cross-country growth regressions", American Economic Review 82(4):942-963. Loury, G.C. (1981), "lntergenerational transfers and the distribution of earnings", Econometrica 49(4): 843 867. Lucas Jr, R.E. (1988), "On the mechanics of economic development", Journal of Monetary Economics 22(3):3-42. Lucas Jr, R.E. (1993), "Making a miracle", Econometrica 61(2):251-271. Maddala, G.S. (1988), Introduction to Econometrics (Macmillan, New York). Maddison, A. (1989), The World Economy in the 20th Century (Development Centre of the OECD, Paris). Mankiw, N.G., D. Romer and D.N. Weil (1992), "A contribution to the empirics of economic growth", Quarterly Journal of Economics 107(2):407-437. Mauro, E (1995), "Corruption and growth", Quarterly Journal of Economics 110(3):681-713. Murphy, K.M., A. Shleifer and R.W. Vishny (1989), "industrialization and the big push", Journal of Political Economy 97(4):1003-1026.
Ch. 4:
The New Empirics of Economic Growth
307
Murphy, K.M., A. Shleifer and R.W. Vishny (1991), "The allocation of talent: implications for growth", Quarterly Journal of Economics 106(2):503-530. Nelson, C.R., and C.I. Plosser (1982), "Trends and random walks in macroeconomic time series", Journal of Monetary Economics 10:129-162. Nerlove, M. (1996), "Growth rate convergence, fact or artifact?", Working paper, University of Maryland, June. Perron, 12. (1989), "The great crash, the oil price shock, and the unit root hypothesis", Econometrica 57(6):I361-1401. Persson, T., and G. Tabellini (1994), "Is inequality harmful for growth?", American Economic Review 84(3):600-621. Pritchett, L. (1997), "Divergence, big time", Journal of Economic Perspectives 11(3):3 17, Sununer. Quah, D. (1992), "International patterns of growth: I. Persistence in cross-country disparities", Working paper, LSE, London, October. Quah, D. (1993a), "Empirical cross-section dynamics in economic growth", European Economic Review, 37(2/3):426-434. Quah, D. (1993b), "Galton's Fallacy and tests of the convergence hypothesis", The Scandinavian Journal of Economics 95(4):427-443. Quah, D. (1994), "Exploiting cross section variation for unit root inference in dynamic data", Economics Letters 44(1):9-19. Quah, D. (1996a), "Aggregate and regional disaggregate fluctuations", Empirical Economics 21 (1): 137159. Quah, D. (1996b), "Convergence empirics across economies with (some) capital mobility", Journal of Economic Growth 1(1):95-124. Quah, D. (1996c), "Empirics for economic growth and convergence", European Economic Review 40(6):1353 1375. Quah, D. (1997), "Empirics for growth and distribution: polarization, stratification, and convergence clubs", Journal of Economic Growth 2(1):27-59. Ramey, G., and V.A. Ramey (1995), "Cross-country evidence on the link between volatility and growth", American Economic Review 85(5): 1138 1151. Rebelo, S.T (1991), "Long-run policy analysis and long-run growth", Journal of Political Economy 99(3):500 521. Romer, D. (1996), Advanced Macroeconomics (McGraw-Hill, New York). Romer, EM. (1986), "Increasing returns and long-run growth", Journal of Political Economy 94:1002-1037. Romer, P.M. (1990), "Endogenous technological change", Journal of Political Economy 98(5, Part 2): $71 S102. Romer, P.M. (1993), "Idea gaps and object gaps in economic development", Journal of Monetary Economics 32(3):543-574. Romer, P.M. (1994), "The origins of endogenous growth", Journal of Economic Perspectives 8(1):3~2, Winter. Sachs, J.D., and A.M. Warner (1995), "Economic reform and the process of global integration", Brookings Papers on Economic Activity 1995(1): 1-95. Sala-i-Martin, X. (1996), "Regional cohesion: evidence and theories of regional growth and convergence", European Economic Review 40(6): 1325 1352. Sala-i-Martin, X. (1997), "I just ran two million regressions", American Economic Association Papers and Proceedings 87(2): 178-183. Singer, B., and S. Spilerman (1976), "Some methodological issues in the analysis of longitudinal surveys", Annals of Economic and Social Measurement 5:447-474. Solow, R.M. (1956), "A contribution to the theory of economic growth", Quarterly Journal of Economics 70(1):65-94.
308
S.N. Durlauf and D.T. Quah
Solow, R.M. (1957), "Technical change and the aggregate production function", Review of Economics and Statistics 39:312 320. Stigler, S.M. (1986), The History of Statistics: The Measurement of Uncertainty before 1900, Belknap Press (Harvard University, Cambridge, MA). Stokey, N.L., and R.E. Lucas Jr (1989) (with Edward C. Prescott) Recursive Methods in Economic Dynamics (Harvard University Press, Cambridge, MA). Summers, R., and A. Heston (1991), "The Penn World Table (Mark 5): an expanded set of international comparisons, 1950-1988", Quarterly Journal of Economics 106(2):327-368. Swan, T.W. (1956), "Economic growth and capital accumulation", Economic Record 32:334-361. Temple, J. (1996), "Essays on economic growth", Ph.D. Thesis (Oxford, October).
Chapter 5
N U M E R I C A L S O L U T I O N OF D Y N A M I C E C O N O M I C M O D E L S * MANUEL S. SANTOS Department of Economics, University of Minnesota
Contents Abstract Keywords 1. Introduction 2. T h e m o d e l and p r e l i m i n a r y considerations 3. B e l l m a n ' s e q u a t i o n and differentiability o f the value function 3.1. Bellman's equation and the contraction property of the dynamic programming algorithm 3.2. Differentiability of the value function
4. A numerical dynamic programming algorithm 4.1. 4.2. 4.3. 4.4. 4.5.
Formulation of the numerical algorithm Existence of numerical solutions and derivation of error bounds Stability of the numerical algorithm Numerical maximization Numerical integration
5. Extensions of the basic algorithm 5.1. 5.2. 5.3. 5.4.
Multigrid methods Policy iteration Modified policy iteration Polynomial interpolation and spline functions 5.4.1. Polynomial interpolation 5.4.2. Spline functions
6. Numerical approximations of the Euler equation 6.1. Numerical methods for approximating the Euler equation 6.2. Accuracy based upon the Euler equation residuals
7. Some numerical experiments 7.1. A one-sector deterministic growth model with leisure 7.2. A one-sector chaotic growth model 7.3. A one-sector stochastic growth model with leisure
312 312 313 314 319 320 321
324 324 326 328 329 332
334 334 336 338 340 340 344
345 347 352
355 355 362 364
* The author is grateful to Jerry Bona, Antonio Ladron de Guevara, Ken Judd, John Rust, John Taylor and Jesus Vigo for helpful discussions on this topic. Special thanks are due to Adrian Peralta-Alva for his devoted computational assistance. Handbook of Macroeconomies, Volume 1, Edited by JB. Taylor and M. WoodJbrd © 1999 Elsevier Science B. V All rights reserved 311
312 8. Quadratic approximations 9. Testing economic theories 10. A practical approach to computation References
M.S. Santos
368 375 379 382
Abstract This chapter is concerned with numerical simulation of dynamic economic models. We focus on some basic algorithms and assess their accuracy and stability properties. This analysis is useful for an optimal implementation and testing of these procedures, as well as to evaluate their performance. Several examples are provided in order to illustrate the functioning and efficiency of these algorithms.
Keywords dynamic economic model, value function, policy function, Euler equation, numerical algorithm, numerical solution, approximation error JEL classification: C61, C63, C68
Ch. 5:
Numerical Solution of Dynamic Economic Models
313
1. Introduction
This chapter offers an overview of some important methods for simulating solutions of dynamic economic models with the aid of high-performance computers. The recent surge of research in this area has been impelled by current developments in computer processing, algorithm design, software and data storage. This progress has fostered the numerical analysis of a wide range of problems to limits beyond what one could possibly foresee a few years ago. Since advances in our computational capabilities are likely to continue, it is expected that numerical simulation of economic models will be an attractive and expanding research field. A basic concern in science is understanding model predictions. Classical mathematical methods can help us derive basic qualitative properties of solutions such as existence, uniqueness and differentiability. But these methods usually fail to afford us with the specific information necessary to test a model. There are some well-known economic examples in which optimal decisions have an analytical representation or closed-form solution (e.g., models with linear decision rules, or with constant elasticities for consumption and saving). In these cases, optimal policies are generally derived from algebraic manipulations or analytical techniques, and computational methods are usually not needed. Such a state of affairs, however, is not the most common situation. Most dynamic economic models feature essential nonlinearities stemming from intra and intertemporal substitutions over non-constant margins. (These nonlinearities become more pronounced when uncertainty is present in the decision problem.) Digital computers are then the most plausible way to understand the behavior of a given model with a view toward its eventual testing. And one should expect that computational techniques will help to bridge the traditional gap between theoretical developments and empirical economic analysis. Over the past decades, economic thinking has achieved levels of rigor and argumentation comparable to any other scientific discipline. The principles of axiomatization and mathematical logic are well rooted in economic theory. Also, empirical work has endorsed the underlying postulates of statistical analysis. If our main objective is to collect the fruits of this scientific endeavor, the same accepted practices should prevail for solving economic models. A framework for carrying out and reporting numerical experiments is presented in Bona and Santos (1997). Our purpose here is to focus on the accuracy and stability properties of some algorithms currently used by economists, and evaluate their performance in the context of some growth models. Accuracy seems to be a minimal requirement for judging a numerical simulation. And once we have a theory of the error involved in a numerical approximation, we are in a better position to devise more efficient algorithms, and to test and debug the computer code. Stability is concerned with possible variations that numerical errors and misspecifications of parameter values may inflict on the computed solution. Unstable algorithms may lead to odd outcomes, and may considerably lessen the power of a numerical simulation in testing a particular theory. Our study of accuracy and stability properties will be
314
M.S. Santos
complemented with some numerical experiments where we discuss further aspects of the implementation and performance of these algorithms. Computational tools have been applied to a wide variety of problems in economics and finance. Rather than providing a thorough review of these applications, the present work focuses on the analysis o f some fundamental algorithms as applied to some simple growth models. Once the functioning of these algorithms is understood in this basic context, these same techniques should be of potential interest for solving other model economies, even though the assumptions of strong concavity and differentiability are fundamental to some of our results. There are several survey papers on this topic, which to a certain extent may be complementary to the present one. Kehoe (1991) reviews the literature on static general equilibrium models, along with certain numerical methods for dynamic economies. Our paper is in the spirit of Taylor and Uhlig (1990), who describe several computational methods and evaluate their performance. In our case, we shall concentrate on fewer methods, place more emphasis on their accuracy and stability properties, and carry out alternative numerical tests. Marcet (1994) reexamines the literature on the so called parameterized expectations algorithm and presents a sample of its applications. This method computes the optimal law of motion from a direct approximation of the Euler equation. A variety of other methods that approximate the Euler equation are laid out in Judd (1992, 1996), who has advocated for the use of polynomial approximations with certain desired orthogonality properties. Variants of both Marcet's and Judd's procedures along with some other related algorithms are reevaluated in Christiano and Fisher (1994). Finally, Rust (1996) considers several numerical methods for solving dynamic programming programs, and analyzes their complexity properties. Complexity theory presents an integrated framework for assessing the efficiency of algorithms, although some of these asymptotic results may not be binding in simple applications.
2. The model and preliminary considerations
We begin our analysis with a stochastic, reduced-form model of economic growth in which the solution to the optimal planning problem may be interpreted as the equilibrium law of motion of a decentralized economy. Our framework is encompassed in the class of economies set out in Stokey and Lucas (1989). The reader is referred to this monograph for some basic definitions and technical points raised in the course of our discussion. The usefulness of this relatively abstract setting for carrying out numerical computations will be illustrated below with some simple examples. Let (K, K) and (Z, Z ) be measurable spaces, and let (S,,9) = (K x Z, K; x Z) be the product space. The set K contains all possible values for the endogenous state variable, Z is the set of possible values for the exogenous shock, and S is the set of state values for the system. The evolution of the random component {zt}t~o is governed by a stochastic law defined by a function q) : Z × Z ~ Z
Ch. 5: Numerical Solution of Dynamic Economic Models
315
and an i.i.d, process {st}¢>~l where zt = qo(zt-l,et). It follows that the mapping q0 induces a time-invariant transition function Q on (z, Z). Moreover, for each z0 in Z one can define a probability measure /d (z0, ") on every t-fold product space (Z t, Z t) = (Z × Z x ... × Z, Z × Z x ... x Z ) comprising all partial histories of the form z t = (Zl, . . . , zt). The physical constraints of the economy are summarized by a given jeasible technology set, f2 C K x K × Z, which is the graph o f a continuous correspondence, F : K × Z ---, K. The intertemporal objective is characterized by a one-period return function u on g2 and a discount factor, 0 3 < 1. The optimization problem is to find a sequence of (measurable) functions, {:~t}t~0, s~t " Z t-1 --+ K, as a solution to (3O
W(ko, zo) = {~,}~o
/3t
t U(~t' 2"gt+l' Zt) ~t(Zo' dzt)
=
(zct, z6+l,zt) • g2 subject to
zt+l = cp(zt, et+l) (k0,z0) fixed,
zr0=k0,
03<1,
andt=0,1,2
.....
(2.1)
As presently illustrated, this stylized framework comprises several basic macroeconomic models, and it is appropriate for computational purposes. E x a m p l e 2.1. A one-sector deterministic growth model with leisure: Consider the following simple dynamic optimization problem: OG
max tZo/3s['11ogct + ( 1 - ' 1 ) l o g l t ] {~',,6, i,}~o =
I ct + it <~Ak;X(1 - lt) l-a ks+l = it + (1 - 6)kt subject to
03<1,
0 < , t ~ < 1,
0
0~<6~<1,
kt, cs >l O,
O <~ lt <~ 1,
(2.2) A>0,
ko given,
t
0,1,2,....
At each time t = 0, 1, 2 , . . . , the economy produces a single good that can be either consumed, ct, or invested, 6, for the accumulation of physical capital, ks. There is available a normalized unit of time that can be spent in leisure activities, Is, or in the production sector, (1 - Is). The one-period utility function is logarithmic, and 0 < ,1 ~< 1, is the weight of consumption. The one-period production function is Cobb-Douglas, Ak a (1 - l) l-a, characterized by parameters A and a, where A > 0 is a normalizing parameter (although for given measurement units it may represent the technology level), and 0 < a < 1 is interpreted as the income share of physical capital. Physical capital, ks, is subject to a depreciation factor, 0 ~< 6 <~ 1.
316
M.S. Santos
We observe the following asymmetry between variables ct, It and kt. At the beginning of each time t ~> 0, variable kt has been already determined, and from such a value one can figure out the set of future technologically feasible plans { c r I r , kr}r>~t for the economy. More specifically, kt is a state variable. On the other hand, ct and It can only affect the value of the current one-period utility, for given kt and kt+l. That is, c~ and It are control variables. Let R+ be the set of non-negative numbers. For all given feasible pairs (k, U), define v(k, U) = max )~ log c + (1 - ~) log I c, l
subject to Aka(1 - l) 1-a + (1 - 6)k - c - U ~> O. Let c(k, U) and l(k, U) represent the optimal choices for this one-period optimization problem. For every k ~> O, let F ( k ) = {k' E R+" A k a (1 - l (k, k ' ) ) l - a + (1 - 6 ) k - U >~ 0},
and let £2 denote the graph of the correspondence F. Let us now write the following optimization problem: co
max ~-"/3tv(kt, kt+l) {~')~° t=~ subject
(2.3) to f (kt,kt+l) E g2
t /co fixed,
03<1,
andt=0,1,2,....
Then, both problems (2.2) and (2.3) contain the same set of optimal solutions. In other words, if {kt}~0 is an optimal solution to problem (2.3), then {c~(kt, kt+l), lt(kt, kt+l), it(k, kt+l)}~ 0 is an optimal solution to problem (2.2), and vice versa. Example 2.2. A one-sector stochastic growth model with leisure: Uncertainty is now introduced in this simple setting. The maximization problem is written as follows: O<3
max E0 ~-~/3 t [)~logcz + (1 - J,)loglt] {cf,lt, it}~2-o t=0 ct + it = ztAkT(1 - lt) l-a kt+l = it + (1 - cS)]q
subject to
logzt+l = p logzt + et+l 03<1,
0<3.~<1,
A>0,
0
0~<6~<1,
0~
kt, ct >~ O,
O <~ It <, 1,
ko and zo given, t = 0,1, 2 . . . . .
(2.4) Here, random variable zt >~ 0 enters the production function, and follows the law of motion zt = q~(zt-1, et), for q0(z, e) = zPeL Analogously to the previous optimization
Ch. 5:
Numerical Solution o f Dynamic Economic Models
317
problem, one can derive optimal controls c(k, U, z) and l(k, U, z), and define the oneperiod return function v(k, U, z) and the technological correspondence F(k, z), so that the optimization problem can be expressed as in (2.1). The state variables are k and z, and the control variables are c and L The objective function is now the discounted sum of expected utilities, and E0 is the expectations operator at time 0. There are certain variations of these simple examples worth considering. For instance, the one-period utility function may be of the form (c2~11_;~) 1 ~ _
1 for
~r>0.
Since this is a monotonic transformation of the previous one-period utility, the optimal control functions, c(k,U,z) and l(k,U,z), remain unchanged. Likewise, the derivation of the one-period return function v(k,U,z), and the technological correspondence F(k, z), proceed accordingly. Another interesting extension is to allow for an (endogenous) control variable that can influence the law of motion of stochastic variable z.
Example 2.3. A one-sector stochastic growth model with endogenous uncertainty: Consider the following dynamic optimization problem: oo
max {c,,l, . . . . . . . i'}~-0
E0 ~-~/3 t [)~ log ct + (1 - ,~) log lt] t-O
c, + i~ = z , A k ~ e )
~
k~+~ = i~ + (1 - a ) k ,
logzt+l = p logzt + (1 - ut)et+l subject to
0
0<)~<1,
A>0,
0
0~<6~<1,
0~
kt, ct >/ O, lt >~ O,
(2.5)
O <~It + ut + et <~ 1,
ut ~ O,
ko and zo given,
et >>. O, t = O, 1,2 . . . . .
Here, the available unit of time can be allocated over three different margins: Leisure activities, lt, working in the production sector, et, or time spent in reducing the variance of the random shock, ut. One can find this kind of trade-offs in models with durable goods, human capital accumulation, or health care [cf. Becker (1993), Ladrou de Guevara et al. (1997), Rust (1987)]. The reduced version of this optimization problem involves a return function, v(k, kt+l, zt, Zt+l), and a technological correspondence, F(kt,zt). Hence, variable zt+l appears as an additional argument of function v.
3 18
M.S. Santos
Strictly speaking, model (2.5) cannot be embodied in the framework of our baseline optimization problem (2.1), since our optimization problem specifies an exogenous law of motion for variable z. Most of our results below, however, can be extended to a more general setting in which the stochastic process may be endogenously influenced by some control variables. As a matter of fact, our optimization problem (2.1) contemplates endogenous and exogenous state variables, and for convenience we have assumed an exogenous process for the stochastic law of motion. One conclusion to be drawn from these examples is that the above modelization (2.1) is very useful for computational purposes. In most economic models, there are state and control variables. However, control variables can generally be calculated in terms of the state variables. Hence, recasting the model in reduced form - in terms of the state variables - will generally lower its computational burden. For convenience of exposition, we now impose certain assumptions on the reduced form model (2.1). As one can infer from the above examples, these postulates may be derived from more primitive formulations, and hold in most standard macroeconomic applications.
Assumption 1: The set K x Z C R t × R m is compact, and for each fixed z the set g2z = {(k,k') I ( k , U , z ) E g2} is convex.
Assumption 2: The mapping v : g2 --+ R is continuous, and on the interior o f its domain it is differentiable o f class C 2 with bounded first- and second-order derivatives. Moreover, f o r all fixed z there exists some constant 71 > 0 such that o(k,k',z)+ ½~ IIk'll 2 is concave as a function on (k,k').
Assumption 3: For each interior point (ko,zo) in K × Z every optimal realization {kt,zt}t>~o has the property that (kt, kt+l,zt) E int(Y2) f o r each t >/O.
Assumption 4: The function ~ : Z × Z ---+ Z is continuous, and f o r each fixed ~, the mapping cp (., e) is C 2 and the derivative functions Dl ~(z, e) and Dll cp(z, E) are bounded and jointly continuous over all points (z, e) in Z × Z. Also, there are nonnegative constants 0 <<,p < fi 1/2 and C >/ 0 such that the first- and second-order partial derivatives o f zt with respect to zo, Ozt/Ozo and 02zt/Oz~, have the property that Ozt
<~ C p t
and
Oz2
<~ Cp t .for each t > O.
These assumptions are entirely standard [cf. Stokey and Lucas (1989), Ch. 9]. In Assumption 1, the compactness of the domain seems a natural restriction for the purposes of computing numerical solutions. And it should be observed that the convexity requirement on the technological correspondence precludes the existence of
Ch. 5: Numerical Solution of Dynamic Economic Models
319
increasing returns to scale. In A s s u m p t i o n 2, the n o r m Ilk'It is the usual Euclidean norm. Hence, such hypothesis imposes a strong form o f concavity on the second c o m p o n e n t o f the utility function, and over compact sets the c o n d i t i o n is weaker than the conventional postulate that the Hessian matrix D2vz (k, U ) be negative definite 1 over all points (k, U , z ) in £2. The interiority condition asserted i n A s s u m p t i o n 3 is necessary to establish subsequently the smoothness o f optimal paths. As in the above examples, this a s s u m p t i o n is satisfied in most m o d e l s that feature Inada-type conditions [cf. Brock and M i r m a n (1972)], or w h e n the d o m a i n is restricted to a certain absorbing region c o n t a i n i n g the asymptotic d y n a m i c s o f the system. Regarding A s s u m p t i o n 4, note that zt = Co ( cp ( . . . ( qo (zo, e l ) , E2) . . .) , et ). The a s s u m p t i o n then requires the existence o f some constants 0 ~< p < 1//31/2 a n d C ~> 0 such that the matrix n o r m s
__o Ozo (co (... (of (zo, el), e2)...), 02
~z02¢ (¢(-..(¢(zo, el),e2)...),
<<.cp', Et)
<<.c #
for every realization (et, . . . , et) and t > 0. Thus, this c o n d i t i o n limits the asymptotic growth o f the stochastic process, and it is satisfied in m o s t m o d e l s with b o u n d e d firstand second-order m o m e n t s 2. For the purposes o f our analysis such condition would not be n e e d e d in situations where z is a n endogenous variable [e,g., see the contrast b e t w e e n derivatives (3.3) a n d (3.4) below].
3. Bellman's equation and differentiability of the value function In this section, we introduce the methodology o f d y n a m i c p r o g r a m m i n g , and recall some qualitative properties o f optimal solutions. These methods have proven very
1 For functions u over a set £2 C R I x R I x R m, Do(ko,kl,Z ) will denote the (first-order) derivative of u evaluated at an interior point (k0, kl, z), and Div(ko,kl, z), i = 1, 2, 3, will denote the partial derivative of v with respect to the ith component variable. Similarly, Diju(ko, kt,z ) will denote a second-order partial derivative of v with respect to the ith andjth components. Sometimes we will use the notation D2vz(ko, k I ) to represent the Hessian matrix of the mapping u(.,., z), where z is held fixed. 2 An additional, technical condition usually imposed in this stochastic framework is that the transition function Q on (Z, Z) is weakly continuous or that it satisfies the Feller property [cf. Stokey and Lucas (1989), Ch. 9]. For the particular stochastic law of motion considered here, we can show that this technical assumption always holds. Thus, let f be a bounded continuous function on K x Z. Let ~ ( k , z ) = f f ( k , z ! )Q(z, dz l ), and assume that the sequence {(kn ,z n )}n~>0 converges to (k,z). Now, in order to show that q/(k,z) is a continuous function, write the integral f f ( k , z ~) Q(z, dz') in the form f f ( k , q0(z,e))/*(de), and then apply the Lebesgue dominated convergence theorem. This alternative way of writing the integral will also be advantageous to compute certain derivatives, since in such a case it will not be necessary to differentiate over the transition function Q [cf. expressions (3.4) and (3.5) below].
M.S. Santos
320 useful for the construction of reliable numerical implementation of efficient numerical algorithms study of analytical tools and for a reexamination solutions such as existence, uniqueness, stability,
procedures. Hence, the design and provides an added stimulus for the of qualitative properties of optimal and differentiability.
3.1. Bellman's equation and the contraction property of the dynamic programming algorithm Under the above assumptions the value function W(ko,zo), given in Equation (2.1), is well defined and jointly continuous [cf. Stokey and Lucas (1989)]. Moreover, for each fixed z0 the mapping W(., z0) is concave, and satisfies the Bellman equation
W(ko, ZO) = max v(ko, kl, zo) + f i f kl
Jz
W(kl, zl) Q(zo, dzl)
(3. 1 )
subject to (ko, kl,zo) E £2. The optimal value W(ko, zo) is attained at a unique point given by the policy function kl = g(ko,zo). The policy function is also continuous. Furthermore, an iterated substitution on the right-hand side of (3.1) shows that the set of optimal contingency plans {kt, zt }t>~0 is a Markov process determined by the optimal policy kt+l = g(kt, zt). From a computational point of view, Bellman's equation is a functional equation, and function W is a fixed point of this equation 3. The most common way to compute this fixed point is by the following recursive algorithm known as the method of successive approximations or value-function iteration. Let ~/V be the space o f bounded, continuous functions V on the state space K × Z endowed with the norm Hv]l = max IV(k,z)]. Define the (non-linear) operator T • t42 ~ l/V, as (k,z) C K ×Z
T( V)(ko, ZO) = max u(k0, kl, z0) +/3 . ~ V(kl, zl)Q(zo, dzl) k, subject to (k0, k~, z0) E g2
(3.2)
for V E YV. This functional mapping is known as the dynamic programming operator. Blackwell (1965) and Denardo (1967) first observed that T is a contractive application on W with modulus 0 < /3 < 1, i.e. ][TVo-TV~[[ <. /3[[Vo-V1][ for Vo, V~ E W. It follows that W is the unique fixed point under T, and I I W - V, II ~3" I I w - v011 for vn = TnVo, where T" denotes the n-fimes composition of T. Some simple examples can be constructed in which constant/3 is a fight upper bound. Hence, the method of successive approximations usually yields a linear rate of convergence to the fixed point W.
3 It should be pointed out that certain regularity conditions are required for the existence of a fixed point for Bellman's equation. In stochastic problems, a main technical difficulty is the measurability of the value function [of. Stokey and Lucas (1989), p. 253].
Ch. 5: Numerical Solution of Dynamic Economic Models
321
3.2. Differentiability of the value function Under general assumptions, continuity of the value and policy functions can be established following the method of successive approximations. That is, by the theorem of the maximum [e.g. Stokey and Lucas (1989), p. 62], continuity is preserved at each iteration step, Vn = T(Vn_l) for n ~> 1, and the contraction property of T implies that convergence is uniform. This method, however, has not proved so effective for studying differentiability properties. For this purpose various tools of analysis have been introduced. Invoking some basic properties of the subgradient of a concave function [cf. Benveniste and Scheinkman (1979)], one can show that at every interior point (ko,zo) function W is differentiable with respect to k, and the partial derivative is given by the familiar envelope condition
DI W(ko,zo) = Dlv(ko, kl, z0),
(3.3)
where k~ is the optimal point. Moreover, a straightforward calculation allows us to check that D2W(ko, zo) = Z [ 3 t
,
3v(kt'kt+l'Zt)" COzoj [zt(z°'dzt)'
(3.4)
t=o
where again the right-hand side of Equation (3.4) is evaluated at the optimal contingency plan {kt,zt}t>~o. Hence, W is a C l mapping. Therefore, at interior points the optimal policy kl = g(ko, zo) can be characterized by the first-order condition
D2v(ko, kl, zo) + [3[
Jz
DI W(kl,Zl) Q(zo, d z l )
= O.
(3.5)
Under the foregoing assumptions, it follows from a mere application of the implicit function theorem to (3.5) that if W is differentiable of class C 2 then g is C I. The converse result is not so straightforward as Equation (3.4) involves an infinite series of partial derivatives. In a discrete-time stochastic growth framework, the second-order differentiability of the value function has been studied by Blume, Easley and O'Hara (1982), Gallego (1993) and Santos and Vigo (1995). The assumptions imposed by Blume, Easley and O'Hara are fairly strong for the present context. Their method of proof builds on the idea that uncertainty should smooth out the optimization problem, since the integral operation in problem (2.1) can act as a convolution on function W. To guarantee this nice smoothness property of the integral these authors postulate certain separability and invertibility conditions over the stochastic process as well as smooth density functions. These assumptions would be fairly stringent for our purposes. Indeed, in the present model random variables may be discrete, or may not contain a smooth density, allowing thus for event trees or other commonly studied modelizations with uncertainty.
322
M.~Santos
Gallego (1993) extends the method of analysis presented by Santos (1991, 1994). This approach was originally developed in deterministic growth models, but it turns out to extend in a simple manner to our stochastic framework. The idea here is to construct a "candidate" mapping for the second-order derivative of the value function that has nice continuity properties. This mapping is derived as the solution to a quadratic approximation of the original optimization problem. This associated problem is much easier to manipulate, and under the above assumptions, one simply checks that it defines the second-order derivative of W. Moreover, the quadratic optimization problem provides bounds for such derivative in terms of primitive data of the model, and it should be a benchmark for assessing the suitability of alternative quadratic approximations encountered in the literature. Both Blume et al. (1982) and Gallego (1993) focus on the second-order differentiability of W on K. Santos and Vigo (1995) extend this analysis to allow for differentiability on Z. These are their main results: Theorem 3.1. Under Assumptions (1)-(4) the value function W is a C 2 mapping on int(K × Z).
Corollary 3.2.
Under Assumptions (1)-(4) the policy function g is
a C 1
mapping on
int(K × Z).
The derivatives of these functions can be computed from primitive elements of the model in the following recursive way [cf. op. cit.]. First, Dll W (ko,zo) is determined by the associated quadratic optimization problem x0 • Dll m(ko,zo) .xo O<3
max ~ - ' f i t [ {zt}~>0~
P
[(3r, Zt+l). DZvz,(k, kt+l) • (zt, Jrt+l)] pt(z0,dz t)
subject to Zt+l = q~(z~,et+l),
(3.6)
t = 0, 1,2 . . . . .
Here the maximization proceeds over all measurable functions {~t}~0, ~t : Z t-1 ~ R/, for t ~> 1 with ~c0 = x0 and z0 fixed, the one-period objective O2vz,(kt, kt+l) is the Hessian matrix of the mapping v(., ",Zt) for given zt, and {kt,zt}~_ 0 is the optimal contingency plan to problem (2.1) for the initial value (ko,zo). From this characterization, one readily proves that the optimal plan {~}~-0 to maximization problem (3.6) determines the derivative of the policy function with respect to/co. That is, ~t = Dlgt(ko,zo) • ;to for t ~> 1, where Dlgt(ko,zo) denotes the derivative of the function g ( g ( . . , g(ko, zo). • .), zt 2), zt-1) with respect to k0 for every possible realization (Zl,Z2. . . . . zt 1). Given that (x0, 0, 0, 0 .... ) is a feasible solution to maximization problem (3.6), we obtain that
IID~lW(ko,zo)ll <~ IIDllv(ko,g(ko,zo),zo)ll <~L, where L=
sup (ko, kl, zo)Eg2
IlD~v(ko,g(ko,zo),zo)ll.
(3.7)
Ch. 5: Numerical Solution of Dynamic Economic Models
323
Moreover, if (~t}~0 is an optimal solution to problem (3.6) with I1~011 - 1, then in view of the asserted concavity of v(k, U, z) we must have o~
(3.8) t=0
~
t/
where ~t+l " ~t+l denotes the inner product multiplication at every possible value of the random vector ~t÷1 and t / > 0 is the lower estimate of the curvature of the return function, as specified in Assumption 2. Observe that condition (3.8) places an upper bound on the exponential growth factor of the derivative ~ - Dlgt(ko, zo). ZOo.Indeed, (3.8) implies that oo
~-~/3¢Z HO,g¢+l(ko.zo)ll2(zo,dz,) ~ L t=0
t
(3.9)
/7
On the other hand, differentiation olD2 W(ko, zo) in (3.4) with respect to/co yields O12 W(ko, zo) T = D21 m(ko, zo)
=~Y°° f,k\OzoJ[(°z'~.(n~,~(k,,~+,,~,).D,g'(ko,~o) t=O
+ O320(kt, kt+l, zt). D~g t+l (ko, zo))]/Jr (zo, dzt). Now, taking matrix norms we have
IlDlzW(ko,zo)Tll- llOz1W(k0,z0)ll <~~, f, Oz, IlO3,~(k.k,+,,z,)llIlO~g'(ko,zo)[t t=0
+ llOaxv(/q, k.1,
z,)ll
IID,g'"(ko,zo)ll I
dt)
~< [1 + 2C(L/tl)I/e~Z~IG, (3.10) where the last inequality follows from Assumption 4 and condition (3.9) for G = sup{llD31vll, IlD32vll}. Similar upper bounds can be obtained for IIDRgll and IID22WII, Note from these computations that in general variable z will have a more pronounced effect on the second-order derivatives of the value function. Much less is known about higher-order differentiability properties of the value and policy functions. In simple deterministic models, regular examples can be constructed where the value function is C 2 but fails to be C 3 [cf. Araujo (1991) and Santos
324
M.S. Santos
(1994)]. This lack of differentiability is closely related to certain non-independence (or resonance) conditions o f the eigenvalues at a given unstable steady state. (For a pair o f eigenvalues, 3~1 and )~2, the resonance conditions amount to )~]* = ,~2 for some integer n > 0.) Such conditions are nevertheless pathological (i.e., not robust to small perturbations o f the model), although further sources ofnon-differentiability may arise in multivariate models with complex behavior, or in stochastic frameworks. In conclusion, there exists a reasonably general theory for first- and second-order derivatives o f the value function, and such derivatives may be bounded from primitive elements o f the model. Relatively little is known about existence and characterization o f higher-order derivatives 4.
4. A numerical dynamic programming algorithm Our purpose now is to formulate a numerical algorithm for computing functions W and g, and to study its accuracy properties. In the actual implementation o f the algorithm, there are additional errors stemming from the numerical maximization and integration o f these functions, and these errors may unfold over the iterative process. We shall present here a stability analysis o f these approximations, along with a discussion o f available methods for numerical maximization and integration. In subsequent sections we shall be concerned with possible extensions o f this basic algorithm to speed up computations, and with its numerical implementation.
4.1. Formulation of the numerical algorithm Our numerical algorithm is based on a discretized version of the method of successive approximations. The basic idea underlying this analysis is to restrict the set o f functions ~/V to a finite-dimensional domain so that the algorithm can be coded in a finite number o f computer instructions. Our functions will be defined on the whole domain K × Z via piecewise affine interpolation o f their values over a finite set o f regular points. In the following section, we shall consider alternative interpolation schemes. In the sequel we assume that the state space S = K x Z is a polyhedron. This does not entail much loss of generality for most economic applications. Let {SJ } be a finite
4 A thoughtful method to compute high-order derivatives of the value and policy functions is laid out in Judd (1996) and Gaspar and Judd (1997). This method is only operative for steady-state solutions, assuming the existence of such derivatives.
Ch. 5." Numerical Solution o f Dynamic Economic Models
325
family of simplices 5 such that UjS j = S and int(S i) Nint(SJ) = 0 for every pair S', SJ. Define the grid level or mesh size as h = max diam (k S J ~,/ . J Let (kJ,z j) be a generic vertex of the triangulation. Consider then the space o f piecewise affine functions
t/Vh = { V h " S ---+R
V h is bounded, continuous and DV h is constant in int (S j) for each S j
f
It follows that every function V h in I'Vh is determined by the nodal values Vh(k j, Z J), for all vertex points (k J, z J). These nodal values uniquely define a piecewise affine function over the whole domain S, compatible with the given triangulation {S J}. Also, 142h is a closed subspace o f I/V, equipped with the norm =
max
IIV ll (k,z cK×z
[Vh(k,z)[
for
V h E W h.
For a given triangulation {S J} with mesh size h, we now consider the following algorithm for value function iteration: (i) Initial step: Select an accuracy level TOLW and an initial guess W0h. (ii) Value function evaluation: Let /,
" ~J ,z0) J = max v(kj, W£+l(k kl
+/3 /
Jz
j dz1) m nh(kl ,z1) Q ( z O,
(4.1)
subject to (kj, kl,z j) E f2 for each vertex point (kJ,zJ).
Ilw2+,- w211
(iii) End of iteration: If TOL , stop; else, increment n by 1, and return to step (ii). For present purposes, it should be understood that the maximization and integration operations in algorithm (4.1) are performed exactly. As a matter of fact, in the actual implementation o f the algorithm, one solves for the integral on the right-hand side o f Equation (4.1), and then maximizes over kl. These functional evaluations define a mapping T h : I/V ---+ I/Vh such that Wn/+l = Th(W2). The mapping T h is a discretized version o f the dynamic programming algorithm.
5 A simplex S j in Rt is the set of all convex combinations of 1+ 1 given points. [cf. Rockafellar (1970)]. Thus, a simplex in R l is an interval, a simplex in R2 is a triangle, and a simplex in R3 is a tetrahedron.
M.S. Santos
326
Accuracy level TOLW is selected so as to provide a good approximation W2+l to the fixed point W h o f the following functional equation
j j = max W h(k6,z0) kl
(kd, k,,zd)+/3[Wh(kl,zl)e(zX,
dzl)
,JZ
(4.2)
subject to (k~,kl,z j) c f2 for each vertex point (k~,zo).J j This is the corresponding discretized version o f Bellman's Equation (3.1), which is required to hold at the finite set o f vertex points. Using the above iterative scheme, we now establish existence o f the fixed point W h and provide estimates for the approximation error involved in this discretization.
4.2. Existence of numerical solutions and derivation of error bounds A basic topic in numerical analysis is existence and accuracy properties of numerical solutions. Relatively little is known, however, about the existence and accuracy properties o f most numerical algorithms used by economists. Li (1993) and Marcet and Marshall (1994) are notable exceptions to this trend. Li provides a derivation of error botmds for a simple monetary model. Marcet and Marshall prove certain asymptotic properties of a parameterized expectations algorithm. Although Marcet and Marshall's results are comforting, these results cannot be generally invoked in numerical computations since they simply assert that the approximation scheme converges to the exact solution as the computational cost goes to infinity. We now review some accuracy properties o f our numerical model 6. L e m m a 4.1. Under Assumptions (1)-(4), Equation (4.2) has a unique solution W h
in I/Vh. Proof: The proof is the standard one. One immediately sees that T h is a contraction mapping with modulus 0 < / 3 < 1. By a well known fixed-point theorem, equation (4.2) has a unique fixed point W h in 142h. [] It is worth noticing that the contraction property o f the operator T h implies that the sequence {W~}n~>0 generated by Equation (4.1) converges linearly to the fixed point W h. L e m m a 4.2. Let W be the value fimction in Equation (3.1). Let 7 = under Assumptions (1)-(4) it must hold that IIzw- ZhWII.< ~h 2
IIo2wll Then,
6 The present analysis is taken from Santos and Vigo (1998). One can find related discretization procedures and derivation of error bounds for the computation of the value function [e.g. Bertsekas (1975), Chow and Tsitsiklis (1991), Falcone (1987), Fox (1973), Kitanidis and Foufoula-Georgiou(1987), and Whitt (1978, 1979)]. None of these latter papers, however, derives a quadratic order of convergence for the computed value function in an infinite-horizon optimization setting (as h goes to zero).
Ch. 5." NumericalSolution of Dynamic Economic Models
327
Observe that at each nodal value Th(W)(kJ,z j) = W(kJ,zJ); moreover, ThW is piecewise affine, and function W is C 2. Then, the result follows from a standard application of Taylor's theorem [see Santos and Vigo (1998) for further details]. Theorem 4.3. Let W be the fixed point of Equation (3.1) and W h be the fixed point of Equation (4.2). Then, under Assumptions (1)-(4) it must hold that
IIv/- v/hll ~ ~
h2,
Proof: Let T and T h be as defined previously from Equations (3.2) and (4.1), respectively. We must then have,
[Iv/- whl[ = IITV/- T~V/~II ~< IITV/- T~V/II + [[Thv/- Thv/~[[ ~< I[Tv/- r~v/]l +/3 Ilv/- v/~[[, where use is made in these computations of the triangle inequality and of Lemma 4.1. Therefore,
IIW- whll ~ @
I[Tw- Thv/II
Theorem 4.3 is now a direct consequence of Lemma 4.2. []
v/ll
Inequalities (3.7)-(3.10) provide upper estimates for parameter y = lID2 These estimates can be useful to bound the observed error in specific applications. It should be noted that constant 7/2(1 -/3) becomes unbounded for/3 = 1. This singularity seems to be related to the fact that if the value of the approximation error in a single period may be up to (7/2)h 2 (Lemma 4.2), then the cumulative error over the entire infinite horizon may extend up to [7/2(1 -/3)] h 2. An immediate consequence of Theorem 4.2 is the following useful result: Corollary 4.4. Let g(k j, z j) be the optimal policy Jbr the original t)aIue function W at a vertex point (kJ, zJ), and let gh(kJ,zJ) be the optimal policy for the approximate value function W h at vertex point (k j, zJ). Then, /
Ilg(kJ,zJ)-g,h(kJ,zJ)[[ ~ ',
27
\ I/2
/[n(i-/3)l h forall(kJ, zJ).
It follows then from this analysis that for the computed value function the order of convergence is quadratic on h, whereas for the computed policy function the order of convergence is linear. As it is to be expected, a key assumption for the linear convergence of the optimal policy is a positive lower estimate t / o n the concavity of the return function as postulated in Assumption 2.
328
M.S. Santos
4.3. Stability of the numerical algorithm Our results indicate that the numerical model has some desired accuracy properties. In particular applications, however, these computations are subject to round-off errors and inaccuracies due to numerical maximizations and integrations. These numerical errors may get propagated in unexpected ways over the iteration process. A numerical algorithm is said to be unstable if small computational errors produce large variations in the solution. Unstable algorithms should be avoided. We now show that our algorithm has some desired stability properties. For a given triangulation {S -i} with mesh size h, define a functional operator T~'" }d2 ---+}42h such that [Th(V)(kJ,z j) - Th(V)(kJ,zJ)[ <~ e, for all V in W and all vertex points (kJ,zJ), for fixed e > 0. The interpretation is that under the operator T~ the computational error is not greater than e for all functions in the space W. If such distance is preserved for all nodal values V(kJ, z J), it follows that the constructed piecewise linear interpolations of {Th~(V)(k/,z/)} and {Th(V)(kJ,zJ)} are also within an e-distance over the whole domain S. In accordance with this interpretation, we postulate the following regularity conditions on functional operator T): (i) Monotonicity: T[zV ' ~ T~h V for V' /> V; (ii) Discounting: T~(V + a) <~ T~ V +/3a for every V and every constant function a. These properties are preserved under most standard numerical maximization and integration procedures. That is, condition (i) will hold if the maximization and integration schemes preserve monotonicity, and condition (ii) entails that adding a constant function to the maximizations and integrations results in an equivalent change for the corresponding solution to both operations. (Of course, inequality (ii) may be problematic in rounding off very small numbers.) Under conditions (i) and (ii), functional operator T~h is a contraction mapping on }42 with modulus/3 (cf. Lemma 4.1). Our next result bounds the distance between the fixed points of operators T h and T~: Theorem 4.5. Let W be the fixed point ofT, let W h be the fixed point o f T h, and let
W~ be the f x e d point of T~. Assume that T~ satisfies conditions (i) and (ii). Then, under Assumptions (1)-(4), we have:
(1) II Wh - w ll
1 _e/3;
(2)
IIW - w ll
E
2(lY-/3) h2 + 1 - ~
Part (1) can be established from the method of proof of Theorem 4.3. Part (2) is a consequence of the triangle inequality, using part (1) and Theorem 4.3. Again, the intuition for part (1) is that ire > 0 is the possible computational error in each iteration, then e/(1 - f i ) should be an upper bound for the cumulative error over the entire infinite horizon. Indeed, this estimate can be obtained from the recursion [(cumulative error)t [ ~< [(current error)t [ + fi [(cumulative error)t_l[. The bounds established in Theorem 4.5 can be useful for an efficient design of the computer code. Thus, it would not be optimal to operate with a very fine grid
Ch. 5: Numerical Solution of Dynamic Economic Models
329
of points in cases where the approximation errors from maximization and integration are relatively large. An efficient implementation of the algorithm requires that these errors should be balanced. That is, h2/e ~ 2/y. This benchmark value may be adjusted depending upon the computational cost of reducing each of these errors, and on further operational features of the algorithm discussed in subsequent sections. Of course, the computational error e stems from the maximization and integration approximations, and these individual errors should also be balanced. Routines for maximization and integration usually provide good estimates for these approximations. Moreover, it should be realized that if the maximization is carried out over a discrete set of grid points {k j} with mesh size h, then the additional error involved in this approximation is of order h 2, since the first-order derivative at a maximizer is equal to zero. On the other hand, if the integration is performed over a discretized space, as an approximation for an underlying continuous-valued random variable, then the additional error will depend on the integration scheme and the differentiability properties of the value fimction. [Observe that in general variable z has a more pronounced effect on the derivatives of the value function, cf. Equations (3.4) and (3.10).] Thus, one should make reasonable choices concerning discretizations of state spaces K and Z so that the involved approximation errors are of the same magnitude. It seems that commonly found computations which restrict the uncertainty space Z to very few states as compared to the space of capitals K [e.g., Christiano (1990) and Danthine, Donaldson and Mehra (1989)] may obtain more accurate approximations for the same computational cost by considering more balanced grids over the whole space K x Z. 4.4. Numerical maximization Methods for numerical optimization are covered in Gill, Murray and Wright (1981), Kahaner, Moler and Nash (1989), and Press et al. (1992). In addition, there are several professionally designed subroutines for the solution of various specialized problems, as well as the two all-purpose libraries NAG and IMSL. Here, we shall offer a brief introduction to these methods along with a discussion of some specific issues concerning the analysis of error and implementation of these numerical procedures. As in classical mathematical analysis, a distinction is made in numerical maximization between smooth and non-smooth problems, global and local optimization, constrained and unconstrained solutions, and maximization in one and several dimensions. These properties not only dictate the nature of the techniques employed to tackle the optimization, but they also bear on practical computational considerations. Thus, a method for numerical maximization of non-smooth functions is generally inefficient for smooth problems. Likewise, a numerical method for maximization in several variables will not generally be suitable for one-dimensional problems. Algorithms for numerical maximization of smooth functions usually search over the whole domain of feasible solutions, as opposed to restricting the search to a grid of prespecified points. The software usually indicates the tolerance level or interval of
330
M.S. Santos
uncertainty. Since at an interior maximum point the derivative of the function is equal to zero, it should be understood that if the computed maximum is at an e-distance, then the approximation error in the functional value is of order e 2. If the search for the maximum is restricted to a grid of prespecified points with mesh size h [e.g., Christiano (1990)], then the additional error incurred in this approximation would be of order h 2. Although the search for a maximum over a grid of points is a very simple strategy associated with a relatively small approximation error, methods based on functional evaluations are not generally computationally efficient for smooth problems. There are faster, more powerful algorithms that take advantage of the information provided by the derivatives of the functions. In our case, our mappings are piecewise linear, and hence the gradient is defined at almost every point. Moreover, the curvature of these mappings can be bounded in a certain sense, since the first-order derivatives are determined by the envelope theorem [cf. equality (3.3)], and an upper estimate of the rate of change of these derivatives is the maximum value of the secondorder derivative of the return function [cf. inequality (3.7) above, or Montrucchio (1987) for more general arguments]. Hence, our functions possess some smoothness properties, and have bounded curvature. Then, numerical maximization methods based on simple functional evaluations would generally be inefficient. Of course, smoothness can be obtained under higher-order interpolations; and if these interpolations preserve concavity, the optimization problem may be more tractable. Another important consideration is the dimensionality of the optimization problem. Numerical methods for maximization in several dimensions are generally based on one-line maximizations. The choice of these directions is key for defining the search method. This choice, however, is trivial for unidimensional problems. For univariate maximization, a typical initial step is to bracket the maximum. That is, in some very simple way one selects a triplet of points a < b < c such that f ( b ) is greater than both f ( a ) and f(c). (This choice guarantees the existence of a maximum inside the chosen interval; moreover, if the objective is concave such solution is the desired global maximum.) Once the maximum has been bracketed, then the searching process should exploit regularity properties of the function, using either smooth approximations or functional evaluations. There are also hybrid procedures that combine both types of information. This is the strategy followed by Brent's method [cf. Press et al. (1992), Sect. 10.2], and it seems suitable to our case where the functions have kinks at the vertex points, but at the same time preserve certain smoothness properties. The method proceeds along the following steps: (a) Smooth approximation. The routine selects three given function values, and constructs a parabolic approximation. Then it quickly determines the maximum of the parabola. If this maximum point falls within certain limits (i.e., the maximum is cooperative), then this value is added in the next iteration for a subsequent parabolic approximation, until a desired level of accuracy is achieved. Convergence to the true maximum is of order 1.324.
Ch. 5.. Numerical Solution of Dynamic Economic Models
331
(b) Golden-section search. I f the parabolic approximation is not a reasonable one, then the routine switches to a more reliable but slower method called goldensection search. This procedure is analogous to the familiar method of bisection for finding the zeroes of a univariate function. Given at each stage a bracketing triplet of points, golden-section search tries a point that is a fraction 0.38197 into the largest of the two intervals from the central point of the triplet. With the four points now available, the procednre then selects a new bracketing triplet. Following this iterative process, the interval of search is eventually reduced at each stage by 1 - 0.38197 = 0.61803, which corresponds to the rate of convergence of this method. Brent's method falls into the class of so-called safeguarded procedures, which combine fast algorithms with slower, more reliable ones. Press et al. (1992, Sect. 10.3) discuss another method of this nature which seems appropriate for univariate concave optimization. The method proceeds as follows. Given a bracketing triplet of points a < b < c, one determines the direction of the derivative at the intermediate point, b. This information then defines the next interval of search, which would be either [a, b] or [b, c]. The value of the derivatives of the two chosen points can then be used to produce another intermediate point by some root finding procedure such as the secant method. If this method yields values beyond certain limits, then one bisects the interval under consideration. Of course, in order to implement this latter safeguarded procedure, concavity and smoothness properties of the univariate problem are essential. In the unidimensional case, concavity is always preserved by piecewise linear interpolations. As for differentiability, one could compute for instance one-side derivatives, or else resort to higher-order interpolations preserving first-order differentiability. Regarding multivariate optimization, there are also a host of algorithms, the usefulness of which will depend on the dimensionality, smoothness and concavity properties of the optimization problem. In recent years, there has been a considerable amount of attention devoted to numerical procedures on non-smooth optimization for both concave and non-concave objectives [e.g., see Bazaraa et al. (1993, Ch. 8), HiriartUrruti and Lemarechal (1993) and Shor (1985)]. A simple algorithm in this class of non-differentiable problems is the downhill simplex method (also called the polytope algorithm). For smooth optimization, two popular procedures are quasi-Newton and conjugate-gradient methods. These latter two methods can also be applied to nonsmooth problems, making use of firfite-difference approximations for the first-order derivatives. In an n-dimensional space, the downhill simplex method considers n + 1 functional evaluations, say at points xo,xL,... ,x,,. These points can be visualized as the vertices of a simplex or polytope. Then, in the next step a new simplex is constructed by producing a new vertex that will replace the point with the worst functional evaluation. Depending on the new value, the polytope may further expand or contract in that direction. A new iteration then starts by replacing the worst point, and this iterative process goes on until a desired solution is attained. Under this procedure, the search for a maximum is not
M.S. Santos
332
usually guaranteed, but this seems to be a convenient way to find a maximum in cases where one cannot approximate the derivatives o f the function. Quasi-Newton methods derive estimates o f the curvature of the function, without explicitly computing the second-order derivatives. Thus, each iteration starts at a point xk with a matrix Bk which reflects second-order information, and which is supposed to be an approximation o f the true Hessian if the function is sufficiently smooth. (At the initial stage one usually starts with B0 equal to the identity matrix, and in such case the algorithm reduces to the steepest descent method.) Then, the search direction, Pk, is the solution to Bk •Pk = -gk, where gk is the gradient vector. Subsequently, maximization is carried out in this direction, that is, on the line xk + apk. A choice o f a number ~ defines a new point xk+l = xk + aPk, and completes the iteration. The Hessian estimate Bk+l is then updated following some standard methods [cf. Press et al. (1992), p. 420]. The whole process stops when the gradient gk is sufficiently small. Conjugate-gradient methods construct a sequence o f searching directions which satisfy certain orthogonality and conjugacy conditions so as to improve at each step the search for a maximum. Conjugate-gradient methods do not require estimates or knowledge o f the Hessian matrix. Hence, their applicability extends more naturally to large-scale problems.
4.5. Numerical integration The integral operation in Equation (4.1) can be effected by standard numerical procedures. In general, numerical integration can be easily carried out in one dimension, and it becomes costly in several dimensions. Professionally designed software usually provides an estimate o f the approximation error or tolerance, which in most cases can be adjusted. An n-point quadrature formula, ~ 7 _ I w i f ( x i ) , is an estimate o f a given integral,
I = f~'f(x) dx, where the wi and xi are called weights and nodes. These values depend on a, b and n, but not on f . The difference Rn = f~a'f(x)dx - ~7:~ wif(xi) is called the remainder or error. There are several well known quadrature rules, which under certain regularity conditions yield bounds for the approximation error. The mid-point rule takes w = b - a, and x = ½(b + a). The trapezoidal rule has weights wl = w2 = ½(b - a), and nodes Xl = a, x2 = b. Each o f these rules can be compounded. For example, let us divide the interval [a, b] into N equally-sized panels, and let SN be the integral value by applying the trapezoidal rule to each o f these panels. Then, computations for S:v can be useful to calculate S2N. The three-point compounded Simpson's rule can be defined a s S u : ~4 S 2 x -- ~1S N , and for sufficiently smooth functions the approximation
Ch. 5:
Numerical Solution of Dynamic Economic Models'
333
error under this latter rule is "fourth order" [cf. Kahaner et al. (1989) and Press et al. (1992)]. Compounding allows to attain higher accuracy using previous functional evaluations. Compounding is also very convenient to track numerically the approximation error. Theoretical error bounds are usually too conservative. Further, a quadrature rule is not useful unless there is some way to estimate the remainder R~ [Kahaner et al. (1989), p. 150]. Another basic family of integration rules is Gaussian quadrature. Here, the weights and the nodes are freely selected so that certain integrands can be more effectively approximated. An n-point Gaussian quadrature rule can integrate exactly every polynomial up to degree 2n - 1, integration of polynomials of higher degree would generally entail an approximation error. Hence, Gaussian quadrature is very efficient for the integration of smooth, polynomial-like functions, but may not have a good performance for other types of integrands. More generally, Gaussian quadratures are constructed so as to integrate exactly polynomials times some weighting function, p(x); that is, weights w i and nodes x i are chosen to satisfy f]°p(x)p(x)dx ~ 7 = i wip(xi), where p(x) is a polynomial. For the particular choice p(x) = l/x/1 - x 2, the rule is termed Gauss-Chebyshev integration, and for p(x) = 1 the rule is termed Gauss-
Legendre integration. Compounding is not possible for Gaussian rules, since the nodes of an n-point rule are distinct from those of an m-point rule. (Only, ifn and m are odd, the rules will have the mid-point in common.) There are, however, ways to estimate the error for Gaussian quadratures. Let G~ = ~ 7 _ I w f ( x i ) be an n-quadrature of polynomial degree 2n - 1. Then, define n+l
i-1
j-1
Here, Kzn+l has n + 1 additional nodes, and different coefficients for all ai, hi. These values can be specified so that Kzn+l is of polynomial degree 3n + 1. The two rules (Gn, K2n+l) form a Gauss-Kondrod pair. The difference IGn -K2~+I I is generally a fairly pessimistic error bound for the integral estimate K2n+l. Gauss-Kondrod quadrature rules are usually regarded as very efficient methods for calculating integrals of smooth functions. For double integrals, f~ fJf(x,y)dy dx, an obvious procedure is to solve iteratively one-dimensional integrals. This is called a product rule. Here, the approximation error can be bounded from the estimates of the one-dimensional quadratures. In some cases, especially for integrals over many dimensions, it is optimal to acknowledge the multidimensional nature of the approximation problem, and resort to a more direct integration rule. This would be called a non-product rule. Sometimes, suitable transformations can be made to facilitate computations, but numerical integration of multiple integrals may become rather costly or infeasible. [Davis and Rabinowitz
334
M.S. Santos
(1984) and Stroud (1972) are still useful references.] An alternative route is MonteCarlo integration, which is much easier to implement. Monte-Carlo methods approximate the value of the integral from an average of functional evaluations over random points. Now, the error is actually a stochastic variable, and hence one can make some probabilistic inferences. Thus, if N is the number of sampled points, then the expected error goes to zero at a rate N -u2. This result holds under certain mild conditions on the integrand, without reference to the dimensionality. Over the past three decades, there has been active research to improve these estimates of Monte-Carlo integration, using "quasi-random" methods or "low discrepancy points". The general idea of these deterministic methods (with some randomized extensions) is to sample more carefully the regions of integration, so that the error may exhibit on a worst-case basis a convergence rate close to N 1 [e.g., see Geweke (1996), Niederreiter (1992) and Press et al. (1992) for an introductory account to this theory, and Papageorgiou and Traub (1996), Paskov (1996), and Tan and Boyle (1997) for some numerical evaluations]. These latter results usually require some sort of Lipschitz continuity, and the constants involved in the orders of convergence may depend on the dimensionality of the domain.
5. Extensions of the basic algorithm In this section we introduce some variations of the preceding algorithm which may accelerate the computation of the value and policy functions. There are two natural ways to modify the original method of successive approximations: (a) Introducing an alternative iteration process, or (b) appending a different interpolation procedure, under the same iteration structure. The accuracy properties of algorithms in class (a) remain unchanged, even though one may obtain considerable savings in computational time. The idea is that the method of successive approximations is slow, and there are other possible ways to compute the fixed point. The accuracy properties of algorithms in class (b) may change, although it should be recalled that higher-order approximants do not always yield higher accuracy, and these approximations may be more costly to implement. A common feature of these extensions is that they have not been extensively used and tested in the economic literature; but some of them seem to be of potential interest, and may be quite useful in particular applications. 5.1. Multigrid methods Multigrid methods have been widely applied for solving partial differential equations [cf. Stoer and Bulirsch (1993)]. In a natural way, these methods have been proposed by several authors for the solution of dynamic programs. Chow and Tsitsiklis (1991) have argued that the complexity of a multigrid algorithm is, in a certain sense, optimal.
Ch. 5: Numerical Solution of Dynamic Economic Models
335
Also, Santos and Vigo (1995) implement this algorithm for computing solutions of economic growth problems, and report substantial speedups with respect to the single grid method, especially for cases with fine grids or with high discount factors. To motivate this approach, assume that in computing the value function the desired precision parameter h has been fixed, and there is not a good initial guess /47oto start the process. If W h is the fixed point of T h, and W is the fixed point of T, then it follows from the contraction property of these operators that in the first iteration the approximation error is bounded by /3[IWh-Wo[I+Mh
2.
(5.1)
Likewise, in the nth iteration the approximation error is bounded by
IIwh- w01[ +Mh 2
(5.2)
In these calculations, the distance between W0 and the value function W has been decomposed in the following way. The term /3n ][Wh - W01] bounds the distance between the fixed point W h and the nth iteration of W0 under T h for n ~> 1, whereas M h 2 bounds the distance IIW - Wh]] for M = y/[2(1 -/3)]. This second component is fixed on h, whereas the first one decreases by the factor 0 < /3 < 1. Hence, if our original guess is not close to W, initial reductions in the approximation error are relatively large (in absolute value) as compared to the second term M h 2. Since these gains are basically related to the distance ][W h - W0H and the contraction property of the operator T h, a coarser grid may lead to similar results, even though it entails less computational cost. Thus, it may be beneficial to consider coarser grids in the early stages of the computational process. Formally, the multigrid method proceeds as follows. Let ({S hi })i be a sequence of triangulations, for i = 0, 1 , . . . , n. Let hi be the mesh size of triangulation {S hi }. Suppose that h0 > h~ > ..- > hi > .." > hn. Then, take an arbitrary initial guess W0 and implement the above iterated method (4.1) under the coarsest partition {S h' }, to obtain a fixed point W h°. Next, choose Who as the initial condition for computing WhI . And follow the same procedure for subsequent finer grids: Pick W h~-~ as the initial choice for computing W h', for i = 1, 2, 3 , . . . , n. For the same level of accuracy, this method may reduce the computational burden, since early computations are performed over coarser grids with less computational cost. Likewise, one can generally obtain reasonable initial guesses of the value and policy functions at a very low cost from numerical computations over coarse grids. Our previous error analysis becomes useful for an optimal implementation of this method. As discussed in the preceding sections, inaccuracies of maximization and integration operations must be of order of magnitude h~, where hi is the grid level of triangulation {S h~}. Hence, in early stages of the iteration process these subroutines can be set up at a lower tolerance level to be effected more quickly. Furthermore, from early maximizations one can derive good guesses, so that the search for the maximum
M.S. Santos
336
can be restricted to smaller regions while proceeding with finer partitions. Another crucial issue is the timing of the grid change. A coarser grid is more cheaply effected, but the gain in the approximation error is also smaller; i.e., the right-hand term of expression (5.2) is larger, and the contraction property of the operator does not apply to that term. For an optimal grid change, it seems that an appropriate benchmark value is the gain achieved per work expended. Let us then estimate the approximation errors in expression (5.2), and then proceed with the corresponding calculations for a specific illustration. A good bound for the first term of expression (5.2) can be obtained from the contraction property of our operators; indeed,
:h- will
II
- v: ll
1-/3
As for the term Mh 2, we need an estimate of the second-order derivative of the value function. This estimate may be calculated from our theoretical analysis of Section 3, although more operational bounds are usually obtained from computational experiments (cf. Section 7). To discuss the issue under concern in a more precise manner, assume then that there is but one state variable, and the grid sizes (hi)7-0 follow the relation hi = ho/2 i, i = 1,2 . . . . , n. Then, moving from mesh size hi to mesh size hi+l will roughly double the computational cost in each iteration, since there are twice as many vertex points. On the other hand, the additional gain in the approximation error is determined by the second term of expression (5.2), and for a grid change this term is estimated to 3 2 be ~Mh i , since convergence is quadratic. Therefore, to benefit from the grid change, the gain in the approximation error must double, and this happens when the ratio Ilw"3 For multidimensional problems, the optimal ratio would be smaller, as the computational cost increases exponentially.
v: llmh2
5.2. Policy iteration 7 The method of policy iteration is credited to Bellman (1955, 1957) and Howard (1960). [see Puterman and Brumelle (1979) for some early history.] In economics, the method has been successfully applied to some optimization problems involving large discrete state spaces [e.g., Rust (1987)]. There are, however, certain unresolved issues concerning the implementation of this procedure. From a theoretical point of view, the iterative scheme exhibits locally a quadratic convergence rate, and it has not been established an optimal radius or region of convergence. On the other hand, further computational work is needed to assess its performance in a wide range of economic applications. In the context of our previous discretization procedure, for a given triangulation {S J) with mesh size h, we now consider the following iterative scheme for policy
iteration: 7 This presentation draws upon unpublished work of the author with John Rust.
Ch. 5: Numerical Solution of Dynamic Economic Models
337
(i) Initial step." Select an accuracy level TOLW and an initial guess W0h. (ii) Policy improvement step." Find gnh(k~, J Zo) J that solves
B(W;,h)(k dj :o)j =_W~,(k~,zj)+mkax v(kg,k,,zJo)+[3fz W2(kl'z')Q(z/'dz') (5.3) for each vertex point (k~,z0).J J (iii) Policy evaluation step." Find W,'+ h l(kd, J z 0) J satisfying I"
h j O/ -_ v(kd,g~;(k;,zo), J h j j z j )+: :3L Z/;+~(g.(k;, h h j z~),z')Q(z j,dz,) wL~(kd,z (5.4) for each vertex point (k j, z~). (iv) End of iteration." I f ][W2+,- W2[[ ~< TOLW, stop; else, increment n by 1, and return to step (ii). It should be understood that all functions derived from these iterations are piecewise linear, and compatible with the given triangulation {S J}. Thus, B maps the space of continuous fimctions, 142, into the space of piecewise linear functions, W h. Of course, the fixed point of Equations (5.3)-(5.4) corresponds to the fixed point of Equation (4.2), and the existence of such a unique solution W h has been established in L e m m a 4.1. Therefore, the error bounds derived in Theorem 4.3 and Corollary 4.4 apply here. Observe that step (ii) corresponds to a single iteration of the method of successive approximations. The innovation of policy iteration is in step (iii). For continuous random variables this step is generally non-trivial, since it involves the computation of a function W2, which appears on both sides, and under an integral sign. In such situations, in order to facilitate calculations one may resort to a complete discretization of the space Z, with a further loss of accuracy s. To be more specific about step (iii), let us consider a return function v(kt, kt+l), with kt, kt+l in R+. Assume that {ki}~, are the grid points. Then, for a piecewise linear function g h, Equation (5.4) can be written as follows: . ~::
= [I
-
# P g::] v/Zh+,,
(5.5)
where vg/; is an N-dimensional vector with elements v(ki,ghn(ki)), i = 1.... ,N; on the right-hand side, I is the N x N identity matrix, and Pg) is an N × N matrix generated by policy g~ in the following way: I f kl = g~(ki), for i = 1. . . . . N, and kl = ).U ~ + (1 - Z)kJ for some j, then the ith row Pgl; would be all zeroes, except for the (j - 1)th a n d j t h entries, which are equal to Z and (1 - Z), respectively. As a result
8 Technically, this is a Fredholm equation of the second kind. A natural approach for solving this problem is to use numerical integration and collocation [cf. Dahlquist and Bjorck (1974), pp. 396-397], and limit the search for the fixed point to a finite system of linear equations.
M.S. Santos
338
of this construction, each row of matrix Pg) is made up of non-negative elements that add up to unity. Hence, the matrix [I -/3Pg~I] is always invertible, for 0 3 < 1. An advantage of the discretization procedure of Section 4.1 is that under certain regularity conditions the (Frechet) derivative of operator B at W2 exists and it is given by - [ I /3Pg~].Moreover, if { W]' }n~>0is a sequence generated by Equations (5.3)-(5.4) it follows from these equations that such sequence satisfies
Wnh~l = W 2 -I-[Z-/3Pgf~1-1 B(W2).
(5.6)
Therefore, policy iteration is equivalent to Newton's method applied to operator B [cf. Puterman and Brumelle (1979)]. As is well known, Newton's method exhibits locally a quadratic rate of convergence; i.e. in a certain neighborhood of Wh there exists some constant L > 0 such that Ilwh-w2+lll <~ LIIWh-W2112 for generated by Equation (5.6). Moreover, the region of quadratic convergence and the complexity properties of this method are determined by the second-order derivative of B [or by the Lipschitz properties of the derivative of B, if second-order derivatives do not exist; e.g., Traub and Wozniakowski (1979)]. Observe that as we proceed to finer grids or partitions the numbers of rows and columns of matrix Pg~I increase accordingly. Likewise, for the same perturbations of the function W h (using as distance the sup norm), the values of these entries vary more rapidly for finer grids. Hence, the second-order differentiability properties of B cannot be bounded independently of the grid size. Indeed, such derivatives will get unbounded as h goes to zero. This is to be contrasted with the method of successive approximations where the rate of convergence is linear, bounded by factor/3 (i.e., Ilw h - w~h+lII ~3 [Iw h - wil I for all n ~> 1). Such bound is independent of the grid size, and of the distance from the fixed point. Therefore, policy iteration exhibits quadratic convergence to the fixed point of the algorithm, at the cost of introducing the more complex computational step (5.4); quadratic convergence is local, and the region of quadratic convergence (as well as the constant involved in the order of convergence) may depend on the mesh size of the triangulation. Each iteration involves a matrix inversion; although such an operation may be achieved by some efficient procedures exploiting the structure of the problem, its computational cost increases exponentially with the size of the grid. These considerations lead us to think that policy iteration may not perform so well for very fine partitions of the state space. Some illustrative computations will be reported in Section 7. i ]
i i
5.3. Modified policy iteration The method of modified policy iteration was originally discussed by Morton (1971). The main purpose is to speed convergence to the fixed point Wh, without incurring in the computational burden of policy iteration.
Ch. 5:
Numerical Solution o f Dynamic Economic Models
339
In the framework of our discretization procedure, for a given triangulation {S j } of mesh size h, the following iterative scheme will be considered for modified p o l i c y iteration:
(i) Initial step: Select an accuracy level T O L W and a function Wh. (ii) Policy improvement step." Find gn+~t'*0, ~ ~ j z o) j" that solves
v¢"~ (k~,~ ' ) Q(zo, : dz ' )
maxv(kg,~,z0:)+/3f~ kl j
(5.7)
j"
for each vertex point (k~, Zo). (iii) Policy evaluation step: For a fixed integer m ~> 1, let m-I h
.l
"
= ~-'~fit J z
ht J z o),g;,+l j ht+l (k;,Zo),Z~)~ j j t (z0,dz) j t v(g;+l(k;,
(5.8)
t~O
(~,~0),~)~ (z~,dz') m
J z0). J for each vertex point (k~, (iv) E n d o f iteration: If IIW~h~l- W2I I <~ T O L W , stop; else, increment n by 1, and return to step (ii). As in Section 3, the term g~t refers to the composite function
t,( },,
g; gnt" " "gn(ko,zo) " "),zt-2),zt-1)
for every possible realization (z~,za . . . . . zt_l). Hence, for m > 1 the right-hand side of Equation (5.8) involves the calculation of multidimensional integrals, which in some cases may be rather complex. Observe that if m = 1 the iterative scheme reduces to the dynamic programming algorithm, and as m goes to infinity the method approaches policy iteration. For a related numerical framework, Puterman and Shin (1978) have established global convergence to the fixed point Wh, and an asymptotic, linear rate of convergence equal to/3 m, i.e.,
With respect to the dynamic programming algorithm, the above iterative process avoids the use of repetitive maximizations, which are generally fairly costly. Thus, the optimal m may be greater than 1 [see Christiano (1990), and Example 7.1 below]. Our experience is that modified policy iteration algorithms usually perform well in deterministic optimization problems or in cases where multiple integrals are easily calculated.
M.S. Santos
340
5.4. Polynomial interpolation and spline functions For certain applications it may be more convenient to consider alternative approximation schemes such as piecewise multilinear functions over rectangular subdivisions, or higher-order interpolants (e.g., polynomials and splines). Under our previous assumptions, piecewise multilinear approximations would again yield an approximation error for the computed value function of order h 2. Moreover, for cases in which the value function W is higher-order differentiable, it is possible to obtain better orders of convergence under more appropriate approximation schemes. In particular, if the value function is C k for k ~> 3, then it is also plausible to derive convergence of order k using higher-order interpolants. As discussed in Section 3, fairly little is known about higher-order differentiability properties of the value and policy functions, and there are no known operational methods to bound these derivatives whenever they exist. In those cases, a more complex interpolant may not yield better accuracy, since the approximation error may depend on the size of the higher-order derivatives 9. There are nevertheless certain situations where the use of higher-order interpolants may be advantageous. An obvious case is when for the family of models under consideration the value and policy functions are smooth with relatively small highorder derivatives. But even if these derivatives do not always exist, one may be led to believe that the functions are reasonably well behaved, and that smooth approximations will give good results. Additionally, if concavity is preserved, smooth approximations facilitate the application of more efficient numerical maximization subroutines, accelerating the computation process.
5.4.1. Polynomial interpolation The use of polynomial approximation in dynamic programming dates back to Bellman, Kalaba and Kotkin (1963). These authors argue in favor of polynomial bases with certain orthogonality properties such as the Chebysheo and Legendre polynomials. Generally, the use of orthogonal bases facilitates the computations and leads to better accuracy. The Chebyshev polynomial of degree m is denoted Tin(x), and is defined by the relation
Tin(x) = cos(m arccosx),
m = O, 1,2 . . . . .
(5.9)
9 In order to extend the analysis of Section 4 to higher-order interpolations, a technical problem is to establish the monotonicity of the discretized maximization operator Th, asserted in Lemma 4.1. This property is not generally satisfied for higher-order interpolants [cf. Judd and Solnick (1997)].
Ch. 5: Numerical Solution of Dynamic Economic Models
341
Combining this definition with some trigonometric identities, we obtain for m ~> 1 the functional expressions T o ( x ) = 1,
rl(x) =x, T 2 ( x ) = 2 x 2 - 1, .
.
.
Tm+l(x) - 2xTm(x) - Tm l(X).
Each polynomial Tm has m zeroes in the interval [-1, 1]. These solutions are specified by the values x k = c o s ( : r ( 2 2 ~ - 1) ) ,
k=l,2,...,m.
(5.10)
The location o f these zeroes is such that 2 -(m-I) Tm is the polynomial o f degree m with leading coefficient 1 which deviates least from zero over the interval [-1, 1]. Hence, these polynomials feature minimal oscillations and this is an attractive property for the purposes o f interpolation. Another important property is the following discrete orthogonality relation. Let xk (k = 1. . . . . m) be the m zeroes o f Tm(x). Then, for i,j <m,
~
k=l
Ti(xk)Tj(xk) =
0
i~j,
m/2
i = j ~ O,
m
i=j=0.
(5.11)
To illustrate the relevance o f appropriately locating the nodes to minimize possible oscillations from polynomial interpolation, let us examine the following two examples. First, consider the f u n c t i o n f ( x ) = ~'~, x E [0, 1]. Functions o f this form are frequently observed in economic models. Successive interpolations o f this function will be taken at both equally spaced points and at the Chebyshev zeroes 10. Table 1 reports the approximation error em = maxx c t0,11 If(x) - pm(X)[ for various polynomial degrees under both interpolation schemes. It can be observed that uniform convergence is insured for Chebyshev interpolation, but not for interpolation at equally spaced points. Indeed, under this latter interpolation, the error grows without bound.
10 For a sequence ofm + 1 distinct points {xi}7'_0,interpolation at the values {f(xi)}7'_o uniquely defines a polynomial pm(X) of degree m. For equally spaced points, x0 and x m would correspond to the extreme points of the interval. For interpolation at the Chebyshev zeroes the nodes are x i = l ( y i + 1) for all i, where {Yi}'in-o are the zeroes of Chebyshev polynomial Tin+1(y) defined on [ 1, 1]. That is, in this latter case a change of units is needed, since the Chebyshev polynomials are defined on the interval [-1, 1], and the function f(x) = ,/x is restricted to the interval [0, 1].
M.S. Santos
342
Table 1
Approximation errors for the ftmctionf(x) = x/x, x c [0, 1]a Vertex points
Chebyshev interpolation
Point of max. error
Interpolation at equally spaced points
Point of max. error
10
5.01xlO 2
0.0
1.72x10 2
5.55×10 2
25
2.00x10 2
0.0
5.86x10 3
2.08x10-2
50
1.00×10 2
0.0
2.66x10 3
1.02xlO 2
75
6 . 7 0 × 10 3
0.0
2 . 5 9 x 10 °
6.75 × 10 -3
lO0
5 . 0 0 x 10 -3
0.0
4 . 2 8 x 109
5 . 0 5 × 10 3
200
2 . 5 0 × 10 -3
0.0
4 . 6 9 × 1039
9 . 9 7 × 10 1
300
1 . 6 0 x 10 `3
0.0
3 . 7 2 x 1068
9.98 × 10 1
400
1 . 2 0 × 10 .3
0.0
9 . 5 0 x 1098
1 . 2 5 × 10 3
500
1.OOx 10 -3
0.0
1 . 3 6 × 10129
9 . 9 8 x 10 1
" Columns 2 and 4 report the approximation errors e m maxxc[0,1][f(x) pm(x)], with pro(x) the polynomial interpolant of degree m. Columns 3 and 5 report the point x in [0, 1] where e m attains the maximum value for each of these interpolants. - -
Another notorious example o f non-convergence can be obtained from the function = Ix I, x E [-1, 1]. This function is non-differentiable at x = 0, and has constant derivatives at every other point. Again, simple functions o f this form are commonly observed in economics. Non-differentiabilities, such as that o f point x = 0, may arise from optimization at a boundary surface. In an analogous manner, Table 2 displays the approximation error em for polynomials o f various degrees under both interpolation procedures. As in the previous example, only Chebyshev interpolation guarantees uniform convergence. As a matter o f fact, it can be shown [cf. Natanson (1965), p. 30] that for interpolation at uniformly spaced points convergence occurs only at points - 1 , 0 , 1, and not at any other point. (Convergence at the extreme points -1, 1 is guaranteed by construction.) In the case o f function ~ a main problem for polynomial interpolation is that the derivatives are unbounded at point x = 0, whereas for function Ixl the derivatives are not defined at point x = 0. Since the polynomial interpolant is jointly defined over the whole domain, sharp changes in the derivatives of the function may lead to large oscillations in the interpolant. These oscillations are somewhat minimized under Chebyshev interpolation, and in both o f the above examples such interpolation procedure displays uniform convergence. There are, however, continuous functions for which Chebyshev interpolation may fail to converge uniformly. Although there is no known functional form with such a property, lack o f convergence may be established by a constructive argument [cf. Natanson (1965), Ch. 2]. Indeed, from the construction o f such a mapping one could actually show that the class o f continuous functions for
f(x)
343
Ch. 5." Numerical Solution o f Dynamic Economic Models
Table 2 Approximation errors for the fimctionf(x) = [x[, x C [ 1, 1] a Vertex points
Chebyshev interpolation
Point of max. error
10
5.50x 10-2
-5.26x
25
2.31×10 2
_6.12x10-e
Interpolation at equally spaced points 7.47 x 10 2
10 -2
Point of max. error -5.55 ×
10 -17
5.83×102
9.58x10 1 -9.79x10 1
50
1.11×10 2
_l.01x10-2
4.77×107
75
7.73×10 3
_2.01x10-2
2.14x1016
100
5.58× l0 3
-5.02x
10 -3
3.09x 1021
-9.89x 10 q
200
2.79x 10 3
-2.50x 10 3
2.35x 105°
9.94x 10 1
300
1.86x10 3
-1.66x10 3
5.81x1079
9.96×10 1
400
1.39x 10-3
-1.25 x 107
2.31 x 10109
9.97 x 10 1
500
1.11xl0 3
1.19×t0 j39
9.97x10 1
1.00x10 3
9.86x10 1
a Columns 2 and 4 report the approximation errors % - maxxc [ 1,1] If(x)-pm(x)l, with pro(X) the polynomial interpolant of degree m. Columns 3 and 5 report the point x in [-1, 1] where em attains the maximum value for each of these interpolants. w h i c h C h e b y s h e v interpolation m a y fail to converge u n i f o r m l y is non-negligible in the metric space i n d u c e d by the m a x norm. For continuous functions, u n i f o r m c o n v e r g e n c e can be insured for a m o r e fanciful, H e r m i t i a n interpolation at the C h e b y s h e v nodes [cf. R i v l i n (1990), Th. 1.3]. However, this procedure m a y b e c o m e a w k w a r d for computational purposes, since there is no handy way to estimate the a p p r o x i m a t i o n error. For continuously differentiable functions, m o r e operative error bounds are available. Thus, assume that f is a C k function on [ - 1 , 1]. Then, it can be shown [cf. Rivlin (1969, T h e o r e m s 1.5, 4.1, 4.5), Judd (1992)] that for C h e b y s h e v interpolation, em =
max
l~x~l
If(x)-pm(X)[<~
M log m mk
IIfkl[
(5.12)
for all m > k; here, M is a certain constant that depends on k, and Itfkll is the m a x i m u m value o f the kth-order derivative o f f . A s o b s e r v e d by Judd (1992), p i e c e w i s e linear interpolation dominates asymptotically p o l y n o m i a l interpolation for k K 2, whereas for k ~ 3 p o l y n o m i a l interpolation exhibits a h i g h e r convergence order. O f course, in practical applications one should also take into account the constant terms i n v o l v e d in these error bounds. As discussed in Section 3, high-order derivatives o f the policy function m a y g r o w without b o u n d or m a y fail to exist. In those cases, p o l y n o m i a l interpolation m a y lead to detrimental results, and indeed m a n y authors w a r n against its extensive use. The f o l l o w i n g excerpt is taken f r o m Press et al. (1992), p. 101: Unless there is a solid evidence that the interpolation function is close in form to file true function, f , it is a good idea to be cautious about polynomial interpolation. We enthusiastically
344
M.S. Santos
endorse interpolation with 3 or 4 points, we are perhaps tolerant of 5 or 6; but we rarely go higher than that unless there is quite rigorous monitoring of estimated errors. A further problem with polynomial interpolation is that the functions may lose their original shape. Concavity and monotonicity are not usually preserved. These are key properties for numerical maximization and related operations. On the other hand, polynomials possess exact derivatives and integrals, and as simple, smooth functions they can allow for some other convenient manipulations. Our commentary thus far has been limited to polynomial interpolation for functions of one variable. There is much less practical experience with multivariate polynomial interpolation, a topic surrounded by further technical difficulties [cf. Lorentz (1992), Xu (1996)]. One notorious problem is that for a given set of points and a proper polynomial subspace, the interpolant may not be uniquely defined. To obtain good results, either the functional domain or the location of the nodes must be restricted. A well-behaved family of multidimensional interpolants are those defined as products of monomials 1~ (i.e., the so called tensor products), where many unidimensional arguments carry through. Indeed, for tensor products regular polynomial interpolation is uniquely defined; further, error bounds of the type (5.12) are also available, even though these estimates are somewhat diminished. For further details, see Hammerlin and Hoffmann (1991, Ch. 6). 5.4.2. Spline functions
Let a = x0 < xl < x2... < Xn b be an array of vertex points in the real line. Then, a spline function of degree k is a C k-1 mapping on [a, b] that coincides at each internal [xi, xi+l ] with a polynomial of degree k. This definition generalizes to the multidimensional case by considering tensor products over unidimensional functions [cf. Schumaker (1981)]. A piecewise linear function would correspond to a spline of degree 1. Splines combine in subtle ways benefits of polynomials and piecewise interpolation, since they allow for a tight control of the function, preserving the smoothness of the interpolant. More precisely, (i) As piecewise functions, splines avoid the typical oscillating behavior of polynomial interpolation. (ii) As higher-order interpolants, splines may exhibit better orders of convergence than piecewise linear functions. (iii) As smooth approximations, splines permit the application of powerful numerical maximization methods. =
11 For instance, for bivariate interpolation in the (x,y)-plane, a polynomial pnk(x,y) would be defined at each point as pnk(x,y) = pn(x)p/,(y), where pn(x) is a polynomial of degree n in x and Pk(Y) is a polynomial of degree k in y.
Ch. 5: Numerical Solution of Dynamic Economic Models
345
As already discussed, polynomials may have a poor performance for the approximation of certain functions. By focussing on a certain grid of points, splines may overcome this unnatural feature of polynomials. Piecewise linear approximations and splines arefinite element methods, which allow for a local control of the approximation error. The main trade-offwith respect to piecewise linear approximations is that splines may yield better orders of convergence, but generally require greater computational cost for determining the appropriate coefficients. Cubic splines and B-splines [cf. Schumaker (1981)] are two examples of higher-order interpolants that can be implemented at a relatively low computational cost. An additional advantage of splines is that the resulting function is smooth, so that Newton-type methods - which rely on the computation of first- and secondorder derivatives - can be applied for numerical maximization. Of course, for a successful implementation of a Newton-type method, the interpolation must preserve certain concavity properties, suggesting that the order of the interpolant cannot be too high. Indeed, there are certain quadratic splines that preserve the concavity of the approximation [cf. Schumaker (1983)] but it seems much harder to preserve concavity for splines of higher order. In those cases, it appears more practical to check concavity for some test functions [cf. Johnson et al. (1993)]. There has not been so much numerical experimentation in economics on the performance of splines. It seems, however, that these functions may be greatly beneficial for cases of fine grids and smooth optimal policies 12. Some illustrative computations are provided in Section 7. Spline interpolation has been advocated by Johnson et al. (1993). These authors study a four-dimensional dynamic programming problem, and report reductions in CPU time over piecewise linear functions by factors 250-300, for a given level of accuracy. The gains stem from both faster numerical maximization and better accuracy properties of splines. For the numerical maximization, Johnson et al. (1993) apply a quasi-Newton algorithm under spline interpolations, and a polytope algorithm under piecewise linear interpolation. It appears that these gains are overestimated, since quasi-Newton methods can still be used for piecewise linear interpolants provided that these functions preserve some regularity properties (see the discussion in Section 4.4).
6. Numerical approximations of the Euler equation Our main objective so far has been the computation of the value function from Bellman's equation (3.1). Since it does not seem plausible to compute this fixed point directly, several iterative algorithms have been considered that guarantee convergence to the fixed point. These algorithms are reliable, but they involve at each step costly
12 Splines could still be effectivein models lacking higher-order differentiabilityof the policy function, especially if such derivatives exist almost everywhere and are bounded.
M.S. Santos
346
numerical maximizations and integrations. In this section, we discuss alternative solution methods based on approximating the Euler equation. In general, these methods do not guarantee global convergence to a desired solution, but are sometimes very effective, since they approximate locally the fixed point at higher convergence rates. We shall also review some valuable results for testing the accuracy properties of these algorithms. Our starting point is the following Euler equation: D2v(k0, kl, z) + [J [
D1v(kl, k2, z') Q(z, dS) = 0.
(6.1)
,/z
Under our previous assumptions of interiority and differentiability, this equation must be satisfied along every optimal orbit. Hence, this equation holds at all (ko,z) such that kl = g(ko, z) and k2 = g(g(ko, z), z'). Moreover, under the present assumptions, function g is the unique solution to this functional equation [cf. Stokey and Lucas (1989, Ch. 4)]. Several discretization procedures are available to compute the optimal policy function g from Equation (6.1). In deterministic problems, there are well established methods for the solution of ordinary differential equations [e.g., see Gear (1971) and Lambert (1991), and Mulligan (1993) for a recent application to economic problems]. In order to extend this approach to our economic framework, the basic idea is to approximate the graph of the policy function as the set of solutions that satisfy at all times the second-order system of difference equations implicitly defined by Equation (6.1). Thus, let us assume that uncertainty is not present in the analysis, so that the Euler equation may be written in the following simple form: D2o(k0, g(ko)) + [3D1v(g(ko), g2(ko)) = O.
(6.2)
Suppose now that g has a unique, globally stable steady state, k* = g(k*). Such stationary solution k* can readily be calculated from Equation (6.2). Furthermore, the derivative of the policy function Dg(k*) can be determined from the tangent space of the stable manifold of the system. Hence, in a small neighborhood of k*, function g may be approximated by its derivative Dg(k*). Once these functional values have been estimated for a given neighborhood, a global approximation can be obtained by iterating backwards on Equation (6.2). Indeed, computing the stable manifold of system (6.2) amounts to computing the graph of the policy function. Moreover, error estimates for all these approximations can be easily derived. The approach just described allows us to compute directly the policy function, and avoids the explicit use of numerical maximizations, typical of the slower algorithms of the preceding sections. However, this computational procedure breaks down in deterministic models with complex dynamics or in the presence of uncertainty. Indeed, in a stochastic framework the stationary state would correspond to an invariant distribution, which generally cannot be well approximated locally by the derivative
Ch. 5:
Numerical Solution of Dynamic Economic Models
347
of the policy function. Hence, we need alternative methods that can tackle the Euler equation in a more global way.
6.1. Numerical methods for approximating the Euler equation Since numerical methods from ordinary difference equations are not directly applicable to stochastic models, an alternative approach is to compute function g directly from Equation (6.1) using a standard discretization procedure (e.g., using Newton's method over a finite-dimensional space of functions that approximate g). But given that function g is implicitly defined by such a non-linear equation, it seems that a more fruitful computational strategy would be to approximate directly the second term of Equation (6.1) - and consequently the first - as a function of (k,z); then, from this approximation we can compute function g. This approach avoids some of the nonlinearities involved in attempting to compute the implicitly defined fimction g directly, and consequently it may lead to more operational numerical schemes. The family of methods following this approach is known as parameterized expectations algorithms [e.g., Wright and Williams (1984) for an early application, and den Haan and Marcet (1990), Judd (1992), and Christiano and Fisher (1994) for an account of recent developments]. This terminology may seem deceptive, since such algorithms can also be used for computing solutions of deterministic dynamic problems. Hence, "algorithms approximating the Euler equation" seems to be a more appropriate connotation. We shall first outline a general method for carrying out these computations, and then focus on a simple algorithm due to Christiano and Fisher (1994). As with other algorithms in this family, there are potential problems concerning its implementation; in particular, a solution may not exist, there may be multiple solutions, the algorithm may fail to converge to a desired solution, and formal error estimates are not available. Our description of a general framework for approximating the Euler equation proceeds as follows: (Step 1) Select an n-dimensional space gt of real-valued, non-negative functions q/o(k,z); each function t/to can be defined by a vector a C R n. (Step 2) Compute a function U = go(k, z) from the condition
D2v(k, k', z) + ~a(k, z) = O.
(6.3)
(Step 3) Define a system of n-dimensional equations
cpi(!tto-ma) = 0,
i = 1,2,...,n,
(6.4)
where each q~i is a real-valued mapping from the space of/-dimensional functions over (k, z), and mo denotes the mapping
ma(k, z) = [3 [ Dl O(go(k, z), ga(gu(k, z), z'), z') Q(z, dz'). ,Iz
(Step 4) Find a solution a* for system (6.4).
M.S. Santos
348
There are several issues involved in this numerical procedure. The main objective is to attain a certain level o f accuracy for a reasonable computational cost. In step 1, one chooses a suitable n-dimensional space o f functions ~ . This space could be generated by the ordinary polynomials o f degree n - 1 [e.g., Marcet and Marshall (1994)], by polynomials with orthogonal bases [cf. Judd (1992)], or by piecewise linear functions [e.g., McGrattan (1996)], or by spline functions. Given an element ~ , in step 2 we can compute from Equation (6.3) a function ga. As already discussed, this indirect procedure for computing the policy function may simplify the solution method, since the first term in Equation (6.1) is now approximated by a known function t/t, and hence all non-linearities are just embedded in mapping ma 13. Then, in step 3 we formulate an n-dimensional system o f equations to determine the fixed point a*, which may be calculated by some root-finding numerical procedure. The most powerful solution methods are those based upon Newton's method, involving the inversion o f a certain Jacobian matrix; in those situations, one should check that the problem is well-conditioned [i.e., that the matrix to be inverted is not nearly singular; see Press et al. (1992)]. Newton's method ensures a quadratic order of convergence provided that the initial guess is close enough to the true solution. Unless tP~(k,z) is a sufficiently good approximation, Equation (6.3) may not have a feasible solution. Indeed, U = g~(k, z) is the maximizer for the optimization problem max v(k, U, z) + tI-t~(k, z) . k'.
(6.5)
k/
In view of the concavity o f v(k, U , z ) on U, numerical maximization o f problem (6.5) may be a reliable strategy for solving Equation (6.3). Alternatively, a solution to Equation (6.3) may be obtained by some root-finding method, and for multisector models a good initial guess is usually required in order to guarantee convergence to a desired solution. In models with one variable, computation of Equation (6.3) may just involve some simple algebraic operations. Observe that the choice o f the space qt limits the set of plausible conditions to be imposed in step 3, and consequently the available solution methods to be applied in step 4. Thus, simultaneous consideration o f steps (1)-(4) is required for an optimal design o f the numerical algorithm. Computation o f the fixed point a* may be a rather delicate problem, since function ma could be highly non-linear, involving a conditional expectation; further, a solution may not exist, or there may be multiple solutions. Press et al. (1992, p. 372) share the following views on these issues:
13 A common belief in favor of this computational approach is that the conditional expectation function
re(k, z) = [3 f D 1v(g(k, z), g(g(k, z), z'), z') Q(z, dzl) is smoother than other functions characterizing the optimal solution, such as the policy function g(k,z). A straightforward application of the chain rule shows, however, that in general function m(k,z) cannot be smoother than function g(k,z). In a revised version of the original paper, Christiano and Fisher argue that, under an alternative approximation due to Wright and Williams (1984), function m(k, z) may be smoother than g(k, z) in some specific situations. This seems, however, a minor improvement; in addition, the Wright-Williams approach may lead to further complexities in the computation involved in Equation (6.3), as concavity is lost.
Ch. 5:
Numerical Solution of Dynamic Economic Models
349
We make an extreme but wholly defensible statement: There are no good, general methods for solving systems of more than one nonlinear equation. Furthermore, it is not hard to see why (very likely) there never will be any good, general methods. For problems in more than two dimensions, we need to find points mutually common to N unrelated zero-contour surfaces, each of dimension N - 1. You see that root finding becomes virtually impossible without insight! You will almost always have to use additional information, specific to your particular problem, to answer such basic questions as, "Do I expect a unique solution?", and "Approximately where?"... We should then highlight some important differences with respect to the family o f algorithms o f preceding sections. Numerical dynamic programming may be slow and computationally costly. However, global convergence to a desired solution always holds, and the associated maximization and integration operations may be effected by relatively reliable procedures. In contrast, solving non-linear systems o f equations is conceptually a much more complex numerical problem, although under a good initial candidate for the approximate solution convergence could be faster, assuming that the system is well-behaved (i.e., singularities do not unfold in the computational procedure). Given the inherent complexity o f non-linear systems, we should not expect these solution methods to be operational for very large equations systems, or for discretizations involving a great number o f vertex points. As a representative element in the class o f numerical methods approximating the Euler equations, we now present an algorithm put forth by Christiano and Fisher (1994). The finite-dimensional space tit will be generated by the Chebyshev polynomials. These polynomials satisfy certain desired orthogonality properties, which will facilitate the implementation o f the numerical model, and allow for a more accurate approximation. To describe the workings o f the algorithm, let us confine ourselves to the simple framework o f Equation (6.2), i.e. a deterministic setting in which k is a scalar variable. n-I Each basis function tPa is written in the form gta = e x p ( ~ ) = 0 ajTj(x)), where Tj(x) is the Chebyshev polynomial o f degree j , x =
2 log k - log k log k - log _k
1,
and [k, k] is the interval o f feasible capitals 14. The exponential function ensures that each basis function is non-negative; also, the new variable x E [-1, 1]. We can then express k as a function o f x, which in more compact form will be written as k = ~(x).
14 An alternative parameterization is to let x = 2 _k -k_l. k-k
350
M.S. Santos
[If there are many state variables, tensor products can generate polynomial bases with orthogonal properties of the type (5.11), e.g. see Judd (1992).] Step 2 requires computation of function ga, and this is straightforward in some simple models [cf. Christiano and Fisher (1994)], since ktta(k) is equal to the inverse of the marginal utility of consumption. In step 3, the algorithm makes use of a collocation method so that condition (6.4) reduces to the following system of equations: 0
" V=0
i
]
Here, xi (i = 1,2 . . . . . n) are the zeroes of the nth-degree polynomial Tn(x). Taking logs in Equation (6.6), and making use of the orthogonality conditions (5.11), we then obtain
aJ=n~ZZ
Tj(Xi) log ma(tl(Xi)),
j = 0, 1, ... ,n -- 1,
(6.7)
i-1
where /~ = 2 if j ~> 1, and /~ = 1 if j = 0. Observe that all non-linearities in system (6.7) appear on the right-hand side. This seemingly simple form may facilitate computation of a fixed point a*, and such a simple structure stems from both the method of approximating the Euler equations and the orthogonality properties of the polynomials. Let Oy(a) = I~ ~ n
Tj(xi) log ma(~(xi)),
j
O, 1 . . . . . n - 1.
i-1
Then, step 4 involves the computation of a fixed point a* = q~"(a*); i.e., solving the equation system (6.6) or (6.7). This computation may be most efficiently effected via standard solution methods for non-linear equations. Christiano and Fisher (1994) resort to a Newton-Raphson algorithm, taking as initial guess the solution of an associated quadratic optimization problem. For more complex problems, homotopy methods may be more effective, or successive refinements of the algorithm, using as initial guess the solution of a coarser discretization (cf. Example 7.2, below). Alternatively, one could iterate on the map a~+l = q~n(at), and verify if the sequence converges to a fixed point, but unless we are sure of the stability properties of mapping ~ , convergence here could be slower or even more problematic. As illustrated by Christiano and Fisher (1994), the algorithm has been fast and accurate in some simple test cases. However, this good performance may not be observed in more complex applications, since computation of a fixed point a* in Equation (6.7) is a rather delicate problem, and the indiscriminate use of polynomial interpolation may result in a poor approximation. As a matter of fact, there is relatively little theoretical work on the performance and asymptotic properties of this class of algorithms, and it seems that the following issues need further investigation:
Ch. 5: Numerical Solution of Dynamic Economic Models
(i)
351
Existence: The non-linear system of equations characterizing a particular algo-
rithm may not have a solution a*, and hence we may be unable to produce a reasonable approximation. (ii) Multiplicity: There may be multiple solutions. (iii) Computability: Even if there is a unique solution, the system of non-linear equations may not be amenable to computation using standard numerical techniques. (iv) Accuracy: There is no theory on error bounds or accuracy properties for this family of algorithms. A related discussion on the existence of a fixed point a* is contained in Marcet and Marshall (1994). These authors suggest the use of the Brouwer fixed-point theorem. To the best of our knowledge, at present there are neither rigorous proofs, nor well known counterexamples, regarding the existence of a fixed point a*; moreover, it is hard to document if existence has been a serious operational issue. Under our assumptions in Section 2, the previously discussed algorithm may generate multiple solutions. The multiplicity of solutions may signal the presence of singularities, which limit the range of methods for solving systems such as (6.7); however, multiplicity should not raise further logical concerns. If there are multiple solutions, one would be advised to select the one with smallest Euler equation residuals (cf. Theorem 6.1, below); else, if all solutions exhibit small residuals, then the corresponding policy functions cannot be far apart form each other. Application of Newton-type methods for solving systems of non-linear equations involves inversions of matrices of derivatives, and these matrices may be singular (or nearly singular). Although Newton-type methods can attain quadratic orders of convergence to a fixed point a*, collinearities may result in inaccurate solutions or in rather small regions of convergence. Such difficulties may especially arise in complex stochastic models, with costly numerical integrations, and where a given level of accuracy may require fine grids or polynomials of high degree. Regarding accuracy, there is no formal derivation of error bounds for the class of algorithms studied in this section. In this analysis two types of accuracy results are pertinent: (a) Accuracy of a given numerical scheme approximating the Euler equation. (b) Derivation of error bounds for the policy and value functions using the Euler equation residuals. Concerning point (a), in the absence of a good theory for bounding or estimating highorder derivatives of the policy function, it seems a difficult task to obtain tight error estimates for polynomial approximations, especially in stochastic models with several state variables. Likewise, stability properties [cf. Section 4.4] for algorithms using Newton's method will depend primarily on the conditioning number of the Jacobian matrix, and this has to be directed specifically to each particular application. On the other hand, understanding how the size of the Euler equation residuals translates into approximation errors for the value and policy functions is not only essential to derive theoretical error bounds for the computed value and policy functions
352
M.S. S a n t o s
under this family of algorithms, but it is also a key step for further implementational issues. For instance, evaluation of the residuals could allow us to assess the accuracy of competing numerical solutions, or the accuracy of a given numerical solution regardless of our confidence on the algorithm.
6.2. Accuracy based upon the Euler equation residuals" In practical applications, accuracy can be checked in simple test cases against analytical solutions, or against more reliable numerical methods. These indirect procedures may nevertheless be awkward or infeasible in some situations. That is, computing models with closed-form solutions is only illustrative of the performance of the algorithm in real applications, and the use of reliable numerical methods to test an algorithm involves further computational cost. We shall present here a recent result that allows us to bound the approximation error for any arbitrary solution. This analysis is based upon the computation of the Euler equation residuals. Computation of the residuals is a relatively easy task which involves functional evaluations, and hence it can be effected for arbitrarily large samples of points at a reasonable computational cost. In order to proceed more formally we need the following terminology. Let ~ be a measurable selection of the technology correspondence, F. Define W~ as (DO
v/
(ko, zo ) = Z
fit ~z' v(g t ( k°' zo ), ~t+l (/co,zo), zt ) ~tt(zo, dzt ).
t=0
As before, g/(ko,zo) = ~(~(...~(ko,zo)...),zt 2),zt-i) for every possible realization (z l, z2 . . . . . zt-1). The interpretation is that ~ is the computed policy function, and W~ is the resulting value function under the plan generated by ~. The following result applies to every numerical solution ~ independently of the algorithm under which may have been secured. Theorem 6.1 ]Santos (1999)]. Let e > O. Assume that
Dzv(ko,~(ko,zo),zo)+[3 fzDlv(~(ko,zo),~2(ko,zo),zl)Q(zo, dZl ) <~ e
(6.8)
for all (ko,zo). Then, we haue
(i)
_
w ll-<
2L 2 \v,
/
2
(1
E2' 2
E,
for sufficiently small e > O. In plain words, this theorem asserts that the approximation error of the policy function is of the same order of magnitude as that of the Euler equation residuals, e,
Ch. 5:
Numerical Solution o f Dynamic Economic Models
353
whereas the approximation error of the value function is of order E2. Furthermore, the constants supporting these orders of convergence only depend on: (a) The discount factor, /3; (b) the minimal curvature of the return function, t/, as specified in Assumption 2; and (c) the norm of the second derivative of the return function, L, as specified in Equation (3.7). Sharper error bounds may be obtained by invoking further properties of the solution [cf. Santos (1999), Th. 3.3]. To understand the nature of this result, it may be instructive to focus on an analogous analysis of the finite-dimensional case. Thus, assume that F : R I -+ R is a strongly concave, C 2 mapping. Let x* = arg max F(x). xER I
Then, the derivative
DF(x*) = O. Moreover, by virtue of the strong concavity of this function, (6.9)
]]DF(~)I] ~< e ~ II-~-x*ll <~Ne,
where the constant N is inversely related to the curvature of F, and so this estimate can be chosen independently of e, for e small enough. In addition, concavity implies that
F(x*) - F(Jc) <~ DF(~c) . (x* - x) (6.10)
I[DF(-~)I[ Ilx*-~11 <~ N e 2,
where the last inequality follows from inequality (6.9). Therefore, in the finitedimensional case, there exists a constant N such that
F ( x * ) - F ( 2 c ) <~ N e 2
and
IIx-x*ll
<~ Ne.
(6.11)
Matters are not so simple in the infinite-horizon model, since asymptotically discounting brhlgs down the curvature to zero, although it should be realized that the results established in Theorem 6.1 are weaker than those in Equation (6.11). That is, Theorem 6.1 provides a bound for the approximation error of the computed policy function kl = ~(ko,zo), but not for the entire orbit {~}~-0. As is to be expected, we now have that the approximation error is not only influenced by the curvature of the return function v, but also by the discount factor,/3. To understand in a more precise sense the influence of the discount factor/3 on these estimates, note that the approximation error of the value function is bounded by the discounted sum of expected future deviations [[~t(ko, zo)-gt(ko,zo)][ times
354
M.S. Santos
the maximum size of the Euler equation residuals [ef. inequality (6.10)]. Moreover, [[~t(ko,zo)-gt(ko,zo) H can be bounded iteratively for each t/> 1 from the derivatives of the policy function g(ko,zo), and such derivatives have an asymptotic exponential growth factor no greater than 1/V/-fi [cf. Equation (3.9)]. Therefore, a higher fi widens our estimate of the approximation error of the value function from the Euler equation residuals, and allows for a higher asymptotic growth of the derivatives of the policy function. Accuracy checks based upon the size of the Euler equation residuals have been proposed by den Haan and Marcet (1994) and Judd (1992), even though none of these authors provide error bounds for the computed value and policy functions. These estimates can be easily obtained from Theorem 6.1, since the constants involved in those orders of convergence can be calculated from primitive data of the model, and sharper error estimates may be obtained by invoking further properties of the solution. Theoretical error bounds, however, are usually fairly pessimistic. Accordingly, one may consider that a main contribution of Theorem 6.1 is to establish asymptotic orders of convergence for the computed value and policy functions, and with these results now available one may proceed to estimate numerically the constants associated with those convergence orders [cf. Santos (1999)]. While not as convincing as a complete analysis, the information gathered from a numerical assessment of the error may give us a more precise idea of our approximation for the specific model under consideration. The prevailing view in economics is that Euler equation residuals should be free from dimensionality or measurement units [cf. Judd (1992), p. 437]. Indeed, errors must be evaluated in light of all normalizations and scalings of the functions considered in the analysis, and unit-free measures of error are usually convenient and informative. It should be understood, however, that any measure or elasticity of this sort will not provide a complete picture of our approximation, since as shown previously the accuracy of the residuals is tied down to the curvature of the return function and to the discount factor. As will be argued in Section 9, numerical analysis offers an attractive framework for the study of economic models, and a crucial step in these computational experiments is to show that the errors involved are sufficiently small so as to make proper inferences about the true behavior of the economic model. Without further regard to the properties of the model, there is no hard and fast rule that will always ensure us of the validity of our computations. As is to be expected, the bounds in Equation (6.10) and Theorem 6.1 are in a certain sense invariant to a rescaling of the variables, since the curvature of the return function varies accordingly. In other words, after a change of units a suitable modification of the tolerance allowed for the Euler residuals yields the same level of accuracy for the computed value and policy functions (cf. Santos (1999)]. Of course, regardless of the scale specified in a numerical exercise, accuracy should be fixed to a level such that despite all the numerical errors, the conclusions drawn from our computations remain essentially true for the original economic model.
Ch. 5." NumericalSolution of Dynamic Economic Models
355
7. Some numerical experiments The preceding algorithms will now be applied to some simple growth models. Most of our discussion will focus on the accuracy and computational cost o f these numerical approximations. Our numerical computations were coded in standard FORTRAN 77, and run on a DEC A L P H A 2100 (with a dual processor, each component rated at 250 MHz), which in a double precision floating-point arithmetic allows for a sixteen-digit accuracy. Subject to this technological limit, our dynamic programming algorithm is in principle free from a fixed accuracy level. Accuracy is determined by the following parameter values: h, mesh size; TOLW, accuracy imposed in the iterative scheme: the program stops if for two consecutive value functions W~ and V/,,h+l the difference II roLw; TOLI, accuracy attained in integration; TOLM, accuracy attained in maximization. For the PEA-collocation algorithm of Christiano and Fisher (1994), the accuracy level hinges on the degree o f polynomial interpolation, and the degree o f these interpolants is chosen so that the Euler equation residuals are sufficiently small.
7.1. A one-sector deterministic growth model with leisure In our first set o f experiments we shall attempt to compute the value and policy functions for the simple growth model of Example 2.1. For c5 = 1, these functions have analytical forms, and such forms will serve as benchmark for our numerical results. For convenience, we again write the optimization problem CX3
max ~ - ~ / 3 t [ ~ l o g c t + ( 1 -~,)loglt] {c,, I,, i, }~0 t = 0
I ct + it = Akin(1 - It) 1-a kt+l = it + (1 - cS)kt subject to
(7.1)
03<1,
0<)~<
0
0~<6~<1
k , ct ~>0,
1,
0~
A>0,
k0given,
t= 0,1,2,....
As is well known, for 6 - 1 the value function W(ko) takes the simple form W(ko) = B + C In ko, where B and C are constants such that C = )~a/(1 - a/3). Likewise, the policy function kt+~ = g(kt) takes the form
kt+l = a/3Al~(1-1t) 1 ~,
with
(1 - X)(1 - aft) l = ;~(1-a)+(1-A)(1-aft)
It follows that the system has a unique steady state, k* > O, which is globally stable.
M.S. Santos
356
The existence of one state variable, k, and two controls, l and c, suggests that all numerical maximizations may be efficiently carried out with a unique choice variable. We then write the model in a more suitable form for our computations. The solution to the one-period maximization problem, is determined by the following system of equations:
)~
co
(1 --
~t~
-
A)(1
-
lo) ~
loAk~(1 - a)
=/t,
where kt > 0 is the Lagrange multiplier. After some simple rearrangements, we obtain %Ak~lo(1
-
a)
Co = (1 - )t)(1 - 10)a" Likewise,
The iterative process
h W~+l(ko )=
Win+l - Th(W2)
in (4.1) is then effected as follows:
1 [ ZAk~lo(1- a)
"~
max to )~ o g L ( 1 _ ) 0 ( 1 _ l o ) e ) +(1
)Olog(/o) (7.2)
+/3Wff (Akff(X_lo)_~ [(X_lo)
A/o(l~a)]).
Although Equation (7.2) may appear more cumbersome than the original formulation (7.1), this form will prove more appropriate for our computations as it only involves maximization in one variable. We initially consider parameter values/3 = 0.95, )~ = 1, A = 10, a = 0.34, 6 = 1. For such values the stationary state is k* = 1.9696. For the purposes of this exercise, the domain of possible capitals, K, is restricted to the interval [0.1, 10]. Under these conditions it is then easy to check that Assumptions (1)-(3) are all satisfied. Over the feasible interval of capital stocks, [0.1, 10], we consider a uniform grid of points kJ with step size h. In this simple univariate case our interpolations yield concave, piecewise-linear functions. The maximization at vertex points k j in Equation (4.1) is effected by Brent's algorithm [cf. Press et al. (1992)] with TOLM = 10 s. Such a high precision should allow us to trace out the errors derived from other discretizations embedded in our algorithm. The computer program is
Ch. 5:
Numerical Solution o f Dynamic Economic Models
357
Table 3 E x a m p l e 7.1. Computational method: dynamic p r o g r a m m i n g a l g o r i t h m with linear interpolation a Vertex points
100
M e s h size
Iterations
10 ~
CPU time
Max. error in g
Max. error in W 3.69×10 2
91
3.81
5.31x10 2
1000
10 4
181
73.46
5 . 7 6 x 10 3
3.68× 10 4
10000
10 3
271
1061.41
5.93x10 4
3.80×10 6
~ Parameter v a l u e s : / 3 = 0.95, 3. = 1, A = 10, a = 0.34 and 6 = 1.
instructed to stop once two consecutive value functions W2+1 = Th(W2) satisfy the inequality
IIw2+
h2
- w2 II
roLw
- 5'
Since T h is a contractive operator with modulus 0 < /3 < 1, the fixed point W h = Th(W h) in Equation (4.2) should then lie within a distance h2
lily h- w21r
- -
5(1 - / 3 )
(As will be shown below, in this case constant 1 balances roughly the truncation and approximation errors. Alternatively, this constant could be set in accordance with the estimates obtained at the end o f our discussion o f the multigrid algorithm in Section 5; in this example, both procedures yield similar values.) We start this numerical exercise with h = 10 -1 and the initial condition W0 ~ 0. In computing the approximate value function W h for h = 10 I the program stops after h = 91 iterations with a reported CPU time o f 3.81 seconds. We then proceed in the same manner with h = 10 2 and h = 10 -3, taking as initial condition W0 = 0. All the calculations are reported in Table 3, which includes the maximum observed error for the value and policy functions. One can see that the constant stemming from our numerical computations associated with the quadratic convergence o f the approximation error o f the value function takes values around 3.8, whereas the corresponding constant for the linear convergence of the approximation error o f policy function takes values around 0.59. Both functions converge as predicted by our error analysis. (The relatively small value for the constant associated with the error o f the computed policy function seems to be due to the simple structure o f the optimization problem.) The evolution o f these errors suggests that most o f the computational effort is spent in building up the value function for the infinite horizon optimization problem. Consequently, better initial guesses or more direct procedures for computing the value function may considerably lower the CPU time.
M.S. Santos
358 Table 4 Example 7.1. Upper estimates for the observed error ~ Domain for k0 [1, 10] [0.5, 10] [0.1, 10]
M
M + 1/[5(1 -/3)]
3~¢
M + 1/[5(1 -/3)]
1.6740 6.6962 167.4051
5.6740 10.6962 171.4052
1.9456 7.7824 194.5609
5.9456 11.7824 198.5609
a Parameter values:/3 = 0.95, ~ = ½, A = 10, a = 0.34 and 6 = 1. In order to compare these numerical estimates with our previous theoretical analysis we first decompose the approximation error for the value function as suggested in Equation (5.2). Thus,
Irv( )- rvh(k)l
Mh 2
is the error resuking from our numerical algorithm, and 1
]Wh(k) - W~(k)] ~< -5/ (1 3~
h2
is the error resulting from stopping the iteration process in finite time. The righthand side o f this inequality is equal to 4h 2. The constant M = 7/[2(1 - / 3 ) ] from Theorem 4.3 depends on the maximum value of the second-order derivative of W, and such derivative gets unbounded at the origin. Since points near the origin are never visited, in particular applications one is more interested in computing the value and policy functions over a significative domain containing the asymptotic dynamics. This allows one to get more operational estimates for M. In Table 4 we list values for M over several alternative domains. For instance, the interval [1,10] almost includes the point ~k ~ * , and here M = 1.6740. Therefore, in this case the upper estimate [M + 1/5(1 - / 3 ) ] h 2 from Equation (4.1) is equal to 5.6740, whereas the observed error is bounded by 4h 2. Consequently, the observed error falls into the range imposed by our theoretical analysis. In situations where the value function does not feature a closed-form solution, one can use instead the alternative estimate M, derived from our upper bound (3.7) o f the second-order derivative o f W in terms o f primitive data of the model. These values are also listed in Table 4. This difference between the observed error and our upper estimates is something to be expected in practice, since our results are meant to bound the error on a worst-case basis. We also considered further variations o f the preceding exercise that we now summarize: (1) Two-dimensional maximization: In the spirit o f problem (7.1), all iterations could alternatively involve a maximization over two variables (say l and k). For this case, our computations (see Table 5) show that the additional times reported are always beyond one-third o f those in Table 3. The two-dimensional maximization was effected by the routine m i n c o n g e n _ l i n o f IMSL. A
359
Ch. 5: Numerical Solution of Dynamic Economic Models
Table 5 Example 7.1. Computational method: dynamic programming algorithm with linear interpolation and with two-variable maximizationa Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
100 1000 10000
10-I 10 2 10 3
91 181 271
11.19 222.20 3155.24
5.31×10 2 5.76×10 3 5.96× 10 4
2.69×10 2 3.68× 10 4 3.80× 10 6
a Parameter values:/3 - 0.95, ~ = ½, A - 10, a = 0.34 and 6 = 1. Table 6 Example 7.1. Computational method: dynamic programming algorithm with linear interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
100 1000 10000
10 1 10 2 10 3
460 920 1379
18.71 378.35 5367.44
4.77×10 2 5.57× 10 3
1.980x10 l 1.949x 10 3 1.900X 10 5
5 . 9 7 × 10 -4
a Parameter values:/3 = 0.99, ~ = ½, A = 10, a = 0.34 and 6 = 1. (2) D i s c o u n t f a c t o r s [3 close to 1: Our theoretical analysis suggests that as [3 approaches 1, the constants involved in the orders o f convergence m a y b e c o m e u n b o u n d e d . For [3 = 0.99 and all the above parameter values, the constant o f the approximation error o f the value function goes up to 20, which is over a 5-fold increase with respect to the preceding figure. This is roughly the ratio, 1 ( ~ ) /1 ( ~ ) = 5, predicted by our error analysis, upscaled by a further increase in the second-order derivative o f function W [i.e., W ( k ) = B + C log k with C = )~a/(1 - a[3)]. Hence, one should expect these constants to b e c o m e u n b o u n d e d as [3 approaches unity. Table 6 reports in an analogous way further information regarding this n u m e r i c a l experiment with [3 = 0.99. (3) Multigrid methods: For multigrid methods, we have considered the following simple variation o f the m e t h o d o f successive approximations: Start with h l = 10 1, a n d take W h' as the initial guess for the iteration process with grid level h2 = 10 -2. Then, take W h2 as the initial value to start the iterations for grid level h3 = 10 3. This procedure leads to considerable speedups, and the C P U time gets down to one half (ef. Tables 6 and 7). (4) Higher-order approximations: Instead o f the space 1A2h o f piecewise linear functions, one could focus on alternative finite-dimensional functional spaces involving higher-order interpolations or spline functions. The original n u m e r i c a l experiment in Table 3 is n o w replicated in Table 8, using cubic splines, and in Table 9, using shape-preserving splines (i.e., quadratic splines that preserve m o n o t o n i c i t y and
360
M.S. Santos
Table 7 Example 7.1. Computational method: multigrid with linear interpolation a Vertex points
Mesh size
Iterations
CPU time
1000 10000
10 -2 10-3
460 932
210.67 1802.19
" Parameter values:/3 = 0.99, ;t = 7, A = lO, a = O.34 and 6 = l. Table 8 Example 7.1. Computational method: dynamic programming algorithm with cubic spline interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
100 1000 10000
10 1 10-2 t0 -3
181 361 543
9.59 201.41 3630.06
3.61X10 4 1.74× 10-6 1.74× 10 6
6.13X10 5 3.45X 10 8 8.41 × 10 II
a Parameter values:/3 = 0.95, 3, = 1, A = 10, a = 0.34 and c5 = 1. Table 9 Example 7.1. Computational method: dynamic programming algorithm with shape-preserving spline interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
100 1000 10000
10 1 10 2 10 3
136 271 540
6.99 141.03 2951.07
1.51x10 3 1.98x10 5 1.88x10 6
3.65×10 3 3.59x 10-6 4.42x10 l0
a Parameter values:/3 = 0.95, X = ½, A = 10, a = 0.34 and 6 = 1. c o n c a v i t y ) . O n e c a n o b s e r v e that c u b i c s p l i n e s h a v e r o u g h l y t h e s a m e p e r f o r m a n c e as s h a p e - p r e s e r v i n g s p l i n e s s i n c e t h e f u n c t i o n s to b e a p p r o x i m a t e d are r e l a t i v e l y s i m p l e a n d t h e v a r i o u s g r i d s c o n s i d e r e d are r e l a t i v e l y fine. T h e p o s s i b l e n e g a t i v e effects on numerical maximization from the loss of concavity under cubic splines s e e m to b e m i n o r ; f u r t h e r m o r e , c u b i c s p l i n e s y i e l d b e t t e r a p p r o x i m a t i o n s , a n d h a v e a l o w e r i m p l e m e n t a t i o n c o s t - a s n o e f f o r t is s p e n t i n c h e c k i n g c o n c a v i t y 15. F r o m T a b l e s 3, 8 a n d 9, w e c a n a l s o o b s e r v e t h a t s p l i n e s y i e l d b e t t e r a c c u r a c y results per work expended than linear interpolations. For instance, we can see that o n e t h o u s a n d p o i n t s u n d e r s p l i n e i n t e r p o l a t i o n ( s e c o n d c o l u m n in Table 8) l e a d to much better error estimates than ten thousand points under linear interpolations
15 For cubic splines T O L W is set to lh4, and for shape-preserving splines T O L W is set to 51-h3.
Ch. 5." Numerical Solution of Dynamic Economic Models
361
Table 10 Example 7.1. Computational method: policy iteration with linear interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. value of L
Max. error in W
Max. error in g
100 300 750 1000 3000
1.00×10 1 3.31×10 .2 1.32×10 2 9.90× 10.3 3.30x 10 `3
6 6 8 11 9
0.51 7.84 228.93 963.73 28197.48
131.7472 1318.7690 126.9136 59943.0753 218854.7858
4.08×10 4 9.00x10 5 6.00×10 6 6.00× 10-6 1.00× 10 6
5.9741×10 2.9869×10 9.3287×10 5.7674 × 10 2.2403 × 10
2 2 3 3 3
~, Parameter values:/3 = 0.95, ,~ = 3, l A = 1 0 , a = 0.34 and cS=1. Table 11 Example 7.1. Computational method: modified policy iteration with linear interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. error in g
m
Max. error in W
100 1000 10000
10 -I 10 2 10 3
6 8 12
0.5037 7.2365 146.30
6.24×10 2 4.32x 10-3 3.95x10 -4
65 65 65
9.42×10 4 1.31×10 -5 9.39×10 7
1 A=lO, a=0.34and6=l. a Parameter values:/3 = 0.95, )t = 3, (third c o l u m n in Table 3). Moreover, the c o m p u t a t i o n a l cost under cubic splines interpolation is lower: the n u m b e r o f vertex points has b e e n r e d u c e d by a factor o f 10, whereas for e a c h v e r t e x point the t i m e cost o f spline interpolation (instead o f p i e c e w i s e linear interpolation) goes up by less than a factor o f 2. Consequently, the o u t c o m e o f these exercises is that splines o u t p e r f o r m linear interpolations by factors o f 5 - 5 0 , and these gains m a y increase exponentially for finer grids. (5) Policy iteration." A s p r e v i o u s l y argued, policy iteration b e c o m e s less attractive for fine grids, since the r e g i o n o f quadratic c o n v e r g e n c e m a y get smaller and the c o m p u t a t i o n a l burden associated with inverting large matrices increases considerably. These results are c o n f i r m e d in Table 10. It can be seen that the constant
L - II w - mo+~ II I I V / - ~,112 (associated with the quadratic order o f c o n v e r g e n c e o f the algorithm) increases w i t h the m e s h size o f the grid, w h i c h leads to further iterations. However, for fine grids the m a j o r cost incurred seems to lie in the inversion o f large matrices. C o m p a r i s o n o f Tables 3 and 11 shows that p o l i c y iteration dominates d y n a m i c p r o g r a m m i n g for small and m e d i u m - s i z e d grids, but it has a m u c h worse p e r f o r m a n c e for grids b e y o n d 1000 points.
M.S. Santos
362
Table 12 Example 7.1. Computational method: modified policy iteration with linear interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. error in g
m
Max. error in W
100 1000 10000
10 i 10-2 10-3
8 9 13
1.63 20.26 319.63
2.76x 10-2 3.81×10 3 4.13x 10 4
250 250 250
1.44x 10-3 8.31×10 -4 5.56×10 _6
a Parameter values:/3 = 0.99, ~ = ½, A = 10, a = 0.34 and 6 = 1. (6) Modified policy iteration: Considerable gains were also attained with modified policy iteration (cf. Tables 3 and 6, and Tables 11 and 12); indeed, in some cases the iteration process stops ten times faster. For this simple example, we found that the optimal number o f compositions on the policy function (i.e., number m in Section 5.3) was o f the order m = 65 for/3 = 0.95, and m = 250 for/3 = 0.99. O f course, an optimal m is also bound to depend on the grid size, and further savings are obtained when modified policy iteration is combined with multigrid or with spline interpolation. (7) PEA-collocation: The PEA-collocation method o f Christiano and Fisher (1994) yields in this case the exact solution, for x - (log k - log k)/(log k - log k_). Hence, this example does not seem to be an appropriate application o f this method. To this end, we shall consider a chaotic law o f motion in the next example. (8) Depreciation parameter, 6 < 1: Under full depreciation, 6 = 1, the model has an analytical solution, and this becomes handy to compute exactly the approximation error. However, the optimal rule is rather simple, since variable l remains constant along the optimal solution. To make sure that this property was not affecting the performance o f our algorithms, the above computational experiments were replicated for parameterizations 6 = 0.05 and 6 = 0.10, with no noticeable change in the reported CPU times. As expected, though, the computational error o f the policy function went up in all cases. (This error was estimated numerically, considering as the true solution that o f a sufficiently fine approximation.)
7.2. A one-sector chaotic growth model In order to apply the PEA-collocation algorithm, we consider a deterministic example with a chaotic law o f motion. Both the return function and the policy function are polynomials, and hence the PEA-collocation algorithm is expected to perform well. The example is taken from Boldrin and Montrucchio (1986). Here, the graph o f the technological correspondence g2 is the (two-dimensional) square [0, 1] x [0, 1]. The return function v : g2 --+ R, is defined as o(kt, kt+l) = 150kt - 24k 2 + 4ktkt+~(1 - kt) - 1.608kt+1 - 0.32848k~+ 1 + 0.17152k3+1 - 0.08576k4+p
Ch. 5." Numerical Solution of Dynamic Economic Models
363
Table 13 Chaotic growth model (Section 7.2). Computational method: dynamic programming algorittnn with linear interpolation Vertex points
Iterations
CPU time
100 1000 10000 20000
5 6 7 7
0.1142 0.9487 10.8873 21.7121
Max. error in g
2.7438 × 10 -3 2.9135x 10-4 2.9134x 10 5 1.4603 x 10 5
Max. Euler eq. residuals
2.7726 x 10 2.9091 x 10 2.9134x 10 1.4537 × 10
3 4 5 5
Table 14 Chaotic growth model (Section 7.2). Computational method: dynamic programming algorithm with shape-preserving spline interpolation Vertex points
Iterations
CPU time
Max. error in g
Max. Euler eq. residuals
100 1000 10000 20000
6 7 9 10
0.29 3.26 44.11 108.41
3.41×10 5 4.38x10 7 1.97× 10-7 1.92x 10 7
3.27x10 4.40x10 1.77x 10 1.89x 10
5 7 7 7
Table 15 Chaotic growth model (Section 7.2). Computational method: PEA-collocation Vertex points 3 5 8 12 15
CPU time
Max. error in g
Max. Euler eq. residuals
0.1112 0.4958 0.4089 1.2248 2.4517
4.5305x 10-2 4.0453 × 10 2 9.4166x 10 4 2.1193x10 4 1.8828x 10 4
4.6267;< 10 4.0704 × 10 9.4413 x 10 2.t155x10 1.8664× 10
2 2 4 4 4
a n d the d i s c o u n t f a c t o r , / 3 = 0.01072. U n d e r this specification, the p o l i c y f u n c t i o n g : [0, 1] ---+ [0, 1] is g i v e n b y t h e c h a o t i c m a p 4k(1 - k) t h a t g o e s f r o m the u n i t i n t e r v a l into itself. C o m p u t a t i o n o f this m o d e l u s i n g o u r p r e v i o u s d i s c r e t i z e d d y n a m i c p r o g r a m m i n g a l g o r i t h m is a relatively e a s y task, since t h e d i s c o u n t f a c t o r / 3 is r a t h e r low; h e n c e , the c o n t r a c t i o n p r o p e r t y o f t h e o p e r a t o r is u n u s u a l l y p r o n o u n c e d . Indeed, w e c a n see f r o m Tables 13 a n d 14 t h a t it t a k e s s e v e n v a l u e - f u n c t i o n iterations to r e a c h a c c u r a c y levels for t h e p o l i c y f u n c t i o n o f o r d e r 10 -4, w i t h r e p o r t e d C P U t i m e s less t h a n 1 m i n u t e . A s i l l u s t r a t e d in Table 15, s u c h a c c u r a c y levels c a n also b e a c h i e v e d w i t h the P E A -
364
M.S. Santos
collocation method of Christiano and Fisher (1994) with polynomials of degree 14. Moreover, even in this example with small discounting, PEA-collocation seems to be faster ~6 In all these computations, the good performance of the PEA-collocation algorithm is partly due to the following successful implementation of the multigrid method: The algorithm is coded so that the computed solution for a given grid size is taken as the initial guess for the subsequent refinement. Thus, the fixed point of the 3-vertexpoint grid is used as the initial guess for the computed solution of the 5-vertex-point grid, which in turn is the initial guess for the computed solution of the 8-vertex-point grid, and so forth. (Our guess for the initial 3-vertex-point grid was derived from a linearization of the Euler equation at the unique interior steady-state value.) This multigrid-type iterative procedure yields faster and more reliable outcomes than the simple PEA-collocation algorithm, since the computed solution from a previous grid is generally the best available starting point for the next round of computations. In their last columns, Tables 13-15 report the m a x i m u m size of the Euler equation residuals associated with the computed policy function. From Theorem 6.1, the size of the residuals should be o f the same order of magnitude as the error of the computed policy function. This pattern is actually confirmed in all the tables. Indeed, the constants associated with these orders of convergence are close to 1 in all three cases. 7.3. A o n e - s e c t o r s t o c h a s t i c g r o w t h m o d e l with leisure
We next consider the stochastic version o f the growth model presented in Example 2.2 for a parameterization in which the value and policy functions retain exact analytical forms. The problem is written as O<3
max E0 Z f i t [;~logct + (1 -,a.)loglt] {~'t,l,,it}~0 t-0 G + it = z t A k ~ ( 1 - lt) I a
kt+l = it + (1 - 6)kt (7.3)
log zt+l = P log z, + et+l subject to
03<1,
0 < Z ~ < 1,
A>0,
O
0~<6~<1,
O~
kt, c t > ~ O ,
0<~lt ~< 1,
k0 and z0 given,
t = 0, 1,2 . . . . .
16 In the implementation of the algorittma, variable x was defined as x = 2~-~ logk logk_ implementation discussed above, x = 21og~_log ~ 1, requires k > 0.
1. The alternative
Ch. 5: Numerical Solution of Dynamic Economic Models
365
where et is an i.i.d, process with zero mean. For (3 = l, the value function W has an analytical form given by
W(ko,zo) = B + Clnk0 + D lnz0, where >`a >` C - - D= 1 - aft' (1 - aft)(1 - p f i ) Also, as previously the optimal policy is a constant fraction of total production,
kt+l = afiztAkff(1 -/t) 1 a
with
(1 - >`)(1 - aft)
l=
>`(1 - a) + (1 - >,)(1 - a/3)
We fix parameter values, fi = 0.95, >` = ½, A = 10, a = 0.34, 6 1, p = 0.90. Also, we restrict the feasible domain so that k E [0.1, 10], e E [-0.032,0.032], and z is such that logz c [-0.32, 0.32]. The random process e comes from a normal distribution, where the density has been rescaled in order to get a cumulative mass equal to unity. As in Prescott (1986) we assume a standard deviation ~re = 0.008. Observe then that the end-points o f the domain o f variable e are four standard deviations away from the mean. One can again check that under these restrictions Assumptions (1)-(4) are satisfied. Indeed, in this simple case one can show that the model has a globally stable invariant distribution. As the random shock has a small variance, all paths eventually fluctuate around the point (k*,~) = (1.9696, 1), where k* is the state value o f the deterministic model and 5 is roughly the unconditional mean of the random process. Consequently, as in Section 7.1, to estimate the value M o f Theorem 4.3 it is reasonable to restrict ourselves to a certain domain containing the ergodic set, such as P = {(x,z)l ½ ~< k ~< 10, e -°32 ~< z ~< e°32}. This set is large enough to encompass most plausible economic applications using this framework, and here Mh 2 = 64.9374h 2. As in the preceding example, to this estimate we should add the other component of the observed error concerning the fact that the iteration process is stopped in finite time. Over the feasible domain o f state variables we set out a uniform grid o f vertex points (k j, 2 j) with mesh size h. Our numerical procedure then follows the iterative process specified in Equation (4.2) with an initial value W0 --= 0. As in Section 7.1, the algorithm is written so that only unidimensional maximizations need to be considered. Thus, each iteration proceeds as follows h W~+a (/co,z0) _- max >`log + (1 - >`) log l0 ,o \ (1 - >`)(1 = 7 0 ~ J
+[3~W:((zoAk~(1-1o)-a[(1
lo)
M°;1->`a)]),z,)Q(zo, dz~).
(7.4) All the integrations have been carried out under the subroutines qsimp and qtrap, as specified in Press et al. (1992, Sect. 2.4), with TOLI= 10-9. These subroutines follow
M.S. Santos
366
Table 16 Example 7.3. Computational method: dynamic programming algorithm with linear interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
43 × 3
0.3872
10
1.44
1.12x10 ~
2.61
143 × 9
10 t
57
87.49
3.63×10 2
2.41x10 1
500 × 33
0.0282
108
2198.95
1.06× 10 2
6.03 × 10 2
a Parameter values:/3 = 0.95, )t = 7, A - 10, a = 0.34, 6 = 1, p = 0.9 and a s. = 0.008. 1
m
Table 17 Example 7.3. Computational method: dynamic programming algorithm with linear interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
43 x 3
0.3872
25
3.98
1.27×10 1
10.57
143 × 9
10 1
286
445.89
3.68x10 2
1.11
500 × 33
0.0282
550
9839.58
1.06x 10 2
1.56× 10 1
a Parameter values:~/3= 0.99, 2~= ½, A = 10, a = 0.34, 6 = 1, p = 0.9 and oe = 0.008. an N - s t a g e refinement o f an e x t e n d e d trapezoidal rule, and as argued in this treatise, they are fairly efficient for the integration o f relatively simple problems. Again, for univariate n u m e r i c a l m a x i m i z a t i o n s we have e m p l o y e d Brent's a l g o r i t h m w i t h tolerance level TOLM = 10 -8. The iteration process in E q u a t i o n (7.4) stops w h e n II w~+l - w~H ~< r o L w
- h2
It should be n o t e d that for the i m p l e m e n t a t i o n o f the algorithm (but not for the final calculation o f the a p p r o x i m a t i o n errors) all interpolations n e e d only be unidimensional. That is, although W~ in E q u a t i o n (7.4) is defined over a grid in a t w o - d i m e n s i o n a l space, a u n i d i m e n s i o n a l interpolation over z allows us to c o m p u t e the integral in E q u a t i o n (7.4) for g i v e n kJ. Then, the integral values are interpolated over k to define a continuous objective for the univariate m a x i m i z a t i o n . Table 16 presents i n f o r m a t i o n on our n u m e r i c a l e x p e r i m e n t for several values o f h. It can be o b s e r v e d that the error t e r m is always b o u n d e d by 24h 2. Hence, the constant s t e m m i n g f r o m our computations is b o u n d e d above by 24, whereas our estimate o f the o b s e r v e d error,
eh(k,z) = I Wh(k,z) - W2(k,z)[ <~ ]W(k,z)- Wh(k,z)[ + [Wh(k,z)- W2(k,z)[
(7.5)
~< 64.397h 2 + 20h 2 ~< 84.397h 2. We again e m p h a s i z e that this is a result to be e x p e c t e d in particular applications, since these estimates are by construction r o u g h u p p e r bounds o f the m a x i m u m
Ch. 5:
367
Numerical Solution of Dynamic Economic Models
Table 18 Example 7.3. Computational method: multigrid with linear interpolationa Vertex points
a
Mesh size
Iterations
CPU time
143 × 9
10 i
158
220.70
500 × 33
0.0282
341
6370.53
Parameter values:/3 = 0.99, 3, = ½, A = 10, a = 0.34, c5= 1, p = 0.9 and o e = 0.008.
Table 19 Example 7.3. Computational method: dynamic programming algorithm with linear interpolation and with two-variable maximizationa Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
43 × 3
0.3872
10
2.69
1.12x 10-I
2.61
143 × 9
10 l
57
160.47
3.63× 10-2
0.241
500 x 33
0.0282
108
4130.08
1.06x 10-2
6.03× i0 -2
Parameter values:/3 = 0.95, 3, = 1, A = 10, a = 0.34, 6 = 1, p = 0.9 and ac = 0.008. approximation error over the entire infinite horizon, and include points fairly distant from the ergodic set. A s in the preceding example, we considered alternative n u m e r i c a l experiments for several values for/3, a n d replicated the original computations u n d e r the multigrid method. Table 17 reports the corresponding n u m e r i c a l results for discount factor /3 = 0.99. As it is to be expected, the constant involved in the orders o f convergence are about five times larger. Likewise, Table 18 replicates the computations o f Table 17 u n d e r the m u l t i g r i d method. It can be seen that the required C P U time gets down roughly to two thirds o f that o f the original experiment. These computational costs seem very reasonable, as compared to similar, rougher procedures o f this basic p r o b l e m [e.g., Christiano (1990), C o l e m a n (1990) and Tauchen (1990)]. We have also carried out the n u m e r i c a l experiment u n d e r the original f o r m u l a t i o n (2.4) in a twod i m e n s i o n a l m a x i m i z a t i o n framework. As shown in Table 19 the time load doubles in all three cases. In view o f the gains o b t a i n e d by splines in the preceding examples, the value and policy functions o f our stochastic m o d e l were also c o m p u t e d b y the d y n a m i c p r o g r a m m i n g algorithm with spline interpolation j7, with significant savings in c o m p u t i n g time (cf. Tables 20 and 21). Indeed, considering accuracy achieved per work expended, m u l t i g r i d with spline interpolation outperforms the d y n a m i c p r o g r a m m i n g
17 For fine grids, the best performance was observed in experiments with shape-preserving spline interpolation over variable k, and cubic spline interpolation over variable z.
368
M.S. Santos
Table 20 Example 7.3. Computational method: dynamic programming algorithm with spline interpolation a Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
43 × 3
0.3872
28
7.02
1.92 x 104
9.58 x 10 1
143 × 9
10 l
130
342.89
3.11x10 -3
4.99×10 3
500 x 33
0.0282
234
8162.89
3.82x 10 4
2.42× 10-5
a Parameter values: fi = 0.95, X = 7, i A -_ 10, a = 0.34, 6 = 1, p = 0.9 and oe = 0.008. Table 21 Example 7.3. Computational method: multigrid algorithm with spline interpolationa Vertex points
Mesh size
Iterations
CPU time
Max. error in g
Max. error in W
143 × 9
10-1
112
290.05
3.11x10 3
4.82×10 3
500 × 33
0.0282
170
5935.71
3.82× 10 4
2.46× 10 5
Parameter values:/3 = 0.95, ,~ = 1, A = 10, a = 0.34, 6 = 1, p = 0.9 and 6e = 0.008. algorithm with linear interpolation by factors o f 1 0 0 - 1 0 0 0 for the finest grids. Thus, u s i n g our computational facilities, it would have b e e n practically unfeasible to get tolerance levels for the value function o f order 10 6 whereas it took less than three hours for the d y n a m i c p r o g r a m m i n g algorithm with cubic splines. Finally, the P E A - c o l l o c a t i o n algorithm o f Christiano and Fisher was not considered in this case, since for our parameterization with full depreciation this m e t h o d yields the exact solution. This algorithm will be tested in the next section, where we attempt to solve an e c o n o m y with m o r e plausible depreciation values. As evaluated b y the Euler equation residuals, in all our n u m e r i c a l experiments P E A - c o l l o c a t i o n is m u c h faster than the d y n a m i c p r o g r a m m i n g algorithm, and can achieve high accuracy levels.
8. Quadratic approximations A n old and popular strategy for the s i m u l a t i o n o f solutions is to approximate the m o d e l with a one-period quadratic return function a n d linear feasibility constraints. This approach reduces substantially the computational complexity, since the Euler equation and the policy function are linear in the state variables. Thus, it is possible to compute quadratic models in m a n y d i m e n s i o n s and this procedure is very convenient whenever the approximation is accurate. Computational algorithms for solving quadratic optimization problems have b e e n recently surveyed in A m m a n (1996), and A n d e r s o n et al. (1996). Several quadratic approximation methods have b e e n proposed in the literature, a n d an important issue is the accuracy o f these computations. For our original optimization
Ch. 5." Numerical Solution o f Dynamic Economic Models
369
problem (2.1), a standard approach would be to take a second-order Taylor's expansion of the one-period return function v(kt, kt+~,zt) at a given point (k,k,g), and then maximize over the new objective. A typical optimization problem would thus be expressed as follows: 0<3
max Y ~ fit [ (x, xt+~,zt)" D2v(k, k, ~:)" (xt, Xf+l,zt) Ijt(zo, dz t) {~,),~-o ~_=-~ az,
(8.1)
subject to (x0, z0) fixed, and t = 0, 1,2 .... , where D2v(k, k, ~) is the Hessian matrix of v at (k, k, ~), and/~t is the probability law induced by the mapping cp. Under certain mild regularity conditions [e.g. Anderson et al. (1996)], an optimal solution {x[} exists for problem (8.1) and it is unique. Let xl = tl(x0, z0) be the optimal policy for the quadratic optimization problem. Then, for k0 = k + x0, the computed policy function ~ for our original optimization problem (2.1) is derived from ks = ~,(ko,zo) = k+ ~(xo,zo). Of course, this approximation is only supposed to be accurate for small perturbations around the vector (k, k, g), usually assumed to be the steady state of a deterministic version of the model. But even for such small perturbations, the following problems may arise: (i) Biased estimates for tke first-order moments: In simple situations in which problem (2.1) generates a globally stable invariant distribution for the set of optimal solutions, the first-order sample moments of k and z are not necessarily equal to those of the deterministic steady state [e.g., see Christiano (1990) and den Haan and Marcet (1994) for some illustrative computations, and Becket and Zilcha (1997) for an analytic example]. Mean-preserving perturbations of the stochastic innovation e may lead to shifts in first-order moments from either non-linearities in the function q0 or from economic maximizing behavior, since agents may want to carry over a larger stock of capital to insure against unexpected shocks. (ii) Biased estimates for the slope: Function r/ is not the best linear approximation at point (k,~) of the policy function g for optimization problem (2.1). As pointed out in Section 3, the derivative Dlg(k,z) is determined by the quadratic optimization problem (3.6); such optimization problem is slightly different and harder to compute, since variable z appears in a fundamental non-linear way. [See Gaspar and Judd (1997) for an alternative discussion on the computation of these derivatives.] Hence, for stochastic shocks with a large variance, optimization problems (3.6) and (8.1) may yield different solutions, and function tl may not be an appropriate estimate of the derivative of g. An alternative quadratic approximation can be obtained from a second-order Taylor's expansion of the return function over the log values of k and z. In such a case, the optimal policy of the constructed optimization problem [e.g., see Christiano (1990) and King, Plosser and Rebelo (1988)] yields the exact solution for the stochastic model with full physical capital depreciation in Equation (7.4). Hence, this optimization problem is expected to generate higher sample moments, ameliorating thus the loss
370
M.S. Santos
Table 22 Example 7.3. Standard deviations, ~r(i), and correlation coefficients, corr(i,j), for i,j = k, c, u,y a Method Exact solution Dynamic programming Quadratic approximation
(y(y) 0.3524 0.3524 0.3523
a(i) 0.1186 0.1186 0.1185
~(e)
~(u)
corr(w,u)
0.2337 0.2337 0.2337
0.0 0.0 0.0
1.0 1.0 1.0
corr(c,y) 1.0 1.0 0.9999
" Parameter values: fi = 0.99, )~= ½, A - 17, a = 0.34, 6 = 1, p = 0.9 and a e = 0.008.
of accuracy discussed in point (i) regarding the linear quadratic model. But the loglinear approximation does not provide an exact solution o f the derivative of the policy function [cf. optimization problem (3.6)], and so it may also lead to inaccurate results, even for small shocks. There have been several accuracy tests for quadratic approximations in the standard stochastic growth model [e.g., Christiano (1990), Danthine, Donaldson and Mehra (1989), Dotsey and Mao (1992)]. This is a topic o f particular concern in the real business cycle literature, where the stochastic innovation is usually calibrated with a relatively small variance. It is then expected that the quadratic approximation would mimic reasonably well the invariant distribution o f the non-linear solution. However, for the aforementioned accuracy tests the authors restrict the law o f motion z to a discrete stochastic chain with three states. Such discretization cannot be generally considered as a good approximation o f the underlying non-linear model. Therefore, to conduct these comparisons, more accurate simulations of the original non-linear model are needed. We shall present here further numerical evidence from the more accurate computational procedures developed in the preceding sections. To understand the nature o f these approximations, we begin our analysis with a parameterization with an analytic solution. Thus, Table 22 reports the standard deviations and correlation coefficients for physical capital, k, consumption, c, work, u = 1 l, output, y = z A k a u I a, and labor productivity, w = (1 - a ) z A k " u a, for the log-linear model computed in Section 7.3 with/3 = 0.99. These sample moments have been obtained from 10000 draws o f the i.i.d, random process { e t } that enabled us to construct a random path for state variable z. The solution for the quadratic approximation is derived from optimization problem (8.1), for the deterministic steady-state (k, k, 3) = (4.669, 4.669, 1). The solution o f the dynamic programming algorithm is derived from the multigrid method with spline interpolation, where the finest grid contains 1000 x 70 vertex points evenly spread over the domain K x Z = [2.5, 7.1711] × [e-°32, e+°32]. From Table 22 we observe that the computed moments are relatively accurate for both the quadratic approximation and the numerical dynamic programming algorithm. The most significant deviations are observed in the standard deviations o f output, a ( y ) , and investment a(i), where the quadratic approximation yields slightly smaller
Ch. 5: Numerical Solution ofDynamicEconomicModels
371
1.0000E-03 O.O000E+O0 C
o
,,,.,
O-~O
--~0 ~ ,
+~
+~+~ +~+~
-1.0000E-03
/
0" 0
_~ -2.0000E-03
+ m
oDP ~
Ul +
-3.0000E-03 I LLI
LLI
LLI
LLI
1.1.1
W
LLI
Capital domain
Fig. 1. Example 7.3. Euler equation residuals for the dynamic programming algorithm (DP) and quadratic approximation (LQ) policy functions for z = 1. Parameter values: fi = 0.99, 2~= ½, A = 17, a - 0.34, 6 = 1, p = 0.9 and oe = 0.008. in the deterministic case, ( i.e., for oe = 0), this parameterization yields a steady-state value k* = 4.669. estimates. However, for the chosen calibration o f parameter values with a~ - 0.008, these differences are not substantial. To gain further understanding o f these results, Figures 1-3 plot the Euler equation residuals and the computed policy functions over the capital domain [2.5, 7.1711 ], for z = 1 fixed. In this example, the Euler equation residuals are small, and so are the approximation errors o f the policy function. For the quadratic approximation, these errors are o f order 10 3 for a sizeable portion o f the domain. 18 This approximation is nearly exact near the steady-state value, k* = 4.669, but it becomes less accurate as we deviate from the central point. Similar patterns and quantitative estimates are observed for alternative feasible values for z. Let g ( k , z ) - afizAka(1 - l) be the true policy function, gDP the policy function derived from the dynamic programming algorithm, and gLQ the solution derived from the quadratic approximation. By virtue o f the concavity o f g in k, the linear policy gLQ overestimates the true policy for both positive and negative deviations from the steady-state value k* (cf. Figures 2 and 3). Moreover, the concavity o f g also implies that this difference grows faster for negative deviations from k* (i.e., for low values o f k). Consequently, the quadratic model overestimates output fluctuations for large values o f k (i.e., for k > k*) and underestimates output fluctuations for small values o f k. These countervailing effects
18 In all the figures, a number mEn means mxl0".
M.X Santos
372
6.0000E+O0 5.6000E+00 C
o
5.2000E+00
t,.}
4.8000E+00 >,
0 D.
of 4.4000E+00 4.0000E+O0 3.6000E+00
O O -ILLI O O O
O O + ILl CO ~" C~
O O 4LIJ CO C,l CO
CO
~
co
co
~-
~
e~
O O 4IJJ ¢O O ~"
O O 4LU O CO 131
O O + LU I~ Lib tg)
,-u5
O O 4LIJ '~" CO ~
CO ~
O O + LLI *'I~
u~ co
,--
Capital Domain Fig. 2. E x a m p l e 7.3. Policy f u n c t i o n s for the d y n a m i c p r o g r a m m i n g a l g o r i t h m (DP) a n d q u a d r a t i c a p p r o x i m a t i o n ( L Q ) for z = 1. P a r a m e t e r v a l u e s : / 3 = 0.99, 3. = 5, l A = 17, a = 0.34, 6 = 1, p = 0.9 a n d a~. = 0.008. In the deterministic case, (i.e., for a t = 0), this p a r a m e t e r i z a t i o n yields a s t e a d y - s t a t e value k* = 4.669.
m cO
1.6000E-O1
"
1.2000E-01
"6
8.0000E-02
e-,
4.0000E-02 O.O000E+O0 e=
~
-4.0000E-02 O O + LLI O O O L~
O O 4LLI CO '~" C,l ~
O O 4LLI t.D C',l CO CO
O O 4LU CO O ~" LO
O O + LLI O CO O') v--
O O 4ILl I"-LO LO CO•
O O + LLI ~ ¢0 ~ If)
O O 4LU
e,i
e5
co
**
u5
u~
~o
r~
T--
Capital d o m a i n Fig. 3. E x a m p l e 7.3. D i f f e r e n c e b e t w e e n the c o m p u t e d p o l i c y functions, gLQ _ gDP, for z = 1. P a r a m e t e r v a l u e s : / 3 = 0.99, 2~ = ½, A = 17, a = 0.34, 6 = 1, p = 0.9 a n d ae = 0.008. In the d e t e r m i n i s t i c case (i.e., f o r cre = 0), this p a r a m e t e r i z a t i o n yields a s t e a d y - s t a t e value k* = 4.669.
Ch. 5:
373
Numerical Solution o f Dynamic Economic Models
Table 23 Example 7.3. Standard deviations, a(i), and correlation coefficients, corr(i,j), for i,j = k, c, u,y a Method
d(y)
~(i)
a(c)
a(u)
corr(w, u)
corr(c,y)
PEA-collocation
0.0283
0.0170
0.0142
0.0034
0.5436
0.8867
Dynamic programming
0.0287
0.0174
0.0142
0.0035
0.5498
0.8838
Quadratic approximation
0.0280
0.0167
0.0141
0.0033
0.5507
0.8917
a Parameter values:/3 - 0.99, ;t = ½, A = 1.1, a = 0.34, 6 = 0.05, p = 0.9 and at = 0.008.
b a l a n c e e a c h o t h e r a n d lead to a s i m i l a r e s t i m a t e for o ( y ) . S i n c e t h e a p p r o x i m a t i o n e r r o r is m o r e p r o n o u n c e d for s m a l l v a l u e s o f k, q u a d r a t i c a p p r o x i m a t i o n s w o u l d t h e n h a v e a t e n d e n c y to u n d e r e s t i m a t e o ( y ) . Finally, Table 23 a n d F i g u r e s 4 a n d 5 r e p l i c a t e t h e a b o v e c o m p u t a t i o n s for a m o r e realistic p a r a m e t e r i z a t i o n w i t h d e p r e c i a t i o n f a c t o r (5 = 0.05. T h i s c a l i b r a t i o n o f the m o d e l does n o t p o s s e s s a c l o s e - f o r m solution, a n d c o n s e q u e n t l y w e c a n o n l y r e p o r t c o m p u t a t i o n s c o r r e s p o n d i n g to o u r n u m e r i c a l m e t h o d s - q u a d r a t i c a p p r o x i m a t i o n , P E A - c o l l o c a t i o n a n d t h e d y n a m i c p r o g r a m m i n g a l g o r i t h m . C o n c e r n i n g the a c c u r a c y 1.0000E-02 ~-I-
__..-.--I- ~ - I -
2.0000E-03
/+
"0
=,?.
-6.0000E-03
~+
e-
/
0 .m ,,-I
-1.4000E-02
14,1
/
-2.2000E-02
-I-
+
-3.0000E-02 O O -ILU O O O
O O -ILU 03 03 03
O O + ILl LO (,O I~
O O -ILU CO %-
O O "t" LU O 03 LO
O O + LLI 03 ~ 133
O O "t" ILl LO 03 O,I
O O + LU CO I'~ LO
Capital domain Fig. 4. Example 7.3. Euler equation residuals for PEA and quadratic approximation (LQ) policy functions for z = 1. Parameter values:/3 = 0.99, 2~= ½, A = 1.1, a = 0.34, 6 = 0.05, p = 0.9 and o e = 0.008. In the deterministic case, this parameterization yields a steady-state value k* = 5.0294.
374
M.S. Santos
4.0000E-02 tO
3.2000E-02
U
>,
.2 "6 D,. e-
2.4000E-02
1.6000E-02
,,,.,,
8.0000E-03 U e-
=_"
O.O000E+O0
rl
-8.0000E-03 0 + W
0 + W
o + W
0 + W
~ + W
0 + W
0 + W
o
~
~
~
~
~
~
0
~
~
~
~
~
~
0 + W
Capital domain
Fig. 5. Example 7.3. Difference between the computed policy fimctions, gLQ _ gD?, for z = 1. Parameter values: fi = 0.99, ;t = ½, A = 1.1, a = 0.34, 6 = 0.05, p = 0.9 and at. = 0.008. In the deterministic case (i.e., for a e = 0), this parameterization yields a steady-state value k* = 5.0294. o f quadratic approximations, there are no substantial variations with respect to the preceding experiment. As before, this approximation yields good estimates for the secondorder moments, and the most significant differences are downward biases for the standard deviation of both output and investment, which are nevertheless fairly small. Also, Euler equation residuals and approximation errors for the linear policy function are of order 10 3 in a significant region. This approximation becomes increasingly less accurate as we deviate from the steady-state solution, especially for small values o f k. In this experiment, we should stress the good performance of PEA-collocation, since an 8 x 5 vertex-point grid achieves Euler equation residuals o f order 10.6 for all points in the domain (cf. Figure 4). Obtaining residuals o f this order o f magnitude under the dynamic programming algorithm requires considerably more grid points, and a computational effort o f the order o f a thousand times higher. One should bear in mind that in all these computations, the standard deviation o f the innovation is very small, i.e., a~. = 0.008. Peralta-Alva and Santos (1998) explore in detail the sensitivity o f quadratic approximations to variations in the parameters fi, 6, p and a~. The most sizeable deviations regarding second-order moments are observed for variations in parameters p and ae, although substantial changes in other parameter values may also have non-negligible effects.
Ch. 5:
375
Numerical Solution of Dynamic Economic Models" Observations Selected for Calibration .•°
Economic Model
Exact Solutions
I I
I AccuracyTests [ Numerical Solutions
Numerical Model
Observations Selected for Testing
Fig. 6. Testing economic theories.
9. Testing economic theories In this section, we focus on certain basic issues concerning the process of validation of economic models, where computer simulations must play a prominent role. Our aim is to provide a useful framework for the analysis and evaluation of numerical procedures. As a result of some conjecture or proposed theory, we consider that a mathematical model has been constructed, with the ultimate goal of rationalizing an economic situation, phenomenon or activity. As is well understood, in general exact solutions are not readily available. Hence, for the purpose of testing an economic model one has to follow the indirect route of computing suitable approximations. To gain confidence in these computations, one needs to weigh a wide range of sources of error that go from the construction of the model to the formal testing of the solutions. A framework for this discussion is outlined in Figure 6, which highlights some major steps in the process of scientific inquiry. In order to perform quantitative experiments, a model must be calibrated. That is, specific functional forms and corresponding parameter values must be stipulated so as to derive definite predictions. Considering that models are artificial constructs, it does not seem plausible to assign these values independently of the underlying postulates. Thus, a parameter such as the elasticity of intertemporal substitution for aggregate consumption can be difficult to identify in the data. But even if it is identified, such parameter may proxy inherent simplifications in the model such as the absence of a government sector, the absence of leisure, or of home production, or of some other observable or non-observable (i.e., not easy to measure) components not explicitly modelled. Besides, from a conceptual perspective this parameter may be attached a different meaning in envirolunents with bounded rationality or with alternative mechanisms for forming expectations.
376
M.S. Santos
Evidence from panel data and microeconomic studies may help to pin down parameter values, but this evidence is not conclusive on most occasions. Thus, Browning, Hansen and Heckman (ch. 8, this volume) observe that estimates from microeconomic studies are not readily transportable to macroeconomic models. Likewise, estimating a subset of parameter values from some independent observations may lead to biases and inconsistencies in the ulterior calibration of the remaining ones [cf. Gregory and Smith (1993) and Canova and Ortega (1996)]. In view of all these complexities, a more operative strategy for calibrating a model is simply to select a set of observations where all model parameters may be jointly estimated. For instance, a common practice nowadays is to calibrate a business cycle model from some facts related to growth theory [e.g., Christiano and Eichenbaum (1992) and Cooley and Prescott (1995)] without focussing so much attention on microeconomic studies. Of course, alternative sets of observations selected for calibration may lead to different parameter values, and the ability of a certain range of parameter values to account for a wide group of well established observations may enhance our confidence in the model chosen. All these calibrations may be associated with standard errors stemming from uncertainties present in the data [cf. Christiano and Eichenbaum (1992)]. The approach just described presents a subtle difference with respect to what would be called a naive interpretation of the positivist (or falsificationist) view, often associated with the writings of Friedman (1953) and Popper (1965). Under these latter methodological programs, a model could in principle be tested without a previous confrontation to an independent or unrelated set of data. In such circumstances, functional forms and parameter values could be chosen so as to insure a best fit for the data or sets of data to be accounted for, without further regard to other seemingly unrelated events that the theory was not initially purported to explain. After an economic model has been specified, one can proceed to its further analysis. Some basic properties such as existence of solutions, existence of stationary paths or invariant distributions, monotonicity, differentiability, uniqueness and stability, may be explored by classical mathematical methods. But most often these tools fail to provide us with the quantitative information necessary to test a model. In such circumstances, numerical approximations may be needed to deepen our understanding of the model's predictions. The numerical model is not usually aimed to rationalize an economic situation. This model is generally an algorithmic device, to be implemented in a computer with a view toward the simulation of the true behavior of the original economic model. Therefore, all errors involved in these approximations should be made sufficiently small so that our inferences based upon the numerical simulations remain basically true for the exact solution. (Accuracy tests are designed to bound the error of the approximate solution.) It should be emphasized that there is no universal criterion or hard and fast rule for the approximation error that can be valid for all applications. The appropriate size of the error will depend on further considerations such as the purpose of the investigation, the sensitivity of the solution to initial conditions and parameter values,
Ch. 5.. Numerical Solution o f Dynamic Economic Models
377
and the conditions under which a model is tested. In other words, our ability to make good inferences from approximate solutions is not unambiguously determined by the size of the approximation error. Since few mathematical models yield closed-form solutions, one may be tempted to specify from the start a numerical model (i.e., a collection of rules which could be coded as a finite set of computer instructions), avoiding formulation of an abstract economic model. This point of view may seem very appealing 19, but some practical considerations work in favor of formulating an abstract model. First of all, there is a well developed mathematical theory concerning regular spaces of functions over continuum quantities, which has no counterpart for discrete variables. Indeed, our most powerful mathematical tools apply to locally linear domains, with continuous or differentiable functions. Second, computations introduce approximations and round-off errors, and it becomes tedious to verify certain properties such as existence, stability, monotonicity of solutions or the constancy of a given elasticity. Validation of these results via numerical computations would require a full sampling of the state and parameter spaces, and this is not usually the most effective way to establish a given property of the solution. Thus, in most situations it is not plausible to support our entire analysis on a numerical model, and a more abstract framework is necessary. Computer simulations, though, may help us make further theoretical progress, since these computations may give way to reasonable conjectures that can be subsequently examined by mathematical analysis or by further numerical experiments. Finally, predictions and properties of solutions of the economic model should be contrasted with the underlying real-world situation. In some simple cases, these critical comparisons may be effected without resort to a formal analysis (e.g., a model may be dismissed if its predictions about the rates of interest are very distant from the observed data). In more subtle situations, statistical techniques are usually needed. Some recent work has focussed attention on testing the output of computer simulations [cf. Canova and Ortega (1996), Christiano and Eicheubaum (1992), Gregory and Smith (1993), Kim and Pagan (1995), and references therein.] For setting up an appropriate framework for model testing, one should realize that data sets and computer simulations are subjected to sources of uncertainty of a different nature. Data sets are characterized by measurement and sampling errors, whereas simulations involve approximations of the original model along with further numerical errors. While measurement errors may resemble round-off errors and machine failures, sampling errors may be avoided in model simulations. Indeed, the moments of an invariant distribution of a model can be accurately reproduced by resorting to arbitrarily large sample paths or by theoretical analysis, although these model statistics may not be univocally defined in the presence of multiple equilibria or multiple
19 Afterall, one is ultimately interested in quantitative assessments; moreover,the numerical model may often offer a more faithful approximationof the real world situation, where some features and quantities are also discrete.
378
M.S. Santos
invariant distributions, and are subject to the uncertainty stemming from parameter calibrations 20. Consequently, for evaluating the performance of a model system we require a detailed study of both the economic and numerical models, together with an analysis of the available data. Each element in this chain has its own specificities, which may give rise to different types of errors that must be accounted for in the process of testing a particular theory or conjecture. Let us briefly discuss some of the major components: (a) Sensitivity of solutions to initial conditions: Small changes in the state variables may lead to fundamentally different predictions, and such pathological behavior may appear in both theoretical and numerical models, especially in environments with chaotic dynamics, multiple steady states or invariant distributions, or indeterminate equilibria. These are instances in which there are no definite predictions, and one should investigate how these instabilities unravel in the numerical model and in the data analysis. But even if solutions are reasonably well-behaved, a distinction should be made between steady-state behavior and transitional dynamics. Often, it is useful to compute the speed of convergence to a given steady state or invariant distribution. (b) Uncertainty of the calibrated parameters: A model may have definite predictions regarding steady-state behavior and transitional dynamics, but such predictions are subjected to the uncertainty stemming from the parameter space. For a given set of parameter values, arbitrarily good estimates of unconditional moments of the model's invariant distributions may generally be obtained by large sets of simulations. These moments depend on parameter values, and correspondingly follow the probability law induced by the parameter space. (c) Rounding and chopping errors: Computer arithmetic is not exact, and hence calculations are subject to errors. For rounding errors, it is reasonable to presume that they are normally distributed. Furthermore, their order of magnitude is small, in most cases around 10 15. But care should be exercised, since in the presence of instabilities these errors may grow over the iterative scheme in many unexpected ways. Their cumulative effect over the computed solution is usually more sizeable. (d) Approximation errors: These are the errors involved in the discretization of the theoretical model. Again, their cumulative effect over the computed solution may be considerably larger than that of the functions being approximated (cf. Lemma 4.2 and Theorem 4.3). Approximation errors are often impaired by systematic components, and hence their statistical modelization may become problematic. For instance, the numerical algorithm presented in Section 4 always underestimates the value function. Systematic biases may also affect the curvature of the interpolants.
20 A further issue in model testing is to determine if the data selected correspond to a steady-state situation, or if transitional dynamics are playing an important role in the analysis.
Ch. 5: Numerical Solution of Dynamic Economic Models
379
(e) Sampling errors: As we only observe a limited number o f realizations of a random
variable, it is well understood that our inferences are subject to sampling error. Statistical theory can help us gauge this error for both small and large data samples. (f) Measurement errors: This is also a classical topic o f statistical theory. It is well known that imperfect measurement may bias the sample moments of our estimates. Several recent attempts at analyzing computer simulations [see the recent surveys by Canova and Ortega (1996), and Kim and Pagan (1995)] contemplate some o f these sources o f error. For instance, Christiano and Eichenbaum (1992) advance an statistical framework for testing second-order moments assuming that uncertainty stems from calibrated parameter values and sample observations. As other alternative approaches, the implicit postulate for the purpose at hand is that some other errors are so small that can be safely ignored. But, in cases where sensitivity o f solutions to initial conditions and approximation errors become relevant considerations, a formal analysis of these influences is needed before embedding them into a statistical framework. For the same significance level, approximation errors may change or widen the critical region, or lessen the power o f a test. If errors are o f different orders o f magnitude, an obvious step is to lessen the influence o f the most critical ones. Approximation or numerical errors can generally be reduced at the expense o f more computational effort, and the gains from reducing these errors should be evaluated against the incurred cost. O f course, these considerations apply for all other errors, but often their costs are prohibitive. Summarizing, this section started with the basic idea that numerical analysis is a useful tool in the study o f economic models, and our purpose has been to highlight several sources o f error to be accounted for in testing an economic theory. Numerical analysis yields approximate solutions. Thus, to make proper inferences about an economic model, the approximation error must be sufficiently small. This error must be evaluated in conjunction with further properties of both the theoretical and numerical models, and the statistical properties o f the data. It does not seem plausible to derive a universal measure o f error - or any other purely econometric statistic - that can yield definite answers in all situations. And the benefits derived from a better approximation have to be balanced against the additional computational effort.
10. A practical approach to computation This chapter has reviewed several numerical techniques for solving economic models. Our analysis has been restricted to a family o f standard growth models in which optimal solutions may be decentralized as competitive allocations. This equivalence between optimal solutions and competitive allocations will break down in the presence of externalities, incompleteness o f financial markets, taxation, public expenditure, money, and other frictions or governmental interventions. Additionally, there are
380
M.S. Santos
alternative frameworks that have not been considered here, such as overlapping generations economies or models based on game theoretical assumptions. The array of methods presented in this paper should nevertheless be of potential interest for solving these other modelizations. Indeed, one should expect that in the near future a good share of research efforts will leave the frictionless economic framework considered in this paper and aim at a rigorous simulation of more complex economies. For a good start to some of these topics, the reader is referred to the recent monograph edited by Cooley (1995). Chapters 1-4, 7 and 12 of that volume focus especially on computational issues. For further related work and extensions, the following is a very partial list of theoretical and applied papers on computation: (a) Economics with heterogeneous agents and borrowing constraints: Castafieda, Diaz-Gimenez and Rios-Rull (1997), Huggett and Ventura (1997), Krieger (1996), Krusell and Smith (1995), and RiosRull (1997); (b) Economies with taxes: Bizer and Judd (1989), Chari and Kehoe (ch. 26, this volume), Coleman (1991), and Jones, Manuelli and Rossi (1993); (c) Suboptimal equilibria: Baxter (1991), and Greenwood and Huffman (1995); (d) Monetary economies: Cooley and Hansen (1989), Giovannini and Labadie (1991), and Lucas and Stokey (1987); (e) Finance models: Duffle (1996, ch. 11), Boyle, Broadie and Glasserman (1997), and Heaton and Lucas (1996); (f) Overlapping generations economies: Auerbach and Kotlikoff (1987), Kehoe and Levine (1985), and Kehoe, Levine, Mas-Colell and Woodford (1991); (g) Asymmetric information: Prescott (1997), and Phelan and Townsend (1991); (h) Game theory: McKelvey and McLennan (1996), and Dutta and Sundaram (1993). Given the ample variety of economic models and computational techniques, a researcher will most often be faced with the choice of an appropriate numerical method. Our purpose now is to outline a set of preliminary steps which may prove useful in the selection and implementation of a numerical procedure. The first basic principle to bear in mind is that there is no numerical method that will perform better in all situations. Hence, the choice will generally be complex, and the numerical procedure must be suited to the analytic nature of the problem under consideration. At this stage, a theoretical study of the model is most helpful. Qualitative properties of optimal solutions, such as existence or differentiability, should shed light on the error stemming from different approximations. The existence and stability properties of steady states or invariant distributions will help determine the mesh size or order of the approximant, as well as the most suitable restriction of the domain. In addition, a theoretical analysis of the model should provide valuable clues in the computation process, such as the choice of an initial guess for the solution, the efficient manipulation of state and control variables to simplify the model, or the appropriate subroutines for integration, maximization, and related operations. Also, it is useful to undertake a theoretical analysis of the numerical model, and examine differences between the dynamic behavior of these solutions and their continuum analogues. A second point to be stressed is that for smooth, concave models involving one or two state variables there are generally reliable algorithms that can compute the solution
Ch. 5:
Numerical Solution o f Dynamic Economic Models
381
in reasonable time at a desired level of accuracy. As already stressed, subroutines for integration and maximization in one dimension are fairly efficient. Moreover, technological developments will facilitate the application of reliable methods in a near future. It seems then that the use of less rigorous or less reliable approximation procedures becomes more attractive for more difficult, time consuming computational problems. Although quadratic approximations and methods approximating the Euler equation performed remarkably well for all models considered in this paper, it should be realized that in most cases these models can be solved via the discretized dynamic programming algorithm combined with spline interpolation. And an efficient use of the multigrid algorithm (and to a lesser extent, with policy and modified policy iteration) may help minimize the computing time. There are, however, computational problems in which the use of reliable methods becomes awkward or infeasible. These models may simply lack concavity, interiority of solutions, or smoothness, or involve several state variables. Techniques for the computation of non-smooth or discrete problems are generally less powerful, and sometimes model-specific; hence, at this general level of discussion, it seems difficult to offer specific guidelines. On the other hand, for the computation of large-scale models, we suggest the following sequential procedure: (i) Quadratic approximations; (ii) Globally convergent numerical methods in which accuracy can be controlled; (iii) Faster computational procedures, which may lack global convergence or a formal derivation of error bounds. If the model contains a globally stable steady state or invariant distribution, then it seems natural to start with a quadratic approximation. To assess the accuracy of this approach it may be helpful to calculate the (exact) derivatives of the value and policy functions, or simply check the Euler equation residuals. From these residuals, one may be able to estimate approximation errors for the value and policy functions (cf. Section 6.2). The quadratic model may also provide a good initial guess of the solution for the remaining, more sophisticated computational methods. Reliable algorithms, which are amenable to a progressive control of the approximation error, can also be useful in the computation of large-scale dynamic problems. In these cases, one may resort to coarse grids with higher-order approximants. (For instance, Johnson et al. (1993) solve a four-dimensional dynamic model with a discretized version of the dynamic programming algorithm with spline interpolation.) Comparisons of outcomes with progressively finer grids - or against simple test cases with closed-form solutions - may serve to appraise numerically the constants involved in the approximation errors. Likewise, a close examination of the Euler equation residuals may allow us to evaluate the accuracy of these methods. Reliable algorithms should also provide reasonable initial guesses for the computation of a model under faster numerical methods. Sometimes, the only feasible route for solving a model is via faster algorithms, which may lack global convergence or are not easily amenable to a formal error analysis. The efficient design of such computational procedures often involves
382
M.S. Santos
a combined application of standard techniques from numerical analysis in ways suggested by a previous analysis of error for reliable algorithms. There are also some subtle issues concerning implementation o f these procedures. As suggested in Section 6, one should check for existence and uniqueness of solutions; moreover, application of Newton-type methods to find zeroes o f non-linear equations requires that the system be locally well-conditioned. For the practical operation of the algorithm, it may be helpful to start with an initial candidate from (i) or (ii), and then compute successively finer approximants, taking as initial guess in each step the solution obtained from the previous approximation. Comparisons of the outcomes and coefficients obtained from successive approximants may shed light on the stability and accuracy of the numerical method. Moreover, an evaluation of the Euler equation residuals may be the most effective way to estimate numerically the approximation error [cf. Santos (1999)].
References Amman, H.M. (1996), "Numericalmethods for linear-quadraticmodels", in: H.M. Amman,D.A. Kendrick and J. Rust, eds., Handbook of ComputationalEconomies (Elsevier, Amsterdam) 588-618. Anderson, E.W., L.P. Hansen, E.R. McGrattan and T.J. Sargent (1996), "Mechanics of forming and estimating dynamic linear economies", in: H.M. Amman, D.A. Kendrick and J. Rust, eds., Handbook of Computational Economics (Elsevier, Amsterdam) 173-252. Araujo, A. (1991), "The once but not twice differentiability of the policy function", Econometrica 59:1383 1391. Auerbach, A.J., and L.J. Kotlikoff (1987), Dynamic Fiscal Policy (Cambridge University Press, Cambridge). Baxter, M. (1991), "Approximatingsuboptimaldynamic equilibria: an Euler equation approach", Journal of Monetary Economics 27:173~00. Bazaraa, M.S., D.H. Sherali and C.M. Shetty (1993), Nonlinear Programming (Wiley,New York). Becker, G.S. (1993), "Nobel lecture: The economic way of looking at behavior", Journal of Political Economy 101:385-409. Becker, R., and I. Zilcha (1997), "StationaryRamsey equilibriumunder uncertainty",Journal of Economic Theory 75(1):122-140. Bellman, R. (1955), "Functional equations in the theory of dynamic programming V Positivity and quasilinearity", Proceedings of the National Academy of Sciences, USA 41:743 746. Bellman, R. (1957), Dynamic Programming (Princeton University Press, Princeton, NJ). Bellman, R., R. Kalaba and B. Kotkin (1963), "Polynomial approximation - a new computational technique in dynamicprogramming:Allocationprocesses", Mathematicsof Computation17:155-161. Benveniste,L.M., and J.A. Scheinkman(1979), "On the differentiabilityof the value function in dynamic models of economics", Econometrica47:727-732. Bertsekas, D.P. (1975), "Convergence of discretization procedures in dynamic programing", IEEE Transactions on Automatic Control 20:415-419. Bizet, D., and K. Judd (1989), "Taxation and uncertainty", American Economic Review 79:331-336. Blackwell, D. (1965), "Discounteddynamicprogramming",Annals of Mathematical Statics 36:226-235. Blume, L.E., D. Easley and M. O'Hara (1982), "Characterizationof optimal plans for stochastic dynamic programs", Journal of Economic Theory 28:221~34. Boldrin, M., and L. Montrucchio (1986), "On the indeterminacy of capital accumulationpaths", Journal of Economic Theory 40:26-39.
Ch. 5:
Numerical Solution of Dynamic Economic Models
383
Bona, J.L., and M.S. Santos (1997), "On the role of computation in economic theory", Journal of Economic Theory 72:241-281. Boyle, E, M. Broadie and E Glasserman (1997), "Monte Carlo methods for security pricing", Journal of Economic Dynamics and Control 21 :1267-1321. Brock, W.A., and L.J. Mirman (1972), "Optimal economic growth and uncertainty: the discounted case", Journal of Economic Theory 4:479-513. Canova, E, and E. Ortega (1996), "Testing calibrated general equilibrium models", mimeograph (Department of Economics, Universitat Pompeu Fabra, Barcelona, Spain). Castafieda, A., J. Diaz-Gimenez and J.-V. Rios-Rull (1997), "Unemployment spells, cyclically moving factor shares and income distribution dynamics", mimeograph (Federal Reserve Bank of Minneapolis). Chow, C.-S., and J.N. Tsitsiklis (1991), "An optimal one-way multigrid algorithm for discrete-time stochastic control", IEEE Transactions on Automatic Control 36:898-914. Christiano, L.J. (1990), "Linear-quadratic approximation and value-function iteration: a comparison", Journal of Business and Economic Statistics 8:99-113. Christiano, L.J., and M. Eichenbaum (1992), "Current real-business-cycle theories and aggregate labormarket fluctuations", American Economic Review 82:430~450. Christiano, L.J., and J. Fisher (1994), "Algorithms for solving dynamic models with occasionally binding constraints", manuscript (University of Western Ontario). Coleman, W.J. (1990), "Solving the stochastic growth model by policy-function iteration", Journal of Business and Economic Statistics 8:27-29. Coleman, W.J. (1991), "Equilibrium in a production economy with income taxes", Econometrica 59: 1091-1104. Cooley, T.E (1995), Frontiers of Business Cycle Research (Princeton University Press, Princeton, NJ). Cooley, T.E, and G.D. Hansen (1989), "The inflation tax in a real business cycle model", American Economic Review 79:733-748. Cooley, T.E, and E.C. Prescott (1995), "Economic growth and business cycles", in: T.E Cooley, ed., Frontiers of Business Cycle Research (Princeton University Press, Princeton, NJ) 39-65. Dahlquist, G., and A. Bjorck (1974), Numerical Methods (Prentice-Hall, Englewood Cliffs, NJ). Danthine, J.-R, J.B. Donaldson and R. Mehra (1989), "On some computational aspects of equilibrium business cycle theory", Journal of Economic Dynamics and Control 13:449~470. Davis, P.J., and R Rabinowitz (1984), Methods of Numerical Integration (Academic Press, New York). den Haan, W.J., and A. Marcet (1990), "Solving the stochastic growth model by parameterizing expectations", Journal of Business and Economic Statistics 8:31-34. den Haan, W.J., and A. Marcet (1994), "Accuracy in simulations", Review of Economic Studies 61:3-17. Denardo, E.V (1967), "Contraction mappings in the theory underlying dynamic programming", SIAM Review 9:165-177. Dotsey, M., and C.S. Mao (1992), "How well do linear approximation methods work?", Journal of Monetary Economics 29:25-58. Duffle, D. (1996), Dynamic Asset Pricing Theory (Princeton University Press, Princeton, N J). Dutta, RK., and R.K. Sundaram (1993), "Markovian games and their applications I: Theory", mimeograph (Columbia University). Falcone, M. (1987), "A numerical approach to the infinite horizon problem of deterministic control theory", Applied Mathematics and Optimization 15:1-13. Fox, B.L. (1973), "Discretizing dynamic programs", Journal of Optimization Theory and Applications 11:28-234. Friedman, M. (1953), "The methodology of positive economics", in: M. Friedman, ed., Essays in Positive Economics (Chicago University Press, Chicago, IL). Gallego, A.M. (1993), "On the differentiability of the value function in stochastic growth models", manuscript (Universidad de Alicante).
384
M.S. Santos
Gaspar, J., and K.J. Judd (1997), "Solving large-scale rational expectations models", Macroeconomic Dynamics 1:45-75. Gear, C.W (1971), Numerical Initial Value Problems and Ordinary Differential Equations (Prentice Hall, Englewood Cliffs, NJ). Geweke, J. (1996), "Monte Carlo simulation and numerical integration", in: H.M. Amman, D.A. Kendrick and J. Rust, eds., Handbook of Computational Economics (Elsevier, Amsterdam) 731-800. Gill, EE., W. Murray and M.H. Wright (1981), Practical Optimization (Academic Press, New York). Giovannini, A., and R Labadie (1991), "Asset prices and interest rates in cash-in-advance models", Journal of Political Economy 99:1215-1251. Greenwood, J., and G.W Huffman (1995), "On the existence of nonoptimal equilibria in dynamic stochastic economies", Journal of Economic Theory 65:611-623. Gregory, A.W, and G.W. Smith (1993), "Statistical aspects of calibration in macroeconomics", in: G.S. Maddala, C.R. Rao and H.D. Vinod, eds., Handbook of Statistics, vol. 11 (Elsevier, Amsterdam) 703-719. Hammerlin, G., and K.-H. Hoffmann (1991), Numerical Mathematics (Springer, Berlin). Heaton, J., and D.J. Lueas (1996), "Evaluating the effects of incomplete markets on risk sharing and asset pricing", Journal of Political Economy 104:443-487. Hiriart-Urruti, J.B., and C. Lemarechal (1993), Convex Analysis and Minimization Algorithms II: Advanced theory and bundle methods (Springer, Berlin). Howard, R, (1960), Dynamic Programming and Markov Processes (M1T Press, Cambridge, MA). Huggett, M., and G. Ventura (1997), "Understanding why high income households save more than low income households", mimeograph (Centro de Investigacion Economica, ITAM, Mexico City). Johnson, S.A., J.R. Stedinger, C.A. Shoemaker, Y. Li and J.A. Tejada-Guibert (1993), "Numerical solution of continuous-state dynamic programs using linear and spline interpolation", Operations Research 41(3):484-500. Jones, L.E., R.E. Manuelli and EE. Rossi (1993), "Optimal taxation in models of endogenous growth", Journal of Political Economy 101:485-517. Judd, K.L. (1992), "Projection methods for solving aggregate growth models", Journal of Economic Theory 58:410-452. Judd, K.L. (1996), "Approximation, perturbation, and projection methods in economic analysis", in: H.M. Amman, D.A. Kendrick and J. Rust, eds., Handbook of Computational Economics (Elsevier, Amsterdam) 511-585. Judd, K.L., and A. Solnick (1997), "Numerical dynamic programing with shape-preserving splines", mimeograph (Stanford University). Kahaner, D., C. Moler and S. Nash (1989), Numerical Methods and Software (PTR Prentice Hall, Englewood Cliffs, NJ). Kehoe, T.J. (1991), "Computation and multiplicity of equilibrium", in: W. Hildenbrand and H. Sonnenschein, eds., Handbook of Mathematical Economics (North-Holland, Amsterdam) 2049-2143. Kehoe, T.J., and D.K. Levine (1985), "Comparative statics and perfect foresight", Econometrica 53: 433-454. Kehoe, T.J., D.K. Levine, A. Mas-ColeU and M. Woodford (1991), "Gross substitutibility in large square economies", Journal of Economic Theory 54:1-25. Kim, K., and A. Pagan (1995), "The econometric analysis of calibrated macroeconomic models", in: H. Pesaran and M. Wiekens, eds., Handbook of Applied Econometrics, vol. 1 (Blackwell Press, London) 356-390. King, R.G, C.I. Plosser and S.T. Rebelo (1988), "Production, growht and business cycles. I. The basic neoclassical model", Journal of Monetary Economics 21:195-232. Kitanidis, RK., and E. Foufoula-Georgiou (1987), "Error analysis of conventional discrete and gradient dynamic programming", Water Resources Research 23:845 848.
Ch. 5:
Numerical Solution of Dynamic Economic Models
385
Krieger, S. (1996), "The general equilibrium dynamics of investment, scrapping and reorganization", mimeograph (University of Chicago). Krusell, R, and A.A. Smith (1995), "Income and wealth heterogeneity in the macroeconomy", mimeograph (University of Pennsylvania). Ladron de Guevara, A., S. Ortigueira and M.S. Santos (1997), "Equilibrium dynamics in two-sector models of endogenous growth", Journal of Economic Dynamics and Control 21:115-143. Lambert, J.D. (1991), Numerical Methods for Ordinary Differential Systems (Wiley, New York). Li, J.X. (1993), Essays in mathematical economics and economic theory, Ph.D. dissertation (Department of Economics, Cornell University). Lorentz, A.L. (1992), Multivariate Birkhoff Interpolation (Springer, New York). Lucas, R.E., and N.L. Stokey (1987), "Money and interest in a cash-in-advance economy", Econometrica 55:491-514. Marcet, A. (1994), "Simulation analysis of dynamic stochastic models: application to theory and estimation", in: C.A. Sims, ed., Advances in Econometrics, Sixth World Congress, vol. II (Cambridge University Press, Cambridge) 91-118. Marcet, A., and D.A. Marshall (1994), "Solving non-linear rational expectations models by parameterized expectations", manuscript (Universitat Pompeu Fabra, Barcelona, Spain). McGrattan, E.R. (1996), "Solving the stochastic growth model with a finite element method", Journal of Economic Dynamics and Control 20:19~42. McKelvey, R.D., and A. McLennan (1996), "Computation of equilibria in finite games", in: H.M. Amman, D.A. Kendrick and J. Rust, eds., Handbook of Computational Economics (Elsevier, Amsterdam) 87-142. Montrucchio, L. (1987), "Lipschitz continuous policy functions for strongly concave optimization problems", Journal of Mathematical Economics 16:259-273. Morton, T.E. (1971), "On the asymptotic convergence rate of cost differences for Markovian decison processes", Operations Research 19:244-248. Mulligan, C.B. (1993), "Computing transitional dynamics in recursive growth models: the method of progressive paths", manuscript (University of Chicago). Natanson, I.R (1965), Constructive Function Theory, vol. 3 (Frederic Ungar Publishing Company, New York). Niederreiter, H. (1992), Random Number Generation and Quasi-Monte Carlo Methods (SIAM, Philadelphia, PA). Papageorgiou, A., and J.F. Traub (1996), "New results on deterministic pricing of financial derivatives", working paper no. 9606~040 (Santa Fe Institute). Paskov, S.H. (1996), "New methodologies for valuing securities", in: S. Pliska and M. Dempster, eds., Mathematics of Derivative Securities (Isaac Newton Institute, Cambridge). Peralta-Alva, A., and M.S. Santos (1998), "Accuracy of quadratic approximations in stochastic models of economic growth", mimeograph (University of Minnesota). Phelan, C., and R.M. Townsend (1991), "Computing multi-period information and constrained optima", Review of Economic Studies 59:853 881. Popper, K. (1965), The Logic of Scientific Discovery (Harper Torchbooks, New York). Prescott, E.C. (1986), "Theory ahead of business cycle measurement", Quarterly Review, Federal Reserve Bank of Minneapolis 10(4):9-22. Prescott, E.S. (1997), "Computing private information problems with dynamic programming methods", mimeograph (Federal Reserve Bank of Richmond). Press, W.H., S.A. Teukolsky, W.T. Vetterling and B.P. Flannery (1992), Numerical Recipes in FORTRAN. The Art of Scientific Computing (Cambridge University Press, Cambridge). Puterman, M.L., and S.L. Brumelle (1979), "On the convergence of policy iteration in stationary dynamic programing", Mathematics of Operations Research 4:60-69. Puterman, M.L., and M.C. Shin (1978), "Modified policy iteration algorithms for discounted Markov decision problems", Management Science 24:1127-1137.
386
M.S. Santos
Rios-Rull, J.-V. (1997), "Computation of equilibria in heterogeneous agent models", Staff Report no. 231 (Federal Reserve Bank of Minneapolis). Rivlin, T.J. (1969), An Introduction to the Approximation of Functions (Dover Publications, New York). Rivlin, T.J. (1990), Chebyshev Polynomials (Wiley, New York). Rockafellar, R.T. (1970), Convex Analysis (Princeton University Press, Princeton, N J). Rust, J. (1987), "Optimal replacement of GMC bus engines: an empirical model of Harold Zurcher", Econometrica 55:999-1033. Rust, J. (1996), "Numerical dynamic programming in economics", in: H.M. Amman, D.A. Kendrick and J. Rust, eds., Handbook of Computational Economics (Elsevier, Amsterdam) 619-729. Santos, M.S. (1991), "Smoothness of the policy function in discrete-time economic models", Econometrica 59:1365-1382. Santos, M.S. (1994), "Smooth dynamics and computation in models of economic growth", Journal of Economic Dynamics and Control 18:879-895. Santos, M.S. (1999), "Accuracy of numerical solutions using the Euler equation residuals", Econometrica, forthcoming. Santos, M.S., and J. Vigo (1995), "Accuracy estimates for a numerical approach to stochastic growth models", Discussion paper no. 107, IEM (Federal Reserve Bank of Minneapolis). Santos, M.S., and J. Vigo (1998), "Analysis of a numerical dynamic programming algorithm applied to economic models", Econometriea 66:409~426. Schumaker, L.L. (1981), Spline Functions: Basic Theory (Wiley/Interscience, New York). Schumaker, L.L. (1983), "On shape preserving quadratic spline interpolation", SIAM Journal of Numerical Analysis 20(4):85~864. Shor, N.Z. (1985), Minimization methods for nondifferentiable functions (Springer, Berlin). Stoer, J., and R. Bulirsch (1993), Introduction to Numerical Analysis (Springer, New York). Stokey, N.L., and R.E. Lucas (1989), Recursive Methods in Economic Dynamics (Harvard University Press, Cambridge, MA). Stroud, A.H. (1972), Approximate Calculations of Multiple Integrals (Prentice Hall, Englewood Cliffs, N J). Tan, K.S., and P.E Boyle (1997), "Applications of scramble low discrepancy sequences to the valuation of complex securities", mimeograph (University of Waterloo). Tanchen, G. (1990), "Solving the stochastic growth model by using quadrature methods and valuefunction iterations", Journal of Business and Economic Statistics 8:49-51. Taylor, J.B., and H. Uhlig (1990), "Solving non-linear stochastic growth models: a comparison of alternative solution methods", Journal of Business and Economic Statistics 8:1 18. Traub, J.E, and H. Wozniakowski (1979), "Convergence and complexity of Newton iteration for operator equations", Journal of the Association of Computing Machinery 26:250-258. Whitt, W (1978), "Approximations of dynamic programs, i", Mathematics of Operations Research 3:231243. WhiR, W. (1979), "Approximations of dynamic programs, II", Mathematies of Operations Research 4:179-185. Wright, B.D., and J.C. Williams (1984), "The welfare effects of the introduction of storage", Quarterly Journal of Economics 99:169-182. Xu, Y. (1996), "Lagrange interpolation on Chebyshev points of two variables", Journal of Approximation Theory 87:220238.
Chapter 6
I N D E T E R M I N A C Y A N D SUNSPOTS IN M A C R O E C O N O M I C S JESS BENHABIB New York University
ROGER E.A. FARMER UCLA
Contents
Abstract Keywords 1. Introduction 2. W h y should we care? 2.1. Technical aspects of linear models 2.2. Indeterminacy and propagation mechanisms in real models of business cycles 2.3. Indeterminacy and propagation mechanisms in monetary models of business cycles 3. I n d e t e r m i n a c y in real m o d e l s 3.1. A framework for comparing different models 3.2. The one-sector model with increasing returns 3.3. The two-sector model with increasing returns 3.4. The two-sector model with constant marginal returns 3.5. Fixed costs and the role of profits 3.6. Models with variable markups 4. I n d e t e r m i n a c y in m o n e t a r y m o d e l s 4.1. Monetary models with one state variable and fixed labor supply 4.2. Money in the utility function and the production function 4.3. Monetary models with one state variable and a variable labor supply 4.4. Monetary models with several state variables 5. I n d e t e r m i n a c y and p o l i c y f e e d b a c k 5.1. Fiscal policy feedback 5.2. Monetary policy feedback 5.3. Interest rate rules and indeterminacy 5.4. Monetary models and sticky prices due to frictions in the trading process 6. I n d e t e r m i n a c y and m o d e l s o f endogenous growth 7. S o m e related w o r k Handbook of Macroeconomics, Volume 1, Edited by J.B. Taylor and M. WoodJbrd © 1999 Elsevier Science B.V. All rights reserved 387
388 388 389 390 391 393 395 398 398 400 401 403 405 405 407 408 410 411 413 416 416 417 419 422 423 426
388
J. Benhabib and R.E.A. Farmer
8. Empirical aspects of models with indeterminacy 8.1. Real models and propagation dynamics 8.1.1. One-sectormodels 8.1.2. Two-sectormodels 8.1.3. Multi-sectormodels 8.2. Monetary models and the monetary transmission mechanism 9. Some criticisms of the use of models with indeterminate equilibria to describe data 9.1. Equilibrium selection 9.2. Equilibrium forecast functions 9.3. Does indeterminacyhave observableimplications? 10. Conclusion Acknowledgements References
427 427 428 433 435 437 438 438 440 442 442 443 443
Abstract
This chapter gives an overview of the recent literature on indeterminacy and sunspots in macroeconomics. It discusses of some of the conceptual and the technical aspects of this literature, and provides a simple framework for illustrating the mechanisms of various dynamic equilibrium models that give rise to indeterminate equilibria. The role of external effects, monopolistic competition, and increasing returns in generating indeterminacy is explored for one-sector and multi-sector models of real business cycles and of economic growth. Indeterminacy is also studied in monetary models, as well as in models where monetary and fiscal policy are endogenous and determined by feedback rules. Particular attention is paid to the empirical plausibility of these models and their parametrizations in generating indeterminate equilibria. An overview of calibrated macroeconomic models with sunspot equilibria is given, and their successes and shortcomings in matching properties of data are assessed. Finally some issues regarding the selection of equilibria, the observable implications, and difficulties of forecasting that arise in such models are briefly addressed.
Keywords indeterminacy, multiple equilibria, sunspots J E L classification: E00, E3, 040
Ch. 6: indeterminacy and Sunspots in Macroeconomics
389
1. Introduction M o d e m macroeconomics is based on dynamic general equilibrium theory and for some time it has been known that, unlike static general equilibrium theory, in dynamic general equilibrium economies equilibria may be indeterminate 1. Indeterminacy means that there may be an infinite number of equilibria, all very close to each other, and the existence of indeterminacy in a dynamic model has, in the past, been considered to be a weakness of a theory that should be avoided by careful modeling assumptions. In contrast, a recent literature has grown up in macroeconomics that exploits the existence of an indeterminate set of equilibria as a means of understanding macroeconomic data. This chapter surveys this literature and places it in the context of other recent developments in macroeconomics. The literature on quantitative aspects of indeterminacy can be organized around three strands. First, there is work that uses models with indeterminate equilibria to explain the propagation mechanism o f the business cycle. Second, there is a group of papers that uses indeterminacy to explain the monetary transmission mechanism, specifically the fact that prices are sticky, and third, there is work in growth theory that uses indeterminacy to understand why the per capita incomes of countries that are similar in their fundamentals nevertheless save and grow at different rates. In this survey we explain the ideas that underlie each of these strands; we discuss the mechanisms that lead to indeterminacy and we report on the current state of quantitative models of the business cycle. We pay particular attention to areas in which models with indeterminate equilibria might offer a significant improvement over a more conventional approach. A closely related concept to that of indeterminacy is the idea of a sunspot equilibrium, an idea developed by Cass and Shell [Shell (1977), Cass and Shell (1983)], to refer to equilibrium allocations influenced by purely extrinsic belief shocks in general equilibrium models 2. A sunspot equilibrium is one in which agents receive different allocations across states with identical fundamentals; that is, preferences,
1 Gale (1974) first demonstratedthat indeterminacyoccurs in Samuelson's "consumption-loans"model and Calvo (1978) was one of the first to discuss the issue in this context. Kehoe and Levine (1985) have an excellent discussion of the conditions under which indeterminacy can and cannot occur in infinite horizon general equilibrium economies. 2 Azariadis (1981) was the first published paper to show that sunspots may be responsible for business cycles although he uses the term self-fulfilling prophecies, originally coined by Robert K. Merton (1948). Woodford (1986, 1988) further demonstrated how sunspots could be relevant to understanding macroeconomic fluctuations. Howitt and McAfee (1992) use the term 'animal spirits' (popularized by Keynes in the General Theory) to refer to the same concept. It is perhaps unfortunate that these terms are now closely connected. Jevons, for example, who worked on sunspots in the 19th century, did not intend that 'sunspots' should refer to extrinsic uncertainty; instead he believed that there was a real link between the sunspot cycle, the weather and the agricultural sector of the US economy. Similarly,Keynes did not use animal spirits to mean self-fulfilling beliefs; instead his view of uncertainty was closer to Frank Knight's concept of an event for which there is too little information to make a frequentist statement about probabilities.
390
J. Benhabib and R.E.A. Farmer
endowments and technology are the same but consumption and or production differs. Sunspot equilibria can often be constructed by randomizing over multiple equilibria of a general equilibrium model, and models with indeterminacy are excellent candidates for the existence of sunspot equilibria since there are many equilibria over which to randomize. Sunspots cannot occur in finite general equilibrium models with complete markets since their existence would violate the first welfare theorem; risk averse agents will generally prefer an allocation that does not fluctuate to one that does. Examples of departures from the Arrow-Debreu structure that permit the existence of sunspots include (1) incomplete participation in insurance markets as in the overlapping generations model, (2) incomplete markets due to transactions costs or asymmetric information, (3) increasing returns to scale in the technology, (4) market imperfections associated with fixed costs, entry costs or external effects, and (5) the use of money as a medium of exchange. We have drawn attention to three strands of literature; business cycles, monetary transmission and economic growth. The literature that uses indeterminacy and sunspots to understand business cycles is more fully developed than the work on growth theory, and models have been developed of both business cycles and the monetary transmission mechanism that provide quantitative explanations of economic data. These models exploit two ideas; first that indeterminacy may provide a rich source of propagation dynamics to an equilibrium model and second that sunspots may provide an alternative impulse to technology or taste shocks. In monetary models the dynamics of indeterminate equilibria have been exploited to explain how a purely nominal shock may have real effects in the short run without invoking artificial barriers to price adjustment. In addition to their contribution to the theory of economic fluctuations, models of indeterminacy have been used in the literature on economic growth to explain different and sometimes divergent growth rates of countries and regions that start out with similar endowments and wealth levels. Despite the explosion of research in modern growth theory, many important questions remain unsettled and results are frequently not robust to alternative empirical specifications [see Levine and Renelt (1992)]. The growth literature on indeterminacy highlights the possibility that economic fundamentals alone will not pin down the savings rates for different countries since countries with identical endowments and wealth levels may coordinate on different equilibrium savings rates that may be determined by cultural, social or historical considerations.
2. Why should we care?
The initial work on indeterminacy in general equilibrium models was often abstract and far removed from issues of economic policy. Part of our goal in this survey is to dispel the misconception that indeterminacy is an esoteric area that is unconnected with the core of macroeconomics. We will show that, if one accepts dynamic general
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
391
equilibrium theory as an organizing principle, the possibility o f indeterminacy is part of the package. Furthermore, indeterminate equilibria can illuminate a number o f issues that are otherwise puzzles. Two issues that we will discuss in this section are (1) the role of beliefs in business fluctuations, and (2) the monetary transmission mechanism. In our concluding comments at the end o f the chapter we will draw attention to some unanswered questions associated with the research agenda. These include the question o f co-ordination o f beliefs on a specific equilibrium and the way that an equilibrium is maintained. 2.1. Technical aspects o f linear models
In deterministic models o f dynamic economies, indeterminacy implies the existence o f many equilibrium paths for quantities and prices that can be indexed by specifying initial conditions for prices. In stationary stochastic contexts, the effect of initial conditions on the evolution o f the economic variables fade away. Nevertheless the indeterminacy of equilibrium in these environments allows the introduction o f exogenous shocks that are not based on fundamentals. Such shocks would be inconsistent with equilibrium if the rational expectations equilibrium were unique. However, this is no longer the case in the presence o f indeterminacy. As long as the sunspot shocks follow a stochastic process that is consistent with the expectations of agents, equilibrium conditions can be satisfied, and sunspots will affect the evolution of real economic variables. Since the stochastic process for sunspots can typically be chosen from a wide class, there are many possible stationary rational expectations equilibria. The particular equilibrium that prevails depends upon the beliefs that agents use to forecast future values o f prices and it is in this sense that sunspots "select" a stochastic equilibrium. We begin by discussing some technical aspects o f linear stochastic models in order to illustrate the content of indeterminacy for the applied econometrician. Our discussion centers on solution methods for linear models and illustrates the implications o f indeterminacy for the methods that are used to formulate, simulate and estimate these models 3. Later in the chapter we will discuss the class o f behavioral models that give rise to approximate linear models. These behavioral models are typically derived from an infinite-horizon maximizing problem solved by a representative agent, although there is no reason to maintain the representative agent assumption and similar linear models follow from a much larger class o f dynamic general equilibrium models 4.
3 For a more complete discussion of solntion methods in linear models with and without indeterminacies the reader is referred to Farmer (1993), King, Plosser and Rebelo (1987) or Blanchard and Kahn (1980). For an alternative, excellent, treatent of sunspots in a variety of models the reader is referred to the survey by Chiappori, Geoffard and Guesnerie (1992). 4 Kehoe and Levine (1985) show that infinite horizon models with a finite number of agents behave very much like the finite commodity model. The key distinction [originally pointed out by Shell (1971)] is between competitive models with a finite number of infinite lived agents, in which there is generically
392
J. Benhabib and R.E.A. Farmer
We start with the assumption that we have already solved for the equilibrium of a dynamic model and that the non-stochastic version of this model contains a balanced growth path. Linearizing around this balanced growth path leads to a system of equations of the form Yt = A y t 1 + B E t
(2.1)
[Yt+l] + C x t + ut,
(2.2)
Xt = D x t 1 + vt,
where y is a vector of endogenous variables, x is a vector of policy variables, u and v are conformable vectors of stochastic shocks, and A, B, C, and D are matrices of parameters that are found by taking first-order Taylor series approximations to the functions that describe a non-stochastic version of the model around its balanced growth path. These equations consist of market clearing conditions, Euler equations and static first-order conditions and a set of transversality conditions that impose boundedness conditions on the elements of Yr. We assume that policy is stationary, that is, the roots of D are all within the unit circle. To find the rational expectations solution to this model, one can rewrite Equations (2.1) and (2.2) as follows: 74zt = Bzt+l +
Ce,+l,
(2.3)
where
zt ~
Yt-1
,
=-
~4 ~
LXt_l
I 0
0
(2.4)
-B
et+l ~
vt Yt+I - E t
,
[Yt+l]
C
[i° 00J 0 -I
.
(2.5)
Premultiplying by j - 1 using the notation q~ = ~-1~, F ~ A 1~, leads to zt = ~ z t + l + Fet+l.
(2.6)
Generally, one can invert Equation (2.6) and write zt+l as a function ofzt, but since some of the roots of the matrix ~ lie outside the unit circle, this procedure does not typically allow one to construct the stochastic process for zt that constitutes the rational expectations equilibrium. The problem is that arbitrary solutions to Equation (2.6) fail
a finite odd number of equilibria, and those with an overlapping generations structure in which there is a double infinity of goods and agents.
Ch. 6:
393
Indeterminacy and Sunspots in Macroeconomics
to remain bounded and they violate the transversality conditions o f one or more o f the agents in the underlying equilibrium model. In other words, the system (2.7)
Zt+l = C[9-lzt - (/)-l/'et+l,
is explosive. In determinate rational expectations models one can eliminate the effect of explosive roots o f q5 I by placing restrictions on zt. Suppose we partition the vector zt into two disjoint sets z~, of dimension nl, and z 2, o f dimension n2, where z) contains those variables that are predetermined at date t, and z 2 contains those variables that are free to be chosen by the equilibrium conditions of the model. Let ,~ be the roots o f qs, and partition ;t into two sets )t I , o f dimension ml, and )t 2, of dimension m2, where/~1 consists o f the roots o f q~ that are outside the unit circle and ~2 o f those that are inside the unit circle 5. The condition for a determinate solution is that one has exactly as many non-predetermined variables as non-explosive roots o f q~, in other words, n2 = m2. The solution to the determinate model is found by diagonalizing the matrix q~ and writing Equation (2.7) as a system o f scalar equations in the (possibly complex) variables ~t, where the ~ are linear combinations of the zt formed from the rows of the inverse matrix o f eigenvectors of q). The elements o f ~ associated with stable roots of q~ are set to zero; in the case when m2 n2 these elements provide exactly enough linear restrictions to exactly determine the behavior o f the zt. In the case when there are fewer stable roots of q5 than non-predetermined initial conditions there is the possibility o f multiple indeterminate equilibria. We illustrate the importance o f this issue for economics with two examples. =
2.2. Indeterminacy and propagation mechanisms in real models o f business cycles The idea behind models o f indeterminacy as a source o f propagation of shocks can be best understood within the context of the Cass-Koopmans stochastic growth model. The equilibrium o f this model can be described as the solution to a system of difference equations. I f we allow for a productivity shock that enters the model as a disturbance to the production function this system has three state variables, consumption, capital and productivity:
[ct ~st
= q) [ kt+, k ~t+l
+F
.
(2.8)
[ ut+l
The variables c and k are consumption and capital, and s is productivity. The tilde's denote deviations from the balanced growth path: fi is the innovation to the productivity shock and ~, defined as et+l = ~'t+l -- Et [Ct+l],
(2.9)
5 There is an important knife-edge special case in which one or more roots are exactly on the unit circle which we shall not explore in detail.
394
J. Benhabib and R.E.A. Farmer
is the one-step-ahead forecast error of consumption. In the linearized model, there are two other endogenous variables, output and labor, that are described as linear functions of the "state variables" c, k and s. Using the definitions from the previous section, k and s are predetermined and c is non-predetermined. When the model is derived from a maximizing model with a constant returns-to-scale technology one can show that the matrix q~ has two unstable roots (outside the unit circle) and one stable root (inside the unit circle). Since there is one non-predetermined variable, c, and one stable root of q~ one has a unique equilibrium; this equilibrium is found by eliminating the influence of the stable root of q~. Since the roots of q~ are the inverses of the roots of q~ I this procedure eliminates the influence of the unstable root of q~ l in the equation
kt+l L s~+i
= ~ 1 ~7t ~
-q0 1/7 LUt+l 1.] ,
(2.10)
by making ~ a function of k and 2. In the special case when there are no shocks to productivity, ~ is identically zero and the steady state of the bivariate system in ~ and is a saddle point. In this case the unique solution makes ~ a function of k that places the system on the stable branch of the saddle. In the stochastic case ~ depends not only on fc but also on ~. In the stochastic model, the expectational error Ot+l is a function only of the innovation to the productivity shock fit+l and there is thus no independent role for errors in beliefs to influence outcomes. This was the original point of the rational expectations revolution; it is possible to show that, if there is a unique rational expectations equilibrium, expectations must be a unique function of fundamentals. When there is a unique equilibrium, it can be described as the solution to a secondorder stochastic difference equation of the form
St+l with a set of side conditions,
E'I E1 It
~t
(2.12)
'
that determine the values of the other variables of the system. The matrices ~ and are functions of the elements of q~ and F found by replacing ~ with a function of and ~. ~ and 7 represent output and employment, and the elements of C are found from the linearized first-order conditions of the original equilibrium model. The main idea of the recent indeterminacy literature is that small departures from the assumptions of the real business cycle model lead to big departures from its
Ch. 6: Indeterminacyand Sunspots in Macroeconomics
395
implications. Farmer and Guo (1994, 1995) take a variant of the Cass Koopmans stochastic growth model, originally studied by Benhabib and Farmer (1994), in which the technology displays increasing returns 6. Their model has a representation of the same form as equation (2.8) but, in contrast to the standard version of this model, in the Farmer-Guo version all the roots of the matrix q~ lie outside of the unit circle. It follows that all roots of q~ 1 are inside the unit circle and hence Equation (2.10) describes a stationary process for an arbitrary series of iid shocks 0t. In the standard model the forecast errors 0t are functions of the fundamental errors fit. In contrast, in the increasing returns model, the forecast errors enter as independent shocks to the business cycle. There are two important implications of models of indeterminacy for business cycle dynamics. The first is that animal spirits can represent an independent impulse to the business cycle even in a model in which all agents have rational expectations. The second is that models with indeterminacy display much richer propagation dynamics than models with a unique determinate equilibrium. In the model with increasing returns, for example, the matrix q~ may generate dynamics that lead to hump shaped responses to shocks, wherever these shocks originate. In the standard model, the dynamics of consumption, capital and GDP are dictated mainly by the assumed form of the dynamics of the shocks 7. 2.3. Indeterminacy and propagation mechanisms in monetary models of business cycles We have described how models with indeterminate equilibria can potentially be used to understand business cycles. A second area in which indeterminacy may prove important is in models of the monetary transmission mechanism. Once again, it is the propagation mechanism inherent in models with indeterminacy that sets these models apart. Early papers on the issue of indeterminacy and monetary propagation were set in the context of the two-period overlapping generations model; papers in this literature include Geanakoplos and Polemarchakis (1986), Azariadis and Cooper (1985), Farmer and Woodford (1997), Farmer (1991, 1992) and Chiappori and Guesnerie (1994). Later work has switched focus and more recently there has been an attempt to include money in infinite-horizon models of money. Papers that exploit the existence of indeterminate monetary equilibria in an infinite-horizon framework include Woodford (1986, 1988), Beaudry and Devereux (1993), Bennett (1997), Lee (1993), Matheny (1992, 1998), Matsuyama (1991b), Benhabib and Farmer (1996b), and Farmer (1997). One of the key ideas in this literature is that indeterminacy can be used to understand the monetary transmission mechanism. 6 The model used by Farmer and Guo is the one explored by Benhabib and Farmer (1994), although similar results would follow from the models of Gali (1994) or Rotemberg and Woodford (1992). 7 See, for example, the paper by Cogley and Nason (1995) who point out the discrepancies between the dynamic predictions of the RBC model and the much richer dynamics apparent in US data.
J Benhabib and R.E.A. Farmer
396
The equilibria of a simple monetary economy can often be characterized by a fimctional equation of the form m t = E t [ G (mr+l,/~t+l, ut)] ,
(2.13)
where mt represents the real value of monetary balances,/~t+l is the ratio of the money supply at date t + 1 to the money supply at date t, ut is a fundamental shock with known probability distribution, E t is the expectations operator and G is a known function. There exist examples of more complicated monetary rational expectations models that include additional state variables and allow for endogenous capital formation. These more c o m p l i c a t e d models rely on the same key insight as simpler models with a single state variable: in models with indeterminacy, prices may be predetermined one period in advance and yet all markets may clear at all dates and agents may have rational expectations. Assuming that Equation (2.13) has a steady state we may linearize around it to generate an equation that must be (approximately) satisfied by any sequence of real balances in a rational expectations equilibrium:
Fnt = aEt [~,+~] +/3#~+~ + yu,.
(2.14)
The variables ~ and/t represent deviations from the non-stochastic steady state. In a standard monetary model the parameter a is between zero and one in absolute value. In standard models one solves equation (2.14) by iterating forwards to find the current value of real balances as a function of the rule governing the evolution of/~t. In models with indeterminacy on the other hand, a may be greater than one and in this case there exist many solutions to this model of the form mt+l = - m r a
~t+] -
ut + et+t,
(2.15)
where et÷l is an arbitrary iid sunspot sequence. Farmer and Woodford (1997) showed that one of these equilibria has the property that the price at date t + 1 is known at date t; in this sense prices are "sticky" even though there is no artificial barrier to prevent them from adjusting each period. The existence of a predetermined price equilibrium is significant because it offers the possibility of using equilibrium theory to understand one of the most difficult puzzles in monetary economics; the characteristics of the monetary transmission mechanism. In his classic essay "Of Money", published in the eighteenth century, David Hume described the empirical facts that at that time were known to characterize the aftermath of what today we would call an "unanticipated monetary injection". Following an addition of money to an economy we typically observe (1) a short-run increase in real economic activity, (2) a fall in short rates of interest, and (3) an increase in the real value of money balances. In the short run, the price level does not respond. Over a longer period of time the interest rate increases to its initial level, real economic
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
397
activity contracts and the full impact o f the monetary increase is felt in prices. David Hume's observations were based on the effect of the importation o f gold to Europe in the aftermath o f the discovery o f the "New World" but his observations have proven remarkably consistent with m o d e m econometric evidence based on the analysis o f vector autoregressions that allow us to construct estimates o f impulse response functions 8. The sequence o f events described in the previous paragraph constitutes a description o f what we believe to be a consensus view o f the facts concerning the "monetary transmission mechanism". But although these facts have been known for two hundred years we still have to reach a consensus theory that can account for them. The current leading contenders as explanations o f the monetary transmission mechanism are some version o f the "menu costs" model due to Akerlof and Yellen (1985) and Mankiw (1985), the contract approach o f Taylor (1980), the staggered price setting model of Calvo (1983) or the closely related cost-of-adjustment models o f Rotemberg (1982, 1996). Each o f these approaches has its merits. However, using standard approaches, it is often difficult to generate impulse response fimctions that resemble the data. Chari, Kehoe and McGrattan (1996) point out some problems of the staggered price setting approach in capturing monetary dynamics. The cost-of-adjustment model of Rotemberg (1996) with quadratic adjustment costs does a better job empirically, and new research in this area is likely to achieve further improvements in explaining propagation dynamics. In part this progress may arise from recognition that monetary models, with or without menu costs, staggered price setting, or informational problems that rationalize labor contracts, can contain a continuum o f rational expectations equilibria (see for example Section 5.4 below). By allowing the data to pick an equilibrium in which prices are slow to respond to new information, one may make considerable progress in reconciling equilibrium theory with the facts. This is the main message o f the recent literature that we will review in Sections 4 and 5.2. As with the work on indeterminacy and business cycles, the main criticisms of monetary models with indeterminacy have been leveled at the plausibility o f the mechanisms that cause indeterminacy to arise. For example, one can show that for certain classes o f fiscal policy, monetary overlapping generations models possess two steady states; one is determinate and one is indeterminate. Since these example are difficult to match with time series data they are open to the criticism that they have little relevance to real world economies. This criticism led to a set of papers based on the infinite-horizon model in which money enters because o f a cash-in-advance constraint, or as a result o f money in the utility function or in the production function. Models in this class have proved harder to dismiss than two-period overlapping
8 There is a huge array of work that studies the empirical characteristics of monetary impulse response functions in the USA [e.g. Sims (1980, 1989)]. One of the constant features of this data is the fact that the price level is slow to respond to purely nominal disturbances; that is, prices are "sticky". Nor does the characteristic of a slow price response seem to be peculiar to the USA as demonstrated by Sims (1992) who compares the USA, the UK, France, Germany and Japan.
398
J. Benhabib and R.E.A. Farmer
generations models although they leave open a number of issues to which we will return in Section 4. The main unanswered issues are: (1) how do agents co-ordinate on a particular equilibrium, (2) are models with indeterminacy possible for plausible parameter values and (3) what are the implications of these models for a theory of optimal monetary policy.
3. Indeterminacy in real models In Section 3 we turn our attention the specific mechanisms that give rise to indeterminacy. Our goal is to provide a common framework to assess the mechanisms that have been discussed in the recent literature. In the class of models that we will focus on, the source of indeterminacy may be viewed as arising from a coordination problem and in this sense our survey provides an extension of the survey of static coordination problems, by Cooper and John (1988), to models in which there is an essential dynamic element to the equilibrium concept. We begin by providing a very simple example that contains the essential elements of the models discussed in recent literature, and in the remaining parts of Section 3 we elaborate on the elements that are specific to each of a number of variants of this basic mechanism. Consider a specific equilibrium path for prices and rates of return; we are going to illustrate how, beginning from one particular equilibrium path, it may be possible to construct another. Suppose that agents collectively change their expectations and they come to believe that the rate of return on an asset will increase. As a consequence of this belief, they begin to accumulate this asset at a faster rate and its price increases. Suppose the return on the asset indeed tends to increase with higher stocks, maybe because of the presence of increasing returns, or a mechanism that mimics increasing returns. Since the overall rate of return on assets must remain equal to an intertemporal rate of discount, maintenance of the new belief as an equilibrium path requires an expected depreciation in the price of the asset, or a capital loss to offset the initial increase in its rate of return. If this price decline is sufficient to contain the explosive accumulation of the asset, then the resulting new path is also an equilibrium. We can now repeat this argument starting with the new equilibrium path, and construct yet another equilibrium. Since there are infinitely many such paths, the original equilibrium is indeterminate. Some of the models that we will discuss work through a mechanism in which increasing returns to scale is an essential part of the argument. However, as a number of the models described in this survey will demonstrate, indeterminacy does not necessarily require increasing returns to scale. 3.1. A framework for comparing different models Recent interest in models with increasing returns was inspired by the literature on endogenous growth initiated by Lucas (1988) and Romer (1990). Early work by these authors showed how to make increasing returns consistent with an equilibrium growth
Ch. 6." Indeterminacyand Sunspots in Macroeconomics
399
model by building externalities or monopolistic competition into an otherwise standard dynamic general equilibrium model. The work of Hall (1988, 1990) and of Caballero and Lyons (1992) provided a further impetus to the increasing returns agenda by suggesting that externalities might be important not only in generating growth but also as a business cycle propagation mechanism. Their work suggested that the degree of increasing returns, exhibited via external effects or markups, was significant in many sectors of the economy. Subsequently a number of authors, notably Basu and Fernald (1995, 1997), Burnside, Eichenbaum and Rebelo (1995), and Burnside (1996), have scaled down the early estimates of Hall, bringing them closer to constant returns, and in certain cases even finding decreasing returns in some industries. Earlier theoretical models of indeterminacy, for example the model of Benhabib and Farmer (1994), had relied on large increasing returns in line with Hall's original estimates. Subsequent theoretical work however has substantially reduced the degree of increasing returns needed to generate indeterminacy [as in Benhabib and Farmer (1996a)], and it now has become clear that in an economy with some small market imperfections, even a technology with constant marginal returns can generate indeterminacy 9. To illustrate the critical elements that generate indeterminacy, we begin with a simple equilibrium structure that can, with slight modifications, accommodate a number of different models. The standard model with infinitely-lived identical agents maximizing the discounted sum of instantaneous utilities given by U(c)+ V(1 - L ) , where c is consumption and L is labor, gives rise to the following system of equations:
U'(c) =p,
(3.1)
U (c)wo(k,L; k,L) = V'(1 - L ) ,
(3.2)
k = y(k, L(p, k); L L(p, k)) - gk - c,
(3.3)
r= p-+ (w~(k,p;LL(p,k))-g)
(3.4)
I
--
-
P
Here w0 is the marginal product of labor, at this point taken to be equal to the real wage, and w~ is the rental rate on capital, equal to the marginal product of capital: the marginal products are with respect to private inputs, keeping k and L fixed. The depreciation rate is g, the discount rate is r, the shadow price of capital is p, and the production function is given by y. In Equations (3.3) and (3.4), it is assumed that L(k,p) has been obtained by solving Equations (3.1) and (3.2). Equation (3.2) represents the labor market equilibrium, Equation (3.3) equates net investment to capital accumulation, and Equation (3.4) is the standard "Euler" equation requiring
9 A separate and related branch of the literature that we do not have space to cover in this survey has shown that search externalities also give rise to indeterminacy. Examples of papers that illustrate this possibility are those by Howitt and MeAfee (1988) and Boldrin, Kiyotaki and Wright (1993). See also Matsuyama (199la) for indeterminacy in a model of industrialization.
400
J. Benhabib and R.E.A. Farmer
the equality of the return on the asset (its net marginal product plus its shadow price appreciation), to the rate of discount. The only non-standard feature of the model above is the inclusion of external effects, generated by the aggregate inputs k and L in the production function. Of course some deviation from the standard framework in the form of a market imperfection must be introduced to obtain indeterminacy, since the standard representative agent model has locally unique equilibria. 3.2. The one-sector m o d e l with increasing returns
We start by investigating the one-sector model of Benhabib and Farmer (1994), which demonstrates how indeterminacy can arise in a representative agent model with increasing returns 10. Increasing returns in this model is reconciled with private optimization by introducing either external effects in production, or a monopolistically competitive market structure where firms face downward sloping demand curves. Benhabib and Farmer show that the simple one-sector model with external effects is identical, in the sense of giving rise to the same reduced form, to a model with monopolistic competition and constant markups. To see how indeterminacy comes about in a model with external effects and increasing returns, consider starting from an equilibrium path, but let the agents believe that there is an equilibrium in which the shadow price of investment p is higher than its current value, and that future returns justify a higher level of investment. If agents act on this belief, the higher current price of investment reduces consumption and induces agents to divert GDP from consumption to investment. If there were no externalities, investment would increase, and the marginal product of capital would begin to decline as additional capital is accumulated. This decline would have to be offset by capital gains in the form of increases in the shadow price of capital in order to validate the belief of agents that higher rates of investment will yield the appropriate return. Trajectories of this sort for investment and prices may be maintained for a number of periods, but the resulting over-accumulation of capital and the exploding prices will violate transversality conditions: an agent will never get to consume enough in the future to justify the sacrifice of a higher rate of investment. In other words, an agent conjecturing such a path of prices and returns would be better off consuming, rather than accumulating additional capital at such a pace. Consider now an alternative parametrization of this model in which externalities are sufficiently big to permit indeterminate equilibria. Once again, let agents conjecture that there exists an alternative equilibrium path, starting with a price of investment p, which is higher than the one in the current steady-state equilibrium. The higher price will cause agents to divert GDP from consumption to investment but, if externalities are strong enough they will simultaneously increase their consumption of leisure. The
10 See also Benhabib and Rustichini (1994), Boldrin and Rustichini (1994) and Chamley (1993).
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
401
increase in leisure will cause GDP to decline, and investment will eventually fall as well. Benhabib and Farmer show that this dynamic argument has a representation in terms of labor demand and supply curves, with strong increasing returns to the labor input, and where the labor demand curve slopes up more steeply than the labor supply curve. In this framework shifts of the curves generated by conjectured changes in the shadow price of capital can lead to the contractionary employment effects mentioned above. As the marginal product of capital falls with the decline of labor, the shadow price of investment must appreciate to produce a capital gain, because in equilibrium the overall return on capital must be equal to the rate of discount. This reinforces the original impulse of a higher relative price for the capital good. The contraction of labor however causes GDP and investment to decline, and the capital stock starts to fall. The decline in the stock of capital reverses the process, because it shifts the labor demand curve down. Since the labor demand curve slopes up more steeply than the labor supply curve, a downward shift in labor demand tends to increase employment. This is the critical element that gives rise to indeterminacy in the model. Higher employment and the low level of the capital stock both cause the marginal product of capital to increase, and intertemporal equilibrium now requires a depreciation, rather than appreciation, in the price of capital to equate the overall return to the discount rate. As the price of capital falls, the economy returns to its original steady state along this new equilibrium path. The key to indeterminacy in this model then lies in the non-standard slopes of the labor demand and supply curves, which induce a perverse labor market response. This feature, which requires a high enough externality to induce increasing returns in the labor input by itself, is what makes the model empirically implausible. 3.3. The two-sector model with increasing returns
A more satisfactory specification leading to indeterminacy is given by Benhabib and Farmer (1996a). They start with a two-sector model, but with identical production functions in the consumption and investment sectors. One might think that by making the production functions of the two sectors identical, the model would collapse to a one-sector economy. However, the two-sector structure is preserved by the distinct external effects in each sector, each arising from their own sectoral outputs, rather than from the aggregate output. The model yields a linear production possibilities surface (ppf) from the private perspective, but one that is convex (to the origin) from the social perspective. To develop a parallel exercise conducted for the one-sector case above, consider again starting at a steady state and increasing the (shadow) price of capital p. Since we have a two-sector model, Equation (3.1) has to be modified to reflect the relative price of capital in terms of the consumption good. If we denote this price by q, we get qU' (c) =p.
(3.5)
402
J. Benhabib and R.E.A. Farmer
A higher q now raises consumption since, given the convexity of the ppf, an increase in consumption relative to investment will be associated with an increase in the relative price of capital. The impact effect on p is ambiguous because Ul(c) declines as well, but if the curvature of U(c) is not too severe, p and q will change in the same direction. Benhabib and Farmer (1996a) use logarithmic utility of consumption in their model. Nevertheless, the reader may want to think of U~(c) as constant in order to get a clearer picture of the logic of the argument. When consumption increases, the supply curve of labor shifts to the left, and since the demand curve for labor slopes down in the two-sector model, the result is a contraction in labor, and also in investment. As in the previous case, the decline in labor decreases the marginal product of capital. Maintaining intertemporal equilibrium now requires an appreciation in the (shadow) price of capital to keep the overall return equal to the discount rate, reinforcing the initial rise in q, and therefore in p. However, the decline of investment and of the capital stock must eventually raise the marginal product of capital, and reverse the appreciation of p: intertemporal equilibrium eventually requires a depreciation of the shadow price of capital. The process therefore is reversed, and the economy moves back towards the steady state, giving rise to an alternative equilibrium trajectory. It is clear in this case that the elements responsible for the "stability" of the steady state, and therefore for "indeterminacy," are no longer the perverse slopes of demand and supply in the labor market, but the convexity of the ppf. Benhabib and Farmer (1996a) find that even a small output externality, resulting in increasing returns as low as 1.07, coupled with standard values for the other parameters, is sufficient to generate indeterminacy. Furthermore their calibration analysis incorporating iid sunspot shocks does quite well by the standards of the recent real business cycle analysis. In spite of an apparent improvement over the one-sector model, one may question the implication of a convex social ppf which implies that the sectoral aggregate supply curves slope down. This issue has been empirically investigated by Shea (1993). He studies 26 manufacturing industries and finds that in 16 of them, the supply curves slope up. The question therefore arises as to whether indeterminacy requires increasing returns in all sectors. Recent results of Harrison (1996) on the two-sector model show that indeterminacy obtains for roughly the same parametrizations as in Benhabib and Farmer (1996a), with the exception that only the investment sector is assumed to have increasing returns. Furthermore, her estimates of increasing returns and externalities, obtained by allowing them to vary across sectors, indicate that they may be sufficiently high in the investment sectors to support indeterminate equilibria 11 Perli (1994) obtains similar results by introducing a home production sector as in Benhabib, Rogerson and Wright (1991), with sector-specific external effects only in the
11 Basu and Fernald (1997) do indeed find that there is heterogeneity of returns to scale across industries.
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
403
market sector, and he obtains indeterminacy with small increasing returns. Furthermore the times series generated by his calibrated model are comparable to US postwar series. In a recent paper Weder (1996) also shows that in a two-sector model indeterminacy can arise with mild increasing returns in the investment sector alone. His specification relies on imperfectly competitive product markets rather than external effects. 3.4. The two-sector model with constant marginal returns
So far it would seem that some increasing returns may be necessary to generate indeterminacy. In a recent paper Benhabib and Nishimura (1998) demonstrate that social constant returns to scale, coupled with some small external effects that imply very mild decreasing returns at the private level, can also generate indeterminacy. In their framework there are private decreasing returns to scale and small external effects that give rise to constant returns to scale at the level o f the aggregate economy. In the two models we discussed in Sections 3.2 and 3.3, a rise in the shadow price of capital eventually induces the capital stock to fall and its marginal product to increase. Consequently, intertemporal equilibrium requires the price o f capital to fall, reversing its original increase. This mechanism can be duplicated in a two-sector model without upward sloping labor demand curves or a convex ppf. The reason is that in models with more than one sector, the marginal product of capital depends not only on factor inputs, but also on the composition o f output and on the relative factor intensities o f the underlying technology. We can express the technology of the two-sector economy in per capita variables as a transformation surface given by c = T ( y , k ) . Such a technology implies that T1 = - p and T2 - wl, where the subscripts o f T indicate derivatives with respect to the appropriate argument. T1 then is the slope o f the ppf, while T2 corresponds to the marginal product o f capital in the production o f the capital good 12. Consider first a simple two-sector model without externalities, and production functions that differ across the consumption and investment sectors. For simplicity also assume that total labor is fixed, and that the utility function is linear in consumption, so that in terms of Equation (3.5) we have q = p. I f the production possibility frontier is strictly concave we can invert the relation T1 (y, k) = - p to obtain the output of the investment good as y = y ( k , p ) . Equation (3.3) must now be modified as follows: (3.6)
k = y ( k , p ) - gk.
Equations (3.4) and (3.6) now fully describe the dynamics of the system in (k,p). With external effects suppressed and a fixed labor supply, the local dynamics around the steady state will depend on the Jacobian matrix J : j=
N-g 0
N -~+(r+g)
"
12 That Ta - w is not immediate, but follows from efficiency conditions and envelope theorems.
J. Benhabib and R.E.A. Farmer
404
Note that the lower left submatrix, ./21 - [-Owl~Ok], is identically zero. This is because under constant (social) returns to scale relative factor prices uniquely determine prices as well as input coefficients. To demonstrate that equilibrium is determinate we must show that the roots of J have opposite sign; but, since J21 is zero, the roots of J are equal to the elements on the main diagonal, Jll and J22. Determinacy is the assertion that these elements have opposite sign. Benhabib and Nishimura demonstrate that the signs of J~l and J22 are related to two familiar theorems in the international trade literature, the Stolper Samuelson theorem and the Rybczinski theorem. We deal here only with the case in which the investment good is labor intensive although the argument that we will present can be easily extended to the case when the investment good is capital intensive, by reversing the signs of the two inequalities that we will present. Consider first the Stolper-Samuelson theorem which asserts (in the labor intensive case) that a rise in p will decrease the rental price of capital, ~ol. In symbols this asserts that
?;;l Since, at the steady state, o)1 = p ( r + g), the Stolper-Samuelson theorem implies that the element J22 is positive. Now consider the Rybczinski theorem which asserts that if more capital is used in the investment goods sector, output of investment goods will rise less than proportionately. In symbols this is represented by the inequality
Since, at the steady state, y = gk, the Rybczinski theorem implies that the element J11 is negative. It follows,.for the case when J21 iN zero, that the Stolper-Samuelson and Rybczinski theorems can be used to establish that equilibrium is determinate. More generally, in multi-sector models J11 and J22 will be matrices rather than scalars, and with linear utility J2i will still be a zero matrix, so that the roots of J will be given by the roots of Jll and J22. However, as shown in Benhabib and Nishimura (1998), the duality between the Stolper-Samuelson theorem and the Rybczinski theorem will imply that at least half of the roots of J will be positive, implying that the equilibrium is determinate 13 What happens when there are external effects? Benhabib and Nishimura (1998) establish that in this case one can break the reciprocal relation between the Rybczinski and Stolper-Samuelson theorems. Output effects on investment are still governed by the logic of the Rybczinski theorem, but the Stolper-Samuelson theorem requires that
13 In some cases, more than half of the roots of J can have positive real parts. This situation however is associated with the presence of optimal cycles and not indeterminacy. See Benhabib and Nishimura (1979).
Ch. 6: Indeterminacyand Sunspots in Macroeconomics
405
costs of production equal prices. With external effects, markets are distorted and the relevant factor intensities can be reversed. This reversal may break the saddle-point property (even when external effects are 'small') and generate a situation in which more than half of the roots of J are negative and there are multiple indeterminate equilibria. Benhabib and Nishimura (1998) calibrate and simulate a discrete-time version of their model, with one consumption and two investment goods, which incorporates iid sunspot shocks and logarithmic utility. Their calibration can match the moments of GDP, consumption, aggregate investment and hours in US data as well as any other standard RBC model, and can generate impulse responses to technology shocks that resemble the hump-shaped impulse responses generated with vectorautoregressions on US data. (See Section 8.1.2 below.) 3.5. Fixed costs and the role o f profits
The results sketched above suggest that introducing even small market imperfections into the standard infinitely-lived representative agent model can produce empirically plausible indeterminacy even under constant social returns. Increasing returns to scale at the level of the aggregate social production function are not necessary for indeterminacy. In order to generate this result, however, there must be decreasing returns at the level of private firms and an implication of this is that firms will earn positive profits. In the parametrized examples given by Benhabib and Nishimura (1998) these profits are quite small because the size of external effects, and therefore the degree of decreasing returns needed for indeterminacy are minimal. Nevertheless positive profits would invite entry, and unless the number of firms is fixed, a fixed cost of entry must be assumed to determine the number of firms along the equilibrium path. Such a market structure would exhibit increasing private marginal costs but constant social marginal costs, which is in line with current empirical work on this subject. It seems therefore that models of indeterminacy based on market imperfections which drive a wedge between private and social returns must have some form of increasing returns, no matter how small, either in variable costs, or through a type of fixed cost that prevents entry in the face of positive profits 14. The point is that while some small wedge between private and social returns is necessary for indeterminacy, this in no way requires decreasing marginal costs, or increasing marginal returns in production. 3.6. Models with variable markups
The use of variety in intermediate or consumption goods, coupled with a monopolistically competitive market structure, has been incorporated by a number of
14 For a more detailed statement of this argument see the papers by Gali (1994) and Gali and Zilibotti (1995).
J Benhabiband R.E.A. Farmer
406
authors into the standard optimal growth model. Woodford (1987) first demonstrated that a monopolistically competitive market structure with flee entry, coupled with variable markups, can lead to indeterminacy and to self-fulfilling sunspot equilibria. He constructs a model in which aggregate output, investment, as well as employment, are driven by expectations of aggregate demand. Furthermore, kinks in the demand curves faced by firms give rise to variable markups, allowing the quantity of labor supplied to adjust to the quantity demanded through changes of aggregate demand in the goods market. As noted by Woodford, this labor market structure with variable markups relaxes the rigid relation between wages and the marginal product of labor that is a feature of competitive market models, and avoids the implication that wages must move countercyclically in the absence of technology shocks 15. Two related approaches, using variable markups, are those of Gali (1994, 1996) and Rotemberg and Woodford (1992). Gali develops a monopolistically competitive market structure in which firms markup price over marginal cost. He shows that changes in the composition of aggregate demand between investment and consumption may cause the markup to vary systematically over the business cycle and he uses the countercyclical markup to demonstrate the possibility that there may be many indeterminate equilibria. Rotemberg and Woodford develop a model of the business cycles based on implicit collusion among firms that strategically vary markups depending on the state of aggregate demand. Their model also gives rise to indeterminacy, and although the exact mechanism is somewhat different from that in Gali or Benhabib and Farmer, the implications for business cycle data are similar as demonstrated in the recent paper by Schmitt-Groh6 (1997). We will return to this paper in Section 8.1.1 in which we discuss empirical aspects of indeterminacy. We can illustrate the mechanism that gives rise to indeterminacy in the models of Gali, and of Rotemberg and Woodford, in the context of our simplified model consisting of Equations (3.1), (3.3) and (3.4). For simplicity we assume that labor is fixed, although variable labor would allow a smaller and less variable markup to generate indeterminacy. First we note that the return on (or user cost of) capital in Equation (3.4) will have to be modified: it is not the marginal product but the marginal product divided by the markup:
r = ) + ( w'(k'P' ~ Z(P'k)) - g)
(3.7)
Here, we simply assume that the markup # is related to p, although in the works cited above this dependence of the variability of the markup is derived from more basic structural assumptions. The influence of a variable markup works in the following way. Starting at a steady-state equilibrium, an increase in p raises the share of investment
15 This issue is also related to the comovementsof consumption and output, and is discussed further below in Section 8.1.2.
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
407
and leads to faster capital accumulation. The higher investment share also lowers the markup and raises wl/t~, even though wl declines. With r fixed, ~b/p then must decline, reversing the process and driving the economy back towards the steady state. This then is another equilibrium, implying that the initial value of p, and therefore the equilibrium trajectory, is indeterminate. The same process would work if the markup depended directly on the capital stock, so that wl/l~ increased with a rise in the capital stock. Gali (1994, 1996) produces variable markups by introducing different demand elasticities for investment and consumption goods. The average markup depends on the relative shares of consumption and investment in aggregate output. Gali assumes the elasticities are such that the markup is negatively related to investment share, and he presents evidence from US data to support his contention that this assumption is a reasonable first approximation to the facts. In the model of Rotemberg and Woodford (1992) the variability of/~ results from a collusion arrangement between firms that share the market and earn a stream of profits in an implicit collusion arrangement. The equilibrium conditions imply that the markup /~ depends on the ratio of the values of the firms to aggregate output. Implicit collusion among firms requires countercyclical markups to maintain discipline, and to prevent deviation from the collusive arrangement. One of the important implications of models with variable markups, as shown in the paper by Schmitt-Groh~ (1997), is that indeterminacy in these models can occur with a lower degree of increasing returns to scale than in the case of constant markups. We return to this idea in Section 8.1.1.
4. Indeterminacy in monetary models In this section we discuss indeterminacy in models that include the real value of money balances as an argument of the utility function or in the production function, following the work of Patinkin (1956). Brock (1974) is the first to formally discuss the possibility of indeterminacy associated with self-fulfilling hyperinflationary and deflationary equilibria, as well as conditions that rule out such equilibria, in a model of an infinitely lived representative agent that has real balances in the utility function. Since Brock's paper, much of the literature on indeterminacy in monetary models has focused on self-fulfilling inflations and deflations, rather than the possibility of a continuum of equilibria converging to a steady state, possibly because the former can occur under relatively weak conditions. Hyperinflationary equilibria can be ruled out if it is assumed that it is prohibitively costly for the economy to operate at low levels of real balances, or close to a barter economy. For certain classes of monetary policies, speculative deflations can also be ruled out by restrictions on preferences. Obstfeld and Rogoff (1983) provide conditions under which such hyperinflationary and deflationary equilibria can be ruled out. An extensive overview of indeterminacy in a cash-inadvance or a cash good-credit good model is given by Woodford (1994). He analyzes the conditions for the existence of hyperinflationary and deflationary equilibria as well
408
J. Benhabib and R.E.A. Farmer
as equilibria described by a continuum o f paths that converge to a steady state and which can give rise to stationary sunspot equilibria 16 One o f the first to note the possibility o f indeterminacy in the form o f a continuum o f equilibria converging to a steady state was Calvo (1979), and there have been many related papers since. Wilson (1979) gives some o f the early examples o f indeterminate equilibria in the context o f a cash-in-advance model. Taylor (1977) has one o f the first discussions o f indeterminacy in a monetary model where he proposes a selection principle that picks the equilibrium solution exhibiting the m i n i m u m variance. McCallum (1983) provides a survey o f early literature on monetary indeterminacy and also proposes a 'minimal state variable solution' to select an equilibrium. Rather than try to survey additions to this vast body o f work we concentrate instead on recent papers that have exploited indeterminacy to address and study empirical features o f the business cycle. Our point o f departure then is the recent interest in calibrated versions o f indeterminate models and in particular the idea that indeterminacy can be a feature that allows us additional freedom to explain properties o f economic fluctuations that are otherwise difficult to understand.
4.1. M o n e t a r y models with one state variable and f i x e d labor supply
We begin our discussion with a class o f models in which money plays the role o f facilitating exchange and in which all other aspects o f the model are stripped down to a bare minimum. These are models in which real money balances is the sole state variable. By focusing on the simplest possible monetary model we will highlight an idea that holds in more general examples; indeterminacy is most easily obtained in monetary models when changes in the stock o f real balances have large effects on output. These effects can come from including m o n e y in the utility function as in Calvo (1979), money in the production function as in Benhabib and Farmer (1996b), or from a cash-in-advance constraint as in the calibrated monetary models o f Cooley and Hansen (1989, 1991). To relate monetary models to our discussion o f indeterminacy in real economies, suppose that output depends on real balances: y = y ( m ) , with y increasing and strictly concave 17. We assume that money is injected into the economy in equal lump-sum
16 See also the symposium on "Determinacy of equlibrittm under alternative policy regimes" in Economic Theory, volume 4, no. 3, 1994. There is also an extensive emipircal literature initiated by
Flood and Garber (1980) that tests for the existence of speculative bubbles. 17 One way to derive a model of money in the production function is by assuming that finns face transaction costs, T(q, m), where q is gross output and m is real balances. In this case net output is given by y(m) = q - T(q, m). If one specifies T as a Leontief function then the model reduces to a cash in advance specification. One could also replace q with consumption c in T to derive a variant of a model with money in the utility function.
Ch. 6." Indeterminacy and Sunspots in Macroeconomics
409
transfers to all agents, and that nominal balances g r o w at the rate o. T h e n the net rate o f return on h o l d i n g m o n e y is
y'(m)- or=y'(m)- (o-
~) ,
w h e r e ~ is inflation, and or _= ( o - / n / m ) in equilibrium. I f we a s s u m e that m o n e y is the only asset, then b = J ( m ) / n . N o w using equations (3.1) and (3.4), we obtain /n
+
u"(c) J
=
+ o-
y'(m))
.
(4.1)
I f we define the elasticities
eL.-
U"(c) c U'(c) '
era-
y'(m) m y(m) '
(4.2)
then E q u a t i o n (4.1) b e c o m e s /n =
m (r + o -y'(m))
1 -ecem
(4.3)
E q u a t i o n (4.3) is a differential equation that must be obeyed by paths for real balances that are consistent w i t h rational expectations equilibrium. The m o d e l has an e q u i l i b r i u m in w h i c h real balances are constant and equal to m*. This steady-state e q u i l i b r i u m is defined as the solution to the equation:
r + o = y'(m*). Since equilibria can be described as functions o f a single state variable, there will exist a set o f indeterminate equilibria i f the steady state in equation (4.3) is locally stable: this requires the right-hand side o f the equation to be decreasing in m at m* 18. Since m (r + ~ r - J ( m ) ) is increasing in m at m*, a sufficient condition for i n d e t e r m i n a c y is that m o n e y is sufficiently productive, or alternatively put, that em is large e n o u g h so that emec > 1 19 The intuition for this result is straightforward. W h e n m o n e y is not sufficiently productive, an increase in n o m i n a l balances or alternatively an initial low price level
18 It is easy to show that over the range of m for which J(m) > 0, higher values of initial m lead to higher welfare because m and output will remain higher at all points in time. 19 Note that if the optimal quantity of money rule is implemented so that a - -r, at the steady state we will have j(m) = 0 and em = 0. Once a level of m is attained at which J(m) - O, it is natural to think that this will continue to hold for higher levels of m because even if money balances cannot cut transaction costs any further, they cannot hurt either. In such a case we will have a continuum of steady state equilibria with real balances satisfying y(m) = 0, as discussed in Woodford (1987).
410
J. Benhabib and R.E.A. Farmer
that corresponds to real balances higher than m*, generates an excess supply o f money that spills onto the goods market: prices have to jump so that output, y(m*), and real balances, m*, remain fixed at their steady-state values. If prices did not jump to restore the steady-state values, real balances would rise, the return on money y1(m) would decline, and the higher money balances would be held only if some further deflation were expected. If such deflationary expectations were confirmed, real balances would eventually explode, and unless certain assumptions are made to rule out such hyperinflationary equilibria, we would still have indeterminacy, even though the steady-state level of real balances is unstable. On the other hand if money were sufficiently productive, an increase in real balances at m* would increase output and create a net excess d e m a n d for money, rather than an excess supply. More formally, the higher real balances would raise consumption and reduce the marginal utility o f current consumption so much that the agents would want to hold higher levels o f the monetary asset 2°. Therefore an increase in nominal balances, or a low initial price that places real balances above m*, would generate a net excess demand for money which would have to be offset by expected inflation in order to restore equilibrium. Inflation would then drive real balances back down to their steady-state value without requiring an instantaneous jump in prices. Thus the steady state is stable, with a continuum o f trajectories o f real balances converging to it. 4.2. M o n e y in the utility function and the production function The model that we discussed in Section 4.1 included money as an argument in the production function. In this section we will show how models of this kind can be extended to include money as an argument of production and utility functions. This modification will be useful in our discussion o f policy rules in Section 5.3 where money in the utility function allows us to demonstrate that indeterminacy can arise in broader range of cases than in simple production function model o f Section 4.1. We model the utility function with the specification U(c, m) where U is increasing and concave in c and m. When money enters both the production and utility functions, Equation (4.3) becomes m (r+ cr-y'(m)-
U,,,(c,m)~ Uc(c,m) I ,
In =
(4.4)
1 - ecem - F,cm where U(c, m) = U ( y ( m ) , m) and the cross partial term ecru -
- m Ucm
Uc
(4.5)
2(1 Note here that this effect is the result not only of the high marginal productivity of money, reflected in era, but also of the rate at which marginal utility declines, reflected in e,,. This is clear from the condition of indeterminacy which requires the product emec to be large.
411
Ch. 6: Indeterminacyand Sunspots in Macroeconomics
plays an important role in the way the model behaves. This term measures the effect of holding extra real balances on the marginal utility of consumption. The term Um/U~., given by the expression
is also important in determining whether equilibrium is determinate since if Um/U~. is decreasing in m the steady state will be locally stable, and therefore indeterminate whenever 1 - e~e~ - ec,n < 0. It seems reasonable to assume that neither consumption nor money are inferior goods,
(Vmm
-go
cmj < 0.
But this is not enough to determine whether Um/U~. is increasing or decreasing in m. It might seem that the discussion in this section is of little empirical relevance. Perhaps utility functions that allow for peculiar cross partials can be easily ruled out by data and we should restrict attention to logarithmic functions or at least to utility functions that are separable in consumption and real balances. Unfortunately, this argument negates the very reason for including money in the utility fimction in the first place since it is precisely changes in the marginal utility of transacting that one might expect to characterize a monetary economy. In Section 4.4 we discuss a calibrated model used by Farmer (1997) in which he demonstrates that models with indeterminacy can be used to mimic impulse response functions in the data in addition to capturing the more salient features of velocity and the rate of interest in the US data. Farmer includes money in the utility function and he chooses in a sum of weighted CES utility functions that allows the term d(Um/Uc)/dm to have either sign. It is precisely this flexibility that allows the model to capture the empirical features that we will describe in Section 4.4. 4.3. Monetary models with one state variable and a variable labor supply
In this section we will show how the extension of monetary models to allow for a second factor of production, labor, can increase the ability of these models to describe data by generating indeterminacy for a more plausible range of the parameter space. Models of this kind still have a single state variable since one can show that, in equilibrium, hours worked are a function of real balances. As in the case of models with a fixed supply of labor, indeterminacy is most likely to occur when money has a big effect on output. There is a growing literature using cash-in-advance or money in the utility function approaches to calibrated models of money that finds a unique rational expectations equilibrium; examples include Cooley and Hansen (1989, 1991) and related literature.
J. Benhabib and R.E.A. Farmer
412
One reason why calibrated models of money may appear to have a unique determinate equilibrium is that these models often use simple functional forms that allow for a single parameter to capture the magnitude of the importance of money. Recent work by Benhabib and Farmer (1996b) and Farmer (1997) demonstrates that indeterminacy may occur in a monetary model f o r realistically calibrated parameter values by modeling the role of money with more flexible functional forms that nest the cash-in-advance model as a special case. For example, in Benhabib and Farmer (1996b), output is produced using labor, and the service of money: (4.6)
y - y(m, I).
I f one makes the assumption that the technology is Cobb-Douglas, there is a single parameter that captures the effect of money; the elasticity of output with respect to real balances. This parameter can be directly measured in the same way that one measures the elasticity of output with respect to labor, through the share of resources used by the firm in transacting. This leads to the calibrated measure Xm = roy'(m, l) _ im y(m, 1) y(m, l)'
(4.7)
where i is the opportunity cost of holding money and the left-hand side of Equation (4.7) is the elasticity of real balances. Since the opportunity cost of holding money cannot be much more than 2% and the velocity of circulation, y/m, is around 5 in post-war data, the elasticity of money in production must be small, less than half of one percent. This kind of magnitude is not enough to generate big effects. Suppose, however, that money is highly complementary with other factors. In this case Benhabib and Farmer (1996b) show that indeterminacy may hold in a monetary model with an otherwise standard constant returns to scale technology. They use a technology of the form Y = (ame + le )
1/e,
which collapses to a Leontief technology as e approaches - o c and to a Cobb-Douglas technology as e approaches 0. The Leontief (or cash-in-advance) technology is rejected by the data since it can be shown to imply that the interest elasticity of the demandfor-money should be zero. The Cobb-Douglas function is also rejected since it would imply an elasticity of substitution of minus one. In the data, recent studies 21 find that the interest elasticity of the demand for money is close to -0.5 and one can use this fact to calibrate the value of e. Models that are calibrated to capture both money's share of output and the elasticity of money demand can be calibrated to display indeterminate equilibria. The reason
21 See, for example, Hoffman, Rasche and Tieslau (1995).
Ch. 6: Indeterminacyand Sunspots in Macroeconomics
413
why indeterminacy is more easily obtained in this case is that, in equilibrium, there is a relationship between real balances and labor demand that is found by solving firstorder conditions in the labor market. I f one solves for labor demand as a function of real balances [call this function l(m)] using this condition and substitutes the result back into the production function one arrives at the equation y=y(m,l(m)).
Calibrations of the production function using Equation (4.7) lead to the conclusion that the elasticity of y with respect to its first argument is small. However, although the direct effect of money on output may be small, the indirect effect through the fact that labor and real balances are increased together, the elasticity o f y with respect to its second argument may be large. Benhabib and Farmer (1996b) exploit the fact that their parametrization leads to indeterminacy to match a number features of the monetary propagation mechanism. 4.4. Monetary models with several state variables
The Benhabib and Farmer explanation of monetary dynamics works by picking an equilibrium in which the price is predetermined one period in advance and hence an increase in the nominal quantity of money causes an increase in real balances and employment. Beaudry and Devereux (1993) and Farmer (1997) build on this idea by building money into versions of a real business cycle economy. The paper by Beaudry and Devereux adds money to a structure in which there is already a real indeterminacy because of increasing returns to scale. The work by Farmer adds money into the utility function and has a production sector that is identical to the standard real business cycle model. Both sets of authors calibrate their economies to fit the broad features of the US economy (both real and monetary) and both models perform as well or better than a standard RBC model at replicating the second moments of US time series on consumption, investment, capital, GDP and employment. The following discussion is based on Farmer (1997) who allows for a fairly general specification of utility of the form U = U(C,m,l),
where C is consumption, m is real balances and I is labor supply. In the spirit of the real business cycle models of King, Plosser and Rebelo, Farmer argues that one should restrict attention to utility functions that allow for growth to be balanced and he shows that this implies that utility must be homogenous of degree p (a real number less than one) in m and C. The class of functions used in the paper is of the form U- X(C'm)l p 1-p
W(C,m)I-PV(1),
(4.8)
414
J Benhabib and R.E.A. Farmer
where X and W are CES aggregators and V is an increasing convex function that measures the disutility of working. The following discussion is based on the special case of this utility function: cl-p
U-
-1-p
ml-PV(l),
p>
1.
The production side of the model is a standard RBC economy in which output is produced with the technology Y = F ( K , l) S,
where F is Cobb-Douglas and S is an autocorrelated productivity shock. Farmer considers two kinds of monetary policies. Policies in which there is an interest rate rule of the kind
and money growth rules of the kind Mt = g M t 1,
where i is the nominal rate of interest, M is the nominal quantity of money and/~ is the money growth factor. In the case when the monetary authority fixes the money growth rate in advance, the model can be described by a four-variable difference equation of the form 22 lull
~t+ 1
Ct+l Kt+l /'/~t+ 1
=A
G Kt mt
! U2+l L
where u 1 and //2 are fundamental shocks and e I and e 2 are sunspot shocks. Unlike the sunspot models that we have discussed so far, Farmer allows for multiple shocks to both sunspots and fundamentals and he calibrates the magnitude of the shocks by estimating the variance co-variance matrix of the residuals from a four variable vector autoregression on US data. Indeterminacy in this model can be understood by appealing to the Benhabib and Farmer (1994) results on the real model with increasing returns. Consider the case in which the Central Bank pegs the nominal interest rate. It is well known that this policy
22 The variables in Farmer (1997) are divided by a growing productivity term to deal with nonstationarities in the data. For brevity we omit this refinement in our discusion.
Ch. 6." Indeterminacy and Sunspots in Macroeconomics
415
rule leads to price level indeterminacy. What Farmer shows is that for utility functions in the class described by Equation (4.8) there may also be a real indeterminacy. Optimal decisions in Farmer's model are characterized by three Euler equations one for capital, one for money and one for bonds - and one static first-order condition describing the labor market. One may combine the Euler equations for money and bonds to yield a second static first-order condition: = ~' +
I+X P
7- 17, P
(4.9)
where the variables ~, rh, and 7 are the logarithms of consumption, real balances and labor supply. This equation plays the role of the "demand for money" in this model. The labor market equations can also be broken down into demand and supply of labor equations as in the real model discussed in Section 3.2. These demand and supply equations are: (1 - p ) ~ + p ~ +X1 = 6),
(4.10)
(1 - a ) k + ( a - 1)7 = 6),
(4.11)
where 6) is the log of the real wage and k is the log of capital. Equation (4.11) is a "labor demand" equation. I f we were to graph the real wage against labor demanded and supplied it would be represented by a downward sloping line, shifted by changes in the capital stock. Equation (4.10) is a "labor supply equation". On the same graph, it would be represented by an upward sloping line that was shifted by changes in consumption or changes in real balances. The key to understanding indeterminacy in the monetary model is to notice that one can replace ~ in Equation (4.10) by a fimction of ~, 7 and ], from the money market equilibrium condition, Equation (4.9). This leads to the hybrid equation ~-
-P]+ P
-1
7 =6).
(4.12)
P
In the real model in continuous time Benhabib and Farmer show that indeterminacy occurs when the labor demand and supply curves cross with the wrong slopes. This occurs in their model as a result of increasing returns to scale in production that causes the labor demand curve to slope up. I f one eliminates real balances from the labor market equations, using Equation (4.10), the resulting model has exactly the same structure as the real model of Benhabib and Farmer, with the additional twist that the interest rate enters as an exogenous variable. Equation (4.12) is a compound equation that combines the labor supply curve and the money demand equation; this plays the same role as the standard labor demand equation in the real model. Notice that this hybrid "labor supply curve" slopes down whenever (1 + X ) / P is less than 1. Using the Benhabib-Farmer indeterminacy condition, it follows that the monetary model has a
J Benhabib and R.E.A. Farmer
416
real indeterminacy whenever the "labor supply" curve slopes down more steeply than the labor demand curve; for reasonable calibrations this occurs when Z is small (elastic labor supply), a equals 2/3 (labor's share of national income) and p is bigger than 1.5.
5. Indeterminacy and policy feedback So far we have discussed models in which indeterminacy arises in the context of models with a government sector; but we have allowed only government policies that are determined by simple rules such as fixed money growth rates or fixed government debt. In this section we will examine indeterminacy that may arise as a consequence of more complicated government policies that allow for feedback from the private sector to future values of fiscal or monetary policy variables.
5.1. Fiscal policy feedback We begin with a class of models in which there are "fiscal increasing returns", first discussed and elaborated on by Blanchard and Summers (1987). In the simplest formulation of such a model an increase in the capital stock can increase the posttax return on capital, because it expands the tax base and reduces the tax rate. If G is the constant real government expenditures and G = "cf(kt), where T is the tax on capital a n d f ( k ) is income, we can obtain the analogue of Equation (3.4) as
L) ) If the after-tax return f ' ( k ) ( 1 - G/f(k)) is increasing in k, a shift in p will raise investment and the capital stock, as well as the return to capital, so that Equation (5.1) will be satisfied only if p/p falls. This reverses the original rise in p and moves the system back toward the steady state, generating another equilibrium path. In fact, as shown in Velasco (1996), such a system has two steady-state values for k, corresponding to a high and a low tax rate, with the low-tax steady state representing a saddlepoint. Note that the term (1 - G/f(k)) is analogous to the reciprocal of a markup that varies inversely with the stock of capital k. Two related papers are those of Guo and Lansing (1998) and Schmitt-Groh6 and Uribe (1997a). Guo and Lansing explicitly compare the welfare properties of alternative fiscal policies in a model with increasing returns in the production sector. Their focus is on the ability to Pareto rank alternative equilibria with an eye to asking if models of indeterminacy might eventually be used to conduct welfare analysis, and to design optimal fiscal policies to select the best equilibrium. The model of SchmittGroh6 and Uribe includes labor and capital taxes, and generates two steady states by fixing government revenues and requiring the tax rate to be determined endogenously. Their model does not rely on explicit increasing returns to generate indeterminacy,
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
417
although the labor market effects are similar to those o f Benhabib and Farmer (1994) with upward sloping labor demand curves. The mechanism that operates in their paper works through increases in employment that decrease equilibrium tax rates, and raise the after-tax return on labor. The tax rates at the indeterminate steady state are below those that maximize the revenue on the Laffer curve. Schmitt-Groh6 and Uribe provide a calibration o f their model to fit the US data and show that a successful calibration requires an elastic labor supply and a labor tax rate above the share of capital in aggregate income. They introduce a non-taxed home production sector, which allows indeterminacy under realistic tax rates and labor supply elasticities. 5.2. Monetary policy f e e d b a c k
The monetary models discussed so far assume no feedback from the private economy to government behavior. In practice, however, central banks typically react to the private sector, and the existence of central bank reaction functions has led to the development of a literature in which it is the central bank that is itself responsible for indeterminacy. Many o f the early monetary models simply assumed that the path of the money supply is an exogenous process determined by the central bank. In practice, central banks do not control a monetary aggregate directly. For example, in the USA the Federal Reserve system manipulates non-borrowed reserves on a day to day basis in an attempt to peg an interest rate (the Federal Funds rate) at a level that is revised periodically in light o f economic conditions. Why does much o f the literature assume that the central bank controls the money supply when in practice interest rate control is more common? One reason is that, as pointed out by Sargent and Wallace (1975), interest rate rules lead to price level indeterminacy and until recently, most authors have avoided building models with indeterminate equilibria because it was not known how to match up models o f this kind with data. Recently there has been more interest in the design o f central bank operating rules and this has led to a revival of interest in indeterminacy and its implications in calibrated monetary models 2s. One o f the first to derive an indeterminate equilibrium from a central bank reaction function is Black (1974). He assumes that the central bank responds, at time t, to the inflation rate between times t - 1 and t, decreasing (increasing) real money balances if this inflation rate is positive (negative). In the absence o f a central bank reaction o f this kind, higher inflation would be required to sustain equilibrium in response to an initial (upward) departure o f the initial price from its unique equilibrium level. When the central bank follows a contractionary reactive policy, inflation is no longer necessary to
23 TO do justice to the literature on central bank policy rules would require a separate survey and the reader is merely referred to two recent papers in the area: Taylor (1996) and Svensson (1996) and the literature cited therein. Also see the conference issue of the Journal of Monetary Economics (1997) no. 5, that collects together a number of related papers on the issue of 'Rules and Discretion in Monetary Policy.'
J. Benhabib and R.E.A. Farmer
418
sustain equilibrium. If the monetary policy response is sufficiently strong, prices must decline to offset the expected further contraction of nominal balances, reversing the inflation, and returning the system to its steady-state level of real balances. Therefore deviations of real balances from steady-state levels are reversed, and the initial price level is indeterminate. More recently Leeper (1991) and Schmitt-Groh~ and Uribe (1997b) have studied similar models where the monetary policy rule ties the nominal rate of interest to past inflation. One way to interpret the policy rule in models of this class is to assume that current inflation is forecast by past inflation. I f one assumes that marginal utility and endowments are constant, and that the utility function is separable in real balances and consumption, Leeper's model can be characterized by a discrete Euler equation of the form
Pt+l Pt
-/3it+l.
(5.2)
In this equation/3 is the discount factor, it+l is the nominal interest rate, representing the payout at time t + 1 to an investment in t, and Pt is the price level at time t. I f we assume a simplified feedback rule, with the constants a and y,
it+l
=
P, - apt l
+ ~,
(5.3)
and combine this with Equation (5.2) to obtain a first-order difference equation in the inflation rate, one can show that there exists an indeterminate set of equilibria ira/3 < 1. If on the other hand the nominal rate responds to the contemporaneous inflation, or in a stochastic model to expected inflation, so that
( Pt+l ~ it+, = aE \ ~ - t J + g' real indeterminacy disappears and the inflation rate is pinned down. The price level however is indeterminate because the interest rate rule can now accommodate any level of nominal balances consistent with price expectations, as noted by Sargent and Wallace (1975) for policies that peg the nominal interest rate. Woodford (1991, 1995, 1996) has argued that the price level can nevertheless be pinned down to eliminate nominal indeterminacy if we introduce departures from "Ricardian equivalence," that is if the government does not follow policies that are solvent under all possible equilibria. Woodford's distinction between Ricardian and non-Ricardian regimes is similar to Leeper's (1991) distinction between active and passive monetary policies. In the case of a non-Ricardian policy it is the requirement that private agents have confidence in the solvency of the government budget rule that 'selects' an equilibrium. In this case Woodford argues that the price level is determined by fiscal, rather than monetary, policy 24. [See also Sims (1997).] 24 See also footnotes 27 and 30 below.
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
419
The discussion of Leeper's analysis makes clear that the presence and nature of indeterminacy is influenced by the existence of lags in the implementation of policy. In particular, with separability of consumption and real balances in the utility function, real indeterminacy would be ruled out in a continuous-time formulation, unless delays were explicitly introduced into the policy rule. The continuous-time formulation of monetary dynamics under interest rate policy rules given in the next subsection also illustrates this point. On the other hand, even with separability in the utility function and interest rules where the nominal rate responds to contemporaneous inflation, the slightest price stickiness can convert price level indeterminacy into real indeterminacy, as shown later in the next subsection, in the discussion of the sticky price model of Calvo (1983). 5.3. Interest rate rules and indeterminacy
In general, policy rules that tie the interest rate to current inflation, or in a stochastic model to expected inflation, imply growth rates for nominal balances that are not constant. By backing out the implied growth rates of nominal balances, we can study equilibria of monetary models of the type studied in Section 4. To illustrate how such monetary feedback policies can generate indeterminacies, we will use a simple continuous-time model, based on work by Benhabib, Schmitt-Groh6 and Uribe (1998). We begin by describing the structure of the private economy. For simplicity we assume that there is no production, and that a representative agent receives a constant, nonstorable endowment e at each moment in time. The agent carries wealth from one period to the next by holding bonds, B, and money real balances, M. We define the real holdings of these two assets as m - M/P and b - B/P, where P is the price level and we let a refer to total real assets so that at each instant a = b + m. Money balances do not pay interest but bonds pay a nominal interest which we denote by i. The flow budget constraint of an agent is given by i~=a(i
¢c)-im+e+ T-c,
where ¢~ = P / P is the inflation rate and T are lump-sum transfers or taxes. Now we turn to the structure of preferences. We denote the utility function of the agent by U(c, m), which we assume to be increasing and concave in consumption and real balances and we assume that the agent maximizes the discounted sum of utilities over an infinite horizon, with discount rate p. The first-order conditions for this problem yield the equation Uc(c, m) = p
[this is the monetary analog of Equation (3.1) given in Section 3.1], the portfolio condition which equates the return on bonds to the marginal benefit of holding money:
J Benhabib and R.E.A. Farmer
420
and the following Euler equation which is the analog of Equation (3.4):
/, _ p + ~ _ p
urn.
(5.5)
u~
Since endowments are constant, market clearing in goods requires that c = e, so that b = 0. Totally differentiating Uc(c, m) = p, and noting that ~ is zero, we have Ucm t~l = fo. If we substitute this into Equation (5.5), and use the money market identity =
(:) cr-
,
(5.6)
where o is the growth rate of nominal money balances, Equation (5.5) becomes P
p
Ecm~ =
+
m
UcJ
,
(5.7)
Ecm = --Ucrn m/Uc. To discuss policy we use a continuous-time version of the same rule given by Equation (5.3). Notice that unlike the case when the central bank picks ~, we must find an expression for a as a function of the interest rate by solving for ~ from the policy rule. To accomplish this task we write the following representation of the monetary policy rule: where
i= R + a(ar-(R-
p)),
where R and a are constants. If we use the definition of inflation (5.6) and the firstorder condition (5.4),
i =~=R+a(ar-(R-p))=R+a
6---m
(R-p)
,
(5.8)
we can find an expression for the money growth rate o:
g = -- +
-a-lR+R-p.
(5.9)
m Finally, by substituting this expression into Equation (5.7) we obtain the following differential equation that characterizes time paths for real balances that are consistent with equilibrium in the economy with interest rate feedback 25' 26: m = (ecru) -~
m ~- -R
( a - 1).
(5.10)
Once again it seems reasonable to suppose that money is a normal good. This implies that Um/Uc will be decreasing in m, which implies that the nominal rate i and the 25 Note that as in the discussionof Leeper's (1991) discrete-timemodel, if the utility function is separable in consumption and money so that Eem = 0, there is no real indeterminacy when the nominal interest rate is set as a function of current exected inflation: Equation (5.10) determines only the level of m. 26 The model easily generalizes to the case where money is productive as well, if we replace e with y(m), with/(m) >~O. In that case Equation 5.4 becomes i - (Um/Ue)+/(m). Details of the analysis are straightforward and are left to the reader.
Ch. 6.. Indeterminacy and Sunspots in Macroeconomics
421
demand for real balances m are inversely related. This model has a unique steady state, defined by the level o f real balances, rh, for which (Um/Uc)[rh] = R. Further, the differential equation (5.10) will be stable if ~c,, and ( a - 1) are o f the same sign. Since a measures the sensitivity of the central bank to the inflation rate in setting the nominal interest rate, it follows that, depending on the sign o f Ec,7, there may be multiple equilibria with interest rate rules that are either sensitive (a > 1) or insensitive (a < 1) to the inflation rate 27. The mechanism at work here depends on the feedback rule: for example a rise in real balances causes the nominal rate, which in equilibrium must be equal to Um/Uc to fall. This induces a tighter monetary policy that reigns inflation in. Therefore, even if Um/Uc declines, the net return to holding money may either increase or decrease, depending on the strength o f the central bank response to the nominal rate. The other channel through which the demand for money is affected is through the effect o f money on the marginal utility o f goods, as discussed in Section 4.1: depending on the sign o f e~m, the demand for money may increase or decrease with a rise in real balances. Therefore both a and Ecm play a role in determining the nature o f the dynamics of m and the stability o f the steady state. The results in this section also cover a cash-in-advance economy as a special case. A cash-in-advance model is equivalent to having money in the utility function where consumption and money are combined with a CES aggregator, which in the limit becomes a Leontief production function as the elasticity o f substitution goes to zero. Since in such a case e~m < 0, indeterminacy with the interest rate rule used above and a cash-in-advance constraint is only possible if a < 128. The results obtained in this section depend on the linearity o f the feedback rule given by Equation (5.5). In effect from the perspective of global analysis the situation is more complicated. The nominal interest rates must be bounded below since the central bank cannot enforce negative rates. Benhabib, Schmitt-Groh6 and Uribe (2000) then show that if the feedback rule used by the central bank, i(¢c), is non-decreasing, and there exists a steady state at 7c* where a = i~(7~*) > 1, that is where monetary policy is active, then there must also exist another steady-state value o f Jr at which a < 1, that is where monetary policy is passive. [This can easily be seen simply by graphing both sides of the steady-state relationship, or the Fisher equation, p + Jr = i(7c)]. In such cases global indeterminacy holds, even though local analysis around one o f the steady states may indicate local determinacy.
27 Requiring the discounted value of government debt to remain asymptotically finite, as in Woodford (t996), eliminates price level indeterminacy but not the real indeterminacies discussed in this section. Benhabib, Schmitt-Groh6 and Uribe (1998) show that under indeterminacy, any one of the equilibrium trajectories of the real variables will have the discounted value of government debt remain asymptotically finite for an appropriate choice of the initial price level. See however footnote 30 for the case with sticky prices. 2s See Benhabib, Schmitt-Groh~ and Uribe (1998).
422
J. Benhabib and R.E.A. Farmer
5.4. Monetary models and sticky prices due to frictions in the trading process In this section we will discuss the role of interest rate rules in generating indeterminacy under "sticky" prices. Recently Woodford (1996), Chari, Kehoe and McGrattan (1996), Clarida, Gali and Gertler (1997), and Kiley (1998), among others, have studied models with sticky prices, based on variants of a staggered price setting model originally due to Calvo (1983) and Taylor (1980). These papers study monetary policies that target the nominal interest rate as a function o f past or current inflations, and each of them has noted the possibility of indeterminacy in models in which staggered price setting is assumed to be part of the environment rather than part of the equilibrium concept. One approach to modelling sticky prices due to Calvo (1983) is to specify that firms can change their prices at random intervals, but with a continuum of firms, a fixed fraction of them can do so at each instant. The firms set their prices fully expecting that their price will remain fixed over a random interval while some of the other firms will change their price, and aggregate demand will also vary. This structure may be interpreted as one of monopolistic competition with firms facing downward sloping demand curves which depend on aggregate demand and on the prices of other firms. The following example is based on the Calvo model. We assume that money enters the utility function and we write the Euler equation for consumption as ec
U~
r-Jr
(5.11)
where the portfolio condition again implies that i = Um/Uc. Substituting the policy rule for the nominal interest rate given by the first equality in Equation (5.8) into Equation (5.11), we can rewrite the Euler equation as C
b = -- ((a - 1) (Jr - (R - r))). ec
(5.12)
Under sticky prices the inflation rate Jr is governed by the dynamics of staggered prices which leads to the following equation describing the rate of change o f inflation: ~c = b (q - c).
(5.13)
Here q and b are constants, with q representing a capacity level associated with full employment: (q - c) may be interpreted as excess aggregate demand 29. Equations (5.13) and (5.12) constitute a system of differential equations in (c, jr), where neither c nor Jr are predetermined variables and the local dynamics of these equations depend on the Jacobian of the system evaluated at the steady state. If a < 1,
29 For a discussion of the relation between this equation and the standard Phillips curve see Calvo (1983).
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
423
the steady state is indeterminate since the Jacobian o f the linearized dynamics around the steady state has one negative root. I f a > 1, the relevant roots are imaginary with zero real part, and the stability properties o f the steady state depend on higher-order terms o f the Taylor expansion in the linearization 3°. The novelty o f the class o f models with staggered price setting is that indeterminacy may arise for reasons that are i n d e p e n d e n t o f other mechanisms, in the sense that real indeterminacy may disappear if one removes staggered price setting. In our earlier formulation with flexible prices and a nominal interest rate feedback rule, real indeterminacy was only possible i f money entered the utility function in a nonseparable manner. But with Calvo-style price setters, real indeterminacy may occur even with a separable utility function. It follows that real indeterminacy in this case is attributable directly to the monopolistically competitive price setting mechanism that we introduced to model sticky prices 31. One way to interpret these results is to note that price level indeterminacy that occurs under interest rate rules and flexible prices with separable preferences, turns into real indeterminacy as soon as we introduce some degree o f price stickiness. This is in contrast to our earlier discussion o f indeterminacy in monetary models in which sticky prices implement one o f the possible set o f equilibria. In contrast, in the staggered price setting literature it is the sticky prices that c a u s e indeterminacy rather than the other way around.
6. Indeterminacy and models of endogenous growth Our discussion so far has centered on models o f business cycles. Another important area in which indeterminacy plays role is economic growth. Recently, Levine and Renelt (1992) demonstrated the lack o f robustness o f many o f the empirical results explaining the differences in the growth rate o f countries b y institutional and policy differences, and by differences in their rates o f factor accumulation, initial wealth and income distribution. The presence o f indeterminacies offers an additional and
30 Benhabib, Schmitt-Groh6 and Uribe (1998) also discuss a sticky price model based on Rotemberg (1996), where agents optimally choose how much to adjust their prices at each instant. They show that indeterminacy obtains for a < 1 just as in Calvo's model, but that it also obtains for a > 1 under some conditions. In the latter case the steady state can has two stable roots rather than one, so the stable manifold is of dimension two. Benhabib, Schmitt-Groh~ and Uribe (1998) show that requiring the discounted value of government debt to remain asymptotically finite, as in Woodford (1996), restricts initial conditions so the dimension of the restricted stable manifold is reduced to one: this however still implies real indeterminacy since neither c nor Jr are predetermined variables. Fm'thermore they also show that when a > 1, the steady state may be totally unstable with two positive roots, in which case indeterminacy takes the form of the stability of a limit cycle rather than of the steady state. See also Benhabib, Schmitt-Groh~ and Uribe (2000). 31 As noted by Kiley (1998), sticky prices have the effect of increasing the responsiveness of output to monetary shocks, and in this sense they are "productive."
424
J. Benhabib and R.E.A. Farmer
complementary explanation of why countries that have very similar endowments and fundamentals, nevertheless save and grow at different rates 32. The recent literature on endogenous growth, initiated by Lucas (1988) and Romer (1990), contains elements of market imperfections that can be shown to generate indeterminacies under reasonable parametrizations. In contrast to the business cycle literature however, in models of endogenous growth it is the balanced growth path that is indeterminate, rather than the steady-state level of GDE The distinctive feature of endogenous growth models is their production technology which allows proportional growth in some accumulated assets like human or physical capital, or the stock of research and development. The fact that the technology allows for linear growth implies that there must exist increasing returns at the social level to overcome diminishing returns at the private level. It is a small step from here to generate indeterminacy through complementarities between the factors of production. An interesting feature of endogenous models is their ability to generate multiple balanced growth paths in conjunction with indeterminacy. We can illustrate how multiple balanced growth paths and indeterminacy can arise in such models with small modifications to the simple structure of Equations (3.1), (3.2), (3.3) and (3.4). We will rely on a simple illustrative structure of production that is linear in an accumulated asset, and with sufficiently strong external effects from the labor input. Our endogenous growth model will have a balanced growth path, along which the ratio of the asset to consumption will be constant. If we denote the asset by k, we will have k = s c , where c is consumption and s is a constant. For simplicity lets assume that the utility of consumption is logarithmic, and that the production function is of the Cobb-Douglas form, y = k a k O-a) L ~, where lc represents an external effect. Consider the endogenous growth version of Equation (3.4), where we have replacedp by c using Equation (3.1):
C
-- ( w I ( L ) -- ( r q- g ) ) .
Note that since y is linear in k, Wl only depends on L, and is given by wl = can also write Equation (3.3) for the goods market equilibrium as
k k - a(L) - s - g,
(6.1)
a L ft.
We
(6.2)
32 For a study of the empiricalrelevanceof indeterminacyin explainingeconomicgrowth,see Benhabib and Gali (1994).
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
425
where a(L) is the average product of capital and only a function of L because y is linear in k: a(L) = L ~. Since s is a constant along the balanced growth path, the difference between the right-hand sides o f Equations (6.1) and (6.2) must be zero 33: a ( L ) - S - Wl(L) + r = s - r + a(L)(1 - a) = O.
(6.3)
The second equality follows because the marginal and average products o f capital, a(L) and yI(L), are proportional, and in our Cobb-Douglas example their difference
is a(L)(1 - a). We can also express s as a function o f L by using the labor market equilibrium condition given by Equation (3.2): s-
c
k
- -
m(L)
-
v'(1 -L)
- v(L).
(6.4)
Here m(L) is the marginal product of labor divided by k. Substituting this expression into Equation (6.3) we have a(L) (1 - a) - o(L) = r.
(6.5)
Equation (6.5) can have one, two or no solutions corresponding to the balanced growth paths, depending on the parameters of the model. The right-hand side o f Equation (6.5) is monotonic in L if v(L) is decreasing, but if v(L) is increasing, there may be two balanced growth paths. A n increasing v(L) however is only possible if the marginal product of labor is increasing, and this requires a significant labor externality. This is precisely what happens in the endogenous growth version o f the model in Benhabib and Farmer (1994), when the labor externalities are high enough. One of the balanced growth paths is determinate, while the other is indeterminate. A more extensive analysis o f a related mechanism in the Lucas (1988) model, with small external effects confined only to the research sector, is given in Benhabib and Perli (1994) [see also Xie (1994)]. They show that multiple balanced growth paths and indeterminacy can appear for reasonable parametrizations o f the model 34. A similar mechanism produces indeterminacy and multiple balanced growth paths in Romer's (1990) model, as analyzed by Benhabib, Perli and Xie (1994) and by Evans,
33 This approach of equating two growth rates to (graphically) characterize nmltiple steady states is also taken by Evans, Honkapohja and Romer (1996). To generate sufficiently high returns that justify the higher investment and growth rates, the authors rely on production complementarities giving rise to external effects from the introduction of new intermediate goods, rather than from postulating increasing returns to the labor input. 34 The conditions for indeterminacy in the Lucas model are less stringent than those presented above because it is a two-sector model. In fact the two-sector structure allows for indeterminacy with a fixed labor supply just like the two sector model discussed in Section 3.1, but requires the utility of consumption not to exhibit too much curvature. This same feature arises in the two-sector model above, and it is the reason for introducing a third sector in Benhabib and Nishimura (1998).
426
J. Benhabib and R.E.A. Farmer
Honkapohja and Romer (1996). Evans, Honkapohja and Romer (1996) also study a modification of the Romer model by introducing adjustment costs that generate a nonlinear production possibility curve between the consumption and investment sectors. Their model has three balanced growth paths, two of which are stable under a learning mechanism. Introducing sunspots induces jumps across the two stable (indeterminate) balanced growth paths, and generates fluctuations in the growth rate. Such regime switching equilibria giving rise to sunspot fluctuations in the growth rate are also studied, both theoretically and empirically, in Christiano and Harrison (1996). As in Benhabib and Perli (1994), they observe that indeterminacy can arise even if the balanced growth paths are locally determinate, because rates of investment can be chosen to place the economy on either one of them. Another specification generating indeterminacy is given in the endogenous growth model of Gali and Zilibotti (1995). They use a model with monopolistic competition, coupled with fixed costs and entry. Markups are inversely related to entry and to the capital stock, so that raw returns can increase in k. This model is an endogenous growth version of the variable markup model of Gali (1994). It gives rise to two balanced growth paths, one with zero growth representing a corner solution, and the other one with a positive growth rate. Furthermore there is a range of initial conditions for capital in which the equilibrium trajectory is indeterminate, and may converge to either of the balanced growth paths depending on the initial choice of consumption.
7. S o m e related w o r k
So far the framework presented by Equations (3.1)-(3.4) assumed that the preferences were standard, and in particular that the discount rate was constant. We may however allow the discount rate to be affected by some social norm, proxied for example by the value of aggregate consumption. If preferences and the discount rate are subject to such external effects, it is clear from Equation (3.4) that they can substitute for external effects and increasing returns in technology. A higher price of investment may lead to a higher capital stock, and may well decrease the marginal returns to capital. If the discount rate declines as well however, increasing price appreciations in the price of the capital good may be unnecessary to sustain equilibrium. The price of the investment good may well decline, and move back towards its stationary equilibrium value, generating indeterminacy. Such a mechanism is explored in detail in a recent paper by Drugeon (1996). In general, endogenous preferences coupled with some market imperfections are likely to provide a basis for multiple equilibria and indeterminacy. An alternative route to indeterminacy may be through increasing returns not in the production function, but in the utility function. In such a setup there must be sufficient discounting of the future to assure that utilities remain finite in equilibrium. In a recent paper Cazzavilan (1996) studies indeterminacy in such a model, where public goods financed by taxes enter the (constant returns to scale) production function. Since the
Ch. 6: Indeterminacy and Sunspots in Maeroeconomics
427
public goods are productive, they create externalities because agents take the tax rate as given. The result is an endogenous growth structure with an indeterminate balanced growth path. Indeterminacy can also arise from variations in capacity utilization if utilization rates co-move with labor, as would be the case if intensified utilization accelerates capital depreciation. This possibility has recently been shown by Wen (1998). In his model a shift in production towards investment will raise the capital stock, but an associated increase in labor will cause the marginal product of capital to increase rather than decrease, very much like the model of Benhabib and Farmer (1994). The reason for the expansion in labor however is not an upward sloping demand curve for labor due to external effects, but a rightward shift in the labor demand curve due to increased capacity utilization. Wen calibrates his model to US data and finds that indeterminacy can provide a remarkably good match to the data with mild increasing returns in the order of 0.1. Guo and Sturzenegger (1994) study the application of indeterminacy to the study of international consumption data. The RBC model has trouble with the fact that consumption across countries is predicted to be perfectly correlated under simple variants of the international RBC model with complete markets. But in practice the correlation between consumption across countries is quite low. The Guo and Sturzenegger explanation drives business cycles with sunspots as in the single-country model of Farmer and Guo (1994), but they assume that agents are unable to perfectly insure across countries. Their calibrated model does a fairly good job of explaining the cross-country data and is one of the first applications of empirical models of indeterminacy to international data sets. We should note that we have not touched upon the literature that deals with indeterminacy in overlapping generations models or in finite markets with incomplete participation or market imperfections. Some recent overviews of these topics can be found in Balasko, Cass and Shell (1995) or Bisin (1997), among others.
8. Empirical aspects of models with indeterminacy In Section 2 we mentioned two areas in which models with indeterminate equilibria might potentially improve upon existing models of the business cycle. The first is that of propagation dynamics and the second is related to monetary features of business cycles. In this section we elaborate on the claim that indeterminacy might be a fruitful research direction by surveying known results in which some progress has been made on each of these issues. 8.1. Real models and propagation dynamics
The real business cycle literature represented a major departure from the Keynesian models that preceded it. On the theoretical front RBC theorists argued that the correct
428
J. Benhabib and R.E.A. Farmer
way forward for macroeconomics is some version of dynamic general equilibrium theory. On the empirical front they argued that the standards for what should be considered a successful description of the data should be considerably relaxed from the requirements imposed by time-series econometricians. Following the approach initiated by Kydland and Prescott (1990), much of the RBC literature dispenses with attempts to study the low-frequency components of time series by passing data (both actual and simulated) through a filter that leaves only high-frequency components 35. If simulated data from an artificial model can replicate a few of the moments of the data from an actual economy then RBC economists argue that the model is a successful description of the real world. There is much to disagree with in the RBC methodology. It has nevertheless had the effect of providing a unified framework for comparing and evaluating alternative economic theories. In this section of the survey we will turn our attention to calibrated models of indeterminacy that have used the RBC methodology to provide competing explanations of business cycle phenomena. These models all build on some simple variant of a representative-agent economy and the variables they describe includes consumption, investment, GDP and employment as a subset. It is therefore possible to ask how their predictions compare with those of the benchmark model 36. 8.1.1. One-sector models
In Section 2.2 we pointed out that the one-sector real business cycle model, driven by productivity shocks, has a representation as a difference equation in three state variables. We reproduce this equation here:
~t-I-1 k ~t+l
: ~-1
~g ~t
4~-lF Lu,+l ]j'
(8.1)
The variables ~, /c and ~ represent deviations of consumption, capital and the productivity shock from their balanced growth paths; ~ is a belief shock and ~t is an innovation to the productivity shock. The variables kt+l and st+l are determined at date t but ct+l is free to be determined at date t + 1 by the equilibrium conditions of the model. If the matrix q~ 1 has three roots inside the unit circle then it is possible to construct equilibria in which the business cycle is driven purely by iid sunspot errors (the variable et+l) and the artificial data constructed in this way can be compared with actual data in the same way that one matches RBC models by comparing moments. This idea was exploited by Farmer and Guo (1994) who pointed out that there are
35 Hodrick and Prescott (1980) advocated this approach in their widely circulated discussion paper. Although the HP filter is widely used in the literature it has also been widely criticized since the filter itself can alter the covarianceproperties of the filtered data in ways that may introduce spurious cycles. 36 For all interesting perspective on this issue, see Kamihigashi (1996).
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
429
some dimensions in which the sunspot model can perform better than models driven by fundamentals. We return to this idea shortly. To get a better idea of how an array of sunspot models compare with each other, with the RBC model and with the data, Schmitt-Groh6 (1997) analyses four different models, all of which are calibrated in a similar way, and all of which have a representation of the kind illustrated in Equation (8.1). The models that she studies are (1) a model similar to that of Gali (1994) in which changes in the Composition of Aggregate Demand (the CAD model) allow the markup to be countercyclical; (2) a model based on Rotemberg and Woodford (1992) in which markups may again be countercyclical but in this case the variability of the markup follows from Implicit Collusion (the IC model); (3) a model with increasing returns and decreasing marginal costs (the IR model); and finally (4) a model with externalities (the EXT model) based on the work of Farmer and Guo (1994). The main question addressed by her work is "For what values of the parameters can the matrix q~-i in Equation (8.1) have three roots all inside the unit circle?" This is an interesting question in light of the results of Farmer and Guo (1994) since, when all of the roots of q~-I are inside the unit circle, one can generate artificial time series for consumption, investment, hours and GDP by simulating sequences of variables using the equation
I
~t+ 1
(8.2)
where @1~ and [~P-IF],, are the 2 × 2 upper left blocks of @ ' and [~P ' F I. The formulation of the model in Equation (8.2) is one in which equilibrium business cycles in which all agents are fully rational are driven purely by sunspots. Schmitt-Groh6 (1997) simulates series of artificial data for all four types of onesector model. In each case she calibrates the baseline parameters as in a standard RBC model, laid out in work by King, Plosser and Rebelo (1987), and she sets the increasing returns, externality and markup elasticity parameters in a way that minimizes the degree of aggregate increasing returns but still allows equilibria to be indeterminate. Table 1 reproduces sections a, b and c of Table 6, page 136 in Schmitt-Groh6 (1997). The first two data columns of this table are reproduced from King, Plosser and Rebelo (1987) and they illustrate the dimensions on which the RBC model is often evaluated. The column labeled "RBC model" contains statistics generated by simulations of a 'standard' RBC model in which the source of business cycle dynamics is a highly persistent productivity shock. Column 1, for comparison, gives the US data. Columns 3 through 6 are statistics generated by Schmitt-Groh6 in which each of the four sunspot models are used to simulate data but, in contrast to the RBC model, each of these columns simulates data generated by a pure sunspot shock. The main point of this table is to illustrate that by the standards of the calibration literature, the models driven purely by sunspots perform about as well. This in itself is an interesting observation because although early work on sunspot models had demonstrated that sunspots could exist, much of this literature had little or no connection with data.
430
J. Benhabib and R.E.A. Farmer
Table 1 Results of different models US data
RBC model
CAD model
IC model
IR model
EXT model
a. Relative standard deviation: std(x)/std(output) Output
1.00
1.00
1.00
1.00
1.00
1.00
Consumption
0.69
0.64
0.35
0.39
0.82
0.91
Investment
1.35
2.31
3.36
3.41
2.32
1.82
Hours
0.52
0.48
0.71
0.70
0.43
0.32
Real Wage
1.14
0.69
0.42
0.44
0.83
0.91
0.60
0.81
b. Autocorrelation coefficient AR(1)
Output
0.96
0.93
0.89
0.71
Consumption
0.98
0.99
0.98
0.98
1.00
1.00
Investment
0.93
0.88
0.88
0.66
-0.08
0.16
Hours
0.52
0.86
0.88
0.66
-0.24
-0.12
Real Wage
0.97
0.98
0.94
0.88
0.97
0.99
0.58
0.84
0.92
c. Contemporaneous correlation with output
Consumption
0.85
0.82
0.65
Investment
0.60
0.92
0.97
0.96
0.82
0.82
Hours
0.07
0.79
0.85
0.86
0.56
0.42
Real Wage
0.76
0.90
0.91
0.86
0.90
0.95
Earlier in this survey we drew attention to two aspects in which sunspot models with indeterminate equilibria are different from standard models with a unique equilibrium. The first was that models with sunspots can generate an alternative source of the impulse to the business cycle, and it is this claim, that sunspots may be a primary impulse, that is evaluated in Table 1. A second, and perhaps more interesting feature of models with indeterminacy, is that they offer an alternative explanation of propagation dynamics. To evaluate this claim, Farmer and Guo (1994) generate a set o f impulse response fimctions from three different models and they compare these impulse response functions with those from US data. The impulse response functions to innovations in output for US data are derived from a vector autoregression of output, employment, consumption and investment with a linear time trend and five lags, over the period 1954.1 to 1991.3. The three models are a standard RBC economy (the same calibration as the RBC economy in Table 1) and two different models with externalities. One of these models is calibrated as in work by Baxter and King (1991) who introduce externalities but calibrate these externalities in a way that is n o t large
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
431
enough to generate indeterminacy. The second is a calibration w i t h indeterminacy in line with the EXT model discussed by Schmitt-Groh& Figure 1 compares the impulse responses in each of these three models with the impulse response to a set of shocks in the US data. Notice, in particular, the dynamic pattern of investment in the data and compare it with models 1 and 2 in Figure 1. The impulse responses for US data shows clear evidence of a cyclical response pattern whereas models 1 and 2 (the RBC model and the externality model w i t h o u t indeterminacy) both show monotonic convergence patterns. Farmer and Guo point out that monotonic convergence in the RBC economy follows from the fact that, although the dynamics in k and s are two-dimensional, there is no feedback in the dynamics of the system from the productivity shock s to the capital stock k. The system
[
~+1
that characterizes the RBC dynamics has monotonic impulse response functions because the matrix A is upper triangular and it necessarily has two real roots. The equation
t+l] [};+et+lJ,
I kt+l
on the other hand, that characterizes the dynamics of the sunspot models, incorporates feedback both from ct to kt and vice versa, hence the matrix A that determines the properties of the impulse response functions in this case c a n have a complex roots. It is this feature that Farmer and Guo exploit to generate the features of the dynamic responses illustrated in Figure 1. Although the one-sector models discussed above do a relatively good job of describing data they rely on large markups and imperfections to generate indeterminacy, which may not be empirically plausible. Schmitt-Groh6 (1997) concludes that while "... the relative volatility, autocorrelation, and contemporaneous correlation properties of macroeconomic aggregates predicted by each of the endogenous business cycle models are broadly consistent with those actually observed in the US data ... the degree of market power or returns to scale required for the existence of expectation driven business cycles lies in the upper range of available empirical estimates.". The more recent models of Perli (1994), Schmitt-Groh6 and Uribe (1997a) and Wen (1998), which modify the one-sector model by introducing home production, taxes, and variable capacity utilization, are also successful in their calibration analysis but they avoid the high degree of increasing returns to scale required by Benhabib and Farmer (1994) to generate indeterminacy. In the next subsection we discuss the empirical performance of multi-sector models which do not rely on large market distortions or external effects to generate indeterminacy.
432
J. Benhabib and R.E.A. Farmer
~J
E
~
E
<
~r
IJ
f~ ,
"m
i
o
-
E
~
E
> ~ o o
~
T
g
°t
Ch. 6: Indeterminacyand Sunspots in Macroeconomics
433
8.1.2. Two-sector models In this section we discuss a class of two-sector models that are able to generate indeterminate equilibria for much lower degrees of returns to scale or market imperfections than the one-sector models discussed above. However the lower increasing returns makes it much harder to obtain procyclical consumption by relying exclusively on sunspot shocks and ruling out technology shocks. Indeed, one of the most successful features of the RBC models is their ability to deliver procyclical consumption and to avoid countercyclical wages implied by the neoclassical model without shocks to technology. The discussion below centers on the issue of procyclical consumption in calibrated multi-sector models that require little or no increasing returns to generate indeterminate equilibria. We begin our discussion with the calibrated two-sector model of Benhabib and Farmer (1996a), discussed in Section 3.3. This model can generate indeterminate equilibria for sector-specific external effects that are significantly milder than those needed for models with a one-sector technology; one obtains indeterminacy for returns to scale in the consumption and investment sectors of about 1.07, when one assumes that net of external effects, the firms face production technologies that exhibit constant returns. But although 1.07 will generate indeterminacy, it is not enough to successfully match the various moments of US macroeconomic data at least in the case when business cycles are solely driven by sunspots. Successfully matching the data requires returns to scale of around 1.2. Unlike the earlier model of Benhabib and Farmer (1994), returns to scale of 1.2 does not imply an upward sloping labor demand curve, but it still remains high in light of recent empirical work by Basu and Fernald (1997) and others. The main reason that a high externality is needed for a reasonable calibration is to assure that consumption is procyclical when the only stochastic shocks in the model are sunspots. We can easily illustrate this point, following the discussion in Benhabib and Farmer (1996a). Let Ur(C) be the marginal utility of consumption, VI(-L) the marginal utility of leisure and MPL(L) the marginal product of labor where for simplicity we ignore the dependence of MPL on capital. The first-order condition for the choice of labor in a standard one-sector model takes the form
U ( C ) M P L ( L ) = V'(-L). Suppose that employment increases spontaneously in this model, as would be the case if "sunspots" were the dominant source of fluctuations. In this case the increase in L would decrease MPL and increase V~(-L): equality for the first-order condition for the labor will be restored only if C were to fall and U~(C) to rise. In other words, pure sunspot fluctuations will cause consumption to be countercyclical. In the following discussion we identify several channels that might break this link. (1) The first possibility is that demand and or supply curves may have non-standard slopes. If the marginal product of labor, MPL, is increasing in L, which gives
434
J. Benhabib and R.E.A. Farmer
an upward sloping labor demand, or if Vr(-L) is decreasing in L, which gives a downward sloping labor supply curve, then an increase in L may be associated with an increase in C. When we estimate a model that involves this first-order condition, the procyclical consumption in the data may well force the estimated parameters to imply an upward sloping demand, a downward sloping supply, or both; this, for example, is exactly what Farmer and Guo (1994) find when they estimate a one-sector model. The existence of an upward sloping demand curve for labor requires externalities or monopolistic competition, but a downward sloping supply curve can occur even when utility functions are concave. For example, an alternative specification of utility that permits procyclical consumption would replace UI(C) and VI(-L) with U1 (C, L) and U2(C, L). This non-separability may allow the labor supply curve to slope down even in the absence of externalities. However, one may show that a downward sloping labor supply curve also implies that consumption is an inferior good. (2) A second way in which one may reintroduce procyclical consumption follows from work on monopolistic competition. In this setting the relevant variable for the firstorder condition for labor is not MPL, but MPL adjusted for the markup. If the markup is constant the conclusions that follow from the first-order condition are unchanged, but if the markup is countercyclical, then procyclical consumption can be rescued, as in the models of Rotemberg and Woodford (1992) or Gali (1994). (3) All of the above discussion is concerned with the difficulty of explaining procyclical consumption in models in which all shocks arise from sunspots as for example, in Farmer and Guo (1994). Procyclical consumption is easier to obtain with technology shocks since in this case output may rise sufficiently to allow both investment and consumption to increase in response to a positive shock, even though labor may move out of the production of consumption goods to the production of investment goods. Indeterminacy would still remain, so that given the capital stock and the realization of the technology shock, investment and consumption would not be uniquely determined. In other words, even if one thinks that technology shocks provide the impulse to the business cycle, indeterminacy still has a considerable amount to add to the story by providing a plausible explanation of an endogenous propagation mechanism. Benhabib and Farmer (1996a) pursue this last route in their calibration, and with the help of increasing returns of 1.2, obtain a correlation of 0.54 between consumption and GDP. The same calibration with sunspot shocks alone gives a correlation of 0.32, which is still positive due to movements in the capital stock, but low relative to this correlation in US data. Lowering the external effects so that returns to scale are only of the order 1.11 yields countercyclical consumption 37.
37 For returns to scale around 1.1 the impulse response functions are driven by positivereal roots within the unit circle, whereas for returns of 1.2, the roots are again complex, yielding oscillatory impulse responses.
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
435
(4) An alternative approach is to introduce a naturally countercyclical sector that will feed labor into the economy during booms, and absorb labor during recessions. The "home" sector will serve that purpose, even in the absence of technology shocks, and will deliver procyclical consumption as well as procyclical employment in the consumption sector. In such a setup ignoring the home sector and the movements of labor between home and market may indeed make it seem as if leisure is inferior. A calibrated model of indeterminacy and sunspots along such lines is given by Perli (1994). The model of Benhabib and Farmer (1996a), as discussed in Section 3.3, relies on identical technologies in the two sectors, which nevertheless give rise to a nonlinear production possibilities frontier because of sector-specific externalities. In a multisector model without identical technologie s , the marginal products of labor and the capital goods depend not only on factor stocks, but on the composition of output, which is endogenous. As pointed out in Section 3.4, this may allow the marginal product of a capital good to increase in response to an increase in its stock and give rise to indeterminacy, even though we have constant marginal costs in the production technology. Furthermore, this may also alleviate the difficulty of obtaining procyclical consumption because the marginal product of labor now depends not just on L, but on the composition of output. 8.1.3. Multi-sector models
Benhabib and Nishimura (1998) calibrate a three-sector model under constant social returns using a standard RBC parametrization. The presence of external effects coupled with constant social returns result in private decreasing returns, and necessitates some fixed costs to prevent entry. These however can be taken to be small the external effects are also small, implying private decreasing returns of the order 0.93 in each sector. Utility is assumed logarithmic in consumption and separable between leisure and consumption, and is parametrized to imply a labor supply elasticity of 5. The production functions are Cobb-Douglas, and the quarterly discount rate is taken as 0.11. The model allows for iid sunspot shocks, as well as technology shocks driven by a first-order autoregressive process with standard persistence parameters. Table 2 gives the moments of simulated data, with numbers in parentheses corresponding to US quarterly data 38: In the table above, investment corresponds to its aggregated value, evaluated at the current relative prices of the two investment goods. GNP contains consumption plus investment, with the price of the consumption good normalized to unity each period. The impulse responses, generated by the linearized dynamics of the system around the steady state, are driven by positive real roots
38 The statistics for the USA are from HP filtered postwar data, and are in line with standard ones in the RBC literature. They differ from the US statistics given by Schmitt-Groh6 (1997) who relies on unfiltered statistics reported in King, Plosser and Rebelo (1987).
436
J. Benhabib and R.E.A. Farmer
Table 2 Three-sector calibration GNP Relative standard deviation
1.00
Correlation with GNP
1.00
AR(1) Coefficient
Consumption
Investment
Labor
3.32 (3.20) 0.83 (0.90) 0.92 (0.76)
0.70 (1.16 0.71 (0.86) 0.80 (0.90)
(0.74 (0.73) 0.53 (0.76) 0.97 (0.84)
0.93 (0.90)
Impulse Response for GNP, C, end I 0 rt) i
O
;\
¢.Q
\
0 CN 0
\
O0
\
0
\ \ \ \
0
\
0
\
0
I I
I
0 0
11 / 0 0
J 0 0
J
0
c I
I
I
I
I
I
I
I
I
5
10
15
20
25
30
35
40
45
50
time Fig. 2. within the unit circle, and resemble the hump-shaped impulse responses generated with vector-autoregressions on US data. Figure 2 shows the impulse responses for consumption, investment and GNP, generated by an aggregate productivity shock impacting the three sectors simultaneously. The aggregative shock leads to a surge o f investment, initially at the expense o f consumption. Again we find that this feature, that is the initial negative response o f consumption to the aggregative technology shock, typically arises for standard RBC
Ch. 6: Indeterminacyand Sunspotsin Macroeconomics
437
calibrations o f multi-sector models whether or not they have any external effects or exhibit indeterminate equilibria. GNP also drops by a small amount when the shock hits, but rises immediately afterward as investment surges, and then subsides, generating the hump-shaped response associated with the data. Another feature, shared with calibrated multi-sector models without external effects or market distortions that have determinate equilibria, is that prices and outputs o f investment goods tend to be more volatile than the aggregated value o f investment, with some sectors even exhibiting countercyclical behavior [see for example Benhabib, Perli and Plutarchos (1997)]. These counterfactual observations about calibrated multi-sector models in the context o f a determinate economy has led Huffman and Wynne (1996) to introduce adjustment costs for the sectoral reallocations of factors o f production. It seems then that, with or without sunspots and multiple equilibria, the multi-sector real business cycle models solve some o f the empirical issues encountered in simpler one-sector models, but also introduce empirical complications o f their own 39.
8.2. Monetary models and the monetary transmission mechanism A second area in which calibrated monetary models are registering some progress is in describing the dynamics o f the monetary transmission mechanism. Recall that the model described by Farmer (1997) has a representation as a difference equation o f the form
[
~t+l Ct+l
Kt+l mt+l
[-U~l [ t~t / c, = A I(t +B |el+, L e2+l mt
(8.3)
where /z is the money growth rate, C is consumption, K is capital and m is real balances. The variables u 1 and u 2 are fundamental shocks and e 1 and e 2 are sunspots shocks. The model has two variables,/~t+l and Kt+l, that are determined at date t and two variables, mt+l and Ct+l, that are free to be determined by the equilibrium behavior o f agents in the model. The condition for there to be a unique rational expectations equilibrium is that two o f the three roots of the matrix A are inside, and two roots are outside, the unit circle. There are two possible dimensions for sunspots to influence the equilibrium o f this model depending on whether three or four of these roots are stable. Farmer shows that it is relatively easy to choose calibrated values o f the parameters in a way that makes all four o f these roots lie within the unit circle and, in this case, he shows that one is free to pick stationary iid white noise processes for each o f the two sunspot variables, e 1 and e 2. He then goes on to show that the variance-covariance
39 In a recent paper Weder (1998) introduces a model with three sectors consisting of separate investment, consumption and durable consumption goods with variable average markups to address some of the empirical issues that arise in calibrating multisector models.
438
J Benhabib and R.E.A. Farmer
matrix o f the vector {u 1, u 2, e l, e 2} can be estimated from the residuals o f a vector autoregression on US data 4°. The important point from this discussion is that it suggests an empirical approach to the resolution o f indeterminacy. I f agents live in a world that is well described by a model in which equilibrium is indeterminate, these individuals must still act. To make a decision it is necessary to form an expectation o f what will happen; the fact that there are many possible expectations that might be fulfilled is not a problem for the agent. He need only pick one o f them. Farmer argues that there is some coordination mechanism that causes agents to act in a particular way and that this mechanism can be represented by a fixed expectation function. The econometrician will observe the outcome o f this expectation function. Let us suppose that agents have solved the coordination problem and that they coordinate on the same equilibrium from one year to the next. This implies that the response to a given constellation o f fundamental shocks will have the same probability structure in each year. In terms o f the monetary VAR described by Equation (8.3), the particular equilibrium in which we find ourselves will show up in the covariance matrix o f the residuals o f the VAR. Each o f the possible sunspot equilibria will result in a different joint covariance structure o f the sunspot shocks with the fundamentals. We expand on this point in the following section in which we address some criticisms that have been leveled against models with indeterminacy as descriptions o f data.
9. Some criticisms of the use of models with indeterminate equilibria to describe data In this section we evaluate and address a number o f concerns that have been raised by critics o f models o f indeterminacy and o f the use o f indeterminacy to explin economic data. We begin with the issue o f how an equilibrium is chosen in a model where many things can happen. 9.1. Equilibrium selection
In any model with multiple equilibria one must address the issue o f how an equilibrium comes about; this is true o f finite general equilibrium models with multiple determinate
40 Farmer calibrates his model and reports impulse response fimctions that appear to match well with US data. These impulse reponse functions exploit the indeterminacy of equilibrium to generate price responses to monetary shocks that mimic those that we observe in the data. In a private communication, Kiril Sossounov has pointed out to us that there is a computational error in the program used to generate the impulse response functions reported in Farmer's paper. For this reason, we have not reproduced them in this survey. The basic point of the paper, that including money in the utility function can lead to indeterminacy, is correct. But it is an open question as to whether indeterminacy occurs for a range of the parameter space that can mimic the low share of resources used through holding money.
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
439
equilibria and it is, afortiori, true of dynamic models with indeterminate equilibria. In dynamic models one thinks of the economy as evolving in a sequence of periods. In each period, agents form forecasts of future prices and they condition their excess demand functions on these forecasts. In an economy with a finite number n of commodities each period and a finite number m of agents with time-separable preferences one can write the equilibrium of the economy as the solution to a set of equations that set excess demand functions equal to zero:
z,
''
'Pt+l
[St+l]
,
t÷l"'''
w t+l 'Sl) = O,
(9.1)
where Z¢ is the n-dimensional vector of excess demands, st is the state of nature, Pt is the n-dimensional vector of prices, pie [st+l] is the ith agent's belief of the value of the price vector at date t + 1 in state of nature st+l, and Wt~1 is the ith agent's belief of the value of his wealth. Wealth must be forecast since it depends on the present value of future prices in all possible realizations of states. Rational expectations is the assumption that all agents know future state dependent prices and can therefore correctly forecast their future wealth. When a model has a unique equilibrium, the set of excess demand functions at each date has a unique solution forpt when each expected price vector is replaced by the actual price vector in that state and when the wealth of each agent is computed accordingly. When a model has an indeterminate set of equilibria, there are many solutions to these equations: excess demand functions alone, reflecting preferences and technology, are insufficient to pin down an equilibrium. In either case, the equilibrium assumption does not address the problem of how rational expectations comes about. Most work on rational expectations models begins with the assumption that there is a representative agent, thereby drastically reducing the complexity of the problem. The usual justification for rational expectations is to appeal to the assumption that the world is stationary, and to argue that in a stationary environment agents would eventually come to learn the unique set of state-dependent prices. There is a body of work on out-of-equilibrium learning that begins by conjecturing that there exists a learning rule used by agents to forecast the future. The main result of this literature is to show that rules of this kind can select an equilibrium. Initially, some authors conjectured that 'plausible' learning rules would always select a determinate equilibrium but this has proved not to be the case. Woodford (1990), for example, has shown that a simple learning rule can converge to a sunspot equilibrium and Duffy (1994) has demonstrated that learning rules can converge to one of a set of indeterminate equilibria. Grandmont (1994) puts forward the view that the problem is so complex that agents are unlikely ever to learn how to behave in a rational expectations equilibrium. For a more detailed exposition of the issues concerning equilibrium selection in models with endogenous cycles and sunspot equilibria the reader is referred to the survey by Guesnerie and Woodford (1992).
J. Benhabib and R.E.A. Farmer
440 9.2. E q u i l i b r i u m f o r e c a s t f i m c t i o n s
A separate, but related question, is how a given equilibrium is maintained. It is all very well to assume that agents know future prices, but how do they behave in any given state? One possibility is that agents use a forecast rule that maps from current and past observable variables to future values of state-dependent prices. Consider a model of the form x ~ = a p t + b E t [Pt+l],
(9.2)
x s = st,
(9.3)
where xtD is aggregate demand, st is aggregate supply which we take to be an iid sequence of random variables with mean zero and bounded support, and Pt is the log price. Equating demand and supply leads to a functional equation that must be satisfied by stochastic processes for P t that are candidate equilibria: h
1
pt = -sta
:-Et [Pt+~]. a
(9.4)
There are two cases to consider. If [b/a[ < 1 then there is a locally unique equilibrium given by 1
(9.5)
Pt = -st. a
This is the case of a unique determinate equilibrium. But if [b/a] > 1 then there is a set of equilibria of the form Pt+l
1 =
~st
a
-
~ P t + et+l,
(9.6)
where et+x is an arbitrary lid sunspot shock with zero conditional mean. In the determinate case, agents can forecast using the equilibrium function (9.5). Plugging this function into Equation (9.2) leads to the expectation El [Pt+l] = E t [ St+l ] = 0 k a a
(9.7)
which implies that demand is given by the function x D = apt.
(9.8)
As the current price varies, market demand varies with current price according to Equation (9.8). A Walrasian auctioneer, calling out prices, would find a unique price, P t = st~a, at which demand equals supply.
Ch. 6:
441
Indeterminacy and Sunspots in Macroeconomics
In the indeterminate case it is not so obvious how an equilibrium could be maintained. Suppose that agents forecast the future using the equilibrium pricing rule, Equation (9.6). Substituting this rule back into Equation (9.2) leads to the demand function X? = S t
which is identical to supply for all possible values o f p t . Equation (9.6) cannot be used to forecast the future price since if agents were to use the equilibrium price function demand equals supply for any possible price. But although Equation (9.6) cannot be used to forecast, there is a rule that can. Suppose that agents use only lagged information to forecast the future price. In particular, suppose that they use the rule 1 a a a2 Pt+l = ~st + et+l - -~ffst 1 - ~et + bfPt-l,
(9.9)
which is obtained by lagging Equation (9.6) by one period. We will refer to this rule as a forecast function. Using Equation (9.9) they can compute an expectation of the price in period t + 1: 1
a
E[pt+l] = ~ s t - ~ S t
a2
a
1 -- ~ e t + ~ P t
1-
(9.10)
Plugging this expectation back into Equation (9.2) leads to the demand function a
a2
x~ = apt + s t - ~st 1 - a e t + ~ P t 1
(9.11)
that shows how current demand varies with price if agents use Equation (9.9) to forecast. Equating demand to supply, it follows that the current price will be determined by the equation 1 a pt = ~st-1 + et - ~Pt-1,
(9.12)
which is the equilibrium pricing rule that we introduced in Equation (9.6). Let us recapitulate what we have said. We have shown that if agents forecast the future price using the forecast function (9.9) then the actual price will be described by the stochastic difference equation in Equation (9.12). Since the forecast function was obtained by lagging the actual pricing rule, the forecast function is rational. To verify this, one can substitute the equilibrium price rule (9.12) into the forecast function. Furthermore, the sequence of error terms et is arbitrary. We have shown that there are arbitrary forecast functions each of which can support a different rational expectations equilibrium 41.
41 For a generalization of this argument to a higher-dimensional linear model see Matheny (1996).
J. Benhabib and R.E.A. Farmer
442
9.3. Does indeterminacy have observable implications? Some authors have been concerned that models with indeterminate equilibria may not be useful models since, it might be thought, anything can happen. This argument is false. In fact, models with indeterminate equilibria place relatively strong restrictions on the moments of data once one closes these models by specifying a process that determines the formation of beliefs. For example, consider the Farmer-Guo version of the RBC model with increasing returns. We showed earlier that this model is described by a set of equations of the form
~+,
:q'~
St+l
kt St
q, 1F
~t+l .
(9.13)
Ut+l
It is true that if one allows the sequence of forecast errors ~t to be arbitrary that this model allows additional freedom to describe the data 42. But once one specifies a stationary stochastic process for the joint determination of sunspots and fundamentals, this model places strong restrictions on the joint process determining the evolution of the state variables. Indeed, Aiyagari (1995) has argued that these restrictions are falsified in data, and this criticism of the Benhabib-Farmer (1994) model is in part responsible for the research agenda on two-sector models that we described above. Although models of indeterminacy do place restrictions on data, these restrictions are often less severe than the standard real business cycle model. Indeed, it is the fact that some of the restrictions of the standard model are often rejected by the data that is one of the prime motivations for considering a wider class of economies.
10. Conclusion
The central theme of this chapter is that the standard infinite-horizon model, modified to incorporate some mild market imperfection, often supports an indeterminate set of equilibria. When the non-stochastic version of a model has an indeterminate set of equilibria, variants of the model that explicitly incorporate uncertainty will typically support a continuum of stationary rational expectations equilibria, some of which may be driven by sunspots. In this sense the property that the equilibria of finite ArrowDebreu economies are determinate is fragile. An implication of this argument is that minor perturbations of the (Hamiltonian) structure of a representative agent model allows self-fulfilling expectations to have a significant influence on the dynamics of prices and output. Furthermore, the economic
42 On the other hand, even with arbitrary forecast errors for sunspot shocks, without technology shocks it would not be possible to match the procyclicalityof consumption in the data for the reasons cited in Section 8.1.2.
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
443
mechanisms which give rise to such perturbations are varied, and the investigation of these mechanisms is a fruitful one, since it can potentially account for features of the time series data that are otherwise difficult to understand. The models that we have discussed in this survey may lead to the development o f a rich theory of economic policy. In some situations, as in models involving monetary policies with feedback rules, sunspots may exist under some policy regimes but not under others. In other instances, as in models where coordinating on higher investment rates leads to Pareto-superior outcomes, the kind o f policies needed to achieve such coordination may be quite complex, and even difficult to implement. The important consideration however is not so much to find policies that eliminate the possibility o f multiple or sunspot equilibria, but to design policies that will select and implement the best possible equilibrium. Even if it is not possible to design policies that will select the best equilibrium, or to completely eliminate sunspot equilibria, the models that we have described in this survey may enable us to design Pareto-improving policy rules. The argument that equilibria are indeterminate may be wrong; but interventionist policy arguments couched in this language are at least capable o f comparison with their noninterventionists counterparts. I f a dialogue is to be developed between those who favor active intervention and those who do not, it is important that the two groups speak the same language. Dynamic general equilibrium theory, allowing for indeterminacies, is exactly the kind o f vehicle that is required to further communication in this debate.
Acknowledgements We wish to thank Roland Benabou, Jordi Gali, Stephanie Schmitt-Groh6, Jang Ting Guo, Sharon Harrison, Takashi Kamihigashi, Roberto Perli, Martin Uribe and Michael Woodford for very useful discussions and comments. Technical support from the C.V. Starr Center for Applied Economics at New York University and from the Program for Dynamic Economics at U C L A is gratefully acknowledged. Farmer's research was supported by the National Science Foundation, grant #952912.
References Aiyagari, S.R. (1995), "The econometrics of indeterminacy: an applied study: A comment", CarnegieRochester Conference Series on Public Policy 43:273284. Akerlof, G.A., and J.L. Yellen (1985), "Can small deviations from rationality make significant differences to economic equilibrium?", American Economic Review 75:708-720. Azariadis, C. (1981), "Self-fulfilling prophecies", Journal of Economic Theory 25:380-396. Azariadis, C., and R. Cooper (1985), "Nominal wage-price rigidity as a rational expectations equilibrum", American Economic Review 73:31-36. Balasko, Y., D. Cass and K. Shell (1995), "Market participation and sunspot equilibria", Review of Economic Studies 62:491312. Basu, S., and J.G. Fernald (1995), "Are apparent productive spillovers a figment of specification error?", Journal of Monetary Economics 36:165-188.
444
J. Benhabib and R.E.A. Farmer
Basu, S., and J.G. Fernald (1997), "Returns to scale in US production: estimates and implications", Journal of Political Economy 105:249-283. Baxter, M., and R.G. King (1991), "Productive externalities and business cycles", discussion paper #53 (Institute for Empirical Macroeconomics, Federal Reserve Bank of Minneapolis). Beaudry, E, and M. Devereux (1993), "Monopolistic competition, price setting and the effects of real and monetary shocks", discussion paper 93-34 (Department of Economics, University of British Columbia). Benhabib. J., and R.E. Farmer (1994), "Indeterminacy and increasing returns", Journal of Economic Theory 63:19-41. Benhabib. J., and R.E. Farmer (1996a), "Indeterminacy and sector specific externalities", Journal of Monetary Economics 37:397-419. Benhabib. J., and R.E. Farmer (1996b), "The monetary transmission mechanism", working paper 96-13 (C.V Starr Center of Applied Economics, New York University). Benhabib. J., and J. Gali (1994), "On growth and indeterminacy: some theory and evidence", CarnegieRochester Conference Series on Public Policy 43:163-212. Benhabib. J., and K. Nishimura (1979), "The Hopf bifurcation and the existence and stability of closed orbits in multisector models of optimal economic growth", Journal of Economic Theory 21:421-444. Benhabib. J., and K. Nishimura (1998), "Indeterminacy and sunspots with constant returns", Journal of Economic Theory 81:58-96. Benhabib. J., and R. Perli (1994), "Uniqueness and indeterminacy: transitional dynamics in a model of endogenous growth", Journal of Economic Theory 63:113-142. Benhabib. J., and A. Rustichini (1994), "Introduction to the symposium on growth, fluctuations and sunspots", Journal of Economic Theory 63:1 19. Benhabib, J., R. Rogerson and R. Wright (1991), "Homework in macroeconomics: household production and aggregate fluctuations", Journal of Political Economy 99:1166-1187. Benhabib, J., R. Perli and D. Xie (1994), "Monopolistic competition, indeterminacy and growth", Ricerche Economiche 48:279598. Benhabib, J., R. Perli and S. Plutarchos (1997), "Persistence of business cycles in multisector models", Economic Research Report 97-19 (C.V Starr Center for Applied Economics, New York University). Benhabib, J., S. Schmitt-Groh6 and M. Uribe (1998), "Monetary policy and multiple equilibria", working paper 98-02 (C.V. Starr Center of Applied Economics, New York University). Benhabib, J., S. Schmitt-Groh6 and M. Uribe (2000), "The perils of Taylor rules", Journal of Economic Theory, forthcoming. Bennett, R. (1997), "Essays on Money", Ph.D. Thesis (UCLA). Bisin, A. (1997), "At the roots of indeterminacy", in: E Battigali et al., eds., Decisions, Games and Markets (Kluwer, New York). Black, E (1974), "Uniqueness of the price level in a monetary growth model with rational expectations", Journal of Economic Theory 7:53-65. Blanchard, O.J., and C.M. Kahn (1980), "The solution of linear difference models under rational expectations", Econometrica 48:1305-1313. Blanchard, O.J., and L.H. Summers (1987), "Fiscal increasing returns, hysteresis, real wages and unemployment", European Economic Review 31:543-559. Boldrin, M., and A. Rustichini (1994), "Indeterminacy of equilibria in models with infinitely-lived agents and external effects", Econometrica 62:323-342. Boldrin, M., N. Kiyotaki and R. Wright (1993), "A dynamic equilibrium model of search, production, and exchange", Journal of Economic Dynamics and Control 17:723-758. Brock, W.A. (1974), "Money and growth: the case of long run perfect foresight", International Economic Review 17:750-777. Burnside, C. (1996), "Production function regressions, returns to scale, and externalities", Journal of Monetary Economics 37:177-201.
Ch. 6.
Indeterminacy and Sunspots in Macroeconomics
445
Burnside, C., M. Eichenbaum and S.T. Rebelo (1995), "Capacity utilization and returns to scale", NBER Maeroeconomics Annual 10:67-110. Caballero, R.J., and R.K. Lyons (1992), "External effects in US cyclical productivity", Journal of Monetary Economics 29:209-226. Calvo, G.A. (1978), "On indeterminacy of interest rates and wages with perfect foresight", Journal of Economic Theory 19:321 337. Calvo, G.A. (1979), "On models of money and perfect foresight", International Economic Review 20:83-103. Calvo, G.A. (1983), "Staggered prices in a utility maximizing framework", Journal of Monetary Economics 12:383-398. Cass, D., and K. Shell (1983), "Do sunspots matter?", Journal of Political Economy 91:193 227. Cazzavilan, G. (1996), "Public spending, endogenous growth and endogenous fluctuations", Journal of Economic Theory 71:394-415. Chamley, C. (1993), "Externalities and dynamics in model of'learning or doing"', international Economic Review 34:583 610. Chaff, VV., P.J. Kehoe and E.R. McGrattan (1996), "Sticky price models of the business cycle: can the contract multiplier solve the persistence problem?", Staff Report 217 (Federal Reserve Bank of Minneapolis, Research Department). Chiappori, EA., and R. Guesnerie (1994), "Rational random walks", Review of Economic Studies 60:837 864. Chiappori, RA., EY. Geoffard and R. Guesneffe (1992), "Sunspot fluctuations around a steady state: the case of multidimensional one-step forward looking economic models", Econometrica 60:1097-1126. Christiano, LJ., and S.G. Harrison (1996), "Chaos, sunspots and automatic stabilizers", Working Paper No. 5703 (NBER). Clarida, R., J. Gali and M. Gertler (1997), "Monetary policy rules and macroeconomic stability: evidence and some theory", working paper 98-01 (C.V Starr Center of Applied Economics, New York University). Cogley, T., and J.M. Nason (1995), "Output dynamics in real busines cycle models", American Economic Review 85(3):492-511. Cooley, T.E, and G.D. Hansen (1989), "The inflation tax in a real business cycle model", American Economic Review 79:733-748. Cooley, T.E, and G.D. Hansen (1991), "The welfare costs of moderate inflations", Journal of Money, Credit and Banking 23:483 503. Cooper, R., and A. John (1988), "Coordinating coordination failures in Keynesian models", Quarterly Journal of Economics 103:441-463. Drugeon, J.E (1996), "A model with endogenously determined cycles, discounting & growth", Economic Theory 12:349-370. Duffy, J. (1994), "On learning and the nonuniqueness of equilibrium in an overlapping generations model with fiat money", Journal of Economic Theory 64(2):541 553. Evans, G.W., S. Honkapohja and RM. Romer (1996), "Growth cycles", Working Paper No. 5659 (NBER). Farmer, R.E. (1991), "Sticky prices", Economic Journal 101:1369-1379. Farmel, R.E. (1992), "Nominal price stickiness as a rational expectations equilibrium", Journal of Economic Dynamics and Control 16:317-337. Farmer, R.E. (1993), The Macroeconomics of Self-Fulfilling Prophecies (MIT Press, Cambridge, MA). Farmer, R.E. (1997), "Money in a real business cycle model", Journal of Money, Credit and Banking 29:568 611. Farmer, R.E., and J.-T. Guo (1994), "Real business cycles and the animal spirits hypothesis", Journal of Economic Theory 63:42 73. Farmer, R.E., and J.-T. Guo (1995), "The econometrics of indeterminacy: an applied study", CarnegieRochester Series in Public Policy 43:225-272.
446
J. Benhabib and R.E.A. Farmer
Farmer, R.E., and M. Woodford (1997), "Self-fulfilling prophecies and the business cycle", Macroeconomic Dynamics 1(4):740-769. Flood, R.E, and EM. Garber (1980), "Market fundamentals versus price level bubbles: the first tests", Journal of Political Economy 88:745-770. Gale, D. (1974), "Pure exchange equilibrium in dynamic economic models", Journal of Economic Theory 6:1~36. Gali, J. (1994), "Monopolistic competition, business cycles, and the composition of aggregate demand", Journal of Economic Theory 63:73-96. Gali, J. (1996), "Multiple equilibria in a growth model with monopolistic competition", Economic Theory 8:251-266. Gali, J., and E Zilibotti (1995), "Endogenous growth and poverty traps in a Cournotian model", Annales D'Economie et de Statistique, 37/38:197-213. Geanakoplos, J.D., and H.M. Polemarchakis (1986), "Walrasian indeterminacy and Keynesian macroeconomics", Review of Economic Studies 53:755-779. Grandmont, J.-M. (1994), "Expectations formation and stability of large socioeconomic systems", discussion paper (C.E.ER.E.M.A.E, Paris). Guesnerie, R., and M. Woodford (1992), "Endogenous fluctuations" in: J.-J. Laffont, ed., Advances in Economic Theory (Cambridge University Press, Cambridge) 289-412. Guo, J.-T., and K. Lansing (1998), "Indeterminacy and stabilization policy", Journal of Economic Theory 88(2):481-490. Guo, J.-T., and E Sturzenegger (1994), "Crazy explanations of the international busines cycle", working paper (UCLA). Hall, R.E. (1988), "The relation between price and marginal cost in U.S. Industry", Journal of Political Economy 96:921 947. Hall, R.E. (1990), "lnvariance properties of Solow's productivity residual" in: E Diamond, ed., Growth, Productivity, Unemployment (MIT Press, Cambridge, MA) 71-112. Harrison, S.H. (1996), "Production externalities and indeterminacy in a two-sector model: theory and evidence", working paper (Northwestern University). Hodriek, R., and E.C. Prescott (1980), "Post-war U.S. business cycles: an empirical investigation", mimeograph (Carnegie-Mellon University, Pittsburgh). Recently published in Journal of Money, Credit and Banking 29 (1997) 1-16. Hoffman, D.L., R.H. Rasche and M.A. Tieslau (1995), "The stability of long run money demand in five industrial countries", Journal of Monetary Economics 35:317-340. Howitt, E, and R.E McAfee (1988), "Stability of equilibria with externalities", Quarterly Journal of Economics t03:261-278. Howitt, E, and R.P McAfee (1992), "Animal spirits", American Economic Review 82:493-507. Huffman, G.W., and M.A. Wynne (1996), "The role of intertemporal adjustment costs in a multi-sector economy", working paper (Southern Methodist University). Kamihigashi, T. (1996), "Real business cycle models and sunspot fluctuations are observationally equivalent", Journal of Monetary Economics 37:105 107. Kehoe, T.J., and D.K. Levine (1985), "Comparative statics and perfect foresight in infinite horizon economies", Econometrica 53:433-453. Kiley, M.T. (1998), "Staggered price setting, partial adjustment, and real rigidities", manuscript (Federal Reserve Board, Division of Research and Statistics, Washington). King, E.G., C.I. Plosser and S.T. Rebelo (1987), "Production growth and business cycles. I. The basic neo-classical model", Journal of Monetary Economics 21:195-232. Kydland, EE., and E.C. Prescott (1990), "Business cycles: real facts and a monetary myth", Quarterly Review, Federal Reserve Bank of Minneapolis 14(2):3-18. Lee, J.Y. (1993), "Essays on money and business cycles", Ph.D. Thesis (UCLA). Leeper, E.M. (1991), "Equilibria under 'active' and 'passive' monetary and fiscal policies", Journal of Monetary Economics 27:129-147.
Ch. 6: Indeterminacy and Sunspots in Macroeconomics
447
Levine, R., and D. Renelt (1992), "A sensitivity analysis of cross-country growth regressions", American Economic Review 82(4):942-964. Lucas, R. (1988), "The mechanics of development", Journal of Monetary Economics 22:3~42. Mankiw, N.G. (1985), "Small menu costs and large business cycles: a macroeconomic model of monopoly", Quarterly Journal of Economics 100:529 538. Matheny, K.J. (1992), "Essays on beliefs and business cycles", Ph.D. Thesis (UCLA). Matheny, K.J. (1996), "Equilibrium beliefs in linear rational expectations models", mimeograph (Kemlert Graduate School of Management, Purdue University). Matheny, K.J. (1998), "Non-neutral responses to money supply shocks when consumption and leisure are Pareto substitutes", Economic Theory, forthcoming. Matsuyama, K. (1991a), "Increasing returns, industrialization, and indeterminacy of equilibrium", Quarterly Journal of Economics 106:617 650. Matsuyama, K. (1991 b), "Endogenous price fluctuations in an optimizing model of a monetary economy", Econometrica 59:1617-1631. McCallum, B.T. (1983), "On non-uniqueness in rational expectations models: an attempt at perspective", Journal of Monetary Economics 11(2): 139-168. Merton, R.K. (1948), "The self fulfilling prophecy", Antioch Review 8:193~ 11. Obstfeld, M., and K. Rogoff (1983), "Speculative hyperinflations in maximizing models: can we rule them out?", Journal of Political Economy 91:675 687. Patinkin, D. (1956), Money Interest and Prices, 2nd edition (MIT Press, Cambridge, MA). Perli, R. (1994), "Indeterminacy, home production and the business cycle: a calibration analysis", working paper (New York University); Journal of Monetary Economics 41(1): 105-125. Romer, EM. (1990), "Endogenous technological change", Journal of Political Economy 98:$71 S102. Rotemberg, J.J. (1982), "Sticky prices in the United States", Journal of Political Economy 90:1187-1211. Rotemberg, J.J. (1996), "Price, output and hours: an empirical analysis based on a sticky price model", Journal of Monetary Economics 37:505-533. Rotemberg, J.J., and M. Woodford (1992), "Oligopolistic pricing and the effects of aggregate demand on economic activity", Journal of Political Economy 100:1153-1207. Sargent, T.J., and N. Wallace (1975), "Rational expectations, the optimal monetary instrument and the optimal money supply rule", Journal of Political Economy 83:241554. Schmitt-Grohr, S. (1997), "Comparing four models of aggregate fluctuations due to self-fulfilling expectations", Journal of Economic Theory 72:96 147. Schmitt-Grohr, S., and M. Uribe (1997a), "Balanced budget rules, distortionary taxes, and aggregate instability", Journal of Political Economy 105:976-1000. Schmitt-Grohr, S., and M. Uribe (1997b), "Price level determinacy and monetary policy under a balanced-budget requirement", working paper 97-17 (Board of Governors of the Federal Reserve System, Washington, DC). Shea, J. (1993), "Do supply curves slope up?", Quarterly Journal of Economics 108:1 32. Shell, K. (1971), "Notes on the economics of infinity", Journal of Political Economy 79:1002 1011. Shell, K. (1977), "Monnaie et allocation intertemporelle", Seminaire d'Econometrie Roy-Malinvaud, Centre National de la Recherche Scientifique, Paris, November 21, 1977, mimeograph (title and abstract in French, text in English). Sims, C.A. (1980), "Comparison of interwar and postwar business cycles", American Economic Review 70:250-257. Sims, C.A. (1989), "Models and their uses", American Journal of Agricultural Economics 71:489~494. Sims, C.A. (1992), "Interpreting the macroeconomic time series facts", European Economic Review 36:975-1011. Sims, C.A. (1997), "Fiscal foundations of price stability in open economies", mimeograph (Yale University). Svensson, L.E.O. (1996), "Inflation forecast targeting: implementing and monitoring inflation targets", discussion paper (Institute for International Economic Studies, Stockholm University).
448
J. Benhabib and R.E.A. Farmer
Taylor, J.B. (1977), "Conditions for unique solutions to stochastic macroeconomic models with rational expectations", Econometrica 45:1377-85. Taylor, J.B. (1980), "Aggreate dynamics and staggered contracts", Journal of Political Economy 88:1~3. Taylor, J.B. (1996), "Policy rules as a means to a more effective monetary policy", discussion paper no. 449 (Center for Economic Policy Research, Stanford University). Velasco, A. (1996), "Animal spirits, capital repatriation and investment", Journal of International Money and Finance 15:221-238. Weder, M. (1996), "Animal spirits, technology shocks and the business cycle", working paper (Humboldt University). Wedel, M. (1998), "Fickle consumers, durable goods and business cycles", working paper; Journal of Economic Theory 81:37-57. Wen, L. (1998), "Capacity utilization under inceasing returns to scale", working paper; Journal of Economic Theory 81:7-36. Wilson, C.A. (1979), "An infinite horizon model with money", in: J. Green and J. Scheinkman, eds., General Equilibrium, Growth and Trade (Academic Press, New York) 81-104. Woodford, M. (1986), "Stationary sunspot equilibria in a finance constrained economy", Journal of Economic Theory 40:128-137. Woodford, M. (1987), "Credit policy and the price level in a cash-in-advance economy", in: W.A. Barnett and K.J. Singleton, eds., New Approaches in Monetary Economics (Cambridge University Press, New York) 52-66. Woodford, M. (1988), "Expectations, finance and aggregate instability", in: M. Kohn and S.-C. Tsiang, eds., Finance Constraints, Expectations and Macroeconomics (Oxford University Press, New York) 230~61. Woodford, M. (1990), "Learning to believe in sunspots", Econometriea 58:277-307. Woodford, M. (1991), "Self-fulfilling expectations and fluctuations in aggregate demand", in: G. Mankiw and D. Romer, eds., New Keynesian Economics, vol. 2 (MIT Press, Cambridge, MA) 77-110. Woodford, M. (1994), "Monetary policy and price level indeterminacy in a cash-in-advance economy", Economic Theory 4:345 380. Woodford, M. (1995), "Price level determinacy without control of a monetary aggregate", CarnegieRochester Conference Series on Public Policy 43:1-46. Woodford, M. (1996), "Control of public debt: a requirement for price stability", Working Paper No. 5684 (NBER). Xie, D. (1994), "Divergence in economic performance: transitional dynamics with multiple equilibria", Journal of Economic Theory 63(1):97-112.
Chapter 7
LEARNING DYNAMICS GEORGE W. EVANS University of Oregon SEPPO HONKAPOHJA
University of Helsinki Contents
Abstract Keywords 1. Introduction 1.1. Expectations and the role of learning 1.1.1. Background 1.1.2. Role of learning in macroeconomics 1.1.3. Alternative reduced forms 1.2. Some economic examples 1.2.1. The Muth model 1.2.2. A linear model with multiple REE 1.2.3. The overlapping generations model with money 1.3. Approaches to learning 1.3.1. Rational learning 1.3.2. Eductive approaches 1.3.3. Adaptive approaches 1.4. Examples of statistical learning rules 1.4.1. Least squares learning in the Muth model 1.4.2. Least squares learning in a linear model with multiple REE 1.4.3. Learning a steady state 1.4.4. The seignorage model of inflation 1.5. Adaptive learning and the E-stability principle 1.6. Discussion of the literature 2. General m e t h o d o l o g y : recursive stochastic algorithms 2.1. General setup and assumptions 2.1.1. Notes on the technical literature 2.2. Assumptions on the algorithm 2.3. Convergence: the basic results 2.3.1. ODE approximation Handbook of Macroeconomics, Volume 1, Edited by JB. Taylor and M. WoodJbrd © 1999 Elsevier Science B. V All rights reserved 449
452 452 453 453 453 454 455 456 456 457 458 461 461 462 464 465 465 467 468 471 472 473 475 475 476 476 478 478
450
G. W. Evans and S. Honkapohja
2.3.2. Asymptotic analysis 2.4. Convergence: further discussion 2.4.1. Immediate consequences 2.4.2. Algorithms with a projection facility 2.5. Instability results 2.6. Further remarks 2.7. Two examples 2.7.1. Learning noisy steady states 2.7.2. A model with a unique REE 2.8. Global convergence 3. Linear e c o n o m i c m o d e l s 3.1. Characterization of equilibria 3.2. Learning and E-stability in univariate models 3.2.1. A leading example 3.2.1.1. A characterization of the solutions 3.2.1.2. E-stability of the solutions 3.2.1.3. Strong E-stability 3.2.1.4. E-stability and indeterminacy 3.2.2. The leading example: adaptive learning 3.2.2.1. Adaptive and statistical learning of MSV solution 3.2.2.2. Learning non-MSV solutions 3.2.2.2.1. Recursive least squares learning: the AR(1) case 3.2.2.2.2. Learning sunspot solutions 3.2.3. Lagged endogenous variables 3.2.3.1. A characterization of the solutions 3.2.3.2. Stability under learning of the AR(1) MSV solutions 3.2.3.3. Discussion of examples 3.3. Univariate models further extensions and examples 3.3.1. Models with t dating of expectations 3.3.1.1. Alternative dating 3.3.2. Bubbles 3.3.3. A monetary model with mixed datings 3.3.4. A linear model with two forward leads 3.4. Multivariate models 3.4.1. MSV solutions and learning 3.4.2. Multivariate models with time t dating 3.4.3. Irregular models 4. L e a r n i n g in nonlinear m o d e l s 4.1. Introduction 4.2. Steady states and cycles in models with intrinsic noise 4.2.1. Some economic examples 4.2.2. Noisy steady states and cycles 4.2.3. Adaptive learning algorithms
479 480 480 480 481 482 483 483 484 486 487 487 488 488 489 490 491 491 493 493 493 494 494 495 496 496 496 497 497 499 499 5O0 501 502 503 505 5O5 5O6 5O6 5O7 5O7 509 510
Ch. 7: Learning Dynamics
451
4.2.4. E-stability and convergence 4.2.4.1. Weak and strong E-stability 4.2.4.2. Convergence 4.2.4.3. The case of small noise 4.2.5. Economic models with steady states and cycles 4.2.5.1. Economic examples continued 4.2.5.2. Other economic models 4.3. Learning sunspot equilibria 4.3.1. Existence of sunspot equilibria 4.3.2. Analysis of learning 4.3.2.1. Fornmlation of the learning rule 4.3.2.2. Analysis of convergence 4.3.3. Stability of SSEs near deterministic solutions 4.3.4. Applying the results to OG and other models 5. E x t e n s i o n s a n d r e c e n t d e v e l o p m e n t s 5.1. Genetic algorithms, classifier systems and neural networks 5.1.1. Genetic algorithms 5.1.2. Classifier systems 5.1.3. Neural networks 5.1.4. Recent applications of genetic algorithms 5.2. Heterogeneity in learning behavior 5.3. Learning in misspecified models 5.4. Experimental evidence 5.5. Further topics 6. C o n c l u s i o n s References
511 512 513 513 513 513 514 515 516 517 517 518 520 520 521 521 521 523 524 525 527 528 530 531 533 533
452
G. W. Evans and S. Honkapohja
Abstract
This chapter provides a survey of the recent work on learning in the context of macroeconomics. Learning has several roles. First, it provides a boundedly rational model of how rational expectations can be achieved. Secondly, learning acts as a selection device in models with multiple REE (rational expectations equilibria). Third, the learning dynamics themselves may be of interest. While there are various approaches to learning in macroeconomics, the emphasis here is on adaptive learning schemes in which agents use statistical or econometric techniques in self-referential stochastic systems. Careful attention is given to learning in models with multiple equilibria. The methodological tool is to set up the economic system under learning as a SRA (stochastic recursive algorithm) and to analyze convergence by the method of stochastic approximation based on an associated differential equation. Global stability, local stability and instability results for SRAs are presented. For a wide range of solutions to economic models the stability conditions for REE under statistical learning rules are given by the expectational stability principle, which is treated as a unifying principle for the results presented. Both linear and nonlinear economic models are considered and in the univariate linear case the full set of solutions is discussed. Applications include the Muth cobweb model, the Cagan model of inflation, asset pricing with risk neutrality, the overlapping generations model, the seignorage model of inflation, models with increasing social returns, IS-LM-Phillips curve models, the overlapping contract model, and the Real Business Cycle model. Particular attention is given to the local stability conditions for convergence when there are indeterminacies, bubbles, multiple steady states, cycles or sunspot solutions. The survey also discusses alternative approaches and recent developments, including Bayesian learning, eductive approaches, genetic algorithms, heterogeneity, misspecifled models and experimental evidence.
Keywords expectations, learning, adaptive learning, least squares learning, eductive learning, multiple equilibria, expectational stability, stochastic recursive algorithms, sunspot equilibria, cycles, multivariate models, MSV solutions, stability, instability, ODE aproximation, stochastic approximation, computational intelligence, dynamic expectations models J E L classification: E32, D83, D84, C62
Ch. 7: Learning Dynamics
453
1. Introduction
1.1. Expectations and the role of learning 1.1.1. Background In modern macroeconomic models the role o f expectations is central. In a typical reduced form model a vector o f endogenous variables yt depends on lagged values yt-1, on expectations o f the next period's values, Yt+l, and perhaps on a vector o f exogenous shocks ut, e.g. taking the form Yt = F ( y t 1,Yt+l, ut), where for the moment assume F to be linear. O f course, in some models the dependence on Yt-l or ut m a y be absent. The information set available when Yt+l is formed typically includes {Yt-i, ut-i, i = 1,2, 3 . . . . } and may or may not also include the contemporaneous values Yt and ut. A useful notation, i f y t , ut are in the information set, is E{yt+l and we write the reduced form as
Yt - F ( y t l,EtYt+l, ut).
(1)
I f y t and ut are not included in the information set then we write yet+l as ET_yt+l. In the economic models we consider in this survey, these expectations are those held by the private agents in the economy, i.e. o f the households or the firms. Models in which policy makers, as well as private agents, must form expectations raise additional strategic issues which we do not have space to explore l . Following the literature, we restrict attention to models with a large number o f agents in which the actions o f an individual agent have negligible effect on the values yr. Closing the model requires a theory o f how expectations are formed. In the 1950s and 1960s the standard approach was to assume adaptive expectations, in which expectations were adjusted in the direction o f the most recent forecast error, e.g. in the scalar case, and assuming Yt is in the information set, ETyt+l = ET_lyt + Y(yt - ET_lyt) for some value o f 0 < ~/ ~< 1. Though simple and often well-behaved, a well known disadvantage o f adaptive expectations is that in certain environments it will lead to systematic forecast errors, which appears inconsistent with the assumption o f rational agents. The rational expectations revolution o f the 1970s has led to the now standard alternative assumption that expectations are equal to the true conditional expectations in the statistical sense. Rational expectations (in this standard interpretation used in macroeconomics) is a strong assumption in various ways: it assumes that agents know the true economic model generating the data and implicitly assumes coordination o f expectations by the agents 2. It is, however, a natural benchmark assumption and is widely in use.
1 See Sargent (1999) for some models with learning by policy makers. 2 A rational expectations equilibrium can be interpreted as a Nash equilibrium, a point made in Townsend (1978) and Evans (1983). It is thus not rational for an individual to hold "rational expectations" unless all other agents are assumed to hold rational expectations. See the discussion in Frydman and Phelps (1983).
454
G. 14(Evans and S. Honkapohja
More recently a literature has developed in which the RE (rational expectations) assumption has been replaced by the assumption that expectations follow a learning rule, either a stylized or a real-time learning rule, which has the potential to converge to RE. An example of a learning rule is one in which agents use a linear regression model to forecast the variables of interest and estimate the required parameters by least squares, updating the parameter estimates each period to incorporate new data. Modeling expectations in this fashion puts the agents in the model in a symmetric position with the economic analyst, since, when studying real economies, economists use econometrics and statistical inference. In contrast, under RE the agents in the model economy have much more information than the outside observer. It is worth emphasizing that most of the literature on learning reviewed in this paper has followed standard practice in macroeconomics and postulates the assumption of a representative agent as a simplification. This implies that the expectations and the learning rules of different agents are assumed to be identical. Some recent papers allow for heterogeneity in learning and this work is discussed below. 1.1.2. Role o f learning in macroeconomics
Introducing learning into dynamic expectations models has several motivations. First, learning has been used to address the issue of the plausibility of the RE assumption in a particular model: could boundedly rational agents arrive at RE through a learning rule? This issue is of interest as it provides a justification for the RE hypothesis. The early work by DeCanio (1979), Bray (1982) and Evans (1983) focused on this, and some further papers are Bray and Savin (1986), Fourgeaud, Gourieroux and Pradel (1986), Marcet and Sargent (1989b), and Guesnerie (1992). This view is forcefully expressed by Lucas (1986), though he views the adjustment as very quick. Secondly, there is the possibility of models with multiple REE (rational expectations equilibria). If some REE are locally stable under a learning rule, while others are locally unstable, then learning acts as a selection device for choosing the REE which we can expect to observe in practice. This point was made in Evans (1985) and Grandmont (1985) and developed, for example, in Guesnerie and Woodford (1991) and Evans and Honkapohja (1992, 1994b, 1995a). Extensive recent work has been devoted to obtaining stability conditions for convergence of learning to particular REE and this work is discussed in detail in the later sections of this paper. A particular issue of interest is the conditions under which there can be convergence to exotic solutions, such as sunspot equilibria. This was established by Woodford (1990). Thirdly, it may be of interest to take seriously the learning dynamics itself, e.g. during the transition to RE. Dynamics with learning can be qualitatively different from, say, fully rational adjustment after a structural change. This has been the focus of some policy oriented papers, e.g. Taylor (1975), Frydman and Phelps (1983), Currie, Garratt and Hall (1993) and Fuhrer and Hooker (1993). It has also been the focus of some recent work on asset pricing, see Timmermann (1993, 1996) and Bossaerts (1995). Brian Arthur [see e.g. papers reprinted in Arthur (1994)] has emphasized path-
Ch. 7: Learning Dynamics
455
dependence of adaptive learning dynamics in the presence of multiple equilibria. If the model is misspecified by the agents, then this can effectively lead to persistent learning dynamics as in Evans and Honkapohja (1993a), Marcet and Nicolini (1998) and Timmermann (1995). Even if the model is not misspecified, particular learning dynamics may not fully converge to an REE and the learning dynamics may be of intrinsic interest. This arises, for example, in Arifovic (1996), Evans and Ramey (1995), Brock and Hommes (1996, 1997), and Moore and Schaller (1996, 1997) 3. The theoretical results on learning in macroeconomics have begun to receive some support in experimental work [e.g. Marimon and Sunder (1993, 1994) and Marimon, Spear and Sunder (1993)] though experimental work in macroeconomic set-ups has so far been less than fully studied. We review this work in Section 5.4. The implications of these results have led also to one further set of issues: the effects of policy, and appropriate policy design, in models with multiple REE. For example, if there are multiple REE which are stable under learning, then policy may play a role in which equilibrium is selected, and policy changes may also exhibit hysteresis and threshold effects. The appropriate choice of policy parameters can eliminate or render unstable inefficient steady states, cycles or sunspot equilibria. For examples, see Evans and Honkapohja (1993a,b, 1995b). Howitt (1992) provides examples in which the stability under learning of the REE is affected by the form of the particular monetary policy 4. A further application of learning algorithms is that they can also be used as a computational tool to solve a model for its REE. This point has been noted by Sargent (1993). An advantage of such algorithms is that they find only "learnable" REE. A well-known paper illustrating a computational technique is Marimon, McGrattan and Sargent (1989). A related approach is the method of parameterized expectations, see Marcet (1994) and Marcet and Marshall (1992).
1.1.3. Alternative reduced forms The models we will consider have various reduced forms, and some preliminary comments are useful before turning to some economic examples. The form (1) assumed that contemporaneous information is available when expectations are formed. If alternatively the information set is {Yt-i, ut-i, i = 1,2, 3,...} then Yt may also (or
3 These lines of research in macroeconomicscorrespond to parallel developments in game theory. For a survey of learning in economics which gives a greater role to learning in games, see Marimon (1997). See also Fudenberg and Levine (1998). 4 Herepolicy is modeled as a rule which atomistic private agents take as part of the economicstructure. Modeling policy as a game is a different approach, see e.g. Cho and Sargent (1996a) and Sargent (1999) for the latter in the context of learning.
456
G. W. Evans and S. Honkapohja
instead) depend on E[_lyt, the expectation o f y t formed at t - 1, so that the reduced form is Yt = F ( y t - l , E t * _ l y t , E t lYt+l,/At) o r Yt = F ( y t 1,E t lYt, ut).
The Muth model, below, is the special case Yt = F(E[_,yt, ut). In nonlinear stochastic models a point requiring some care is the precise quantity about which expectations are formed. Even assuming that lagged values of Yt-i are not present, the required reduced form might be Yt = H ( E [ G ( y t + l , Hi+l), Ut).
In many of the early examples the model is nonstochastic. The reduced form then becomes Yt = H ( E t G ( y t + I ) ) . If H is invertible then by changing variables to Y~ = I-I-~ (yt) the model can be transformed to ~t = E [ f ( ~ t + l ) ,
(2)
w h e r e f ( ~ ) = G(H(.~)). The form (2) is convenient when one considers the possibility of stochastic equilibria for models with no intrinsic randomness, see Section 4.3. In nonstochastic models, if agents have point expectations, these transformations are unnecessary and the model can again simply be analyzed in the form yt - f ( E t Y~+I),
(3)
where f ( y ) = H ( G ( y ) ) . This is standard, for example, in the study of learning in Overlapping Generations models. Finally, mixed datings of expectations appear in some models. For example, the seignorage model of inflation often considers a formulation in which
1.2. Some economic examples
It will be helpful at this stage to give several economic examples which we will use to illustrate the role of learning. 1.2.1. The Muth model
The "cobweb" model of a competitive market in which the demand for a perishable good depends on its price and the supply, due to a production lag, depends on its
457
Ch. 7: Learning Dynamics
expected price, was originally solved under rational expectations by Muth (1961). Consider the structural model qt = m l - m2Pt + Olt, q¢ = r l E ~ lpt + r ~ w t - i + vzt,
where m2, rl > 0, Vlt and vzt are unobserved white noise shocks and wt 1 is a vector o f exogenous shocks, also assumed white noise for convenience, qt is output, Pt is price, and the first equation represents demand while the second is supply. The reduced form for this model is (4)
Pt = ~ + a E t lPt + Y l w t 1 + ~lt,
where ~ = m l / m 2 , ] / = - r 2 / m 2 , and a = - r l / m 2 There is a unique REE in this model given by
~t = (vlt - 02~)/m2. Note that a < 0.
pt = a + blwt-1 + ~/t, where a=(1-a) ~/~, b = ( 1 a ) - l ) ,. Under RE, E t _ l P t = ~l + [Jwt_ 1. L u c a s a g g r e g a t e s u p p l y m o d e l . A n identical reduced form arises from the following simple macroeconomic model in the spirit o f Lucas (1973). Aggregate output is given
by qt = 7t+ O(pt - E t _ l P t ) + ~t,
while aggregate demand is given by the quantity theory equation mt + vt = P t + qt,
and the money supply follows the policy rule mt = ff~ +[9out + p l Wt 1.
Here 0 > 0 and the shocks ~t, vt, ut and wt are assumed for simplicity to be white noise. Solving for Pt in terms o f E ~ l P t , wt-1 and the white noise shocks yields the reduced form (4). For this model 0 < a = 0(1 + 0) -1 < 1. 1.2.2. A l i n e a r m o d e l w i t h m u l t i p l e R E E
Reduced form models o f the form Yt = a + [3Ei* yt+l + 6 y t - , + x w t + vt
(5)
arise from various economic models. Here Yt is a scalar, vt is a scalar white noise shock and wt is an exogenous vector o f observables which we will assume follows a stationary first-order VAR (vector auto-regression) wt = p w t 1 + et.
(6)
Variables dated t are assumed in the time t information set. An example is the linearquadratic market model described in Sections XIV.4 and XIV.6 o f Sargent (1987).
458
G. W. Evans and S. Honkapohja
The standard procedure is to obtain solutions of the form Yt = ?t + byl-I + Uwt + [tvt,
(7)
where D satisfies the quadratic 82 - / 3 18 +/3-1,~ = 0. For many parameter values there will be a unique stationary solution with 181 < 1. However, if externalities or taxes are introduced into the model as in Section XIV.8 of Sargent (1987), then for appropriate parameter values both roots of the quadratic are real and have absolute value less than unity, so that there are two stationary solutions of the form (7). (In this case there also exist solutions that depend on sunspots.) 1.2.3. The overlapping generations model with m o n e y
The standard Overlapping Generations model with money provides an example of a model with REE cycles. 5 Assume a constant population of two-period lived agents. There are equal numbers of young and old agents, and at the end of each period the old agents die and are replaced in the following period by young agents. In the simple version with production, the utility function of a representative agent born at the beginning of period t is U ( C t + l ) - W(nt), where Ct+l is consumption when old and nt is labor supplied when young. U is assumed increasing and concave and W is assumed increasing and convex. We assume that output of the single perishable good qt for the representative agent is given by qt = nt, and that there is a fixed stock of money M. The representative agent produces output in t, trades the goods for money, and then uses the money to buy output for consumption in t + 1. The agent thus chooses n t , M t and ct+~ subject to the budget constraints ptnt = Mt = Pt+lCt+l, where Pt is the price of goods in year t. In equilibrium ct = nt, because the good is perishable, and N(t = M. The first-order condition for the household is ~ ' ( n 3 = Et. (p,+~ pt g'(ct+l)). Using the market clearing condition ct+l = nt+l and the relation Pt/Pt+l = nt+l/nt, which follows from the market clearing condition p t n t = M , we obtain the univariate equation n t W ' ( n t ) = E t (nt+l U'(nt+O). Since w(n) =- n W ' ( n ) is an increasing function we can invert it to write the reduced form as nt = H ( E t G(nt+l)),
(8)
where H(-) - w 1(.) and G(n) = nU'(n). I f one is focusing on nonstochastic solutions, the form (8) expresses the model in terms of labor supply (or equivalently 5 Overlapping generations models are surveyed, for example, in Geanakoplos and Polemarchakis (1991).
Ch. 7." Learning Dynamics
459
nt
rl t
~) t+l) I
I
n~
n t+l
n2
(a)
I
;,
n 3
at+ 1
(b)
rl t
f(nt+l)
nt
f(nt+< g)
i nL
n H
I
nL nu
nt+l
(c)
I
nH
~-
nt+~
(d) Fig. 1.
real balances) and assuming point expectations one has E[G(nL+I) = G(E[nt+l) and one can write nt = f ( E [ n t + l ) f o r f = H o G. The model can also be expressed in terms of other economically interpretable variables, such as the inflation rate :vt = p t / P t - l . One takes the budget constraint and the household's first-order condition which, under appropriate assumptions, yield savings (real balances) as a function mt ==-M/pt = S(E[Yf;t+I). Then the identity mt~5t = mt t yields ~t = S(E[ l~t)/S(E[Jvt+~). Depending on the utility functions U and W, the reduced form function f can have a wide range of shapes. If the substitution effect dominates everywhere then f will be increasing and there will be a single interior steady state (see Figure 1a), but if the income effect dominates over part of the range t h e n f can be hump-shaped. (An
G W. Evans and N Honkapohja
460
autarkic steady state state can also exist for the model.) In consequence the OG model can have perfect foresight cycles as well as a steady state (see Figure lb). Grandmont (1985) showed that for some choices o f preferences there coexist perfect foresight cycles of every order 6. Whenever there are cycles in the OG model, there are multiple equilibria, so that the role o f learning as a selection criterion becomes important. Various extensions o f the OG model can give rise to multiple (interior) steady states. Extension 1 (seignorage model). We briefly outline here two extensions o f the OG model which lead to the possibility o f multiple steady states. In the first extension we introduce government purchases financed by seignorage. Assuming that there is a fixed level o f real government purchases g financed entirely by printing money then g = (Mr - M t i)/pt. The first-order condition for the household is the same as in the basic model. Using the market clearing conditions p t n t = Mr, pt+lct+~ = Mt and Ct+l = nt+l - g we have pt/pt+l = (nt+l - g ) / n t which yields nt = H(E[((nt+l - g)U'(nt+l - g))) or nt = f(E[(nt+l - g)) for nonstochastic equilibria 7. In the case where the substitution effect dominates and where f is an increasing concave function which goes through the origin, this model has two interior steady states provided g > 0 is not too large. See Figure lc. It can be verified that the steady state n = nH corresponds to higher employment and lower inflation relative to n = nL. Extension 2 (increasing social returns). Assume again that there is no government spending and that the money supply is constant. However, replace the simple production function qt = nt by the function qt - 5C(nt, Art), where Aft denotes aggregate labor effort and represents a positive production externality. We assume U1 > 0, U2 > 0 and 5ell < 0. Here Art = Knt where K is the total number of agents in the economy. The first-order condition is now W'(nt) = E? Pt .T'l(nt,Knt) Pt+l
Ul(Ct+l).
Using pt~ot+l = qt+l/qt and Ct+l = qt+l we have W' (nt ).T'(nt, Knt ) .Ul(nt,Knt)
= Et.F(nt+~,Knt+l) U'(.U(nt+l,Knt+l)).
Letting #(nt) denote the left-hand-side function, it can be verified that #(nt) is a strictly increasing function o f nt. Solving for nt and assuming point expectations
6 Conditions for the existence of k-cycles are discussed in Grandmont (1985) and Guesnerie and Woodford (1992). 7 Alternatively, the model can be expressed in terms of the inflation rate in the form S(E; l:r,) S(E?:rt+l) - g
Ch. 7: Learning Dynamics
461
yields nt = f~(E[nt+l) for a suitable j~. For appropriate specifications of the utility functions and the production functions it is possible to obtain reduced-form functionsj7 which yield three interior steady states, as in Figure ld. Examples are given in Evans and Honkapohja (1995b). Employment levels nL < n v < nH correspond to low, medium and high output levels, and the steady states nL and n v can be interpreted as coordination failures. 1.3. Approaches to learning
Several distinct approaches have been taken to learning in dynamic expectations models. We will broadly classify them into (i) Rational learning, (ii) Eductive approaches and (iii) Adaptive learning. The focus of this chapter is on adaptive learning, but we will provide an overview of the different approaches. 1.3.1. Rational learning
A model of rational learning, based on Bayesian updating, was developed by Townsend (1978) in the context of the cobweb model. In the simplest case, it is supposed that agents know the structure of the model up to one unknown parameter, ml, the demand intercept. There are a continuum of firms and each firm has a prior distribution for m I. The prior distributions of each firm are common knowledge. Townsend shows that there exist Nash equilibrium decision rules, in which the supply decision of each firm depends linearly on its own mean belief about ml and the mean beliefs of others. Together with the exogenous shocks, this determines aggregate supply qt and the price level Pt in period t, and finns use time t data to update their priors. It also follows that for each agent the mean belief about ml converges to ml as t --+ oc, and that the limiting equilibrium is the REE. Townsend extends this approach to consider versions in which the means of the prior beliefs of other agents are unknown, so that agents have distributions on the mean beliefs of others, as well as distributions on the mean of the markets distributions on the mean beliefs of others, etc. Under appropriate assumptions, Townsend is able to show that there exist Nash equilibrium decision rules based on these beliefs and that they converge over time to the REE. This approach is explored further in Townsend (1983). Although this general approach does exhibit a process of learning which converges to the REE, it sidesteps the issues raised above in our discussion of the role of learning. In particular, just as it was asked whether the REE could be reached by a boundedly rational learning rule, so it could be asked whether the Nash equilibrium strategies could be reached by a learning process. In fact the question of how agents could ever coordinate on these Nash equilibrium decision rules is even more acute, since they are based on ever more elaborate information sets. The work by Evans and Ramey (1992) on expectation calculation can also be regarded as a kind of rational learning, though in their case there is not full convergence to the REE (unless calculation costs are 0). Here agents are endowed with calculation
462
G. W. Evans and S. Honkapohja
algorithms, based on a correct structural model, which agents can use to compute improved forecasts. Agents balance the benefits o f improved forecasts against the time and resource costs o f calculation and are assumed to do so optimally. Formally, since their decisions are interdependent, they are assumed to follow Nash equilibrium decision rules in the number o f calculations to make at each time. Because o f the costs of expectation calculation, the calculation equilibrium exhibits gradual and incomplete adjustment to the REE. In a "Lucas supply-curve" or "natural rate" macroeconomic model, with a reduced form close to that o f the "cobweb" model, they show how monetary nonneutrality, hysteresis and amplification effects can arise. As with Townsend's models, the question can be raised as to how agents learn the equilibrium calculation decision rules 8. 1.3.2. E d u c t i v e a p p r o a c h e s
Some discussions o f learning are "eductive" in spirit, i.e. they investigate whether the coordination o f expectations on an REE can be attained by a mental process o f reasoning 9. Some o f the early discussions o f expectational stability, based on iterations o f expectation functions, had an eductive flavor, in accordance with the following argument. Consider the reduced form model (4) and suppose that initially all agents contemplate using some (nonrational) forecast rule E°t_lPt = a ° + b°'wt 1.
(9)
Inserting these expectations into Equation (4) we obtain the actual law o f motion which would be followed under this forecast rule: Pt = (t~ + a a °) + ( a b ° + 7)'wt x + tit,
and the true conditional expectation under this law o f motion: E t - l p t = (~ +ota °) + (orb ° + y ) ' w t 1.
Thus if agents conjecture that other agents form expectations according to Equation (9) then it would instead be rational to form expectations according to E11pt = a l
+bltwt_l,
where a 1 = ~ + a a ° and b 1 = y + ab °.
8 Evans and Ramey (1998) develop expectation calculation models in which the Nash equilibrium calculation decision rules are replaced by adaptive decision rules based on diagnostic calculations. This framework is then more like the adaptive learning category described below, but goes beyond statistical learning in two ways: (i) agents balance the costs and benefits of improved calculations, and (ii) agents employ a structural model which allows them to incorporate anticipated structural change. 9 The term "eductive" is due to Binmore (1987).
Ch. 7: Learning Dynamics
463
Continuing in this way, if agents conjecture that all other agents form expectations according to the rule EN1pt = a N + bNtwt_l, then it would be rational to instead form expectations according to
EN~lpt = (t~ + aaN) + (]I + (ybN)twt 1. Letting 0 N! = (a N, bNt), the relationship between Nth-order expectations and (N + 1)thorder expectations is given by
~)N+a = T(oN),
N = 1,2,3,...,
(10)
where
T(O)' : (Ta(a, b), Tb(a, b) I) = (l~ + aa, g' + ab').
(11)
One might then say that the REE is "expectationally stable" if limk~o~ q~N=~= (fi, ~)~)/. The interpretation is that if this stability condition is satisfied, then agents can be expected to coordinate, through a process o f reasoning, on the REE 10. Clearly for the problem at hand the stability condition is lal < 1, and if this condition is met then there is convergence globally from any initial ~b°. For the Lucas supply model example above, this condition is always satisfied. For the cobweb model, satisfaction of the stability condition depends on the relative slopes o f the supply and demand curves. In fact we shall reserve the term "expectational stability" for a related concept based on the corresponding differential equation. The differential equation version gives the appropriate condition for convergence to an REE under the adaptive learning rules. To distinguish the concepts clearly we will thus refer to stability under the iterations (10) as iterative expectational stability or iterative E-stability. The concept can be and has been applied to more general models. Let q~ denote a vector which parameterizes the expectation fimction and suppose that T(¢) gives the parameters o f the true conditional expectation when all other agents follow the expectation function with parameters q~. An REE will be a fixed point ~} of T (and in general there may be multiple REE o f this form). The REE is said to be iteratively E-stable if q~N __+ ~} for all q~0 in a neighborhood o f ~}.
10 Interpreting convergence of iterations of (10) as a process of learning the REE was introduced in DeCanio (1979) and was one of the learning rules considered in Bray (1982). [Section 6 of Lucas (1978) also considered convergence of such iterations.] DeCanio (1979) and Bray (1982) give an interpretation based on real time adaptive learning in which agents estimate the parameters of the forecast rule, but only alter the parameters used to make forecasts after estimates converge in probability. The eductive argument presented here is based on Evans (1983), where the term "expectational stability" was introduced. Evans (1985, 1986) used the iterative E-stability principle as a selection device in models with multiple REE. Related papers include Champsaur (1983) and Gottfries (1985).
464
G. W. Evans and S. Honkapohja
An apparent weakness of the argument just given is that it assumes homogeneous expectations of the agents. In fact, the eductive argument based on iterative E-stability is closely related to the concept of rationalizability used in game theory, which allows for heterogeneity of the expectations of agents. The issue of rationalizability in the cobweb model was investigated by Guesnerie (1992). In Guesnerie's terminology the REE is said to be strongly rational if for each agent the set of rationalizable strategies is unique and corresponds to the REE, Guesnerie showed that if lal < 1 in Equation (4) then the REE is strongly rationalizable, so that in this case the eductive arguments are indeed compelling. Guesnerie (1992) shows that the strong rationality argument can be extended to allow also for heterogeneity in the economic structure, e.g. a different supply curve for each agent, due to different cost functions. The argument can also be extended to cases with multiple REE by making the argument local. In Evans and Guesnerie (1993) the argument is extended to a multivariate setting and the relationship between strong rationality and iterative E-stability is further examined. I f the model is homogeneous in structure, then (even allowing for heterogeneity in beliefs) an REE is strongly rational if and only if it meets the iterative E-stability condition. However, if heterogeneity in the structure is permitted, then iterative E-stability is a necessary but not sufficient condition for strong rationality of the REE. For an investigation of strong rationality in univariate models with expectations of future variables, see Guesnerie (1993). Guesnerie (1996) develops an application to Keynesian coordination problems. 1.3.3. Adaptive approaches
We come now to adaptive approaches to learning, which have been extensively investigated over the last 15 years. In principle, there is a very wide range of adaptive formulations which are possible. As Sargent (1993) has emphasized, in replacing agents who are fully "rational" (i.e. have "rational expectations") with agents who possess bounded rationality, there are many ways to implement such a concept 11. One possibility is to extend the adaptive expectations idea by considering generalized expectation functions, mapping past observations of a variable into forecasts of future values of that variable, where the expectation function is required to satisfy certain reasonable axioms (including bounded memory in the sense of a fixed number of past observations). This approach was taken, in the context of nonstochastic models, in the early work by Fuchs (1979) and Fuchs and Laroque (1976), and the work was extended by Grandmont (1985) and Grandmont and Laroque (1986). Under appropriate assumptions it can be shown that the resulting dynamic systems can converge to perfect
It Sargent (1993) provides a wide-ranging overviewof adaptive learning. See Honkapohja(1996) for a discussion of Sargent's book. Adaptive learning is also reviewed in Evans and Honkapohja (1995a) and Marimon (1997). Marcet and Sargent (1988) and Honkapohja (1993) provide concise introductions to the subject.
Ch. 7: Learning Dynamics
465
foresight steady states or cycles. Using a generalization of adaptive expectations, the conditions under which learning could converge to perfect foresight cycles were also investigated by Guesnerie and Woodford (1991). A second approach is to regard agents as statisticians or econometricians who estimate forecasting models using standard statistical procedures and who employ these techniques to form expectations of the required variables. This line of research has naturally focussed on stochastic models, though it can also be applied to nonstochastic models. Perhaps the greatest concentration of research on learning in macroeconomics has been in this area, and this literature includes, for example, Bray (1982), Bray and Savin (1986), Fourgeaud, Gourieroux and Pradel (1986), Marcet and Sargent (1989c), and Evans and Honkapohja (1994b,c, 1995c). A third possibility is to draw on the computational intelligence 12 literature. Agents are modeled as artificial systems which respond to inputs and which adapt and learn over time. Particular models include classifier systems, neural networks and genetic algorithms. An example of such an approach is Arifovic (1994). Cho and Sargent (1996b) review the use of neural networks, and the range of possibilities is surveyed in Sargent (1993) 13. We discuss these approaches in the final section of this paper. Finally, we remark that not all approaches fall neatly into one of the classes we have delineated. For example, Nyarko (1997) provides a framework which is both eductive and adaptive. Agents have hierarchies of beliefs and actions are consistent with Bayesian updating. For a class of models which includes the cobweb model, conditions are given for convergence to the Nash equilibrium of the true model. The focus of this survey is on adaptive learning and the main emphasis is on statistical or econometric learning rules for stochastic models. We now illustrate this approach in the context of the economic examples above. 1.4. Examples o f statistical learning rules 1.4.1. Least squares learning in the Muth model
Least squares learning in the context of the Muth (or cobweb) model was first analyzed by Bray and Savin (1986) and Fourgeaud, Gourieroux and Pradel (1986). They ask whether the REE in that model is learnable in the following sense. Suppose that firms believe prices follow the process Pt = a + b'wt 1 + rh,
(12)
corresponding to the unique REE, but that a and b are unknown to them. Suppose that firms act like econometricians and estimate a and b by running least squares
12 This term is now more common than the equivalent term "artificial intelligence". 13 Spear (1989) takes yet another viewpoint and looks at bounded rationality and learning in terms of computational constraints.
G.W.Evansand S. Honkapohja
466
regressions ofpt on wt 1 and an intercept using data {Pi, wi}i=o. t l Letting (at l, bt-1) denote their estimates at t - 1, their forecasts are given by
E[_]pt = at i + bl_~wt-i.
(13)
The values for (at 1,bt-1) are given by the standard least-squares formula / ~-1
\ 1
( aZt z- 1i l)z=; l[) b , \i=1
t-1
"~
(~Zi-lPi,~ /
\i=1
where
z, = (1 w , ) .
(14)
/
Equations (4), (13) and (14) form a fully specified dynamic system, and we can ask: Will (at, bt) I ---+ (gl, t~l)I as t --+ oo.9 The above papers showed that if a < 1 then convergence occurs with probability 1. It is notable that this stability condition is weaker than the condition lal < 1 obtained under the eductive arguments. Since ct < 0 always holds in the Muth model (provided only that supply and demand curves have their usual slopes), it follows that least squares learning always converges with probability one in the Muth model. The stability condition can be readily interpreted using the expectational stability condition, formulated as follows. As earlier, we consider the mapping (11) from the perceived law of motion (PLM), parameterized by q~t = (a, bt), to the implied actual law of motion (ALM) which would be followed by the price process if agents held those fixed perceptions and used them to form expectations. Consider the differential equation dO aT - T(q~) - ~b, where z- denotes "notional" or "artificial" time. We say that the REE ~ = (fi, b~)~ is expectationally stable or E-stable if ~} is locally asymptotically stable under this differential equation. Intuitively, E-stability determines stability under a stylized learning rule in which the PLM parameters (a, b) are slowly adjusted in the direction of the implied A L M parameters. It is easily verified that for the Muth model the E-stability condition is simply a < 1, the same as the condition for stability under least-squares learning. The formal explanation for the reason why E-stability provides the correct stability condition is based upon the theory of stochastic approximation and will be given in later sections. It should be emphasized that in their role as econometricians the agents treat the parameters of Equation (12) as constant over time. This is correct asymptotically, provided the system converges. However, during the transition the parameters of the A L M vary over time because of the self-referential feature of the model. Bray and Savin (1986) consider whether an econometrician would be able to detect the transitional misspecification and find that in some cases it is unlikely to be spotted 14. 14 Bullard (1992) considers some recursive learning schemes with time-varying parameters. However, the specification does not allow the variation to die out asymptotically.
467
Ch. 7: Learning Dynamics 1.4.2. Least squares learning in a linear model with multiple R E E
Consider now the model (5) and suppose that agents have a PLM (perceived law of motion) of the form (15)
Yt = a + byt 1 + c'wt + ~k,
and that they estimate the parameters a, b, and c by a least squares regression ofyt on Yt-I, wt and an intercept. Letting = (a,,b,c;),
z; = ( 1 , y ,
1,w;),
the estimated coefficients are given by
(~t=~i~oZiZi)
(16)
(i~oZiYi)
and expectations are given by Etyt+l - at + byt + e;pwt,
where for convenience we are assuming that p is known. For simplicity, estimates q~t are based only on data through t - 1. Substituting this expression for E;yt+l into Equation (5) and solving for Yt yields the ALM (actual law of motion) followed by Yt under least squares learning. This can written in terms of the T-map from the PLM to the ALM: Yt = T(~)t)'zt + W((~t) vt, where
T(0 ) = r
W
= [ (1-[3b)-'6 ] , and \ (1 -/3b)-i (/f + [3cp) /
= (1-/3b) i.
(17) (18)
(19)
Note that fixed points ~ = (fi, b, U) ~ of T(~b) correspond to REE. The analysis in this and more general models involving least squares learning is facilitated by recasting Equation (16) in recursive form. It is well known, and can easily be verified by substitution, that the least squares formula (16) satisfies the recursion g)t = ~)t-1 + ~tR/Igt-l(Yt-I - ~ 1zt l), Rt = Rt I + ~'t(zt lZ; I -Rt-~),
(20)
for Yt = 1/t and suitable initial conditions 15. Using this RLS (Recursive Least Squares) set-up also allows us to consider more general "gain sequences" Yr. The dynamic system to be studied under RLS learning is thus defined by Equations (17)-(20).
15
Rt is an estimate of the moment matrix for zt. For suitable initial conditionsR t = t 1 ~ti=l 0 ZiZ~"
G. W. Evans and S. Honkapohja
468
Marcet and Sargent (1989c) showed that such dynamic systems fit into the framework o f stochastic recursive algorithms which could be analyzed using the stochastic approximation approach. This technique, which is described in the next section, associates with the system an ordinary differential equation (ODE) which controls the motion o f the system. In particular, only asymptotically stable zeros ~} o f the differential equation are possible limit points o f the stochastic dynamic system, such that Ct --+ ~}. In the case at hand the ODE is dO
dT R-IMz(O)(T(())-O)'
dz " - Mz(0)-R'
where M~(O) = lim E [zt(O)zt(O)'] t-~oo
for zt(O)' = (1,yt l(0),w~) and Y,(0) = T(~)'zt(O)+ W(O)vt. Here T(0 ) is given by Equation (18). Furthermore, as Marcet and Sargent (1989c) point out, local stability o f the ODE is governed by d 0 / d r = T(0) - 0. It thus follows that E-stability governs convergence of RLS learning to an REE o f the form (7). For the model at hand it can be verified that if there are two stationary REE o f the form (7), then only one o f them is E-stable, so that only one o f them is a possible limit point o f RLS learning. This is an example of how RLS learning can operate as a selection criterion when there are multiple REE 16
1.4.3. Learning a steady state We now consider adaptive learning of a steady state in nonstochastic nonlinear models of the form (3):
Yt =f(E[yt+l). The basic OG model and the extensions mentioned above fit this framework. One natural adaptive learning rule is to forecast Yt+l as the average o f past observed values, * = (t - - 1)-l ~ i = t-I oYi for t = 1,2,3 . . . . . Since the model is nonstochastic, i.e. Etyt+l the traditional adaptive expectations formula E[yt+x = Et*_yt + Y(Yt 1 - E[_lYt) for fixed 0 < ~/~< 1 also has the potential to converge to a perfect foresight steady state.
16 Marcet and Sargent (1989c) focused on set-ups with a unique REE. Evans and Honkapohja (1994b) showed how to use this framework in linear models with multiple REE and established the connection between RLS learning and E-stability in such models. In these papers convergence with probability 1 is shown when a "Projection Facility" is employed. Positive convergence results when learning does not incorporate a projection facility are given in Evans and Honkapohja (1998b), which also gives details for this example. See also Section 2.4.2 for discussion.
Ch. 7: LearningDynamics
469
Both of these cases are covered by the following recursive formulation, in which for convenience we use q~t to denote the forecast at time t ofyt+l: E[yt+l = q~t, where Ot = ()t-i + Yt(yt 1 -q)t-l), and where the gain sequence 7t satisfies
0
and
~yt=+oo. t-I
The choice gt = t 1 gives the adaptive rule in which the forecast ofyt+l is the simple average o f past values. The "fixed-gain" choice Yt = Y, for 0 < y ~< 1, corresponds to adaptive expectations. For completeness we will give the adaptive learning results both for the "fixed-gain" case and for the "decreasing-gain" case in which lim 7t
t--~oo
= 0,
which is obviously satisfied by Yt = t -I . In specifying the learning framework as above, we have followed what we will call the "standard" timing assumption, made in the previous subsection, that the parameter estimate q}t depends only on data through t - 1. This has the advantage of avoiding simultaneity between Yt and ~bt. However, it is also worth exploring here the implications of the "alternative" assumption in which q~t = ~ t - 1 q- Yt(Yt -q~t-1). We will see that the choice of assumptions can matter in the fixed-gain case, but is not important if Yt = t -1 or if Yt = Y > 0 is sufficiently small. Standard timing assumption. Combining equations, and noting that Yt = f(~t), the system under adaptive learning follows the nonlinear difference equation Ot = q~t-1 + Yt(f(q~t-~)- q}t-l).
(21)
Note that in a perfect foresight steady state, y¢ = ~ = ~, where ~ = f ( ~ ) . In the constantgain case we have ~t = (1 - y)(bt 1 + Yf(~t-~) and it is easily established that a steady state ~ = ~ is locally stable under Equation (21) if and only if ll + y ( f ' ( ¢ } ) - 1)l < 1, i.e. iff 1 - 2/7 < f / ( ~ ) < 1. Note that 1 - 2 / y --~ - o o as y -+ 0.17 Under the decreasing-gain assumption limt--+oo Yt = 0, it can be shown that a steady state ¢} = ~ is locally stable under (21) if and only i f f ' ( ~ ) < 1. Thus the stability condition under decreasing-gain corresponds to the small-gain limit of the condition for constant gain.
17 Guesnerie and Woodford (1991) show how to generalize this condition for equilibrium k-cycles.
470
G. W Evans and X Honkapohja
Alternative timing assumption. Now instead we have the implicit equation ¢ , = ¢,-~ + Y , ( J ( ~ , )
- ¢,-1).
In general there need not be a unique solution for Ot given Ot 1, though this will be assured if gt is sufficiently small and ifff(q~) is bounded. Assuming uniqueness and focusing on local stability near ~ we can approximate this equation by Ot - ~ = (1 - yt.fl(@))-l(1 - Yt)((Pt-I - ¢)). Under the fixed-gain assumption this leads to the stability condition that either f ' ( ~ ) < 1 o r f ' ( ~ ) > 2 / y - 1 (these possibilities correspond to the cases yf'(~) < 1 and y f f ( ~ ) > 1). Under decreasing gain the condition is again simplyf1(~) < 1 (which again is also the small-gain limit for the constant-gain case) 18. S u m m a r y under small gain. Thus for the small-gain case, i.e. assuming either decreasing gain or a sufficiently small constant gain, the condition for local stability under adaptive learning is not affected by the timing assumption and is simply f t ( ~ ) < 1. Returning to our various examples, it follows that the steady states in Figures la and lb are stable under adaptive learning. In Figure lc, the high-output, low-inflation steady state is locally stable, while the low-output, high-inflation steady state is locally unstable. Finally, in Figure ld the high- and low-output steady states nL and nH are locally stable, while nu is locally unstable. As is clear from the above discussion, in the case of sufficiently large constant gains the stability condition is more complex and can depend sensitively on timing assumptions. [Lettau and Van Zandt (1995) analyze the possibilities in detail for some frameworks.] Our treatment concentrates on the decreasing-gain case in large part because in stochastic models, such as the linear models discussed above, decreasing gain is required to have the possibility of convergence to an REE. This also holds if intrinsic noise is introduced into the nonlinear models of this section, e.g. changing model (2) to Yt = E t f ( Y t + l ) + vt. Even if Ut is iid with arbitrarily small support, the above learning rules with constant gain cannot converge to an REE while with decreasing gain and appropriate assumptions we still obtain (local) convergence to the (noisy) steady state if the stability condition is met 19. Finally, we remark that the key stability condition, f f ( ~ ) < 1 for stability of a steady state under adaptive learning with small gain, corresponds to the E-stability condition. In this case the PLM is taken to be simply Yt = ~ for an arbitrary ~b. Under this PLM the appropriate forecast Ei*f(Yt+l) is f(~b) and the implied A L M is Yt = f(q~). The
18 The formal results for the decreasing-gain case can be established using the results of Section 2. Alternatively, for a direct argument see Evans and Honkapohja (1995b). 19 However, we do think the study of constant-gain learning is important also for stochastic models. For example, Evans and Honkapohja (1993a) explore its possible value if either (i) the model is misspecified or (ii) other agents use constant-gain learning.
Ch. 7: LearningDynamics
471
T-map from the PLM to the A L M is just f ( 0 ) , so the E-stability differential equation is dqVdT = f ( q ~ ) - q~, giving the stability conditionf'(~}) < 1. Stability under adaptive learning and E-stability for cycles and sunspots in nonlinear models are reviewed in Section 4. 1.4.4. The seignorage model o f inflation
The preceding analysis of learning a steady state is now illustrated by a summary discussion of the implications of adaptive learning for the seignorage model of inflation. This model is chosen because of its prominence in the macroeconomic literature. The seignorage model is discussed, for example, in Bruno (1989) and in Blanchard and Fischer (1989), pp. 195-201. In this model there is a fixed level of government expenditures g financed by seignorage, i.e. g = (Mt - Mt O/pt or g = Mt/pt - (Pt 1/Pt)(Mt-l/Pt-1). The demand for real balances is given by Mt/pt = S(E[pt+I/pt) and it is assumed that S t < 0. The model can be solved for inflation :rt = pt~ot-i as a function of Et*:rt+l ~ E[pt+l/pt and El_ 1srt oi" equivalently (in the nonstochastic case) can be written in terms o f M S p t and Et(Mt+l~Ot+l ). For the Overlapping Generations version of this model, given as Extension 1 of Section 1.2.3, nt = Mt/pt and the model was written as nt = f ( E [ n t + l ) . In order to apply directly the previous section's analysis of adaptive learning, we initially adopt this formulation. The central economic point, illustrated in Figure lc, is that for many specifications there can be two steady states: a high real balances (low-inflation) steady state nt = nil, err = :q satisfying 0 < f ( n H ) < 1, and a low real balances (high-inflation) steady state nt = hE, Jrt = Sr2 satisfying f~(nL) > 1. In the economic literature [e.g. Bruno (1989)] the possibility has been raised that convergence to nL provides an explanation o f hyperinflation. The analysis of the previous section shows that this is unlikely unless the gain parameter is large (and the alternative timing assumption used). In the smallgain case, the low-inflation/high real balances steady state nH is locally stable under learning and the high-inflation/low real balances steady state nL is locally unstable, in line with E-stability. Suppose instead that the model is formulated in terms of inflation rates. In this case the reduced form is :rt -
S(Et l,Tgt) S(E?ev~+I) - g"
In the usual cases considered in the literature we now have that h(~) -= S ( ~ ) / ( S ( s r ) - g ) is increasing and convex (for Jr not too large), and of course for small deficits g we have the two steady-state inflation rates as fixed points of h(~). The low-inflation (high real balances) steady state ~1 then satisfies 0 < h~(~l) < 1 and the high-inflation (low real balances) steady state ~2 satisfies h~(~2) > 1. E-stability for the PLM ~t = q) is determined by d 0 / d r = h(q~) - q~, so that again the low-inflation steady state is E-stable and the high-inflation steady state is not.
472
G. W, Evans and S. Honkapohja
Under adaptive learning of the inflation rate we assume that E[~t+l = 0t with either the standard timing Ot = Or-1 + gt(~t 1 - q~t-1) or the alternative timing assumption Ot = Ot-1 + gt(J~t - Ct-1). This set-up has been examined in detail by Lettau and Van Zandt (1995). They find that in the constant-gain case, for some values of y, the high-inflation steady state can be stable under learning under the alternative timing assumption. However, their analysis confirms that with small constant gain or decreasing gain the low-inflation steady state is always locally stable under adaptive learning and the high-inflation steady state is always locally unstable under adaptive learning. To conclude the discussion we make two further points. First, in some papers the learning is formulated in terms of price levels, rather than real balances or inflation rates, using least squares regressions of prices on lagged prices. Such a formulation can be problematic since under systematic inflation the price level is a nonstationary variable 2o. Second, the seignorage model has been subject to experimental studies, see Marimon and Sunder (1993) and Arifovic (1995). Their results suggest convergence to the low-inflation, high-output steady state. Such results accord with the predictions of decreasing or small-constant-gain learning. 1.5. Adaptive learning and the E-stability principle
We have seen that when agents use statistical or econometric learning rules (with decreasing gain), convergence is governed by the corresponding E-stability conditions. This principle, which we will treat as a unifying principle throughout this paper, can be stated more generally. Consider any economic model and consider its REE solution. Suppose that a particular solution can be described as a stochastic process with a particular parameter vector } (e.g. the parameters of an autoregressive process or the mean values over a k-cycle). Under adaptive learning our agents do not know ~} but estimate it from data using a statistical procedure such as least squares. This leads to estimates ~bt at time t and the question is whether q~t ---+~ as t ---, c~. For a wide range of economic examples and learning rules we will find that convergence is governed by the corresponding E-stability condition, i.e. by local asymptotic stability of ~}under the differential equation
dO dr
T(~0) - q},
(22)
where T is the mapping from the PLM q~ to the implied ALM T(q~). The definition of E-stability based on the differential equation (22) is the formulation used in Evans (1989) and Evans and Honkapohja (1992, 1995a). This requirement of E-stability is less strict than the requirement of iterative E-stability based on
20 See also the discussion in Section 5.1.4.
Ch. 7: LearningDynamics
473
Equation (10)21. As became evident from the results o f Marcet and Sargent (1989c), it is the differential equation formulation (22) which governs convergence of econometric learning algorithms. This form o f E-stability has been systematically employed as a selection rule with multiple REE in linear models by Evans and Honkapohja (1992, 1994b) and Duffy (1994), and in nonlinear models by Evans (1989), Marcet and Sargent (1989a), and Evans and Honkapohja (1994c, 1995b,c). O f course, there may be alternative ways to parameterize a solution and this may affect stability under learning. In particular, agents may use perceived laws of motion that have more parameters than the REE o f interest, i.e. overparameterization o f the REE may arise. This leads to a distinction between weak vs. strong E-stability. An REE is said to be weakly E-stable if it is E-stable as above, with the perceived law o f motion taking the same form as the REE. Correspondingly, we say that an REE is strongly E-stable if it is also locally E-stable for a specified class o f overparameterized perceived laws o f motion. (The additional parameters then converge to zero.) 22 We remark that, since it may be possible to overparameterize solutions in different ways, strong E-stability must always be defined relative to a specified class o f PLMs 23. Finally, as a caveat it should be pointed out that, although the bulk of work suggests the validity o f the E-stability principle, there is no fully general result which underpins our assertion. It is clear from the preceding section that the validity o f the principle may require restricting attention to the "small-gain" case (gain decreasing to zero or, if no intrinsic noise is present, a sufficiently small constant gain). Another assumption that will surely be needed is that the information variables, on which the estimators are based, remain bounded. To date only a small set o f statistical estimators has been examined. We believe that obtaining precise general conditions under which the E-stability principle holds is a key subject for future work.
1.6. Discussion of the literature In the early literature the market model o f Muth (1961), the overlapping generations model and some linear models were the most frequently used frameworks to analyze learning dynamics. Thorough treatments o f learning dynamics in the Muth model were given by Bray and Savin (1986) and Fourgeaud, Gourieroux and Pradel (1986). Interestingly, without mentioning rational expectations, Carlson (1968) proposed that price expectations be formed as the mean o f observed past prices in study the linear
21 There is a simple connection between E-stability based on Equation (22) and the stricter requirement of iterative E-stability. An REE ~ is E-stable if and only if all eigenvalues of the derivative map DT(~) have real parts less than one. For iterative E-stability the requirement is that all eigenvalues of DT(~) lie inside the unit circle. 22 Early applications of the distinction between weak and strong stability, introduced for iterative E-stability in Evans (1985), include Evans and Honkapohja (1992), Evans (1989) and Woodford(1990). 23 In an analogous way, E-stability can also be used to analyze non-REE solutions which are tmderparameterized. See Section 5.3 below.
474
(7. W. Evans and S. Honkapohja
non-stochastic cobweb (or Muth) model. Auster (1971) extends the convergence result for the corresponding nonlinear setup. Lucas (1986) is an early analysis of the stability of steady states in an OG model. Grandmont (1985) considers the existence of deterministic cycles for the basic OG model. He also examines learning using the generalizations of adaptive expectations to finite-memory nonlinear forecast functions. Guesnerie and Woodford (1991) propose a generalization to adaptive expectations allowing possible convergence to deterministic cycles. Convergence of learning to sunspot equilibria in the basic OG model was first discovered by Woodford (1990). Linear models more general than the Muth model were considered under learning in the early literature. Marcet and Sargent (1989c) proposed a general stochastic framework and technique for the analysis of adaptive learning. This technique, studied e.g. in Ljung (1977), is known as recursive stochastic algorithms or stochastic approximation. (Section 2 discusses this methodology.) Their paper includes several applications to well-known models. Margaritis (1987) applied Ljung's method to the model of Bray (1982). Grandmont and Laroque (1991) examined learning in a deterministic linear model with a lagged endogenous variable for classes of finite memory rules. Evans and Honkapohja (1994b) considered extensions of adaptive learning to stochastic linear models with multiple equilibria. Other early studies of learning include Taylor (1975) who examines learning and monetary policy in a natural rate model, the analysis of learning in a model of the asset market by Bray (1982), and the study Blume and Easley (1982) of convergence of learning in dynamic exchange economies. Bray, Blume and Easley (1982) provide a detailed discussion of the early literature. The collection Frydman and Phelps (1983) contains several other early papers on learning. Since the focus of this survey is on adaptive learning in stochastic models we will not comment here on the more recent work in this approach. The comments below provide references to approaches and literature that will not be covered in detail in later sections. For Bayesian learning the first papers include Turnovsky (1969), Townsend (1978, 1983), and McLennan (1984). Bray and Kreps (1987) discuss rational learning and compare it to adaptive approaches. Nyarko (1991) shows in a monopoly model that Bayesian learning may fail to converge if the true parameters are outside the set of possible prior beliefs. Recent papers studying the implications of Bayesian learning include Feldman (1987a,b), Vives (1993), Jun and Vives (1996), Bertocchi and Yong (1996) and the earlier mentioned paper by Nyarko (1997). A related approach is the notion of rational beliefs introduced by Kurz (1989, 1994a,b). The collection Kurz (1997) contains many central papers in this last topic. The study of finite-memory learning rules in nonstochastic models was initiated in Fuchs (1977, 1979), Fuchs and Laroque (1976), and Tillmann (1983) and it was extended in Grandmont (1985) and Grandmont and Laroque (1986). These models can be viewed as a generalization of adaptive expectations. A disadvantage is that the finite-memory learning rules cannot converge to an REE in stochastic models, cf.
475
Ch. 7: Learning Dynamics
e.g. Evans and Honkapohja (1995c). Further references of expectation formation and learning in nonstochastic models are Grandmont and Laroque (1990, 1991), Guesnerie and Woodford (1991), Moore (1993), B6hm and Wenzelburger (1995), and Chatterji and Chattopadhyay (1997). Learning in games has been subject to extensive work in recent years. A small sample of papers is Milgrom and Roberts (1990, 1991), Friedman (1991), Fudenberg and Kreps (1993, 1995), Kandori, Mailath and Rob (1993), and Crawford (1995). Recent surveys are given in Marimon and McGrattan (1995), Marimon (1997), and Fudenberg and Levine (1998). Kirman (1995) reviews the closely related literature on learning in oligopoly models. Another related recent topic is social learning, see e.g. Ellison and Fudenberg (1995) and Gale (1996).
2. General methodology: recursive stochastic algorithms 2.1. General setup and assumptions In the first papers on adaptive learning, convergence was proved directly and the martingale convergence theorem was the basic toot, see e.g. Bray (1982), Bray and Savin (1986), and Fourgeand, Gourieroux and Pradel (1986). Soon it was realized that it is necessary to have a general technique to analyze adaptive learning in more complex models. Marcet and Sargent (1989b,c) and Woodford (1990) introduced a method, known as stochastic approximation or recursive stochastic algorithms, to analyze the convergence of learning behavior in a variety of macroeconomic models. A general form of recursive algorithms can be described as follows. To make economic decisions the agents in the economy need to forecast the current and/or future values of some relevant variables. The motions of these variables depend on parameters whose true values are unknown, so that for forecasting the agents need to estimate these parameters on the basis of available information and past data. Formally, let Ot E N ~ be a vector of parameters and let Ot = Ot-I q- ytQ(t, Ot-l,Xt)
(23)
be an algorithm describing how agents try to learn the true value of 0. It is written in a recursive form since learning evolves over time. Here Yt is a sequence of"gains", often something like Yt = t 1. Xt C tRk is a vector of state variables. Note that in general the learning rule depends on the vector of state variables. This vector is taken to be observable, and we will postulate that it follows the conditionally linear dynamics Y t = A ( O t - l ) Y t l + B(Ot
l)mt,
(24)
where Wt is a random disturbance term. The detailed assumptions on this interrelated system will be made below 24. 24 Note that somewhat different timing conventions are used in the literature. For example, in some expositions Wt 1 may be used in place of Wt in Equation (24). The results are unaffected as long as W~ is an iid exogenousprocess.
476
G. W. Evans and S. Honkapohja
Note that the least squares learning systems in Section 1.4 can be written in the form (23) and (24). For example, consider the system given by Equations (17) and (20). Substituting Equation(17) into Equation (20) and setting St-i = Rt yields an equation of the form (23), with O[ = (at, bt, c;, vec(St)') and X/ = (1,yt 1, w~,yt-2, w;_ 1, oi-0, and it can be checked that Xt follows a process of the form (24) 2s.
2.1.1. Notes on the technical literature The classical theory of stochastic approximation, see Robbins and Monro (1951) and Kiefer and Wolfowitz (1952), was developed for models without full state variable dynamics and feedback from parameter estimates. Recent expositions of stochastic approximation are given e.g. in Benveniste, Metivier and Priouret (1990), Ljung, Pflug and Walk (1992), and Kushner and Yin (1997). A widely cited basic paper is Ljung (1977), which extended stochastic approximation to setups with dynamics and feedback. Ljung's results are extensively discussed in the book by Ljung and S6derstr6m (1983). A further generalization of Ljung's techniques is presented in Benveniste, Metivier and Priouret (1990). A somewhat different approach, based on Kushner and Clark (1978), is developed in Kuan and White (1994). An extension of the algorithms to infinite-dimensional spaces is given in Chen and White (1998). Stochastic approximation techniques were used by Arthur, Ermoliev and Kaniovski (1983, 1994) to study generalized urn schemes. Evans and Honkapohj a (1998a,b) and the forthcoming book Evans and Honkapohj a (1999a) provide a synthesis suitable for economic theory and applications. The exposition here is based primarily on these last-mentioned sources. Other useful general formulations are Ljung (1977), Marcet and Sargent (1989c), the appendix of Woodford (1990), and Kuan and White (1994).
2.2. Assumptions on the algorithm Let Ot c IRd be a vector of parameters and Xt E IRk be a vector of state variables. At this stage it is convenient to adopt a somewhat specialized form of Equation (23), so that the evolution of 0t is assumed to be described by the difference equation 0 t = Or_ 1 q- ]/tJ-~(Ot 1,Xt) + ]/2tPt(Ot-l,~t ).
(25)
Here ~(.) and Pt(') are two functions describing how the vector 0 is updated (the second-order term Pt(') is often not present). Note that in Equation (25) the function Q(t, Ot-l,Xt) appearing in Equation (23) has been specialized into first- and secondorder terms in the gain parameter Yr.
25 Here vec denotes the matrix operatorwhich forms a column vector from the matrix by stacking in order the columns of the matrix.
Ch. 7: Learning Dynamics
477
Next we come to the dynamics for the vector of state variables. In most economic models the state dynamics are assumed to be conditionally linear, and we postulate here that Xt follows Equation (24). Without going into details we note here that it is possible to consider more general situations, where Xt follows a Markov process dependent on 0t-l. This is needed in some applications, and the modifications to the analysis are presented in detail in Evans and Honkapohja (1998a). For local convergence analysis one fixes an open set D C N J around the equilibrium point of interest. The next step is to formulate the assumptions on the learning rule (25) and the state dynamics (24). We start with the former and postulate the following: (A.1). 7t is a positive, nonstochastic, nonincreasing sequence satisfying O(3
O<3
t--I
t-,1
(A.2). For any compact Q c D there exist C1, C2, ql and q2 such that VO E Q and Vt: (i) lT-{(0,x)l ~< c1(1 + Ix]q1),
O0 I;,(0,x)l ~< c2(1 + Ixlq2) (A.3). For any compact Q c D the function 7-/(0,x) satisfies V0, 0 ~ E Q and Xl,X2 E ~ k .
(i)
107-[(O,xl)/Ox-O~(O,xz)/Ox I <~L~ txl -x2l,
(ii) I~(0,0)-~(0',0)1 ~
Note that (A.1) is clearly satisfied for Yt = C/t, C constant. (A.2) imposes polynomial bounds on 7-/(.) and Pt('). (A.3) holds provided 7-[(O,x) is twice continuously differentiable (denoted as C 2) with bounded second derivatives on every Q. For the state dynamics one makes the assumptions (B.1). Wt is iid with finite absolute moments. (B.2). For any compact subset Q c D :
sup Ig(0)l ~ M,
O~Q
sup IA(0)I ~ p < 1,
OcQ
for some matrix norm 1-1, and A(O) and B(O) satisfy Lipschitz conditions on Q. R e m a r k : In (B.2) the condition on A(O) is a little bit stronger than stationarity.
However, if at some 0* the spectral radius (the maximum modulus of eigenvalues)
478
G W.Evans and S. Honkapohja
satisfies r(A(O*)) < 1 then the condition on A(O) in (B.2) holds in a neighborhood of 0". These are pretty general assumptions. In specific models the situation may be a great deal simpler. One easy case arises when the state dynamics Xt do not depend on the parameter vector 0~ 1. A classical special case in stochastic approximation, first discussed by Robbins and Monro (1951), arises when the distribution of the state variable Xt+l can depend on 0t but is otherwise independent of the history Xt,Xt-l,.
. . , Ot, Ot 1 . . . . .
In general the recursive algorithm consisting of Equations (25) and (24) for 0~ and Xt, respectively, is a nonlinear, time-varying stochastic difference scheme. At first sight the properties of such systems may seem hard to analyze. It turns out that, due the special structure of the equation for the parameter vector, the system can be studied in terms of an associated ordinary differential equation which is derived as follows: (i) Fix 0 and define the corresponding state dynamics
fft( O) : A( O)~_I (0) + B( O)Wt. (ii) Consider the asymptotic behavior of the mean of 7-{(0,Xt(0)), i.e.
h(O)
=
lim t---+O0
ET~(O,2,(O)).
The associated differential equation is then defined as dO dr - h(O). Given assumptions (A. 1)-(A.3) and (B. 1)-(B.2) it can be shown that the fimction h(O) is well-defined and locally Lipschitz.
2.3. Convergence: the basic results 2.3.1. ODE approximation The basic idea in the "ODE method" is to write the algorithm in the form
Ot+l = Ot + yt+lh( Ot) + et,
where
e, = 7,+1 [~(0,,X~+i) - h(O,) + ~',+~p,+l(O,,X,+l)]. Thus, et is essentially the approximation error between the algorithm and a standard discretization of the associated ODE. In proving the validity of the method the main difficulty is in showing that the cumulated approximation errors ~ et are bounded. The precise details of proof are very lengthy indeed [see Evans and Honkapohja (1998a) for an exposition], and they must be omitted here.
Ch. 7: Learning Dynamics
479
2.3.2. Asymptotic analysis We now make the assumption that we have an equilibrium point o f the associated ODE 0* which is locally asymptotically stable for the ODE. It turns out that, in a particular sense, the time path o f 0t generated by the algorithm will converge to 8", provided for its starting point (x, a) the component a is sufficiently close to 0". In doing this the Lyapunov stability theory for ODEs in the form of so-called converse Lyapunov theorems is needed m6. Suppose that 0* is locally asymptotically stable for the associated ODE d g / d r = h(0(r)). Then a version of the converse Lyapunov theorems states that on the domain of attraction ~ o f 0* for the ODE there exists a C e Lyapunov function U(O) having the properties U(8)>OforallgE~, 8~8", (a) U(8*)=O, 0~8", (b) U'(8) h ( 8 ) < O f o r a l l g E ~ , (c) U(8) --~ oc if 8 ---+ 0 ~ or 181 -+ 0<3.27 Introduce now the notation K(c) = {0; U(8) ~< c}, c > 0 for the contour sets of the Lyapunov function. Also let P~ ..... be the probability distribution of (X~, 8t)t>~n with X~ = x, On = a. The following theorem is the basic convergence result for recursive stochastic algorithms (25) and (24): T h e o r e m 1. Let 8" be an asymptotically stable equilibrium point of the ODE d 8 / d r = h(0(r)). Suppose assumptions (A) and (B) are satisfied on D = int(K(c)) for some c > O. Suppose that for 0 < cl < c2 we have K(c2) C D. Then (i) Va E K(Cl),n >/O,x one has
Pn, x,a{Ot leaves K(c2) infinite time or Ot ---+8*} = 1, and (ii) .for any compact Q c D there exist constants' B2 and s such that Va E Q, n >~ 0, x"
P~ .... (8, --+ 8*} ~ x -O2 (a + fxlS)j(n), where J(n) is a positive decreasing sequence with limn~o~ J(n) = O.
Remark:J(n) isinfactgivenbyJ(n)=
(
1+
~ t=n+l
Z2
~ t=n+l
)
7~ •
To interpret the results one first fixes the contour sets K(cl) C K(c2). The theorem states two things. First, the algorithm either converges to 0* or diverges outside K(c2). Second, the probability o f converging to 8" is bounded from below by a sequence o f
26 For converses of the Lyaptmov stability theorems see Hahn (1963, 1967). 27 0~ denotes the boundary of ~.
G.W. Evans and S. Honkapohja
480
numbers which tends to unity as n ~ oc. In other words, if at some large value of t the algorithm has not gone outside K(c2) then it will converge to 0* with high probability. 2.4. Convergence: fi~rther discussion 2.4.1. Immediate consequences The following two results are special cases for obtaining statements about convergence when starting at time 0. The first result is an immediate consequence of the second part of Theorem 1: Corollary 2. Suppose Yt = ~g[, where ~ satisfies (A.1). Let the initial value o f 0 belong to some compact Q c D. Then V6 > 0 : 3~* such that V0 < ~ < ~* and aEQ:
Po, x,a{ot ~ o*} >1 1 - 6 . This is the case of slow adaption. For slow enough adaption the probability of convergence can be made "very close" to one. For general adaption speeds and with additional assumptions it is possible to obtain convergence with positive probability: Corollary 3. Assume that O* is locally asymptotically stable for the ODE. Assume that each component of Wt is' either a random variable with positive continuous density or else is constant. Fix a compact set Q c D, such that O* E int(Q), and a compact set J c II{k. Suppose that for every Oo E Qo and Xo E Jo in some sets Qo and Jo, and for every n > O, there exists a sequence Wo. . . . . WT, with T ) n, such that Or E int(Q) and X r E int(J). Then
P0..... { 0 ~ 0 " } > 0 for all a C Qo and x c Jo. It must be emphasized that it is not in general possible to obtain bounds close to unity even for the most favorable initial conditions at this level of generality. The reason is that for small values of t the ODE does not approximate well the algorithm. For early time periods sufficiently large shocks may displace Ot outside the domain of attraction of the ODE. 2.4.2. Algorithms with a projection facility In the earlier literature [e.g., Marcet and Sargent (1989b,c), Evans and Honkapohja (1994b,c, 1995c)] this problem was usually avoided by an additional assumption, which is called the Projection Facility (PF). It is defined as follows: For some 0 < Cl < c2, with K(c2) c D, the algorithm is followed provided Ot c int(K(c2)).
Ch. 7." Learning Dynamics
481
Otherwise, it is projected to some point in K(ci). An alternative to PF, see e.g. Ljung (1977), is to introduce the direct boundedness assumption that the algorithm visits a small neighborhood o f the equilibrium point infinitely often. This condition is often impossible to verify. The hypothesis of a PF has been criticized as being inappropriate for decentralized markets [see Grandmont (1998), Grandmont and Laroque (1991) and Moreno and Walker (1994)]. The basic results above do not invoke the projection facility which has in fact a further strong implication. With a PF the probability for convergence to a stable equilibrium point can be made equal to unity: Corollary 4. Consider the general algorithm augmented by a projection facility. Then
Vx, a : Po, x,a{Ot --+ 0"} = 1.
We omit the proof, which is a straightforward consequence of the main theorems, see Evans and Honkapohja (1998a). Finally, we note here that almost sure local convergence can be obtained in some special models, provided that the support o f the random shock is sufficiently small, see Evans and Honkapohja (1995c). Also for nonstochastic models there is no need to have a PF when one is interested in local stability. However, for some nonstochastic models problems with continuity o f the functions in the learning algorithm may arise 28.
2.5. Instability results We will now consider the instability results which will, broadly speaking, state the following: (i) The algorithm cannot converge to a point which is not an equilibrium point o f the associated ODE, and (ii) the algorithm will not converge to an unstable equilibrium point o f the ODE. We will have to adopt a new set o f conditions 29. Let again Ot C IRa be a vector o f parameters and adopt the general form (23) for the algorithm, i.e. Ot = Ot 1 + ~/tQ(t, Ot-I ,~t). Below we will impose assumptions directly on Q(.). Again, Xt E 1Rk is the vector of observable state variables with the conditionally linear dynamics (24), i.e. Xt = A(Ot_i)Xt_l + B(Ot_l)Wt. Select now a domain D* C IRd such that all the eigenvalues o f A(O) are strictly inside the unit circle V0 c D*. The final domain o f interest will be an open and connected set
28 For example, the moment matrix in recursive least squares can become singular asymptotically. See Grandmont and Laroque (1991) and Grandmont (1998) for a discussion. Evans and Honkapohja (1998b) and Honkapohja (1994) discuss the differences between stochastic and nonstuchastic models. 29 The main source for the instability results is Ljung (1977). (We will adopt his assumptions A.) A slightly different version of Ljung's results is given in the appendix of Woodford (1990). For an instability result with decreasing gain in a nonstochastic setup see Evans and Honkapohja (1999b).
G. W. Evans and S. Honkapohja
482
D C D* and the conditions below will be postulated for D. We introduce the following assumptions: (C.1). Wt & a sequence o f independent random variables with IWtl < C with probability one f o r all t. (C.2). Q(t, O,x) is C 1 in (O,x) f o r 0 E D. For fixed (O,x) the derivatives are bounded in t. (C.3). The matrices A(O) and B(O) are Lipschitz on D. (C.4). limt~oo EQ(t, O,f(t(O))= h(O) exists f o r 0 E D, where Xt(O) = A(O)Xt_l(O) + B(O)Wt. (C.5). Yt is a decreasing sequence with the properties ~ - ~ Yt = oo, ~-2~ ~t < oc Jbr
It 1] < oo.
some p, and lim suPt~o ~ ~ - ~
With these assumptions the following theorem holds [see Ljung (1977) for a proof]: T h e o r e m 5. Consider the algorithm with assumptions (C). Suppose at some point O* E D we also have the validity o f the conditions (i) Q(t, 0*,Xt(0*)) has a couariance matrix that is bounded below by a positive definite matrix, and (ii) EQ(t, O, Xt(O)) is C 1 in 0 in a neighborhood o f O* and the derivatives converge uniformly in t. Then if h(O*) ~ 0 or if Oh(O*)/O0 ~ has an eigenualue with positive real part, Pr(O~ --~ 0 " ) = 0. In other words, the possible rest points of the recursive algorithm consist of the locally stable equilibrium points of the associated ODE 30. It is worth mentioning the role of condition (i) in the theorem. It ensures that at even large values of t some random fluctuations remain, and the system cannot stop at an unstable point or nonequilibrium point. For example, if there were no randomness at all, then with an initial value precisely at an trustable equilibrium the algorithm would not move off that point. If the system is nonstochastic the usual concept of instability, which requires divergence from nearby starting points, is utilized instead. 2.6. Further remarks" The earlier stability and these instability results are the main theorems from the theory of recursive algorithms that are used in the analysis o f adaptive learning in economics. 30 This assumes that the equilibrium points are isolated. There are more general statements of the result.
Ch. 7: LearningDynamics
483
We note here that there exist some extensions yielding convergence to more general invariant sets of the ODE under further conditions. I f the invariant set consists of isolated fixed points and the dynamics can be shown to remain in a compact domain, then it is possible to prove a global result that learning dynamics converges to the set of locally stable fixed points 31. Another global convergence result for a unique equilibrium under rather strong conditions will be discussed in a moment. As already mentioned in Section 1.5 in the Introduction a simpler way of obtaining the appropriate convergence condition for adaptive learning is the concept of expectational stability. The method for establishing the connection between E-stability and convergence of real-time learning rules is obviously dependent on the type of the PLM that agents are presumed to use. For nonlinear models one usually has to be content with specific types of REE, whereas for linear models the entire set of REE can be given an explicit characterization and one can be more systematic. We remark that the parameterization of the REE and the specification of who is learning and what (i.e. perceived law of motion) can in principle affect the stability conditions. This situation is no different from other economic models of adjustment outside a full equilibrium. However, it is evident that the local stability condition that the eigenvalues of T(.) have real parts less than one is invariant to 1-1 transformations q~ --+ /3 = f ( ~ ) , where f and f-1 are both C 1 . Recall, however, that if agents overparameterize the solution this may affect the stability condition, which is captured by the distinction between weak and strong E-stability.
2.7. Two examples 2.7.1. Learning noisy steady states We consider univariate nonlinear models of the form
Yt = H(E[ G(yt+,, or+l), or),
(26)
where ut is an iid shock. Here E[G(yt+t, vt+l) denotes subjective expectations of a (nonlinear) function of the next period's value of Yt+l and the shock ut+l. Under REE E[ G(yt+ I, Vt+l) = Et G(y~+l, Vt+l), the true conditional expectation. As mentioned previously, various overlapping generations models provide standard examples that fit these frameworks.
31 Woodford(1990) and Evans, Honkapohjaand Marimon (1998a) are examples of the use of this kind of result.
484
G. W. Evans and S. Honkapohja
A noisy steady state for Equation (26) is given by 0 = E G ( y ( o ) , v), such that O = EG(H((), v), o),
Yt = H(O, v~).
Note that Yt is an iid process. For learning a steady state the updating rule is O, = Ot-i +t l [G(yt, v t ) - O t - l ] ,
(27)
which is equivalent to taking sample means. Contemporaneous observations are omitted for simplicity, so that we set E'[G(yt+l,vt+l) = 0t 132. Thus yt = H(Ot l,Vt) which is substituted into Equation (27) to obtain a stochastic approximation algorithm o f the form (24), (25). The convergence condition for such a "noisy" steady state is
~--~E(G(H(O, v), v) < 1.
This condition can also be obtained from the E-stability equation, since the T-map is in fact T(O) = E ( G ( H ( O , v), v). The extension to the E-stability of cycles is discussed in Evans and Honkapohja (1995c) and in Section 4 below.
2. 7.2. A model with a unique R E E The market model of Muth (196 l) was introduced in Section 1.2.1 above. We consider briefly its generalization to simultaneous equations, e.g. to multiple markets, discussed earlier in Evans and Honkapohja (1995a, 1998b): Yt = ~ + AEt_lYt + Cwt,
wt = Bwt 1 + ot.
Here Yt is an n x 1 vector o f endogenous variables, wt is an observed p x 1 vector o f stationary exogenous variables, and vt is a p x 1 vector of white noise shocks with finite moments. The eigenvalues of Bp×p are assumed to lie inside the unit circle. For simplicity, the matrix B is assumed to be known. E[_jyt denotes the expectations o f agents held at time t - 1 based on their perceived law o f motion. Assume also that I - A is invertible.
32 This avoids a simultaneity betweeny~ and 0t, see Section 1.4.3 for further discussion and references.
485
Ch. 7: Learning Dynamics
This model has a unique RISE Yt = ?t + [~wt 1 + ~h,
where fi = ( I - A ) 1~, ~ = ( / _ A ) - I C B and ~Tt = Cut. Is this REE expectationally stable? Consider perceived laws of motion of the form Yt = a + bwt-1 + ~lt
for arbitrary n × 1 vectors a and n x p matrices b. The corresponding expectation function is E ; l Y t = a + bwt 1, and one obtains the actual law of motion Yt = (t~ + A a ) + ( A b + CB)wt_l + ~t,
where ~t = Cot. The T mapping is thus T ( a , b) = ( # + A a , A b + CB).
E-stability is determined by the differential equation da
db
dv - Iz+(A-I)a,
dv - CB+(A-I)b.
This system is locally asymptotically stable if and only if all eigenvalues of A negative real parts, i.e. if the roots of A have real parts less than one. In real-time learning the perceived law of motion is time-dependent:
I have
yt = at-i + bt-l Wt-1 + ~]t,
where the parameters at and bt are updated running recursive least squares (RLS). Letting ¢ = (a, b), z; = (1, w;), et = Y t - (9t-tzt-1, RLS can be written 1
-1
t
Rt = Rt 1 + t l(zt_lz~_l - Rt-I).
This learning rule is complemented by the short-run determination of the value for Yt which is Yt = T(¢t 1)zt 1 + Cot,
where T(¢) = T ( a , b) as given above. In order to convert the system into standard form (25) we make a timing change in the system governing Rt. Thus we set SL-1 Rt, so that =
st =sL 1 +t-'(ztz;-S,
1 ) + t -2
(')
(ztz;-st_l)
The last term is then of the usual form with pt(St 1, zt) = -TZT(ztzt t - St-l). The model is of the form (25) with Ot = vec(q);, St) and X / = (1, w;, w; I, vt). The dynamics for
486
G. W. Evans and S. Honkapohja
the state variable are driven by the exogenous processes and one can verify that the basic assumptions for the convergence analysis are met. The associated ODE can be obtained as follows. Substituting in for et and Yt one obtains for Ot
O; = O ; l + t ls,~z,-~ [T(0, 1)z, 1 + C v , - 0 t
~z~_~]'
= ~)~-1 -}- t-latllzt-lZ~l [Z(~t-l) - ~t-1]' -t- t-lStllzt lVffC'. Taking expectations and limits one obtains the ODE as d q ~ ' - R 1Mz[T(q))-q)]', dT
dR-Mz-R, dr
where Mz denotes the positive definite matrix Mz = Eztz~. The second equation is independent of q~ and it is clearly globally asymptotically stable. Moreover, since R -+ Mz, the stability of the first equation is governed by the E-stability equation
d~ - ~r(0) - 0. dr Its local stability condition is that the eigenvalues of A have real parts less than one, see above. Thus the E-stability condition is the convergence condition for the RLS learning algorithm in this model. In the next section we establish a global result that is applicable for this model.
2.8. Global convergence In this section we provide a stronger set of conditions than conditions (A) and (B) of Section 2.2, which guarantees global convergence of the recursive algorithm
0, : 0t ~ + z t ~ ( 0 , - 1 , x , ) + yTp,(O,-1,x,). The new assumptions are: (D.1). The functions 7-[(0, x) and pt( O,x) satisfy for all O, O' E iRd and all x, x' c iRk: (i) [7-{(O,x 1 ) - 7-[(O,x2)l <~LI(1 + t0[)Ix1 -x21 (1 + [Xl[p' + Ixzlp2), (/0 ]7-/(0,0)-7-/(0',0)1 ~< L2 I0 - O'l, (iii) IO~(Oox)/Ox - O~(O',x)/Oxl ~< L2 10 - 0'l (1 + IxlP2), Or) Ipt(O,x)l <<.L2(1 + 101)(1 + fxl q)
for some constants LI,L2,Pl,P2 and q. (D.2). The dynamics for the state variable Xt E iRk is" independent of 0 and satisfies
(B.1) and (B.2) above. With these conditions one has the following global result:
Ch. 7." Learning Dynamics
487
Theorem 6. Under assumptions (A.1), (D.1) and (D.2) assume that there exists a unique equilibrium point O* E W~a o f the associated ODE. Suppose that there exists a positive C 2 function U(O) on ~d with bounded second derivatives satisfying (i) U'(O)h(O) < O f o r all 0 ~ 0", (i 0 U(O) = 0 iff O = 0", (ii 0 U(O) ~ al012 for all 0 with 101 >> po for some a, po > O. Then the sequence On converges Po ..... almost surely to 0".
A proof is outlined in Evans and Honkapohja (1998b). In that paper it is also shown how this theorem can be used to establish global convergence in the multivariate linear model of Section 2.7.2. 3. Linear economic models
3.1. Characterization o f equilibria Many linear rational expectations models have multiple solutions, and this is one o f the reasons why the study o f learning in such models is o f considerable interest, as previously noted. Consider the following specification: g k Yt = a + ~-~OiYt i+ Z [ 3 i E t i-I
lYt+i+Ut,
(28)
i-O
in which a scalar endogeneous variable Yt depends on its lagged values, on expectations o f its current and future values, and on a white noise shock yr. Here Et-lYt+i denotes the expectation o f yt+i based on the time t - 1 information set. For this model it is possible to give a complete characterization o f the solutions, using the results o f Evans and Honkapohja (1986) and Broze, Gourieroux and Szafarz (1985). The technique is based on the method o f undetermined coefficients, but rather than guessing a solution of a particular kind it is applied systematically to find a representation for all possible solutions. Every solution can be written in the form
Yt
~ - 1' ~[Jk i 1--[30y t k ~e 6i -Z..~ffYt [3~nlOt- ¢....~ [jk Yt i + [,4k i= l lak i=1 k k -
k i+Vt
(29)
q-~CiOt-i+Edict-i, i-1 i-1
where et is an arbitrary martingale difference sequence, i.e. a stochastic process satisfying Et Iet = 0, and where c i , . . . , ck, dl . . . . . di¢ are arbitrary 33. Various particular 33 There is an extensive literature on solution techniques to linear RE models and different possible representations of the solutions. Some central references are Gourieroux, Laffont and Monfort (1982), Broze, Gourieroux and Szafarz (1990), Whiteman (1983), McCallum (1983), Pesaran (1981), d'Autume (1990), and Taylor (1986).
488
G. W. Evans and S. Honkapohja
solutions can be constructed from Equation (29) by choosing the values for the ci and di and the et process appropriately.
In the literature attention is most often focused on so-called minimal state variable (MSV) solutions to Equation (28) 34. These solutions are of the form g Yt = a + Z
biYt-i + Or.
i-I
Many macroeconomic models have expectations for which the information set includes the current values o f the variables. A characterization similar to Equation (29) is available for such models. Some models in the literature have mixed datings of expectations and/or incorporate exogenous processes other than white noise. Although there exists a general characterization for the set o f solutions in Broze, Gourieroux and Szafarz (1985), it is often easier to be creative and derive the representation by the principles outlined above. The references in the footnote above provide detailed discussions of the methods in particular frameworks. 3.2. L e a r n i n g and E-stability in unit)ariate models
In this section we give a comprehensive analysis o f adaptive learning dynamics for some specific linear setups. Although these models appear to be relatively simple, they cover a large number o f standard macroeconomic models that have been developed in the literature. Another advantage o f focusing initially on simple models is that we can study the learning dynamics for the full set o f solutions and obtain complete analytic results. It is possible to generalize the analysis o f learning to more general setups (including various multivariate models) and derive, for example, the conditions for stability of specific solutions under learning. However, these conditions become easily abstract, so that analytic results are limited and it becomes necessary to resort to numerical methods. 3.2.1. A leading example
Consider the univariate model Yt = a + [3oEt* ly t + [31E[~lYt+l + t~t,
(30)
where vt is assumed to be an exogenous process satisfying Et_l c~t = O.
34 The terminology is due to McCallum (1983), but our usage differs from his in that we only use his primary solution principle to define MSV solutions. McCallum also introduces a subsidiary principle, and he defines MSV solutions as those that satisfy both principles. McCallum (1997) argues that his MSV criterion provides a classification scheme for delineating the bubble-free solution.
489
Ch. 7: Learning Dynamics
E x a m p l e 3.1. Sargent a n d Wallace (1975) "ad-hoc" model: qt = al + a2(Pt - Et_lPt) + ult, qt = bl + b2(rt - ( E t lPt+l
where a2 > 0;
Et_lPt)) + u2t,
m = co + P t + c l q t + c 2 r t + u 3 t ,
where b2 < 0;
where Cl > 0, C2 < 0.
Here q,p, and m are the logarithms o f output, the price level and the money stock, respectively, and the money stock is assumed constant, r is the nominal interest rate. This fits the reduced form (30) with yt = p t , and/31 > 0 and/3o +/31 < 1. E x a m p l e 3.2. R e a l balance m o d e l [Taylor (1977)]: qt = al + a2(m - P t ) + ult,
where a2 > 0;
qt = bl + b 2 ( r t - ( E ? _ l p t + l - E [ _ l p t ) ) + b 3 ( m - p t ) + u 2 t , m = co +Pt + qt + c2rt + c 3 ( m - p t ) + u3t,
where b2 < 0, b3 > 0;
where c2 < 0, 0 < c3 < 1.
The reduced form is Equation (30) with Yt = Pt and/31 = -rio, where /30 = b2(b3 + b2(1
a2 - c3)c~ ~ - a2) -l.
For appropriate choice o f structural parameters, any value/30 ~ 0 is possible. 3.2.1.1. A characterization o f the solutions. The set o f stochastic processes Yt = -/311a + fijl (1 - /3o) yt 1 +or + C l Ut-I +diet-1
(31)
characterizes the possible REE. cl and dl are free, and et is an arbitrary process satisfying Et let = 0. et is often referred to as a "sunspot", since it can be taken to be extrinsic to the model. We will refer to Equation (31) as the ARMA(1,1) set of solutions. These solutions can either be stochastically (asymptotically) stationary or explosive, depending on the parameter values. The ARMA(1,1) solutions are stationary if 1/311(1 --/30)] < 1. Choosing d l = 0 and cl = -/311(1 -/3o) gives an ARMA(1,1) process with a common factor for the autoregressive and moving average lag polynomials. When cancelled this yields the MSV solution 35 a Y t -- 1 -- /30 -- [31
+ yr.
(32)
The MSV solution is, o f course, often the solution chosen in applied work, and it is the unique non-explosive solution if I/3~-1(1 -/30)[ > 1. Various terminologies are in use
35 See Evans and Honkapohja (1986, 1994b) for details of this technique in general setups.
490
G. 14(Evans and S. Honkapohja
for this situation: the model is equivalently said to be "saddle-point stable" or "regular" and the MSV solution is said to be "locally determinate". If instead l[311(1 -/30)l < 1 then the model is said to be "irregular" and the MSV solution is described as "locally indeterminate". It is precisely in this case that the A R M A solutions are stationary. We will now consider the E-stability of the various solutions, taking no account of whether the ARMA(1,1) solutions are stationary or explosive (an issue to which we will return). Obviously for this model the MSV solution is always stationary. 3.2.1.2. E-stability o f the solutions. Posit a PLM (Perceived Law of Motion) of the same form as the MSV solution:
(33)
yt = a + yr.
Under this PLM we obtain y t = a + ([3o + [31)a + vt
as the Actual Law of Motion implied by the PLM (33). For E-stability one examines the differential equation da d r - a + ([3o +[3,) a - a
(34)
with unique equilibrium ~ = a/(1 - [ 3 0 - [31). The E-stability condition is
[30 +[31 < 1.
(35)
Next consider PLMs of the ARMA(1,1) form Yt = a + b y t - l +cvt l +det-1 +vt,
(36)
where et is an arbitrary process satisfying Et l gt = O, assumed observable at t. The implied A L M is Yt = a + [3oa +/31a(1 + b) + ([3ob + [3lb2)yt-1 + (flOC + [31be)or 1 + ([3od + [31bd)et-t + yr.
(37)
The mapping from PLM to A L M thus takes the form T ( a , b , c , d ) = (a+[30a+[31a(1 +b),[3ob+[31b2,[[3oc+[31bc,[3od+[31bd),
(38)
and we therefore consider the differential equation d d~(a, b, c, d) = T(a, b, c, d) - (a, b, c, d).
(39)
Note first that (a, b) form an independent subsystem, d ( a , b) = Tab(a, b) - (a, b). Evaluating the roots of DTab - I at the ARMA(1,1) solution values a = -/311a, b = [311(1 -[30), it follows that E-stability for the ARMA(1,1) solutions requires [30 > 1,
/31 < 0.
(40)
It is then further possible to show that if (a, b) converge to the ARMA(1,1) solution values, then under Equation (39) (c, d) also converge to some value [see Evans and
Ch. 7.. Learning Dynamics
491
Honkapohja (1992) for details]. Hence Equations (40) are the conditions for the ARMA(1,1) solution set to be E-stable. 3.2.1.3. StrongE-stability. Reconsider the MSV solution. Suppose agents allow for the possibility that Yt might depend on Yt l, vt j and et 1 as well as an intercept and yr. Is the MSV solution locally stable under the dynamics (39)? Evaluating D T - I at ( a , b , c , d ) = ( a / ( 1 - / 3 o - / 3 1 ) , 0 , 0 , 0 ) one obtains for the MSV solution the strong E-stability conditions:
/30+/31 < 1,
/3o < 1.
(41)
These conditions are stronger than the weak E-stability condition (35). For the ARMA(1,1) solution class one obtains that they are never strongly E-stable, if one allows for PLM o f the form Yt = a + blyt-i + b2Yt 2 + c o t 1 + det 1 +yr.
(42)
The argument here is more difficult (since the linearization Of the differential equation subsystem in (b~, b2) has a zero eigenvalue), and is given in Evans and Honkapohja (1999a). See Evans and Honkapohja (1992, 1994b) for related arguments. (In fact the lack o f strong E-stability is also delicate, since the differential system based on Equation (42) exhibits one-sided stability/instability)36. 3.2.1.4. E-stability and indeterminacy. The overall situation for the model (30) is
shown in Figure 2 37. In terms o f E-stability, there are 4 regions of the parameter space. Iffl0 +/31 > 1 and fil > 0 then none of the REE are E-stable. I f Equation (41) holds then the MSV solution is strongly E-stable, while the ARMA(1,1) solution is E-unstable. If fi0 + fil < 1 and/3o > 1 then the MSV solution is weakly but not strongly E-stable, and the ARMA(1,1) solutions are also weakly E-stable. Finally if/3o +/31 > 1 and /31 < 0 then the ARMA(1,1) solution set is weakly E-stable, while the MSV solution is E-unstable. In Figure 2 the region o f indeterminacy (in which there are multiple stationary solutions) is marked by the shaded cones extending up and down from the point (1, 0). Outside this region, the MSV solution is the unique stationary solution, while inside the indeterminacy region, the A R M A solutions as well as the MSV solution are stationary. For this framework the connection between indeterminacy and E-stability can be summarized as follows. In the special case/3o = 0, indeterminacy arises iff
36 Under Equation (42) the strong E-stability condition for the MSV solution remains (4i). 37 We comment briefly on the relationship of the results given here and those in Evans (1985) and Evans and Honkapobja (1995a). The results in Evans (1985) are based on iterative E-stability, which is a stronger stability requirement. In addition both Evans (1985) and Evans and Honkapohja (1995a) used a stronger definition of weak E-stability for the MSV solution, using PLMs with yt_1 included.
G. W. Evans and S. Honkapohja
492
MSV solution strongly E - sta )le, ARMA solution explosive and [ i - unstable
All solutions E - unstable, ARMA solutions stationary .
All solutions E - unstable, ARMA solutions explosive
.13o MSV solution E - unstable, A R M A solutions explosive and weakly E - stable "[I ~' solution I solution '1 I strongly weakly but E - stable, not strongly 1 ARMA solutions E - stable, stationary and A R M A solutions E - unstable stationary and weakly E - stable
Fig. 2. I /31 l> 1, but the A R M A solutions are never E-stable. However if/30 > 1, cases of (weakly) E-stable A R M A solutions arise in the right-hand half o f the lower cone of indeterminacy. Thus in general there is no simple connection between weak E-stability and determinacy 3s. Applying these results to examples 3.1 and 3.2, we have the following. In the Sargent-Wallace "ad hoc" model, the MSV solution is uniquely stationary and it is strongly E-stable, while the other solutions are E-unstable. In the Taylor real-balance model we have/31 = -/3o. There are three cases: (i) if/3o < ½ then the MSV solution is uniquely stationary and is strongly E-stable, while the other solutions are E-unstable; (ii) if ~l 3o < 1, the ARMA(1,1) solutions are also stationary, but are E-unstable, while the stationary MSV solution is strongly E-stable;
38 However, we know of no cases in which a set of ARMA solutions is strongly E-stable. See Evans and Honkapohja (1994b).
Ch. 7:
493
Learning Dynamics
(iii) if/30 > 1 then the MSV solution is stationary and weakly (but not strongly) E-stable and the ARMA(1,1) solutions are also stationary and weakly (but not strongly) E-stable. 3.2.2. The leading example: adaptive learning 3.2.2.1. Adaptive and statistical learning o f M S V solution. Since the MSV solution is an lid process, the natural statistical estimate is the sample mean, t at = t l ~ - ~ Y t - i , i-1
which is, in recursive form, at = at-i
+ t-l(yt
-
(43)
at-l).
Inserting Yt = ol + ([30 q-/31) at 1 + ot into the recursive equation we obtain the dynamic equation at = a t - 1 - t - t
l(a-]-(/30-}-/31)
at 1--at 1 + Ut).
(44)
Thus the model (30) with PLM (33) and learning rule (43) leads to the stochastic recursive algorithm (44), which can be analyzed using the tools of Section 2. The associated ODE is just the E-stability equation (34). It follows that if the E-stability condition/30 +/31 < 1 is met, then there will be convergence of at to fi = a/(1 -/3o -/31 ) and hence of the process followed by yt to the MSV solution. Indeed, for this set-up there is a tmique zero of the ODE and under the E-stability condition it is globally stable. Thus if/3o +/31 < 1 then at ~ fi with probability 1 globally, i.e. for any initial conditions. 3.2.2.2. Learning non-MSV solutions. Consider next whether suitable statistical learning rules are capable of learning the non-MSV solutions. Since there are technical complications which arise from the continua of solutions, we start with a particular solution from the set (31) in which cl = d l = 0:
Yt = --/311a +/311( 1 --/30)Yt-1 +Vt.
(45)
We also restrict attention to the "irregular" case 1/311(1 -/30)[ < 1, so that we are considering an asymptotically stationary solution. We remark that if the model (30) is regarded as defined for t /> 1, then the solution set (45) has an arbitrary initial condition Yo, the influence of which dies out asymptotically (in the irregular case). In nonstochastic models, vt - 0 and the MSV solution is the steady state Yt = a/(1 -/30 -/31). The solutions (45) then constitute a set of paths, indexed by
494
G.W. Evans and X H o n k a p o h j a
the initial Y0, converging to the steady state, and, as mentioned above, the steady state is then said to be "indeterminate". Thus the question we are now considering is the stochastic analogue to whether an adaptive learning rule can converge to an REE in the "indeterminate" case. 3.2.2.2.1. Recursiue least squares learning: the AR(1) case. We thus assume that agents have a PLM o f the AR(1) form yt = a + byt
l + Ut .
Agents estimate a and b statistically and at time t - 1 forecast Yt and Yt+l using the PLM Yt = at 1 + bt-lYt-~ + Or, where at-l, bt 1 are the estimates o f a and b at time t - 1. Inserting the corresponding forecasts into the model (30) it follows that Yt is given by Yt = Ta(at-l,bt 1)+ Tb(at l,bt-1)Yt
l+Ut,
(46)
where Ta(a,b) = a +/3oa+/31a(1 +b),
Tb(a,b) = fiob+[31b 2.
We assume that (at, bt) are estimated by ordinary least squares. Letting ¢~ = (at, bt),
zt' 1 : (1,yt 1),
least squares can be written in recursive form as Ot = Ot-i + t IRtlz t l(yt - z ~ Rt = Rt l + t
1
10t-I),
t
(47)
(zt 1Zt_l - Rt-l).
Equations (47) and (46) define a stochastic recursive algorithm, and the tools of Section 2 can be applied. In particular, for regions o f the parameter space in which Ibl < 1 we obtain the associated differential equation de d r - R - ' M z ( ¢ ) ( T ( O ) - ¢)'
dR dT - Mz(¢) - R ,
(48)
where M~(¢) = E[zt(¢)zt(¢)'] and zt(¢) is defined as the process for zt under Equation (46) with fixed Ct = ¢. Here ¢' = (a, b) and T(¢) = (Ta(a,b), Tb(a, b)). Provided that /30 +/31 < 1 and /30 > 1, the A R ( I ) solution (45) is stationary and weakly E-stable and it can be shown that the ODE (48) is locally stable at (a, b) = (-/311 a,/311 (1 -/30))39. It follows that under these conditions the solution (45) is locally stable under least squares learning. 3.2.2.2.2. Learning sunspot solutions. Consider now the full class o f ARMA(1,1) solutions (31). Assuming that ut and the sunspot variable el are observable at t, we
39 That stability of ODEs of the form (48) is governed by the stability of the differential equation de -- T(¢) - ¢ is shown in Marcet and Sargent (1989c). ~7
Ch. 7:
495
Learning Dynamics
can consider least squares learning which allows for this more general dependence. We now set (9~ = (at, bt, ct, dr),
z~l = (1,yt-1, vt-1, et 1),
(49)
and we continue to assume that agents use recursive least squares to update their coefficient estimates, ~bt. Thus under least squares learning the dynamic system is given by Equations (47), (49) and the equation Yt = T(Ot)'zt 1 + ut, where T(q)) is given by Equation (38). This again defines a stochastic recursive algorithm and for q~ = ( a , b , c , d ) with [b I < 1 the ODE is again o f the form (48). It is again straightforward to show that local stability of the ODE is governed by the differential equation dqVdr = T(~b) - q~ defining E-stability. There is a technical problem in applying the stochastic approximation tools, however: the assumptions are not satisfied at the ARMA(1,1) set of solutions since they include an unbounded continuum. Although this prevents a formal proof of convergence to the ARMA(1,1) set under least-squares learning, simulations appear to show that there is indeed convergence to an ARMA(1,1) solution if the E-stability condition is met 4°. See Evans and Honkapohja (1994b) for an illustration of convergence to sunspot solutions in a related model. 3.2.3. L a g g e d e n d o g e n o u s variables
In many economic models, the economy depends on lagged values as well as on expectations o f the future. We therefore extend our model (30) to allow direct feedback from Yt-~ and consider models of the form Yt = a + 6yt-1 + [3oE?_lyt + [31E[_lyt+l + or.
(50)
This reduced form is analyzed in Evans and Honkapohja (1992). E x a m p l e 3.3. Taylor (1980) o v e r l a p p i n g c o n t r a c t model: i 1 * Xt = ~ X t 1 + " ~ E t - l X t + l
+
i * * ~Y(Et-lqt + E~-lqt+~) + Ul~,
w, = ½(xt +xt 1), qt = k + mt - wt + uet, mt = Fn + (1 - ~ ) wt,
where xt is the (log) contract wage at time t, wt is the (log) average wage level, qt is aggregate output, and mt is money supply. 0 < ~ < 1, and 1 - cp is a measure of accommodation of price shocks. The reduced form is: xt = a + ½(1 - ½cPY)xt 1 1 ~cP~Et_~xt * + ½(1 1 ~q~y)E t* lXt+l + yr.
4o Recently Heinemaun (1997b) has looked at the stability under learning of the solutions to this model when agents use a stochastic gradient algorithm.
496
G. W. Evans and S. Honkapohja
E x a m p l e 3.4. Real balance model with p o l i c y feedback: Augment Example 3.2 with a policy feedback mt = m + dpt 1 + u4t. In this example /31 = --/30 and any value of fi0 ~ 0 can arise. 3.2.3.1. A characterization o f the solutions. The MSV solutions are o f the AR(1) form.
Guess a solution of the form (51)
Yt = ~P+PYt 1 +vt.
A solution o f this form must satisfy /31p 2 +([30 - 1)/9+ b = 0,
a(1 -[30 -/31(1 + p ) ) - I = ~fl.
If ( / 3 0 - 1) 2 -4/316 > 0 there are two solutions Pl and P2 o f the form (51). One also has the ARMA(2,1) class Yt = -/31~ a + /311( 1 - /3o)Yt 1 - (~/311yt-2 +
Ut + C l O t
1 + d, et-1,
(52)
where ~t is an arbitrary sunspot and where cl and dl are arbitrary. 3.2.3.2. Stability under learning o f the AR(1) M S V solutions. Assume that agents have a PLM of the AR(1) form yt = a + blyt 1 + or. The (weak) E-stability conditions are /3o+fil-l+/31bl
<0,
fio-l+2/31bl <0,
(53)
which are to be evaluated at the A R ( I ) REE bl = Pl and bl =/32. For real-time learning agents are regressing Yt on yt-1 and an intercept. Applying the results o f the previous section, it can be shown that an MSV AR(1) solution is locally stable under least squares learning if and only if it is E-stable. It is also easily verified that only one o f the two MSV solutions can be E-stable 41 . Hence local stability under least squares learning operates as a selection criterion which chooses between the two MSV solutions 42. Strong E-stability o f the MSV solutions and weak E-stability for the ARMA(2,1) class is analyzed in Evans and Honkapohja (1992). 3.2.3.3. Discussion o f examples. Example 3.3, Taylor's overlapping contracts model,
illustrates a situation which often arises. The model is saddle-point stable, i.e. there is a unique non-explosive solution and this solution is an MSV solution. In addition,
41 Recall that our use here of the term "MSV" solutions does not require them to satisfy the subsidiary principle of McCallum (1983). For a discussion of the relationship between E-stability and McCallum's MSV criterion see Evans and Honkapohja (1992). 42 Moore (1993) has this outcome in a model of aggregate externalities.
497
Ch. 7: Learning Dynamics
this is the solution which would be selected by learning dynamics: only the unique stationary MSV solution can be strongly or even weakly E-stable. Example 3.4, the real balances model augmented with a monetary policy feedback, illustrates the broader range of phenomena which can arise more generally. In particular, for appropriate choices o f (5 and /30 (with/31 = -[3o) one can obtain (i) a (weakly) E-stable ARMA(2,1) class o f solutions, (ii) an explosive AR(1) solution which is strongly E-stable with all other solutions E-unstable 43, or (iii) all solutions stationary, with a unique strongly E-stable AR(1) solution and all other solutions E-unstable. See Evans and Honkapohja (1992) for details. 3.3. Univariate models - f u r t h e r extensions and examples
The general principles developed in the preceding example can be readily extended to other univariate models which alter or extend the framework. 3.3.1. Models with t dating o f expectations
In many economic models the variable of interest Yt depends on expectations of future variables which are formed at time t. This means that these expectations may depend on exogenous variables dated at time t and also on Yt itself. The simplest example is the "Cagan model": (54)
yt = [3Etyt+l + 3,wt + vt,
where wt is an exogenous stochastic process which is observed at time t and vt is an unobserved white noise shock. For simplicity we will focus on the case in which wt follows a stationary AR(1) process (55)
Wt = a + ~PWt I + ut,
where ut is white noise and [~Pl < 1. Example 3.5. Cagan model o f inflation: The demand for money is a linear function o f expected inflation mt - p t = -Y(E[pt+l - P t ) + tk,
y > O,
where mt is the log o f the money supply at time t, and Pt is the log of the price level at time t. This can be solved for the above form with yt =--pt, wt =- mr, [3 = 7/(1 + y), and )~ = 1/(1 + 7).
43 Adaptive learning methods can be extended to show convergenceto explosive REE such as this one see Evans and Honkapohja (1994a). For further analysis of learning in nonstationary setups, see Zenner (1996).
498
G. W. Evans and S. Honkapohja
E x a m p l e 3.6. P P P model o f the exchange rate: In a small open economy with flexible exchange rates, perfect capital mobility, exogenous output and purchasing power parity (PPP) we have m t - P t - a - c i t + ot, c > O it = tt + E t el+l - et Pt = fix + ex. Here it is the nominal interest rate, it is the foreign nominal interest rate, et is the log of the exchange rate, mt is the log of the money snpply, pt is the log of the price level, and/Sx is the log of the foreign price level, rot, Or, 19t and ix are treated as exogenous. The
first equation represents monetary equilibrium, the second is the open parity condition and the third is the PPP equation. Solving for et we arrive at the form (54), with Yt =- et, wt a linear combination of the exogenous variables, and/3 = c/(1 + c).
Example 3.7.
Asset p r i c i n g with risk neutrality: Under risk neutrality and appropriate assumptions, all assets earn expected rate of return 1 + r, where r > 0 is the real net interest rate, assumed constant. If an asset pays dividend de at the end of period t then its price pt at t is given by Pt = (1 + r ) - I ( E t p t + l +dr).
We again have the form (54), with yt =--pt, wt =- dt and [3 = (1 + r ) 1. There is a unique M S V solution to the model given by Yt = a + bwt + vt,
f i - (1 - / 3 ) - l a / 3 b
where and
b = (1 -/3~p)-1/t.
In the context of the model (54) and particularly in the case of the asset pricing application, the MSV solution is often referred to as the f u n d a m e n t a l solution 44. Is the fundamental solution stable under least squares learning? We first obtain the (weak) E-stability conditions. The T-map is T,(a, b) =/3a + ot/3b,
Tb(a, b) -/3~flb + )~,
and it is easily verified that the fundamental solution is E-stable if [3 < 1 and/3~p < 1. We have assumed [~Pl < 1 and in all three of our economic examples above 0 3 < 1. Thus the E-stability conditions are met. Under least squares learning agents at time t estimate the model Yt - a + bwt + vt by running a least squares regression o f y t on an intercept and wt using the data available.
44 The solution can also be computed as the present value of expected dividends.
Ch. 7: Learning Dynamics
499
Let (at, bt) denote the least squares estimates using data on (wi,Yi), i = 1. . . . , t - 145. Expectations are then given by E[yt+j = at + btEtWt+l = (at + bta) + bt~Pwt, where for simplicity we treat ~p and a as known, and under learning yt is given by Yt = Ta(at, bt) + Tb(at, bt)wt + yr. Applying the standard stochastic approximation results, it follows that the fundamental solution is locally stable under least squares learning provided the E-stability condition above is met. This holds for each of the above economic examples.
3.3.1.1. Alternative dating. In the above models we have treated EtYt+l as an expectation formed at time t using all information dated t or earlier. The analysis of the fundamental solution and its stability under learning is little changed if we replace this by the assumption that the value of current variables are u n k n o w n when the expectation is formed. Thus if we consider instead Yt = [3E[ lYt+l + )~wt + vt, the fundamental solution is o f the form yt = ct + [)wt i + Or. Although the T-mapping is somewhat different, it is readily determined that the E-stability conditions are identical.
3.3.2. Bubbles 46 There are other solutions to (54) besides the fundamental solution. For simplicity let us focus on the case in which wt - 1, i.e. the model
Yt = [3EtYt+l + )~ + Or, and we assume 0 < [3 < 1. The fundamental solution is Yt = (1 -[3)-1)~ + vt and the general solution takes the form
Yt = - [ 3 1)~ +[3-1yt l-[3-1Vt_l + d e t , where et is an arbitrary sunspot and d is arbitrary. Solutions other than the fundamental solution are explosive and are often referred to as "rational bubbles" or "explosive bubbles". Under the alternative dating the model is
Yt = [3Et_lYt+l + 2~+ Yr.
45 For technical simplicity we assume that the data point (wt,Yt) is not available for the least squares estimates at t of the coefficients(a~,bt), though we do allow the time t forecasts to depend on w, This avoids simultaneity between Yt and b, With additional technical complexity this simultaneity can be permitted, e.g. Marcet and Sargent (1989c). 46 Salge (1997) is a recent review of the literature on asset pricing and bubbles.
G. W. Evans and S. Honkapohja
500
Note that this model is a special case o f the Leading Example (30) with/30 = 0 and /31 =/3. The bubbles solutions take the form (31) i.e. Yt = -/3-1~, + /3-1Yt 1 -t-ot-}- ClUt-1 + det 1.
Are the bubble solutions stable under learning? We consider the case with t - 1 dating o f expectations 47 and confine attention to E-stability. The results are apparent from Figure 2. Since/30 = 0 and 0 < /31 < 1 the bubble solutions are never stable under learning. Note that the fundamental solution is strongly E-stable in this model. Least squares learning will locally converge to the fundamental solution, but not to a bubble solution. Although this analysis casts doubt on the possibility o f adaptive learning converging to explosive bubbles, several points should be borne in mind. First, given initial expectations near an explosive path, least squares learning will not necessarily converge to the fundamental solution. Least squares learning may instead evolve along nonrational explosive paths as the economy is pushed further and further from any REE. Secondly, this section has focused on a simple set-up, which includes the standard asset pricing model in which the issue o f bubbles is most often discussed. The previous section showed that in models with a lagged dependent variable, it is possible for learning to converge to an explosive REE, even when there is a unique non-explosive path 48. A more complex model of asset pricing may thus generate bubbles which are stable under adaptive learning. The issue o f learning in asset pricing models with feedback has been investigated by Timmermann (1994).
3.3.3. A monetary model with mixed datings
In some cases reduced forms include a mixture o f dates at which expectations are formed. An example is the Duffy (1994) analysis o f the Farmer (1991) model 49. A version of the OG production model is considered in which the output o f the single perishable good depends on current and lagged labor input. This gives two ways for agents to store wealth: holding money and holding inventories of goods in process. As in the basic OG model, one way to summarize the model is to obtain the demand for real balances as a function o f expected inflation, mt = f ( E [ p t + l / p t ) . If the money stock is constant, then mt = M/pt and the reduced form can be solved forpt in terms o f
47 With t dating bubbles are also unstable under learning, see Evans (1989). 48 Adam and Szafarz (1992) also emphasize the importance of lagged endogenous variables. 49 Another example is the analysis of Muth's inventory model presented in Evans (1989). This paper also shows how to apply E-stability to models involving conditional variances as well as conditional expectations.
Ch. 7: Learning Dynamics
501
E[pt+l. Alternatively the model can be written in terms o f the inflation rate ¢c~ = Pt/Pt-l as
f(E~l~)
f(E?~,+~) Note that the model has a perfect foresight steady state o f ~t = 1 (this is the MSV solution). Linearizing around ~ = 1 yields ~t = 1 +fl0E[_l¢[, -fioEtcCt+a,
where fi0 -
if(l) f(1)
While in the standard OG f r a m e w o r k f ' ( 1 ) / f ( 1 ) can be positive or negative but cannot exceed 1, Farmer shows that in his m o d e l f ' ( 1 ) / f ( 1 ) > 1 is possible for appropriate specification o f technology and preferences. The listing o f solutions and the analysis of learning can proceed along the lines presented in Section 3.2; the principal difference is that E~a~t+i appears here in place o f E t lart+l (the model here also has no random shock, but this could be easily introduced). In addition to the MSV solution there are perfect foresight AR(1) solutions o f the form ;rt = /301 q - / 3 0 1 ( / 3 0 - 1);rt_l, and if/30 > ½, these constitute a family o f paths (indexed by the initial ~0) converging to ~t = 1 (i.e., if fi0 > ½ the steady state is indeterminate). It is easily verified that the MSV solution ~ = 1 is E-stable under the PLM ~:t = a for any value of/30. If instead the PLM allows for the possibility of an AR(1) path ;rt = a + c3rt 1 then the solution Jrt = 1 remains E-stable only if/30 < 1. Thus, this constitutes a strong E-stability condition for the MSV solution. However, the AR(1) continuum is weakly E-stable if/30 > 1 50. This model provides an example, based on a version of the OG model with fully specified microfoundations, o f the possibility o f learning converging to an indeterminate steady state. If random noise were present, the convergence would be to a member o f the ARMA(1,1) class o f solutions as in our Leading Example. 3.3.4. A linear model with two f o r w a r d leads
In more complicated linear models it is possible for there to be multiple strongly E-stable solutions. In this set-up there will be two or more solutions which are locally stable under least squares learning. To see that this is possible, consider a model in which E[_lyt+2 is also included: Yt =
a
+
(~Yt-I +/30E~lyt +/31Et~lYt+~ +/32Et-lYt+2 + or.
(56)
There are again various classes o f solutions: an ARMA(3,2) class of solutions, 1 to 3 ARMA(2,1) classes o f solutions (which in general depend on sunspot variables, 50 Strong E-stability of the AR(1) solutions would look at PLMs of the form ~t = a + cJvt 1+ d~t 2. As in the "Leading Example" of the earlier section, it can be shown that the corresponding strong E-stability equation at the AR(1) solutions has a zero root and exhibits one-sided stability/instability.
502
G. W. Evans and S. Honkapohja
i.e. on arbitrary martingale difference sequences), and up to 3 AR(1) solutions. We focus on these latter solutions - the MSV solutions. [See Evans and Honkapohja (1994b) for a discussion o f the other solutions and a full treatment o f details.] AR(1) solutions o f the form (51), i.e. Yt = ~P +PYt 1 + vt, must satisfy fl = di +/30/9 +/31/02 +/32/93. Depending on whether or not the cubic has a pair o f complex roots, there are 1 or 3 AR(1) solutions. The main new result is that for an open set o f parameter values two o f the AR(1) solutions are strongly E-stable. For weak E-stability we consider PLMs o f the form Yt = a + blyt 1 + Or.
This leads to the T-map which specifies the A L M Yt = Ta(a, bl) + Tbl (a, bl)Yt-1 + vt: T.(a, hi) = a +/3oa +/31 (abl + a) +/32(bl (bl a + a) + a), Tb~(a, bl) = O+flobl +fl, b~ +fl2b~.
The key to the result is that stability o f the bl parameter is determined by a cubic which may have two stable roots, depending on the parameters. For strong E-stability we allow the A L M to depend on additional lags yt-i as well as on lags vt i and et-i. For the strong E-stability conditions and for conditions under which there are two strongly E-stable stationary AR(1) solutions, see Evans and Honkapohja (1994b). E x a m p l e 3.8. Dornbusch-type model with p o l i c y feedback: The equations are Pt-Pt
1 = ~ E t 1dr,
dt = - y ( r t - Etpt+l +Pt) + tl(et - P t ) , r, = )~ l(pt - Opt-l), rt = Etet+l - et.
The first equation is a Phillips curve, the second the IS curve for an open economy, the third is the LM curve in which monetary policy reacts to pt-1, and the last equation is the open parity condition. The reduced form for Pt can be put in the required form (56). Evans and Honkapohja (1994b) show that the case o f two strongly E-stable AR(1) solutions, as well as weakly E-stable A R M A solution classes depending on a sunspot variable, can arise for appropriate parameter values. 3.4. Multivariate models
Systematic examination o f univariate linear models is illuminating because they can be used to illustrate the wide range o f outcomes for macroeconomic models with
Ch. 7:
503
Learning Dynamics
adaptive learning and because numerous textbook examples fit this framework. For serious applied work a multivariate set-up is needed. Consider the following example. E x a m p l e 3.9. A d h o c s t i c k y - p r i c e m o d e l w i t h p o l i c y f e e d b a c k : Pt -Pt-I
= ao + a l q t + (Et-lpt+l - E t - l p t )
+ Ult
qt = bo - b l ( r t - E t lPt+~ + E t - l p t ) + v2t mt - P t = co + c l q t - c2rt + O3t mt = do + d l p t - , + d2qt 1 + d3rt 1 + d4mt 1 + v 4 ,
The first equation is a standard but a d h o c rational expectations Phillips curve in which inflation, P t - P t 1, depends on expected inflation, E t - l p t + l - E t - i P t , and on aggregate real output, qt. The second equation is the IS curve, relating qt to the e x a n t e rt - E t lPt+l + E t lPt. The third equation is the LM curve, equating the supply of real money balances mt - P t to the demand for them. Here rt is the nominal interest rate. The last equation is the monetary policy feedback rule, relating the nominal supply of money to lagged prices Pt 1 and lagged values of the other variables. Each equation is also subject to an unobservable iid random shock, vit. We assume al, bl, cl, ca > 0. Letting Yt = (Pt, qt, rt, m f f and vt = (Ult, v2t, o3t, u4t) t this model can be put in the form Yt = a + f i o E t - l y t +[3lEt lYt+l + 6yt 1 + ~vt,
where the coefficients are appropriately sized matrices. More generally, allowing for a vector of exogenous variables wt we can consider the following general form: Yt = a + f i o E t lYt + ]3lEt lYt+l + ~Yt-1 -]- l~Wt + ~Ut,
(57)
Wt = p W t 1 + et.
Here Yt is an n × 1 vector of endogenous variables and wt is a vector of exogenous variables which we assume to follow a stationary VAR, so that et is white noise. This set-up, apart from the dating of expectations, is close to McCallum (1983). 3.4.1. M S V s o l u t i o n s a n d l e a r n i n g
We consider the MSV (minimal state variable) solutions which are of the following form: y¢ = a + b y ¢ 1 + c w t I + fret + ~vt,
(58)
wt = p w t | + et,
(59)
where a, b, and c are to be determined by the method of undetermined coefficients. Note that the solutions are in the form of a VAR (vector autoregression). Computing
G.W. Evans and S. Honkapohja
504
Et-~yt and Et-lyt+l and inserting into Equation (57) it follows that REE of this form must satisfy the matrix equations (I - [30 - [3~ b - [31)a
= a,
(60)
[31b2 + ([30 - I)b + 6
= 0,
(61)
=
(62)
(I - [30
-
[31b)c [31ep -
K'p.
In general there are multiple solutions and various techniques are available for solving models of this form. For example, in the regular case there is a unique stationary solution and a modification of the Blanchard and Kahn (1980) technique can be used. See Evans and Honkapohja (1999a) and Christiano and Valdivia (1994). It is also possible to use the E-stability algorithm itself to find a solution. Sargent (1993) noted the possibility of using learning rules to solve RE models, and an advantage of this procedure is that it yields only solutions which are stable under learning. Finally, using numerical techniques one can directly compute E-stable equilibria in applied large-scale macroeconomic models, see Garratt and Hall (1997) and Currie, Garratt and Hall (1993) who also investigate dynamics of learning transitions during regime changes. With PLMs of the form (58), the mapping from the PLM to the ALM is
(!) r
/
°l+(fl°+fll+[31b)a
~
= [ [3~b2 + [30b + 6 1. \ [30c + fllbC + fllCp + tgp /I
(63)
Expectational stability is determined by the matrix differential equation
d~
=T
b
-
.
(64)
c
To analyze the local stability of (64) at an RE solution 2, b, ~ one linearizes the system at that RE solution. In Evans and Honkapohja (1999a) we show the following result:
Proposition 7. An MSV solution gt, f), ~ to Equation (57) is E-stable if (i)
all the eigenvalues o¢ °vecr"-Cvecb) have real parts less than 1, : / O(vec b)' '-
Oi) the eigenvalues of:1 0(vec o vecc)' r~ (vec b, vec ~) have real parts less than 1, and (iii) the eigenvalues of the matrix [3o + [31 + [31b have real parts less than 1. Under least squares learning the perceived law of motion
Yt = at 1 + bt-lYt-1 + ct-lwt-1 q-Ifet + ~vt is used by agents to make forecasts, where the parameters at, bt and ct are updated by running multivariate recursive least squares (RLS). (Although t¢ and ~ may be
Ch. 7:
Learning Dynamics
505
unknown, these do not affect the agents' forecasts). Letting q~r = ( a , b , c ) and ! / / z t = ( 1 , y t , w t ) , RLS can still be written as Equation (47), and Yt is determined by the ALM Yt = T(e)t-i)1zt-1 + t c e t + ~vt. The tools already developed extend to the multivariate case and can be used to show that an E-stable solution is locally stable under recursive least squares learning. 3.4.2. Multivariate models with time t dating
Many multivariate models, such as the RBC (Real Business Cycle) model and those described in Farmer (1993), have Yt depend on expectations Etyt+l, i.e. where the vector Yt is part of the information set. Example 3.10. R e a l Business Cycle Model: The equilibrium equations of the standard RBC model, once linearized around the steady state, can be written in the form Ct = [311EtCt+l q- [~13EtSt+l, kt = (~21ct 1 -t- ~22kt 1 -}- (523st 1,
(65)
st = 10st-1 + vt.
where ct is consumption, kt is the capital stock and st is the productivity shock (with all variables in log deviation from the mean form). This is a special case of the general set-up Yt = Ot + fiEtYt+ 1 + (Sy t 1 -t- K'W t -]- ~Ut, wt = pwt
(66)
l + et,
and the MSV solutions now have the form Yt = a + byt-] + cwt + dvt.
(67)
Solutions can be computed using the Blanchard-Kahn technique, e.g. see Farmer (1993). In Evans and Honkapohja (1999a) we obtain corresponding E-stability conditions and show that they govern convergence of least squares learning. It is straightforward to use iterations of the E-stability algorithm to compute a solution in VAR form for numerical specifications of the RBC model. 3.4.3. Irregular models
I f the model (57) or (66) is not regular then there exist multiple stationary solutions and the solutions may depend on sunspot variables. For example, Benhabib and Farmer (1994), Farmer (1993) and Farmer and Guo (1994) have emphasized variations of the RBC model incorporating increasing returns which can lead to irregular models. In
G. W. Evans and S. Honkapohja
506
the irregular case, models of the form (66), for example, can have solutions of the form Yt = a + byt-1 + cwt + dvt + f et,
where et is a sunspot variable and f ~ 0. The general techniques for studying least-squares learning can be extended to this set-up. The detailed information assumptions on expectation formation plays an important role in determining the conditions for stability under learning in these cases. Based on the results in Evans and Honkapohja (1999a) it appears that requiring a solution to be locally stable under adaptive learning imposes additional substantive requirements on the reduced form, and hence the underlying structural, parameter values.
4. Learning in nonlinear models 4.1. I n t r o d u c t i o n
Our aim here is to discuss in detail how the analysis of learning behavior is carried out for various nonlinear models. As a vehicle we make use of a few simple overlapping generations (OG) models. These models are convenient illustrations of learning and adjustment of expectations, since basic OG models usually have a one-step ahead forward-looking reduced form. OG models are usually nonlinear, and they often have multiple REE. The models may have different types of equilibria, such as steady states, indeterminate paths to steady states, cycles, and sunspots. Thus these models can exhibit phenomena known as indeterminacy and endogenous fluctuations, see e.g. Boldrin and Woodford (1990) and Guesnerie and Woodford (1992) for a review 51. It should be noted, though, that such phenomena are not restricted to OG models. For example, Howitt and McAfee (1992) obtain multiple steady states and endogenous fluctuations in a model of search externalities in the labor market. These fluctuations are stable under a learning process. Evans, Honkapohja and Romer (1998b) consider an endogenous growth model with complementarities between different types of capital goods. Their model has equilibria with self-fulfilling fluctuations which are stable under learning. There are numerous other nonlinear models in the literature 52, and below we discuss two other models in some detail.
51 Note that these fluctuations are self-fulfilling. A different view is in e.g. Heymann and Sanguinetti (1997) who develop models in which (nonrational) transitory fluctuations can arise when agents asymptoticallylearn a steady state. s2 For example, Balasko (1994) and Balasko and Royer (1996) consider expectational stability in some Walrasian models.
Ch. 7: Learning Dynamics
507
Although a large part of our focus is on stability of equilibria under adaptive learning, it should be noted that instability of REE has also been analyzed. This can play an important role for economic results. Woodford (1990) develops an OG model in which the steady state is unstable under certain conditions, and the economy converges to a sunspot solution as a result of learning behavior. Howitt (1992) shows the instability of the steady state under learning behavior when monetary policy is carried out by means of interest rate control in a conventional macroeconomic model. Bullard (1994) shows that in an OG model non-rational limit-cycle trajectories can emerge from learning with sufficiently rapid money growth. Instability results for nonstochastic economies are emphasized by B6nassy and Blad (1989), Grandmont and Laroque (1991) and Grandmont (1998). Interestingly, Chatterji and Chattopadhyay (1997) show that global stability may prevail in spite of local instability 53. 4.2. Steady states and cycles in models with intrinsic noise
We will mostly consider models with intrinsic noise. These arise, for example when a stochastic taste or productivity shock is introduced into the basic OG model. We first present a general one-step ahead forward-looking model with random shocks, after which we provide several economic examples of nonlinear models with and without noise. Consider univariate models of the form y¢ = E;G(yt+l) + vt,
(68)
where G is a nonlinear function and vt is an exogenous shock. We assume that v¢ is iid with mean E(vt) = 0. Here E~G(yt+I) denotes the expectation of G(yt+l) formed at time t (which is nonrational outside an equilibrium). These forms are often inadequate for economic models, see below. In fact, adding productivity shocks into even simple versions of the basic OG model requires the more general reduced form Yt = H ( E { G(yt+l, vt+l), vt).
(69)
We will make the assumptions that the mappings G and H are twice continuously differentiable on some open rectangles (possibly infinite). The analysis of learning a steady state for models (68) and (69) was already considered in Section 2.7.1. Before taking up the issue of adaptive learning we consider some RE solutions to economic models taking the above form. 4.2.1. Some economic examples
Example 4.1. In the basic OG model with production introduced in Section 1.2.3 agents supply labor nt and produce (perishable) output when young and consume G+I 53 Evans and Honkapohja (1998b) and Honkapohja (1994) discuss the differences between the nonstochastic and stochastic economies.
508
G. W. Evans and S. Honkapohja
when old. Output is equal to labor supply and there is a fixed quantity o f money M. Holding money is the only mechanism for saving and thus the budget constraints are pt+lct+l = M and p t n t = M . We now introduce a random taste shock by making the utility function U(Ct+l) - V ( n t ) + et ln(nt).
Here ct is an iid positive random shock to the disutility of labor and we assume that et is known to the agents who are young at time t 54. The first-order condition for a maximum thus is V ' ( n t ) - 6t/nt = E t P t U'(ct+l). Pt+l
Combining with the market clearing condition ct+l = nt+l and using Pt/pt+l = nt+l/nt we get *
!
n t V ' ( n t ) - et = E t (nt+l U (nt+l)).
Finally, if we change variables from n to y = va(n), where O ( n ) =- n V ' ( n ) (note that O(n) is increasing for all n ) 0), we obtain Equation (68) where vt ~ c t - E ( e t ) . E x a m p l e 4.2. The technique used in example 4.1 to transform the model to the form (68) cannot always be used when there are intrinsic shocks. (It should be apparent that the technique required very special assumptions on utility.) The more general form (69) is usually required. As an illustration consider the case o f additive productivity shocks. We return to the assumption that utility is given by U(Ct+l) - V ( n t ) , but now assume that output qt is given by qt = nt + )~t,
where ~ is an iid positive productivity shock. The budget constraints are now and P t q t = M , and the first-order condition plus the market clearing condition q~+l = Ct+l and Pt/Pt+l = qt+l/qt yields
Pt+lCt+l = M
(nt + )~t)V1(nt) = E t ((nt+l + )Lt+l) Ut(nt+l + .~t+l)).
Since (n + ~ ) V 1 ( n ) is strictly increasing in n, and letting vt - 3 , t - E(J,t), this equation can be solved for nt and put in the form (69) where Yt - nt.
54 Letting ~'(n) - V(n) - ~ ln(n) we have ~'/(n) - V'(n) - e/n and Vn(n) = Vn(n) + c/n 2. Under the standard assumptions V/, Vn > 0, with e > 0, we have P'~(n) > 0 for n sufficiently large and fen(n) > 0 for all n ~> 0. Thus the marginal disutility of labor fZ/(n) may be negative at small n but we have the required assumptions needed for a well-defined interior solution to the household maximization problem.
Ch. 7: Learning Dynamics
509
E x a m p l e 4.3. Increasing social returns: A different extension o f the basic OG model to incorporate increasing social returns is obtained in Evans and Ho1~kapohja (1995b). This model was already sketched in Section 1.2.3. Assuming that substitution effects dominate in consumption behavior, this model can have up to three interior steady states. E x a m p l e 4.4. Hyperinflation or seignorage: Consider the basic OG model with government consumption financed by money creation. This model was also introduced in Section 1.2.3. Using the first-order condition nt V ~(nt) = E7 ((nt+l - gt+l) U ~(nt+l - gt+l)) and assuming further that gt = g + vt, where vt is iid with Evt = 0 and "small" compact support, one obtains a special case o f the reduced form (69): n, = H ( E t G(nt+b vt+l )), where the parameter g has been absorbed into the function G. We note here two extensions of this model. First, we sketch below the analysis of Bullard (1994) which is based on the same model but with the alternative assumption that money growth is constant while government expenditure adjusts to satisfy the budget constraint. Second, Evans, Honkapohja and Marimon (1998a) consider the same model but with a constitutional limitation on government consumption which cannot exceed a given fraction o f GDR This extension leads to the possibility o f a further constrained steady state which is usually stable trader steady-state learning behavior. 4.2.2. Noisy steady states and cycles For models with intrinsic noise we consider the REE which are analogs o f perfect foresight steady states and cycles. We start with the simplest case: a noisy steady state for the model (68). Under rational expectations we have Yt = EtG(yt+l) + vt, and we look for a solution o f the form Yt = Y + Or. It follows that.P must satisfy.P = E G ( ~ + vt). In general, a solution o f this form may not exist even if G has a fixed point 33, i.e. if 33 = G(33), but an existence result is available for the case o f "small noise". More generally, consider noisy cycle REE for the model (69). A noisy k-cycle is a stochastic process o f the form Yt - y i ( v t ) for t m o d k = i,
i = 1,... , k - 1,
(70)
Yt =yk(vt) for t m o d k - O, where the k functions yi(ot) satisfy
yi(vt) = H(EG(yi+l(Vt+l), Ot+l), Ot) if t mod k = i, yk(vt) = H ( E G ( y l (vt+l), vt+l), vt) for t mod k = 0.
i-1 ..... k
1,
(71)
G.W. Evans and S. Honkapohja
510
In a noisy k-cycle, the expectations EG(yt+I, vt+l) follow a regular cycle. We will use the notation
Oi = EG(yi(vt), or) if t mod k = i,
for i = 1 , . . . , k - 1,
Ok = EG(yk(vt), vt) if t mod k = 0, so that yi(vt) = H(~+I, or), i = 1.... , k - 1, and yk(vt) = H(OI, Or). Thus a noisy k-cycle is equivalently defined by (01 . . . . . 0k) such that
Oi = EG(H(Oi+I, Or), vt) for i = 1. . . . . k - 1,
(72)
Ok = EG(H(O~, vt), 03. A noisy steady state corresponds to the case k = 1 and is thus defined by a function
y(vt) such that y(vt) = H(Ea(y(vt+~), vt+x), vt). Letting 0 = EG(y(vt+l), vt+l) it is seen that a noisy steady state is equivalently defined by a value 0 satisfying 0 = EG(H(O, or), vt). It can be shown that noisy k-cycles exist near k-cycles of the corresponding nonstochastic model, provided the noise is sufficiently small in the sense that it has small bounded support 55.
4.2.3. Adaptive learning algorithms We now introduce adaptive learning for noisy k-cycles (for the analysis o f stability o f a noisy steady state see Section 2.7.1 above). Suppose agents believe they are in a noisy k-cycle. They need to make 1-step ahead forecasts o f G(yt, vt) and at time t have estimates (01t,..., 0kt) for the expected values o f G(yt, vt) at the different points o f the k-cycle, k ~> 1. That is, 0it is their estimate at t for EG(yt, vt) if t mod(k) = i, for i = 1 , . . . , k - 1 and 0kt is their estimate at t for EG(yt, vt) if t mod(k) = 0. We assume that G(yt, vt) is observable and used to update their estimates. Since vt is iid, in an REE noisy k-cycle the values o f G(yt, vt) are independently distributed across time and are identically distributed for the same values o f t mod k.
55 For a rigorous statement and proof see Evans and Honkapohja (1995c).
Ch. 7: Learning Dynamics
511
A natural estimator o f ( 0 1 , . . . , 01,) is then given by separate sample means for each stage o f the cycle:
Oit= (#Ni) a Z G(yj, oj), jCN~ Ni - { j = l . . . . , t - 1
where
(73)
Itmodk=i}fori=l,.--,k-1,
Nk = { j = 1. . . . , t - 1 [ t m o d k = 0}. Here #N~ denotes the cardinality of the set Ni, i.e. the number o f elements in the set. Given the estimates the A L M takes the form
Yt = H(Oi+l,t, Vt) if t mod k = i,
for i = 1.... , k - 1,
Yt = H(Olt, Vt) if t m o d k = O.
(74)
The system consisting o f Equations (73) and (74) can be put into a recursive form, and the standard techniques o f Section 2 can be applied 56. The details are given in Evans and Honkapohja (1995c).
4.2.4. E-stability and convergence Before stating the formal convergence results under adaptive learning, we derive the appropriate stability condition using the E-stability principle. Recall that under this principle we focus on the mapping from a vector o f parameters characterizing the Perceived Law o f Motion (PLM) to the implied parameter vector characterizing the Actual Law o f Motion (ALM). Although in a noisy k-cycle the solution is given by k functions, yi(ot), i = 1 . . . . . k, what matters to the agents are only the expected values o f G(yt+~, vt+l). If agents believe they are in a noisy k-cycle, then their PLM is adequately summarized by a vector 0 = (01 . . . . . Ok), where
Oi = G(yt, vt) e if t m o d k
= i,
f o r / = 1....
, k - 1,
Ok = G(yt, vt) e if t mod k = O. If agents held these (in general nonrational) perceptions fixed, then the economy would follow an actual (generally nonrational) k-cycle
Yt = H ( O i + l , v t ) i f t m o d k = i ,
f o r i = l .... ,k
1,
56 Guesnerie and Woodford (1991) look at analogous fixed-gain learning rules in a nonstochastic system.
G.W. Evans and S. Honkapohja
512
Yt = H(O1, vt) if t m o d k = 0. The corresponding parameters 0* = ( 0 ~ , . . . , 0k*) of the ALM induced by the PLM are given by the expected values of G(yt, vt) under this law of motion:
O[ ~ EG(H(Oi+l,vt),vt) if t m o d k = i, O~ = EG(H(01, vt), or) if t mod k = 0.
for i = 1.... , k - 1,
Thus the mapping 0* = T(O) from the PLM to the A L M is given by
T(O) = (R(02),..., R(Olc),R(O1)), where
R(Oi) = E(G(H(Oi, or), vt)), assuming k > 1. For k = 1 we have simply T(O) = R(O) = E(G(H(O, or), vt). With this formulation of the T mapping the definition of E-stability is based on the differential equation (22) with ~0 = 0. It is easily verified using Equation (72) that fixed points of T(O) correspond to REE noisy k-cycles. A REE noisy k-cycle 0 is said to be E-stable if Equation (22) is locally asymptotically stable at 0 57 Proposition 8. Consider an REE noisy k-cycle of the model (69) with expectation parameters 0 = (01 Ok). Let ~ = 1-[I~_1R'(Oi). Then 0 is E-stable if and only if ....
,
~<1 -(cos(Jr/k)) -k < ~ < 1
ifk-l
ork=2
i f k > 2.
4.2.4.1. Weak and strong E-stability. In the context of k-cycles the distinction between weak and strong stability, discussed in Section 1.5, arises naturally as follows. A k-cycle can always be regarded as a degenerate nk-cycle for any integer n > 1. Thus the 2-cycle (01, 02) is also a 4-cycle taking values (01,02, 01, 02), a 6-cycle taking values (01, 02, 01, 02, 01, 02) , etc. Define k as the primitive period of the cycle if it is not an m-cycle for any order m < k (e.g. the primitive period is 2 in the example just given if 01 ¢ 02). Consider now a noisy k-cycle REE of the model (69) with primitive period k and expectation parameters 0 = (01 . . . . . Ok). 0 is said to be weakly E-stable if it is E-stable when regarded as a k-cycle and strongly E-stable if it is E-stable when
57 The next two propositions about E-stability of steady states and cycles are proved in Evans and Honkapohja (1995c).
Ch. 7: LearningDynamics
513
regarded as an nk-cycle for every positive integer n. Conditions for strong E-stability are given by the following result. Proposition 9. Consider an REE noisy k-cycle of the model (69), with primitive period k, and with expectation parameters 0 = (01 .... , Ok). 0 is strongly E-stable if and only if I~[ < 1.
4.2.4.2. Convergence. For this framework the preceding E-stability conditions provide the appropriate convergence condition under adaptive learning for Equations (73) and (74). The associated differential equation turns out to be just the equation defining E-stability d 0 / d r = T(O) - O, so that we have Proposition 10. Consider an REE noisy k-cycle of the model (69), with primitive period k, and with expectation parameters 0 = (01 . . . . . Ok). Suppose that 0 is weakly E-stable. Then 0 is locally stable under adaptive learning. If instead 0 is not weakly E-stable then 0t = (01t,..., Okt) converges to 0 with probability 0 58. Of course by "locally stable" under adaptive learning we mean various more specific statements made explicit in Section 2: (1) Convergence with positive probability for nearby initial points. (2) Convergence with probability close to 1 for sufficiently low adaption rates. (3) Convergence with probability 1 if a sufficiently small Projection Facility is used.
4.2.4.3. The case of small noise. If vt = 0, then we have Yt = F(yt+t) under perfect foresight, where F ( y ) =_H(G(y, 0), 0). The E-stability condition is then determined in terms of ~ = Ft(~I)F'(~2)... F'(~k) for a perfect foresight k-cycle (~31,... ,.vk). Recall that noisy k-cycles exist nearby if the noise is sufficiently "small". It can be shown that the E-stability conditions are "inherited" from the perfect foresight case. Moreover, for small enough noise it is also possible to show convergence from nearby initial points with probability 1 without a projection facility, see Evans and Honkapohja (1995c). 4.2.5. Economic models with steady states and cycles 4.2.5.1. Economic examples continued. These general convergence and non-convergence results can be easily applied to the four economic examples in Section 4.2.1. The technique is to convert each model to the form (69) or its simpler versions. For Examples 4.1 and 4.2 describing two formulations of preference shocks it may be shown that with "small enough" noise there exists a stable noisy steady state in the neighborhood of the corresponding steady state in the nonstochastic model. (However, with sufficient noise the stability condition can be altered.) In Example 4.3, the model of increasing social returns, the multiple interior steady states can be divided into
58 A correspondingresult holds also for strong E-stabilitywhenthe learning rule is overparameterized.
514
G. l/Y E v a n s a n d X H o n k a p o h j a
locally stable and unstable ones. For the hyperinflation model (Example 4.4) the lowinflation steady state is stable under learning, whereas the high-inflation steady state is not. The same models furnish examples of RE cycles that are stable or unstable under learning. For example, consider the basic OG model in Example 4.1 without any shocks. It is well-known that this model can have deterministic cycles as REE, provided that over a suitable range the offer curve slopes downwards sufficiently steeply 59. The stability results of learning behavior in Propositions 8-10 provide stability conditions for cycles in the basic OG model. Moreover, an E-stable deterministic cycle remains E-stable in a model in which a sufficiently small preference shock has been added.
4.2.5.2. Other economic models. Example 4.5. Instability of Interest Rate Pegging: The argument that tight interest rate control is not a feasible monetary policy has recently been re-examined by Howitt (1992) for some alternative economies with learning behavior. One of the models has both short- and far-sighted agents. The former live for two periods, selling their endowment e when young and consume only at old age the proceeds ePt 1. The latter have a constant endowment y and an objective function E~ ~/~o[3Ju(et+j). (Here E[ denotes expectations.) They face a finance constraint implying that Mt = PrY, since current consumption and investment in bonds is paid for by initial money, a transfer and initial bonds (with interest). M~ is end-of-period money holding. Denoting the nominal interest factor on bonds by R , the first-order condition for the far-sighted agent is
u'(e,) = R.1E:
P., L art+, 1' a~,+,- Pt Market clearing for goods yields ct = y + e(1 - 1/¢~t). The finance constraint implies that inflation equals money growth. With a pegged interest factor R the model has a unique perfect foresight steady state with inflation factor ¢c* = Re. The analysis of this model proceeds by defining the variable fiut{y + e[1 (1/art)l} Xt ~ ¢tt SO that the first-order condition gives -
u'[y + e(1 - 1/~)] =/~t+l,Xt+l ~ E[xt+b This equation defines a function a~, = ¢c(~t+l). Introducing the notation h(x) = fiR/~(x) the model yields the dynamic equation
xt = 5ct+lh(~ct+l), where h,h' > O,h(x*) = 1, and where x* = a~-~(a~*). 59 See Grandmont (1985) for details.
Ch. 7: LearningDynamics
515
It is easy to verify that if, for example, agents try to learn a steady state, then x* will be unstable 6o. This is easily seen by noting that the derivative o f F(X,+l ) = 2,+ih(.~,+l ) is greater than unity at x*. E x a m p l e 4.6. Learning Equilibria: The model of Bullard (1994) is obtained from the model o f Example 4.4 o f Section 4.2.1 by replacing the assumption o f constant (real) government spending by constant nominal money growth 0 = M,/Mt-1. Government spending is then made endogenous, so that the budget constraint is satisfied. Bullard's model can be described in terms o f the savings (or money demand) function MJP, = S(Pt/E[Pt+I) and forecasting o f the inflation rate fit = E[Pt+I/P,. For the latter it is postulated that agents run a first-order autoregression using data through t - 1. This system can be written as a system o f three nonlinear difference equations: /3, =/3,1 + g , - I
rs(/3'-12) /3. 1 ,
/3., =/3, 1, g, =
21 t, S ( /3,_ll ) ll
+
11, •
Bullard shows that if the money growth rate 0 is not too large, this system is stable with inflation given by/3" - 0. However, if 0 is increased beyond a critical value, the steady state becomes unstable. In fact, the system undergoes a H o p f bifurcation, and the learning dynamics converges to a limit cycle 61. There is a multiplicity o f these "learning equilibria" depending on the starting point. When the estimates o f agents, attempting to learn a steady state, converge to a (nonrational) limit cycle, it is possible that forecasting errors becomes large and exhibit some regularities. If such a regularity is found, then agents would try to exploit such a regularity and stop using the previous learning rule. However, Bullard shows that for carefully chosen savings functions the forecast errors can exhibit a complex pattern, so that agents do not necessarily find regularities they could exploit. Sch6nhofer (1996) examines this issue further and shows that the forecast errors can even be chaotic.
4.3. Learning sunspot equilibria In this subsection our interest lies in the analysis o f learning o f equilibria which are influenced by extraneous random phenomena, often referred to as "sunspots". To
6o We note here that Howitt (1992) considers also other more general learning rules. This is possible, since the system is one-dimensional and relatively simple. 61 The possibility of convergence to a non-REE limit cycle, under learning with decreasing gain, arises because the regressor, i.e. past price level, is nonstationary.
516
G. W. Evans and X Honkapohja
simplify the discussion we will assume that preference or technology shocks do not appear in the model 62 and focus on the class o f models Yt = E[ f (yt+l).
(75)
The rational expectations equilibria for Equation (75) satisfy yt = Et[ f(yt+l)], where Et denotes the conditional expectation, given information at time t. The rational expectations equilibria which are dependent on extraneous random phenomena or 'sunspots' have received a great deal o f attention in the recent literature after the initial investigations by Shell (1977), Azariadis (1981), and Cass and Shell (1983). The existence o f such sunspot equilibria has, in particular, been much studied 63. The seminal work o f Woodford (1990) demonstrates that, for appropriate specifications, learning can converge to sunspot solutions in the basic OG model. The Woodford (1990) representation o f learning in the OG monetary model provides a careful treatment which precisely reflects the information available to households 64. Our presentation here follows Evans and Honkapohja (1994c) which is developed in terms o f the simple reduced form (75). These rational solutions, together with deterministic cycles studied above, can be viewed as a modern formulation of a long tradition in economics which emphasizes the possibility o f endogenous fluctuations in market economies. The nonlinearity of the economic model is a key element in generating the possibility o f these equilibria even if extraneous variables can sometimes appear as part of rational solutions in linear models as well (see Section 3). 4.3.1. Existence o f sunspot equilibria We begin by reviewing some results on the existence o f sunspot equilibria. The definition o f a sunspot equilibrium involves the idea that economic agents in the model condition their expectations on some (random) variable st which otherwise does not have any influence on the model economy. Although different types of sunspot solutions have been considered in the literature we will focus here on REE that take the form o f a finite Markov chain. For most o f the analysis we simplify even further by assuming that the extraneous random variable is a 2-state Markov chain with a constant transition matrix H = (Jrij), 0 < ~.j < 1, i,j = 1,2. Here ~ j denotes the probability that St+l = j given that currently in state st = i 65. A 2-state Markov chain is defined by probabilities ~11 and Yg22since ~12 = 1 --~ll and ~21 = 1 - Jv22.
62 Intrinsic shocks could be easily introduced, see Evans and Honkapohja (1998a). 63 See Chiappori and Guesnerie (1991) and Guesnerie and Woodford (1992) for recent surveys. 64 An interpretation of the Woodford (1990) results in terms of E-stability was presented in Guesnerie and Woodford (1992). 65 Guesnerie and Woodford (1992) discuss possible interpretations of s t.
Ch. 7: LearningDynamics'
517
One defines a (2-state) Stationary Sunspot Equilibrium (SSE) ( y ~, y~ ) with transition probabilities 7r/j by means o f the equations Y~' = ~1 lf(Y~') + (1 - ~ll)f(Y~),
y~ = (1 - ~22)f(YT) + ~22f(Y~).
(76)
In the SSE Yt = Y~l if st = i. These equations are clearly a special case o f an REE for Equation (75), where Yt+~ = )~7 with probability Jr/j given that yt = y[. These equations have the geometric interpretation that the two values (YT,Y~) must be convex combinations o f f ( y ~ ) and f(y~). This observation makes it possible to construct economic examples o f SSEs as follows. Assume that f(331) < J'(332) for two points 331 and 332. Then there exist 0 < zcU < 1 such that (331,332) is an SSE with transition probabilities zc~jif and only if the points 331 and 332 both lie in the open interval (f(331),f(332)). Note that in this construction the two points 33! and 332 need not be near any deterministic equilibria. A large part of the literature has focused on the existence o f SSEs in small neighborhoods around deterministic cycles or steady states for model (75). To make this notion precise we say that an SSE y = (Yl,Y2) is an c-SSE relative to 2 = (21 ,Y2) if y lies in an e-neighborhood o f ~. Next, it may be noted that a deterministic equilibrium 2-cycle (Yl,22), with Yl = f(Y2) and Y2 = f(21), is a limiting case o f an SSE when Jrll, Jr22 ---+ 0. Similarly, a pair o f distinct steady states, i.e. a pair (Yl,Y2) satisfying 21 ~ Y2, Yl = f ( Y l ) and Y2 =f(Y2), is a limiting case o f SSEs with Zrll, 7c22 ~ 1. It is easy to derive the following results for e-SSEs near deterministic equilibria: (i) I f f ' ( ~ l ) f ' ( ~ 2 ) ~ 1 holds for a 2-cycle (21,22), there is an e > 0 such that for all 0 < c ~ < c there exists an e'-SSE relative to (21,Y2). (ii) I f f ' ( 2 1 ) ~ 1 andf'(j32) ~ 1 at a pair o f distinct steady states (21,;v2), there is an e such that for all 0 < e' < e there exists an e'-SSE relative to (21,22) 66 (iii) There is an e > 0 such that for all 0 < e ~ < e there exists an e'-SSE relative to a single steady state 33 if and only if [f'(33)[ > 1. The overlapping generations models sketched above provide simple examples o f SSEs and e-SSEs since these models can exhibit multiple steady states, steady states with [f~(33)[ > 1 and cycles. To conclude the discussion on the existence of SSEs we remark here that for fully specified models it is sometimes possible to utilize arguments based on global analysis (such as the index theorem o f Poincar6 and Hopf) to prove the existence o f SSEs, see the surveys cited previously.
4.3.2. Analysis of learning 4.3.2.1. Formulation of the learning rule. For learning sunspot equilibria the agents must have a perceived law o f motion that in principle can enable them to learn such 66 This kind of SSE may be called an "animal spirits" cycle in accordance with Howitt and McAfee (1992).
G. W. Evans and S. Honkapohja
518
an REE: If agents believe that the economy is in an SSE, a natural estimator for the value o f y t in the two different sunspot states is, for each state of the sunspot process in the past, the computation o f the average o f the observations ofyt which have arisen in that state of the sunspot st. This is a form o f state-contingent averaging. Thus let q}t = (~blt, O2t) be the estimates o f the values that Yt takes in states 1 and 2 o f the sunspot. Let also ~ t = 1 if& = j and ~Pjt = 0 otherwise be the indicator function for state j o f the sunspot. The learning rules based on state-contingent averaging can be written in the form Ojt =Oj, t-I + t ll])/,t-lq/lt_l(Yt-l--Oj, t l-t-et 1),
qjt = qj, t l + t
l(~#/,t-1--qj, t-1),
(77)
Yt = lPlt[Ygllf(Olt) + (1 -- ~11)f(02t)] + /P2t [(1 -- a't'22)f(01t) + a'g22f(02t)]
for j = 1,2. We note here that in the learning rules agents are assumed to use observations only through period t - 1. This is to avoid a simultaneity between Yt and expectations E t f ( y,+l ). Equations (77) are interpreted as follows, tqj, t_ 1 is the number o f times state j has occurred up to time t - 1. The recursion for the fraction o f observations o f state j is the second of Equations (77). The first equation is then a recursive form for the state averages, with one modification to be discussed shortly. Finally, the third of Equations (77) gives the temporary equilibrium for the model, since the right-hand side is the expectation o f the value o f f ( y t + O given the forecasts ~jt. We make a small modification in the learning rule by including a random disturbance et to the algorithm. This can be interpreted as a measurement or observation error, and it is assumed to be iid with mean 0 and bounded support (tetl < C, C > 0, with probability 1) 67.
4.3.2.2. Analysis o f convergence. We now show that, under a stability condition, the learning rule (77) above converges locally to an SSE. For this we utilize the local convergence results reviewed in Section 2. First introduce the variables
O[=(01t, e)2t,qlt,q2t),
X / = (~Pl,t 1, ~P2,t-l, et-l)
and the functions
~J'(0t l , ~ t ) = q~j,t-lq/,lt-l(Yt 1-Oj, t 1 +Et 1), ~2+i(0t 1,Xt) = *Pi,t 1 - q i , t-1,
j=
1,2,
i = 1,2.
For state dynamics we note simply that Xt is a Markov process independent of 0t. The system is then in a standard form for recursive algorithms 6s. 67 The observation error is needed only for the instability result. 68 The formal analysis requires an extension of the basic framework of Section 2 to non-iid shocks or alternatively to Markovian state dynamics as summarized in the appendix of Evans and Honkapohja (1998b) and treated in detail in Evans and Honkapohja (1998a). The formal details for the former approach are given in Woodford (1990) and Evans and Honkapohja (1994c).
Ch. 7: LearningDynamics
519
The associated differential equation governing local convergence is d 0 / d r = h(O), where
hi(0) = Yglql[~ll/(~l)+( 1 - ~ 1 1 ) / ( ~ 2 ) - ~ 1 ] , h2(0) =~2q2[(1 13g22)f(01)w,7g22]c(~2)-~2], h3(0) = ~ l - q l , h4(0) = f g 2 - q 2 .
Here (~l, £c2) is the limiting distribution of the states of the Markov chain. Clearly, at the equilibrium point ql = Yrl, q2 = ~2 and (~1, q~2)is an SSE. In the ODE d 0 / d r = h(O) the subsystem consisting of the last two components of h(O) is independent of (Ol, q52) and one has global stability for it in the domain qi E (0, 1), i = 1, 2. It follows that the entire ODE is locally stable provided DT(e)I, ~2) has all eigenvalues with real parts less than unity, where
T(qJI' q~2) =
( T I ( ~ I , ~2) ~ = ( ; r l l f ( q ) l ) + (1 -- Jrl 1)f(~2) "~ T2(~bl,q~2)J (1-~22)f(Ol)+~22f(O2)J"
(78)
Note that the function T(q~I, 02) = [T1 (¢1, ~2), T2(~l, ¢2)] defines the mapping from the perceived law of motion [Yt+x = ~ if st+l = 1, yt+l = ¢2 if st+l = 2] to the actual law of motion [Yt+i = ¢~ ifst+l = 1, Yt+l = 0~ if St+l = 2], where (¢~,~b~) = T(¢~,¢2). The condition on the eigenvalues can thus be used to define the concept of E-stability for sunspot equilibria. We have obtained the following result:
Proposition 11. The learning rule (77) converges locally to an SSE (y~,y~) provided it is weakly E-stable, i.e. the eigenvalues of DT(y~,y~) have real parts less than one.
Remark: The notion of convergence is as in Theorem 1 in Section 2. If the algorithm is augmented with a projection facility, almost sure convergence is obtained. It is also possible to derive an instability result along the lines of Evans and Honkapohja (1994c) for SSEs which are not weakly E-stable:
Proposition 12. Suppose that an SSE (y~,y~) is weakly E-unstable, so that DT(y~,y~) has an eigenvalue with real part greater than unity. Then the learning dynamics (77) converge to (Yl ,Y2) with probability zero. The stability result can also be developed for the general model (26) or (69). In this framework sunspot equilibria are noisy, because the equilibrium is influenced by both the sunspot variable as well as the exogenous preference or technology shock. This is discussed in Evans and Honkapohja (1998a).
G.W.Evans and S. Honkapohja
520 4.3.3. Stability o f SSEs near deterministic solutions
The preceding result shows that local convergence to SSEs can be studied using E-stability based on Equation (78). Computing D T we have
DT(y) =
a-gllf'(yl) (1 - ~ll)f'(Y2) ) (1 - ~22)f'(Yl) a-g22f'(y2) "
The analysis of E-stability of SSEs near deterministic solutions (e-SSEs) is based on two observations. First, D T ( y ) can be computed for the deterministic solutions, which are limiting cases for e-SSEs. Second, under a regularity condition, the fact that eigenvalues are continuous functions o f the matrix elements provides the E-stability conditions for e-SSEs in a neighborhood o f the deterministic solution. This approach yields the following results: (i) Given a 2-cycle ~ = (YI,.~2) w i t h f ' ( ~ l ) f ' ( ~ 2 ) -~ 0, there is an e > 0 such that for all 0 < c' < e all e'-SSEs relative to ~ are weakly E-stable if and only if P is weakly E-stable, i.e., it s a t i s f i e s f ' ( ~ 0 f ' ( ~ 2 ) < 1. (ii) Given two distinct steady states Yl ¢ Y2 there is an e > 0 such that for all 0 < e ~ < e all e~-SSEs relative to ~ = (Yl,Y2) are weakly E-stable if and only if both steady states are weakly E-stable, i . e . , f ' ( ~ l ) < 1 a n d f ' ( ~ 2 ) < 1. Analogous results are available when a 2-cycle is strongly E-stable or a pair of distinct steady states are strongly E-stable [see Evans and Honkapohja (1994c) for the definition and details]. For the case o f a single steady state the situation is more complex, but the following partial result holds: Let ~ be a weakly E-unstable steady state, i.e. S ( . v ) > 1. Then there exists an e > 0 such that for all 0 < e' < e all c'-SSEs relative to .~ are weakly E-unstable. One may recall from Proposition 3 that SSEs near a single steady state ~ also exist when f ' ( ~ ) < -1. For this case it appears that both E-stable and E-trustable e-SSEs relative to 33 may exist. However, it can be shown that there is a neighborhood of such that SSEs in the neighborhood are E-unstable in a strong sense. 4.3.4. Applying the results to OG and other models The hyperinflation model, Example 4.4 in Section 4.2.1, has often been used as an economic example for sunspot equilibria. This construction relies on the two distinct steady states o f the model. The application o f the results above shows that such equilibria near a pair o f steady states are unstable under learning. In order to construct a robust example o f such "animal spirits" sunspot solutions it is necessary to have a pair o f steady states that are both stable when agents try to learn them. Since under certain regularity conditions two stable steady states are separated by an unstable one, the construction o f a robust example o f sunspot equilibria, which is based on distinct steady states, normally requires the existence o f three steady states at a minimum.
Ch. 7: LearningDynamics
521
The model of increasing social returns, Example 4.3 in Section 4.2.1, is a simple OG model with this property. Evans and Honkapohja (1993b) develop this extension and provide simulations illustrating convergence to such an SSE. Other similar robust examples of these endogenous fluctuations are the "animal spirits" equilibria in Howitt and McAfee (1992) in a model of search externalities, and equilibrium growth cycles in Evans, Honkapohja and Romer (1998b) in a model of endogenous growth with complementary capital goods. Alternatively, stable sunspot solutions can be obtained when the model exhibits a k-cycle which is by itself stable under learning. If such a k-cycle is found, then normally there also exist stable sunspot solutions nearby, provided agents allow for the possibility of sunspots in their learning behavior. OG models with downward-sloping offer curves provide simple examples of sunspot equilibria near deterministic cycles. In addition to these local results, the original analysis of Woodford (1990) showed how to use index theorem results to obtain global stability results for SSEs in the OG model.
5. Extensions and recent developments
In this section we take up several further topics that have been analyzed in the area of learning dynamics and macroeconomics. These include some alternative learning algorithms, heterogeneity of learning rules, transitions and speed of convergence results, and learning in misspecified models. 5.1.
Genetic algorithms, classifier systems and neural networks
Some of the models for learning behavior have their origins in computational intelligence. Genetic algorithms and classifier systems have found some applications in economics. 5.1.1.
Genetic algorithms
Genetic algorithms (GA) were initially designed for finding optima in non-smooth landscapes. We describe the main features of GAs using the Muth market model which is one of the very first applications of GAs to economics. The exposition follows Arifovic (1994). We thus consider a market with n firms with quadratic cost functions Cit = x q i t + 1 2 ~ynqit, where qit is the production by firm i, and x and y are parameters. Given 1 2 price expectations Pte the expected profit of firm i is l i f t = P te qit - xqit - ~ynqit, and one obtains the supply function for firm i as qit = ( y n ) - l ( P ~ - x ) . The demand function is taken to be pt - A - B ~ - 1 qit, and the RE solution Pt = P~ yields qit = qt = (A - x ) / [ n ( B +y)].
G. W. E v a n s a n d S. H o n k a p o h j a
522
Arifovic (1994) considers some alternative GA's. We outline here her "singlepopulation" algorithm. Formally, there is a population A t of 'chromosomes' Ait which are strings o f length ~ o f binary characters 0, 1 : a I
g
Air = ( it, . . . ~ ait)~
k
where
ait
= 0 or 1.
To each chromosome Air one associates a production decision by firm i by the formula g
xit qit = ~ ,
where
= V" ak 2 k 1 xit ~ it • k=l
Here k is a norming factor 69. Short-run profits/~t = Hit = Ptqit - Cit provide a measure of 'fitness' for alternative chromosomes (production decisions). Here P t is the shortrun equilibrium price, given a configuration o f n chromosomes. The basic idea in a genetic algorithm is to apply certain genetic operators to different chromosomes in order to produce new chromosomes. In these operators the fitness measure provides a criterion o f success, so that chromosomes with higher fitness have a better chance o f producing offsprings to the population. The following operators are used by Arifovic (1994): (1) R e p r o d u c t i o n : Each chromosome Air produces copies with a probability which depends on its fitness. The probability o f a copy Cit is given by P ( c i t ) = ~ A i t / ( ~ ; = 1 [Ait)" The resulting n copies constitute a 'mating pool'. (2) C r o s s o v e r : Two strings are selected randomly from the pool. Next, one selects a random cutoff point, and the tails o f the selected chromosomes are interchanged to obtain new chromosome strings. Example. If there are two strings [ 110101111 ] and [001010010], and tails o f length 4 are interchanged, then the new strings are [110100010] and [001011111]. Altogether n / 2 pairs are selected (assume that n is even, for simplicity). (3) M u t a t i o n : For each string created in step 2, in each position 0 and 1 is changed to the alternative value with a small probability. These are standard genetic operations. In her analysis Arifovic (1994) adds another operator which is not present in standard G A s 70. (4) E l e c t i o n : The new 'offsprings' created by the preceding three operators are tested against their 'parents' using the profit measured at the previous price as the fitness criterion. The rules for replacement are if one offspring is better than both parents, replace the less-fit parent, -
69 Note that for large g the expressions xit can approximate any real number over the range of interest. 70 The market model does not converge when this operator is absent. Since mutation is always occurring, unless it is made to die off asymptotically, something like the election operator must be utilized to get convergence.
Ch. 7: Learning Dynamics
523
if both offsprings are better, replace both parents, - if parents are better than offsprings, they stay in the population. These four operations determine a new population of size n and, given this configuration, a new short-run equilibrium price is determined by the equality of demand and output. After this the genetic operators are applied again using the new market price and profits as the fitness measure. Arifovic (t994) shows by simulations that this algorithm converges to the RE solution irrespective of the model parameter values 71. This result is remarkable, since it happens in spite of the myopia in the fitness criterion. (The system, however, has no stochastic shocks.) For some specifications it also turns out that the time paths of the GA corresponds reasonably well with certain experimental results for the market model. These genetic operations can be given broad interpretations in terms of economic behavior. First, reproduction corresponds to imitation of those who have done well. Second, crossover and mutation are like testing new ideas and making experiments. Finally, election means that only promising ideas are in fact utilized. To conclude this discussion we remark that as a model of learning the genetic algorithm is probably best interpreted as a framework of social rather than individual learning, cf. Sargent (1993). Indeed, individual firms are like individual chromosomes who are replaced by new ones according to the rules of the algorithm. -
5.1.2. Classifier systems Classifier Systems provide a different variety of learning algorithms which can be made more akin to thought processes of individuals than a GA. This allows a direct behavioral interpretation with individual economic agents doing the learning. A classifier system consists of an evolving collection of 'condition-action statements' (i.e. decision rules) which compete with each other in certain specified ways. The winners become the active decisions in the different stages. The strengths (or utility and costs) of the possible classifiers are a central part of the system and accounts are kept of these strengths. When a 'message' indicating current conditions arrives, one or more classifiers are activated as the possible decisions given the signal. Next, the competition stage starts to select the active classifier. The strengths are updated according to the performance of the active classifier. (The updating rules in fact mimic the updating of parameter estimates in stochastic approximation.) Typically, there are also ways for introducing new classifiers 72. A well-known economic application of classifier systems is Marimon, McGrattan and Sargent (1989). They introduce classifier system learning into the model of money
71 Thisfinding is consistent with the E-stabilitycondition and corresponds to the Least Squares learning results, see Sections 1.4.1 and 2.7.2: downward sloping demand and upward sloping supply is sufficient for global convergence. 72 Sargent (1993), pp. 77-81, and Dawid (1996), pp. 13-171 provide somewhat more detailed descriptions of classifier systems.
524
G. W. Evans and S. Honkapohja
and matching due to Kiyotaki and Wright (1989). Using simulations Marimon et al. show that learning converges to a stationary Nash equilibrium in the Kiyotaki-Wright model, and that, when there are multiple equilibria, learning selects the fundamental low-cost solution. Another recent application is Lettau and Uhlig (1999). They utilize a classifier system as a rule-of-thumb decision procedure in the usual dynamic programming setup for consumption-saving decisions. The system does not fully converge to the dynamic programming solution, and Lettau and Uhlig suggest that this behavior can account for the 'excess' sensitivity of consumption to current income. 5.1.3. Neural networks
Another very recent approach to learning models based on computational intelligence has been the use o f neural networks 73. The basic idea in neural networks is to represent an unknown functional relationship between inputs and outputs in terms o f a network structure. In general the networks can consist o f several layers o f nodes, called neurons, and connections between these neurons. The simplest example o f a network is the perceptron which is a single neuron receiving several input signals and sending out a scalar output. Infeedforward networks information flows only forward from one layer o f neurons to a subsequent one. Such a network usually has several layers o f neurons, organized so that neurons at the same layer are not connected to each other, and neurons in later layers do not feed information back to earlier layers in the structure. In network structures signals are passed along specified connections between the different neurons in the network. In each neuron input signals are weighted by some weights and the aggregate is processed through an activation function of that neuron. The processed signal is the output from that neuron, and it is sent to further neurons connected to it or if at the terminal layer as a component o f the output o f the whole network. A n important property o f these networks is that they can provide good approximations o f the unknown functional relation between the inputs and the outputs. To achieve this the networks must be 'trained': the weights for inputs at each neuron must be determined so that, given the training data, the network approximates well the functional relation present in the input and output data. This training is often based on numerical techniques such as the gradient method, and in fact many training schemes can be represented as stochastic approximation algorithms. The training can be done with a fixed data set, so that it is then an 'off-line' algorithm, or it may been done 'on-line' as a recursive scheme. In the latter case the basic setup corresponds closely to adaptive learning.
73 The use of neural networks in economics is discussed e.g. in Beltratti, Margarita and Terna (1996), Cho and Sargent (1996b), and Sargent (1993). White (1992) is an advanced treatise discussing the relationship of neural networks to statistics and econometrics.
Ch. 7: Learning Dynamics
525
In economic theory, neural networks have very recently been utilized as representations of approximate functional forms, as computational devices and as an approach to bounded rationality and learning. One use of neural networks has been the computation of (approximate) solutions to economic models, see e.g. Beltratfi, Margarita and Terna (1996) for various illustrations from economics and finance. Another use of neural networks has been in modelling bounded rationality and learning. Cho (1995) uses perceptrons in the repeated prisoner's dilemma game, so that the perceptrons classify the past data and through a threshold this leads to a decision in accordance with the output of the perceptron. Such strategies are quite simple, and thus the modeled behavior is very much boundedly rational. Nevertheless, the efficient outcomes of the game can be recovered by use of these simple strategies. Cho and Sargent (1996a) apply this approach to study reputation issues in monetary policy. Other papers using neural networks as a learning device in macroeconomic models include Barucci and Landi (1995), Salmon (1995), Packal~n (1997) and Heinemann (1997a). The last two studies look at connections to E-stability in the Muth model. 5.1.4. Recent applications o f genetic algorithms
The paper by Arifovic (1994) demonstrated the potential of GAs to converge to the REE, and a natural question is whether such convergence occurs in other models, and whether, when there are multiple equilibria, there is a one-to-one correspondence between solutions which are stable under statistical or econometric learning rules and solutions which are stable under GAs. The expectational stability principle, which states that there is a close connection between stability under adaptive learning rules and expectational stability, would argue for a tight correspondence between stability under econometric learning and under GAs. One setup in which this question can be investigated is the OG model with seignorage, in which a fixed real deficit is financed by printing money. Recall that, provided the level of the deficit is not too large, there are two REE monetary steady states. E-stability and stability under adaptive learning was discussed in Sections 1.4.3 and 1.4.4. Under small-gain adaptive learning of the inflation rate, the lowinflation steady state is locally stable while the high-inflation steady state is locally unstable, consistent with the E-stability results. Learning in this model was actually first investigated under least-squares learning by Marcet and Sargent (1989a). They assumed that agents forecast inflation according to the perceived law of motion Pt+l = [3tPt, where fit is given by the least squares regression (without intercept) of prices on lagged prices. They showed that there could be convergence only to the lowinflation steady state, never to the high-inflation steady state. In addition, in simulations they found some cases with unstable paths leading to expected inflation rates at which there was no temporary equilibrium (i.e., at which it was impossible to finance the deficit through money creation). Arifovic (1995) sets up the GA so that the chromosome level represents the first period consumption of the young. Using GA simulations (with an election operator),
526
G. W. Evans and S. Honkapohja
she also finds convergence to the low-inflation steady state and never to the highinflation steady state. There are some differences in detail from Least Squares learning. From some starting points which lead to unstable paths under (Marcet-Sargent) leastsquares learning there was convergence under GA learning. It is possible that some of these apparent discrepancies arise from the particular least-squares learning scheme followed. Since the price level in either steady state is a trended series, whereas the inflation rate is not, it would be more natural to an econometrician to estimate the inflation rate by its sample mean rather than by a regression of prices on past prices. In any case, there does appear to be a close connection in this model between the local stability properties of statistical and GA learning, and the key features of learning dynamics are revealed by E-stability. In Bullard and Duffy (1998a), GAs are used to look at the issue of convergence to cycles in the standard deterministic OG endowment model with money. Recall that Grandmont (1985) showed that for appropriate utility functions it is straightforward to construct models in which there are regular perfect foresight cycles. Recall also that Guesnerie and Woodford (1991) and Evans and Honkapohja (1995c) provide local stability conditions for the convergence of adaptive and statistical learning rules to particular RE k-cycles. For "decreasing-gain" rules these are the E-stability conditions which are given in the above section on nonlinear models. It is therefore of interest to know whether GAs exhibit the same stability conditions. In Bullard and Duffy (1998a) agent i uses the following simple rule for forecasting next period's price: F [ [ P ( t + 1)] = P ( t - ki - 1). Different values of ki are consistent with different perfect foresight cycles. (Note that every value of ki is consistent with learning steady states). The value of ki used by agent i is coded as a bit string of length 8, so that the learning rule is in principle capable of learning cycles up to order 39. Given their price forecast, each agent chooses its optimal level of saving when young and total saving determines the price level. A GA is used to determine the values o f k i used in each generation. Note that in this setup [in contrast to the approach in Arifovic (1994, 1995)] the GA operates on a forecast rule used by the agent, rather than directly on its decision variable 74. The question they ask is: starting from a random assignment of bit strings, will the GA converge to cycles? To answer this question they conduct GA simulations for a grid of values of the parameter specifying the relative risk aversion parameter of the old. Their central finding is that, with only a handful of exceptions, there is convergence either to steady states or 2-cycles, but not to higher-order cycles. This finding raises the possibility that GAs may have somewhat different stability properties than other learning rules. However, the results are based on simulations using a GA
74 This makes GA learning closer in spirit to least squares and other adaptive learning of forecast rules. Using GAs to determine forecast rules was introduced in Bullard and Duffy (1994). Bullard and Duffy (1998b) show how to use GAs to directly determine consumption plans in n-period OG endowment economies.
Ch. 7: Learning Dynamics
527
with a particular specification of the initial conditions and the forecast rule. Thus many issues concerning stability under GAs remain to be resolved 75. We close this section with a brief description of two other recent papers which use GAs in macroeconomic learning models. Arifovic (1996) considers an OG model with two currencies. This model possesses a continuum of stationary perfect foresight solutions indexed by the exchange rate. In the GA set-up each agent has a bit string which determines the consumption level and the portfolio fractions devoted to the two currencies. Fitness of string i used by a member of generation t - 1 is measured by its ex-post utility and is used to determine the proportion of bit strings in use in t + 1 according to genetic operator updating rules. The central finding is that the GA does not settle down to a nonstochastic stationary perfect foresight equilibrium, but instead exhibits persistent fluctuations in the exchange rate driven by fluctuations in portfolio fractions. Arifovic, Bullard and Duffy (1997) incorporate GA learning in a model of economic development based on Azariadis and Drazen (1990). This model, which emphasizes the roles of human capital and threshold externalities, has two perfect foresight steady states: a low-income zero-growth steady state and a highincome positive-growth steady state. In the GA set-up the bit strings encode the fraction of their time young agents spend in training and the proportion of their income they save 76. The central finding, based on simulations, is that, starting from the low-income steady state, economies eventually make a transition to the high-income steady state after a long, but unpredictable length of time. These examples illustrate that GAs can be readily adapted to investigate a wide range of macroeconomic models. An advantage of GAs in economics is that they automatically allow for heterogeneity. A disadvantage is that there are no formal convergence results. Although in some cases there are supporting theoretical arguments, the findings in economics to date rely primarily on simulations. This literature is growing fast. Dawid (1996) provides an overview of GAs and discusses their applications to both economic models and evolutionary games. Lettau (1997) considers the effects of learning via genetic algorithms in a model of portfolio choice. 5.2. Heterogeneity in learning behavior
In most of the literature on statistical and econometric learning it is assumed that the learning rules of economic agents are identical. This is a counterpart and an addition to the assumption of the existence of a representative agent. Some studies have considered models in which agents have different learning rules. An early example is Bray and Savin (1986), who allow for agents to have heterogeneous priors in the context of the Muth model. Howitt (1992) incorporates different learning rules in his analysis of the instability of interest rate pegging. Evans, Honkapohja and Marimon (1998a)
75 GA learning of 2-cycles has also recentlybeen investigatedin Arifovic (1998). 76 In this model all of the standard genetic operators are used except the election operator.
528
G. W. Evans and S. Honkapohja
extend the deficit financing inflation model to include a continuum of agents with identical savings functions but different learning rules. Marcet and Sargent (1989b) consider a model in which two classes of agents with different information form different expectations. Soerensen (1996) looks at adaptive learning with heterogeneous expectations in a nonstochastic OG model. In this literature there are two techniques for setting up and analyzing models with heterogenous learning. First, as pointed out by Marcet and Sargent (1989c), when setting up the problem as a recursive algorithm it is straightforward to allow for a finite range of possibly heterogeneous expectations by expanding the state vector accordingly. This is easily done when there are a finite number of different agent types. Second, in some models it may be possible to aggregate the different learning rules and obtain for mean expectations a rule that is amenable to standard techniques. Evans, Honkapohja and Marimon (1998a) is an example of this latter methodology. The stability conditions for learning are in general affected by behavioral heterogeneity. However, many models with heterogeneous agents make the assumption that the dynamics of endogenous variables in the reduced form depend only on average expectations 77. It turns out that, when the basic framework is linear, the stability condition for convergence of learning with heterogeneous expectations is identical to the corresponding condition when homogeneous expectations are imposed, see Evans and Honkapohja (1997). Finally, we remark that the models based on GAs and classifier systems discussed above can incorporate heterogeneity in learning behavior, as can the approach developed in Brock and Hommes (1997). Using the latter approach, Brock and de Fontnouvelle (1996) obtain analytical results on expectational diversity.
5.3. L e a r n i n g
in m i s s p e c i f i e d m o d e l s
In most of the literature it has been assumed that agents learn based on a PLM (perceived law of motion.) that is well specified, i.e. nests an REE of interest. However, economic agents, like econometricians, may fail to correctly specify the actual law of motion, even asymptotically. It may still be possible to analyze the resulting learning dynamics. An early example of this idea, in the context of a duopoly model, is Kirman (1983). Maussner (1997) is a recent paper focusing on monopolistic competition. As an illustration, consider the Muth model of Sections 1.2.1 and 1.4.1 with reduced form (4). Agents were assumed to have a PLM of the form P t = a + btwt_l + l"lt, corresponding to the REE. Suppose that instead their PLM is Pt = a + rh, so that
77 Frydman (1982) and some papers in the volume Frydman and Phelps (1983) have stressed the importance of average opinions.
Ch. 7: Learning Dynamics
529
agents do not recognize the dependence o f price on wt-1, and that they estimate a by least squares. Then at = at-i + t l ( p t - a t - l ) , and the PLM at time t - 1 is Pt = at-1 + ~/t with corresponding forecasts E~_lpt = at-l. Thus the A L M is Pt = t.t + aat-i + yl wt 1 + 17t and the corresponding stochastic recursive algorithm is at = at-1 + t 1(~+ (a - 1)at-1 + g'wt-1 + ~Tt). The associated ODE is d a / d z = ~t + (a - 1)a, and thus from Section 2 it follows that at --+ fi = (1 - a ) - l g almost surely. (We remark that the ODE da/d'c can also be interpreted as the E-stability equation for the underparameterized class o f PLMs here considered). In this case we have convergence, but it is not to the unique REE which is Pt = (1 - a) 1# + (1 - a)-ly~wt I + rh. Agents make systematic forecast errors since their forecast errors are correlated with wt-i and they would do better to condition their forecasts on this variable. However, we have ruled this out by assumption: we have restricted PLMs to those which do not depend on wt-1. Within the restricted class o f PLMs we consider, agents in fact converge to one which is rational given this restriction. The resulting solution when the forecasts are Ei*lpt = fi is pt = (1 - a ) - l ~ + Y'w,-1 + ~,. We might describe this as a restricted perceptions equilibrium since it is generated by expectations which are optimal within a limited class o f PLMs. The basic idea o f a restricted perceptions equilibrium is that we permit agents to fall short o f rationality specifically in failing to recognize certain patterns or correlations in the data. Clearly, for this concept to be "reasonable" in a particular application, the pattern or correlation should not be obvious. In a recent paper, Hommes and Sorger (1998) have proposed the related, but in general more stringent, concept o f consistent expectations equilibria. This requires that agents correctly perceive all autocorrelations o f the process. The restricted perceptions equilibrium concept is closely related to the notion o f reduced order limited information R E E introduced in Sargent (1991). Sargent considers the Townsend (1983) model in which two classes o f agents have different information sets and each class forms expectations based on a PLM which is a fixed-order vector A R M A process, e.g. a first-order A R process. This gives a mapping from the PLM to the A L M and a fixed point o f this map is a limited information REE, which was studied under learning in Marcet and Sargent (1989b). Sargent shows that this solution has reduced order, i.e. agents could make better forecasts using a higher-order
530
G. W. Evans and S. Honkapohja
ARMA process. In Sargent (1991), agents use an ARMA process, which is shown to yield full-order equilibrium 78. Some recent literature has explored learning dynamics in economies which are subject to recurrent structural shifts. As pointed out in Evans and Honkapohja (1993a), there are in principle two approaches if agents understand that these shifts will recur. One approach is for them to construct a hypermodel which allows for the structural shifts. If the agents misspecify such a model, they may converge to a restricted perceptions equilibrium, as above. An alternative approach is to allow for the structural shifts using a constant- or nondecreasing-gain learning algorithm which can potentially track the structural change. The constant-gain procedure was followed in Evans and Honkapohja (1993a). The choice of gain parameter involves a trade-off between its tracking ability and forecast variance, and an equilibrium in this class of learning rules was obtained numerically. In this kind of framework, policy can exhibit hysteresis effects if the model has multiple steady states. The recent analysis of Sargent (1999) also employs a constant-gain algorithm. In two recent papers the agents use algorithms in which the gain parameter is reset as a result of structural change. Timmermann (1995) looks at an asset pricing model with decreasing gain between structural breaks. It is assumed that agents know when a structural change has occurred and reset their gain parameters accordingly. This leads to persistent learning dynamics with greater asset price volatility 79. Marcet and Nicolini (1998) consider the inflation experience in some Latin American countries. Using an open economy version of the seignorage model in which the level of seignorage is exogenous and random, they assume that agents use decreasing gain unless recent forecast errors are high, in which case they revert to a higher fixed gain. They show that under this set-up the learning rule satisfies certain reasonable properties. Under their framework, recurrent bouts of hyperinflation are possible, and are better explained than under rational expectations. 5.4. Experimental evidence
Since adaptive learning can have strong implications for economic dynamics, experimental evidence in dynamic expectations models is of considerable interest. However, to date only a relatively small number of experiments have been undertaken. The limited evidence available seems to show that, when convergent, time paths from experimental data converge towards steady states which are stable under smallgain adaptive learning. Perhaps the clearest results are from experiments based on
78 Evans, Honkapohjaand Sargent (1993) consider an equilibrium in which a proportion of agents have perfect foresight and the rest, econometricians,have the optimal model from a restricted class of PLMs. Mitra (1997) considers a model with these two types of agents in which the econometricianschoose an optimal memory length. 79 In Timmermann(1993, 1996) excess asset price volatility is shown during the learning transition in a model with no structural breaks.
Ch. 7: Learning Dynamics
531
the hyperinflation (seignorage) OG model. Recall that in this model the high real balance/low-inflation steady state is E-stable, and thus stable under adaptive learning, whereas the low real balance/high-inflation steady state is unstable 8o. This theoretical result is strongly supported by the experiments described in Marimon and Sunder (1993) [related experiments are reported in Arifovic (1995)]: convergence is always to the high real balance steady state and never to the low real balance steady state. Marimon, Spear and Sunder (1993) consider endogenous fluctuations (2-cycles and sunspot equilibria) in the basic OG model. Their results are mixed: persistent, beliefdriven cycles can emerge, but only after the pattern has been induced by corresponding fundamental shocks. These papers also consider some aspects of transitional learning dynamics. One aspect that clearly emerges is that heterogeneity of expectations is important: individual data show considerable variability. Arifovic (1996) conducts experiments in the 2-currency OG model in which there is a continuum of equilibrium exchange rates. These experiments exhibit persistent exchange rate fluctuations, which are consistent with GA learning. For the same model, using a Newton method for learning decision rules, simulations by Sargent (1993), pp. 107-112, suggest path-dependent convergence to a nonstochastic REE. These results raise several issues. First, it would be useful to simulate learning rules like the Newton method with heterogeneous agents and alternative gain sequences. Second, given the existence of sunspot equilibria in models of this type one should also investigate whether such solutions are stable under adaptive learning. Finally, Marimon and Sunder (1994) and Evans, Honkapohja and Marimon (1998a) introduce policy changes into experimental OG economies with seignorage. The former paper considers the effects of prealmounced policy changes. The results are difficult to reconcile with rational expectations but the data are more consistent with an adaptive learning process. The latter paper introduces a constitutional constraint on seignorage which can lead to three steady states, two of which are stable under learning. The experiments appear to confirm that these are the attractors. The learning rules in this paper incorporate heterogeneity with random gain sequences, inertia and experimentation. This generates considerable diversity and variability during the learning transition which has the potential to match many aspects of experimental data. 5.5. Further topics
The speed of convergence for learning algorithms is evidently an important issue for the study of learning behavior. The self-referential nature of many learning models invalidates the direct application of the corresponding results from classical statistics. At present very few studies exist on this subject. An analytic result on asymptotic speed of convergence for stochastic approximation algorithms is provided in Benveniste,
80 At least providedthe gain is sufficientlysmall. See Sections 1.4.3 and 1.4.4.
532
G. W. Evans and S. Honkapohja
Metivier and Priouret (1990), on pp. 110 and 332. In particular, suppose that the gain sequence is 7t = C/t. Then, provided the real parts of all eigenvalues of the derivative of the associated ODE are less than -0.5, asymptotic convergence occurs at rate v't. (No analytic results are available in this case if the eigenvalue condition fails.) Marcet and Sargent (1995) have applied this result to adaptive learning in a version of the Cagan inflation model. They also carried out Monte Carlo simulations. The numerical results appear to accord with the analytics if the model satisfies the eigenvalue condition. However, the speed of convergence can be very slow when the eigenvalue condition fails 81. In the discussion of statistical learning procedures it is a standard assumption that the PLM can be specified parametrically. However, just as an econometrician may not know the appropriate functional form it may be reasonable to assume that agents face the same difficulty. In this case a natural procedure is to use nonparametric techniques. This is discussed in Chen and White (1998). As an illustration consider learning a noisy steady state in a nonlinear model (26) in Section 2.7.1 which we repeat here for convenience: Yt = H(E[G(yt+I, Vt+l), vt). Previously, the shock was assumed to be iid and in this case a noisy steady state y(vt) could be described in terms of a scalar parameter 0* = EG(y(v), v) (here the expectation is taken with respect to the distribution of v). Chen and White (1998) instead consider the case where vt is an exogenous, stationary and possibly nonlinear AR(1) process. A natural PLM is now of the form Ei*G(yt+l, vt+~) = O(vt), and under appropriate assumptions there exists an REE O(vt) in this class. Agents are assumed to update their PLM using recursive kernel methods of the form
Or(o)
=
Ot 1(0) + t -1 [G(yt, v,) - Ot-i (v)] 91((vt - vt 1)/ht)/ht,
where 9l(.) is a kernel function (i.e. a density which is symmetric around zero) and {ht} is a sequence of bandwidths (i.e. a sequence of positive numbers decreasing to zero). Chen and White establish that under a number of technical assumptions and an E-stability-like condition the learning mechanism converges to O(vt) almost surely, provided a version of the projection facility is employed. Another new approach employs models in which agents choose a predictor from some class of expectation functions. Brock and Hommes (1997) suggest the notion of an adaptively rational expectations equilibrium in which agents make a choice among finitely many expectations functions on the basis of past performance. This choice is coupled with the dynamics of endogenous variables, and the resulting dynamics can sometimes lead to complicated global dynamics. A related paper is Hommes and Sorger (1998). The approach is similar in spirit to models of choice of forecasting functions in the presence of nonlinear dynamics or structural shifts, cf. Evans and Honkapohja (1993a), Marcet and Nicolini (1998), and Mitra (1997).
gl Vives (1993) has establisheda similar asymptoticspeed of convergenceresult for Bayesianlearning.
Ch. 7: LearningDynamics
533
6. Conclusions
Increasingly, macroeconomists are investigating models in which multiple rational expectations equilibria can arise. Traditionally, this was considered theoretically awkward: which solution would the economy follow? Examining adaptive learning in such circumstances is particularly fruitful. Requiring stability of equilibria under adaptive learning can greatly reduce the degree of multiplicity. In some models there is a unique equilibrium which is (locally) stable under learning, while other models can have more than one stable equilibrium. Even in the latter case, incorporating learning dynamics provides a resolution of the indeterminacy issue, since models with multiple stable equilibria are converted into models with path dependence. The dynamics of such an economy are determined by its initial conditions (including expectations) and by the equations of motion which include the learning rules as well as the usual structural equations of the model. In particular, the ultimate equilibrium can in part be determined by the sequence of random shocks during the transition. As was indicated above, there is some experimental evidence supporting the important role played by adaptive learning in models with multiplicity. A number of important policy issues can arise in such models, and learning dynamics need to be taken into account in formulating economic policies. In some cases policy rules can lead to unstable economic systems even though the equilibria themselves may seem satisfactory. In cases with multiple stable equilibria, the path dependence exhibited in models with adaptive learning can lead to hysteresis effects with changes in policy. In addition, temporarily inefficient policies may be necessary to guide the economy to a superior equilibrium. Finally, even in cases with a unique equilibrium, learning dynamics can be important in characterizing data in situations where there are sudden changes in policy regimes. The dynamics with learning can be very different from fully rational adjustments after such a change. Although our discussion has focused most heavily on asymptotic convergence to REE, some of these other issues, which have been less studied, are likely to receive more attention in the future. Learning dynamics is a new area of research where many issues are still open and new avenues no doubt remain to be discovered. We look forward to future work with excitement.
References
Adam, M., and A. Szafarz (1992), "Speculativebubbles and financial markets", Oxford EconomicPapers 44:626--640. Amman, H.M., D.A. Kendrick and J. Rust, eds (1996), Handbook of Computational Economics, vol. 1 (Elsevier, Amsterdam). Arifovic, J. (1994), "Genetic algorithmlearning and the cobweb model", Journal of Economic Dynamics and Control 18:3-28.
534
G. W. Evans and S. Honkapohja
Arifovic, J. (1995), "Genetic algorithms and inflationary economies", Journal of Monetary Economics 36:219-243. Arifovic, J. (1996), "The behavior of the exchange rate in the genetic algorithm and experimental economies", Journal of Political Economy 104:510-541. Arifovic, J. (1998), "Stability of equilibria under genetic algorithm adaption: an analysis", Macroeconomic Dynamics 2:1-21. Arifovic, J., J. Bullard and J. Duffy (1997), "The transition from stagnation to growth: an adaptive learning approach", Journal of Economic Growth 2:185-209. Arthur, W.B. (1994), Increasing Returns and Path Dependence in the Economy (The University of Michigan Press, Ann Arbor, MI). Arthur, W.B., Y.M. Ermoliev and Y.M. Kaniovski (1983), "On generalized urn schemes of the Polya kind", Kibernetica 19:49-56. Arthur, W.B., Y.M. Ermoliev and Y.M. Kaniovski (1994), "Strong laws for a class of path-dependent stochastic processes with applications", in: Arthur (1994), chap. 10, pp. 185-201. Auster, R. (1971), "The invariably stable cobweb model", Review of Economic Studies 38:117-121. Azariadis, C. (1981), "Self-fialfilling prophecies", Journal of Economic Theory 25:380-396. Azariadis, C., and A. Drazen (1990), "Threshold externalities in economic development", The Quarterly Journal of Economics 104:501-526. Balasko, Y. (1994), "The expectational stability of Walrasian equilibria", Journal of Mathematical Economics 23:179-203. Balasko, Y., and D. Royer (1996), "Stability of competitive equilibrium with respect to recursive and learning processes", Journal of Economic Theory 68:319 348. Barnett, W., J. Geweke and K. Shell, eds (1989), Economic Complexity: Chaos, Sunspots, Bubbles, and Nonlinearity (Cambridge University Press, Cambridge). Barnett, W., et al., eds (1991), Equilibrium Theory and Applications, Proceedings of the Sixth International Symposium in Economic Theory and Econometrics (Cambridge University Press, Cambridge). Barucci, E., and L. Landi (1995), "Non-parametric versus linear learning devices: a procedural perspective", Working paper (University of Florence). Beltratti, A., S. Margarita and E Terna (1996), Neural Networks for Economic and Financial Modelling (International Thompson Computer Press, London). B~nassy, J., and M. Blad (1989), "On learning and rational expectations in an overlapping generations model", Journal of Economic Dynamics and Control 13:379400. Benhabib, J., and R.E. Farmer (1994), "Indeterminacy and increasing returns", Journal of Economic Theory 63:194 1. Benveniste, A., M. Metivier and P. Priouret (1990), Adaptive Algorithms and Stochastic Approximations (Springer, Berlin). Bergstr6m, V, and A.E. Vredin, eds (1994), Measuring and Interpreting Business Cycles (Oxford University Press, Oxford). Bertocchi, G., and W. Yong (1996), "Imperfect information, Bayesian learning and capital accumulation", Journal of Economic Growth 1:487-503. Binmore, K. (1987), "Modeling rational players", Economics and Philosophy 3:179-214. Blanchard, O.J., and S. Fischer (1989), Lectures on Macroeconomics (MIT Press, Cambridge, MA). Blanchard, O.J., and C.M. Kahn (1980), "The solution of linear difference models under rational expectations", Econometrica 48:1305-1311. Blume, L.E., and D. Easley (1982), "Learning to be rational", Journal of Economic Theory 26:340-351. B6hm, V, and J. Wenzelburger (1995), "Expectations, forecasting, and perfect foresight - a dynamical systems approach", Discussion Paper 307 (University of Bielefeld). Boldrin, M., and M. Woodford (1990), "Equilibrium models displaying endogenous fluctuations and chaos", Journal of Monetary Economics 25:189 222. Bossaerts, E (1995), "The econometrics of learning in financial markets", Econometric Theory 11: 151-189.
Ch. 7: LearningDynamics
535
Bray, M. (1982), "Learning, estimation, and the stability of rational expectations equilibria", Journal of Economic Theory 26:318-339. Bray, M., and D.M. Kreps (1987), "Rational learning and rational expectations", in: Feiwel (1987), chap. 19, pp. 597-625. Bray, M., and N. Savin (1986), "Rational expectations equilibria, learning, and model specification", Econometrica 54:1129-1160. Bray, M., L.E. Blume and D. Easley (1982), "Introduction to the stability of rational expectations", Journal of Economic Theory 26:313-317. Brock, W.A., and E de Fontnouvelle (1996), "Expectational diversity in monetary economics", Working Paper SSRI 9624 (University of Wisconsin-Madison). Brock, W.A., and C.H. Hornmes (1996), "Models of complexity in economics and finance", Working paper (University of Wisconsin, Madison). Brock, WA., and C.H. Hommes (1997), "A rational route to randomness", Econometrica 65:1059-1095. Broze, L., C. Gourieroux and A. Szafarz (1985), "Solutions of dynamic linear rational expectations models", Econometric Theory 1:341-368. Broze, L., C. Gourieroux and A. Szafarz (1990), Reduced Forms of Rational Expectations Models. Fundamental of Pure and Applied Economics (Harwood Academic Publishers). Bruno, M. (1989), "Econometrics and the design of economic reform", Econometrica 57:275 306. Bullard, J. (1992), "Time-varying parameters and nonconvergence to rational expectations under least squares learning", Economics Letters 40:159-166. Bullard, J. (1994), "Learning equilibria", Journal of Economic Theory 64:468-485. Bullard, J., and J. Duffy (1994), "Using genetic algorithms to model the evolution of heterogeneous beliefs", Working paper (Federal Reserve Bank of St. Louis). Bullard, J., and J. Duffy (1998a), "Learning and the stability of cycles", Macroeconomic Dynamics 2:22-48. Bullard, J., and J. Duffy (1998b), "A model of learning and emulation with artificial adaptive agents", Journal of Economic Dynamics and Control 22:179-207. Carlson, J. (1968), "An invariably stable cobweb model", Review of Economic Studies 35:360-363. Cass, D., and K. Shell (1983), "Do sunspots matter?", Journal of Political Economy 91:193-227. Champsaur, E (1983), "On the stability of rational expectations equilibria", Working Paper 8324 (CORE). Champsaur, P., et al., eds (1990), Essays in Honor of Edmond Malinvaud, vol. 1. Microeconomics (MIT Press, Cambridge, MA). Chatterji, S., and S.K. Cbattopadhyay (1997), "Global stability in spite of 'local instability' with learning in general equilibrium models", Working Paper WP-AD 97-11 (IVIE). Chen, X., and H. White (1998), "Nonparametric adaptive learning with feedback", Journal of Economic Theory 82:190522. Chiappori, EA., and R. Guesnerie (1991), "Sunspot equilibria in sequential market models", in: Hildenbrand and Sormenschein (1991) 1683-1762. Cho, I.-K. (1995), "Perceptrons play the repeated prisoner's dilemma", Journal of Economic Theory 67:266-284. Cho, I.-K., and T.J. Sargent (1996a), "Learning to be credible", Working paper (Brown University). Cho, I.-K., and TA. Sargent (1996b), "Neural networks for encoding and adapting in dynamic economies", in: Amman, Kendriek and Rust (1996) 441-470. Christiano, L.J., and V. Valdivia (1994), "Notes on solving models using a linearization method", mimeograph (Northwestern University). Crawford, V.E (1995), "Adaptive dynamics in coordination games", Econometrica 63:103-143. Currie, D., A. Garratt and S. Hall (1993), "Consistent expectations and learning in large scale macroeconometric models", in: Honkapohja and ingberg (1993), pp. 21-42. d'Autume, A. (1990), "On the solution of linear difference equations with rational expectations", Review of Economic Studies 57:672688.
536
G. W. Evans and S. Honkapohja
Dawid, H. (1996), Adaptive Learning by Genetic Algorithms: Analytical Results and Applications to Economic Models (Springer, Berlin). DeCanio, S. (1979), "Rational expectations and learning from experience", The Quarterly Journal of Economics 94:47-57. Dixon, H., and N. Rankin, eds (1995), The New Macroeconomics: Imperfect Markets and Policy Effectiveness (Cambridge University Press, Cambridge). Duffy, J. (1994), "On learning and the nonuniqueness of equilibrium in an overlapping generations model with fiat money", Journal of Economic Theory 64:541 553. Ellison, G., and D. Fudenberg (1995), "Word-of-mouth corunmnication and social learning", Quarterly Journal of Economics 110:93-125. Evans, G.W. (1983), "The stability &rational expectations in macroeeonomic models", in: Frydman and Phelps (1983), chap. 4, pp. 67-94. Evans, G.W. (1985), "Expectational stability and the multiple equilibria problem in linear rational expectations models", The Quarterly Journal of Economics 100:1217-1233. Evans, G.W. (1986), "Selection criteria for models with non-uniqueness", Journal of Monetary Economics 18:147-157. Evans, G.W. (1989), "The fragility of sunspots and bubbles", Journal of Monetary Economics 23: 297-317. Evans, G.W., and R. Guesnerie (1993), "Rationalizability, strong rationality, and expectational stability", Games and Economic Behaviour 5:632-646. Evans, G.W., and S. Honkapohja (1986), "A complete characterization of ARMA solutions to linear rational expectations models", Review of Economic Studies 53:227-239. Evans, G.W., and S. Honkapohja (1992), "On the robustness of bubbles in linear RE models", International Economic Review 33:1-14. Evans, G.W., and S. Honkapohja (1993a), "Adaptive forecasts, hysteresis and endogenous fluctuations", Federal Reserve Bank of San Francisco Economic Review 1993(1):3-13. Evans, G.W., and S. Honkapohja (1993b), "Learning and economic fluctuations: using fiscal policy to steer expectations", European Economic Review 37:595-602. Evans, G.W., and S. Honkapohja (1994a), "Convergence of least squares learning to a non-stationary equilibrium", Economic Letters 46:131-136. Evans, G.W., and S. Honkapohja (1994b), "Learning, convergence, and stability with multiple rational expectations equilibria", European Economic Review 38:1071-1098. Evans, G.W., and S. Honkapohja (1994c), "On the local stability of sunspot equilibria under adaptive learning rules", Journal of Economic Theory 64:142-161. Evans, G.W., and S. Honkapohja (1995a), "Adaptive learning and expectational stability: an introduction", in: Kirrnan and Salmon (1995), chap. 4, pp. 102 126. Evans, G.W., and S. Honkapohja (1995b), "Increasing social returns, learning and bifurcation phenomena", in: Kirman and Salmon (1995), chap. 7, pp. 216-235. Evans, G.W., and S. Honkapohja (1995e), "Local convergence of recursive learning to steady states and cycles in stochastic nonlinear models", Econometrica 63:195-206. Evans, G.W., and S. Honkapohja (1997), "Least squares learning with heterogeneous expectations", Economic Letters 52:197~01. Evans, G.W., and S. Honkapohja (1998a), "Convergence of learning algorithms without a projection facility", Journal of Mathematical Economics 30:59-86. Evans, G.W., and S. Honkapohja (1998b), "Economic dynamics with learning: new stability results", Review of Economic Studies 65:23~44. Evans, G.W., and S. Honkapohja (1999a), Learning and Expectations in Macroeconomics, book manuscript (Eugene, OR and Helsinki). Evans, G.W., and S. Honkapohja (1999b), "Convergence for difference equations with vanishing time dependence, with applications to adaptive learning", Economic Theory, forthcoming.
Ch. 7." Learning Dynamics
537
Evans, G.W., and G. Ramey (1992), "Expectations calculation and currency collapse", American Economic Review 82:207-224. Evans, G.W, and G. Ramey (1995), "Expectation calculation, hyperinflation and currency collapse", in: Dixon and Rankin (1995), chap. 15, pp. 307-336. Evans, G.W, and G. Ramey (1998), "Calculation, adaptation and rational expectations", Macroeconomic Dynamics 2:156 182. Evans, G.W, S. Honkapohja and T.J. Sargent (1993), "On the preservation of deterministic cycles when some agents perceive them to be random fluctuations", Journal of Economic Dynamics and Control 17:705-721. Evans, G.W, S. Honkapohja and R. Marimon (1998a), "Convergence in monetary inflation models with heterogenous learning rules", Discussion paper 386 (Department of Economics, University of Helsinki). Evans, G.W, S. Honkapohja and EM. Romer (1998b), "Growth cycles", American Economic Review 88:495-515. Farmer, R.E. (1991), "Sticky prices", The Economic Journal 101:1369-1379. Farmer, R.E. (1993), The Economics of Self-Fulfilling Prophecies (MIT Press, Cambridge, MA). Farmer, R.E., and J.-T. Guo (1994), "Real business cycles and the animal spirits hypothesis", The Journal of Economic Theory 63:42-72. Feiwel, G.R., ed. (1987), Arrow and the Ascent of Modern Economic Theory (New York University Press, New York). Feldman, M. (1987a), "Bayesian learning and convergence to rational expectations", Journal of Mathematical Economics 16:297-313. Feldman, M. (1987b), "An example of convergence to rational expectations with heterogeneous beliefs", International Economic Review 28(3):635-650. Fourgeand, C., C. Gourieroux and J. Pradel (1986), "Learning procedures and convergence to rationality", Econometrica 54:845-868. Friedman, D. (1991), "Evolutionary games in economics", Econometrica 59:632665. Frydman, R. (1982), "Towards an understanding of market procesesses: individual expectations, learning, and convergence to rational expectations equilibrium", American Economic Review 72:652-668. Frydman, R., and E.E. Phelps (1983), Individual Forecasting and Aggregate Outcomes, "Rational Expectations" Reexamined (Cambridge University Press, Cambridge). Fuchs, G. (1977), "Formation of expectations: a model in temporary general equilibrium theory", Journal of Mathematical Economics 4:167 187. Fuchs, G. (1979), "Is error learning behavior stabilizing?", Journal of Economic Theory 20:300-317. Fuchs, G., and G. Laroque (1976), "Dynamics of temporary equilibria and expectations", Econometrica 44:1157-1178. Fudenberg, D., and D.M. Kreps (1993), "Learning mixed equilibria", Games and Economic Behaviour 5:320-367. Fudenberg, D., and D.M. Kreps (1995), "Learning in extensive-form games I. Self-confirming equilibria", Games and Economic Behaviour 8:20-55. Fudenberg, D., and D.K. Levine (1998), Theory of Learning in Games (MIT Press, Cambridge, MA). Fuhrer, J.C., and M.A. Hooker (1993), "Learning about monetary regime shifts in an overlapping wage contract model", Journal of Economic Dynamics and Control 17:531-553. Gale, D. (1996), "What have we learned from social learning?", European Economic Review 40: 617-628. Garratt, A., and S. Hall (1997), "E-equilibria and adaptive expectations: output and inflation in the LBS model", Journal of Economic Dynamics and Control 21:1149-1171. Geanakoplos, J.D., and H.M. Polemarchakis (1991), "Overlapping generations", in: Hildenbrand and Sonnenschein (1991), chap. 35, pp. 1899-1960. Gottfries, N. (1985), "Multiple perfect foresight equilibriums and convergence of learning processes", Journal of Money, Credit and Banking 17:111-117.
538
(7. W. Evans and S. Honkapohja
Gourieroux, C., J. Laffont and A. Monfort (1982), "Rational expectations in dynamic linear models: analysis of the solutions", Econometrica 50:409-425. Grandmont, J.-M. (1985), "On endogenous competitive business cycles", Econometrica 53:995 1045. Grandmont, J.-M. (1998), "Expectations formation and stability of large socioeconomic systems", Econometrica 66:741~81. Grandmont, J.-M., and G. Laroque (1986), "Stability of cycles and expectations", Journal of Economic Theory 40:138-151. Grandmont, J.-M., and G. Laroque (1990), "Stability, expectations, and predetermined variables", in: Champsaur et al. (1990), chap. 3, pp. 71-92. Grandmont, J.-M., and G. Laroque (1991), "Economic dynamics with learning: some instability examples", in: Barnett et al. (1991), chap. 11, pp. 247-273. Griliches, Z., and M. Intriligator (1986), Handbook of Econometrics, vol. 3 (North-Holland, Amsterdam). Guesnerie, R. (1992), "An exploration of the eductive justifications of the rational-expectations hypothesis", American Economic Review 82:1254 1278. Guesnerie, R. (1993), "Theoretical tests of the rational-expectations hypothesis in economic dynamical models", Journal of Economic Dynamics and Control 17:847-864. Guesnerie, R. (1996), "Coordination problems with fixed and flexible wages: the role of the Keynesian multiplier", mimeograph (DELTA, ENS). Guesnerie, R., and M. Woodford (1991), "Stability of cycles with adaptive learning rules", in: Barnett et al. (1991), pp. 111 134. Guesnerie, R., and M. Woodford (1992), "Endogenous fluctuations", in: Laffont (1992), chap. 6, pp. 289-412. Hahn, W (1963), Theory and Application of Liapunov's Direct Method (Prentice-Hall, Englewood Cliffs, N J). Hahn, W. (1967), Stability of Motion (Springier, Berlin). Heinemann, M. (1997a), "Adaptive learning of rational expectations using neural networks", Working paper (University of Hannover). Heinemann, M. (1997b), "Convergence of adaptive learning and expectational stability: the case of multiple rational expectations equilibria", Working paper (University of Hannover). Heymann, D., and P. Sanguinetti (1997), "Business cycles from misperceived trends", Working paper (University of Buenos Aires). Hildenbrand, W., and H. Sonnenschein, eds (1991), Handbook of Mathematical Economics, vol. IV (North-Holland, Amsterdam). Hommes, C.H., and G. Sorger (1998), "Consistent expectations equilibria", Macroeconomic Dynamics 2:287-321. Honkapohja, S. (1993), "Adaptive learning and bounded rationality: an introduction to basic concepts", European Economic Review 37:587-594. Honkapohja, S. (1994), "Expectations driven nonlinear business cycles: comments", in: Bergstr6m and Vredin (1994), chap. 19, pp. 256~62. Honkapohja, S. (1996), "Bounded rationality in macroeconomics: a review essay", Journal of Monetary Economics 35:509 518. Honkapohja, S., and M. Ingberg, eds (1993), Macroeconomie Modeling and Policy Implications (NorthHolland, Amsterdam). Howitt, R (1992), "Interest rate control and nonconvergence to rational expectations", Journal of Political Economy 100:776-800. Howitt, E, and R.R McAfee (1992), "Animal spirits", American Economic Review 82:493-507. Jun, B., and X. Vives (1996), "Learning and convergence to a full-information expectations are not equivalent", Review of Economic Studies 63:653-674. Kandori, M., G.J. Mailath and R. Rob (1993), "Learning, mutation, and long run equilibria in games", Econometrica 61:29-56.
Ch. 7: Learning Dynamics
539
Kiefer, J., and J. Wolfowitz (1952), "Stochastic estimation of the modulus of a regression function", Annals of Mathematical Statistics 23:462~466. Kirman, A.E (1983), "On mistaken beliefs and resultant equilibria", in: Frydman and Phelps (1983), chap. 8, pp. 147-166. Kirman, A.P. (1995), "Learning in oligopoly: theory, simulation, and experimental evidence", in: Kirman and Salmon (1995), chap. 5, pp. 127-178. Kirman, A.E, and R Salmon, eds (1995), Learning and Rationality in Economies (Basil Blackwell, Oxford). Kiyotaki, N., and R. Wright (1989), "On money as a medium of exchange", Journal of Political Economy 97:922954. Kreps, D.M., and K. Wallis, eds (1997), Advances in Economics and Econometrics: Theory and Applications, vol. I (Cambridge University Press, Cambridge). Kuan, C.-M., and H. White (1994), "Adaptive learning with nonlinear dynamics driven by dependent processes", Econometrica 62:1087 1114. Kurz, M. (1989), "Bounded ability of agents to learn the equilibrium price process of a complex economy", Technical report 540 (IMSSS, Stanford University). Kurz, M. (1994a), "Asset prices with rational beliefs", Working paper (Stanford University). Kurz, M. (1994b), "On the structure and diversity of rational beliefs", Economic Theory 4:1-24. Kurz, M., ed. (1997), Endogenous Economic Fluctuations. Studies in the Theory of Rational Beliefs (Springer, Berlin). Kushner, H., and D. Clark (1978), Stochastic Approximation Methods for Constrained and Unconstrained Systems (Springer, Berlin). Kushner, H.J., and G.G. Yin (1997), Stochastic Approximation Algorithms and Applications (Springer, Berlin). Laffont, J.-J., ed. (1992), Advances in Economic Theory: Sixth World Congress, vol. 2 (Cambridge University Press, Cambridge). Lettau, M. (1997), "Explaining the facts with adaptive agents: the case of mutual funds flows", Journal of Economic Dynamics and Control 21:111 21147. Lettau, M., and H. Uhlig (1999), "Rules of thumb and dynamic programming", American Economic Review 89:148-174. Lettau, M., and T. Van Zandt (1995), "Robustness of adaptive expectations as an equilibrium selection device", Working paper (Northwestern University). Ljung, L. (1977), "Analysis of recursive stochastic algorithms", IEEE Transactions on Automatic Control 22:551-575. Ljung, L., and T. Srderstrrm (1983), Theory and Practice of Recursive Identification (MIT Press, Cambridge, MA). Ljung, L., G. Pflug and H. Walk (1992), Stochastic Approximation and Optimization of Random Systems (Birkhauser, Basel). Lucas Jr, R.E. (1973), "Some international evidence on output-inflation tradeoffs", American Economic Review 63:326-334. Lueas Jr, R.E. (1978), "Asset prices in an exchange economy", Eeonometrica 46:1429-1445. Lucas Jr, R.E. (1986), "Adaptive behavior and economic theory", Journal of Business 59(Suppl.): $401-$426. Marcet, A. (1994), "Simulation analysis of dynamic stochastic models: applications to theory and estimation", in: Sims (1994) 81-118. Marcet, A., and D.A. Marshall (1992), "Convergence of approximate model solutions to rational expectations equilibria using the method of parameterized expectations", Working Paper WP73 (Northwestern University). Marcet, A., and J.R Nicolini (1998), "Recurrent hyperinflations and learning", Working Paper 1875
(CEPR).
540
G. W. Eoans and S. Honkapohja
Marcet, A., and T.J. Sargent (1988), "The fate of systems with 'adaptive' expectations", AEA Papers and Proceedings, 78(2): 168-172. Marcet, A., and T.J. Sargent (1989a), "Convergence of least squares learning and the dynamic of hyperinflation", in: Barnett et al. (1989), pp. 119-137. Marcet, A., and T.J. Sargent (1989b), "Convergence of least-squares learning in environments with hidden state variables and private information", Journal of Political Economy 97:1306-1322. Marcet, A., and T.J. Sargent (1989e), "Convergence of least-squares learning mechanisms in selfreferential linear stochastic models", Journal of Economic Theory 48:337068. Marcet, A., and T.J. Sargent (1995), "Speed of convergence of recursive least squares: learning with autoregressive moving-average perceptions", in: Kirman and Salmon (1995), chap. 6, pp. 179-215. Margaritis, D. (1987), "Strong convergence of least squares learning to rational expectations", Economics Letters 23:157 161. Marimon, R. (1997), "Learning from learning in economics", in: Kreps and Wallis (1997), chap. 9, pp. 278-315. Marimon, R., and E.R. McGrattan (1995), "On adaptive learning in strategic games", in: Kirman and Salmon (1995), chap. 3, pp. 63 101. Marimon, R., and S. Sunder (1993), "Indeterminacy of equilibria in a hyperinflationary world: experimental evidence", Econometriea 61:1073-1107. Marimon, R., and S. Sunder (1994), "Expectations and learning under alternative monetary regimes: an experimental approach", Economic Theory 4:131 - 162. Marimon, R., E.R. McGrattan and T. Sargent (1989), "Money as medium of exchange with artificially intelligent agents", Journal of Economic Dynamics and Control 14:329 373. Marimon, R., S.E. Spear and S. Sunder (1993), "Expectationally driven market volatility: an experimental study", Journal of Economic Theory 61:74-103. Maussner, A. (1997), "Learning to believe in nonrational expectations that support Pareto-superior outcomes", Journal of Economics 65:235-256. McCallum, B.T. (1983), "On nonuniqueness in linear rational expectations models: an attempt at perspective", The Journal of Monetary Economics 11:134-168. McCallum, B.T. (1997), "Alternative criteria for identifying bubble-free solutions in rational expectations models", Working paper (Carnegie Mellon and NBER). McLennan, A. (1984), "Price dispersion and incomplete learning in the long run", Journal of Economic Dynamics and Control 7:331-347. Milgrom, P., and J. Roberts (1990), "Rationalizability, learning, and equilibrium in games with strategic complementarities", Econometrica 58:1255-1277. Milgrom, P., and J. Roberts (1991), "Adaptive and sophisticated learning in normal form games", Games and Economic Behavior 3:82 100. Mitra, K. (1997), "Least squares prediction and nonlinear dynamics under non-stationarity", mimeograph (University of Helsinki). Moore, B.J. (1993), "Least-squares learning and the stability of equilibria with externalities", Review of Economic Studies 60:197-208. Moore, B.J., and H. Schaller (1996), "Learning, regime switches, and equilibrium asset pricing dynamics", Journal of Economic Dynamics and Control 20:979-1006. Moore, B.J., and H. Schaller (1997), "Persistent and transitory shocks, learning and investment dynamics", Working paper (Carleton University). Moreno, D., and M. Walker (1994), "Two problems in applying Ljung's 'projection algorithms' to the analysis of decentralized learning", Journal of Economic Theory 62:42(~427. Muth, J.E (1961), "Rational expectations and the theory of price movements", Econometrica 29:315-335. Nyarko, Y. (1991), "Learning in mis-specified models and the possibility of cycles", Journal of Economic Theory 55:416-427.
Ch. 7: Learning Dynamics
541
Nyarko, Y. (1997), "Convergence in economic models with Bayesian hierarchies of beliefs", Journal of Economic Theory 74:266-296. Paekal6n, M. (1997), "Adaptive learning of rational expectations: a neural network approach", mimeograph (University of Helsinki). Pesaran, H. (1981), "Identification of rational expectations models", Journal of Econometrics 16:375-398. Robbins, H., and S. Monro (1951), "A stochastic approximation method", Annals of Mathematical Statistics 22:400-407. Salge, M. (1997), Rational Bubbles. Theoretical Basis, Economic Relevance and Empirical Evidence with Special Emphasis on the German Stock Market (Springer, Berlin). Salmon, M. (1995), "Bounded rationality and learning; procedural learning", in: Kirman and Salmon (1995), chap. 8, pp. 236~75. Sargent, T.J. (1987), Macroeeonomic Theory, 2nd edition (Academic Press, New York). Sargent, T.J. (1991), "Equilibrium with signal extraction from endogenous variables", Journal of Economic Dynamics and Control 15:245-273. Sargent, T.J. (1993), Bounded Rationality in Macroeconomics (Oxford University Press, Oxford). Sargent, T.J. (1999), The Conquest of American Inflation (Princeton University Press, Princeton, NJ). Sargent, T.J., and N. Wallace (1975), "'Rational expectations', the optimal monetary instrument and the optimal money supply rule", Journal of Political Economy 83:241~54. Sch6nhofer, M. (1996), "Chaotic learning equilibria", Discussion Paper 317 (University of Bielefeld). Shell, K. (1977), "Monnaie et allocation intertemporelle", Working paper (CNRS Seminaire de E. Malinvaud, Paris). Sims, C.A., ed. (1994), Advances in Econometrics, Sixth World Congress, vol. 2 (Cambridge University Press, Cambridge). Soerensen, J.P. (1996), "An economy with heterogeneous agents", Working paper (University of Edinburgh). Spear, S.E. (1989), "Learning rational expectations under computability constraints", Econometrica 57:889-910. Taylor, J.B. (1975), "Monetary policy during a transition to rational expectations", Journal of Political Economy 83:1009-1021. Taylor, J.B. (1977), "Conditions for unique solutions in stochastic macroeconomic models with rational expectations", Econometrica 45:1377-1386. Taylor, LB. (1980), "Aggregate dynamics and staggered contracts", Journal of Political Economy 88: 1~3. Taylor, J.B. (1986), "New approaches to stabilization policy in stochastic models of macroeconomic fluctuations", in: Griliches and Intriligator (1986), chap. 34, pp. 1997-2055. Tillmaun, G. (1983), "Stability in a simple pure consumption loan model", Journal of Economic Theory 30:315-329. Timmermann, A.G. (1993), "How learning in financial markets generates excess volatility and predictability in stock prices", Quarterly Journal of Economics 108:1135-1145. Timmermann, A.G. (1994), "Can agents learn to form rational expectations? Some results on convergence and stability of learning in the UK stock market", Economic Journal 104:777-797. Timmermann, A.G. (1995), "Volatility clustering and mean reversion of stock returns in an asset pricing model with incomplete learning", Working paper (University of California, San Diego). Timmermann, A.G. (1996), "Excessive volatility and predictability of stock prices in autoregressive dividend models with learning", Review of Economic Studies 63:523 557. Townsend, R.M. (1978), "Market anticipations, rational expectations, and Bayesian analysis", International Economic Review 19:481-494. Townsend, R.M. (1983), "Forecasting the forecast of others", Journal of Political Economy 91:546-588. Turnovsky, S. (1969), "A Bayesian approach to the theory of expectations", Journal of Economic Theory 1:220-227.
542
G. W. Evans and S. Honkapohja
Vives, X. (1993), "How fast do rational agents learn?", Review of Economic Studies 60:329-347. White, H. (1992), Artificial Neural Networks: Approximation and Learning Theory (Basil Blackwell, Oxford). Whiteman, C. (1983 ), Linear Rational Expectations Models. University of Minnesota Press (Minneapolis, MN). Woodford, M. (1990), "Learning to believe in sunspots", Econometriea 58:277 307. Zenner, M. (1996), Learning to Become Rational. The Case of Self-Referential Autoregressive and Non-Stationary Models (Springer, Berlin).
Chapter 8 MICRO
DATA
AND
GENERAL
EQUILIBRIUM
MODELS*
MARTIN BROWNING
Institute of Economics, Copenhagen University, Copenhagen, Denmark email: [email protected] LARS PETER HANSEN
University of Chicago, Chicago, IL, USA email:
[email protected] JAMES J. HECKNIAN
University of Chicago, Chicago, IL, USA email:
[email protected]
Contents Abstract Keywords Introduction 1. Stochastic g r o w t h m o d e l 1.1. Single consumer model 1.1.1. Parameterizations 1.1.2. Steady states 1.1.3. Micro evidence 1.2. Multiple agents 1.2.1. Preferences 1.2.2. Labor income 1.2.3. Market structure 1.2.4. Preference homogeneity 1.2.5. Risk aversion or intertemporal substitution? 1.2.6. Preference heterogeneity 1.2.7. Questionnaire evidence on the scale and distribution of risk aversion 1.3. Incomplete markets 1.3.1. Microeconomic uncertainty 1.3.1.1. Estimated processes for wages and earnings
544 544 545 547 548 548 549 551 552 552 553 553 553 556 558 564 566 567 567
* We thank Marco Cagetti, John Heaton, Jose Scheinkman, John Taylor, Edward Vytlacil and Noah Williams for comments. Hansen and Heckman gratefully acknowledge funding support by the National Science Foundation.
Handbook of Macroeconomics, Volume 1, Edited by J.B. Taylor and M. Woodford © 1999 Elsevier Science B.V. All rights reserved 543
544
M. Browning et aL
1.3.1.2. Missing risks 1.3.1.3. Statistical decompositions 1.3.2. Limited commitment and private information 1.3.2.1. Limited commitment 1.3.2.2. Private information 2. Overlapping generations model 2.1. Motivation 2.2. Economic models of earnings and human capital investment 2.2.1. The Ben-Porath framework 2.2.2. The HLT model of earnings, schooling and on-the-job training 2.3. Structure of the model 2.3.1. Equilibrium conditions under perfect foresight 2.3.2. Linking the earnings function to prices and market aggregates 2.4. Determining the parameters of OLG models 3. Micro evidence 3.1. Introduction 3.2. Defining elasticities 3.2.1. Frisch demands 3.2.2. Other demand functions 3.2.3. An example 3.2.4. The life-cycle participation decision 3.3. Consumption estimates 3.4. Labor supply 3.4.1. Labor supply estimates 3.5. Heterogeneity in the marginal rate of substitution between goods and leisure Summary and conclusion References
572 572 574 574 575 576 576 577 582 587 587 591 592 592 594 594 594 595 597 599 601 605 614 615 620 623 625
Abstract Dynamic general equilibrium models are required to evaluate policies applied at the national level. To use these models to make quantitative forecasts requires knowledge o f an extensive array o f parameter values for the economy at large. This essay describes the parameters required for different economic models, assesses the discordance between the macromodels used in policy evaluation and the microeconomic models used to generate the empirical evidence. For concreteness, we focus on two general equilibrium models: the stochastic growth model extended to include some forms o f heterogeneity and the overlapping generations model enriched to accommodate human capital formation. Keywords general equilibrium models, microeconomic evidence, stochastic growth model, overlapping generations model, calibration
Ch. 8: Micro Data and General Equilibrium Models
545
Introduction An extensive literature in macroeconomics and public finance uses dynamic stochastic general equilibrium models to study consumption savings, capital accumulation, and asset pricing and to analyze alternative policies. Except for a few special cases, the economies studied cannot be analyzed using "paper and pencil" style analysis. It is often difficult to produce general theorems that are true for all parameter values of dynamic general equilibrium models. This is a general feature of non-linear dynamic models in economics as well as in the physical sciences. For such models, knowing which parameters govern behavior is essential for understanding their empirical content and for providing quantitative answers to policy questions. For the numerical output of a dynamic equilibrium model to be interesting, the inputs need to be justified as empirically relevant. There are two sources of information that are commonly used in rationalizing parameter values. One is the behavior of time series averages of levels or ratios of key variables. These time series averages are often matched to the steadystate implications of versions of the models that abstract from uncertainty. The other input is from microeconomic evidence. In this essay we discuss the use of evidence from both sources, concentrating mostly on microeconomic evidence. See King and Rebelo (1999) and Taylor (1999) for extensive discussions of calibrating real-business cycle and staggered contract models, respectively. It was once believed to be a simple task to extract the parameters needed in general equilibrium theory from a large warehouse of stable micro empirical regularities. Indeed, Prescott (1986) argued that: The key parameters of growth models are the intertemporal and intratemporal elasticities of substitution. As Lucas (1980) emphasizes,"On those parameters, we have a wealth of inexpensive available data from census and cohort information, from panel data describing market conditions and so forth". While this Lucas-Prescott vision of calibration offers an appealing defense for building models with microeconomic foundations, implementing it in practice exposes major discrepancies between the micro evidence and the assumptions on which the stylized dynamic models are based. The microeconomic evidence is often incompatible with the macroeconomic model being calibrated. For example, a major finding of modem microeconomic data analysis is that preferences are heterogeneous. For reasons of computational tractability, dynamic general equilibrium model-builders often abstract from this feature or confront it in only a limited way. This chapter explores the discordance between micro evidence and macro use of it and suggests ways in which it can be diminished. Our chapter raises warning flags about the current use of micro evidence in dynamic equilibrium models and indicates the dangers in, and limitations of, many current practices. It also exposes the weak micro empirical foundations of many widely used general equilibrium modeling schemes. The decision to incorporate micro evidence in an internally consistent manner may alter the structure and hence the time series implications of the model. While
546
M. Browning et al.
steady-state approximations may be useful for some purposes, compositional changes in labor supply or in market demand alter the microeconomic elasticities that are relevant for macroeconomics. Like several of the other contributions to this Handbook, ours is more of a guide for future research than a summary of a mature literature. Because the micro empirical literature and the macro general equilibrium literature have often moved in different directions, it is not surprising that they are currently so detached. The main goal of this essay is to foster the process of reattachment. Macro general equilibrium models provide a framework within which micro empirical research can be fruitfully conducted. At the same time, dynamic general equilibrium theory will be greatly enriched if it incorporates the insights of the micro empirical literature. The micro fotmdations of macroeconomics are more fruitfully built on models restructured to incorporate microeconomic evidence. This essay explores three challenges for closing the gap between empirical microeconomics and dynamic macroeconomic theory: • Heterogeneity: Any careful reading of the empirical microeconomics literature on consumption saving and labor supply reveals quantitatively important heterogeneity in agent preferences, in constraints, in dimensions of labor supply and skill, and in human capital accumulation processes. Accounting for heterogeneity is required to calibrate dynamic models to microeconomic evidence. • U n c e r t a i n t y : Modern macroeconomics is based on models of uncertainty. Aggregating earnings across members in a household and across income types may create a disconnect between uncertainty as measured by innovations in time series processes of earnings and income equations and actual innovations in information. Government or interfamily transfers provide insurance that should be accounted for. Alternative risk components such as risks from changes in health, risks from unemployment and job termination, and risks from changes in family structure, have different degrees of predictability and are difficult to quantify. Measuring the true components of both micro and macro uncertainty and distinguishing them from measurement error and model misspecification remains an empirical challenge that is just beginning to be confronted. • Synthesis: Synthesizing evidence across micro studies is not a straightforward task. Different microeconomic studies make different assumptions, often implicit, about the economic environments in which agents make their decisions. They condition on different variables and produce parameters with different economic interpretations. A parameter that is valid for a model in one economic environment cannot be uncritically applied to a model embedded in a different economic environment. Different general equilibrium models make different assumptions and require different parameters, many of which have never been estimated in the micro literature. In order to be both specific and constructive, in this essay we limit ourselves to two prototypical general equilibrium models: (a) a stochastic growth model and (b) a perfect foresight overlapping generations model. The first model is sufficiently rich to enable us to explore implications of uncertainty, market structure and some
Ch. 8:
Micro Data and General Equilibrium Models
547
forms of heterogeneity in the preferences and opportunities of microeconomic agents. The second model introduces explicit life-cycle heterogeneity and demographic structures in appealing and tractable ways. We consider a recent version of the second model that introduces human capital formation, heterogeneity in skills, and comparative advantage in the labor market. These attributes are introduced to provide a framework for analyzing labor market policies, to account for a major source of wealth formation in modern economies, and to account for the phenomenon of rising wage inequality observed in many countries. The plan of this chapter is as follows. We first present two basic theoretical models analyzed in this chapter and the parameters required to implement them. We summarize the main lessons from the micro literature that pertain to each model and their consequences for the models we consider. The models are presented in Sections 1 and 2, respectively, with some accompanying discussion of the relevant micro literature. Section 3 presents further discussion of the micro evidence on intertemporal substitution elasticities.
1. Stochastic growth model This part of the chapter presents alternative variants of a Brock-Mirman (1972) stochastic growth model and discusses the parameters needed to calibrate them. We explicitly consider the consequences of heterogeneity for the predictions of this model and for the practice of research synthesis. It is often not the median or "representative" preferences that govern behavior asymptotically; rather it is the extreme. The agents with the smallest rates of time preference or smallest risk aversion may dominate the wealth accumulation process, but not the supply of labor. Understanding the source and magnitude of the heterogeneity is required before microeconomic estimates can be "plugged" into macroeconomic models. We also explore the measurement of microeconomic uncertainty needed to quantify the importance of precaution in decision-making and to calibrate equilibrium models with heterogeneous agents. We use the Brock-Mirman (1972) stochastic growth model as a starting point for our discussion because of its analytical tractability. Moreover, it is the theoretical framework for the real-business cycle models of Kydland and Prescott (1982) and Hansen (1985) and for subsequent multiple consumer extensions of it by Aiyagari (1994), Krusell and Smith (1998) and others. Our use of the stochastic growth model is not meant as an endorsement of its empirical plausibility. Much is known about its deficiencies as a model of fluctuations [e.g., see Christiano (1988), Watson (1993), and Cogley and Nason (1995)] or as a model of security market prices implied by a Lucas-Prescott (1971) type of decentralization [e.g., see Hansen and Singleton (1982, 1983), Mehra and Prescott (1985), Weil (1989), Hansen and Jagannathan (1991), and Heaton and Lucas (1996)]. Nevertheless, the Brock-Mirman model and its extensions provide a convenient and widely used starting point for investigating the difficulties
548
M. Browning et al.
in obtaining plausible parameter configurations from microeconomic data and from aggregate time series data. 1.1. Single consumer model
Suppose that there is a single infinitely-lived consumer. This consumer supplies labor and consumes in each period, evaluating consumption flows using a v o n N e u m a n n Morgenstern discounted utility function: oo
F~~ /3t U(c,), t=O
where ct is consumption at date t, U is an increasing concave function and 0 3 < 1 is a subjective discotmt factor. Labor ht is either supplied inelastically, or else preferences are modified to incorporate the disutility of work (utility o f leisure): O<3
F~~ [3'U(c,, ht). t=0
Production takes place according to a two-input production function:
Ct + (kt -/~,kt-l)
dtf(kt-1, ht)
(1.1)
where kt is capital and dt is a technology shock, which is a component o f a Markov process {xt}. The depreciation rate is 1 - ;,. Associated with this Markov process is a sequence o f information sets {It}. In what follows we sometimes adopt the common and convenient Cobb-Douglas specification o f production1: f ( k , h) = k°h 1-°.
(1.2)
1.1.1. Parameterizations
We first present the basic utility functions that have been used in empirical work and in many versions o f the Brock-Mirman model. We briefly review the microeconometric evidence on preferences, paying particular attention to the interactions between consumption and labor supply. This evidence is discussed more extensively in Section 3. For convenience, some models abstract from the labor supply decision
1 In Section 2 we will study deficiencies of this Cobb-Douglas specification. In particular, labor is not homogeneous and an efficiency units assumption to adjust labor to homogeneous units is inconsistent with the evidence from factor markets. Comparative advantage and sectoral choices by agents are key features of modern labor markets.
Ch. 8." Micro Data and General Equilibrium Models
549
and use an iso-elastic one-period utility function defined over a single non-durable consumption good 2:
u (c)
c 1-p - 1
(1.3)
1-p
for p 7> 0. This specification is used in part because, given intertemporal additivity o f preferences, it is homothetic and hence leads to simple steady-state characterizations. To obtain a more interesting model of economic fluctuations, including fluctuations in total or average hours worked, Kydland and Prescott (1982) introduced leisure into the preferences o f a Brock-Mirman model [see also King, Plosser and Rebelo (1988a,b) and Cooley and Prescott (1995)]. Most subsequent investigators assume that the one-period utility function can be written in the form U (c, h) = { c ° [aP(h)]l-a}l-P - 1 1-p
(1.4)
where h is hours o f work and ~p is decreasing and positive and a is in the interval (0, 1)3. When p = 1, we obtain the additively-separable model: U(c, h ) = a log(c)+ ( 1 - or)log[~p (h)].
1.1.2. Steady states With specification (1.4), the marginal rate o f substitution between consumption and work is: mrs =
(1 - or) Wt(h)c o~(h)
(1.5)
and hence is linear in consumption Suppose that there is geometric growth in the technology process {dr}. Given the Cobb-Douglas production function (1.2), a steady state exists in which hours worked, the consumption-capital ratio and the implied firm
2 Some consumption models allow for many goods. For example, many dynamic international trade models follow the tradition of static models and allow that "traded" and "non-traded" goods enter the utility function differently; see, for example, Backus, Kehoe and Kydland (1995) and Stockman and Tesar (1995). 3 For some dynamic equilibrium models, consumption and labor supply are composites. For example, Kydland and Prescott (1982) have preferences defined over a weighted sum of current and lagged labor supply and Eichenbaum and Hansen (1990) and Hornstein and Praschnik (1994) define consumption as a CES aggregator of the flow of services from durables and non-durables. Auerbach and Kotlikoff (1987) use a CES version of Equation (1.4) in their overlapping generations model.
550
3/L Browning et al.
expenditure share on labor costs are constant. Steady-state calibration proceeds as follows. Steady states and steady-state ratios are measured by time series averages. The production function parameter 0 is pinned down by labor's share o f output, and the depreciation factor for capital from the steady-state investment-capital ratio. For a given % say ~p = 1 - h, the parameter a may be solved out by equating minus the marginal disutility of work (1.5) with the marginal product o f labor. This yields c + i
- (1 - a ) ~pt(h)h
c
a~(h)
(1 - o ) '
where i is steady-state investment 4. A n important question this theory has to confront is whether the functional forms for preferences of consumption and labor supply used to interpret aggregate time series data as steady states are compatible with microeconomic evidence on the functional form o f preferences. The time series evidence measures the fraction o f available time an "average" worker spends in market work. The claim in the real-business cycle literature is that per capita leisure has remained relatively constant in the post-war period while real wages have been rising at the same rate as output. However, this stability in average hours worked per capita masks divergent trends for males and females. A central finding from the empirical micro literature is that the time series o f the labor supply behavior o f men and women is different and neither is close to a stationary time series. [See Pencavel (1986) and Killingsworth and Heckman (1986).] I f preference parameters are to be based on microeconomic evidence, two questions have to be answered. First, do the functional forms found in the micro literature produce growth steady states? Second, given the changes in the composition o f the labor force, whose labor elasticities should be used in calibrating a macroeconomic model? The answers to these questions rely in part on the relative quality o f the aggregate time series and the microeconomic evidence. Durlauf and Quah (1999) raise serious doubts about the validity o f the steady-state approximation as an accurate factual description of m o d e m economies. Note further that the functional form restrictions required for the conjectured steady states apply to a fictitious composite household model o f consumption and leisure. In practice the microeconomic evidence is extracted separately for men and women using preference specifications that are
4 Consideration of household production and the possibility of substituting work time in the home for expenditures on goods lead authors such as Benhabib, Rogerson and Wright (1991), Greenwood and Hercowitz (1991) and Greenwood, Rogerson and Wright (1995) to allow for derived utility functions over consumption and market hours that are somewhat more general than the class of models considered here. Their home production specification introduces technological progress into the "reduced-form" depiction of the preferences for consumption and labor supply and loosens the restrictions needed for the existence of a steady state of the desired form. See Eichenbaum and Hansen (1990) for a similar development for modeling the preferences for durable and non-durable consumption goods.
Ch. 8." Micro Data and General Equilibrium Models
551
outside the form given in Equation (1.4). For example, MaCurdy (1983) reports that a specification o f male labor supply with
~Y u(c, h) =
cl_ac 1 - ac
hl+ah ,~ l-p i~ ~ J 1-p
-
is consistent with monthly male labor supply data from the US where p = 0.14, ac = 0.66 and ah = 0.16, and these parameters are precisely estimated. Note in particular that a c ¢ 1. The marginal rate o f substitution between consumption and work is: hah mrs -
GC-ac, '
and this empirical specification is not consistent with steady-state growth because ac ¢ 1. It is, however, consistent with the well known observation that male hours o f work per capita have declined over time.
1.1.3. M i c r o evidence
Our more detailed discussion o f the microeconomic evidence presented in Section 3 establishes the following additional empirical conclusions: • Most o f the responsiveness of labor supply with respect to wage change is due to entry and exit from employment; yet most of the micro evidence for intertemporal labor supply elasticities is presented for continuously working, continuously married prime age males - the demographic group least responsive to wage changes, especially at the extensive margin 5. • There is evidence that consumption is complementary with male labor supply while the evidence is mixed on the interaction between consumption and female labor supply. At present there are no reliable estimates o f this interaction. Yet the difference between male and female labor supply highlights the problem o f pooling the labor supply o f diverse groups into one aggregate.
5 Rogerson (1988) designed an aggregate model of labor supply that focuses exclusively on the extensive margin. Individuals are allocated randomly to jobs that require a fixed number of hours. The number of jobs fluctuates over time but not the number of hours per job. Hansen (1985) adapted this framework to the Brock-Mirman stochastic growth model. While these models successfully emphasize the extensive margin, they are not well suited to capture differential labor supply responses between men and women. We discuss this model further in Section 3.
552
M. Browning et al.
• The elasticity of intertemporal substitution (eis = - 1 / p ) as determined from consumption is usually poorly determined. If constancy across the population is imposed on this elasticity, then there is no strong evidence against the view that this elasticity is slightly above minus one. There is, however, evidence that the eis varies both with observable demographics and with the level of wealth so that the homothetic iso-elastic form is rejected and an assumption that eis is minus one for all demographic groups is not consistent with the evidence. The same evidence suggests that low wealth households are relatively more averse to consumption fluctuations than are high wealth households. • For leisure, the elasticity of intertemporal substitution is between 0.1 and 0.4 for annual hours for men and 1.61 for women. There is evidence that these elasticities are larger for shorter units within a year. Because these labor supply elasticities ignore the entry and exit decision, they provide only part of the information needed to construct the aggregate labor supply curve. 1.2. Multiple agents
Heterogeneity in preferences, discount rates, and risk aversion parameters is found in numerous micro studies. As a step towards achieving a better match between micro economic evidence and dynamic stochastic economics, it is fruitful to explore macro general equilibrium models with explicit heterogeneity. Such models are of considerable interest in their own right and often produce rather different outcomes than their single consumer counterparts. Adding heterogeneity enriches the economic content of macro models, and calls into question current practices for obtaining parameter estimates used in general equilibrium models. We start with a very simple specification. Consider a large population with J types of agents indexed b y j . We abstract from labor supply as in the Brock-Mirman (1972) stochastic growth model and we also ignore human capital accumulation. Instead we suppose initially that labor is supplied inelastically. Following Aiyagari (1994), we adopt the simple version of the Brock-Mirman model in which individual agents confront stochastic productivity shocks yj, t to their labor supply. This scheme produces idiosyncratic shocks in labor income in spite of the presence of a common wage (per unit productivity) and leads to a simple analysis. Later on, we explore complications caused by the addition of the labor supply decision. 1.2.1. Preferences
We follow the common practice of using preferences with a constant elasticity of intertemporal substitution but we let the eis and the subjective rate of time discount differ among individuals: OO
E ~
t=0
(/~)t (C],t) 1 pj -- 1 l_o
/
'
(1.6)
Ch. 8:
Micro Data and General Equilibrium Models
553
where (~, pj) differ by consumer type. The evidence discussed both here and in Section 3 documents that such heterogeneity is empirically important. 1.2.2. Labor income Assume that {xt} is a Markov process governing aggregate shocks and that the productivity for type j at time t + 1, y/,t+b is a component of a person-specific state vector sj, t+l. The probability distribution of sj, t+l given current period state vectors sj, t and xt is denoted by F/(" [sj, t,xt). The income of personj at time t is wtyj, t where wt is the endogenously determined wage rate at time t. Aggregate or average labor supply corrected for efficiency is given by
hr=y1
J
.j-- |
where J is the number of agents in the economy and the individual labor supply is normalized to be unity. In equilibrium, wages satisfy the aggregate marginal product condition wt = (1 - 0) dt(kt_l/ht)
O.
1.2.3. Market structure Depending on the institutional environment confronting agents, different interactions among them may emerge. Market interactions can be limited by informational constraints and the ability to commit to contractual arrangements. Here, we consider a variety of market contexts and their implications for behavior and calibration. We initially explore a complete market model as a benchmark. Suppose consumers trade in a rich array of security markets. We introduce a common sequence of information sets and use It to denote information available at date t. Consumers can make statecontingent contracts conditioned on information available in subsequent time periods. Given the ability to make such contracts, we obtain a large array of equilibrium security market prices. Moreover, in studying consumption allocations, we may simplify the analysis by exploiting the implications of Pareto efficiency. Although our interest is in economies with heterogeneous consumers, we begin our exposition of Pareto efficient models by first considering agents with homogenous preferences but heterogeneous endowments. 1.2.4. Preference homogeneity Suppose initially that consumers have common preferences (fi, p). Endowments may differ; in this case, preferences aggregate in the sense of Gorman (1953). At a
M. Browning et al.
554
mechanical level, this can be checked as follows. The intertemporal marginal rates of substitution are equated so:
(
thus
\ cj, t /
Averaging over the consumption of each type, we find that
Ca, t
- -
= Ca, t+1,
where ca, t denotes consumption averaged across types. We may solve this equation to obtain an alternative expression for the common marginal rate of substitution:
mt+l,t =[j (ca, t+l ~ P. \ ca, t / This result is due to Rubinstein (1974). Under the stated conditions, there exists an aggregate based on preferences that look like the common individual counterparts. A n alternative way to view this economy is as an example of Wilson's (1968) theory of syndicates. With the marginal rates o f substitution equated across consumers, we are led to a solution whereby individual consumption is a constant fraction of the aggregate over time: Cj, t = K)jCa, t~
or equivalently that the allocation risk-sharing rules are linear 6. Armed with this result, the general equilibrium of this model can be computed as follows. Given the aggregate endowment process and the capital accumulation process, we may solve for the optimal aggregate consumption process. This may be thought o f as special case of a B r o c k - M i r m a n style stochastic growth model in which the fictitious consumer has preferences that coincide with those of the individual identical agents. The solution to this problem yields the equilibrium processes for aggregate consumption aggregate investment and aggregate capital stock y. Notice that we can compute the aggregate quantities without simultaneously solving for the equilibrium prices. It is not necessary to ascertain how wealth is allocated across consumers because we can construct well defined aggregates s.
6 The reference to this as a risk-sharingrule is misleading. Consider economiesin which endowmentsof individuals oscillate in a deterministicway, but the aggregate endowmentis constant. Then consumption allocations will be constant as implied by the linear allocation rule, but there is no risk. 7 See also Lucas and Prescott (1971). 8 The simplicity here is overstated in one important respect. In characterizingthe aggregate endowment behavior, we either must appeal to a cross-sectionalversion of the Law of Large Numbers, or we must keep track of the idiosyncratic state variables needed to forecast individual endowments.
Ch. 8: Micro Dam and General Equilibrium Models
555
Given our assumption o f homothetic preferences, the equilibrium allocation o f consumption assigns a constant (over time and across states) fraction o f aggregate consumption to each person. Each consumer is endowed with an initial asset stock along with his or her process for consumption endowments. To determine the equilibrium allocation o f consumption across people we must solve for the equilibrium valuation o f the consumption endowments. With this valuation in hand, the individual consumption assignments are readily deduced from the intertemporal budget constraint. Then we can consider the equilibrium pricing o f state-contingent claims to consumption. Following Rubinstein (1974) and Hansen and Singleton (1982), pricing implications for this economy may be obtained by using the equilibrium consumption vector and forming the equilibrium intertemporal marginal rates o f substitution: the equilibrium versions o f {rot+l, t}. From this process we can construct the pricing operator: 79, Let Zt+l represent a claim to consumption at time t + 1. For instance, zt+l may be the payoff (in terms o f time t + 1 consumption) to holding a security between dates t and t + 1. For securities with longer maturities than one time period, we can interpret zt+l as the liquidation value o f the security (the dividend at time t + 1 plus the price o f selling the security at t + 1). The consumption claim zt+l may depend on information that is only observed at date t + 1 and hence is a random variable in the information set It. The equilibrium restrictions o f our model imply that the price at time t can be expressed as
~)t(Zt+l ) = E(mt+l, tZt+l lit),
(1.7)
where 7)t(z¢+l) is the date t price quoted in terms o f date t consumption Thus the pricing operator 79t assigns time t equilibrium prices to these contingent consumption claims in a linear manner 9. The intertemporal marginal rate o f substitution, rot+l, t, acts like a state-contingent discount factor. Since it is stochastic, in addition to discounting the future, it captures risk adjustments for securities with uncertain payouts. For this economy we may extract preference parameters using Euler equation estimation. Let Zt+l denote a vector o f one-period (gross) payoffs and Qt the corresponding price vector. The preference parameter vector (/3, p) can be identified from the unconditional moment restriction:
9 We are being deliberately vague about the domain of this operator. Since rot+l,t is positive, any non-negative payoff can be assigned an unambiguous but possibly infinite value. To rule out value ambiguities that arise when the positive part of the payoff has value +co and the negative part -oe, additional restrictions must be imposed on payoffs that depend on properties of the candidate discount factors.
556
M. Browning et al.
The asset payoffs may consist of multiple security market returns, or they may be synthetic payoffs constructed by an econometrician 1°. We obtain the same preference parameters when the model is estimated using aggregate or individual data on consumption In principle, this provides a way to test the assumed preference specification given the market environment if there is access to micro data. As noted by Rubinstein (1974), Hansen (1987), and in effect, Gorman (1953), this result is special, even under preference homogeneity. It is not applicable to any concave increasing oneperiod utility function unless the consumers have the same initial wealth. Given knowledge of the technology parameters, the stochastic process for aggregate capital stocks may be deduced as in Brock and Mirman (1972) by endowing the representative agent with preference parameters and labor supply hi. From this solution we may solve for equilibrium aggregate consumption equilibrium stochastic discount factors (from intertemporal marginal rates o f substitution), equilibrium wages (from marginal products), the initial wealth distribution and hence the sharing parameters (the tcj). 1.2.5. R i s k aversion or intertemporal substitution?
By estimating Euler equations, econometricians may identify preference parameters without having to solve the decision problem confronting individual agents. In particular, parameters can be identified without any need to measure microeconomic uncertainty or wealth. O f course if preferences are misspecified by an econometrician, estimated Euler equations will not recover the true preference parameters of individual agents. We now consider a misspecification of particular substantive interest. The parameter p is associated with two conceptually distinct aspects of preferences: risk aversion and intertemporal substitution along certain paths. The link between attitudes towards risk and intertemporal fluctuations in consumption over time is indissolubly embedded in models that simultaneously assume separability over time and over states [see Gorman (1968)] 11. It is the latter type of separability that is the key assumption in expected utility models. Hall (1988) and Epstein and Zin (1989, 1991) argue that it is fruitful to disentangle attitudes toward risk from intertemporal substitution as they correspond to two different aspects of consumer behavior. Concern about intertemporal substitution comes into play even in economies with deterministic movements in technologies. These considerations led Epstein and Zin (1989) to use
10 Synthetic payoffs and prices are constructed as follows. Multiply a single gross return, say 1 + rt+l, by instrumental variables in the conditioning information set/~t. Since this conditioning information is available to the consumer at purchase date t, the price of the scaled return is given by the instrumental variable used in the scaling. By creating enough synthetic securities, unconditional moment condition (1.8) can be made to imititate the conditional pricing relation [see Hansen and Singleton (1982) and Hansen and Richard (1987)]. ll The parameter p also governs precautionary savings or prudence [see Kimball (1990)]. This latter link is readily broken by adopting a more flexible parameterization of expected utility.
Ch. 8: MicroData and GeneralEquilibriumModels
557
a recursive utility formulation due to Kreps and Porteus (1978) in which preferences are represented using "continuation utility" indices, which measure the current period value of a consumption plan from the current period forward. The continuation utility index, Vy,t, for person j is obtained by iterating on the recursion:
(c,,) 1'
+/3~,(vj,,+O~_p] 1/(1 p)
(1.9)
where Tgt makes a risk adjustment on tomorrow's utility index:
,[.~t(gj,t+l)= {[E(gj, t+l)l_a iit]}l/(l a).
(1.10)
Observe that the utility index today is homogeneous of degree one in current and future (state-contingent) consumption This specification of preference nests our specification (1.6) with a common p provided that a = p. By allowing a to be distinct from p we break the connection between risk aversion and intertemporal substitution. The parameter a is irrelevant in an environment without uncertainty, but the intertemporal elasticity of substitution ( - l / p ) parameter is still important. The parameter a makes an incremental risk adjustment, which is absent from the standard von Neumann-Morgenstern formulation. Some of our previous analysis carries over directly to this recursive utility formulation. The efficient allocation for individual consumptions and individual utility indices satisfies
9,, = ~Ca,,,
V;,t = ~Va, t
for some numbers N where Vo,t is constructed using the process for the representative consumer {co,t+k : k = 0, 1. . . . } in place of {cAt+k : k = 0, 1 , . . . } . With this modification, the procedure we have previously described for solving out Brock-Mirman economies applies for preferences of the type (1.9). The intertemporal marginal rates of substitution are, however, altered, and this complicates the construction of the one-period stochastic discount factors. With the Kreps-Porteus utility recursion,
\ Ca.~ /
LTC,(G.,+~)J
which now depends on the continuation utility Vo, t+l. The same formula works if individual consumptions and continuation utilities are used in place of the aggregates. Micro- or macroeconomic estimation procedures based on Euler equations that erroneously assume that a = p generally fail to produce a usable estimate of either a or p unless they are equal. Even if a risk-free asset is used in estimating the Euler equation (1.8), the intertemporal substitution parameter will not be identified. The
558
M. Browning et al.
presence o f risk aversion parameter a ( ~ p ) will alter the Euler equation i f there is any uncertainty affecting the consumption decisions o f the individual [see Epstein and Zin (1991)]. For this case, a correctly specified Euler equation contains the continuation utility (Vj, t+I) and its conditional moment 12. I f the risk adjustment is logarithmic ( a = 1), then a logarithmic version o f the Euler equation will recover p provided the return on the wealth portfolio is used as the asset return instead o f the riskfree return [see Equation 18 in Epstein and Zin (1991)] 13 Since continuation utilities now enter correctly specified Euler equations, one way to modify the unconditional moment restrictions used in estimation is to solve recursion (1.9) for values o f the preference parameters. This solution requires knowledge o f the equilibrium consumption process for either individuals or the aggregate. Thus it is no longer possible to separate the estimation o f preference parameters from the estimation o f the other features o f the model as is conventional in the standard Euler equation approach. In particular, it is necessary to specify the underlying uncertainty individuals confront. Given that this explicit computation o f the full model is required, there are other, more direct, approaches to estimation than plugging solved continuation utilities into Euler equations in a two-stage procedure 14. Barsky, Juster, Kimball and Shapiro (1997) pursue an alternative way o f measuring risk preferences (independent o f intertemporal substitution) that is based on confronting consumers with hypothetical employment gambles. We discuss their work below.
1.2.6. Preference heterogeneity Even when we restrict ourselves to state-separable power-utility functions, once we allow for heterogeneity in preferences we must m o d i f y our aggregation theorem and our method for solving the general equilibrium model. We continue to impose the complete market structure, but drop back to the simple additively-separable preference
t2 Epstein and Zin (1991) present a clever solution to this problem whereby they derive an alternative Euler equation that depends instead on the one-period return on a hypothetical wealth portfolio. In practice it is difficult to construct a reliable proxy that is compatible with the observed consumption data. In their Euler equation estimation using aggregate data, Epstein and Zin (1991) used the valueweighted return on the New York Stock Exchange, but this proxy only covers a component of wealth in the actual economy. 13 When the risk adjustment is made using the negative exponential counterpart to Equation (1.10), then Euler equation (1.8) continues to apply but with an endogenously determined distorted conditional expectation operator. This risk adjustment is equivalent to inducing a specific form of pessimism. See Hansen, Sargent and Tallarini (1999) for a discussion of this point. 14 An interesting question is what security data are needed to identify the risk adjustment in the utility index. Epstein and Melino (1995) address this question without imposing parametric restrictions on the risk adjustment. Not surprisingly, the degree of identification depends on the richness of the security market returns used in the investigation. When investors have access to more security markets and the resulting returns are observed by an econometrician, the range of admissible risk adjustments shrinks. Hansen, Sargent and Tallarini (1999) illustrate this point in the parametric context of a permanent income model with an exponential risk adjustment.
Ch. 8:
Micro Data and General Equilibrium Models
559
specification. It is again convenient to pose the equilibrium problem as an optimal resource allocation problem for the purpose of computing equilibrium quantities. To accomplish this in a world of heterogeneous preferences we use a method devised by Negishi (1960) and refined by Constantinides (1982), Lucas and Stokey (1984) and others. Using standard Pareto efficiency arguments, and assuming interior solutions, consumers equate their marginal rates of substitution
\c],o/
\cl,o/
For each individual j, we assign a time t Pareto weight ~o],t with a deterministic equation of evolution: (Oj, t = (~1)
(Dj, t_ 1.
(1,11)
Equating marginal rates of substitution we obtain: COj,t(Cj, t) -py = (0i, t(Cl, t) -pl . We may thus characterize Pareto efficient allocations by combining evolution equation (1.11) with the solution to the static deterministic optimization problem: J
max
(cj) 1 PJ - 1
y'~a)i
c,, c2...... , j~= l
1
subjectto 1 - p]
J Zc]=c. J ] =1
(1.12)
The solution to problem (1.12) is obtained from the following argument. Let /~ denote the common marginal utility across individuals: o+(cj)-PJ = ~.
Then, (1.13)
=
By first averaging this expression over individuals, we may compute/~ by solving the non-linear equation .1~ - 1/(~0
'LCI )
c=J]=
•
M. Browning et al.
560
Plugging the solution for the common marginal utility/~ back into Equation (1.13), we obtain the allocation equations: cj = ¢)j(c; OO),
j = 1,... ,J,
where ~o denotes the vector of Pareto weights. The allocation rules 0j are increasing and must average out to unity. Substituting these rules back into the original objective function we construct a utility function for aggregate consumption
v(c;
J \(OJ(c;l Z o+
1)
j= 1
It is straightforward to verify that U(c, o9) is concave and strictly increasing. By the Envelope Theorem,
OU(c; ~o) Oc -
oj [0j(c; ~o)] -p' = ~t,
(1.14)
which is the common marginal utility. The "mongrel" function U will generally depend on the Pareto weights in a non-trivial manner. We may use this constructed utility function to deduce optimal allocations in the following way. Given any admissible initial COo,solve the optimal resource allocation problem using the preference ordering induced by the von Neumann-Morgenstern "mongrel" utility function: OO
E ~ q3~)' U(ct; o)~) t=O
subject to the equation of motion (1.1) and the evolution equation for vector of Pareto weights (1.11). If resources are allocated efficiently, we obtain an alternative (to Gorman) justification of the representative consumer model. This justification carries over to the derived pricing relations as well. Prices may be deduced from the marginal rates of substitution implied by the mongrel preference ordering. This follows directly from the link between individual and aggregate marginal rates of substitution given in Equation (1.14). This construction justifies a two-step method for computing efficient allocations of resources when preferences are heterogeneous. In the first step we compute a mongrel utility function for a fictitious representative consumer along with a set of allocation rules from a static and deterministic resource allocation problem. The mongrel utility function may be used to deduce equilibrium aggregate consumption and investment rules and equilibrium prices. The static allocation rules may be used for each state and date to allocate aggregate consumption among the individual consumers. These computations are repeated for each admissible choice of Pareto weights. To
Ch. 8:
Micro Data and General Equilibrium Models
561
compute a particular general equilibrium, Pareto weights must be found that satisfy the intertemporal budget constraints of the consumer with equality. This economy has the following observable implications. First, under discount factor homogeneity, the procedure just described can be taken as a justification for using a representative agent model to study aggregate consumption investment and prices. Microeconomic data are not required to calibrate the model provided that the mongrel preferences used to compute the general equilibrium are not used for welfare analyses. Second, one can use microeconomic and macroeconomic data together to test whether individual consumption data behave in a manner consistent with this model. If discount factors are heterogenous, and if this economy runs for a long period of time, in the long run the consumer with the largest discount factor essentially does all of the consuming 15. This follows directly from Equation (1.11). Thus it is the discount factor of the (eventually) wealthiest consumer that should be of interest to the calibrator of a representative agent model, provided that the aim is to confront aggregate time series data and security market data. Since in the US economy, 52% of the wealth is held by 5% of the households, and there is evidence, discussed below, that wealthy families have lower discount rates, this observation is empirically relevant. A research synthesis that uses a m e d i a n or t r i m m e d m e a n discount factor estimated from micro data to explain long-run aggregate time series data would be seriously flawed. This raises the potential problem that the estimated extreme value may be a consequence of sampling error or measurement error instead of genuine preference heterogeneity. If the aim is to evaluate the impact of macroeconomic policies on the welfare of the person with the m e d i a n discount rate, then the welfare evaluation should be performed outside the representative consumer framework used for matching aggregate time series data. To be accurate it would have to recognize the diversity of subjective discount rates in the population. With discount factor homogeneity, individual consumption will be a time invariant function of aggregate consumption This occurs because evolution equation (1.11) implies that the Pareto weights are invariant over time. In spite of this invariance, if there is heterogeneity in intertemporal substitution elasticities in an economy with growth it may still be the case that one type of the consumer eventually does most of the consumption because of non-linearity in the allocation rule. To demonstrate this we follow Dumas (1989) and, suppose that we have two types of consumers, both facing a common discount factor, but Pl
15 Lucas and Stokey (1984) use this as a criticism of models with discount factors that are independent of consumption.
562
M. Browning et al. Share of consumption of agent 2 0.9
0.8
\ \ \
0.7 \
\ \
\ '.~ \ \
0.6 # 69
rho = 1 0.5
rho = 1.5
0.4
~ ~ ~
rho = 2
0.3
0.2
I
I
I
I
I
I
1
I
I
1
2
3
4
5
6
7
8
9
10
aggregate consumption
Fig. 1.1. Fraction of aggregated consumption assigned to agent 2 plotted for P2 = 1 (solid line), P2 = 1.5 (dash-dotted curve), and P2 = 2 (dashed curve). In all cases Pl = 1. Pareto weights are ½ for both agents. depicted in Figures 1.1 and 1.2 for a Dumas-style economy. With all consumers facing a common/3, to explain the data on aggregate quantities and security market prices, it is again the preferences of the eventually wealthiest consumer that matter and not the preferences o f the average or median person. When we add labor supply into this setup, the lessons for picking parameters are different. Suppose for simplicity, we assume individual preferences are additively separable between consumption and labor supply so that for an individual o f type i: c I-pi - 1 ~pi(h) l-pi - 1 gi(c, h) = a ~ - + (1 - ~i)
1 -Pi
1 -Pi
Let the preference parameter for the first type satisfy Pl = 1 and assume Pl < P2. Consider a social planner allocating consumption and hours across individuals in a Pareto efficient manner. Provided that hours can be converted into an efficiency units standard and the hours allocations for each individual are away from corners (assumptions we question in Sections 2 and 3), we can derive an aggregation result for the disutility of aggregate hours using the same techniques just described for aggregate consumption While person 1 eventually does most o f the consuming in the
Ch. 8:
563
Micro Data and General Equilibrium Models Both agents rho=l
5
J
i
20
i
i
40 60 rho=l (solid), rho=1.5 (dotted)
i
80
1O0
80
1O0
80
100
8
=
4
0
i
0
20
o
20
i
40 60 rho=l (solid), rho=2 (dotted)
~4 2 o
40
60 time
Fig. 1.2. Consumption assigned to each agent when aggregate consumption grows over time. Pareto weights are 1 for both agents; growth rate is 3% per time period. Solid curve: agent 1; dotted curve, agent 2. Dotted and solid curves coincide in the top panel.
economy, the leisure preferences o f person 2 will figure prominently in the aggregate (mongrel) preference ordering. Thus to construct the mongrel representative agent for this economy requires that the analyst use the intertemporal consumption elasticity o f the rich person, but an intertemporal elasticity for labor supply that recognizes both agent types. In the presence o f heterogeneity in preferences, it will sometimes be necessary to apply different weighting schemes across the population for consumption elasticities than for labor supply elasticities in order to construct an aggregate that fits the data 16 Tests o f the complete-market model usually focus on linear allocation rules, whereas accounting for preference heterogeneity leads one to recognize important
16 In a somewhat different setting, Kihlstrom and Laffont (1979) use preference heterogeneity to build a model in which more risk averse individuals become workers and less risk averse individuals become entrepreneurs.
564
~/L Browning et aL
non-linearities in allocation rules 17. In models with endowment uncertainty, but heterogeneity in discount rates and intertemporal elasticities of substitution, the efficient allocation is that individual consumption is a deterministic function of aggregate consumption alone. Even this implication can be altered, however, if von Neumann-Morgenstern preferences are replaced with a more general recursive formulation in which [as in Epstein and Zin (1991)] future utilities are adjusted for risk. In this case the evolution of the Pareto weights is stochastic [see Dumas, Uppal and Wang (1997) and Anderson (1998)]. As a consequence, allocation rules depend, not only on current aggregate consumption, but also on the past history of the aggregate state variables (the capital stock and xt). The deterministic relationship between individual consumption and aggregate allocations will be altered if time non-separabilities are introduced in the form of habit persistence or durability in consumption goods. In these cases, past histories of consumption also enter into the allocation rules. Thus the fix up of mongrel preferences has to be substantially altered when we consider more general preference specifications. We now present our first bit of evidence on the empirical importance of preference heterogeneity. This evidence demonstrates that our concerns about this issue are not purely aesthetic. 1.2.7. Questionnaire evidence on the scale and distribution o f risk auersion
In an innovative paper, Barsky, Juster, Kimball and Shapiro (1997) elicit risk preferences from hypothetical questions administered to a sample of respondents in the University of Michigan Health and Retirement Survey (HRS). The aim of this study was to extract the degree of relative risk aversion without linking it to the elasticity of intertemporal substitution. Respondents were asked about their willingness to participate in large gambles of various magnitudes. For example, adult respondents are asked to imagine themselves as the sole earners in their families and are asked to choose between a job with their current annual family income guaranteed for life versus a prospect with a 50-50 chance of doubling family income and a 50-50 chance of reducing family income by a third. Respondents who take the gamble offered in the first question are then asked if they would take an option with the same gain as offered in the first question, and the same probability of gain, but a greater loss that cuts income in half. Respondents who decline the gamble offered in the first question are offered a second option with the same gain (and probability of gain) as the first option but the loss is reduced to 20 percent. Answers to these questions enable one to bound the coefficient of relative risk aversion. The results of this hypothetical exercise are reported in Table 1.1 for a variety of demographic groups. The two notable features of this table are: (a) the substantial proportion of risk averse people and (b) the heterogeneity in risk aversion both across demographic groups and within them. 17 See Attanasio (1999) for a survey of the literature testing for linear allocation rules.
565
Ch. 8." Micro Data and General Equilibrium Models
Table 1.1 Risk aversion by demographic groups Demographic Group
Risk aversion [3.76, e~]
(2.00,3.76)
Age under 50 years
58.5
50 to 54 years
61.9
55 to 59 years
66.0
60 to 64 years 65 to 69 years
Interval
Number of
(1,2)
(0,1)
Responses
14.4
13.8
13.1
1147
12.0
12.2
13.7
3800
11.5
9.8
12.5
4061
69.3
9.5
9.4
11.6
2170
66.6
12.0
9.2
12.0
390
Over 70 years
68.3
6.4
9.3
15.8
139
Female
65.1
11.8
11.0
11.9
6448
Male
64.0
11.2
10.7
13.9
5259
White
64.9
12.5
10.7
11.8
8508
Black
66.7
9.1
10.6
13.3
1884
Other
62.3
10.0
13.7
13.7
109
Asian
57.9
10.3
11.1
20.6
126
Hispanic
59.3
9.2
12.6
18.7
1054
Protestant
66.2
11.5
10.8
11.4
7404
Catholic
62.3
10.8
11.4
15.3
3185
Jewish
56.3
13.2
11.1
19.2
197
Other
61.6
14.3
9.6
14.3
900
Source: Barsky, Juster, Kimball and Shapiro (1997), Table III. The p-value for the hypothesis that the mean risk tolerance is equal across age groups is 0.0001, that it is equal across sexes is 0.015, that it is equal across races is 0.0001, and that it is equal across religions is 0.0001. a
There are serious questions about the relationship b e t w e e n actual risk-taking b e h a v i o r and the responses elicited f r o m questionnaires. In addition, there are serious questions about the m a g n i t u d e o f the gambles in these hypothetical choice experiments. Preferences that exhibit constant relative risk aversion, link the aversion to small g a m b l e s to the aversion to large g a m b l e s and hence j u s t i f y calibrating risk aversion to large gambles. B e h a v i o r responses to small bets m a y be different f r o m the responses to large ones, and the bets studied in this survey are i n d e e d substantial. In fact, Epstein and M e l i n o (1995) provide empirical e v i d e n c e that the risk aversion m a y be m u c h larger for small g a m b l e s than large ones. N o n e t h e l e s s , the results s u m m a r i z e d in Table 1.1 are v e r y suggestive o f considerable h e t e r o g e n e i t y in the population. We present further e v i d e n c e on preference h e t e r o g e n e i t y in Section 3.
M. Browninget al.
566
We next consider more general market environments without full insurance, and the empirical challenges that arise in constructing and calibrating general equilibrium models in such environments.
1.3. Incomplete markets While the multiple consumer, Pareto optimal economy is pedagogically convenient, it assumes the existence of a rather large collection of markets. Moreover, it eliminates many interesting policy questions by assumption such as those having to do with borrowing constraints or adverse selection. We now consider what happens when most of the markets assumed to be open in the Pareto optimal economy are closed down. Following Miller (1974), Bewley (1977), Scheinkman and Weiss (1986), Aiyagari (1994) and Krusell and Smith (1998), among others, we suppose that consumers can only purchase and sell shares of the capital stock and are not permitted to trade claims to their future individual productivities. Moreover, only non-negative amounts of capital can be held. This is an economy with a "borrowing" constraint and other forms of market incompleteness that hold simultaneously. We discuss how these constraints can arise later on. In an environment with incomplete markets, we can no longer exploit the convenient Pareto formulation. The economy we consider is one in which prices and quantities must be computed simultaneously. Under the efficiency units assumption, the current period wage rate satisfies a standard marginal productivity condition. The gross return to holding capital must satisfy
1 +r,+l =
Odt+l(kt/ht+l) 0-1 -[-~.
For the special case of Aiyagari's economy, there is no aggregate uncertainty. As a consequence, the equilibrium rate of return to capital will be riskless. In Krusell and Smith (1998), there is aggregate uncertainty but this uncertainty is sufficiently small so that there is little difference between the risky return on capital and a riskfree security 18. Given that only non-negative amounts of capital can be held, the familiar consumption Euler equation is replaced by
E[[3j (c/"+~ yP;(l +r,+O [It] <<1, \ c:,t /
0.15)
where equality holds when the consumer chooses positive holdings of the capital stock. The familiar interior solution Euler equation no longer characterizes some of the agents in the economy [see, for example, Zeldes (1989)].
Is As a consequence, like the original Brock-Mirman model, theirs is a poor model of the aggregate return on equity measured using stock market data.
Ch. 8: Micro Data and General Equilibrium Models
567
In an attempt to explain the wealth distribution, Krusell and Smith (1998) introduce a form of discount factor heterogeneity modeled as a persistent stochastic process. They do not, however, make specific use of the preference heterogeneity measured from microeconomic consumption studies in their calibration. Furthermore, as we will see, calibrating to microeconomic data in environments with less than full insurance requires more than just extracting preference parameters; it requires measuring the uninsured uncertainty facing agents. 1.3.1. Microeconomic uncertainty
Aiyagari (1994) and Krusell and Smith (1998) attempt to quantify the impact of the precautionary motive for savings on both the aggregate capital stock and the equilibrium interest rate, assuming that the source of uncertainty is in individual labor market productivity. In order to accomplish this task, these analysts require a measure of the magnitude of microeconomic uncertainty, and how that uncertainty evolves over the business cycle. Euler equation estimates of preference parameters must be supplemented by measures of individual uncertainty. This introduces the possibility of additional sources of heterogeneity because different economic agents may confront fundamentally different risks. To calibrate the macroeconomic model it becomes crucial to measure the distribution of individual shocks. The income of person j at time t + 1 is wt+lYj, t+l and its distribution depends in part on the aggregate state variable xt. In practice, household income arises from many sources with possibly different risks and people in different occupations face quantitatively important differences in the uncertainty they confront [see the evidence in Carroll and Samwick (1997)]. Aggregating income from all sources or pooling agents in different risk classes is a potentially dangerous practice that may mask the true predictability of the individual components. Aggregates of income sources may not accurately represent the true economic uncertainty facing agents. The persistence in the idiosyncratic shocks and the manner in which aggregate state variables shift the distributions of idiosyncratic shocks are known to have an important impact on consumption allocations in incomplete market models [see Mankiw (1986) and Constantinides and Duffle (1996)]. Aggregating across risk components can alter the measured predictability. We now present evidence on the time series processes of labor earnings and wage innovations drawing on research by MaCurdy (1982) and Abowd and Card (1989). Hubbard, Skinner and Zeldes (1994) consider measures of uncertainty for other sources of household income. We summarize the econometric evidence on the form of the forecasting equation represented as an ARMA model, and the magnitude of the innovation variance, which is often used as a measure of uncertainty. 1.3.1.1. Estimated processes f o r wages and earnings. There is agreement among micro studies of nationally based representative samples that differences in the residuals of male log earnings or wage rates from a Mincer earnings function are
568
M . B r o w n i n g et al.
Table 1.2 Estimated ARMA processes for residuals from first differenced Mincer log wage or log earnings equations a ARMA
a
mI
E(g2i, t) b
m2
(innovation valiance)
Log hourly wage rates
(0,2) (1,1)
0.122
-0.484 -0.608
(17) (13)
-0.066
(2.7)
(2.6)
0.061 0.061
(17) (16)
-0.411 -0.621
(14) (13)
-0.106 -
(3.8)
(3.95)
0.054 0.056
(15) (14)
Log annual earnings
(0,2) (1,1)
0.216
a Source: MaCurdy (1982), Tables 3 and 4. 1 + m2e~,~ 2; t-statistics in parentheses. b Computed assuming stationarity of the innovation process. Aui, t = aAui, t I + ~, t + m l ei, ~-
adequately represented by either an MA(2) process or an ARMA(1,1) process. There is surprisingly little information available about the time series processes of female earnings and wage rates. The representations for the change in log earnings and wage rates for married males that receive the most support in the studies o f MaCurdy (1982) and Abowd and Card (1989) are: Abli, t = F~i,t + m l E i , t-1 + m26i, t-2
(1.16)
Abli, t = a A H i , t-1 q- Ei, t + m l E i , t I,
(1.17)
or
where A u i , t = l~li, t -- Ui, t-I
and ui, t is the residual o f a Mincer regression for log earnings or wage rates for person i. (See Section 2.2 for a discussion o f Mincer earnings models.) Estimates o f the parameters o f these models are presented in Table 1.2 [taken from MaCurdy (1982)]. He reports that he is unable to distinguish between these two representations o f the time series process o f residuals. In Table 1.3 we report MaCurdy's (1982) estimates when the autoregressive unit root specification is not imposed in Equation (1.17). The freely estimated autoregressive coefficients are close to but slightly less than one, and they are estimated with enough accuracy to reject the unit root model using statistical tests. The analysis o f Abowd and Card (1989) is generally supportive o f the results reported by MaCurdy (1982), except that MaCurdy finds that the coefficients ml, m2 are constant over time whereas Abowd and Card report a rejection o f the overall
Ch. 8:
569
Micro Data and General Equilibrium Models
Table 1.3 Estimated ARMA processes for residuals from first levels of Mincer log wage or log earnings equations a ARMA
a
ml
m2
E@2i, t) b
(innovation variance) Log hourly wage rates
(1,2)
0.025
(2.8)
-0.46
(13.4)
-0.053
(1.78)
0.061
(17)
(2.5)
-0.39
(12.2)
-0.094
(3.13)
0.055
(18.3)
Log annual earnings
(1,1)
0.026
a Source: MaCurdy (1982), Tables 5 and 6. ui, t (1 a) ui, t 1 +61,t + m l g i , t 1 +m2Ei, t 2; t-statistics in parentheses. b Computed assuming stationarity of the innovation process. - -
hypothesis o f stationarity for the model. There is no necessary contradiction between the two studies because MaCurdy does not require that the variances o f the el, t be constant, nor does he report evidence on the question. However, he uses the assumption o f constancy in the innovation variances o f earnings processes to report the innovation variances given in the final column o f Tables 1.2 and 1.3. From the vantage point o f macroeconomics, time series variation in the innovation variances is o f interest, especially the extent that the variances fluctuate over the business cycle. Figures 1.3 and 1.4 demonstrate how the conditional variance o f Aui, t changes over the business cycle. In periods o f rising unemployment, the innovation variance in log wage equations increases. This evidence is consistent with the notion that microeconomic uncertainty is greater in recessions than booms 19. As we have previously noted, the models described in this section take households as the decision unit. This led researchers such as Heaton and Lucas (1996) and Hubbard, Skinner and Zeldes (1994) to present estimates o f pooled family income processes 2°. They do not report separate estimates o f the earnings processes for husbands and wives. In samples for earlier periods, there is evidence o f negative covariance between spousal earnings [Holbrook and Stafford (1971)]. In later samples, there is evidence o f positive covariance [Juhn and Murphy (1994)] 21. We question whether pooling household earnings processes is a sound practice for extracting components o f risk
19 The evidence for professional and educated workers reported by Lillard and Weiss (1979) and Hause (1980) suggests the presence of a person-specific growth trend. This growth trend is not found to be important in national samples across all skill groups and for the sake of brevity we do not discuss specifications of the earnings functions with this growth trend. 20 Also, to obtain a better match with their model, Heaton and Lucas (1996) look at individual income relative to aggregate income. In effect they impose a cointegration relation between individual and aggregate log earnings. 21 However, one should not make too much of this difference. The Holbrook and Stafford study reports a relationship for panel data; the Juhn and Murphy study is for cross-sectional data.
M. Browning et al.
570
©
4- c~ x ~ i.o c~,
•
ii
-O
•
0@
o~
o ~
0J
u~
~A~D O O S6UlUJB=1601 l e n u u v ~o s l e n p ! s a H Ul a 6 u e q o ~o aoue!JeA
Ch. 8:
Micro Data and General Equilibrium Models
571
B
+~
×
drill
~D
•
•
0
~<
+ ,
b
~ ~ ° c; sOu!uJea Bo-I lenuuv to slenp!sebl Ul e6ueqo =toeoue!JeA
0
572
M. Browning et al.
facing households in models designed to track the wealth distribution of the economy. Each income source is likely to have its own component of uncertainty. Further research on this topic would be highly desirable. 1.3.1.2. Missing risks. Most of the microeconomic evidence is based on samples of annual earnings or average hourly wages for continuously working, continuously married males. Risks associated with long-term job loss due to job displacement, illness or marital disruption are typically ignored. On these grounds the estimated innovation variances to narrowly defined earning processes are likely to understate the risks confronted by agents. Many labor force and life-cycle sources of risk are abstracted from in these studies. Carroll (1992), Hubbard, Skinner and Zeldes (1994), and Lillard and Weiss (1997) make this point, and estimate additional components of uncertainty. 1.3.1.3. Statistical decompositions. Statistical decompositions of wage and earnings processes are intrinsically uninformative about the information available to economic agents. As in Friedman (1957), all of the components of Equations (1.16) or (1.17) could be known and acted on by agents, Estimated innovation variances include measurement error components, factors known to the agents and unknown to the econometrician, and true components of uncertainty. On these grounds, the estimates of the variances in empirical earnings equations likely overstate the true uncertainty facing agents. To demonstrate the value of the cross-equation restrictions connecting consumption and earnings in identifying the innovation in earnings, consider the following example based on the permanent income model of consumption Suppose that the first difference of the level of labor income (earnings) et evolves according to 22 OO
Aet
q9 + Z
aj • ~t-j,
j - o
where { t/t} is a non-degenerate stationary, multivariate martingale difference sequence with a finite second moment. We impose as a normalization that the covariance of t/t is the identity matrix. The aj are vectors of moving-average coefficients. The martingale difference sequence {~t} is adapted to the sequence of information sets available to the consumer. The multiple components are introduced into ~t to capture the multiple sources of uncertainty in, say, household income. 23 For simplicity, we assume the preferences for consumption are quadratic, as in Flavin (1981) or Hansen (1987), and that the real interest r is constant 24. Then from those
22 This specification includes ones used by Friedinan in his Ph.D. thesis, see Friedman and Kuznets (1945). In contrast to the processes fit by MaCurdy (1982), this specificationis depicted in terms of first differences of income levels instead of first differences of logarithms. 23 For a related discussion, see Blundell and Preston (1998). 24 See Hansen (1987) for a general equilibrium interpretation of this model.
Ch. 8:
Micro Data and General Equilibrium Models
573
analyses we know that the change in consumption from date t - 1 to t, ct - ct l, is just the change in the flow of discounted current and future income from t to t - 1: ct-ct-l
=
--
~ r
aj
" ~lt.
(1.18)
/=0 Not only is consumption a martingale as noted by Hall (t978), but the composite income--consumption process must be present-value neutral [see Hansen, Roberds and Sargent (1991)]. That is, consumption and income must respect a present-value budget constraint for all realizations of the shock vector ~/t. Relation (1.18) respects this constraint by taking account of the fact that the shock ~/t alters income in future time periods. Even in the absence of measurement error in consumption and income, the entire innovation vector ~/t cannot necessarily be identified by an econometrician using data on income alone. For instance, if ~/t has more than one entry, fitting a univariate time series process to income will not reveal this vector of new information pertinent to the consumer. By looking instead at consumption, r/t can at least be partially identified because consumption is a martingale adapted to the true information set of the consumer. It follows from Equation (1.18) that the first-difference in consumption reveals one linear combination of the shock v e c t o r O t 25 . Suppose now that measured earnings are e t = et + Vt,
where the measurement error { v t } is mean zero, finite variance and is independent of the process { ~Tt}, but the serial correlation properties of the {at} are unrestricted. Then from the autocovariances of {Ae~ } one cannot identify the moving-average coefficients (the a j ) even if {~Tt} is a scalar process. Thus the strategy of estimating the variance of the innovations in information from the covariances of the error processes of measured earnings equations, as used by Carroll (1992), Hubbard, Skinner and Zeldes (1994) and others, fails, because it is impossible to separate the measurement error from earnings innovations. Again consumption data is informative because agents respond to innovations in earnings but not to the measurement error. Suppose that consumption is also contaminated by measurement error. Thus we write
where {c7} is measured consumption and { v t } is a measurement error process that is independent of {~/t} and {vt}. We may now use the cross covariances between the 25 Hansen, Roberds and Sargent (1991) use this result and generalizations of it to deduce testable implications of present value budget balance.
M. Browning et al.
574
{Ac t } and {Ae[ } to obtain at least partial identification of the income information structure confronting the consumer. This follows from the formula
Cov(Aet*+k,Ac;) = ak . [ j ~ (
~ 1 ) J aj1 ,
=0
which uses the fact that ~/t has an identity as its covariance matrix and is valid for k ~> 0. Given prior knowledge of the constant interest rate r, we may use this formula and formula (1.18) to deduce the variance of the Act:
Var(Act)=Ik=~o(l~r)kak]'[j=~o(l~r)JaJJ =
Cov(Ae,2k,Ac;). k=0
From these calculations then we may again infer how the true income process responds to a linear combination of the t/t shock vector 26. I f the {~/t} process is scalar, then we have full identification of the information structure confronting the consumer that is pertinent for the evolution of income. Thus the use of consumption data, even if measured with error, can help to identify the true income uncertainty confronting economic agents. We conclude this subsection with a brief discussion of the sources of market incompleteness.
1.3.2. Limited commitment and private inJbrmation In our incomplete markets model, we made no attempt to justify the form of market incompleteness. Two common justifications include problems of enforceability and observability. Measurements of microeconomic uncertainty are critical ingredients to economies that explicitly account for limited commitment and private information.
1.3.2.1. Limited commitment. Kehoe and Levine (1993) and Kocheflakota (1996) propose the following alternative to the incomplete markets models we have considered thus far. Suppose consumers are permitted to walk away from obligations, but when they do so they are excluded from future participation in markets. Instead they are restricted to use their own backyard storage technologies or simply consume only their 26 Hansen, Roberds and Sargent (1991) use a closelyrelated argumentto show that present-valuebudget balance has testable implication when consumption is a martingale. They do not, however, explore the ramifications of measurement error in consumption and income.
Ch. 8: Micro Data and General Equilibrium Models
575
labor income. As a consequence, in equilibrium, consumers are guaranteed a lower bound on their discounted utilities at each date and state. Consumption allocations are obtained by solving Pareto problems subject to utility lower bounds implied by the utility threat points: the points at which consumers are indifferent between honoring their obligations and defanlting. While there is no default in equilibrium, allocations are altered by the presence of the utility threat points. Alvarez and Jennann (1999) consider an economy with limited commitment in which consumers have no access to backyard storage technologies and hence are punished by constraining future consumption to equal future income, period by period. They show that the resulting allocations may be decentralized by introducing personspecific solvency or borrowing constraints 27. Euler equations are replaced by Euler inequalities as in Luttmer's (1996) work on asset pricing when investors face solvency constraints. Alvarez and Jermann (1999) are able to imitate asset pricing features of an economy with solvency constraints, but with different predictions about when agents will be up against financial market constraints. Individual Euler equations linking consumption to asset prices continue to characterize the behavior of some individuals in each time period. When and which consumers are constrained in their ability to borrow can be ascertained by computing the utility threat points, which in turn depend on the microeconomic uncertainty they would be forced to confront in the absence of risk sharing. Thus the same problem of measuring uncertainty discussed for economies with incomplete markets applies to economies in which limited commitment is the only source of financial market frictions. 1.3.2.2. Private information. In the limited commitment economies considered by Kehoe and Levine (1993) and by Alvarez and Jermann (1999), idiosyncratic endowment shocks are publicly observed. Suppose instead that they are only known to the individual agents and not to the public. Again, the options and information available to agents matter. For instance, suppose that a capital accumulation technology is only available to the society as a whole and not to individuals. In this case, individual consumption contracts can be enforced because the agents are unable to privately transfer consumption from one period to the next. There is a substantial body of work on optimal resource allocation subject to incentive constraints that relies on the enforceability of consumption contracts [see Green (1987), Phelan and Townsend (1991), Atkeson and Lucas (1992) among others]. In general, the efficient allocations do not have decentralizations that look like the incomplete market structure previously described. Economies with a simple security market structure are not Pareto efficient even after accounting for the incentive constraints [e.g., see Lucas (1992)]. Of course, security markets could be supplemented by other institutions designed to reduce or
27 The person-specific nature of the solvency constraint stretches a bit the notion of a decentralized economy.As an alternative, we may view these limited commitmenteconomies as (constrained) efficient benchmarks to which we might compare alternative institutional arrangements.
576
M. Browning et al.
eliminate the efficiency wedge. In contrast, when a capital accumulation technology is privately available, individual agents can hide their consumption from the public. Thus individual consumption contracts are no longer enforceable. For some special versions of these environments, Allen (1985) and Cole and Kocherlakota (1997) show that the incomplete security market economy we described previously fully decentralizes the Pareto efficient allocations subject to incentive constraints. Even when there is an efficiency wedge, the specification ofmicroeconomic uncertainty is a critical ingredient in both the decentralized economy and in the Pareto efficient economy subject to incentive constraints. The problems of measuring microeconomic uncertainty arise in private information economies as well.
2. Overlapping generations model 2.1. Motivation
The overlapping generations model (OLG), now widely used in macroeconomics and public finance, captures heterogeneity among cohorts, something not captured by the models considered in Section 1. Thus it provides valuable information on cohortspecific consequences of economic policies and of macro disturbances. Influential versions of the OLG model by Auerbach and Kotlikoff (1987) and Fullerton and Rogers (1993) are widely used to evaluate a variety of policy reforms, including tax and social security policies. The versions of this model that are widely used are based on perfect foresight. A major computational advantage of this form of the OLG model is that it is relatively straightforward to calculate transitional paths for high dimensional specifications, as demonstrated by Greenwood and Yorukoglu (1997). This is in contrast with the stochastic growth model with incomplete markets, where computation with high dimensional state vectors is still a very difficult problem. However, perfect foresight is a strong assumption. Huggett (1996) considers a version of an OLG model with idiosyncratic uncertainty, but he only analyzes steady states. Most empirical implementations of the OLG model ignore decisions to accumulate human capital by taking skills to be exogenous endowments. On the other hand, human capital is a more substantial component of total wealth than physical capital. For this reason, we present a generalization of the Auerbach-Kotlikoff overlapping generations model due to Heckman, Lochner and Taber (1998) (HLT) that incorporates human capital, in addition to the physical capital that is the centerpiece of the AuerbachKotlikoff analysis. We consider the evidence required to empirically implement this and other versions of the OLG model using either micro or macro data. We use our exposition of this extended version of the OLG model to consider the benefits of introducing richer forms of heterogeneity in skills and human capital production technologies and to consider choices at both the intensive and extensive margins. Schooling decisions are made at the extensive margin. Moreover, investment
Ch. 8: Micro Data and General Equilibrium Models
577
in schooling and on-the-job training are distinct activities and produce distinct skills, and output is produced by a skill mix supplied by different agents. The following facts motivate our choice o f models: • Empirical evidence in labor economics demonstrates that comparative advantage in factor markets is the rule and that different persons specialize in different skills [see Sattinger (1993)]. Widely-used efficiency units corrections that collapse skill to a single dimension do not explain either cross-section wage distributions or the evolution o f wage distributions over time. • There are substantial cohort effects in earnings functions. • Heterogeneity in saving behavior and wealth accumulation helps in fitting OLG models to data. Life-cycle models with preference homogeneity in each cohort do not generate enough dispersion in savings to explain the observed capital holdings. To accommodate these facts, we follow HLT by embedding an extension o f the model o f Ben Porath (1967) of individual human capital production into a general equilibrium setting. In Ben-Porath's model, in the stochastic growth model discussed in Section 1, and in the Auerbach-Kotlikoff model or the Fullerton-Rogers model, wage inequality can only be generated by differences in amounts o f a common skill (the so-called "efficiency units" model). All skill commands the same price. In the model considered by HLT and in this part o f our chapter, different levels o f schooling enable individuals to invest in different skills through on-the-job training in the post-schooling period. In the aggregate, the skills o f different schooling groups are not perfect substitutes 28. Within schooling groups, however, persons with different amounts o f skill are perfect substitutes 29. For our purposes, this model provides us with a useful platform for integrating empirical results from labor economics into a general equilibrium, macroeconomic model. Before turning to the details o f the HLT model, it is fruitful to place their work in the context o f current models o f earnings determination widely used in the literature. 2.2. Economic models o f earnings and human capital investment
Three different frameworks for interpreting wage and earnings functions coexist in the literature on empirical labor economics. Their coexistence in the literature is a constant source o f confusion. The only agreement among users o f these alternative frameworks is their agreement to call the coefficient on schooling in a log wage equation a "rate o f return", although the conditions required to do so are known to be restrictive 3°.
28 This specification is consistent with evidence that the large increase in the supply of educated labor consequent from the baby boom depressed the returns to education. [See Freeman (1976), Autor, Katz and Krueger (1997) and Katz and Murphy (1992).] 29 This specification accords with the empirical evidence summarized in Hamermesh (1993, p. 123) that persons of different ages but with the same education levels are highly substitutable for each other. 3o In order to interpret the schooling coefficient as an internal rate of return, it is necessary to assume an environment of perfect certainty and to further assume that earnings are multiplicatively separable
M. Browninget al.
578
The first two frameworks are scalar and vector attribute pricing equations, respectively. The third framework accounts for personal investment and the estimated coefficients on attributes of fitted earnings functions are not prices although they are often interpreted to be so. The most widely used empirical framework for wages in aggregative analysis writes the wage w of person i (wi) as the product of the price R of labor services denominated in "efficiency units" with the quantity of efficiency units Qi embodied in the agent: Wi =
RQi,
(2.1)
where Qi is a function of characteristics of individuals such as education and work experience. In this framework, R = FQ(Q, K) where F is the aggregate production function and k is the aggregate capital stock and Q = ~/~_ 1 Qi where N is the number of agents working in the economy. Equation (2.1) underlies the calibrated real-business cycle models of Kydland (1984, 1995) and a variety of microeconomic studies. [See the one-sector model in Heckman and Sedlacek (1990) or the aggregative models of Stokey and Rebelo (1995), Lucas (1988), Uzawa (1965) and Caballe and Santos (1993).] In logs, Equation (2.1) has the strong implication that aggregate shocks operate only through intercepts of log wage equations. This framework cannot explain the well-documented rise in inequality in means among skill classes or increases in variances of log wage equations that are central features of many modern economies [see, e.g. Katz and Murphy (1992)] unless quality changes within skill groups. Since the mean of log wages and the variance of log wages increase greatly over short periods of time for groups with stable skill characteristics, such an explanation of rising wage inequality is implausible. It is tested and rejected in the study of Katz and Murphy (1992). Real wages respond to aggregate unemployment (see Table 2.1). However, the response to local unemployment rates varies by gender, education and age (see Table 2.2). The wages of unskilled persons are much more cyclically sensitive to local unemployment rates. This evidence does not support a one-dimensional efficiency units model, In addition, an efficiency units model is inconsistent with the well-documented evidence on comparative advantage in the labor market. [See Heckman and Sedlacek (1985), or Sattinger (1993).] A model in which persons choose sectors based on their comparative advantage in them explains the distribution of US wage data whereas a model with efficiency units does not [Heckman and Sedlacek (1990)]. Finally, the evidence reported in Topel (1986) and Heckman, Layne-Farrar and Todd (1996) demonstrates that the wages of different skill groups respond differently to local labor market shocks. This is inconsistent with efficiency units wage specification (2.1). in schooling and experience and that all persons of all education levels have the same post-school experienceprofiles. See Heckman and Klenow (1997) and Heckman, Lochner and Taber (1999) for a derivation.
Ch. 8:
Micro Data and General Equilibrium Models
579
Table 2.1 Elasticity of real wages with respect to aggregate unemployment: first-differenced panel data Author
Data Set
Elasticity
(S.E.)
Bils (1985)
NLS young men (whites), 1966 1980 10 changes
-0.089
(0.019)
Rayack (1987)
PSID males (whites), 1968-1980 12 changes
-0.081
(0.016)
Blank (1990)
PS1D males (whites), 1969-1982 13 changes, pairwise balanced
-0.081a
(0.043)
Solon, Barsky and Parker (1994)
PSID males, 1967 1987 20 changes
-0.085
(0.022)
a Blank (1990) regresses change in real wages on percentage change in real GNP. Her estimate is transformed to unemployment elasticity using an estimated "Okun" coefficient of 0.30.
An alternative specification of the wage equation that meets some of these objections to the efficiency units wage model is the Gorman-Lancaster model of earnings. [Welch (1969), applies this model.] In this framework, the wage of person i is a function of characteristics embodied in the person times their price. Letting Ai be a J x 1-dimensional vector o f attributes for person i and RA a 1 x J-dimensional vector o f their economy wide prices,
(2.2)
Wi = R A A i .
The prices are determined by the aggregate production function using the aggregates N
j = 1,...,J
and
Rj =
f j ( A 1. . . . . A j , / ~ ) ,
i-I
where Rj is the jth entry of RA. In principle, this model can explain rising wage inequality in log earnings if aggregate shocks, or local labor market shocks, do not operate uniformly on the arguments of the production function, Aj, j = 1. . . . . J, and persons possess more than one type of skill. Equation (2.2) is inconsistent with the evidence on comparative advantage in the labor market, however. Heckman and Scheinkman (1987) demonstrate that for the US economy skill prices RA are not uniform across different sectors for a variety of different types o f sectors. Since prices are not uniform, a model of sectorat self-selection and comparative advantage in the labor market is required. Heckman and Sedlacek (1985, 1990) find that such a model better explains both the cross section and time series of the US wage distribution than do efficiency units models. The usual least squares estimates of wage equations reported in the literature do not estimate skill prices unless analysts account for self selection across sectors where skill prices are different.
580
M. Browning et al.
Table 2.2 Elasticities of wages, hours, and earnings with respect to state unemployment rates a Category
Hourly wage Actual (1) Adjusted (2)
1. All
Annual hours Actual (3) Adjusted (4)
Annual earnings Actual (5) Adjusted (6)
-0.07 (0.02)
-0.08 (0.02)
-0.11 (0.01)
-0.12 (0.01)
-0.18 (0.02)
-0.20 (0.02)
a. Women
-0.06 (0.02)
-0.06 (0.02)
-0.08 (0.02)
-0.09 (0.02)
-0.14 (0.03)
-0.16 (0.03)
b. Men
-0.08 (0.02)
-0.09 (0.02)
0.13 (0.01)
-0.15 (0.01)
-0.21 (0.02)
-0.24 (0.02)
a. <12 years
-0.04 (0.03)
-0.06 (0.02)
-0.14 (0.04)
-0.19 (0.03)
-0.18 (0.05)
-0.25 (0.04)
b. 12-15 years
-0.09 (0.02)
-0.09 (0.02)
-0.13 (0.02)
-0.13 (0.01)
-0.22 (0.02)
-0.23 (0.02)
c. 16+ years
-0.01 (0.02)
-0.05 (0.02)
-0.02 (0.02)
-0.06 (0.02)
-0.03 (0.03)
-0.12 (0.03)
a. Age 16-29
-0.12 (0.02)
-0.13 (0.02)
-0.16 (0.02)
-0.18 (0.02)
-0.28 (0.04)
-0.31 (0.03)
b. Age 30-44
-0.06 (0.02)
0.05 (0.02)
0.10 (0.01)
0.10 (0.01)
0.16 (0.03)
0.15 (0.03)
c. Age 45-65
0.03 (0.02)
0.03 (0.02)
-0.06 (0.02)
-0.07 (0.02)
-0.09 (0.03)
-0.10 (0.03)
2. By Gender
3. By education
4. By age
5. By number of employers last year a. One
-0.07 (0.02)
-0.07 (0.02)
-0.10 (0.01)
-0.11 (0.01)
-0.16 (0.02)
-0.18 (0.02)
b. Two or more
0.14 (0.03)
0.14 (0.03)
0.20 (0.02)
0.21 (0.02)
0.34 (0.04)
0.35 (0.04)
Source: Card (1995). Standard errors are given in parentheses. Table entries are elasticities of variables indicated in column headings with respect to state unemployment rate. Estimates are based on 51 state observations for 1979, 1982, 1985, 1988, and 1991. Unadjusted data are means of log hourly wages, log annual hours, and log annual earnings for each state-year cell. Adjusted data are means of regression-adjusted wages, hours, and earnings. All models include state and year dummies.
Ch. 8:
Micro Data and General Equilibrium Models
581
A third specification of the earnings equation is the "Mincer model" that is widely cited as a rationale for wage equations in empirical labor economics. Actually there are two Mincer models that are algebraically similar but that are economically distinct. The first Mincer model (1958) assumes that everyone is alike, the economy is stationary and that persons live forever. However, different levels of schooling are associated with different skill levels. Mincer's model is an equalizing-differentials or arbitrage pricing model for the lifetime permanent wage of a person of schooling S, where r is an externally determined interest rate:
w(S) e-~s -
-
-
w(0),
(2.3)
and w(O) is the benchmark no-schooling wage. Thus ln w(S) = ln[rw(O)] + rS. Because people are alike, allocations to schooling are demand driven via the aggregate production function defined on the aggregate stocks of skills
S(j)= Nj.Sj j = l,... ,J, where Nj is the number of persons in schooling group j, Sj is the schooling level of persons in schooling group j, w(Sj) = Fj(S(1) . . . . , S(J), K), the marginal product of schooling at level j. The evidence against the equalizing differentials model is overwhelming [see, e.g. Murphy and Tope1 (1987)] and for that reason Equation (2.3) is not useful as a framework for interpreting earnings data. A second Mincer model (1974) is widely appealed to although it is not well understood. It is an accounting identity that writes, in discrete time, potential earnings at age a as E(a): a-I
E(a) = Z
r(j) C(j) + E(O),
(2.4)
j-O
where C(j) is the cost of investment in period j, r(j) is the average return on investment in period j, and E(0) is initial earnings potential. Schooling is defined as occurring in periods in which all potential earnings are invested (C(j) = E(j)). After schooling, C(j) < E(j). Let k(j) be the fraction of potential earnings invested, k(j) = 1 during schooling and k(j) < 1 afterward. Observed earnings at age a, w(a), reflect investment because part of the potential earnings are invested:
w(a) = E(a)[1 - k(a)].
(2.5)
Mincer's distinction between observed earnings and potential earnings is an important insight especially for young persons. Even if E(a) can be written as the product of prices and attributes as in specifications (2.1) or (2.2), w(a) cannot, and coefficients on attributes do not identify measured attribute prices.
582
M. Browning et al.
Mincer assumes that schooling is exogenously determined and in post-school periods r, 0 ~< r ~< F, where F + S = ~, the total length of working and schooling life and r is work experience, z" k('r) = ( 1 - ~ ) k(O).
(2.6)
This relationship is assumed to be the same at all schooling levels. Heckman, Lochner and Taber (1998) test and reject this specification in US data for both males and females. Collecting results, a simple recursion shows that in w ( r ) = ao + al S + a2 r + a3 7~2,
(2.7)
where Ctl is the average rate of return to schooling and ct2 and ct3 are functions of k(0), F and the average rate of return to post-school investment 31 . Schooling is determined outside the model, and there is no direct relationship between the earnings function and the schooling equation. The coefficients ct0, Ctl, ct2 and a3 are reduced-form coefficients in the sense of Marschak (1953). They are not the policy invariant structural parameters of Lucas and Sargent (1981). As preferences, technology and policies change, so do these coefficients. Heckman, Lochner and Taber (1998) demonstrate the empirical importance of this point when they estimate a structural earnings equation and test it against the Mincer model. They demonstrate how the coefficients of the Mincer model change in response to changes in interest rates, skill prices and aggregate shocks. In addition, the assumption that k(0) does not depend on S, which is central to Mincer's claim that log earnings profiles are parallel in experience across schooling groups, although not parallel in terms of age, receives limited empirical support in the recent US data [Heckman and Todd (1997)]. At a purely empirical level, Equation (2.7) omits an important interaction between schooling and work experience. 2.2.1.
The Ben-Porath framework
The model of Ben Porath (1967) provides a more rigorous formulation of the earnings equation that combines a theory of earnings with a theory of schooling and on-the-job training. It accounts for Mincer's important insight that measured earnings are less than potential earnings so that earnings functions are not just pricing functions for observed characteristics. The Ben-Porath model assumes income maximizing agents who live ~ periods, who face parametric interest rate r and no credit constraints. The
31 More precisely, a~ - rs, the "average rate of return to schooling"; ct2 = rpko(O)[1 + (1/2F)] + k0(0)/F and a3 = -rpk(O)/2F, where rp is the average rate of return on all post-school investments. ~" is the length of life spent working after schooling and k(0) is defined in the text. Given ct2, a 3 and F, rp and k(O) are identified from a least squares regression assuming that the error term is orthogonal to the regressors.
Ch. 8:
583
Micro Data and General Equilibrium Models
model is an efficiency units model that writes the potential wage, E, for a person with human capital H at age a in period t as
E(H, a, t)= R(t)H(a), where R(t) is the marginal product o f aggregate human capital. Versions o f this model are widely used in modern growth theory. [See, e.g. Lucas (1988), Stokey and Rebelo (1995), and the references listed below.] Let I(a, t) denote the proportion of work time spent investing at age a in period t, where 1 is the maximum amount o f time available and D(a, t) is the tuition cost o f investment or goods paid to receive training, with price PD(t). Let F(I(a, t), H(a, t), D(a, t)) be the production function o f human capital for individuals which is assumed to be concave in I and D for fixed H . The net earnings o f the individual are
w(a, t) = R(t) H(t)[1 - I(a, t)] - PD(t) D(a, t).
(2.8)
Ignoring taxes, and for simplicity assuming a stationary economy, the agent maximizes earnings over the life cycle and solves
max [ e ra {RH(a)[1 - I(a)] - PDD(a)} da
(2.9)
l(a),D(a) J
0
subject to/:/(a) = F(I(a), H(a), D(a)) - ~rH(a), and initial condition H ( 0 ) where ~r is the rate of depreciation on human capital 32. Let /~ be the multiplier associated with the dynamic constraint. In this framework schooling occurs when I = 1. I f o = 0, schooling occurs only once, at the beginning of life, if it occurs at all. [See Weiss (1986).] Schooling ends at age a* where I(a*) = 1 and
t~(a*)Fl(I(a*), H(a*), D(a*)) = RH(a*),
(2.10)
so if "0" is the beginning o f life, a* is also the length o f time spent in school. The functional form for F that is most commonly used in the literature writes
I] = AIaH [s - ~rH.
(2.11)
In the post-school period, measured earnings are RH(a)[1 - I ( a ) ] - P D ( a ) D ( a ) if the goods costs o f investment are subtracted from earnings. Otherwise, measured earnings
32 Strictly speaking the Ben-Porath model writes F = F(i(a)H(a), D(a)), but the more general specification is a minor extension of this technology. This technology is sometimes called a "neutrality" model because human capital accumulation raises the marginal cost of human capital investment and the marginal productivity of human capital investment in the same proportion.
584
M. Browning et al.
are RH(a)[1 - I ( a ) ] . The Ben-Porath model provides a theory of I ( t ) that is missing in Mincer's analysis 33. It recognizes the distinction between potential and actual earnings. It tightly links schooling and on-the-job training decisions through the human capital production function and explicitly links those decisions to the earnings function. It is a theory with testable cross-equation restrictions once the functional form of F is specified. It provides a framework for testing the relationship among earnings, schooling and training dynamics and in this regard is a scientific success. It is important to distinguish the investment behavior generated by the Ben-Porath model from that produced from a learning-by-doing model of the sort developed in Heckman (1971), Shaw (1989) or Altug and Miller (1990, 1998). In the common form of that model, an hour of work experience in any sector of the economy produces the same growth in wages. Investment time and work time in final goods production are bundled. This model is based on an efficiency units assumption that is inconsistent with the evidence on comparative advantage in labor markets in modern economies. In particular, the model abstracts from comparative advantage in the labor market both in the production of output and in the production of skills. Implicit in the model is a "free lunch" assumption: since learning is uniform across sectors, there is no cost of learning, unlike what is assumed in the Mincer or Ben-Porath models. This is a consequence of the Leontief assumption that bundles work and investment in the s a m e proportion in all sectors. Once heterogeneity in learning opportunities across sectors is recognized, and the Leontief assumption is relaxed, the "learning by doing model" becomes similar to the Ben-Porath model and learning becomes a costly activity rather than a free lunch. Cossa, Heckman and Lochner (1998) discuss these points and develop a test between the two specifications. Estimates of this technology for discrete-time versions of Equation (2.11) are reported in Table 2.3. Until recently, all the estimates reported in the literature were for males and assumed neutrality (so F = F ( I H , D)). Only Heckman (1976) and Rosen (1976) estimate non-neutral Ben-Porath models. Rosen (1976) imposes a different restriction on the model [a = 1, fl 1 in the notation of Equation (2.11)]. Heckman (1976) allows a and fi to be freely specified. The recent analysis of Heckman, Lochner and Taber (1998) estimates an unrestricted model for males and females for different education and ability groups. See Table 2.4 for a summary of their estimates for each gender and ability group. They are not able to reject the hypothesis that a neutral Ben-Porath model (a = r ) describes the human capital accumulation process for persons of different ability and education groups. There are several important limitations of the original Ben-Porath framework. First, like model (2.1), the Ben-Porath model is based on an efficiency units assumption for the labor market. There is no room for the operation of comparative advantage in such labor markets. The market affects skill prices identically at all levels of human
33 Mincer simply postulates the functional form of k(t) - [I(a) RH(a) + PoD(a)]/RH(a), assuming that earnings are net of both types of cost.
Ch. 8." Micro Data and General Equilibrium Models
585
o
z
~
z
z
z
o
©
o Z
o
Z
o
Z
o
Z
~
Z
%
Z
o
Z
,.-.,
©
e----,
~D © o
~5 I
I
I
I
I
I
I
0
c5 ~.
oo ~5
r~ o
~<
586
M. Browning et al.
Table 2.4 Estimated parameters for human capital production functiona Parameter
Estimated value Males Highschool (S- 1)
a /3 A(1) Ho(1)* A(2) Ho(2)* A(3) Ho(3)* A(4) Ho(4)*
0.945(0.017) 0.832(0.253) 0.081(0.045) 9.530(0.309) 0.085(0.053) 12.074(0.403) 0.087(0.056) 13.525(0.477) 0.086(0.054) 12.650(0.534)
College(S = 2) 0.939(0.026) 0.871(0.343) 0.081(0.072) 13.622(0.977) 0.082(0.074) 14.759(0.931) 0.082(0.077) 15.614(0.909) 0.084(0.083) 18.429(1.095)
Females Highschool(S- 1) College(S = 2) 0.967 0.810 0.079 6.696 0.082 7.806 0.084 8.777 0.086 9.689
0.968 1.000 0.057 8.347 0.057 9.453 0.058 11.563 0.058 13.061
a Source: Heckman, Lochner and Taber (1998), Table 1. Human capital production: H~Sl = As(o)(IS) a~ (HSf' + (1 - a)/_/s, with S = 1,2. Standard errors are given in parentheses. b Heckman, Lochner and Taber (1999) do not report the standard errors for females. c Initial human capital for person of ability quantile using ability levels from NLSY. capital since in an efficiency units model, different amounts o f human capital represent different amounts o f the same skill. Only if agents at different skill levels invest differently in response to common aggregative shocks will means and variances of measured earnings vary across skill groups. [Heckman, Lochner and Taber (1998) present empirical evidence on this issue.] Second, in smooth Ben-Porath problems where F is strongly concave in I, the proportion o f time investing, I ( a ) , gradually declines from 1 in the post-school period. Thus, the model predicts that earnings do not jump after the end of schooling but gradually increase from 0, a phenomenon not actually observed in the data. For all o f these reasons, the Ben-Porath model is not consistent with the available evidence on life-cycle labor earnings. Heekman, Lochner and Taber (1998) extend the model, decouple the schooling decision from on-the-job training investment, and present a model in which earnings jump to a substantial positive number upon completion o f schooling. An additional problem with the Ben-Porath model is that it makes no distinction between human capital as an input that facilitates subsequent learning and human capital as a direct productive service used to produce market goods. At an intuitive level, schooling serves both purposes but the human capital acquired on the job affects subsequent learning differently than the general human capital acquired at school. We next turn to the framework o f Heckman, Lochner and Taber (1998) that solves these problems with the Ben-Porath model and enables them to develop an empirically
Ch. 8." MicroData and GeneralEquilibrium Models
587
concordant model of earnings that provides a framework for analyzing labor markets with heterogeneous human capital and comparative advantage.
2.2.2. The HLT model of earnings, schooling and on-the-job training HLT extend the Ben-Porath framework in several ways to account for the central facts of modern labor markets. (1) It distinguishes between schooling capital and job training capital at a given schooling level. Schooling capital is an input to the production of capital acquired on the job but the tight link between schooling and on-the-job training investments embodied in Equation (2.10) is broken. Earnings can now jump after completion of schooling instead of gradually increasing from zero as occurs when the two types of human capital are assumed to be the same. (2) Among persons of the same schooling level, there is heterogeneity both in initial stocks of human capital and in the ability to produce job-specific human capital. (3) Skills produced at different schooling levels command different prices, and wage inequality among persons is generated by differences in skill levels, differences in investment, and differences in the prices of alternative skill bundles. (4) There is a labor-leisure choice in addition to a laborinvestment choice 34. (5) The model is embedded in a general equilibrium setting so that the relationship between capital markets and human capital markets at different skill levels is explicitly developed. All of these extensions recognize heterogeneity in skills across persons and emphasize choices at both the intensive and extensive margins. These features of their work undermine the representative agent paradigm. We now present these extensions in greater depth.
2.3. Structure of the model Assuming perfect certainty, we first derive the optimal consumption on-the-job investment, and schooling choices for a given individual of type 0 who takes skill prices as given. We aggregate the model to produce a general equilibrium model. Throughout this section, we simplify the tax code and assume that income taxes are proportional 35. Retirement is mandatory. In the first portion of the life cycle, a prospective student decides whether or not to remain in school. Once he has left school, he cannot return. He chooses the schooling option that gives him the highest level of lifetime utility. Define Kcs t as the stock of physical capital held at time t by a person of age a and schooling level S; HaSt is the stock of human capital at time t of type S at age a. The optimal life-cycle problem can be solved in two stages. First condition on schooling and solve for the optimal path of consumption (Cs,t), leisure (La, t) and post-school
34 Heckman (1975, 1976), Blinder and Weiss (1976) and Ryder, Stafford and Stephen (1976) extend the Ben-Porath model to include labor Supply. 35 Heckman, Lochner and Taber (1998) relax this assumption.
588
M. B r o w n i n g et al.
investment (Is ¢) for each type of schooling level, S. Total time is normalized to unity, so work time is ha, t = 1 - L , , t - I a , t. Individuals then select among schooling levels to maximize lifetime welfare. Given S, an individual of age a at time t has the value function
Vo, t(IJ£Slt, K£S,t, S) =
max c2,,,~L,LL
U(C~St, LS, t)+bVa+l, t+l(H£+l,t+l, s Ka+l,t+l, S),
(2.12) where U is strictly concave and increasing and 6 is a time preference discount factor. This function is maximized subject to the budget constraint s .< s Ka+l,t+l .~ K~,t[l + ( l _ . c ) r t ] + ( 1
s
s
s
s "c) Rts H[,,t(1 - l~,t - La, t) - C~, t,
(2.13)
where T is the proportional tax rate on capital and labor earnings, R s is the rental rate on human capital of type S, and rt is the net return on physical capital at time t. We abstract from all activities of government except for taxation. On-the-job human capital for a person of schooling level S accumulates through the human capital production function s t+l = A S ( o ) (I2,s v~' [rrs ~l~, H;+I, t J ~'~a, tJ~ + ( 1 _ a S ~J H Sa,t,
(2.14)
where the conditions 0 < as < 1 and 0 ~ [3s, <~ 1 guarantee that the problem is concave in the control variable, and a s is the rate of depreciation of job-S-specific human capital. This functional form is widely used in both the empirical literature and the literature on human capital accumulation. (See the survey of estimates in Tables 2.3 and 2.4.) For simplicity, we ignore the input of purchased goods into the production of human capital on the job. For an analysis of post-school investment, this is not restrictive as we can always introduce goods and solve them out as a function of I s t, thereby reinterpreting I,S,t as a goods-time investment composite. HLT explicitly allow for tuition costs of college which we denote by D s, t. The same good that is used to produce capital and final output is used to produce schooling human capital. After completion of schooling, time is allocated to two activities: on-the-job investment, I s t, and work, (hSa, t = 1 - I:,s, t - Lsa, t), both of which must be non-negative. The agent solves a lifecycle optimization problem given initial stocks of human and physical capital, HS(O) and K0, as well as an ability parameter that governs the production of human capital on the job, As(o). HS(O) and A s ( o ) represent ability to "earn" and ability to "learn", respectively, measured after completing school. They embody the contribution of schooling to subsequent learning and earning in the schooling-level-S-specific skills as well as any initial endowments. HLT, Auerbach and Kotlikoff (1987) and Fullerton and Rogers (1993) abstract from short-run credit constraints that are often featured in the literature on schooling and human capital accumulation. Their models are consistent with the
Ch. 8." Micro Data and General Equilibrium Models
589
evidence presented in Cameron and Heckman (1998a,b) that long-run family factors correlated with income (the 0 operating through As(o) and HS(O)) affect schooling, but that short-term credit constraints are not empirically important. Such long-run factors account for the empirically well-known correlation between schooling attainment and family income. The mechanism generating the income-schooling relationship is through family-acquired human capital and not credit rationing. The a and/3 are also permitted to be S-specific, which emphasizes that schooling affects the process of learning on the job in a variety of different ways. Conditional on the choice of schooling, the following first-order condition governs the model: ,
(2.15)
6 OVa+l, t+l
us"' = 0K,+l,t+l ' ULs" ) ~5~OVa+l,t+l ( 1 - Z) RStHas, t"
(2.16)
A strict inequality implies no work - consistent with models of retirement and labor force withdrawal,
/ AS~u s ktlSa,t)~as-~iHS ~s~ t a, tJ
OVa+l't+l -
OVa+l't+l I
OKa+l,t+ l
OOa+l,t+ 1
\
~
~
)
(2.17)
(marginal return to investment time equals marginal cost);
OVa,t - 60Va+~'¢+l [1 + rt(1 - r)] OK~,t OK,+l,t+l
(2.18)
(intertemporal arbitrage in returns on physical capital); OVa t '
~ OVa+l
t+l
s
o/4 ; t - o o/(o+l' t+ Rt (1
+ 60V~+l,t+l ~ L
OHm+I,t+
-I~, t - L a ,
t)(1 - 72)
[AS/3s(iSt)a~(HSS~ 1 + ( 1 - as)] ' ,
(2.19)
(marginal value of human capital is the return to current and future earnings). At the end of the endogenously determined working life, the final term, which is the contribution of human capital to earnings, has zero marginal value. At the beginning of life, agents choose the value of S that maximizes lifetime utility: = Argmaxs
[vS(o) _ es],
(2.20)
where vS(O) is now the value of schooling at level S inclusive of D s, the discounted direct cost of schooling; and es represents non-pecuniary benefits. Discounting of
590
M. Browning et al.
V s is back to the beginning of life to account for different ages of completing school. Tuition costs are permitted to change over time so that different cohorts face different environments for schooling costs. Given optimal investment in physical capital, schooling, investment in job-specific human capital, and consumption one can compare the path of savings. For a given return on capital and rental rates on human capital, the solution to the S-specific optimization problem is unique given concavity of the production function (2.14) in terms o f / s t , (0 < as < 1), given the restriction that human capital be self-productive, but not too strongly (0 ~3s ~< 1), given that investment is in the unit interval (0 ~< ISt ~< 1), and given concavity of U in terms of C and L 36. The choice of S is unique almost surely if e s is a continuous random variable. The dynamic problem is of split-endpoint form. The initial condition for human and physical capital is given and optimality implies that investment is zero at the end of life. For any terminal value o f H s and K s, one can solve backward to the initial period and obtain the implied initial conditions 37. One can iterate until the simulated initial condition equals the prespecified value. The prices of skills and capital are determined as derivatives of an aggregate production fimction. In order to compute rental prices for capital and the different types of human capital, it is necessary to construct aggregates of each of the skills. At the micro level, agents specialize in choosing one skill among a variety of possible skills. At the macro level, different skills associated with different schooling levels have different effects on aggregate output. They are not perfect substitutes. Given the solution to the individual's problem for each value of 0 and each path of prices, one can use the distribution of 0, G(O), to construct aggregates of human and physical capital. The population at any time is composed of fi overlapping generations, each with an identical ex ante distribution of heterogeneity, G(O). Human capital of type S is a perfect substitute for any other human capital of the same schooling type, whatever the age or experience level of the agent, but it is not perfectly substitutable with human capital from other schooling levels. Cohorts differ from each other only because they face different price paths and policy environments within their lifetimes. Assuming perfect foresight [as used in Auerbach and Kotlikoff (1987)], let c index cohorts, and denote the date at which cohort c is born by tc. Their first period of life is tc + 1. Let Ptc be the vector of paths of rental prices of physical and human capital confronting cohort c over its lifetime from time tc + 1 to t~ + ft. The rental rate on physical capital at time t is rt. The rental rate on human capital is Rts. The choices made by individuals depend on the prices they face, P~c, their type, 0, and hence their endowment and their non-pecuniary costs of schooling, e s. Let H s t(O, Ptc)
36 Heckman, Lochner and Taber (1998) use a present value formulation in their model because consumption is separable from investment in their setup. 37 One can solve this problem numerically using the method of "shooting" or the methods described by Santos (1999) or Judd (1998).
Ch. 8." Micro Data and General Equilibrium Models
591
and KSa,t(O, Ptc) be the amount of human and physical capital possessed, respectively, and let Is t(O, Ptc) be the time devoted to investment by an individual with schooling level S, at age a, of type 0, in cohort c. By definition, the age at time t of a person born at time t~ is a = t - t~. Let NS(O, tc) be the number of persons of type 0, in cohort c, of schooling level S. For simplicity suppose that retirement is mandatory, at age aR, although this is not strictly required. In this notation, the aggregate stock of employed human capital of type S at time t is cumulated over the non-retired cohorts in the economy at time t: t-1
f Hs t~,t(O,Pt~) [1 - I s t~.,t(O,Pt~)- Ls t~,t(O,P,~)] Ns(O, tc)dG(O), tc = t aR
where a = t - t ~ , S = 1. . . . . S, where S is the m a x i m u m number of years o f schooling. The aggregate potential stock o f human capital of type S is obtained by setting IS_tc,t(O,Pt~) = 0 and LSt_tc,t(O,Pt~) = 0 in the preceding expression: t
//tS(p°tential) :
Z
1
/HS-'c, t(O' Ptc)NS(O' tc)dG(O).
t~ = t -- OlR ~
Allowing for human capital investment on the job produces a model of endogenous utilization rates. The aggregate capital stock is the capital held by persons of all ages: t-I
Z f Ks to,t(O' Pv)N'(O, Pt~)dG(O). tc=t--aR s=l
2.3.1. Equilibrium conditions under perfect foresight To close the model, it is necessary t o specify the aggregate production function F(/)t l . . . . , / / ~ , k t ) which is assumed to exhibit constant returns to scale. The equilibrium conditions require that marginal products equal pre-tax prices Rts = FB~ (i/t1,..., Lrf', Kt, t), S = 1 , . . . , S and rt = FR,(Fit1,..., ~ , kt, t). For their two-skill economy, HLT specialize the production function to _
_
_
)l/vz
F(Ht 1 , f I ? , K t ) = a3 { a 2 [ a l ( H [ l ) V1 + (1 - al)(/~/?)~/lIV2/v1 -~ (1 - a2)K~!~2 ~
.
(2.21) This specification is general enough to accommodate all of the models used in the applied general equilibrium literature based on O L G models. When v~ = v2 = 0, the technology is Cobb-Douglas. Auerbach and Kotlikoff (1987) assume efficiency units so different labor skills are perfect substitutes (vl = 1). In addition, they assume a Cobb-Douglas aggregate technology relating human capital and physical capital
M. Browning et al.
592
(v2 = 0). When v2 = 0, the model is consistent with the constancy of capital's share irrespective of the value of vl. Heckman, Lochner and Taber (1998) estimate Equation (2.21) and report that v2 = 0, but vl = 0.306 so the elasticity of substitution between skill groups is 1.41 where Hi is the aggregate stock of high school human capital and H2 is the aggregate stock of college human capital. They document that (a) the two skills cannot be aggregated into a composite efficiency unit; (b) the stock of skilled human capital is increasing over time so that a growth steady state is implausible for modern economies as a factual description of a state of affairs; and, (c) that even in principle a growth steady state does not exist because the elasticities of substitution among skill groups are not unity.
2.3.2. Linking the earnings function to prices and market aggregates The earnings at time t for a person of type 0 and age a from cohort c with human capital HS(a, t, c) are
WS(a, t, c) =RtHa, s s t(O, Ptc) [1 - I~,,(O, Pt~) - LSa,t(O, Ptc)] • s
(2.22)
They are determined by aggregate rental rates (RS), individual endowments H~,t(O, s Pt~) and individual investment decisions (IS(O, Ptc)) and leisure decisions LS(O, Ptc). The last three components depend on agent expectations of future prices. Different cohorts facing different price paths will invest differently and have different human capital stocks. This insight rationalizes the evidence on cohort effects in earnings reported by MaCurdy and Mroz (1995) and Beaudry and Green (1997) among others. An essential idea introduced in the HLT paper, which is absent from currently used specifications of earnings equations in labor economics, is that utilized skills and not potential skills determine earnings 3s. The utilization rate is an object of choice linked to personal investment and labor supply decisions and is affected both by individual endowments and aggregate skill prices. As the quantity of aggregate skill is changed, so are aggregate skill prices. This affects schooling decisions, investment decisions, labor supply decisions, measured wages, and savings decisions.
2.4. Determining the parameters of OLG models This section discusses how to choose the parameters of the OLG model just presented and various other versions of the model. Many of the problems that arise in determining the parameters of this model also arise in determining the parameters of the stochastic growth model. A Cobb-Douglas assumption for physical capital and a composite of labor is an appropriate description for the US economy where the constancy of capital's share is a well-established empirical regularity. However, the evidence from Europe
38 This idea is central to the Becker-Chiswick(1966) and Mincer (1974) models.
Ch. 8: Micro Data and General Equilibrium Models
593
is much less clear cut. The elasticity of substitution among the various skill groups is not infinite (as in the efficiency units model) nor is it one (as in Cobb-Douglas specifications). A major problem that arises in the homogeneous preference form of this class of models is the inability to reproduce the aggregate capital-output ratio and the distribution of wealth holdings from a pure life-cycle model. See, for example, the discussion in Auerbach and Kotlikoff (1987), Hubbard, Skinner and Zeldes (1994) or Huggett (1996). As noted in Section 1, about half of the physical wealth is held by the top 5% of the families. While Huggett (1996) finds that the Gini coefficient for physical wealth is improved by accounting for earnings uncertainty, the resulting wealth distribution still misses on both tails. For instance, when the utility curvature parameter p = 1.5, the top 5% of wealth distribution only accounts for about a third of the total wealth (depending on the magnitude of the borrowing constraint), and the fraction of people with zero or negative wealth is too large. One appealing way to resolve this problem is to introduce preference heterogeneity beyond that induced by the overlapping generations. Persons with low rates of time preference or low risk preferences may account for the bulk of the life-cycle saving 39. What is cause and what is effect, however, has not been sorted out in the literature. Do people endowed with low subjective discount rates simply choose to accumulate more wealth, or do subjective discount rates change as people become wealthier? The ideal data set for the purposes of estimating a general equilibrium model with heterogeneous skills would combine micro data on firms, data on the earnings of workers, their life-cycle consumption and wealth holdings, and macro data on prices and aggregates. With such data, one could estimate all the parameters of our model and the distribution of wages, wealth, and earnings. Using the micro data joined with aggregate prices, one could estimate the parameters of the micro model. Using the estimated micro functions, it would be possible to construct aggregates of human capital that could be used in determining the output technology. The estimated aggregates should match measured empirical aggregates and, when inserted in aggregate technology, should also reproduce the market prices used in estimation. Two practical obstacles prevent implementation of this approach. (1) Analysts typically do not have information on individual consumption linked to labor earnings and labor supply. (2) The data on market wages do not reveal skill prices, as is evident from the distinction between R s and WS(a, t, c) in Equation (2.22). Since prices cannot be directly equated with wages, it is apparently not possible to estimate aggregate stocks of human capital to use in determining aggregate technology. These obstacles led Heckman, Lochner and Taber (1998) to propose and implement an alternative
39 Other ingredients may also work. In a partial equilibrium context, Carroll (1992) claims that small probabilities of bad income shocks can better account for wealth holdings of low income consumers. Hubbard, Skinner and Zeldes (1994) offer another explanationbased in part on uncertainty from multiple sources of income.
594
M. Browning et al.
for assembling information from different data sets to check the consistency of the constructed model with the available micro and macro data, They require that the econometric procedures used to produce the micro-based parameters employed in their model, including the implicit assumptions made about the economic environment and expectations, recover the parameters estimated from synthetic micro data sets generated by the model. In estimating skill-specific human capital production functions, HLT demonstrate that it is necessary to account for heterogeneity in ability, in the technology required to produce skills, and in endowments.
3. Micro evidence 3.1. I n t r o d u c t i o n
This part of the chapter presents additional evidence from the microeconomic literature on the parameter values that are required to implement the dynamic general equilibrium (DGE) models analyzed in Sections 1 and 2. Several conceptually distinct labor supply and consumption demand elasticities are presented. We discuss the problem of research synthesis and issue a warning against uncritical use of the existing micro evidence in standard general equilibrium models. We also present further evidence on preference heterogeneity. We start by defining a variety of conceptually different elasticities that are frequently confused in both the micro and macro literatures. 3.2. D e f i n i n g elasticities
When considering the estimation of parameters for ultimate use in a DGE model, it is important to keep track of exactly what is being held constant (the "conditioning variables") in the process of estimation in order to ascertain whether the parameter being estimated corresponds to the parameter required in a general equilibrium model. We illustrate this point by considering a simple model of consumption and labor supply, but our discussion applies much more generally. We derive three frequently estimated elasticities typically formulated for a model in which an agent sells labor on a spot market without any transactions costs or fixed costs of employment. Although these elasticities are often referred to by the same name, they correspond to different choices of conditioning variables and hence distinct conceptual experiments. We then consider aggregate labor supply response measures that account for heterogeneity and dichotomous work-no work decisions. In later sections we present some parameter estimates of these elasticities taking care to specify which elasticity is being estimated. Suppose that in a given period t a person chooses current non-durable consumption ct and hours of market work ht = T - lt. Preferences are intertemporally additive but the within-period utility function U ( c t , ht) is not. In what follows we separate the
Ch. 8: Micro Data and General Equilibrium Models
595
within-time-period decision of how to allocate consumption and leisure given current period net expenditures from the intertemporal savings decision. This leads us to define "total net expenditure" or "net dissavings" et = ptct - w t h t , where Pt is the time t price of consumption and wt is the time t wage rate. These prices can be denominated in any convenient unit of account, including time t dollars. Deciding how much to consume and how much to work for a given amount of net dissavings is one part of the decision problem confronting the consumer. In a model with a more fully specified security market, the link between net dissavings and the market opportunities for investment is clearer. As we discussed in Section 1, it is conventional in the RBC literature to consider a wide array of alternative security markets. Following Heckman (1974) the impact of the allocation of et over time is conveniently determined by the life-cycle evolution of the shadow price )~t of the net dissavings: the marginal utility of income. Changing the specification of the security market environment alters the evolution of {~.t}. For instance, suppose that the consumer/investor has access to a one-period risk-free security with rate of return rt+l. Then the marginal utility of income satisfies the stochastic difference equation: )~t =/3(1 + rt+l)E()~t+~ I It),
(3.1)
where/3 is the subjective discount factor, provided there is no binding borrowing constraint [MaCurdy (1983)]. Under the permanent income restriction that/3(1 + rt+l) = 1, this results in the familiar conclusion that the marginal utility of income is a martingale [MaCurdy (1978), Hall (1978)]. Including risky securities at the same time imposes further restrictions on the evolution of {;,t}. When the rate of return is a risky asset, Equation (3.1) becomes Z, =/3E [(1 + Ft+I)Z,+I
II¢].
Including additional risky securities we obtain )~t =/3E[(1
+ 4+l)Xt+l I It]
for j = 1. . . . . J where J is the number of securities. The presence of short sale constraints may convert these arbitrage equalities into strict inequalities. In the limiting complete market case, the ratio/3(3,t+l/)~t) is the same for all consumers and is equal to the market stochastic discount factor for pricing single-period securities described in Equation (1.7) in Section 1 [Altug and Miller (1990, 1998)]. We next consider Frisch demand functions that condition on the shadow price 3.t. While a multiplier )~t can be defined for all environments, its interpretation is more complicated in environments with borrowing or short sale constraints. 3.2.1. Frisch demands
Assuming interior solutions for consumption and hours (ct > 0 and T > h t > 0), and Equation (3.1), or some generalization consistent with no corner solutions in
596
M. Browning et al.
intertemporal financial transfers, the household optimal consumption and hours of work satisfy two first-order conditions:
Uc (ct, ht) = )~tPt,
(3.2)
Uh (ct, ht) = -)~twt.
(3.3)
If the utility function U(ct, T - It) is strictly concave in consumption and leisure, we can invert Equations (3.2) and (3.3) to give the Frisch (or)t-constant) consumption and labor supply functions (where now we drop the t subscripts): c = c ( p , w, )0,
(3.4)
h = h ( p , w, ,~).
(3.5)
From the integrability conditions, these functions are homogeneous of degree zero in the price, the wage and the inverse of )~; symmetric (Cw = -hp); and satisfy negativity (which implies Cp < 0 and hw > 0). The parameters of c(.) and h(.) have been the subject of intensive empirical investigation and we review this empirical literature in Sections 3.3 and 3.4. A common assumption in both the empirical literature in microeconomics and in the macroeconomic literature is that consumption and labor supply are additively separable within the period; this is equivalent to assuming that Cw = hp = 0. We do not invoke this assumption here and we note below that the micro evidence speaks against it. The case in which an individual or household chooses not to work: h = 0, is also of considerable interest. In the absence of fixed costs for entry and exit, a person chooses not to work if:
Uc(c, O)= Xp,
Uh(c, O) <-Zw,
that is, if the reservation value of leisure at zero hours of work is greater than the market wage. We say more about this case in Section 3.2.4. Associated with the Frisch consumption and labor supply functions are the Frisch (or)~-constant) price and wage elasticities: (p, w, )0 = 0 In c p, 0 lnp - Cp (p, w, )0 -c Olnh w 0 (p, w, )0 = 0 In w - hw (p, w, ~) h"
(3.6)
These elasticities consider changes in demands and supplies for a particular good when its own price is changed but other prices are held constant. For instance, 0, the Frisch (or)~-constanO elasticity o f labor with respect to the nominal wage, holds the nominal price of consumption constant and hence also measures hours response to
Ch. 8." Micro Data and General Equilibrium Models
597
increases in the real wage. As we will see, the conditioning on )~ gives both elasticities an intertemporal character. To construct the intertemporal elasticity f o r consumption we consider t/(p, w, )0 - 0 In c )~ 0 t n ~ - cx (p, w, X) -'c There is an obvious counterpart for labor supply. Assuming no binding borrowing or short sale constraints, the change in X can be thought of as arising from a change in the interest rate or any other factor that alters )~ through forward-looking relation (3.1). In contrast to 7, cp is defined for a change that holds X fixed. For this reason, q0 sometimes is referred to as an intertemporal elasticity of substitution for consumption When the utility function U is additively separable between consumption and hours, then cp and t/ coincide, but in general they do not. Instead, by the homogeneity of degree zero of the Frisch demand functions, we have the relation cp (p, w, ,~) P- + Cw (p, w, ;~) w = cz (p, w, )0 _X = ~(p, w, X), c
c
c
or
q) (p, w, )~)+ Cw (p, w, )0 w = ~I(P, w, )~). c
Thus, the intertemporal elasticity t / c a n also be viewed as the consumption response when both the price and wage change proportionately. 3.2.2. Other d e m a n d f u n c t i o n s
Conditioning on the multiplier )~ provides a convenient way of estimating the parameters required to implement many of the models discussed in Sections 1 and 2. To apply this strategy requires that there be no binding constraints on transferring resources over time. Within-period elasticities can be defined even if Euler inequalities rather than Euler equations characterize consumer behavior. Assuming equalities hold, the within-period elasticities can be derived by substituting total net expenditures e = p c - wh for )~, by inverting the following expression: e = p c (p, w, )0 - wh (p, w, ;~) = ~p (p, w, ~.) ~ ;~ = t~ (p, w, e).
(3.7)
Substituting in the Frisch consumption and labor supply functions we obtain the withinperiod u n c o m p e n s a t e d or MarshaIlian consumption and labor supply functions: c = c [p, w, # (p, w, e)] = c* (p, w, e),
(3.8)
h = h [p, w,/~ (p, w, e)] = h* (p, w, e).
(3.9)
This derivation is more restrictive than necessary because it assumes that Frisch demands exist. An alternative derivation that stresses the more general nature of these
598
3/L Browning et aL
demand functions maximizes within-period utility subject to a period-specific budget constraint that may be augmented beyond (or below) current period earnings. Under this interpretation, e is the supplement to current earnings that governs within-period choices and e may be either exogenously or endogenously determined 4°. Under either interpretation, the parameters of c*(.) and h*(.) can be estimated using only cross-section data. However, they do not recover all the parameters of the original functions c (.) and h (.) except under very stringent assumptions. In particular, because they condition on the within-period allocation of expenditures, they are uninformative on interperiod substitution unless strong functional form assumptions are maintained. On the other hand, those parameters that are identified from Equations (3.8) and (3.9) are robust to misspecifying the asset market structure used to characterize the evolution of the marginal utility of income. For example, these relationships are valid whether or not the consumer is up against a borrowing constraint. A third labor supply function can be derived directly by equating the intratemporal marginal rate of substitution between hours and consumption to the real wage. By forming ratios of Equations (3.2) and (3.3), we eliminate the marginal utility of income ~t. Solving for labor supply in terms of the price of consumption, the wage and consumption gives: h = h** (p, w, c),
(3.10)
where now we condition labor supply responses on consumption Alternatively, we could solve for consumption by conditioning on the wage and hours. Since these demand functions are readily derived from the marginal rate of substitution first-order condition, they are known as m-demand/supply functions [see Browning (1998)]. They are valid whether or not the intertemporal Euler equations are binding. As noted in Section 1, these demand relations are also among those used by macroeconomists to calibrate parameters from steady-state relations using time series averages. The two conditional labor supply wage elasticities associated with Equations (3.10) and (3.9) generally differ from each other and from the corresponding Frisch elasticity given in Equation (3.6). Assuming that there are no binding constraints connecting transfers among periods, the three demand functions are connected by the identities h ( p , w, )0 =- h* [p, w, ~p(p, w, ~)] = h** [p, w, c ( p , w, ~)]
(3.11)
so that the wage effects are related by differentiating with respect to w: hw (p, w, ),) = h* (p, w, e) + h* (p, w, e) lpw (p, w, )t)
= h** (p, w, c) + h~** (p, w, c) cw (p, w, 3,).
(3.12)
40 Sometimes, analysts condition on full income (e + wT) and not e. Assuming normality of leisure, the wage effect holding e + wT constant is greater than or equal to the wage effect holding e constant since to hold e + wT fixed while w goes up one must reduce e. Full income-constant (or Becker) labor supply functions overstate the Marshalliml (or e-constant) labor supply functions.
Ch. 8:
599
Micro Data and General Equilibrium Models
The first equation states that the Frisch (or tl-constant) wage responses for labor supply are equal to the within-period wage response holding resource flow constant plus an intertemporal net savings response that accounts for how savings is altered by the wage change. The second equation shows that the Frisch wage effect can be decomposed into a within-period effect of wages on hours holding current consumption constant plus an intertemporal response of consumption to wages. From cross-section data on consumption expenditures and labor supply we can estimate h*(p, w, e) and h**(p, w, c) assuming either exogeneity of the conditioning variables or access to valid instruments for the endogenous regressors. Each demand equation can be used to bound the intertemporal Frisch response hw(p, w, )0. More precisely, from Equation (3.7) we have
~Pw(p, w, 2.)=pcw(p, w, A)-whw(p, w, )O-h. The sign of ~Pw('), the effect of wages on borrowing, or dissaving, is ambiguous, if as is widely assumed, consumption and labor supply are Frisch complements (so Cw(.) > 0). However, it is unlikely that the cross effect on the right hand side will outweigh the other two terms. In a period of high wages it seems plausible that savings increase, rather than decrease or remain constant. Thus ~0w < 0. In this case, hw(.) ~ h*(.) since h~*(.) is negative if leisure is a normal good. The m-supply response h~*(p, w, c) gives an upper bound for the Frisch response hw(.) if leisure and consumption are normal (so that hS*(.) < 0) and consumption and labor supply are complements (so Cw(.) > 0). Thus h~* < hw < hw so we can bound hw from the cross-sectional relationships. Obviously if consumption and leisure are additively separable within periods, the m-supply response function is the Frisch response. In general, we must take care to specify exactly what is being held constant (the marginal utility of income, total net expenditure or consumption) in the empirical study we examine when we use an estimated wage elasticity The empirical literature presents estimates of all of these elasticities and more, as we discuss below, and often does not distinguish among them.
3.2.3. An example To make the discussion of Section 3.2.2 more concrete, consider the following utility function which, as we have previously noted, is sometimes used in DGE models:
g (c, h) = [cO ( T - h) (1 °)] (I-p) (1 - p )
1.
0<~<1,
p>0,
(3.13)
where T is the time available for work. The conditions on the admissible parameter values are produced by monotonicity and strict concavity. The associated Frisch consumption and labor supply functions are: lnc = ac +tic lnp + yc lnw + (tic + yc) ln,~,
(3.14)
M. Browning et al.
600
I n / = l n ( T - h) = ah +/~h l n p + gh in w + (fib + yh) ln)~,
(3.15)
where /3c -
(1 - a) (1 - p )
( op - a - p ) , Yc =
P a(1 h
--
-
- p) -
yh_
,
a(1 - p ) -
P
P 1
P
The monotonicity and concavity conditions on the utility function imply that/3c and Yh are both negative. Note as well that q3o + yo) = (/3h + yh) = -
1 P
<0,
so that both consumption and leisure are normal goods (that is, increases in lifetime wealth lead to decreases in the marginal utility of income and consequent increases in both consumption and leisure). When p = 1, the utility function U is additively separable in consumption and leisure:
U(c, h) = a ln(c) + (1 - o) ln(T - h), and as a consequence, Yc =fih = 0 and Yh =/3c = -1. The Frisch labor supply wage elasticity is given by:
Olnh _ Olnw
gh(T
h
h)
(3.16)
The Frisch elasticity for consumption holding the wage constant and the marginal utility of income constant, is given by tic; and the elasticity ofintertemporal substitution is given by - 1 / p , which is the coefficient on lnX in Equation (3.14). This latter elasticity can alternatively be viewed as the elasticity of intertemporaI substitution holding the real wage constant. This may be seen by rewriting the consumption function as l n c = ac +(tic + yc) lnp + Yc in w + (tic + yc) In 3,. P Both lnp and In )~ now have a common coefficient: 1 /3o + 7c -
p,
which shows that the two elasticities are the same.
(3.17)
601
Ch. 8: Micro Data and General Equilibrium Models
The consumption elasticities /3c and/3c + 7c result from two different conceptual experiments, as reflected by their different conditioning variables. They coincide only in very special cases. When the consumer does not care about non-work time (leisure) ( a = 1) only the consumption equation (3.14) is relevant and 7c = 0. In this case both consumption elasticities coincide and are given by - 1 / p . This is the usual definition for an iso-elastic consumption utility function. Alternatively, when U(c, h) is additively separable ( p = 1), 7c is again zero and/3c = - 1 . When consumption and labor supply are not additively separable within the period we have two distinct Frisch intertemporal substitution elasticities for consumption one holding the current wage constant (tic) and one holding the real wage constant (/3c + 7c). The utility function given in Equation (3.13) is very restrictive since it confounds intertemporal substitution conditions - a high value for p implies a high propensity to substitute across time - and within-period substitution possibilities - the cross elasticities (/3h and 7c) are positive if and only i f p > 1. When p > 1, market work and consumption are (Frisch) complements; that is, a rise in the wage leads to an increase in both consumption and labor supply; market goods substitute for the home production foregone when someone works. As discussed in the previous subsection, we can also derive within-period labor supply functions that condition on either net dissaving or consumption For the utility function (3.13), the e-constant demand function is not very illuminating but the c-constant function is given by
ln(T-h)=ln(~)+ln(c)-ln
.
(3.18)
This equation clearly demonstrates that we cannot generally identify the preference parameters for the intertemporal allocation from within-period information. Indeed, in this case the parameter p cannot be identified from the c-constant function and thus none of the parameters of the Frisch consumption and labor supply functions can be deduced. While parameter a can be inferred from this intratemporal relation, this particular choice of functional form implicitly imposes the requirement that at a fixed real wage, consumption and leisure move together - a prediction that is at odds with the evidence surveyed in Section 3.3 41
3.2.4. The life-cycle participation decision The discussion so far assumes that solutions for consumption and leisure are interior. Such an assumption is congenial to the representative agent approach to macroeconomics but is grossly inconsistent with the microeconomic evidence on labor supply summarized by Pencavel (1986) and Killing sworth and Heckman (1986). Summarizing the Ph.D. research of Coleman (1984), Heckman (1984) noted that even
41 In Section 1 [Equation (1.4)] we described an extension of this functional form that macroeconomists sometimes use because it still accommodates steady state calibration.
602
M. Browning et al.
for prime age males, variations in employment contribute about 50% of the total variation in person hours over the business cycle. A central finding of the modern labor supply literature summarized in Heckman (1978, 1993) and Blundell and MaCurdy (1999) is that most of the curvature in the labor supply-wage relationship comes from choices at the extensive (entry-exit) margin. Accounting for entry and exit decisions forces analysts to introduce heterogeneity among agents. It is implausible that all agents either work in a period or do not work in a given period. Some mechanism must be introduced to account for why some agents work while others do not. In the macro literature building on Rogerson (1988), an assumption of fixed costs of work or some source of non-convexity is introduced so that it is optimal for individual agents to work either full time or not at all. The mechanism used to allocate work across people is a lottery embedded in a complete contingent claims market. [See Rogerson (1988) or the survey in Hansen and Prescott (1995).] Ex ante identical persons get different draws from the lottery. Draws are independent over time. Under additive separability in consumption and leisure, the winners of the lottery are those denied work; they get the same consumption bundle as workers but enjoy more leisure than workers. Rogerson (1988) shows how lotteries can be priced in a complete market RBC model of the sort considered in Section 142. Under conditions presented in his paper, a decentralized mechanism exists and a competitive equilibrium can be supported by the pricing system. This description of the employment allocation mechanism strains credibility and is at odds with the micro evidence on individual employment histories. Heckman and Willis (1977), Heckman (1982) and Clark and Summers (1979) document that employment indicator variables for persons are highly correlated over time - contrary to the Bernoulli assumption implicit in Rogerson (1988) and Hansen and Prescott (1995). Over long stretches of time, some people work all of the time while others never work. Heckman (1982) explicitly tests and rejects the Bernoulli assumption. This persistence in employment status remains even after controlling for commonly observed characteristics such as education, age and work experience. Persistence in the employment or non-employment state is a central feature of the micro data. At a minimum, individual-specific lotteries with outcomes strongly correlated over time are required to account for the micro data on employment - histories complicating the analysis of equilibrium lottery pricing. The heterogeneity in employment experience among persons of the same apparent demographic and productivity characteristics suggests that problems with adverse selection and moral hazard are likely to render the competitive lotteries of the sort analyzed by Rogerson (1988) infeasible. The micro literature explains cross-section heterogeneity in employment experiences, and the persistence of employment status over time, by introducing temporally invariant person-specific unobserved heterogeneity. The evidence in Heckman and
42 Hansen and Prescott (1995) extend Rogerson'smodel to account for non-separabilitiesin consumption and leisure. This produces leisure-dependent consumption allocations.
Ch. 8: Micro Data and General Equilibrium Models
603
Willis (1977) and Heckman (1982) indicates that such invariant components accotmt for much o f the persistence in employment status over time. Invariant unobservables can be introduced into all three types o f demand functions analyzed in Section 3.2.2. Within the Frisch framework, if we abstract from any fixed costs o f work or other non-convexities, a person will not work in period t if
U~(ct, O) = 3,tpt
(3.19)
Uh(ct, O) < - X t w t .
(3.20)
and
We can solve out for the virtual wage, w[, that makes inequality (3.20) an exact equality at zero hours o f work and use w[ in the Frisch consumption demands. Thus, unless there is contemporaneous additive separability, estimation o f the consumption demand equation will depend on virtual wages which equal actual wages if a person works. Similarly, in the absence o f contemporaneous additive separability, the employment decision will depend on Pt as well as on )~t and wt. To see this, solve the first-order condition for consumption (3.19) for ct(;~t, Pt; ht = 0) and substitute into inequality (3.20). This produces an equation characterizing life-cycle employment that depends on Pt, among other factors. The employment decision is discontinuous in terms o f wt; below or at w[, persons will not work; above w*t ' persons work at age t. Introduce a person-specific timeinvariant random variable e to account for heterogeneity in the population. This is unobserved by the econometrician. Instead the econometrician observes a vector X o f demographic characteristics that partially predict e. Associated with this random variable is a distribution o f types in the population. Let d t = 1 if a person is employed at time t; dt = 0 otherwise. Accounting for heterogeneity, e, which for simplicity is assumed to be scalar, we may write this inequality as
dt = 0 if Uh(ct(L(e), Pt; ht = 0), 0; e) ~< - ~ ( e ) w ~ , d t = 1 otherwise, or more succinctly
dt : l[Uh(c(J~t(e), pt; ht : 0), e) >-)~t(e)wt], where 1(.) is the indicator function and where we note that ,~t(e) does not depend on wt if the inequality is strict 43. However, standard results in consumer theory demonstrate
43 The assumption that e is scalar is only a simplifying device. It is not strictly required. Note that ;~t(e) is a function of e and initial assets as well as prices and wages in periods where persons work. For simplicity of notation we only exhibit the dependence on e.
M. Browning et al.
604
that )~t(e) depends on e and current and future prices and wages for periods in which persons work and consume. The set o f e values for which d t = 0 for wage wt is thus given by
e~ = {e I gh{c~(Z~(e), pt; ht = 0), 0; e] ~< -X~(e)w,}, which depends implicitly on all prices and wages over the life cycle in periods outside o f t in which the consumer works and consumes as well as initial endowments. Let the boundary o f the set _et for persons with potential market wage wt be
B(et) = {e I gh[ct(Zt(e), p~; ht = 0), 0; e] = -,~,(e)w~}. Assume, for simplicity, that prices and wages are independent of e and demographic characteristics X do not enter preferences directly, except through their effect on e. These assumptions simplify the notation and are easily relaxed. Then, in period t, the proportion not working is Pr(dt
= 0 [ X) = / dF(e [ X), e_t
where F(e [ X ) is the conditional distribution o f e. The life-cycle wage response o f participation is
0 Pr(dt -- 0 IX)
Owt
OB~/),
(3.21)
=f(B(e_t) I X )
where we are assuming that there is only one boundary point, an upper boundary point, and that the distribution is continuous at that point with density f :
dF(e I X ) ~.=B(e,) de =f(B(e-t) IX)" For more general boundary sets, we require a notion o f the density added or subtracted as the boundary is changed by the wage 44. Aggregating over age, wage and X groups, as in Section 2, produces the aggregate proportion o f people who do not work.
44 We require evaluation of the limit lim fet(w~+Awt)dF(O I X) - fct(wt)dF(O ]X) Awt ~ 0
Aw t
'
where we make the dependence of _et on w~ explicit. This expression is easily generalized to allow for vector e.
Ch. 8." Micro Data and General Equilibrium Models
605
Aggregate labor supply is constructed accounting for both the choice at the intensive margin and choice at the extensive margin. See H e c k m a n (1978) or Pencavel (1986) for more details. Aggregate employment-wage parameters combine preferences and distribution parameters and cannot be directly compared with the interior solution elasticities unless the distribution o f taste parameters is accounted for 45. A similar analysis can be performed for the e-constant and c-constant demand functions 46. In each case we can define a reservation wage and can define a set o f e values such that for persons with a given et(e) o r ct(e), respectively, and for the other prices and X variables, persons work or do not work in period t. In general et and ct depend on e, just as ,tt depends on e. Employment and non-employment proportions are determined by integration o f d F ( e [ X ) over the appropriate sets. We note, however, that all three interpretations o f the leisure demand function produce the same aggregate employment elasticity provided that all three exist 47. We now turn to the empirical evidence on these elasticities starting with the eis for consumption. 3.3. Consumption estimates In this section we present estimates from micro data o f the parameters o f preferences for consumption Given our discussion in the previous subsection, it will be clear that this will sometimes require us to also consider labor supply although the great majority o f consumption studies assume within-period additivity between consumption and labor supply. We discuss labor supply estimates in the next subsection. Here we shall concentrate on two aspects o f these estimates. The first is the conditioning variables used - in particular whether within-period additivity is assumed
45 Our analysis generalizes to an environment of uncertainty. For a model of perfect insurance, the analysis in the text applies without any modification. For a model of less than perfect insurance, ;tt is replaced by the derivative of the value function with respect to current assets, and variation in w t is taken holding any effect of the current wage on future values constant. 46 Thus for the c-constant functions the set of e values for which persons with consumption ct(e) and wt, Pt who do not work is { Uh(ct(e),O;e) < _ w t } e~ = el gc(c,(e), O; e) p~ ' and Pr(dl = 0 [ X) = f_e')dF(e ] X). For the e-constant functions, the e set for non workers with earnings et(e) is
{
wt}
Uh(et(e)/pt, O; e) -Prr e~ = e [ Uc(et(e)/Pt, O; e) <
'
and the probability of non-participation is P(d t = 0 I X) = re_7dF(e ]X). Again, et(e) and et(e) depend in general on initial endowments, assets and prices in periods with interior solution demand functions. 47 Note that the e- and c-cunstant demands can exist when the Frisch demands do not.
606
M. Browning et al.
or not, and whether the wage or labor supply is held fixed. The second is accounting for heterogeneity. Examples of observable heterogeneity include household composition, health status, age, and cohort (year of birth). To illustrate how we might incorporate heterogeneity consider the second-order approximation to the Euler equation for the iso-elastic (constant relative risk aversion) utility function, assuming within-period additivity between consumption and labor supply [see, for example, Browning and Lusardi (1996), Equation (2.4)]: (3.22) Alnct+l = _1 [ln(/3t+l,t)+rt+l+g 1ot2] +/At+l, P where rt+l is a (lending, net of tax) real rate of interest 48. The coefficient/3t+1, t is the time preference discount factor between periods t and t + 1, -1//9 is the eis (p is the CRRA parameter), and o2 is the variance of Euler equation error conditioned on time t information. When the rate of return is riskless, 02 is proportional to the conditional variance in consumption growth; it captures the precautionary motive and usually depends on factors such as the uncertainty associated with, for example, future income and health as well as the insurance possibilities open to the agent and the income realization and level of assets in period t. Now suppose that the discount factor is given by /3,+1,t =/30 exp(/3zAZt+l + el3),
(3.23)
where/3o is the baseline discount factor. The variable zt is a demographic variable that changes the utility of consumption in period t. For example, it is generally believed that children increase consumption so that i f z is a dummy variable for the presence of children then/3z will be positive. The (zero mean) variable e[~ captures (unobservable) variations in the discount factor. Similarly, let 1a;e = Oyt + e,~, (3.24) where Yt is the level of a variable such as income or assets in period t. The variable eo captures (fixed) differences in future risk and insurance arrangements. For example, for tenured university professors this is virtually zero while it may be quite high for young workers. Unlike its counterpart for the discount factor, this variable is unlikely to have zero mean. Combining Equations (3.22) to (3.24) we have the structural Euler equation: Alnct+l = P (ln/3o+el~+eo)+~rt+l + ; /3zAzt+l+ ; Oyt+ut+x _1
(3.25)
and an associated reduced form A in Ct+l = ao + arrt+l + azAZt+l + ayYt + Ut+l.
(3.26)
Note that in this form lagged levels variables (in this case, Yt or variables correlated with the permanent differences in discount factors el3 and expected consumption 48 If agents cannotcarry forwarddebt at this rate then there is an extranon-negativeLagrangemultiplier term in this equation [see Browningand Lusardi (1996)].
Ch. 8: Micro Data and General Equilibrium Models
607
variance E,) may be correlated with consumption growth - this causes obvious problems in choosing variables for orthogonality conditions to identify the model. I f we can recover consistent estimates of the parameters in Equation (3.26) then we can recover some of the parameters of Equation (3.25). In particular we can identify the eis (= - 1 / p ) and the observable variation in the discount factor/3z. We cannot, however, identify the mean discount factor/3o without further analysis. We turn now to some micro-based estimates of the parameters of the Euler equation. Most empirical consumption Euler equations do not condition on either the wage or labor supply. This implicit assumption that consumption and labor supply are additively separable within the period is very convenient but probably unwarranted. Evidence from consumption studies against the additive form will be presented below but there is also other evidence. For example, using family expenditure data, Browning and Meghir (1991) show that demand patterns depend significantly on male and female labor force status in an intuitively plausible way (for example, the budget shares of transport, clothing, and eating out are all rising with labor supply). I f this is the case then preferences over non-durables are not (weakly) separable from labor supply and consequently preferences over some non-durable composite commodity cannot be additivety separable from labor supply. Note that since this result is independent of the normalization of the utility function, this rules out any separable utility fimction, which unfortunately includes many functional forms assumed in the DGE literature. In Table 3.1 we present evidence from consumption Euler equation estimates based on micro data that maintain within-period additivity between goods and leisure 49. Almost always, the principal focus of these papers is on tests of the orthogonality conditions implied by the standard additive model so that either anticipated income growth or lagged income is also included on the right hand side. We restrict attention to those papers that explicitly report all of the effects of demographics. Most papers assume an iso-elastic utility function so that the equation estimated is similar to that given in Equation (3.26), usually without the lagged income term. The exceptions are Hall and Mishkin (1982) who use a quadratic utility function and Attanasio and Browning (1995) who use a generalization of the iso-elastic form. Table 3.1 reveals that most investigators find significant observable heterogeneity in the discount factor [the a~ in Equation (3.26)]. A typical example is Zeldes (1989) who finds that the elasticity of food consumption with respect to "needs" (effectively, the number of adult equivalents) is about 0.24 with a standard error of 0.03. Lawrance (1991) also allows for variation in the intercept with the level ofpresample income (ostensibly to capture differences in the discount factor that are correlated with lifetime wealth), education, and race. In terms of Equations (3.23) and (3.24) these factors could capture the variation due to el3 or eo (or even a propensity to be liquidity constrained). Lawrance assumes that only the discount factor varies across the population and identifies the baseline discount factor by estimating the mean of 49 We include Attanasio and Browning (1995) since they present results with and without the separability assumption so it provides a bridge between the two sets of results.
M. Browning et al.
608
I
•~. eq.. ~. r--:. i
,...,~
i
,,~
"0
o ",~ eq
P eqo6
~5~5
I
0
,..., 0
G
.~
.~
-
7~
~-
"i
o
o
r~
~
I
o
~
~
Ch. 8:
Micro Data and General Equilibrium Models
609
the expected consumption variance under a normality assumption. Although the point estimates sometimes indicate large variations in the discount factor with presample income levels, these are not generally statistically significant (perhaps because of the correlation between income and education). Her preferred estimates [Lawrance (1991), Table 4] suggest that bottom income decile households have annual discount factors that are about two percentage points lower than top decile households and also that households with a college educated head have discount rates that are about two percentage points lower than otherwise comparable households. Even if one rejects the scale of her estimates because it relies on a normality assumption, her evidence of heterogeneity in discount rates is still valid. There are few attempts to capture unobserved heterogeneity in the papers listed in Table 3. t. The only examples of explicit allowance for unobserved heterogeneity is the inclusion of a fixed effect in the Euler equation in Zeldes (1989) and Keane and Runkle (t992). In terms of Equation (3.26) these pick up the means (over time) of ~[~ or Eu. Neither of the two papers finds that the fixed effects are statistically significant. This is at odds with the Lawrance evidence but it may simply reflect the problem of trying to estimate preference heterogeneity with a fixed effect which can include measurement error components as well. It is difficult to credibly estimate subjective discount factors (that is, the parameters and distribution of E/~ in Equation (3.23)) from Euler equations. Yet the mean and distribution of the discount factor are, potentially, important elements in any DGE model. Where, then, can we look for estimates of the distribution of discount factors based on micro data? Recently two different approaches have been tried. Gourinchas and Parker (1996) use a mixture of calibration and estimation on the US CEX data to estimate the discount factor for groups characterized either by their education or their occupation. Their method exploits the changing relative strengths (suggested by theory) of the precautionary motive and the life-cycle (saving for retirement) motive as agents age. Their results depend on the shapes of the income and expenditure paths over the life-cycle and retirement "needs" rather than on period to period changes as in an Euler equation approach. They find that high school and college graduates have a discount factor of 0.96 and 0.97, respectively, when the real rate of interest is set to 3% and there is assumed to be no uncertainty in the return on capital. Although the estimates are sensitive to modelling assumptions (particularly the choice of the real rate used in the model) the use of lifetime income and expenditure patterns is a distinct improvement on the use of Euler equations to estimate the discount factor. An even more promising approach is that of Samwick (t997) who uses wealth holdings at different ages as observed in the USA in 1992 to infer the underlying distribution of discount factors. For example, a household that is close to retirement and has a low wealth to permanent income ratio is inferred to have a high discount rate since it will necessarily have lower consumption in the retirement period. Samwick finds that the distribution has three components. First, about 70% of households are characterized by an approximately normal distribution centered on 5% with a standard deviation of about five percentage points (so that most agents in this group have
610
M. Browning et al.
discount rates of between - 5 % and 15%). Then there is a small group o f wealthy households (about 5% o f the total) who have large wealth and consequently large negative discount rates (below -15%). Finally there is a group (about one quarter o f the total) who have very low wealth and consequently very high discount factors (above 20%). Although the numerical values o f the estimates may not be robust to the modelling choices that Samwick makes (concerning initial assets, other forms of wealth, income processes, etc.), the evidence o f heterogeneity in discount rates is likely to be robust and his approach merits further investigation. Consider next the other important parameter, the eis. As can be seen from the final column o f Table 3.1, many o f these studies report an estimate o f the coefficient on the real interest rate (= - e i s ) although these are usually not very well determined 5°. All o f the PSID studies listed in Table 3.1 use cross-section variation in the real rate (due to differences in marginal tax rates) to aid in identification. As is typical o f the entire microeconomic consumption and labor supply literature, most investigators do not introduce prices in their models and instead include time dummies that "wipe out" the real interest rate coefficient, so that it is open to question whether these estimates are genuinely estimating an intertemporal substitution effect. Attanasio and Browning (1995) use quarterly data from 17 years which gives a great deal of time series variation in the real rate o f return to capital (the quarterly rate varies from - 8 . 6 % to 2.4%). They also estimate a more general form o f the iso-elastic form that allows for an eis that varies with the level o f consumption (here seen as a proxy for lifetime wealth) and demographics. They find that there is significant variation in the eis with demographics and also that the iso-elastic form is rejected against the more general form. Thus the eis is not constant over the population even when demographics are held constant. It is an open question as to whether this is due to everyone having the same generalized iso-elastic utility function or whether it reflects everyone having iso-elastic utility with unobserved variation in the parameters that is correlated with lifetime wealth (or some form o f misspecification). In Table 3.2 we present results on consumption for those studies that allow for non-separabilities between consumption and labor supply. Since the effects o f other demographics are generally similar to those for the separable case, we concentrate here on the effects o f non-separabilities between consumption and labor supply. One thing to note here is that most authors are more interested in qualitative results (for example, "is there excess sensitivity?" or "is consumption additively separable from male labor supply?") than in precise estimation o f specific parameters. This is reflected 5o One thing to bear in mind is that if we use different commodities then we should expect different estimates of the eis, see Atkeson and Ogaki (1996) and Browning and Crossley (1997). The latter show that if goods are additively separable within the period then the eis for a sub-component of non-durables (for example, food) is equal to the eis for non-durables as a whole multiplied by the (uncompensated) expenditure elasticity for the sub-component (this is an exact version of Pigou's Law that income elasticities are proportional to own price elasticities if preferences are additive). Thus we would expect that the eis in studies that use food as the consumption good should be, say, half of the eis in studies that use a broader measure.
Ch. 8:
Micro Data and General Equilibrium Models
611
in the fact that many of the estimates are rather implausible. The table also reflects the wide gulf between calibrators and micro-econometricians in that there are no Euler equation estimates of the most widely used functional form found in the DGE models that do not assume additivity between consumption and labor supply. Of the five studies that present separability tests for male labor supply, three Attanasio and Weber (1993), Blundell, Browning and Meghir (1994) and Attanasio and Browning (1995) - report significant complementarities between consumption and male labor supply. Moreover, the effects are quite large; for example, Attanasio and Browning (1995) have consumption falling 28% if an anticipated change from full-time work to no work takes place 51. However, two other studies, Browning, Deaton and Irish (1985) and Meghir and Weber (1996), find that consumption is additively separable from mate labor supply. The former study conditions on the real wage rather than labor supply, so that it is estimating a different parameter. The specification in Meghir and Weber (1996) is very different from the others used; they model three non-durable goods (food at home, transport, and services) rather than a single composite; they allow for interactions between the wife's labor force status and various parameters; they condition on other quantities ("food out", clothing, and fuel) and they use the panel aspect of the CEX. Although this evidence on the separability between consumption and male labor supply is a bit ambiguous, taken as a whole, it supports the notion that consumption and male labor supply are Frisch complements. The evidence concerning female labor supply is even more mixed. Given the evidence that demands do not appear to be separable from female labor supply within the period [see, for example, Browning and Meghir (1991)], given the presumed presence of costs of going to work (including day care for children since this is included in "services"), and given the possibility of substituting home production for market purchases, it might be supposed that we would find strong complementarities between consumption (strictly, market purchases of non-durables) and female labor supply. However, this is not the case. Of the five studies that allow for non-separabilities between consumption and female labor supply, two [Altug and Miller (1990) and Blundell et al. (1994)] find them to be Frisch complements (but, for the former, only if it is assumed that male labor supply is additively separable from both consumption and female labor supply, which is contrary to the male labor supply results reported in the last paragraph); two others [Attanasio and Weber (1995) and Attanasio and Browning (1995)] find weak evidence in favor of substitutability and one [Meghir and Weber (1996)] finds very strong evidence of non-separabilities (which cannot be characterized in the usual way). The papers listed in Table 3.2 also present estimates of the eis for consumption The largest effects are found in Blundelt et al. (1994) 52. They permit the eis to vary with the level of consumption labor force status, and demographics. Generally, the variation
51 This is interpreted as being due to non-separabilities. A plausible alternative is that households in which the husband is out of work may be liquidity constrainedand this explains the low consumption. 52 Note, however,that they include a dummyvariable for the post-1980 period and this "sharpens up" their results considerably.
M. Browning et al.
612
•~_~ ~ . ~
0
,.~
t.,)
.~ ~
;~'~
~
~ o
-,.~
~.~,
~
c~'~
~
~.~ ~ ©
~ o
¢-)
~
o
o
g_~.~
o
"-~ O "~
o
,<
©
o
&o
~ ~~ z g
O o
o
~g~
Ch. 8..
613
Micro Data and General Equilibrium Models
rm
o ~ "~
g
o
~.~
~o ~,_
r~ c.I o
Nz: ~ ~.~.
_
.~ ~
g
3g~
~ ~
3 "~.~ ~.~
¢¢3
o
.z:~ o
.o ~= .~ ~
o
f2~ o
,-g
,e
o
o
.<
-g~
614
M. Browning et al.
with consumption is greater than the variation with the other observable factors and quite high values (in absolute value) are found for some high consumption households; the range in the sample is from -2.9 (the first decile) to -0.96 (the ninth decile). I f this dependence on consumption actually reflects persistent heterogeneity in the eis, then this may go some way to rationalizing the different portfolios that high consumption and tow consumption households hold. Four main conclusions concerning consumption can be drawn from these studies: • The intertemporal allocation of consumption varies significantly with variations in demographics. In particular, consumption increases with household size. • There is consistent evidence that consumption is complementary with male labor supply. The estimates suggest large decreases in consumption in response to an anticipated change from full-time work to no work. • The evidence is mixed on the interaction between consumption and female labor supply. At present we believe that there are no reliable estimates of this interaction. • The elasticity of intertemporal substitution (eis) is usually poorly determined. Two studies, however, find significant variation in the eis with demographics, labor force status, and the level of consumption, with quite high elasticities for some high consumption households. If constancy is imposed on the eis, then one characterization of the literature is that there is no strong evidence against the view that the eis for non-durables (holding labor supply constant) is a bit less than -1 and the eis for food is -0.5. The first finding is problematic for calibrating infinitely-lived agent models using microeconomic data. The dynasty models are not designed for dealing with changes in household size over the life cycle. The second two findings, taken together, demonstrate that separate treatment of male and female labor supply is needed if microeconomic evidence is to be used in calibration. Yet, as noted in Section 1 of this chapter, the convention in the macroeconomics literature is to pool genders to exploit the rough constancy in hours per head across both groups and to calibrate preference and production parameters using the constancy of aggregate person hours per capita. We next turn to the estimates of labor supply elasticities.
3.4. Labor supply
The literature on labor supply offers an interesting contrast with the literature on consumption just surveyed. There are only a handful of intertemporal labor supply estimates which we summarize here. In contrast, there is a vast predecessor literature on static labor supply models. The difficulty in using this literature arises from the lack of a clear framework for interpreting the estimates. Much of the literature proceeds by estimating a relationship between hours and wages, holding constant demographics and some measure of wealth or unearned income, but not holding fixed the marginal utility of expenditure, as is required to isolate the eis. In this section we show that the static literature provides a lower bound for the eis for labor supply.
Ch. 8:
615
Micro D a t a and General Equilibrium M o d e l s
3.4.1. Labor supply estimates
Table 3.3 presents the estimates of the Frisch labor supply elasticity for males, females, and the aggregate per capita labor supply of households. As is typical of the entire published literature, most of these estimates are based on hours worked within a year for continuously married prime age males who work some time in each year for long stretches of time - typically six to ten years. Only Heckman and MaCurdy (1980) estimate the Frisch labor supply elasticity for married females. In all of the male micro samples, observations with no earnings in the year are excluded. The estimated elasticities for males are small but are consistent with each other. For married females, the Frisch elasticity is much larger. Only Ghez and Becker (1975) examine whether the Frisch elasticity depends on demographics; they find that the Frisch elasticity declines with educational attainment for males. For black males, the Frisch elasticity is negative. Ghez and Becket (1975) also report evidence that suggests that male and female Table 3.3 Estimates of elasticity of intertemporal substitution in labor supply Authors
Estimate
Frisch labor supply elasticity
Dependence on demographics
MaCurdy (1981)
Life cycle labor supply for males
[0.10, 0.40]
None
Assumed
Altonji (1986)
Life cycle labor supply for males
lO, 0.35]
None
Assumed
(for labor supply)
Heckman and MaCurdy (1982)
Life cycle labor supply for females
(1.61) a (labor supply)
None
Assumed
Lucas and Rapping (1970) b
US aggregate time series (males and females); manhours (person)
1.40
None
Assumed
(labor supply (short run)
Ghez and Becker (1975)
Males
Whites:
0.49 (Grade School) 0.36 (high school) 0.30 (college) roughly constant
0.39 (whites) -0.106 (nonwhites)
Blacks:
a This number is obtained from that reported by Heckman and MaCurdy (1982) who estimate a demand for leisure by multiplying their eis for leisure by 4 to obtain the implied elasticity of labor supply H=T-L:
0 ln~
-
T - LOW
, where, on average, (T - L~ - 4 in their sample.
b The interpretation of the Lucas and Rapping parameter as a Frisch parameter is based on MaCurdy (1985). Their estimate is for aggregate person hours and includes entry and exit. For this reason, their estimated elasticity is higher than the estimates that use workers who are continuously working. See Heckman (1978).
616
M. Browning et al.
leisure time are direct substitutes in within-period household utility. Abowd and Card (1989) report very large and statistically imprecisely determined estimates o f the Frisch elasticity for males. We do not discuss their study in detail because of the imprecision of their estimates. One interesting finding implicit in the Abowd and Card (1989) study is that the Frisch elasticity increases the smaller the time unit of labor supply that is analyzed: the Frisch elasticity is smallest on biennial labor supply observations and largest for six month intervals. This suggests that an instantaneously additively-separable model is inappropriate and that time is more substitutable over shorter intervals than longer ones. Further suggestive evidence on this point is given by MaCurdy's (1983) very high reported elasticities for monthly hours of work. It is important to note that none of these elasticities account for the participation response [that is, none compute Equation (3.21) or an elasticity version of it] 53. They are all taste-constant responses to wages whereas the participation elasticity includes a component that arises from changing the sample composition of the tastes of workers through entry-exit decisions when wages change. The cross-section evidence we survey below suggests that the participation response elasticities are substantially higher. There is a vast literature on static labor supply models that is not directly relevant to the calibration of macro general equilibrium models, although it is often, used this way especially by economists in public finance; see, for example, Auerbach and Kotlikoff (1987) or Fullerton and Rogers (1993). In their simplest form, static models relate hours worked at age t to wages at age t and either asset income or assets at t (At). The resulting labor supply equation is not captured by any of the four models o f labor supply discussed in Section 3.2. It does not condition on current expenditure; it does not condition on current consumption; it does not condition on the marginal utility of wealth, although At may be a partial proxy for )~t; it also does not condition on et but rather on earnings from asset income; and it typically does not allow for corner solutions. To see what is estimated in this literature, consider a simple model in which consumption and leisure are contemporaneously additively separable. Suppose preferences for leisure are given by a simple iso-elastic specification. We initially assume perfect certainty with a constant real interest rate r and we restrict this rate to satisfy fi(1 + r) = 1 where fi is the subjective discount factor. Also, for the time being we abstract from cohort effects in wealth accumulation. Then we may write the )~-constant or Frisch function for leisure demand as In It -- a0 + al in wt + al In )~t + et, where T - h t = It and et is a mean zero measurement error. Instead of running a regression based on this equation, run a misspecified regression of ln/t on In wt and A t : In It = ao + al In wt + a2At + (al In ).t
- a2At + e t ) ,
s3 Only Lucas and Rapping (1970) use total person hours as the dependent variable. Implicitly they estimate effects inclusive of entry and exit. See Heckman (1978).
Ch. 8: Micro Data and General Equilibrium Models
617
where the expression in braces is the composite error term for the model. Observe that the true value o f a2 is zero, since once we condition on ;~t, wealth does not enter the Frisch labor supply function. We cannot, however, observe ,~t and consequently cannot condition on it. Instead, we include At in a misspecified regression equation. By a standard specification-error analysis, least squares estimates o f cq, a2, denoted b y (~1 and &2, and obtained from a cross section, converge under the usual regularity conditions to
(YAAO'w,t,-- O'wAOA)~) plim&l = a l
ID I
1+
'
where the covariances are formed over age groups in the cross section: Oww = Var(ln wt); C~wX= Cov(ln wt, In ,~t); awA = Cov(ln wt, At) and CrAX = Cov(At, in )~t);
C~AA= Cov(A¢, At) and D =
plim &2
CrwwawA > 0. Moreover, (~wA O'AA
( Crww(YAZ-- ~wZOwA )
al
~
iOi
)
Some o f these covariances can be signed. From concavity o f preferences, aAX < 0 (diminishing marginal utility o f wealth); assuming that goods and leisure are normal in all periods, awX ~< 0 and in a cross section, it usually happens that awA > 0 (higher wage people have higher wealth). Thus, in general, the bias o f &i for al is ambiguous. Observe, however, that there is no bias if income effects are small. If they are negligible, awx = aAX = 0 and the cross-section wage elasticity in a leisure demand equation recovers the Frisch labor supply parameter. Moreover, in an environment o f perfect certainty, the weaker the correlation over time in wages, the weaker is C~w~,the more likely is the bias o f &l for al to be downward and the more likely is &2 to be negative. If At is omitted from the model, as was the common practice in many of the crosssection studies surveyed in Table 3.4, the OLS estimate o f al when )~t is omitted from the regression (&l) has: plim&~=a'(
l+°w~)'aww/
(3.27)
which is definitely downward biased since C~wZ< 0. From a l , w e can produce a lower bound on al 54 The preceding arguments remains valid if there are cohort effects or if we introduce age effects by relaxing the restriction that/3(1 + r) = 1 back into the model, provided that the assumed properties o f the covariances also apply to age-adjusted versions o f
54 A version of this bounding argmnent was first presented in Heckman (1971). The argument is more general and does not require contemporaneous additive separability.
618
M. B r o w n i n g et al.
t~ eq
¢q
~-
o'~
i
I
',q"
~. ~
~
o. ~
*:~
P,-
t"-~0.
I
J
i
I
I
I
]
I
±
5'~
~
~
"
e~
~
.
~
%
~D
I
I
oq o "m
O o
~
,2, La~
,~
p..
~
~
~
z oo
N
,...;
Ch. 8." Micro Data and General Equilibrium Models
619
1 I
I
I
I
I
~
I
I
I
I
I
~
0
0
~
0
±
I
0
°7 9 F ~ 7 oo,, on
d d d d d
~
l" A . ~
I , LU LU
A
..
d
o H er~
~"
,~
~
o
H ~
N
N G'
v
~
r~
~"
~
~ o
x5
~A
o0
~7.
620
M. Browning et al.
them. In the case of secular exponential growth in wealth across cohorts, the coefficient on age includes a growth rate of wealth term. Provided that aXw < 0, the argument also extends to an uncertain environment with less than full insurance. In Table 3.4 we present cross-section estimates from 15 studies. The estimates correspond to &l or &a above. The first nine studies in the table produce estimates of O O,nh in w _ ( 2 ~ ) ~O,l n l where T - h = 1. Thus, the upper bounds for Frisch leisure demands are lower bounds for Frisch labor supply elasticities The first nine studies are the traditional least squares estimates which are plagued by bad data on asset income. All but one of the numbers reported in the first nine rows of this table are below the estimates of the Frisch labor supply elasticity reported in Table 3.3 vindicating our analysis and suggesting that the cross-section labor supply estimates constitute lower bounds for the Frisch labor supply parameter. Studies (10) and (11) of Table 3.4 report that cross-sectional annual-hours-of-work labor supply elasticities for married women are as negative as they are for married men. Studies (12) and (13) reveal that labor supply elasticities are more positive when participation is the dependent variable. (But recall from our earlier discussion that these elasticities are not the eis or the Hicks-Slutsky elasticity.) This point is emphasized in the studies of Heckman (1978, 1993) and Blundell and MaCurdy (1999). Study (12) reveals that the curvature of labor supply is more elastic the lower the wage. Study (14) attempts to replicate the influential cross-sectional estimates of Hausman (1981), which are sometimes used by macroeconomists as measures of the Frisch parameter. His estimates are much higher than the other estimates reported in the table. In a careful analysis, the authors of study (14) are unable to replicate Hausman's reported estimates using his own data. Instead their unrestricted estimates of labor supply elasticities are negative and inconsistent with utility maximization. When the restrictions of utility maximization are imposed, the model exhibits a zero labor supply elasticity for males. It is the estimates in row (14), and not Hausman's (1981), that fall in line with the other estimates reported in the tables and should be used for evidence on uncompensated static labor supply. Study (15) summarizes the literature that establishes that the traditional labor supply model does not satisfy Slutsky symmetry or integrability conditions. However, Heckman (1971) establishes that when the cross-section model is embedded in a lifecycle setting, the evidence is much stronger for the traditional household model of labor supply. 3.5. Heterogeneity in the marginal rate o f substitution between goods and leisure
Beginning with the work of Heckman (1974), and continuing on in the work of Burtless and Hausman (1978), Hausman (1981), MaCurdy (1983) and MaCurdy, Green and Paarsch (1990), labor economists and econometricians have estimated the extent of variability in the marginal rate of substitution between goods and leisure. Heckman (1974) builds a life-cycle model of labor supply, asset accumulation, and the demand for child care for an environment of perfect certainty. In his model the
Ch. 8." Micro Data and General Equilibrium Models
621
marginal rate of substitution between goods and leisure is explicitly parameterized and allowed to depend on both observed and unobserved factors. In later work, MaCurdy (1983) applies and extends this framework to an environment of uncertainty using Euler equation methods. Other researchers have used static one-period models of labor supply to document heterogeneity in the preference for leisure. For brevity we only summarize the static labor supply evidence reported by Heckman (1974), who presents the clearest evidence on preference heterogeneity. The marginal rate of substitution function (or slope of the indifference curve) at a given level of prework income Y is
m = m(Y, h),
(3.28)
where we ignore variations in the prices of other goods and where h is hours of work 55. A consumer possesses a family of indifference curves indexed by level indicator Y, the no-work level of income or consumption We know that Om/OY > 0 if leisure is a normal good. From diminishing marginal rate of substitution between goods and time, we know that Ore/Oh > O. If a consumer faces a parametric wage w, at initial income position Y, she works if w > m(Y, 0). I f this inequality applies, the consumer's decision of how much to work is characterized by
w = m(Y*, h*),
(3.29)
where Y* is a level index appropriate to the indifference curve and is the amount of income (or consumption) that would make the consumer indifferent to a choice between working h* hours at wage rate w to gain total resources wh* + Y, or not working and receiving income Y*. Without knowledge of Y*, one cannot deduce the relationship between w and h* predicted by consumer optimization. However, given Y*, we know that optimality also requires that h* satisfy h*
wh* + Y = f m(Y*, h) dh + Y*.
(3.30)
0
From Equation (3.29), if m is monotonic in Y* we may solve for Y* as a function of w and h: Y* = g(w, h). Using this value in Equation (3.30), we implicitly define the labor supply function by h
wh + Y = f m [g(w, s), s] ds + g(w, h). 0
55 In the life-cycle version of the model, Y is determined by a two-stage budgeting argument.
M. Browning et al.
622
0
o o
o
~
x
~q
x
o
.~
o
o
i
o
. ~ q @
o
.
.
"a
~ 0 0
d
~
~
d
d
d
g
d
od
d c~
. ~ . ~ . . ~
~
r~ 0
o
d o x x o ~ o l o o I
~
od
rn
Ch. 8:
623
Micro Data and General Equilibrium Models
A wide variety of fimctional forms may be used to specify m. Heckman's (1974) preferred specification is (3.31)
lnm = a o + a ~ Y + a z h + a 3 Z + u ,
where Y is the prework level of income, h is hours of work, and Z is a vector of variables to be discussed more fully below. A random variable designated "u" with zero mean and variance ~r~2, reflects variation in preferences for work among individuals. The previous analysis leads us to the prediction that al > 0 (normality of leisure) and a2 > 0 (diminishing marginal rate of substitution between goods and leisure). The resulting static labor supply function is implicitly defined by u = lnw-oto-ot2h-ot3Z-al
wh+ Y-
~22(1 - e a2h) .
(3.32)
This labor supply curve can become backward bending beyond a certain value of hours worked. Estimates for this specification are presented in Table 3.5 where the price of child care is also introduced as a determinant of the marginal rate of substitution. This table reveals that the estimated marginal rate of substitution function for married women has considerable heterogeneity depending both on observed variables (education, the number of children and the wealth level of the family) and on unobservables. Children, education, and asset income all raise the value of leisure. The standard deviation in the unobservables is also large. This heterogeneity arises from variation over people and not sampling variation. Figure 3.1 plots the median, third quartile and first quartile standard deviation in the population marginal rate of substitution for a group with median non-labor income and other characteristics, excluding education from preferences. Population variation in tastes is a central empirical regularity in the micro literature. Moreover, as documented in Heckman and Willis (1977) and Heckman (1982), these tastes are stable over time giving rise to persistence in employment status over time.
Summary and conclusion This chapter has documented the empirical evidence supporting heterogeneity in preferences, constraints and skills and some of their consequences for modern macroeconomics. It has also examined the problems of measuring uncertainty from time series processes of earnings and wage functions. Finally, it has discussed the problem of extracting the parameter values required in dynamic general equilibrium theory from the "large shelf" of micro estimates. The gulfs between the theoretical environments presumed by the dynamic economic models and estimation environments used by empirical microeconomists leaves the shelf of directly usable numbers virtually empty. What is there requires careful interpretation and qualification.
M. Browning et al.
624
~o
o I I
-T0"0 o ~aO
\\',~ E
\
i~
g
'\ i N
o
,-d
c~
o
I - -
I
I
I
L
-
o
Q
~b
Ch. 8:
Micro Data and General Equilibrium Models
625
While dynamic general equilibrium models may suggest new directions for empirical macroeconomic research, it is essential to build the dynamic economic models so that the formal incorporation of microeconomic evidence is more than an afterthought. Macroeconomic theory will be enriched by learning from many of the lessons from modern empirical research in microeconomics. At the same time, microeconomics will be enriched by conducting research within the paradigm of modern dynamic general equilibrium theory, which provides a framework for interpretation and synthesis of the micro evidence across studies.
References Abowd, J., and D. Card (1989), "On the covariancestruc~tre of earnings and hours changes",Econometrica 57(2):411-445. Aiyagari, S.R. (1994), "Uninsured idiosyncratic risk and aggregate savings", Quarterly Journal of Economics 104(2):219-240. Alien, E (1985), "Repeated principal-agent relationshipswith lending and borrowing", Economic Letters 17:27-31. Altonji, J. (1986), "Intertemporal substitution in labor supply: evidence from micro data", Journal of Political Economy 94(2), pt. 2:S176-$215. Altug, S., and R.A. Miller (1990), "Household choices in equilibrium", Econometrica 58(3):543-570. Altug, S., and R.A. Miller (1998), "The effect of work experience on female wages and labor supply", Review of Economic Studies 65(222):45. Alvarez, E, and U.J. Jermann (1999), "Efficiency, equilibrium, and asset pricing with risk of default", Econometriea, forthcoming. Anderson,E. (1998), "Uncertainty and the dynamicsofpareto optimal allocations",Universityof Chicago Dissertation. Ashenfelter, O., and J.J. Heckman (1973), "Estimating labor supply functions", in: G.G. Cain and H.W Watts, Income Maintenance and Labor Supply (Markham, Chicago, IL) 265-278. Atkeson, A., and R.E. Lucas Jr (1992), "On efficient distribution with private information", Review of Economic Studies 59:427-453. Atkeson, A., and M. Ogaki (1996), "Wealth varying interest elasticity of substitution: evidence from panel and aggregate data", Journal of Monetary Economics 38:507 534. Attanasio, O.R (1999), "Consumption", ch. 11, this Handbook. Attanasio, O.P., and M. Browning (1995), "Consumptionover the life cycle and over the business cycle", American Economic Review 85(5):1118-1137. Attanasio, O.E, and G. Weber (1993), "Consumptiongrowth, the interest rate, and aggregation", Review of Economic Studies 60(3):631 649. Attanasio, O.R, and G. Weber(1995), "Is consumptiongrowth consistentwith intertemporaloptimization? Evidence from the consumer expendituresurvey", Journal of Political Economy 103(6):1121-1157. Auerbach, A.J., and L.J. Kotlikoff (1987), Dynamic Fiscal Policy (Cambridge University Press, Cambridge). Autor, D., L. Katz and A. Krueger (1997), "Computing inequality: have computers changed the labor market?", Working Paper No. 5956 (NBER, March). Backus, C.K., RJ. Kehoe and EE. Kydland(1995), "International business cycles: theory and evidence", in: T.E Cooley, ed., Frontiers of Business Cycle Research (Princeton University Press, Princeton, NJ) ch. 11. Barsky, R., T. Juster, M. Kimball and M. Shapiro (1997), "Preference parameters and behaviorial heterogeneity: an experimental approach in the health and retirement survey", Quarterly Journal of Economics 12(3):537-580.
626
M. Browning et aL
Beaudry, P., and H. Green (1997), "Cohort patterns in Canadian earnings: assessing the role of skill premia inequality trends", Working Paper No. 6132 (NBER). Becker, G., and B. Chiswick (1966), "Education and the distribution of earnings", American Economic Review 56(1/2):358-369. Ben Porath, Y. (1967), "The production of human capital and the life cycle of earnings", Journal of Political Economy 75(4):352-65. Benhabib, J., R. Rogerson and R. Wright (1991), "Homework in macroeconomics: household production and aggregate fluctuations", Journal of Political Economy 99(6): 1166-87. Bewley, T. (1977), "The permanent income hypothesis: a theoretical formulation", Journal of Economic Theory 16:252~92. Bils, M.J. (1985), "Real wages over the business cycle: evidence from panel data", Journal of Political Economy 93(4):666 689. Blank, R. (1990), "Why are real wages cyclical in the 1970s'', Journal of Labor Economies 8(1):16~47. Blinder, A., and Y. Weiss (1976), "Human capital and labor supply: a synthesis", Journal of Political Economy 84(3): 449-72. Blundell, R., and 32E. MaCurdy (1999), "Labor supply: a review of alternative approaches", in: O. Ashenfelter and D. Card, eds., Handbook of Labor Economics, vol. 3A (North-Holland, Amsterdam) 155%1695. Blundell, R., and I. Preston (1998), "Consumption inequality and income uncertainty", Quarterly Journal of Economics 113(2):603-640. Blundell, R., M. Browning and C. Meghir (1994), "Consumer demand and the life cycle allocation of household expenditures", Review of Economic Studies 61(1):57-80. Boskin, M.J. (1983), "The economics of labor supply", in: G.G. Cain and H.W. Watts, Income and Maintenance and Labor Supply (Markham, Chicago, IL) 163 181. Bowen, W., and T.A. Finegan (1969), The Economics of Labor Force Participation (Princeton University Press, Princeton, NJ). Brock, W.A., and L.J. Mirman (1972), "Optimal economic growth and uncertainty: the discounted case", Journal of Economic Theory 4(3):479-513. Brown, C. (1976), "A model of optimal human-capital accumulation and the wages of young high school graduates", Journal of Political Economy 84(2):299-316. Browning, M. (1998), "A modelling commodity demands and labour supply with m-demands", mimeo, Institute of Economics, University of Copenhagen. Browning, M., and T. Crossley (1997), "Shocks, stocks and socks: consumption smoothing and the replacement of durables during an unemployment spell", mimeo, McMaster University. Browning, M., and A. Lusardi (1996), "Household saving: micro theories and micro facts", Journal of Economic Literature 34(4): 1797-1855. Browning, M., and C. Meghir (1991), "The effects of male and female labor supply on commodity demands", Econometrica 59(4):925-51. Browning, M., A. Deaton and M. Irish (1985), "A profitable approach to labor supply and commodity demands over the life cycle", Econometrica 53(3):50343. Burfless, G., and D. Greenberg (1983), "Measuring the impact of nit experiments on work effort", Industrial and Labor Relations Review 36(4):592-605. Burtless, G., and J. Hausman (1978), "The effect of taxes on labor supply; evaluating the gary negative income tax experiment", Journal of Political Economy 86(6): 1103-1130. Caballe, J., and M. Santos (1993), "On endogenous growth with physical and human capital", Journal of Political Economy 101 (6): 1042-67. Cameron, S., and J.J. Heckman (1998a), "Life cycle schooling and dynamic selection bias models and evidence for five cohorts of American Males" (first presented at the University of Wisconsin, June, 1990), Journal of Political Economy 106(2):262-333. Cameron, S., and J.J. Heckman (1998b), "Should college attendance be further subsidized to reduce
Ch. 8:
Micro Data and General Equilibrium Models
627
rising wage inequality", in: M. Kosters, ed., Financing College Tuition: Government Policies and Social Priorities (AEI Press, Washington, DC). Card, D. (1995), "The wage curve: a review", Journal of Economic Literature 33(2):785-799. Carroll, C.D. (1992), "The buffer-stock theory of saving: some maeroeconomic evidence", Brookings Papers on Economic Activity 1992(2):61 156. Carroll, C.D., and A.A. Samwick (1997), "The nature of precautionary wealth", Journal of Monetary Economics 40(1):41 71. Christiano, L.J. (1988), "Why does inventory investment fluctuate so much?", Journal of Monetary Economics 21(2/3):247-280. Clark, K.B., and L.H. Summers (1979), "Labor market dynamics and unemployment: a reconsideration", Brookings Papers on Economic Activity 1979(1):13-60. Cogley, T., and J.M. Nason (1995), "Effects of the Hodrick Prescott filter on trend and difference stationary time series: implications for business cycle research", Journal of Economic Dynamics and Control 19(1/2):253~78. Cole, H.L., and N. Kocherlakota (1997), "Efficient allocations with hidden income and hidden storage", Federal Reserve Bank of Minneapolis Research Department Staff Report 238. Coleman, T. (1984), "Essays on aggregate labor market business cycle fluctuations", Ph.D. Dissertation (University of Chicago). Constantinides, G.M. (1982), "Intertemporal asset pricing with heterogeneous consumers and without demand aggregation", Journal of Business 55(2):253-267. Constantinides, G.M., and D. Duffle (1996), "Asset pricing with heterogeneous consumers", Journal of Political Economy 104(2):219-240. Cooley, T.E, and E.C. Prescott (1995), "Economic growth and business cycles", in: 32E Cooley, ed., Frontiers of Business Cycle Research (Princeton University Press, Princeton, N J). Cossa, R., J.J. Heckman and L. Lochner (1998), "The effects of EITC on human capital production", unpublished manuscript (University of Chicago). DaVanzo, J., D.N. DeTray and D.H. Greenberg (1973), "Estimating labor supply response: a sensitivity analysis", R-1372-(E) (The Rand Corporation). Dickinson, J. (1974), "Labor supply of family members", in: J.N. Morgan and G.J. Duncan, eds., Five Thousand American Families: Patterns of Economic Progress (University of Michigan Survey Research Center, Ann Arbor, MI) ch. 1, 177-250. Dumas, B. (1989), "Two-person dynamic equilibrium in the capital market", The Review of Financial Studies 2(2):157-188. Dumas, B., R. Uppal and T. Wang (1997), "Efficient intertemporal allocations with recursive utility", unpublished manuscript. Durlauf, S.N., and D.T. Quah (1999), "The new empirics of economic growth", ch. 4, this Handbook. Eichenbaum, M., and L.R Hansen (1990), "Estimating models with intertemporal substitution using aggregate time series data", Journal of Business and Economic Statistics 8(1):53-69. Epstein, L.G., and A. Melino (1995), "A revealed preference analysis of asset pricing under reeursive utility", Review of Economic Studies 62(4):597-618. Epstein, L.G., and S.E. Zin (1989), "Substitution, risk aversion, and the temporal behavior of consumption and asset retunas: a theoretical framework", Econometrica 57(4):937-969. Epstein, L.G., and S.E. Zin (1991), "Substitution, risk aversion, and the temporal behavior of consumption and asset returns: an empirical analysis", Journal of Political Economy 99(2):263-286. Flavin, M. (1981), "The adjustment of consumption to changing expectations and future income", Journal of Political Economy 89(51):974 1009. Freeman, R. (1976), The Overeducated American (Basic Books, New York). Friedman, M. (1957), A Theory of The Consumption Function (Columbia University Press, New York). Friedman, M., and S. Kuznets (1945), Income from Independent Professional Practice (NBER, New York).
628
M. Browning et aL
Fullerton, D., and D. Rogers (1993), Who Bears The Lifetime Tax Burden (Brookings Press, Washington, DC). Ghez, G., and G.S. Becket (1975), The Allocation of Time and Goods over the Life Cycle (NBER/ Columbia University Press, New York). Gorman, W.M. (1953), "Community preference fields", Econometrica 21(1):63-80. Gorman, W.M. (1968), "The structure of utility functions", Review of Economic Studies 35:369090. Gourinchas, E-O., and J. Parker (1996), "Consumption over the life-cycle", mimeograph (MIT). Green, E. (1987), "Lending and the smoothing of uninsurable income", in: E.C. Prescott and N. Wallace, eds., Contractual Arrangements for Intertemporal Trade (University of Minnesota Press, Minneapolis) 3-25. Greenwood, J., and Z. Hercowitz (1991), "The allocation of capital and time over the business cycle", Journal of Political Economy 99(6):1188-214. Greenwood, J., and M. Yorukoglu (1997), "1974", Carnegie-Rochester Series on Public Policy. Greenwood, J., R. Rogerson and R. Wright (1995), "Household production in real business cycle theory", in: T.E Cooley, ed., Frontiers of Business Cycle Research (Princeton University Press, Princeton N J) oh. 6. Haley, W.J. (1976), "Estimation of the earnings profile from optimal human capital accumulation", Econometrica 44:1223-38. Hall, R.E. (1978), "Stochastic implications of the life cycle-permanent income hypothesis: theory and evidence", Journal of Political Economy 86(6):971-87. Hall, R.E. (1988), "Intertemporal substitution in consumption", Journal of Political Economy 96(2): 339-357. Hall, R.E., and ES. Mishkin (1982), "The sensitivity of consumption to transitory income: estimates from panel data on households", Econometrica 50(2):461-481. Hamermesh, D. (1993), Labor Demand (Princeton University Press, Princeton, N J). Hansen, G.D. (1985), "Indivisible labor and the business cycle", Journal of Monetary Economics 16(3):309-27. Hansen, G.D., and E.C. Prescott (1995), "Recursive methods for computing equilibria of business cycle models" in: T.E Cooley, ed., Frontiers of Business Cycle Research (Princeton University Press, Princeton, NJ) 3%64. Hansen, L.E (1987), "Calculating asset prices in three example economies", in: T.E Bewley, ed., Advances in Econometrics, Fifth World Congress (Cambridge University Press, Cambridge) 207-243. Hansen, L.E, and R. Jagaunathan (1991), "Implications of security market data for models of dynamic economies", Journal of Political Economy 99(2):225~62. Hansen, L.E, and S.E Richard (1987), "The role of conditioning information in deducing testable restrictions implied by dynamic asset pricing models", Econometrica 55(3):587-613. Hansen, L.E, 'and K.J. Singleton (1982), "Generalized instrumental variables estimation of nonlinear rational expectations models", Econometrica 50(5): 126%1286. Hansen, L.E, and K.J. Singleton (1983), "Stochastic consumption, risk aversion and the temporal behavior of asset returns", Journal of Political Economy, 91(2):249-265. Hansen, L.E, W. Roberds and T.J. Sargent (1991), "Time series implications of present-value budget balance and of martingale models of consumption and taxes", in: L.E Hansen and T.J. Sargent, Rational Expectations Econometrics (Westview Press, Boulder, CO) 121 161. Hansen, L.E, T.J. Sargent and T.D. Tallarini Jr (1999), "Robust permanent income and pricing", Review of Economic Studies, forthcoming. Hause, J.C. (1980), "The fine structure of earnings and the on-the-job training hypothesis", Econometrica 48(4): 1013~9. Hausman, J. (1981), "Labor supply", in: H. Aaron and J. Pechman, eds., How Taxes Affect Economic Behavior (The Brookings Institution, Washington, DC) 27-72. Heaton, J., and D.J. Lueas (1996), "Evaluating the effects of incomplete markets on risk sharing and asset pricing", Journal of Political Economy 104(3):443-487.
Ch. 8:
Micro Data and General Equilibrium Models
629
Heckman, J.J. (1971), "Three essays on the demand for goods and the supply of labor", unpublished Ph.D. Thesis (Princeton University, Princeton, N J). Heckman, J.J. (1974), "Effects of child-care programs on women's work effort", in: T.W. Schultz, ed. Economics of the Family: Marriage, Children, and Human Capital (University of Chicago Press, Chicago, IL) 491-518. Heckman, J.J. (1975), "Estimates of a human capital production function embedded in a life-cycle model of labor supply", in: N. Terleckyj, ed., Household Production and Consumption (Columbia University Press, New York) 227-264. Heckman, J.J. (1976), "A life-cycle model of earnings, learning, and consumption", Journal of Political Economy 84(4, pt. 2):Sll $44. Heckman, J.J. (1978), "A partial survey of recent research on the labor supply of women", American Economic Review 68(Suppl.):200-207. Heckman, J.J. (1982), "Heterogeneity and state dependence", in: S. Rosen, ed., Studies in Labor Markets (University of Chicago Press, Chicago, IL). Heckman, J.J. (1984), "Comments on the Ashenfelter and Kydland papers", Carnegie-Rochester Conference Series on Public Policy 21:209 224. Heckman, J.J. (1993), "What has been learned about labor supply in the past twenty years?", American Economic Review Papers and Proceedings 83(2): 116 121. Heckman, J.J., and P.J. Klenow (1997), "Human capital policy", in: M. Boskin, ed., Capital Formation, Hoover Economic Growth Conference, Hoover Institution, 1998. Heckman, J.J., and T.E. MaCurdy (1980), "A life cycle model of female labor supply", Review of Economic Studies 47:47-74. Corrigendum: 1982, Review of Economic Studies 49:659-660. Heckman, J.J., and T.E. MaCurdy (1982), "Corrigendum on a life cycle model of female labour supply", Review of Economic Studies 49:65%660. Heckman, J.J., and J. Scheinkman (1987), "The importance of bundling in a Gorman-Lancaster model of earnings", Review of Economic Studies 54(2):243-255. Heckman, J.J., and G. Sedlacek (1985), "Heterogeneity, aggregation and market wage functions: an empirical model of self-selection in the labor market", Journal of Political Economy 98(6): 1077-1125. Heekman, J.J., and G. Sedlacek (1990), "Self-selection and the distribution of hourly wages", Journal of Labor Economics 8(1, pt. 2):$329-63. Heckman, J.J., and E Todd (1997), "Forty years of Mincer earnings functions", unpublished manuscript (University of Chicago). Heckman, J.J., and R. Willis (1977), "A beta-logistic model for the analysis of sequential labor force participation by married women", Journal of Political Economy 85(1):2~58. Heckman, J.J., A. Layne-Farrar and E Todd (1996), "Human capital pricing equations with an application to estimating the effect of schooling quality on earnings", Review of Economics and Statistics 78(4):562-610. Heckman, J.J., L. Lochner and C. Taber (1998), "Explaining rising wage inequality: explorations with a dynamic general equilibrium model of labor earnings with heterogeneous agents", Review of Economic Dynamics 1(1):1-58. Heckman, J.J., L. Lochner and C. Taber (1999), "General equilibrium cost benefit analysis of education and tax policies", in: G. Ranis and L. Rant, eds., Trade, Growth and Development: Essays in Honor of Professor T.N. Srinivasan (Elsevier Science, Amsterdam) 291-349. Holbrook, R., and E Stafford (1971 ), "The propensity to consume separate types of income: a generalized permanent income hypothesis", Econometrica 39(1):151. Hornstein, A., and J. Praschnik (1994), "The real business cycle: intermediate inputs and sectoral comovement", Discussion Paper 89 (Institute for Empirical Macroeconomics, Federal Reserve Bank of Minneapolis). Hubbard, R.G., J.S. Skinner and S.E Zeldes (1994), "The importance of precautionary motives in explaining individual and aggregate saving", Carnegie-Rochester Conference Series on Public Policy 40:59-125.
630
M. Browning et aL
HuggeR, M. (1996), "Wealth distribution in life-cycle economies", Journal of Monetary Economics 38(3):469-494. Judd, K. (1998), Numerical Methods in Economics (MIT Press, Cambridge, MA). Juhn, C., and K. Murphy (1994), "Relative wages and skill demand, 1940-1990", in: L.C. Solmon and A.E. Levenson, eds., Labor Markets, Unemployment Policy, and Job Creation (Westview Press, Boulder, CO). Juhn, C., K. Murphy and R. Topel (1991), "Why has the natural rate of unemployment increased over time?", Brookings Papers on Economic Activity 1991(2):75-126. Katz, L., and K. Murphy (1992), "Changes in relative wages, 1963-1987: supply and demand factors", Quarterly Journal of Economics 107(1):35-78. Keane, M.P., and D. Runkle (1992), "On the estimation of panel data models with serial correlation when instruments are not strictly exogenous", Journal of Business and Economic Statistics 10(1): 1 9. Kehoe, T.J., and D.K. Levine (1993), "Debt-constrained asset markets", Review of Economic Studies 60(4):865 888. Kihlstrom, R.E., and J.J. Laffont (1979), "A general equilibrium entrepreneurial theory of firm formation based on risk aversion", Journal of Political Economy 87(4):719-748. Killingsworth, M.R., and J.J. Heckman (1986), "Female labor supply: a survey", in: O. Ashenfelter and R. Layard, eds., Handbook of Labor Economics, vol. 1 (North-Holland, Amsterdam) ch. 2, 103-204. Kimball, M.S. (1990), "Precautionary saving in the small and in the large", Econometrica 58(1):53-73. King, R.G., and S.T. Rebelo (1999), "Resuscitating real business cycle models", ch. 14, this Handbook. King, R.G., C.I. Plosser and S.T. Rebelo (1988a), "Production, growth and business cycles I. The basic neoclassical model", Journal of Monetary Economics 21 (2/3): 191-232. King, R.G., C.I. Plosser and S.T. Rebelo (1988b), "Production, growth and business cycles II. New directions", Journal of Monetary Economics 21(2/3):309-341. Kocherlakota, N. (1996), "Implications of efficient risk sharing without commitment", Review of Economic Studies 63(4):595-609. Kosters, M.H. (1966), "Effects of an income tax on labor supply", in: A.C. Harberger and M.J. Bailey, eds., The Taxation of Income From Capital, Studies of Government Finance (Brookings Institution, Washington, DC) 301-324. Kreps, D.M., and E.L. Porteus (1978), "Temporal resolution of uncertainty and dynamic choice theory", Econometrica 46(1): 185-200. Krusell, E, and A.A. Smith (1998), "Income and wealth heterogeneity in the macroeconomy", Journal of Political Economy 106(5):867-896. Kydland, EE. (1984), "Labor-force heterogeneity and the business cycle and a clarification", CarnegieRochester Conference Series on Public Policy 21 :173~08. Kydland, EE. (1995), "Business cycles and aggregate labor market fluctuations", in: T.E Cooley, ed., Frontiers in Business Cycle Research (Princeton University Press, Princeton, NJ) 126-156. Kydland, EE., and E.C. Prescott (1982), "Time to build and aggregate fluctuations", Econometrica 50(6): 1345-70. Lawrance, E. (1991), "Poverty and the rate of time preference", Journal of Political Economy 99(1): 54-77. Lillard, L., and Y. Weiss (1979), "Components of variation in panel earnings data: American scientists 1960-70", Econometrica 47(2):437-54. Lillard, L., and Y. Weiss (1997), "Uncertain health and survival: effects on end-of-life consumption", Journal of Business and Economic Statistics 15(2):254-268. Lucas Jr, R.E. (1980), "Methods and problems in business cycle theory", Journal of Money, Credit and Banking 12(4, pt. 2):696-715; reprinted 1981, in: R.E. Lucas, ed., Studies in Business-Cycle Theory (Massachusetts Institute of Technology Press, Cambridge, MA) 271-296. Lucas Jr, R.E. (1988), "On the mechanics of economic development", Journal of Monetary Economics 22(1):3-42. Lucas Jr, R.E. (1992), "On efficiency and distribution", The Economic Journal 102(411):233-247.
Ch. 8:
Micro Data and General Equilibrium Models
631
Lucas Jr, R.E., and E.C. Prescott (1971), "Investment under uncertainty", Econometrica 39(5):659-681. Lucas Jr, R.E., and L. Rapping (1970), "Real wages, employment and inflation", in: E. Phelps, ed., Microeconomic Foundations of Employment and Inflation (Norton, New York) 257 308. Lucas Jr, R.E., and T. Sargent (1981), Rational Expectations and Econometric Practice (University of Minnesota Press, Minneapolis, MN). Lucas Jr, R.E., and N. Stokey (1984), "Optimal growth with many consumers", Journal of Economic Theory 32(1):139-171. Lusardi, A. (1996), "Permanent income, current income and consumption: evidence from two panel data sets", Journal of Business and Economic Statistics 14(1):81-90. Luttmer, E.G.J. (1996), "Asset pricing in economies with frictions", Econometrica 64(6): 1439-1467. MaCurdy, T.E. (1978), "Two essays on the life cycle", Ph.D. Thesis (University of Chicago, Department of Economics). MaCurdy, T.E. (1981), "An empirical model of labor supply in a life cycle setting", Journal of Political Economy 89(6): 1059-1085. MaCurdy, T.E. (1982), "The use of time series processes to model the error structure of earnings in a longitudinal data analysis", Journal of Econometrics 18(1):83-114. MaCurdy, T.E. (1983), "A simple scheme for estimating an intertemporal model of labor supply and consumption in the presence of taxes and uncertainty", International Economic Review 24(2):265~89. MaCurdy, T.E. (1985), "Interpreting empirical models of labor supply in an intertemporal framework with uncertainty", in: J.J. Heckman and B. Singer, eds., Longitudinal Analysis of Labor Market Data (Cambridge University Press, Cambridge). MaCurdy, T.E., and T.A. Mroz (1995), "Measuring microeeonomic shifts in wages from cohort specification", unpublished manuscript (Stanford University). MaCurdy, T.E., D. Green and H. Paarsch (1990), "Assessing empirical approaches for analyzing taxes and labor supply", Journal of Human Resources 25(3):415~490. Mankiw, N.G. (1986), "The equity premium and the concentration of aggregate shocks", Journal of Financial Economics 17(2):211-219. Marschak, J. (1953), "Economic measurements for policy and prediction", in: W. Hood and T. Koopmans, eds., Studies in Econometric Method (Wiley, New York). McElroy, M. (1981), "Empirical results from estimates of joint labor supply functions of husbands and wives", in: R. Ehrenberg, ed., Research in Labor Economics 4:53-64. Meghir, C., and G. Weber (1996), "Intertemporal nonseparability or borrowing restrictions? A disaggregate analysis using a U.S. consumption panel", Econometrica 64(5): 1151-1181. Mehra, R., and E.C. Prescott (1985), "The equity premium: a puzzle", Journal of Monetary Economics 15(2):145-61. Miller, B.L. (1974), "Optimal consumption with a stochastic income stream", Econometrica 42(2): 253~66. Mincer, J. (1958), "Investment in human capital and the personal income distribution", Journal of Political Economy 66:281-302. Mincer, J. (1974), Schooling, Experience and Earnings (Columbia University Press, New York). Moffitt, R.A., and K.C. Kehrer (1981), "The effect of tax and transfer programs on labor supply: the evidence from the income maintenance experiments", in: R.G. Ehrenberg, ed., Research in Labor Economics, vol. 4 (JAI Press, Greenwich, CT) 103-150. Mroz, T.A. (1984), "The sensitivity of an empirical model of married women's hours of work to economic and statistical assumptions", unpublished Ph.D. Thesis (Stanford University). Murphy, K., and R. Topel (1987), "Unemployment, risk and earnings: testing for equalizing wage differences in the labor market", in: K. Lang and J.S. Leonard, eds., Unemployment and the Structure of Labor Markets (Blackwell, Oxford) 103-140. Nakamura, A., and M. Nakamura (1981), "A comparison of the labor force behavior of married women
632
M. Browning et al.
in the United States and Canada, with special attention to the impact of income taxes", Econometrica 49(2):451-490. Negishi, T. (1960), "Welfare economics and existence of an equilibrium for a competitive economy", Metroeconomica 12:92 97. Pencavel, J. (1986), "Labor supply of men: a survey", in: O. Ashenfelter and R. Layard, eds., Handbook of Labor Economics, vol. 1 (North-Holland, Amsterdam) ch. 1, 3-102. Phelan, C., and R.M. Townsend (1991), "Computing multi-period, information-constrained optima", Review of Economic Studies 58(5):853-881. Prescott, E.C. (1986), "Theory ahead of business cycle measurement", Federal Reserve Bank Minneapolis Quarterly Review 10(4):9-22. Rayack, W (1987), "Sources and centers of cyclical movement in real wages: evidence from panel data", Journal of Post-Keynesian Economic 10(1):3~1. Rogerson, R. (1988), "Indivisible labor, lotteries and equilibrium", Journal of Monetary Economics 21(1):3 16. Rosen, S. (1976), "A theory of life earnings", Journal of Political Economy 84(Suppl.):345-382. Rubinstein, M. (1974), "An aggregation theorem for securities markets", Journal of Financial Economies 1:225-244. Ryder, H., E Stafford and E Stephen (1976), "Labor, leisure and training over the life cycle", International Economic Review 17:651-674. Samwick, A. (1997), "Discount rate heterogeneity and social security reform", mimeograph (Dartmouth College). Santos, M.S. (1999), "Numerical solution of dynamic economic models", ch. 5, this Handbook. Sattinger, M. (1993), "Assignment models of the distribution of earnings", Journal of Economic Literature 31(2):831-80. Scheinkman, J.A., and L. Weiss (1986), "Borrowing constraints and aggregate economic activity", Econometrica 54(1):23-45. Shaw, K. (1989), "Life cycle labor supply with human capital accumulation", International Economic Review 30(2):431-456. Shea, J. (1995), "Union contracts and the life-cycle permanent income hypothesis", American Economic Review 95(1):18(~200. Solon, G., R. Barsky and J.A. Parker (1994), "Measuring the cyclicality of real wages: how important is composition bias?", Quarterly Journal of Economics 109(1): 1-25. Stockman, A.C., and L. Tesar (1995), "Tastes and technology in a two country model of the business cycle: explaining international co-movements", American Economic Review 85(1):168-185. Stokey, N.L., and S.T. Rebelo (1995), "Growth effects of flat-rate taxes", Journal of Political Economy 103(3):519-550. Taylor, J.B. (1999), "Staggered wage setting in macroeconomics", ch. 15, this Handbook. Topel, R. (1986), "Local labor markets", Journal of Political Economy 94(3):111-43. US Bureau of the Census (1960), 1960 Census Public Use Sample (United States Government Printing Office, Washington, DC). Uzawa, H. (1965), "Optimum technical change in an aggregative model of economic growth", International Economic Review 6(1): 18-31. Watson, M.W (1993), "Measures of fit for calibrated models", Journal of Political Economy 101(6): 1011 1041. Well, E (1989), "The equity premium puzzle and the risk-free rate puzzle", Journal of Monetary Economics 24(3):4-421. Weiss, Y. (1986), "The determination of life cycle earnings: a survey", in: O. Ashenfelter and R. Layard, eds., Handbook of Labor Economics, vol. 1 (Elsevier Science, New York) part 5, 603-640. Welch, E (1969), "Linear synthesis of skill distribution", Journal of Human Resources 4(3):311 325. Wilson, R. (1968), "The theory of syndicates", Econometriea 36(1):119-132.
Ch. 8." Micro Data and General Equilibrium Models
633
Zeldes, S.P. (1989), "Consumption and liquidity constraints: an empirical investigation", Journal of Political Economy 97(2):305-46.
Chapter 9
NEOCLASSICAL GROWTH THEORY ROBERT M. SOLOW
Massachusetts Institute of Technology, Department of Economics, E52-383B, Cambridge, MA 02139, USA
Contents
Abstract Keywords 1. Introduction 2. The Harrod-Domar model 3. The basic one-sector model 4. Completing the model 5. The behaviorist tradition 6. The optimizing tradition 7. Comparing the models 8. The Ramsey problem 9. Exogenous technological progress 10. The role of labor-augmentation 11. Increasing returns to scale 12. Human capital 13. Natural resources 14. Endogenous population growth and endogenous technological progress in the neoclassical framework 15. Convergence 16. Overlapping generations 17. Open questions References
Handbook of Macroeconomics, Volume 1, Edited by J.B. Taylor and M. WoodJbrd © 1999 Elsevier Science B.V. All rights reserved 637
638 638 639 640 641 642 643 646 648 649 650 651 652 653 655 657 659 660 663 665
638
R.M. Solow
Abstract
This chapter is an exposition, rather than a survey, of the one-sector neoclassical growth model. It describes how the model is constructed as a simplified description of the real side of a growing capitalist economy that happens to be free of fluctuations in aggregate demand. Once that is done, the emphasis is on the versatility of the model, in the sense that it can easily be adapted, without much complication, to allow for the analysis of important issues that are excluded from the basic model. Among the issues treated are: increasing returns to scale (but not to capital alone), human capital, renewable and non-renewable natural resources, endogenous population growth and technological progress. In each case, the purpose is to show how the model can be minimally extended to allow incorporation of something new, without making the analysis excessively complex. Toward the end, there is a brief exposition of the standard overlapping-generations model, to show how it admits qualitative behavior generally absent from the original model. The chapter concludes with brief mention of some continuing research questions within the framework of the simple model.
Keywords growth, technological progress, neoclassical model JEL classification: 04, E1
Ch. 9:
Neoclassical Growth Theory
639
1. Introduction
As part of macroeconomics, growth theory functions as the study of the undisturbed evolution of potential (or normal capacity) output. The force of "undisturbed" in this context is the maintained assumption that the goods and labor markets clear, i.e., that labor and capital are always fully or normally utilized (or, at the very minimum, that the degree of utilization does not vary). The scope of specifically "neoclassical" growth theory is harder to state, because it is a matter of judgment or convention how much more of the neoclassical general equilibrium apparatus to incorporate in a model of undisturbed growth. As in most of macroeconomics, modeling strategy in growth theory tends to be weighted away from generality and toward simplicity, because the usual intention is to compare model with data at an early stage. Simplicity does not mean rigidity. On the contrary, it will emerge from this review that the neoclassical growth model is extraordinarily versatile. Like one of those handy rotary power tools that can do any of a dozen jobs if only the right attachment is snapped on, the simple neoclassical model can be extended to encompass increasing and decreasing returns to scale, natural resources, human capital, endogenous population growth and endogenous technological change all without major alteration in the character of the model. In this survey, the completely aggregated one-sector model will be the main focus of attention. Models with several sectors (agriculture and industry, consumption goods and capital goods) have attracted attention from time to time, but they tend to raise different issues. To discuss them would break continuity. The main loss from this limitation is that the important literature on open-economy aspects of growth theory has to be ignored. [The main reference is Grossman and Helpman (1991) and the work they stimulated.] Apart from the underlying restriction to "equilibrium growth" (meaning, in practice, the full utilization already mentioned), the most important neoclassical attribute is the assumption of diminishing returns to capital and labor. Here "capital" means (the services of) the stock of accumulated real output in the strictest one-good case, or the complex of stocks of all accumulatable factors of production, including human capital and produced knowledge, when they are explicitly present. The further assumption of constant returns to scale is typically neoclassical, no doubt, but it is not needed in some unmistakably neoclassical approaches to growth theory. The text will be neutral as between the ultra-strong neoclassical assumption that the economy traces out the intertemporal utility-maximizing program for a single immortal representative consumer (or a number of identical such consumers) and the weaker assumption that saving and investment are merely common-sense functions of observables like income and factor returns. The long-run implications tend to be rather similar anyway. Much of growth theory, neoclassical or otherwise, is about the structural characteristics of steady states and about their asymptotic stability (i.e., whether equilibrium paths from arbitrary initial conditions tend to a steady state). The precise definition of a steady state may differ from model to model. Most often it is an evolution
640
R.M. Solow
along which output and the stock of capital grow at the same constant rate. It would be possible to pay much more attention to non-steady-state behavior, by computer simulation if necessary. The importance of steady states in growth theory has both theoretical and empirical roots. Most growth models have at least one stable steady state; it is a natural object of attention. Moreover, ever since Kaldor's catalogue of "stylized facts" [Kaldor (1961)], it has generally, if casually, been accepted that advanced industrial economies are close to their steady-state configurations, at least in the absence of major exogenous shocks. The current vogue for large international cross-section regressions, with national rates of growth as dependent variables, was stimulated by the availability of the immensely valuable Summers-Heston (1991) collection of real national-accounts data for many countries over a fairly long interval of time. The results of all those regressions are neither impressively robust nor clearly causally interpretable. Some of them do suggest, however, that the advanced industrial (OECD) economies may be converging to appropriate steady states. There is nothing in growth theory to require that the steady-state configuration be given once and for all. The usefulness of the theory only requires that large changes in the determinants of steady states occur infrequently enough that the model can do meaningful work in the meanwhile. Then the steady state will shift from time to time whenever there are major technological revolutions, demographic changes, or variations in the willingness to save and invest. These determinants of behavior have an endogenous side, no doubt, but even when established relationships are taken into account there will remain shocks that are too deep or too unpredictable to be endogenized. No economy is a close approximation to Laplace's clockwork universe, in which knowledge of initial positions and velocities is supposed to determine the whole future.
2. The H a r r o d - D o m a r model
This survey is not intended as a history of thought, It is worth saying, however, that neoclassical growth theory arose as a reaction to the Harrod-Domar models of the 1940s and 1950s [Harrod (1939), Domar (1946)]. (Although their names are always linked, the two versions have significant differences. Harrod is much more concerned with sometimes unclear thoughts about entrepreneurial investment decisions in a growing economy. Domar's more straightforward treatment links up more naturally with recent ideas.) Suppose that efficient production with the aggregate technology requires a constant ratio of capital to (net) output, say Y = vK. Suppose also that net saving and investment are always a fixed fraction of net output, say I = d K / d t = s Y . Then, in order for the utilization rate of capital to stay constant, capital and output have to grow at the proportional rate st;. Since labor input is proportional to output, employment would then grow at the same rate. If labor productivity were increasing at the rate m, the growth rate of employment would be sv - m. Let the growth of the labor force, governed
Ch. 9:
Neoclassical Growth Theory
641
mainly by demography, be n. Then the persistence of any sort of equilibrium requires that s v = m + n; if so > m + n there would be intensifying labor shortage, limiting the use of capital, while if so < m + n there would be increasing unemployment. But the equilibrium condition so = m + n is a relation among four parameters that are treated, within the Harrod-Domar model, as essentially independently determined constants: s characterizes the economy's propensity to save and invest, v its technology, n its demography, and m its tendency to innovate. There is no reason for them to satisfy any particular equation. An economy evolving according to Harrod-Domar rules would be expected to alternate long periods of intensifying labor shortage and long periods of increasing unemployment. But this is an unsatisfactory picture of 20th century capitalism. The Harrod-Domar model also tempted many - though not its authors to the mechanical belief that a doubling of the saving-investment quota (s) would double the long-term growth rate of a developing or developed economy. Experience has suggested that this is far too optimistic. There was a theoretical gap to be filled. The natural step is to turn at least one of the four basic parameters into an equilibrating variable. Every one of those four parameters has its obvious endogenous side, and models of economic growth have been built that endogenize them. Within the neoclassical framework, most attention has been paid to treating the capital intensity of production and the rate of saving and investment as variables determined by normal economic processes. The tradition of relating population growth to economic development goes back a lot further; and labor-force participation clearly has both economic and sociological determinants. The case of technological progress is interesting. Most neoclassical growth theory has treated it as exogenous. Some authors, e.g., Fellner (1961) and von Weizsficker (1966), have discussed the possibility that internal economic factors might influence the factor-saving bias of innovations. It is then no distance at all to the hypothesis that the voltune of innovation should be sensitive to economic incentives. This was certainly widely understood. But there was little or no formal theorizing about the rate of endogenous technological progress until the question was taken up in the 1980s. The original references are Lucas (1988) and Romer (1986), but there is now a vast literature.
3. The basic one-sector model
The model economy has a single produced good ("output") whose production per unit time is Y ( t ) . The available technology allows output to be produced from current inputs of labor, L ( t ) , and the services of a stock of"capital" that consists of previously accumulated and partially depreciated quantities of the good itself, according to the production function Y = F ( K , L). (Time indexes will be suppressed when not needed.) The production function exhibits (strictly) diminishing returns to capital and labor separately, and constant returns to scale. (More will be said about this later.)
642
R.M. Solow
Constant returns to scale allows the reduction Y : F ( K , L) : L F ( K / L , 1) : L F ( k , 1) = L f ( k ) ,
and thus finally that y =f(k), where y is output per unit of labor input, i.e., productivity, and k is the ratio of capital to labor input, i.e., capital intensity. From diminishing returns, f ( k ) is increasing and strictly concave. The usual further assumptions (Inada conditions) about f ( k ) eliminate uninteresting possibilities: f ( 0 ) ~>0, f ' ( 0 ) = cx~, f ' ( c ~ ) = 0. These are overly strong: the idea is that the marginal product of capital should be large at very low capital intensity and small at very large capital intensity. [No more than continuity and piecewise differentiability is required of f(.), but nothing is lost by assuming it to be at least twice continuously differentiable, so strict diminishing returns means that f l ' ( k ) < 0.] The universal assumption in growth theory is that each instant's depreciation is just proportional to that instant's stock of capital, say D = d K . This is known to be empirically inaccurate, but it is the only assumption that makes depreciation independent of the details of the history of past gross investment. The convenience is too great to give up. Since this point is usually glossed over, it is worth a moment here. A much more general description is that there is a non-increasing survivorship function j ( a ) , with j(0) = 1 and j(A)=0. (A may be infinite.) The interpretation is that j ( a ) is the fraction of any investment that survives to age a. Then if I ( t ) is gross investment, K ( t ) = f A I ( t -- a ) j ( a ) da. Now differentiation with respect to time and one integration by parts with respect to a leads to K' : I(t) -
~0'AI ( t
- a) d(a) da,
(3.1)
where d ( a ) = - j ' ( a ) is the rate of depreciation at age a. So net investment at time t depends on the whole stream of gross investments over an interval equal to the maximum possible lifetime of capital. It can be checked that only exponential survivorship, j ( a ) = e -da, simplifies to K ' = I ( t ) - d K ( t ) . This assumption will be maintained for analytical convenience. The more complicated formula could easily be adapted to computer simulation. 4. Completing the model At each instant, current output has to be allocated to current consumption or gross investment: Y= C + I . It follows that K ' = Y - d K - C = F ( K , L ) - d K - C.
(4.1)
If the labor force is exogenous and fully employed, L(t) is a given function of time. (In this completely aggregated context, the clearing of the markets for goods and
Ch. 9: Neoclassical Growth Theory
643
labor amounts to the equality of saving and investment at full employment.) Then any systematic relationship that determines C(t) as a function of K ( t ) and t converts Equation (4.1) into an ordinary differential equation that can be integrated to determine the future path of the economy, given L(t) and the initial value of K. Suppose L(t) = e nt. Then simple transformations convert Equation (4.1) into autonomous per capita terms: k/-f(k)
(d+n) k
c,
(4.2)
where, of course, c = C/L. The last component to be filled in is a rule that determines consumption per capita. Here there are two distinct strategies plus some intermediate cases. The simplest possibility, as mentioned earlier, is just to introduce a plausible consumption function with some empirical support. This was the earliest device [Solow (1956), Swan (1956)]. The other extreme, now more common, is to imagine the economy to be populated by a single immortal representative household that optimizes its consumption plans over infinite time in the sort of institutional environment that will translate its wishes into actual resource allocation at every instant. The origins are in Ramsey (1928), Cass (1965) and Koopmans (1965), but there is a large contemporary literature on this basis. For excellent surveys with further references, see Barro and Sala-i-Martin (1995), Blanchard and Fischer (1989, Ch. 2), and D. Romer (1996, Chs. 1, 2).
5. The behaviorist tradition
The two simplest examples in the "behaviorist" tradition are (a) saving-investment is a given fraction of income-output, and (b) saving-investment is a given fraction (which may be unity) of non-wage income, however the distribution of income between wages and profit or interest is determined in the society at hand. The case where different fractions of wage and non-wage income are saved amounts to a mixture of (a) and (b) and does not need to be examined separately. [Complications arise if the correct distinction is between "workers" and "capitalists" instead of wages and non-wages, because workers who save must obviously become partial capitalists. See Samuelson and Modigliani (1966), and also Bertola (t994).] In all this, an important role is played by the maintained assumption that investment always equals saving at full utilization. Under the first of these hypotheses, (4.2) becomes k' = f ( k ) - (d + n) k - (1 - s ) f ( k ) = s f ( k ) - (d + n)k,
(5.1)
where of course s is the fraction of output saved and invested. The conditions imposed o n f ( k ) imply that the right-hand side (RHS) of Equation (5.1) is positive for small k
R.M. Solow
644
(a+d+n)k///
Fig. 1. because f'(O) > (d + n)/s, first increasing and then decreasing because f " ( k ) < 0, and eventually becomes and remains negative because i f ( k ) becomes and remains very small. It follows that there is a unique k* > 0 such that U ( t ) > 0 when k(t)
k*. Thus k* is the globally asymptotically stable rest point for k (leaving aside the origin which may be an unstable rest point [if f ( 0 ) = 0 ] . The phase diagram, Figure 1, drawn for the case f ( 0 ) = 0 , makes this clear.) The properties of k* will be discussed later. For now it is enough to note that, starting from any initial capital intensity, the model moves monotonically to a predetermined capital intensity defined from Equation (5.1) by s f ( k * ) - ( d + n ) k *= O. [Note that k* precisely validates the Harrod-Domar condition because f ( k ) / k corresponds precisely to v, now a variable. The depreciation rate appears only because Equation (5.1) makes gross saving proportional to gross output instead of net saving proportional to net output.] When the economy has reached the stationary capital intensity k*, the stock of capital is growing at the same rate as the labor force - n - and, by constant returns to scale, so is output. The only sustainable growth rate is the exogenously given n, and productivity is constant. A reasonable model of growth must obviously go beyond this. The second hypothesis mentioned earlier, that saving-investment is proportional to non-wage income, requires a theory of the distribution of income between wages and profits. The usual presumption is the perfectly competitive one: profit (per capita) is k S ( k ) because f t ( k ) is the marginal product of capital. Some further generality is almost costlessly available: if the economy is characterized by a constant degree of monopoly in the goods market and monopsony in the labor market, then profit per capita will be proportional to kf~(k) with a factor of proportionality greater than one. If sk is the fraction of profits saved and invested (or the product of that fraction and the monopoly-monopsony factor) Equation (4.2) can be replaced by
U = skkf'(k) - (d + n) k.
(5.2)
Ch. 9: Neoclassical Growth Theory
645
d+n ( k)
k ~-
Fig. 2. The analysis is not very different from that of Equation (5.1). Indeed if F(K,L) is Cobb-Douglas with elasticities b and 1 - b , so that kfl(k)=by, Equations (5.1) and (5.2) coincide, with s replaced by skb. More generally, the conditions imposed on f(k) do not quite pin down the behavior of kfr(k), though they help. For instance, kf'(k) < f ( k ) as long as the marginal product of labor is positive; so the fact that in Figure 2 the graph of sf(k) eventually falls below the ray (n + d)k, irrespective of s, implies that the RHS of Equation (5.2) becomes and remains negative for large k. The other Inada condition is more complicated. Obviously f ( 0 ) = 0 implies that kH(k) goes to zero at the origin. Now the derivative ofkf'(k) i s f ( k ) + kf"(k) 1/b). This makes at most a trivial difference in the qualitative behavior
R.M. Solow
646
of the solution of Equation (5.2). For some parameter choices the origin is the only steady state; for the rest there is one and only one non-zero steady state, and it is an attractor. So nothing special happens. In discrete time, however, the qualitative possibilities are diverse and complex. The discrete analogue of Equation (5.2) can easily exhibit periodic or chaotic dynamics (and even more so if there is saving from wages). It is not clear how much "practical" macroeconomic significance one should attach to this possibility; but it is surely worth study. For an excellent treatment, see B6hm and Kaas (1997). Since k f ' ( k ) / f ( k ) = e ( k ) , the elasticity of f ( . ) with respect to k, the RHS of Equation (5.2) could be translated as s k e ( k ) f ( k ) - ( n + d ) k . As this suggests, a wide variety of assumptions about (market-clearing) saving and investment can be incorporated in the model if Equation (5.1) is generalized to
k' (t) = s ( k ) f (k) - (n + d) k.
(5.3)
For example, suppose s(k) is zero for an initial interval of low values of k and y, and thereafter rises fairly steeply toward the standard value s. This pattern might correspond to a subsistence level of per capita income, below which no saving takes place. The modified phase diagram now has two non-zero steady-state values of k, the larger of which is as before. The smaller steady state is now unstable, in the sense that a small upward perturbation will launch a trajectory toward the stable steady state, while a small downward perturbation will begin a path leading to k = y = 0. This is a sort of low-equilibrium trap; similar variations can be arranged by making n a function of, say, the wage rate, and thus of k. The details are straightforward.
6. The optimizing tradition These formulations all allocate current output between consumption and investment according to a more or less mechanical rule. The rule usually has an economic interpretation, and possibly some robust empirical validity, but it lacks "microfoundations". The current fashion is to derive the consumption-investment decision from the decentralized behavior of intertemporal-utility-maximizing households and perfectly competitive profit-maximizing firms. This is not without cost. The economy has to be populated by a fixed number of identical immortal households, each endowed with perfect foresight over the infinite future. No market imperfections can be allowed on the side of firms. The firms have access to a perfect rental market for capital goods; thus they can afford to maximize profits instant by instant. For expository purposes, nothing is lost by assuming there to be just one household and one firm, both pricetakers in the markets for labor, goods, loans and the renting of capital. The firm's behavior is easy to characterize because it can afford to be myopic under these assumptions. To unclutter the notation, take d = 0. The market provides a real wage w(t) (in terms of the produced commodity) and a real (own) rate of interest i(t).
647
Ch. 9: Neoclassical Growth Theory
It is always profit-maximizing for the firm to hire labor and rent capital up to the point where i(t) = S ( k ( t ) ) ,
(6.1a)
w ( t ) = f (k(t)) - k f ' (k(t)),
(6. lb)
the two RHS expressions being the marginal products of capital and labor. [To allow for depreciation, just subtract d from the RHS of Equation (6.1a).] As before, the size of the household grows like e nt, and each member supplies one unit of labor per unit time, without disutility. (This last simplifying assumption is dispensable.) The household's preferences are expressed by an instantaneous utility function u(c(t)), where c(t) is the flow of consumption per person, and a discount rate for utility, denoted by r. The household's objective at time 0 is the maximization of (6.2)
fo °~ e r t u(c(t)) e nt dt = U.
(The term e ~t can be omitted, defining a slightly different but basically similar optimization problem for the household or clan.) The maximizing c(t) must, of course, satisfy a lifetime budget constraint that needs to be spelled out. Let J ( t ) = Jot i(t) dt so that e -J(t) is the appropriate factor for discounting output from time t back to time zero. The household's intertemporal budget constraint requires that the present value (at t = 0) of its infinite-horizon consumption program should not exceed the sum of its initial wealth and the present value of its future wage earnings. In per-capita terms this says
ff
e J(O c(t) e nt dt = ko +
e J(t) w ( t ) e nt dt,
(6.3)
where non-satiation is taken for granted, so the budget constraint holds with equality, and k0 is real wealth at t = 0. Maximization of Equation (6.2) subject to condition (6.3) is standard after introdnction of a Lagrange multiplier, and leads to tile classical (Ramsey) first-order condition
IC~Ff~C~ u'(c)
1 c'(t) = i(t) - r.
c
(6.4)
The first fraction is the (absolute) elasticity of the marginal utility of consumption. So the optimizing household has increasing, stationary, or decreasing consumption according as the current real interest rate (real return on saving) exceeds, equals, or falls short of the utility discount rate. For a given discrepancy, say a positive one, consumption per head will grow faster the less elastic the marginal utility of consumption.
648
R.M. Solow
((k):n
C~
CO I I
k*
ko
k
Fig. 3. In the ubiquitous special case of constant elasticity, i.e., if u ( c ) = (c 1 - h _ 1)/(1 - h), Equation (6.3) becomes l_c,(t) _ i ( t ) - r _ f ' ( k ( t ) )
c
h
- r
h
(6.3a)
by Equation (6.1a). Under these rules of the game, the trajectory of the economy is determined by Equation (6.4) or, for concreteness, Equations (6.3a) and (4.2), reproduced here with d = O as k ' ( t ) = f ( k ( t ) ) - n k ( t ) - c(t).
(6.5)
The phase diagram in c and k is as shown in Figure 3. c ' ( t ) = 0 along the vertical line defined b y f t ( k * ) = r , with e increasing to the left and decreasing to the right. U = 0 along the locus defined by c = f ( k ) - n k , with k decreasing above the curve and decreasing below it. Under the normal assumption that r > n [otherwise Equation (6.2) is unbounded along feasible paths] the intersection of the two loci defines a unique steady state:
k=k*
f'(k*)
= r,
c* = f ( k * ) - nk*.
(6.4a,b)
7. Comparing the models
This steady state is exactly like the steady state defined by a "behaviorist" model: capital per head and output per head are both constant, so capital and output grow
Ch. 9: Neoclassical Growth Theory
649
at the same rate as the labor force, namely n. In the steady state, the ratio of saving and investment to output is a constant, nk*/f(k*). The steady-state investment rate is higher the higher k* turns out to be, and thus, from Equation (6.4a) the lower is r. So far as steady-state behavior is concerned, choosing a value for s is just like choosing a value for r, a higher s corresponding to a lower r. Out-of-steady-state behavior differs in the two schemes. In the usual way, it is shown that the steady state or singular point (c*, k*) is a saddle-point for the differential equations (4.2) and (6.3a). History provides only one initial condition, namely k0. If the initial value for c is chosen anywhere but on the saddle path, the resulting trajectory is easily shown to be non-optimal for the household (or else ultimately infeasible). The appropriate path for this economy is thus defined by the saddle path, which leads asymptotically to the steady state already discussed. Of course the saving-investment rate is not constant along that path, although it converges to the appropriate constant value (from above if k0
8. The Ramsey problem The optimizing model just described originated with Ramsey (1928) and was further developed by Cass (1965) and Koopmans (1965). They regarded it, however, as a story about centralized economic planning. In that version, Equation (6.2) is a social welfare indicator. A well-meaning planner seeks to choose c(t) so as to maximize Equation (6.2), subject only to the technologically determined constraint (4.2) and the initial stock of capital. [In that context, Ramsey thought that discounting future utilities
R.M. Solow
650
was inadmissible. He got around the unboundedness of Equation (4.2) when r = 0 by assuming that u(.) had a least upper bound B and then minimizing the undiscounted integral of B - u ( c ( t ) ) , either omitting the factor e nt on principle or assuming n = 0. The undiscounted case can sometimes be dealt with, despite the unbounded integral, by introducing a more general criterion of optimality known as "overtaking". For this see von Weizs/icker (1965).] Then straightforward appeal to the Euler equation of the calculus of variations or to the Maximum Principle leads precisely to the conditions (6.4) (or 6.3a) and (6.5) given above. A transversality condition rules out trajectories other than the saddle path. The competitive trajectory is thus the same as the planner's optimal trajectory. One can say either that the solution to the planning problem can be used to calculate the solution to the competitive outcome, or that the competitive rules offer a way to decentralize the planning problem. Lest this seem too easy, it should be remembered that the competitive solution simply presumes either that the household have perfect foresight out to infinity or that all the markets, for every value of t, are open and able to clear at time zero. That strikes many workers in this field as natural and some others as gratuitous.
9. Exogenous technological progress These models eventuate in a steady state in which y, k and c are constant, i.e., aggregate output and capital are growing at the same rate as employment and the standard of living is stationary. That is not what models of growth are supposed to be about. Within the neoclassical framework, this emergency is met by postulating that there is exogenous technological progress. The extensive and intensive production functions are written as F(K, L; t) andf(k; t), so the dependence on calendar time represents the level of technology available at that moment. So general an assumption is an analytical dead end. The behaviorist version of the model can only be dealt with by simulation; the optimizing version leads to a complicated, non-autonomous Euler equation. The standard simplifying assumption is that technological progress is "purely labor-augmenting" so that the extensive production function can be written in the form Y(t) = F(K(t), A(t)L(t)). Technological progress operates as if it just multiplied the actual labor input by an increasing (usually exponential) function of time. The quantity A(t)L(t) is referred to as "labor in efficiency units" or "effective labor". It will be shown below that this apparently gratuitous assumption is not quite as arbitrary as it sounds. I f y(t) is now redefined as Y(t)/A(t)L(t)= Y(t)/eate nt= Y(t)/e (a+n~t, and similarly for k and c, the basic differential equation (5.1) of the behaviorist model is replaced by
k' = sf(k) - (a + n + d ) k ,
(9.1)
the only change from Equation (5.1) being that the rate of growth of employment in efficiency units replaces the rate of growth in natural units. [(One can write f ( k )
Ch. 9: Neoclassical Growth Theory
651
rather than f ( k ; t) because the time-dependence is completely absorbed by the new version of k.] Under the standard assumptions about f ( . ) there is once again a unique non-trivial steady-state value k*, defined as the non-zero root o f s f ( k * ) = (a + n + d ) k * . This steady state attracts every path of the model starting from arbitrary k0 > 0. The difference is that in this steady state aggregate capital, output and consumption are all proportional to e (a+n)t SO that capital, output and consumption per person in natural units are all growing at the exponential rate a, to be thought of as the growth rate of productivity. This growth rate is obviously independent of s. The effect of a sustained step increase in s, starting from a steady state, is a temporary increase in the aggregate and productivity growth rates that starts to taper off immediately. Eventually the new path approaches its own steady state, growing at the same old rate, but proportionally higher than the old one. [There is a possibility of over investment, if f ~ ( k * ) < a + n + d, in which case higher s increases output but decreases consumption. This will be elaborated later, in connection with the Diamond overlapping-generations model.] The situation is slightly more complicated in the optimizing version of the model because the argument of c(t) must continue to be current consumption per person in natural units, i.e. consumption per effective unit of labor multiplied by e nt. This does not change the structure of the model in any important way. The details can be found in D. Romer (1996, Ch. 2). It goes without saying that the introduction of exogenous technical progress achieves a steady state with increasing productivity, but does not in any way explain it. Recent attempts to model explicitly the generation of A(t) fall under the heading of "endogenous growth models" discussed in the original papers by Lucas (1988) and EM. Romer (1990), and in the textbooks of Barro and Sala-i-Marfin (1995) and D. Romer (1996). A few remarks about endogenizing aspects of technical progress within the neoclassical framework are deferred until later.
10. The role of labor-augmentation
The question remains: what is the role of the assumption that exogenous technical progress is purely labor-augmenting? It is clear that either version of the model, and especially easily with the behaviorist version, can be solved numerically without any such assumption. It is just a matter of integrating the differential equation k I = s f ( k ; t) - (n + d) k. The deeper role of labor-augmentation has to do with the importance - in theory and in practice - attached to steady states. It can be shown that purely labor-augmenting technical progress is the only kind that is compatible with the existence of a steady-state trajectory for the model. This observation was due originally to Uzawa (1961). Since the proof is not easily accessible, a compact version is given here. To begin with, it is worth noting that labor-augmenting technical progress is often described as "Harrod-neutral" because Roy Harrod first observed its particular
652
R.M. Solow
significance for steady states. We have defined a steady state as a growth path characterized by a constant ratio of capital to output. For a well-behaved f ( k ; t) it is clear that constancy of the average product of capital is equivalent to constancy of the marginal product of capital. So a steady state might just as well be characterized as a path with constant marginal product o f capital. In the same way, because y/k is monotone in k, one can express k as a function of k/y and t and therefore y = f ( k ; t) =f(k(k/y; t); t) = g(z; t), where z stands for k/y. Now a straightforward calculation leads to dy/dk = gz/(g +zgz). The requirement that dy/dk be independent o f time for given k/y says that the RHS of this equation is independent of t and therefore equal to a function of z alone, say c(z). Thus one can write gz(z; t)=c(z)[g(z;t)+zgz(z; t)], and finally that gz(z; t)/g(z; t ) = c ( z ) / ( 1 - z c ( z ) ) . The RItS depends only on z, and thus also the LHS, which is d(lng)/dz. Integrating this last, one sees that lng(z; t) must be the sum of a function of t and a function of z, so that g(z; t) = y = A(t)h(z). Finally z = k/y = h -1(y/A), whence k/A = (y/A) h -1 (y/A) =j(y/A) and y/A =j-I (k/A). This is exactly the purely-labor-augmenting form: Y = F(K, AL) means Y = ALF(K/AL, 1) or
y/A = f (k/A). The assumption that technical progress is purely labor-augmenting is thus just as arbitrary as the desire that a steady-state trajectory should be admissible. That property brings along the further simplifications.
11. Increasing returns to scale There is an almost exactly analogous, and less well understood, way of dealing with increasing returns to scale. Leaving the model in extensive variables, one sees that the equation K'(t) = sF[K(t),A(t)L(t)] can be integrated numerically for any sort of scale economies. Trouble arises only when one looks for steady-state trajectories, as a simple example shows. Suppose F is homogeneous of degree h in K and AL. If K and AL are growing at the same exponential rate g, Y =F(K,AL) must be growing at the rate gh. Unless h = 1, steady state trajectories are ruled out. There is a simple way to restore that possibility. Let h be a positive number not equal to 1 and suppose the production function F [ K , A L h] is homogeneous of degree 1 in K and AL h. Production exhibits increasing returns to scale in K and L if h > 1: doubling K and L will more than double AL h and thus more than double output, though F is generally not homogeneous of any degree in K and L. (Obviously, if F is C o b b Douglas with exponents adding to more than 1, it can always be written in this special form.) But now, if A grows at the exponential rate a and L at the rate n, it is clearly possible to have a steady state with Y and K growing at rate g = a + nh. (The same goes for h < 1, but the case of increasing returns to scale is what attracts attention.) It is an interesting property of such a steady state that productivity, i.e., output per unit of labor in natural units, Y/L, grows at the rate g - n = a + ( h - 1)n. Thus the model with increasing returns to scale predicts that countries with faster growth of the labor force
Ch. 9: Neoclassical Growth Theory
653
will have faster growth rates of productivity, other things equal. This seems empirically doubtful. This discussion speaks only to the existence of a steady state with non-constant returns to scale. More is true. Within the behaviorist model, the steady state just described is a global attractor (apart from the trivial trap at the origin). To see this it is only necessary to redefine y as Y/AL h and k similarly. The standard calculation then shows that k' = s f ( k ) - (a + hn + d) k, with a unique stable steady state at k*, defined as the unique non-zero root of the equation sf(k*) = (a + hn + d) k*. Note that, with h > 1, a higher n goes along with a smaller k* but a higher productivity growth rate. The appropriate conclusion is that the neoclassical model can easily accommodate increasing returns to scale, as long as there are diminishing returns to capital and augmented labor separately. Exactly as in the case of exogenous technical progress, a special functional form is needed only to guarantee the possibility of steady-state growth. The optimizing version of the model requires more revision, because competition is no longer a viable market form under increasing returns to scale; but this difficulty is not special to growth theory.
12. Human capital Ever since the path-breaking work of T.W. Schultz (1961) and Gary Becker (1975) it has been understood that improvement in the quality of labor through education, training, better health, etc., could be an important factor in economic growth, and, more specifically, could be analogized as a stock o f " h u m a n capital". Empirical growth accounting has tried to give effect to this insight in various ways, despite the obvious measurement difficulties. (For lack of data it is often necessary to use a current flow of schooling as a surrogate for the appropriate stock.) See for just a few of many examples, Denison (1985) and Collins and Bosworth (1996). The U.S. Bureau of Labor Statistics, in its own growth-accounting exercises, weights hours worked with relative wage rates, and other techniques have been tried. These considerations began to play a central role in theory with the advent of endogenous growth theory following Romer and Lucas, for which references have already been given. Here there is need only for a sketch of the way human capital fits into the basic neo-classical model. Corresponding empirical calibration can be found in Mankiw, Romer and Weil (1992) and Islam (1995). Let H ( t ) be a scalar index of the stock of human capital, however defined, and assume as usual that the flow of services is simply proportional to the stock. Then the extensive production function can be written as Y = F ( K , H, L). If there is exogenous technical progress, L can be replaced by AL as before. Assume that F exhibits constant returns to scale in its three arguments. (If desired, increasing returns to scale can be accommodated via the device described in the preceding section.) Then the intensive productivity function is y = F(k, h, 1) = f ( k , h). In the endogenous-growth literature, it
R.M. Solow
654 k t= 0
I~"
/Y "-~ 'h:0
k
Fig. 4.
is more usual to start with the assumption that Y = F(K, HL), so that HL is interpreted as quality-adjusted labor input. The really important difference is that it is then assumed that F is homogeneous of degree 1 in the two arguments K and HL. Obviously this implies that there are constant returns to K and H, the two accumulatable inputs, taken by themselves. This is a very powerful assumption, not innocent at all. Within the neo-classical framework, the next step is a specification of the rules according to which K and H are accumulated. Simple symmetry suggests the assumption that fractions sx and sH of output are invested (gross) in physical and human capital. (This is undoubtedly too crude; a few qualifications will be considered later.) Under these simple assumptions, the model is described by two equations:
k' = s ~ f ( k , h ) - ( a + n + d x ) k ,
h' = s H f ( k , h ) - ( a + n + d H ) h .
(12.1)
As usual, a + n is the rate of growth of the (raw) labor supply in efficiency units and dx and dH are the rates of depreciation of physical and human capital. Under assumptions on f(.,.) analogous to those usually made on f(.), there is just one non-trivial steady state, at the intersection in the (h, k) plane of the curves defined by setting the LHS of Equation (12.1) equal to zero. In the Cobb-Douglas case If(k, h) = kbh c, b + c < 1] the phase diagram is easily calculated to look like the accompanying Figure 4. With more effort it can be shown, quite generally, that the locus of stationary k intersects the locus of stationary h from below; since both curves emanate from the origin, the qualitative picture must be as in Figure 4. Thus the steady state at (h*,k*) is stable. [It is obvious from Equation (12.1) that k*/h* =sx/sH if the depreciation rates are equal; otherwise the formula is only slightly more complicated.] Thus, starting from any initial conditions, K, H and Y eventually grow at the same rate, a + n. This model with human capital is exactly analogous to the model without it. But this model is unsatisfactory in at least two ways. For one thing, the production of human capital is probably not fruitfully thought of, even at this level of abstraction, as a simple diversion of part of aggregate output. It is not clear how to model the production
Ch. 9: NeoclassicalGrowth Theory
655
of human capital. The standard line taken in endogenous-growth theory has problems of its own. (It simply assumes, entirely gratuitously, that the rate of growth of human capital depends on the level of effort devoted to it.) Nothing further will be said here about this issue. The second deficiency is that, if investment in physical capital and in human capital are alternative uses of aggregate output, the choice between them deserves to be modeled in some less mechanical way than fixed shares. One alternative is to treat human capital exactly as physical capital is treated in the optimizing-competitive version of the neo-classical model. Two common-sense considerations speak against that option. The market for human capital is surely as far from competitive as any other; and reverting to infinite-horizon intertemporal optimization on the part of identical individuals is not very attractive either. It is possible to find alternatives that give some economic structure to the allocation of investment resources without going all the way to full intertemporal optimization. For example, if in fact one unit of output can be transformed into either one unit of physical or one unit of human capital, market forces might be expected to keep the rates of return on the two types of investment close to one another as long as both are occurring. This implies, given equal depreciation rates, that f l ( k , h)=.f2(k, h) at every instant. The condition f12 > 0 is sufficient (but by no means necessary) for the implicit function theorem to give k as a function of h. I f F(K, H, L) is Cobb-Douglas, k is proportional to h; the same is true for a wider class of production functions including all CES functions. The simplest expedient is to combine this with something like k ~= s f ( k , h ) (a + n + d) k with h replaced by h(k). Then physical investment is a fraction of output, and human-capital investment is determined by the equal-rate-of-return condition. In the Cobb-Douglas case, this amounts to the one-capital-good model with a CobbDouglas exponent equal to the sum of the original exponents for k and h. It happens that this set-up reproduces exactly the empirical results of Mankiw, Romer and Weil (1992), with the original exponents for k and h each estimated to be about 0.3. A more symmetric but more complicated version is to postulate that aggregate investment is a fraction of output, with the total allocated between physical and human capital so as to maintain equal rates of return. With depreciation rates put equal for simplicity, this reduces to the equation k ~+ H = sf(k, h) - (a + n + d)(k + h), together with h = h(k). The Cobb-Douglas case is, as usual, especially easy. But the main purpose of these examples is only to show that the neoclassical model can accommodate a role for human capital, with frameworks ranging from rules of thumb to full optimization.
13. N a t u r a l r e s o u r c e s
There is a large literature on the economics of renewable and nonrenewable resources, some of it dealing with the implications of resource scarcity for economic growth. [An early treatise is Dasgupta and Heal (1979). See also the Handbook of Natural
656
R.M. Solow
Resource and Energy Economics, edited by Kneese and Sweeney (1989) for a more recent survey.] This is too large and distant a topic to be discussed fully here, but there is room for a sketch of the way natural resources fit into the neoclassical growththeoretic framework. The case of renewable natural resources is simplest. Some renewable resources, like sunlight or wind, can be thought of as providing a technology for converting capital and labor (and a small amount of materials) into usable energy. They require no conceptual change in the aggregate production function. More interesting are those renewable resources - like fish stocks and forests - that can be exploited indefinitely, but whose maximal sustainable yield is bounded. Suppose the production function is Y = F ( K , R , e(g+n)t), with constant returns to scale, where R is the input of a renewable natural resource (assumed constant at a sustainable level) and the third input is labor in efficiency units. If a constant fraction of gross output is saved and invested, the full-utilization dynamics are K ~ = s F ( K , R, e (g+n)t) - dK, where R is remembered to be constant. For simplicity, take F to be Cobb-Douglas with elasticities a, b and 1 - a - b for K, R and L respectively. The model then looks very much like the standard neoclassical case with decreasing returns to scale. It is straightforward to calculate that the only possible exponential path for K and Y has them both growing at the rate h = (1 - a - b)(g + n)/(1 - a). If intensive variables are defined by y = Ye ht and k K e -ht, the usual calculations show that this steady state is stable. In it, output per person in natural units is growing at the rate h - n = [(1 - a - b ) g - bn]/(1 - a ) . For this to be positive, g must exceed b n / ( 1 - a - b). This inequality is obviously easier to satisfy the less important an input R is, in the sense of having a smaller Cobb-Douglas elasticity, i.e., a smaller competitive share. If the resource in question is nonrenewable, the situation is quite different. In the notation above, R/> 0 stands for the rate of depletion of a fixed initial stock So given at t = 0. Thus the stock remaining at any time t > 0 is S(t) and S(t) = ft ~ R(u) du, assuming eventual exhaustion, so that R ( t ) = - S ~ ( t ) . Along any non-strange trajectory for this economy, R(t) must tend to zero. Even if F ( K , O , A L ) = 0, it is possible in principle for enough capital formation and technological progress to sustain growth. But this has not been felt to be an interesting question to pursue. It depends so much on the magic of technological progress that both plausibility and intellectual interest suffer. The literature has focused on two other questions. First, taking L to be constant, and without technological progress, when is a constant level of consumption per person sustainable indefinitely, through capital accumulation alone? The answer is: if the asymptotic elasticity of substitution between K and R exceeds 1, or equals 1 and the elasticity of output with respect to capital exceeds that with respect to R. For representative references, see Solow (1974), Dasgupta and Heal (1979), and Hartwick (1977). Second, and more interesting, how might such an economy evolve if there is a "backstop" technology in which dependence on nonrenewable resources is replaced by dependence on renewable resources available at constant cost (which may decrease =
657
Ch. 9: Neoclassical Growth Theory
through time as technology improves). In pursuing these trails, capital investment can be governed either by intertemporal optimization or by rule of thumb. The depletion of nonrenewable resources is usually governed by "Hotelling's rule" that stocks of a resource will rationally be held only if they appreciate in value at a rate equal to the return on reproducible capital; in the notation above, this provides one differential equation: dFR/dt=FRFK. The other comes from any model of capital investment.
14. Endogenous population growth and endogenous technological progress in the neoclassical framework Making population growth and technological progress endogenous is one of the hallmarks of the "New" growth theory [see Barro and Sata-i-Martin (1995) for references]. Needless to say, one way of endogenizing population growth goes back to Malthus and other classical authors, and has been adapted to the neoclassical-growth framework from the very beginning. There was also a small literature on endogenous technical progress in the general neoclassical framework, e.g., Fellner (1961) and von Weizs~icker (1966), but it was concerned with the likely incidence of teclmical change, i.e., its labor-saving or capital-saving character, and not with its pace. However the same simple device used in the case of population can also be used in the case of technology. It is outlined briefly here for completeness. The Malthusian model can be simplified to say just that the rate of population (labor-force) growth is an increasing function of the real wage; and there is at any time a subsistence wage - perhaps slowly changing - at which the population is stationary. In the neoclassical model, the real wage is itself an increasing function of the capital intensity (k), so the subsistence wage translates into a value k0 that separates falling population from growing population. There is no change in the derivation of the standard differential equation, except that the rate of growth of employment is now n(k), an increasing function of k vanishing at k0. One might wish to entertain the further hypothesis that there is a higher real wage, occurring at a higher capital intensity kl such that n(k) is decreasing for k > kl, and may fall to zero or even beyond. Technical progress can be handled in the same way. Imagine that some unspecified decision process makes the rate of (labor-augmenting) technological progress depend on the price configuration in the economy, and therefore on k. (The plausibility of this sort of assumption will be discussed briefly below.) In effect we add the equation Al(t) = a(k)A(t) to the model. The remaining calculations are as before, and they lead to the equation k' = s f ( k ) - ( d + n ( k ) + a ( k ) ) k ,
(14.1)
where it is to be remembered that k = K / A L stands for capital per unit of labor in efficiency units.
R.M. Solow
658
(d+a(k)+h(k))k s~f(k)
~ kl~
~
k2
k3
Fig. 5.
The big change is that the last term in Equation (14.1) is no longer a ray from the origin, and may not behave simply at all. It will start at the origin. For small k, it will no doubt be dominated by the Malthusian decline in population and will therefore be negative. One does not expect rapid technological progress in poor economies. For larger values of k, n(k) is positive and so, presumably, is a(k). Thus the last term of Equation (14.1) rises into positive values; one expects it to intersect sf(k) from below. Eventually - the "demographic transition" - n(k) diminishes back to zero or even becomes negative. We have no such confident intuition about a(k). On the whole, the most advanced economies seem to have faster growth of total factor productivity, but within limits. Figure 5 shows one possible phase diagram, without allowing for any bizarre patterns. The steady state at the origin is unstable. The next one to the right is at least locally stable, and might be regarded as a "poverty trap". The third steady state is again unstable; in the diagram it is followed by yet another stable steady state with a finite basin of attraction. Depending on the behavior of a(k), there might be further intersections. For a story rather like this one, see Azariadis and Drazen (1990). There are many other ideas that lead to a multiplicity of steady states. The interesting aspect of this version of the model is that k is output per worker in efficiency units. At any steady state k*, output per worker in natural units is growing at the rate a(k*). It is clear from the diagram that a change in s, for instance, will shift k* and thus the steady-state growth rate of productivity. It will also shift n(k*) and this is a second way in which the aggregate growth rate is affected. So this is a neoclassical model whose growth rate is endogenous. The question is whether the relation A ~ ~ a(k)A has any plausibility. The Malthusian analogue L / = n(k)L has a claim to verisimilitude. Birth and death rates are likely to
Ch. 9.. Neoclassical Growth Theory
659
depend on income per head; more to the point, births and deaths might be expected to be proportional to the numbers at risk, and therefore to the size of the population. One has no such confidence when it comes to technical change. Within the general spirit of the neoclassical model, something like a(k) seems reasonable; k is the natural state variable, determining the relevant prices. But the competitive market form seems an inappropriate vehicle for studying the incentive to innovate. And why should increments to productive knowledge be proportional to the stock of existing knowledge? No answer will be given here, and there may be no good answer. The relevant conclusion is that population growth and technological progress can in principle be endogenized within the framework of the neoclassical growth model; the hard problem is to find an intuitively and empirically satisfying story about the growth of productive technology.
15. Convergence The simplest form of the neoclassical growth model has a single, globally stable steady state; if the model economy is far from its steady state, it will move rapidly toward it, slowing down as it gets closer. Given the availability of the Summers-Heston crosscountry data set of comparable time series for basic national-product aggregates, it is tempting to use this generalization as a test of the neoclassical growth model: over any common interval, poorer countries should grow faster than rich ones (inper capita terms). This thought has given rise to a vast empirical literature. Useful surveys are Barro and Sala-i-Martin (1995) and Sala-i-Martin (1996), both of which give many further references. The empirical findings are too varied to be usefully discussed here, but see chapter 4 by Durlauf and Quah in this volume, and also chapter 10. Sala-i-Martin distinguishes between/j-convergence and a-convergence. The first is essentially the statement given above; it occurs when poor countries tend to grow faster than rich ones. On the other hand, a-convergence occurs within a group of countries when the variance of their p e r capita GDP levels tends to get smaller as time goes on. Clearly/j-convergence is a necessary condition for a-convergence; it is not quite sufficient, however, though one would normally expect/J-convergence to lead eventually to a-convergence. Something can be said about the speed of convergence if the neoclassical model holds. Let gt, r stand for the economy's per capita growth rate over the interval from t to t + T , meaning that gt, r = T-11og[y(t+T)/y(t)]. Then linearizing the neoclassical model near its steady state yields an equation of the form gt, r = const. - T-l(1 - e -/~r) logy(t).
(15.1)
Obviously gt,0 =/Jlogy. Moreover, in the Cobb-Douglas case with f ( k ) = k b, it tma~s out that/3 = (1 - b)(d + n + a). Another way to put this is that the solution to the basic differential equation, near the steady state at k*, is approximately k(t) - k* ~ e-b(a+n+")t(k0 - k*).
(15.2)
660
R.M. Solow
Since b is conventionally thought to be near 0.3, this relation can be used to make /3-convergence into a tighter test of the neoclassical model. [It usually turns out that b must be considerably larger than that to make the model fit; this has led to the thought that human capital should be included in k, in which case the magnitudes become quite plausible. On this see Mankiw, Romer and Weil (1992).] One difficulty with all this is that different countries do not have a common steady state. In the simplest model, the steady-state configuration depends at least on the population growth rate (n) and the saving-investment rate (s) or the utility parameters that govern s in the optimizing version of the model. One might even be permitted to wonder if countries at different levels of development really have effective access to a common world technology and its rate of progress; "backwardness" may not be quite the same thing as "low income". In that case, an adequate treatment of convergence across countries depends on the ability to control for all the determinants of the steadystate configuration. The empirical literature consists largely of attempts to deal with this complex problem. On this, see again chapter 10 in this Handbook. The natural interim conclusion is that the simple neoclassical model accounts moderately well for the data on conditional convergence, at least once one allows for the likelihood that there are complex differences in the determination of steady states in economies at different stages of development. The main discrepancy has to do with the speed of convergence. This is perhaps not surprising: actual investment paths will follow neither optimizing rules nor simple ratios to real output. Outside the simplest neoclassical growth model, there may even be multiple steady states, and this clearly renders the question of/3-convergence even more complicated. This possibility leads naturally to the notion of club-convergence: subsets of "similar" countries may exhibit/3-convergence within such subsets but not between them. Thus the states of the United States may exhibit convergence, and also the member countries of OECD, but not larger groupings. This is discussed in Galor (1996). See also Azariadis and Drazen (1990) for a model with this property.
16. Overlapping generations Literal microfoundations for the optimizing version of the standard neoclassical model usually call for identical, immortal households who plan to infinity. An equivalent and equally limiting - assumption involves a family dynasty of successive generations with finite lives, each of which fully internalizes the preferences of all succeeding generations. An alternative model avoids some restrictiveness by populating the economy with short-lived overlapping generations, each of which cares only about its own consumption and leaves no bequests. The simplest, and still standard, version involves two-period lives, so that two generations - young and old - coexist in each period. As the previous sentence suggests, overlapping-generations models are written in discrete time, although this is not absolutely necessary [Blanchard and Fischer (1989),
Ch. 9: Neoclassical Growth Theory
661
p. 115]. There is a large literature beginning with Samuelson (1958) [anticipated in part by Allais (1947)]. An excellent exposition is to be found in D. Romer (1996, Ch. 2), and a full treatment in Azariadis (1993), where further references can be fotmd. The OG model has uses in macroeconomic theory generally [for instance Grandmont (1985), Hahn and Solow (1996)], but here attention is restricted to its use in growth theory, beginning with Diamond (1965). There is only one produced good, with the production function Yt = F ( K t , A t N t ) as usual. In the standard notation, we can write Yt = f ( k t ) , where y and k are output and capital per unit of labor in efficiency traits. In each period, then, the competitively determined interest rate is rt =f~(kt) and the wage in terms of the single good is Atwt = A t ( f ( k t ) - ktf~(kt)). Note that wt is the wage per efficiency unit of labor; a person working in period t earns Atwt. Nt families are born at the beginning of period t and die at the end of period t + 1. Set Nt = (1 + n) t so the total population in period t is (1 + n) t 1 + (1 + n) t. Each family supplies one unit of labor inelastically when it is young, earns the going (real) wage Atwt, chooses how much to spend on the single good for current consumption etl, earns the going rate of return rt+l on its savings ( A t w t - c t l ) , and spends all o f its wealth on consumption when old, so that ct2 = (1 + rt+l)(Atwt - Ctl). Note that savings in period t are invested in period t + 1. As with other versions of growth theory, it is usual to give each household the same time-additive utility function u ( c t l ) + ( 1 + i ) lu(ct2). It is then straightforward to write down the first-order condition for choice of Ctl and c~2. It is U'(Ct2)/U'(Ctl) = (1 +i)/(1 +rt+l); together with the family's intertemporal budget constraint it determines etl and ct2, and therefore the family's savings in period t as a function of rt+l and Atwt. In the ever-popular special case that u(x) = (1 - m)-lx I m (SO that m is the absolute elasticity of the marginal utility function), it follows directly that the young family saves a fraction s(r) of its wage income, where
(1 + r) (1-m)/m s(r) = (1 + r)( 1-m)/m + (1 + i) 1/m
(16.1)
and the obvious time subscripts are omitted. The formulas for young and old consumption follow immediately. Now assume that At -- (1 + a) t as usual. Since the savings of the young finance capital investment in the next period, we have Kt+l = s(rt+l)AtwtNt. Remembering that k is K l A N , we find that kt+l = (1 + n) l(1 + a)-ls(rt+l)wt.
(16.2)
Substitution of rt+l =f~(kt+l) and wt = f ( k t ) - k t f t ( k t ) leaves a first-order difference equation for kt. In simple special cases, the difference equation is very well behaved. For instance (see D. Romer 1996 for details and exposition), if f ( - ) is Cobb-Douglas and u(.) is
R.M. Solow
662
kt+l
J
k0
kI
k2
k3
k~
kt
Fig. 6.
logarithmic, the difference equation takes the form kt+l = const.ktb and the situation is as in Figure 6. (Note that logarithmic utility implies that the young save a constant fraction o f their earnings, so this case is exactly like the standard neoclassical model.) There is one and only one stationary state for k, and it is an attractor for any initial conditions k0 > 0. Exactly as in the standard model, k* decreases when n or a increases and also when i increases (in which case s naturally decreases). Since k =K/AN, the steady-state rate of growth of output is a + n and the growth rate of labor productivity is a, both independent of i (or s). There are, however, other possibilities, and they can arise under apparently "normal" assumptions about utility and production. Some o f these possibilities allow for a multiplicity of steady states, alternately stable and unstable. This kind of configuration can arise just as easily in the standard model when the saving rate is a function of k. They are no surprise. The novel possibility is illustrated in Figure 7. The curve defined by Equation (16.2) in the (kt, k t + l ) plane may bend back, so that in some intervals - in the diagram, when kt is between km and kM - kt is compatible with several values of kt+t. The difference equation can take more than one path from such a kt. This is the situation that gives rise to so-called "sunspot" paths. See Cass and Shell (1983), Woodford (1991), and an extensive treatment in Farmer (1993). The mechanism is roughly this. Suppose sl(r) < 0. Then a young household at time t that expects a low value of rt+l will save a lot and help to bring about a low value of r. If it had expected a high value of r next period, it would have saved only a little and helped to bring about a high value of r. The possibility exists that the household may condition its behavior on some totally extraneous phenomenon (the "sunspot" cycle) in such a way that its behavior validates the implicit prediction and thus confirms the significance of the fundamentally irrelevant signal. In this particular model, the sunspot phenomenon seems to require that saving be highly sensitive to the interest
Ch. 9:
663
Neoclassical Growth Theory
kt+l
km
kM
kt
Fig. 7.
rate, and in the "wrong" direction at that. This goes against empirical findings, so that indeterminacy of this kind may not be central to growth theory, even if it is significant for short-run macroeconomic fluctuations.
17. Open
questions
This survey has stayed close to basics and has not attempted anything like a complete catalogue of results in growth theory within the neoclassical framework. In that spirit, it seems appropriate to end with a short list of research directions that are currently being pursued, or seem worth pursuing. The role of human capital needs clarification, in both theoretical and empirical terms. Human capital is widely agreed to be an important factor in economic growth. Maybe more to the point, it seems to offer a way to reconcile the apparent facts of convergence with the model. One difficulty is that the measurement of human capital is insecure. See Judson (1996) and Klenow and Rodriguez-Clare (1998). School enrollment data are fairly widely available, but they clearly represent a flow, not a stock. Direct measurement of the stock runs into deep uncertainty about depreciation and obsolescence, and about the equivalence of schooling and investment in human capital. Mention has already been made of the use of relative wages as indicators of relative human capital; the well-known Mincer regressions can also be used, as in Klenow and Rodriguez-Clare (1998). (Better measurement might throw some light on the way human capital should enter the production function: as a labor-augmentation factor, as a separate factor of production, or in some other way. On this, as on several
664
R.M. Solow
other matters, the distinction between "neoclassical" and "endogenous" growth theory seems to be artificial.) It was suggested in the text above that there is no mechanical obstacle to the endogenization of technical change within the neoclassical model. But the analytical devices mentioned by example were all too mechanical. The modeling of technological progress should be rethought, and made more empirical using whatever insights come from micro-studies of the research-and-development process. It seems pretty clear that the endogenous-growth literature has been excessively generous in simply assuming a connection between the level of innovative effort and the rate of growth of the index of technology. It is impossible to know what further empirical work and theoretical modeling will suggest about the nature of that connection. But it is a central task to find out. One of the earliest stories about endogenous technical change was Arrow's model of "learning by doing" (Arrow 1962), which is well within the neoclassical framework. It, too, was rather mechanical, with automatic productivity increase as simple fall-out from gross investment. Many economists have found the basic idea to be plausible; it is a source of technical change that is entirely independent of R&D. But very little econometric work has taken off from learning-by-doing, and there seems to have been no attempt to test it. Recently there have been renewed efforts to elaborate and improve the underlying idea [Young (1993), Solow (1997)]. The next step should probably be empirical. Very similarly, the notion that technical change has to be "embodied" in new investment in order to be effective seems instantly plausible [Solow (1960), Jorgenson (1966)]. For many years, however, it proved to be impossible to verify its importance in macroeconomic time series data. Just recently there has been a revival of interest in this question. Lau (1992) may have isolated a significant embodiment-effect in a combined time-series analysis involving several advanced and newly-industrialized countries. And Wolff (1996) claims to have found in the embodiment-effect an explanation of the productivity slowdown that occurred almost worldwide in the early 1970s. Hulten (1992), on the other hand, came to different conclusions using different methods and data. The interesting possibility of using changes in the relative prices of investment goods and consumption goods to isolate the embodiment effect has opened up new vistas. Greenwood, Hercowitz and Krusell (1997) is a pioneering reference. See also the survey by Hercowitz (1998) and an (as yet) unpublished paper by Greenwood and Jovanovic (1997). Some new theory might help develop this work further. A powerful embodiment effect (and the same could be said about learning by doing) will strengthen the connection between short-run macroeconomic fluctuations and long-run growth. As things stand now, the only effect of business cycles on the growth path comes through the "initial" value of the stock of capital. These more sophisticated mechanisms would also link growth to cycle through the level of achieved technology. There are no doubt other ways in which better integration of growth theory and business-cycle theory would improve both of them. A last issue that needs exploring is the matter of increasing returns to scale. It was shown earlier that the neoclassical model can easily accommodate increasing (or,
Ch. 9: Neoclassical Growth Theory
665
for that matter, decreasing) returns to scale, just as a matter o f modeling production. The important question lies elsewhere. The ubiquity o f increasing returns to scale implies the ubiquity o f imperfect competition as a market form. There is plenty o f microeconomic theory to link imperfect competition with investment and perhaps with innovative activity. The systematic relation - if any - between imperfect competition and growth has not been fully explored. What there is has come mostly through the endogenous-growth literature [Aghion and Howitt (1992, 1998), Romer (1990)), and there it has been an appendage to specialized models o f the R & D process. Imperfect competition is finding its way slowly into general macroeconomics. Growth theory should not be far behind.
References Aghion, P., and P. Howitt (1992), "A model of growth through creative destruction", Econometrica 60:323 351. Aghion, P., and P. Howitt (1998), Endogenous Growth Theory (MIT Press, Cambridge, MA.). Allais, M. (1947), Economic et InterSt (Imprimerie Nationale, Paris). Arrow, K. (1962), "The economic implications of learning by doing", Review of Economic Studies 26:155-173. Azariadis, C. (1993), Intertemporal Macroeconomics (Blackwell, Oxford). Azariadis, C., and A. Drazen (1990), "Threshold externalities in economic development", Quarterly Journal of Economics 105:501-526. Barro, R.J., and X. Sala-i-Martin (1995), Economic Growth (McGraw-Hill, New York). Becker, G. (1975), Human Capital, 2nd edition (National Bureau of Economic Research/Columbia University Press, New York). Bertola, G. (1994), "Wages, profits and theories of growth", in: L. Pasinetti and R. Solow, eds., Economic Growth and the Structure of Long-Term Development (St. Martin's Press, New York) 90-108. Blanchard, O.J., and S. Fischer (1989), Lectures in Macroeconomics (MIT Press, Cambridge, MA). B6hm, V., and L. Kaas (1997), "Differential savings, factor shares, and endogenous growth cycles", Working Paper (Department of Economics, University of Bielefeld). Cass, D. (1965), "Optimum growth in an aggregative model of capital accumulation", Review of Economic Studies 32:233~40. Cass, D., and K. Shell (1983), "Do sunspots matter?", Journal of Political Economy 91:193~27. Collins, S., and B. Bosworth (1996), "Economic growth in East Asia: accumulation vs. assimilation", Brookings Papers on Economic Activity 1996(2): 135-191. Dasgupta, P., and G. Heal (1979), Economic Theory and Exhaustible Resources (Cambridge University Press, Cambridge). Denison, E.E (1985), Trends in American Economic Growth, 1929-1982 (The Brookings Institution, Washington, DC). Diamond, EA. (1965), "National debt in a neoclassical growth model", American Economic Review 55:1126-1150. Domar, E. (1946), "Capital expansion, rate of growth and employment", Econometrica 14:137 147. Farmer, R. (1993), The Macroeconomics of Self-fulfilling Prophecies (MIT Press, Cambridge). Fellner, W (1961), "Two propositions in the theory of induced innovations", Economic Journal 71: 305-308. Galor, O. (1996), "Convergence? Inferences from theoretical models", Working Paper No. 1350 (Center for Economic Policy Research, London). Grandmont, J.-M. (1985), "On endogenous competitive business cycles", Eeonometriea 53:995-1046.
666
R.M. Solow
Greenwood, J., and B. Jovanovic (1997), "Accounting for growth", unpublished. Greenwood, J., Z. Hercowitz and R Kxusell (1997), "Long-run implications of investment specific technological change", American Economic Review 87:342-362. Grossman, G.M., and E. Helpman (1991), Innovation and Growth in the Global Economy (MIT Press, Cambridge, MA). Hahn, E, and R.M. Solow (1996), A Critical Essay on Modem Macroeconomic Theory (MIT Press, Cambridge, MA). Harrod, R. (1939), "An essay in dynamic theory", Economic Journal 49:14-33. Hartwick, J. (1977), "Intergenerational equity and the investing of rents from exhaustible resources", American Economic Review 66:972 974. Hercowitz, Z. (1998), "The 'embodiment' controversy: a review essay", Journal &Monetary Economics 41:217-224. Hulten, C. (1992), "Growth accounting when technical change is embodied in capital", American Economic Review 82:964-980. Islam, N. (1995), "Growth empirics: a panel data approach", Quarterly Journal of Economics 110: 1127-1170.
Jorgenson, D. (1966), "The embodiment hypothesis", Journal of Political Economy 74:1 17. Judson, R. (1996), "Measuring human capital like physical capital: what does it tell us?", Working Paper (Federal Reserve Board, Washington, DC). Kaldor, N. (1961), "Capital accumulation and economic growth", in: EA. Lutz and D.C. Hague, eds., The Theory of Capital (St. Martin's Press, New York). King, R.G., and S.T. Rebelo (1993), Transitional dynamics and economic growth in the neoclassical model, American Economic Review 83:908-931. Klenow, RJ., and A. Rodriguez-Clare (1998), "The neoclassical revival in growth economics: has it gone too far?", in: O.J. Blanchard, ed., Macroeconomics Annual 1997 (MIT Press, Cambridge, MA). Kneese, A., and J. Sweeney, eds (1989), Handbook of Natural Resources and Energy Economics, 3 volumes (Elsevier Science, Amsterdam). Koopmans, T.C. (1965), "On the concept of optimal economic growth", Scientific Papers of Tjalling C. Koopmans (Springer, New York). Lau, L. (1992), "The importance of embodied technical progress: some empirical evidence from the group-of-five countries", Publication No. 296, mimeograph (Center for Economic Policy Research, Stanford University, Stanford, CA). Lucas, R. (1988), "On the mechanics of economic development", Journal of Monetary Economics 22:3-42. Mankiw, N.G., D. Romer and D.N. Weil (1992), "A contribution to the empirics of economic growth", Quarterly Journal of Economics 107:407-448. Ramsey, E (1928), "A mathematical theory of saving", Economic Journal 88:543-559. Romer, D. (1996), Advanced Macroeconomics (McGraw-Hill, New York). Romer, P.M. (1986), "Increasing returns and long-run growth", Journal of Political Economy 94:1002-1037. Romer, P.M. (1990), "Endogenous technological change", Journal of Political Economy 98: $71 S 102. Sala-i-Martin, X. (1996), The classical approach to convergence analysis, Economic Journal 106:1019-1036. Samuelson, EA. (1958), "An exact consumption-loan model of interest, with or without the social contrivance of money", Journal of Political Economy 66:467-482. Samuelson, P.A., and E Modigliani (1966), "The Pasinetti paradox in neoclassical and more general models", Review of Economic Studies 33:269-301. Schultz, T.W ( 1961), Investment in human capital, American Economic Review 51:1-17. Solow, R.M. (1956), "A contribution to the theory of economic growth", Quarterly Journal of Economics 70:65-94.
Ch. 9:
Neoclassical Growth Theory
667
Solow, R.M. (1960), "Investment and technical progress", in: K. Arrow, S. Karlin and R Suppes, eds., Mathematical Methods in the Social Sciences (Stanford University Press, Palo Alto, CA) 89-104. Solow, R.M. (1974), "Intergenerational equity and exhaustible resources", Review of Economic Studies 41:29-45. Solow, R.M. (1997), Learning from 'Learning by Doing' (Stanford University Press, Palo Alto, CA). Summers, R., and A. Heston (1991), "The Penn world trade (Mark 5): an expanded set of international comparisons 1950-1988", Quarterly Journal of Economics 106:329 368. Swan, T.W. (1956), "Economic growth and capital accumulation", Economic Record 32:334-361. Uzawa, H. (1961), "Neutral inventions and the stability of growth equilibrium", Review of Economic Studies 28:117-124. von Weizs/icker, C. (1965), "Existence of optimal programs of accumulation for an infinite time horizon", Review of Economic Studies 32:85 104. yon Weizs/icker, C. (1966), "Tentative notes on a two-sector model with induced technical progress", Review of Economic Studies 33:245~52. Wolff, E. (1996), "The productivity slowdown, the culprit at last? Follow-up on Hulten and Wolff", American Economic Review 86:1239-1252. Woodford, M. (1991), "Self-fulfilling expectations and fluctuations in aggregate demand", in: N.G. Mankiw and D. Romer, eds., New Keynesian Economics, vol. 2 (MIT Press, Cambridge, MA) 2:77-110. Young, A. (1993), "Invention and bounded learning by doing", Journal of Political Economy 101: 443-472.
Chapter 10
EXPLAINING CROSS-COUNTRY INCOME DIFFERENCES ELLEN R. McGRATTANand JAMES A. SCHMITZ, Jr. Federal Reserve Bank of Minneapolis
Contents 670 670 671 674 678 678 687 688 695 695 695 702 707 709 709 715 720 724 730 733 734 734
Abstract Keywords 1. Introduction 2. Some basic facts 3. Accounting 3.1. Levels accounting 3.2. Growth accounting 4. Growth regressions 5. Quantitative theory 5.1. Effects of policy on disparity 5.1.1. Policies distorting investment 5.1.2. Policies affecting trade 5.1.3. Other policies 5.2. Effects of policy on growth 5.2.1. Policies in a two-sector AK model 5.2.2. Policies in an R&D model 6. Two growth models and all of the basic facts 6.1. An exogenous growth model 6.2. An endogenous growth model 7. Concluding remarks Acknowledgements References
Handbook of Macroeconomics, Volume 1, Edited by J.B. Taylor and ~ © 1999 Elsevier Science B.V. All rights reserved 669
Woodford
670
E.R. McGrattan and JA. Schmitz, Jn
Abstract This chapter reviews the literature that tries to explain the disparity and variation of GDP per worker and GDP per capita across countries and across time. There are many potential explanations for the different patterns of development across countries, including differences in luck, raw materials, geography, preferences, and economic policies. We focus on differences in economic policies and ask to what extent can differences in policies across countries account for the observed variability in income levels and their growth rates. We review estimates for a wide range of policy variables. In many cases, the magnitude of the estimates is under debate. Estimates found by running cross-sectional growth regressions are sensitive to which variables are included as explanatory variables. Estimates found using quantitative theory depend in critical ways on values of parameters and measures of factor inputs for which there is little consensus. In this chapter, we review the ongoing debates of the literature and the progress that has been made thus far.
Keywords cross-country income differences, growth accounting, growth regressions, endogenous growth theory J E L classification: E62, E65, O11, O 4 1 , 0 4 7
Ch. 10: Explaining Cross-Country Income Differences
671
1. Introduction
Gross domestic product (GDP) per worker of rich countries like the USA is about 30 times that of poor countries like Ethiopia. The fastest growing countries now grow at 9 percent per year, whereas 100 years ago the highest rates of growth were around 2 percent. Over the postwar period, there is virtually no correlation between income levels and subsequent growth rates, and growth rates show very little persistence. This chapter reviews the literature that tries to explain these and other facts about the crosscountry income distribution. There are many potential explanations for the different patterns of development across countries, including differences in luck, raw materials, geography, preferences, and economic policies. As in most of the literature, we focus on economic policy and ask to what extent can differences in policies across countries account for the variability in levels of income and their growth rates. Are policies responsible for only a few percent of the income differences or for most of the variation? If they are responsible for most of the variation, which policies are particularly helpful or harmful? We show that while some progress has been made in answering these questions, it has been fairly limited. There are estimates of the effects of policy on income and growth for a wide range of policy variables. However, in most cases, their magnitudes are under debate. Moreover, there is little consensus concerning methodology. We review two approaches used to obtain estimates of the effects of policy on income and growth. The most widely used approach is to run cross-sectional regressions of growth rates on initial levels of income, investment rates, and economic policy or political variables. [See, for example, Kormendi and Meguire (1985), Barro (1991), and Barro and Sala-i-Martin (1995).] Policy variables found to have a significant effect on growth in these regressions include measures of market distortions such as the average government share in GDP or the black market premium, measures of political rights or stability, and measures of financial development. For example, Barro and Lee (1994) show that as a result of differences in the ratio of government consumption to GDP and in the black market premium between a group of East Asian countries and a group of sub-Saharan African countries, the East Asian countries were predicted to grow 3.5 percent per year faster. The actual difference in growth rates was 8.1 percent per year. Thus, these differences in the two variables account for a large fraction of the difference in growth rates. In this literature, the estimated coefficients on variables designated as policy variables have been shown to be sensitive to which variables are included in the regression. Levine and Renelt (1992) find that a large number of policy variables are not robustly correlated with growth. Hence, estimates of the impact of economic policy on growth are under debate. Another approach to calculating the effects of economic policy, which we call quantitative theory, is to specify explicit models of economic development, parameterize them, and derive their quantitative implications. In our review of quantitative theory, we start with studies that explore the extent to which differences in economic
672
E.R. McGrattan and J.A. Schmitz, Jr
policies account for differences in levels of income. We consider the effects of fiscal policies, trade policies, policies affecting labor markets, and policies impeding efficient production. [Examples of such studies include Chari et al. (1997) on investment distortions, Romer (1994) on tariffs, Hopenhayn and Rogerson (1993) on labor market restrictions, Parente and Prescott (1994, 1997) on barriers to technology adoption, and Schmitz (1997) on inefficient government production.] To illustrate the quantitative effects of some of these policies, we derive explicit formulas for cross-country income differences due to inefficient government production, taxes on investment, and tariffs. These formulas show that measured differences in policies can potentially explain a significant fraction of observed income disparity. However, there is also debate in this literature about the magnitude of the impact of policy on income. Much of the debate centers around the choice of model parameters. For example, measured differences in investment distortions can account for a significant fraction of observed income disparity if shares on accumulable factors are on the order of ~ or larger. Shares on the order of 1 imply very little disparity in incomes. Measured differences in tariff rates imply significant differences in incomes if the number of imports are assumed to vary significantly with the tariff rate. Otherwise, the effects of tariffs are very small. We also review studies in quantitative theory that explore the extent to which differences in economic policies account for differences in growth rates of income. We review two standard endogenous growth models: a two-sector "AK" model and a model of research and development (R&D). For the AK model, we consider the effects of changes in tax rates on long-run growth rates as in King and Rebelo (1990), Lucas (1990), Kim (1992), Jones et al. (1993), and Stokey and Rebelo (1995). To illustrate the quantitative effects of these tax policies, we derive explicit formulas for the steady-state growth rate in terms of tax rates and parameters of the model. Here, too, the estimated impact of tax changes on growth varies dramatically in the literature. For example, the predicted decline in the growth rate after an increase in the income tax rate from 0 percent to 20 percent ranges from 7/10ths of a percent to 4 percentage points. Using the explicit formulas, we show how the estimates of tax effects on growth are sensitive to certain model parameters. Unlike the AK model, there has been little work to date assessing the effects of policy changes on growth rates in the R&D models. [See, for example, Romer (1990), Grossman and Helpman (1991a, b), and Aghion and Howitt (1992).] This is likely due to the fact that the main quantitative concern for these models has been their predicted scale effects. That is, most of these models predict that the growth rate increases with the number of people working in R&D. We describe a discrete-time version of the model in Romer (1990) and Jones' (1995a) version of the model which eliminates scale effects. [See also Young (1998).] We also discuss the possible growth effects of policies such as the subsidization of R&D and show that these effects depend critically on certain model assumptions. Both approaches to estimating the effects of policy, then, the growth regression approach and the quantitative theory approach, have provided estimates of the impact
Ch. 10: Explaining Cross-Country Income Differences
673
of policy on income and growth. But, as the examples above indicate, within each approach, the magnitude of the impact of policy is under some debate. But in comparing the two approaches, we need to compare more than the precision of their estimates of policy's effect on incomes and growth. For example, the growth regression literature has come under considerable criticism because of econometric problems. [See, for example, Mankiw (1995), Kocherlakota (1996), Sims (1996), and Klenow and Rodriguez-Clare (1997a).] One serious problem is the endogeneity of right-handside variables in these regressions. The quantitative theory approach is not subject to such econometric criticisms. Hence, while the growth regression approach is the most widely used approach, we think the quantitative theory approach will ultimately be the predominant one. Thus, we place more emphasis on it in our review. The rest of our review proceeds as follows. Section 2 presents some basic facts about the cross-country income distribution using data on GDP per worker for 19601990 compiled by Summers and Heston (t991) and on GDP per capita for 18201989 compiled by Maddison (1991, 1994). In Section 3, we review the accounting literature which has been a source of data on factor inputs and total factor productivity. Studies in the accounting literature attempt to apportion differences in country income levels or growth rates to technological progress and factor accumulation. [See, for example, Krueger (t968), Christensen et al. (1980), Elias (1992), Mankiw et al. (1992), Young (1995), Hsieh (1997), Klenow and Rodriguez-Clare (1997a), and Hall and Jones (1998).] These studies do not directly address why factor inputs differ across countries, but they do provide measures of labor and capital inputs, estimates of the shares of these inputs, and thus an estimate of either the level or the growth rate of total factor productivity (TFP). We show that, as yet, there is still no consensus on the level or growth of human capital and TFP or on the size of factor shares. The remainder of the chapter is concerned with estimating the effects of policy on income and growth. In Section 4, we review the empirical growth literature. In Section 5 we review studies applying the quantitative theory approach - considering first those concerned with differences in income levels and then those concerned with growth. The two literatures within quantitative theory, that examining disparity and that examining growth, have developed in large part separately from each other. There have been few attempts to account for more than one key regularity in the data and few attempts to compare the implications of competing theories for data. We conclude the chapter by considering the implications of two standard growth models, the neoclassical exogenous growth model and the AK model, for some of the basic features of the data from Maddison (1991, 1994) and Summers and Heston (1991). To make a direct comparison, we use the same tax processes as inputs in both models. We show that these models do fairly well in accounting for the large range in relative incomes, the lack of correlation in incomes and subsequent growth rates, and the lack of persistence in growth rates. However, both models have trouble replicating the large increase in maximal growth rates observed over the past 120 years.
E.R. McGrattan and J.A. Schmitz, Jr
674
301 1989 Relative GDP Per Capita (World Average = 1)
105
0o 4
.
10 4
011/8 1/4 1/2
1
2
4
8
[] [] []
10 3
[]
i
I
1800
I
i
I
1850
I
T
I
p
[
i
I
I
1900 Year
p
I
1950
I
I
[]
i
I
I
2000
Fig. 1. GDP per capita, 1820 1989.
2. Some basic facts
In this section, we review some basic facts about the distribution of country incomes and their growth rates. We have historical data for various years over the period 18201989 from Maddison (1994) for 21 countries. For the period 1870-1989, data are available from Maddison (1991) in all years for 16 countries. More recent data are from the Penn World Table (version 5.6) of Summers and Heston (1991) and cover as many as 152 countries over the period 1950-1992 I. These data show that disparity in incomes is large and has grown over time, that there is no correlation between income levels and subsequent growth rates, that growth rate differences are large across countries and across time, and that the highest growth rates are now much higher than those 100 years ago. These basic features of the data are summarized in Figures 1-4. [See Parente and Prescott (1993) for a related discussion.] In Figure 1, we provide two perspectives on the disparity of per capita GDP across countries. First, we plot per capita GDP in 1985 US dollars for 21 countries for various years between 1820 and 1989. These data are taken from Maddison (1994). Each country-year observation is represented by a square. Second, for 1989, we display the distribution of relative GDP per capita using the 137 countries with available data in the Summers and Heston data set (variable RGDPCH). To construct the relative GDP,
1 All of the data used in this chapter are available at our web site.
Ch. 10: Explaining Cross-Country Income Differences
675
we divide a country's per capita GDP by the geometric average for all 137 countries. A value o f 8 implies that the country's per capita GDP is 8 times the world average, and a value o f ½ implies that the country's per capita GDP is g1 o f the world average. One noteworthy feature o f Figure 1 is the increase in disparity in GDP per capita over the last 170 years in Maddison's (1994) 21-country sample 2. The ratio o f the highest per capita GDP to the lowest in 1820 is 3.0, whereas the ratio in 1989 is 16.7. Hence, the range o f GDPs per capita in this sample increased by a factor of 5.6 (16.7 + 3). If we consider the Summers and Heston (1991) sample of 137 countries in 1989 (shown in the insert o f Figure 1), we find that the average GDP per capita for the top 5 percent o f countries is nearly 34 times that of the bottom 5 percent. Another notable aspect o f the 1989 distribution is its near uniformity in the range ½ to 8. Thus, it is not true that being very rich (having a per capita GDP from 4 to 8 times the world average) or being very poor (having a per capita GDP from g1 to 1 of the world average) is uncommon. Furthermore, over the period 1960-1990, the ratio o f the relative incomes o f rich to poor has been roughly constant; 1989 is not an unusual year 3. The data that we plot in Figure 1 are GDP per capita since we do not have data on the number o f workers prior to 1950. However, much o f our analysis in later sections will deal with GDP per worker. If we instead use GDP per worker to obtain an estimate o f disparity in 1989, we get a similar estimate to that found with GDP per capita. In 1989 the average GDP per worker for the most productive 5 percent of the countries is about 32 times that o f the least productive 5 percent. Next consider Figure 2, which has in part motivated the cross-sectional growth literature. Figure 2 presents average annual growth rates in GDP per worker over the 1960-1985 period versus the relative GDP per worker in 1960. For this period, data are available from Summers and Heston (1991) for 125 countries. There are two key features to note. First, there is no correlation between 1960 productivity levels and subsequent growth rates. The correlation is 0.01. Second, the range in average annual growth rates is large. Even over a 25-year period, some countries had average growth rates o f over 5 percent per year while some countries had average annual growth rates that were negative. These features o f the data are also found for GDP per capita and for the subset o f the Summers and Heston countries that have data available through 1990. [For example, see Barro and Sala-i-Martin (1995), who use GDP per capita.] Figure 3 presents average annual growth rates o f GDP per worker for a country over 1973-1985 versus average annual growth rates over 1961-1972 for the same sample o f countries used in Figure 2. As Easterly et al. (1993) note, the correlation between growth rates in the two subperiods is low. The correlation in this case is 0.16. A striking feature of the growth rates is the magnitudes across subperiods. For example, Saudi
2 Prescott (1998) calculates the disparity between western and eastern countries and finds a significant increase in disparity over the past 200 years. 3 The same is true of the Maddison 21-cotmtry sample. The ratio of the highest to lowest per capita GDP was 19.0, 19.6, and 16.7 in 1950, 1973, and 1989, respectively.
676
E.R. McGrattan and J.A. Schmitz, Jr. 8 7 6 5 ©
05
•
4 O0
32 -
•
e• O •
•
O•
•
•
•
o O
.
g.~
• O0
1 -*
•
• • • • oqb
00 • •
Joe•
• • •
•
oe e
OO0 •
0-
-2 -
-3
I
1/8
I
1/4
J
I
I
1/2 1 2 Relative G D P Per W o r k e r in 1960
4
Fig. 2. Growth versus initial GDP per worker, 1960 1985. 10
u
o
© "
•
.0 •
• O
•
.
; . ' j
•
•
o•
41o0
N
• eo]~
o o ~
U •
•
."
~g •
•
•
•
• 0o
~
oOJ
•
•
il
"
0•
•
.~
• O0
.
•
•
O~
5
i
i
I
r
i
~
i
I
i
0 5 Growth Rates o f G D P Per Worker, 1961-72 Fig. 3. Persistence of growth rates, 1960 1985.
i
I
10
677
Ch. 10: Explaining Cross-Country Income Differences
12Fastest Growers: ~.~ 10
-
8
-
United States Canada Japan Austria Switzerland Jordan
1870-80 1880-10 1910-20, 30-40,50-70 1920-30 / 1940-50 / 1970-80
6
4
2
I
I
I
I
I
I
I
I
I
I
I
I
Y8'80 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 n Year Fig. 4. Maximum GDP per capita growth, 1870-1990. Arabia grew at a rate of 8.2 percent in the first half of the sample and then at a rate o f - 1 . 8 percent in the second half. Guinea's growth rate in the first half of the sample was about 0 and jumped to 4.2 in the second half. Figure 4 plots growth rates of the fastest growing countries over time. Starting in 1870, for each country for which data are available, we calculate the average annual growth rate within each decade between 1870 and 1990. For each decade, we select the country that achieved the maximum growth rate and plot this growth rate along with the country names in Figure 4. For example, the USA achieved the maximum average annual growth over the 1870-1880 decade, about 2.5 percent. The sample of countries in Figure 4 are from two sources. From 1870 to 1950, the data are GDP per capita from Maddison (1991). Over this period, there are only 16 countries 4. From 1950 to 1990, the data are GDP per capita from Summers and Heston (1991). We included all countries with available data. The pattern in Figure 4 is striking. The maximum average annual growth rates over a decade have increased dramatically through time, from the 2-3 percent range in the late 1800s to the 8-9 percent range that we currently observe. An obvious concern is that the pattern in Figure 4 is driven by the fact that the sample of countries increased dramatically after 1950. The countries in Maddison (1991) are the ones that are the
4 Unlike the 21-country sample used for Figure 1, the data from Maddison (1991) are primarily rich countries.
678
E.R. McGrattan and J.A. Schmitz, Jr.
most productive today; they are the most productive today because they had the greatest growth in productivity from 1870 to 1950. There may have been episodes during this period in which some of the poorer countries had miraculous growth rates. But, to our knowledge, no such episodes have been identified. Thus, we suspect that if data for all countries were available back to 1870 and we again drew Figure 4, the picture would not change very much 5. Before reviewing the progress that has been made in estimating the effects of policy on income and growth, we review the levels and growth accounting literatures. The objective of these literatures is to estimate the contributions of physical capital, labor, educational attainment, and technological progress to differences in levels or growth rates of output. While they do not directly address why factor inputs differ across countries, the accounting exercises are nonetheless important steps to explaining cross-country income differences. For example, to estimate the effects of policy in quantitative theories, reliable estimates for certain parameters, like the capital shares, are needed. The accounting exercises provide careful measures of labor and capital inputs, estimates of the shares of these inputs, and an estimate of TFP or its growth rate.
3. Accounting We start this section with some results of levels accounting. We show that the estimates of TFP are sensitive to the measurement of human capital and the shares of income to physical and human capital. As yet, there is little consensus on the size of the stock of human capital or on the magnitude of the factor shares. Thus, when we calculate the fraction of income differences explained by differences in observed factor inputs, we find a wide range of estimates. We then discuss some recent work in growth accounting estimating the growth in TFP for the East Asian newly industrialized countries. Here, there is less disagreement about whether good economic performances were due in large part to factor accumulation or to total factor productivity.
3.1. Levels accounting
The objective in levels accounting studies is to apportion differences in income levels to differences in levels of total factor productivity and factor inputs. Typically, the
5 In fact, if we use the Maddison (1991) data, which are availableuntil 1980, to construct growth rates between 1950 and 1980, the pattern is the same for all years except 1970-1980.
Ch. 10: ExplainingCross-CountryIncome Differences
679
starting point is an aggregate production function F - assumed to be the same across countries - o f the form
Y=F(K,H,L,A),
(3.1)
where Y is output, K is the stock o f physical capital, H is the stock o f hmnan capital, L is the labor input, A is an index o f the technology level, and income is defined to be output per worker (Y/L). These studies construct measures o f K, H , and L and treat A as a residual in Equation (3.1). Many levels accounting studies assume that the production function has a C o b b Douglas form given by
y =KakHa,,(AL)I ak a,~,
(3.2)
where ak and ah are capital shares for physical and human capital, respectively, and ak + ah < 1. Equation (3.2) is then rearranged to get
ctk
ah
y =A
,
(3.3)
where y = Y/L. With measures o f K/Y and H/Y, these studies ask, To what extent do cross-country variations in these capital intensities account for the large variation in y?6 There is substantial disagreement on the answer to this question. For example, Mankiw et al. (1992) argue that differences in K/Y and H/Y can account for a large fraction o f the disparity in y whereas Klenow and Rodriguez-Clare (1997b) and Hall and Jones (1998) argue that it accounts for much less. In this section, we ask the following question: To what extent can differences in capital intensities account for the income disparity between the richest and poorest countries? To be precise, we calculate the ratio
ak Yrich
Nr
Ypoor
1 (Ki)~ N,~ E i 6 poor ~
~k
ah ~,,
,
(3.4)
(Hi)l~ah
where the "rich" are the Nr most productive countries and the "poor" are the Np least productive countries. Note that Equation (3.4) assumes no differences in technology A
6 A notable exception is Krueger (1968), who does not have measures of physical capital. She estimates income levels that could be attained in 28 countries if each country had the same physical capital per worker and natural resources as the USA, but each country had its own human resources. Krueger finds that there would still be large per capita GDP differences between the USA and many of these countries even if they had the physical capital and natural resources of the USA. Using logged differences in incomes, her findings imply that the fraction of the income disparity explained by differences in human capital is in the range of 20 to 40 percent.
E.R. McGrattan and JA. Schmitz, Jr.
680
d
q
"
o o
N
•
"
t o
e
• o#ooo
"
0%
..
•
•
•
%.
o.
•
I
$ 9 = 0.67
0 1/8
I 1/4
I
I
I
1/2 1 2 R e l a t i v e G D P P e r W o r k e r in 1985
I
I
4
8
F i g 5 Physical capital-output ratio versus income, 1985
across countries. Thus, if we use observations of K/Y and H / Y in Equation (3.4), the ratio is a prediction of the disparity in income levels due only to variations in capital intensities. In our calculations, we use the measures of capital intensities in Mankiw et al. (1992), Klenow and Rodriguez-Clare (1997b), and Hall and Jones (1998). The measure of K/Y is very similar across the studies of Mankiw et al. (1992), Klenow and Rodriguez-Clare (1997b), and Hall and Jones (1998). Therefore, we use the same K/Y for all of the calculations that we do. We construct estimates of the capital stock for each country using the perpetual inventory method. With data on investment, an initial capital stock, and a depreciation rate, we construct a sequence of capital stocks using the following law of motion for/(~ 7: Kt+~ = (1 - 6)Kt + Xkt,
(3.5)
where 6 is the rate of depreciation. We choose a depreciation rate of 6 percent. For the initial capital stock, we assume that the capital-output ratio in 1960 is equal to the capital-output ratio in 1985 in order to get our estimate 8. In Figure 5, we plot the physical capital-output ratio, K/Y, for 1985 versus the relative GDP per worker in 1985 for all countries that have complete data on GDP 7 We use I x RGDPCH x POP from the Penn World Table of Summers and He ston (199 l) for investment. 8 This way of estimating the final capital-output ratio leads to a good approximation if the economy is roughly on a balanced growth path. As a check, we tried other initial capital stocks and found that the final capital-output ratio was not sensitive to our choices.
Ch. 10:
Explaining Cross-Country Income Differences
681
per worker and investment [variables R G D P W and I in Summers and Heston (1991)] over the sample period 1960-1985. There are 125 countries in the sample. The figure shows that capital-output ratios for the most productive countries are on the order o f 3, whereas capital-output ratios for the least productive countries are around 1 or below. The correlation between the capital-output ratio and the logarithm o f relative GDP per worker is 0.67. We next consider measures of H/Y which vary a lot across the three studies. We start with the measure used by Mankiw et al. (1992). Motivated by the work o f Solow (1956), Mankiw et al. (1992) assume that g
Y
sh
(3.6)
n+g+6'
where sh is the fraction o f income invested in human capital, g is the growth rate o f world-wide technology, n is the growth rate o f the country's labor force, and c5 is the rate at which both physical and human capital depreciate. The expression in Equation (3.6) is a steady-state condition o f Solow's (1956) model augmented to include human capital as well as physical capital. Mankiw et al. (1992) use the following measure for sh:
sh = secondary school enrollment rate x
1 5 - 19 population] ~ - - - ~ population] '
(3.7)
which approximates the percentage o f the working-age population that is in secondary school. To construct this measure, we use Equation (3.7) with secondary school enrollment rates from Barro and Lee (1993) [variables Sxx, xx=60, 6 5 , . . . , 85] and population data from the United Nations (1994). We construct sh for each o f the six years (1960, 1965 . . . . , 1985) in which data are available and take an average 9. This investment rate is divided by n + g + 6 with g = 0.02 and 6 = 0.03 as in Mankiw et al. (1992) and n given by the growth rate o f the country's labor force constructed from the Summers and Heston data set 10. In Figure 6, we plot average secondary school enrollment rates [the average o f variables Sxx, xx=60, 65 . . . . . 85 from Barro and Lee (1993)] versus the relative GDP per worker in 1985. Figure 6 has two noteworthy features. First, there is a very strong correlation between the secondary enrollment rate and the logarithm o f output per worker across countries. The correlation is 0.83. Second, there is a large range in secondary enrollment rates. There are many countries with secondary enrollment rates under 10 percent, and as many with rates over 60 percent. Weighting the enrollment
9 Data are unavailable in all years for Namibia, Reunion, Seychelles, Puerto Rico, Czechoslovakia, Romania, and the USSR. 10 Mankiw et al. (1992) use working-age population, while we construct growth rates of the labor force using Summers and Heston's (1991) RGDPCHxPOP/RGDPW. The results are quantitatively similar.
682
E.R. McGrattan and J.A. Schmitz, Jr.
0.9 0.8 ~
i
..'e -
....
0.7 •
0.6
••
o•
0.5
•
•
I.
ooOo
..
.g
'~ 0.4 eOo
~o 0.3 •
•
•
•
•
•
Q
0.2 0.1 0
~ 1/8
• •e • a •1 1/4
i
°
I
00
'
00
•
p = 0.83 I
I
1/2 1 2 Relative GDP Per Worker in 1985
I
I
4
8
Fig. 6. Secondary enrollment versus income, 196~1985. rates by the population [as in Equation (3.7)] and deflating them by n + g + 6 [as in Equation (3.6)] does little to change the pattern displayed in Figure 6. Hence, there are large differences between the human capital-output ratios for the most productive and least productive countries, with the correlation between H / Y and the logarithm o f G D P per worker equal to 0.79. With this measure o f H/Y for Mankiw et al. (1992), the K/Y series described above, and values for the capital shares ak and ah, we can calculate the ratio o f incomes in Equation (3.4). In this calculation, N~ and Np in Equation (3.4) are the richest 5 percent o f countries and the poorest 5 percent o f countries, respectively. In Table 1, we report our results. In the first row o f the table, we assume that ak = 0.31 and ah = 0.28 as estimated by Mankiw et al. (1992). In this case, the predicted income disparity assuming only differences in capital stocks - between the richest and poorest countries in 1985 is 12.8. The actual ratio is 31.4. The numbers in the last column o f Table 1 are the ratios o f predicted to actual income disparity - both in logarithms. This is a measure o f the gap in productivities attributable to variation in human and physical capital. For M a n k i w et al.'s (1992) human capital measure and parameter values, we find that 74 percent [that is, log(12.8)/log(31.4)] o f the gap in actual incomes can be explained by differences in capital intensities 11.
H Mankiw et al. (1992) run a regression of the logarithm of output per worker on their measures of K/Y and H/Y. They find an R2 statistic of 0.78 and parameter estimates &k = 0.31 and &h = 0.28. They
Ch. 10: Explaining Cross-Country Income Differences
683
Table 1 Income disparity due to different factor intensities, 1985 Human capital measure based on:
Mankiw et al. (1992) Variation on Mankiw et al.
Physical capital Human capital share share
Predicted income disparity a
Percentage difference explained b
0.31
0.28
12.8
74
1/3
1/3
33.7
102
0.31
0.28
5.4
49
1/3
1/3
9.9
66
1/3
0.43
30.1
99
Klenow and Rodriguez-Clare (1997b) c
0.30
NA
3.4
36
Hall and Jones (1998) c
1/3
NA
4.0
40
a Income disparity is defined to be the ratio of the average income of the richest 5 percent of countries to the average income of the poorest 5 percent (where income is output per worker). b The percentage difference explained is defined to be the logarithm of the predicted income disparity divided by log(31.4), which is the logarithm of the actual income disparity. c NA means not applicable. No value of a h is reported because the production function used in this study can be written as Y = Kak(ALg(s)) i c,k, where the function g does not depend on either capital share. Thus, the income disparity does not depend on %.
We also find that the correlation b e t w e e n the predicted and actual logarithms o f G D P per worker is 0.84. We should note, however, that the results are v e r y sensitive to the choice o f capital shares. For example, suppose that we use slightly h i g h e r values for capital shares; 1 The results o f this case are r e p o r t e d in the second row o f Table 1. say, al~ = ah = 5" In this case, the prediction for the ratio o f productivities o f the top 5 percent to the b o t t o m 5 percent is 33.7 - m o r e than t w i c e what it was in the case with ak = 0.31 and ah = 0.28, and a l m o s t exactly in line with the data. K l e n o w and R o d r i g u e z - C l a r e (1997b) argue that M a n k i w et al.'s (1992) m e a s u r e o f h u m a n capital overstates the true variation in educational attainment across the world b e c a u s e it e x c l u d e s p r i m a r y school enrollment, w h i c h varies m u c h less than does secondary. Figure 7 plots the p r i m a r y e n r o l l m e n t rates [the average o f variables Pxx, xx=60, 65 . . . . . 85 f r o m Barro and L e e (1993)] versus G D P per worker. Again, there is a strong positive correlation. But note that there is m u c h less variation in p r i m a r y e n r o l l m e n t rates than in s e c o n d a r y e n r o l l m e n t rates, w h i c h are displayed in Figure 6. O n l y ten countries have a rate b e l o w 0.40. Suppose that we redo our calculation o f the ratio Yrich/Ypoor u s i n g a measure o f sh in E q u a t i o n (3.6) that includes primary, secondary, and p o s t - s e c o n d a r y e n r o l l m e n t rates.
view the high R2 statistic and the reasonable estimate for physical capital's share as strong evidence that variation in factor inputs can account for most of the variation in output per worker.
E.R. McGrattan and JA. Schmitz, Jr.
684 •
•
• e
Q
e
Q•
0.9
eOe o•
•
0.8
',,O
,-
0.7
©
, ©-
0.6 0.5
•
© .~ 0.4
~
0.3-
•
©
0.2_
e
QO
•
•
OA--
p = 0.74 0
1/8
I 1/4
I 1/2
I 1
I 2
I 4
Relative GDP Per Worker in 1985 Fig. 7. P r i m a r y e n r o l l m e n t versus i n c o m e , 1960 1985.
In particular, suppose that we use the fraction of 5- through 64-year-olds who are enrolled in school averaged over the period 1960-1985; this is a weighted average of the three enrollment rates in Barro and Lee (1993). In Table 1, in the row marked "Variation on Mankiw et al.," we report the predicted disparity, which is only 5.4. This ratio implies that roughly half (49 percent) of the observed disparity in incomes is explained by differences in capital--output ratios. Thus, adding primary and tertiary enrollment rates to the measure of sh significantly reduces the contribution of human capital to income differences. (Compare the first and third rows of Table 1.) Although the predicted disparity is smaller, we still find a strong positive correlation between yi and (Ki/Yi) ak/(1 ak-ah)(Hi/Yi) ak/(1-ak-ah). The correlation in this case is 0.79. As before, the magnitude of this disparity depends critically on the capital shares. Making a slight change from ak = 0.31 and ah = 0.28 to ak = 51 and ah = ½ leads to an increase in the percentage explained from 49 percent to 66 percent. I f we choose ak = ½ and ah = 0.43, then almost all of the income disparity can be explained by differences in capital stocks across countries. As Mankiw (1997) notes, we have little information about the true factor shares - especially for human capital. Klenow and Rodriguez-Clare (1997b) and Hall and Jones (1998) argue that a more standard way of measuring human capital is to use estimates of the return to schooling from wage regressions of log wages on years of schooling and experience. [See Mincer
Ch. 10: Explaining Cross-Country Income Differences
685
(1974).] For example, Klenow and Rodriguez-Clare (1997b) report estimates o f the human capital-output ratio constructed as follows: 1-c¢ k
~- -
e ~'Is
o)i e ~'2expi+y3exp~
AL y'
(3.8)
where s is the average years of schooling in the total population over age 25 taken from Barro and Lee (1993) [variable HUMAN85], expi is a measure of experience for a worker in age group i and is equal to ( a g e / - s - 6), and mi is the fraction o f the population in the ith age group. The age groups are {25-29, 30-34 . . . . , 6 0 - 6 4 } and age/ E { 2 7 , 3 2 , . . . , 62}. The coefficients on schooling and experience in Equation (3.8) are given by Yl = 0.095, Y2 = 0.0495, and Y3 = -0.0007, which are averaged estimates from regressions of log wages on schooling and experience. The measure o f the human capital-output ratio used by Hall and Jones (1998) does not depend on experience and is given by
H _ (eO(,~) ~ y
AL__
(3.9)
y'
where s is the average years of schooling in the total population over age 25 taken from Barro and Lee (1993) [variable HUMAN85] and q~(.) is a continuous, piecewise linear function constructed to match rates o f return on education reported in Psacharopoulos (1994) 12. For schooling years between 0 and 4, the return to schooling q~/(s) is assumed to be 13.4 percent, which is an average for sub-Saharan Africa. For schooling years between 4 and 8, the return to schooling is assumed to be 10.1 percent, which is the world average. With 8 or more years, the return is assumed to be 6.8 percent, which is the average for the OECD countries. It turns out that the measures o f H / Y constructed by Klenow and Rodrlguez-Clare (1997b) and Hall and Jones (1998) are very similar. I f we set Y2 and Y3 equal to 0 in Equation (3.8) and ignore experience, then we get roughly the same capital intensities as those constructed by Klenow and Rodrlguez-Clare (1997b). Similarly, if we set qi(s) = 0.095s in Equation (3.9) and assume the same rate o f return on education across countries, then we get roughly the same capital intensities as those constructed by Hall and Jones. As a result, the residuals, A, constructed in these two studies are very similar. The correlation between the two residual series is 0.88 if we use the countries that appear in both data sets. We now see how the measures o f human capital defined by Klenow and RodrlguezClare (1997b) and Hall and Jones (1998) affect the ratio Y~ch/Ypoor in Equation (3.4).
cxk
12 Substituting H/Y in Equation (3.9) into Equation (3.3) and simplifying gives y = A ( K / Y ) ~ e¢(~'), which is the form of the GDP per worker used in Hall and Jones (1998). We have written their implied H/Y so as to compare it to that of Mankiw et al. (1992).
E.R. McGrattan and JA. Schmitz, Jr.
686 1.4 • o --
II 1.2
Mankiw-Romer-Weil Klenow-Rodriguez ExponentialFit for MRW Exponential Fit for KR
/ / / •
0
• •
d "4 0.8
=
0.6
°ooo ~o° ~ - ~
"~" 0.4 O :=
oU o o
Qool
1/8
•
OOJ
°
_7" O• 0
.
~
0
/
-,o • . u
o .o.
Z-
~°°co
°%oo Oo
~
%
~
"
o.•.~
°
% ~oO o
** o
...:
• % oea
0
o
•
0.2
/
• •
o
•
0
•
•
9
•
•
oI
1/4
•
I
I
I
1/2 1 2 Relative GDP Per Worker in 1985
I
I
4
8
Fig. 8. Human capital-output ratio versus income, 1985. For our calculations, we assume, as they do, that ak is the same across all countries. In Table 1, we report our predictions o f income disparity and the gap in productivities attributable to differences in capital intensities. For Klenow and Rodriguez-Clare (1997b), we find that only 36 percent o f the gap in productivities is attributable to d i f f e r e n c e s in capital stocks. For Hall and Jones, 40 percent o f the difference in productivities is explained by differences in capital stocks 13. To see why the results reported in Table 1 are so different across studies, consider the data in Figure 8. We plot the human capital-output ratios o f Mankiw et al. (1992) and Klenow and Rodriguez-Clare (1997b). Due to data availability, only 117 o f the original 125 countries in our sample are included. Both measures o f human capital to output are plotted against the relative GDP per worker in 1985. For both series, we fit an exponential curve. As is clear from the figure, there is much larger variation in the human capital-output ratio o f Mankiw et al. (1992) than in that o f Klenow and Rodriguez-Clare (1997b). For Klenow and RodHguez-Clare (1997b), the correlation between H / Y and GDP per worker is close to zero. In fact, i f we had used ak = ½ and ah = 1 when constructing H/Y for Klenow and Rodrlguez-Clare (1997b), we would
13 Klenow and Rod6guez-Clare (1997b) and Hall and Jones (1998) find that the average contribution of A to differences in y is about ½ (that is, the mean of the Ai's relative to A for the USA is approximately ½).
Ch. 10: ExplainingCross-Country Income Differences
687
have found a negative correlation between the human capital-output ratio and GDP per worker.
3.2. Growth accounting The objective in growth accounting is to estimate the contributions of technological progress and factor accumulation to differences in growth rates of output. As we saw in Figure 4, the growth rates of the fastest growing countries were on the order of 8 or 9 percent in the post-World War II period. These growth rates far exceed those of the fastest growing countries a century before. During three of the four decades between 1950 and 1990, East Asian countries led the pack. During the 1950s and 1960s, Japan had the highest growth rate. During the 1980s, South Korea had the highest growth rate. Among countries growing at 6 percent or better over the 19501990 period are three other East Asian countries, namely, Hong Kong, Singapore, and Taiwan. A question which has interested many people, then, is, Is factor accumulation or TFP growth responsible for growth rates of 8 or 9 percent? The studies of Young (1995) and Hsieh (1997) focus on the newly industrialized countries in East Asia, namely, Hong Kong, Singapore, South Korea, and Taiwan. Young (1995) finds that the extraordinary growth performances of the Asian countries are due in large part to factor accumulation. The output growth rates over the period 1966-1990 for Hong Kong, Singapore, South Korea, and Taiwan are 7.3, 8.7, 10.3, and 9.4, respectively 14. The estimates of average TFP growth over the same period for Hong Kong, Singapore, South Korea, and Taiwan are 2.3, 0.2, 1.7, and 2.6, respectively. Hsieh (1997) estimates TFP growth for the East Asian countries using both the primal approach as in Young (1995) and the dual approach. The primal approach is to use the growth rates of quantities of capital and labor to back out measures of TFP growth whereas the dual approach is to use the growth rates of prices of these factors. Hsieh finds TFP growth rates for Hong Kong, Singapore, South Korea, and Taiwan of 2.7, 2.6, 1.5, and 3.7, respectively IS. As these estimates suggest, there is much more agreement in these two studies than in the levels accounting studies reviewed above. Both agree that factor accumulation was much more important than TFP growth. There is some disagreement on the estimate for Singapore. Young (1995) finds that factor accumulation, especially of capital, is the whole story behind Singapore's high growth rates, whereas Hsieh (1997) finds that a significant fraction is due to TFP growth. Hsieh argues that while capital increased significantly, the real return to capital did not fall. The higher is the growth rate in the real return of capital, the higher would be Hsieh's estimate of TFP growth.
14 The data on output for South Korea and Taiwan do not include agriculture, and the period for the Hong Kong data is 1966-1991. 15 The period for Singapore used in Hsieh (1997) is 1971-1990. Using the primal approach yields an estimate of -0.7 for this shorter sample.
688
E.R. McGrattan and J.A. Schmitz, Jr.
While growth rates of TFP on the order of 2 percent are high, they are not extraordinarily high. In the USA, for example, Christensen et al. (1980) find growth rates of TFP of 1.4 percent over the period 1947-1973 when growth rates in output were around 4 percent. The growth rates in output for the East Asian countries over the period 1966-1990 were significantly higher since the growth in capital and labor was extraordinarily high. In the remainder of the chapter, we turn to the literatures which directly estimate the impact of policy on income and growth.
4. Growth regressions In this section, we review a literature - the cross-sectional growth literature - that quantifies the effects of observed policies on country growth rates. We begin with a brief overview of the literature. We discuss the motivation for the studies in this literature and the typical growth regression that is run. We then describe the results of Barro and Lee (1994) and their estimates of the effects of policies on growth. Finally, we discuss some critiques of the methodology used in the literature. As we noted in Section 2, average growth rates vary widely across countries and are uncorrelated with initial income levels. (See Figure 2.) The fact that income levels and subsequent growth rates are uncorrelated was at one time thought to be a puzzle for standard growth theory which predicted that poor countries should grow faster than rich countries in per capita terms. Such a prediction would imply a negative correlation between income levels and subsequent growth rates. This result depends, of course, on countries having the same steady-state income levels. If countries do not converge to the same steady-state income levels, the pattern predicted by theory is potentially consistent with Figure 216. Analyses in the growth regression literature attempt to uncover the relationship between initial incomes and subsequent growth rates, holding constant variables that determine countries' long-run steady-state income levels. The typical exercise involves regressing the growth rate of per capita GDP on the initial level of GDP, initial factor endowments such as educational levels, and control variables which are assumed to be determinants of the steady-state level of per capita output. Without the initial factor endowments and control variables, the coefficient on initial GDP is positive (as suggested by Figure 2). With these variables included, the coefficient on initial GDP is negative. The set of control variables typically includes the ratio of investment to GDP, measures of market distortions such as the ratio of government consumption to GDP and the black market premium on foreign exchange, measures of political instability, and measures of financial development. Again, the purpose of these variables is to sort countries into more homogeneous groups, that is, groups that have similar steady
16 In Section 5, we provide a different explanation for this fact.
Ch. 10: Explaining Cross-Country Income Differences
689
0.35 p = -0.62 i
Ig3 ~D
0.3
~2
-
-
0.25
0 © -=~ 0.2 •
"."
0~
"~ 0.15 0.1-
0
<
~
o• • • 0
0.05 -
o° •
• •
• •
~
• •
•
•o
• • .',,
©
0 1/8
I 1/4
I • I I 1/2 1 2 Relative GDP Per Capita in 1985
I 4
•
I 8
Fig. 9. Government share versus income, 1965-1985. states. Thus, one would expect that those control variables that are highly correlated with i n c o m e would be significant in the regression. I n Section 3, we saw that the average ratio o f investment to G D P as constructed by S u m m e r s and H e s t o n (1991) is highly positively correlated with income. Variables proxying market distortions are negatively correlated with income. In Figures 9 and 10, we plot the ratio o f g o v e r n m e n t c o n s u m p t i o n to G D P and the logarithm o f 1 plus the black market p r e m i u m , respectively, versus per capita G D P ]7. We see that averages o f both o f these measures over the period 1965-1984 are negatively correlated with per capita G D P in 1985. I n Figure 11, we plot Gastil's (1987) index o f political rights averaged over the period 1 9 7 2 - 1 9 8 4 versus per capita G D P in 1985. A value o f 7 for the index indicates that citizens o f the c o u n t r y have relatively few democratic rights, such as freedom o f the press, freedom o f speech, and so on, whereas a value o f 1 indicates the most freedom. The correlation b e t w e e n this index and relative i n c o m e is strongly negative. Finally, in Figure 12, we plot K i n g a n d Levine's (1993) m e a s u r e o f the ratio o f liquid liabilities
17 In Figures 9-12, we use an average of GVXDXE5x, x = 65-69, ..., 80-84 for government share of GDP, an average of BMPxL, x = 65-69, ..., 80-84 for the logarithm of one plus the black market premium, an average of PRIGHTSx, x = 72-74, 75-79, 80-84 for the index of political rights, and an average of LLYx, x - 65-69,..., 80-84 for the measure of liquid liabilities. These data are all taken from the data set of Barro and Lee (1993).
E.R. McGrattan and JA. Schmitz, Jr.
690 1.5
p = -0.50
0.5
•
•
• •
i
" •
* •
0
•
•
eeo
1/8
"
•0.. •
• • J• • • •
q
•
.
e•
o°
•O lOee
o•
• •
•
°° • olo
0~ •
i
o~
•
e-
1/2 1 2 Relative GDP Per Capita in 1985
1/4
4
Fig. 10. B l a c k market p r e m i u m versus income, 1965-1985.
7 R
•
•• •
C)
Q°0 •
p = -0.75
(D
• II
. ¢ ; .
;-
-...:
..
00
~2 O C)
•
2 -
00
•
(D
1
1/8
1/4
1/2
1
2
Relative G D P Per Capita in 1985 Fig. 11. Political rights versus income, 1972-1985.
4
8
Ch. 10:
691
Explaining Cross-Country Income Di~]brences 1.6 -
oo ,,D
~2
9 = 0.60
1.41.2 1 0.8
0o
0.6 © ©
0.4 0.2 Q
0
•
• • ,'; O0 OQ
•
I
1/8
el•
e
. : . . , , ;• . ::. . . . ; l . . : ,. 1/4
•
•
•
00
•qlo e:
•
•
•
•
I
I
I
1/2 1 2 Relative GDP Per Capita in 1985
I
4
•
I
8
Fig. 12. Financial developmentversus income, 1965-1985. to GDR averaged over the period 1965-1984 versus the relative GDP per capita in 1985. Liquid liabilities are the sum of currency held outside the banking system and demand and interest-bearing liabilities of banks and nonbank financial intermediaries. We see that this measure of financial development is positively correlated with per capita GDR We now turn to a specific example of a growth regression given in Barro and Lee (1994). Their preferred regression equation is given by g = - . 0 2 5 5 log(GDP) (.0035)
+.0801 log(LIFE) (.0139)
+.0138 MALE_SEC -.0092 FEM_SEC (.0042)
+.0770 I/Y (.0270)
(.0047)
-. 1550 G/Y (.0340)
(4.1)
-.0304 log(1 + BMP) -.0178 REV, (.0094)
(.0089)
where g is the growth rate of per capita GDP, MALE_SEC and FEM_SEC are male and female secondary school attainment, respectively, LIFE is life expectancy, I/Y is the ratio of gross domestic investment to GDP, G/Y is the ratio of government consumption to GDP less the ratio of spending on defense and noncapital expenditures on education to GDP, BMP is the black market premium on foreign exchange, REV is the number of successful and unsuccessful revolutions per year, and means have been subtracted
692
E.R. McGrattan and J.A. Schmitz, Jr.
for all variables. Eighty-five countries were included over the period 1965-1975 and 95 countries over the period 1975-19851~. These results show that countries with a higher I / Y , a lower G / Y , a lower black market premium, and greater political stability had on average better growth performances. In Table 2, we reprint results from Barro and Lee (1994) which show the fitted growth rates for the regression equation in Equation (4.1) and the main determinants of these growth rates. The fitted growth rates are reported in the second-to-last column, and the actual growth rates are reported in the last column. The first five columns of the table are the sources of growth. To obtain their contributions to the fitted growth rate, one multiplies the values of explanatory variables for a specific group of countries (expressed relative to the sample mean) by the coefficients in Equation (4.1). The net convergence effect adds up the contributions of initial per capita GDR secondary schooling, and life expectancy. The contributions of all other variables are shown separately. Variables used as proxies for market distortions, namely the government share in output and the black market premium, account for a large fraction of the differences in observed growth rates. For example, over the period 1975-1985, differences in G / Y accounted for a 2 percent per year difference in growth rates between the fast-growing East Asian countries and the slow-growing sub-Saharan African countries. Over the same period, differences in the black market premium accounted for a 1.5 percent per year difference in growth rates between the East Asian countries and the slow-growing sub-Saharan African countries. Together these variables account for a difference of 3.5 percent. The actual difference in growth rates was 8.1 percent. Table 2 provides results for only one regression. The literature, however, is voluminous and there have been many other policy variables identified as potentially important sources of growth. Examples include measures of fiscal policy, trade policy, monetary policy, and so on. In many cases, variables to include are suggested by theory. For example, King and Levine (1993) include the measure of the state of financial development in Figure 12 in the growth regressions that they run. They motivate inclusion of such a variable with a model of R&D in which financial institutions play a central role because they facilitate innovative activity. [See also Greenwood and Jovanovic (1990).] Another example is income inequality which Alesina and Rodrik (1994) and Persson and Tabellini (1994) argue is harmful for growth. They include measures of within-country income inequality in the regressions they run on the basis of simple political economy models of taxation. In their models, growth
18 The school attainment variables and Iog(GDP) are the observations for 1965 in the 1965-1975 regression equation and for 1975 in the 1975-1985 regression equation. The life expectancyvariable is an average for the five years prior to each of the two decades, namely, 196~1964 in the first regression equation and 1970-1974in the secondregression equation. Variables1/Y and G/Y are sample averagesfor 1965-1975 and 1975-1985 in the regression equations for the two decades, respectively. The revolution variable is the average number over 1960-1985. For the regression, lagged explanatory variables are used as instruments.
Ch. 10: Explaining Cross-Country Income Differences
693
o o II
<
II
"~o~ I l l l l
I
oo
o o
%
o
o o o o l l l l l
o
o
o
o
o
o
o
o
o
~
o
o
o
o
I l l l l
oo ~
o~
o o II
~
o o o o [ l l l l
0
o
o
o
o
% ~0
11
I I I
O0 (2
o o~
0
Z
o %=
i
?T
?Ti
TTi
Tii
ii
? T~T
.o~~
o
'~
© t,e3
~ . ~ ~ ~
o
~
© , ~_~
694
E.R. McGrattan and J.A. Schmitz, Jr.
depends on tax policies which are voted upon. The lower is the capital stock of the median voter, the higher is the tax rate and the lower is the growth rate given that tax proceeds are redistributed. In both cases, the theory and data analysis are only loosely connected. Many of the explanatory variables in the regressions are not variables in the models, and relations such as Equation (4.1) are not equations derived directly from the theory. The exercise of Barro and Lee (1994) and others in this literature suggest that differences in policies play an important role for the variation in country growth rates. However, as we noted earlier, the magnitudes are in debate. For example, Levine and Renelt (1992) show that the results of such regressions are sensitive to the list of variables included. They identify more than 50 variables that have been found to be significantly correlated with growth in at least one cross-sectional growth regression. From their extreme-bound robustness tests, Levine and Renelt (1992) conclude that a large number of fiscal and trade policy variables and political indicators are not robustly correlated with growth. The list of variables that are not robustly related to the growth rate in per capita GDP includes the ratio of government consumption expenditures to GDP, the black market premium, and the number of revolutions and coups - the main variables used in the Barro and Lee (1994) regression. Sala-i-Martin (1997) uses a weaker notion of robustness but still finds that the main variables in Barro and Lee (1994) are not robustly correlated with growth. There are also deeper methodological debates with the growth regression approach. First, there are many econometric problems such as endogeneity of right-hand-side variables, too-few observations, omitted variables, and multicollinearity which call into question the estimates found in this literature. The problem most emphasized is the endogeneity of regressors. [See, for example, Mankiw (1995), Kocherlakota (1996), Sims (1996), Klenow and Rodriguez-Clare (1997a), and Bils and Klenow (1998).] Consider, for example, the black market premium which is sometimes included in the regressions. Most theories say that this ratio is jointly determined with the growth rate with changes in both induced by changes in some policy. To deal with this problem, researchers use instrumental variable methods. However, their choices of instruments (e.g., political variables or lagged endogenous variables) have been criticized because they are not likely to be uncorrelated with the error terms in the regressions. As Sims (1996) emphasizes, to say more about the characteristics of the instruments, one must be specific about the equations determining all of the other variables those equations that are not estimated. Sims (1996) concludes that the coefficient on the policy variable of interest "represents, at best, a small piece of the story of how policy-induced changes ... influence output growth and at worst an uninterpretable hodgepodge." We turn next to an approach that is not subject to these same criticisms. The approach puts forward fully articulated economic models relating fundamentals, such as preferences, technologies, and policies, to quantifiable predictions for output per worker. Using quantitative theory, we try to tighten the link between theory and data making the mapping between policies and GDP very explicit.
Ch. 10: ExplainingCross-Country Income Differences
695
5. Quantitative theory In this section, we consider explicit models that map assumptions about preferences, technologies, and policies to predictions for GDP. We make no attempt here to review all models of growth and development. Instead, we focus on several standard models and their quantitative implications. Policies that we consider include taxes on investment, government production, tariffs, labor market restrictions, granting of monopolies, monetary policies, and subsidies to research and development. We first consider implications for disparity of incomes and then implications for growth in incomes. We derive specific answers to the question: How much of the cross-country differences in income levels and growth rates can be explained by differences in particular economic policies? We also discuss assumptions that are critical for the results. 5.1. Effects of policy on disparity In this section, we consider theories of income disparity and their quantitative predictions. By disparity, we mean the ratio of GDP per worker of the most productive cotmtries to the least productive. As we saw in Section 2, the productivity levels of the most productive 5 percent of cotmtries are on the order of 30 times that of the least productive. We ask, How much of this difference is due to policies such as taxes on investment, inefficient government production, trade restrictions, labor market restrictions, and granting of monopolies? To illustrate the quantitative effects of some of these policies, we derive explicit formulas for cross-country income differences. We show, under certain assumptions, that measured differences in policies imply significant income disparity. 5.1.1. Policies distorting investment In this section, we work with the neoclassical growth model and derive formulas for income differences due to policies distorting investment. Many have pointed to disincentives for investment such as government taxation of capital, corruption, and inefficient bureaucracies as possible explanations for differences in observed income levels. [See, for example, de Soto (1989) who describes inefficiency and corruption in Peru.] Such distortions on investment seem a natural candidate to generate variations in income given the large differences in capital-output ratios across countries (see Figure 5) and the strong association between growth rates and investment rates especially for investment in machinery - as found in DeLong and Summers (1991, 1993). Schrnitz (1997) studies one type of distortion on investment which occurs when governments produce a large share of investment goods and bar private production of these goods. In Egypt, for example, the government share of investment production has been close to 90 percent. This is in contrast to the USA and many European
E.R. McGrattan and J.A. Schmitz, Jr.
696
70e4)
60
p = -0.47
50
~'~ ~
40
~>
30
=
20
m
Q
10
m
•llal
0
I
1/8
1/4
I
I
I •
1/2 1 2 Relative GDP Per Worker in 1985
I 4
I 8
Fig. 13. Public enterprise share versus income, various years. countries where the government share of investment production is close to zero. Schmitz (1996) presents evidence on the government's share of manufacturing output where a subset of investment goods is produced. He shows that there is a negative correlation between the government's share of manufacturing output and productivity in a country. Figure 13 documents this pattern. In Figure 13, we display the public enterprise share of manufacturing output versus relative incomes for various years. The correlation between the govermnent share of output in manufacturing and relative incomes is -0.47. Figure 13 then suggests that some governments produce a large share of investment goods. One expects there to be a large impact on productivity due to this policy because if the government produces investment goods inefficiently, this will have an impact on capital per worker. Unfortunately, it is hard to find specific measures for the many other distortions to investment. However, in many models, differences in distortions on investment across countries imply differences in the relative price of investment to consumption. Jones (1994) uses the PPP-adjusted price of investment divided by the PPP-adjusted price of consumption as a comprehensive measure of the many distortions in capital formation. He does so for various components of domestic capital formation like transportation equipment, electrical machinery, nonelectrical machinery, and nonresidential construction. When he includes these relative price variables in a growth regression of the type studied in Barro (1991), he finds a strong negative relationship between growth and the price of machinery.
697
Ch. 10: Explaining Cross-Country Income Differences
8
.•
p = -0.65
4 ~
•
• ,'%
e~
ee
e
t
ee
• ~•
•
• •
•
:
• • •
• me • • • •
•
°•°
•
•e
• -.
Q .
..
•
• .° °
•°°~ ° . o °'e41~ •
".
• •
~D
1/2 1/8
I 1/4
I I [ 1/2 1 2 Relative GDP Per Worker in 1985
I 4
I 8
Fig. 14. Relative price of investment versus income, 1985. Chari et al. (1997) use a similar measure o f relative prices for the tax on investment in a standard neoclassical growth model. In particular, they use the relative price o f investment goods to consumption goods from the Summers and Heston data set (PI/PC). Figure 14 presents this relative price in 1985 versus the relative GDP per worker (for the sample o f 125 countries with complete data on GDP per worker over the period 1960-1985). There are two aspects o f this figure worth noting. First, there is a very strong negative correlation between relative investment prices and the relative GDP per worker. The correlation is -0.65. Second, there is a large range in relative prices. Assuming the relative price o f investment to consumption is a good measure o f investment distortions, one expects that the large variation in prices implies a large variation in cross-country incomes. Using the following simple two-sector model, we can show how investment distortions such as those studied in Schmitz (1997) and Chari et al. (1997) affect income. The representative household chooses sequences o f consumption and investment to maximize
(5.1)
~-~[3 t U(Ct), t=O
where Ct is consumption at date t. The household's budget constraints are given by Ct +ptXt = rtK¢ + wtL,
t >>,0,
(5.2)
E.R. McGrattanand JA. Schmitz,Jr.
698
where the subscript t indexes time, p is the relative price of investment to consumption, X is investment, r is the rental rate o f capital, K is the capital stock, w is the wage rate, and L is the constant labor input. The capital stock is assumed to depreciate at a rate 6 and have the following law o f motion: Kt+l = (1 - O)Kt + X . The economy is assumed to have two sectors: one for producing consumption goods and one for producing investment goods. The capital good can be allocated to either sector. The aggregate capital stock satisfies K~. + K~ = K, where Kc and K~ are capital stocks used to produce consumption and investment goods, respectively. Similarly, the aggregate labor input satisfies Lc + Lx = L, where L~ and Lx are inputs used to produce consumption and investment goods, respectively. Production functions in both sectors are assumed to be Cobb-Douglas. Firms in the consumption-good sector choose K~ and Lc to maximize profits; that is subject to C - A~K~ L c1 - a ,
max C - rKc - wLc,
_
Kc,Lc
a
(5.3)
where Ac is an index o f the technology level in the consumption-good sector. Similarly, firms in the investment-good sector choose Kx and Lx to maximize profits: max p X - rKx - wLx,
Ks,L,:
subject to X = AxK~xL~-y,
(5.4)
where Ax is an index o f the technology level in the investment-good sector. Note that the above economy with different productivity factors in the consumptionand investment-good sectors is equivalent to one in which the productivity factors are the same but there are distortions on investment. Suppose that the productivity factors in the two sectors are the same (that is, Ax = Ac). But suppose that of the Kx units o f capital used in the investment sector, Kx/(1 + rx) units are used in production and the remaining rxKx/(1 + rx) units are needed to overcome regulatory barriers, with the same true o f the labor input. Then, the investment technology is given by
( Kx ~Y ( Zx ) 1-Y _ Ac g~xZx1-~
X=Ac\l+rx/
~
l+rx
[This is a version of the economy in Chari et al. (1997).] By setting At~(1 + rx) = Ax, we see that the two specifications are the same 19 We now derive an explicit formula for differences in GDP per worker due to differences in productivity factors Ax in the investment sector across countries as in
19 If we allow for trade, however, the interpretation may matter. If 1/(1 + rx) is a measure of resources used to overcome regulatory restrictions rather than a country-specific productivity factor, it is easy to imagine that such distortions also apply to imported goods.
699
Ch. 10: ExplainingCross-CountryIncome Differences
Schmitz (1997). We compare the aggregate productivity of a country like the USA in which the government produces no investment goods to that of a country like Egypt in which the government produces a vast majority of the investment goods. For simplicity, we assume that it produces all of the investment goods. We also assume that a = y, that Ax = Ax for Egypt, and that Ax = Ap for the USA, where Ag denotes government productivity and Ap denotes private productivity. The only difference between countries is the productivity factor in the investment sector. We compare steady-state GDP across the two countries. With capital shares equal in the two sectors, the capital-labor ratios are equated and are equal to the economy-wide ± capital-labor ratio k = K/L, which is proportional to Ax~-~. Outputs in the consumption and investment sectors are therefore given by C = AckaLc and X = AxkaLx, respectively. In comparing GDPs across countries, it is common practice to use a set of world prices. For the model we assume that the world price of investment equals the US price (that is, AJAp). Let y denote the GDP per worker in international prices. In this case, y = C/L + AcX/(ApL), and therefore, the relative productivities are given
by y(Ax = Ap) _
I1~ Ag J
'
(5.5)
where y(Ax = Ap) is the GDP per worker for the country with investment goods produced privately andy(Ax = Ag) is the GDP per worker for a country with investment goods produced by the government. Note that Lx/L in this model is the same in both countries. If the government produces goods less efficiently than the private sector, then Ag < Ap. In this case, one can show that, for all values of a in (0,1) and all values of Lx/L in (0,1), the ratio in Equation (5.5) exceeds one. Estimates of the relative productivity factors Ap/Ag can be found in Krueger and Tuncer (1982) and Funkhouser and MacAvoy (1979). Their estimates lie between 2 and 3. The fraction of labor in the investment sector is equal to the share of l investment in output. Suppose this share is 3' Suppose also that the capital share a is 7' If private producers have a productivity factor that is 2 times as large as government producers, then the model predicts that a country with no government production of investment has a labor productivity 1.57 times that of a country whose investment is entirely produced by the government. If private producers are 3 times as productive as government producers, then a country with no government production of investment has a labor productivity that is 2 times that of a country where the government produces all investment goods. During the 1960s when Egypt was aggressively pursuing government production of investment, productivity in the USA was about 8 times that of Egypt. The calculations above indicate that this policy makes the USA about 2 times as productive as Egypt. What fraction of the productivity gap should be attributed to this policy? One way to measure the fraction of the gap in productivity attributable to this policy is to take the logarithm of the ratio of output per worker in the model and divide this by the
700
E.R. McGrattan and JA. Schmitz, Jr.
logarithm of the ratio of output per worker in the data. Under the assumption that the productivity factor for private firms is twice as large as that for the government, this results in In(1.57)/ln(8) ~ 0.22. Under the assumption that the multiple is 3, we have ln(2)/ln(8) ~ 0.33. Hence, under this measure, the policy accounts for between 22 and 33 percent of the productivity gap. As we noted above, the formula in Equation (5.5) also applies to the case with variations in distortions as described in Chari et al. (1997). The ratio Ai/Aj is simply replaced by the ratio (1 + rxj)/(1 + rxi). In another version of their model, Chari et al. (1997) allow for distortions such as bribes that have to be paid to undertake investments. Under this interpretation, bribes are simply transfers from one agent to another. In this case, the budget constraints of the household are given by Ct +ptXt = rtKt + wtL + T,,
t >t O,
(5.6)
where Tt is the value of these transfers at date t. In this case, the profit-maximization problem solved by the investment-goods firm is given by max P X rKx-wLx, subject t o X =,~ gYr I-r (5.7) x~,Lx l + Z x and the problem of the consumption-goods firm is the same as before. The specification in Equation (5.7) implies that bribes are proportional to the scale of the investment. We now derive an explicit formula for differences in income due to differences in investment distortions that, like bribes, are simply transfers from one agent to another. For now we assume that a = y. In this case, the relative price of investment to consumptionp is proportional to the distortion 1 + rx in equilibrium. For the model we assume that the world prices of consumption and investment goods are one (so that the world price of investment equals the price of a country with a distortion of zero). If we assume that all investment is measured in national income accounts, then GDP per worker in the model is given by C/L + X/L. Assuming that the only difference across countries is the level of rx that they face, we find that the ratio of productivities of countries i and j is given by yi _ ( l + rxi'] ~'
y1
\
j
(5.8)
in the steady state. If we assume, as Chari et al. (1997) do, that half of the capital stock is organizational capital and is therefore not measured in national income accounts, then GDP per worker in the model is given by C/L + ½X/L. [See Prescott and Visscher (1980) for a discussion of the concept of organization capital.] In this case, the ratio of productivities of countries i a n d j is Yi
Yj
_
a(1
+ Txi) aal -
b(1
+
Txi) al-1
a(1 + rxj)~-~r - b(1 + rxj)~l~ '
where a and b are positive constants that depend on/3, 6, and a (and growth rates of population and world-wide technology which we have abstracted from here). For the
Ch. 10: Explaining Cross-Country Income Differences
701
parameters used in Chaff et al. (1997), a is about 5 times larger than b. Therefore the ratio of measured incomes is approximately equal to the expression in Equation (5.8). Consider again the data in Figure 14. Is the range in relative prices large enough to account for the 30-fold difference in relative incomes? It is, if one views K as a broad measure of capital that includes not only physical capital but also stocks of human capital and organizational capital. For example, if we assume a capital share on the order of 2, then differences in relative prices (and hence, differences in the ratio (1 + Txi)/(1 + rxj)) on the order of 5 or 6 imply a factor of 30 difference in incomes since we square relative prices. In Figure 14, we see that four of the poor countries have relative prices exceeding 4. If we compare these countries with the richest countries who have relative prices that fall below 1, we can get relative productivities on the order of 30. There is a potential bias in the measure of distortions that we plot in Figure 14. If consumption goods are largely nontraded tabor-intensive services, we would expect that they are systematically cheaper in capital-poor countries. In this case, the relative price overstates the real distortion. To demonstrate this, we can use the steady-state conditions of the model to derive an expression for the relative price of investment to consumption in terms of the distortions T~. The expression is given by i-a
p = B ( 1 + rx)v~, where B depends on parameters assumed to be the same across countries. Above we assumed a = 7 and therefore had p = B(1 + Tx). However, if production of investment goods is more capital intensive than production of consumption goods (~ > a), then the ratio of prices of two countries is larger than the ratio of their true distortions; that is, Pi/Pj > (1 + "gxi)/(1 + T~j) where rxi > rxj. Chari et al. (1997) find for Mexico and the USA that, if anything, the relative prices understate the true distortion since estimates of capital shares imply a > 7 for both countries. The estimates derived in this section illustrate that the effects of certain policies distorting investment are potentially large. Inefficient government production can explain 22 to 33 percent of the productivity gap between countries like Egypt and the USA. For more comprehensive measures of distortions like the relative price of investment to consumption, the implied differences in incomes across countries are large if we assume that the distortions affect not only physical capital but also human and organizational capital. However, the estimates of the impact of policy on income found above are sensitive to choices of the capital share and to magnitudes of measured versus unmeasured capital. For example, if we assume a capital share of ½, then differences in relative prices on the order of 5 imply differences in incomes on the order of the square root of 5, which is significantly smaller than differences on the order of the square of 5. Before closing this section, we have three general comments concerning this literature. First, a number of quantitative studies have extended the basic neoclassical model explored above. One aim of these studies is to ask whether observed policy
E.R. McGrattan and J.A. Schmitz, Jr.
702
differences have a larger impact on measured incomes in the extended models as compared to the standard model. Jovanovic and Rob (1998) extend the basic model to include vintage capital. The extended vintage capital model yields predictions for income disparity that are similar to those of the standard model. Parente et al. (1997) introduce home production into the standard model. Policies that influence capital accumulation now also have an impact on the mix o f market and non-market activity. Their model can imply (for a given difference in policies) a much larger difference in income disparity across countries than does the standard model. Second, the analysis above illustrates that theories o f the kind described here cannot rely on variations in TFP (that is, variations in A~.) alone to explain income differences. For example, the following is true in the steady state with a = 7: a
Y
l+Txx
1
1 + 6,
(5.9)
where Y = AcKaL l-a. This condition shows why variation across countries in the residual Ac is not enough. There are large differences in K/Y across countries. With rx constant, this model predicts that K/Y is constant. Thus, we need variation in some intertemporal distortion (for example, ~'x) in order to generate differences in the capital-output ratio 20. Third, to simplify matters, we assumed no cross-country variation in TFP in deriving our predictions o f income differences. However, when there is unmeasured capital, it is hard to distinguish between an economy with a small capital share and variations in both Ac and z'~ (where Ac and r~ are correlated) and an economy with a larger capital share and only variations in T~. For example, if Klenow and Rodriguez-Clare (1997b) or Hall and Jones (1998) were to construct measures o f TFP simulated from a stochastic version of the model above with half o f the capital stock unmeasured, they would conclude that TFP accounts for much o f the variation in output per worker even if it accounted for none o f the variation in output per worker. Thus, one must be cautious when interpreting the results o f Klenow and Rodriguez-Clare (1997b) and Hall and Jones (1998). And, as Prescott (1998) points out, a theory o f TFP differences is still needed. We turn next to the trade literature and again derive formulas relating policies to differences in income levels. 5.1.2. Policies affecting trade The earliest research using models to measure the impact of policies on country income and welfare focused on trade policies 21. The trade literature is large. In this section, we 2o Even if we do not abstract from growth, the theory with only variations in A~. will do poorly. The capital-output ratio is highly correlated with income, but growth rates of the poor and rich are not very different. 21 In fact, Johnson (1960) discusses the work of Barone (1913) who attempts to measure the impact of tariffs on income and welfare.
Ch. 10: Explaining Cross-Country Income Differences
703
0.5
~,
,
p = -0.38
0.4 0 0,3
~z
0.2•
•
• 00
•
ID
0.I
•
-
0 0
eeo •
•
0 0 •
•
OO
•
O0
00 oo
•
0
I
1/8
•
•
1/4
I
I
•
Io
00
1/2 1 2 Relative GDP Per Worker in 1980
• O00
•
•
~4~
,,o
"1 ° e
I
4
8
Fig. 15. Tariff rates versus income, 1980. provide a broad historical outline o f this literature. We first discuss several measures o f trade restrictions and their relationship to country productivity. We then discuss work relating trade restrictions to differences in income levels. Figure 15 presents measures of tariff rates on capital goods and intermediate goods constructed by Lee (1993) versus the relative GDP per worker for 91 countries in 1980. When plotting the data, we dropped the observation for India, where income in 1980 was approximately ½ o f the world average and the tariffrate was 132 percent. This point was dropped so we could more easily view the other data points. Not surprisingly, there is a negative relationship between tariff rates and incomes. The correlation between tariff rates and incomes is -0.38. As Easterly and Rebelo (1993) point out, taxes on international trade fall as a share o f government revenue as income rises, while the share o f income taxes rises. For many o f the low- and middle-income countries, the tariffrates are in the range o f 25 to 50 percent. But rates among the rich are, in general, quite low. In Figure 16, we present additional evidence on trade restrictions. We plot Sachs and Warner's (1995) measure o f a country's "openness" for the period 1950-1994 against relative GDP per worker in 1985. A country is open if (i) nontariff barriers cover less than 40 percent o f its trade; (ii) its average tariff rates are less than 40 percent; (iii) any black market premium in it was less than 20 percent during the 1970s and 1980s; (iv) the country is not socialist under Kornai's (1992) classification; and (v) the country's government does not monopolize major exports. Sachs and Warner construct
704
E.R. McGrattan and J.A. Schmitz, Jr. 1
•
•
•
•
•
O•
0.9 P = 0.64 0.8
•
11Oo
0
0.7 e',
0.6
©
0.5 © •
0.4
•
O0
0
0.3
g
0.2
000
0.1 0
•
1/8
.....
]1/4
-e
• -e-
L .
o• -
I--
o~ • ~
• -
I--
-
;
I
4
8
1/2 1 2 Relative GDP Per Worker in 1985
Fig. 16. Fraction of years open versus income, 1950 1994. an index that measures the fraction o f years in the period that a country has been "open." As we see from Figure 16, the correlation between the Sachs and Warner index and GDP per worker is strongly positive; economies with policies that promote trade are those with high productivities. In an early paper, Johnson (1960) reviews and extends prior studies that measure the cost o f protection. His measure of the cost o f protection is defined to be "the goods that could be extracted from the economy in the free-trade situation without making the country worse off than it was under protection - some variation o f the Hicksian compensating variation" (p. 329). In the two-good version o f his general equilibrium model, the cost o f protection in percentage o f national income is Cost = ~
t/V,
where r is the tariff on imports, t/is the compensated elasticity o f demand for imports, and V is the ratio of imports at market prices to domestic expenditure. Johnson argues that the cost is small given it is an elasticity multiplied by three fractions, each o f which is small. The example he gives is a tariff o f 331 percent and an import share of 25 percent. To obtain a cost o f 4 percent o f national income, the compensated demand elasticity has to be slightly above 5 - a value he dismisses as implausibly high. When Johnson extends the analysis to many goods, he cannot conclude as easily that the cost of protection is small. However, when he analyzes data from two studies
Ch. 10: ExplainingCross-Country Income Differences
705
on Australia's and Canada's commercial policies, he concludes that the cost is small in both countries. During the 1960s, a number of studies continued the work reviewed by Johnson. A good reference is Balassa (1971). The findings of Balassa (1971) are similar to those of Johnson (1960). The cost of protection is on the order of a few percent of GDP, with the highest cost being 9.5 percent of income in Brazil. A further development in the quantitative study of tariffs was the computational general equilibrium (CGE) literature. There are a number of good surveys of this literature, such as Shoven and Whalley (1984). Some notable contributions to this literature are Cox and Harris (1985), Whalley (1985), and the papers in Srinivasan and Whalley (1986). Most of the CGE literature found - as did the earlier literature that reductions in observed tariffs would lead to small increases in welfare and income, typically on the order of 1 percent of GDR Since the mid-1980s, there have been attempts to extend the models in this literature under the presumption that larger gains in income follow tariff reductions. One avenue has been to develop dynamic models in which the capital stock adjusts to the reductions in tariffs. The CGE literature typically studied static models, so, for example, the models did not consider the response of capital stocks to changes in tariffs. A recent paper that looks at such responses of capital stocks is Crucini and Kahn (1996). This study examines the increase in tariffs that followed passage of the Smoot-Hartley tariff during the Great Depression. They find that if "tariffs had remained permanently at levels prevailing in the early 1930s [due to the Smoot-Hartley tariff], steady-state output [in the USA] would have declined by as much as 5 percent" as a result of the higher tariffs (p. 428). At least for this episode then, considering changes in the capital stock does not significantly change the conclusion that the effects of tariffs on income are small. Another recent paper is that by Stokey (1996) who examines dynamic gains from trade as capital stocks adjust. Stokey finds larger gains from capital adjustment than do Crucini and Kahn. Another avenue that has been pursued is to allow for changes in the set of goods available in the economy as tariffs change. One example is Romer (1994) who argues that tariffs may have a large impact on productivity. He constructs an example of a small open economy which imports specialized capital inputs to use in a love-forvariety production function. Foreign entrepreneurs that sell the capital inputs face fixed costs of exporting to the small open economy. In the model, increases in tariffs result in a narrowing of goods imported and a fall in productivity. Romer's back-of-theenvelope calculations show that the effects on productivity may be large. Here, we review his calculations and discuss Klenow and Rodriguez-Clare's (1997c) study of this mechanism for Costa Rica. Romer (1994) considers a small open economy that produces a single good according to the production function y = L 1 a fo N x~ di,
E.R. McGrattan and J.A. Schmitz, Jr.
706
where L is the labor input and xi is the input of the ith specialized capital good, i c [0, N]. The capital goods are imported from abroad. The number of types of goods imported N is not a priori fixed; in equilibrium, it will depend on the tariff rate. Each specialized capital good is supplied by a foreign monopolist. The foreign monopolist faces a constant marginal cost of producing each unit equal to c and a fixed cost to export equal to co(i) = l~i, where /2 is a positive constant. The small open economy charges a tariff of T percent on all purchases of the specialized capital goods. Let the timing of events be as follows. The small open economy announces a tariff r. Given this v, foreign entrepreneurs decide whether or not to export to the country. Because of the symmetry of the capital goods in final production, all foreign entrepreneurs that export face the same demand curve and earn the same revenue. Profits differ, of course, since fixed costs differ. Marginal entrepreneurs are those whose profit just covers their fixed cost. The product of the marginal entrepreneur is N. The problem facing the foreign entrepreneur i if he enters is max (1 - T ) p ( x i ) x i - cxi, xi
where the inverse demand function p ( x i ) = o~(L/xi) a-1 is derived from the marginal productivity condition for capital. It is easy to show that the profit-maximizing price is a simple markup over marginal cost and that the profit-maximizing quantity is
which depends on the level of the tariff r. Since the tariffis the same for all producers, we have dropped the index i on x. Setting gross profit equal to fixed costs, we can solve for the marginal product N as a function o f x ( r ) ; that is, N(T)
-
(1
-
a)Cx(r)"
With these expressions for x(T) and N(T), we can write GDP in equilibrium as y = L I - a N ( T ) [ x ( T ) ] a.
What is the impact of tariffs on GDP? One way to measure the impact is to compare the GDP of a country with no tariffs to one with tariff rate r; that is, y(v = o) y(r - - > O)
_ N(0) Cx(0))a =(1_ r)g+l. ~
t x(T))
(5.10)
This expression assumes that the labor input is the same in the two countries. Before making some back-of-the-envelope calculations with this ratio, let us present another formula. Romer (1994) argues that the effects of tariffs on GDP can be
Ch. 10: Explaining Cross-Country Income Differences
707
large and, in particular, much larger than traditional analyses have suggested. In the traditional calculations, the implicit assumption is that the set of products does not change with tariffs. In the context of the above model, the traditional analysis assumes a different timing of events. The timing in the traditional analysis assumes that entrepreneurs decide to export or not, assuming there is a zero tariff. After this decision is made, the small open economy posts an unanticipated tariff of r. Because the fixed costs are sunk, entrepreneurs continue to export (as long as net profits are positive). What is the impact of tariffs in this case? In this case, the relevant ratio is y(T=0) y(v > 0)
N
= =N(0)
(x(0)~a =(1-v)~. ~ X('g)/]
(5.11)
Note that the key difference between Equations (5.10) and (5.11) is that N(0) replaces N(r). In essence, the key difference between these formulas is the exponent on (1 - r). In the latter case, where the number of imports varies with the tariff rate, the exponent is larger. 1 Suppose that T is To do some rough calculations, Romer (1994) assumes that a = g. 25 percent. Using the formula in Equation (5.10), we find that GDP is 2.4 times higher without tariffs than with tariffs. Using the formula in Equation (5.11), we find that GDP is only 1.3 times higher without tariffs than with tariffs. Thus, we significantly underestimate the effect on GDP if we do not allow the number of goods to vary with the tariff rate. Furthermore, the result is nonlinear. I f we assume that r is 50 percent, then the first formula yields a ratio of 8, while the second yields a ratio of 2. Two conclusions can be drawn from these simple calculations. First, the effects of tariffs on productivity may be much larger when we consider that the set of products changes with tariffs. Second, the effects of tariffs on GDP are potentially large. The rough calculations that we did above use rates in the range observed for the low- and middle-income countries. (See Figure 15.) Romer's (1994) estimates led Klenow and Rodriguez-Clare (1997c) to consider the effects of tariffs in Costa Rica. Klenow and Rodrlguez-Clare (1997c) find that considering changes in the set of goods imported can significantly change the traditional cost-of-tariff calculation. For example, they find that their cost-of-tariff calculation leads to a loss from trade protection that is up to 4 times greater than the traditional calculation. In the particular case they study, the Costa Rican tariff reform in the late 1980s, the traditional calculation leads to rather small gains from tariff reduction. Hence, Klenow and Rodriguez-Clare's (1997c) estimates of the gain are also rather small - just a few percent of GDE Still, it may be that in other countries or time periods, their formula may imply gains from tariff reductions that are a large fraction of GDE 5.1.3. Other policies
There are many other studies that have examined the quantitative impact of particular policies on income. In labor economics, there are studies of the effects of labor
708
E.R. McGrattan and JA. Schmitz, Jr.
market restrictions such as impediments to hiring and firing workers on productivity and income. In industrial organization, there are studies assessing the quantitative effects of policies toward monopoly. In public finance, there are studies concerned with the quantitative effects of tax policies on income. In this section, we discuss some examples. In many countries (developed and less developed), there are legal restrictions on the actions of employers. These laws range from requiring paying termination costs when firing employees to prohibiting firms from closing plants. Putting such legal restrictions on the actions of employers obviously influences their decision to hire employees. The laws, then, have implications for the equilibrium level of employment. A number of studies have tried to quantify the effects of such laws on aggregate employment and income. For example, Hopenhayn and Rogerson (1993) study the costs of imposing firing costs on firms. They construct a general equilibrium and use it to study the consequences of a law that imposes a tax equal to one year's wages if a firm fires an employee. They find that such a policy reduces employment by about 2.5 percent and reduces average productivity by 2 percent 22. An old issue is the relationship between monopoly and economic progress. In much of the R&D literature discussed later, there is an emphasis on the idea, attributed to Schumpeter, that entrepreneurs need to capture rents in order to innovate and introduce new products. Hence, this idea suggests that monopoly leads to economic progress. There is, of course, some truth to this idea. But for developing countries in which the issue is primarily one of technology adoption and not creation, the idea may be of little quantitative importance. Developing countries need to worry less about the incentives to invent new products than do developed countries. Hence, if monopolies have costs as well, monopolies may be more costly in developing countries. But the cost of monopoly is low in most models. The cost of monopoly is usually due to a restriction on output. The costs of such output restrictions are usually estimated to be a small share of GDR Bigger costs would emerge if monopoly were tied to restrictions on technology adoption. Parente and Prescott (1997) present a new model that argues that monopoly does restrict technology adoption. They study the consequences of giving a group the right to use a particular technology. If the group is given such a right, then it may try to block the adoption of new technologies that would reduce the gain from the monopoly right. Moreover, the group may use existing technologies inefficiently. There is also a branch of the CGE literature that studies public finance issues. Among the policies that have been quantitatively explored in this literature are the abolition of government taxes, indexation of tax systems to inflation, and replacement of income taxes with consumption taxes. A good survey of some of this literature is contained in Shoven and Whalley (1984).
22 Other work in this area includes Bertola (1994) and Loayza (1996) who study the effects of certain labor market restrictions on growth.
Ch. 10: Explaining Cross-Country Income Differences
709
5.2. Effects of policy on growth Up to now, we have focused on disparity in the levels o f income across countries. However, much o f the recent literature has focused instead on income growth. O f particular interest is the significant increase in the standard o f living of the richest countries over the past 200 years and the recent growth miracles in East Asia. (See Figures 1 and 4.) An objective in this literature - typically referred to as the endogenous growth literature - is to develop models in which growth rates are endogenously determined. One of the main questions o f this literature has been, What are the determinants o f the long-run growth rate? To illustrate the kinds o f quantitative predictions that have been found, we analyze two prototype endogenous growth models. The first is a two-sector model with growth driven by factor accumulation. The second model assumes that growth is driven by research and development. For both models, we derive steady-state growth rates and show how they depend on economic policies. Under certain assumptions, measured differences in policies imply significant differences in growth rates.
5.2.1. Policies in a two-sector AK model In this section, we analyze the balanced growth predictions o f a prototype two-sector endogenous growth model 23. There are three main differences between this model and the exogenous growth model discussed in Section 5.1.1. First, here we assume that there are constant returns to scale in accumulable factors. Second, we introduce elastic labor supply. Adding elastic labor supply does not change the results o f Chari et al. (1997) significantly, but does have a large effect on the predictions of the endogenous growth models. Third, we add taxes on factor incomes as in Stokey and Rebelo (1995). We assume that there is a representative household which maximizes
Z [3tU(c,, gt) Nt,
(5.12)
t-0
where c is consumption per household member, g is the fraction o f time devoted to work, and Nt is the total number of household members. Since we are using a representative household in our analysis, we refer to the units o f c as per capita and to Art as the total population at date t. As before, we assume here that the growth rate o f the population is constant and equal to n. For our calculations below, we assume that U(c,g) = {c(1 - g)~P}l-~/(1 - a ) with tO > 0.
23 For more discussion of this model, see Rebelo (1991) and Jones and Manuelli (1997). In Section 5, we analyze simulations of the model using as inputs the process for investment tax rates estimated in Chari et al. (1997).
710
E.R. McGrattan and JA. Schmitz, Jr.
There are two sectors of production in the economy. Firms in sector 1 produce goods which can be used for consumption or as new physical capital. The production technology in this sector is given by (5.13)
c + x k = y = A(kv)~k(hgu) ah,
where xk is per capita investment in physical capital; A is the index of the technology level; v and u are the fractions of physical capital and labor, respectively, allocated to sector 1; and k and h are the per capita stocks of physical and human capital, respectively. In this case, we assume constant returns to the accumulable factors; that is, ak + ah = 1. The human capital investment good is produced in sector 2 with a different production technology, namely, Xh = B (k(1 - v)) °z' (hg(1
u)) °h ,
(5.14)
where xh is per capita investment in human capital and B is the index of the technology level. Again, we assume constant returns in accumulable factors so that 0h + Oh = 1. As do Uzawa (1965) and Lucas (1988), we allow for the possibility that the production of human capital is relatively intensive in human capital (that is, Oh > ah). Note that if ak = Ok and A = B, then this model is equivalent to a one-sector endogenous growth model. The laws of motion for the per capita capital stocks k and h are given by (1 + n ) k t + l = (1 - - 6 k ) k t + x k t , (1 + n)ht+l = (1 - 6h)ht +xht,
(5.15) (5.16)
where the term (1 + n) appears because we have written everything in per capita terms. Households supply labor and capital to the firms in the two sectors. Their income and investment spending are taxed. A typical household's budget constraint is given by ct + (1 + 'cxkt)x~t + (1 + "cxh3qtxht
~< (1 - "cklt)rltktvt + (1 - "ck2t)r2tkt(1 - Or) + (1
-
"Chit)Wltgthtut
(5.17)
+ (1 -'chet) w2te-tht(1 - ut) + It,
where q is the relative price of goods produced in the two sectors, "cxk is a tax on physical capital investment, 'cxh is a tax on human capital investment, rj is the rental rate on physical capital in sector j, wj is the wage rate in sector j, "ckj is a tax on income from physical capital used in sector j, rhj is a tax on income from human capital used in sector j, and T is per capita transfers. We assume that households maximize Equation (5.12) subject to Equations (5.15), (5.16), and (5.17), the processes for the tax rates "cxk, "cxh, "ckj, "chj, j = 1,2, and
Ch. 10: Explaining Cross-Country Income Differences
711
Table 3A Parameter values for tax experiments in the two-sector endogenous growth model Parameters
Model King and Rebelo (1990)
Lucas (1990)
Kim (1992)
Jones et al. (1993)
0.33 0.33
0.24 0.0
0.34 0.34
0.36 0.17
Physical capital (6k)
0.1
0.0
0.05
0.1
Human capital (610
0.1
0.0
0.01
0.1
0.988 0.0 1.0 0.0 0.126
0.98 0.5 2.0 0.014 0.078
0.99 0.0 1.94 0.0 0.048
0.98 5.0 1.5 0.0 0.407
Capital shares
Sector 1 (ak) Sector 2 (0~) Depreciation rates
Preferences
Discount factor (/3) Share on leisure (~0) Risk aversion (o) Growth in population (n) Technology level (B)
given factor prices. Assuming competitive markets, one finds that factor prices in equilibrium are marginal products derived using the technologies in Equation (5.13) and Equation (5.14). We turn now to some calculations. Following Stokey and Rebelo (1995), we parameterize the model to mimic different studies in the literature. In Table 3A, we display four such parameterizations corresponding to the studies o f King and Rebelo (1990), Lucas (1990), K i m (1992), and Jones et al. (1993). For all four models and all o f the numerical experiments we run, we normalize the scale o f technology in sector 1 with A = 1 and adjust B so as to achieve a particular growth rate in our baseline cases. Although there are slight differences between the model described above and those we are comparing it to, when we run the same numerical experiments as these studies, we find comparable results. Here we run the same numerical experiment for all four models. The experiment is motivated by the data on income tax revenues and growth rates for the U S A reported in Stokey and Rebelo (1995). Stokey and Rebelo note that in the USA, there was a large increase in the income tax rate during World War II. Despite this, there was little or no change in the long-run US growth rate. Stokey and Rebelo argue that this evidence
712
E.R. McGrattan and J.A. Schmitz, Jr
Table 3B Steady-state growth for a 0 percent and 20 percent income tax in the two-sector endogenous growth model Model King and Rebelo (1990)
Lucas (1990)
Kim (1992)
Jones et al. (1993)
2.00 -0.62
2.00 1.17
2.00 1.31
2.00 -1.99
2.18 182
1.28 5.12
1.23 3.89
3.31 2924
Steady-state growth rate
Tax rate = 0 Tax rate = 0.2 Ratio of incomes
After 30 years After 200 years
suggests that the models in the literature predict implausibly large growth effects o f fiscal policies 24. Suppose that we parameterize our model using the values given in Table 3A. The parameter B is set so as to achieve a steady-state growth rate o f 2 percent when all tax rates are 0. N o w consider an increase in the tax rates rk~, rk2, rhb and Th2 from 0 percent to 20 percent. In Table 3B, we display the after-tax steady-state growth rates for all four parameterizations. The new growth rates range from a value o f - 1 . 9 9 for Jones et al.'s (1993) parameters to 1.31 for Kim's (1992) parameters. To get some sense o f the magnitudes, imagine two countries that start out with the same output per worker but one follows a 0 percent tax policy and the other a 20 percent tax policy. After 30 years, one would predict that their incomes differ by a factor o f 1.23 using Kim's (1992) parameters and 3.31 using Jones et al.'s (1993) parameters. After 200 years, the factors would be 3.89 versus 2,924. Thus, there is a large difference between the predictions o f Lucas (1990) or K i m (1992) and King and Rebelo (1990) or Jones et al. (1993) i f growth rates are compounded over many years. Table 3B shows clearly that the estimated impact o f policy on growth varies dramatically in the literature. Here, too, there is still much debate about the magnitude o f the estimates o f policy effects. To get some sense o f why the results are so different, we consider two special cases o f the model and derive explicit formulas for the growth rate o f productivity in the steady state. Suppose first that incomes from capital and labor used in sector j are
24 The small change in growth could also be due to the fact that there were other policy changes such as lower tariffs or increased public spending on education as in Glomm and Ravikumar (t998) that had offsetting effects on the growth rate.
Ch. 10: ExplainingCross-CountryIncomeDifferences
713
taxed at the same rates. That is, let rj- = rkj = Thj. Suppose that tax rates on physical and human capital investment are equal; that is, z'x = rxk = rxh. Suppose also that the capital shares are equal in the two sectors, with a = ak = Ok. Finally, assume that the depreciation rates are equal for physical and human capital, and let 6 = 6k = 6h. In this case, the steady-state growth rate for output per worker is given by 1
g
/3 1-b+[Aoc(1-rl)]a[B(1-a)(1-T2)] I-ag(v) -~+~ ~] ]j -1,
(5.18)
where z- = (rx, rl, v2) is the vector o f tax rates and g(v) denotes the fraction o f time spent working in the steady state, which is a function o f the tax rates. From the expression in Equation (5. t 8), we see that predicted effects o f a tax increase depend on the discount factor, the depreciation rate, the capital share, and the elasticity o f labor. The parameters o f King and Rebelo (1990) fit the special case in Equation (5.18). But they further assume that labor is supplied inelastically, and therefore, g(T) = 1. Consider two variations on King and Rebelo's (1990) parameter values given in Table 3A. First, suppose they had assumed 6 = 0 rather than 6 = 0.1. Using the formula in Equation (5.18) with a = 0.33, 6 = 0, /3 = 0.988, a = 1, A - 1, and B = 0.0154, we find that the pre-tax growth rate is 2 percent and the after-tax growth rate is 1.36 percent, which is significantly higher than -0.62. (See Table 3B.) Now consider increasing o. If we set 6 = 0, a = 2, and B = 0.032 so as to get a pre-tax growth rate o f 2 percent, then the after-tax growth rate is 1.48 percent, which is even higher than the estimate found with Kim's (1992) parameter values. We now consider a second special case. Suppose that the sector for producing human capital uses no physical capital. In this case, the steady-state growth rate for output per worker is given by 1
1 + vxh JJ
(5.19)
where r = (rx, Tkl, rk2, Tha, Th2) is the vector o f tax rates and g(T) is the time spent working in the steady-state equilibrium. The parameters o f Lucas (1990) fit this special case. In this case, no physical capital is allocated to sector 2, and therefore, changes in Vkz have no effect at all. Furthermore, changes in tax rates in sector 1 only affect growth if they affect the supply of labor. If labor is inelastically supplied, the taxes levied on factors in sector 1 have no growth effects at all. Lucas (1990) chooses a near-inelastic labor supply elasticity (~p = 0.5). Suppose, for his case, we use ~p = 5, implying an elastic labor supply as in Jones et al. (1993), and set B = 0.219 to hit the baseline growth rate. With these changes, the steady-state labor supply g is 0.209 when the tax rates are 20 percent and 0.283 when the tax rates are 0 percent. Using the formula in Equation (5.19), we find that the pre-tax growth
E.R. McGrattan and J.A. Schmitz, Jr.
714
rate is 2 percent and that the after-tax growth rate is 0.79 percent. Thus, the growth effects are sensitive to the choice of labor elasticity. The formulas in Equation (5.18) and Equation (5.19) illustrate how sensitive the quantitative predictions are to certain parameter assumptions. In particular, the predictions are sensitive to choices of the labor elasticity, depreciation rates, and the intertemporal elasticity of substitution. Stokey and Rebelo (1995) attribute the wide range of estimates of the potential growth effects of tax increases cited in the literature to different assumptions for these parameters. The conclusion that Stokey and Rebelo (1995) draw from the US time series evidence is that tax reform would have little or no effect on growth rates in the USA. They do not dispute that the two-sector endogenous growth model yields a good description of the data if it is parameterized as in Lucas (1990) or Kim (1992). Jones (1995b), however, uses the US time series as evidence that the model is not a good description of the data. He notes that after World War II, we saw large increases in the investment-output ratio in France, Germany, Great Britain, Japan, and the USA. But growth rates in these countries changed little. If the data were well described by a one-sector AK growth model, then Jones (1995b) argues that we should have seen larger increases in the growth rate accompanying the increases in the investmentoutput ratio. The model Jones (1995b) works with is a one-sector version of the model above in which labor is supplied inelastically and the total population is constant. Suppose that A = B, a = ak = 0k, 6 = 6~ = 6h, ~P = 0, and n = 0. In this case, the ratio of human to physical capital is given by the ratio of their relative shares (1 - a ) / a . Here, as in the A K model, total output can be written as a linear function of k, namely, as A k ~ h 1 a = A[(1 - a ) / a ] 1 ~k. Thus, the growth rate in output is equal to the growth rate in capital. From Equation (5.15), we can derive the steady-state growth rate in capital, which we denote by g, by dividing both sides of the equation by kt and subtracting 1. The growth rate in this case is 2¢k
(5.20)
C +X k +Xh
where we have used the steady-state relation between capital and total output c + xk + xh. Jones (1995b) points out that while investment-output ratios have risen over the postwar period, growth rates have stayed roughly constant or have fallen. The formula in Equation (5.20) implies the opposite: increases in investment-output ratios should be accompanied by increases in growth rates. There are several caveats to be noted with Jones' (1995b) argument. First, in countries such as the USA, the changes in the investment-output ratio are not that large, and by Equation (5.20) we would not expect a large change in the growth rate. 1 Suppose a = 51 and A is set equal to ~1 to get a capital-output ratio of roughly 2g. Suppose also that the depreciation rate is 5 percent. These values would imply that an increase in the investment-output ratio from 16.5 percent to 18.1 percent, as reported
Ch. 10: Explaining Cross-Country Income Differences
715
by Jones (1995b) for the USA over the period 1950-1988, should lead to a change in the growth rate from 1.55 percent to 2.18 percent. Given the size of growth rate variations in the data, it is hard to detect such a small change in the long-run growth rate over such a short period of time. Second, the relationship between growth rates and the investment-output ratio is not given by Equation (5.20) as we relax many of the assumptions imposed by Jones (1995b). For example, if labor is elastically supplied or the two sectors of the model have different capital shares, then Equation (5.20) does not hold. In such cases, we have to be explicit about what is changing investmentoutput ratios in order to make quantitative predictions about the growth rates. If, for example, we use Lucas' (1990) model to investigate the effects of income tax changes, we find a small effect on growth rates but a big effect on investment-output ratios. In this section, we discussed the effects of changes in tax rates on growth. The AK model has also been used to study the effects of monetary policy on growth. For example, Chari et al. (1995) consider an AK model with several specifications of the role for money. In all cases, they find that changes in the growth rate of the money supply has a quantitatively trivial effect on the growth rate of out-put. As we saw above, large growth effects require large effects on the real rate of return. Changes in tax rates can have a potentially large effect on the real rate of return, but changes in inflation rates do not. On the other hand, Chari et al. (1995) find that monetary policies that affect financial regulations such as reserve requirements on banks can have nontrivial effects (on the order of a 0.2 percentage point fall in the growth rate with a rise in inflation from 10 to 20 percent) if the fraction of money held as reserves by banks is high (on the order of 0.8). These effects are small, however, relative to the effects of fiscal policy that have been found. 5.2.2. Policies in an R & D model
A large literature has developed theoretical models of endogenous growth based on devoting resources to R&D. This literature includes new product development models [such as in Romer (1990)] and quality-ladder models [such as in Grossman and Helpman (1991a, b) and Aghion and Howitt (1992)]. As compared to the theoretical literature that explores the quantitative link between policies and disparity (as in Section 5.1) and the two-sector endogenous growth literature that explores the quantitative link between policies and growth (as in Section 5.2.1), this R&D literature has far fewer studies exploring the quantitative link between policies and growth. This is likely due to the fact that the main quantitative concern for these models has been their predicted scale effects. Though there has been little explicit analysis of the effect of policy in these models, we think that it is important to review this important literature. We begin by describing a discrete-time version of the model in Romer (1990). Recall that in Section 5. t.2 we considered the problem of a small open economy importing intermediate goods that had already been developed in the rest of the world. Here we focus on the R&D activity. Technological innovation - new blueprints for intermediate
716
E.R. McGrattan and J.A. Schmitz, Jr.
inputs - is the driving force behind growth in this model. We show that the model implies a scale effect: the growth rate increases with the number of people working in R&D. This implied scale effect has been criticized by Jones (1995a) who offers a possible solution without significantly changing the model. [See also Young's (1998) model of quality ladders.] We review Jones' (1995a) model in which there is no scale effect. We lastly turn to the evidence on this point. The discrete-time version of the economy in Romer (1990) that we consider has three production sectors. In the research sector, firms use existing blueprints and human capital to produce new blueprints. In the intermediate goods sector, firms use existing blueprints and capital to produce intermediate capital goods. In the final goods sector, firms use intermediate capital goods, labor, and human capital to produce a final good that can be consumed or used to produce new capital. In addition, there is a household sector. Households buy consumption and investment goods with wages, rental earnings, and profits. Consider first the problem of the final goods producers. Their production function is given by y = H~L r
~0N ~
a-r di,
where Hy is human capital devoted to final goods production, L is labor, N is the total number of intermediate goods currently in existence, and xi is the quantity of the ith intermediate good. Final goods producers choose inputs to maximize their profits and, therefore, solve max Y ].Iy,L, {xi}
wHHy--WLL--fO N pixidi,
(5.21)
where WH is the price of a unit of human capital, WL is the wage rate for labor, Pi is the price of intermediate good i, and the final good is the numeraire. Profit maximization implies that Pi = (1 - a - 7)H~.Lrx7 a-r
(5,22)
and that
N WH = a H ~ - I L r fo
x~-a y di.
Consider next the problem of intermediate goods producers. We start by assuming that the blueprint for intermediate good i has been purchased. The technology available to the producer of intermediate good i is linear and is given by xi = l ki,
(5.23)
Ch. 10: Explaining Cross-Country Income Differences
717
where ki is the capital input. Conditional on having purchased blueprint i, the producer
of intermediate good i maximizes profits ~i: ori
m a x p(xi) xi - rki
(5.24)
Xi
subject to Equation (5.23), where p(-) is the demand function given by Equation (5.22) and r is the rental rate for capital. The decision to purchase a blueprint is based on a comparison of the cost of the blueprint versus the benefit of a discounted stream of profits from using the blueprint. Free entry into intermediate good production implies that
PN, = ,_._, j=t s=t
where PN~ is the price of blueprint N at date t and ~j are profits at date j. Next we consider the problem of research firms who produce new blueprints and sell them to intermediate goods producers. Given an input of human capital,/:/, a firm can produce 6I-IN new blueprints, where 6 is a productivity parameter and N is the total stock of blueprints in the economy. Let Hut denote the aggregate human capital input in R&D; then the stock of blueprints evolves according to Nt+l = Nt + 6HNtNt.
(5.25)
In equilibrium, it must be true that WH = PN 6N.
Lastly, consumers maximize expected utility subject to their budget constraint. Preferences for the representative household over consumption streams are given by oo
C; - ° -
1
t=0
where Ct are units of consumption at date t. Denoting the interest rate by rt, one finds that the maximization of utility subject to the household's budget constraint implies that Ut(Ct) = l.3Ut(Ct+l)(1 + rt+i).
(5.26)
We now compute a steady-state equilibrium growth rate for output. Assume that the total stock of human capital H = HN + Hy and the supply of labor L are both fixed. Romer (1990) shows that a symmetric equilibrium exists in which output Y, consumption C, and the number of blueprints N all grow at the same rate. Denote
E.R. McGrattan and J.A. Schmitz, Jr.
718
this growth rate by g, and denote the quantities, prices, and profits in the intermediate good sector by x, p, and £v. From Equation (5.25), we know that g = (SHN.Thus, to compute the growth rate of output, we need to derive the stock of human capital devoted to R&D in equilibrium. The returns to human capital in the research sector and in the final goods sector must be equal in equilibrium; therefore,
PNON = aH~ ILYN21-a-Y.
(5.27)
Using Equations (5.22), (5.23), and the first-order condition from Equation (5.24), we have that = (gg + y ) ( 1
- a -
y)H~LY~c l-a-y.
Equating the price of blueprints to the discounted value of the profits from use of the blueprints implies that PN
=
1
1
--f'E : Y
--{(a F
q-
~/)(1 - a -
y)H~LY21-a-Y}.
(5.28)
Substituting Equation (5.28) in Equation (5.27) and simplifying yields the following expression for human capital in production: aP
Hv = 6(1 - a - y)(a + ]/)' Therefore, the growth rate is
ar g : 6 ( H-6(l_a_-~,)(a+ y) )=6H-Ar,
(5.29)
where A = a / [ ( l - a)(1 - a - V)]. From the household's first-order condition in Equation (5.26) we have that g = [fi(1 + r ) ] } - 1.
(5.30)
Thus, in Equations (5.29) and (5.30), we have two equations from which we can determine the growth rate g and the interest rate r on a balanced growth path. Notice that g depends positively on the stock of human capital H . Thus, there is a scale effect, as we noted above. As Jones (1995a) points out, one need not even proceed past the specification of growth in the number of blueprints and the description of technologies to know that there is a scale effect. The main assumption of the model is that a doubling of the number of people working on R&D implies a doubling o f the growth rate by Equation (5.25). However, in many countries, particularly the OECD countries, there has been a dramatic increase in the number of scientists and engineers
Ch. 10: Explaining Cross-Country lncome Differences
719
and a dramatic increase in the resources devoted to R&D with little or no increase in growth rates over a sustained period. Within the context of Romer's (1990) model that we just described, Jones (1995a) offers a possible solution to the problem of the existence of a scale effect. In particular, he assumes that the evolution of blueprints is given by
Nt+l = Nt + 6HZNtNtO,
(5.31)
with 0 < / l ~< 1 and q~ < 1, rather than by Equation (5.25). Jones (1995a) also assumes that g = 0 and that the growth rate of H is given by n, where H is now interpreted to be the total labor force. On a balanced growth path, HXNt must grow at the same rate as Nt 1-0. [See Equation (5.31).] Thus, it follows that /in
g = 1 - 0'
(5.32)
where g is the growth rate of blueprints and output per worker. Note that g now depends on the growth rate of the labor force rather than the total number of researchers. Thus the scale effect is removed. Is the relationship in Equation (5.32) consistent with the data? The answer to this question depends a lot on how the model is interpreted. For example, if we interpret this model as one of a typical country, then the answer is no. The correlation between growth rates of GDP per worker and growth rates of the labor force over the period 1960-1985 is -0.12 (based on all countries with GDP per worker available). The relationship in Equation (5.32) implies a positive correlation. I f we interpret this model as one relevant only for countries in which there is a lot of activity in R&D, we still find that the correlation between the growth rates of GDP per worker and the labor force are around zero or slightly negative. Suppose, finally, that we view the model as one relevant for the world economy. Using the data of Bairoch (1981), we see that the growth rate in real GDP per capita for the world and the growth rate of the world population followed the same trend over the period 1750-1990. Hence, under this interpretation, the model is roughly consistent with the data assuming that the growth rate of the world population is a good proxy for the growth rate of the number of researchers world-wide. With growth in output per worker determined by a variable assumed to be exogenous, namely the growth rate in the labor force, policies intended to encourage innovation such as subsidies to R&D or capital accumulation have no growth effects in Jones' (1995a) model. However, we do know that there are many countries that have a large fraction of R&D spending financed by their government. Examples include France, Germany, Japan, and the USA. In fact, Eaton and Kortum (1996) attribute more than 50 percent of the growth in each of the OECD countries to innovation in Germany, Japan, and the USA. But the effects of policies encouraging innovation are still being debated. [See, for example, Aghion and Howitt (1998).]
720
E.R. McGrattan and J.A. Schmitz, Jr.
6. Two growth models and all of the basic facts In Section 5.1, we reviewed the literature that studies the implied disparity in incomes in various parameterized models, while in Section 5.2, we reviewed the literature that studies the implied growth rates in various parameterized models. These studies typically focus on one or the other dimension of the income distribution - that is, disparity or growth. As Lucas (1988) argues, "the study of development will need to involve working out the implications of competing theories for data other than those they were constructed to fit, and testing these implications against observation" (p. 5). We turn to that task in this section. We look at implications of two of the above parameterized models for numerous dimensions of the income distribution over the last 100 or so years (in particular, the features in Figures 1-4). As we just mentioned, few studies actually perform this exercise. A big difficulty in performing such an exercise is coming up with reasonable measures of factor inputs such as human capital and economic policies for such a long period of time. We do not solve that problem here. What we do is take the measure of distortions on capital investment that Chari et al. (1997) use for the post-World War II period and suppose the process applies to the last 200 years or so 25. Our purpose in this section is not to argue that investment distortions were the only factor determining variations in incomes but only to show what can be learned by conducting the exercise that Lucas (1988) suggests. For example, we learn that the parameterized models, with the distortions from Chari et al. (1997), do a reasonable job in explaining some figures, but not all. The models that we analyze here are a standard exogenous growth model and a standard AK endogenous growth model. For both models, we generate panel data sets and compare them to the data compiled by Maddison (1991, 1994) and Summers and Heston (1991). This is done by producing analogues of Figures 1-4 for the two models. The exogenous growth model can be written succinctly as the following maximization problem: OQ
max Z / 3 / ( c t ( 1 - g t ) ~ ) > ° / ( 1 - o) {ct,~,,xkt,xh~} t=0 subject to
ct + (l + rxt)(x,~ + xht) <~ wtgt + rktkt + rhtht + Tt, (l + g)(1 + n)kt+l = (1 - 6)kt + xkt, rkt = Fl([rt,ht,?t),
(1 + g)(1 + n)ht+l = (1 - 6)ht + xht,
rat = F2(kt, ht,-gt),
wt = F3(kt, ht, gt),
T, = vx,(2k, + 2at), F ( k , h , g ) = Akak ha@ a~-~,,,
(6.1) 25 In recent work, Jones et al. (1998) have included stochastic tax and productivity processes in an endogenous growth model in order to study the effects of uncertainty on the growth rate.
Ch. 10:
Explaining Cross-Country Income Differences
721
w i t h x e t , x m ~> 0, fi =/3(1 + g ) l - ° ( 1 + n ) , and ak + ah < 1. Original variables have been converted to per capita terms and, i f necessary, divided by the level o f technology in order to make them stationary (for example, gt = L t ~ t , ct = C,/(AtNt), kt = K t / ( A t N t ) , and so on, where Nt = (1 + ny is the total population and At = A(1 + g)t is the level o f technology). A bar over the variable denotes the economy-wide level 26. The endogenous growth model can be written succinctly as the following maximization problem
max Z/3t(c'(1 {c,,g,,xk,,xh,,v~,u,} t=0 subject to
- gt)~P)l-a/(1 - or)
ct + (1 + rxt)(xkt + qtxht) <~ rltktvt + r2tkt(1 - ut)
+ Wltgthtut + w2tgtht(1 - ut) + Tt, (1 + n)kt+t = (1 - 6)kt + xia,
tit = Fl(ktDt, ~tt-~tut), rat
=
(1 + n)ht+l = (1 - 6)ht + xht,
Wlt = F2([ctut, Tlt-gtFtt),
q t G l ( L ( 1 - got), htgt(1 - ut)), w2t = qtG2(/Tct(1 - got), ttt-gt(1 - fix)),
qt = F1 (Lgo,, ht?tft)/G, ([ct(1 - got), htg., (1 - fi,)), Tt = Txt(Xkt + qtXht), F ( K , H ) = A K a k H a~',
G ( K , H ) = B K ° * H °~,
(6.2) with xkt, xht ~> 0 , / ) =/3(1 + n), ak + ah = 1, and Ok + Oh = 1. Variables are in per capita units, and a bar over the variable denotes the economy-wide level. In order to simulate the models, we need to choose parameter values and a process for the policy variable rx. In Table 4A, we report the parameter values that we use. Many o f the parameter values are chosen to be the same in the two models. In particular, we choose physical capital shares equal to ½, depreciation rates o f 6 percent on both types o f capital stocks, a discount factor equal to 97 percent, and the weight on leisure in utility equal to 3. These parameter values fall in the ranges typically used in the literature. We set the growth rate o f the population equal to 1.5 percent, which is consistent with the data reported in Maddison (1994). The growth rate in technology in the exogenous growth model is set equal to 1.4 percent to achieve the same long-run growth patterns seen in Maddison's (1994) sample. In the endogenous growth model, we set A = B and ak = Ok so as to mimic a one-sector model. We then experiment with a different value o f Ok - one that implies that the human capital sector is human capital-intensive - to see i f the results are affected. For the risk aversion parameter, we experiment with a = 2 and o = 5. As we showed earlier, growth rates in the endogenous growth model are very sensitive to this parameter. I f we choose a value that is too small (for example, near 1), then the
26 The model in Section 5.1.1 with a - g is a simplified version of the model specified here. Under certain specifications of the parameters, they are isomorphic to each other.
E.R. McGrattan and J.A. Schmitz, Jr.
722
Table 4 A Parameter values used in simulations of the exogenous and endogenous growth models Model components
Parameter values
A. Exogenous growth model Production: y - Ak~k hah
A=l,
ak=~,ah=l
Evolution of capital: (1 + n)(a + g ) k l + 1 = (1 - 6 k ) k t +xkt
n = 0 . 0 1 5 , g = 0 . 0 1 4 , 61, = 0.06
(1 + n)(1 + g)ht+ I = (1 - 6h)h t +Xht
61, = 0 . 0 6
Preferences: ~ t ~t {c,(1 /3
g~),l,}l o/( 1 _
o)
/3 = 0.97, ~p = 3, ~ = 2 o r 5
-/3(1 + g)l-a(1 + n)
B. Endogenous growth model Production: y = A(ko)",, (htu) ~h
A = l, ak = ½, a h = ~
x h = B ( k ( 1 - v)) °k (he(1 - u)) °h
B=l,
0/(=l,0h=Z,
or0 k=0.03,0h=0.97
Evolution of capital: (1 + n)kt+ 1 - (1 - 6 k ) k , + x~,
n = 0 . 0 1 5 , 6 k - 0.06
(1 + n)ht+ 1 = (1 - 6h) h t + xht
0h = 0.06
Preferences: ~tflt{ct(1
gt)qJ}l-a/(1 - ~r)
/3=0.97,
~p-3,
or=5
=/3(1 + '0
disparity after 200 years is much greater than that actually observed. A value of 5 gives reasonable predictions for the distribution of incomes over time in the endogenous growth model. Results in the exogenous growth model are much less sensitive to this choice. However, the variation in growth rates is still affected significantly. For both models, we conduct the same experiment. We assume that all countries face the same process for investment distortions. All other tax rates are assumed to be equal to 0. Recall that relative prices of investment to consumption can be used as a measure of the distortion on investment. With the data on relative prices o f investment to consumption over the sample period 1960-1985, Chaff et al. (1997) estimate a regime-switching process for the stochastic process on the relative price of investment to consumption. In particular, they assume that conditional on being in regime R, the relative price 1 + T~ follows an autoregressive process given by
(1 + Tx,t+l) =PRO + rx,t) + ~R(1 - P R ) + oR et+l,
Ch. tO:
Explaining Cross-Country Income Differences
723
Table 4B Stochastic process for distortions used in simulations of the exogenous and endogenous growth models Model components
Parameter values
Autoregressive parameters: (1 + rx,t) =pR(1 + ~x,t-l) + gR(1 - P c ) + O'REt
pp = 0.993, ~p = 1.976, (~p = 0.074, Pr = 0.865, ~r = 2.459, a T - 0.789
Switching probability parameters: ~RR,(m) = aR
bR(m-- 1)
a e = 0.244, bp = -0.012, a T = 0.350, br = -0.016
where et is i.i.d, and is drawn from a standard normal distribution. The probability o f switching regimes depends on the number o f periods since the last regime switch. Let m denote the number o f periods since the last switch. The probability of switching from regime R to R/, conditional on having been in R for m periods, is ZCRR,(m) = aR -- b R ( m - 1).
Chari et al. (1997) find that the data are well characterized by a process with two regimes. In one regime, the distortions are highly persistent over time, and in the other, they are more volatile. C h a r i e t al. (1997) refer to the regimes as the p e r s i s t e n t and the t u r b u l e n t regimes. In Table 4B, we display the maximum likelihood parameter estimates o f Chari et al. (1997) for the process governing the distortion on investment. The subscript P indicates values for the persistent regime, and the subscript T indicates values for the turbulent regime. Conditional on being in the persistent regime, the coefficient on the lagged relative price is 0.993, and the standard deviation of the innovation is 0.074. Conditional on being in the turbulent regime, the coefficient on the lagged relative price is 0.865, and the standard deviation o f the innovation is 0.789. The unconditional variance o f the relative price is 2.47 in the turbulent regime and 0.39 in the persistent regime. Thus, the relative price fluctuates a lot more in the turbulent regime than it does in the persistent regime. Notice that in the turbulent regime, relative prices show more mean reversion than they do in the persistent regime. Note also that the unconditional mean o f the relative price is 2 in the persistent regime and 2.5 in the turbulent regime. The parameters o f the switching probability functions are given in Table 4B. The probability o f switching from the persistent to the turbulent regime, conditional on having switched to the persistent regime in the previous period, is 0.244, while this probability is 0.016, conditional on having been in the persistent regime for 20 periods or more. The probability o f switching from the turbulent to the persistent regime, conditional on having switched to the turbulent regime in the previous period, is 0.350, while this probability is 0.046, conditional on having been in the turbulent regime for 20 periods or more. Thus, the probability o f leaving the persistent regime is lower
724
E.R. McGrattan and JA. Schmitz, Jr.
than the probability of leaving the turbulent regime. Notice that the turbulent regime is aptly named because of two characteristics: conditional on being in the regime, relative prices fluctuate more, and the probability of leaving the regime is higher. Using estimates for the process for 1 + Tx as inputs, we simulate an artificial panel data set for both the exogenous growth model and the endogenous growth model. We assume that all countries have the same factor endowments and investment distortions (that is, values for k, h, rx, R, and m) in the year 1750, and as we noted above, we assume the same process for investment distortions. This choice of initial year is motivated by Bairoch's (1981) GNP per capita data. His numbers show that the average standards of living of the developed countries and the Third World were very similar in 1750. The initial conditions for the relative price are set as follows. We set the relative price equal to 1 (and, hence, the tax on investment equal to 0) for all countries in 1750 and assume that they are in the persistent regime and have been for 20 or more periods. For k and h, we use the corresponding steady-state values (with h normalized to be 1 in 1750 in the endogenous growth model). With these initial conditions, we produce two panel data sets - one for each model - for the period 1750-1990 for 1000 countries. 6.1. An exogenous growth model
In this section, we describe simulation results for the exogenous growth model. We start with the case in which o = 2. Results in this case are summarized in Figures 1720. We then describe how the results change when we increase a to 5. Results in this case are given in Figures 21-24. Both sets of results are compared to their analogues in the data, namely, Figures 1-4, which are discussed in Section 2. In Figure 17, we display the time series of income distributions for the model. We display the 25th and 75th percentiles of the distribution as well as the 10th and 90th percentiles. We plot the percentiles since we are comparing the model to a very incomplete set of data back to 1820. Recall that we have per capita GDP data for only 21 countries. Since we do have a number of the very poor countries and many of the rich countries in this set of 21, it is likely that a good comparison can be made with the 10th and 90th percentiles for our model. We include the 25th and 75th percentiles, however, in order to give some feeling for the size of the tails of the distribution. The model shows a gradual fanning out of the distribution as is observed in the data. In 1820 the country at the 90th percentile has a per capita GDP equal to 4.3 times that of the country at the 10th percentile. By 1989, this factor is 16.5, which is very close to the ratio of 16.7 in the data. The insert in Figure 17 is a snapshot of the distribution of per capita output in 1989. We keep the axes the same in Figure 17 as in Figure 1 to make it easier to compare the figures. As a result, only 94.3 percent of the countries are shown for the model since there were some outliers with per capita output greater than 8 times the world average and less than 1 of the world average. In the data, the distribution of per capita GDP is close to uniform. The distribution of per capita output in the model in 1989 is closer to normal.
725
Ch. 10: Explaining Cross-Country Income Differences
30[ 1989 Relative GDP Per Capita (World Average = 1)
105
'~ 20t
[]
104
0U8 1/4 1/2 1
2
4 []
8 []
[]
[]
[]
0
0 0
Q ©
(D
[]
lOs
o 0 [] i
1800
r
r
l
l
l
l
1850
l
o
o
[]
[] l
l
l
0 []
0 [] i
1900 Ye~
0 []
0
0 []
[]
25-75 Percentile 10-90 Percentile I
r
i
r
i
1950
I
2000
Fig. 17. GDP per capita, 1820-1989, for exogenous growth model with o = 2.
Next consider the predicted correlation between incomes in 1960 and subsequent growth rates. In Figure 18, we plot the model's growth rates in incomes for the period 1960 to 1985 versus the relative incomes in 1960. Again, we keep the axes the same as in Figure 2, which shows the relationship between growth and initial income for the data. In this case all but 1 percent o f the countries are shown. The pattern for the model looks like a cloud - similar to that in Figure 2. The correlation between initial incomes and growth rates is negative for the model, but only slightly. As noted before, the transition dynamics occurring when capital is off its steady state lead to a negative correlation between initial capital and subsequent income growth. Here, however, there are two forces at work: transition dynamics o f capital and stochastic disturbances in investment distortions. Because the data on the relative price o f investment to consumption display large fluctuations, output displays large fluctuations. Therefore, we find a lot o f mobility o f countries and little correlation between initial incomes and subsequent growth rates. Note also that, in the data, countries with the most persistent relative prices o f investment to consumption over time are the richest and the poorest. Countries with relative prices that vary significantly over time are middle-income countries. These features o f the data are well mimicked by the model because countries with policies that switch regimes frequently are middle-income countries. As a result, growth rates for the middle-income countries show the greatest variation.
726
E.R. McGrattan and J.A. Schmitz, Jr 8 --
•
6-
"
•
. ,.
~'~
•
5-
"~
4-
• D
3 2 -
o
_
•
. •
~'~.
•
0000
•
oo
•
..-.
.
•
#o ~ e
_
ee
•
"
•
,
•
i:~o..~-~
.L~':
~-o
•
1/8
1/4
I
"
.
• I
-
°o
•
•
...-
o~
~
•
, •
%'.
ee"lr"
•
I
o
oeo
o_O o a ~ v e 4 ~ e B ~ e o O
-2 -3
•
•
-o 1~0o • • ~o • " o - o • .dido ~ S . / P e a . L °oL e_t Fe~,-~°o.~.~.e.B~7 "° • ~ _ °
~-..~'~. ~ " •
°" o • "%
o',~ " e
•eOe
o~
• oo °
eo e ~ . . o _ f _
-1
•
oo -o o,. -
oO e - e e o8
" "
•
oe • OooOl o#eo o - __ • o o _ o O O •° ° - - o ~
o. o o o " e ~
•
. •
~r
Oo0 °
•
I
1/2 1 2 Relative GDP Per Worker in 1960
l
I
4
8
Fig. 18. Growth versus initial GDP per worker, 1960-1985, for exogenous growth model with o = 2. In Figure 19, we plot the growth rates o f GDP for two subperiods, 1961-1972 and 1973-1985. Here, as in the data, we find a weakly positive correlation between the growth rates in these subperiods. The correlation for the growth rates in the model is 0.21, whereas the correlation for the data is 0.16. This lack o f persistence is evident in both Figure 19 for the model and Figure 3 for the data. For the model, countries with very different growth rates in the two subsamples are typically in a turbulent regime with tax rates falling (rising) over the first half o f the sample and rising (falling) over the second half. Note, however, that although the growth rates are not correlated across the subsamples, the average investment-output ratios are correlated - both in the data and in the model. In the model, there is uncertainty about future tax rates which keeps agents from rapidly changing their saving behavior in the face o f rising or falling rates. Another feature of the model's simulation that is similar to the data is the range o f growth rates. In Figure 19, we see that most o f the growth rates calculated for the model fall in the range o f those observed in the data: - 5 percent to 10 percent. There are only 1.6 percent o f the model countries with growth rates that fall outside o f this range. We will see shortly how important the choice o f ~ is for this result. Next we construct maximum growth rates o f GDP per capita for the model. To avoid relying too heavily on growth rates for outlier countries, we take an average o f the top 2½ percent o f growth rates over each decade and call these the maximum growth rates. In Figure 20, we plot these growth rates. Notice that the growth rates for the model
Ch. 10: Explaining Cross-Country Income Differences
727
10 •
tt3
O0
•
• •
@0
•
•
o?, [-.,.
..
.
. •
5 m
•
•
%°
~'bo --
•
to
•
"th~
."
q~ © •
OO
g •
oo
"
•
p
d
~
-
.e[o • I o~Q o" elSjutjDOdJJ~ t
"eO
Oql'
..,,
I
o.
J
.
oo• O
• "o •
•
•
.. .. •
• •
,.
I
• •
•
•
.
w
•
.
•
I
..
O _~
- .
TO
•
o
•
o..
• Q
•
0
-
O
I
•
o.
• - 6"o
•
I
.'. "
•
%o"gr
•
•
:.:
o O,O
eO
•
o~,~r •
"
• •
--o
..7" - " o r -~ ~
•
. o°
•
Odpo_ •
edP
-55
"
0o"
e"IJ[.*JJS'N.t."u~'..".,
• • O
.
.o~.:,W 1 I "D ~.' _ . . ~ . ~ ~.~'m. t- ~. • . .t
••
0
•
"
•
~D
•
o
•
.
-
•
o".o
•
•
o
.o
•
•
;
I
0
I
I 0
I
5
I
I
I
10
Growth Rates o f G D P Per Worker, 1961-72 Fig. 19. Persistence o f growth rates, 1960-1985, for exogenous growth model with a = 2.
12-
10 © (D
©
O
4
-
2-
1"n8'80
I 1890
I 1900
I 1910
[ 1920
I 1930
[ 1940
I 1950
I 1960
I 1970
r 1980
I I 1990 2000
Year Fig. 20. M a x i m u m GDP per capita growth, 1870-1990, for exogenous growth model with cr = 2.
E.R. McGrattan and J.A. Schmitz, Jr
728
1989 R e l a t i v e G D P Per C a p i t a ( W o r l d A v e r a g e = 1)
105 --
30
20 = ]o o o
l04 --
I 1/21
1/8 1/4
I I8
2
O
o []
o
o
[]
[] O []
i
1800
i
i
i
I
1850
i
i
i
i
I
1900 Year
i
i
i
i
o 0
o []
o []
o []
[]
[] o
[] o
[] o
[] o
[] o l0 s
4
[]
25-75 Percentile 10-90 Percentile I
i
i
i
i
1950
I
2000
Fig. 21. GDP per capita, 1820-1989, for exogenous growth model w i t h a = 5.
are higher throughout the sample period than those observed in the data presented in Figure 4. Furthermore, the model growth rates show no significant upward trend. Although there is a lot of mobility of countries in the model, maximal decade growth rates are higher than 6 percent over the whole simulated sample period, 1750-1990. In the model, these high growth rates are tied to falling tax rates, which is the only impediment to faster growth. Obviously, the model has to be modified to incorporate the idea that higher growth rates are achievable only when outside opportunities (for example, world technology) are better. In Figures 21-24, we show results for the same experiment described above. In this case, we set the risk aversion parameter ~r equal to 5. There are several differences between the cases with ~r = 2 and cr = 5 worth noting. First, the range of the distribution over time displayed in Figure 21 is significantly smaller than in the case with o = 2. This can be seen by comparing Figures 17 and 21. In 1989, output per capita for a country in the 90th percentile is only 7.1 times that of a country in the 10th percentile. Second, variation in growth rates is also reduced as is clear when we compare Figures 22 and 23 with Figures 18 and 19. It is not surprising, then, that we find that the maximum growth rate is smaller the larger is ~r. This is evident when we compare Figures 20 and 24. In summary, in both simulations (with cr = 2 and cr = 5), we find a large range in the distribution of 1989 incomes, little correlation between incomes and subsequent
Ch. 10." Explaining Cross-Country Income Differences
729
8 q'3
7
--
6-
g
"....: - ..
•
54-
• Oo
3 -
.
2 © ~D
,O'-qp •
. o ,.o':
qb
•
O0
•
•
-"="6e..-)...
oQ.mm' ~ ¶ o l [
oill~FIIJ~Oqp.,.o eo"
~,al
1
• ~,~,~;~,. ~:/,~.;~..~ ... ~.-..,.~.~ ~.....
0 -1
08,
•
•
•
-2 -3
I
1/8
I
1/4
I
I
1/2 1 2 Relative G D P Per W o r k e r in 1960
I
I
4
8
Fig. 22. Growth versus initial GDP per worker, 1960-1985, for exogenous growth model with ~ = 5.
10t¢3
d~ •
g
~
•
• O
•
_~
•
•
°
°o0
~
~,.g o .
5 o
o
•
e|o
eO
08o
0
•
• .
.
o a ~ s ~%;
,dt,-Oo % . -2 .- .•. "d
-5
I
I
I
I
I
0
,
I
•
I
•
I
•
.
••
•
I
i
,
5
~
I
I
10
G r o w t h R a t e s o f G D P P e r W o r k e r , 1961-72 Fig. 23. Persistence of growth rates, 1960 1985, for exogenous growth model with cr = 5.
E.R. McGrattan and JA. Schmitz, Jr
730 12-
~4
10-
©
8-
© ~D
6-
O
4
2-
Yn8180
I 1890
I 1900
I 1910
I 1920
I 1930
I 1940
I 1950
I 1960
I 1970
I 1980
I 1990
I 2000
Year Fig. 24. Maximum GDP per capita growth, 187~1990, for exogenous growth model with cr = 5.
growth rates, and little persistence in growth rates. Yet, in both simulations, we find little agreement between maximal growth rates in the model and those in the data.
6.2. An endogenous growth model Results for the endogenous growth model with ak = Ok are reported in Figures 25-28. These figures can be compared to Figures 1-4 for the data and Figures 21-24 for the exogenous growth model with cr = 5. In Figure 25, we display four of the percentiles of the distribution of incomes for the model over the period 1820-1989. Here, as in the data, there is a gradual fanning out of the distribution. However, by 1989, the ratio of GDP per capita for the country at the 90th percentile to that of the 10th percentile is 43.9, which exceeds the ratio in the data. With constant returns to scale in accumulable factors, the model predicts that the disparity of incomes increases with time. How quickly this occurs depends on choices of risk aversion, depreciation, and labor supply elasticity, as the formulas derived in Section 5.2.1 make clear• Part of the distribution for 1989 is displayed in the insert of Figure 25. Only 82.4 percent of the cotmtries have a relative GDP per worker in the range of 1 to 8. However, the distribution of incomes is roughly uniform, as it is in the data. In Figure 26, we plot the relative GDPs per worker in 1960 and the annualized growth rates for 1985 over 1960. As with the data displayed in Figure 2, growth rates
Ch. 10: Explaining Cross-Country Income Differences
~30
731
1989 Relative GDP Per Capita (World Average = l)
lOs '~ 20 o
o -e 104
-
[]
[]
~
0
[] 118 1/4 112
1
2
4
8 []
(3
o 0
[] (3 [] 0
o
0
o
0
0
0
Q
[]
[]
[]
[]
[]
[]
(3 []
25-75 Percentiles 10-90 Percentiles
10 3
o [] I
I
I
,
1800
I
I
I
I
I
I
1850
I
I
I
I
1900
I
,
I
,
I
I
1950
2000
Ye~ Fig. 25. GDP per capita, 1820-1989, for endogenous growth model with a k = Ok.
8 •
7d~
"
6-
"
•
•
•
5 -
4-
• 00
3-
000
e b " ee
•
•
2 o
1
•
¢~
•
3,O q P
•
g
•.
• •
•
•
o
° ~
on
-"!"0 • •
oO
•
_.
~.ee
"
--
•
• .
~o eo • •
~
•
eeS. eoeeo... --4U~DOOO • • ~•-dD O ~ • eo • 4F-~F-
]ro--~-. ~
-o'*
oo
~O
•
•
gg'q~
dD~
.
elOeo
• •
• • qb~
2) 3/ 1/8
I 1/4
I I I 1/2 1 2 Relative G D P Per W o r k e r in 1960
I 4
I 8
Fig. 26. Growth versus initial GDP per worker, 1960 1985, for endogenous growth model with a k = 01~.
732
E.R. McGrattan and J.A. Schmitz, Jr.
10t 0'3
!
O
,--4
•
O
0
00
O0
• ~, .e , 4 j g
•
ee J
•
•
•
@
0
-5
I
I
i
I
I
I
I
I
i
I
p
I
0 5 Growth Rates of GDP Per Worker, 1961-72
i
I
I
10
Fig. 27. Persistence of growth rates, 1960-1985, for endogenous growth model with ak = %. for countries with low initial GDPs per worker are not systematically higher than those for countries with high initial GDPs. Another feature that is similar to the data in Figure 2 is the range of growth rates. For most countries, growth rates fall in the range o f - I to 3. In Figure 27, we plot the growth rates of GDP over the subperiods 1961-1972 and 1973-1985. The correlation across subperiods is 0.78 - which is significantly higher than the correlation of 0.16 in the data. As ak + ah approach 1, the transition dynamics in the growth model become much slower, and the growth rates vary much less. For this reason, we find a much smaller correlation in the exogenous growth model than in the endogenous growth model. In Figure 28, we plot the maximum growth rates in each decade for the simulation. Comparable figures for the data and the exogenous growth model are Figures 4 and 24, respectively. Notice that by 1880, there is no trend in maximum growth rates. It does not persist because the optimal investment strategy is to get the ratio of physical to human capital, kJht, back to a constant level. Once this occurs, there is little variation in the growth rates across alternative distortion levels. Therefore, we do not come close to mimicking the pattern of increasing growth rates seen in Figure 4. When we simulate an artificial panel data set for the two-sector endogenous growth model, with Ok = 0.03 and Oh = 0.97, we find results very similar to those displayed in Figures 25-28. As in the case with ak = Ok, we find a large range in the distribution of 1989 incomes and little correlation between incomes and subsequent growth rates.
Ch. 10: Explaining Cross-Country income Di~Cerences
733
12-
~4
10-
8-
>
©
O
2-
n8,80 Y
I I I I I I I I I I [ I 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 Year
Fig. 28. Maximum GDP per capita growth, 1870-1990, for endogenous growth model with a k = Ok. Yet in both simulations, we find more persistence in growth rates than is found in the data and little agreement between maximal growth rates in the model and those in the data. In this section, we have worked out the implications o f two standard growth models for the basic facts on income described in Section 2. We went beyond what is typically done and considered both the time-series and cross-sectional aspects o f the data generated by the model. For the basic facts that we consider, the exogenous growth model does a better j o b simultaneously accounting for the dispersion o f incomes over time, the lack o f persistence in growth rates, and the range in cross-country growth rates than the A K model does. However, more exploration o f these models is needed before we can definitively say how well the models do in explaining cross-country income differences. W h a t we found is that the models - despite their simplicity - do fairly well mimicking some o f the basic features o f the data.
7. Concluding remarks In this chapter, we have reviewed some o f the basic facts about cross-country incomes and some studies in the recent literature on growth and development meant to explain these facts. As we have noted throughout, there are still many open issues and unanswered questions. Quantifying the rote o f economic policies for growth and
734
E.R. McGrattan and J.A. Schmitz, Jr.
development depends on policy variables that are difficult to measure and on models that have predictions which rely on controversial parameterizations. What we have done in this chapter is to review the progress made thus far. Further progress can be made with better measures o f factor inputs, especially human capital, better measures of policy variables, and greater synthesis of theory and data.
Acknowledgements The views expressed herein are those of the authors and not necessarily those of the Federal Reserve Bank of Minneapolis or the Federal Reserve System. We thank Francesco Caselli, Bill Easterly, Boyan Jovanovic, Pete Klenow, Narayana Kocherlakota, and Ed Prescott for helpful comments. Data and computer programs used for this chapter are available at our web site: h t t p : / / r e s e a r c h . m p l s , frb. fed. us
and are listed with Staff Report 250.
References Aghion, P., and R Howitt (1992), "A model of growth through creative destruction", Econometrica 60:323-351. Aghion, E, and R Howitt (1998), Endogenous Growth Theory (M1T Press, Cambridge). Alesina, A., and D. Rodrik (1994), "Distributive politics and economic growth", Quarterly Journal of Economics 109:465-490. Bairoch, E (1981), "The main trends in national economic disparities since the industrial revolution", in: E Bairoch and M. L~vy-Leboyer,eds., Disparities in Economic Development since the Industrial Revolution (St. Martin's Press, New York). Balassa, B.A. (1971), The Structure of Protection in Developing Countries (The Johns Hopkins Press, Baltimore). Barone, E. (1913), Principi di Economia Politica (Athanaeum, Rome). Barro, R.J. (1991), "Economic growth in a cross section of countries", Quarterly Journal of Economics 106:407-443. Barro, R.J., and I-W. Lee (1993), "International comparisons of educational attainment", Journal of Monetary Economics 32:363 394. Barro, R.J., and J.-W Lee (1994), "Sources of economic growth", Carnegie-RochesterConference Series on Public Policy 40:1-46. Barro, R.J., and X. Sala-i-Martin (1995), Economic Growth (McGraw-Hill,New York). Bertola, G. (1994), "Flexibility,investment,and growth", Journal of MonetaryEconomics 34:215-238. Bils, M., and RJ. Klenow (1998), "Does schooling cause growth or the other way around?", Working paper 6393 (National Bureau of Economic Research). Chari, VV, L.E. Jones and R.E. Manuelli (1995), "The growth effects of monetary policy", Federal Reserve Bank of Minneapolis Quarterly Review 19:18-32. Chari, V.V.,EJ. Kehoe and E.R. McGrattan (1997), "The poverty of nations: a quantitativeinvestigation", Research Department Staff Report 204 (Federal Reserve Bank of Minneapolis). Christensen, L.R., D. Cummings and D.W Jorgenson (1980), "Economic growth, 1947-73: an internationalcomparison",in: J.W Kendrickand B.N. Vaccara, eds., New Developmentsin Productivity Measurement and Analysis (Universityof Chicago Press, Chicago, 1L).
Ch. 10: Explaining Cross-Country Income Differences
735
Cox, D., and R. Harris (1985), "Trade liberalization and industrial organization: some estimates for Canada", Journal of Political Economy 93:115-145. Crucini, M.J., and J. Kahn (1996), "Tariffs and aggregate economic activity: lessons from the Great Depression", Journal of Monetary Economics 38:427-467. de Soto, H. (1989), The Other Path: The Invisible Revolution in the Third World (Harper & Row, New York). DeLong, J.B., and L.H. Summers (1991), "Equipment investment and economic growth", Quarterly Journal of Economics 106:445-502. DeLong, J.B., and L.H. Summers (1993), "How strongly do developing economies benefit from equipment investment?", Journal of Monetary Economics 32:395-415. Easterly, W, and S.T. Rebelo (1993), "Fiscal policy and economic growth: an empirical investigation", Journal of Monetary Economics 32:417-458. Easterly, W, M. Kremer, L. Pritchett and L.H. Summers (1993), "Good policy or good luck? Country growth performance and temporary shocks", Journal of Monetary Economics 32:459-483. Eaton, J., and S. Kortum (1996), "Trade in ideas: patenting and productivity in the OECD", Journal of International Economics 40:251-278. Elias, V.J. (1992), Sources of Growth: A Study of Seven Latin American Economies (ICS Press, San Francisco). Funkhouser, R., and EW. MacAvoy (1979), "A sample of observations on comparative prices in public and private enterprises", Journal of Public Economics 11:353-368. Gastil, R.D. (1987), Freedom in the World: Political Rights and Civil Liberties, 1986-1987 (Greenwood Press, New York). Glomm, G., and B. Ravikumar (1998), "Flat-rate taxes, government spending on education, and growth", Review of Economic Dynamics 1:306-325. Greenwood, J., and B. Jovanovic (1990), "Financial development, growth, and the distribution of income", Journal of Political Economy 98, Part 1:1076-1107. Grossman, G.M., and E. Helpman (1991a), Innovation and Growth in the Global Economy (MIT Press, Cambridge). Grossman, G.M., and E. Helpman (1991b), "Quality ladders in the theory of growth", Review of Economic Studies 58:43-61. Hall, R.E., and C.I. Jones (1998), "Why do some countries produce so much more output per worker than others?", Working Paper (Stanford University). Hopenhayn, H., and R. Rogerson (1993), "Job turnover and policy evaluation: a general equilibrium analysis", Journal of Political Economy 101:915 938. Hsieh, C.-T. (1997), "What explains the industrial revolution in East Asia? Evidence from factor markets", Working Paper (University of California, Berkeley). Johnson, H.G. (1960), "The cost of protection and the scientific tariff", Journal of Political Economy 68:327-345. Jones, C.I. (1994), "Economic growth and the relative price of capital", Journal of Monetary Economics 34:359-382. Jones, C.I. (1995a), "R&D-based models of economic growth", Journal of Political Economy 103: 759 784. Jones, C.I. (1995b), "Time series tests of endogenous growth models", Quarterly Journal of Economics 110:495-525. Jones, L.E., and R.E. Manuelli (1997), "The sources of growth", Journal of Economic Dynamics and Control 21:75-114. Jones, L.E., R.E. Manuelli and P.E. Rossi (1993), "Optimal taxation in models of endogenous growth", Journal of Political Economy 101:485 517. Jones, L.E., R.E. Manuelli and E. Stacchetti (1998), "Technology and policy shocks in models of endogenous growth", Working Paper (Northwestern University). Jovanovic, B., and R. Rob (1998), "Solow vs. Solow", Working Paper (New York University).
736
E.R. McGrattan and J.A. Schmitz, Jr.
KJm, S.-J. (1992), "Taxes, growth, and welfare in an endogenous growth model", Ph.D. Dissertation (University of Chicago). King, R.G., and R. Levine (1993), "Finance, entrepreneurship, and growth: theory and evidence", Journal of Monetary Economics 32:513342. King, R.G., and S.T. Rebelo (1990), "Public policy and economic growth: developing neoclassical implications", Journal of Political Economy 98(Part 2):S126 S150. Klenow, RJ., and A. Rodriguez-Clare (1997a), "Economic growth: a review essay", Journal of Monetary Economics 40:597-617. Klenow, EJ., and A. Rodriguez-Clare (1997b), "The neoclassical revival in growth economics: has it gone too far?", NBER Macroeconomics Annual 1997 (MIT Press, Cambridge). Klenow, P.J., and A. Rodriguez-Clare (1997c), "Quantifying variety gains from trade liberalization", Working Paper (University of Chicago). Kocherlakota, N.R. (1996), "Comment on R.J. Barro, Inflation and growth", Federal Reserve Bank of St. Louis Review 78:170-172. Kormendi, R.C., and E Meguire (1985), "Macroecononaic determinants of growth: cross-country evidence", Journal of Monetary Economics 16:141-163. Kornai, J. (1992), The Socialist System: The Political Economy of Communism (Princeton University Press, Princeton). Krueger, A.O. (1968), "Factor endowments and per capita income differences among countries", Economic Journal 78:641 659. Krueger, A.O., and B. Tuncer (1982), "Growth of factor productivity in Turkish manufacturing industries", Journal of Development Economics 11:307-325. Lee, J.-W. (1993), "International trade, distortions, and long-run economic growth", international Monetary Fund Staff Papers 40:299-328. Levine, R., and D. Renelt (1992), "A sensitivity analysis of cross-country growth regressions", American Economic Review 82:94~963. Loayza, N.V. (1996), "The economies of the informal sector: a simple model and some empirical evidence from Latin America", Carnegie-Rochester Conference Series on Public Policy 45:129-162. Lucas Jr, R.E. (1988), "On the mechanics of economic development", Journal of Monetary Economics 22:3-42. Lucas Jr, R.E. (1990), "Supply-side economics: an analytical review", Oxford Economic Papers 42: 293-316. Maddison, A. (1991), Dynamic Forces in Capitalist Development: A Long-Run Comparative View (Oxford University Press, New York). Maddison, A. (1994), "Explaining the economic performance of nations, 1820-1989", in: W.J. Baumol, R.R. Nelson and E.N. Wolff, eds., Convergence of Productivity: Cross-National Studies and Historical Evidence (Oxford University Press, New York). Mankiw, N.G. (1995), "The growth of nations", Brookings Papers on Economic Activity 1995(1): 275-310. Mankiw, N.G. (1997), Comment on P.J. Klenow and A. Rodriguez-Clare, "The neoclassical revival in growth economics: has it gone too far?", NBER Macroeconomics Annual 1997 (MIT Press, Cambridge). Mankiw, N.G., D. Romer and D.N. Weil (1992), "A contribution to the empirics of economic growth", Quarterly Journal of Economics 107:407-437. Mincer, J. (1974), Schooling, Experience, and Earnings (Columbia University Press, New York). Parente, S.L., and E.C. Prescott (1993), "Changes in the wealth of nations", Federal Reserve Bank of Minneapolis Quarterly Review 17:3-16. Parente, S.L., and E.C. Prescott (1994), "Barriers to technology adoption and development", Journal of Political Economy 102:298-321. Parente, S.L., and E.C. Prescott (1997), "Monopoly rights: a barrier to riches", Research Department Staff Report 236 (Federal Reserve Bank of Minneapolis); American Economic Review, forthcoming.
Ch. 10: Explaining Cross-Country Income Differences
737
Parente, S.L., R. Rogerson and R. Wright (1997), "Homework in development economics: household production and the wealth of nations", Working paper (University of Pennsylvania). Persson, T., and G. Tabellini (1994), "Is inequality harmful for growth?", American Economic Review 84:600 621. Prescott, E.C. (1998), "Needed: A theory of total factor productivity", International Economic Review 39:525 551. Prescott, E.C., and M. Visscher (1980), "Organization capital", Journal of Political Economy 88:446-461. Psacharopoulos, G. (1994), "Returns to investment in education: a global update", World Development 22:1325-1343. Rebelo, S.q2 (1991), "Long-run policy analysis and long-run growth", Journal of Political Economy 99:500-521. Romer, RM. (1990), "Endogenous technological change", Journal of Political Economy 98(Part 2): $71-$102. Romer, EM. (1994), "New goods, old theory, and the welfare costs of trade restrictions", Journal of Development Economics 43:5~8. Sachs, J.D., and A.M. Warner (1995), "Economic reform and the process of global integration", Brookings Papers on Economic Activity 1995(1):1 95. Sala-i-Martin, X. (1997), "I just ran two million regressions", American Economic Review 87:178-183. Schmitz Jr, J.A. (1996), "The role played by public enterprises: how much does it differ across countries?", Federal Reserve Bank of Minneapolis Quarterly Review 20:2-15. Schmitz Jr, J.A. (1997), "Government production of investment goods and aggregate labor productivity", Research Department Staff Report 240 (Federal Reserve Bank of Minneapolis). Shoven, J.B., and J. Whalley (1984), "Applied general equilibrium models of taxation and international trade: an introduction and survey", Journal of Economic Literature 22:1007-1051. Sims, C.A. (1996), "Comment on R.J. Barro, Inflation and growth", Federal Reserve Bank of St. Louis Review 78:173-178. Solow, R.M. (1956), "A contribution to the theory of economic growth", Quarterly Journal of Economics 70:65-94. Srinivasan, T.N., and J. Whalley (1986), General Equilibrium Trade Policy Modeling (MIT Press, Cambridge). Stokey, N.L. (1996), "NAFTA and Mexican development", Discussion Paper 108 (Institute for Empirical Macroeconomies). Stokey, N.L., and S.T. Rebelo (1995), "Growth effects of flat-rate taxes", Journal of Political Economy 103:519350. Summers, R., and A. Heston (1991), "The Penn World Table (Mark 5): an expanded set of international comparisons, 1950-1988", Quarterly Journal of Economics 106:327-368. United Nations (1994), The Sex and Age Distribution of the World Populations, the 1994 Revision (United Nations, New York). Uzawa, H. (1965), "Optimal technical change in an aggregative model of economic growth", International Economic Review 6:18-31. Whalley, J. (1985), Trade Liberalization Among Major World Trading Areas (MIT Press, Cambridge). Young, A. (1995), "The tyranny of numbers: confronting the statistical realities of the East Asian growth experience", Quarterly Journal of Economics 110:641-680. Young, A. (1998), "Growth without scale effects", Journal of Political Economy 106:41-63.
AUTHOR INDEX
Amman, H.M. 368, 535 Anderson, E. 564 Anderson, E.W. 368, 369 Ando, A., s e e Modigliani, E 762 Andolfatto, D. 994, 1158, 1173, 1203, 1207, 1221 Andres, J., s e e Blanchard, O.J. 1214 Araujo, A. 323 Arellano, M. 787 Arifovic, J. 455, 465, 472, 521-523, 525 527, 531 Arrow, K. 664, 1033, 1042 Arrow, K.J. 1218 Arthur, W.B. 454, 476, 534 Ascari, G. 1041 Aschauer, D.A. 1656, 1657 Asea, E, s e e Mendoza, E. 1439 Ashenfelter, O. 618, 1038, 1039 Askildsen, J.E. 1074 Atkeson, A. 575, 610, 786, 847, 1298, 1675, 1718, 1720 Atkinson, A.B. 1673, 1676, 1680, 1682, 1718 Attanasio, O.R 564, 607, 608, 610-613, 752, 753, 756, 759, 769, 777, 779, 781,783, 784, 787, 789-794, 796, 797, 802, 1264, 1655 Auerbach, A.J. 380, 549, 576, 588, 590, 591, 593, 616, 821, 1624, 1634, 1635, 1639, 1652, 1718 Auerbach, A.J., s e e Feldstein, M.S. 904, 906 Auernheimer, L. 1449 Auster, R. 474 Autnr, D. 577 Axilrod, S.H. 1493 Azariadis, C. 262, 264, 271, 289, 389, 395, 516, 527, 658, 660, 661, 1035
Abel, A.B. 818, 831, 834, 835, 994, 1069, 1237, 1251, 1253, 1265, 1266, 1268, 1271, 1272, 1284, 1285, 1651 Abowd, J. 567, 568, 570, 571,616, 759 Abraham, J. 1039 Abraham, K.G. 1058 Abraham, K.J. 1183, 1221 Abramovitz, M. 208 Abramowitz, M. 865, 887 Acemoglu, D. 852, 1215 Adam, M. 500 Adams, C. 1538 Adelman, EL., s e e Adelman, I. 9 Adelman, I. 9 Ag6nor, RR. 1543, 1572 Aghion, R 264, 665, 672, 715,719, 1157, 1208, 1210, 1213, 1377, 1450, 1454, 1465 Aiyagari, S.R. 442, 547, 552, 566, 567, 983, 1140, 1293, 1631 Aizenman, J. 1497, 1538, 1540 Akaike, H. 217 Akerlof, G. 1344 Akerlof, G.A. 198, 397, 1034, 1035, 1039, 1157, 1200 al Nowaihi, A. 1415, 1422, 1437 Alesina, A. 162, 277 279, 692, 1404, 1416, 1422-1426, 1430, 1432, 1438, 1439, 1446, 1449, 1450, 1454, 1460, 1461, 1464~1466, 1469, 1471, 1518, 1522, 1540 Alesina, A., s e e TabeUini, G. 1456, 1465 Alessie, R. 774, 775 Allais, M. 661, 1309 Allen, D.S. 871 Allen, E 576 Almeida, A. 1432, 1495 Alogoskoufis, G.S. 166, 214, 215 Altonji, J. 615 Altonji, J., s e e Hayashi, E 796 Altonji, J.G. 789 Altug, S. 584, 595, 611,612, 785, 786, 792 Alvarez, E 575, 996 Ambler, S. 944, 1062, 1067 American Psychiatric Association 1325
Bacchetta, E 1344 Bacchetta, R, s e e Feldstein, M. 1637 Bachelier, L. 1316 Backus, C.K. 549 Backus, D. 1017, 1031, 1270, 1405, 1414, 1415 Backus, D.K. 9, 42, 45, 938, 1316, 1708 I-1
I-2 Bade, R. 1432, 1438 Bagehot, W. 155, 1485, 1515 Bagwell, K. 1125 Bagwell, K., s e e Bernheim, B.D. 1647 BmleNM.J. 1643 Bairoch, R 719, 724 Bakeg J.B. 1125 Balasko, 5q 427, 506 Balassa, B.A. 705 Balke, N.S. 6, 61, 114, 204, 205, 221 Ball, L. 42, 72, 199, 1023, 1037, 1039, 1041, 1127, 1415, 1499, 1504, 1542, 1632, 1650, 1651 Ball, R. 1321 Ballard, C. 1639 Baneljee, A., s e e Aghion, R 1377 Bange, M.M., s e e De Bondt, W.E 1321 Banks, J. 751, 758, 759, 770, 783, 788, 790-792 Banks, J., s e e Attanasio, O.E 756, 759, 793, 794 Bannerjee, A.V 1332 Bansal, R. 1255 Barberis, N. 1294, 1322 Barclays de Zoete Wedd Securities 1238 Barkai, H. 1572 Barnett,, W. 538 Barnett, S. 831 Barnett, W. 540 Barone, E. 702 Barro, R.J. 101, 157, 158, 173, 237, 245, 246, 252, 269, 271,272, 277281, 284, 643, 651, 657, 659, 671, 675, 681, 683-685, 688, 689, 691 694, 696, 943, 974, 1023, 1055, 1155, 1404, 1405, 1411, 1412, 1414, 1415, 1425, 1438, 1439, 1466, 1485 1489, 1637, 1641, 1642, 1645, 1662, 1675, 1702, 1705, 1707 Barsky, R. 43, 558, 564, 565 Barsky, R., s e e Solon, G. 579, 1058, 1102, 1106 Barsky, R., s e e Warner, E.J. 1019 Barsky, R.B. 182, 215, 216, 1149, 1237, 1277, 1294 1296, 1653 Barth, J.R. 1657 Bartle, R.G. 76 Barucci, E. 525 Basar, T. 1449 Basu, S. 399, 402, 433, 983, 992, 994, 1069, 1080-1082, 1096, 1097, 1117, 1142 Bates, D.S. 1310, 1324
Author
Index
Baumol, W.J. 252, 269 Baxter, M. 9, 11, 12, 45, 203, 380, 430, 934, 938, 974, 980, 992, 1296, 1404 Bayoumi, T. 161,211,216, 217, 219 Bayoumi, T., s e e Mussa, M. 208 Bazaraa, M.S. 331 Bean, C., s e e Blanchard, O.J. 1214 Bean, C.R. 785, 1497 Beaudry, P. 99, 395, 413, 592, 1264 Beaulieu, J.J. 801, 802, 876 Becker, G. 592, 653 Becker, G.S. 317, 1645 Becket, G.S., s e e Ghez, G. 615, 752, 759 Becker, R. 369 Beetsma, R. 1411, 1436, 1438 Bekaert, G. 1281 Bell, D.E. 1313 Bellman, R. 336, 340 Belsley, D. 882, 887, 888, 892 Beltratti, A. 524, 525 Ben-David, D. 265, 278 Ben Porath, Y. 577, 582 Benabou, R. 1017, 1018, 1031, 1128, 1129, 1469, 1472, 1473 B6nabou, R. 268 Benartzi, S. 1290, 1312, 1313 B6nassy, J. 507 Benassy, J.-E 1506 Benhabib, J. 283,395, 399-405,408, 412-414, 417, 419, 421,423-427, 431,433-435, 437, 442, 505, 550, 847, 1145, 1449, 1465, 1467, 1472 Benigno, E, s e e Missale, A. i450 Benjamin, D. 161 Bennett, R. 395 Bensaid, B. 1446, 1449 Benveniste, A. 476, 531 Benveniste, L.M. 321 Bergen, M., s e e Dutta, S. 1019, 1020 Bergen, M., s e e Levy, D. 1014, 1015, 1019 Bergen, P.R. 1041 Berger, L.A. 1330 Bergstr6m, V. 538 Bernanke, B.S. 68, 72, 76, 83, 89, 91 93, 114, 144, 178, 182-184, 800, 856, 857, 1036, 1343, 1345, 1346, 1352, 1357, 1361, 1363, 1365, 1369, 1371, 1373, 1376-1378, 1495, 1578 Bernard, A.B. 254, 271,287, 288 Bernard, V.L. 1321
Author
Index
Bernheim, B.D. 1646, 1647, 1649, 1654, 1659, 1660 Berry, M., s e e Dreman, D. 1320 Berry, T.S. 1618 Bertocchi, G. 474 Bertola, G. 643, 708, 801, 821, 834, 835, 840, 843, 1187, 1222, 1472, 1580 Bertsekas, D.P. 326 Besley, T. 856 Betts, C.M. 217 Beveridge, S. 1062, 1143 Bewley, T. 566, 1155 Bhaskar, V 1037 Bianchi, M. 290, 292 Bikhchandani, S. 1332 Bils, M. 694, 910, 912, 983, 1053, 1059, 1069, 1070, 1072, 1075, 1076, 1078 1081, 1085, 1087, 1102, 1104, 1119, 1120, 1130 Bils, M.J. 579 Binder, M. 271, 1092 Binmore, K. 462 Binmore, K.G. 1188 Bisin, A. 427 Bismut, C., s e e Benabou, R. 1017, 1018, 1031 Bizet, D. 380 Bjorck, A., s e e Dahlquist, G. 337 Black, F. 417, 1280, 1310, 1331, 1507 Blackwell, D. 320 Blad, M., s e e B6nassy, J. 507 Blanchard, O.J. 4042, 211, 216, 217, 391, 416, 471,504, 643, 660, 818, 852, 877, 887, 888, 890, 892, 906, 912, 1013, 1030, 1033, 1034, 1036, 1041, 1112, 1130, 1162, 1173, t176, 1183, 1184, 1194, 1202, 1214, 1221, 1266, 1491, 1634, 1635, 1645, 1650 Blanchard, O.J., s e e Missale, A. 1450 Blank, R. 579 Blinder, A. 587, 750, 1018-1020, 1038 Blinder, A.S. 41,876, 881,887, 893, 903,904, 907, 908, 910, 1018, 1085, 1118, 1344, 1485, 1499, 1660 Blinder, A.S., s e e Bernanke, B.S. 83, 91, 93 Bliss, C. 1461, 1465 Bliss, R., s e e Fama, E.E 1280 Blomstrom, M. 277, 279, 280 Bloomfield, A. 156 Blume, L.E. 321,322, 474 Blume, L.E., s e e Bray, M. 474 Blundell, R. 572, 602, 611,612, 620, 764, 770, 779, 781,783,788, 790-792, 797
I-3 Blundell, R., s e e Banks, J. 758, 759, 770, 783, 788, 790-792 Boadway, R. 1463 Bodnar, G. 1318 B6hm, V 475, 646 Bohn, H. 1465, 1622, 1650, 1691 Boldrin, M. 362, 399, 400, 506, 962, 1062, 1284, 1297, 1465 Bolen, D.W. 1325 Bollerslev, T. 1236, 1280 BoRon, E, s e e Aghion, R 1377, 1450, 1454, 1465 Bona, J.L. 313 Boothe, P.M. 1658 Bordo, M.D. 152, 155 160, 162, 164-167, 182, 184, 185, 194, 202-204, 207-209, 211,215, 217-221, 1404, 1438, 1590 Bordo, M.D., s e e Bayoumi, I". 161 Bordo, M.D., s e e Betts, C.M. 217 Borenstein, S. 1124 Boschan, C., s e e Bry, G. 8 Boschen, J.E 139 Boskin, M.J. 618 Bossaerts, P. 454 Bosworth, B., s e e Collins, S. 653 Bourguignon, E, s e e Levy-Leboyer, M. 222 Bovenberg, A.L., s e e Gordon, R.H. 1637 Bovenberg, L., s e e Beetsma, R. 1411 Bowen, W. 619 Bowman, D. 1313 Boyd, W.H., s e e Bolen, D.W. 1325 Boyle, M., s e e Paulin, G. 751 Boyle, P. 380 Boyle, P.P., s e e Tan, K.S. 334 Brainard, WC. 817 Brauch, R., s e e Paulin, G. 751 Braun, R.A. 974 Braun, S.N., s e e K r a n e , S.D. 876, 877 Brav, A. 1290 Bray, M. 454, 463,465, 466, 473~475, 527 Brayton, E 1043, 1344, 1485 Brayton, E, s e e Hess, G.D. 1485, 1509 Breeden, D. 1246 Breiman, L. 289 Bresnahan, T.E 911,912 Bretton Woods Commission 208 Broadbent, B. 1412 Broadbent, B., s e e Barro, R.J. 1412 Broadie, M., s e e Boyle, P. 380 Brock, W.A. 319, 407, 455, 528, 532, 547, 552, 556, 942, 951, 1507
I-4 Brown, C. 585 Brown, E, s e e Ball, R. 1321 Brown, S. 1242 Browning, E. 1463 Browning, M. 598, 606, 607, 61~612, 750, 752, 771,778, 787, 792, 798, 803 Browning, M., s e e Attanasio, O.E 607, 608, 610, 611,613, 779, 789, 791, 1655 Browning, M., s e e Blundell, R. 611,612, 779, 781,783, 790, 791 Broze, L. 487, 488 Brugiavini, A. 775 Brugiavini, A., s e e Banks, J. 770, 788 Brumberg, R., s e e Modigliani, E 761 Brumelle, S.L., s e e Puterman, M.L. 336, 338 Brunner, A.D. 104 Brunner, K. 179, 183, 191, 1025, 1491 Bruno, M. 471, 1090, 1496, 1538, 1539, 1543, 1553 Bry, G. 8 Bryant, R.C. 1043, 1491, 1497, 1516-1518 Bryant, R.R. 1313 Buchanan, J.M. 1631, 1642 Buchholz, T.G. 1643 Buckle, R.A. 1019 Bufman, G. 1543 Buiter, W. 1030, 1521 Bulirsch, R., s e e Stoer, J. 334 Bull, N. 1675, 1711 Bullard, J. 466, 507, 509, 515, 526 Bullard, J., s e e Arifovic, J. 527 Bulow, J. 1448, 1449 Burdett, K. 1173, 1196 Bureau of the Census 1618, 1619 Burns, A.E 5, 8, 931,934 Burns, A.E, s e e Mitchell, W.C. 8, 44 Burnside, C. 399, 930, 980485, 994, 1078, 1142, 1162 Burtless, G. 618, 620 Butkiewicz, J.L. 1621 Caballe, J. 578 Caballero, R.J. 399, 749, 771, 794, 801, 802, 821-823, 828, 830, 832, 834-838, 840 842, 844, 846, 847, 852, 855, 856, 994, 1032, 1157, 1158, 1160, 1187, 1210, 1211, 1213, 1472 Caballero, R.J., s e e Bertola, G. 801, 821, 834, 840, 843, 1187 Cagan, E 157, 161,203, 1534 Cage, R., s e e Paulin, G. 751
Author
Index
Calmfors, L. 1214 Calomiris, C.W. 169, 181, 183, 187, 191, 1376 Calvo, G.A. 389, 397, 408, 419, 422, 1030, 1032, 1034, 1114, 1346, 1360, 1363, 1389, 1400, 1415, 1428, 1445-1447, 1449, 1450, 1535, 1538, 1539, 1546, 1552, 1554, 1557, 1563, 1564, 1568, 1569, 1571-1573, 1582, 1583, 1587-1589, 1591, 1592, 1596, 1597, 1599-1603, 1605 Cameron, S. 589 Campbell, J. 92 Campbell, J.R. 846, 847, 994 Campbell, J.Y. 763, 764, 769, 784, 930, 961, 1120, 1140, 114t, 1145, 1150, 1235-1238, 1251, 1255, 1257, 1258, 1261, 1264-1266, 1268, 1270, 1272, 1274, 1275, 1280, 1284, 1286, 1290, 1320, 1655 Canavese, A.J. 1543 Canetti, E.D., s e e Blinder, A.S. 1018, 1118 Canjels, E. 55 Canova, E 283,376, 377, 379 Cantor, R. 1344 Canzoneri, M.B. 159, 160, 1405, 1414, 1415, 1507, t508 Capie, E 154, 163, 222, 1438 Caplin, A. 849, 850 Caplin, A.S. 801,910, 1031, 1032 Card, D. 580, 1016, 1148 Card, D., s e e Abowd, J. 567, 568, 570, 571, 616, 759 Card, D., s e e Ashenfelter, O. 1038, 1039 Cardia, E. 1655 Cardia, E., s e e Ambler, S. 1062, 1067 Cardoso, E. 1543 Carey, K., s e e Bernanke, B.S. 178, 182 Carlson, J. 473 Carlson, J.A. 904 Carlson, J.A., s e e Buckle, R.A. 1019 Carlson, J.B. 104 Carlstrom, C. 1348, 1357, 1368, 1378, 1379 Carlton, D. 1129 Carlton, D.W. 1018-1020 Carmichael, H.L. 1155 Carpenter, R.E. 876, 881,912, 1344 Carroll, C.D. 567, 572, 573, 593, 759, 762, 769, 771,785, 788, 793, 1264, 1344, 1653, 1655 Case, K.E. 1323 Casella, A. 1463, 1465 Caselli, E 277-279, 283,284, 286
Author
Index
Cass, D. 244, 246, 247, 295, 389, 516, 643, 649, 662, 942, 948, 1673 Cass, D., s e e Balasko, Y. 427 Castafieda, A. 380 Cazzavilan, G. 426 Cecchetti, S.G. 182, 217, 876, 1015, 1016, 1018, 1019, 1251, 1265, 1270, 1272, 1294, 1296 Cecchetti, S.G., s e e Ball, L. 1037 Chadha, B. 1031, 1542 Chah, E.Y. 775 Chamberlain, G. 283, 286, 785 Chamberlain, T.W. 1334 Chamley, C. 400, 851, 1439, 1673, 1675, 1693, 1697, 1699 Champsattr,, E 538 Champsaur, E 463 Chan, L. 1321 Chan, L.K.C. 1653 Chandler, L.V 176 Chang, C.C.Y., s e e Chamberlain, T.W. 1334 Chari, VV 72, 124, 397, 422, 672, 697, 698, 700, 701, 709, 715, 720, 722, 723, 974, 1036, 1037, 1040-1042, 1371, 1448, 1449, 1459, 1488, 1489, 1578, 1673-1676, 1691, 1699, 1708 1710, 1720, 1723 Chaff, VV., s e e Atkeson, A. 1675, 1718, 1720 Chatteljee, S. 996, 1126 Chatterji, S. 475, 507 Chattopadhyay, S.K., s e e Chatterji, S. 475, 507 Chen, N. 1281 Chen, X. 476, 532 Cheung, C.S., s e e Chamberlain, T.W. 1334 Chevalier, J.A. 1122, 1123 Chiappori, P.A. 391,395, 516 Childs, G.D. 882 Chinn, M., s e e Frankel, J. 1497 Chiffnko, R.S. 815, 817, 1058, 1066, 1086, 1344, 1367 Chiswick, B., s e e Becker, G. 592 Cho, D. 278 Cho, I.-K. 455, 465, 524, 525 Cho, J.O. 974, 976, 1025, 1036 Cho, J . O . , s e e Bils, M. 983, 1075, 1079, 1104 Chou, R.Y. 1236, 1280 Chou, R.Y., s e e Bollerslev, T. 1236, 1280 Choudhff, E.U., s e e Bordo, M.D. 184, 194 Chow, C.-S. 326, 334 Chow, G.C. 1294 Christensen, L.R. 673, 688
I-5 Christiano, L.J. 43, 67-70, 83, 84, 89, 91-94, 99, 108, 109, 114, 115, 124, 137, 143, 144, 314, 329, 330, 339, 347, 349, 350, 355, 362, 364, 367, 369, 370, 376, 377, 379, 426, 504, 547, 764, 881, 888, 909, 952, 962, 974, 1011, 1017, 1018, 1021, 1030, 1038, 1089, ll00, 1296, 1365, 1369, 1708, 1736 Christiano, L.J., s e e Aiyagari, S.R. 1140 Christiano, L.J., s e e Boldffn, M. 962, 1284, 1297 Christiano, L.J., s e e Chaff, V.V. 72, 1449, 1673, 1675, 1676, 1691, 1699, 1708-1710, 1720, 1723 Chung, K.L. 299 Clarida, R. 95, 96, 136, 422, 1364, 1368, 1486 Clark, D., s e e Kushner, H. 476 Clark, J.M. 816 Clark, K.B. 602, 1173 Clark, P.B., s e e Mussa, M. 208 Clark, T.A. 173 Clark, T.E. 1091, 1485 Cochrane, J. 1120 Cochrane, J.H. 101, 211, 796, 1234, 1246, 1249, 1296 Cochrane, J.H., s e e Campbell, J.Y. 1237, 1251, 1284, 1286 Coe, D.T. 265 Cogley, T. 211,395, 547, 967, 1142, 1503 Cohen, D. 271 Cohen, D., s e e Greenspan, A. 798, 844, 847 Cohn, R., s e e Modigliani, E 1321 Cole, H.L. 576, 1163, 1194, 1201-1203, 1207, 1446, 1449, 1603 Cole, H.L., s e e Chaff, V.V. 1459 Coleman, T. 601 Coleman, W.J. 367, 380 Coleman II, W.J. 114 Coleman II, W . J . , s e e Bansal, R. 1255 Collins, S. 653 Conference Board 43 Congressional Budget Office 1618, 1619, 1621, 1624~1627, 1639, 1640, 1660 Conley, J.M., s e e O'Barr, W.M. 1332 Conlon, J.R. 1032 Constantinides, G.M. 559, 567, 781,803, 1237, 1284, 1291, 1293 Constantinides, G.M., s e e Ferson, W.E. 1284 Contini, B. 1177, 1178, 1180, 1200, 1222 Cook, T. 194, 195, 1493
I-6 Cooley, T.E 42, 69, 97, 101, 115, 124, 137, 376, 380, 408, 411, 549, 847, 954, 962, 974, 1376, 1463, 1736 Cooley, T.E, s e e Cho, J.O. 974, 976, 1025, 1036 Cooper, R. 204, 398, 824 Cooper, R., s e e Azariadis, C. 395 Coopel; R., s e e Chatterjee, S. 996, 1126 Cootner, EH. 1316 Corbo, V. 1543, 1554 Correia, I. 974, 1537, 1675, 1720, 1733 Cossa, R. 584 Council of Economic Advisers 1639 Cox, D. 705 Cox, W.M. 1621 Cox Edwards, A., s e e Edwards, S. 1543, 1554, 1555, 1575 Crawford, V.P. 475 Crossley, T., s e e Browning, M. 610, 798 Croushore, D. 1485, 1653 Crucini, M.J. 178, 705 Crucini, M.J., s e e Baxter, M. 1296 Cukierman, A. 1404, 1414, 1415, 1432, 1437, 1438, 1450, 1456, 1463, 1465 Cukierman, A., s e e Alesina, A. 1424, 1426 Cukierman, A., s e e Brunner, K. 1025 Cummings, D., s e e Christensen, L.R. 673, 688 Cummins, J.G. 822, 856, 1344 Cunliffe Report 161 Currie, D. 454, 504 Cushman, D.O. 95, 96 Cutler, D.M. 797, 1290, 1320, 1321, 1624 Cyrus, T., s e e Frankel, J.A. 280 Dahlquist, G. 337 Daniel, B.C. 1647 Daniel, K. 1322 Danthine, J.-E 329, 370, 952, 962, 1002, 1157 Darby, M.R. 166 Dasgupta, R 655, 656 d'Autume, A. 487 DaVanzo, J. 618 Daveri, E 1220 Davidson, J. 750 Davies, J.B. 766 Davis, D. 1033 Davis, EJ. 333 Davis, S.J. 1151, 1152, 1160, 1161, 1176, 1178, 1180, 1194, 1199
Author
Index
Davis, S.J., s e e Attanasio, O.P. 796, 797 Davutyan, N. 156 Dawid, H. 523, 527 De Bondt, W.E 1307, 1320, 1321, 1323 de Fontnouvelle, E, s e e Brock, W.A. 528 De Fraja, G. 1037 De Gregorio, J. 1546, 1551, 1573, 1575, 1577 de Haan, J., s e e Eijffinger, S. 1404, 1438 de la Torte, M. 41 De Melo, J., s e e Corbo, V 1543 de Melo, J., s e e Hanson, J. 1543 De Melo, M. 1535, 1551 De Pablo, J.C. 1543 de Soto, H. 695 Deaton, A. 752, 756, 764, 771, 775, 776, 783, 785, 787, 794, 798, 1344 Deaton, A., s e e Blinder, A. 750 Deaton, A., s e e Browning, M. 611,612, 752, 787, 792 Deaton, A.S. 1264 Deaton, A.S., s e e Campbell, J.Y. 764 Debelle, G. 1489, 1518, 1522 DeCanio, S. 454, 463 DeCecco, M. 155 Degeorge, E 1321 DeKock, G. 158 DeLong, J,B. 252, 279, 695, 1042, 1290, 1324 DeLong, L B . , s e e Barsky, R.B. 1237, 1277, 129~1296 den Haan, W.J. 271,347, 354, 369, 994, 1166, 1194, 1203, 1204, 1206, 1207 Denardo, E.V 320 Denison, E.E 237, 653 Denizer, C., s e e De Melo, M. 1535, 1551 Denson, E.M. 40 Desdoigts, A. 290 DeTray, D.N., s e e DaVanzo, J. 618 Devereux, M. 952, 1466, 1471 Devereux, M., s e e Alessie, R. 775 Devereux, M., s e e Beaudry, P. 395, 413 Devereux, M.B. 1126 Devereux, M.B., s e e Beaudry, R 99 Devine, T.J. 1166 Dewatripont, M., s e e Aghion, R 1157 Dezhbakhsh, H. 1039 Di Tella, G., s e e Canavese, A.J. 1543 Diamond, E 796 Diamond, R, s e e Shafir, E. 1316 Diamond, EA. 661, 1157, 1161, 1162, 1173, 1188, 1634, 1645, 1684, 1718
Author
Index
Diamond, EA., s e e Blanchard, O.J. 41, 42, 1162, 1173, 1183, 1184, 1194, 1202, 1221 Diaz-Alejandro, C.E 1543 Diaz-Gimenez, J., s e e Castafieda, A. 380 Dickens, W.T., s e e Akerlof, G.A. 198 Dickey, D.A. 53, 54, 212 Dickinson, J. 618 Dicks-Mireaux, L., s e e Feldstein, M. 1633 Diebold, EX. 6, 11 Dielman, T., s e e Kallick, M. 1325 Dixit, A. 824, 829, 844, 1115, 1121, 1126 Dixit, A.K., s e e Abel, A.B. 835 Dixon, H. 537 Dodd, D.L., s e e Graham, B. 1323 Dolado, J. 1437 Dolado, J.J. 1214 Dolde, W. 1318 Dolde, W.C., s e e Tobin, J. 773 Domar, E. 640 Domberger, S. 1019 Dominguez, K. 164, 182 Domowitz, I. 1020, 1083, 1093 Doms, M. 823, 838 Donaldson, J.B., s e e Constantinides, G.M. 1293 Donaldson, J.B., s e e Danthine, J.-E 329, 370, 952, 962, 1002, 1157 Doob, J.L. 299 Dornbusch, R. 198, 1043, 1543, 1562, 1563, 1565, 1568, 1582, 1590, 1637 Dotsey, M. 370, 952, 974, 1032, 1043, 1522, 1652 Drazen, A. 1463, 1465, 1541, 1580 Drazen, A., s e e Alesina, A. 162, 1450, 1461, 1465, 1540 Drazen, A., s e e Azariadis, C. 262, 264, 271, 289, 527, 658, 660 Drazen, A., s e e Bertola, G. 1580 Drazen, A., s e e Calvo, G.A. 1571 Dreman, D. 1320, 1323 Dreze, J. 770 Drifflll, J., s e e Backus, D. 1405, 1414, 1415 Driskill, R.A. 1042 Drndi, E 1450 Drugeon, J.E 426 Dueker, M.J. 1485 Duffle, D. 380 Duffle, D., s e e Constanfinides, G.M. 567, 781, 1237, 1291 Duffy, J. 257, 439, 473, 500 Duffy, J., s e e Arifovic, J. 527
I-7 Duffy, J., s e e Bullard, J. 526 Duguay, E 215 Dumas, B. 561,564 Dunlop, J.T. 939, 1059 Dunn, K.B. 800, 1284 Dunne, T., s e e Doms, M. 823, 838 Dupor, B. 994 Durkheim, I~. 1331 Durlauf, S.N. 254, 262 264, 268, 270, 271, 287, 289, 303,550, 905-907 Durlauf, S.N., s e e Bernard, A.B. 254, 271,287, 288 Dutta, EK. 380 Dutta, S. 1019, 1020 Dutta, S., s e e Levy, D. 1014, 1015, 1019 Dutton, J. 156 Dyl, E.A. 1334 Dynan, K.E. 770
Easley, D., s e e Blume, L.E. 321,322, 474 Easley, D., s e e Bray, M. 474 Easterly, W. 277-279, 281, 675, 703, 1538, 1547, 1553, 1560, 1561 Easterly, W., s e e Bruno, M. 1553 Eaton, J. 719 Eberly, J.C. 801,802, 1344 Eberly, J.C., s e e Abel, A.B. 831, 834, 835, 994 Echenique, E 1551, 1561 Eckstein, O. 1344 Eden, B. 1019, 1023 Edin, D.A. 1457 Edwards, S. 1538, 1543, 1554, 1555, 1575, 1578-1580 Edwards, S., s e e Cukierman, A. 1456, 1465 Edwards, W. 1322 Eichenbaum, M. 83, 94, 96, 99, 100, 137, 184, 549, 550, 785, 799, 800, 803, 885, 888, 905 907, 912, 957, 1084 Eichenbaum, M., s e e Aiyagari, S.R. 1140 Eichenbaum, M., s e e Burnside, C. 399, 930, 980-985, 994, 1078, 1142, 1162 Eichenbaum, M., s e e Chari, V.V. 72, 1449 Eichenbaum, M., s e e Christiano, L.J. 43, 6770, 83, 84, 89, 91-94, 99, 108, 115, 124, 137, 143, 144, 376, 377, 379, 764, 974, 1011, 1021, 1038, 1089, 1100, 1365, 1369, 1708, 1736 Eichenbaum, M.S., s e e Christiano, L.J. 881, 888
I-8 Eichengreen, B. 152, 154-157, 160, 162-164, 168, 178, 185, 187, 189, 204, 208, 209, 211,219, 1449, 1465, 1590 Eichengreen, B., s e e Bayoumi, T. 211, 216, 217, 219 Eichengreen, B., s e e Bordo, M.D. 162 Eichengreen, B., s e e Casella, A. 1463, 1465 Eijffinger, S. 1404, 1432, 1438 Eisner, R. 817, 1310, 1621, 1622 Ekeland, I. 1689 E1 Karoui, N. 835 Elias, V.J. 673 Ellison, G. 475, 1124 Ellson, R.E., s e e Bordo, M.D. 157 Elmendorf, D.W. 1439 Elmendorf, D.W., s e e Ball, L. 1650, 1651 Elmendorf, D.W., s e e Feldstein, M. 1656 Emery, K.M. 215 Emery, K.M., s e e Balke, N.S. 114 Engel, E., s e e Caballero, R.J. 801, 802, 821, 835-838, 84~842, 994, 1032, 1158 Engelhardt, G. 1344 Engle, R., s e e Bollerslev, T. 1280 Engle, R.E 50 Engle, R.E, s e e Chou, R.Y. 1236, 1280 Englund, P. 9 Epstein, L.G. 556, 558, 564, 565, 744, 769, 1250, 1256 Erceg, C. 1041 Erceg, C.J., s e e Bordo, M.D. 182 Eriksson, C. 1208 Erlich, D. 1314 Ermoliev, Y.M, s e e Arthur, W.B. 476 Escolano, J. 1718 Esquivel, G., s e e Caselli, E 277-279, 283, 284, 286 Esteban, J.-M. 264 Estrella, A. 43, 1281, 1485 Evans, C. 982 Evans, C.L. 105 Evans, C.L., s e e Bordo, M.D. 182 Evans, C.L., s e e Christiano, L.J. 67, 68, 70, 83, 84, 89, 91-94, 99, 108, 137, 143, 144, 1011, 1021, 1038, 1089, 1100, 1365, 1369 Evans, C.L., s e e Eichenbaum, M. 83, 94, 96, 137 Evans, G.W. 425, 426, 453-455, 461-465, 468, 470, 472-478, 480, 481, 483, 484, 487, 489-492, 495-497, 500, 502, 504-507, 509 513, 516, 518-521,526-528, 530-532, 1025, 1125
Author
Index
Evans, M. 182 Evans, E 283, 1635, 1647, 1656-1659 Faig, M. 1675, 1720 Fair, R. 1416, 1425 Fair, R.C. 876, 1077, 1491 Fair, R.C., s e e Dominguez, K. 182 Falcone, M. 326 Fallick, B.C. 855 Fama, E.E 1235, 1280, 1281, 1307, 1316, 1320-1323 Farber, H. 1200 Farmer, R. 662, 1002 Farmer, R.E. 391,395,396, 411-414, 427-430, 434, 437, 500, 505 Farmer, R.E., s e e Benhabib, J. 395, 399-402, 408, 412-414, 417, 425, 427, 431,433-435, 442, 505 Farrell, J. 1121 Faust, J. 69, 217, 1416, 1425, 1437 Fauvel, Y. 1573 Favaro, E. 1554, 1555 Fay, J.A. 1077, 1103 Fazzari, S.M. 818, 1344 Fazzari, S.M., s e e Carpenter, R.E. 881, 912, 1344
Fazzari, S.M., s e e Chirinko, R.S. 1066, 1086 Featherstone, M. 1332 Federal Reserve Board 176 Feenberg, D. 60 Feenstra, R. 1569 Feenstra, R.C., s e e Bergen, ER. 1041 Feiwel, G.R. 535 Feldman, M. 474 Feldstein, M. 44, 197, 1485, 1497, 1498, 1622, 1631, 1633, 1636, 1637, 1639, 1656, 1660 Feldstein, M.S. 904, 906 Felli, E. 1083, 1122 Fellner, W. 641,657 Ferejohn, J. 1425 Fernald, J.G., s e e Basu, S. 399, 402, 433, 994, 1117, 1142 Fernandez, R. 1543, 1562 Ferris, S.E 1314 Ferson, W.E. 1284 Festinger, L. 1314 Fethke, G. 1037 Fiebig, D.G., s e e Domberger, S. 1019 Filippi, M., s e e Contini, B. 1177, 1178, 1180, 1222 Fillion, J.E 1498
Author
Index
Finch, M.H.J. 1543 Finegan, T.A., s e e Bowen, W. 619 Finn, M. 981, 1091 Fiorina, M. 1425 Fischer, A.M., s e e Dueker, M.J. 1485 Fischer, S. 182, 197, 202, 215, 216, 1025, 1026, 1155, 1404, 1405, 1438, 1449, 1489, 1496, 1498, 1538, 1542, 1547, 1561, 1582 Fischer, S., s e e Blanchard, O.J. 471,643, 660, 1013, 1033, 1034, 1036, 1491, 1635 Fischer, S., s e e Bruno, M. 1538 Fischer, S., s e e Debelle, G. 1489, 1518, 1522 Fischhoff, B. 1319, 1326 Fischhoff, B., s e e Lichtenstein, S. 1318 Fishe, R.EH. 173 Fisher, I. 154, 157, 203, 1316, 1321, 1343, 1372, 1377, 1485 Fisher, J. 92 Fisher, J., s e e Boldrin, M. 962, 1284, 1297 Fisher, J., s e e Christiano, L.J. 314, 347, 349, 350, 355, 362, 364, 962, 1296 Fishel, J.D.M. 910, 1368, 1375, 1376, 1378 Fisher, J.D.M., s e e Campbell, J.R. 846 Fishlow, A. 155 Flandreau, M. 154 Flannery, B.R, s e e Press, W.H. 329-334, 343, 348, 356, 365 Flavin, M. 572, 749, 763, 784 Flemming, J.S. 773 Flood, R.P. 152, 158, 202, 408, 1428, 1429, 1438, 1507, 1595, 1596 Flood, R.R, s e e Garber, EM. 165 Florovsky, G. 1326 Forbes, K. 277, 278 Ford, A.G. 155 Fore, D., s e e Roseveare, D. 1626 Foresi, S., s e e Backus, D.K. 1316 Forteza, A., s e e Echenique, F. 1551, 1561 Fortune, R 1310 Foufoula-Georgiou, E., s e e Kitanidis, EK. 326 Fourgeaud, C. 454, 465, 473, 475 Fox, B.L. 326 Foxley, A. 1543 Frankel, J. 1497 Frankel, J.A. 280, 281, 1590, 1637 Franses, P.H. 289 Fratianni, M. 1431 Freeman, R. 577 Fregert, K. 1016 French, K. 1280
I-9 French, K.R., s e e Fama, E.E 1235, 1281, 1320, 1323 Frenkel, J.A. 203, 1630 Frenkel, J.A., s e e Aizenman, J. 1497 Frennberg, R 1238 Friedman, B.M. 43, 44, 1632, 1642 Friedman, D. 475 Friedman, J.H., s e e Breiman, L. 289 Friedman, M. 46, 48, 61, 137, 154, 160, 162, 168, 172, 176, 179, 180, 185, 189, 195,203, 222, 275, 376, 572, 761, 762, 943, 1011, 1173, 1325, 1485, 1488, 1496, 1537, 1674, 1720 Froot, K. 1266, 1316 Frydman, R. 453,454, 474, 528, 536, 539 Fuchs, G. 464, 474 Fudenberg, D. 455, 475, 1155 Fudenberg, D., s e e Ellison, G. 475 Fuerst, T, 99, 974, 1378 Fuerst, T,, s e e Carlstrom, C. 1348, 1357, 1368, 1378, 1379 Fuhrer, J.C. 454, 905, 908, 1039, 1040, 1491, 1518 Fuhrer, J.C., s e e Carroll, C.D. 769, 785 Fukuda, S.-i. 875 Fuller, W.A., s e e Dickey, D.A. 53, 54, 212 Fullerton, D. 576, 588, 616 Funkhouser, R. 699 Futia, C. 299 Galbraith, J.K. 1182 Gale, D. 389, 475, 849, 851, 1376 Gale, D., s e e Chamley, C. 851 Gale, W.G. 1646 Galeotti, M. 909, 1086, 1124 Gali, J. 395, 405407, 426, 429, 434, 993, 994, 1117, 1119, 1120, 1129 Gall, J. 67, 69, 217 Gali, J., s e e Benhabib, J. 424 Gall, J., s e e Clarida, R. 96, 136, 422, 1364, 1368, 1486 Gallarotti, G.M. 154 Gallego, A.M. 321,322 Galor, O. 262, 263, 272, 660 Gandolfi, A.E., s e e Darby, M.R. 166 Garber, P.M. 165, 1323, 1543 Garber, P.M., s e e Eichengreen, B. 187, 189 Garber, P.M., s e e Flood, R.E 408, 1595, 1596 Garcia, R. 790 Garibaldi, P. 1180, 1222 Garratt, A. 504
1-10 Garratt, A., s e e Currie, D. 454, 504 Garriga, C. 1675, 1718 Gaspar, J. 324, 369 Gastil, R.D. 689 Gatti, R., s e e Alesina, A. 1432 Gavin, W. 1485 Geanakoplos, J.D. 395,458, 1322 Gear, C.W. 346 Geczy, C.C., s e e Brav, A. 1290 Gelb, A., s e e De Melo, M. 1535, 1551 Genberg, H. 165, 1428 Geoffard, RY., s e e Chiappori, EA. 391 Gerlach, S., s e e Bacchetta, E 1344 Gersbach, H. 1376 Gertler, M. 83, 92-94, 1040, 1343, 1348, 1366, 1373, 1374, 1376-1378 Gertler, M., s e e Aiyagari, S.R. 1293, 1631 Gertler, M., s e e Bernanke, B.S. 92, 144, 183, 856, 857, 1036, 1345, 1346, 1352, 1357, 1365, 1369, 1371, 1373, 1376 1378, 1578 Gertler, M., s e e Clarida, R. 95, 96, 136, 422, 1364, 1368, 1486 Geweke, J. 34, 334 Geweke, J., s e e Barnett, W. 540 Geweke, J.E 89 Ghali, M., s e e Surekha, K. 908 Ghez, G. 615, 752, 759 Ghezzi, E 1572 Ghosh, A.R. 202, 207, 208 Giavazzi, E 167, 203, 1438, 1446, 1449, 1580 Giavazzi, E, s e e Missale, A. 1450 Gibson, G.R. 1307 Gigerenzer, G. 1308, 1318 Gilbert, R.A. 195 Gilehrist, S. 847, 1344 Gilchrist, S., s e e Bernanke, B.S. 856, 1036, 1345, 1373, 1376 Gilchrist, S., s e e Gertler, M. 83, 92-94, 1366, 1373, 1374, 1376 Gill, EE. 329 Gilles, C., s e e Coleman II, W.J. 114 Gilson, R.J. 1154 Giovannini, A. 156, 158, 160, 166, 169, 380 Giovannini, A., s e e Giavazzi, E 167 Gizycki, M.C., s e e Gruen, D.K. 1316 Glasserman, E, s e e Boyle, E 380 Glazer, A. 1456, 1465 Glomm, G. 712, 1472 Glosten, L. 1280 Goetzmann, W., s e e Brown, S. 1242
Author
Index
Goetzmann, W.N. 1242, 1252, 1314, 1320, 1333 Goff, B.L. 159 Gokhale, J. 750 Gokhale, J., s e e Auerbach, A.J. 1624 Goldberg, RK., s e e Attanasio, O.E 777 Goldfajn, I., s e e Dornbusch, R. 1590 Goldstein, M., s e e Mussa, M. 208, 1637 Gomes, J. 994, 1159 Gomme, P. 962, 1062 Gomme, R, s e e Andolfatto, D. 1173 Gomme, R, s e e MacLeod, W.B. 1157 Goodfriend, M. 88, 120, 121, 156, 173, 191, 194 196, 764, 1013, 1117, 1346, 1509, 1514, 1515 Goodhart, C., s e e Capie, E 154 Goodhart, C.A.E. 193 Goodhart, C.A.E., s e e Almeida, A. 1432, 1495 Goodhart, C.E.A. 1438, 1495, 1507, 1508, 1514 Goodman, A. 797 Goolsbee, A. 839, 843, 848 Gordon, D.B. 128, 134 Gordon, D.B., s e e Barro, R.J. 158, 1155, 1405, 1411, 1415, 1438, 1485-1489 Gordon, D.B., s e e Leeper, E.M. 69 Gordon, R. 1030 Gordon, R.H. 1637 Gordon, R.J. 40, 46, 48, 49, 181, 1542 Gordon, R.J., s e e Balke, N.S. 6, 61,204, 205, 221 Gorman, W.M. 553, 556, 782, 803 Gorton, G., s e e Calomiris, C.W. 181 Gottfries, N. 463, 1121, 1122 Gould, D.M. 1551, 1559, 1561 Gourieroux, C. 487 Gourieroux, C., s e e Broze, L. 487, 488 Gourieroux, C., s e e Fourgeaud, C. 454, 465, 473,475 Gourinchas, E-O. 609, 1344 Graham, B. 1323 Graham, EC. 1656, 1657 Grandmont, J.-M. 439, 454, 460, 464, 474, 475, 481,507, 514, 526, 661 Granger, C. 34 Granger, C.W.J. 881,903 Granger, C.W.J., s e e Engle, R.E 50 Gray, J.A. 1025, 1026, 1038 Green, D., s e e MaCurdy, T.E. 619, 620 Green, E. 575
Author
Index
Green, H., s e e Beaudry, P. 592 Greenberg, D., s e e Burtless, G. 618 Greenberg, D.H., s e e DaVanzo, J. 618 Greenspan, A. 199, 798, 844, 847, 1630 Greenwald, B. 857, 1122, 1377 Greenwood, J. 380, 550, 576, 664, 692, 962, 980, 995 Greenwood, J., s e e Cooley, T.E 847 Greenwood, J., s e e Gomes, J. 994, 1159 Greenwood, J., s e e Gomme, R 962, 1062 Gregory, A.W. 376, 377 Gregory, A.W., s e e Devereux, M. 952 Grief, K.B. 253 Griffiths, M., s e e Dolado, J. 1437 Griliches, Z. 541 Grilli, V. 95, 1404, 1432, 1438, 1439, 1465 Grilli, V., s e e Alesina, A. 1430 Grilli, V., s e e DeKock, G. 158 Grilli, V., s e e Drazen, A. 1463, 1465, 1541 Grilli, V.U. 169 Gros, D., s e e Adams, C. 1538 Gross, D. 857, 1344 Gross, D.B., s e e Goolsbee, A. 839 Grossman, G.M. 264, 639, 672, 715, 1210, 1464 Grossman, H.J. 158, 1415, 1449 Grossman, S.J. 801, 1237, 1242, 1246, 1268, 1291, 1293 Grout, RA. 852 Gruen, D.K. 1316 Guerra, A. 1546, 1606, 1607 Guesnerie, R. 439, 454, 460, 464, 465, 474, 475, 506, 511,516, 526 Guesnerie, R., s e e Chiappori, RA. 391, 395, 516 Guesnerie, R., s e e Evans, G.W. 464 Guidotti, EE. 1537, 1588, 1603, 1675, 1720 Guidotti, EE., s e e Calvo, G.A. 1447, 1450 Guidotti, RE., s e e De Gregorio, J. 1546, 1551, 1573, 1575, 1577 Guiso, L. 772 Guiso, L., s e e Galeotti, M. 909 Gulde, A.M., s e e Ghosh, A.R. 202, 207, 208 Gultekin, M. 1317 Gultekin, N.B., s e e Gultekin, M. 1317 Guo, J.-T., s e e Farmer, R.E. 395, 427430, 434, 505 Guo, J.-T. 416, 427 Gurley, J.G. 1507 Gust, C. 1041 Guttman, R, s e e Erlich, D. 1314
1-11 Haberler, G. 185 Hahn, E 661 Hahn, T., s e e Cook, T. 194, 1493 Hahn, W. 479 Hairault, J.-O. 1036 Haldane, A.G. 1432, 1438, 1485, 1495, 1497 Haley, W.J. 585 Hall, G. 911 Hall, R.E. 399, 556, 573, 595, 607, 608, 673, 679, 680, 683-686, 702, 765, 767 769, 784, 789, 791, 794, 817, 856, 930, 982, 1068, 1070, 1079, 1089, 1092, 1095, 1096, 1141-1143, 1145, 1151-1153, 1157, 11601164, 1200, 1261, 1485, 1493, 1498, 1655, 1656 Hall, S., s e e Currie, D. 454, 504 Hall, S., s e e Garratt, A. 504 Hallerberg, M. 1460, 1465 Haltiwanger, J., s e e Caballero, R.J. 821, 837, 838, 84~842, 1158 Haltiwanger, J., s e e Cooper, R. 824 Haltiwanger, J.C. 881 Haltiwanger, J.C., s e e Abraham, K.G. 1058 Haltiwanger, J.C., s e e Davis, S.J. 1151, 1152, 1160, 1161, 1176, 1178, 1180, 1194, 1199 Hamermesh, D. 577 Hamilton, A. 1659 Hamilton, J. 963 Hamilton, J.D. 12, 72, 80, 182, 1118, 1265 Hammerlin, G. 344 Hammour, M.L., s e e Caballero, R.J. 846, 847, 852, 855, 856, 1157, 1158, 1160, 1187, 1210, 1211, 1213, 1472 Hannerz, U. 1332 Hansen, B. 1194 Hansen, B.E. 38, 39 Hansen, G.D. 547, 551,602, 976, 977, 1200 Hansen, G.D., s e e Cooley, T.E 69, 97, 101, 115, 124, 137, 380, 408, 411,974, 1736 Hansen, L.P. 547, 555, 556, 558, 572-574, 768, 769, 784, 882, 915, 1234, 1246, 1249, 1250, 1261, 1294, 1295 Hansen, L.R, s e e Anderson, E.W. 368, 369 Hansen, L.R, s e e Coehrane, J.H. 1234, 1246, 1249 Hansen, L.E, s e e Eichenbattm, M. 549, 550, 785, 799, 800, 803 Hanson, J. 1543 Hansson, B., s e e Frennberg, R 1238 Harberger, A.C. 1554, 1590
1-12 Harden, I., s e e von Hagen, J. 1439, 1460, 1465 Hardouvelis, G.A. 1281 Hardouvelis, G.A., s e e Estrella, A. 43, 1281 Harris, R., s e e Cox, D. 705 Harrison, A. 277, 279, 280 Harrison, S.G., s e e Christiano, L.J. 426 Harrison, S.H. 402 Harrod, R. 640 Hart, O. 852, 1154 Hartwick, J. 656 Harvey, A.C. 9 Harvey, C.R. 1236, 1280 Hashimoto, M. 1152 Hassett, K.A. 815, 818, 843, 1344 Hassett, K.A., s e e Auerbach, A.J. 821 Hassett, K.A., s e e Cummins, J.G. 822, 856, 1344 Hassett, K.A., s e e Fallick, B.C. 855 Hassler, J. 9, 1238 Haug, A.A., s e e Dezhbakhsh, H. 1039 Haugen, R.A., s e e Ferris, S.P. 1314 Hause, J.C. 569 Hausman, J. 620 Hausman, J., s e e Burtless, G. 620 Hawley, C.B., s e e O'Brien, A.M. 776 Hayashi, E 773, 775, 776, 785, 788, 790, 796, 800, 818, 1649 Head, A., s e e Devereux, M.B. 1126 Heal, G., s e e Dasgupta, R 655, 656 Heal, G.M., s e e Ryder Jr, H.E. 1284 Heaton, J. 380, 547, 569, 803, 1242, 1255, 1284, 1293 Heekman, J.J. 576, 578, 579, 582, 584-587, 590, 592, 593, 595, 601-603,605, 615-617, 620-624, 752, 759, 1166 Heckman, J.J., s e e Ashenfelter, O. 618 Heckman, J.J., s e e Cameron, S. 589 Heckman, J.J., s e e Cossa, R. 584 Heckanan, J.J., s e e Killingsworth, M.R. 550, 601, 1148 Heijdra, B.J. 1119, 1120, 1126 Heinemann, M. 495, 525 Hellwig, M., s e e Gale, D. 1376 Helpman, E. 203, 1580 Helpman, E., s e e Coe, D.T. 265 Helpman, E., s e e Drazen, A. 1580 Helpman, E., s e e Grossman, G.M. 264, 639, 672, 715, 1210, 1464 Hendershott, RH. 1333 Henderson, D.W. 1497
Author
Index
Henderson, D.W., s e e Bryant, R.C. 1491, 1497, 1516 Henderson, D.W., s e e Canzoneri, M.B. 160, 1507, 1508 Hendry, D., s e e Davidson, J. 750 Hereowitz, Z. 664 Hercowitz, Z., s e e Barro, R.J. 1023 Hereowitz, Z., s e e Greenwood, J. 550, 664, 962, 980 Herrendorf, B. 1415, 1436, 1438 Hess, G.D. 9, 1485, 1509 Hester, D.A. 871 Heston, A., s e e Summers, R. 238, 301, 640, 673-675, 677, 680, 681, 689, 720 Hetzel, R.L. 180 Heymann, D. 506, 1539, 1540, 1543 Hibbs, D. 1400, 1425 Hildenbrand, W. 535, 537 Himarios, D., s e e Graham, EC. 1656, 1657 Himmelberg, C.R, s e e Gilchrist, S. 1344 Hiriart-Urruti, J.B. 331 Hirschhorn, E., s e e Cox, W.M. 1621 Hirschman, A. 1540 Hirshleifer, D., s e e Bikhchandani, S. 1332 Hirshleifer, D., s e e Daniel, K. 1322 Hobijn, B., s e e Franses, RH. 289 Hodrick, R. 9, 12, 34, 428, 931,932 Hodrick, R.J., s e e Bekaert, G. 1281 Hodrick, R.J., s e e Flood, R.P. 1507 Hoelscher, G. 1658 Hoffmaister, A. 1561, 1589 Hoffman, D.L. 51,412 Hoffmann, K.-H., s e e Hammerlin, G. 344 Holbrook, R. 569 Holmstrom, B. 1376, 1417, 1418, 1425 Holt, C.A., s e e Davis, D. 1033 Holt, C.C. 882, 885, 888, 909, 910, 912 Holtham, G., s e e Bryant, R.C. 1491, 1497, 1516 Holtz-Eakin, D., s e e Blinder, A.S. 41 Hommes, C.H. 529, 532 Hommes, C.H., s e e Brock, W.A. 455, 528, 532 Honkapohja, S. 464, 481,507, 535 Honkapohja, S., s e e Evans, G.W. 425, 426, 454, 455, 461,464, 465, 468, 470, 472 478, 480, 481,483, 484, 487, 489~492, 495--497, 502, 504-507, 509-513, 516, 518-521, 526-528, 530-532, 1025 Hooker, M.A., s e e Fuhrer, J.C. 454
Author
1-13
Index
Hooper, P., s e e Bryant, R.C. 1043, 1491, 1497, 1516~1518 Hopenhayn, H. 672, 708, 994 Hopenhayn, H.A. 844 ~Horioka, C., s e e Feldstein, M. 1636 Horn, H. 1415 Hornstein, A. 549, 996 Hornstein, A., s e e Fisher, J.D.M. 910 Horvath, M. 994 Horvath, M., s e e Boldrin, M. 962, 1062 Hoshi, T. 1344 Hosios, A.J. 1193, 1224 Hotz, VJ. 792, 803 Houthakker, H.S. 803 Howard, R. 336 Howitt, R 389, 399, 455, 506, 507, 514, 515, 517, 521,527, 1174, 1508 Howitt, E, s e e Aghion, R 264, 665, 672, 715, 719, 1208, 1210, 1213 Howrey, E.E, s e e Fair, R.C. 1491 Hoynes, H.W., s e e Attanasio, O.R 753 Hsieh, C.-T. 673, 687 Hubbard, R.G. 567, 569, 572, 573, 593, 771, 776, 794, 797, 856, 1344, 1376, 1660 Hubbard, R.G., s e e Cummins, J.G. 822, 1344 Hubbard, R.G., s e e Domowitz, I. 1020, 1083, 1093 Hubbard, R.G., s e e Fazzari, S.M. 818, 1344 Hubbard, R.G., s e e Gertler, M. 1376 Hubbard, R.G., s e e Hassett, K.A. 815, 818, 843, 1344 Huberman, G., s e e Kahn, C. 1154 Huffman, G.W. 437 Huffman, G.W., s e e Greenwood, J. 380, 962, 98O HuggeR, M. 380, 576, 593 Hulten, C. 664 Hultgren, T. 1100 Humphrey, T.M. 1485 Humphreys, B.R. 909 Hurd, M.D. 780 Hybels, J., s e e Kallick, M. 1325 Hyslop, D., s e e Card, D. 1016 Ibbotson, R. 1321 Iden, G., s e e Barth, J.R. 1657 Ikenberry, G.J. 163 Im, K. 283 Imrohoroglu, A. 797 Ingberg,, M., s e e Honkapohja, S. Ingram, B. 984
535
Inman, R., s e e Bohn, H. 1465 Intriligator, M., s e e Griliches, Z. 541 Ireland, P.N. 129, 194, 1036, 1492, 1494, 1497 Irish, M., s e e Browning, M. 611, 612, 752, 787, 792 Irons, J., s e e Faust, J. 1416, 1425 Irwin, D.A. 178 Isard, R, s e e Flood, R.R 158, 1429, 1438 Islam, N. 283-285, 287, 653 Ito, T. 1425 Iwata, S., s e e Hess, G.D. 9 Jaekman, R. 1221 Jackman, R., s e e Layard, R. 1098, 1176, 1177, 1221 Jackwerth, J.C. 1310 Jaeger, A., s e e Harvey, A.C. 9 Jaffee, D.M. 1376 Jagannathan, R., s e e Glosten, L. 1280 Jagannathan, R., s e e Hansen, L.P. 547, 1234, 1246, 1249 James, H., s e e Bernanke, B.S. 183, 184 James, W. 1330 Janis, I. 1332 Jappelli, T. 776, 780, 790, 1344 Jappelli, T., s e e Guiso, L. 772 Jeanne, O. 156, 1041 Jeanne, O., s e e Bensaid, B. 1446, 1449 Jefferson, RN. 1485, 1509 Jegadeesh, N. 1321 Jegadeesh, N., s e e Chan, L. 1321 Jensen, H. 1415, 1427 Jensen, H., s e e Beetsma, R. 1436, 1438 Jensen, M. 1344 Jeon, B.N., s e e von Furstenberg, G.M. 1333 Jermann, U.J. 1296 Jermann, U.J., s e e Alvarez, E 575 Jermann, U.J., s e e Baxter, M. 980, 992 Jewitt, I., s e e Buiter, W. 1030 Jimeno, J.E, s e e Blanchard, OJ. I214 Jimeno, J.E, s e e Dolado, J.J. 1214 John, A., s e e Cooper, R. 398 Johnson,, P.G., s e e Banks, J. 751 Johnson, H.G. 702, 704, 705 Johnson, R, s e e Goodman, A. 797 Johnson, RA., s e e Durlauf, S.N. 254, 263, 264, 270, 271,289, 303 Johnson, S.A. 345, 381 Jones, C.I. 237, 264, 290, 292, 672, 696, 714-716, 718, 719
1-14 Jones, C.I., s e e Hall, R.E. 673, 679, 680, 683-686, 702, 856 Jones, L.E. 245, 257, 261, 380, 672, 709, 711 713, 720, 1675, 1711 Jones, L.E., s e e Chaff, V.V. 715, 1578 Jones, M. 1540 Jonsson, G. 1404, 1411, 1415, 1426, 1438 Jonung, L. 159, 1485 Jonung, L., s e e Bordo, M.D. 152, 215, 217, 220, 221 Joining, L., s e e Fregert, K. 1016 Jorda, O. 881 Jorgenson, D. 664 Jorgenson, D.W. 817 Jorgenson, D.W., s e e Christensen, L.R. 673, 688 Jorgenson, D.W., s e e Hall, R.E. 817 Jorion, P., s e e Goetzmann, W.N. 1242, 1252, 1320 Jovanovic, B. 702, 848, 1200 Jovanovic, B., s e e Greenwood, J. 664, 692 Judd, J.E 1485, 1487, 1512, 1516 Judd, K. 590, 1652 Judd, K., s e e Bizet, D. 380 Judd, K.J., s e e Gaspar, J. 324, 369 Judd, K.L. 314, 324, 340, 343, 347, 348, 350, 354, 1673, 1675, 1694 Judson, R. 663 Judson, R., s e e Porter, R. 1509 Juhn, C. 569, 619 Jun, B. 474 Juster, ET. 777 Juster, T., s e e Barsky, R. 558, 564, 565 Kaas, L., s e e B6hm, V. 646 Kafka, A. 1543 Kahaner, D. 329, 333 Kahn, C. 1154 Kahn, C.M., s e e Blanchard, O.J. 391,504 Kahn, J., s e e Crucini, M.J. 178, 705 Kahn, J.A. 897, 910 Kahn, J.A., s e e Bils, M. 910, 912, 1053, 1078, 1079, 1085 Kahneman, D. 1308, 1309, 1311 Kahneman, D., s e e Thaler, R.H. 1313 Kahneman, D., s e e Tversky, A. 1308, 1315, 1319, 1330 Kalaba, R., s e e Bellman, R. 340 Kaldor, N. 237, 238, 240, 640, 941 Kalecki, M. 1054 Kallick, M. 1325
Author
Index
Kamihigashi, T. 428 Kaminsky, G.L. 1550, 1553, 1590 Kandel, S. 1235, 1252, 1253, 1265, 1270, 1272 Kandori, M. 475 Kane, A., s e e Chou, R.Y. 1236, 1280 Kaniovski, Y.M., s e e Arthur, W.B. 476 Kaplan, S.N. 856, 1344 Karatzas, I., s e e E1 Karoui, N. 835 Karras, G., s e e Cecchetti, S.G. 217 Kashyap, A.K. 137, 877, 881, 886, 903, 906, 912, 1018, 1344, 1374, 1376 Kashyap, A.K., s e e Cecchetti, S.G. 876 Kashyap, A.K., s e e Hoshi, T. 1344 Kashyap, A.K., s e e Hubbard, R.G. 1344 Katz, L. 577, 578 Katz, L., s e e Autor, D. 577 Katz, L.E, s e e Abraham, K.J. 1183, 1221 Katz, L.E, s e e Cutler, D.M. 797 Katz, L.W., s e e Blanchard, O.J. 1176 Kaufman, H. 1344 Keane, M.P. 608, 609, 786, 790 Keefer, P., s e e Knack, S. 1466, 1471 Kehoe, P.J., s e e Atkeson, A. 847, 1675, 1718, 1720 Kehoe, P.J., s e e Backus, C.K. 549 Kehoe, P.J., s e e Backus, D.K. 9, 42, 45, 938, 1708 Kehoe, P.J., s e e Chaff, V.V. 124, 397, 422, 672, 697, 698, 700, 701, 709, 720, 722, 723, 974, 1036, 1037, 1040-1042, 1371, 1448, 1449, 1488, 1489, 1673 1676, 1691, 1699, 1708-1710, 1720, 1723 Kehoe, P.J., s e e Cole, H.L. 1449 Kehoe, T.J. 314, 380, 389, 391,574, 575 Kehoe, T.J., s e e Cole, H.L. 1446, 1449, 1603 Kehrer, K.C., s e e Moffitt, R.A. 618 Kelly, M. 271 Kernmerer, E.W. 173 Kendrick, D.A., s e e Amman, H.M. 535 Kenen, P.B. 165, 1496 Kennan, J. 803 Kessler, D. 1646 Keynes, J.M. 158, 161, 1055, 1059, 1537 Kiefer, J. 476 Kiefer, N.M., s e e Burdett, K. 1173 Kiefer, N.M., s e e Devine, T.J. 1166 Kiguel, M. 1535, 1543, 1546, 1554, 1555 Kihlstrom, R.E. 563 Kiley, M.T. 422, 423, 1041, 1117, 1129 Killian, L. 87
Author
Index
Killingsworth, M.R. 550, 601, 1148 Kim, J. 129, 1036 Kim, K. 377, 379 Kim, M., s e e Nelson, C.R. 1320 Kim, S. 95 Kim, S.-J. 672, 711-714 Kimball, M., s e e Barsky, R. 558, 564, 565 Kimball, M., s e e Carroll, C.D. 762, 771 Kimball, M.S. 556, 770, 1036, 1041, 1056, 1114, 1117, 1127, 1653 Kimball, M.S., s e e Basu, S. 983, 992, 994, 1069, 1080, 1081, 1117 Kimbrough, K.R 1537, 1675, 1676, 1720, 1732 Kindahl, J., s e e Stigler, G. 1018 Kindleberger, C.R 162 King, M. 199, 1333, 1485, 1489 King, R.G. 9, 46, 54, 69, 101,278, 369, 391, 429, 435, 545, 549, 649, 672, 689, 692, 711-713, 929, 931,932, 939, 941,945, 953, 954, 971, 995, 1036, 1041, 1043, 1062, 1140, 1364, 1367, 1491 King, R.G., s e e Barro, R.J. 974 King, R.G., s e e Baxter, M. 9, 11, 12, 430, 934, 974 King, R.G., s e e Dotsey, M. 974, 1032, 1043 King, R.G., s e e Goodfriend, M. 1013, 1117, 1346, 1515 King, S. 101 Kirby, C. 1320 Kirman, A.P. 475, 528, 536, 539-541 Kitanidis, RK. 326 Kiyotaki, N. 524, 852, 857, 1353, 1356, 1376, 1378, 1379 Kiyotaki, N., s e e Blanchard, O.J. 1033, 1034 Kiyotaki, N., s e e Boldrin, M. 399 Kleidon, A.W. 1320 Klein, B. 202, 215, 216 Klein, L. 941 Klemperer, RD. 1121 Klenow, RJ. 663, 673, 679, 680, 683 686, 694, 702, 705, 707 Klenow, EJ., s e e Bils, M. 694 Klenow, P.J., s e e Heckman, J.J. 578 Klock, M., s e e Silberman, J. 1316 Knack, S. 1466, 1471 Kneese, A. 656 Knowles, S. 277, 278 Kocherlakota, N. 574, 954, 985, 1234, 1251, 1253 Kocherlakota, N., s e e Cole, H.L. 576
1-15 Kocherlakota, N., s e e Ingrain, B. 984 Kocheflakota, N.R. 271,673, 694 Kochin, L., s e e Benjamin, D. 161 Kollintzas, T. 904-907 Kollman, R. 1085 Kon-Ya, E, s e e Shiller, R.J. 1316 Konings, J., s e e Garibaldi, E 1180, 1222 Koopmans, T. 931,942, 948 Koopmans, T.C. 244, 246, 247, 295, 643, 649, 1673 Koopmans, T.J. 9 Kormendi, R.C. 278-281,671, 1656, 1657 Kornai, J. 703 Kortum, S., s e e Eaton, J. 719 Kosobud, R., s e e Klein, L. 941 Kosters, M.H. 618 Kotkin, B., s e e Bellman, R. 340 Kotlikoff, L. 1448, 1449, 1465 Kotlikoff, L., s e e Hayashi, E 796 Kotlikoff, L.J. 780, 1624, 1646 Kotlikofl, L.J., s e e Auerbach, A.J. 380, 549, 576, 588, 590, 591, 593, 616, 1624, 1634, 1635, 1639, 1652, 1718 Kotlikoff, L.J., s e e Gokhale, J. 750 Koyck, L.M. 816 Kramer, C., s e e Flood, R.E 1596 Krane, S.D. 876, 877 Kremer, M., s e e Blanchard, O.J. 852 Kremer, M., s e e Easterly, W. 277, 278, 281, 675 Kreps, D.M. 540, 557, 1256 Kreps, D.M., s e e Bray, M. 474 Kreps, D.M., s e e Fudenberg, D. 475 Krieger, S. 380, 843, 847 Krishnamurthy, A. 1376, 1378 Kroner, K.E, s e e Bollerslev, T. 1236, 1280 Krueger, A., s e e Autor, D. 577 Krueger, A.O. 673, 679, 699 Krueger, J.T. 104, 105 Krugman, P. 1215, 1536, 1590, 1592, 1594, 1596, 1601, 1605, 1606, 1632 Krusell, P. 380, 547, 566, 567, 994, 1293, 1445, 1473 Krusell, R, s e e Greenwood, J. 664 Kuan, C.-M. 476 Kugler, P. 1281 K u h , E . , s e e Meyer, J.R. 817 Kumhof, M. 1596 Kurz, M. 474 Kuslmer, H. 476 Kushner, H.J. 476
1-16 Kusko, A.L. 1327 Kuttner, K., s e e Evans, C.L. 105 Kuttner, K.N., s e e Friedman, B.M. 43, 44 Kutmer, K.N., s e e Krueger, J.T. 104, 105 Kuznets, S. 941 Kuznets, S., s e e Friedman, M. 572 Kwiatkowski, D. 212 Kydland, EE. 9, 42, 158, 428, 547, 549, 578, 929, 953, 956, 957, 962, 980, 981, 1058, 1059, 1140, 1141, 1145, 1167, 1195, 1400, 1405, 1415, 1449, 1485, 1486, 1488, 1557, 1561, 1673, 1708 Kydland, EE., s e e Backrus, C.K. 549 Kydland, EE., s e e Backus, D.K. 1708 Kydland, EE., s e e Bordo, M.D. 158, 160, 185, 215, 1438 Kydland, EE., s e e Hotz, V.J. 792, 803 Kyle, A.S., s e e Campbell, J.Y. 1290 La Porta, R. 1240, 1320 Labadie, P., s e e Giovannini, A. 380 Labadie, EA., s e e Coleman II, W.J. 114 Lach, S. 1019 Ladron de Guevara, A. 317 Laffont, J . , s e e Gourieroux, C. 487 Laffont, J.J., s e e Kihlstrom, R.E. 563 Laffont, J.-J. 538 Lahiri, A. 1539, 1571, 1578, 1579, 1597 Lai, K.S. 876 Laibson, D. 1653 Laidler, D. 1485 Lakonishok, J. 1323 Lakonishok, J., s e e Chan, L. 1321 Lam, E-S., s e e Cecchetti, S.G. 1251, 1265, 1270, 1272, 1294, 1296 Lam, P.S. 802 Lambert, J.D. 346 Lambertini, L. 1457, 1465 Lamo, A.R. 290 Lamont, O.A., s e e Kashyap, A.K. 881, 912, 1344, 1374 Landi, L., s e e Barucci, E. 525 Lane, P. 1472 Langer, E.J. 1329 Lansing, K., s e e Guo, J.-T. 416 Lapham, B.J., s e e Devereux, M.B. 1126 Laroque, G., s e e Fuchs, G. 464, 474 Laroque, G., s e e Grandmont, J.-M. 464, 474, 475, 481,507 Laroque, G., s e e Grossman, S.J. 801 Lau, L. 664
Author
Index
Lau, S.H.P. 1037 Lawrance, E. 607-609 Layard, R. 1098, 1176, 1177, 1221 Layard, R., s e e Jackman, R. 1221 Layne-Farrar, A., s e e Heckman, J.J. 578 Lazaretou, S. 159 Lazear, E.E 1660 Lazear, E.E, s e e Hall, R.E. 1152 League of Nations 162 Leahy, J. 844, 1332 Leahy, J., s e e Caballero, R.J. 823, 828, 830 Leahy, J., s e e Caplin, A. 849, 850 Leamer, E.E. 282 Lebow, D.E. 215, 1016 Lebow, D.E., s e e Blinder, A.S. 1018, 1118 Lee, C. 1324 Lee, J.-W., s e e Barro, R.J. 277-281, 671,681, 683-685, 688, 689, 691 694 Lee, J.-W. 703 Lee, J.Y. 395 Lee, K. 284 Lee, T.H., s e e Granger, C.W.J. 881,903 Leeper, E.M. 69, 74, 83, 93, 101, 128, 132, 134, 137, 418, 420, 1036, 1089, 1369, 1518, 1520, 1631 Leeper, E.M., s e e Faust, J. 69, 217 Leeper, E.M., s e e Gordon, D.B. 128, 134 Lefort, E, s e e Caselli, E 277-279, 283, 284, 286 Lehmann, B.N. 1321 Leibfritz, W., s e e Roseveare, D. 1626 Leiderman, L. 1432, 1438, 1495, 1543 Leiderman, L., s e e Bufman, G. 1543 Leiderman, L., s e e Calvo, G.A. 1552, 1600 Leiderman, L., s e e Kaminsky, G.L. 1550 Leijorthufvud, A. 152, 202, 215 Leijonhufvud, A., s e e Heymarm, D. 1539, 1540 Lemarechal, C., s e e Hiriart-Urruti, J.B. 331 LeRoy, S.E 1235, 1319 Lettau, M. 470, 472, 524, 527, 1293, 1297 Leung, C. 271 Levhari, D. 1450, 1465 Levin, A. 283, 1017, 1031, 1035, 1036, 1038 Levin, A., s e e Brayton, E 1043, 1344, 1485 Levine, D.K., s e e Fudenberg, D. 455, 475 Levine, D.K., s e e Kehoe, T.J. 380, 389, 391, 574, 575 Levine, J. 1332 Levine, P., s e e al Nowaihi, A. 1415, 1422, 1437
Author
Index
Levine, R. 269, 277-282, 390, 423, 671, 694, 1376 Levine, R., s e e King, R.G. 278, 689, 692 Levy, D. 1014, 1015, 1019 Levy, D., s e e Carpenter, R.E. 876 Levy, D., s e e Dutta, S. 1019, 1020 Levy-Leboyer, M. 222 L6vy-Strauss, C. 1331 Lewis-Beck, M. 1425 Li, J.X. 326 Li, Y., s e e Johnson, S.A. 345, 381 Lichtenstein, S. 1318 Lichtenstein, S., s e e Fischhoff, B. 1319 Lilien, D.M. 1160, 1183, 1221 Lilien, D.M., s e e Hall, R.E. 1153 Lillard, L. 569, 572 Limongi, E, s e e Przeworski, A. 1466 Lin, C., s e e Levin, A. 283 Lindbeck, A. 1098, 1425, 1465 Lindert, P. 156 Lioni, G., s e e Contini, B. 1177, 1178, 1180, 1222 Lippi, E 1432 Lippi, E, s e e Cukierman, A. 1438 Lippi, M. 217 Lipsey, R.E., s e e Blomstrom, M. 277, 279, 280 Liu, C.Y., s e e Conlon, J.R. 1032 Liviatan, N., s e e Cukierman, A. 1437 Liviatan, N., s e e Kiguel, M. 1535, 1543, 1546, 1554, 1555 Lizondo, J.S. 1538 Lizzeri, A. 1459 Ljung, L. 474, 476, 481,482 Ljungqvist, L. 1214 Lo, A.W. 1321 Lo, A.W., s e e Campbell, J.Y. 1255, 1257, 1258, 1261, 1266, 1270, 1320 Loayza, N.V. 708 Lochner, L., s e e Cossa, R. 584 Lochner, L., s e e Heckman, J.J. 576, 578, 582, 584, 586, 587, 590, 592, 593 Lockwood, B. 1411, 1415 Lockwood, B., s e e Herrendorf, B. 1436, 1438 Lohman, S. 1416-1418, 1425, 1431, 1438 Londregan, J., s e e Alesina, A. 1425 Long, J. 929, 952, 953, 994 Loomes, G. 1313 Lopez-de-Silanes, E, s e e La Porta, R. 1240 Lorentz, A.L. 344 Lothian, LR., s e e Darby, M.R. 166
1-17 Loury, G.C. 299 Lovell, M.C. 881, 893,908, 910 Lown, C., s e e Bemanke, B.S. 1343 Lucas, D.J. 1035, 1036, 1042 Lucas, D.J., s e e Heaton, J. 380, 547, 569, 1255, 1293 Lucas, R. 398, 424, 425, 641, 651, 929, 932, 953 Lucas, R.E. 46, 50, 380, 1158, 1446, 1449 Lucas, R.E., s e e Stokey, N.L. 314, 318-321, 346, 951,998, 999 Lucas Jr, R.E. 67, 88, 158, 238, 245, 264, 265, 293, 454, 457, 463, 474, 545, 547, 554, 559, 561, 575, 578, 582, 583, 615, 616, 672, 710-715, 720, 797, 102~1024, 1043, 1195, 1268, 1489, 1490, 1495, 1500, 1592, 1673, 1675, 1699, 1711, 1723, 1728 Lucas Jr, R.E., s e e Atkeson, A. 575 Lucas Jr, R.E., s e e Stokey, N.L. 271,299 Ludvigson, S. 785, 788, 1344, 1652 Lundvik, E, s e e Hassler, J. 9, 1238 Lusardi, A. 608, 790, 791 Lusardi, A., s e e Browning, M. 606, 771 Lusardi, A., s e e Garcia, R. 790 Luttmer, E.G./ 575 Lyons, R.K., s e e Caballero, R.J. 399
Maberly, E.D., s e e Dyl, E.A. 1334 Macaulay, ER. 173 MacAvoy, P.W., s e e Funkhouser, R. 699 Maccini, L.J. 881, 893, 894, 903, 907 Maccini, L.J., s e e Blinder, A.S. 887, 904, 910, 1344 Maccini, L.J., s e e Durlauf, S.N. 905-907 Maccini, L.J., s e e Haltiwanger, J.C. 881 Maccini, L.J., s e e Humphreys, B.R. 909 MacDonald, R., s e e Bordo, M.D. 156 Mace, B.J. 796 Mackay, D. 1307 MacKinlay, A.C., s e e Campbell, J.Y. 1255, 1257, 1258, 1261, 1266, 1270, 1320 MacKinlay, A.C., s e e Lo, A.W. 1321 MacLeod, W.B. 1157, 1186 MaCurdy, T.E. 551, 567-569, 572, 592, 595, 615, 616, 619-621,752, 759, 767, 792, 975, 1148, 1149 MaCurdy, T.E., s e e Attanasio, O.P. 792 MaCurdy, T.E., s e e Blundell, R. 602, 620 MaCurdy, T.E., s e e Heckman, J.J. 615 Maddala, G.S. 275
1-18 Maddison, A. 288, 673-675, 677, 678, 720, 721 Madison, J. 1659 Mailath, G.J., s e e Kandori, M. 475 Makhija, A.K., s e e Ferris, S.R 1314 Malcomson, J.M., s e e MacLeod, W.B. 1157, 1186 Malinvaud, E., s e e Blanchard, O.J. 1214 Malkiel, B. 1316 Mankiw, N.G. 135, 158, 159, 173, 216, 244246, 252-255, 269-271,277-279, 289, 397, 567, 653, 655, 660, 673, 67%686, 694, 749, 785, 790, 800, 961, 1281, 1290, 1292, 1638, 1702, 1742 Manldw, N.G., s e e Abel, A.B. 1266, 1651 Mankiw, N.G., s e e Ball, L. 42, 1023, 1632, 1650, 1651 Mankiw, N.G., s e e Barro, RJ. 1637 Mankiw, N.G., s e e Barsky, R.B. 1653 Mankiw, N.G., s e e Campbell, J.Y. 769, 784, 1261, 1264, 1290, 1655 Mankiw, N.G., s e e Elmendorf, D.W. 1439 Mankiw, N.G., s e e Hall, R.E. 1485, 1493, 1498 Mankiw, N.G., s e e Kimball, M.S. 1653 Mann, C.L., s e e Bryant, R.C. 1043, 1491, 1497, 1516-1518 Manuelli, R.E., s e e Chari, V.V. 715, 1578 Manuelli, R.E., s e e Jones, L.E. 245, 257, 261, 380, 672, 709, 711 713,720, 1675, 1711 Mao, C.S., s e e Dotsey, M. 370, 952 Marcet, A. 314, 326, 348, 351,454, 455, 464, 465, 468, 473-476, 480, 494, 499, 525, 528-530, 532, 1675, 1705, 1707 Marcet, A., s e e Canova, E 283 Marcet, A., s e e den Haan, W.J. 347, 354, 369 Margarita, S., s e e Beltratti, A. 524, 525 Margaritis, D. 474 Mariano, R.S., s e e Seater, J.J. 1656, 1657 Mariger, R.R 1344 Marimon, R. 455, 464, 472, 475, 523, 531, 1214 Marimon, R., s e e Evans, G.W. 483, 509, 527, 528, 531 Marion, N., s e e Flood, R.R 1429, 1438 Mark, N.C., s e e Cecchetti, S.G. 1251, 1265, 1270, 1272, 1294, 1296 Marris, S. 1632 Marschak, J. 582, 1043 Marshall, A. 203 Marshall, D.A., s e e Bekaert, G. 1281
Author
Index
Marshall, D.A., s e e Marcet, A. 326, 348, 351, 455 Marston, R., s e e Bodnar, G. 1318 Marston, R.C. 164 Martin, J.P. 1181 Mas-Colell, A., s e e Kehoe, T.J. 380 Masciandaro, D., s e e Grilli, V. 1404, 1432, 1438, 1439, 1465 Masson, A . , s e e Kessler, D. 1646 Masson, P., s e e Chadha, B. 1542 Masson, P.R. 1554, 1588 Matheny, K.J. 395, 441 Matsukawa, S. 1037 Matsuyama, K. 395, 399 Matthieson, D., s e e Mussa, M. 208 Mauro, P. 277 Mauro, P., s e e Easterly, W. 1538 Maussner, A. 528 Mayhew, S. 1310 McAfee, R.P., s e e Howitt, R 389, 399, 506, 517, 521 McCaUum, B.T. 83, 173, 184, 198, 203, 408, 487, 488, 496, 503, 1022, 1026, 1043, 1411, 1426, 1432, 1437, 1438, 1485, 1487, 1488, 1490, 1491, 1493, 1495, 1500, 1502, 1506-1510, 1512, 1515-1519, 1631 McCulloch, J.H., s e e Dezhbakhsh, H. 1039 McElroy, M. 619 McFadden, D. 1314, 1316, 1328 McOrattan, E.R. 348, 974 McGrattan, E.R., s e e Anderson, E.W. 368, 369 McGrattan, E.R., s e e Chari, V.V 124, 397, 422, 672, 697, 698, 700, 701, 709, 720, 722, 723, 974, 1036, 1037, 1040-1042, 1371 McGrattan, E.R., s e e Marimon, R. 455, 475, 523 McGuire, WJ. 1332 McIntire, J.M., s e e Carlson, J.B. 104 McKelvey, R.D. 380 McKibbin, W.J., s e e Henderson, D.W. 1497 McKinnon, R. 1496 McKilmon, R.I. 166, 207 McLanghlin, K.J. 1016, 1152 McLean, I., s e e Eichengreen, B. 157 McLennan, A. 474 McLennan, A., s e e McKelvey, R.D. 380 McManus, D.A. 908 Means, G.C. 1082 Meckling, W., s e e Jensen, M. 1344 Medeiros, C. 1554, 1555
Author
Index
Medoff, J.L., s e e Fay, J.A. 1077, 1103 Meehl, R 1319 Meghir, C. 611,613, 775, 804 Meghir, C., s e e Arellano, M. 787 Meghir, C., s e e Attanasio, O.R 793, 794 Meghir, C., s e e Blundell, R. 611, 612, 779, 781,783,790-792 Meghir, C., s e e Browning, M. 607, 611,778 Meguire, R, s e e Kormendi, R.C. 278-281,671, 1656, 1657 Mehra, R. 547, 961, 1234, 1236, 1249, 1251, 1264, 1268, 1270, 1272, 1289, 1312 Mehra, R., s e e Constanfinides, G.M. 1293 Mehra, R., s e e Danthine, J.-R 329, 370, 952 Meigs, A.J. 191 Melenberg, B., s e e Alessie, R. 774 Melino, A., s e e Blanchard, O.J. 912 Melino, A., s e e Epstein, L.G. 558, 565 Melino, A., s e e Grossman, S.J. 1242 Melnick, R., s e e Bruno, M. 1539 Meltzer, A.H. 162, 169, 17z1~176, 178, 179, 185, 204, 215517, 222, 1466, 1485, 1543 Meltzer, A.H., s e e Brulmer, K. 179, 183, 191, 1025 Meltzer, A.H., s e e Cukierman, A. 1414, 1450, 1463 Mendoza, E. 1439, 1571, 1579 Mendoza, E., s e e Calvo, G.A. 1591, 1600, 1601 Meredith, G., s e e Chadha, B. 1542 Merton, R. 1275 Merton, R.K. 389, 1333 Merz, M. 994, 1158, 1173, 1203, 1207 Metivier, M., s e e Benveniste, A. 476, 531 Metzler, L.A. 867 Meyer, J.R. 817 Mihov, I., s e e Bernanke, B.S. 72, 76, 83, 89, 114, 1365, 1369 Milesi-Ferretti, G.-M., s e e Mendoza, E. 1439 Milesi-Ferretti, G.-M. 1425, 1426, 1597 Milgrom, R 475, 1322 Millard, S.E 1217, 1220 Miller, B.L. 566 Miller, M., s e e Lockwood, B. 1411, 1415 Miller, M., s e e Modigliani, E 1343 Miller, R.A., s e e Altug, S. 584, 595, 611,612, 785, 786, 792 Mills, E 1082 Mills, J., s e e Erlich, D. 1314 Mills, L.O., s e e Boschen, J.E 139 Mills, T.C. 204
1-19 Mills, T.C., s e e Capie, E 163, 1438 Mincer, J. 581,592, 684 Minehart, D., s e e Bowman, D. 1313 Mirman, L., s e e Levhari, D. 1450, 1465 Mirman, L.J., s e e Brock, W.A. 319, 547, 552, 556, 942, 951 Miron, J.A. 173, 216, 876, 907, 1242 Miron, J.A., s e e Barsky, R.B. 1149 Miron, J.A., s e e Beaulieu, J.J. 876 Miron, J.A., s e e Feenberg, D. 60 Miron, J.A., s e e Mankiw, N.G. 173, 216, 1281 Mirrlees, J.A. 1154 Mirrlees, J.A., s e e Diamond, P.A. 1684 Mishkin, ES. 101, 183, 216, 1023, 1380, 1432, 1438 Mishkin, ES., s e e Bernanke, B.S. 1495 Mishkin, ES., s e e Estrella, A. 1485 Mishkin, ES., s e e Hall, R.E. 607, 608, 789, 1655 Mishra, D. 1416, 1425 Missale, A. 1450 Mitchell, B.R. 222 Mitchell, W.C. 8, 44, 1053 Mitchell, W.C., s e e Burns, A.E 5, 8, 931,934 Mitra, K. 530, 532 Mnookin, R.H., s e e Gilson, R.J. 1154 Modiano, E.M. 1543 Modigliani, E 761,762, 780, 1321, 1343, 1646, 1656, 1657 Modigliani, E, s e e Dreze, J. 770 Modigliani, E, s e e Holt, C.C. 882, 885, 888, 909, 910, 912 Modigliani, E, s e e Jappelli, T. 780 Modigliani, E, s e e Samuelson, RA. 643 Moffitt, R. 752, 787 Moffitt, R.A. 618 Moler, C., s e e Kahaner, D. 329, 333 Mondino, G. 1540 Monfort, A., s e e Gourieroux, C. 487 Monro, S., s e e Robbins, H. 476, 478 Montgomery, E. 1017, 1018 Montiel, R 1539 Montiel, E, s e e Ag6nor, RR. 1543 Montrucchio, L. 330 Montrucchio, L., s e e Boldrin, M. 362 Moore, B.J. 455, 475, 496 Moore, G.H. 1059 Moore, G.H., s e e Zamowitz, V. 40 Moore, G.R., s e e Fuhrer, J.C. 905, 908, 1039, 1040, 1518
1-20 Moore, J., s e e Kiyotaki, N. 852, 857, 1353, 1356, 1376, 1378, 1379 Moreno, D. 481 Morgan, D. 1374 Morrison, C.J. 1086 Mortensen, D.T. 1157, 1158, 1162, 1163, 1173, 1182, 1183, 1187, 1188, 1194, 1198, 1203, 1208, 1217, 1220, 1222 Mortensen, D.T., s e e Burdett, K. 1173, 1196 Mortensen, D.T., s e e Millard, S.R 1217, 1220 Morton, T.E. 338 Mosser, RC. 910 Motley, B., s e e Judd, J.R 1485, 1487, 1512, 1516 Mroz, T.A. 618 Mroz, T.A., s e e MaCurdy, T.E. 592, 752 Muellbauer, J., s e e Deaton, A. 783 Mueller, D. 1464 Mulligan, C.B. 346, 1150 Mundell, R.A. 1496 Murphy, K. 581 Murphy, K., s e e Juhn, C. 569, 619 Murphy, K., s e e Katz, L. 577, 578 Murphy, K.M. 262, 278, 1082 Murray, C.J., s e e Nelson, C.R. 11 Murray, W., s e e Gill, RE. 329 Musgrave, R.A. 1631, 1661 Mussa, M. 208, 1404, 1637 Mussa, M., s e e Flood, R.R 152, 202, 1428 Mussa, M.L., s e e Frenkel, J.A. 203 Muth, J.E 457, 473, 484 Muth, J.E, s e e Holt, C.C. 882, 885, 888, 909, 910, 912 Myerson, R. 1459 Nakamura, A. 618 Nakamura, M., s e e Nakamura, A. 618 Nalebuff, B., s e e Bliss, C. 1461, 1465 Nance, D.R. 1318 Nankervis, J.C., s e e McManus, D.A. 908 Nash, S., s e e Kahaner, D. 329, 333 Nason, J.M., s e e Cogley, T. 395, 547, 967, 1142, 1503 Natanson, I.R 342 NBER 8 Neale, M.A., s e e Northcraft, G.B. 1315 Negishi, T. 559 Nelson, C.R. 11, 211, 213, 264, 969, 1264, 1320 Nelson, C.R., s e e Beveridge, S. 1062, 1143 Nelson, D.B. 182
Author
Index
Nelson, E. 1035 Nerlove, M. 283,284 Neumarm, G.R., s e e Burdett, K. 1173 Neusser, K. 941 Neves, J., s e e Correia, I. 974 Neves, P., s e e Blundell, R. 792 Ng, S., s e e Garcia, R. 790 Nickell, S., s e e Layard, R. 1098, 1176, 1177, 1221 Nickell, S.J. 823 Nicolini, J.R, s e e Marcet, A. 455, 530, 532 Niederreiter, H. 334 Nilsen, O.A., s e e Asldldsen, J.E. 1074 Nishimura, K., s e e Benhabib, J. 403-405, 425, 435 Nordhaus, W. 1400, 1425 North, D. 1449 Northcraft, G.B. 1315 Novales, A. 803 Nurkse, R. 163, 203 Nyarko, Y. 465, 474 O'Barr, W.M. 1332 O'Brien, A.M. 776 O'Brien, A.E 181 Obstfeld, M. 159, 164, 165, 407, 1411, 1415, 1429, 1438, 1449, 1507, 1571, 1588, 1590, 1592, 1630 Obstfeld, M., s e e Froot, K. 1266 O'Connell, S.A. 1650 Odean, T. 1314, 1323 O'Driscoll, G.E 1643 OECD 1181, 1182, 1215, 1620 Office of Management and Budget 1622 Officer, L. 155 Ogaki, M., s e e Atkeson, A. 610, 786 Ohanian, L.E. 1036 Ohanian, L.E., s e e Cooley, T.E 42, 962, 974 O'Hara, M., s e e Blume, L.E. 321,322 Ohlsson, H., s e e Edin, D.A. 1457 Okina, K. 1508 Okun, A.M. 1014, 1541 0liner, S.D. 137, 820, 1374, 1376 Oliner, S.D., s e e Cummins, J.G. 856 Olsder, G., s e e Basar, T. 1449 Olshen, R.A., s e e Breiman, L. 289 Oppers, S. 154 Orphanides, A. 198, 1485 Ortega, E., s e e Canova, E 376, 377, 379 Ortigueira, S., s e e Ladron de Guevara, A. 317 Ostry, J. 1568
Author
Index
Ostry, J., s e e Montiel, R 1539 Ostry, J.D., s e e Ghosh, A.R. 202, 207, 208 Owen, PD., s e e Knowles, S. 277, 278 Ozler, S. 1457, 1465 Ozlel, S., s e e Alesina, A. 277-279, 1460, 1466, 147l Paarsch, H., s e e MaCurdy, T.E. 619, 620 Pacelli, L., s e e Contini, B. 1177, 1178, 1180, 1222 Packal~n, M. 525 Padilla, J., s e e Dolado, J. 1437 Pagan, A., s e e Kim, K. 377, 379 Pagan, A.R. 9, 69, 108 Pagano, M., s e e Giavazzi, E 203, 1438, 1446, 1449, 1580 Pagano, M., s e e Jappelli, T. 776 Papageorgiou, A. 334 Papageorgiou, C., s e e Duffy, J. 257 Paquet, A., s e e Ambler, S. 944 Parekh, G. 87, 109 Parente, S.L. 672, 674, 702, 708 Parke, W.R., s e e Davutyan, N. 156 Parker, J., s e e Barsky, R. 43 Parker, J., s e e Gourinchas, E-O. 609, 1344 Parker, J.A. 1120 Parker, J.A., s e e Solon, G. 579, 1058, 1102, 1106
Parkin, M. 1037, 1412, 1415, 1506 Parkin, M., s e e Bade, R. 1432, 1438 Pashardes, R, s e e Blundell, R. 781 Paskov, S.H. 334 Patel, J., s e e Degeorge, E 1321 Patinkin, D. 407, 1506, 1507, 1630, 1643 Paulin, G. 751 Paxson, C., s e e Deaton, A. 798 Paxson, C., s e e Ludvigson, S. 788 Pazos, E 1534 Peles, N., s e e Goetzmann, WN. 1314 Pencavel, J. 550, 601,605, 975, 1148 Peralta-Alva, A. 374 Perli, R. 402, 431,435 Perli, R., s e e Benhabib, J. 425, 426, 437 Perotti, R. 1466, 1469, 1472 Perotti, R., s e e Alesina, A. 1439, 1464, 1465 Perron, R 264 Perry, G.L., s e e Akerlof, G.A. 198 Persson, M. 1447, 1449 Persson, T. 278, 692, 1400, 1403, 1413, 1415-1418, 1420, 1421, 1425, 1433, 1435, 1437-1440, 1442, 1445, 1448-1450, 1454,
1-21 1456, 1459, 1460, 1465, 1466, 1469, 1470, 1490 Persson, T., s e e Englund, E 9 Persson, T., s e e Hassler, J. 9, 1238 Persson, T., s e e Horn, H. 1415 Persson, T., s e e Kotlikoff, L. 1448, 1449, 1465 Persson, T., s e e Persson, M. 1447, 1449 Pesaran, H. 487 Pesaran, M.H., s e e Binder, M. 271 Pesaran, M.H., s e e Im, K. 283 Pesaran, M.H., s e e Lee, K. 284 Pestieau, EM. 1718 Petersen, B.C., s e e Carpenter, R.E. 881, 912, 1344 Petersen, B.C., s e e Domowitz, I. 1020, 1083, 1093 Petersen, B.C., s e e Fazzari, S.M. 818, 1344 Petterson, E 1457 Pflug, G., s e e Ljung, L. 476 Phaneuf, L. 1028, 1039, 1041 Phelan, C. 380, 575, 796 Phelan, C., s e e Atkeson, A. 1298 Phelps, E. 944, 1025, 1026, 1039 Phelps, E.E., s e e Frydman, R. 453, 454, 474, 528, 536, 539 Phelps, E.S. 46, 168, 1059, 1098, 1121, 1122, 1157, 1173, 1176, 1192, 1220, 1537, 1538, 1720, 1724 Philippopoulus, A., s e e Lockwood, B. 1415 Phillips, A.W. 1510 Phillips, A.W.H. 46 Phillips, L.D., s e e Lichtenstein, S. 1318 Phillips, RC.B., s e e Kwiatkowski, D. 212 Picard, R 1157 Pieper, PJ., s e e Eisner, R. 1621 Pierce, J.L. 195 Piketty, T., s e e Aghion, R 1377 Pindyck, R. 1072 Pindyck, R.S. 835, 910, 912 Pindyck, R.S., s e e Abel, A.B. 835 Pindyck, R.S., s e e Caballero, R.J. 844 Pippenger, J. 156 Pischke, J.-S., s e e Jappelli, T. 790 Pischke, J.-S. 764 Pissarides, C.A. 774, 1163, 1173, 1183, 1184, 1188, 1193, 1194, 1200, 1203, 1207, 1209, 1220 Pissarides, C.A., s e e Garibaldi, P. 1180, 1222 Pissarides, C.A., s e e Jackman, R. 1221
1-22 Pissarides, C.A., s e e Mortensen, D.T. 1158, 1182, 1183, 1194, 1198, 1203, 1208 Plosser, C.I. 952, 954, 958, 961, 963, 1094, 1658 Plosser, C.I., s e e King, R.G. 9, 54, 369, 391, 429, 435, 549, 929, 931, 941, 945, 954, 995 Plosser, C.I., s e e Long, J. 929, 952, 953, 994 Plosser, CT, s e e Nelson, C.R. 11, 211, 213, 264, 969 Plutarchos, S,, s e e Benhabib, J. 437 Polemarchakis, H.M., s e e Geanakoplos, J.D. 395, 458 Policano, A., s e e Fethke, G. 1037 Pollak, R.A. 803 Pollard, S. 161 Poole, W. 192, 1514, 1515 Poonia, G.S., s e e Dezhbakhsh, H. 1039 Popper, K. 376 Porter, R. 1509 Porter, R.D., s e e LeRoy, S.E 1235, 1319 Porteus, E.L., s e e Kreps, D.M. 557, 1256 Pottier, E 1068, 1126 Pottier, E, s e e Hairault, J.-O. 1036 Posen, A. 1404, 1426, 1432, 1438 Posen, A., s e e Mishkin, ES. 1432, 1438 Poterba, J.M. 159, 1235, 1320, 1465, 1648, 1655 Poterba, J.M., s e e Cutler, D.M. 1290, 1320, 1321 Poterba, J.M., s e e Feldstein, M. 1633 Poterba, J.M., s e e Kusko, A.L. 1327 Power, L., s e e Cooper, R. 824 Pradel, J., s e e Fourgeaud, C. 454, 465, 473, 475 Praschnik, J., s e e Hornstein, A. 549 Prati, A. 162 Prati, A., s e e Alesina, A. 1446, 1449 Prati, A., s e e Drudi, E 1450 Prescott, E.C. 178, 365, 545, 675, 700, 702, 930, 934, 952, 954, 956, 957, 961, 963, 982, 1033, 1296, 1488, 1489, 1710 Prescott, E.C., s e e Chari, V.V. 1488, 1489, 1674 Prescott, E.C., s e e Cooley, T.E 376, 549, 954 Prescott, E.C., s e e Hansen, G.D. 602 Prescott, E.C., s e e Hodrick, R. 9, 12, 34, 428, 931,932 Prescott, E.C., s e e Kydland, EE. 9, 42, 158, 428, 547, 549, 929, 953, 956, 957, 962, 980, 981, 1058, 1059, 1140, 1141, 1145,
Author
Index
1167, 1195, 1400, 1405, 1415, 1449, 1485, 1486, 1488, 1673, 1708 Prescott, E.C., s e e Lucas Jr, R.E. 547, 554 Prescott, E.C., s e e Mehra, R. 547, 961, 1234, 1236, 1249, 1251, 1264, 1268, 1270, 1272, 1289, 1312 Prescott, E.C., s e e Parente, S.L. 672, 674, 708 Prescott, E.C., s e e Stokey, N.L. 951,998, 999 Prescott, E.S. 380 Press, W.H. 329 334, 343, 348, 356, 365 Preston, L, s e e Banks, J. 759, 783, 790, 791 Preston, I., s e e Blundell, R. 572, 764, 797 Priouret, R, s e e Benveniste, A. 476, 531 Pritchett, L. 237 Pritchett, L., s e e Easterly, W. 277, 278, 281, 675 Przeworski, A. 1466 Psacharopoulos, G. 685 ./ Puterman, M.L. 336, 338, 339 Quadrini, V., s e e Cooley, T.E 1376 Quadrini, V, s e e Krusell, P. 1445, 1473 Quah, D. 254, 263, 268, 272, 275, 283, 287, 288, 290-292, 294, 299 Quah, D., s e e Leung, C. 271 Quah, D.T., s e e Blanchard, O.J. 211,216, 217 Quail, D.T., s e e Durlauf, S.N. 550 Quandt, R.E. 34 Quattrone, G.A. 1329 Rabin, M. 1319 Rabin, M., s e e Bowman, D. 1313 Rabinowitz, E, s e e Davis, P.J. 333 Radner, R. 952 Radner, R., s e e Benhabib, J. 1465 Ramey, G. 281,852, 1157, 1159 Ramey, G., s e e den Haan, W.J. 994, 1166, 1194, 1203, 1204, 1206, 1207 Ramey, G., s e e Evans, G.W. 455, 461,462 Ramey, VA. 67, 876, 885, 897, 902, 905-907, 909, 911,914, 1084, 1089 Ramey, V.A., s e e Bresnahan, T.E 911,912 Ramey, V.A., s e e Chah, E.Y. 775 Ramey, V.A., s e e Ramey, G. 281 Ramos, J. 1543 Ramsey, E 643, 649 Ramsey, EE 1673 Rankin,, N., s e e Dixon, H. 537 Rankin, N. 1025 Rapping, L., s e e Lucas Jr, R.E. 615, 616
Author
Index
Rasche, R.H., s e e Hoffman, D.L. 51,412 Ratti, R.A. 1497 Ravikumar, B., s e e Chatterjee, S. 1126 Ravikumar, B., s e e Glomm, G. 712, 1472 Rawls, J. 1662 Ray, D., s e e Esteban, J.-M. 264 Rayack, W. 579 Razin, A. 1715 Razin, A., s e e Frenkel, J.A. 1630 Razin, A., s e e Helpman, E. 203, 1580 Razin, A., s e e Mendoza, E. 1439 Razin, A., s e e Milesi-Ferretti, G.-M. 1597 Rebelo, S.T. 245, 260, 261, 709, 952, 1546, 1568, 1578-1581, 1606 Rebelo, S.T., s e e Burnside, C. 399, 930, 980, 982-985, 994, 1078, 1142 Rebelo, S.T., s e e Correia, I. 974 Rebelo, S.T., s e e Easterly, W. 703 Rebelo, S.T., s e e Gomes, J. 994, 1159 Rebelo, S.T., s e e King, R.G. 9, 54, 369, 391, 429, 435, 545, 549, 649, 672, 711-713, 929, 932, 945, 954, 995, 1062, 1140 Rebelo, S.T., s e e Stokey, N.L. 578, 583, 672, 709, 711,714, 954 Redish, A. 154, 155, 166 Redish, A., s e e Betts, C.M. 217 Redmond, J. 161 Reichenstein, W. 101 Reichlin, L., s e e Evans, G.W. 1125 Reichlin, L., s e e Lippi, M. 217 Reid, B.G., s e e Boothe, EM. 1658 Reinhart, C.M. 1545, 1546, 1551, 1553, 1561, 1572, 1573 Reinhart, C.M., s e e Calvo, G.A. 1538, 1539, 1552, 1588, 1600 Reinhart, C.M., s e e Kaminsky, G.L, 1553, 1590 Reinhart, C.M., s e e Ostry, J. 1568 Renelt, D., s e e Levine, R. 269, 277-282, 390, 423, 671,694 Reserve Bank of New Zealand 1500 Resnick, L.B., s e e Levine, J. 1332 Restoy, E 1272 Revelli, R., s e e Contini, B. 1177, 1178, 1180, 1200, 1222 Revenga, A., s e e Blanchard, O.J. 1214 Rey, E, s e e Aghion, E 1157 Ricardo, D. 1642 Rich, G. 1514 Richard, S.E, s e e Hansen, L.E 556 Richards, S., s e e Meltzer, A.H. 1466
1-23 Rietz, T. 1252, 1272, 1296 Riley, J. 1461, 1465 Rios-Rull, J. 943 Rios-Rull, J.-V., s e e Castafieda, A. 380 Rios-Rtdl, J.-V. 380 Rios-Rull, V., s e e Krusell, E 1445, 1473 Ritter, J.R. 1321 Ritter, J.R., s e e Ibbotson, R. 1321 Rivers, D. 840 Rivlin, T.J. 343 Rob, R., s e e Jovanovic, B. 702 Rob, R., s e e Kandori, M. 475 Robb, R., s e e Heckman, J.J. 752 Robbins, H. 476, 478 Roberds, W., s e e Hansen, L.E 573, 574 Roberts, H.V. 1307 Roberts, J., s e e Milgrom, P. 475 Roberts, J.M. 1013, 1033, 1040, 1116, 1118, 1505 Roberts, J.O., s e e Lebow, D.E. 215 Roberts, K. 1466 Robertson, J.C., s e e Pagan, A.R. 69, 108 Robinson, D. 217 Robinson, J. 1054, 1120 Robinson, S., s e e Meltzer, A.H. 204, 216, 217, 222 Rockafellar, R.T. 325 Rockoff, H. 155, 157 Rockoff, H., s e e Bordo, M.D. 160 Rodriguez, C.A. 1562, 1563, 1565, 1568 Rodriguez-Clare, A., s e e Klenow, EJ. 663, 673, 679, 680, 683-686, 694, 702, 705, 707 Rodrik, D., s e e Alesina, A. 278, 692, 1466, 1469 Rogers, C. 1449, 1450 Rogers, D., s e e Fullerton, D. 576, 588, 616 Rogerson, R. 551,602, 976-978, 1145 Rogerson, R., s e e Benhabib, J. 402, 550, 1145 Rogerson, R., s e e Bertola, G. 1222 Rogerson, R., s e e Cho, J.O. 976 Rogerson, R., s e e Cole, H.L. 1163, 1194, 1201-1203, 1207 Rogerson, R., s e e Greenwood, J. 550, 995 Rogerson, R., s e e Hopenhayn, H. 672, 708, 994 Rogerson, R., s e e Parente, S.L. 702 Rogoff, K. 961, 1415-1418, 1420, 1422, 1425, 1429, 1432, 1434, 1438 Rogoff, K., s e e Bulow, J. 1448, 1449 Rogoff, K., s e e Canzoneri, M.B. 1507, 1508
1-24
Author
Rogoff, K., s e e Obstfeld, M. 407, 1507, 1590, 1630 Rojas-Suarez, L. 1575 Roland, G., s e e Persson, T. 1460 Roldos, J. 1578 Roll, R. 1328 Romer, C.D. 6, 69, 92, 137, 183, 187, 204, 205, 1618 Romer, D, 237, 643, 649, 651,661,930, 1013, 1034, 1140, 1157, 1163, 1635, 1661 Romer, D., s e e Ball, L. 1023, 1037, 1041, 1127
Romer, D., s e e Frankel, J.A. 280, 281 Romer, D., s e e Mankiw, N.G. 244-246, 252 255, 269-271,277-279, 289, 653, 655, 660, 673, 679-683,685, 686, 1638 Romer, D.H., s e e Romer, C.D. 69, 92, 137 Romer, EM. 238, 245, 260, 261,264, 265, 271, 278, 280, 398, 424, 425, 641, 651, 665, 672, 705-707, 715-717, 719, 1638 Romer, EM., s e e Evans, G.W. 425, 426, 506, 521 Rose, A., s e e Akerlof, G.A. 1200 Rose, A.K., s e e Eichengreen, B. 1590 Rose, A.K., s e e Frankel, J.A. 1590 Rosen, A., s e e Meehl, E 1319 Rosen, S. 584, 585, 976 Rosensweig, J.A. 1659 Rosenthal, H., s e e Alesina, A. 1425, 1426 Roseveare, D. 1626 Ross, L. 1319 Ross, S., s e e Brown, S. 1242 Ross, S.A. 1331 Rossana, R.J. 879, 881,886, 907 Rossana, R.J., s e e Maccini, L.J. 881,893, 894, 903,907 Rossi, RE., s e e Jones, L.E. 380, 672, 711-713, 1675, 1711 Rotemberg, J.J. 67, 68, 395, 397, 406, 407, 423, 429, 434, 838, 910, 974, 996, 1020, 1033, 1034, 1036, 1040, 1041, 1043, 1044, 1055, 1056, 1058, 1062, 1063, 1067-1069, 1074, 1081, 1082, 1088-1090, 1092, 1093, 1106, 1107, 1114, 1116, 1118, 1123-1125, 1129, 1143, 1144, 1365, 1464, 1492, 1494, 1497 Rotemberg, J.J., s e e Mankiw, N.G. 785 Rotemberg, J.J., s e e Pindyck, R. 1072 Rotemberg, J.J., s e e Poterba, J.M. 159 Rothschild, M. 823 Rotwein, E. 1011
Index
Roubini, N. 1439, 1465 Roubini, N., s e e Alesina, A. 277-279, 1404, 1423, 1425, 1460, 1466, 1471 Roubini, N., s e e Grilli, V. 95 Roubini, N., s e e Kim, S. 95 Rouwenhorst, K.G. 1296 Royer, D., s e e Balasko, Y. 506 Rubinstein, A. 1188 Rubinstein, A., s e e Binmore, K.G. 1188 Rubinstein, M. 554-556 Rubinstein, M., s e e Jackwerth, J.C. 1310 Rudd, J.B., s e e Blinder, A.S. 1018, 1118 Rudebusch, G.D. 69, 104, 196, 1493 Rudebusch, G.D., s e e Diebold, EX. 6 Rudebusch, G.D., s e e 0liner, S.D. 137, 820, 1374, 1376 Rudebusch, R.G. 11 Rudin, J. 1040 Ruhm, C. 1152 Runkle, D., s e e Glosten, L. 1280 Runkle, D., s e e Keane, M.E 608, 609, 786, 790 Runkle, D.E. 789, 790, 1655 Runkle, D.E., s e e Geweke, J.E 89 Runkle, D.E., s e e Mankiw, N.G. 135 Russek, ES., s e e Barth, J.R. 1657 Rust,, J., s e e Amman, H.M. 535 Rust, J. 314, 317, 336 Rustichini, A., s e e Benhabib, J. 400, 847, 1449, 1467, 1472 Rustichini, A., s e e Boldrin, M. 400, 1465 Ryder, H. 587 Ryder Jr, H.E. 1284
Sabelhaus, J., s e e Gokhale, J. 750 Sachs, J. 1590, 1591 Sachs, J., s e e Bruno, M. 1090 Sachs, J., s e e Roubini, N. 1439, 1465 Sachs, J.D. 252, 703 Sack, B., s e e Galeotti, M. 909 Sadka, E., s e e Razin, A. 1715 Sahay, R. 1535 Sahay, R., s e e Fischer, S. 1538, 1547, 1561 Saint Marc, M. 222, 223 Saint-Paul, G. 1162, 1472 Saint-Paul, G., s e e Blanchard, O.J. 1214 Sakellaris, E, s e e Barnett, S. 831 Sala-i-Martin, X. 269, 277, 279-282, 659, 694
Author
1-25
Index
Sala-i-Martin, X., s e e Barro, R.J. 237, 245, 246, 252, 269, 271, 272, 278, 284, 643, 651,657, 659, 671,675, 1637 Salge, M. 499 Salmon, , R, s e e Kirman, A.P. 536, 539-541 Salmon, C.K., s e e Haldane, A.G. 1485, 1497 Salmon, M. 525 Saloner, G., s e e Rotemberg, J.J. 910, 1058, 1093 Salter, W.E.G. 848 Sampson, L., s e e Fauvel, Y. 1573 Samuelson, P.A. 46, 643, 661, 1311, 1634 Samwick, A. 609 Samwick, A.A., s e e Carroll, C.D. 567 Sandmo, A., s e e Atkinson, A.B. 1718 Sandroni, A. 1293 Sanguinetti, P. 1540 Sanguinetti, E, s e e Heymann, D. 506 Sanguinetti, R, s e e Jones, M. 1540 Santaella, J. 1543 Santos, M., s e e Caballe, J. 578 Santos, M.S. 321-323, 326, 327, 335, 353, 354, 382, 590, 1266 Santos, M.S., s e e Bona, J.L. 313 Santos, M.S., s e e Ladron de Guevara, A. 317 Santos, M.S., s e e Peralta-Alva, A. 374 Sargent, T. 162, 198, 929 Sargent, T., s e e Ljungqvist, L. 1214 Sargent, T., s e e Lucas Jr, R.E. 582 Sargent, T., s e e Marimon, R. 455, 523 Sargent, T.J. 73, 121, 135, 417, 418, 453, 455, 457, 458, 464, 465, 489, 504, 523, 524, 529-531,763, 888, 1023, 1024, 1145, 1506, 1507, 1519, 1542, 1543, 1630, 1631 Sargent, T.J., s e e Anderson, E.W. 368, 369 Sargent, T.J., s e e Cho, I.-K. 455, 465, 524, 525 Sargent, T.J., s e e Evans, G.W. 530 Sargent, T.J., s e e Hansen, L.R 558, 573, 574, 882, 915, 1294, 1295 Sargent, T.J., s e e Marcet, A. 454, 464, 465, 468, 473476, 480, 494, 499, 525, 528, 529, 532, 1675, 1705, 1707 Sattinger, M. 577, 578 Saunders, A. 181 Sauvy, A. 222 Savage, L.J. 1308, 1324 Savage, L.J., s e e Friedman, M. 1325 Savastano, M.A. 1589 Savastano, M.A., s e e Masson, RR. 1554, 1588
Savin, N., s e e Bray, M. 454, 465, 466, 473, 475, 527 Savin, N., s e e Ingrain, B. 984 Savin, N.E., s e e McManus, D.A. 908 Savouri, S., s e e Jackman, R. 1221 Sayers, R.S. 156 Sbordone, A. 983 Sbordone, A., s e e Cochrane, J. 1120 Sbordone, A.M. 1078, 1099, 1108, 1118, 1128 Scanlmell, W.M. 156 Scarpetta, S. 1214 Schaling, E. 1437 Schaling, E., s e e Eijffinger, S. 1432, 1438 Schaller, H., s e e Moore, B.J. 455 Scharfstein, D., s e e Chevalier, J.A. 1122, 1123 Scharfstein, D., s e e Hoshi, T. 1344 Scheinkman, J., s e e Ekeland, I. 1689 Scheinkman, J., s e e Heckman, J.J. 579 Scheinkman, J.A. 566 Scheinkman, J.A., s e e Benveniste, L.M. 321 Schiantarelli, E, s e e Galeotti, M. 909, 1086, 1124
Schrnidt, P., s e e Kwiatkowski, D. 212 Schmidt-Hebbel, K., s e e Easterly, W. 1538 Schmitt-Groh6, S. 406, 407, 416, 418, 429, 431,435 Schmitt-Groh~, S., s e e Benhabib, J. 419, 421, 423 Schmitz Jr, J.A. 672, 695-697, 699 Schnadt, N., s e e Capie, F. 154 Scholes, M., s e e Black, F. 1310, 1331 Scholz, J.K., s e e Gale, W.G. 1646 Sch6nhofer, M. 515 Schopenback, R, s e e Erlich, D. 1314 Schotter, A. 1415 Schuh, S. 877, 881,912 Schuh, S., s e e Davis, S.J. 1151, 1152, 1160, 1161, 1178, 1194, 1 1 9 9 Schuh, S., s e e Fuhrer, J.C. 905, 908 Schuh, S., s e e Humphreys, B.R. 909 Schultz, T.W. 653 Schumaker, L.L. 344, 345 Schwartz, A., s e e Thaler, R.H. 1313 Schwartz, A.J. 156, 161,173, 180, 204, 1515 Schwartz, A.J., s e e Bordo, M.D. 159, 165, 184, 194, 203, 204, 208, 217, 1404, 1590 Schwartz, A.J., s e e Darby, M.R. 166 Schwartz, A.J., s e e Friedman, M. 61, 137, 154, 162, 172, 176, 179, 180, 185, 189, 222
1-26 Schwert, G.W. 1236, 1280 Schwert, G.W., s e e French, K. 1280 Seater, J.J. 1621, 1654, 1656, 1657 Sedlacek, G., s e e Heckrnan, J.J. 578, 579 Sedlacek, G.J., s e e Hotz, VJ. 792, 803 Segal, I.B. 1157 Senhadji, A.S., s e e Diebold, EX. 11 Sentana, E., s e e King, M. 1333 Seppala, J., s e e Marcet, A. 1675, 1705, 1707 Seslnick, D. 746, 751 Shafir, E. 1316, 1324, 1329 Shafir, E., s e e Tversky, A. 1324 Shapiro, C. 1157 Shapiro, C., s e e Farrell, J. 1121 Shapiro, M. 938, 980 Shapiro, M., s e e Barsky, R. 558, 564, 565 Shapiro, M.D. 138, 818, 1069, 1075, 1655 Shapiro, M.D., s e e Dominguez, K. 182 Shapiro, M.D., s e e Mankiw, N.G. 135 Shapiro, M.D., s e e Ramey, VA. 67, 1089 Sharma, S., s e e Masson, ER. 1554, 1588 Sharpe, S. 1344 Shaw, E.S., s e e Gurley, J.G. 1507 Shaw, K. 584 Shay, R.R, s e e Juster, ET. 777 Shea, J. 402, 608, 790, 983, 1117 Sheffrin, S.M., s e e Driskill, R.A. 1042 Shefiin, H. 1313, 1317, 1321, 1330 Shell,, K., s e e Barnett, W. 540 Shell, K. 389, 391,516 Shell, K., s e e Balasko, Y. 427 Shell, K., s e e Cass, D. 389, 516, 662 Shepard, A., s e e Borenstein, S. 1124 Sherali, D.H., s e e Bazaraa, M.S. 331 Sheshinski, E. 1031, 1037 Sherry, C.M., s e e Bazaraa, M.S. 331 Shiller, R.J. 173, 1234, 1235, 1238, 1249, 1290, 1316, 1317, 1319, 1320, 1323, 1324, 1327, 1330-1332 Shiller, R.J., s e e Campbell, J.Y. 1235, 1265, 1280, 1320 Shiller, R.J., s e e Case, K.E. 1323 Shiller, R.J., s e e Grossman, S.J. 1242, 1246, 1268, 1291 Shin, M.C., s e e Puterman, M.L. 339 Shin, Y., s e e Im, K. 283 Shin, Y., s e e Kwiatkowski, D. 212 Shleifer, A. 1317, 1324 Shleifer, A., s e e Barberis, N. 1294, 1322 Shleifer, A., s e e Bernheim, B.D. 1646 Shleifer, A., s e e DeLong, J.B. 1290, 1324
Author
index
Shleifer, A., s e e La Porta, R. 1240 Shleifer, A., s e e Lakonishok, J. 1323 Shleifer, A., s e e Lee, C. 1324 Shleifer, A., s e e Murphy, K.M. 262, 278, 1082 Shoemaker, C.A., s e e Johnson, S.A. 345, 381 Shor, N.Z. 331 Shoven, J.B. 705, 708 Shoven, J.B., s e e Ballard, C. 1639 Sibert, A., s e e Rogoff, K. 1416, 1417, 1420, 1425 Sichel, D., s e e Oliner, S.D. 820 Siegel, J.J. 1312, 1313 Silberman, J. 1316 Simkins, S. 931 Simmons, B. 163 Simon, H.A., s e e Holt, C.C. 882, 885, 888, 909, 910, 912 Simons, H.C. 852, 1485 Simonsen, M.H., s e e Dornbusch, R. 1543, 1565 Sims, C.A. 34, 44, 69, 83, 93, 95, 99, 105, 121, 128, 129, 131, 132, 134, 144, 397, 418, 539, 673,694, 1509, 1518, 1520, 1631 Sims, C.A., s e e Hayashi, E 788 Sims, C.A., s e e Leeper, E.M. 69, 74, 83, 93, 101, 128, 132, 134, 1036, 1089, 1369 Sinai, A., s e e Eckstein, O. 1344 Singer, B. 292 Singer, B., s e e Heckman, J.J. 1166 Singleton, K. 1270 Singleton, K.J., s e e Duma, K.B. 800, 1284 Singleton, K.J., s e e Hansen, L.E 547, 555, 556, 768, 769, 784, 882, 1234, 1246, 1250, 1261 Siow, A., s e e Altonji, J.G. 789 Skinner, B.E 1328 Skinner, J.S. 771,772 Skinner, J.S., s e e Hubbard, R.G. 567, 569, 572, 573,593,771,776, 794, 797, 1660 Slade, M.E. 1015 Slemrod, J., s e e Shapiro, M.D. 1655 Slovic, E, s e e Fischhoff, B. 1319 Small, D.H., s e e Hess, G.D. 1485, 1509 Small, D.H., s e e Orphanides, A. 1485 Smetters, K.A. 1647 Smith, A.A., s e e Krusell, E 380, 547, 566, 567, 994 Smith Jr, A.A., s e e Krusell, E 1293 Smith, C.W., s e e Nance, D.R. 1318 Smith, E.L. 1312
Author
Index
Smith, G.W., s e e Devereux, M. 952 Smith, G.W., s e e Gregory, A.W. 376, 377 Smith, R., s e e Alogoskoufis, G.S. 166, 214 Smith, R.E, s e e Lee, K. 284 Smithson, C.W., s e e Nance, D.R. 1318 Snower, D., s e e Blanchard, O.J. 1214 Soares, J., s e e Cooley, T.E 1463 S6derlind, P., s e e Hassler, J. 9, 1238 S6derstr6m, T., s e e Ljung, L. 476 Soerensen, J.E 528 Solnick, A., s e e Judd, K.L. 340 Solon, G. 579, 1058, 1102, 1106 Solon, G., s e e Barsky, R. 43 Solow, R.M. 237, 244, 246, 257, 643, 656, 664, 681, 929, 930, 942, 950-952, 1140, 1207, 1638 Solow, R.M., s e e Blanchard, O.J. 1214 Solow, R.M., s e e Blinder, A.S. 1660 Solow, R.M., s e e Hahn, E 661 Solow, R.M., s e e Samuelson, EA. 46 Sommariva, A. 222 Sonnenschein, , H., s e e Hildenbrand, W. 535, 537 Sorger, G., s e e Hommes, C.H. 529, 532 Souleles, N., s e e Jappelli, T. 790 Spear, S.E. 465 Spear, S.E., s e e Marimon, R. 455, 531 Spiegel, M.M., s e e Benhabib, J. 283 Spilerman, S., s e e Singer, B. 292 Spulber, D., s e e Caplin, A.S. 801, 1031, 1032 Spynnewin, E 803 Srba, E, s e e Davidson, J. 750 Srinivasan, T.N. 705 Stacchetti, E., s e e Jones, L.E. 720 Stafford, E, s e e Holbrook, R. 569 Stafford, E, s e e Ryder, H. 587 Staiger, D. 49, 50 Staiger, R. 1415 Staiger, R.W., s e e Bagwell, K. 1125 Stambaugh, R.E, s e e French, K. 1280 Stambaugh, R.E, s e e Kandel, S. 1235, 1252, 1253, 1265, 1270, 1272 Stark, T., s e e Croushore, D. 1485 Starr, R.M., s e e Chah, E.Y. 775 Startz, R., s e e Nelson, C.R. 1264 Statman, M., s e e Shefrin, H. 1313, 1317, 1330 Stedinger, J.R., s e e Johnson, S.A. 345,381 Stein, J.C., s e e Kashyap, A.K. 137, 881,912, 1344, 1374, 1376 Stengel, R.E 904
1-27 Stephen, P., s e e Ryder, H. 587 Sterling, A., s e e Modigliani, E 1656, 1657 Stigler, G. 1018 Stigler, G.J. 1173 Stigler, S.M. 275 Stiglitz, J., s e e Dixit, A. 1115, 1121, 1126 Stiglitz, J., s e e Greenwald, B. 857, 1122, 1377 Stiglitz, J., s e e Jaffee, D.M. 1376 Stiglitz, J.E. 1675, 1696, 1718 Stiglitz, J.E., s e e Atkinson, A.B. 1673, 1676, 1680, 1682, 1718 Stiglitz, J.E., s e e Shapiro, C. 1157 Stock, J.H. 9, 11, 39, 43, 45, 50-54, 821, 878, 919, 934, 938, 939, 1011, 1021, 1404, 1674 Stock, J.H., s e e Feldstein, M. 44, 1485, 1497, 1498 Stock, J.H., s e e King, R.G. 54, 941 Stock, J.H., s e e Staiger, D. 49, 50 Stockman, A. 1578 Stockman, A.C. 549 Stockman, A.C., s e e Baxter, M. 203, 938, 1404 Stockman, A.C., s e e Darby, M.R. 166 Stockman, A.C., s e e Gavin, W. 1485 Stockman, A.C., s e e Ohanian, L.E. 1036 Stocks, Bonds, Bills and Inflation 1639 Stockton, D.J., s e e Lebow, D.E. 215, 1016 Stoer, J. 334 Stoker, T., s e e Blundell, R. 770, 788 Stokey, N., s e e Alvarez, E 996 Stokey, N., s e e Lucas Jr, R.E. 559, 561 Stokey, N., s e e Milgrom, E 1322 Stokey, N.L. 271,299, 314, 318-321,346, 578, 583, 672, 705, 709, 711, 714, 951, 954, 998, 999, 1674 Stokey, N.L., s e e Lucas, R.E. 380, 1446, 1449 Stokey, N.L., s e e Lucas Jr, R.E. 158, 1673, 1675, 1699, 1723, 1728 Stone, C.J., s e e Breiman, L. 289 Strang, G. 82 Strongin, S. 83-85, 87, 114 Strotz, R.H. 1653 Strotz, R.H., s e e Eisner, R. 1310 Stroud, A.H. 334 Stuart, A. 1485 Stulz, R.M. 1317 Sturzenegger, E, s e e Dornbusch, R. 1543 Sturzenegger, E, s e e Guo, J.-T. 427 Sturzenegger, E, s e e Mondino, G. 1540
1-28 Suarez, J. 1378 Subrahmanyam, A., s e e Daniel, K. 1322 Sugden, R., s e e Loomes, G. 1313 Suits, D., s e e Kallick, M. 1325 Summers, L.H. 961 Summers, L.H., s e e Abel, A.B. 1266, 1651 Summers, L.H., s e e Alesina, A. 1432 Summers, L.H., s e e Bernheim, B.D. 1646 Summers, L.H., s e e Blanchard, O.J. 416, 1635 Summers, L.H., s e e Carroll, C.D. 759, 793, 1655 Summers, L.H., s e e Clark, K.B. 602, 1173 Summers, L.H., s e e Cutler, D.M. 1290, 1320, 1321 Summers, L.H., s e e DeLong, J.B. 279, 695, 1042, 1290, 1324 Summers, L.H., s e e Easterly, W. 277, 278, 281, 675 Summers, L.H., s e e Kotlikoff, L.J. 780, 1646 Summers, L.H., s e e Mankiw, N.G. 785 Summers, L.H., s e e Poterba, J.M. 1235, 1320, 1648 Summers, R. 238, 301, 640, 673 675, 677, 680, 681,689, 720 Sun, I". 1270 Sundaram, R.K., s e e Dutta, RK. 380 Sundaresan, S.M. 1284 Sunder, S., s e e Marimon, R. 455, 472, 531 Surekha, K. 908 Sussman, O., s e e Suarez, J. 1378 Svensson, , L.E.O., s e e Leiderman, L. 1432, 1438 Svensson, J. 1466, 1471, 1472 Svensson, L.E.O. 156, 197, 417, 1033, 1034, 1273, 1411, 1432, 1434, 1489, 1493, 1494, 1498, 1504 Svensson, L.E.O., s e e Englund, R 9 Svensson, L.E.O., s e e Kotlikoff, L. 1448, 1449, 1465 Svensson, L.E.O., s e e Leiderman, L. 1495 Svensson, L.E.O., s e e Persson, M. 1447, 1449 Svensson, L.E.O., s e e Persson, T. 1449, 1450, 1454, 1456, 1465 Swagel, R, s e e Alesina, A. 277-279, 1460, 1466, 1471 Swan, T.W. 244, 246, 247, 643 Sweeney,, J., s e e Kneese, A. 656 Swoboda, A., s e e Genberg, H. 165 Symansky,, S.A., s e e Bryant, R.C. 1491, 1497, 1516
Author
index
Szafarz, A., s e e Adam, M. 500 Szafarz, A., s e e Broze, L. 487, 488 Tabellini,, G., s e e Persson, I". 1400 Tabellini, G. 1414, 1415, 1450, 1456, 1464, 1465 Tabellini, G., s e e Alesina, A. 1446, 1449, 1450, 1454, 1465, 1518, 1522 Tabellini, G., s e e Cukierman, A. 1456, 1465 Tabellini, G., s e e Daveri, E 1220 Tabellini, G., s e e Edwards, S. 1538 Tabellini, G., s e e Grilli, V 1404, 1432, 1438, 1439, 1465 Tabellini, G., s e e Ozler, S. 1457, 1465 Tabellini, G., s e e Persson, T. 278, 692, 1403, 1413, 1415-1418, 1420, 1421, 1425, 1433, 1435, 1437-1440, 1442, 1445, 1448, 1449, 1459, 1460, 1466, 1469, 1470, 1490 Taber, C., s e e Heckman, J.J. 576, 578, 582, 584, 586, 587, 590, 592, 593 Taguas, D., s e e Blanchard, O.J. 1214 Tallarini Jr, T.D., s e e Hansen, L.E 558, 1294, 1295 Tallman, E.W., s e e Rosensweig, J.A. 1659 Talvi, E. 1543, 1571, 1604 Tan, K.S. 334 Tanner, S., s e e Banks, J. 758, 792 Tanzi, V 1741 Tarshis, L. 939, 1059 Tauchen, G. 367 Taylor, A., s e e Obstfeld, M. 164, 165 Taylor, C. 1330 Taylor, J.B. 46, 182, 314, 397, 408, 417, 422, 454, 474, 487, 489, 495, 545, 1011, 1013, 1015, 1017, 1025, 1027-1031, 1037-1039, 1042, 1043, 1113, 1364, 1411, 1485, 1487, 1488, 1490, 1497, 1505, 1507, 1512, 1513, 1516, 1518, 1542, 1582 Taylor, J.B., s e e Phelps, E. 1025, 1026 Taylor, L.D., s e e Houthakker, H.S. 803 Taylor, S.E. 1330 Tejada-Guibert, J.A., s e e Johnson, S.A. 345, 381 Teles, E, s e e Correia, I. 1537, 1675, 1720, 1733 Telmer, C.I., s e e Backus, D.K. 1316 Temin, R 162, 179, 180, 183, 184 Temple, J. 276 Terlizzese, D., s e e Guiso, L. 772 Terna, P., s e e Beltratti, A. 524, 525 Terrones, M. 1425
Author
Index
Teruyama, H., s e e Fukuda, S.-i. 875 Tesar, L., s e e Mendoza, E. 1439 Tesar, L., s e e Stockman, A.C. 549 Tetlow, R., s e e Fillion, J.E 1498 Teukolsky, S.A., s e e Press, W.H. 329-334, 343, 348, 356, 365 Thaler, R., s e e Froot, K. 1316 Thaler, R., s e e Lee, C. 1324 Thaler, R.H. 1313, 1317 Thaler, R.H., s e e Benartzi, S. 1290, 1312, 1313 Thaler, R.H., s e e De Bondt, W.E 1307, 1320, 1323 Thaler, R.H., s e e Shefrin, H. 1317 Thaler, R.H., s e e Siegel, J.J. 1312 The Economist 1238, 1632 Theunissen, A.J., s e e Whittaker, J. 1508 Thomas, J. 994 Thomas, J.K., s e e Bernard, V.L. 1321 Thomas, T.J. 161 Thompson, S.C., s e e Taylor, S.E. 1330 Thomson, J.B., s e e Carlson, J.B. 104 Thornton, H. 1485 Thurow, L. 759 Tieslau, M.A., s e e Hoffman, D.L. 412 Tillmann, G. 474 Timberlake, R.H. 169, 174 Timmermarm, A.G. 454, 455, 500, 530 Tinbergen, J. 817 Tirole, J. 1266, 1650 Tirole, J., s e e Fudenberg, D. 1155 Tirole, J., s e e Holmstrom, B. 1376 Titman, S., s e e Jegadeesh, N. 1321 Tobin, J. 773, 817, 818, 1643 Tobin, J., s e e Brainard, W.C. 817 Tobin, J . , s e e Eichengreen, B. 168 Todd, E, s e e Heckman, J.J. 578, 582 Todd, R., s e e Christiano, L.J. 1365 Toharia, D., s e e Blanchard, O.J. 1214 Toma, M. 174, 177, 187, 190 Toma, M., s e e Goff, B.L. 159 Tommasi, M. 1540 Tommasi, M., s e e Jones, M. 1540 Tommasi, M., s e e Mondino, G. 1540 Topel, R. 578 Topel, R., s e e Juhn, C. 619 Topel, R., s e e Murphy, K. 581 Tornell, A. 1466, 1472, 1590 Tornell, A., s e e Lane, E 1472 Tornell, A., s e e Sachs, J. 1590, 1591
1-29 Townsend, R.M. 453, 461,474, 529, 795, 796, 1350, 1376 Townsend, R.M., s e e Phelan, C. 380, 575, 796 Traub, J.E 338 Traub, J.E, s e e Papageorgiou, A. 334 Trehan, B. 159 Tria, G., s e e Felli, E. 1083, 1122 Triffin, R. 157, 165 Trostel, EA. 1652 Tryon, R., s e e Brayton, E 1043, 1344, 1485 Tsiddon, D. 1031 Tsiddon, D., s e e Lach, S. 1019 Tsitsiklis, J.N., s e e Chow, C.-S. 326, 334 Tsutsui, Y., s e e Shiller, R.J. 1316 Tullio, G. 156 Tullio, G., s e e Sommariva, A. 222 Tullock, G., s e e Grief, K.B. 253 Tuncer, B., s e e Krueger, A.O. 699 Turnovsky, S. 474 Tversky, A. 1308, 1315, 1319, 1324, 1330 Tversky, A., s e e Kahneman, D. 1308, 1309, 1311 Tversky, A., s e e Quattrone, G.A. 1329 Tversky, A., s e e Shafir, E. 1316, 1324, 1329 Tversky, A., s e e Thaler, R.H. 1313 Tybout, J., s e e Corbo, V 1543 Tylor, E.B. 1331 Uhlig, H. 70 Uhlig, H., s e e Lettan, M. 524, 1297 Uhlig, H., s e e Taylor, J.B. 314 United Nations 681 Uppal, R., s e e Dumas, B. 564 Uribe, M. 1539, 1578, 1589 Uribe, M., s e e Benhabib, J. 419, 421,423 Uribe, M., s e e Mendoza, E. 1571, 1579 Uribe, M., s e e Schmitt-Groh6, S. 416, 418, 431 US Bureau of the Census 585 Uzawa, H. 578, 651, 710 Valdes, R., s e e Dornbusch, R. 1590 Valdivia, V., s e e Christiano, L.J. 504 Van Huyck, J.B., s e e Grossman, H.J. 158, 1415, 1449 van Wincoop, E., s e e Beaudry, R 1264 Van Zandt, T., s e e Lettau, M. 470, 472 Vasicek, O. 1270 V6gh, C., s e e Guidotti, RE. 1675, 1720
1-30 V6gh, C.A. 1535, 1538, 1542, 1543, 1546, 1550, 1554, 1588 V6gh, C.A., s e e Bordo, M.D. 158 V~gh, C.A., s e e Calvo, G.A. 1428, 1535, 1538, 1539, 1546, 1554, 1557, 1563, 1564, 1568, 1571, 1572, 1582, 1587-1589, 1597, 1605 V~gh, C.A., s e e De Gregorio, J. 1546, 1551, 1573, 1575, 1577 V~gh, C.A., s e e Edwards, S. 1578-1580 V6gh, C.A., s e e Fischer, S. 1538, 1547, 1561 V~gh, C.A., s e e Guidotti, RE. 1537, 1588, 1603 V6gh, C.A., s e e Hoffmaister, A. 1561, 1589 V6gh, C.A., s e e Lahiri, A. 1597 V6gh, C.A., s e e Rebelo, S.T. 1546, 1568, 1578, 1579, 1581, 1606 V6gh, C.A., s e e Reinhart, C.M. 1545, 1546, 1551, 1553, 1561, 1572, 1573 V6gh, C.A., s e e Sahay, R. 1535 Vela, A., s e e Santaella, J. 1543 Velasco, A. 416, 1446, 1449, 1450, 1459, 1465, 1540 Velasco, A., s e e Sachs, J. 1590, 1591 Velasco, A., s e e Tommasi, M. 1540 Velasco, A., s e e Tornell, A. 1466, 1472, 1590 Venable, R., s e e Levy, D. 1014, 1015, 1019 Venegas-Martinez, E 1571 Ventura, G., s e e Huggett, M. 380 Veracierto, M. 994 Verdier, T., s e e Saint-Paul, G. 1472 Vetterling, WT., s e e Press, WH. 329-334, 343, 348, 356, 365 Viana, L. 1543 Vickers, J. 1414, 1415 Vigo, J., s e e Santos, M.S. 321,322, 326, 327, 335 Vinals, J., s e e Goodhart, C.E.A. 1438, 1495 Vishny, R.V~, s e e Barbefis, N. 1294, 1322 Vishny, R.W., s e e La Porta, R. 1240 Vishny, R.W., s e e Lakonishok, J. 1323 Vishny, R.W., s e e Murphy, K.M. 262, 278, 1082 Vishny, R.W., s e e Shleifer, A. 1324 Visseher, M., s e e Prescott, E.C. 700 Vives, X. 474, 532 Vires, X., s e e Jun, B. 474 Volcker, RA. 1630 von Furstenberg, G.M. 1333 von Hagen, J. 1439, 1460, 1465 yon Hagen, J., s e e Eichengreen, B. 1465 von Hagen, J., s e e Fratianni, M. 1431
Author
Index
yon Hagen, J., s e e Hallerberg, M. 1460, 1465 von Weizs/icker, C. 641,650, 657 Vredin,, A.E., s e e Bergstr6m, V. 538 Vuong, Q.H., s e e Rivers, D. 840 Waehtel, E 1658 Wachtel, E, s e e Evans, M. 182 Wachter, S.M., s e e Goetzmann, W.N. 1333 Wadhwani, S., s e e King, M. 1333 Wagner, R.E., s e e Buchanan, J.M. 1631 Waldmann, R.J., s e e DeLong, J.B. 1290, 1324 Walk, H., s e e Ljung, L. 476 Walker, M., s e e Moreno, D. 481 Wallace, N., s e e Sargent, T.J. 417, 418, 489, 1024, 1506, 1507, 1519, 1630 Waller, C. 1431 Waller, C., s e e Fratianni, M. 1431 Wallis,, K., s e e Kreps, D.M. 540 Walsh, C.E. 1433, 1434, 1437, 1438, 1490 Walsh, C.E., s e e Trehan, B. 159 Walsh, C.E., s e e Waller, C. 1431 Wang, EA. 1322 Wang, J. 1237, 1293 Wang, L.-T., s e e Dezhbakhsh, H. 1039 Wang, T., s e e Dumas, B. 564 Warner, A.M., s e e Sachs, J.D. 252, 703 Warner, E.J. 1019 Wascher, W, s e e Lebow, D.E. 1016 Watson, J., s e e den Haan, WJ. 994, 1166, 1194, 1203, 1204, 1206, 1207 Watson, J., s e e Ramey, G. 852, 1157, 1159 Watson, M.W, 6, 50, 547, 931 Watson, M.W, s e e Bernanke, B.S. 144 Watson, M.W., s e e Blanchard, O.J. 1266 Watson, M.W, s e e Canjels, E, 55 Watson, M.W., s e e King, R.G. 46, 54, 939, 941 Watson, M.W., s e e Staiger, D. 49, 50 Watson, M.W., s e e Stock, J.H. 9, 43, 45, 5~52, 821, 878, 919, 934, 938, 939, 1011, 1021, 1404, 1674 Webb, S., s e e Goodman, A. 797 Webber, A., s e e Capie, E 222 Weber, G. 774 Weber, G., s e e Alessie, R. 774, 775 Weber, G., s e e Attanasio, O.E 611-613, 756, 769, 781, 783, 784, 787, 790, 791, 793, 794, 1264, 1655 Weber, G., s e e Blundell, R. 781 Weber, G., s e e Brugiavini, A. 775 Weber, G., s e e Meghir, C. 611,613, 775, 804
Author
Index
Weber, M. 1331 Weder, M. 403, 437 Wehrs, W., s e e Carlson, J.A. 904 Weibull, J., s e e Lindbeck, A. 1465 Weil, D.N., s e e Mankiw, N.G. 173, 216, 244246, 252-255,269-271,277-279, 289, 653, 655, 660, 673, 67%683, 685, 686, 1638 Weil, P. 547, 1235, 1250, 1253, 1256, 1647 Weil, E, s e e Blanchard, O.J. 1650 Weil, E, s e e Restoy, E 1272 Weingast, B., s e e North, D. 1449 Weinstein, M.M. 182 Weisbrod, S.R., s e e Rojas-Suarez, L. 1575 Weiss, A., s e e Greenwald, B. 1122 Weiss, L., s e e Scheinkman, J.A. 566 Weiss, Y. 583 Weiss, Y., s e e Blinder, A. 587 Weiss, Y., s e e Lillard, L. 569, 572 Weiss, Y., s e e Sheshinski, E. 1031, 1037 Weitzman, M.L. 1689 Welch, E 579 Welch, I., s e e Bikhchandani, S. 1332 Wen, J.E, s e e Devereux, M. 1466, 1471 Wen, L. 427, 431 Wenzelburger, J., s e e B6hm, V. 475 Werner, A., s e e Dornbusch, R. 1543, 1563, 1568 West, K.D. 871, 876, 880, 882, 885, 887, 888, 894, 896, 897, 900, 902, 905-908, 913, 919, 1028, 1041, 1320, 1497 Whalley, J. 705 Whalley, J., s e e Ballard, C. 1639 Whalley, J., s e e Shoven, J.B. 705, 708 Whalley, J., s e e Srinivasan, T.N. 705 Wheatley, S. 1242, 1261 Wheelock, D.C. 177, 179 Wheelock, D.C., s e e Calomiris, C.W. 187, 191 Whinston, M.D., s e e Segal, I.B. 1157 White, E., s e e Bordo, M.D. 159 White, E.N. 180 White, H. 524 White, H., s e e Chen, X. 476, 532 White, H., s e e Kuan, C.-M. 476 Whited, T. 1344 Whited, T., s e e Hubbard, R.G. 1344 Whiteman, C. 487 Whitt, W. 326 Whittaker, J. 1508 Wickens, M.R., s e e Robinson, D. 217 Wicker, E. 162, 176, 177, 179-181, 1543
1-31 Wicksell, K. 203, 1485, 1631 Wieland, V., s e e Orphanides, A. 1485 Wigmore, B.A. 163, 183 Wilcox, D. 1242 Wilcox, D., s e e Kusko, A.L. 1327 Wilcox, D.W. 1655 Wilcox, D.W., s e e Carroll, C.D. 769, 785 Wilcox, D.W., s e e Cecchetti, S.G. 876 Wilcox, D.W., s e e Kashyap, A.K. 137, 877, 886, 903,906, 912 Wilcox, D.W., s e e Orphanides, A. 198, 1485 Wilcox, D.W., s e e West, K.D. 908 Wildasin, D., s e e Boadway, R. 1463 Wilkinson, M. 881 Williams, J.C., s e e Brayton, E 1043, 1344, 1485 Williams, J.C., s e e Gilchrist, S. 847 Williams, J.C., s e e Wright, B.D. 347, 348 Williamson, J. 1597 Williamson, O.E. 852 Williamson, S. 1376 Willis, R., s e e Heckman, J.J. 602, 623 Wilson, B., s e e Saunders, A. 181 Wilson, C.A. 408 Wilson, R. 554, 796 Winter, S.G., s e e Phelps, E.S. 1121 Woglom, G. 1127 Wohar, M.E., s e e Fishe, R.P.H. 173 Wojnilower, A. 1344 Wolf, H., s e e Dornbusch, R. 1543 Wolf, H., s e e Ghosh, A.R. 202, 207, 208 Wolff, E. 664 Wolfowitz, J., s e e Kiefer, J. 476 Wolinsky, A. 1188 Wolinsky, A., s e e Binmore, K.G. 1188 Wolinskry, A., s e e Rubinstein, A. 1188 Wolman, A.L., s e e Dotsey, M. 974, 1032, 1043 Wolman, A.L., s e e King, R.G. 1036, 1041, 1043, 1364, 1367 Wolters, J., s e e Ttdlio, G. 156 Wong, K.-E 108 Wood, G.E., s e e Capie, E 163, 1438 Wood, G.E., s e e Mills, T.C. 204 Woodford, M. 389, 395, 406, 407, 409, 418, 421~423, 439, 454, 473-476, 481,483,507, 516, 518, 521,662, 1036, 1157, 1507, 1509, 1518-1520, 1537, 1630, 1675, 1676, 1720, 1731 Woodford, M., s e e Bernanke, B.S. 1361, 1363 Woodford, M., s e e Boldrin, M. 506
1-32 Woodford, M., s e e Farmer, R.E. 395, 396 Woodford, M., s e e Guesnerie, R. 439, 454, 460, 465, 474, 475, 506, 511, 516, 526 Woodford, M., s e e Kehoe, T.J. 380 Woodford, M., s e e Lucas Jr, R.E. 1023 Woodford, M., s e e Rotemberg, J.J. 67, 68, 395, 406, 407, 429, 434, 974, 996, 1020, 1041, 1043, 1044, 1055, 1056, 1062, 1063, 1067-1069, 1074, 1081, 1082, 1088-1090, 1092, 1093, 1106, 1107, 1118, 1123-1125, 1129, 1143, 1144, 1365, 1492, 1494, 1497 Woodford, M., s e e Santos, M.S. 1266 Woodward, EA., s e e Baker, J.B. 1125 Wooldridge, J., s e e Bollerslev, T. 1280 Wozniakowski, H., s e e Traub, J.E 338 Wright, B.D. 347, 348 Wright, M.H., s e e Gill, RE. 329 Wright, R. 1158 Wright, R., s e e Benhabib, J. 402, 550, 1145 Wright, R., s e e Boldrin, M. 399 Wright, R., s e e Burdett, K. 1196 Wright, R., s e e Greenwood, J. 550, 995 Wright, R., s e e Hansen, G.D. 976 Wright, R., s e e Kiyotaki, N. 524 Wright, R., s e e Parente, S.L. 702 Wright, R., s e e Rogerson, R. 978 Wurzel, E., s e e Roseveare, D. 1626 Wynne, M. 974 Wynne, M.A., s e e Huffman, G.W. 437 Wyplosz, C., s e e Eichengreen, B. 168, 1590 Xie, D. 425 Xie, D., s e e Benhabib, J. 425 Xie, D., s e e Rebelo, S.T. 952 Xu, Y. 344 Yashiv, E. 1200 Yellen, J.L., s e e Akerlof, G.A. 397, 1034, 1035, 1039, 1157, 1200 Yeo, S., s e e Davidson, J. 750 Yi, K.-M., s e e Kocherlakota, N.R. 271
Author
Index
Yin, G.G., s e e Kushner, H.J. 476 Yong, W, s e e Bertocchi, G. 474 Yorukoglu, M., s e e Cooley, T.E 847 Yorukoglu, M., s e e Greenwood, J. 576 Yotsuzuka, T. 1649 Young, A. 664, 672, 673, 687, 716 Yomag, J., s e e Wachtel, P. 1658 Yu, B., s e e Hashimoto, M. 1152 Yun, T. 1026, 1036 Zarazaga, C.E. 1540 Zarazaga, C.E., s e e Kydland, EE. 1557, 1561 Zarnowitz, V 9, 40 Zeckhauser, R., s e e Degeorge, E 1321 Zeckhauser, R.J., s e e Abel, A.B. 1266, 1651 Zeira, J., s e e Galor, O. 262, 263 Zejan, M., s e e Blomstrom, M. 277, 279, 280 Zeldes, S.R 566, 607-609, 771,789, 790, 802, 1344, 1655 Zeldes, S.P., s e e Barsky, R.B. 1653 Zeldes, S.R, s e e Hubbard, R.G. 567, 569, 572, 573, 593, 771,776, 794, 797 Zeldes, S.E, s e e Mankiw, N.G. 790, 1290 Zeldes, S.P., s e e Miron, J.A. 876, 907 Zeldes, S.P., s e e O'Connell, S.A. 1650 Zellner, A. 34 Zenner, M. 497 Zha, T., s e e Cushman, D.O. 95, 96 Zha, T., s e e Leeper, E.M. 69, 74, 83, 93, 101, 128, 132, 134, 1089, 1369 Zha, T., s e e Sims, C.A. 69, 83, 93, 99, 128, 129, 131, 132, 134, 144 Zhang, L., s e e Lockwood, B. 1411, 1415 Zhou, Z., s e e Grossman, S.J. 1237, 1293 Zhu, X. 1708 Zilcha, I., s e e Becker, R. 369 Zilibotti, E, s e e Gali, J. 405, 426 Zilibotti, E, s e e Marimon, R. 1214 Zin, S.E., s e e Epstein, L.G. 556, 558, 564, 744, 769, 1250, 1256 Zingales, L., s e e Kaplan, S.N. 856, 1344
SUBJECT INDEX
accelerator 884, 890, 896, 909 accelerator model 816, 817 accelerator motive 867, 902 activist vs. non-activist policies 1485 actual law of motion (ALM) 466, 472, 490, 511 adaptive expectations 453,465 adaptive learning 464, 472, 493,510 stability under 471 adaptively rational expectations equilibrium 532 adjustment costs 800, 1072 employment 1075 hours 1075 in investment 1296 non-convex 821, 839 production 867, 892, 893, 900 hazard 835, 836, 840 speed of 881,889, 908 age distribution 753, 848 aggregate convexity 843 aggregate demand 1617, 1628, 1630 aggregate human capital 583, 590-593 aggregate productivity 1195 aggregate productivity shock 1204 heterogeneous 1214 aggregate shocks 578, 582, 865 aggregation 548 594, 604, 605, 614, 615, 745, 781,804, 836, 849, 910 across commodities 782 AK model 672, 673, 709-715, 720, 733 allocation rules 1688, 1723 alternative dating 499 amplification 841, 1145, 1158, 1159, 1161 anchoring 1314-1317, 1322 animal spirits 395, 517, 521,941 anomalies 1307, 1308, 1316, 1317, 1321, 1322, 1333, 1334 approximation error 326-345,351-382 arbitrage 1246 ARMA models 489, 496, 501 Arrow Debreu equilibrium 795
asset-price channel 1378 asset prices, variable 1356 asset pricing models with feedback 500 asset pricing with risk neutrality 498 associated differential equation 519 asymmetric fixed costs 825 asymmetry in adjustment of employment 1158 asymptotic stability 479, 639 autarky 853 automatic stabilizers 1660 average cohort techniques 787
backlog costs 884 backstop technology 656 balance-of-payments (BOP) crises 1534, 1535, 1553 balanced-budget rule 1631 balanced growth path 50, 392, 393, 424, 425, 427 band-pass filter, s e e BP filter bank lending channel 1376 Barro, R. 1640, 1642-1646 Bayesian learning 474 Bayesian updating 461,465 Belgium 1619 Bellman's Principle of Optimality 998 bequest motive 745, 780, 1624, 1646, 1647 strategic 1646 best practice 848 /3-convergence 659 Beveridge curve 1194, 1196, 1221, 1222 bilateral bargaining problem 1157 black market premium 671,688, 689, 691-694, 703 Blanchard-Kahn technique 505 Bolivia 1631 boom-recession cycle 1550, 1552, 1581 bootstrap methodology 79 BOP crises, s e e balance-of-payments crises borrowers' net worth 1345 1-33
1-34 borrowing constraint 566, 575, 593, 595, 597, 598, 772, 775, 1293 s e e also capital market imperfections; credit market imperfections; liquidity constraints Boschen-Mills index 13%142 bottlenecks 842, 843 bounded rationality 454, 464 BP (band-pass) filter 12, 933, 934 Bretton Woods 152, 153, 163-168, 188, 190, 192, 199, 202-204, 206~09, 211,213, 215, 218-220 Brownian motion 825, 845 regulated 845 bubble-free solution 1524 bubble solutions 1522 bubbles 499 explosive 499 budget deficit 1619 budget surplus 1619 buffer-stock saving 771, 1653, 1654 building permits 45 Burns-Mitchell business cycle measurement 932 business cycles 865, 927-1002, 1620, 1621, 1659 see also cycles; fluctuations in aggregate activity facts about 934, 938, 939, 956 general equilibrium models 67 in RBC model 968 measuring 932 persistence of 939 table of summary statistics 956, 957 US facts 934 USA 935 938, 956 Cagan model of inflation 497 calculation equilibrium 462 calibration 545, 550, 567, 601,614, 616 Canada 45 capacity utilization 41,427, 431,930 modeling of 980 rate of 981 steady-state rate of 984 capital 1617, 1687 broad measure 701 desired 816, 842 frictionless 832, 838 human 673, 678, 679, 681-687, 701, 710, 713, 714, 716 718, 720, 732, 734
Subject Index
s e e also human capital organizational 700, 701 physical 678-683, 701, 710, 713, 714, 721, 732 specific 1154 stock of 1629, 1630, 1632, 1633, 1636-1638, 1648, 1652, 1656 target 820 unmeasured 701,702 vintage 702 capital accumulation 942, 1203 general equilibrium nature of 946 optimal 946 perpetual inventory method 944 capital budgeting 1623 capital controls 1588 capital imbalances, establishments' 837 capital intensities 641,644, 679, 680, 682, 685, 686 capital investment decision 1349 capital/labor substitution 856 capital market imperfections 1648, 1649 see also borrowing constraint; credit market imperfections capital taxation 1661, 1708 optimality of zero 1693 capital utilization 848 CARA utility 794 cash-in-advance constraint 397, 1722 cash-credit model 1720, 1721 "catching up with the Joneses" 1284 certainty equivalence 762 Chamley result 1698 characteristics model 578, 579, 582, 602 characterization of equilibria 487, 489 Cholesky factor 80 classification 262, 289, 303 classifier systems 465, 523 closed economy 1714 closed-form solution 769 club-convergence 660 Cobl~Douglas production function in RBC model 944, 950 "cobweb" model 456 coefficient of relative risk aversion 1249 cohort data 781 cohort effects 576, 577, 59~592, 617, 753, 754 cointegration 50, 750, 820, 838, 877-881, 885-887, 903, 1266 collateral 857
1-35
Subject lndex
commitment 574, 575, 1488, 1523 technology 1688, 1723 vs. flexibility 1489 commodity space 1686 comparative advantage 547, 548, 577-579, 584, 587 comparative dynamics measured by impulse response 967, 968, 970 competitive equilibrium 844, 845, 1677, 1688, 1722 competitive trajectory 650 complementarity 1161 complements 599, 601,611-613,855 complete markets 553, 558, 563, 595, 602, 786, 1688 computation of (approximate) solutions 525 computational general equilibrium (CGE) 705, 7O8 computational intelligence 465 computational tool 455 conditionally linear dynamics 475, 481 conditioning 556, 594, 597-599, 601,605,612, 613 consistent expectations equilibria 529 constant returns to scale 639, 831, 1687 in RBC model production function 995 consumer expectations 45 consumer theory 603 consumer's budget constraint 1264, 1712, 1728 consumption 40, 545, 546, 548-558, 560-564, 566, 567, 572 576, 587, 590, 594-603, 605-614, 616, 621, 1276 behavior in US business cycles 938 empirical 1344 estimates 605-614 'excess' sensitivity 524 growth 1233, 1242, 1276 inequality in 797 permanent-income hypothesis 943 private 1687 procyclical 433-435 smoothing 805 in RBC model 967 time-averaged data 1242 consumption-based asset pricing 1249 consumption expenditure 745 Consumption Expenditure Survey (CEX) 750 consumption per capita 643 consumption taxes 1692 contract multiplier 1028
contractual problems 849 control rights 852 control variables 688, 689 convergence 240, 245-276, 284-288, 290, 295, 296, 659 global 486 local 519 probability of 480 speed of 531,659 convergence analysis 454, 477-479 convertibility 153, 160 convertibility rules 209, 213 convex adjustment costs 818, 823 coordination failures 461 coordination of beliefs 391 comer solutions 804 cost of capital 817, 1344 cost shifters 906, 912 cost shock 867, 884, 899, 907, 908, 912 Costa Rican tariff reform 707 costly state verification 1349 creative destruction 848, 1210, 1213 credibility 1536, 1603 credit chains 1378 credit constraints 856 credit market 847 imperfections 1343 see also borrowing constraint; capital market imperfections segmentation 1575, 1577 cross-country regression 276, 281 cross-section least-squares regression 269 cross-sectional density 840 of establishments' capital imbalances 837 cross-sectional growth regression 252, 269273, 275, 276, 284-289, 671,675, 694 literature 688 crossover 522 crowding out 1632, 1633, 1636, 1638, 1648, 1652, 1654 currency crises 1534 current account deficit 1598 Current Population Survey 796 curse of dimensionality 843, 847 customer markets 1120 cycles 460, 507, 509, 526, 865 deadweight loss 1631, 1632, 1639, 1640, 1662 debt contract 1350 debt-deflation 1372
1-36
Subject Index
debt neutrality 1644 debt income ratio 1630 debt-output ratio 1619 decentralized economy 547, 575, 576, 602 decision rule 888-890 deficits 1617 nominal 1621 real 1621 demand shocks 865, 884, 88%892, 895, 898, 1055 demographic transition 658 demographic variables 793 demographics 547, 551-615, 744 and retirement behavior 758 depreciation 642, 1633 detrending and business cycle measurement 932 difference models of habit 1284 difference-stationary models 764 difference-stationary process 211,215, 1497 differential equation 472 diminishing returns 639 separately to capital and augmented labor 653 dirty floating 1587 discount factor 548, 555-557, 561, 567, 588, 595, 606, 607, 609, 610, 616 disinflation, output costs of 1542 disjunction effect 1324 disparity in GDP 675 disparity in incomes 674 distribution dynamics 263, 290 295, 299 distribution of country incomes 674 distribution of relative GDP 674 dividend growth 1233, 1242, 1276 dollarization 1589 domestic debt 1595, 1601 domestic policy regime 153, 202 Dornbusch-type model 502 DSGE, see dynamic stochastic general equilibrium models durability 798, 1242 durable goods 549, 746, 799, 1550, 1552, 1573, 1575 dynamic economic models 312, 313 Dynamic New Keynesian (DNK) framework 1346 dynamic programming 834 dynamic stochastic general equilibrium (DSGE) models 930, 1139, 1145, 1150, 1157, 1166
models with job search
1158
earnings 546, 567-573, 577-588, 592, 593, 598, 605, 615, 623 see also wages structural equation 582 variance 569 572, 578, 586 econometric approaches 237 economic growth 1617, 1641, 1651 economic relationship 852 education 577, 578, 580, 584, 602, 607, 609, 613, 615, 622, 623,653 eductive approaches 462, 464 effective labor 650 efficiency of terminations 1152 efficiency units 566, 658 s e e also labor in efficiency units efficiency wages 577, 578, 1098, 1157, 1159, 1160 efficient equilibrium 854 efficient markets 1307, 1308, 1316, 131%1322, 1333 elastic labor supply 1145 elasticity 545, 546, 550-552, 563, 579, 580, 592-594, 596-601,605, 607, 610, 614-617, 620 of capital supply 1714 long run 838 of demand, varying 1119 ofintertemporal substitution 552, 557, 561, 564, 597, 600, 601, 614, 615, 769, 791, 1148, 1250 of investment 857 of labor supply schedule 1147 of substitution 645 election 522 embodied technology 1207 embodiment-effect 664 employment 39 employment contract 1153 employment fluctuations 1173, 1194 employment protection 1215, 1217 employment relationship 1157 endogenous fluctuations 506, 531 endogenous growth models 238, 241,243,245, 257, 259, 261, 264, 265, 269, 271, 297, 506, 651,653, 1711 entry 1067 variable 1125 entry and exit 551,602, 615, 616, 824, 844 envelope theorem in RBC model 998
1-37
Subject Index
"episodic" approach 1560 e-SSE 520 Epstein Zin-Weil model 1259 equipment 840 equity premium puzzle 1234, 1245, 1249, 1250 error correction model 750 E-stability 463, 466, 468, 471-473, 488, 490, 491,504, 511 iterative 463 strong 473,483, 491,512 weak 473, 483,512 Euler equation 314, 345-347, 349 352, 354, 355, 364, 368, 371, 373, 374, 381, 382, 555358, 566, 567, 575, 597, 598, 606, 607, 609, 611,621,650, 765, 767, 794, 805 undistorted 1713 Euler equations 745, 791 excess bond returns 1276, 1277, 1280 excess sensitivity 772, 784, 785, 790 excess smoothness puzzle 747 excess stock returns 1249, 1276, 1277 excess volatility 1319, 1320 excessive destruction 856 exchange rate 527, 531, 1658 anchor 1588 and markups 1122 arrangements 167, 203 exchange-rate-based stabilization 1535, 1543, 1553, 1559 empirical regularities 1546 existence of competitive equilibrium in RBC model 1002 exit, delayed 850 see also entry and exit exogenous growth models 261 exogenous technological progress 650 expectation functions 453,461,464 expectational stability, s e e E-stability expectations, average 528 expectations hypothesis of term structure 1281 experience 582, 584, 590, 602 experimental evidence 530 exports 41 extensive margins 843 external effects 390, 399-401, 403-405, 424~27, 431,433-435, 437 external finance premium 1345 external habit models 1284 externalities in RBC model 1002
factor-saving bias 641 factors of production 909 Family Expenditure Survey (FES) 746, 750 family income 564, 569, 589 Federal Reserve 153, 168, 169, 172-182, 184-202, 219 feedback derivative 1510 proportional 1510 feedback rule 68, 71 feedforward networks 524 financial accelerator 1345 financial development 671,688, 692 financial markets, role in economic growth 1376 firing cost 1186, 1214, 1222 fiscal authorities 1524 fiscal deficits 1538, 1594, 1604 fiscal increasing returns 416 fiscal policy 672, 692, 694, 712, 715, 1580, 1617, 1624 countercyclical 1617, 1660 fiscal theory of price-level determination 1520, 1524 fixed costs 390, 426, 435, 828, 848, 911 flow-fixed costs 831 fixed effect 787 flexible accelerator 816, 865, 893, 903 flexible cyclical elasticity 842 flexible neoclassical model 817 floating exchange rate 1582 fluctuations in aggregate activity 547, 549, 552, 556, 569, 1053 see also business cycles induced by markup variation 1055, 1104 France 45 free entry condition 844, 845 frictionless neoclassical model 817 Friedman rule 1720 Frisch demands 595-597, 603 Frisch labor supply 1146 full-order equilibrium 530 functional forms 550, 583,584, 588, 598, 601, 607, 611,623 fundamental solution 498 fundamental transformation 852 gain sequence 469, 475 decreasing 469 fixed 469 small 470
1-38 general equilibrium 543 625, 888 generational accounting 1624 genetic algorithms 465, 521,525 Germany 45, 1631 global culture 1332, 1333 GLS 788 gold standard 153-190, 199520 Golden Rule 1650 Gorman Lancaster technology 800 government budget constraint 1687, 1719 consumption 671,691,694, 1687, 1736 rate to GDP 688, 689 debt 1617, 1687 production 672, 695, 701 production of investment 699 purchases 41 purchases and markups 1120 share 692 in GDP 671,689 in investment 695, 696 in manufacturing output 696 in output 693 gradual adjustments 823 gradualism 849 Granger causality 34 Great Depression 153, 163, 175, 178, 180-184, 199, 200, 213, 1343 Great Inflation of the 1970s 153 great ratios of macroeconomics 939, 940 gross domestic product (GDP) per capita 674 per worker 671 gross substitutes 1731 growth cycles 9 growth accounting 678, 687, 688 growth miracles in East Asia 709 growth-rate targets 1524 maximum growth rate 677, 726, 728, 732 habit formation 798, 802, 1237, 1284 habits 564, 802 Harrod Domar models 640 hazard rate constant 839 effective 836 increasing 840 hedging demand 1275 Herfindahl index 824 heterogeneity 546, 547, 552, 553
Subject Index
in firms 1366 in learning 527 in values of job matches 1152 of preferences 545, 558, 563 unobserved 779, 831 heterogeneous agents 843, 1237, 1290 heterogeneous consumers 1686 Hicks composite commodity 766 Hicksian demand decomposition in RBC model 971 hiring rate 1161 histogram 840 historical counterfactual simulations 1523 history-dependent aggregate elasticity 841 Hodrick Prescott filter, see HP filter hold-up problems 852 home production 402, 417, 431,702 home sector 435 homotheticity 1725, 1728, 1733 Hotelling's rule 657 HP (Hodrick Prescott) filter 12, 932, 933 human capital 527, 546, 547, 576, 577, 583 592, 594, 639, 653, 1638, 1712 hump-shaped impulse responses 405, 436, 1374 hump-shaped profiles 755 hyperinflation (seignorage) 509, 520, 531, 1631 hysteresis and threshold effects 455, 530 i.i.d, model 1739 identification problem 75-78 global identification 76, 77 local identification 76 underidentification 76, 77 idiosyncratic risk 795, 1290 idiosyncratic shocks 840 in productivity 1183 imbalances 826 imperfect competition 665 implementability constraint 1677, 1689, 1719, 1729 implicit collusion 1123 imports 41 impulse 1140 impulse response measure of comparative dynamics 967 to productivity in RBC model 967 impulse response functions 74, 81, 85, 86, 90, 98, 100, 102, 107, 110, 112, 133, 140, 397, 411,430, 431,880, 894
Subject Index inaction range 832 Inada conditions 645 income distribution, cross-country 671 income elasticity 1681 income inequality 797 income processes 569, 574, 610 income tax 672 income uncertainty 1652 incomplete contracts 853, 854, 856 incomplete markets 566-576, 1742 indeterminacy 491, 494, 506, 1161, 1506, 1691 nominal 418, 1506, 1524 of price level 215, 216, 415, 417, 419, 423 real 413, 415, 416, 418, 419, 423 indicator, cyclical 1062 indivisible labor model, role in RBC model 977 industry equilibrium 888, 889 inequality 745, 795 infinite-horizon consumption program 647 inflation 42, 1534, 1536, 1630 and business cycles 939 and markups 1128 inertia 1562 level 198 persistence 166, 211,213-215 rate 1738 autocorrelation 1738, 1739 variability 207 inflation correction 1621 inflation forecast targeting 1504 inflation-indexed bonds 1271 inflation-indexed consul 1269 inflation targeting 1499, 1505 vs. price-level targeting 1497 inflation tax 1538, 1720 inflationary expectations 1281 information externality 849 information pooling 849 information set 455 informational problems 849-851,858 infrequent actions 825 instability 481,519 of interest rate pegging 514 of REE 507 institutional factors 852 instrument feasibility 1507 instrument instability 1517 instrument variable 1492, 1524 instrumental variables (IV) estimator 787
1-39 instrumental variables (IV) regression 1261 insurance 745, 795 integrated world capital market 1297 interest rate 43, 1620, 1621, 1629, 1630, 1634, 1635, 1637, 1639, 1648, 1652, 1653, 1657-1659 nominal 1524 interest rate instrument 1514 interest rate policy 1596 interest rate smoothing 1509 intermediate-goods result 1684, 1720, 1733 intermediate-goods taxation 1676 intermediate input use 1081 internal habit models 1284 international capital flows 1636-1638 International Financial Statistics (IFS) 1238 international reserves 1594 intertemporal allocation 761 intertemporal budget 555, 561,647, 661 intertemporal budget constraint 1259, 1268 intertemporal CAPM 1275 intertemporal channel 1142 intertemporal elasticity of labor supply 1149 of substitution in leisure 1147 intertemporal marginal rate of substitution 1245 intertemporal non-separabilities 775 intertemporal optimization 745 intertemporal substitution 1055, 1150 "intervention" policy 1587 intradistribution dynamics 274, 292 intratemporal first-order conditions 775 inventories, target 894 inventories of finished goods 887 inventory fluctuations 1084 procyclieal 872-882, 898, 900, 909 inventory investment 865 inventory-sales ratio 871 inventory sales relationship 867 investment 40, 641 collapse 851 competitive equilibrium 844 delays 1365 distortions 672, 695-698 empirical 1344 expected 839 frictionless 832 lumpy 822, 823 share in output 693,699 spike 823, 824, 857
1-40 investment (cont'd) tax incentives 843 US manufacturing 840 investment episode 823 investment-output ratio 714 irrational expectations 1237, 1293 irregular models 490, 493, 505 irreversibility constraint 832 irreversible investment 822, 828, 832 iso-elastic utility function 606, 607, 610 Italy 1619 Ito's lemma 825 Japan 45 Jensen's inequality 1247 job-finding rate, cyclical behavior of 1162 job loss 1151 job search 1143, 1150, 1158, 1162 job-specific capital 1152 job to job flows 1198, 1200 job worker separations 1184 jobs creation 846, 1150, 1158, 1161, 1173, 1176, 1178, 1185, 1201, 1219 cost 1187, 1193, 1215, 1222 creation and destruction, international comparison 1178 destruction 846, 1150, 1158, 1160, 1166, 1173, 1176, 1178, 1185, 1197, 1201, 1219 rate 1151, 1152 flow 1197 international comparison 1180 reallocafion 1222 termination 1152 cost 1193 joint production 853 joint surplus 1157 just-in-time 871 Kaldor facts about economic growth 941 Keynes, J.M. 1660 Keynesian analysis 1628 Keynesian consumption function 761 Kreps-Porteus axiomatization 744 Krugman model 1592 Kuhn-Tucker multiplier 774 labor 1687 bargaining strength 1219 labor-augmentation 651
Subject Index
labor contract 1154 labor force 1174 labor force status 602, 603, 607, 611, 614, 623 labor hoarding 1076, 1078, 1097 labor in efficiency units 650 see also efficiency units labor income 1237, 1275, 1290 labor market 855 policy 1214 restrictions 672, 695 labor power 1220 labor productivity 42 labor regulations 852 labor share 1059 labor supply 546 553, 562, 577, 585, 587, 592, 594, 596, 598, 599, 601,602, 605, 606, 608, 610-621,623, 744, 777, 792, 1150, 1296 elasticity 975, 1371 in RBC model 975 empirical 1148 endogenous in RBC model 945 extensive margin 976 female 611 fixed costs of working 976 indivisible labor model 976 male 611 substitution effect 975 unobserved effort of 930 labor tax rate, autocorrelation 1739 lack of credibility 1569, 1572, 1581 Latin America 1543 laws of large numbers 837 leading example 488, 493 learning 453, 488 by doing 664 in games 475 in misspecified models 528 least squares learning 465, 467, 526 social 849 stability under 496 statistical 493 learning dynamics, persistent 455 learning equilibria 515 learning rules 439, 454 econometric 472 finite-memory 474 fixed-gain 511 statistical 465 learning sunspot solutions 494 learning transition 531
S u b j e c t Index
Legendre Clebsch condition 904 levels accounting 678-687 leverage 1280 life cycle 583, 586-588, 593, 595, 601, 603, 604, 609, 615, 620, 621, 744, 749, 752, 754, 760, 792, 793 life cycle-permanent income model 760 life expectancy 691 693 lifetime budget constraint 647 see also intertemporal budget likelihood fimction 840 linear allocation rules 554, 563, 564 linear commodity taxes 1677 linear filter 11 linear model 467, 487, 842 with two forward leads 501 linear-quadratic model 457, 865, 876, 882, 903, 904 liquidity 1255, 1591 liquidity constraints 745, 772, 773, 789, 1654 see also borrowing constraint liquidity variables 817 log-linearization 788 long-term bonds 1255, 1280 low-equilibrium trap 646 Lucas aggregate supply model 457 Lucas critique 1491 Lucas program 67 lumpy project 823 Lyapanov theorems 479 M2 44 M1 velocity 50 machinery, price of 696 macroeconomics 639 magical thinking 1328, 1329 maintenance 823 maintenance investment 839 major and infrequent adjustments 823 managed float 152, 153, 167, 202, 204, 207 manufacturers 870 marginal cost schedule 1054 declining 1066 marginal production costs 867, 890, 892, 896, 899, 902, 905, 907 marginal profitability of capital 830 marginal rate of substitution 549, 551,554-557, 559, 560, 598, 622, 765 heterogeneity 620-623 marginal utility 767 market capitalization 1239
1-41 market clearing 1021-1024, 1026, 1035 expected 1021, 1024-1027 market imperfection 390, 405,424, 426, 433 market structure 546, 553,558, 575, 598 market tightness 1185 market work 550, 594, 601 Markov chain 1708, 1736 Markov process 1264 markup 399, 400, 406, 407, 426, 429, 431, 1053 average 1068 countercyclical 406, 1113 for France 1068 cyclical 1092 desired 1056 measurement 1058 models of variation 1055, 1112 procyclical 1113, 1128 variable 406, 407 variation in desired 1129 Marshallian demands 597 martingale 767 martingale difference sequence 487 match capital 1152 matching function 1183 matching model 1163 Maximum Principle 650 measure of financial development 691 measurement error 518, 546, 561, 572-574, 609, 616, 1242 "mechanical" approach 1560 mechanism design 1154 Medicare 1622, 1626 men 550, 552, 607, 615, 620 mental compartments 1317 menu costs 397 microeconomic data 543-625, 745 microeconomic lumpiness 824 microfoundations 761 military purchases 1088 Mincer model 568, 569, 581,582, 584, 592 minimal state variable solutions, see MSV solutions mismatch 1221 mismeasurement of average inflation 1254 Modigliani Miller theorem 1343 monetary accommodation 1539 monetary base 44, 1507, 1524 monetary economies 1720 monetary model with mixed datings 500
1-42 monetary policy 692, 695, 715, 1012, 10241037, 1281, 1630, 1660, 1720 optimal, cyclical properties of 1736 monetary policy rule 1364 monetary policy shocks 65 145 effect 69 on exchange rates 9 4 4 6 on US domestic aggregates 91-94 on volatility 123-127 identification schemes 68-70, 1369 Bernanke-Mihov critique 115-123 Bernanke-Mihov test 11%121 empirical results 121 123 Coleman, Gilles and Labadie 114, 115 narrative approach 136-141 see also Romer and Romer shock pitfalls 134-136 plausibility 100-104 assessment strategies 114-123 problems 143 145 interpretations 71-73 non-recursive approaches 127-134 output effects 1129 recursiveness assumption 78 127 see also recursiveness assumption responses to 1368 monetary regimes 153, 168, 178, 202, 204, 211,216, 220 money 44, 1011-1013, 1020-1029, 1031-1033, 1035, 1036, 1040, 1041 money anchor 1588 money-based stabilization 1535, 1543, 1554, 1558, 1582 money demand 50, 598, 1603, 1736 consumption elasticity of 1725 interest elasticity of 1736 money growth rate 1738 money-in-the-utility-function model 1720, 1728 money supply 1536 money velocity 1588 s e e also M1 velocity monopolies 695 monopolistic competition 1033-1036, 1041, 1042 monotonicity 830 Morgan Stanley Capital International (MSCI) 1238 MSV (minimal state variable) solutions 488, 493,502 and learning 503
Subject Index
locally (in)determinate 490 non-MSV solutions 493 multiple competitive equilibria 1679 multiple equilibria 1539, 1603 multiple REE, see under R E E multiple solutions 487, 1506, 1524 multiple steady states 460 multiple strongly E-stable solutions 501 multiplicity of steady states 658, 662 multivariate models 502 with time t dating 505 mutation 522 Muth model 465, 484, 525 myopia 1653, 1654 Nash bargain, generalized 1189 National Account 751,752 national accounting identities 1628 National Bureau of Economic Research (NBER) 6,8 national income 1617 national saving 1628, 1629, 1637, 1639, 1641, 1652, 165%1662 natural experiments 822 natural rate 1176 natural resources 639 negative income tax experiments 1148 neoclassical exogenous growth model 243, 261, 673 neoclassical growth model 245, 246, 252, 259, 269, 272, 276, 639, 695, 697, 701, 1140 basis for RBC model 942 neoclassical theory of investment 817 net convergence effect 692, 693 net present value rule 835 net worth and the demand for capital 1352 neural networks 465, 524 neurons 524 noise case of small 513 intrinsic 507 noise traders 1290 noisy k-cycle 513 noisy steady states 483, 509 nominal anchor 207, 211, 215, 216, 1535, 1542, 1557 nominal income targeting 1505 non-durables 746 non-nested models 840 non-random attrition 787 non-Ricardian policy 418
1-43
Subject Index
non-Ricardian regime 418 non-separability of consumption and leisure 759 non-state-contingent nominal claims 1722 Non-Accelerating Inflation Rate of Unemployment (NAIRU) 46 nonlinear models 468 nonlinearity 828, 839 nonparametric techniques 532 numerical algorithms 320, 324, 326, 328, 348, 358, 378 numerical solutions 318, 326, 352, 805 obsolescence 848 OECD 685, 718, 719, 1174 OECD adult equivalence scale 757 oil prices, effects of 1089 one-sector model 639 one-step-ahead forward-looking reduced form 506 open economy 1714 open market operations 1722 openness 703 operationality 1486, 1523 opportunism 851, 858 opportunity costs 854 optimal control 1490 optimal debt policy 1639, 1659, 1660, 1662 optimal fiscal policy 1686 optimal investment path 834 optimal national saving 1617 optimal tax theory 1692 optimal trajectory 650 optimal wedges 1692 optimum quantity of money rule 1537 option to wait 832, 834 ordinary differential equation (ODE) approximation 468, 478 orthogonality conditions 785 out-of-sample forecasting 840 out-of-steady-state behavior 649 output 1687 output levels 206 output variability 208, 211 overconfidence 1319-1323, 1325, 1326, 1328 overhead labor 1065 overidentifying restrictions 768 overlapping contracts models 495, 1582 overlapping generations model 390, 395, 397, 398, 427, 458, 546, 549, 576-594, 660, 1634, 1635, 1645-1647
overparametrization 473 overreaction 1319-1322 overtaking 650 overvaluation 1563
panel data 275, 283587, 295, 781 Pareto weights 559-564, 796 partial adjustment model 821,838 participation 574, 601, 1218 path dependence of adaptive learning dynamics 455 peacetime 1699 Penn World Table 674, 680 pent-up demand 841 perceived law of motion (PLM) 466, 472, 490, 511 perceptron 524 perfect competition 831 perfect foresight 650 perfect insulation 846 perfect-insurance hypothesis 796 periodic or chaotic dynamics 646 see also cycles permanent-income hypothesis 749, 1641, 1662 permanent shocks 216-219 perpetual inventory method 680 persistence 870 882, 891, 893, 900, 902, 904, 1142, 1162, 1166, 1739 of business cycles, see persistence under business cycles of fluctuations 527 of inflation 1537 peso problem 1252 pessimism 1295 Phillips curve 46, 1056, 1363, 1542 planner's problem in RBC model 997, 1002 policy 455 affecting labor markets 672 distorting investment 695 impeding efficient production 672 policy accommodation 1538 policy fimction 320-381 political rights 671,689 political stability 671,688, 692 Ponzi scheme 1650 population aging 1625, 1640 population growth 941 endogenous 639 power utility 1249
1-44 precautionary saving 744, 770, 1253, 1288, 1653 preference parameters 550, 555, 556, 558, 567, 601,605 preferences 546-550, 552, 553, 556 558, 564, 565, 567, 572, 582, 593, 601, 604, 605, 607, 608, 610, 614, 616, 617, 623 additive 594 conditional 778 functional forms 550 Gorman polar 766, 783 heterogeneity 545, 552, 558-565, 567, 593, 594, 609, 621,623 homogeneity 553-556, 577 of representative agent in RBC model 942 quadratic 762, 770 present-value model of stock prices 1264 log-linear approximation 1265 present-value neutrality 573 price elasticity 1681 price functions 1723 price puzzle 97-100 price rules 1688 price-cost margin 1053 see also markup price-dividend ratio 1265, 1266, 1276 prices 42 of machinery 696 of raw materials 1082 pricing, equilibrium 555, 602, 845 primal approach 1676 primary budget 1619 principal agent problems 1345 principles of optimal taxation 1676 private and public saving 1629 private information 574 576, 849 production costs, non-convex 897, 911 production economy 1686 production efficiency 1684, 1735 production function 548-550, 578, 579, 581, 583-586, 588, 590, 591,594 non-Cobb-Douglas 1064 production possibilities surface 401 production smoothing 876, 877, 884, 895, 1085 production to order 887 production to stock 887 productivity 552, 553, 566, 583, 602, 1057 cyclical 938, 1094 deterministic growth of 943 general 1192, 1193
Subject Index
growth of 942 shocks 930, 943, 965, 972 amplification of 963 modeled as first-order autoregressive process 963 persistence of (serial correlation) 952, 963 RBC model's response to 964 remeasurement of 982 slowdown 664 profit function 830 profits 1057 cyclical 1100 projection facility (PF) 480 propagation of business cycles 865 propensity to consume 762 property rights 852, 856 proportional costs 825 proportional taxes 1687 prospect theory 1308-1313 protection of specific investments 1154 "provinces" effect 1540 proxies for capital utilization 1080 prudence 771 PSID 783 public consumption 1581 public debt 1601, 1603 public finance 1676 public saving 1629, 1641 putty-clay models 847, 848 q-theory 817 see also Tobin's q average q 817, 818 "flexible q" 818 marginal q 818 fragility of 828 quadratic adjustment cost model 823, 838 Quandt Likelihood Ratio (QLR) 34 quantitative performance 1578, 1581 quantitative theory 671-673, 695-719 see also dynamic stochastic general equilibrium models quasi-magical thinking 1329, 1330 Ramsey allocation problem 649, 1679, 1691, 1692, 1713, 1719, 1723, 1729 Ramsey equilibrium 1678, 1688, 1723, 1729, 1732 Ramsey growth model 1651, 1652 Ramsey prices 1679
Subject Index
random walk 767, 1316, 1319, 1702, 1706, 1738, 1742 geometric 825 range of inaction 826 rate of arrival of shocks 1193 rate of discount 1193 rate of return 566, 577, 582, 595, 606, 610 ratio models of habit 1284 rational bubbles 499, 1266 rational expectations 453 transition to 454 rational learning 461 rationalizability 464 rationing 857 RBC models, see real business cycle Reagan, R. 1641 real balance model 489, 496 real business cycle (RBC) 394, 402, 413, 427, 428, 437, 442, 505, 843, 928, 1296 amplification of productivity shocks in 958, 967 as basic neoclassical model 942 baseline model 1143, 1709, 1736 failures 1144 calibration 953-955, 959 competitive equilibrium 999 concave planning problem 1002 contingent rules 1000 criticisms 961 depreciation rate of capital 944 discount factor 942 modified 945 endowments in 943 extensions 994 firm's problem 1001 government spending and taxes in 974 high risk aversion model 1709 high-substitution version calibration 985, 987 decision rules for 985 ingredients of 984 probability of technical regress 989, 990 role of capacity utilization in 985 role of indivisible labor in 985 sensitivity to measurement of output 992 sensitivity to parameters 990, 991 simulation of 986 household's problem 1000 importance of consumption smoothing in 967
1-45 Inada conditions on production function 996 interest rate effects 973 internal propagation in 967 labor demand for 956 supply of 956 Lagrangian for 946 lifetime utility 996 market clearing 1001 production function in 943 RBC model as basic neoclassical model 942 simulations of 957 solution certainty equivalence 952 dynamic programming 951 linear approximations 949 loglinear approximations 952 rational expectations 951 steady state of 947 transformation to eliminate growth 944 transitional dynamics of 948 transversality condition for 946 wage effect in 973 wealth effects in 971 with nominal rigidities 974 real exchange rate 1547 real interest rate 1220, 1233, 1276, 1286 measurement of 939 real marginal cost 1053 real shocks 1174 real wage 1296 reallocation of workers 1160, 1183, 1199 recession now versus recession later 1535, 1557 recursive algorithm 468, 475, 479, 486 recursive least squares 467 recursive least squares learning 494 recursive utility 557 recursiveness assumption 68, 73, 78-127 benchmark identification schemes 83-85 F F policy shock 87, 88 influence of federal funds futures data 104-108 NBR policy shock 88 NBR/TR policy shock 89 problems 97 results 85 robustness 96, 97 sample period sensitivity 108-114
1-46 recursiveness assumption (cont'd) relafion with VARs 78 83 REE (rational expectations equilibria) 452 cycles 458 multiple 454, 467 reduced order limited information 529 unique 484 reflecting barriers 828 regime switching 426 regression tree 289 regular models 490 regulation barrier 832 relative price of investment to consumption 696-698, 700, 701 reluctance to invest 828, 832 renegotiation 1153, 1155 renewable/nonrenewable resources 655, 656 rental prices 588, 590, 592 of capital 1000 reorganization 1160, 1161 representative agent 556, 557, 560, 561, 563, 587, 601,838, 1249, 1259, 1268 in RBC model altered preferences in indivisivle labor 977 altruistic links 943 preferences of 942 representative household 643 representativeness heuristic 1319, 1322, 1327 reproduction 522 research and development (R&D) 664, 672, 692, 695, 708, 709, 715 719 residence-based taxation 1715 restricted perceptions equilibrium 529 restrictions in job separation 1222 restrictions on government policy 1707 retailers 869 retirements 839 returns to scale 639 decreasing 656 increasing 652, 653, 664, 828, 830, 1066 social 460, 509, 521 Ricardian equivalence 418, 1617, 1640-1659, 1661 Ricardian regime 418 Ricardo, D. 1640 risk 546, 547, 552, 554-558, 563-567, 569, 572, 575, 593, 606 risk adjustment 555, 557, 558 risk aversion 547, 552, 556-558, 564-566, 606, 77t
Subject Index risk premium 1246, 1247, 1250 risk price 1236, 1280 risk-sharing in indivisible labor version of RBC model 977 riskfree rate puzzle 1235, 1252 robustness approach 1491, 1523 Romer and Romer shock 137-142 rule-like behavior 1487, 1522 rule-of-thumb decision procedure 524 rules 152-154, 156, 158, 160, 166, 168, 184, 200, 208, 219, 220 rules vs. discretion 1485 Rybczinski theorem 404
(S,s) model 801,802, 831,910, 911 sacrifice ratio 1541 saddle point 405, 649 saddle point stability 490 Sargent and Wallace model 489 saving 641 private 1628, 1629, 1632-1634, 1637, 1641, 1648 'saving for a rainy day' equation 764 scale effects 672, 715, 716, 718, 719 school attainment 691 school enrollment 681,684 post-secondary 683 primary 683 secondary 681-683 schooling 576 578, 581-592 sclerosis 856 scrapping 844, 847, 855, 856 endogenous 844 search and matching approach 1173, 1183 search efficiency 1162 search equilibrium 1186 search externalities 506 seasonal adjustment 1242 seasonal variations in work volume 1149 second-best solutions 849 secondary job loss 1163 sector-specific external effects 402 sectoral shifts hypothesis 1221 securities market 1722 seignorage model 460, 471, 509, 525, 530, 1741 selection criterion 468 selection device 454 self-fulfilling fluctuations 506 separability 556, 602, 603, 607, 608, 612, 613, 617, 1725, 1728, 1733
Subject I n d e x
tests 61 l separation rate 1151 Sharpe ratio 1249 shock absorber 1699, 1710, 1739 shock propagation 1203 shocks and accommodation 1539 shopping-time model 1720, 1732 shopping-time monetary economy 1732 short-term bonds 1280 short-term maturity debt 1603 a-convergence 659 Sims Zha model 128 134 empirical results 131-134 skill-biased technology shock 1215, 1216, 1218 skills 546, 547, 569, 576379, 581, 582, 584, 586-588, 590-594, 623 slow adaption 480 slow speed of adjustment 877, 894 small durables 798 small open economy 1715 small sample 820 small versus large firms 1373 smooth pasting conditions 827 Social Security 1619, 1622, 1624, 1626, 1635 Solow residual 930, 1140, 1141 as productivity measure 962 in growth accounting 962 mismeasurement 962 solvency conditions 575 specificity 851,852, 856 spectral analysis 11 SSE, see stationary sunspot equilibria stability conditions 454 stabilization 1534, 1562 stabilization goals 153 stabilization time profiles 1547 stable equilibrium point 481 stable roots 393 staggered contracts model 1012, 1013, 1024, 1027, 1030, 1032, 1039 staggered price and wage setting 1012, 1013, 1027, 1030, 1031, 1033, 1035-1037, 1040 staggered price setting 397, 422, 423, 1129, 1363 staggered-prices formulation 1582 standardized employment deficit 1621 state-contingent claims 555, 602 state-contingent returns on debt 1687, 1699 state-dependent pricing 1031, 1032 state dynamics 477
1-47 state prices 1294 stationary distribution of RBC model 999 stationary sunspot equilibria (SSE) 408, 517 e-SSE 517 near deterministic solutions 520 steady states 468, 507, 525, 549-551,576, 592, 598, 639 of RBC model 944 sterilization 1595 sticky price models 503, 1113 stochastic approximation 468, 475, 476 stochastic discount factor 1234, 1245 log-normal 1246 stochastic growth model 546 577, 592 stochastic simulations 1516, 1523 stock market 1310, 1312, 1313, 1315, 1316, 132~1328, 1331, 1333 stock market volatility puzzle 1235, 1236, 1268, 1276, 1280 stock prices 43 stock return 1233, 1240 stockout costs 884, 885 Stolper-Samuelson theorem 404 Stone price index 783 storage technologies 574, 575 strategic complementarity 1129 strategic delays 858 strong rationality 464 structural model 462 structural shifts 530 structures 840 subgame perfection 1679 subjective discount factor 548, 552, 561, 593, 595, 609, 616 subsistence wage 657 substitutes 577, 590, 591,613, 616 sunk costs 858 sunspot equilibria 454, 515 sunspot paths 662 sunspot solutions 495 s e e also learning sunspot solutions sunspots 489, 515 supply of capital 846 supply price of labor 1192, 1193 supply shocks 1129 supply-side responses 1577 surplus 853 surplus consumption ratio 1286 survivorship bias 1242 sustainability 1597
1-48 T-mapping 467, 471,512 Tanzi effect 1741 target points 826 target variables 1492, 1523 tariff 672, 695, 703-707 taste shift 778 tax see also labor tax rate; capital taxation distortionary 1651, 1652, 1654 on capital income 1686 on employment 1220 on international trade 703 policy 672, 708 rate 1441 on private assets 1709 reforms 822 smoothing 1655, 1659, 1662, 1705 intertemporal 1617 source-based 1715 system 1679 Taylor expansion 1265 Taylor rule 1364 technological change 1708 technological embodiment 848 technological progress 641, 1207, 1213 disembodied 1207, 1208 endogenous 639 Harrod-neuVral, Hicks-neutral 944 labor-augmenting 944 purely labor-augmenting 650 technological regress, probability of in RBC models 930 technology adoption 672, 708 technology shocks 1141, 1142, 1736 temporariness hypothesis 1569, 1572 temporary shocks 216 temporary work 1165 term premium 1255 term structure of interest rates 1270 termination costs 708 thick-market externality 1161 threshold externalities 527 thresholds 258-262, 276, 289 time-additive utility function 661 time aggregation 881 time-consistent behavior 1488 time dependency 799 time-dependent pricing 1031, 1032 time-inconsistent behavior 1653 time preference 547, 588 time preference rate 1253
Subject Index
time series 264, 272, 287, 288 time series volatility 756 time to build 832, 850 time-varying aggregate elasticity 841 timing assumption 469 Tobin's q 817, 1296 see also q-theory total factor productivity (TFP) 42, 673, 678, 687, 688, 702 trade deficit 1630, 1658, 1659 trade policy 672, 692, 694, 702 training 577, 58~584, 586-592, 653 transition rates 1166 transversality conditions 392, 393, 400, 650 Treasury bills 1233 trend-stationary models 764 trend-stationary process 10, 211, 1497 trigger points 830 tuition costs 583, 588, 590 twin deficits 1630 two-stage least squares estimation 1261 uncertainty 545-547, 556, 558, 564, 566, 567, 569, 572, 574, 575, 593, 605, 606, 620, 621,623,744, 1627, 1653 underinvestment 852, 854 underreaction 1319-1322 unemployment 546, 569 571, 578, 579, 1143, 1150, 1158, 1161, 1162, 1173, 1174, 1194, 1214 experiences of OECD countries 1213 natural level 1157 rise in 1182 serial correlation 1163 unemployment compensation 1217 unemployment income 1214 unemployment inflow and outflow rates 1181 unemployment rates 1176 unemployment spell duration hazard 1184 unemployment skill profile 1216 unified budget 1619 uniform commodity taxation 1676, 1726 union bargaining 1098 uniqueness of equilibrium in RBC model 1002 unit root 11 United Kingdom 45 univariate models 488, 497 unstable equilibrium point 481 utility function 548-550, 556-558, 560, 594, 596, 597, 599-601,606, 607, 610
1-49
Subject Index
momentary in RBC model 944 offsetting income and substitution effects 944 utility recursion 557 utilization of capital 1079 vacancies 41, 1194 vacancy chain 1200 vacancy duration hazard 1184 value function 319-327, 329, 335, 336, 340, 345, 351-355, 357 359, 365, 368, 378 value matching 827 variable costs 828 variety, taste for 705 vector autoregression (VAR) 73, 438 definition 73 vintage capital models 847, 848 volatility employment 1157 inventories 869, 870 monetary aggregates 1599 vote share 1455 wage bargaining 1130 wage contract 1173, 1186 wage inequality 1182, 1214, 1218, 1219
wages 42, 547, 550 553, 556, 566-569, 572, 577-579, 581,587, 593, 595-601,603-607, 611, 612, 616, 617, 619, 621, 623, 1181, 1629, 1637 see also earnings cyclical 939 equilibrium 556 fixed 1157 marginal 1069 rigidity 1055 war of attrition 1540 wars 1619, 1642, 1656, 1661-1663, 1699 wealth distribution 556, 561,567, 572, 593 wealth-output ratios 1240 wealth shock 1372 welfare costs of macroeconomic fluctuations 1297 welfare theorems, role in RBC analysis 1001 wholesalers 869 within-period responses 599 women 550, 552, 607, 615,620, 623 worker flows 1180 into unemployment 1164 worker turnover 1176 works in progress inventories 887 yield spread
1256, 1280