NBER Macroeconomics Annual 2001 Editors Ben S. Bernanke and Kenneth Rogoff
THE MIT PRESS Cambridge, Massachusetts London, England
NBER/Macroeconomics Annual, Number 16, 2001 ISSN: 0889-3365 E-ISSN 1537-2642 ISBN: Hardcover 0-262-02520-5 Paperback 0-262-52323-X Published annually by The MIT Press, Cambridge, Massachusetts 02142 An electronic, full-text version of 'NBER/Macroeconomics Annual is available from MIT Press Journals when purchasing a subscription. Subscription Rates Hardcover/Print and Electronic: $62.00 Paperback/Print and Electronic $32.00 Outside the U.S. and Canada add $10.00 for postage and handling. Canadians add 7% GST. Subscription and address changes should be addressed to: MIT Press Journals, Five Cambridge Center, Cambridge, MA 02142-1407; phone 617253-2889; fax 617-577-1545; e-mail
[email protected]. Claims will be honored free of charge if made within three months of the publication date of the issue. Claims may be submitted to
[email protected]. Prices are subject to change without notice. In the United Kingdom, continental Europe, and the Middle East and Africa, send back volume orders and business correspondence to: The MIT Press, Ltd., Fitzroy House, 11 Chenies Street, London WC1E 7ET England; phone 44-020-7306-0603; fax 44-020-7306-0604; e-mail
[email protected]. In the United States and for all other countries, send single copy and back volume orders to: The MIT Press, Five Cambridge Center, Cambridge, MA 02142; toll-free book orders phone 800-356-0343; fax 617-625-6660; e-mail
[email protected]. Copyright Information Permission to photocopy articles for internal or personal use, or the internal or personal use of specific clients, is granted by the copyright owner for users registered with the Copyright Clearance Center (CCC) Transactional Reporting Service, provided that the fee of $10.00 per copy is paid directly to CCC, 222 Rosewood Drive, Danvers, MA 01923. The fee code for users of the Transactional Reporting Service is: 0889-3365/01 $10.00. For those organizations that have been granted a photocopy license with CCC, a separate system of payment has been arranged. © 2002 by the National Bureau of Economic Research and the Massachusetts Institute of Technology.
NBER BOARD OF DIRECTORS BY AFFILIATION OFFICERS Carl F. Christ, Chairman Michael H. Moskow, Vice Chairman Martin Feldstein, President and Chief Executive Officer Susan Colligan, Vice President for Administration and Budget and Corporate Secretary Robert Mednick, Treasurer Kelly Horak, Controller and Assistant Corporate Secretary Gerardine Johnson, Assistant Corporate Secretary DIRECTORS AT LARGE Peter C. Aldrich Elizabeth E. Bailey John H. Biggs Andrew Brimmer Carl F. Christ John S. Clarkeson Don R. Conlan George C. Eads
Martin Feldstein Stephen Friedman Judith M. Gueron George Hatsopoulos Karen N. Horn Judy C. Lewent John Lipsky Michael H. Moskow
Alicia H. Munnell Rudolph A. Oswald Robert T. Parry Peter G. Peterson Richard N. Rosett Kathleen P. Utgoff Marina v.N. Whitman Martin B. Zimmerman
DIRECTORS BY UNIVERSITY APPOINTMENT George Akerlof, California, Berkeley Marjorie B. McElroy, Duke Joel Mokyr, Northwestern Jagdish Bhagwati, Columbia Andrew Postlewaite, Pennsylvania William C. Brainard, Yale Nathan Rosenberg, Stanford Michael J. Brennan, California, Los Angeles Michael Rothschild, Princeton Craig Swan, Minnesota Glen G. Cain, Wisconsin Franklin Fisher, Massachusetts Institute David B. Yoffie, Harvard of Technology Arnold Zellner, Chicago Saul H. Hymans, Michigan DIRECTORS BY APPOINTMENT OF OTHER ORGANIZATIONS Richard D. Rippe, National Mark Drabenstott, American Agricultural Economics Association Association for Business Economics John J. Siegfried, American Economic Gail D. Fosler, The Conference Board A. Ronald Gallant, American Statistical Association Association David A. Smith, American Federation Robert S. Hamada, American Finance of Labor and Congress of Industrial Association Organizations Josh S. Weston, Committee for Robert Mednick, American Institute of Certified Public Accountants Economic Development Gavin Wright, Economic History Angelo Melino, Canadian Economics Association Association
DIRECTORS EMERITI Thomas D. Flynn Franklin A. Lindsay Bert Seidman Lawrence R. Klein Paul W. McCracken Eli Shapiro Since this volume is a record of conference proceedings, it has been exempted from the rules governing critical review of manuscripts by the Board of Directors of the National Bureau (resolution adopted 8 June 1948, as revised 21 November 1949 and 20 April 1968).
Contents Editorial: Ben S. Bernanke and Kenneth Rogoff
1
Abstracts 7 IS GROWTH EXOGENOUS? TAKING MANKIW, ROMER, AND WEIL SERIOUSLY 11
Ben S. Bernanke and Refet S. Gurkaynak COMMENTS: Francesco Caselli 58 David Rorner 62 DISCUSSION 71 LONG-TERM CAPITAL MOVEMENTS 73 Philip R. Lane and Gian Maria Milesi-Ferretti COMMENTS: Kristin J. Forbes 116 Jeffrey Frankel 127 DISCUSSION 134 DO WE REALLY KNOW THAT OIL CAUSED THE GREAT STAGFLATION? A MONETARY ALTERNATIVE 137 Robert B. Barsky and Lutz Kilian COMMENTS: Olivier Blanchard 183 Alan S. Blinder 192 DISCUSSION 195 THE COST CHANNEL OF MONETARY TRANSMISSION Marvin J. Earth III and Valerie A. Ramey COMMENTS: Charles L. Evans 240 Simon Gilchrist 249 DISCUSSION 254
199
vi • CONTENTS THE 6D BIAS AND THE EQUITY-PREMIUM PUZZLE Xavier Gabaix and David Laibson COMMENTS: Anthony W. Lynch 312 Monika Piazzesi 317 DISCUSSION 329
257
EVOLVING POST-WORLD WAR II U.S. INFLATION DYNAMICS Timothy Cogley and Thomas J. Sargent COMMENTS: Christopher A. Sims 373 James H. Stock 379 DISCUSSION 387
331
Editorial, NBER Macroeconomics Annual 2001 In recent decades there has been an extensive rethinking of many major issues in macroeconomics, including the sources of long-term economic growth, the nature of the monetary policy transmission mechanism, the effect of supply shocks on the economy, and the relationship between consumption spending and asset prices, among others. So it is remarkable that so many of the papers in this year's NBER Macroeconomics Annual are able to offer fresh perspectives on often-studied topics. In their paper, Ben Bernanke and Refet Giirkaynak revisit the conclusions of a well-known 1992 article by N. Gregory Mankiw, David Romer, and David Weil (MRW), who argued that the cross-country data on economic growth are well explained by Robert Solow's neoclassical growth model, augmented to take account of human-capital formation. Bernanke and Giirkaynak show first that, in principle, the MRW framework can be used to evaluate any growth model that admits a balanced growth path, not just the Solow model. Using their generalized version of the MRW framework, they then investigate how well both the Solow model and some alternative models of endogenous growth fit the crosscountry data, drawn from the Penn World Tables. Their tests strongly reject the hypothesis that the data can be well described by a steady-state version of the Solow neoclassical growth model, since (contrary to a central implication of that model) they find that behavioral variables such as a country's aggregate saving rate are strong predictors of longrun rates of output growth. As noted, Bernanke and Giirkaynak's rejection of the Solow model requires the auxiliary hypothesis that the economies in the sample are in steady state. To develop a test that does not require the steady-state assumption, these authors also look directly at estimates of total factor productivity (TFP) growth by country, constructed using new estimates
2 • BERNANKE & ROGOFF
of labor's share of national income. They find that, like rates of output growth, rates of TFP growth are also strongly correlated with savings rates and other behavioral variables, a result that once again tends to favor models that exhibit endogenous growth. In his discussion, David Romer praised the paper for raising anew the possible importance of capital-stock externalities in the growth process. He noted, though, that, once the dubious steady-state assumption is dropped, the only tests in the paper that potentially discriminate among alternative growth models are those based on TFP growth rates, which are exceptionally difficult to measure accurately. He therefore cautioned against drawing strong conclusions about alternative models from the evidence of this paper. While the oil price fluctuations of the past four decades may have wrought serious damage to the world economy, empirical macroeconomists have always considered sharp swings in oil prices to be a blessing for their research, as such price movements are one of the few important influences on the macroeconomy that economists have been willing to treat as exogenous. Indeed, the exogeneity of oil price shocks has been largely unquestioned in conventional macroeconomic analyses, ranging from standard textbook treatments to sophisticated econometric models. The central claim of the paper by Robert Barsky and Lutz Kilian is that oil price shocks are not reasonably taken as exogenous, but are in fact usually endogenous to the state of aggregate demand in the major oilconsuming countries. They focus in particular on the infamous OPEC price increases of the 1970s, writing, "Our analysis suggests that— although political factors were not entirely absent from the decisionmaking process of OPEC—the two major OPEC oil price increases in the 1970s would have been far less likely in the absence of conducive macroeconomic conditions resulting in excess demand in the oil market." Contrary to the conventional wisdom, they argue that the great stagflation of the 1970s was not a result of political events in the Middle East, but instead was set off by excessively expansionary monetary policy in the late 1960s and early 1970s. This monetary ease set off a boom in commodity prices, including the price of oil; the stagflationary impact of commodity price increases in turn promoted accommodative monetary policy. Barsky and Kilian offered a variety of evidence, both direct and indirect, to support their thesis. The discussants, Alan Blinder and Olivier Blanchard, argued for a more nuanced interpretation that treats at least some part of major price increases as exogenous. Nevertheless, Barsky and Kilian's analysis poses an important challenge for traditional views on the role of oil prices in the macroeconomy. Empirical analyses of the patterns of monetary policy transmission by structural vector autoregression (SVAR) methods have typically been
Editorial • 3
characterized by the so-called price puzzle. The price puzzle in the finding that unexpected increases in the short-term interest rate (a tightening of monetary policy) tend to be followed by moderate increases in inflation, rather than decreases as predicted by conventional macro models. Some economists have taken the price puzzle as evidence that the SVAR analyses are in fact poorly identified and unreliable; others have suggested various "fixes" to try to eliminate this apparently anomalous result. The paper by Marvin Earth and Valerie Ramey is among the first to explore the possibility that the price puzzle is not a puzzle or statistical quirk after all, but reflects a genuine inflationary impact of increases in the short-term interest rate. They argue that, by increasing the cost of credit and hence firms' overall costs of production, increases in interest rates can in principle lead to price increases rather than decreases (and thus to decreased output for supply-side reasons), a mechanism they refer to as the "cost channel" of monetary policy. The authors present a variety of aggregate and industry-level evidence to support their view that the cost-side theories of monetary policy transmission deserve serious consideration. Though their approach still requires a methodology for identifying aggregate demand and aggregate supply shocks, the disaggregated data in particular allow them to extract more information than is usually possible. Using two-digit industry-level data, for example, they find evidence that monetary policy directly affects prices through interest costs. Indeed, a rise in interest rates appears to predict both falling output and rising price-wage ratios in many industries. It is interesting that this "cost channel" effect appears to be most pronounced prior to 1980, during an era when monetary policy was particularly volatile. Many of the most troubling empirical puzzles in macroeconomics and finance can be traced to inconsistencies between the data and economists' canonical models of intertemporal consumption choice. The equity premium puzzle (if there still is one) fundamentally derives from the fact that aggregate consumption seems too smooth and predictable for consumption risk to explain the large observed excess returns to equity, relative to bonds. Researchers have experimented with a wide variety of models that potentially magnify the effects of consumption volatility on asset prices, through mechanisms such as habit persistence in consumption. One channel that has been explored by Anthony Lynch and others is based on the view that individuals do not continuously adjust their consumption rates, but (because of cognitive and other costs) do so only periodically. In their paper, Xavier Gabaix and David Laibson show that, in such a model, consumption volatility can have a large effect on asset price returns, even if consumers have "reasonable" rates of risk aver-
4 • BERNANKE & ROGOFF
sion. Using a continuous-time version of Lynch's period-adjustment model, Gabaix and Laibson are able to get remarkably simple and elegant analytic results. With the further assumption of limited participation (that is, that only a portion of the population holds equities), the authors argue that the model is able to match the key statistical properties of aggregate consumption and equity returns. One important empirical prediction of the model is that aggregate consumption should respond to lagged changes in equity returns, as individual consumers are not able to respond immediately to new information. This prediction seems to fly in the face of the conventional wisdom that aggregate consumption changes are largely unpredictable. However, Gabaix and Laibson revisit this literature and argue that in fact this prediction of their model is broadly consistent with the data. Data on international capital flows are notoriously poor; for example, the sum of world current accounts typically equals a large negative number, rather than zero as it should. Data on countries' net holdings of foreign assets are even worse, largely because it is extremely difficult for statistical authorities to take account of capital gains and losses on existing foreign assets. Thus simply cumulating current-account surpluses (already badly measured for many countries) does not give nearly an accurate picture of individual countries' true net indebtedness vis-a-vis the rest of the world. Famously, the cumulative U.S. current account went negative in the mid-eighties, many years before net interest payments from abroad became negative—most likely because U.S. citizens experienced particularly large capital gains on foreign investments made during the first two decades after World War II. Efforts have been made in recent years, particularly in OECD countries, to improve the netforeign-assets data by using information from equity markets and other financial markets to adjust valuations of external holdings. In their paper, Philip Lane and Gian Maria Milesi-Ferreti discuss and apply a new dataset that they have developed that similarly upgrades the data for net foreign assets of developing countries. Construction of this dataset is an important achievement, since it is precisely for developing countries that foreign-asset accumulation of indebtedness is often of greatest macroeconomic significance. Lane and Milesi-Ferreti use their new dataset to explore a number of central theories in international finance, some of which heretofore have not been directly testable. For example, previous authors have tested for portfolio balance effects using cumulated current accounts, finding little evidence of any effect of external asset holdings on interest rates. The results here, while in need of further refinement, instead tend to confirm the prediction of the portfolio balance theory that real interest differen-
Editorial • 5
tials should be inversely related to net foreign-asset positions. The authors also explore modern theories of the trade balance and real exchange rates that assign a central role to net foreign-asset positions. This paper, and especially the new dataset it introduces, will undoubtedly spark considerable further empirical exploration of these issues. Timothy Cogley and Thomas Sargent ask whether there is a danger of "recidivism" in monetary policy as inflation rates remain low and stable. In particular, given the observed behavior of inflation, unemployment, and interest rates, might the monetary authorities begin to believe (as they purportedly did in the 1960s) that the economy exhibits an exploitable Phillips-curve relation in the long run? The authors explore this question using a computer-intensive, nonlinear Bayesian modeling approach which is designed to allow for the effects of shifting beliefs and preferences of the monetary authorities on the joint dynamics of inflation, unemployment, and interest rates. (Formally, their model is a vector autoregression with parameter drift.) They use their methods to develop some interesting stylized facts, such as the observation that the mean and persistence of inflation are positively correlated. Cogley and Sargent interpret their results through the lens of the traditional Solow-Tobin test of the natural-rate hypothesis, a test that Sargent has argued is valid only if the inflation process is highly persistent. During the 1960s inflation was low and not persistent, and thus the Solow-Tobin test (erroneously) rejected the natural-rate hypothesis. According to the authors, this rejection led the monetary authorities to believe in the existence of an exploitable Phillips curve and thus to engage in policies that created high and persistent inflation during the 1970s. The persistence of inflation during the 1970s in turn led the Fed to "discover" the natural-rate hypothesis (as the Solow-Tobin test became valid), which led to a change in Fed behavior and the low-inflation regime that has existed since about 1983. The danger, for which Cogley and Sargent's methods provide some evidence, is that the long period of low and nonpersistent inflation may lead the Fed once again to conclude that an exploitable trade-off exists. Discussant Christopher Sims took issue with the authors' assumption that the data are characterized by parameter drift rather than stochastic volatility. The authors conceded the possibility that the data might be better described by constant parameters and changing volatility rather than the reverse; but they justified their modeling choice by noting that their interest was in the implication of the classic Lucas critique that changes in policy regimes will lead to changes in reduced-form parameters. While further generalization may prove useful, Cogley and Sargent have both made a useful contribution to econometric modeling and drawn an important lesson for policy.
6 • BERNANKE & ROGOFF
The editors would like to take this opportunity to thank Martin Feldstein and the National Bureau of Economic Research for their continued support of this conference and publication; the NBER's conference staff for excellent logistical support; and the National Science Foundation for financial assistance. Doireann Fitzgerald did an excellent job as conference rapporteur and editorial assistant for this volume. This volume is Ben Bernanke's last as coeditor. He would like to express his personal thanks to the NBER, to his coeditors Julio Rotemberg and Kenneth Rogoff, and to the authors, discussants, and conference participants who have made his stint at the Macro Annual an enjoyable one. Mark Gertler will replace Bernanke as coeditor for Volume 17. Ben S. Bernanke and Kenneth Rogoff
Abstracts Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously BEN S. BERNANKE AND REFET S. GURKAYNAK Is long-run economic growth exogenous? To address this question, we show that the empirical framework of Mankiw, Romer, and Weil (1992) can be extended to test any growth model that admits a balanced growth path, and we use that framework both to revisit variants of the Solow growth model and to evaluate simple alternative models of endogenous growth. To allow for the possibility that economies in our sample are not on their balanced growth paths, we also study the cross-sectional behavior of total-factor-productivity growth, which we estimate using alternative measures of labor's share. Our broad conclusion, based on both model estimation and growth accounting, is that long-run growth is significantly correlated with behavioral variables such as the savings rate, and that this correlation is not easily explained by models in which growth is treated as the exogenous variable. Hence, future empirical studies should focus on models that exhibit endogenous growth.
Long-Term Capital Movements PHILIP R. LANE AND GIAN MARIA MILESI-FERRETTI International financial integration allows countries to become net creditors or net debtors with respect to the rest of the world. In this paper, we show that a small set of fundamentals—shifts in relative output levels, the stock of public debt, and demographic factors—can do much to explain the evolution of net foreignasset positions. In addition, we highlight that "external wealth" plays a critical role in determining the behavior of the trade balance, both through shifts in the desired net foreign-asset position and through the investment returns generated on the outstanding stock of net foreign assets. Finally, we provide some evidence that a portfolio balance effect exists: real interest-rate differentials are inversely related to net foreign-asset positions.
8 • ABSTRACTS
Do We Really Know that Oil Caused the Great Stagflation? A Monetary Alternative ROBERT B. BARSKY AND LUTZ KILIAN This paper argues that major oil price increases were not nearly as essential a part of the causal mechanism that generated the stagflation of the 1970s as is often thought. There is neither a theoretical presumption that oil supply shocks are stagflationary nor robust empirical evidence for this view. In contrast, we show that monetary expansions and contractions can generate stagflation of realistic magnitude even in the absence of supply shocks. Furthermore, monetary fluctuations help to explain the historical movements of the prices of oil and other commodities, including the surge in the prices of industrial commodities that preceded the 1973-1974 oil price increase. Thus, they can account for the striking coincidence of major oil price increases and worsening stagflation.
The Cost Channel of Monetary Transmission MARVIN J. BARTH III AND VALERIE A. RAMEY This paper presents evidence that the cost channel may be an important part of the monetary transmission mechanism. We first highlight three puzzles that might be explained by a cost channel of monetary transmission. We then provide evidence on the importance of working capital and argue why monetary contractions can affect output through a supply channel as well as the traditional demand-type channels. Using a vector autoregression analysis, we investigate the effects across industries. Following a monetary contraction, many industries exhibit periods of falling output and rising price-wage ratios, consistent with a supply shock. The effects are noticeably more pronounced during the period before 1979.
The 6D Bias and the Equity-Premium Puzzle XAVIER GABAIX AND DAVID LAIBSON If decision costs lead agents to update consumption every D periods, then econometricians will find an anomalously low correlation between equity returns and consumption growth (Lynch, 1996). We analytically characterize the dynamic properties of an economy composed of consumers who have such delayed updating. In our setting, an econometrician using an Euler equation procedure would infer a coefficient of relative risk aversion biased up by a factor of 6D. Hence with quarterly data, if agents adjust their consumption every D=4 quarters, the imputed coefficient of relative risk aversion will be 24 times greater than the true value. High levels of risk aversion implied by the equity premium and violations of the Hansen-Jagannathan bounds cease to be puzzles. The neoclassical model with delayed adjustment explains the consumption behavior of shareholders. Once limited participation is taken into account, the model
Abstracts • 9 matches most properties of aggregate consumption and equity returns, including new evidence that the covariance between ln(Ct+h/Ct) and K t+1 slowly rises with h.
Evolving Post-World War II U. S. Inflation Dynamics TIMOTHY COGLEY AND THOMAS J. SARGENT For postwar U.S. data, this paper uses Bayesian methods to account for the four sources of uncertainty in a random coefficients vector autoregression for inflation, unemployment, and an interest rate. We use the model to assemble evidence about the evolution of measures of the persistence of inflation, prospective long-horizon forecasts (means) of inflation and unemployment, statistics for testing an approximation to the natural-unemployment-rate hypothesis, and a version of the Taylor rule. We relate these measures to stories that interpret the conquest of U.S. inflation under Volcker and Greenspan as reflecting how the monetary policy authority came to learn an approximate version of the naturalunemployment-rate hypothesis. We study Taylor's warning that defects in that approximation may cause the monetary authority to forget the natural-rate hypothesis as the persistence of inflation attenuates.
This page intentionally left blank
Ben S. Bernanke and Refet S. Gurkaynak PRINCETON UNIVERSITY
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously 1. Introduction "This paper takes Robert Solow seriously." Thus begins one of the most influential and widely cited pieces in the empirical growth literature, a 1992 article by N. Gregory Mankiw, David Romer, and David Weil. In brief, Mankiw, Romer, and Weil (1992), henceforth MRW, performed an empirical evaluation of a "textbook" Solow (1956) growth model using the Penn World Tables, a multicountry data set constructed by Summers and Heston (1988) for the years 1960-1985. MRW found support for the Solow model's predictions that, in the long-run steady state, the level of real output per worker by country should be positively correlated with the saving rate and negatively correlated with the rate of labor-force growth. However, their estimates of the textbook Solow model also implied a capital share of factor income of about 0.60, high compared to the conventional value (based on U.S. data) of about one-third. To address this possible inconsistency, MRW considered an augmented version of the Solow model, in which human capital enters as a factor of production in symmetrical fashion with physical capital and raw labor. They found that the augmented Solow model fits the data better and yields an estimated capital share more in line with conventional wisdom. They concluded (abstract, p. 407) that "an augmented Solow model that includes accumulation of human as well as physical capital provides an We thank Alan Heston and Robert Summers for providing us with preliminary data, Peter Bondarenko for expert research assistance, and the conference discussants, Robert Solow, and Princeton colleagues for useful comments. Bernanke gratefully acknowledges the support of the National Science Foundation, and Giirkaynak the support of an SSRC Program in Applied Economics Fellowship.
12 • BERNANKE & GURKAYNAK
excellent description of the cross-country data." Numerous authors have since used the MRW framework to study the significance of additional factors to growth (see Durlauf and Quah, 1999, for references). Islam (1995) and others have extended the MRW analysis to panel data. That MRW's augmented Solow model fits the cross-country data well is an interesting finding (and, as they point out, the results could have been otherwise). However, as we will discuss in some detail below, it is not entirely clear to what degree the good fit of the MRW specification may be attributed to elements that are common to many models of economic growth (such as the Cobb-Douglas production structure), and how much of the fit is due to elements that are specific to the Solow formulation (such as the exogeneity of steady-state growth rates). Indeed, as we will show, MRW's basic estimation framework is broadly consistent with any growth model that admits a balanced growth path— a category that includes virtually all the growth models in the literature.1 Hence, one might argue that MRW do not actually test the Solow model, in the sense of distinguishing it from possible alternative models of economic growth. On the other hand, the fact that the MRW framework is for the most part not specific to the Solow model is also a potential strength, as it implies that their approach can in principle be used to evaluate not only that model but other candidate growth models as well. Because the policy implications of the Solow model and other growth models (especially endogenous-growth models) differ markedly, assessing the empirical relevance of alternative models is an important task. In this paper we modestly extend the empirical framework introduced by MRW and use it to reevaluate both the Solow model and some alternatives. In particular, we re-examine the crucial prediction of the Solow model, that long-run economic growth is determined solely by exogenous technical change and is independent of variables such as the aggregate saving rate, schooling rates, and the growth rate of the labor force. To anticipate our conclusion, we find strong statistical evidence against the basic Solow prediction. In particular, we find that a country's rate of investment in physical capital is strongly correlated with its long-run growth rate of output per worker, and that rates of human-capital accumulation and population growth are also correlated, though somewhat less strongly, with the rate of economic growth. The rest of the paper is organized as follows. Section 2 reconsiders the MRW empirical framework. We show that the assumptions underlying 1. Durlauf and Quah (1999) derive a general framework that nests a variety of alternative growth models, including alternative versions of the Solow model.
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously - 13
their specification can be broken into two parts: those that apply to any growth model admitting a balanced growth path (BGP), and those that are specific to the Solow model. This discussion paves the way for subsequent reanalysis of both the Solow model and some simple alternatives. The empirics of the Solow model, under the maintained assumption of steady states, are revisited in Section 3. We first replicate and extend the MRW results, using more recent data and a longer sample period. We find that both the textbook and augmented Solow models perform slightly less well with updated data, and that parameter restrictions of the model that MRW found to be consistent with the data are now typically rejected. However, we do not consider these results to be particularly informative about the applicability of the Solow model, particularly its strong implication that long-run growth is exogenous. Instead, we propose a more powerful test of the Solow model, based on its prediction that in the steady state national growth rates should be independent of variables such as the saving rate and the rate of humancapital formation. We find a strong rejection of the joint hypothesis that the Solow model is correct and that the economies in our sample are in steady states. Section 4 uses our version of the MRW framework to consider some simple alternative growth models: the Uzawa (1965)-Lucas (1988) twosector model with human-capital formation, and the so-called AK model. Both models have some explanatory power, in the sense that rates of human-capital formation (Uzawa-Lucas) and of physical-capital accumulation (AK) both appear to be strongly related to output growth in the long run. However, neither model is a complete description of the crosscountry data; in particular, the overidentifying restrictions imposed by each model are decisively rejected. All the analysis through Section 4 is based on the assumption that the economies in the sample are on balanced growth paths. If all or some of the economies were in fact in transition to a balanced growth path during the sample period, our tests are invalid. MRW study the issue of non-steady-state behavior by estimating rates of convergence and relating these to the parameters of the model. We take a more direct approach: According to the Solow model, total factor productivity (TFP) growth rates should be independent of behavioral variables such as the saving rate, whether the economy is in a steady state or not. In Section 5 we construct estimates of factor shares for more than 50 countries, which allow us to infer long-run TFP growth rates. We also consider TFP growth rates for the full sample, based on a plausible assumption about factor shares. Finally, in Section 6, we verify that long-run TFP growth rates are not statistically independent of national rates of saving and
14 • BERNANKE & GURKAYNAK
other behavioral variables. We do not here take a strong position on the direction of causation between TFP growth and other country characteristics, as either suggests that a richer model than the Solow model is needed to explain long-run growth.
2. A Generalized Mankiw-Romer-Weil Framework MRW provide an appealing framework for comparing the implications of the Solow model with the cross-country data. In this section we show that their framework is potentially even more fruitful than they claim, in that it can be used to evaluate essentially any growth model that admits a BGP. Indeed, as we will show, the MRW framework can be thought of as consisting of two parts: a general structure that is applicable to any model admitting a BGP, and a set of restrictions imposed on this structure by the specific growth model (such as the Solow model) being studied. Here we develop the point in some generality; in subsequent sections we apply the generalized MRW approach to study both the Solow model and some alternative models of economic growth. Assume that in a given country at time t, the output Yt depends on inputs of raw labor Lt and three types of accumulated factors: Kt, Ht, and Zt. The factors Kt and Ht are accumulated through the sacrifice of current output (think of physical capital and human capital, or structures and equipment). The factor Zt, which could be an index of technology, or of human capital acquired through learning-by-doing, is assumed to be accumulated as a byproduct of economic activity and does not require the sacrifice of current output. The four factors of production combine to produce output according to the following standard, constant-returns-to-scale Cobb-Douglas form (note that Zt multiplies raw labor Lt and thus may also be thought of as an index of labor productivity):
Output may either be consumed or transformed into K-type or H-type capital:
where C, is consumption and the overdot indicates a time derivative. K-type and H-type capital depreciate at rates 8K and SH respectively. Z-type capital does not use up output, but is accumulated according to
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously -15
some yet unspecified relationship that links changes in Z to the current state of the economy:
Behavioral or technological parameters (such as the parameter that links the rate of learning-by-doing to the level of production) may be implicit in z(-). Finally the labor force grows at exogenous rate n:
We consider a BGP of this economy in which constant shares of output, denoted by SK and SH, respectively are devoted to gross investment in the two capital goods. For now we take these shares to be strictly exogenous. This assumption is harmless for the analysis of the Solow model, which also assumes exogenous saving rates. We examine the case of endogenous saving rates at various points below. Using lowercase letters to denote per-worker quantities, e.g., yt — Yt/Lt, we can rewrite the production function and the capital accumulation equations in a standard way as
The growth rates of k and h, which are constant along the BGP, are given by
The growth rate of output per worker is
where gz = Zt/Zt. The first term on the right-hand side of the expression for gk, equation (2.8), equals sKYt /Kt. Since both gk and 8K + n are constant along the BGP, Yt /Kt must also be constant. Hence Y and X grow at the same rate on the BGP (cf. Barro and Sala-i-Martin, 1999, p. 54). By similar argument, the
16 • BERNANKE & GURKAYNAK
expression for gh, equation (2.9), implies that Y and H grow at the same rate. Hence, Y, K, and H share a common growth rate, call it g = gK = gH = gy. Finally, from the expression for gy, equation (2.10), we see that Z must also grow at the same constant rate, or gz = g. The requirement that Z grow at a constant rate on the BGP rules out scale effects in the determination of Z; hence the equation for Z reduces to
We can now solve explicitly for the BGP of output per worker. Using the equations for gk and gh above, and the fact that these two quantities are equal in the steady state, we find
To simplify the algebra a bit, and for comparability with MRW, suppose that 8K = dH = 8, so that oo = SH /SK. Solving (2.8) and (2.9) to find the BGP values of kt and ht (call them kf and h*), we get
The output per worker along the BGP, y*r is given (in logs) by
Further, the f-period difference in output per worker along the BGP is
To this point we have considered the BGP of a single country. Suppose now that we have a panel of countries, indexed by i. Further, suppose
7s Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 17
that In Zit = z, + sit2, and that In yit = In #* + r\it, where riit is stationary and represents cyclical deviations of output from the BGP. Then equations (2.15) and (2.16) may be written in estimation form as
As we have stressed, our analysis thus far assumes only that the economy is in a BGP and does not rule out endogenous determination of TFP (identified here with Zt). To go from this generalized MRW framework to a specific growth model, additional restrictions are required. For example, in their estimation of the augmented Solow model, MRW specialize further by assuming that air fa, and (most importantly) g, are the same for all countries, and that actual output equals BGP output (j]it = 0). [MRW do not write down (2.18) explicitly, but it is implicit in their calculations, as they use average output growth to determine the value of the common growth rate g.] Their estimation of the textbook Solow model further assumes that /3 = 0, that is, human capital H does not enter as a separate factor of production. In Section 4 we show how this framework can accommodate other models of economic growth. First, though, we revisit the MRW estimates, using updated data.
3. Replication and Extension of the MRW Results The original MRW article used cross-national data for the period 19601985. In this section we replicate the MRW results for 1960-1985 and extend them through 1995. We find that MRW's conclusions about the fit of the textbook Solow model and the augmented Solow model seem slightly weaker when we use revised and/or extended data, though their main results survive. We also propose a new test of the Solow model based on joint estimation of equations in the form of (2.17) and (2.18). 2. MRW assume (in our notation) that In Zi0 = z0 + ei0. Their assumption implies that zt = z0 + gt and eit = eiQ + (g, ~ g)t, where g is the mean country growth rate. Under the MRW assumption that g, = g, we have simply eit = eio. We discuss the implications of this error structure further below.
18 • BERNANKE & GURKAYNAK
Following MRW we draw our basic data from the Summers-Heston Penn World Tables (PWT), which contain information on real output, investment, and population (among many other variables) for a large number of countries. The data set used in the original MRW study was PWT version 4.0. The PWT data have been revised twice since publication of the MRW article; as of this writing, PWT version 5.6 (which extends coverage of most variables through 1992) is the latest publicly available version. Alan Heston and Robert Summers have also kindly supplied us with a preliminary version of PWT version 6.0, which extends the data through 1998 for most variables.3 In what follows we compare results using all three PWT data sets (4.0, 5.6, and preliminary 6.0). MRW measure n as the average growth of the working-age population (ages 15 to 64). They obtained these data from the World Bank's World Tables and the 1988 World Development Report. We use the original MRW data on working-age population in conjunction with the PWT 4.0 data set. For analyses using PWT 5.6 and PWT 6.0, we use analogous data taken from the World Bank's World Development Indicators 2000 CD-ROM. The saving rate relevant to physical capital, SK, is measured as the average share of gross investment in GDP, as in MRW. In open economies, of course, investment and saving need not be equal. However, if the capacity of countries to borrow abroad is limited (for reasons well known from the literature on sovereign debt), MRW's identification of the ratio of investment to GDP with SK seems defensible, even though technically investment is not fully financed by domestic saving. Reconciling closed-economy growth models with the existence of international capital flows is a general problem in this literature, and we do not have much to add on the issue here.4 MRW's estimates of the augmented Solow model (with human-capital accumulation) include a variable they call SCHOOL, analogous to our SH, which is the average percentage of a country's working-age population in secondary school. More specifically, MRW define SCHOOL as the percentage of school-age population (12-17) attending secondary school times the percentage of the working-age population that is of secondaryschool age (15-19). The age ranges in the two components of SCHOOL are incommensurate, but we are inclined to agree with MRW that the imperfect matchup is not likely to create major biases, and we use the same construct. Data on enrollment rates and on working-age popula3. Of course, Heston and Summers are not responsible for results obtained using these preliminary data. 4. For an open-economy extension of the augmented Solow model of MRW, see Barro, Mankiw, and Sala-i-Martin (1995).
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously -19
tion and its components are from the sources noted two paragraphs above and from the UN World Population Prospects. With these data we perform the following exercises. First, we replicate the MRW results for the textbook Solow model for their sample period, 1960-1985, for each of their three country samples and using all three vintages of the PWT data. Next, we use the data sets PWT 5.6 and PWT 6.0 to repeat the estimation for the periods 1960-1990 and 1960-1995, respectively. Finally we repeat these exercises for MRW's augmented Solow model. The replication of MRW's results for the textbook Solow model and for their 1960-1985 sample period are contained in Table 1 (compare MRW's Table I, p. 414 of their article). As in MRW, the three country samples we examine are (1) the non-oil sample, the set of all countries for which complete data are available, excluding oil producers (98 countries); (2) the intermediate sample, which is the non-oil sample excluding countries whose data receive a grade of D from Summers and Heston or whose population is less than one million (75 countries)5; and (3) the OECD sample, OECD countries with populations greater than one million (22 countries).6 Note that, because of missing data, the sample sizes are in some cases slightly smaller than PWT 5.6 and PWT 6.0 are used for the replication. When we repeat the MRW estimations using PWT 4.0 (see the three leftmost columns of Table 1), our results are essentially identical to theirs, as expected. In particular, in the restricted regression (which imposes cross-parameter restrictions on the regression coefficients) we find an R2 of 0.59 for both the non-oil and intermediate samples, suggesting that the model explains a significant part of the variation in real output per worker among these countries. For the OECD sample, the R2 is a much more modest 0.06, as in MRW. The single restriction imposed by the model is not rejected in any of the three samples. The primary shortcoming of the results, as identified by MRW, is that the estimated capital share a is about 0.60 in both the non-oil and intermediate samples, a value that seems too high. The estimated a for the OECD sample is a more reasonable 0.36. We also obtained estimates for the MRW sample period, 1960-1985, using revised PWT data (see Table 1). The results are again similar to those found by MRW, with two exceptions worth noting: First, when the revised data are used, the overidentifying restriction of the model is rejected for the non-OECD country samples (the p-values are 0.02 and 5. More recent versions of the PWT data no longer include these grades. 6. Our OECD sample coincides with that of MRW throughout, that is, we do not include countries joining since 1990.
Table 1 ESTIMATION OF THE TEXTBOOK SOLOW MODEL FOR THREE ALTERNATIVE VINTAGES OF THE PWT DATASET" Value (Standard Error)
Parameter
Non-Oil
Intermediate
PWT 6.0
PWT 5. 6
PWT 4.0
OECD
Non-Oil
Intermediate
OECD
Non-Oil
Intermediate
OECD
No. of observations
98
75
22
96
75
22
90
72
21
Constant
5.62 (1.56)
5.47 (1.52)
7.99 (2.46)
4.44 (1.35)
4.74 (1.39)
8.66 (2.49)
5.06 (1.35)
5.23 (1.46)
9.10 (2.48)
ln(//GDP)
1.43 (0.14)
1.32 (0.17)
0.50 (0.43)
0.97 (0.09)
1.02 (0.13)
0.61 (0.53)
0.88 (0.09)
0.93 (0.14)
0.36 (0.37)
-1.92 (0.55)
-1.97 (0.53)
-0.75 (0.83)
-2.25 (0.49)
-2.19 (0.49)
-0.66 (0.82)
-2.14 (0.49)
-2.13 (0.51)
-0.53 (0.79)
0.59
0.59
0.02
0.64
0.62
0.00
0.62
0.56
0.01
ln(n + g + S)
R2
Restricted Regression Constant
6.87 (0.12)
7.10 (0.15)
8.61 (0.53)
7.74 (0.08)
7.71 (0.11)
8.76 (0.60)
8.31 (0.08)
8.25 (0.12)
9.52 (0.37)
ln(//GDP) ln(n + g + 8)
1.49 (0.12)
1.43 (0.14)
0.56 (0.36)
1.07 (0.08)
1.16 (0.11)
0.63 (0.41)
0.98 (0.09)
1.09 (0.12)
0.40 (0.26)
R2
0.59
0.59
0.06
0.63
0.60
0.06
0.60
0.54
0.06
/?-Value
0.42
0.29
0.80
0.02
0.04
0.97
0.02
0.04
0.86
Implied a
0.60 (0.02)
0.59 (0.02)
0.36 (0.15)
0.52 (0.02)
0.54 (0.02)
0.39 (0.15)
0.49 (0.02)
0.52 (0.03)
0.29 (0.14)
Test of Restriction
" Dependent variable: log (GDP per working-age person) in 1985. Standard errors are reported immediately below parameter estimates. The investment and population growth rates are averaged over the period 1960-1985. g + S is assumed to be 0.05.
22 • BERNANKE & GURKAYNAK Table 2 ESTIMATION OF THE TEXTBOOK SOLOW MODEL FOR MORE RECENT SAMPLE PERIODS" Value (Standard Error) 1960-1990 (PWT 5.6) Parameter
1960-1995 (PWT 6.0)
Non-oil Intermediate OECD Non-oil Intermediate OECD
No. of observations
85
70
22
90
72
21
Constant
3.59 (1.37)
3.62 (1.36)
7.96 (2.20)
4.16 (1.38)
4.58 (1.44)
7.79 (2.37)
ln(J/GDP)
0.94 (0.10)
0.95 (0.13)
0.65 (0.47)
1.07 (0.10)
1.11 (0.14)
0.38 (0.37)
-2.59 (0.49)
-2.60 (0.47)
-0.97 (0.73)
-2.66 (0.49)
-2.54 (0.50)
-1.07 (0.75)
0.67
0.66
0.09
0.68
0.65
0.12
Constant
7.84 (0.09)
7.79 (0.12)
8.72 (0.55)
8.24 (0.08)
8.19 (0.12)
9.48 (0.37)
ln(J/GDP-
1.09 (0.09)
1.19 (0.11)
0.74 (0.37)
1.22 (0.09)
1.32 (0.12)
0.57 (0.27)
0.63
0.62
0.13
0.66
0.63
0.14
p-Value
0.00
0.00
0.72
0.00
0.01
0.48
Implied a
0.52 (0.02)
0.54 (0.02)
0.43 (0.12)
0.55 (0.02)
0.57 (0.02)
0.36 (0.11)
ln(n + g + 8) R2 Restricted Regression
ln(n + g + 8) R2 Test of Restriction
"Dependent variable: log (GDP per working-age person) in 1990 (PWT 5.6) and 1995 (PWT 6.0). Standard errors are reported immediately below parameter estimates. The investment and population growth rates are averaged over the periods 1960-1990 or 1960-1995, depending on the sample, g + S is assumed to be 0.05.
0.04 respectively for both the PWT 5.6 data and the PWT 6.0 data). This rejection contrasts with MRW's original finding for the same sample period. Second, we find somewhat lower estimates of the capital share, closer to 0.5 than 0.6. As MRW's results go only through 1985, it is interesting to see whether their findings hold for updated data. Table 2 shows the results
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 23
of estimating the MRW specification using more recent data and hence longer sample periods. The leftmost three columns of the table show estimates for the 1960-1990 sample period (using PWT 5.6), and the rightmost three columns show the results for 1960-1995 (using PWT 6.0). The end dates were chosen to minimize the effect of missing data at the end of the sample. Qualitatively the results are similar to those in Table 1; indeed, relative to the results for 1960-1985, R2 is somewhat higher for both sample periods and each group of countries. However, the overidentifying restriction proposed by MRW is now strongly rejected outside the OECD (the p-values for the non-oil and intermediate samples are respectively 0.00 and 0.00 for 1960-1990, and 0.00 and 0.01 for 1960-1995). The estimated capital shares remain between 0.5 and 0.6 for the large samples, and they rise to about 0.4 for the OECD sample. As we have noted, the high estimated values of the capital share obtained by MRW for the textbook Solow model led them to consider a variant of the Solow model in which human capital as well as physical capital is accumulated. In terms of our exposition of Section 2, this model allows for a nonzero coefficient (3 on the second form of accumulated capital, while retaining the assumption that technology growth rates are the same for all countries. We also replicated and extended this set of MRW estimates. Our estimates of the augmented Solow model for the 1960-1985 sample period are reported in Table 3, and Table 4 gives the estimates for the 1960-1990 and 1960-1995 sample periods. As MRW found, the performance of the augmented Solow model, with human capital, is generally better than that of the textbook version. The augmented model explains considerably more of the cross-country variation in output per worker; for example, for the 1960-1995 sample (using PWT 6.0), R2 equals 0.75 for the large non-oil sample, 0.77 for the intermediate sample, and 0.45 for the OECD sample. The coefficient on human capital, (3, takes on reasonable values (generally between 0.3 and 0.4), and the estimates of the coefficient on physical capital, a, are correspondingly reduced. There are also some problems, however. First, the overidentifying restriction on the ordinary least-squares (OLS) coefficients is rejected at the 1% level for the broadest sample for the 1960-1990 and 19601995 sample periods, and at the 5% level for the 1960-1985 sample using the most recent vintage of the data (PWT 6.0). Second, the estimated capital share a is now unreasonably low in some cases: For 1960-1985, a is estimated to be 0.00 for the OECD sample when PWT 5.6 is used, and -0.03 when PWT 6.0 is used. For 1960-1990 and 1960-1995 respectively, the OECD capital share is estimated to be 0.09 and 0.04.
Table 3. ESTIMATION OF THE AUGMENTED SOLOW MODEL FOR THREE ALTERNATIVE VINTAGES OF THE PWT DATA" Value (Standard Error) PWT 5. 6
PWT 4.0
Parameter
Non-Oil
Intermediate
OECD
Non-Oil
Intermediate
PWT 6.0 OECD
Non-Oil
Intermediate
OECD
No. of observations
98
75
22
96
75
22
90
72
21
Constant
6.98 (1.15)
7.87 (1.17)
8.67 (2.17)
6.80 (1.06)
7.94 (1.15)
10.84 (1.91)
6.71 (1.09)
8.38 (1.12)
10.29 (1.93)
In(VGDP)
0.70 (0.13)
0.71 (0.15)
0.28 (0.39)
0.45 (0.09)
0.51 (0.12)
0.19 (0.41)
0.42 (0.10)
0.51 (0.11)
-0.01 (0.30)
ln(n + g + S)
-1.71 (0.41)
-1.48 (0.40)
-1.06 (0.74)
-1.69 (0.38)
-1.43 (0.39)
-0.67 (0.60)
-1.82 (0.39)
-1.42 (0.38)
-0.78 (0.61)
In SCHOOL
0.66 (0.07)
0.73 (0.10)
0.75 (0.29)
0.61 (0.07)
0.72 (0.10)
1.17 (0.28)
0.56 (0.08)
0.71 (0.09)
1.01 (0.27)
R2
0.78
0.77
0.24
0.80
0.78
0.46
0.76
0.77
0.42
Restricted Regression Constant
7.86 (0.14)
7.97 (0.15)
8.71 (0.47)
8.45 (0.10)
8.44 (0.13)
9.20 (0.47)
8.91 (0.10)
8.89 (0.11)
9.73 (0.29)
ln(//GDP) ln(n + g + 8)
0.74 (0.12)
0.71 (0.14)
0.29 (0.33)
0.48 (0.09)
0.52 (0.12)
0.00 (0.34)
0.46 (0.10)
0.53 (0.11)
-0.06 (0.24)
In (SCHOOL) ln(n + g + 8)
0.66 (0.07)
0.73 (0.09)
0.76 (0.28)
0.63 (0.07)
0.73 (0.09)
1.11 (0.28)
0.58 (0.08)
0.72 (0.08)
1.00 (0.26)
0.78
0.77
0.28
0.79
0.78
0.47
0.75
0.77
0.45
0.45
0.93
0.98
0.12
0.66
0.39
0.05
0.65
0.77
Implied a
0.31 (0.04)
0.29 (0.05)
0.14 (0.15)
0.23 (0.04)
0.23 (0.05)
0.00 (0.16)
0.23 (0.04)
0.24 (0.04)
-0.03 (0.12)
Implied (3
0.28 (0.03)
0.30 (0.04)
0.37 (0.12)
0.30 (0.03)
0.32 (0.04)
0.53 (0.13)
0.28 (0.04)
0.32 (0.04)
0.52 (0.11)
R2 Test of Restriction p- Value
"Dependent variable: log (GDP per working-age person) in 1985. Standard errors are reported immediately below parameter estimates. The investment and population growth rates are averaged over the period 1960-1985. g + S is assumed to be 0.05. SCHOOL is the average percentage of the working-age population in secondary school for the period 1960-1985:
26 • BERNANKE & GURKAYNAK Table 4 ESTIMATION OF THE AUGMENTED SOLOW MODEL FOR MORE RECENT SAMPLE PERIODS" Value (Standard Error) 1960-1990 (PWT 5.6) Parameter
1960-1995 (PWT 6.0)
Non-Oil Intermediate OECD Non-Oil Intermediate OECD
No. of observations
85
70
22
90
72
21
Constant
5.42 (1.09)
6.50 (1.23)
10.03 (1.89)
5.81 (1.12)
7.92 (1.07)
9.48 (1.98)
ln(J/GDP)
0.41 (0.10)
0.52 (0.13)
0.30 (0.39)
0.54 (0.11)
0.60 (0.12)
0.08 (0.31)
ln(n + g + 5)
-2.24 (0.38)
-1.97 (0.40)
-0.90 -2.35 (0.59) (0.39)
-1.81 (0.36)
-1.19 (0.60)
In SCHOOL
0.65 (0.09)
0.72 (0.13)
1.00 (0.30)
0.65 (0.09)
0.85 (0.10)
1.06 (0.32)
0.80
0.77
0.40
0.80
0.83
0.43
Constant
8.50 (0.11)
8.42 (0.13)
9.08 (0.46)
8.84 (0.10)
8.85 (0.10)
9.61 (0.30)
ln(J/GDP) In (n + g + 8)
0.48 (0.11)
0.57 (0.13)
0.20 (0.34)
0.62 (0.11)
0.64 (0.11)
0.09 (0.25)
In(SCHOOL) ln(n + g + 8)
0.69 (0.09)
0.79 (0.12)
0.96 (0.28)
0.68 (0.09)
0.88 (0.09)
1.06 (0.30)
R2
0.78
0.76
0.42
0.79
0.83
0.46
0.01
0.12
0.61
0.01
0.39
0.95
Implied a
0.22 (0.05)
0.24 (0.05)
0.09 (0.15)
0.27 (0.04)
0.25 (0.04)
0.04 (0.12)
Implied j8
0.32 (0.04)
0.33 (0.05)
0.44 (0.12)
0.30 (0.04)
0.35 (0.04)
0.49 (0.11)
R2 Restricted Regression
Test of Restriction p-Value
"Dependent variable: log (GDP per working-age person) in 1990 (PWT 5.6) and 1995 (PWT 6.0). Standard errors are reported immediately below parameter estimates. The investment and population growth rates are averaged over the periods 1960-1990 or 1960-1995, depending on the sample, g + S is assumed to be 0.05. SCHOOL is the average percentage of the working-age population in secondary school for the relevant sample period.
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 27 3.1 A MORE POWERFUL TEST OF THE SOLOW MODEL
Based on the results so far, one might follow MRW and draw broadly positive conclusions about the fit of the Solow model, especially when augmented with human capital. Notably, a simple regression using only three variates (the saving rate, the schooling rate, and the population growth rate) seems to explain a remarkable share of cross-country variation in the level of output per worker. It is true that the estimates of the production-function coefficients are not always reasonable, and we have found that the overidentifying restriction implied by the Cobb-Douglas structure is often rejected, but problems with estimation of production relationships are not uncommon. Very possibly, these statistical rejections are not of great economic significance. However, as our exposition in Section 2 suggests, the results shown so far do not constitute the strongest test of the Solow model within this framework. In our view, a better test of the Solow model involves testing the restrictions on the analogue of equation (2.18), the equation explaining long-run growth. In particular, if the hypothesis that the steady state of the Solow model describes the cross-sectional distribution of output per worker is true, then we should not be able to reject the hypothesis that factors such as the saving rate or the rate of human-capital accumulation do not enter into the determination of the long-run growth rate. Formally, equations (2.17) and (2.18), together with the assumptions that all countries share the same production function parameters and longrun growth rate, imply that
where the growth rate g is constant across countries. A straightforward statistical implication of the model, easily tested in this framework, is that the coefficients on variables such as the saving rate, the schooling rate, and the growth rate of the workforce rate should be zero when they are entered on the right side of (3.2). [More precisely, we divide both sides of (3.2) by the number of periods t, so that the annual growth rate is on the right-hand side.] Table 5 reports the results of this test. Equations (3.1) and (3.2) are estimated jointly by seemingly unrelated regression (SUR), with
Table 5 TEST OF EXOGENEITY OF GROWTH IN THE SOLOW MODEL8 Value (Standard Error) Augmented Solow Model
Textbook Solow Model Parameter
Non-Oil
Intermediate
OECD
Western
Non-Oil
Intermediate
OECD
Western
No. of observations
90
72
21
22
90
72
21
22
Constant
-0.01 (0.01)
0.00 (0.01)
0.02 (0.01)
0.02 (0.01)
-0.01 (0.01)
-0.01 (0.01)
0.02 (0.01)
0.02 (0.01)
0.14 (0.02)
0.14 (0.02)
0.06 (0.04)
0.06 (0.04)
0.12 (0.02)
0.12 (0.02)
0.07 (0.04)
0.05 (0.04)
-0.01 (0.05)
-0.04 (0.05)
-0.02 (0.10)
-0.08 (0.10)
0.07 (0.05)
0.05 (0.05)
-0.12 (0.10)
-0.05 (0.11)
n
0.00 (0.15)
-0.03 (0.15)
-0.40 (0.28)
-0.36 (0.26)
0.03 (0.15)
0.03 (0.15)
-0.38 (0.28)
-0.31 (0.27)
*2(3) P
80.41 0.00
54.57 0.00
6.84 0.08
3.48 0.32
79.68 0.00
53.13 0.00
8.03 0.05
2.90 0.41
//GDP SCHOOL
"SUR estimation of two-equation system of the form of equations (3.1) and (3.2), with coefficients of (3.1) unconstrained. Dependent variable: change in log (GDP per working-age person), 1960-1995. The table shows the results of the estimation of equation (3.2). The final two rows report a test of the prediction of the model that variables other than the constant should be excluded from (3.2). A small value of p implies rejection of the joint hypothesis that the economies are in a steady state and growth is exogenous.
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 29
equation (3.2) being augmented by the variables I/GDP, SCHOOL, and the labor-force growth rate n.7 The prediction of the Solow model (under the auxiliary assumption of steady states) is that the estimated coefficients of the last three variables should all be zero. Table 5 shows the parameter estimates and standard errors for the augmented equation (3.2). The chi-squared test and the associated p-value in the final two rows test the exclusion restriction implied by the model. In brief, the Solow model's implication that growth is exogenous is strongly rejected for the non-oil and intermediate samples. When equation (3.1) takes the form implied by the textbook Solow model, that is, we impose (3 = 0, exogeneity of growth is rejected for the OECD sample at the 10% level. When equation (3.1) allows (3 ^ 0, the restriction is rejected at the 5% level for the OECD. Inspection of the coefficients and standard errors in Table 5 shows that the principal reason for the rejections is the strong relationship of the saving rate (I/GDP) to the long-run growth rate. There are at least two possible reasons for the statistical rejections found in Table 5: First, growth may not be truly exogenous, in the sense of the Solow model. Second, the maintained hypothesis that the countries in the sample are in the steady state may be wrong, i.e., we may be picking up transition dynamics. One simple test of the second possibility is to consider only the 22 countries in our sample that are located in the Western Hemisphere. Arguably, the assumption of steady states makes more sense for Western Hemisphere countries than for the rest of the world, as the Americas have not been the scene of major wartime destruction, postcolonial transitions, or (except for Cuba, which is not in our sample) sustained nonmarket experiments during the past century. Interestingly as Table 5 shows, the restrictions of the Solow model cannot be rejected for the countries of the Western Hemisphere as a group. Thus, it remains possible that the results of this section arise because of transition dynamics, not because the Solow model is fundamentally wrong about long-run growth. In the latter part of the paper we address this issue directly by considering the determinants of TFP growth rather than output. 3.2 ENDOGENOUS SAVINGS RATES? THE RAMSEY MODEL
Our rejection of the Solow model is based on the finding that variables such as saving rates are correlated with growth rates. One possible reason for this correlation is that saving rates are endogenous and depend on rates of growth, rather than the other way around, as in the 7. Our focus is not on equation (3.1), but the SUR approach brings efficiency gains in the estimation of that equation too.
30 • BERNANKE & GURKAYNAK
classic formulation due to Ramsey (1928), Cass (1965), and Koopmans (1965); see, e.g., Barro and Sala-i-Martin (1999, Chapter 2), for an exposition. In the remainder of this section we briefly consider the fit of the Ramsey model to the data. Before doing so, however, we should emphasize that the possibility that saving rates are endogenous to growth does not (in our view) invalidate our rejection of the Solow model in the previous section. In brief, there are two possibilities: Either the long-run growth rate is the same for all countries (that is, g{ = g for all i), as maintained by MRW, or it is not. If the long-run growth rate is invariant, then differences in growth rates cannot account for differences in savings rates. In any case, the null that the growth rate is the same for all countries is rejected by our test reported above, under the plausible assumption that the long-run average values of I/GDP, SCHOOL, and n are not strongly correlated with the cyclical error term, (17,, — r)io)/t. Suppose then that the long-run growth rates differ (exogenously) across countries. This alternative assumption raises both econometric and substantive problems for the MRW analysis of the Solow model. Econometrically, if the growth rate is stochastic, the MRW equation (2.17) is no longer a valid regression, as the error term is correlated with the regressors (see footnote 2). Hence the interpretation of MRW's results favoring the Solow model is problematic. More substantively, "explaining" growth by assuming that growth rates differ exogenously across countries is not particularly helpful. Once it is allowed that long-run growth rates differ across countries, we are naturally pushed to consider explanations for these differences, as offered (for example) by endogenous growth models. We consider the version of the Ramsey model without human capital, that is, with /3 = 0. The relevant equations are
where p is the discount rate (of the representative agent), a is the coefficient of relative risk aversion, and vlt is a country-specific (but timeindependent) error term. Equations (3.3) and (3.4) are the appropriately modified versions of equation (2.17) and (2.18), and equation (3.5) is the
Zs Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 31 Table 6 ESTIMATES OF THE RAMSEY MODEL0 Value (Standard Error) Parameter
Non-Oil
Intermediate
OECD
Western
No. of observations
90
72
21
22
z
8.54 (0.09)
8.73 (0.10)
9.56 (0.06)
8.81 (0.11)
a
-0.17 (0.41)
0.16 (0.40)
0.08 (0.51)
0.75 (1.35)
P
0.13
0.11
0.07
0.12
(0.01)
(0.01)
(0.01)
(0.02)
0.49
0.33
0.14
0.15
corr(sK, §K)
"SUR estimation of two-equation system (3.3) and (3.5), with a = 0.35 assumed in both equations. The last row shows the simple correlation of actual and fitted saving rates across countries.
standard expression for the Ramsey steady-state saving rate.8 To estimate this system, it is convenient to rewrite (3.4) as
Using (3.6), we substitute for g{ in (3.3) and (3.5). This substitution introduces a measurement error term, (!/£)(%, — rjit); however, this error is probably small for our sample length (35 years) and is zero asymptotically. After making this substitution, we estimate the system (3.3) and (3.5) jointly by nonlinear SUR, to take advantage of possible efficiency gains if the error terms are correlated. As noted above (see also footnote 2), when growth rates vary across countries equation (3.3) is no longer a valid regression, as the error term eit = s^ + (gt — g)t is likely to be correlated with the regressors; hence, we impose a = 0.35 (a value justified later in the paper) and estimate only the constant term in (3.3). [Estimation of equation (3.5) alone produced similar results to those reported here.] Table 6 shows the results for the period 1960-1995 for four samples (the three MRW samples plus the Western Hemisphere). 8. This savings rate comes from the solution of the consumer optimization problem, max Jo e~pl [(c}~'T ~ 1)7(1 — a)} Lt dt, where c, is per capita consumption. The same maximization problem also applies to the Uzawa-Lucas model introduced in the next section.
32 • BERNANKE & GURKAYNAK
The results provide at best weak support for the view that saving rates are endogenous to growth rates. The link between the growth rate and the saving rate operates most directly through the risk aversion parameter (the reciprocal of the intertemporal elasticity of substitution), a. As Table 6 shows, the estimated value of a is much too low (negative, for the largest sample), relative to typical findings, and is poorly identified. (However, estimates of the discount rate p are well identified and reasonable in magnitude.) As a measure of fit, the table also reports for each sample the simple correlation of the actual saving rate and the fitted saving rate. This correlation is 0.49 for the largest (non-oil) sample (recall, though, that here the estimated a is negative) and 0.33 for the intermediate sample. For the OECD and Western Hemisphere samples respectively, the correlations of actual and fitted saving are only 0.14 and 0.15. Further, much of the explanation for saving appears to be due to variation in the growth rate of the labor force rather than variation in the growth rate. In short, it appears that one cannot reasonably account for the observed correlation of saving and growth as reflecting the endogenous response of the former to the latter.9'10 More evidence on this point is provided below. In the next section we consider the fit of some alternatives to the Solow model which permit growth as well as saving to be endogenous. 4. Alternative Growth Models The extended MRW framework provides a means of assessing alternative growth models. In this section we consider the application of the framework to the Uzawa (1965)-Lucas (1988) two-sector growth model with human capital and to a version of the AK model with learning-bydoing. At this point these exercises are meant to be largely illustrative, as the models considered are quite simple. 4.1 THE UZAWA-LUCAS MODEL
In our version of the Uzawa-Lucas model, we assume that production is given by
9. Independent evidence is provided by King and Rebelo (1993), who show that a neoclassical growth model with endogenous savings rates has strong counterfactual implications, such as real interest rates above 100% in early stages of development. 10. Preliminary estimation of the out-of-steady-state dynamics of the savings rate in the Ramsey model also resulted in unreasonable estimates of the coefficient of relative risk aversion and the discount rate.
7s Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously - 33
In equation (4.1), ht is human capital per worker at time i, and 1 - SH is the share of worker time devoted to market production. The factor A is a constant (i.e., it may vary by country but not over time). Long-run growth occurs in this model only through the accumulation of human capital. The human-capital accumulation equation is
where B measures the productivity of educational technology and SH (as previously defined) is the share of time devoted to education by people of working age (the SCHOOL variable of MRW). Equation (4.1) reduces to equation (2.1) when Zt = A(l — sH)ht and /3 = 0. Since Z, /Zf = ht/ht, equation (4.2) is equivalent to equation (2.11) withg(sK, SH, . . . ) = BsH. Following the steps of the analysis of Section 2, we obtain the pair of empirical equations for this model corresponding to equations (2.17) and (2.18) respectively:
where eit = eio + (g{ — g)t = siQ + B(sHi — sH)t. Thus, as expected, the product Bsm appears in the expression for In yit. Note that (4.4) has no constant term. Both equations also appear likely to exhibit heteroscedasticity; that will be taken care of by our estimation procedure. In principle, the Uzawa-Lucas model allows the rate of human-capital information and the saving rate in the steady state to be endogenous. To accommodate this endogeneity, we append the following two equations:
where vl{ and v2i are error terms. Equation (4.5) is the same as the Ramsey expression (3.5) for the optimal saving rate, and equation (4.6) gives the optimal steady-state rate of human-capital formation. We estimate this variant of the Uzawa-Lucas model in two ways: First, we estimate only equations (4.3) and (4.4), effectively treating sKi and sHi as
34 • BERNANKE & GURKAYNAK exogenous. Second, to allow for endogenous rates of saving and human-capital formation, we estimate the system (4.3)-(4.6) simultaneously, making the substitution (3.6) for the growth rate in equations (4.3), (4.5), and (4.6). Again we have the problem that the error term is correlated with the regressors in (4.3), and hence, for both exercises, we simply impose a = 0.35.11 Table 7 shows the results of estimation for four samples of countries for the years 1960-1995. The top part of Table 7 shows the results wh£n the savings rates for physical and human capital are treated as exogenous and given; the bottom part allows these variables to be endogenously determined by the utility maximization problem of a representative agent. We find that the parameters z and B are tightly estimated, with similar values independent of whether savings rates are treated as exogenous or endogenous. However, the estimated values of a and p, shown in the bottom part of Table 7, are found to be inadmissible (a is always estimated to be negative) or implausible. The negative estimates for o- result from the fact that human-capital investment rates and population growth rates are negatively correlated in the data, which is inconsistent with equation (4.6) unless a < 0. Again, the representative-agent model does not seem to do very well in explaining cross-country variations in saving; future work should consider alternative models of saving, such as the life-cycle model (which focuses on demographics) In order to assess goodness of fit, Table 7 also shows the cross-sectional correlations of the endogenous variables of the model and their fitted values. In the top half of the table, the correlations of actual and fitted growth rates treat the saving rate and the rate of human-capital formation as exogenous and given. More precisely, this correlation is just the correlation of the actual growth rate and BsHl. In the bottom part of the table all three variables are treated as endogenous (the rate of population growth is thus the only exogenous source of cross-country variation). With saving rates exogenous, the correlation of actual and fitted growth under the Uzawa-Lucas model is 0.54 for the large non-oil sample and 0.43 for the intermediate sample.12 The correlations of actual and fitted growth 11. One is tempted to put BsH, explicitly in the expression (4.3) and assume that the term is uncorrelated with ei0l rendering the regression valid. A little reflection shows that this is unreasonable, however. If the term g, = BsHi were uncorrelated with ea, it would perforce by definition be correlated with every error term e^, j= - °°, ... ,-1,1, ... , °°. But the start date of the sample is arbitrary; there is no reason to assume that the error term corresponding to the start date happens to have the unique property of being uncorrelated with the growth rate. _ 12. Note that these correlations are not comparable with the R2's obtained in the MRW regressions, which take the level of output per capita rather than its growth rate as the dependent variable. By definition, the steady-state Solow model explains none of the cross-country growth variation examined here.
7s Growth Exogenous? Taking Mankizv, Romer, and Weil Seriously • 35 Table 7 ESTIMATES OF THE UZAWA-LUCAS MODEL8 Value (Standard Error) Parameter
Non-Oil
Intermediate
OECD
Western
90
72
21
22
z
8.53 (0.09)
8.73 (0.10)
9.57 (0.06)
8.79 (0.11)
B
0.21 (0.02)
0.23 (0.02)
0.25 (0.02)
0.15 (0.02)
corr(g, g)
0.54
0.43
8.27 (0.07)
8.39 (0.08)
9.61 (0.06)
8.75 (0.10)
B
0.23 (0.01)
0.24 (0.02)
0.26 (0.02)
0.14 (0.01)
a
-4.16 (0.40)
-4.57 (0.48)
-13.89 (2.60)
-5.71 (1.16)
P
0.31 (0.02)
0.33 (0.03)
0.64 (0.11)
0.23 (0.03)
No. of observations sk, sh exogenous
sk, sh endogenous z
corr(g, g) corr(sK, SK) corr(sH, SH)
0.25 -0.38 0.36
0.27 -0.42 0.43
-0.10
0.39 -0.34 0.03
0.19
0.22 -0.04 0.53
"Results are derived from SUR estimation of equations (4.3) and (4.4) in the top panel, and (4.3)-(4.6) in the bottom panel, imposing a value of 0.35 for a in all equations.
are much lower for the other two country samples (—0.10 for the OECD sample and 0.19 for the Western Hemisphere sample). For the OECD sample at least, there is probably not enough meaningful variation in measured schooling rates to explain differences in growth. When saving and human-capital formation are allowed to be endogenous (bottom part of Table 7), the results deteriorate markedly, as expected. Conditional on fitted rather than actual schooling rates, the correlation of fitted and actual growth rates is much lower for the two bigger samples (though higher for the OECD and Western Hemisphere). The last two rows, which show the correlations of fitted and actual saving
36 • BERNANKE & GURKAYNAK
and schooling rates, make the point that (given the broad patterns in the data) the representative-agent model appears unable to fit both variables simultaneously. In particular, the correlations of fitted and actual savings rates are negative, reflecting the poor fit of g and the negative estimates of a [see equation (4.5)]. We conclude that, conditional on rates of human-capital formation, the Uzawa-Lucas model does a reasonably good job of explaining growth for the non-oil and intermediate samples. However, an optimizing model that assumes that behavioral parameters are the same across countries does not do a good job of explaining cross-country differences in savings rates and rates of human-capital formation. This latter finding is consistent with the relatively weak explanatory power of the Ramsey model above, though at least in that case the correlations of actual and fitted values of saving rates were positive. 4.2 THE AK MODEL
Another standard growth model in the literature is the so-called AK model. One common rationalization of this model is Arrow's (1962) idea of learning-by-doing. Suppose that the production function of the economy is given by (4.1), but that worker skills are proportional to the capital-labor ratio, i.e., ht = kt. Then the per-worker production function is simply
where A — A1 a is a country-specific constant. Along the BGP the growth rate of the capital-labor ratio and hence of output per worker is sKA — (n + 8). Assume that Ai = A(l + st) and In A = a, so that In A{ = a + eif approximately. Then the two equations describing the BGP of this model are
We estimated (4.8) and (4.9) simultaneously by SUR and then tested the restriction that In A = a. Here we treat the saving rate as exogenous. The results are shown in Table 8. As shown by the p-values in the penultimate row of the table, the over-identifying restriction of the model is strongly rejected.
Is Growth Exogenous? Taking Mankiiv, Romer, and Weil Seriously • 37 Table 8 ESTIMATES OF THE AK MODEL3 Value (Standard Error) Parameter
Non-Oil
Intermediate
OECD
Western
No. of observations
90
72
21
22
a
-0.08 (0.06)
-0.20 (0.06)
-0.55 (0.06)
-0.08 (0.10)
A
0.40 (0.02)
0.37 (0.02)
0.27 (0.01)
0.42 (0.03)
376.68 0.00 0.67
341.13 0.00 0.63
393.42 0.00 0.47
115.85 0.00 0.32
* 2 d) P corr(g, g)
"Results are derived from SUR estimation of equations (4.8) and (4.9). The tested restriction is that In A = a.
As above, an alternative way to evaluate the AK model is to see how the growth rates it implies are correlated with observed growth rates. For each country we estimated A{ as the output-capital ratio in 1995, we calculated the forecast growth rate for that country as g{ = sKiA{ — (n{ + 8). The correlation of this forecast growth rate with the actual growth rate for the four country samples are shown in the last row of Table 8. Reflecting the positive relationship of saving rates and growth rates, these correlations are rather high, ranging from 0.32 for the Western Hemisphere sample to 0.67 for the large non-oil sample. We thus come to mixed conclusions about the AK model. On the one hand, the crossequation restriction imposed by the model, relating the output-capital ratio and the sensitivity of growth to the saving rate, is strongly rejected by the data. On the other, the key prediction of the model that the saving rate (rate of capital accumulation) is important for explaining the growth as well as the level of per capita output seems to hold considerable validity. We find a similar result linking the saving rate and TFP growth below.
5. Estimates of Labor's Share To this point we have assumed that all the economies in the sample lie on a balanced growth path. At best this can only be an approximation. First, economies are buffeted by a variety of major and minor shocks, as well as changes in institutions and policies; hence, even if our models
38 • BERNANKE & GURKAYNAK
are precisely correct, some component of observed economic growth must be accounted for by transition dynamics.13 Second, we cannot take literally the prediction of many endogenous growth models that country growth rates may differ permanently, as that would imply counterfactually that the cross-sectional variance of real GDP per worker grows without bound. Although government policies and private-sector decisions may have highly persistent effects on growth (the prediction of endogenous growth models that we take most seriously), ultimately there must be forces (such as technology transfer from leaders to followers) that dampen the tendency toward divergence. In the second part of their paper, MRW attempt to estimate directly the speed of convergence to the steady state and to relate their findings to the predictions of the Solow model. Although this exercise is an interesting one, measuring the speed of convergence is a difficult econometric problem, especially in the face of possible parameter heterogeneity and ongoing economic shocks. A more direct way to study the determinants of long-run growth, without having to take a stand on whether the world's economies are currently on a balanced growth path (or whether some are and some aren't), is to obtain country-by-country estimates of the growth of TFP. As is well known, if production is CobbDouglas14 and factor markets are competitive,15 then TFP growth rates can be found by standard growth accounting methods, using factor shares to estimate the elasticities of output with respect to capital and labor. In this section we build on the work of Gollin (1998) to calculate labor shares for a sample of countries. Section 6 reports the results of the associated growth accounting exercises. Studies of labor's share have often found lower values in developing countries than in industrial countries (see, e.g., Elias, 1992). Taken at face value, this result suggests either that less-developed countries operate different technologies than industrialized countries, or perhaps that the constant-elasticity-of-substitution (CES) or other production-function form is preferable to the Cobb-Douglas. In an important paper, Gollin (1998) presents evidence against the conventional finding. Gollin's key insight is that published series on "employee compensation" may signifi13. Much of macroeconomics is devoted to the study of these short-run dynamics around a steady state, otherwise known as business cycles. 14. The Cobb-Douglas production function may also be viewed as a first-order approximation to more complicated production functions. Below we provide some evidence in favor of the Cobb-Douglas assumption. 15. Some endogenous growth models assume monopolistic competition and payments to factors other than capital and labor. In practice, we expect that the empirical labor share will be a reasonable measure of the Cobb-Douglas coefficients applying to an agglomerate of raw labor and human capital.
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 39
Table 9 COST COMPONENTS OF GDP Indirect taxes, net Indirect taxes Less: Subsidies Consumption of fixed capital Compensation of employees by resident producers Resident households Nonresidents Operating surplus Corporate and quasicorporate enterprises Private unincorporated enterprises General government Statistical discrepancy Equals Gross Domestic Product Source: UN National Accounts Statistics
cantly understate total labor compensation, particularly in developing economies, because of the large share of income flowing to workers who are self-employed or employed outside the corporate sector.16 To try to capture the income of the latter group of workers, Gollin employs data from the United Nations System of National Accounts (see United Nations, National Accounts Statistics). Our Table 9 shows the UN's method of breaking down the cost components of GDP. Income received by the self-employed and noncorporate employees is a component of the category operating surplus, private unincorporated enterprises (OSPUE). Gollin considers two measures of labor's share which use data on OSPUE. For the first measure, he attributes all of OSPUE to labor earnings, so that labor's share becomes (corporate) employee compensation plus OSPUE, divided by GDP net of indirect taxes. For his second measure, he assumes that the share of labor income in OSPUE is the same as its share in the corporate sector. Specifically, this measure of the share of labor income can be written
16. Gollin also examines the possibility that differences in sectoral composition might explain cross-country differences in labor share. However, he does not find this factor to be important.
40 • BERNANKE & GURKAYNAK
We view this second measure, which allows for the existence of noncorporate capital income, as more reasonable; we will refer to it as the OSPUE measure. Gollin also considers a third measure of labor's share, which uses data on the ratio of corporate employees to the total labor force less unemployed, available in various issues of the International Labor Organization's Yearbook of Labor Statistics. Specifically, he assumes that corporate and noncorporate workers receive the same average compensation, so that aggregate labor income can be calculated by scaling up corporate employee compensation by the ratio of the total labor force to the number of corporate employees. This measure, which we will refer to as the labor-force correction, is defined by
We have replicated and updated Gollin's calculations for the OSPUE measure and the labor-force correction for our sample of countries. One problem that we noted in doing so is that OSPUE is reported for only about 20 countries; the majority of countries report only the total operating surplus of corporate enterprises and private unincorporated enterprises, that is, we have only the sum of OSPUE and corporate capital income.17 To expand the number of countries for which labor shares could be calculated, we constructed an alternative measure of labor share that combines information about the corporate share of the labor force and the aggregate operating surplus. To do so, we assume that the corporate share of total private-sector income (both capital income and labor income) is the same as the share of the labor force employed in the corporate sector. Total private-sector income is calculated as the sum of the operating surplus and corporate employee compensation. We then compute an imputed OSPUE as the share of noncorporate employees in the labor force times the private-sector income. Using the imputed OSPUE, we then estimate labor's share using equation (5.1), with imputed OSPUE in place of actual OSPUE. Table 10 reports a variety of data for the 53 countries in our sample for which either (1) OSPUE is available or (2) the share of corporate employees in the labor force is at least half, or both. We impose the second 17. The operating surplus of government enterprises is also included in operating surplus. As our dataset does not include economies in which the government controls a large share of enterprises, this component can safely be ignored.
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 41
requirement because we found that, for countries with very low corporate employment shares (for some, this share is below 0.10), the calculated labor shares are often unreasonable (e.g., they may exceed one). This result is not unexpected, for two reasons: First, countries with large informal sectors are likely to have relatively poor economic statistics, all else equal. Second, our estimates which use the labor-force correction scale up corporate employee compensation by the reciprocal of the corporate employee share of the labor force. When the corporate employee share is both small and measured with error, estimates based on the reciprocal of the share will be highly unreliable. We found, on the other hand, that when the corporate employee share exceeds 0.5 or 0.6, the resulting estimated labor shares not only are reasonable in magnitude but also tend to agree closely with alternative measures. All of the analyses reported below use 0.5 as the cutoff for the corporate employee share of the labor force; results for samples based on a 0.6 cutoff are essentially identical. In Table 10 the second column gives the share of the country's labor force employed in the corporate sector. Columns 3 through 6 give four alternative measures of labor's share for each country. Column 3, the naive calculation, is corporate employee compensation divided by GDP net of indirect taxes. As emphasized by Gollin, this estimate is likely to be too low, because it ignores the income of noncorporate employees. We include it for reference and comparison with other measures. Columns 4-6 give our three primary measures of labor's share. Column 4 shows Gollin's OSPUE measure, column 5 our imputed OSPUE measure, and column 6 the measure based solely on the labor-force correction. Columns 2-6 are based on average data for the period 1980-1995, or for a period as close to 1980-1995 as possible. We also calculated country-by-country time series for the labor share (not shown). For comparison, columns 7-10 show estimates from previous studies, as reported in Barro and Sala-i-Martin (1999, Table 10.8, pp. 380-381). The year ranges at the head of columns 7-10 correspond to the timing of the data used by the previous studies. We find the results of this exercise encouraging. As Table 10 shows, when alternative measures of labor's share exist, they tend to agree closely, especially when the corporate employee share is greater than 0.6 or so. Two additional findings tend to support Gollin's (1998) conclusion that the Cobb-Douglas assumption of stable income shares is a good one: First, we find no systematic tendency for country labor shares to vary with real GDP per capita or the capital-labor ratio. Indeed, most estimated labor shares lie between 0.6 and 0.8, and the average value of
Table 10 ALTERNATIVE MEASURES OF LABOR'S SHARE Est . Labor Share
Corporate Employees/ IF
Naive
Algeria Australia Austria Belgium Bolivia
0.74 0.84 0.86 0.82 0.55
0.47 0.57 0.61 0.60 0.37
Botswana Burundi Canada Chile Colombia
0.45 0.06 0.91 0.68 0.68
0.39 0.22 0.62 0.42 0.45
0.45 0.75
Congo Costa Rica Denmark Ecuador Egypt
NA 0.72 0.89 0.56 0.56
0.38 0.54 0.64 0.25 0.43
0.47
El Salvador Finland France Germany, W. Greece
0.60 0.85 0.85 0.89 0.52
0.35 0.62 0.61 0.63 0.45
Hong Kong Ireland Israel Italy Ivory Coast
0.88 0.77 0.80 0.72 0.11
0.51 0.58 0.59 0.49 0.43
Country
Actual OSPUE
0.68 0.74
0.71 0.74
0.71 0.68
Imputed OSPUE
IF
0.61 0.66 0.70 0.71
0.63 0.68 0.71 0.73 0.67
0.68 0.59
0.69 0.62 0.65
0.73 0.71
0.74 0.72 0.45 0.77
0.71 0.71 0.69 0.79
0.58 0.73 0.73 0.71 0.86
0.73 0.70 0.65
0.57 0.75 0.73 0.69
1947-73 CCJ
1960-90 Dough 'y
0.56
0.55
1940-80 Elias
1966-90 Young
0.48 0.37
0.60 0.61
0.58 0.60 0.63
0.61
0.62
0.60 0.68
Jamaica Japan Jordan Korea, Rep. Malaysia
0.60 0.76 0.67 0.56 0.64
0.53 0.59 0.45 0.48 0.43
Mauritius Mexico Morocco Netherlands New Zealand
0.85 0.59 0.63 0.88 0.80
0.48 0.34 0.36 0.59 0.55
Norway Panama Paraguay Peru Philippines
0.89 0.65 0.62 0.53 0.44
0.55 0.50 0.32 0.31 0.27
Portugal Singapore S. Africa Spain Sri Lanka
0.71 0.85 0.94 0.73 0.62
0.52 0.47 0.59 0.52 0.50
0.72
Sweden Switzerland Trin & Tobago Tunisia UK
0.91 0.85 0.77 0.66 0.89
0.68 0.66 0.55 0.41 0.65
USA Uruguay Venezuela Zambia
0.91 0.74 0.68 0.62
0.65 0.43 0.38 0.48
0.73 0.64
0.77 0.67
0.61
0.58
0.65
0.68 0.66
0.66 0.67
0.57 0.59 0.58 0.67 0.69
0.61 0.73 0.49 0.56
0.63 0.76 0.52 0.59
0.71 0.53 0.62 0.67 0.78
0.73 0.55 0.63 0.70 0.81
0.77
0.74 0.76 0.69
0.75
0.72
0.75 0.78 0.71 0.62 0.74
0.74
0.71 0.58 0.53 0.72
0.71 0.59 0.55 0.78
0.55 0.67
0.31 0.55
0.34
0.59 0.47
0.62
0.61
0.60
0.59 0.45
Sources: Authors' calculations. Studies corresponding to the final four columns are Christensen, Cummings, and Jorgenson (1980); Elias (1992); Dougherty (1991); and Young (1995).
44 • BERNANKE & GURKAYNAK
the labor share is 0.65, similar to that observed in the United States and other industrialized countries.18 Second, the time series of labor shares by country tend to be quite stable, with no systematic tendency to rise or fall over time. The comparison of our calculated labor shares to previous studies suggests that the earlier studies took insufficient account of noncorporate employee income (note how close the results of several of the earlier studies are to the naive calculation of labor share, column 3). The exception is the careful work of Young (1995), who obtains numbers similar to ours for Hong Kong and Korea, but a smaller value for Singapore.
6. The Determinants of TFP Growth In this section we describe our calculations of TFP growth for our sample of countries and report results of regressions of TFP growth on country characteristics. Again, the advantage of looking directly at TFP growth is that it avoids the need to take a stand on whether countries are on a balanced growth path or in transition to a BGP. The labor shares (and by implication, the capital shares) shown in Table 6 are an important input to the calculation of TFP growth. We have output growth from the PWT 6.0 data. The two remaining required inputs to a growth accounting exercise are measures of capital-stock growth and labor-force growth. PWT version 5.6 provides data on capital stocks for a subset of countries, but our prerelease version of PWT 6.0 does not yet have capitalstock data. We estimate capital stocks from available PWT 6.0 data by a perpetual inventory calculation. Here (in contrast to our replication of the MRW results) we assume a depreciation rate of 6%, following Hall and Jones (1999).19 Initial capital stocks are found by the assumption that capital and output grow at the same rate. Specifically, for countries with investment data beginning in 1950 we set the initial capital stock X1949 = I1950/(g + 8), where g is the ten-year growth rate of output (e.g., from 1950 to 1960) and 8 (= 0.06) is the assumed rate of depreciation. We have investment data starting from 1950 for 50 countries, from 1955 for 14 countries, and from 1960 for 26 countries. The calculated capital stocks include both residential and nonresidential capital. PWT 5.6 provides data on residential capital per worker 18. In the next section, we set the labor share for each country equal to the OSPUE measure, if available; to the imputed OSPUE measure, if OSPUE is unavailable; and finally to the labor-force correction measure if neither OSPUE measure is available. The average labor share derived from this procedure is precisely 0.65. 19. We get similar results if we assume 3% depreciation or if we use PWT version 5.6 instead.
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 45
as a fraction of nonresidential capital per worker for 63 countries. For these countries we use the average ratio of nonresidential capital to total capital to impute nonresidential capital stocks in the PWT 6.0 data set. For other countries we assume that residential capital is one-third of the total, about the average value for the countries on which we have data. Labor-force growth unadjusted for quality (that is, assuming a zero return to schooling) is calculated as the rate of growth of the workingage population, as in Section 3. We also compute alternative qualityadjusted measures, as follows: We use the most recent Barro-Lee (2000) data on educational achievement to give larger weight to more-educated workers, assuming social returns to education of 7% per year (results are not sensitive to alternative assumptions). A similar method was employed by Collins and Bosworth (1996) and by Klenow and RodriguezClare (1997). TFP growth rates (reported in the Appendix) are then found by the standard growth-accounting calculation. The Appendix to the working-paper version of this paper gives our estimated TFP growth rates under alternative assumptions and for different subsamples. With average TFP growth rates by country in hand, we can ask whether these growth rates are independent of variables such as the saving rate, schooling rate, or labor-force growth rate, as the Solow model would predict. As Table 11 shows, the answer is a strong no. The top half of Table 11 shows regression results for the sample of about 50 countries for which we have calculated labor shares (see footnote 10). The bottom half uses calculated TFP growth rates under the assumption that labor's share is a fixed 0.65 in each country, an assumption which we believe to be reasonable in light of our labor-share estimates above. The advantage of this assumption is that it allows us to expand the sample to 80 countries or more. Note that in either case we are focusing on long-run averages, so that cyclical influences should be minimal. Table 11 shows that, whether we include a human-capital correction or not, and independent of the combination of variates included in the regression, TFP growth is cross-sectionally strongly related (in both the economic and statistical senses) to the saving rate and, in most cases, to the growth rate of the labor force. TFP growth rates also tend to be related to schooling rates, but when both the saving rate and the schooling rate are included in the regression, the coefficient on the schooling rate tends to become statistically insignificant. Further, as might be expected, when the labor force is adjusted for human-capital accumulation, the effect of the schooling variable is reduced. Table 12 repeats the analysis of Table 11 for the 1980-1995 subperiod. The data for this subperiod are probably more reliable (we don't need to worry about whether our estimated initial capital stocks are reasonable,
Table 11 DETERMINANTS OF TFP GROWTH, 1965-1995" Value (Standard Error) Actual Labor Shares
Constant
0.00 (0.00)
SK
0.08 (0.01)
0.00 (0.00)
0.02 (0.00)
n
-0.01 (0.00)
0.00 (0.00)
0.06 (0.02)
0.06 (0.02)
0.07 (0.05)
0.15 (0.05)
%
R2
7% Returns to Education (50 Countries)
No Returns to Education (53 Countries)
Parameter
-0.44 (0.10) 0.33
0.16
0.25
0.34
0.01 (0.01)
0.00 (0.01)
-0.01 (0.00)
0.05 (0.02)
0.07 (0.02)
0.10 (0.04)
0.05 (0.04)
-0.29 (0.10)
-0.36 (0.11)
-0.27 (0.10)
0.41
0.31
0.41
-0.01 (0.01)
0.02 (0.00)
0.14 (0.06)
-0.01 (0.01)
0.00 (0.00)
0.07 (0.02)
0.05 (0.02)
0.28
0.08
0.26
0.28
0.00 (0.01) 0.05 (0.02)
0.08 (0.06)
0.03 (0.05)
-0.32 (0.10)
-0.41 (0.11)
-0.31 (0.11)
0.39
0.27
0.38
0.06 (0.06) -0.45 (0.11)
0.01 (0.01)
Labor Share = 0.65 No Returns to Education (90 Countries) Constant
SK
-0.01 (0.00)
-0.01 (0.00)
0.11 (0.01) 0.21 (0.03)
SH
n R2
0.01 (0.00)
-0.01 (0.00)
-0.01 (0.00)
0.09 (0.02)
0.11 (0.01)
0.49
0.32
0.06
-0.01 (0.00)
-0.01 (0.00)
-0.01 (0.00)
0.09 (0.02)
0.10 (0.01)
0.20 (0.03)
0.07 (0.04)
-0.03 (0.11)
-0.10 (0.13)
0.01 (0.11)
0.48
0.32
0.50
0.07 (0.04) -0.37 (0.14)
7% Returns to Education (81 Countries)
0.50
"Dependent variable: average growth rate of TFP, 1965-1995.
-0.01 (0.00)
0.01 (0.00)
0.17 (0.03)
-0.02 (0.00)
-0.01 (0.00)
0.09 (0.02)
0.10 (0.01)
0.43
0.22
0.09
0.44
-0.01 (0.00)
0.09 (0.02) 0.15 (0.04)
0.04 (0.04)
-0.10 (0.11)
-0.19 (0.13)
-0.08 (0.11)
0.43
0.23
0.43
0.05 (0.04) -0.38 (0.13)
0.00 (0.00)
Table 12 DETERMINANTS OF TFP GROWTH, 1980-1995" Value (Standard Error) Actual Lahtr Shares 7% Returns to Education (50 Countries)
No Returns to Education (53 Countries) Constant
%
-0.01 (0.00)
-0.01 (0.01)
0.10 (0.02) 0.14 (0.06)
%
-0.02 (0.01)
0.00 (0.01)
0.10 (0.02)
0.07 (0.02)
0.32
0.07
0.35
0.32
0.01 (0.01)
0.00 (0.01)
-0.02 (0.00)
0.07 (0.02)
0.10 (0.02)
0.10 (0.05)
0.05 (0.05)
-0.50 (0.13)
-0.65 (0.13)
-0.50 (0.13)
0.48
0.38
0.48
0.05 (0.06) -0.69 (0.13)
n R2
0.02 (0.00)
0.00 (0.01)
0.02 (0.00)
0.06 (0.08)
-0.02 (0.01)
0.00 (0.01)
0.10 (0.02)
0.07 (0.02)
0.25
-0.01
0.35
0.24
0.00 (0.01) 0.06 (0.02)
0.05 (0.06)
0.02 (0.06)
-0.55 (0.13)
-0.69 (0.13)
-0.55 (0.13)
0.45
0.35
0.44
0.01 (0.07) -0.69 (0.13)
0.01 (0.01)
Labor Share = 0.65 No Return to Education (90 Countries) Constant
SK
-0.02 (0.00)
-0.01 (0.00)
0.13 (0.02) 0.17 (0.04)
%
n R2
0.01 (0.00)
-0.02 (0.00)
-0.01 (0.01)
0.12 (0.02)
0.11 (0.02)
0.36
0.16
0.13
0.00 (0.01)
-0.01 (0.01)
-0.02 (0.00)
0.10 (0.02)
0.11 (0.02)
0.14 (0.04)
0.03 (0.04)
-0.24 (0.15)
-0.45 (0.15)
-0.24 (0.15)
0.37
0.22
0.37
0.04 (0.04) -0.59 (0.16)
7% Return to Education (81 Countries)
0.36
"Dependent variable: average growth rate of TFP, 1980-1995.
-0.01 (0.00)
0.01 (0.00)
0.13 (0.05)
-0.02 (0.00)
-0.01 (0.01)
0.11 (0.02)
0.10 (0.02)
0.01 (0.05) -0.59 (0.15)
0.30
0.08
0.15
0.29
0.00 (0.01)
-0.01 (0.01)
0.10 (0.02) 0.09 (0.04)
0.00 (0.05)
-0.32 (0.15)
-0.51 (0.15)
-0.32 (0.15)
0.33
0.18
0.32
50 • BERNANKE & GURKAYNAK
for example), it agrees more closely with the period for which we estimated labor shares, and in any case it is interesting to know if the results hold in shorter periods. If anything, the rejection of the Solow prediction seems stronger in the second half of the sample, with saving rates and workforce growth entering with high economic and statistical significance. Visual inspection of the data is useful to reassure ourselves that the results are not being driven by a few outliers. Figures 1-6 show scatterplots of the bivariate relationships between TFP growth and each of the three variates: SK, SH, and n. To conserve space, we show results only for the larger sample in which we have imposed a fixed labor share of 0.65; the results for the smaller sample with directly estimated labor shares are quite similar, as the reader can verify from the regression results reported in Tables 11 and 12. Figures 1-3 show the results without a quality adjustment for the labor force; Figures 4-6 adjust laborforce quality by assuming a 7% return to a year of schooling. As suggested by the regression results, the weakest relationship is between TFP growth and schooling, especially when the human-capital correction is used (as expected). However, the relationship of TFP growth to both saving rates and workforce growth rates seems to be quite robust. It is difficult to account for these results by appealing to measurement
Figure 1 RELATION OF TFP GROWTH TO SAVING RATE
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 51 Figure 2 RELATION OF TFP GROWTH TO SCHOOLING RATE
Figure 3 RELATION OF TFP GROWTH TO LABOR FORCE GROWTH RATE
52 - BERNANKE & GURKAYNAK Figure 4 RELATION OF TFP GROWTH TO SAVING RATE
Figure 5 RELATION OF TFP GROWTH TO SCHOOLING RATE
Is Growth Exogenous ? Taking Mankiw, Romer, and Weil Seriously • 53 Figure 6 RELATION OF TFP GROWTH TO LABOR FORCE GROWTH RATE
error: For example, if saving rates are mismeasured, the resulting misestimation of the capital stock should tend to induce a negative relationship between TFP growth and the saving rate, rather than the positive relationship we observe. 7. Conclusion We have revisited Mankiw, Romer, and Weil's classic empirical study of the Solow model of economic growth. We showed that the MRW framework applies broadly to almost any economic growth model that admits a balanced growth path, and that the restrictions specifically imposed by the Solow model tend to be rejected. In particular, we find that variables such as the saving rate seem to be strongly correlated with long-run growth rates. The correlation of variables like the saving rate with longrun output growth rates is inconsistent with the joint hypothesis that the Solow model is true and the economies being studied are in their respective steady states. The finding that the saving rate and the growth rate of the labor force are correlated with estimated TFP growth is inconsistent with the standard Solow model, even if we do not assume steady states. We also use the MRW framework to consider some alternative models of economic growth, such as the Uzawa-Lucas model and the AK model. These models are rejected as literal descriptions of the data. However, the
54 • BERNANKE & GURKAYNAK
implications of these models, that country growth rates depend on behavioral variables such as the rate of human-capital formation and the saving rate, seem more consistent with the data than the Solow model's assumption that growth is exogenous. Future research should consider variants of endogenous growth models to see which, if any, provide a more complete and consistent description of the cross-country data. We believe that the generalized MRW-type framework we have developed here could prove very helpful in assessing the alternative possibilities.
Appendix. Additional Country Data See Table 13. Table 13 ESTIMATED TFP GROWTH RATES, 1965-1995 Growth Rate (%/yr) Actual Labor Shares
Country Algeria Angola Argentina Australia Austria Bangladesh Belgium Benin Bolivia Botswana Brazil Burkina Faso Burundi Cameroon Canada Central Afr. R. Chile Colombia Congo Costa Rica Denmark Dominican Rep. Ecuador Egypt
No Returns to Education
Labor Share = 0.65
7% Return to Education
No Returns to Education
7% Return to Education
0.35
-0.23
-0.22
1.30 1.52
1.10 1.41
1.67
1.41
0.39 -2.05 0.34 1.24 1.33 0.47 1.43 -0.90 0.00 1.66 1.33 -0.07 -0.83 -0.98 0.71 -1.57 1.70 1.22 1.71 -0.54 1.17 0.61 0.97 0.70
-0.02 -0.47
-0.06 -0.92
— — -0.37 — 0.78 — 1.66 1.22 1.72 -0.34 1.31
— — — — 0.40 — 1.37 0.87 1.68 -0.70 1.21
0.81 1.10
0.48 0.06
-0.11 1.06 1.22 0.14 1.20 -1.24 -0.04 1.01 1.13
— — -1.24 0.34 -1.88 1.39 0.87 1.65 -0.87 1.08 0.19 0.49 -0.18
Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously • 55
Table 13 CONTINUED Growth Rate (%/yr) Actual Labor Shares
Country
El Salvador Ethiopa Finland France Ghana Greece Guatemala Honduras Hong Kong India Indonesia Ireland Israel Italy Ivory Coast Jamaica Japan Jordan Kenya Korea, Rep. Madagascar Malawi Malayasia Mali Mauritania Mauritius Mexico Morocco Mozambique Nepal Netherlands New Zealand Nicaragua Niger Nigeria Norway Pakistan Panama Papua N. Guinea Paraguay
Labor Share = 0.65 7% Return
No Returns
7% Return
No Returns
to
to
to
to
Education
Education
Education
Education
-0.53 — 1.63 1.41 — 1.93 — — 2.63 — — 2.56 1.93 1.91 -0.34 0.30 1.92 -0.72 — 2.87 — — 1.73 — — 1.73 0.09 0.80 — — 1.26 0.05 — — — 2.08 — 0.76 — 0.13
-0.85 — 0.97 1.09 — 1.33 — — 2.25 — — 2.12 1.51 1.60
-0.43 -0.56 1.46 1.12 0.56 1.35 0.67 -0.22 3.06 1.31 2.17 2.31 1.81 1.74 -0.35 0.29 1.71 -0.67 1.32 2.87 -1.32 -0.27 1.66 0.07 -1.69 1.87 0.25 0.93 -2.78 -0.30 1.22 0.02 -2.62 -1.93 -1.66 2.18 0.99 0.54 -1.11 0.87
-0.79 — 0.86 0.84 0.15 0.86 0.36 -0.65 2.62 0.91 1.71 1.92 1.42 1.46 — -0.06 1.46 -1.29 1.00 2.13 — -0.37 1.21 -0.03 — 1.46 -0.39 — -2.89 -0.73 0.68 -0.30 -2.89 -2.04 — 1.47 0.47 -0.02 -1.35 0.47
-0.03 1.65 -1.33 — 2.13 — — 1.27 — — 1.36 -0.45 — — — 0.70 -0.29 — — — 1.41 — 0.13 — -0.17
56 • BERNANKE & GURKAYNAK Table 13 CONTINUED Growth Rate (%/yr) Actual Labor Shares
Country Peru Philippines Portugal Rwanda S. Africa Senegal Singapore Spain Sri Lanka Sweden Switzerland Syria Tanzania Thailand Togo Trinidad & Tobago Tunisia Turkey Uganda United Kingdom United States Uruguay Venezuela Zaire Zambia Zimbabwe
Labor Share = 0.65
7% Return to Education
No Returns to Education
7% Return to Education
0.44 0.06 2.44
-0.12 -0.49 1.91
0.24
-0.07
2.09 1.34 1.27 1.44 0.33 —
1.85 0.83 0.91 0.97 -0.08 —
0.34 0.19 2.10 -0.91 0.25 -0.62 3.12 1.25 0.64 1.18 0.05 0.62 -0.70 2.32 -1.69 0.22 1.85 0.55 0.34 1.00 0.99 1.34 -0.33 -3.23 -1.79 1.64
-0.32 -0.42 1.62 -1.12 -0.07 -0.75 2.82 0.76 0.34 0.78 -0.30 0.00 -0.69 1.97 -2.15 -0.12 1.24 0.03 0.04 0.70 0.59 1.02 -0.94 -3.61 -2.22 1.09
No Returns to Education
—
—
0.33 1.82
-0.03 1.23
1.28 1.22 1.29 -0.22 — -1.97 —
0.93 0.76 1.00 -0.72 — -2.44 —
REFERENCES Arrow, K. J. (1962). The economic implications of learning by doing. Review of Economic Studies 29(June):155-173. Barro, R. J., and J.-W. Lee. (2000). International data on educational attainment: Update and implications. Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 7911. , N. G. Mankiw, and X. Sala-i-Martin. (1995). Capital mobility in neoclassical models of growth. American Economic Review 85(March):103-115. , and X. Sala-i-Martin. (1999). Economic Growth. Cambridge, MA: MIT Press.
Is Growth Exogenous ? Taking Mankiw, Romer, and Weil Seriously • 57 Cass, D. (1965). Optimum growth in an aggregate model of capital accumulation. Review of Economic Studies 32(July):233-240. Christensen, L. R. , D. Cummings, and D. Jorgenson. (1980). Economic growth, 1947-1973: An international comparison. In New Developments in Productivity Measurement and Analysis. ]. W. Kendrick and B. Vaccara (eds.). NBER Conference Report. Chicago: University of Chicago Press. Collins, S. M., and B. Bosworth. (1996). Economic growth in East Asia: Accumulation versus assimilation. Brookings Papers on Economic Activity 1996(2):135191. Doughtery, C. (1991). A comparison of productivity and economic growth in the G-7 countries. Ph.D. dissertation. Harvard University. Durlauf, S. N., and D. T. Quah. (1999). The new empirics of economic growth. In Handbook of Macroeconomics, J. B. Taylor and M. Woodford (eds.). Amsterdam: Elsevier Science, pp. 235-308. Elias, V. J. (1992). Sources of Growth: A Study of Seven Latin American Economies. San Francisco: ICS Press. Gollin, D. (1998). Getting income shares right: Self employment, unincorporated enterprise, and the Cobb-Douglas hypothesis. Unpublished paper. Williams College (June). Hall, R. E., and C. I. Jones. (1999). Why do some countries produce so much more output per worker than others? Quarterly Journal of Economics 114 (February):83-116. Islam, N. (1995). Growth empirics: A panel data approach. Quarterly Journal of Economics 110(November):1127-1170. King, R. G., and S. T. Rebelo. (1993). Transitional dynamics and economic growth in the neoclassical model. American Economic Review 83(September): 908-931. Klenow, P., and A. Rodriguez-Clare. (1997). The neoclassical revival in growth economics: Has it gone too far? In NBER Macroeconomics Annual. B. Bernanke and J. Rotemberg, eds. Cambridge, MA: MIT Press, pp. 73-103. Koopmans, T. C. (1965). On the concept of optimal economic growth. In Scientific Papers ofTjalling C. Koopmans. New York: Springer-Verlag. Lucas, R. E., Jr. (1988). On the mechanics of economic development. Journal of Monetary Economics 22(June):3-43. Mankiw, N. G., D. Romer, and D. N. Weil. (1992). A contribution to the empirics of economic growth. Quarterly Journal of Economics 107(May):407-437. Ramsey, E (1928). A mathematical theory of saving. Economic Journal 88:543-559. Solow, R. M. (1956). A contribution to the theory of economic growth. Quarterly Journal of Economics 70(February):65-94. Summers, R., and A. Heston. (1988). A new set of international comparisons of real product and price levels estimates for 130 countries, 1950-1985. Review of Income and Wealth 34(March):l-26. Uzawa, H. (1965). Optimal technical change in an aggregative model of economic growth. International Economic Review 6(January):18-31. Young, A. (1995). The tyranny of numbers: Confronting the statistical realities of the East Asian growth experience. Quarterly Journal of Economics HO(August): 641-680.
58 • CASELLI
Comment FRANCESCO CASELLI Harvard University
Paraphrasing MRW, Bernanke and Gurkaynak have titled their paper "Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously." I can't resist the temptation to summarize my reactions to their paper by adding my own variation to the paraphrasing theme and title my discussion "Is Growth Exogenous? Taking Mankiw, Romer, and Weil a Bit Too Seriously." As I understand it, the paper attempts to provide two contributions. The first contribution is methodological, and consists in developing a framework to use cross-country macroeconomic data to test any growth model that admits a balanced growth path. In my comment I will applaud the elegance of the idea, but will argue that, by taking the balancedgrowth property too seriously, it makes it virtually inevitable that any growth model will be rejected empirically. The second contribution is to assess the empirical validity of the Solow model, using in part, but not exclusively, the methodology I just mentioned. The results are interesting, but I will argue that the authors take the Solow model a bit too seriously as a potential complete description of the data-generating process. The first two parts of my discussion develop these two points. In the final section I add some idiosyncratic notes on the status of growth empirics. 1. The Methodology The methodological contribution of the paper is to propose a general strategy to test growth models within the (large) class that admits a balanced growth path (BGP). The starting point is to note that—along a BGP—economies feature constant values of a number of macroeconomic variables, such as the growth rate of GDP, the saving (investment) rate, the rate of growth of the labor force, the ratio of "idea workers" in the labor force, etc. Let me denote by x the vector of such variables that are constant in BGP. Different growth models impose different restrictions on the BGP relationship between the vector x and the level and the growth rate of per capita income. In general, such restrictions can be represented as a special case of the system
Comment • 59
If countries in the international data set have been on a BGP over the period of observation, then the vector x can be estimated, for each country, by its historical average. With such estimates at hand, growth models can be tested by testing the restrictions they impose on / and g. In order to improve efficiency, Bernanke and Gurkaynak propose to estimate the two equations jointly, as a system of unrelated regressions (SUR).1 This is an elegant and sophisticated construct, which has the great merit of firmly grounding empirical work in theory. I also think it is an excellent idea to estimate the equations describing the BGP jointly, so as to achieve greater efficiency, However, I am concerned that the usefulness of this method may be severely limited by its strong reliance on the BGP property. There are three orders of considerations that make me a bit skeptical about the applicability of the method. The first and obvious problem is clearly acknowledged by the authors, and that is of course that if the economies in the sample are observed outside their BGP, a rejection of the model based on the failure of the BGP restrictions would be spurious. When rejecting based on Bernanke and Gurkaynak's methodology, one never knows if one is rejecting the model, or just the assumption that countries are on their BGP. This is why when trying to make the case against the Solow model Bernanke and Gurkaynak are forced to resort to additional pieces of evidence, collected outside their general methodology (more on this below). One could argue that, while a rejection is inconclusive because of the transitional-dynamic problem, applications of the method could still be informative in the case of failure to reject. A nonrejection may lead one to increase one's confidence in the joint hypotheses that the particular model that is not being rejected is correct and that the data are drawn from a sample of countries that are on a BGP. My second and third points both imply, however, that it is virtually impossible for this method to deliver a nonrejection in a cross-country sample. Specifically, the second point is that the authors' methodology—at least as applied in the paper-—seems to depend heavily on testing for exclusion restrictions. In particular, if a growth model does not predict that a variable z should be significant in an estimate of the BGP system of equations, failure of this exclusion restriction leads to a rejection of the model. The very practical problem with this is that 10 years of growth regressions have taught us that a very large number of variables tend to enter significantly into the system, and indeed many different sets of variables enter jointly significantly in growth regressions. I suspect, therefore, that for any possible growth model one can find the right z 1. In some applications there are more than two equations describing the BGP, but that is not critical for the purposes of this discussion.
60 • CASELLI
that, showing up significantly in the growth regression, will lead to a rejection of the model.2 My third concern with the method's strong reliance on the BGP property derives from the observed behavior of the cross-country distribution of income. Because the cross-country distribution of income is neither exploding nor imploding over the typical sample period used in crosscountry growth empirics, a researcher who wants to interpret the data as describing a world of countries in BGP must necessarily assume all countries to share the same BGP growth rate. But then, no cross-country variable should have explanatory power for the cross section or growth rates, a requirement that will obviously always be "rejected." 2. Solow Empirics One of the contributions of the paper is to revisit and challenge Mankiw, Romer, and Weil's contention that the Solow model (in human-capitalaugmented form) works well as a model for growth empirics. The BGP equations are
and Bernanke and Giirkaynak reject the model (mostly) on the ground that sk, sh, n enter the growth equation significantly, while the Solow model predicts that the y's should all be zero. This is striking in that the very same finding led MRW to conclude that the Solow model performs well. The reason for this apparent inconsistency is of course that MRW thought they were testing the Solow model during the transition to the BGP, where rates of accumulation are indeed expected to have explanatory power for growth rates, while Bernanke and Giirkaynak assume the world to be in steady state, where they do not. 2. What does it mean to test a model? I can think of two criteria. The first criterion is to test the basic insight of the model (e.g., "x affects y"). The second criterion is to test whether the model constitutes as exhaustive description of the data (e.g., "x, and only x, affects y). I have just argued that it is virtually impossible to fail to reject any growth model on the basis of the second criterion. But I would also argue that that criterion is overly demanding. After all, labor economists do not reject the human-capital model because variables other than education enter the Mincer regression significantly.
Comment • 61
Since they are well aware of the transitional-dynamic difficulty, the authors also perform a completely different experiment. They argue that a key property of the Solow model is that rates of TFP growth are exogenous. Hence, they obtain cross-country estimates of TFP growth rates and regress them on sk, sh, and n. Since some of these variables turn out to be significant, they conclude that the data reject the Solow model. To me this is taking Robert Solow too seriously or, to be more precise, too literally. In particular, this is turning a model's useful simplifying assumption into the model's main insight. In my view the key insight of the Solow model is that factor accumulation per se is insufficient to achieve long-run growth, and that long-run growth can only come from growth in TFP. But it is definitely not the key insight of the model that TFP is exogenous: of course growth is not exogenous. Indeed, an implication of the Solow model is that we need to study the determinants of TFP growth. Put differently, it is impossible for me to think of the Solow model as an attempt to fully explain the growth process, much less to be a competitor for models that endogenize TFP growth. On the contrary, the Solow model should be viewed as providing strong motivation for endogenous-growth theory. Of course, as we explore the determinants of TFP growth, it may well be the case that we discover that the accumulation of some factors has additional indirect growth effects through this channel. This is indeed what Bernanke and Giirkaynak's regression seems to suggest, and from this perspective it may well be the most interesting result in the paper. However, the result should be interpreted with great caution, since we can't be quite sure that the accumulation variables in the TFP-growth equations are not picking up the effects of some omitted variable, a pervasive problem in cross-country growth empirics. Only an instrumental-variables approach can really tackle this issue. 3. The Status of Cross-Country Growth Empirics The most dramatic feature of cross-country income data is of course the enormous dispersion of per capita income. Per capita income ratios between the richest and poorest countries in the world exceed a factor of 30. As mentioned above, as a first approximation this enormously dispersed distribution has been roughly stable over time, at least since 1960. This stability is at least in part a consequence of largely serially uncorrelated growth rates. The sheer magnitude of the inequality of income, along with the rough stability of the distribution, has recently led several researchers to de-emphasize differences in growth rates and
62 • ROMER
instead to give first priority to the task of understanding differences in income levels. This research agenda has already delivered some important insights. For example, it has proved useful to conceptualize per capita income Y as Y = F(factors, efficiency). Hence, differences in income across countries are attributed to a combination of differences in factor endowments (or accumulated stocks) and the efficiency with which these factors are used. Using data on the factors, it is then possible to decompose the cross-country variation in income into its two determinants. The emerging consensus is that variation in efficiency explains a very large fraction, indeed, a majority, of the variation in per capita income. The search is now on for the sources of such differences in efficiency levels.3
Comment1 DAVID ROMER University of California, Berkeley
1. Introduction I would like to start my comments by trying to set a speed record for discussant unfriendliness: I want to object to Bernanke and Giirkaynak's title. Their stated subject is, "Is Growth Exogenous?" My objection is that no one believes that growth is exogenous: growth, like everything else, has a cause. The assumption of exogenous long-run growth is a useful modeling device, not a serious hypothesis. Fortunately, Bernanke and Giirkaynak do not actually focus on their stated subject. What they in fact investigate is the role of physical and human capital in differences in growth among countries. They take several distinct approaches to investigating this issue. First, they update a paper that Gregory Mankiw, David Weil, and I wrote concerning capital's importance in cross-country differences in economic performance. Second, they point out that Solow-style models imply that rates of investment in physical and human capital do not affect long-run growth. They 3. In developing theories featuring endogenous differences in efficiency and income levels, I think it will continue to be reasonable to write down models admitting a BGP. While certainly not literally true, the BGP property is a convenient approximation when the goal is to focus on level differences. 1. I thank Ben Bernanke and Refet Giirkaynak for providing their data.
Comment • 63
therefore test whether investment rates are correlated with growth over the period 1965-1995. Third, they perform analogous tests concerning the correlates of long-run growth for the Y = AK model and the Uzawa-Lucas model. Finally, they examine whether total factor productivity (TFP) growth from 1965 to 1995 is correlated with the investment measures. In my comments, I want to focus on Bernanke and Giirkaynak's final approach. My reason for not spending time on their extension of MRW is simply that I felt that if I did discuss MRW I should do so thoroughly; and I did not think that would be the most interesting use of my time here. With regard to the examination of correlations between investment measures and growth, this has the problem (which Bernanke and Giirkaynak recognize) that since countries were not all on their balanced growth paths in 1965, even Solow-style models predict a correlation between investment and growth as a result of transition dynamics. Solow-style models also make this prediction if investment rates over the 1965-1995 period differed from investment rates before 1965. Thus this test does not discriminate among the competing theories. Further, it is dominated by Bernanke and Giirkaynak's final approach of looking directly at TFP growth. The test of the Y = AK and Uzawa-Lucas models suffer from the same limitations: with reasonable transition dynamics and the possibility of changes in fundamentals, these types of models do not deliver sharp predictions. More importantly, we already know that the idea that these models apply to individual countries fails fundamentally. The models imply that there are permanent differences in growth rates, and thus make the highly implausible prediction that the variance of income across countries will explode. They also imply that growth rates should have been rising rapidly over the postwar period as rates of investment in physical and human capital rose, while in fact growth rates have been essentially constant (Jones, 1995). Because of these considerations, I will concentrate on Bernanke and Giirkaynak's examination of TFP growth. They start this part of their paper by computing TFP growth by country for the period 1965-1995. They then regress TFP growth on measures of physical-capital investment, human-capital investment, and population growth. They find positive and significant coefficients on the investment measures. As they point out, there are two possible reasons for this result. First, physical and human capital could make contributions to output beyond what is measured in the TFP calculations, which employ the standard approach of using earnings to measure marginal products. That is, there could be externalities to capital. Second, capital accumulation could merely be correlated with other influences on TFP growth.
64 • ROMER
In my comments, I want to first point out some measurement problems that may introduce important biases into Bernanke and Giirkaynak's procedure. I then want to propose a variant on their methodology that I think is cleaner and that allows one to see the limitations and implications of their approach more clearly. 2. Measurement There are two potentially important measurement issues in the TFP calculations, one involving human capital and one involving physical capital. Neither issue is specific to Bernanke and Giirkaynak's paper. The issue involving human capital concerns the production function for human capital. The assumption in the MRW part of the paper is that human-capital production uses physical and human capital just as intensively as goods production. The assumption in the TFP calculations is that to measure human capital, we only need to know how much schooling workers have; this implicitly assumes that physical and human capital play no role in producing human capital. The difference between the two approaches is quantitatively important. For example, the two approaches make very different predictions about what a worker moving from a rich to a poor country will earn. More generally, the implied gap in human-capital stocks between high-saving, high-education and lowsaving, low-education countries is much larger under the MRW assumption than under the schooling-only assumption. Thus differences in TFP may be smaller than what Bernanke and Gtirkaynak find using their schooling-only assumption. More importantly, some of the relationship between investment and their estimates of TFP growth may actually reflect correlation between investment and measurement error in their estimates of human capital. This problem is not specific to Bernanke and Giirkaynak: many researchers seem to choose a specification for human-capital production largely arbitrarily. But the choice often has important implications. With regard to physical capital, Bernanke and Giirkaynak use the standard perpetual-inventory approach to construct estimates of capital stocks from investment data. But Pritchett (2000) has recently pointed out that when governments invest, we cannot be confident that one unit of resources devoted to investment produces anything close to one unit of resources' worth of capital. He makes a strong case that for countries with big, bad governments, this issue can be important. Thus differences in TFP may again be smaller than Bernanke and Giirkaynak's estimates imply. And again, some of their estimated relationship between TFP growth and investment may in fact be a relationship between measurement error and investment.
Comment • 65
3. Methodology Let me now turn to Bernanke and Giirkaynak's methodology. To make the issues clear, I will focus on physical capital and postpone considering human capital until I get to the results. There are two features of Bernanke and Giirkaynak's approach that seem unappealing. The first is its two-step nature. To try to detect if capital contributes more to output growth than is reflected by its private marginal product, they first subtract capital's direct private contribution from output growth and then regress what is left on investment rates. The second is that there is no clear way to interpret the magnitude of their estimates: when they find a positive correlation between TFP growth and saving rates, there is no obvious way to determine the magnitude of capital's implied additional impact on output. Robert Solow once commented that the way Milton Friedman differs from the rest of us is that while for the rest of us, everything we see reminds us of sex, everything Friedman sees reminds him of the money supply. Well, as an empirical economist, everything I see reminds me of instrumental variables (IV). I therefore want to propose an IV variation on Bernanke and Giirkaynak's procedure. To do this, suppose log output per worker in country i depends on log capital per worker and other factors:
Bernanke and Giirkaynak's procedure would be to impose an a (derived from income data), compute the residual, and regress the residual on the saving rate. I propose instead to estimate (1) by IV, instrumenting for k with the saving rate. As you might expect, one can show that as long as the saving rate is positively correlated with k (which of course it is), Bernanke and Giirkaynak's procedure yields a positive coefficient on the saving rate if and only if the IV approach yields an estimate of a greater than the one Bernanke and Giirkaynak impose.2 That is, the IV estimate 2. To see this, consider the more general model Y = X/? + e, with instruments W of the same dimension as X. The Bernanke-Giirkaynak procedure is to impose a ft, say /3, compute the residuals Y - X/3, and then regress them on W. This yields TBG = (WWT'W (Y - X/3) = (WWT 1 [(WX) (W'X)^W'Y - (WX)jB] = (W'WT1 (WX) (Ay - /?). Thus the Bernanke-Giirkaynak estimate is nonzero if and only if the IV estimate of /3 differs from the imposed value.
66 • ROMER
transforms Bernanke and Giirkaynak's regression coefficient onto a scale that is much easier to understand. As a result, IV provides a more direct and easily interpretable way of getting at what Bernanke and Gtirkaynak are interested in. Putting things in this IV framework makes it clear why Bernanke and Giirkaynak say that there are two possible reasons for their results. Positive correlation between the instrument and the residual causes IV to produce upward-biased estimates. Thus an IV estimate of a that exceeds capital's income share could reflect either externalities from capital or simply correlation between the saving rate and other influences on TFP. And unfortunately, positive correlation between the saving rate and the residual is very plausible. This is a simple application of what I call Xavier's law, which states, "When governments screw up, they screw up big time" (Sala-i-Martin, 1991, p. 371). That is, there tends to be positive correlation among a wide range of forces that determine economic success, such as physical-capital accumulation, human-capital accumulation, market orientation, openness to trade, macroeconomic stability, political stability, lack of corruption, cultural attitudes conducive to growth, and so on. As a result, Bernanke and Giirkaynak's procedure, or its IV cousin, is not a reliable way of testing for externalities from capital. If we decide to go ahead and do the estimation anyway (in either its Bernanke-Glirkaynak or its IV form), we have to decide whether to consider levels or growth rates. Bernanke and Giirkaynak consider growth rates. That is, they consider not equation (1), but
where the changes are computed from 1965 to 1995. With the IV interpretation of what Bernanke and Giirkaynak are doing, we can describe the advantages and disadvantages of moving from levels to growth rates. The change in the capital stock has less variation than the level, and is less correlated with the saving rate; this tends to reduce the precision of the estimates. On the other hand, the change in the residual is likely to have less variation than the level; this tends to increase the estimates' precision. Similarly, the bias of the estimates can either rise or fall, depending on how the covariances of the instrument with the capital-stock variable and the residual change. Thus theory does not provide clear guidance about whether estimation in levels or in growth rates is preferable. A final issue about the specification that needs to be addressed is the geographic extent of externalities. Externalities from capital surely do not conveniently operate uniformly within a country and then suddenly
Comment • 67
stop at borders. Given this, it is unlikely that treating each country as an independent observation, and treating all countries identically, is ideal. 4. Results from the IV Approach One advantage of being a discussant is that it makes it acceptable to try things out speculatively. Thus, despite the reasons I just gave that the IV approach is likely to produce biased estimates and my uncertainty about the consequences of geographic spillovers, I decided to try the IV estimation anyway. Bernanke and Giirkaynak very kindly and helpfully provided their data. I implemented the IV procedure I just described. The only difference is that my instrument is actually the log of s,/(n, + g + 8), since the Solow model with Cobb-Douglas production structure implies that the log of the BGP capital stock is linear in this variable. Following Mankiw, Romer, and Weil, I set g + 8 to 0.05. Table 1 reports the results. The top panel looks at levels, and the bottom panel at growth rates. Consider levels first. As a warmup exercise, I start with OLS in the first column. Since increases in output Table 1 ESTIMATES OF CAPITAL'S IMPACT ON OUTPUT" Levels Estimation
OLS
IV
a
0.69 (0.02)
0.63 (0.03)
R2
0.90
2
0.82
R of first-stage regression Growth Rates Estimation
OLS
IV
a
0.63 (0.06)
1.55 (0.34)
R2
0.60
R2 of first-stage regression
0.11
"Standard errors are in parentheses, All regressions include a constant. The sample size is 88.
68 • ROMER
coming from sources other than capital accumulation raise the resources available for investment, the OLS estimate is likely to be biased up. And indeed, the OLS estimate of a is quite high: 0.69 (with a standard error of 0.02). What we are mainly interested in, however, is the IV estimate. As the second column reports, it is only slightly smaller than the OLS estimate: 0.63 (0.03).3 There are two possible reasons for the finding that the IV estimate is so much larger than capital's income share: there could be large externalities to capital, or the instrument could be correlated with the error term. As I described, some correlation with the error seems likely. Thus we cannot have confidence in a structural interpretation of the IV estimate. Now consider growth rates, which are what Bernanke and Giirkaynak focus on. As before, the OLS estimate of a is large and tightly estimated: 0.63 (0.06). The IV estimate, however, is now quite imprecise. Its standard error is 0.34; as a result, the two-standard-error confidence interval has a width of 1.36. The main reason is that the saving rate is a much worse instrument for the change in the capital stock than for its level: the R2 of the first-stage regression is 0.11 here, as opposed to 0.82 with levels. The wide confidence interval means that it is essentially impossible to learn anything important about a from this regression. The point estimate is in fact huge: 1.55. Since this is not remotely plausible as an estimate of capital's importance in production, it strongly suggests correlation between the instrument and the error term. So far I have ignored human capital. To consider it, I adopt the standard production function,
where S, is years of schooling (and where I have implicitly adopted the schooling-only view of human-capital production). Dividing both sides by L{ and taking logs yields
Both S{ and MRW's measure of human-capital investment are measures of time in school. Thus there is little point in instrumenting for S{ with the MRW measure. The instrument list is therefore ln[s,/(n, + g + 8)] and S, (and the constant).4 But again there is reason to fear correlation with 3. Even though the IV and OLS estimates are similar, the Hausman test decisively rejects the null that they are equal (t = 4.2). 4. Bernanke and Giirkaynak's human-capital variable is in fact H, = 2;/ye007S';, where L is the fraction of workers in country i with / years of schooling. My S, is therefore (In H,)/ 0.07, which differs slightly from average years of schooling.
Comment • 69 Table 2 ESTIMATES OF CAPITAL AND SCHOOLING'S IMPACTS ON OUTPUT" Levels Estimation
OLS
IV
a
0.59 (0.06)
0.33 (0.09)
*
0.15 (0.06)
0.27 (0.04)
R2
0.90
R2 of first-stage regression
0.90
Growth Rates Estimation
OLS
IV
a
0.61 (0.06)
1.48 (0.36)
$
0.17 (0.09)
0.12 (0.15)
R2
0.65
R2 of first-stage regression
0.15
"Standard errors are in parentheses, All regressions include a conslant. The sample size is 80 for the levels regressions, 77 for the growth regressions.
the residual: time in school is likely to be correlated with the same constellation of variables that may be correlated with investment in physical capital. Table 2 reports the results of estimating (3) by OLS and IV. With both levels and growth rates, the OLS estimates of the importance of physical and human capital are quite high. The estimated a's are about 0.6, and the 4>'s about 0.15. With levels, the IV estimate of a is very much in line with capital's income share: 0.32 (0.09). The estimate of >, however, is very large: 0.27 (0.04). It would be nice if these estimates could be taken as evidence of an absence of externalities from physical capital and of large externalities from human capital. Unfortunately, a more likely possibility is that the estimates largely reflect differing correlations of the instruments with the error term.
70 • ROMER
Finally, with growth rates, the estimate of a is again highly imprecise and wildly implausible. The estimate of (j> is reasonable, but also quite imprecise.5
5. Concluding Remarks Where do we go from here? One could try to use the IV approach to obtain more trustworthy estimates of the social returns to physical and human capital by controlling for variables that are correlated with saving rates and schooling and that affect economic performance. Unfortunately, I am skeptical that such an approach can ever produce reliable estimates. The effects of Xavier's law are sufficiently pervasive that controlling for all the relevant variables is essentially impossible. Thus, my view is that the solution will have to lie in the instruments rather than the controls. Specifically, I think the identification of the importance of externalities from capital to cross-country income differences is more likely to come not from broad measures of capital accumulation, but from smaller variations that are plausibly uncorrelated with the residual. In other words, I think we should be looking for natural experiments. I also think that any successful effort will have to tackle the issue of the geographic extent of the spillovers. Despite my reservations about the specifics of their investigation, I want to applaud Bernanke and Giirkaynak for beginning to address the neglected issue of the role of capital externalities in cross-country differences in economic success. Capital externalities were at the heart of early new growth models, and there is plenty of statistical and anecdotal evidence for their importance at the microeconomic level. But recent work on cross-country differences has largely ignored them. By calling attention to their potential importance, Bernanke and Giirkaynak have left us with an important research agenda. REFERENCES Jones, C. I. (1995). Time series tests of endogenous growth models. Quarterly Journal of Economics 110(May):495-525. Pritchett, L. (2000). The tyranny of concepts: CUDIE (cumulated, depreciated, investment effort) is not capital. Journal of Economic Growth 5(December):361384. Sala-i-Martin, X. (1991). "Comment." In NBER Macroeconomics Annual, vol. 6. Cambridge, MA: National Bureau of Economic Research, pp. 368-378. 5. Further, since the coefficient on S, is (1 — a)(/>, the combination of an a greater than one and a positive $ means that the estimates imply that, all else equal, an increase in S, is associated with lower growth.
Discussion • 71
Discussion Greg Mankiw acknowledged that Mankiw, Romer, and Weil had stacked the deck in favor of Solow by not imposing the capital share and allowing the data to choose the parameters. In contrast, authors such as Klenow and Rodriguez-Clare who imposed the capital share found that capital explained much less of cross-country differences. He went on to say that one of the big unanswered questions in the empirical growth literature is how to explain the correlation (pointed out by the authors) between TFP growth and factors that affect capital accumulation. He suggested three hypotheses that could explain this correlation: First, measurement error, as discussed by David Romer; second, externalities to physical- and human-capital accumulation; or third, some mechanism that could lead TFP to feed back into capital accumulation. For example, he noted that in work by David Weil, habit formation in consumption results in positive correlation between TFP growth and capital accumulation. Mankiw said that, in order to move forward, the literature must find instruments that distinguish econometrically among the three explanations. He also wondered whether the IV approach used by Romer in his comment might not be just another way of packaging the OLS correlations presented in MRW. Ben Bernanke emphasized that the central result of the paper was the finding that there is a correlation between long-run growth rates, on the one hand, and saving rates and population growth on the other. The results of the paper do not distinguish among the three explanations put forward by Mankiw or a fourth possible explanation, that a common factor drives both TFP and savings. Bernanke was not convinced by Romer's IV technique, being skeptical that valid instruments for saving rates exist in country panel data sets. He suggested three ways of making progress: First, economists should try to write down simple parsimonious models that can account empirically for the broad facts about growth, in the spirit of the modern literature on modeling business cycles. Second, as David Romer said, researchers should try to identify natural experiments at the country level, such as those used by Esther Duflo in her work on the effects of schooling. Finally, timing relationships, between (say) changes in saving rates and changes in growth rates, might in some circumstances be informative. Bernanke emphasized, however, that the paper shows that the key prediction of the Solow model, that there is no long-run growth from factor accumulation, is not a good first approximation to the facts. Bernanke, Mankiw, and Romer discussed various issues concerning how to measure human capital and how to write down the human-capital
72 • DISCUSSION
production function. Bernanke felt that while it was theoretically possible to construct a measure of the human-capital stock using the resources devoted to human-capital accumulation and the perpetual inventory method, in practice it would be very hard to collect data on inputs other than students' time. Mankiw noted that, in reality, the lack of physical capital inputs to education was a big problem in developing countries. Romer did not see the problem of measuring human-capital stocks as intractable. For example, one could follow Klenow and Rodriguez-Clare and make some simple assumptions about the fraction of total capital devoted to schooling. Alternatively, the U.S. earnings of immigrants could be compared with the earnings of U.S. natives with the same number of years of education. This gives some idea of whether students' time or physical capital is more important in accumulating human capital. The conclusion from such analyses is that both students' time and other inputs matter for human-capital accumulation, although the production function is not the same as for other types of output. Referring to David Romer's concern about countries not being the right unit of observation, Bernanke suggested that introducing borders and distance into the empirical analysis might help to refine the estimates.
Philip R. Lane and Gian Maria Milesi-Ferretti TRINITY COLLEGE DUBLIN AND CEPR, AND INTERNATIONAL MONETARY FUND AND CEPR
Long-Term Capital Movements 1. Introduction The global integration of capital markets has been one of the biggest stories in the world economy in recent decades. International asset trade offers several potential benefits. Countries can share risks via international portfolio diversification; the efficient allocation of capital to the most productive location is promoted; and consumption can be smoothed across time periods in response to shifts in macroeconomic fundamentals. While risk sharing may be largely accomplished through gross international asset trade, net capital flows will typically be required for the latter two functions. With respect to net asset trade, the empirical literature initiated by Feldstein and Horioka (1980) has focused on the evolution of current accounts across countries and through time, highlighting the degree of comovement between national saving and domestic investment. Another branch of the literature has investigated whether net capital flows respond appropriately to cyclical macroeconomic shocks, most promiWe thank Ken Rogoff, Ben Bernanke, and our discussants Kristin Forbes and Jeffrey Frankel for helpful suggestions; Sam Ouliaris for extensive econometric advice; Jerry Coakley, Hamid Faruqee, Eswar Prasad, Morten Ravn, and participants in the Trinity Research Lunch, the Dublin Economics Workshop, the Joint IMF/World Bank seminar, and seminars at the International Finance Division of the Federal Reserve Board, Birkbeck College, and Queen's University Belfast for comments. For help with constructing the public-debt data, we are very grateful to Ilan Goldfajn, Alessandro Missale, Gustavo Morales, Rafael Rodriguez-Balza, Sebastian Sosa, Michel Strawczynski, Mauricio Villafuerte, and several colleagues at the IMF. Mathias Hoffman, Grace Juhn, and Charles Larkin provided excellent research assistance. Lane's work on this paper was partly conducted during a visit to the Research Department of the IMF and is part of a research network on "The Analysis of International Capital Markets: Understanding Europe's Role in the Global Economy," funded by the European Commission under the Research Training Network Programme (Contract No. HPRN-CT-1999-00067). Lane also gratefully acknowledges the support of a TCD Berkeley Fellowship.
74 • LANE & MILESI-FERRETTI
nently in the literature that has tested present-value models of the current account (see Obstfeld and Rogoff, 1996). In this paper, we instead turn our attention to the stocks of external assets and liabilities, studying the long-term factors driving the evolution of countries' net external positions. Our interest in this subject, which has received much less attention in the literature, is based on a number of considerations. First, international macroeconomic theory suggests that a host of long-term fundamentals can lead to countries becoming persistent international net creditors or international net debtors. Such long-term factors can be missed if emphasis is exclusively placed on current-account imbalances, even using long spans of data: for instance, a country may run persistent current-account deficits but still be reducing its external liabilities relative to GDP. Second, if longterm factors are important in determining net foreign-asset positions, short-term flows cannot be properly understood unless the constraints imposed by long-run equilibrium conditions are explicitly taken into account. For example, the implications of a country's current-account deficit depend on whether it is moving the country towards or away from its target long-run net foreign-asset position. Why then has little attention been devoted to studying such longerrun issues? Paucity of data on foreign-asset and -liability stocks has been a traditional barrier to research on net foreign-asset positions. Only a few countries have published reliable estimates of accumulated stocks, whereas current-account data have been much more widely available. In Lane and Milesi-Ferretti (2001a), we have employed a uniform methodology to generate estimates of foreign-asset and -liability positions for a large number of industrial and developing countries over the past three decades. This dataset enables us to analyze the behavior of net foreignasset positions in a more comprehensive manner than in the efforts of previous researchers. We address three questions about net foreign-asset positions. First, we try to explain their behavior, across countries and over time, investigating why some countries are net creditors and others net debtors, and why some creditors turn into debtors, like the United States, and vice versa, like Singapore. Identifying the long-term macroeconomic forces underlying the endogenous determination of net foreign-asset positions provides insight into the role played by international financial integration in allowing countries to delink national production and consumption. Second, we identify two mechanisms that link trade balances to net foreign-asset (NFA) positions. One key channel is that changes in the target long-run NFA position are an important force driving the current account. The other is that, for a given desired NFA position, a country that enjoys high returns on its foreign assets and pays out low returns
Long-Term Capital Movements • 75
on its foreign liabilities can afford to run a smaller trade surplus (or larger trade deficit). In this way, we highlight the role of a state variable (the NFA position) in determining the dynamics of the trade balance. Third, we explore the relation between NFA positions and the realinterest-rate differential. This is an old question in the portfolio-balance literature: do debtor countries pay a risk premium? The traditional literature attempted to link currency return differentials to outstanding relative stocks of national money, but much less research has been directed at linking differences in real interest rates across countries to long-run net foreign asset positions (Frankel and Rose, 1995). The structure of the rest of the paper is as follows. In Section 2, we briefly discuss the broad properties of our dataset of foreign assets and liabilities. The determination of long-run NFA positions is investigated in Section 3. Section 4 models the short-run dynamics of the NFA position and the behavior of the trade balance. We turn in Section 5 to the relation between the NFA position and the real-interest-rate differential. Conclusions and directions for future research are offered in Section 6. 2. International Balance Sheets: Stylized Facts 2.1 METHODOLOGY
A country's net external position is the sum of net claims of domestic residents on nonresidents. In line with the way in which transactions are recorded in balance-of-payments statistics, we classify external assets and liabilities into three main categories: foreign direct investment (FDI), portfolio equity (EQ), and debt instruments (DEBT). Foreign exchange reserves (FX) belong in this last category, although we keep them separate in the overall accounting. Hence we define net foreign assets as follows:
where the letter A indicates assets and the letter L liabilities. The FDI category reflects a "lasting interest" of an entity resident in one economy in an enterprise resident in another economy (IMF, 1993). This includes greenfield investment as well as equity participation giving a controlling stake (typically set at above 10%), while remaining equity purchases are classified under portfolio equity investment.1 The debt category includes trade credits, bank loans, and portfolio bond instruments. 1. This implies that in certain cases the distinction between these two categories can in fact be blurred, but the issue cannot be clarified further in the absence of detailed disaggregated data.
76 • LANE & MILESI-FERRETTI
For most industrial countries, estimates of stocks of external assets and liabilities are published by national authorities and collected by the IMF and the OECD, but coverage starts for most countries only in the early eighties. The corresponding measure of NFA is called the international investment position (IIP). For developing countries, however, comprehensive stock data are generally available only for external debt and foreign exchange reserves; IIP availability is limited, especially along the time-series dimension. In addition, the methodologies used to estimate the various stocks of equation (1) often differ across countries (for example, book or market value for equity and FDI), making cross-country comparisons more difficult. In order to overcome the limitations in existing data, we have constructed data on external assets and liabilities for 66 industrial and developing countries, covering the period 1970-1998. We discuss in detail the methodology we use for estimating net external positions in Lane and Milesi-Ferretti (2001 a). Broadly speaking, we rely on stock data, when available, supplemented by cumulative flows data, with appropriate valuation adjustments. The latter are particularly important given the increased role played by portfolio equity and FDI flows during the past decade. The use of flow data can be better understood by considering the fundamental balance-of-payments identity, which states that the current account, net financial flows, and changes in foreign-exchange reserves sum to zero, with a term capturing "net errors and omissions" acting as the balancing item.2 Financial flows can be divided between FDI, portfolio equity, and debt flows, plus a term capturing capital-account transfers, which include debt forgiveness operations and other transactions that do not give rise to a corresponding asset or liability. The evolution of net claims on the rest of the world is dictated by the flows of new net claims—which equal the current account balance net of capital transfers TRf—and by capital gains and losses KG on existing claims:
Our first measure of NFA, CUMCA, is available for all countries and is obtained by cumulating current-account balances, net of capital transfers, with appropriate adjustments designed to take into account valuation effects, debt reduction and debt forgiveness, and other terms subsumed in KG. For example, we adjust the outstanding stock of equity 2. We assume that errors and omissions reflect changes in the debt assets held by country residents abroad, in line with the capital-flight literature. See Lane and Milesi-Ferretti (2001a) for a discussion of this issue.
Long-Term Capital Movements • 77
assets and liabilities so as to reflect variations in the U.S.$ value of stockmarket indices, and the stocks of inward and outward FDI to reflect changes in the cross-country prices of capital goods. A comparison with existing data on stocks of external assets and liabilities provides a satisfactory robustness check on our methodology. For developing countries, we also construct a second measure, CUMFL, that is obtained as the sum of stocks of the various external assets and liabilities, calculated as adjusted cumulative capital flows or, as is the case for external debt and foreign exchange reserves, as direct stock measures. As is explained in detail in Lane and Milesi-Ferretti (2001 a), our CUMCA measure implicitly considers estimates of cumulative unrecorded capital flows as assets held by the country residents abroad. CUMFL instead includes unrecorded capital outflows only to the degree that they are reflected in net errors and omissions, and hence a lower fraction of unrecorded external capital holdings than CUMCA.3 We use these measures to supplement the existing IIP data. Before turning to the presentation of the data, it is important to point out that the measurement of international current and capital transactions faces severe problems, in particular underrecording of exports and capital outflows, reflected in the existence of a measured "world current account deficit" (over U.S.$70 billion in 1998). These problems are unavoidably reflected in our data, which make use of official sources; even though we try to take account of unrecorded capital outflows to the extent possible, external assets are as a whole underreported. 2.2 NET FOREIGN ASSETS: BROAD TRENDS
The distribution of countries between large and small creditors and debtors in 1975, 1986, and 1997 is depicted in Figure I. 4 In industrial countries as a whole the dispersion of net external positions has increased 3. For developing countries, the CUMCA measure determines the stock of debt assets residually, after subtracting from the estimated net external position the net FDI and equity positions and the difference between reserves and external debt. To understand the difference with CUMFL, consider, for example, the case of a country with a trade deficit entirely financed by a flow of new debt liabilities (and errors and omissions equal to zero). Assume, as has often been the case in developing countries during periods of capital flight, that the change in the stock of external debt (measured by World Bank data) exceeds the recorded debt inflow in the balance of payments. Cumulating the current account (as in CUMCA) implies that the change in the net external position is equal to the recorded flow of new debt, and thus implicitly assumes that the difference between the change in the stock of debt and the flow is offset by an accumulation of debt assets of the country abroad. If debt assets are instead estimated directly as cumulative flows (as is the case for CUMFL), the change in the net external position corresponds to the increase in the stock of external debt. 4. We focus here just on the overall NFA position. See Lane and Milesi-Ferretti (2001b) for a discussion of the composition of the external capital structure.
78 • LANE & MILESI-FERRETTI Figure 1 DISTRIBUTION OF NET FOREIGN-ASSET POSITIONS: (a) INDUSTRIAL COUNTRIES; (b) DEVELOPING COUNTRIES
during the past 25 years, with an increase in the number of relatively large debtors, especially between 1975 and 1986, and in the number of creditors with assets above 10% of GDP. For developing countries, there is a large increase in the number of countries with large external liabilities (over 40% of GDP) between the 1970s and the 1980s, in the aftermath of the debt crisis. More generally, a pattern of increased dispersion in net external positions is also visible, and is especially strong in the 1970s and the 1980s.
Long-Term Capital Movements • 79
Figure 2 plots different NFA measures as a fraction of GDP for a selection of industrial countries for the period 1970-1998. We graph both our estimate CUMCA and the direct estimate of NFA (IIP) when available.5 Only a few countries have remained creditors throughout the past three decades (Germany, Japan, the Netherlands, and Switzerland); the rest of the group is almost evenly split between persistent debtors and switchers. Among the latter, the best-known case is the United States. Figure 3 plots NFA measures for some of the developing nations in our sample, highlighting a number of interesting facts. First, the dynamics of external positions in the countries most affected by the debt crisis is similar, with a sharp worsening during the early 1980s and an improvement later in the decade. Second, net external liabilities measured with CUMFL are significantly larger than with CUMCA in several countries (Argentina, Brazil, Mexico, and Indonesia), reflecting unrecorded capital outflows. The third is the effect of the currency collapse due to the Asian crisis on external liabilities in Indonesia and to a lesser degree in Thailand. Finally, the improvement of Singapore's net external position over time is remarkable.6
3. The Determinants of Net Foreign-Asset Positions We propose a parsimonious reduced-form model of the NFA position:
where bit is country i's ratio of NFA to GDP in year t, YC/f is its output per capita, GDEBTIf is its level of public debt, and DEM,, is a set of demographic variables. As the discussion in the next subsection makes clear, we have followed the main themes developed in the theoretical literature in selecting these variables as the primary determinants of NFA positions.7 It is important to take note that all variables should be interpreted as measured relative to global values, since common movements in output per capita, demographic trends, and government debt should not 5. In Lane and Milesi-Ferretti (2001a) we explain the most relevant differences between these two measures. 6. Taiwan shows a similar, albeit less dramatic, trend among the economies in our sample. 7. Since we have a limited number of time-series observations, we are constrained in the number of determinants that we can include in our empirical work. As detailed in Section 3.1, these variables can affect NFA positions through several channels as highlighted by a number of theoretical contributions. Building an integrative general-equilibrium model that would nest the various hypotheses is beyond the scope of this paper, and our empirical specification will inevitably not be able to discriminate between all competing theories.
Figure 2 NET FOREIGN ASSETS, INDUSTRIAL COUNTRIES
Figure 3 NET FOREIGN ASSETS, DEVELOPING COUNTRIES
82 • LANE & MILESI-FERRETTI
affect NFA positions, but rather will operate via global variables such as the world real interest rate. 3.1 THEORETICAL CHANNELS
Relative output per capita can affect NFA positions through several channels. First, if the domestic marginal product of capital decreases as an economy grows richer, domestic investment will fall and home investors will seek out overseas accumulation opportunities. Second, an increase in domestic income may lead to a rise in the domestic savings rate. This result is most clearly generated in models with habit formation in consumption preferences: as an economy grows, consumption will lag behind output (see, for instance, Carroll, Overland, and Weil, 2000). An alternative explanation has been suggested by Rebelo (1992): under Geary-Stone preferences, the savings rate will also be increasing in income levels, since the marginal utility of extra consumption sharply diminishes once basic consumption needs are satisfied. We note that, even if the increase in the savings rate is temporary, there may be a permanent improvement in the NFA position. A positive relation between relative output per capita and the NFA position is also captured in the traditional stages-of-the-balance-of-payments hypothesis (see Halevi, 1971, and Fischer and Frenkel, 1974). Although these factors point to a positive relation between relative output per capita and the NFA position, an effect operating in the opposite direction may be at work in developing countries operating under credit constraints. In models in which an improvement in net worth or cash flow relaxes financial constraints, an increase in production may allow greater recourse to foreign credit, possibly implying a negative relation between net external assets and relative output, at least over some interval. The second variable we consider is the stock of public debt. In a world that exhibits departures from Ricardian equivalence, higher levels of public debt may be associated with a decline in the external position. For instance, in the Blanchard-Yaari finite-horizon model, an increase in public debt is not fully offset by an increase in private asset accumulation, since public debt is perceived as net wealth by current generations, who will bear only part of the tax burden implied by its higher stock (Blanchard, 1985; Faruqee and Laxton, 2000). Third, demographic factors are also potentially important determinants of the net foreign assets. For instance, countries with an aging population can prepare for an increase in the ratio of retirees to workers by accumulating overseas assets to supplement domestic income streams. Domestic investment in these countries will also be curtailed as the marginal prod-
Long-Term Capital Movements • 83
uct of capital is diminished by a reduction in the growth of (or a decline in) the working-age population and the labor force. At the other end of the population distribution, a society with a high youth dependency ratio may require heavy investment in social infrastructure (education, housing). A high youth dependency ratio may also reduce the savings rate, as households with children attempt to smooth consumption. Accordingly, we may expect to see a decline in NFA in countries experiencing a rise in the youth dependency ratio (see also Taylor, 1994; Taylor and Williamson (1994); Obstfeld and Rogoff, 1996; Higgins, 1998). However, the impact of demographic factors on the NFA position is not just a function of the youth and old-age dependency ratios, but also of the age structure of the working-age population (Mundell, 1991). For instance, a relatively young workforce may be associated with relatively low saving and high investment, whereas an older workforce may be associated with a rise in the NFA position, as the saving-for-retirement motive kicks in and domestic investment falls. For this reason, we will employ the entire age distribution in our empirical work. Finally, some authors have recently modeled the determination of NFA positions in a stylized mean-variance portfolio framework, with the demand and supply for domestic and foreign assets being determined by risk and return characteristics and by the profiles of investors (see Calderon, Loayza, and Serven, 2000; Kraay, Loayza, Serven and Ventura, 2000; Edwards, 2001). As the preceding discussion has highlighted, our fundamentals—output per capita, public debt, and demography— potentially affect these factors in complex ways. Among the channels not already discussed, output per capita and years to retirement may plausibly affect the degree of risk aversion. However, the relation between risk aversion and the NFA position depends on whether the "safe" asset is domestic or foreign, which is typically a model-specific choice. 3.2 PREVIOUS EMPIRICAL WORK
Masson, Kremers, and Home (1994) is one of the very limited number of studies focusing on the evolution of NFA.8 In their country studies of the United States, Japan, and Germany over the period 1960-1985, they relate NFA positions to the overall dependency ratio and the level of government debt, but do not include the level of income per capita.9 8. Halevi (1971) and Roldos (1996) provide some empirical evidence on the stages-of-thebalance-of-payments hypothesis. 9. In a study of OECD countries, Bayoumi and Gagnon (1996) also control for fiscal and demographic effects, but their primary focus is on the effects of inflation on NFA positions.
84 • LANE & MILESI-FERRETTI
They find evidence of a long-run relation between these variables, and highlight the role of feedback mechanisms working through absorption in the adjustment towards the long-run equilibrium. Calderon, Loayza, and Serven (2000) relate the evolution of NFA to composite measures of risk and return; they find support for their specification, particularly for countries with low barriers to international capital movements. Taylor (1994), Higgins (1998), and Herbertsson and Zoega (1999) have provided some evidence that demographic factors are an important driving force of medium-term current-account behavior. Herbertsson and Zoega (1999) focus in particular on the link between population age structure and public and private saving behavior: they highlight how countries with high youth dependency ratios tend to have larger current-account deficits.10 Employing a demographic specification similar to ours, Taylor (1994) and Higgins (1998) show that the demographic structure is quantitatively important in explaining medium-term currentaccount behavior. 3.3 EMPIRICAL ANALYSIS
Our empirical analysis of the long-run behavior of NFA uses data for 66 countries spanning the period 1970-1998. Throughout our empirical work, we split the sample between industrial and developing countries.11 The industrial countries consist of long-standing members of the OECD, which approximately correspond to the most-developed set of countries at the start of the sample period. We allow for potentially different relations between our fundamentals and NFA positions for the two groups, as well as for differences in data quality. For instance, we have already noted that the output per capita may exert different effects in the two groups, and the difference in life expectancy and in retirement patterns means that demographic effects plausibly will also differ across the two samples. Furthermore, differences in the pervasiveness of liquidity con10. However, Chinn and Prasad (2000) find instead only weak evidence of a systematic effect of dependency ratios on current-account balances in a wide sample of industrial and developing countries. 11. Industrial countries are the United States, the United Kingdom, Austria, BelgiumLuxembourg, Denmark, France, Germany, Italy, the Netherlands, Norway, Sweden, Switzerland, Canada, Japan, Finland, Greece, Iceland, Ireland, Portugal, Spain, Australia, and New Zealand. Developing countries are Turkey, South Africa, Argentina, Bolivia, Brazil, Chile, Colombia, Costa Rica, the Dominican Republic, Ecuador, El Salvador, Guatemala, Mexico, Panama, Paraguay, Peru, Uruguay, Venezuela, Jamaica, Trinidad and Tobago, Israel, Jordan, Kuwait, Oman, Saudi Arabia, the Syrian Republic, Egypt, Sri Lanka, Taiwan, India, Indonesia, Korea, Malaysia, Pakistan, Philippines, Singapore, Thailand, Algeria, Botswana, Cote d'lvoire, Mauritius, Morocco, Zimbabwe, Tunisia, and China.
Long-Term Capital Movements • 85
straints and other sources of violation of Ricardian equivalence may induce differences in the relation between net foreign assets and public debt in the two groups. We use the following variables: NFA as a ratio of GDP (CUMCA and CUMFL measures, as well as the IIP measure for robustness checks), GDP per capita in 1995 U.S. dollars (in log form), the stock of public debt as a fraction of GDP, and the shares of population under 14, over 65, and between 15 and 64 (in 5-year cohorts).12 Public debt is defined as the sum of external public debt, net of foreign-exchange reserves, and gross domestic public debt.13 For industrial countries, the main source of data for public debt is the OECD (general government definition); for developing countries, the data have been constructed using the World Bank's Global Development Finance, the IMF's Government Financial Statistics, and national sources. Unfortunately, the definition of government for developing countries is not homogeneous—it can refer to central government, general government, or the nonfinancial public sector. When data availability was not a constraint, we have used the broadest definition of government. A data appendix detailing sources and definitions for the debt data is available from the authors. Finally, population shares were constructed using the United Nations (2000) Demographic Yearbook (Historical Supplement 1948-1997), supplemented by data from Herbertsson and Zoega (1999).14 3.3.1 Bivariate Relations As a precursor to the multivariate econometric work, we begin in Figures 4-6 by showing the bivariate relations between net foreign-asset positions on the one side and output per capita, public debt, and demographic structure on the other. In these graphs, the data are measured in terms of average changes between 1980-1989 and 1990-1998, capturing the medium- or long-term movement in country 12. Ideally, we would like to measure net foreign assets relative to a country's total wealth, but this would require data on land values, natural resources, human capital, and the value of domestic assets. In any event, it is plausible that GDP may serve as a reasonable proxy for wealth. 13. We would of course prefer to use net domestic public debt, but data availability for such a measure is much more limited. Since we focus on time-series behavior, and given the strong comovement between the two measures for those countries for which they are both available, we are confident that this choice still allows us to capture the right long-run relation. As we will discuss later, obstacles are more serious when undertaking cross-sectional analysis, because of cross-country differences in the definitions of "government." 14. We thank these authors for kindly sharing their data.
86 • LANE & MILESI-FERRETTI Figure 4 NET FOREIGN ASSETS AND GDP PER CAPITA (AVERAGE CHANGE, 1990-1998 OVER 1980-1989): (a) INDUSTRIAL COUNTRIES; (b) DEVELOPING COUNTRIES
Long-Term Capital Movements • 87 Figure 5 NET FOREIGN ASSETS AND PUBLIC DEBT (AVERAGE CHANGE, 1990-1998 OVER 1980-1989): (a) INDUSTRIAL COUNTRIES; (b) DEVELOPING COUNTRIES
88 • LANE & MILESI-FERRETTI Figure 6 IMPACT OF CHANGE IN DEMOGRAPHICS ON CHANGE IN NET FOREIGN ASSETS. (AVERAGE CHANGE, 1990-1998 OVER 19801989): (a) INDUSTRIAL COUNTRIES; (b) DEVELOPING COUNTRIES
Long-Term Capital Movements • 89
positions.15 In each figure, panels (a) and (b) contain observations from the industrial and developing countries respectively. Figure 4a shows a quite striking positive bivariate relation between growth in output per capita and improvement in the NFA position among the industrial nations. A significant positive relation between output per capita and the NFA position is also evident in the developing-country sample in Figure 4b. However, the slope is flatter and the overall fit is much weaker. We will return to the difference in slopes between the industrial and developing samples when interpreting the results of the regression analysis below. Figure 5 plots the change in the NFA position against the change in the ratio of public debt to GDP. For both industrial and developing countries, we observe an inverse bivariate relation: growth in public debt tends to be associated with a decline in the net foreign-asset position. We turn to the effect of demographic structure in Figure 6. This figure charts the correlation between the change in the NFA position and the change in the population shares in each age cohort (0-14, 15-19, . . . , 60-64, 65+). For the industrial countries, we see that an increase in the youth dependency ratio is associated with a decline in the net foreignasset position, as is an increase in the 30-49 age groups (albeit these correlations are weaker). There is a twin-peaks effect here: increases in both the 15-29 and 50-64 age groups are associated with an improvement in net foreign assets. For the developing countries, the effect of demographic structure is more uniform: an increase in the 15-29 population share is associated with a decline in the NFA position, whereas the 30-49 population share exerts a positive effect. Although these scatter diagrams provide some suggestive evidence, the interpretation of bivariate relations of course should not be pushed too far. For instance, there is a strong correlation in the data between demographic structure and output per capita, both along the time-series and along the cross-sectional dimension, which could explain the comovements of one of these variables with net foreign assets. To uncover whether all of these variables play a simultaneous role in the dynamics of net foreign assets, we next turn to panel regressions for formal multivariate regression analysis. 3.3.2 Panel Fixed-Effects Regression Analysis Since we are interested in the role played by shifts in our fundamentals in explaining the dynamic 15. This "cross-section in first differences" is essentially a country fixed-effects specification, picking up intra-country time variation. We get similar graphs if we also employ data from the 1970s, but the more recent period offers more complete data and may better capture behavior under integrated capital markets.
90 • LANE & MILESI-FERRETTI
evolution of NFA positions, we focus on a fixed-effects panel specification in this sub-subsection (we consider the cross-section evidence in the next sub-subsection). The country fixed effects also have the merit of soaking up unobserved variables that may lead to permanent differences in measured net foreign-asset positions across countries.16 To control for common global movements, in particular of world GDP per capita, demographics, and public debt, we also include time dummies in all the regressions. As a precursor to the regression analysis, we explored the univariate time-series properties of the data. We tested for nonstationarity in our series for net foreign assets, demographic variables, government debt, and log GDP per capita, using the NPT1.1 econometric package—see Chiang and Kao (2000). The tests were performed separately on the industrial- and the developing-country samples, using the panel unitroot test of Hadri (2000) (allowing for fixed effects and no time trend). For all series in the four samples, the test rejects the null hypothesis of stationarity.17 In light of the evidence on the presence of unit roots, we subsequently tested for panel cointegration among our variables, using tests suggested by Kao (1999) and Pedroni (1999). Both are residualbased tests for which the null hypothesis is lack of cointegration (nonstationarity of residuals). These test statistics are reported in Table 1 and strongly suggest the existence of a cointegrating relation among net foreign assets and our fundamentals. Having ascertained that the variables display a common trend, we follow Stock and Watson (1993) and estimate their long-run relation using a dynamic ordinary least-squares (DOLS [—1,1]) specification.18 We report estimates for the 1970-1998 and 1980-1998 intervals. The dataset is more complete for the post-1980s period, and in addition this latter period may better reflect an environment of open capital accounts.19 With respect to the specification, we want to allow the entire age structure to influence the net foreign-asset position, but do not wish to estimate independent parameters for our twelve age cohorts. We therefore follow Higgins (1998) by restricting the coefficients on the population share variables to lie along a cubic polynomial, so that only three composite demographic variables need actually be entered into the regression specification (see the Appendix for details). 16. This may capture both country-specific determinants of net foreign-asset positions and permanent measurement errors in our estimates of national net foreign-asset positions. 17. Other panel unit-root tests gave broadly similar results. The unit-root test results are available from the authors. 18. A DOLS[-2,2] specification gave similar results. Only leads and lags of output growth and changes in public debt are included (including changes in demographic variables makes no difference). Standard errors are corrected for heteroscedasticity. 19. In future work, we plan to look explicitly at measures of capital-account liberalization.
Long-Term Capital Movements • 91 Table 1 KAO (1999) AND PEDRONI (1999) COINTEGRATION TESTS
(1)
(2)
(3)
(4)
Industrial 1970-98
Industrial 1980-98
Developing 1970-98
Developing 1980-98
10.89 (0.00)
10.42 (0.00)
-15.65 (0.00)
11.62 (0.000)
Kao (1999) ADF stat., 1 lag
-4.24 (0.00)
-4.48 (0.00)
-4.73 (0.00)
-4.17 (0.00)
Kao (1999) ADF stat., 2 lags
-4.36 (0.00)
-4.52 (0.00)
-4.29 (0.00)
-4.61 (0.00)
Pedroni (1999) f-stat. for fa
-333.6 (0.00)
-237.1 (0.00)
-472.4 (0.00)
-315.2 (0.00)
Kao (1999) DF p*-test
Note: Cointegration tests are performed on the vector including NFA, log GDP per capita, public debt, and the three composite demographic variables. The table reports the value of the statistic, with pvalues in parenthesis. The null hypothesis in all tests is lack of cointegration. DF (ADF) stands for (augmented) Dickey-Fuller.
Tables 2 and 3 reports the results of the panel estimation (with fixed country and time effects) for the industrial- and developing-country samples respectively. For the industrial-country sample, we use both our measure of net foreign-asset positions (CUMCA) and, for robustness, a measure that replaces CUMCA by official international investment position data where they are available for most of the sample period (CUMCA + IIP). For developing countries, we employ the two alternative measures of the net foreign-asset position (CUMCA and CUMFL) described in Section 2. We also report results when Singapore is excluded from the sample, since it is an extreme observation with respect to its net foreignasset position, and its role as banking center complicates considerably the construction of accurate net-foreign-asset measures (indeed, CUMFL is not available). Finally, in each case, we also report results for balanced samples. For the industrial-country sample, Table 2 shows a consistently strong positive influence of output per capita on the net foreign-asset position. The stable point coefficient of about 0.9 means that a 10% improvement in a country's relative output per capita is associated with a 9-percentage-point improvement in its ratio of net foreign assets to GDP. This result provides supporting evidence for those theories outlined in Section 3.1 that predict a positive comovement between output per capita and net foreign assets. If we consider the 1970-1998 interval, the results for public debt and
92 • LANE & MILESI-FERRETTI Table 2 DETERMINANTS OF NET FOREIGN ASSETS, INDUSTRIAL COUNTRIES: PANEL DOLS REGRESSIONS WITH FIXED TIME AND COUNTRY EFFECTS (5)
(3) CUMCA+IIP 1970-98
(4) CUMCA+IIP 1980-98
CUMCA 1970-98
(2) CUMCA 1980-98
Log GDP per capita
0.91 (12.63)**
0.91 (7.26)**
Public debt
-0.125 (3.1)**
-0.05 (0.9)
-0.124 (3.01)**
-0.07 (1.1)
-0.18 (4.54)**
/(demog)
30.1 (0.00)** 0.89 516 22 -1.47 -0.66 1.41 (50-54) -1.49 (15-19)
2.3 (0.51) 0.91 389 22 -0.81 -0.59 0.46 (35-39) -0.81 (0-14)
22.1 (0.00)** 0.89 516 22 -1.24 -1.29 1.24 (50-54) -1.29 (15-19)
4.2 (0.24) 0.93 382 22 -1.2 -0.44 0.63 (30-34) -1.2 (0-14)
43.6 (0.00)** 0.9 390 15 -2.26 -0.05 1.24 50-54) -2.26 (0-14)
(1)
Adjusted R2 Observations Countries a(Popul. < 15) a(Popul. > 64) amax «min
0.9 (12.55)**
0.89 (6.71)**
CUMCA Balanced 1972-97 0.94 (11.66)**
Dynamic ordinary least squares, f-statistics in parentheses [p-value for the ^2(demog) statistic]. * (**) indicates statistical significance at the 5% (1%) confidence level. In regressions (1) and (2) the dependent variable is CUMCA for all countries except Belgium, for which it is the IIP estimate of NFA minus gold. In regression (3) the dependent variable is the IIP estimate of NFA for Belgium, Canada, Italy, Japan, and the United Kingdom, and CUMCA for all other countries. In regression (4) it is the IIP estimate of NFA for Austria, Belgium, Canada, Finland, Germany, Italy, Japan, the Netherlands, Spain, Sweden, Switzerland, the United Kingdom, and the United States, and CUMCA for the remaining countries. For definition of a, see Appendix.
demographic structure are also quite strong. In line with our theoretical prior, net foreign assets are negatively related to the size of the government debt. The statistically significant —0.125 point estimate implies that the ratio of net foreign assets to GDP falls by 6 percentage points in a country that experiences a 40-percentage-point increase in its ratio of public debt to GDP (relative to the world average), indicating that government debt is largely absorbed domestically. The relation between net foreign assets and demographic structure also accords with the thrust of the theoretical literature: a decline in the net foreign asset occurs if there is an increase in the population shares of younger age cohorts, whereas the net foreign-asset position responds positively to an increase in the share of workers nearing retirement, with
Table 3 DETERMINANTS OF NET FOREIGN ASSETS, DEVELOPING COUNTRIES: PANEL DOLS REGRESSIONS WITH FIXED TIME AND COUNTRY EFFECTS CUMCA 1970-98 All
(2) CUMCA 1980-98 All
(3) CUMCA 1970-98 No Sing.
(4) CUMCA 1980-98 No Sing.
(5) CUMFL 1970-98 No Sing.
(6) CUMFL 1980-98 No Sing.
(7) CUMCA 1977-97 Balanced
Log GDP per capita
-0.21 (4.59)**
-0.08 (1.05)
-0.29 (6.76)**
-0.2 (2.98)**
-0.31 (6.8)**
-0.25 (3.6)**
-0.26 (3.55)**
Public debt
-0.67 (14.03)**
-0.67 (13.3)**
-0.73 (16.8)**
-0.71 (14.6)**
-0.86 (21.4)**
-0.86 (19.6)**
-0.50 (8.87)**
^(demog)
28.7 (0.00)**
21.2 (0.00)**
5.5 (.14)
4.6 (.20)
6.4 (.10)
38.7 (0.00)**
0.85 753 38 -0.49 2.05 2.05 (65+) -1.19 (25-29)
0.88 572 38 -0.78 2.47 2.47 (65+) -1.1 (20-24)
(1)
Adjusted R2 Observations Countries a(Popul. < 15) a(Popul. > 64) «max «min
0.83 779 39 -1.01 -0.522 3.92 (50-54) -3.92 (20-24)
0.87 590 39 -0.38 0.158 3.54 (55-59) -3.54 (20-24)
12.7 (.01)** 0.89 728 38 -0.9 4.33 4.33 (65+) -1.18 (45-49)
0.91 566 38 -1.11 4.6 4.6 (65+) -1.14 (35-39)
0.89 416 16 -1.17 0.55 5.66 (55-59) -5.67 (20-24)
Dynamic ordinary least squares f-statistics in parentheses [p-value for the x (demog) statistic]. * (**) indicates statistical significance at the 5% (1%) confidence level. In regressions (l)-(4) the dependent variable is CUMCA; in regressions (5) and (6) it is CUMFL. Regression (3)-(6) exclude Singapore from the sample. For definition of a, see Appendix.
94 • LANE & MILESI-FERRETTI
a maximum effect for the 50-54 age group. It is also interesting to note that the over-65 age group exerts a negative effect, consistent with the running down of net foreign assets. However, as is evident from columns (2) and (4) in Table 2, the significance of the public-debt and demographic results is lost if we just look at the more recent 1980-1998 period. With regard to public debt, the weakening of the conditional correlation is due to just one country, Australia, where public debt exhibits a strong positive comovement with net foreign assets. If Australia is excluded from the sample, the coefficient on public debt rises to -0.12 and is strongly statistically significant. Results for the balanced sample are similar to those for the 1970-1998 period for the full sample.20 We next turn to the results for the developing country sample. First, across columns (l)-(6), we observe a negative relation between output per capita and the net foreign-asset position: as a developing country becomes richer, it typically sees an increase in its net external liabilities. The contrast with the result for the industrial country sample is quite striking, although the negative coefficient is typically small and is insignificant in column (2). As was noted in Section 3.1, a negative association between output per capita and NFA is consistent with the relaxation of binding credit constraints on developing countries.21 Second, Table 3 shows a very strong inverse relation between public debt and the NFA position. A point estimate in the range [—0.67, -0.86] implies that a 20-percentage-point increase in government debt is associated with a [13.4, 17.2]-percentage-point decline in NFA. This high pass through from net government liabilities to net external liabilities is also consistent with pervasive credit constraints in developing countries, since credit-market imperfections are understood to be a primary source of deviations from Ricardian equivalence (Bernheim, 1987).22 With respect to the effect of demographic structure on the net foreignasset positions of developing countries, the evidence in Table 3 shows a 20. Belgium-Luxembourg, Denmark, Finland, Greece, Norway, and Portugal were dropped to obtain a balanced sample. 21. Results clearly suggest that the relation between output per capita and net foreign assets over the entire sample of industrial and developing countries is nonmonotonic. To some extent, we capture a nonlinear relation by splitting the sample between industrial and developing countries. We also tried to capture nonlinearities within the developing-country sample by positing the existence of a threshold level of income (varying the choice of threshold), as well as by splitting the developing-country sample into richer and poorer countries based on initial or average income. However, no strong evidence of nonlinearity emerges from the analysis—the relation with income per capita remains weak statistically and economically. 22. In most of the developing countries in our sample, public debt was primarily contracted internationally, given the shallowness of domestic financial markets.
Long-Term Capital Movements • 95
pattern similar to that for industrial countries: an increase in the population share of younger age groups is associated with a decline in the net foreign-asset position. A comparison of the a-coefficients between the industrial and developing countries also shows a greater sensitivity of the net foreign-asset position to age structure in the latter group. However, the significance of these demographic effects is weakened when Singapore is excluded from the sample.23 Finally, results for the balanced sample in column (7) are quite similar to those for the full sample, although the magnitude of the public-debt effect falls somewhat, to -0.50.24 We turn now to examining how well our panel specification, which imposes equality of all slope coefficients within our two country groups, can match the dynamics of net foreign assets at the individualcountry level. For this purpose, Figures 7 and 8 plot actual and fitted long-run values of net foreign assets for selected industrial and developing countries.25 For the richer countries, the graphs suggest that our specification matches the time-series behavior of NFA quite well in small open economies, but does not do as well for Germany, the United Kingdom, and the United States. For the last country, public debt has been declining and growth has been strong in the late 1990s, and both factors would lead us to expect an improvement in NFA. Instead, the level of U.S. net external liabilities has increased substantially during this period.26 A similar diverging pattern between actual and fitted values occurs in the late nineties for Japan, for exactly the symmetric reason—faltering GDP growth and rapidly increasing public debt would lead us to expect, ceteris paribus, a worsening in the NFA position, whereas Japan's improved throughout the period.27 For developing countries, the overall fit shown in Figure 8 is very good, with very few exceptions. One is Venezuela, which has severe 23. Singapore has undergone a dramatic demographic transition, with a rapid aging of the population. Of course, this may in fact represent very good evidence regarding the effect of demography on net foreign assets, since Singapore has also been rapidly accumulating external assets in recent years. 24. The balanced sample for developing countries excludes Algeria, Argentina, Bolivia, Botswana, Brazil, Chile, Cote d'lvoire, the Dominican Republic, Paraguay, Peru, Trinidad and Tobago, Turkey, and Zimbabwe. 25. Graphs for all other countries are available from the authors. The fitted values are generated from fixed-effects panel OLS regressions: the coefficient estimates are very similar to those obtained from the DOLS specification. 26. See Obstfeld and Rogoff (2000) on the sustainability of the U.S. external position. 27. In part, these patterns can be linked to the increased degree of equity diversification across countries: for example, the strong performance of U.S. equity markets during the 1990s and the weak performance of Japanese markets implied capital gains for foreign holders of U.S. equities and losses for foreign holders of Japanese equities.
Figure 7 ACTUAL AND FITTED VALUES, NET FOREIGN ASSETS, SELECTED INDUSTRIAL COUNTRIES
Figure 8 ACTUAL AND FITTED VALUES, NET FOREIGN ASSETS, SELECTED DEVELOPING COUNTRIES
98 • LANE & MILESI-FERRETTI
measurement problems for its NFA position because of the size of unrecorded assets held abroad. The divergence for Malaysia's actual and fitted values in the 1990s is due to the same factors at work in the United States: our model predicts that fast growth and a declining public debt should be associated with falling external liabilities. In summary, the data suggest that foreign-asset positions in industrial countries exhibit a strong comovement with relative output per capita, while their quantitative link with public debt is relatively weak. Conversely, public debt is very strongly correlated with the dynamics of net external liabilities in developing countries, while the relation with income per capita along the time-series dimension is weak or negative. In addition, in both samples, the demographic variables generally play an important role in determining NFA positions. Our simple econometric specification captures long-run trends in NFA very well for developing countries and small open industrial economies, but is less successful in explaining the behavior of NFA in larger countries. 3.3.3 Cross-Sectional Evidence The panel data analysis presented in the previous sub-subsection has focused on the evolution of net foreign assets within countries. In this sub-subsection, we investigate the crosssectional relation between NFA and their determinants, focusing on the 1990s. Table 4 presents results of cross-sectional regressions of net foreign assets on log output per capita, public debt, and demographic variables, where all variables are averages during the period 19901998.28 Relative output per capita is the only significant variable in explaining the cross-sectional variation in NFA positions across industrial countries. As in the time-series dimension, richer countries have larger NFA positions, although the cross-section point estimate is 40-50% smaller in magnitude. Neither public debt nor demography is helpful in explaining the 1990s cross section for industrial countries. Our fundamentals are more successful in explaining cross-country differences in net external positions among developing countries. In contrast to the time-series result, we find a positive association between output per capita and NFA in the cross section, although the point estimate is typically small and not significant in column (6). Similar to the time-series evidence, the cross-sectional effect of public debt is negative and significant: developing countries with larger public debts also •have larger net external liabilities. Columns (4)-(6) also suggest a signifi28. The results are virtually unchanged if we focus on a single year, because these variables move only slowly from year to year.
Table 4 NET FOREIGN ASSETS: CROSS-SECTIONAL REGRESSIONS CUMCA 1990-98 Industrial
(2) CUMCA+IIP 1990-98 Industrial
0.45 (3.58)**
0.54 (2.92)**
(1)
Log GDP per capita
(3) CUMCA 1990-98 Dev.
(4) CUMCA 1990-98 Dev., no Sing.
(5) CUMFL 1990-98 Dev., no Sing.
(6) CUMFL 1990-98 Dev., no Sing.
0.15 (1.6)
-1.87 (2.93)** 0.13 (3.26)**
0.18 (2.32)**
0.17 (2.0)**
-0.11 (0.35)
-0.44 (4.52)**
-0.45 (4.47)**
-0.65 (5.18)**
-0.71 (6.55)**
35.3 (0.00)**
33.6 (0.00)**
36.7 (0.00)**
1.35 (0.28)
Log GDP per capita squared Public debt
0.10 (0.7)
*2(demog)
3.05 (0.38)
2.21 (0.53)
0.45 22 -1.2 -0.44 0.62 (30-34) -1.2 (0-14)
0.33 22 394.2 -1314.6 424.3 (15-19) -1314.6 (65+)
Adjusted R2 Countries a(Popul. < 15) a(Popul. > 64) n "max «min
0.62 39 -489.2 1527.8 1527.8 (65+) -511.9 (20-24)
0.57 38 -442.3 1389.0 1389.0 (65+) -464.0 (20-24)
0.63 38 -276.9 921.8 921.8 (65+) -298.1 (35-39)
0.69 38 -2.25 -0.04 1.24 (50-54) -2.25 (0.14)
Ordinary least squares, heteroscedasticity-corrected f-statistics in parentheses [p-value for the ,y2(demog) statistic]. * (**) indicates statistical significance at the 5% (1%) confidence level. In regressions (1) the dependent variable is CUMCA for all countries except Belgium, for which it is the IIP estimate of net foreign assets minus gold. In regression (2) the dependent variable is the IIP estimate of NFA for Austria, Belgium, Canada, Finland, Germany, Italy, Japan, the Netherlands, Spain, Sweden, Switzerland, the United Kingdom, and the United States, and CUMCA for the remaining countries. Regressions (3)-(6) refer to the developing-country sample. In regressions (3) and (4), the dependent variable is CUMCA; in regression (5) it is CUMFL. Regressions (4) and (5) exclude Singapore. For definition of a, see Appendix.
100 • LANE & MILESI-FERRETTI
cant effect of the demographic structure on the cross-section distribution of NFA positions among developing countries, with a pattern that is qualitatively similar to that found in the time-series data. The differences in the coefficients on income between the industrial and developing sample, both in the time series and in the cross section, suggest that the underlying relation between NFA and output per capita is nonlinear. We report results using a quadratic cross-sectional relation between output per capita and NFA for developing countries in column (7).29 The specification does pick up a nonmonotonicity, but the turning point is at a low threshold ($1170); only 8 out of the 38 countries are in the region in which the cross-sectional relation between output per capita and NFA is slightly negative.30
4. The Dynamics of Net Foreign Assets and the Trade Balance In the previous section, we focused on the long-run behavior of NFA, arguing that it can be characterized as a cointegrating relation bit = cr'Zit + eit. In this section, we shift our attention to the adjustment mechanism— namely, the role played by our long-run model in shaping the short-run dynamics of NFA, as well as the implications these dynamics have for the trade balance. 4.1 THE ECM REPRESENTATION
Since the underlying long-run relation is a cointegration equation, we can obtain the "desired" change in NFA, Abit, as the fitted values from estimating an error-correction-mechanism representation
In order to keep the model specification as parsimonious as possible, we impose equality of all slope coefficients among the industrial- and among the developing-country samples in estimating this error-correction specification. Table 5 reports the estimated error-correction coefficient A and the overall fit of equation (4) for the different country groups and samples. The specification of the regression also includes the lagged change in the 29. A similar specification for the whole sample gives statistically weaker results, with an estimated turning point below output per capita of U.S.$1000. It makes little difference to the results if Singapore is included or CUMCA is used as the NFA measure. 30. Caution should be exercised in interpreting these cross-sectional results, because our sample excludes low-income countries, which are typically highly indebted.
Long-Term Capital Movements • 101 Table 5 CHANGES IN NET FOREIGN ASSETS: SPEED OF ADJUSTMENT: PANEL REGRESSIONS, ERROR-CORRECTION SPECIFICATION (a) Industrial Countries"
(3) CUMCA+IIP 1970-98
(4) CUMCA+IIP 1980-98
(1) CUMCA 1970-98
(2) CUMCA 1980-98
Error correct.
-0.11 (4.11)**
-0.17 (4.59)**
-0.12 (4.23)**
-0.14 (3.34)**
Adjusted R2 Observations Countries
0.28 539
0.30 393
0.27 537
0.13 374
22
22
22
22
(b) Developing Countries13
(1) CUMCA All 1970-98
(2) CUMCA All 1980-98
(3) CUMCA No Sing 1970-98
(4) CUMCA No Sing 1980-98
(5) CUMFL No Sing 1970-98
(6) CUMFL No Sing 1980-98
Error correct.
-0.06 (2.36)*
-0.11 (2.96)**
-0.10 (4.99)**
-0.16 (5.05)**
-0.10 (4.53)**
-0.15 (4.66)**
Adjusted K2 Observations Countries
0.44 849
0.45 612
0.48 822
0.50 594
0.54 786
0.56 585
39
39
38
38
38
38
Ordinary least squares, f-statistics in parentheses [p-value for the x (demog) statistic]. * (**) indicates statistical significance at the 5% (1%) confidence level. "Regressions also include the lagged first difference in CUMCA, contemporaneous first differences in the other variables belonging to the Z-vector, and country and time dummies. In regressions (1) and (2) the dependent variable is the change in CUMCA for all countries except Belgium, for which it is the change in the IIP estimate of NFA minus gold. In regression (3) the dependent variable is the change in the IIP estimate of NFA for Belgium, Canada, Italy, Japan, and the United Kingdom, and the change in CUMCA for all other countries. In regression (4) it is the change in the IIP estimate of NFA for Austria, Belgium, Canada, Finland, Germany, Italy, Japan, Netherlands, Spain, Sweden, Switzerland, the United States, and the change in CUMCA for the remaining countries. In regressions (l)-(4) the dependent variable is the change in CUMCA; in regressions (5)-(6) it is the change in CUMFL. Regressions also include the lagged first difference in the dependent variable, contemporaneous first differences in the other variables belonging to the Z-vector, and country and time dummies. Regressions (3)-(6) exclude Singapore from the sample.
dependent variable and contemporary changes in all explanatory variables (coefficients not reported). Results show that deviations of NFA from their long-run trend tend to be quite persistent, with a half-life of 5-6 years, and that the speed of adjustment is quite similar in industrial and developing countries. Given the restrictive specification of the short-
102 • LANE & MILESI-FERRETTI
run dynamics, the fit of the regressions is remarkably good, especially so for developing countries. It is useful to ask how well this simple specification accounts for the dynamics of NFA at the individual-country level. For this purpose, Table 6 reports the country-by-country bivariate correlations between actual and fitted values for changes in NFA for the period 1970-1998. For industrial countries, the model does poorly in explaining the short-run dynamics of the NFA position for most of the large economies—Japan, the United Kingdom, and the United States—while it tracks the smaller open economies, such as Ireland, Portugal, and the Scandinavian countries, quite nicely.31 For developing countries, the model performs remarkably well across the board, explaining a substantial fraction of year-to-year changes in NFA, with very few exceptions. 4.2 IMPLICATIONS FOR THE TRADE BALANCE
The factors driving the NFA position influence the behavior of the trade balance via two channels. First, changes in the desired NFA position require shifts in the trade balance. Second, for a given desired NFA position, there is an inverse relation between the investment returns on the outstanding stock of NFA and the trade balance. In an accounting sense, changes in the NFA position reflect trade imbalances, investment income payments and receipts, and capital gains and losses. Formally,
where TB!t is the balance of trade in goods and services, TR£ (TR£) are net current (capital) transfers, iitBit-i is investment income, and KG;t is the capital gain/loss on outstanding net external assets. The current account is given by the sum of TB,,, TR£, and the investment income iitBit^.32 Dividing both sides of equation (5) by GDP measured in U.S. dollars, adding together investment income and capital gains, and rearranging terms, we obtain
31. One reason why the model may not fully capture the dynamics of the NFA position for the former group of countries is that these are financial centers, and high levels of gross international asset trade mean that the impact of volatile revaluation effects on the NFA position is likely to be especially important. 32. The expression iitBit_} for investment income implicitly assumes that the dollar yield on external assets and liabilities is the same. We discuss below the implications of this assumption.
Long-Term Capital Movements • 103 Table 6 CORRELATION BETWEEN ACTUAL AND FITTED CHANGE IN NET FOREIGN ASSETS8 Industrial countries
Observ.
Correlation
Devel. countries
Observ.
Correlation
Australia Austria Belgium Canada Denmark Finland France Germany Greece Iceland Ireland Italy Japan Netherlands New Zealand Norway Portugal Spain Sweden Switzerland United Kingdom United States
24 27 16 27 18 27 21 27 26 18 27 27 27 27 27 27 25 22 27 18 27 27
0.07 0.80 0.40 0.17 0.74 0.71 0.55 0.40 0.68 0.83 0.79 0.69 0.10 -0.31 0.58 0.62 0.81 0.46 0.72 -0.35 0.19 0.01
Algeria Argentina Bolivia Botswana Brazil Chile Colombia Costa Rica Cote D'lvoire Dominic. Rep. Ecuador El Salvador Guatemala India Indonesia Israel Jamaica Jordan Korea Malaysia Mauritius Mexico Morocco Pakistan Panama Paraguay Peru Philippines South Africa Sri Lanka Taiwan Thailand Trinidad&T. Tunisia Turkey Uruguay Venezuela Zimbabwe
8 7 4 19 18 10 27 27 8 5 27 27 24 24 26 27 27 23 27 27 26 24 27 26 27 22 8 27 27 25 23 27 21 27 22 24 27 20
0.49 0.90 0.95 0.67 0.79 0.76 0.81 0.88 0.94 0.82 0.88 0.60 0.32 0.42 0.50 0.72 0.80 0.77 0.77 0.56 0.81 0.17 0.92 0.85 0.21 0.77 0.80 0.60 0.62 0.78 0.71 0.44 0.75 0.76 0.48 0.87 0.34 0.63
"Correlation coefficient between actual and fitted values of changes in the ratio of NFA to GDP. Regressions for the period 1970-1998 corresponding to column (1) in Table 5a for industrial countries, and column (5) in Table 5b for developing countries.
104 • LANE & MILESI-FERRETTI
where tb* is the ratio to GDP of the balance of goods, services, and current transfers; iit + kg!t is the nominal rate of return on outstanding net foreign assets (nominal yield iit plus capital gains/losses); and y is the rate of change of GDP measured in current dollars. Note that 1 + y = (1 + g)(l + 6r)(l + IT*), where g is the real GDP growth rate, e is the rate of realexchange-rate appreciation of the home country's currency vis-a-vis the U.S. dollar, and TT* is U.S. inflation. In turn, we can rearrange equation (6) to relate the transfer-corrected trade balance to our estimate of the change in the NFA position, given in equation (4):
where rit is the real rate of return on net foreign assets, measured in U.S. dollars.33 The transfer-corrected trade balance is related to three factors. The first term on the RHS on this equation reflects the change in the net foreign-asset position that is required for convergence to its long-run fundamental value, as captured by the ECM representation in Section 4.1; the second term ( — ^t) is the combined effect of overall returns, output growth, and real-exchange-rate changes, interacted with the past NFA position; and the third term is the component of the change in NFA that is not explained by the dynamics of its long-run fundamentals. Consider for example a debtor country for which the rate of return on its net liabilities is higher than its growth rate. In this case, if the fundamental NFA position does not change, the country will need to run a trade surplus equal to tyit. In Figure 9 we show the distribution of adjusted returns ^ and the trade balance tb* among industrial and developing countries for the periods 1980-1989 and 1990-1998.34 The low growth and real depreciation 33. In the presence of differences in rates of return between external assets and liabilities, the RHS would also include the term (rfi — ^)^_i, where r^t — rft is the rate of return differential between liabilities and assets, and t^_j is the stock of gross liabilities. We implicitly include this term in the adjusted returns Wit. 34. The construction of the adjusted returns term ^ is complicated by the measurement problems associated with capital gains and losses, briefly discussed in Section 2. For industrial countries, the series for KG,, includes the difference between the change in the outstanding stock and the flow for portfolio equity investment assets and liabilities, foreign direct investment assets and liabilities, and foreign-exchange reserves. These differences are particularly significant for portfolio equity assets and liabilities, especially during the 1990s, because of the fluctuations in market values generated by stockmarket trends and volatility. Our data do not allow us to estimate capital gains and losses on the debt portfolio of industrial countries. For developing countries, the series on capital gains and losses includes one additional item—the impact of cross-currency fluctuations on the outstanding stock of gross external debt (data that are reported in the World Bank's Global Development Finance database).
Figure 9 TRADE BALANCE AND ADJUSTED RETURNS: CROSS-COUNTRY DISPERSION, 1980s AND 1990s Adjusted returns, industrial countries
Adjusted returns, developing countries
Note: Number of countries with adjusted returns and trade balance (ratios of GDP), averaged over the corresponding time period, within the given range.
106 • LANE & MILESI-FERRETTI
associated with the debt crisis are reflected in the large number of less developed countries with large negative adjusted returns during the 1980s, a number that declines in the 1990s. Among industrial countries one observes an increase in the number of countries with large negative adjusted returns during the 1990s, and correspondingly in the number of countries running large trade surpluses. The increase in rates of return generated by the capital gains on equity holdings during the 1990s is one factor behind this development. Figure 9 also highlights that there is more dispersion in the trade balance among developing than among industrial countries. Figure 10 presents scatter diagrams illustrating the cross-sectional relation between the adjusted-returns term and the trade balance for the industrial and developing countries for the period 1980-1998. The graphs also show a line with a negative slope of 45 degrees that corresponds, for a given level of adjusted returns, to the trade balance that would keep the NFA position constant (in the absence of capital transfers such as debt forgiveness). In both samples there is a strong negative relationship between adjusted returns and trade balance. Some observations are noteworthy. First, the United States's adjusted-returns term is positive, a reflection of the positive rate-of-return differential between its external assets and liabilities. This implies that a trade deficit of 0.5% of GDP over the past 2 decades would have been consistent with an unchanged NFA position. In fact the trade deficit has been much larger, in connection with the deterioration of the U.S. net external position. Second, Singapore's spectacular increase in its NFA, even given its large positive adjusted-returns term, has required large trade surpluses. In summary, the results in this section show that the long-run fundamentals driving the NFA positions can also explain an important fraction of short-run changes in countries' external wealth, and that the behavior of the trade balance is tightly related to the dynamics of the NFA position. The extent to which changes in the underlying fundamentals of the net external position and correction in any drift from the long-run equilibrium relation are reflected in the trade balance depends on the adjusted returns on the outstanding NFA position. 5. Net Foreign Assets and Real Interest
Differentials
Rates of return on assets and liabilities play a crucial role in determining the dynamic behavior of NFA and are likely to be influenced by their level and composition. For instance, a home bias in asset demand and/or an upward-sloping supply of international funds means that interest rates may be linked to NFA positions: debtor countries should experi-
Long-Term Capital Movements • 107 Figure 10 ADJUSTED RETURNS AND THE TRADE BALANCE
108 • LANE & MILESI-FERRETTI
ence higher interest rates than creditor countries. Applications of this portfolio balance approach have typically related currency returns to shifts in relative asset supplies in different currencies (e.g. a model of dollar interest rates vs. yen interest rates), but the model should hold more generally as a framework for thinking about country risk (Frankel and Rose, 1995). In this spirit, the real interest-rate differential can be written as
where dit is the country risk premium and the second term on the righthand side is (minus) the expected rate of real exchange-rate appreciation. If the rate of real appreciation is zero in a steady state, then the longrun real interest differential just depends on the steady-state country risk premium
where we model the country risk premium as inversely (and linearly) related to the ratio of NFA to exports, bx,,.35 5.1 EMPIRICAL RESULTS
We confine attention to the industrial-country sample. Nominal interest rates are yields on government bonds, the same ones employed by Obstfeld and Rogoff (2000, 2001 ).36 We measure the real interest rate as the December nominal interest rate in year t minus the actual inflation rate in year t + 1. We report the panel fixed-effects results in Table 7, where the DOLS estimator is again employed. In panel (a), we include all countries, and the time dummies soak up the world real interest rate that is common to all countries; in panel (a), we employ the real interest differential vis-avis the U.S. The actual ratio of NFA to exports is employed as a regressor in columns (l)-(4), whereas in columns (5)-(8) we use the fitted values generated in Section 3.S.2.37 The results in columns (l)-(2) and (5)-(6) are for 1970-1998; those in columns (3)-(4) and (7)-(8), for 1980-1998. We also enter the stock of public debt and the rate of real exchange-rate 35. We use exports rather than GDP as the denominator to better capture the capacity of the economy to make overseas payments. The choice of denominator makes little practical difference for the results. 36. Iceland is excluded from the sample. We thank those authors and lay Shambaugh for generous assistance with these data. 37. In Section 3.3.2, we regressed the ratio of NFA to GDP on output per capita, the stock of public debt, and demographic variables. We multiply the fitted values from this regression by the ratio of GDP to exports.
Long-Term Capital Movements • 109
appreciation in alternative specifications.38 In line with the portfoliobalance literature, the former is intended to control for variation in the supply of alternative assets; the latter is to proxy for expected changes in the real exchange rate. Across columns (l)-(8), the results show clear evidence of a portfoliobalance effect in the determination of real interest differentials: for instance, according to the point estimate in column (1) of panel (b), a 20-percentage-point improvement in the ratio of NFA to exports is associated with a 50-basis-point reduction in the real interest-rate differential. The effect is also significant for the 1980-1998 period, and the estimated point coefficient is typically larger for the more recent period. These findings are little affected by inclusion of the stock of public debt and the rate of real exchange-rate appreciation. Even stronger results are obtained when the NFA position is instrumented by the level of GDP per capita, public debt, and demographic variables in columns (5)-(8), suggesting that the relation is not being generated by reverse causality running from the real interest-differential on the NFA position. Figure 11 provides a scatterplot of average net foreign assets and real interest rates over the period 1990-1998, documenting a negative relation between these variables. Table 8 reports cross-section regression results for the same period. In the cross section, net foreign assets again have a significant effect on the real interest-rate differential across all specifications. For instance, the point estimate of —1.07 in column (1) of panel (b) indicates that, all else equal, a country with an average NFA-toexports ratio that is 50 percentage points above the sample mean enjoys a real interest rate that is 53.5 basis points below the average real interest-rate differential vis-a-vis the U.S. We note also that the stock of public debt typically has a marginally significant positive effect on the real interest-rate differential (at the 10% level), but real exchange-rate appreciation has no effect in the cross-sectional specification. The results in this section provide some suggestive evidence that NFA positions matter in determining real interest-rate differentials, in the spirit of the portfolio-balance literautre.39 In future work, it would be 38. In line with the method for measuring expected inflation, the actual rate of real exchange-rate appreciation in year t+1 proxies for the expected rate of real appreciation in year t+\. In panel (a), we use a multivariate CPI-based real-exchange-rate series; in panel (b), the bilateral CPI-based real exchange rate vis-a-vis the U.S. 39. Bayoumi and Gagnon (1996) predict that a country's NFA position should be negatively correlated with its (after-tax) real interest rate. In this case, our estimate of the portfolio balance effect will be understated if a high real interest rate endogenously improves the NFA position. We further note that inflation and real interest rates are negatively correlated in the time-series dimension of our dataset but positively correlated in the cross section.
Table 7 REAL INTEREST RATES AND REAL INTEREST DIFFERENTIALS: PANEL DOLS REGRESSIONS WITH FIXED TIME AND COUNTRY EFFECTS
(1) 1970-98
(2) 1970-98
(3) 1980-98
(4) 1980-98
(5) 1970-98
(6) 1970-98
(7) 1980-98
(8) 1980-98
-1.5 (2.45)*
-1.63 (2.94)**
-2.87 (4.48)**
-2.81 (4.65)**
(a) Real Interest Rate NFA/exports
-1.06 (2.6)*
-0.83 (2.0)*
-1.36 (2.48)*
-0.91 (1.66)
Public debt
3.82 (2.1)*
7.1 (3.4)**
2.98 (2.03)*
3.56 (1.91)*
D(RER)
0.03 (1.2)
0.04 (1.74)
0.02 (-9)
2.64 (1.23)
Adjusted R2 Countries Observations
0.5 21
462
0.56 21
410
0.36 21
362
0.39 21
336
0.54 21
442
0.59 21
410
0.43 21
358
0.46 21
336
(b) Real Interest NFA/ exports
-2.54 (5.41)**
-2.73 (4.3)**
-0.04 (2.15)*
D(RER)
0.58 21 423
0.59 21 403
-2.22 (4.58)**
-2.57 (4.03)**
7.79 (4.82)**
3.18 (1.76)
Public debt
Adjusted R2 Countries Observations
-2.44 (5.5)**
Differential
-0.014 (.78)
0.6 21 344
0.64 21 338
0.6 21 416
-2.77 (4.27)**
-3.19 (4.83)**
-3.24 (5.52)**
2.23 (1.51)
3.18 (1.67)
0.012 (.54)
0.015 (.66)
0.59 21 386
0.63 21 340
0.67 21 319
Sample is industrial countries except Iceland. In panel (a), the dependent variable is the real interest rate; in panel (b), the real interest differential vis-a-vis the United States. In regressions (l)-(4), CUMCA is employed as the measure of NFA; in regressions (5)-(8), it is based on the fitted value from the regression of NFA on GDP per capita, public debt, and demographic variables. In regressions (2), (4), (6), and (8), the multivariate real exchange rate is employed in panel (a), and the bivariate real exchange rate vis-a-vis the United States in panel (b). * (**) indicates statistical significance at the 5% (1%) confidence level.
112 • LANE & MILESI-FERRETTI Figure 11 REAL INTEREST RATES AND NET FOREIGN ASSETS
instructive to experiment with different asset classes and maturities and explore alternative techniques for calculating expected inflation and the expected rate of real appreciation. Moreover, it would be interesting to distinguish between different components of the NFA position (e.g., is it just net external debt that matters? do portfolio equity liabilities and FDI liabilities have different effects?) and to investigate the interaction between NFA positions and other risk factors in determining real interestrate differentials. 6. Conclusions Our primary goal in this paper has been to demonstrate the fruitfulness of studying the behavior of a key state variable in international macroeconomics: namely, the net foreign-asset position. We have shown that persistent fundamentals—output per capita, public debt, and demographic variables—have a major influence on the direction of international asset trade. Moreover, we have examined the role played by the desired and actual NFA position in determining the trade balance—the former because trade balances are typically required to accomplish changes in the target NFA position, the latter due to the role played by
Long-Term Capital Movements - 113 Table 8 REAL INTEREST RATES AND REAL INTEREST DIFFERENTIAL: CROSS-SECTION EVIDENCE (AVERAGE, 1990-98)
(2)
(1)
(3)
(4)
-1.2 (5.39)**
-1.18 (5.28)** 1.31 (1.67) -0.19 (1.1) 0.52 21
(a) Real Interest Rate NFA/exports
-0.88 (2.6)*
Public debt D(RER) Adjusted R2 Countries
0.31 21
-0.88 (2.68)* 1.57 (1.55) -0.19 (0.9) 0.35 21
0.49 21
(b) Real Interest Differential NFA/exports
-1.07 (3.62)**
Public debt D(RER) Adjusted R2 Countries
0.54 20
-1.07 (4.12)** 1.72 (1.8) -0.08 (.43) 0.59 20
-1.27 (6.61)**
0.65 20
-1.26 (8.21)** 1.33 (1.7) -0.1 (.72) 0.68 20
Sample is industrial countries, except Iceland. 1990-1998 averaged data. In panel (a), the dependent variable is the real interest rate; in panel (b) the real interest differential vis-a-vis the U.S. In regressions (l)-(2), CUMCA is employed as the measure of NFA; in regressions (3)-(4) it is based on the fitted value from regression of NFA on GDP per capita, public debt, and demographic variables. In regressions (2) and (4), the multivariate real exchange rate is employed in panel (a), and the bivariate real exchange rate vis-a-vis the U.S. in panel (b). * (**) indicates statistical significance at the 5% (1%) confidence level.
investment returns on outstanding foreign assets and liabilities. Finally, we have presented evidence that the NFA position is also important in determining international asset prices, exerting a negative influence on real interest-rate differentials. Given the space limitations, there are many interesting questions concerning foreign-asset and -liability positions that we cannot address in this paper. In other work, we have shown that NFA positions exert an important influence on the long-run behavior of real exchange rates (Lane and Milesi-Ferretti, 2000) and made an initial exploration of the determinants of the structure of the "international balance sheet" between debt, portfolio equity, and foreign direct investment (Lane and Milesi-Ferretti, 2001b). Among the important issues that we must defer to future research is the role played by the level and composition of the external balance sheet in determining the probability of a financial crisis,
114 • LANE & MILESI-FERRETTI
and an exploration of the factors driving differences in cross-countries rates of return on external assets and liabilities.
Appendix Our demographic specification follows Fair and Dominguez (1991) and Higgins (1998). We divide the population into / = 12 age cohorts, and the age variables enter the net-foreign-assets equation as S^o:^, where pjt is the population share of cohort; in period t and I^a,- = 0. We make the restriction that the coefficients lie along a cubic polynomial
The zero-sum restriction on the coefficients implies that
In turn, we can estimate ylf y2, y3 by introducing the age variables into the estimated equation in the following way:
Finally, we can easily recover the implicit a; once we know yQ, ylf y2, y3. REFERENCES Bayoumi, T., and J. Gagnon. (1996). Taxation and inflation: A new explanation for capital flows. Journal of Monetary Economics 38:303-330. Bernheim, B. D. (1987). Ricardian equivalence: An evaluation of theory and evidence. In NBER Macroeconomics Annual, Vol. 2. Cambridge, MA: National Bureau of Economic Research, pp. 263-303.
Long-Term Capital Movements • 115 Blanchard, O. (1985). Debts, deficits and finite horizons. Journal of Political Economy 93(April):223-247. Calderon, C., N. Loayza, and L. Serven. (2000). External sustainability: A stockflow perspective. World Bank Policy Research Working Paper 2281 (January). Carroll, C., J. Overland, and D. Weil. (2000). Saving and growth with habit formation. American Economic Review 90:341-355. Chiang, M., and C. Kao. (2000). Nonstationary panel time series using NPT 1.1: A user guide. Syracuse University. Mimeo. Chinn, M., and E. Prasad. (2000). Medium-term determinants of current accounts in industrial and developing countries: An empirical exploration. IMF Working Paper 00/46 (March). Edwards, S. (2001). Does the current account matter? Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 8275. Fair, R., and K. Dominguez. (1991). Effects of the changing US demographic structure on macroeconomic equations. American Economic Review 81:1276-1294. Faruqee, H., and D. Laxton. (2000). Life cycles, dynasties, saving: Implications for small, open economies. IMF Working Paper 00/126 (July). Feldstein, M., and C. Horioka. (1980). Domestic savings and international capital flows. Economic Journal 90:314-329. Fischer, S., and J. Frenkel. (1974). Economic growth and the stages of the balance of payments. In Trade, Stability and Macroeconomics, G. Horwich and P. Samuelson (eds.). New York: Academic Press, pp. 503-521. Frankel, J., and A. Rose. (1995). Empirical research on nominal exchange rates. In Handbook of International Economics, Vol. 3, G. Grossman and K. Rogoff (eds.). Amsterdam: North-Holland. Hadri, K. (2000). Testing for stationarity in heterogeneous panel data. Econometrics Journal 3:148-161. Halevi, N. (1971). An empirical test of the "balance of payments stages" hypothesis. Journal of International Economics 1:102-118. Herbertsson, T., and G. Zoega. (1999). Trade surpluses and life-cycle saving behavior. Economics Letters 65:227-237. Higgins, M. (1998). Demography, national savings and international capital flows. International Economic Review 39:343-369. International Monetary Fund. (1993). Balance of Payments Manual 5. Washington, DC: IMF. . Balance of Payments Statistics, various issues. . International Financial Statistics, various issues. Kao, C. (1999). Spurious regression and residual-based tests for cointegration in panel data. Journal of Econometrics 90:1-44. Kraay, A., N. Loayza, L. Serven, and J. Ventura. (2000). Country portfolios. Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 7795. Lane, P. R., and G. M. Milesi-Ferretti. (2000). The transfer problem revisited: Net foreign assets and long-run real exchange rates. Centre for Economic Policy Research Discussion Paper 2511. , and . (2001a). The external wealth of nations: measures of foreign assets and liabilities for industrial and developing countries," Journal of International Economics 55, December, 263-294. , and . (2001b). External capital structure: Theory and evidence. In The World's New Financial Landscape: Challenges for Economic Policy, edited by H. Liebert. Tubingen, Germany: Mohr.
116 • FORBES Masson, P., J. Kremers, and J. Home. (1994). Net foreign assets and international adjustment: The United States, Japan and Germany. Journal of International Money and Finance 13:27-40. Mundell, R. A. (1991). The great exchange rate controversy: Trade balances and the international monetary system. In International Adjustment and Financing: The Lessons of 1985-1991, F. Bergsten (ed.). Washington: Institute for International Economics. Obstfeld, M., and K. Rogoff. (1996). Foundations of International Macroeconomics. Cambridge, MA: The MIT Press. , and . (2000). Perspectives on OECD economic integration: Implications for US current account adjustment. In Global Economic Integration: Opportunities and Challenges. Proceedings of a Symposium Sponsored by the Federal Reserve Bank of Kansas City. , and . (2001). The six major puzzles in international macroeconomics: Is there a common cause? In NBER Macroeconomics Annual Vol. 15. Cambridge, MA: National Bureau of Economic Research, pp. 339-390. , and A. M. Taylor. (2000). Real interest equalization and real interest parity over the long run: A reconsideration. Berkeley and Davis: University of California. Mimeo. Pedroni, P. (1999). Critical values for cointegration tests in heterogeneous panels with multiple regressors. Oxford Bulletin of Economics and Statistics 61:653-678. Rebelo, S. (1992). Growth in open economies. Carnegie-Rochester Series on Public Policy 36:5-46. Roldos, J. (1996). Human capital, borrowing constraints and the stages of the balance of payments. International Monetary Fund (February). Mimeo. Stock, J. H., and M. W. Watson. (1993). A simple estimator of cointegrated vectors in higher order integrated systems. Econometrica 61:783-820. Taylor, A. M. (1994). Domestic savings and international capital flows reconsidered. Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 4892. Taylor, A.M. and J. G. Williamson (1994). Capital Flows to the New World as an Inter-generational Transfer, Journal of Political Economy 102, 348-371. United Nations (2000). Demographic Yearbook: Historical Supplement 1948-1997. CD-ROM. World Bank. Global Development Finance, various issues. World Bank. World Development Indicators, various issues.
Comment KRISTIN J. FORBES MIT-Sloan School and NBER
1. Overview of the Paper This paper is part of an ambitious project by Lane and Milesi-Ferretti attempting to measure, explain, and explore various aspects of international balance sheets. The first paper in the series, "The External Wealth of Nations," documents the compilation of an exciting new dataset on net foreign-asset positions for a sample of 66 industrial and developing
Comment • 117
countries from 1970 through 1998. This paper uses this dataset to answer three straightforward questions. First, what determines a country's NFA position? Second, how do changes in a country's net foreign-asset position affect its trade balance? Third and finally, how does a country's NFA position affect its domestic interest rate? The paper presents an extensive series of graphs and empirical tests aimed at answering these three questions. Most of the results are highly significant, economically important, and in agreement with the predictions of standard open-economy macro models. For example, results for the first question suggest that in industrial countries, changes in NFA positions are positively correlated with changes in output per capita. In developing countries, changes in net foreign-asset positions are negatively correlated with changes in output per capita and negatively correlated with changes in public debt. In both groups of countries, NFA positions are highly correlated with demographics. The results for the second question show that countries' net foreign-asset positions are negatively correlated with their trade balance. Finally, results for the third question indicate that countries' NFA positions are negatively correlated with their real interest rates. The authors should be applauded for this paper. They examine important questions that are far from resolved in the open-economy macro literature. In their empirical tests, they are careful to use panel estimation to control for any time-invariant omitted variables, as well as the appropriate time-series techniques to adjust for cointegration. Despite their extremely parsimonious specifications, the graphs of actual and fitted values suggest that their models have a high degree of explanatory power for most countries in the sample. Perhaps most noteworthy, the dataset compiled for this paper was a substantial undertaking (which is understated in the paper) and will undoubtedly form the basis of a numerous studies examining topics related to net foreign assets. I do, however, have several concerns with the paper's analysis. To correspond to the trio of questions examined in the paper, the remainder of my comments will focus on three of the most problematic issues: nonlinearity, omitted variables, and endogeneity. The comments will conclude with an overall evaluation of the paper. 2. Nonlinearity and Income Divisions My first set of concerns with the paper is that many of the relationships being tested with linear regressions are nonlinear. This problem arises in each of the three sets of tests, but to make the point clearly, I will focus on one specific nonlinearity: the relationship between a country's GDP per capita and its NFA position. In the theoretical discussion in Section
118 • FORBES
3.1, the paper points out several ways in which output per capita can affect net foreign-asset positions. For example, "if the domestic marginal product of capital decreases as an economy grows richer, domestic investment will fall and home investors will seek out overseas accumulation opportunities." On the other hand, in credit-constrained countries, "an increase in production may allow greater recourse to foreign credit, possibly implying a negative relation between net external assets and relative output per capita, at least over some interval." Each of these channels linking a country's output and net foreignasset position could counteract each other, and the relative strength of each of the channels could vary with a country's income level. For example, the second channel, based on credit constraints, is more likely to occur in developing countries. In order to adjust for this nonlinear relationship between output and net foreign assets, the authors divide their sample into two groups of countries: industrial and developing. They define industrial countries as "long-standing members of the OECD, which approximately corresponds to the most-developed set of countries at the start of the sample period." The empirical results for the two groups of countries suggest that this relationship between output and net foreign assets is in fact nonlinear and driven by the two theoretical channels discussed above. The relationship between changes in output per capita and changes in net foreign assets is positive and highly significant in industrial countries, and negative and highly significant in developing countries. But is there any reason to believe that this rough division between "long-standing members of the OECD" and nonmembers accurately captures the true form of the relationship? Each group of countries is extremely diverse. For example, "industrial" countries include the U.S. and Switzerland as well as Greece and Portugal. "Developing" countries include Paraguay and Zimbabwe as well as Singapore and Israel. It is hard to believe the relationship between income and net foreign assets is the same for these diverse members of each country group. A simple extension to one of the figures in the paper shows that these differences within each group of countries in the relationship between income and net foreign assets can be important and significantly affect estimates. Figure 1 graphs the average change in a country's NFA position between 1980-1989 and 1990-1998 vs. the average change in its GDP per capita over the same two periods for developing countries. This is the analysis performed in Figure 4(b) of the paper.l Then, to calculate 1. Figure 4(b) drops several observations from the sample because those countries do not have sufficient data to include in the subsequent regression analysis. I include the full sample, with no significant effect on the results.
Comment • 119 Figure 1 DEVELOPING COUNTRIES
the fitted line for the graph, I estimate the linear specification used in the paper and also add a squared term for GDP per capita. Regression results are reported in columns (1) and (2) of Table 1. The nonlinear specification outperforms the linear regression, and the squared term is highly significant. In Figure 1, the fitted regression line including the nonlinear term is clearly a better fit for the data than a straight line. Next, instead of focusing on just developing countries, I repeat this analysis for the entire sample of countries. Figure 2 graphs the relationship between average changes in NFA positions and average changes in GDP per capita for industrial and developing countries. Columns (3) and (4) in Table 1 report regression estimates for the linear regression and with the additional squared term, respectively. Once again, the nonlinear specification outperforms the linear specification, and Figure 2 suggests that the nonlinear fitted line is a much better description of the data. This series of results suggests that the underlying relationship linking changes in NFA positions and GDP per capita is not linear. A simple extension to the panel estimates—just adding a squared term—appears to significantly improve the specification. In the current version of the paper, the authors perform a similar extension to their cross-section estimates [adding a squared term for GDP per capita in column (6) of
120 • FORBES Table 1 EVIDENCE OF NONLINEARITY IN THE RELATIONSHIP BETWEEN INCOME PER CAPITA AND NET FOREIGN ASSETS Full sample
Developing countries
Constant
Log GDP per capita
(1)
(2)
(3)
(4)
-0.05 (-0.80)
-0.05 (-0.86)
-0.06 (-1.46)
-0.09 (-2.07)
0.62 (3.15)
1.62 (4.30)
0.66 (4.09)
1.41 (4.68)
-2.04 (-3.01)
Log GDP per capita2 No. of countries Adjusted R2
45 0.17
45 0.30
-1.55 (-2.89) 67 0.19
67 0.27
Note: f-statistics are in parentheses. Variables calculated as average changes between 1980-1989 and 1990-1998 (to correspond to Figure 4 in the paper). See Figures 1 and 2 of this comment for corresponding data points and fitted regression line.
Figure 2 ALL COUNTRIES
Comment • 121
Table 4]. The nonlinear term is highly significant, and including this term substantially affects other coefficient estimates. This combination of results suggests that the rough division between industrial and developing countries used in the paper will not accurately capture the relationship between income levels and NFA positions. Instead of using these two rough groups, the paper should try to better specify the underlying, nonlinear relationship between these variables. At the very least, it should include a squared term in the base specification. As shown in the simple tests in Table 1, even the simple extension of including a squared term for income levels can significantly affect coefficient estimates. 3. Omitted Variables: Investment, at Least A second concern that I have with this paper is omitted variables. The specifications estimated to answer each of the three motivating questions are extremely parsimonious. For example, the first series of regressions, predicting determinants of a country's NFA position, include only six control variables: income per capita, public debt, and three demographic variables. The second series of regressions, predicting a country's trade balance, include two sets of explanatory variables: a lagged measure of the trade balance and then a set of controls for investment returns. The third series of regressions, predicting real interest-rate differentials, includes at most three controls: NFA, public debt, and the real exchange rate. In all three cases, there are numerous variables that are not included in the regression but could affect the dependent variable and be highly correlated with one or more explanatory variables. As a result, coefficient estimates could be biased. The paper takes an important step toward adjusting for omitted-variables bias by using panel estimation and controlling for any time-invariant country-specific effects. Panel estimation does not, however, control for any omitted variables that vary over time, which is particularly problematic in this paper, since the time periods are fairly long (generally 28 or 18 years). To make this point about the necessity to include additional controls and sensitivity tests in the regression analysis, I will focus on one omitted variable: domestic investment. This is only one example of several omitted variables that could significantly affect the regression results. Domestic investment is one variable that should be included in estimates predicting a country's NFA position (the first series of tests in the paper). To see the importance of this variable, it is useful to examine the standard balance-of-payments accounting equation used in introductory macroeconomics textbooks:
122 • FORBES
where X is exports, M is imports, TA is government tax revenue, G is government spending on goods and services, TR is government transfer payments, S is private savings, and I is domestic investment. The model used to estimate a country's NFA position in the paper is
where NFA is the ratio of net foreign assets to GDP, YC is output per capita, GDEBT is the stock of public debt, and DEM is a set of demographic variables. When equation (2) is estimated in changes (as in the panel specification), it is directly comparable to equation (1). Changes in NFA in equation (2) are highly correlated with the trade surplus in equation (2) (as explored in detail in the second series of tests in the paper.) Changes in GDEBT in equation (2) are equivalent to the government budget surplus in equation (1). Changes in DEM are included to capture how changes in the demographic composition of the population affect the savings rate [as written in equation (1)]. Investment, however [the final term in equation (1)], is not included in equation (2). Instead, the paper includes output per capita. It is well documented that investment is highly volatile over time within a given country. Therefore, it is unlikely that the country fixed effects control for movements in this variable. Moreover, investment is undoubtedly correlated with output per capita. Therefore, do estimates of the relationship between output per capita and NFA in equation (2) capture the relationship between these two variables? Or is the coefficient on output per capita actually capturing the effect of investment? Or is the relationship between investment and GDP biasing the coefficient estimates on GDP? To analyze these questions more formally, Table 2 reports the univariate correlations between NFA (measured by CUMCA), income per capita, and investment as a share of GDP for industrial and developing countries.2 These univariate correlations suggest that NFA are positively correlated with GDP per capita in both industrial and developing countries. This is in contrast to the multivariate panel regression results, where NFA are positively correlated with GDP per capita in industrial countries, but negatively correlated in developing countries. The univariate correlation 2. Correlations are calculated across countries and years. Investment as a share of GDP is taken from World Bank (2000). World Development Indicators on CD-ROM, Washington, DC: World Bank.
Comment - 123 Table 2 UNIVARIATE CORRELATIONS (a) Industrial countries: 1970-1998
NFA (CUMCA) GDP per capita Investment / GDP
NFA (CUMCA)
GDP per capita
Investment /GDP
1.00
0.45 1.00
0.04 -0.17 1.00
NFA (CUMCA)
GDP per capita
Investment /GDP
1.00
0.37 1.00
-0.04 0.07 1.00
(b) Developing countries: 1970-1998
NFA (CUMCA) GDP per capita Investment/GDP
estimates also show that NFA are positively correlated with investment in industrial countries and negatively correlated in developing countries. Moreover, GDP per capita is negatively correlated with investment in industrial countries and positively correlated in developing countries. Although it is impossible to predict how omitting investment will bias the coefficient on GDP per capita in the multivariate context of equation (2), the correlations in Table 2 allow us to predict the bias in a univariate context. The correlations suggest that omitting investment will generate a negative bias in estimates of the effect of GDP on NFA in both industrial and developing countries. Moreover, if these univariate correlations are strong enough and outweigh any counteracting multivariate correlations, that will also be the effect of the omitted-variable bias in the multivariate context. Table 3 tests these implications. It reports fixed-effects estimates of equation (2) with and without a control for investment for both industrial and developing countries.3 The results agree with the predictions from the univariate correlation analysis. Excluding investment from the model generates a downward bias on the coefficient estimates for GDP per capita. In industrial countries, the effect of the bias is small. In developing countries, however, the effect of the bias is significant and the coefficient on GDP per capita becomes insignificant, while the coefficient on investment is negative and highly significant. This suggests that 3. These estimates are similar to those reported in column (1) of Tables 2 and 3 in the paper. The only differences between these estimates and those in the paper (to the best of my knowledge) are: (1) these estimates are fixed effects and do not control for cointegration as done in the paper; (2) this sample size is slightly larger than that in the paper.
124 • FORBES Table 3 REGRESSION RESULTS: IMPACT OF OMITTING INVESTMENT FROM PREDICTIONS OF NET FOREIGN ASSETS Developing countries
Industrial countries (1)
(2)
(3)
(4)
Log GDP per capita
0.87 (14.73)
0.93 (15.02)
-0.19 (-4.57)
-0.04 (-0.88)
Public debt /GDP
-0.13 (-4.10)
-0.17 (-5.20)
-0.63 (-19.27)
-0.63 (-19.84)
-0.47 (-2.97)
Investment /GDP No. of observations No. of countries Within R2
577 22 0.46
535 22 0.51
-1.16 (-8.49) 907 39 0.47
872 38 0.54
Note: f-statistics are in parentheses. Dependent variable is CUMCA. Estimates are fixed effects for the full sample from 1970-1998. Period dummies and demographic variables are included in the regressions but are not reported.
when investment is omitted from the equation, estimates of the effect of GDP per capita on NFA in developing countries may be biased and actually be capturing the relationship between investment and NFA. This section has presented theoretical and empirical evidence that omitting one variable from one regression could significantly bias coefficient estimates. Domestic investment in the regressions predicting NFA, however, is only one of a number of potentially important omitted variables. Others are capital-account liberalization, increased trade flows, changes in expected growth rates or returns, income inequality, inflation, and exchange-rate movements. Each of these variables has changed significantly for many countries in the sample over the long periods under consideration and therefore will not be captured in the country effects in the panel estimation. Granted, there are limited degrees of freedom in many of the regressions estimated in the paper, but given the potentially serious biases from excluding these important variables, the paper should carefully address what other variables are omitted and how they might affect the results. Moreover, the paper should add an extensive series of sensitivity tests to see if including any of these variables in the base specification significantly affects results. The NBER Macroeconomics Annual is the ideal forum to perform this sort of detailed robustness analysis and explore a wide variety of potential interactions between variables.
Comment • 125
4. Endogeneity: What is Actually Driving What? The third major concern that I have with this paper is endogeneity. The paper carefully explains why each of the independent variables could affect the dependent variables in each of the three sets of regressions. There are equally valid reasons, however, why each of the dependent variables could in turn affect many of the explanatory variables. In several parts of the paper, the language suggests that the authors are aware of this problem. For example, when interpreting coefficient estimates, they write that a movement in one variable "is associated with" or "is correlated with" a movement in another variable, instead of saying that a movement in one variable "causes" a movement in the other. In other cases, however, the terminology is less careful and the language interprets coefficient estimates as showing causality. Moreover, the central purposes motivating the paper are not to understand correlations, but rather to better understand what causes changes in a country's NFA position and what are the effects of changes in NFA positions on other variables, such as the trade balance and interest-rate differentials. Therefore, in order to answer the key questions motivating the paper, the authors should address potential endogeneity issues in more detail. This section discusses two specific examples in detail and then provides suggestions for dealing with endogeneity. One of the clearest examples of endogeneity is in the final series of tests in the paper: how a country's NFA position affects its interest-rate differential (versus the global interest rate or the U.S. interest rate.) The paper estimates a straightforward regression of the interest-rate differential on NFA, using both panel and cross-country estimation for two different periods. In alternative specifications, there are also controls for movements in the country's real exchange rate and stock of public debt. Estimates of the coefficient on net foreign assets are negative and usually highly significant. The paper interprets this as "some suggestive evidence that NFA positions matter in determining real interest-rate differentials. . ." But, do movements in NFA positions drive movements in the interest-rate differential, or vice versa? Japan is a clear example. Japan has significantly lowered its interest rate since 1990 (from 5.20 in 1990 to 0.01 in 1998) in an attempt to spur domestic growth.4 During this period, Japan has consistently run a large capital-account surplus, and its NFA position has increased substantially. (The CUMCA variable rose from 0.14 in 1990 to 0.39 in 1998.) Did the change in Japan's NFA position drive the fall in interest rates? Or did the fall in interest rates drive the change in Japan's NFA position? The specification in the paper as4. Based on the real-interest-rate data used in the paper.
126 • FORBES sumes the former, while I would argue that the latter channel is more important. In addition to this model predicting interest-rate differentials, each of the central regressions in the paper could also have problems with endogeneity. For example, in the set of regressions predicting a country's NFA position, two of the explanatory variables are income per capita and public debt. But when a country borrows more from abroad (generating a negative NFA position), couldn't these additional resources spur output growth—especially in a country that was previously liquidity-constrained? And if the borrowing from abroad is largely lending to the government, couldn't this decline in NFA (i.e., increased borrowing from abroad) allow the government to increase its level of public debt? For example, in the last 5 years of the dataset, Argentina's NFA (as measured by CUMCA) fell from -18.2% in 1993 to -27.8% in 1998. Over the same period, Argentina's public debt increased from 23.8% to 28.4% of GDP. Did the changes in Argentina's public debt cause the changes in its NFA position, or vice versa? Each of these examples suggests that endogeneity could affect regression estimates. The authors should directly address these issues rather than using terms such as "associated with" or "correlated with" when interpreting results. In the theoretical motivation for each set of regressions, they should carefully discuss any channels that could generate feedback from the dependent to the explanatory variables. In the regression estimates, they should attempt to instrument for the variables which are most likely to suffer from serious endogeneity problems. Granted, finding desirable instruments is always difficult in a panel framework, but at the very least the authors should try using lagged values of each of the relevant variables as instruments. 5. Conclusions and Overall Assessment When I read and assess an empirical paper, I frequently think of it in terms of a four-tiered pyramid. At the base of the pyramid is the paper's motivation. Without a relevant question or interesting issue, a paper has no foundation and will have minimal impact. The second tier on the pyramid is the dataset. Although no dataset is perfect, it is impossible to address certain issues without critical pieces of information of an acceptable quality. The third and fourth tiers of the pyramid are the model and estimation methodology. The model should capture the key relationships between the relevant variables, and the estimation methodology should yield unbiased and efficient estimates. Few empirical papers satisfactorily achieve all four of these levels.
Comment • 127
The paper by Lane and Milesi-Ferretti performs well as assessed in terms of this empirical-paper pyramid. The paper clearly satisfies the first criterion: it is built on a strong base. It asks a number of important questions about the determinants of countries' net foreign-asset positions and how changes in these asset positions affect key macroeconomic variables. The paper also performs extremely well on the second tier of the pyramid. It uses an exciting new dataset, undoubtedly compiled with a tremendous amount of effort by the authors, on NFA positions. The paper is weaker, however, on the third and fourth tiers of the pyramid. The models used as the basis for estimation may miss important relationships between key variables. Although cointegration is an excellent start, the estimation methodology may overlook substantial problems. In particular, my comments have focused on three potential problems with the model and estimation: nonlinearity, omitted variables, and endogeneity. To be fair, however, much of the empirical work in macroeconomics does not satisfactorily address these three issues. Therefore, although my comments have focused on several potential weaknesses with the paper, the paper's accomplishments and valuable contributions are worth reiterating. This paper uses a first-rate new dataset to investigate several important issues relating to international balance sheets. Many of these issues were previously unresolved in the literature, largely due to unsatisfactory data. In terms of the empiricalpaper pyramid, the paper satisfies the two most important criteria to form the basis of an important paper—interesting and unresolved questions combined with excellent data. The paper's results will undoubtedly inspire future work investigating a number of these relationships in more detail. The dataset has promising possibilities for future research. I look forward to seeing the next installment by these authors in their series of papers exploring international balance sheets.
Comment JEFFREY FRANKEL Harvard University
In 1985, U.S. statistics showed that the net international investment position of the United States had turned negative for the first time since World War I. In 1989, it again turned negative for the first time since World War I. How is that possible? In the meantime, a revision had raised the valuation of U.S. assets overseas, by recognizing, for example, increased prices of capital assets acquired in the distant past. This revi-
128 • FRANKEL
sion was large enough to restore America's net creditor status, though only temporarily.l The magnitude of this revision is one indicator of how large the measurement errors in these data are, or at least how bad they have been in the past, which in turn is one reason why they have been so little used. The worldwide discrepancy is another tangible illustration of the problem. That said, I am persuaded that this line of research by Lane and Milesi-Ferretti is a very useful one. In part this is because of the high marginal product of research in an area where few others have explored. But it is also because the authors have been able to put together data for more countries than were available in the early 1980s. And the variation in the data is sufficiently great that measurement error need not necessarily prevent us from learning anything by examining them. Overall, the results are much better than I would have predicted. There is little modeling as such. Instead, they offer a variety of theoretical reasons for thinking that income per capita, public debt, and demographics should each have effects on the net foreign-asset position, and these tend to be borne out in the empirical results. In the case of industrialized countries, income per capita has a strong positive effect on the asset position, supporting the idea that countries become creditors as they grow rich. (This certainly fits the experience of the Netherlands and the United Kingdom in their heydays, the United States until the 1980s, and Japan. The United States in recent years is a conspicuous outlier.) Public debt seems to have a negative effect on the investment position, as hypothesized. This effect is even stronger among developing countries than among industrialized countries. The authors explain the discrepancy by the argument that credit constraints are pervasive in developing countries. But I would have thought that credit constraints for these countries are even worse internationally (capital controls, default risk, recurrent crises, absence of international bankruptcy court)—that they would find it even harder to finance budget deficits out of foreign borrowing than out of domestic borrowing. I consider this result to be a bit paradoxical, but it is the same paradox found in the FeldsteinHorioka literature: the saving retention coefficient is even higher for industrialized countries than for developing countries, which seems inconsistent with the higher capital mobility that we expect for industrial1. The U.S. data system tends to collect better data on capital flowing in than on capital flowing out. No comprehensive survey of U.S. residents' holdings of foreign securities had been conducted since World War II, until one was conducted in 1994 (Kester et al., 1995). (Measured U.S. net indebtedness is $1.474 trillion as of end 1999, and still climbing rapidly.)
Comment • 129
ized countries.2 The big question that this research should try to answer is analogous to the one addressed by the earlier Feldstein-Horioka literature: Are net international investment positions as large as we would expect from neoclassical theory and perfect capital mobility, and if not, why not? The claim is made that economic theory has stronger predictions about the long-run relationships among asset stocks than about short-run relationships among flows. By way of elaboration, the point is made that theory predicts that the stock of foreign assets should depend negatively on the stock of government debt, but that the relationship between the current-account deficit and the budget deficit depends entirely on the origin of the shocks.31 think the point is to look at low-frequency relationships, not at long-term capital, whether stock or flow. Perhaps the title of the paper should be Long-Term Movements of Capital, rather than LongTerm Capital Movements. The third finding is that demographics matter as well, with the young population reducing the asset position and the peak-earning fifties age cohort adding to it.4 Next come estimates with dynamic adjustment. The authors estimate the half-life at five years, and describe this behavior of the investment position as quite persistent. I would have expected slower adjustment, if anything, and am surprised it is not more persistent. I suspect that if adjustment were solely by current-account surpluses and deficits, the half-life might be longer than five years, and that the estimates are picking up variation in exchange rates and asset prices. In the dynamic estimates, and the panel estimates, the results work less well for the United States, Japan, Germany, and the United Kingdom. Could this be because these are the countries that borrow primarily in their own currency? A key question is whether we should be thinking of the kind of portfolio balance model where investors diversify across currencies of denomination, or the kind where they diversify across countries of issuance. Among other questions that turn on this 2. See, e.g., Dooley, Frankel, and Mathieson (1987). 3. The latter point is certainly true. In the 1980s the U.S. current account grew worse when the budget deficit widened, because the origin was fiscal expansion, whereas in the 1990s the current account grew worse when the budget improved, because both were responding to a New Economy investment boom. But is the situation really so different with stocks rather than flows? Mightn't theory predict that the sign of the correlation between the stock of foreign assets and the stock of government debt would reverse, if the driving force were a New Economy boom that raised the return to capital? 4. The paper mentions that "the over-65 age group exerts a negative effect, consistent with the running down of NFA." In the case of those who have newly retired, I would expect a positive effect on the level of assets. Only for the very old, those who have lived longer than expected, might one look for a negative effect.
130 • FRANKEL
decision, if it is a matter of currency risk rather than country (default risk), it may be necessary to express foreign holdings relative to total portfolio (wealth) rather than relative to income or exports as the authors do throughout. I see several remaining puzzles and priorities for future research: 1. The relationship between income and investment position appears to have an inverted-U shape. This follows from the finding that the relationship is positive for one income range and negative for another. If so, the relationship would be analogous to the original Kuznets curve, which said that income inequality gets worse at early stages of industrialization, and then starts to get better when income passes a turning point, and to the so-called environmental Kuznets curve, which says that the same pattern holds for pollution. We observe that high debt brings with it vulnerability to financial crisis. Perhaps all three variables—inequality, pollution, and debt—are unpleasant side effects of growth that people are willing to put up with at early stages when maximizing GDP is the overriding goal, but which they can afford to reduce when they get richer. The authors indeed find some evidence of the U-shaped relationship between income and investment in cross-section data. The puzzle is that they do not find it in time-series data. 2. As the authors say, future research should attempt to distinguish among different components of the net investment position, breaking out FDI, equities, and long-term debt from short-term debt— though it might be necessary at the same time to break out gross assets from gross liabilities. I think we have decided, in the aftermath of the financial crises of the 1990s, that the composition of net capital flows is as important as the total magnitude. 3. I would suggest trying a more sophisticated approach to measuring the rate-of-return variable. A lot can be said on this last problem. The authors decompose the expected return differential into two components—a real interest differential and expected real appreciation:
Since the latter component is generally insignificant in their results, they are in effect saying that expected return differentials are determined by
Comment • 131
differences in real interest rates. I am not sure if this will give the right answer in general. Interest rates (real as well as nominal) in Japan, for example, have been below those in the U.S. for most of the postwar period, yet this difference has been approximately offset—perhaps more than offset, depending on the measure—by the upward trend in the value of the yen (which in real terms averaged 3% a year). Because yen appreciation was such a strong trend, Japanese bonds paid more than American bonds despite their low interest rate. In other words, real appreciation of the yen may have been large enough to change the sign of the difference in expected returns.5 At the one-year horizon, there is good reason for thinking that speculators expect the real exchange rate to regress gradually toward purchasing power parity (PPP), at least among the dollar and major European currencies. (Forget the yen.) Actually, there are two reasons for thinking so. First, survey data suggest that expectations of market participants are formed in this way. Second, studies of PPP suggest that the actual realexchange-rate process has an autoregressive component, and rational expectations implies that investors' expectations would in turn be formed in this way. Let me make a pitch for inverting the equation—running it with rates of return on the left-hand side and asset position on the right. Write the demand for domestic assets as a linear function of the expected return differential:
Then invert:
1. The logic is that measurement errors in the rates of return (s) are large. 2. If the rates of return are measured as ex post returns, then there is a theoretical argument for believing that these large measurement errors—which are investors' ex post prediction errors—are uncor5. Frankel (1991, Section 8.2). When Japan removed its capital controls after 1979, the net flow was out rather than in. So perhaps the real interest differentials are giving us the right answer. (This would be easier to understand if we were talking about flows. The low real interest rate in Japan signals an excess of national saving relative to real investment, and the high real interest rate in the United States signals the reverse; the discrepancy in each country is the net capital flow.)
132 • FRANKEL
related with the ex ante asset quantity variable. That theoretical reason is, of course, rational expectations. Let us accept the standard rational-expectations methodology for present purposes. 3. This specification readily lends itself to intuitive interpretation as the answer to the question: "If I increase my international indebtedness by one percentage point, by how much do I drive up the interest rate I have to pay?" 4. If one wants to test the null hypothesis of perfect capital mobility, it is much easier to test /3"1 = 0 than it is to test j3 = °°. 5. You can have fun imposing the constraint that (3 is determined by optimal portfolio diversification, which can give you the constraint that the coefficient matrix is proportional to the variance of the error term s in the same equation. Going to the multidimensional case is optional, where /3"1 is a matrix, proportional to the variancecovariance matrix of s:
Lane and Milesi-Ferretti do invert the equation in Section 5, to the extent of putting the ex post real interest differential on the left-hand side. I might understand proceeding in this manner if the logic were that we are talking about assets other than bonds here (e.g., FDI), so that some broad measure of real return to equity is what matters. That gets into the other point about decomposing the aggregate investment position into FDI vs. bonds etc. But let us stay with the idea of one aggregate asset. If that one asset were short-term default-free bills, then the only source of uncertainty would be in the exchange rate, for those countries able to borrow in their own currency:
This case is particularly simple, and allows one to model and measure the first and second moments with some precision. But it requires also getting data on the stocks of domestic and foreign assets that are outstanding and that thus have to be held by someone, not just net domestic debt to foreigners. Indeed, the net international indebtedness variable, which is the focus of this paper, does not enter into the asset supplies at all. Rather, to get net indebtedness to matter, it has to come in as a determinant of demand rather than supply, assuming a home bias in asset demands. Such a home bias is easy to derive from the optimal diversification framework, by the way, because residents of each country
Comment • 133
consume more of their own goods, and so each views the other's currency as somewhat risky.6 I am not recommending that Lane and Milesi-Ferretti go down this route. Their unique contribution is working with the data on the net foreign-investment position. Their title and introduction state explicitly that their motivation is to shift the emphasis from short-term flows to long.7 Long-term loans and bonds, equities, and FDI are as important as short-term bonds. As their graphs show, equities and FDI grew rapidly among emerging markets in the 1990s. In these markets, default risk has been as important as exchange risk. So the authors need not focus on short-term interest rates and exchange rates in measuring expected returns. And they need not get sidetracked cumulating government bond supplies in each country. Even at the stage where the authors continue to aggregate all asset categories together, I would like to propose trying an alternative approach for measuring the aggregate rate of return: the net investment income line of the balance of payments, expressed relative to the net international investment position. There are certainly problems with this strategy. Even if the data are measured accurately, a serious problem arises if investment income and the investment position are of opposite signs, as they were for the United States from 1989 until mid-1998. There is no cure for this problem except to do the disaggregation. In addition, there are serious errors in the measurement of investment income. They are probably a leading source of both the world current-account discrepancy ("horizontal") and the statistical discrepancy in the U.S. balance of payments ("vertical"). Nevertheless, these errors are quite on a par with those in the measurement of the net international-investment position itself, and it seems appropriate to study these two important but neglected series together. The advantage is that you then can avoid deciding what kind of asset you are thinking about, and also can throw the questions of how to 6. In the framework of mean-variance optimization with nonstochastic goods prices, the home bias in asset preferences is equal to the home bias in consumption preferences, times a factor equal to 1 minus the reciprocal of the coefficient of relative risk aversion. (See, e.g., Frankel, 1994, p. 11.) 7. They describe the Feldstein-Horioka literature as focusing on short-term capital flows. But in fact Feldstein and Horioka gave as motivation for their paper the observation that the existing interest-rate parity literature focused on short-term capital mobility, and their goal was to address long-term capital mobility. In my view the distinction between shortterm and long is misplaced here. Lane and Milesi-Ferretti want to talk about net stocks of assets, whereas the earlier literature they have in mind talks about flows. Perhaps a (second) change of title is in order: it should be something like Long-Term Patterns in International Investment. And similarly, the contribution of Feldstein and Horioka was not to shift attention from short term to long, but rather from prices to quantities.
134 • DISCUSSION
measure the real interest rates and expected changes in the real exchange rate out the window and estimate an equation like (2) above. You can even impose the constraint that (3~l is proportional to the variance of s. I look forward to future installments of this work, whether along the lines of my suggestion or not. REFERENCES Dooley, M., J. Frankel, and D. Mathieson. (1987). International capital mobility in developing countries vs. industrialized countries: What do saving-investment correlations tell us?" 7MFS Staff Papers 34(3, September):503-530. Frankel, J. (1991). Japanese finance in the 1980s: A survey. In Trade with Japan: Has the Door Opened Wider? P. Krugman (ed.). Chicago: University of Chicago Press, pp. 225-268. Reprinted in Japanese Economy, Vol. 2, P. Drysdale and L. Gower (eds.). Routledge Press, 1998. . (1994). The internationalization of equity markets: Introduction. In The Internationalization of Equity Markets, J. Frankel (ed.). Chicago: University of Chicago Press. Kester, A., and Panel on International Capital Transactions. (1995). Following the Money: US Finance in the World Economy. National Research Council. Washington: National Academy Press.
Discussion Philip Lane responded to Kristin Forbes by saying that the possibility of a nonlinear relationship between net foreign assets and income was not addressed in the time series because of the difficulty of incorporating nonlinearities in a cointegration framework. In particular, in this case the relationship might have more than one turning point. Lane explained that they did not put savings and investment directly into their regressions because they wanted a parsimonious model, and demography and income could affect net foreign assets through many channels, including savings and investment. He agreed with Jeff Frankel that the issue of net investment income relative to net foreign-asset position is as yet unexplored. He said that, in practice, the composition of net foreign assets is important for this relationship. Mark Gertler said that, leaving aside credit market imperfections, the basic neoclassical model suggests that a country's future investment opportunities should determine its net foreign-asset position. Ideally, the researcher would like to have a measure of Tobin's c\ by country. He noted that the investment-capital ratio could be a proxy for Tobin's q under certain conditions, which could explain why the investment-
Discussion • 135
capital ratio worked so well in the regressions presented by Kristin Forbes. Rick Mishkin agreed, suggesting that the return to capital relative to the pool of domestic savings is the first-order factor to investigate as a determinant of net foreign-asset positions. David Romer also agreed with Gertler and Mishkin that fundamentals were of first-order importance and should be taken into account more explicitly. Charles Engel was skeptical of the possibility of estimating long-run equilibrium relationships based on the 30 years of data collected by the authors. First, he thought convergence would be slow, and second, there could be structural breaks in the estimated relationship. He would have preferred to see the authors examine short-run relationships using their data. He was also worried by the fact that the estimated model appeared to work well for small countries, but not for the United States, Japan, and Germany. He was not happy with the use of net foreign assets relative to GDP, instead of wealth. In a stock-market boom, this measure makes the United States look risky, even though most of U.S. stock is held by Americans. He would have preferred to see a better measure of countries' ability to pay off their debts. Jaume Ventura raised the issue of the direction of causality in the relationship between interest-rate differentials and net foreign-asset positions. The authors assumed that rate differentials were high because of risk premia. But from a portfolio-balance perspective, causality could run in the opposite direction. Lane responded that the interest rates in question were interest rates on bonds, so, given arbitrage, the differential should be determined by expected exchange-rate changes and risk premia alone. Greg Mankiw said the data set collected by the authors would be very useful. He would have liked to see correlations between net foreign assets and the right-hand-side variables used in growth theory. This would give some idea of the theories one should look at in trying to explain net foreign-asset positions. In response, Ventura said he had run regressions where net foreign assets relative to wealth (rather than GDP) were the dependent variable, with standard variables from growth regressions on the right-hand side. In these regressions, he noted, wealth explained most of the variation in net foreign assets. The other variables, such as human capital and political institutions, came in with the right sign, but explained little of the variation in net foreign-asset positions. Michael Klein was curious about what happened to net foreign assets in the runup to crises. Are changes in net foreign assets generally transitory or persistent around crises? Lane agreed that in theory fundamentals matter, and countries with high marginal products of capital should see capital inflows. But he said
136 • DISCUSSION that in practice, things were not so simple, as political-economy considerations were also very important. Gian Maria Milesi-Ferretti also defended the use of income per capita rather than investment opportunities as a determinant of net foreignasset positions. In standard open-economy models the two are correlated, so this is appropriate. On breaks in the data, he felt that researchers should not give up estimating long-run relationships on this account; instead, they should investigate whether breakpoints are systematically related to certain variables.
Robert B. Earsky and Lutz Kilian UNIVERSITY OF MICHIGAN AND NBER; AND UNIVERSITY OF MICHIGAN, EUROPEAN CENTRAL BANK, AND CEPR
Do We Really Know that Oil Caused the Great Stagflation? A Monetary Alternative 1. Introduction There continues to be considerable interest, both among policymakers and in the popular press, in the origins of stagflation and the possibility of its recurrence. The traditional explanation of the stagflation of the 1970s found in intermediate textbooks is an adverse shift in the aggregate supply curve that lowers output and raises prices on impact.1 Indeed, it is hard to see in such a static framework how a shift in aggregate demand could have induced anything but a move of output and prices in the same direction. This fact has lent credence to the popular view that exogenous oil supply shocks in 1973-1974 and 1978-1979 were primarily responsible for the unique experience of the 1970s and early 1980s. For example, The Economist (November 27, 1999) writes: Could the bad old days of inflation be about to return? Since OPEC agreed to supply-cuts in March, the price of crude oil has jumped to almost $26 a barrel, up from less than $10 last December and its highest since the Gulf war in We have benefited from comments by numerous colleagues at Michigan and elsewhere. We especially thank Susanto Basu, Ben Bernanke, Olivier Blanchard, Alan Blinder, Mark Gertler, Jim Hamilton, Miles Kimball, Ken Rogoff, Andre Plourde, Matthew Shapiro, and Mark Watson. We acknowledge an intellectual debt to Larry Summers, who stimulated our interest in the endogeneity of oil prices. Allison Saulog provided able research assistance. Barsky acknowledges the generous financial support of the Sloan Foundation. The opinions expressed in this paper are those of the authors and do not necessarily reflect views of the European Central Bank. 1. For example, Abel and Bernanke (1998, p. 433) write that—after a sharp increase in the price of oil—"in the short run the economy experiences stagflation, with both a drop in output and a burst in inflation."
138 • BARSKY & KILIAN 1991. This near-tripling of oil prices evokes scary memories of the 1973 oil shock, when prices quadrupled, and 1979-80, when they also almost tripled. Both previous shocks resulted in double-digit inflation and global recession. . . . Even if the impact will be more modest this time than in the past, dear oil will still leave some mark. Inflation will be higher and output will be lower than they would be otherwise.
Academic economists, even those who may not fully agree with the prevailing view, have done little to qualify these accounts of stagflation. On the one hand, the recent scholarly literature has focused on the relationship between energy prices and economic activity without explicitly addressing stagflation (see, e.g., Hamilton, 1983, 1985, 1988, 1999; Rotemberg and Woodford, 1996). On the other hand, some authors (e.g., Bohi, 1989; Bernanke, Gertler, and Watson, 1997) have stressed not the direct effects of oil price increases on output and inflation, but possible indirect effects arising from the Federal Reserve's response to the inflation presumably caused by oil price increases. A common thread in the popular press, in textbook treatments, and in the academic literature is that oil price shocks are an essential part of the explanation of stagflation. In contrast, in this paper we make the case that the oil price increases were not nearly as essential a part of the causal mechanism generating the stagflation of the 1970s as is often thought. We discuss reasons for being skeptical of the importance of commodity supply shocks in general, and the 1973-1974 and 1979-1980 oil price shocks in particular, as the primary explanation of the stagflation of the 1970s. First, we show that there were dramatic and acrossthe-board increases in the prices of industrial commodities in the early 1970s that preceded the OPEC oil price increases. These price increases do not appear to be related to commodity-specific supply shocks, but are consistent with an economic boom fueled by monetary expansion. Second, there is reason to doubt that the observed high and persistent inflation in the deflator in the early and late 1970s can be explained by the 1973-1974 and 1979-1980 oil price shocks. The argument that oil price shocks caused the Great Stagflation depends on the claim that oil price shocks are inflationary. Using a simple model, we show that a onetime oil price increase will increase gross output price measures such as the CPI, but not necessarily the price of value added, as proxied by the GDP deflator. Indeed, an oil price increase may lower the deflator. Further, the data show that only two of the five major oil price shocks since 1970 have been followed by significant changes in the inflation rate of the GDP deflator, though in all cases the CPI inflation rate changed sharply relative to the deflator. Although we come to the same conclu-
Do We Really Know that Oil Caused the Great Stagflation? • 139
sion as Blinder (1979) that oil caused a spike in consumer-price inflation during the two most stagflationary episodes, we show that oil prices do not provide a plausible explanation of the sustained inflation that occurred in the GDP deflator as well as in the CPI. If oil price shocks were not the source of the Great Stagflation, what explains the striking coincidence of the major oil price increases in the 1970s and the worsening of stagflation? In this paper we provide evidence that in the 1970s the rise in oil prices—like that in other commodity prices—was in significant measure a response to macroeconomic forces, ultimately driven by monetary conditions. This view coheres well with existing microeconomic theories about the effect of real-interest-rate variation and output movements on resource prices, and challenges the conventional wisdom that major oil price changes are largely exogenous with respect to macroeconomic variables of OECD countries. It is commonly held that major oil price movements are ultimately due to political events in the Middle East. Our analysis suggests that—although political factors were not entirely absent from the decision-making process of OPEC—the two major OPEC oil price increases in the 1970s would have been far less likely in the absence of conducive macroeconomic conditions resulting in excess demand in the oil market. The prevailing view that exogenous oil price shocks were the primary culprits of the Great Stagflation of the 1970s goes hand in hand with the perception that monetary factors do not provide an adequate explanation of stagflation. In this paper we develop more fully a latent dissent to the conventional view that monetary considerations cannot account for the historical experience of the 1970s.2 Bruno and Sachs (1985) and, to a lesser extent, Blinder (1979, p. 77) discuss monetary expansion as one important source of stagflation, but their emphasis is on the inadequacy of money as an explanation of the bulk of stagflation and commodity price movements.3 In contrast, we show how in a stylized dynamic model of the macroeconomy stagflation may arise endogenously in response to a sustained monetary expansion even in the absence of supply shocks. The data generated by the model are broadly consistent, both qualitatively 2. References that we identify with the traditional view include Samuelson (1974), Blinder (1979), and Bruno and Sachs (1985). Precursors of our alternative explanation of stagflation and its association with oil prices include Friedman (1975), Cagan (1979), McKinnon (1982), Houthakker (1987), and De Long (1997). 3. For example, Bruno and Sachs (1985, p. 6) stress the inadequacy of purely demand-side models of stagflation and propose that contractionary movements in aggregate supply (such as oil price shocks) are needed to explain the slide into stagflation. Blinder (1979, pp. 102,209) states that the inflation of 1973-1974 was simply not a "monetary phenomenon." As the causes of the inflationary surge in the mid-1970s, and also of the recession that followed, he identifies "special factors" such as food price shocks in 1972-1974, the oil price shock in 1973, and the dismantling of price controls in 1974.
140 • BARSKY & KILIAN
and quantitatively, with the dynamic properties of the actual output and inflation data for 1971-1975. Our model captures the notion that economic agents in the 1970s responded only gradually to shifts in the monetary policy regime. We link these shifts to the breakdown of the Bretton Woods system and to changes in policy objectives. Several indicators of monetary policy stance show that monetary policy in the United States, in particular, exhibited a go-and-stop pattern in the 1970s. Moreover, episodes of stagflation were associated with swings in worldwide liquidity that dwarf monetary fluctuations elsewhere in our sample. The remainder of the paper is organized as follows. We begin with an outline of the basic facts of the stagflation of the 1970s in Section 2. Section 3 presents a monetary explanation of stagflation. In Section 4 we examine the empirical support for this monetary explanation, and in Section 5 the reasons for the shifts in monetary policy stance that in this view ultimately triggered the Great Stagflation. In Section 6 we discuss theoretical and empirical arguments against the oil-supply-shock explanation of stagflation. Finally, in Sections 7 and 8 we discuss the theoretical reasons for a close relationship between oil prices and macroeconomic variables and provide evidence that oil prices were in substantial part responding to macroeconomic forces, rather than merely political events in the Middle East. Additional evidence from the most recent oil price increase is discussed in Section 9. Section 10 contains the concluding remarks.
2. Basic Facts This section describes some of the salient features of stagflation and of the evolution of oil prices in the postwar period. The 1970s and early 1980s were an unusual period by historical standards. Table 1 describes the pattern of inflation and of GDP growth for each of the NBER business-cycle contractions and expansions. For each phase, we present data on nominal GDP growth and its breakdown into real and price components. Two critical observations arise immediately from Table 1. First, with one exception, the phase average of the rate of inflation rose steadily from 1960.2 to 1981.2, and declined over time thereafter. The exception is that inflation was 2.5 percentage points lower (9.56% compared with 6.98%) during the 1975.1-1980.1 expansion than in the preceding contraction period from 1973 A to 1975.1. The second, and most important, observation is the appearance of stagflation in the data. Stagflation appears in Table 1 as an increase in inflation as the economy moves from an expansion to a contraction phase. There were three episodes in which inflation, as measured by the growth in the GDP deflator, was near 9% per annum. In two of these
Do We Really Know that Oil Caused the Great Stagflation? • 141 Table 1 REAL GROWTH, INFLATION, AND NOMINAL GROWTH IN THE UNITED STATES Percent change per annum
NBER businesscycle dates
State of the economy
Real growth
Inflation
Nominal growth
1960.2-1961.1 1961.1-1969.4 1969.4-1970.4 1970.4-1973.4 1973.4-1975.1 1975.1-1980.1 1980.1-1980.2 1980.2-1981.2 1981.2-1982.4 1982.4-1990.2 1990.2-1991.1 1991.1-2001.1
Contraction Expansion Contraction Expansion Contraction Expansion Contraction Expansion Contraction Expansion Contraction Expansion
-1.03 +4.64 -0.49 +4.34 -1.76 +3.80 -3.46 +0.62 -1.34 +4.07 -1.27 +3.46
+ 1.22 +2.59 +4.93 +5.22 +9.56 +6.98 +8.88 +9.11 +6.07 +3.29 +4.12 +2.10
+0.19 +7.23 +4.44 +9.56 +7.80 + 10.78 +5.42 +9.73 +4.73 +7.36 +2.85 +5.56
Source: Based on quarterly chain-weighted GDP and GDP deflator data from DRI for 1960.1-2001.1. The business-cycle dates are based on the NBER dating. The last expansion is incomplete.
three episodes real output contracted sharply (i.e., in 1973.4-1975.1 and 1980.1-1980.2), and in the third it grew very slowly (i.e., in 1980.21981.2). Indeed, in all but one contraction (i.e., with the exception of the second Volcker recession in 1981.2-1982.4), average inflation during the contraction was higher than during the previous expansion. Figure 1 shows the percentage change in the nominal price of oil since March 1971, when the U.S. became dependent on oil imports from the Middle East (see Section 8). Episodes of so-called oil shocks are indicated by vertical bars and include the 1973-1974 OPEC oil price increase after the October war of 1973, the 1979-1980 price increases following the Iranian revolution in late 1978 and the outbreak of the Iran-Iraq war in late 1980, the collapse of OPEC and of the oil price in early 1986, the oil price spike following the invasion of Kuwait in 1990, and the most recent period of OPEC price management since March 1999. The coincidence of two large increases in the price of imported oil in the 1970s and two periods of strong stagflation has spurred interest in a causal link from "oil shocks" to stagflation, although casual inspection of Figure 1 and Table 1 suggests that this link is far less apparent for other episodes. The decade of the 1970s also coincided with fundamental changes in monetary policy and in attitudes toward inflation, as the Bretton Woods system collapsed. Monetary policy became much more expansionary on average and more unstable in the 1970s than in the 1960s. One reason that these developments are often considered less important in discus-
142 • BARSKY & KILIAN Figure 1 PERCENTAGE CHANGE IN NOMINAL PRICE OF OIL
Source: The underlying oil price series is refiner's acquisition cost of imported crude oil (DRI code: EEPRPI) for January 1974 to July 2000. We use the U.S. producer price index for oil (DRI code: PW561) and the composite index for refiner's acquisition cost of imported and domestic crude oil (DRI code: EEPRPC) to extend the data back to March 1971.
sions of stagflation is the perception that monetary factors are unlikely to generate stagflation of sufficient magnitude (see Blinder, 1979). As we will show in Section 3, this perception is incorrect. Another reason for the popularity of the oil-price-shock explanation of stagflation is the fact that both the phenomenon of stagflation and that of major upheavals in the oil market occurred for the first time in the 1970s, although Table 1 suggests that stagflation predates the first oil price shock of late 1973. Although oil price shocks continue to occur, there have been no major episodes of stagflation since the 1970s. In this paper we question the extent to which we really know that oil price shocks played a central role in generating stagflation. We will show that a monetary approach can explain not only the evolution of the Great Stagflation, but also that of the price of oil during that period. We will present a coherent explanation for the almost simultaneous occurrence
Do We Really Know that Oil Caused the Great Stagflation? • 143
of high oil prices and stagflation in the 1970s, and for the absence of such a relationship in subsequent periods.
3. Purely Monetary Explanation of the Great Stagflation of the 1970s In this section, we describe a stylized monetary model that illustrates how substantial stagflation may arise even in the absence of supply shocks when inflation is inherently "sluggish" or persistent, and particularly when the monetary authority also follows a rule that prescribes a sharp contractionary response to increases in inflation. In this model as well as in the data, inflation continues to rise after output has reached its maximum and peaks only with a long delay. Impulse-response estimates from structural vector autoregressions (VARs) indicate that a monetary expansion is followed by a prolonged rise not just in the price level, but in inflation, a phenomenon that Nelson (1998) calls "sluggish inflation." Likewise, it is widely accepted that output exhibits a hump-shaped response to a monetary expansion. An important empirical regularity in VAR studies is that the response of output peaks after about 4-8 quarters, followed by a peak in inflation after about 9-13 quarters (e.g., Bernanke and Gertler, 1995; Christiano, Eichenbaum, and Evans, 1996; Leeper, 1997). Thus, the peak response of output occurs about one year before the inflation response reaches its maximum.4 What is the source of sluggish inflation? Sluggish inflation is not a property of the most commonly used monetary business-cycle models (Taylor, 1979; Rotemberg, 1982,1996; Calvo, 1983). In these models, both inflation and output jump immediately to their maximal levels, followed by a monotonic decline. Although recent research has demonstrated the inconsistency of the Taylor-Calvo-Rotemberg model with the stylized facts about inflation and output dynamics (see Nelson, 1998), it has not provided a generally appealing alternative. In this paper, we take the position that sluggish inflation reflects the fact that agents learn only gradually about shifts in monetary policy (see Sargent, 1998). Agents are always processing new information, but especially so in a period following regime changes as dramatic as the changes that occurred in the 1970s. Given the low and stable inflation rates of the 1960s, it is plausible that agents were slow to revise their inflationary expectations when confronted with an unprecedented monetary expansion under Arthur Burns in the early 1970s. This interpretation appears even more plausi4. Nelson (1998) presents estimates that the response of inflation to a monetary innovation peaks after 13 quarters, but his VAR only includes money and the price level.
144 • BARSKY & KILIAN
ble considering the financial turmoil and uncertainty associated with the gradual disappearance of the Bretton Woods regime. Similarly, expectations of inflation were slow to adjust in the early 1980s, when Paul Volcker launched a new monetary policy regime resulting in much lower inflation. We propose a stylized model that formalizes the notion that in times of major shifts in monetary policy inflation is likely to be particularly sluggish. Consider a population consisting of two types of firms. A fraction (ot of "sleepy" firms is not convinced that a shift in monetary policy has taken place and sets its output price (pst) at last period's level adjusted for last period's inflation rate. The remaining fraction 1 — a)t of "awake" firms is aware of the regime change and sets its output price at pwt = pt + j8(yf — y{), where )3 is a constant, yt the log of real GDP, and y{ the log of potential real GDP. As time goes by, the fraction of agents that is unaware of the regime change evolves according to a>t = e~M. These considerations imply an aggregate price-setting equation of the form
This price equation is the source of the inflation persistence in our model. Equation (la) is very much in the spirit of Irving Fisher's (1906) reference to an earlier monetary expansion that "caught merchants napping." Its motivation is closer to that of the Lucas supply schedule (see Lucas, 1972,1973) than to that of sticky-price models. Agents are always free to adjust prices without paying "menu costs." Moreover, by the choice of appropriate time-varying weights o>f, our inflation equation may allow for the fact that agents learn more quickly about some shifts in policy than about others. For expository purposes, however, we postulate that these weights evolve exogenously. The second building block of the model is the equation
where Ap, is the inflation rate (which we will associate with the rate of change of the GDP deflator) and Araf is the rate of nominal money growth. This relationship is a very simple money demand equation. We complete the model by adding a policy reaction function. We posit that the Fed cannot observe the current level of the GDP deflator. We postulate a reaction function under which the Fed targets the rate of inflation. Let if16™ be the steady-state rate of inflation consistent with the initial increase in money growth. That rate may be interpreted as the level of inflation that the Fed is willing to tolerate under the new expan-
Do We Really Know that Oil Caused the Great Stagflation? • 145
sionary regime. The Fed responds to periods of inflation in excess of TT™™ by decelerating monetary growth by some small fraction y of last period's excess inflation rate:
where !(•) denotes the indicator function, and et represents the increase in the money growth rate associated with the Fed's more expansionary policy after the collapse of the Bretton Woods system. Note that, holding constant other demand shifters and given the sluggishness of inflation, this money growth rule may be translated into a more conventional interest-rate rule by inverting an IS curve and observing that high real balances imply low real interest rates. In addition, by way of comparison, we will explore a much simpler model in which money supply growth follows a sequence of exogenous policy shocks et and in which there is no policy feedback:
The model is parametrized as follows. We postulate that in steady state output grows at 3% per annum. Moreover, prices grow at a steady rate of 3% per annum prior to the monetary expansion. We follow Kimball (1995) in setting )3 = 0.06. The single most important parameter in the model is A, which determines the fraction of agents "awake." We choose A to give our model the best possible chance to match the timing of business-cycle peaks and troughs in the U.S. inflation and output-gap data for 1971-1975. The resulting value of A = 0.08 implies that after two years slightly less than half of economic agents will have adopted the new pricing rule. This rate of transition may appear slow, but—as we will discuss below—is consistent with evidence from other sources as well.5 Finally, for the model with policy feedback, we choose j = 0.05 for illustrative purposes. This value means that, if, for example, the inflation rate last quarter is 2%, the Fed will decelerate monetary growth by 0.1 percentage point (one-tenth of the initial monetary expansion). Our choice of y ensures that the inflation rate returns to the initial steady5. In our model it takes about two years for half of the agents to adopt the new pricing rule. This rate of adaptation may appear very slow, but it is not unlike those found in many other economic contexts. For example, data from the literature on the entry of lowerpriced generics into the market for branded drugs show that after two years only about half of the consumers have switched to the lower-priced generic drug (see Griliches and Cockburn, 1994; Berndt, Cockburn, and Griliches, 1996). If it takes so long for agents to adapt in such a simple problem, it does not appear implausible that it would take at least as long in our context.
146 • BARSKY & KILIAN
state rate in the long run. This choice is consistent with the interpretation that the Fed—rather than discovering a solution to the dynamic inconsistency problem—found its way back to low inflation using a mechanical rule (see Sargent, 1998). Given this choice of parameters, consider a one-time shock to et in period 5, representing a 4-percentage-point-per-annum increase in money growth, beginning in steady state. Figure 2a shows that a monetary expansion produces the essential features discussed above. The model economy displays stagflation, sluggish inflation, and a humpshaped response of output to a monetary expansion. Most importantly, the output gap rises with the inflation rate initially, with output peaking about two years after the shock, whereas inflation peaks only after three years (close to the 13 quarters reported by Nelson, 1998). Between these two peaks output and inflation move in opposite directions, resulting in stagflation. We now address the extent to which this stylized monetary model can explain the business-cycle peaks and troughs over the 1971-1975 period. Consider the following thought experiment: Since we know that a strong monetary expansion took place starting in the early 1970s, we—somewhat arbitrarily—propose to date the monetary expansion in the model so that period 5 in Figure 2a corresponds to 1971.1. This thought experiment allows us tentatively to compare the behavior of output and inflation in the model in response to a monetary expansion with the actual U.S. data. Given this interpretation, the monetary model predicts a peak in GDP in 1972.4-1973.1, followed by a peak in deflator inflation in 1974.2 (shortly after the OPEC oil price increase) and a trough in GDP in 1975.2. Note that, although the NBER dates the end of the expansion in late 1973, Hodrick and Prescott (HP) detrended GDP peaks in 1973.1, at the same time as the gap peaks in our model. Thus, the timing of the cycle that would have been induced by the monetary expansion in 1971.1 (after allowing for the Fed's reaction to the changes in inflation set in motion by this initial expansion) is remarkably close to the timing of the actual business cycle. Note that this coincidence of the timing of the business-cycle peaks and troughs does not occur by construction, but arises endogenously given our choice of parameters. Continuing with the same analogy, we now focus on the magnitude of the output and price movements induced by the monetary expansion. Of particular interest is the ability of the model to match the phase averages for 1971.1-1973.3 and for 1973.4-1975.1. We find that the average per annum inflation rates for 1971.1-1973.3 and 1973.4-1975.1 in the model are fairly close to the U.S. data. The model predicts average inflation rates of 5.1% and 10.4% per annum, respectively, compared
Do We Really Know that Oil Caused the Great Stagflation? • 147 Figure 2 IMPLICATIONS OF A PURELY MONETARY MODEL OF STAGFLATION: (a) WITH POLICY FEEDBACK; (b) WITHOUT POLICY FEEDBACK
Notes: Solid curves: quarterly inflation rate. Dashed curves: output gap. Models described in text. Responses to a permanent 1-percentage-point increase in money growth in period 5.
148 • BARSKY & KILIAN
with 4.9% and 9.6% in the data. Thus, both the model and the data show a substantial increase in inflation during the recession. Similarly, for GDP growth the model fit is not far off. The model predicts 4.8% growth per annum for 1971.1-1973.3 compared with 5.2% in the data, and -3.0% for 1973.4-1975.1 compared with -1.8% in the data. We conclude that the quantitative implications of this model are not far off from the U.S. data, especially considering that we completely abstracted from other macroeconomic determinants. Also note that the Fed inflation target in our model economy becomes binding in early 1973, consistent with the empirical evidence of a monetary tightening in early 1973 in response to actual and incipient inflation (see Section 4). This example illustrates that go-stop monetary policy alone could have generated a large recession in 1974-1975, even in the absence of supply shocks. A question of particular interest is how essential the endogenous policy response of the Fed is for the generation of stagflation. Some authors have argued that the 1974 recession may be understood as a consequence of the Fed's policy response to inflationary expectations (e.g., Bohi, 1989; Barnanke, Gertler, and Watson, 1997). Figure 2b shows that policy reaction is an important, but by no means essential, element of the genesis of stagflation. In fact, a qualitatively similar stagflationary episode would have occurred under the alternative policy rule (lc') without any policy feedback. The main effect of adding policy feedback (y > 0) is to increase the amplitude of output fluctuations and to dampen variations in inflation. In the model without policy feedback, holding fixed the remaining parameters, the timing of the cycle induced by the monetary regime change is roughly similar to that in Figure 2a. Figure 2b shows a peak in GDP in 1972.4-1973.1, followed by a peak in inflation in 1974.3 and a trough in GDP in 1975.2. The model without policy feedback predicts average annual inflation rates of 5.1% and 12.0% for 1971.1-1973.3 and 1973.4-1975.1, respectively, compared with 4.9% and 9.6% in the U.S. data. Average output growth per annum over these same subperiods is 4.9% and —2.0%, respectively, in the model, compared with 5.2% and -1.8% in the data. The policy shift associated with the monetary tightening under Paul Volcker in late 1980 provides a second example of the basic mechanism underlying our stylized model. Using the same parametrization as for Figure 2b, our model predicts a sharp recession in late 1982, followed by an output boom in 1985 and an output trough in early 1987. This pattern closely mirrors the movements of HP-filtered actual output. At the same time, inflation in the model falls sharply, reaching its trough in 1984,
Do We Really Know that Oil Caused the Great Stagflation? • 149
followed by a peak in 1986. Actual inflation in the GDP deflator followed a qualitatively similar, but delayed pattern. It reached its trough in 1986, followed by a peak in mid-1988. Thus, the response of actual inflation in this episode is even more sluggish than that in the model. 4 Support for the Monetary Explanation of Stagflation In this section, we will present four additional pieces of evidence in support of the monetary explanation of stagflation. First, we will examine several indicators of monetary policy stance to show that monetary policy in the United States, in particular, exhibited a go-and-stop pattern in the 1970s. Second, we will show that episodes of "stagflation" were associated with swings in world-wide liquidity which dwarf monetary fluctuations elsewhere in our sample. Third, we will show that there were dramatic and across-the-board increases in the prices of industrial commodities in the early 1970s that preceded the OPEC oil price increases. These price increases do not appear to be related to commodityspecific supply shocks, but are consistent with an economic boom fueled by monetary expansion. Finally, we will document that in early 1973 a broad range of business-cycle indicators started to predict a recession, nine months before the first OPEC oil crisis, but immediately after the Fed began to tighten monetary policy. 4.1 EVIDENCE OF GO-AND-STOP MONETARY POLICY IN THE UNITED STATES
Our evidence is based on two measures of the total stance of monetary policy for this period—one based on the behavior of the Federal Funds rate, the other based on narrative evidence (see Bernanke and Mihov, 1998; Boschen and Mills, 1995). The Bernanke-Mihov index of the overall monetary policy stance shows a strongly expansionary stance from mid-1970 to the end of 1972 (see Figure 3). Interestingly, the BoschenMills index, which is based on narrative evidence, is mostly neutral during this period with the exception of 1970-1971. The reason is that the Boschen-Mills index is based on policy pronouncements as opposed to policy actions. Quite simply, the Fed's pronouncements in this period were uninformative at best and probably misleading. Both the Boschen-Mills index and the Bernanke-Mihov index show a sharp tightening of monetary policy in early 1973. The Boschen-Mills indicator, on a scale from +2 (very expansionary) to -2 (very tight), moves from neutral at the end of 1972 to —1 for the first three months of 1973. It then spends the next 6 months at —2, followed by two months at —1, ending the year in neutral. Further, the Bernanke-Mihov index shows a sharp and prolonged contraction in monetary policy by early 1973
150 • BARSKY & KILIAN Figure 3 INDICATOR OF OVERALL MONETARY POLICY STANCE, JANUARY 1966 TO DECEMBER 1988
Source: Courtesy of B. Bernanke and I. Mihov.
(see Figure 3).6 As noted by Boschen and Mills (1995), this contraction was an explicit response to rising inflation. It occurred long before the disturbances in the oil market in late 1973 and provides an alternative explanation of the recession in early 19747 The contractionary response of the Fed in 1973 to the inflationary pressures set in motion by earlier Fed policy is a key element of our monetary explanation of stagflation.8 Note that the observed increase in inflation in 1973 is understated as a result of price controls, and the observed increase in 1974 is overstated due to the lifting of the price controls (see Blinder, 1979). 6. The downturn in the Bernanke-Mihov index in 1973 reflects a sharp rise in the Federal Funds rate. Interestingly, as Figure 5d shows, although the real interest rate rose, it remained negative throughout 1974. Thus, the contractionary effect of the monetary tightening must have worked partly through other channels such as the effect of high nominal interest rates on housing starts in the presence of disintermediation due to interest-rate ceilings. 7. This interpretation is consistent with Bernanke, Gertler, and Watson's (1997) conclusion that the Fed in 1973 was responding to the inflationary signal in non-oil commodity prices, not to the oil price increase as is commonly believed. 8. There is no Romer date for 1973, despite the clear evidence of a shift in policy toward a contractionary stance.
Do We Really Know that Oil Caused the Great Stagflation? • 151
As the U.S. economy slid into recession in 1974, the Fed again reversed course to ward off an even deeper recession. Indicators show a renewed monetary expansion that lasted into the late 1970s. The Bernanke-Mihov index indicates that monetary policy was strongly expansionary from late 1974 into 1977 (see Figure 3). This expansion was not initially reflected in high inflation, in line with our earlier discussion of sluggish inflation. Boschen and Mills record a similar, if somewhat briefer, expansion. Around 1978, the monetary stance turned slightly contractionary, becoming strongly contractionary in late 1979 and early 1980 under Paul Volcker, as inflation continues to worsen. Once again, the monetary policy stance provides an alternative explanation for the genesis of stagflation. 4.2 WORLDWIDE CHANGES IN LIQUIDITY
The changes in monetary policy indicators in the 1970s in the United States, and indeed in many other OECD countries, were accompanied by unusually large swings in global liquidity. One indicator of global liquidity is world money growth. We focus on world (rather than simply U.S.) monetary growth, both because the prices of oil and non-oil commodities are substantially determined in world markets, and because— despite its origins in the U.S.—the monetary expansion in the early 1970s was amplified by the workings of the international monetary system, as foreign central banks attempted to stabilize exchange rates in the 1968-1973 period. The counterpart of the foreign-exchange intervention in support of the dollar was the paid creation of domestic credit in all of the large economies (see McKinnon, 1982; Bruno and Sachs, 1985; Genberg and Swoboda, 1993). Figure 4a and b show a suitably updated data set for GNP-weighted world money growth and inflation, as defined by McKinnon (1982). There is evidence of a sharp increase in money growth in 1971-1972 and in 1977-1978 preceding the two primary stagflationary episodes in Table 1. The increase in world money growth is followed by a substantial rise in world price inflation in 1973-1974 and in 1979-1980 (see Figure 4b). The data also show a third major increase in world money-supply growth in 1985-1986. This does not pose a problem for the monetary explanation of stagflation, because 1985-1986 is fundamentally different from 1973-1974 and 1979-1980. The coincidence of substantial money growth and low world inflation constitutes a partial rebuilding of real balances following the restoration of the commitment to low inflation.9 9. This is precisely the standard interpretation of the patterns of inflation and money growth that have been documented for the period following the monetary reform that ended the German hyperinflation (see Barro, 1987, p. 206, Table 8.1).
152 • BARSKY & KILIAN Figure 4 MEASURES OF WORLD LIQUIDITY
Source: Inflation and money are GNP-weighted growth rates per annum as defined by McKinnon (1982, pp. 322), based on IPS data for 1960.1-1989.4.
We now turn to the United States, where the monetary expansions of the 1970s originated. Figure 5 shows that U.S. liquidity followed a pattern similar to that of other industrial countries. Figure 5a shows two large spikes in money growth in 1971-1972 and in 1975-1977 that preceded two episodes of unusually high inflation in the GDP deflator in 1974 and in 1980 (see Figure 5b) and that coincided with two episodes of significantly negative growth in real money balances in 1973-1974 and 1978-1980 (see Figure 5c). Figure 5c also shows evidence of a rebuilding of real balances (and possibly of the financial deregulation) after 1980. Additional evidence of excess liquidity in the 1970s is provided by the behavior of the U.S. real interest rate. Figure 5d shows that 1972-1976 and 1976-1980 were periods of abnormally low real interest rates, followed by unusually high real interest rates in 1981-1986. This pattern is consistent with the view that the excess money growth in the early and mid-1970s depressed ex ante real interest rates via a liquidity effect and further depressed ex post interest rates by causing unanticipated inflation. The evidence in Figure 5d also is consistent with the view that the Fed in the 1970s followed an interest-rate rule that was more tolerant of inflation than would have been consistent with a Taylor rule as estimated over the
Do We Really Know that Oil Caused the Great Stagflation? • 153 Figure 5 MEASURES OF U.S. LIQUIDITY
Source: (a) Based on DRI series FM2. (b) Based on DRI series GDPD. (c) Based on DRI series FM2 and PRXHS. (d) Based on DRI series FYGM3 and PRXHS.
Volcker-Greenspan period (see Clarida, Gali, and Gertler, 2000). Finally, the timing in Figure 5d contradicts the view that oil shocks were responsible for the low ex post real interest rates. Real interest rates were negative during 1973, after the evidence of excess money growth, but well before the two major oil price increases. In fact, the 1973-1974 and 1979-1980 oil price increases were followed by a rise in ex post real interest rates. 4.3 MOVEMENTS IN OTHER INDUSTRIAL COMMODITY PRICES
An important additional piece of evidence that has received insufficient attention in recent research is the sharp and across-the-board increase in industrial commodity prices that preceded the increase in oil prices in 1973-1974 (see Figure 6). These increases occurred as early as 1972, well before the October War, and are too broad-based to reflect supply shocks in individual markets. They are, however, consistent with a picture of increased demand driven by the sharp increase in global liquidity documented in Figure 3. There is significant evidence that poor harvests caused food prices to soar in the early 1970s (see Blinder, 1979). Our data set deliberately
154 • BARSKY & KILIAN Figure 6 NOMINAL PRICE INDEXES FOR CRUDE OIL AND FOR INDUSTRIAL COMMODITIES, JANUARY 1948 TO JULY 2000
Source: All data are logged and de-meaned. The commodity price index excludes oil and food. The index shown is an index for industrial commodity prices (DRI code: PSCMAT). Virtually identical plots are obtained using an index for sensitive materials (DRI code: PSM99Q). The oil price series is defined as in Figure 1.
excludes food-related commodities. Instead, we focus on industrial raw materials. Commodities such as lumber, scrap metal, and pulp and paper, for which there is no evidence of supply shocks, recorded rapid price increases in the early 1970s (see National Commission on Supplies and Shortages, 1976). For example, the price of scrap metal nearly doubled between October 1972 and October 1973, and continued to rise until early 1974, to nearly four times its initial level. The price of lumber almost doubled between 1971 and 1974, as did the price of wood pulp. These commodity price data paint a picture of rapidly rising demand for all commodities in the early 1970s. It is interesting to note that a similar increase did not occur in oil prices until late 1973. Similarly, the 1979 increase in oil prices was preceded by a boom in other commodity prices, consistent with the evidence of monetary expansion, although the commodity price increase is of lesser magnitude. In fact, a striking empirical regularity of the data in Figure 6 is that
Do We Really Know that Oil Caused the Great Stagflation? • 155
increases in other industrial commodity prices tended to precede increases in oil prices over the 1972-1985 OPEC period (and similarly for decreases). This fact is evident for example in 1972,1978,1980,1983, and 1984. A natural question is how the monetary explanation of stagflation proposed here can be reconciled with the delayed response of oil prices relative to other industrial commodities. The explanation appears to be that, unlike other commodity transactions, most crude-oil purchases until the early 1980s did not take place in spot markets, but at long-term contractual prices. The sluggish adjustment of these contractual prices in response to demand conditions in commodity markets tended to delay the response of the oil price relative to the price of more freely traded commodities, until the spot market largely replaced traditional oil contracts in the early 1980s. 4.4 BUSINESS-CYCLE INDICATORS
Finally, the monetary explanation is consistent with evidence of an impending recession long before the first oil price shock in October 1973January 1974, but shortly after the monetary tightening of early 1973. Both the expected conditions component of the index of consumer confidence and the index of leading indicators peaked in January 1973, when monetary policy switched to a contractionary stance in response to rising inflation. Consumer durables started falling relative to trend in early 1973, as would be expected in response to a monetary tightening. Similar declines can be observed in the numbers of housing starts and motor-vehicle purchases. Figure 7 suggests that both consumers and economic forecasters were expecting a recession many months before the October 1973 war and the subsequent oil embargo, and that this expectation was not driven by concerns over OPEC. The decline in the index of leading indicators continued throughout 1974. Although we cannot tell to what extent the fall in the index of leading economic indicators after September 1973 can be attributed to oil as opposed to money, a full two-thirds of the fall in consumer confidence in 1973-1974 was completed prior to the oil date. 5. What Explains the Initial Monetary Expansion of the 1970s? The U.S. economy moved from an extended period of low and stable inflation at the beginning of the 1960s to one of high and variable inflation by the end of the decade. The underlying cause of the shift towards higher inflation was the gradual reduction in the United States's commitment to the twin goals of low and stable inflation and the avoidance of "excessive" balance-of-payments deficits. In the late 1960s, the central
156 • BARSKY & KILIAN Figure 7 BUSINESS-CYCLE INDICATORS WITH OPEC I OIL DATES: (a) EXPECTED CONDITIONS COMPONENT OF CONSUMER CONFIDENCE; (b) REAL DURABLES CONSUMPTION (PERCENT DEVIATION FROM HP-TREND); (c) INDEX OF LEADING ECONOMIC INDICATORS
Do We Really Know that Oil Caused the Great Stagflation? - 157 Figure 7 CONTINUED
Sources: (a) Survey of Consumers, University of Michigan; (b), (c) based on DRI data.
bank's commitment to these traditional goals was increasingly diluted by the additional goal of maintaining high employment.10 The dilution of the commitment to controlling inflation and balance-of-payments deficits was behind both the weakening (and ultimately the destruction) of the Bretton Woods system and the initiation of expectations of high and persistent inflation in the late 1960s. The rise in inflationary expectations in turn triggered an inflation trap by raising the cost of subsequent disinflations (also see Christiano and Gust, 2000).n These inflationary pressures were reinforced by two serious errors of economic analysis on 10. This change in focus can be traced ultimately to the Great Depression and the perception that tight monetary policy had been responsible for excessively high unemployment during the Great Depression. The rise in social and political commitment to full employment was furthered by an intellectual belief in the more or less permanent exploitability of the Phillips curve (see Samuelson and Solow, 1960). The refusal to rein in social spending or to allow a sharp rise in interest rates, as the Vietnam war expanded in the late 1960s, reflected the change in priorities. 11. What makes the expectations-trap hypothesis plausible is evidence that by 1971 the Fed indeed perceived a shift in the public's expectations of inflation. As Christiano and Gust (2000) note, Arthur Burns was concerned about expectations of inflation as early as December 1970. By 1971, he perceived a shift in inflationary expectation due to the steady rise of consumer-price inflation since 1965, well before the commodity supply shocks, the oil shocks, and the monetary expansion of the early 1970s (see Burns, 1978, pp. 118, 126).
158 • BARSKY & KILIAN
the part of the Federal Reserve: first, a miscalculation of full employment in the wake of the productivity slowdown and of structural changes in the labor market (see Orphanides, 2000; Orphanides et al., 1999);12 and, second, an increased tendency to attribute inflation to "special factors" rather than the underlying monetary environment.13 A third element was the exploitation of the newly unconstrained policy environment in the service of electoral politics.14 A number of authors (see McKinnon, 1982; DeLong, 1997; Mundell, 2000) have suggested that the collapse of the gold exchange standard associated with Bretton Woods played a key causal role in the creation of the inflationary monetary environment of the 1970s.15 Although it is widely accepted that the eventual collapse of the Bretton Woods system was inevitable due to fundamental structural flaws (see the papers in Bordo and Eichengreen, 1993), the timing of its demise was influenced by the same factors that also launched the initial wave of inflation in the mid-1960s to 1970. The collapse was triggered by an excess supply of U.S. dollars resulting both from the expansion of the U.S. monetary base and a reduction of the demand for dollars abroad driven by the expecta12. Orphanides (2000) documents that the measurements of real output available to the Fed following both the 1970 and 1974 recessions were substantially lower than the output data now available. At the same time, official estimates of potential real output were in retrospect far too optimistic, resulting in excessively high estimates of the output gap, defined as the shortfall of actual output relative to potential. Drawing on evidence from simulated real-time Taylor rules and on the Fed minutes and the recollections of the policymakers involved, Orphanides concludes that the increase in the natural rate of unemployment and the productivity slowdown in the late 1960s and 1970s were two major factors in explaining the inflationary outcomes of the period. 13. For example, Hetzel (1998) makes the case that then chairman Arthur Burns adhered to a special-factors theory of inflation which attributed increases in inflation to a variety of special circumstances ranging from unions and large corporations to government deficits and finally food and oil price increases. Hetzel argues that Burns systematically discounted any direct effects from increases in the money supply on inflation and did not appear to be overly concerned about the extent of the monetary expansion in the early 1970s. For a similar view see Christiano and Gust (2000). 14. For example, DeLong (1997) stresses that the inflation of the early 1970s was fueled in addition by Arthur Burns's efforts to facilitate Nixon's reelection through expansionary monetary policy. Christiano and Gust's (2000) narrative evidence that Burns was not intimidated by Nixon does not contradict this interpretation, because Burns's conservative economic views were closer to Nixon's than to those of his Democratic opponent. 15. The temporal coincidence is indeed an impressive one. The breakdown of the Bretton Woods system was foreshadowed by the introduction in 1968 of a two-tiered system of convertibility with significantly higher prices for private than for official transactions, in response to the declining private-sector confidence in the dollar peg. It became official when President Nixon announced the "closing of the gold window"—ending the convertibility of dollars into gold in August 1971. The relaxation of the convertibility constraint coincided with a dramatic increase in U.S. monetary growth (see Figure 3a) and a period of expansionary monetary policy between mid-1970 and late 1972, as indicated by the Bernanke-Mihov index.
Do We Really Know that Oil Caused the Great Stagflation? • 159
tion of an incipient depreciation of the dollar (see McKinnon, 1982; Bruno and Sachs, 1985; Genberg and Swoboda, 1993). In this sense, the breakdown of the Bretton Woods system was endogenous. At the same time, awareness of the loss of prestige that would accompany suspension of convertibility continued until the end to serve as a partial commitment device that contributed significant restraint against higher inflation (see Bordo and Kydland, 1996). By completely removing this constraint, the 1971 closing of the gold window permitted a second round of monetary expansion starting from an already high base. In this sense, the breakdown of the Bretton Woods system may also be considered one of the causes of the monetary expansion. As stressed by Kydland and Prescott (1977) and Barro and Gordon (1983), the incentive of the central bank to stimulate employment in the short run tends to produce an excessively high rate of inflation in the absence of a suitable commitment mechanism. Not until the development of new commitment devices in the form of a lexicographic intellectual commitment to price stability and the cult of the conservative central banker at the beginning of the 1980s (see Rogoff, 1985) was the prevailing inflation reduced to the levels of the early 1960s. The reason that the 1970s are different from the preceding and the following decade thus is the absence of effective constraints on monetary policy. As the global monetary system underwent dramatic changes in the early 1970s, both central bankers and private agents slowly had to adapt to the new rules of the game. The monetary expansion was not immediately understood by market participants and required a process of learning that is reflected in the sluggish adjustment of inflation in the model of Section 3. Furthermore, the Fed itself was operating in a new monetary environment without the traditional constraints and needed to learn about the consequences of its own actions. There was a widespread sense that "the rules of economics are not working in quite the way they used to" (see Burns, 1978, p. 118). This element of trial and error is important in understanding the go-and-stop nature of monetary policy in this period and helps to explain why the generally inflationary stance of monetary policy was punctuated by occasional sharp contractions. It also helps to answer the question why the Fed did not learn from its mistakes after the first episode of go-and-stop monetary policy ended in 1974. The data for U.S. monetary growth in Figure 5a show a renewed expansion that coincided with a period from late 1974 until 1977, in which policy indicators signal a second "go" phase for monetary policy. Part of the explanation may be that, at the time, the Fed attributed at least part of the observed stagflation to oil supply shocks and other special factors. More importantly, the Fed lacked a political mandate for
160 • BARSKY & KILIAN
serious reform. The lack of commitment to maintaining low inflation could only be overcome by the experience of double-digit inflation in the late 1970s.16
6. How Convincing is the Aggregate-Supply-Shock Explanation of Stagflation? The view that the historical pattern of stagflation can be accounted for by the effects of money does not preclude the possibility that oil shocks played a major role in generating the stagflation either directly or indirectly by inducing a policy response. In this section, we will demonstrate that the supply-shock explanation of stagflation is less convincing than commonly thought. 6.1 IS THE TEXTBOOK ANALYSIS OF AGGREGATE SUPPLY SHOCKS CONVINCING? GROSS VS. VALUE-ADDED CONCEPTS OF OUTPUT AND PRICE
The textbook view is that oil price shocks are of necessity inflationary. The only question is the magnitude of the inflationary effect. As we will show, however, this claim is unambiguously true only for the price of gross output, not for the price of value added. The following counterexample demonstrates that oil price shocks may in fact have a deflationary effect on the price of value added, even as they raise the price of gross output. Suppose gross output Q is given by the production function Y = Q[V(K, L, x), O], where x denotes a technology disturbance, O denotes the quantity of a foreign commodity import ("oil"), and V(K, L, x) is domestic value added. As is standard, we assume separability between O and the other factors in order to ensure the existence of a value-added production function. As is immediately clear, a decline in O, under separability, is not a shock to the production function for value added—the ability to produce domestic output is unchanged. It follows that oil shocks cannot play the role of a technology shock in a standard real-business-cycle model (i.e., they do not alter value added, holding constant capital and labor input), although they do lower the quantity of gross output. Following Rotemberg and Woodford (1996), we consider an economy in which symmetric firms produce final output using the gross output production function
16. Sargent (1998) provides a detailed account of competing explanations of the transition back to a low-inflation regime.
Do We Really Know that Oil Caused the Great Stagflation? • 161
where Ot is the quantity of foreign oil used in production, Q is homogeneous of degree one in its arguments, and Vt is a function of labor hours and capital. The capital stock is assumed to be fixed, ensuring concavity of Vt. Let gross output be the numeraire. Vt, the value added associated with capital and labor, should be thought of as real GDP. Nominal GDP is given by PtYt — P°tOt, where P°tis the price of imported oil. Further postulate that the demand for money balances is proportional to nominal gross output:
where Pt is the price of gross output. Thus, nominal gross output is determined by the money stock alone. Now suppose that labor is supplied inelastically. Further suppose that all markets are perfectly competitive. Logarithmically differentiating (2) and (3) with respect to P°t, we obtain
where A denotes percent changes, s0 is the cost share of oil in gross output, and e0iV is the elasticity of substitution between value added and oil. This means that an increase in the price of imported oil will tend to lower the quantity of gross output and raise the price of gross output. Next consider the deflator for value added, defined as the ratio of nominal to real value added:
Again consider an increase in the price of imported oil. Clearly, under our assumptions the denominator of (6) does not vary with the price of oil. The numerator, however, will fall, since by (3) nominal gross output is determined solely by the money stock, and the cost share of imported oil in gross output is expected to rise in response to an oil price increase (see Gordon, 1984; Rotemberg and Woodford, 1996). Thus, the oil price shock lowers the price of value added, even as it raises the price of gross output.
162 • BARSKY & KILIAN
This stylized example illustrates that the aggregate-supply-shock analysis of oil price changes is questionable. Whereas aggregate-supply shocks in the textbook model are stagflationary for value added, oil price increases may actually be deflationary. In this sense, in our example they are closer in spirit to aggregate demand shocks than to aggregate supply shocks.17 How realistic is this counterexample? Clearly, to overturn our benchmark result would require a sufficiently sharp fall in real value added in response to an oil price shock, without a commensurate drop in the money stock. We now discuss several mechanisms by which oil price shocks may in principle generate a fall in the quantity of value added. Since oil shocks are not productivity shocks, the key to establishing that oil price shocks affect value added then must be showing that labor and capital inputs change in response to an oil price shock.18 One model that establishes such a link is the sectoral-shifts model of Hamilton (1988). A related channel has been discussed by Bernanke (1983), who shows in a partial-equilibrium model that oil price shocks will tend to lower value added, because firms will postpone investment as they attempt to find out whether the increase in the price of oil is transitory or permanent. Even if we accept the view that an oil price shock lowers real value added, however, there is no presumption that this shock will be stagflationary. First, consider the case of a fixed money supply. It is not enough to show that value added falls in response to an oil price shock. For the price of value added actually to rise when the money supply is fixed, value added must fall by more than the numerator in (6). More generally, the money supply will not be fixed. In that case, the direction of the change in the price deflator also depends on the Fed's reaction to the fall in value added. The optimal Fed behavior would be to contract the money supply in response to the fall in value added (see King and Goodfriend, 1997). We have already shown that indeed the Fed was conducting contractionary monetary policy at the time of the oil price 17. An additional factor that reinforces the aggregate-demand-shock interpretation is the transfer of purchasing power from the United States to OPEC (see Bruno and Sachs, 1985). 18. Under imperfect competition, as noted Rotemberg and Woodford (1996), an oil price shock does result in a rise in the supply price for all levels of value added. This increase occurs because firms apply the markup to all cost components, including imported oil, not just to capital and labor. The magnitude of this effect, however, is likely to be small for reasonable markup ratios, unless we allow in addition for substantial changes in the markup over time. The latter possibility is discussed by Rotemberg and Woodford (1996), who show that a model involving implicit collusion between oligopolists in the goods market can yield output responses to an oil price shock that are quantitatively important.
Do We Really Know that Oil Caused the Great Stagflation? • 163
shocks. Whether this monetary contraction would have been enough to stabilize the price level, as value added fell, is an empirical question. Either way, monetary policy plays a key role in determining the effect of oil price shocks on inflation. This discussion shows that the implications of an oil price shock are unambiguous only for the price of gross output measures such as the consumer price index (CPI). Although one could construct other examples, in which oil price shocks are inflationary for the price of value added (measured by the GDP deflator), there is no presumption that in general they are. The direction and strength of the effect of oil price shocks on the GDP deflator is an empirical question. 6.2 DO OIL PRICE SHOCKS MOVE THE GDP DEFLATOR?
The preceding discussion stressed the important distinction between inflation in prices of gross output (such as the CPI) and of value added (such as the GDP deflator). In this section, we provide some empirical evidence about the timing and relative magnitude of the changes in the GDP deflator and the CPI inflation rates during major oil price changes that sheds light on the relative contributions of oil and money to the inflation in the GDP deflator observed in the 1970s. Figure 8 shows the annualized inflation rates for gross output prices (as measured by the CPI) and the price of value added (as measured by the GDP deflator) for the United States in the period 1960.1-2000.2.19 We use the PRXHS index of consumer prices, which excludes housing and shelter. Despite the obvious differences in the content and construction of these two indices, there is strong comovement in the long run. For our purposes, it will be of interest to focus on five major episodes: the two major oil prices increases of 19731974 and 1979-1980, the major drop in oil prices in early 1986, the invasion of Kuwait in 1990-1991, and the recent oil price volatility since 1997. Our first observation is that Figure 8 shows an unusual discrepancy between the deflator and CPI inflation rates during each of the five episodes of interest. CPI inflation rose sharply relative to deflator inflation between 1972 and 1974 and again in 1979 and early 1980. This result is not surprising, as these periods were characterized by major fluctuations in world commodity markets. To the extent that prices of imported 19. Our theoretical counterexample maintained the implicit assumption that no oil is produced domestically. This is not an issue for most OECD countries in the 1970s with the exception of the United States. There are reasons to doubt the quantitative importance of this channel, however, even for the United States, given the small share of domestic oil in U.S. GDP. It may be shown that the inflation rate for the non-oil component of U.S. GDP will be lower than the inflation rate for the total GDP deflator, but that the overall results are qualitatively similar under realistic assumptions.
164 • BARSKY & KILIAN Figure 8 QUARTERLY U.S. INFLATION RATES FOR 1960.1-2000.2
Source: All data are growth rates per annum. All data are taken from the DRI database. We use PRXHS (consumer prices excluding shelter) as the CPI measure, and GDPD as the implicit GDP deflator.
oil and other imported commodities enter the CPI but not the deflator, our earlier discussion suggests that we should expect to see a wedge between inflation in the CPI and in the deflator. Moreover, it is well known that especially price-sensitive items such as food (whether imported or not) have higher weights in the CPI than in the deflator, adding to the discrepancy. Similarly, the 1986 and 1990-1991 episodes are characterized by a differential response of CPI and deflator inflation rates. The same differential response occurs after 1997, as oil prices first plummet and then experience a dramatic reversal in 1999 and 2000. Our second observation is that during the 1970s increases in CPI inflation rates tended to precede increases in the inflation rate of the deflator. Although CPI inflation reached double-digit rates in early 1974, the bulk of the inflation in the deflator only occurred from mid-1974 to 1975. Similarly, although CPI inflation rates rose sharply in 1979-1980, the bulk of the increase in the inflation in the deflator occurred only in mid-
Do We Really Know that Oil Caused the Great Stagflation? • 165
1980-1981.20 One possible explanation for this difference in timing is that value added fell for the reasons described by Hamilton (1988), and monetary policy did not contract enough to prevent an increase in the price level. An alternative explanation is that the delayed inflation was caused by the earlier monetary expansion. The latter explanation seems more plausible, given that of the five oil episodes in our sample period only the 1973-1974 and 1979-1980 episodes are associated with large changes in the deflator inflation rate—but none of the other major oil price changes. Of particular interest is the 1986 fall in the oil price following the collapse of OPEC. The fact that the sharp deflation in the CPI in 1986 was accompanied by only a minor reduction in deflator inflation casts doubt on the view that oil was responsible for deflator inflation in earlier periods. Similarly, during 1990-1991 deflator inflation changed little by historical standards. Further evidence against the oil-supply-shock view of stagflation is provided by the events of 1997-2000. During this period oil prices first fell sharply to an all-time low and then rose sharply to heights not seen since 1979-1980. As expected, these oil prices swings are reflected in considerable swings in CPI inflation rates in Figure 8, but they have little, if any, effect on deflator inflation. Figure 8 illustrates that the high correlation of oil price shocks and subsequent increases in deflator inflation that we observe in the 1970s breaks down in other periods. We interpret this evidence as supportive of the view that this relationship is largely coincidental. The monetary explanation of stagflation provides a coherent account of why the 1970s were different, and of what generated the observed dramatic increases in deflator inflation. Thus, the evidence in Figure 8 provides further support for the monetary explanation of the stagflation of the 1970s. 7. The Relationship of Oil Prices and the Macroeconomy: Theory In Section 6, we showed that, although an oil supply shock may well cause a recession, its effect on the GDP deflator (as opposed to the CPI) is ambiguous in theory and appears to be small in practice. Nevertheless, casual observers continue to be impressed with the coincidence of sharp oil price increases in the 1970s and the worsening of stagflation. In 20. The unusually long delay in the response of inflation to money can be explained by the presence of wage and price controls throughout 1973 and in early 1974. These controls effectively suppressed inflation rates. The lifting of price controls in April 1974 coincided with a sharp increase in deflator (as well as CPI) inflation (see Blinder, 1979). The fact that the increase in deflator inflation rates in 1980-1981 was smaller (if more sustained) than in 1974-1975 also is consistent with this interpretation.
166 • BARSKY & KILIAN
fact, some observers seem puzzled by the absence of a close link between oil prices and stagflation at other times (for example, The Economist, 1999). In this section, we will argue that the almost simultaneous occurrence of sharp increases in oil prices and worsening stagflation in the 1970s was indeed no coincidence. Unlike conventional accounts based on exogenous oil supply shocks, however, we stress that oil prices were responding in substantial measure to conditions in the oil market, which in turn were greatly affected by macroeconomic conditions (and ultimately by the monetary stance). Put differently, we reject the common notion of a simple one-way causal link from oil prices to the macroeconomy and allow for the possibility that oil prices (like other commodity prices traded in international markets) tend to respond to macroeconomic forces. The view that oil prices contain an important endogenous component is not as radical as it may seem. In fact, the observed behavior of oil and non-oil commodity prices coheres well with economic theory about resource prices (see Heal and Chichilnisky, 1991). Commodity prices rise in response to high output and low real interest rates. Our emphasis on the endogenous response of oil prices to global (and in particular U.S.) macroeconomic conditions does not rule out that political events played a role in the timing of the observed oil price increases, but it suggests that politically motivated increases in the oil price would have been far less likely in the absence of a conducive economic environment created by monetary policy. The starting point for our analysis is the classic resource extraction model of Hotelling (1931). Applying this model to oil, marginal revenue (MR) net of marginal cost of extraction (MCE) must rise at the rate of interest, so that well owners are on the margin indifferent between extracting oil today and extracting oil tomorrow. Further, the transversality condition says that, in the limit, no oil should be wasted. Combining these two conditions, for the special case of zero marginal extraction cost, we have
where p0oil = initial relative price of oil, Soil = fixed stock of oil, r = real interest rate, yt = aggregate output in period i, and D°l1 = demand for oil in period t. Under perfect competition, equation (7) implies that the price of oil rises at the rate of interest until the fixed stock of oil is exhausted. For the more general case of positive marginal extraction costs the first-
Do We Really Know that Oil Caused the Great Stagflation? • 167
order condition for profit maximization is that MR — MCE must rise at rate r.
Note that the required rise over time in MR — MCE may be accomplished by a fall in MCE as new capacity is developed, even without a rise in the oil price (see Holland, 1998). Indeed, this feature of the model allows for the oil price to fall over time. This simple model implies several channels through which monetary policy affects oil prices. First, a one-time permanent drop in r raises the initial price, and implies slower price growth thereafter. Second, a rise in aggregate real income shifts out the flow demand for oil. Since the oil is consumed more rapidly, the price of oil must rise to clear the market. The magnitude of these effects depends on the size and duration of the effects of monetary policy on r and y. Money is not normally thought to permanently change r or y. Thus, the magnitude of price adjustment in response to monetary policy in this model may not be large. Much stronger effects on the price of oil may occur once capacity is modeled explicitly. If marginal costs are increasing in the extraction rate (which—in the limit—may be interpreted as a capacity constraint), a shift in demand for oil in this model may generate sharp increases in the price of oil as well as overshooting of the oil price.21 In the limit, if installed capacity is instantaneously fixed, the price of oil at a moment in time is determined entirely by demand. A rise in real GDP, or a decrease in the real interest rate, shifts the demand curve for oil to the right, sharply raising the market price of the given stock of oil. However, this price increase carries the seeds of its own destruction. If we began in steady state, the shadow price of capacity will now exceed its replacement cost at current levels of capacity. If the price remains high for extended periods, investment in drilling and distribution capacity takes place, and in the long run the price of oil will fall. In addition to the direct effects of real income and real interest rates on the demand for oil, there is also an additional effect that links the stabil21. Mabro (1998, p. 16) notes that ". . . exhaustibility as an ultimate outcome in a universal context is not very relevant [for the oil price] because the time horizon involved, even today, is far too long to have a noticeable impact. What matters is the relationship of current productive capacity to current demand and of planned investments in capacity to future demand. It is not the geo-physical scarcity of oil that poses problems . . . but the capacity issued at any given point in time."
168 • BARSKY & KILIAN
ity of oil cartels to macroeconomic forces. Standard theoretical models of cartels such as Rotemberg and Saloner (1986) and Green and Porter (1984) predict that cartel stability will be strengthened by low real interest rates. Producers trade off the immediate gains from abandoning the cartel against the present value of the cartel rents forgone. This logic suggests that the unusually low real interest rates in the 1970s, all else equal, should have been conducive to the formation of cartels, and the high real interest rates of the 1980s should have been detrimental. Moreover, Green and Porter show that if producers, rather than observing the cartel's output, only observe a noisy measure of the market-clearing price, cartel activity will be procyclical. The assumption of imperfectly observable output is particularly appealing for crude-oil producers. The actual production level of crude oil can only be estimated in many cases, and reliable output statistics become available only with a long lag. Thus, we would expect strong economic expansions, all else equal, to strengthen oil cartels and major recessions to weaken them. 8. The Relationship of Oil Prices and the Macroeconomy: Evidence The view that oil prices are endogenous with respect to U.S. macroeconomic variables such as real interest rates and real GDP has considerable empirical support. The two most prominent increases in the price of oil in 1973-1974 and 1979-1980 were both preceded by periods of economic expansion (see Table 1) and unusually low real interest rates (see Figure 5d). Similarly, the most recent oil price increase coincided with a strong economic expansion. In contrast, the fall in oil prices after 1982 coincided with a severe global recession and unusually high real interest rates. This section analyzes in detail the historical evidence for a link between oil prices and the macroeconomy. 8.1 WHY DID THE 1973-1974 OIL PRICE INCREASE OCCUR WHEN IT DID?
An intriguing question is why the two major and sustained oil price increases of the 1970s occurred when they did. The dominant view in the literature appears to be that the timing was primarily determined by exogenous political events in the Middle East, which are thought to have triggered supply cuts, thereby raising oil prices (see Hamilton, 1999). However, as we will argue, sustained oil price increases are only possible under conditions of excess demand in the oil market. Such conditions are unlikely to occur in the absence of favorable macroeconomic conditions, notably economic expansion and low real interest rates.
Do We Really Know that Oil Caused the Great Stagflation? • 169 Thus the apparent success of OPEC oil producers in raising prices in the 1970s (and their failure to raise prices for sustained periods at other times) is no historical accident. The timing of the oil price increases in the 1970s coincided with periods of unusually strong demand for oil, driven in substantial part by global macroeconomic conditions. Until the late 1960s, the excess capacity of the U.S. oil industry allowed the U.S. to play the special role of the supplier of last resort to Europe and Japan, in the event that oil supplies were threatened. The fact that the U.S. assumed this role was an inadvertent consequence of the regulatory policies of the Texas Railroad Commission regime, under which rationing of production led to excess capacity (see Hamilton, 1985). The ability of the United States to flood the market with surplus oil served as a deterrent against any attempt to raise international oil prices, and ultimately thwarted the effects of the 1956 and 1967 oil embargoes. What then were the changes in the world oil market that made the successful 1973 oil price hike possible? The main difference between the early 1970s and earlier periods was that, on top of the long-term trend toward increased energy consumption, there was a dramatic surge in worldwide demand for oil that was fueled by monetary expansion. In March 1971, U.S. oil production for the first time in history reached 100% of capacity (see Yergin, 1992, p. S67).22 The rising demand for oil was at first met with an increase in oil output in the Middle East that kept the price of oil low and falling in real terms (see Figure 9). Oil imports as a share of U.S. oil consumption rose from 19% in 1967 to 36% in 1973 (see Darmstadter and Landsberg, 1976, p. 31). Mabro (1998, p. 11) notes that OPEC's average daily production increased from 23.4 million barrels per day in 1970 to 30.99 million barrels per day in 1973. All OPEC members but Kuwait, Libya, and Venezuela increased production in this period. As a result, excess capacity was shrinking quickly in the Middle East. Seymour (1980, p. 100) documents that the oil market had been tightening since 1972 in spite of the rapid increases in oil output. In late 1972, all of the main market indicators—tanker freight rates, refinedproduct prices, and spot crude prices—started rising and continued their climb throughout 1973. While the recoverable reserves in the Middle East were of course huge, available production capacity was lagging consumption. By September-October of 1973, immediately before the 22. The normal market response to this shortage would have been rising oil prices. However, U.S. price controls on oil, imposed in 1971 as part of an overall anti-inflation program, were discouraging domestic oil production while stimulating consumption, and left little incentive for exploration or conservation. Moreover, growing environmental concerns held back U.S. oil production, even as new large oil reserves were being discovered in Alaska (see McKie, 1976, p. 73).
170 • BARSKY & KILIAN Figure 9 REAL PRICE INDICES FOR CRUDE OIL AND FOR INDUSTRIAL COMMODITIES, JANUARY 1948 TO JULY 2000
Source: See Figure 6 for a description of the data. The price data have been deflated using the CPI index excluding shelter (PRXHS).
oil embargo, both Saudi Arabia and Iran had just about reached their maximum sustainable output. The capacity shortage was not limited to Saudi Arabia and Iran. Had oil prices not risen in late 1973, there would have been virtually no spare productive capacity available anywhere in the world on the basis of the then projected forecasts of oil consumption for the winters of 1973-1974 and 1974-1975 (see Seymour, 1980, p. 100).23 23. Our reading of the evidence coincides with contemporary accounts. For example, in November 1968, only one year after the successful defeat of the 1967 oil embargo, State Department officials announced at an OECD meeting that soon the U.S. would not be able to provide extra supply to the world in the event of an embargo (see Yergin, 1992, p. 568). In November 1970, a U.S. diplomat in the Middle East filed a report stating that "the extent of dependence by western industrial countries upon [foreign] oil as a source of energy has been exposed, and the practicality of controlling supply as a means of exerting pressure for raising the price of oil has been dramatically demonstrated" (Yergin, 1992, p. 587).
Do We Really Know that Oil Caused the Great Stagflation? • 171 8.2 IF THE 1973 OIL PRICE INCREASE WAS CAUSED BY DEMAND SHIFTS, WHY DID OIL OUTPUT FALL?
The normal market reaction to the increased demand for oil in the early 1970s should have been an increase in both price and quantity of oil. As we have noted, the data instead show a steady decline of the price of oil in real terms in the early 1970s, followed by a sharp rise in the price of oil in late 1973 and a drop in oil output. This puzzling observation reflected the gradual resolution of a disequilibrium that arose from the peculiar institutional structure of the OPEC oil market at that time. Throughout the 1960s, oil delivery contracts were long-term agreements between OPEC producers and oil companies. Oil producers agreed to supply oil at a price that was fixed in nominal terms for several years in advance. Contracts were periodically renegotiated to take account of changes in economic conditions. As the macroeconomic environment became increasingly unstable in the early 1970s, the renegotiations failed to keep pace with the rapidly changing macroeconomic conditions. The stickiness of the nominal oil price contributed to the observed fall of the real price of oil, as inflation outpaced expectations. OPEC producers became increasingly reluctant to supply additional quantities of oil at prices well below the market-clearing level. By late 1973 this regime came to an abrupt end, when OPEC reneged on its contractual agreements with the oil companies and unilaterally decreed a much higher price of oil. As the price of oil rose sharply, the quantity of oil fell, lending credence to the view that a contemporaneous shift in the supply of oil had taken place. It is common to attribute the fall in oil output and the rise in the price of oil to the 1973 war and the subsequent oil embargo (see Hamilton, 1999). As we will show, this interpretation is by no means obvious, because excess demand in the oil market would have induced an unprecedented increase in oil prices at the end of 1973, even in perfectly competitive markets. For expository purposes consider a two-period model of the oil market dynamics in the early 1970s (see Figure 10). In period 1, starting from the equilibrium point A, a shift in demand for oil as a result of expansionary monetary policy raises the shadow price for oil. The new market-clearing price at point B, however, is never realized, because the price of oil is effectively held back by long-term contractual agreements (see Penrose, 1976).24 Instead, we move from A to C, corresponding to an increase in 24. The essential point here is that the price of oil in the early 1970s remained substantially below market-clearing level in the presence of excess demand. The assumption of a fixed price is an oversimplification designed to allow us to abstract from the effects of inflation. The price of oil actually fell in real terms in the early 1970s, despite efforts by OPEC to offset these losses (see Figures 6 and 9).
172 • BARSKY & KILIAN Figure 10 A TWO-PERIOD DISEQUILIBRIUM ANALYSIS OF THE OIL MARKET
Notes: In period 1, starting from the equilibrium point A, a shift in demand for oil as a result of expansionary monetary policy raises the shadow price for oil. Given the fixed contractual price of oil, production increases and we move to C. In period 2, OPEC reneges on the contractual price, and raises the oil price to the market-clearing level D while reducing the quantity supplied.
the quantity of oil supplied at the old price. In period 2, OPEC reneges on the contractual price, and raises the oil price to the market clearing level (D=B) while reducing the quantity supplied. The price and quantity movements in period 2 have the appearance of an oil supply shock, yet the supply curve never shifts; we are witnessing the correction of a disequilibrium resulting from the earlier demand shift. Our stylized model of the 1973-1974 oil market dynamics is consistent both with the absence of significant increases in the real price of oil and the observed increase in oil production in the early 1970s. It also is consistent with the fall in the quantity of oil produced and the sharp increase in the OPEC oil price in 1973-1974. The 1973-1974 episode illustrates the point that fundamental identification problems need to be addressed before we can assess the effect of exogenous political events in the Middle East on the price of oil. As we have shown, the observed price and quantity movements in 1973-1974 are consistent both with supply interruptions and with the restoration of equilibrium after the removal of price ceilings. Our model also is consistent with the views of oil economists such as Mabro (1998, p. 10) that "a major political crisis will not cause a price shock when capacity cushions exist in other coun-
Do We Really Know that Oil Caused the Great Stagflation? • 173
tries, while excess demand would cause prices to flare even in the absence of any political crisis." The fact that the cumulative rise in the oil price did not exceed the cumulative increase in other industrial commodity prices suggests that the actual oil price in January 1974 was probably not far from the marketclearing level. As we will argue later, OPEC market power played a more important role in determining the price of oil only after January 1974, when OPEC attempted to stabilize the price of oil at its peak level, even as the U.S. economy slid into recession and other commodity prices fell sharply. An alternative interpretation of the oil price increase of 1973-1974 has been proposed by Hamilton (1999). Hamilton stresses the role of oil supply interruptions that are exogenous to the state of the U.S. macroeconomy. He discusses several such supply interruptions that in his view were caused by "military conflicts" and "wars" [including (1) the October 1973 war, (2) the Iranian revolution of late 1978; (3) the outbreak of the Iran-Iraq war in September 1980, and (4) the Gulf war of 19901991]. There is some doubt, however, about the extent to which these events were truly exogenous. Hamilton is not explicit about the nature of the causal link from military conflict to exogenous production cutbacks. In some cases, for example in discussing the Gulf war (p. 28) or the Iraq-Iran war (in his Appendix B), he clearly has in mind the physical destruction of oil facilities and the war-induced disruption of oil shipping.25 In contrast, the production cutbacks in late 1973 clearly were not caused directly by military conflict.26 In fact, most of the production cutbacks occurred only after the war (which lasted from October 6 to October 23, 1973) as part of an oil embargo by Arab oil producers. In his Appendix B, Hamilton postulates a causal link from the October war to this oil embargo. This link is questionable. Unlike the war itself, the oil embargo is not an exogenous political event. There is considerable evidence that oil producers carefully considered the economic feasibility of the oil embargo.27 In fact, the oil embargo was contemplated as early 25. Hamilton (1999, p. 28) refers to "a number of historical episodes in which military conflicts produce dramatic and unambiguous effects on the petroleum production from particular fields" such as the Iraqi invasion of Kuwait in July 1990. 26. During this war only Syrian and Iraqi oil facilities sustained battle damage. Neither country was a major oil producer, and the loss of oil output was small. The bulk of the reduction in oil output that did occur in late 1973 can be attributed to countries that were not directly involved in the war, but chose to restrict output, notably Saudi Arabia and Kuwait (see U.S. Energy Information Administration, 1994, p. 307). 27. An early example is King Faisal of Saudi Arabia's rejection in 1972 of the use of the oil weapon on economic grounds (see Terzian, 1985, p. 164). That decision was reversed in late 1973, when more than a third of U.S. oil consumption was accounted for by
174 • BARSKY & KILIAN
as July 1973, well before the October war (see Arad and Smernoff, 1975, p. 124), and United States officials were aware of that threat.28 Although some countries announced a first stage of production cuts as early as October 18 (in the last week of the war), the embargo was tightened only after hostilities had ended on October 23. Not surprisingly, the oil embargo was lifted without its original political goals being achieved, as soon as oil prices had reached a sufficiently high level. Concern for the Arab cause lasted only as long as it was economically expedient. Moreover, contrary to popular perception (see the quotation from the Economist in Section 1), the oil embargo was not associated with a quadrupling of oil prices. In actuality, the price increase that coincided with the embargo was only half as large. The other half of the oil price increase occurred well before the embargo. In 1971, the basic structure of the contractual agreements between oil companies and OPEC countries had been renegotiated at conferences in Teheran and Tripoli. These agreements were long-term in nature. Neither the oil companies nor the OPEC governments anticipated the subsequent successive dollar devaluations in 1971 and 1973, the rapid rise in U.S. inflation, and the extraordinary surge in the demand for oil in 1972 and 1973. In response to these events, OPEC countries and oil companies repeatedly renegotiated the conditions of their contracts. Posted prices of light Arabian crude gradually rose from $2.29 in June 1971 to $2.90 in February 1973. In June 1973, pressure mounted to abandon the framework of Teheran and Tripoli and for governments to set posted prices unilaterally. By September 1973, all the OPEC countries were prepared formally to request a revision of the price agreements, as the gap between market prices and posted prices widened. Negotiations opened on October 8, two days after the outbreak of the October 1973 war. OPEC proposed to raise the price of oil to $5.12. The oil companies stalled for time. On October 16, 1973, OPEC renounced the Teheran and Tripoli agreements and unilaterally adopted the proposal they had earlier put before the companies. As Penrose (1976, p. 50) notes, "the October 1973 increases imports. Similarly, during the 1971 Teheran negotiations between the major oil companies, the Gulf states threatened to implement an oil embargo, but never implemented it. Again, during Israel's invasion of Lebanon in 1982-which coincided with high oil prices, a global recession, and high real interest rates—an oil embargo was considered by the Organization of Arab Petroleum Exporting Countries, but rejected as inconsis- • tent with the economic interests of the organization (see Yergin, 1992, pp. 582, 719; Skeet, 1988, p. 187). 28. Arad and Smernoff (1975, p. 190) note that in July 1973 the Committee on Emergency Preparedness of the National Petroleum Council issued a report that concluded that an interruption of petroleum imports into the U.S. was likely as early as January 1974, based on data on the dependence of the U.S. on oil imports.
Do We Really Know that Oil Caused the Great Stagflation? • 175
in posted prices were not related to the war, but to the fact that the assumptions underlying the Teheran agreement had proved unjustified. The exporting countries therefore felt that the . . . prices agreed upon in Teheran required adjustment to the new market and monetary conditions. These conditions were not, in their view, of their own making, since they had not cut back supplies. . . . " In fact, the quantity of oil supplied by OPEC had gradually increased from 29.9 million barrels per day in January 1973 to 32.7 million barrels per day in September 1973. Thus, as we showed earlier, not only are there strong reasons to doubt that the second doubling of oil prices on January 1, 1974, was caused by exogenous oil supply cuts, but there is overwhelming evidence that the initial doubling of the oil price to $5.12/barrel was due to increased demand for oil. 8.3 WHY DID THE 1979-1980 OIL PRICE INCREASE OCCUR WHEN IT DID?
We now turn to the second major oil price increase of the 1970s, which took place in 1978-1980. As in the early 1970s, there is clear evidence of an output boom, unusually low real interest rates, and rising inflation prior to 1980. The rapid growth was fueled by the renewed world-wide monetary expansion documented in Section 4. Although this expansion was reflected in a sustained increase in industrial commodity prices in 1976-1979, the increase in other commodity prices was dwarfed by the increase in oil prices that started in late 1978 (see Figure 6). Since the surge in oil prices not only far exceeded inflation adjustments, but also was not supported by a corresponding tightening in other commodity markets, it must have reflected additional developments specific to the oil market. Judging by the increase in other industrial commodity prices in 1978-1979, at best one-third of the actual oil price increase appears to be consistent with the monetary model. In that respect, the second oil crisis appears fundamentally different from the first oil crisis of 1973-1974.29 The inability of the monetary model to explain more than one-third of the oil price increase in 1979-1980 does not imply that the other twothirds of the increase was due to oil production cutbacks caused by the Iranian revolution in late 1978 and the outbreak of the Iran-Iraq war in September 1980, as suggested by Hamilton (1999). First, taking into 29. Also note that, unlike in 1973-1974 when oil prices doubled in a single day, the oil price increase in 1979-1980 was much more gradual. One reason is that—unlike in the early 1970s—OPEC oil prices had not been held back by what was effectively a price ceiling. Thus, the observed oil price dynamics cannot be explained by a disequilibrium adjustment of the kind described in Figure 7.
176 • BARSKY & KILIAN
account the offsetting production increases by other oil producers such as Saudi Arabia, the production shortfall in early 1979 was not nearly as dramatic as suggested by Hamilton (1999). Global production in October, November, and December 1978 exceeded the September 1978 level. Only in January and February of 1979, at the height of turmoil in Iran, did global oil production fall significantly below its September 1978 level, by 4% and 3%, respectively (see U.S. Energy Information Administration, 1994, p. 312).30 Moreover, total annual OPEC oil production in 1979 was 4% higher than in 1978 (see Skeet, 1988, p. 244). Second, the timing of the oil price increase suggests that physical production shortfalls narrowly defined are not the cause of the oil price surge. The bulk of the oil price increases occurred well after the Iranian revolution was over and well before the outbreak of the Iran-Iraq war. Specifically, during the Iranian revolution, between October 1978 and April 1979, the average price of U.S. oil imports rose by only about $3/ barrel (see DRI database). In February 1979, Iran announced the resumption of exports, and by April 1979, global oil production matched the September 1978 level. The main surge in oil prices began only in May 1979, at a time when global oil production exceeded its September 1978 level (see U.S. Energy Information Administration, 1994, p. 312). Between May and October 1979 alone, oil prices rose from $19 to $25 per barrel. Oil prices continued to climb to almost $34 by April 1980, when the armies of Iran and Iraq were first put on alert (see Terzian, 1985, p. 279). The war broke out in September 1980. In December 1980, oil was still under $36. It finally rose to a peak of $39 in February 1981. One explanation of the additional oil price rise that occurred between mid-1979 and mid-1980 that has been proposed in the literature is a temporary surge in precautionary demand in response to increased uncertainty about future oil supplies and expectations of strong future demand (see Adelman, 1993, p. 428). The uncertainty-based explanation of higher oil prices, however, does not seem plausible in the absence of taut demand conditions in the oil market, which in turn were driven in no small measure by a booming world economy and low real interest rates. The fact that a large number of military conflicts and incidents in the Gulf region in subsequent years did not lead to sustained increases in oil prices suggests that increased Middle East uncertainty appears to have little or no effect on oil prices in the absence of favorable macro30. Hamilton notes that Iranian cutbacks in January and February 1979 amounted to almost 9% of the average monthly global oil production for 1978. Using the same data source, after allowing for production increases elsewhere, global oil production in January and February 1979 actually matched or exceeded slightly the average 1978 level (see U.S. Energy Information Administration, 1994, p. 312).
Do We Really Know that Oil Caused the Great Stagflation? • 177
economic conditions.31 It is no coincidence that oil prices (as well as nonoil commodity prices) peaked shortly after Paul Volcker launched a sharp monetary contraction resulting in a global recession and high real interest rates. Weakening demand played a crucial role in undermining Saudi Arabia's efforts to shore up the oil price between 1982 and 1985 by reducing oil supply. The fact that other OPEC members undercut the official OPEC price in 1982-1985 appears consistent with the view that, in the absence of effective monitoring and punishment, cash-starved oil-producing countries (such as Iraq and Iran) had an incentive to undercut the cartel price in order to increase current revenue. At the same time, competition from other oil producers increased. By the early 1980s, a large number of new oil suppliers such as Egypt, Angola, Malaysia, China, Norway, and the United Kingdom had entered the market in response to the unusually high oil prices of the 1970s, while existing producers including the United States (Alaska), Mexico, and the USSR had invested in new capacity and expanded oil production. By 1982, less than half of world oil was supplied by OPEC, compared with two-thirds in 1977 (see Skeet, 1988, p. 201). The resulting downward pressure on oil prices is consistent with the predictions of the Retelling model with capacity constraints. We do not attempt to address in this paper the reasons for the long delay in the decline of oil prices—both in the mid-1970s and in the early 1980s—after the initial monetary expansion was reversed. Although the sharp oil price increases in the 1970s came on the heels of shifts in the demand for oil that—in our view—were directly or indirectly fueled by monetary expansion, OPEC seems to have been adept at restraining official price cutting even in the presence of significant excess capacity. Figure 8 shows that other industrial commodity prices dropped sharply in response to recessions and higher real interest rates, as theory would suggest. Oil prices, however, remained at a much higher level than 31. Examples include the Israeli attack on an Iraqi nuclear reactor in June 1981; a state of near-war between Israel and Syria from April to July 1981 (see Skeet, 1988, p. 181); the invasion of Lebanon by Israel in June 1982; the Iranian Ramadan offensive against Iraq in July 1982 (see Yergin, 1992, p. 764); Iran's threat in July 1983 to blockade the Straits of Hormuz (see Terzian, 1985, p. 323); suicide attacks on the U.S. and French headquarters in Lebanon in October 1983 (see Skeet 1988, p. 197); the tanker war in the Gulf in February-April 1984, during which at least eleven tankers and the major Iranian oil terminal were hit (see Terzian, 1985, p. 327; Yergin, 1992, p. 743); the Iranian capture of the Fao Peninsula in the southeastern corner of Iraq in February 1986, followed by Iranian artillery and missile attacks on Kuwait's oil ports and Iranian naval attacks on Kuwaiti shipping; the Kuwaiti request for U.S. naval patrols in the Gulf in March 1987 to protect its oil tankers (see Yergin, 1992, p. 765); the Iraqi missile attack on the U.S.S. Stark during the tanker war in May 1987, resulting in the deaths of 36 sailors; and the downing of an Iranian airliner by U.S. forces in the Gulf in July 1988 following skirmishes with Iranian patrols (see Yergin, 1992, p. 766).
178 • BARSKY & KILIAN
other commodity prices during 1974-1978 and again during 1981-1985. This differential response after the onset of the 1974-1975 and 1981-1982 recessions is suggestive of the use of OPEC market power to prop up oil prices. As Nordhaus (1980, p. 367) notes, in periods of excess demand, there is little OPEC can do (or would want to do) to impede oil price increases. Once official OPEC prices have risen, however, they tend to be sticky, even when there is a glut in the oil market. Indeed, empirical and anecdotal evidence lends support to the view that OPEC was most influential not in 1973-1974 or in 1979 during the time of the most rapid oil price increases—as popular opinion would suggest—but in preventing oil prices from falling as rapidly as they should have when oil demand subsided (also see Mabro, 1998, pp. 10-11). 9. Lessons from the Most Recent Oil Price Surge The tripling of oil prices after 1998 provides us with yet another opportunity to test the implications of our explanation of stagflation. Historically, as we have shown in Section 6, oil price increases by themselves have caused excess CPI inflation (relative to inflation in the GDP deflator) for short periods, rather than extended periods of inflation. The current episode is no exception. U.S. data after 1998 show a spike in CPI inflation relative to deflator inflation rates (see Figure 8). In contrast, there has been little movement in the inflation rate of the deflator. This finding is not surprising, because there has not been a major monetary expansion of the kind that was characteristic of the 1970s. Also noteworthy is the fact that, despite higher oil prices, there is no evidence yet of a major contraction, which seems to belie the notion that oil price increases inevitably cause recessions. Although real interest rates have not been unusually low, cumulative growth rates for the United States have been extraordinarily high—high enough to offset the less than stellar growth performance of Europe and Japan. That increase in output, however, appears to be different from the rapid output growth in the 1970s that was largely fueled by monetary expansion. The very strong real growth in the past several years, especially in the U.S., has been reflective of an increase of potential output rather than "demand" generated (see Basu, Fernald, and Shapiro, 2000). Our analysis suggests that this strong growth in output was instrumental in supporting the increase in oil prices in 1999-2000. If the United States had been in a recession during 1999-2000 and U.S. demand for oil had been low, it would have been hard for OPEC to enforce high oil prices over extended periods. The ability of a cartel like OPEC to sustain prices above the competitive level depends on a conducive macroeconomic environment. If there is a significant contraction of
Do We Really Know that Oil Caused the Great Stagflation? • 179
the economy, historical experience suggests that OPEC will have an uphill battle maintaining the current level of oil prices. Both oil and other commodity prices fell sharply after the Asian crisis, yet only oil prices have strongly rebounded. This discrepancy is suggestive of a larger role for OPEC after 1998 than in earlier episodes. One interpretation we can rule out for sure is that OPEC has been reacting to exogenous political events in the Middle East. Certainly, the latest major oil price increase was not preceded by physical production cutbacks "induced by war" along the lines of Hamilton (1999). In fact, oil prices rose during the period of peace making between Israel and the Palestinians, but Arab leaders refused to use the "oil weapon" when the recent confrontations erupted (see Washington Post, 2000). This is not surprising, given the already high level of oil prices, and certainly is consistent with our view that in previous episodes political factors were allowed to play a role in setting oil prices only to the extent that they did not conflict with economic objectives and constraints. What then enabled OPEC to consolidate its power after its influence had declined ever since the late 1980s? There are two reasons. One is that other oil producers (such as Norway and Mexico) that are not part of OPEC effectively joined forces with OPEC, raising the organization's effective market share, which had declined dramatically in the 1980s (see New York Times, 2001). The consolidation of OPEC is consistent with theoretical models of cartels, such as Green and Porter (1984), that lead us to expect that cartels will flourish in periods of strong economic growth. Second, there is evidence that oil producers across the world, with the possible exception of Saudi Arabia, were once again operating close to capacity, and that few additional oil supplies were likely to be forthcoming in the short run. This scarcity was arguably driven in important part by strong demand for crude oil. The rise in oil prices coincided with a shortage of oil tankers, and freight rates for crude oil shipments have increased sharply, suggesting high demand for oil. Thus, the problem appears to have been one of insufficient inventories in the face of rapidly rising demand for oil, rather than a global supply cut. This view is further supported by the sharp drop of crude-oil prices in late December 2000 from a peak of more than $37 per barrel to below $27 upon news reports of an impending U.S. recession, despite low inventories, Middle East turmoil, one of the coldest winters in recent memory, and the high likelihood that most of Iraq's oil exports would remain off global markets. Predictably, OPEC will attempt to stem the expected decline in oil prices by announcing production cutbacks, as it did after each of the major oil price increases in the 1970s when demand for oil began to slacken (see New York Times, 2001). How long OPEC will be able to sustain high real oil prices will depend on the depth of the economic
180 • BARSKY & KILIAN
downturn as well as the extent to which new non-OPEC oil supplies will be forthcoming in response to higher oil prices. 10. Concluding Remarks The origins of stagflation and the possibility of its recurrence continue to be an important concern among policymakers and in the popular press. Our analysis suggests that in substantial part the Great Stagflation of the 1970s could have been avoided, had the Fed not permitted major monetary expansions in the early 1970s. We demonstrated that the stagflation observed in the 1970s is unlikely to have been caused by supply disturbances such as oil shocks. This point is important, because to the extent that stagflation is due to exogenous supply shocks, any attempt to lower inflation would worsen the recession. In contrast, if we are right that stagflation is first and foremost a monetary phenomenon, then stagflation does not present an inevitable "policy dilemma." We conclude that oil price increases by themselves are unlikely to reignite stagflation, as long as the Federal Reserve refrains from excessively expansionary monetary policies. Moreover, a sustained increase in the real price of oil is unlikely in the absence of a conducive macroeconomic environment in OECD countries. REFERENCES Abel, A., and B. Bernanke. (1998). Macroeconomics, 3rd ed. Reading, MA: Addison-Wesley. Adelman, M. A. (1993). The Economics of Petroleum Supply. Cambridge, MA: The MIT Press. Arad, U. B., and B. J. Smernoff. (1975). American Security and the International Energy Situation. New York: Hudson Institute. Barro, R. J. (1987). Macroeconomics, 2nd ed. New York: Wiley. , and D. Gordon. (1983). A positive theory of monetary policy in a natural rate model. Journal of Political Economy 91:589-610. Basu, S., J. Fernald, and M. D. Shapiro. (2000). Productivity growth in the 1990s: Technology, utilization or adjustment? Department of Economics, University of Michigan. Mimeo. Bernanke, B. S. (1983). Irreversibility, uncertainty, and cyclical investment. Quarterly Journal of Economics 98:85-106. , and M. Gertler (1995). Inside the black box: The credit channel of monetary policy transmission. Journal of Economic Perspectives 9:27-48. , , and M. W. Watson. (1997). Systematic monetary policy and the effects of oil price shocks (with discussion). Brookings Papers on Economic Activity 1:91-148. , and I. Mihov. (1998). Measuring Monetary Policy. Quarterly Journal of Economics 113(3):869-902. Berndt, E.-R., I. M. Cockburn, and Z. Griliches. (1996). Pharmaceutical innovations and market dynamics: Tracing effects on price indexes for antidepressant drugs. Brookings Papers on Economic Activity: Microeconomics 133-188.
Do We Really Know that Oil Caused the Great Stagflation? • 181 Blinder, A. (1979). Economic Policy and the Great Stagflation. New York: Academic Press. Bohi, D. R. (1989). Energy Price Shocks and Macroeconomic Performance. Washington: Resources for the Future. Bordo, M. D., and B. Eichengreen. (1993). A Retrospective on the Bretton Woods System: Lessons for International Monetary Reform. NBER project report. Chicago: University of Chicago Press. , and E E. Kydland. (1996). The gold standard as a commitment mechanism. In Modern Perspectives on the Gold Standard, T. Bayoumi, B. Eichengreen, and M. P. Taylor (eds.). Cambridge: Cambridge University Press, pp. 55-100. Boschen, J. E, and L. O. Mills. (1995). The relation between narrative and money market indicators of monetary policy. Economic Inquiry 33(l):24-44. Bruno, M., and J. Sachs. (1985). Economics of Worldwide Stagflation. Cambridge, MA: Harvard University Press. Burns, A. E (1978). Reflections of an Economic Policy Maker. Speeches, and Congressional Statements: 1969-1978. Washington: American Enterprise Institute for Public Policy Research. Cagan, P. (1979). Persistent Inflation: Historical and Policy Essays. New York: Columbia University Press. Calvo, G. (1983). Staggered prices in a utility-maximizing setting. Journal of Monetary Economics 12(3):383-398. Christiano, L. J., M. Eichenbaum, and C. L. Evans. (1996). The effects of monetary policy shocks: Some evidence from the flow of funds. Review of Economics and Statistics 78:16-34. , and C. Gust. (2000). The expectations trap hypothesis. International Finance Discussion Paper No. 676. Board of Governors of the Federal Reserve. Clarida, R., J. Gali, and M. Gertler. (2000). Monetary policy rules and macroeconomic stability: Evidence and some theory. Quarterly Journal of Economics 115:147-180. Darmstadter, ]., and H. H. Landsberg. (1976). The economic background. In The Oil Crisis, R. Vernon (ed.). New York: Norton. De Long, B. J. (1997). America's peacetime inflation: The 1970s. In Reducing Inflation: Motivation and Strategy, C. Romer and D. Romer (eds.). Chicago: University of Chicago Press. Economist. (1999). "Oil's Pleasant Surprise." November 27, p. 16. Fisher, I. (1906). The Rate of Interest. New York: MacMillan. Friedman, M. (1975). Perspectives on Inflation. Newsweek, June 24, p. 73. Genberg, H., and A. K. Swoboda. (1993). The provision of liquidity in the Bretton Woods system. In A Retrospective on the Bretton Woods System: Lessons for International Monetary Reform, M. D. Bordo and B. Eichengreen (eds.). NBER project report. Chicago: University of Chicago Press, pp. 269-315. Gordon, R. J. (1984). Supply shocks and monetary policy revisited. American Economic Review, 74:38-43. Green, E. J., and R. H. Porter. (1984). Noncooperative collusion under imperfect price information. Econometrica, 52(1):87-100. Griliches, Z., and I. Cockburn. (1994). Generics and new goods in pharmaceutical price indexes. American Economic Review, 84(5):1213-1232. Hamilton, J. D. (1983). Oil and the macroeconomy since World War II. Journal of Political Economy 91:228-248. . (1985). Historical causes of postwar oil shocks and recessions. The Energy Journal 6:97'-115.
182 • BARSKY & KILIAN . (1988). A neoclassical model of unemployment and the business cycle. Journal of Political Economy 96:593-617. . (1999). What is an Oil Shock? Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 7755. Hetzel, R. L. (1998). Arthur Burns and inflation. Federal Reserve Bank of Richmond Economic Quarterly 84:21-44. Heal, G., and G. Chichilnisky. (1991). Oil and the International Economy. Oxford: Clarendon Press. Holland, S. (1998). Existence of competitive equilibrium in the Hotelling model with capacity constraints. Department of Economics, University of Michigan. Mimeo. Hotelling, H. (1931). The economics of exhaustible resources. In Microeconomics: Theoretical and Applied, Vol. 1. R. E. Kuenne (ed.). Aldershot: Elgar, 1991, pp. 356-394. Houthakker, H. (1987). The ups and downs of oil. In Deficits, Taxes, and Economic Adjustments, P. Cagan (ed.). Washington: American Enterprise Institute. International Monetary Fund. International Financial Statistics, various issues. Kimball, M. S. (1995). The quantitative analytics of the basic neomonetarist model. Journal of Money, Credit, and Banking 27(4), Part 2: 1241-1277. King, R., and M. Goodfriend. (1997). The new neoclassical synthesis and the role of monetary policy. In NBER Macroeconomics Annual, No. 12. B. S. Bernanke and J.-J. Rotemberg (eds.). Cambridge, MA: MIT Press. Kydland, F. E., and E. C. Prescott. (1977). Rules rather than discretion: The inconsistency of optimal plans. Journal of Political Economy 85:473-492. Leeper, E. M. (1997). Narrative and VAR approaches to monetary policy: Common identification problems. Journal of Monetary Economics 40(3):641-657. Lucas, R. E. (1972). Expectations and the neutrality of money. Journal of Economic Theory 4:103-124. . (1973). Some international evidence on output-inflation tradeoffs. American Economic Review 63:326-334. Mabro, R. (1998). OPEC behavior 1960-1998: A review of the literature. The Journal of Energy Literature 4:3-27. McKie, J. W (1976). The United States. In The Oil Crisis, R. Vernon (ed.). New York: Norton. McKinnon, R. I. (1982). Currency substitution and instability in the world dollar standard. American Economic Review 72(3):320-333. Mundell, R. A. (2000). A reconsideration of the twentieth century. American Economic Review 90(3):327-340. National Commission on Supplies and Shortages. (1976). The Commodity Shortages of 1973-1974. Case Studies. Washington: U.S. Government Printing Office. Nelson, E. (1998). Sluggish inflation and optimizing models of the business cycle. Journal of Monetary Economics 42:303-322. New York Times. (2001). For OPEC, cuts in production are a delicate balancing act. January 11, p. 1. Nordhaus, W. D. (1980). Oil and economic performance in industrial countries. Brookings Papers on Economic Activity 2:341-388. Orphanides, A. (2000). Activist stabilization policy and inflation: The Taylor rule in the 1970s. Board of Governors of the Federal Reserve. Discussion Paper. , R. D. Porter, D. ReifSchneider, R. Tetlow, and F. Finan. (1999). Errors in
Comment • 183 the measurement of the output gap and the design of monetary policy. Board of Governors of the Federal Reserve. Discussion Paper. Penrose, E. (1976). The development of crisis. In The Oil Crisis, R. Vernon (ed.). New York: Norton. Roberts, S. (1984). Who Makes the Oil Price? An Analysis of Oil Price Movements 1978-1982. Oxford: Oxford Institute for Energy Studies. Rogoff, K. (1985). The optimal degree of commitment to an intermediate monetary target. Quarterly Journal of Economics 100:1169-1190. Rotemberg, J. (1982). Sticky prices in the United States. Journal of Political Economy 90(6):1187-1211. . (1996). Prices, output, and hours: An empirical analysis based on a sticky price model. Journal of Monetary Economics 37(3):505-533. , and G. Saloner. (1986). A supergame-theoretic model of business cycles and price wars during booms. American Economic Review 76:390-407. , and M. Woodford. (1996). Imperfect competition and the effects of energy price increases on economic activity. Journal of Money, Credit, and Banking 28(4), Part 1: 550-577. Samuelson, P. A. (1974). Worldwide stagflation. In Collected Scientific Papers, Vol. 4. H. Nagatani and K. Crowley (eds.). Cambridge, MA: The M.I.T. Press, 1977. , and R. M. Solow. (1960). Analytical aspects of anti-inflation policy. American Economic Review 40(2):177-194. Sargent, T. J. (1998). The Conquest of American Inflation. Hoover Institution. Seymour, I. (1980). OPEC: Instrument of Change. London: Macmillan. Skeet, I. (1988). OPEC: Twenty-Five Years of Prices and Politics. Cambridge: Cambridge University Press. Taylor, J. (1979). Staggered wage setting in a macro model. American Economic Review 69(2):108-113. Terzian. P. (1985). OPEC: The Inside Story. London: Zed Books. United States Energy Information Administration. (1994). Historical Monthly Energy Review: 1973-1992. U.S. Government Printing Office. Washington Post. (2000). Arab leaders meet for summit. October 21. Yergin, D. (1992). The Prize. The Epic Quest for Oil, Money, and Power. New York: Simon and Schuster.
Comment OLIVIER BLANCHARD Massachusetts Institute of Technology
1. Introduction Revisionist history is always fun. But it is not always convincing. I have enjoyed thinking about the thesis developed by Barsky and Kilian. But I am not convinced.
184 • BLANCHARD
2. The Noncontroversial Part: The Role of Money in the Early 1970s Not all of the paper is revisionist; indeed, some of it is less so than it sounds: There is, I believe, wide agreement that money played a major part in what happened in the early to mid-1970s. Most observers would in particular agree to the following propositions (much of what follows can, for example, be found in the book by Michael Bruno and Jeffrey Sachs on the Economics of Stagflation, published in 1985): Expansionary monetary policy was an important factor in stimulating growth and reducing unemployment in the United States after the 1970 recession. By 1973, the unemployment rate was 4.9%, down from 6.0% in 1971. U.S. inflation came down until 1972, and then started increasing in 1973, suggesting that the unemployment rate was then close to the natural rate. It is therefore likely that, had the output expansion continued at the same rate after 1973, inflation would have further increased, even absent any changes in the relative price of oil. Expansionary monetary policy in the United States, and the attempts by foreign central banks to maintain the value of the dollar, led to large induced monetary expansions abroad. Growth in the other major OECD countries—countries, which in contrast to the United States, had not had a recession in 1970— continued to be high. Unemployment continued to be very low. Inflation in the EEC steadily increased, nearly doubling between 1970 and 1973. There again, a slowdown in activity was clearly needed, and would have come, even absent the increase in the price of oil. This world monetary expansion was associated with low nominal interest rates, and even lower real interest rates. This, combined with strong world demand, was more than enough to trigger an increase in the price of commodities and raw materials some time before the increase in the price of oil.
In short, even absent the increase in the price of oil, 1974 and 1975 would have seen either increasing inflation and/or a slowdown in growth. Most likely, given the attitude of central banks from 1973 on, the outcome would have involved monetary tightening and a slowdown in growth. 3. Controversial Point 1: One Can Explain Stagflation within a Model with Only Monetary Shocks Let me start with what looks like a semantic issue, but is in fact more.
Comment • 185
Barsky and Kilian define "stagflation" as the coincidence of "low or negative output growth" and "high inflation" (i.e., AM > 0, Trhigh). They argue— rightly—that this is easy to generate in a model with nominal rigidities. In effect, we know that inflation builds up slowly after a monetary expansion. At some point, high inflation leads to a decrease in real money, which in turn leads to a decrease in output growth; at that point there is indeed high inflation and low, possibly negative, output growth. A more conventional definition of "stagflation" however is the coincidence of high unemployment and increasing inflation (i.e., u high, and ATT > 0. Why does this semantic discussion matter? Because (1) what was observed in the 1970s was indeed a combination of high unemployment and increasing inflation, i.e. stagflation according to the second definition; (2) this combination is very hard to generate in response to only changes in money growth. Let me develop both points. On the empirical evidence: The average unemployment rate from 1973 to 1975 was 6.4%, substantially higher than what the natural rate of unemployment had been until then. And, over the same period, the increase in inflation was around 5 percentage points. The period was one of high unemployment and increasing inflation. On the theoretical proposition: Go back to a conventional Phillipscurve relation:
Inflation minus expected inflation is a decreasing function of the distance between the actual unemployment rate and the natural unemployment rate. In the absence of supply shocks, u is a constant. Stagflation (according to the second definition, and the 1973-1975 facts) implies the coincidence of increasing inflation, viz.
and of unemployment above the natural rate, viz. ut - u > 0, so that, by implication,
186 • BLANCHARD
In words, the expected increase in inflation must exceed the actual increase in inflation, and this at a time at which inflation is increasing. It is difficult to think of expectation formation mechanisms which will naturally deliver this result. The learning model presented by Barsky and Kilian does not. In general, learning does not seem promising here: It seems more likely to lead to the opposite inequality, with the expected increase in inflation lagging behind the actual increase. So, how can one generate stagflation? By having an increase in the natural rate u, or, equivalently for our purposes, a positive disturbance to the Phillips-curve relation. A natural candidate is an increase in the price of oil, which generates an increase in u, or, equivalently for our purposes, a positive disturbance in the Phillips-curve relation. Thus, the traditional focus on supply shocks to explain the 1970s. This argument however suggests one way out for proponents of the monetary-policy explanation. Barsky and Kilian do not push it explicitly, but clearly they could, as it is in the spirit of their paper. If expansionary monetary policy leads to an increase in the relative price of oil, then one indeed can in principle generate stagflation just from monetary shocks. This leads to the second major issue, the degree to which one can think of the large increases in the price of oil.in the 1970s as endogenous and triggered by monetary policy. 4. Controversial Point 2: The Increase in the Price of Oil in the 1970s Was an Endogenous Response to a Money-Driven World Boom Here, theory is on the side of Barsky and Kilian. Oil is a natural resource. Current or anticipated increases in demand or decreases in the real interest rate should both lead to an increase in the current price. And, indeed, the early 1970s were a period of high demand and low real rates. The problem is empirical. I am no expert on the oil market. But from my reading of the literature, I have the strong feeling that most experts agree: The two increases in the price of oil were not the natural, if delayed, response to demand and interest rates, but were mostly the result of successful cartelization. Agreement among experts is surely no proof. But the hypothesis of endogenous increase in the price of oil runs into a number of obvious problems (my—very limited—knowledge on these issues is largely based on M. A. Adelman's writings, in particular Adelman, 1993): The degree to which the price of a good depends on the interest rate depends on the degree to which it is a fixed rather than a renewable resource. And here, the
Comment • 187 evidence seems to be that oil behaves more like the second than the first. The amount of so-called "proven resources" has consistently increased over time, despite the steady extraction of oil from the ground. The degree to which the price depends on current and anticipated demand depends on the slope of the marginal-cost curve, relating the cost of extraction to the flow of extraction. Evidence suggests that, in the 1970s, the marginal-cost curve was rather flat: that there were a large number of fields from which oil could be extracted at close to the same cost per barrel—a cost far below the price which prevailed from 1974 on. The increase in prices in the mid-1970s was associated with a decrease in production—not what you would expect to see in response to a shift in demand. The issue is taken up by Barsky and Kilian, who provide a creative, if not totally convincing, answer. But there is another empirical problem along the same lines: The increase in prices in the mid-1970s was associated with a decrease in production for the low-cost OPEC countries, and an increase in the production for the high-cost non-OPEC countries. This is hard to explain without giving some central role to OPEC in the story. With the world recession of the mid-1970s, the high-demand conditions, which might have justified the high price of oil earlier, largely vanished. And later on in the decade, tighter monetary policy led to much higher real rates. Yet the price of oil remained high. The paper—and the literature—invoke "ratchet effects" (in the form of higher excise taxes, imposed by the oil-producing countries on oil companies); but this begs the question: Why not invoke the same mechanisms for the initial increase in the price of oil?
5. Controversial Point 3: The Recession of the Mid-1970s Was Due to a Contraction in Money Here, the problem is again empirical and obvious. The way monetary contraction is supposed to work is through high interest rates. Nominal interest rates increased substantially in 1974 and 1975. This is shown in Figure 1, which plots short-, medium-, and long-term interest rates from the mid-1960s to the early 1990s. But inflation increased by more, and, based on forecasts of inflation at the time, so did expected inflation. This is shown in Figure 2, which plots short-, medium-, and long-term real interest rates, using inflation forecasts of the time, for the same period. Real interest rates were lowest in 1974 and 1975; indeed, in both years, the short real rate was negative, the longer real rates very close to zero. (The numbers are taken from Blanchard, 1993; the construction of the real rates is described in that paper.) Can the monetary-contraction story survive Figure 2? Yes, if there is a role for the nominal interest rate, for a given real interest rate. And,
Figure 1 NOMINAL INTEREST RATES, SHORT, MEDIUM, AND LONG: UNITED STATES, 1966 TO 1992
Figure 2 REAL INTEREST RATES, SHORT, MEDIUM, AND LONG: UNITED STATES, 1966 TO 1992
190 • BLANCHARD
based on the research on the interaction between inflation, taxation, and intermediation, we can think of a number of channels through which nominal rather than real rates might matter. One may however doubt that the rather modest increase in nominal rates could have been enough to offset the effects of low real rates and generate a recession of the size observed in the mid-1970s. The burden of proof is on the authors on this one. 6. Controversial Point 4: An Increase in the Price of Oil Should Have No Effect on the GDP Deflator; Yet the GDP Deflator Increased Substantially in the Mid-1970s The argument that the price of oil should not affect the price of value added—the GDP deflator—is perfectly valid as far as it goes, that is, in partial equilibrium. The general-equilibrium closure offered by Barsky and Kilian, with fixed nominal money and fixed employment, is not convincing. Once one uses a more standard macroeconomic closure, the evidence on inflation appears quite consistent with theory. Let me use the standard toy model here. Assume that consumers consume a bundle composed of a produced good and energy. The produced good is produced using labor. Denote by pv the (log) price of the produced good (the GDP deflator), by x the (log) relative price of energy, and by p the (log) consumption price index (the CPI). Then assume
The first equation states that the consumption price index is a weighted average of the price of the produced good and the price of energy, with 6 as the share of energy in consumption. The second states that the price of the produced good is equal to the nominal wage (the constant terms are unimportant here, so I set them equal to zero). The third equation states that the consumption wage depends on the output gap y. It introduces nominal wage rigidity in the form of a dependence of the wage on both the current and the lagged consumer price index. The fourth equation is a reduced-form aggregate demand equation giving the demand for the produced good as a function of real money balances. The last equation is a money rule.
Comment • 191
Solving the model is straightforward. For simplicity, let me just look here at the limit case where b — 1, so there is full accommodation by the central bank, and y remains constant (the difference with Barsky and Kilian is that constancy of output is the result of monetary accommodation, so the nominal money stock is not constant but endogenous). In this case, inflation is given by
Then: An increase in the relative price of energy leads to positive inflation, measured using either the CPl or the GDP deflator. Even if the share of energy in consumption is small, the effect of the relative price increase on inflation can be large. (For example, if 6= 0.05 and a = 0.5, then a 50% increase in the relative price of energy leads to an increase in inflation of 5%.) Inflation measured using the GDP deflator lags inflation measured using the CPL
This is very much what was observed in the mid-1970s. The mechanism behind the increase in inflation is straightforward. Given the wage, the increase in the price of oil increases the CPI. But the implied decrease in consumption wages leads workers to ask for an increase in nominal wages, which leads in turn to an increase in the GDP deflator. Under the convenient assumption of full accommodation made here, inflation goes on forever. Under the more realistic assumption of partial accommodation, lower output eventually puts an end to these price and wage increases, and inflation eventually stops. But the results that inflation lasts for some time, it can be quite large relative to the shock, and CPI inflation leads inflation using the GDP deflator all remain.
7. Conclusions I have argued that money cannot be the main culprit for what happened in the mid-1970s. It cannot explain stagflation. The behavior of interest rates does not fit. And most of the movements in the price of oil had little to do with monetary policy.
192 • BLINDER
Does this mean that there are no mysteries left? The answer is an emphatic no. There is plenty we do not understand about what happened in the 1970s, and more generally about the price of oil and economic activity. The list of puzzles is well known, from the surprising finding by Hamilton that most U.S. postwar recessions have been preceded by an increase in the price of oil (a finding which may turn to be true once more in the near future), to the apparent asymmetry between the effects of increases and decreases in the price of oil, to the sheer size of the two recessions of the 1970s. I thank the two authors for putting the issues back on the research agenda. REFERENCES Adelman, M. (1993). The Economics of Petroleum Supply. Cambridge, MA: The MIT Press. Blanchard, O. (1993). The equity premium. Brookings Papers on Economic Activity 2:75-138.
Comment ALAN S. BLINDER Princeton University
This is a fascinating paper, one which is well worth reading. It more than repays the time you spend digesting it. It was also, obviously, meant to be a provocative paper, and I was, appropriately, provoked. In my book Economic Policy and the Great Stagflation (Blinder, 1979), which I dusted off to prepare this comment, I considered the monetary (dare I say "monetarist"?) explanation of the episode and concluded as follows: "If I were forced to summarize the influence of the Fed on the Great Stagflation, I guess I would stress how little difference it made rather than how much." Barsky and Kilian reach a rather different conclusion. Why? Was I so wrong in 1979? Frankly, I find very few errors of commission in the paper. So the hunt must be for errors of omission. The paper reads a bit like a selective legal brief prepared by a pair of clever trial lawyers. So IT1 try to add a few arguments for the defense (of the conventional wisdom) and point out that some of the conclusions that they portray as contrary to that wisdom are actually quite well known. Leaving out many interesting details, their argument comes in five parts, and I'll take each up in turn.
Comment - 193 1. A PURELY MONETARIST MODEL OF GO-STOP POLICY GIVES A GOOD EXPLANATION OF THE STAGFLATIONARY EVENTS OF THE 1970s.
The authors indeed demonstrate that they can calibrate a relatively simple model of go-stop monetary policy that matches the cyclical facts without ever mentioning oil prices. Or can they? When I read, in the last paragraph of Section 3, of a predicted "output trough in early 1987" that "closely mirrors" the facts, I wondered what facts those were. I never noticed a 1987 trough, and neither did the NBER dating committee. Incidentally, the model doesn't even need the "stop" part of "go-stop" to generate stagflation. As we have all known for years, the "go" part alone is enough. In a textbook aggregate-demand-aggregate-supply diagram, if a burst of demand growth (it need not be generated by money) leads the economy to overshoot its potential output, the adjustment back to equilibrium will entail both falling output and rising prices. Or, moving up a derivative to the Phillips-curve diagram, overshooting the natural rate will lead to an adjustment period in which both unemployment and inflation are rising at the same time. Many of us have demonstrated this well-known conclusion to our Economics 101 students—without using difference equations.1 2. THERE IS ABUNDANT EVIDENCE IN FAVOR OF THE MONETARY EXPLANATION.
It is one thing to claim that a certain type of model can explain a phenomenon; it is quite another to claim that the proffered explanation is empirically relevant. So Barsky and Kilian marshal a variety of empirical support for their monetary explanation of stagflation. But I am not entirely convinced. For example, they show that several bursts of money growth occurred at approximately the right times to set off go-stop (or just go-adjust) cycles. But they never mention the (often simultaneous and offsetting) shifts of velocity—the very things that eventually led almost all economists and policymakers to abandon monetarism. One example comes in 1970 and 1971, when sharp declines in velocity seemed to validate Arthur Burns's perspicacious (or was it lucky?) prediction that falling velocity would obviate most of the dangers of rapid money growth. A second example is the apparent downward shift of the conventional money demand equation in 1974-1976, which Steve Goldfeld dubbed "The Case of the Missing Money."2 1. See, for example, W. J. Baumol and A. S. Blinder (1999), Economics: Principles and Policy, 8th ed., Harcourt, pp. 590-591. 2. S. M. Goldfeld (1976), The Case of the Missing Money, Brookings Papers on Economic Activity 3:683-730.
194 • BLINDER
I do not wish to dispute the claim that a go-stop cycle occurred in the United States in 1972-1973. It did. But it seems a grave omission to forget the role of fiscal policy. This was the infamous period in which President Nixon opened the federal coffers to assist his 1972 reelection campaign, and then shut them abruptly in 1973. Years ago, Goldfeld and I constructed econometric-model-based measures of fiscal policy which show a titanic runup in fiscal stimulus from 1969 to a peak in 1971 and then an abrupt plunge to fiscal restraint in 1973.3 Yes, it was an aggregate-demand story; but money growth was not the only actor. 3. THE BURST OF WORLD MONEY GROWTH WAS LARGELY DUE TO THE BREAKDOWN OF THE BRETTON WOODS CONSTRAINTS ON MONEY CREATION.
I have no quarrel with highlighting the end of the Bretton Woods system as an explanation for the high money growth rates that followed, not only in the United States but in other countries. But think about the timing for a minute. The old system of fixed exchange rates died not all at once, but rather in stages between August 1971 and early 1973. That seems a bit early to account for rapid money growth in 1971 and 1972, followed by stagflation starting in late 1973. The Nixon reelection campaign—to which his friend Arthur Burns contributed from the Fed—is a more convincing explanation (or at least a partner in crime) for the United States. 4. THE SUPPLY-SHOCK THEORY OF STAGFLATION GIVES NO REASON TO THINK THAT INFLATION MEASURED BY THE GDP DEFLATOR SHOULD RISE.
Barksy and Kilian do shoot at least one hole in the supply-shock theory: It gives no reason to think the GDP deflator—as opposed to some index of consumer prices—should rise more rapidly after an oil shock. Empirically, of course, it did. This is a legitimate criticism of the conventional wisdom—which does not, of course, apply to analyses that focus on consumer-price inflation.4 Their point is really much simpler than the way they make it in the paper. Just remember that Y=C+I+G+X—IM (another thing we teach our Economics 101 students), and you will notice that IM enters with a minus sign. In the simple, limiting case that all imported oil is consumed and no real quantities are affected by an oil shock, nominal C and nominal ZM would rise by identical amounts, leaving nominal GDP and all real magnitudes unchanged. Thus the GDP deflator would be unchanged. 3. A. S. Blinder and S. M. Goldfeld (1976), New measures of fiscal and monetary policy, 1958-1973, American Economic Review 66(December):780-796. 4. This caused me to check my own past work. Fortunately, it was all about consumer-price inflation.
Discussion • 195
This, of course, is a rather unrealistic account of the way our economy reacts to an oil shock. But it does point out that we need a more complicated theory—involving, say, markups and wage-price interactions—to explain why an oil shock boosts the GDP deflator. Having said that, if we resist the conclusion that oil shocks are inherently stagflationary, we have quite a few coincidences to explain—as is clear, e.g., from the work of Jim Hamilton (cited in the paper) or Carruth, Hooker, and Oswald (1998).5 5. THE FACT THAT OIL PRICES ROSE FIRST AND STAGFLATION FOLLOWED LATER IS LARGELY EXPLAINED BY REVERSE CAUSATION: IT WAS MONETARY-INDUCED BOOMS THAT PUSHED OIL PRICES UP.
Thus do Barsky and Kilian elide the aforementioned coincidences. If I may paraphrase Shakespeare, methinks the gentlemen doth protest too much on this point. The simple conventional wisdom holds that movements in the price of oil (which are purely exogenous) drive the economy—or, in a variant, that oil prices drive monetary policy, which in turn drives the economy.6 The Barsky-Kilian model, by contrast, holds that monetary policy drives the economy, which in turn drives the (now purely endogenous) price of oil. In fact, reasonable people do not have to choose either of these two extremes. Two-way causation is fine—and, indeed, I think that's where Barsky and Kilian actually come out in the end. On the "oil prices are endogenous" side, there is surely truth to the notion that strong worldwide macroeconomic conditions made it easier for the OPEC cartel to work. [But didn't Rotemberg and Saloner (1986) argue that cartels break up in booms?7] But on the "oil prices are exogenous" side, I think it is a mistake to argue that money growth caused the Yom Kippur War or the fall of the Shah. Indeed, don't events like those give econometricians their best shot at cutting the Gordian knot of simultaneity?
Discussion Robert Barsky explained that the main aim of the paper was to cast doubt on the received wisdom that stagflation in the 1970s was a result of an aggregate supply shift alone. In response to Olivier Blanchard's 5. A. Carruth, M. Hooker, and A. Oswald (1998), Input prices and unemployment equilibria: Theory and evidence for the United States, Review of Economics and Statistics LXXX, pp. 621-628. 6. See Bernanke, Gertler, and Watson (1997). 7. I realize that other theorists have argued that strong market conditions help cartels hold together. That's one of the nice features of economic theory.
196 • DISCUSSION
discussion, he raised the possibility that, with sufficiently responsive expectations on the part of the "awake" firms, the model in the paper might generate expected inflation greater than lagged inflation. On cartels, he said that the links between interest rates, output, and cartel power could go in both directions. In particular, when interest rates are low and output is high, the incentives to cheat might be less. Barsky responded to Alan Blinder's discussion by saying that monetary overshooting could lead to a stagflation in the model. This result depended on the amount of "sleepiness" in firms and the endogeneity of monetary policy. He stressed the similarity between their paper and Bernanke, Gertler, and Watson (1997), the main difference being that their paper argued that the Fed was responding to commodity price inflation alone, rather than to inflation in both commodity and oil prices as in BGW. Barsky agreed with Blinder that Arthur Burns might have loosened money supply to help Nixon's re-election, but he noted that commodity price inflation gave Burns the excuse he needed. Mark Gertler suggested that, because of the regulation of depository institutions in the 1970s, increases in nominal interest rates could have had an effect on economic activity even without increases in real rates. For evidence on this point, he suggested that the authors look at what happened to housing in 1973. He agreed with Blanchard that more than a purely monetary story was necessary to explain the combination of low growth and high inflation of the early 1970s. Gertler noted the potential role of the productivity slowdown in 1972 and the resulting decline in growth; it could be argued that the Fed, failing to understand what was happening, eased policy, leading to high inflation. He put forward the possibility that the opposite has happened in recent years, during which productivity has surged and inflation has been low. Michael Klein remarked that the exchange-rate channel was one way that a monetary expansion might have fed into a rise in the dollar price of oil. Charles Engel followed up on this comment by saying that, until 1990, a dollar depreciation was related to an increase in the price of oil, and vice versa. A possible explanation is that oil was priced in dollars. When the dollar fell, the oil price fell in Europe and Japan, increasing world demand and the world price of oil. This could be consistent with a money-based account if money growth caused the depreciation. The problem is that empirical studies have not found a close connection between monetary policy and exchange-rate changes in the short run. David Romer agreed with the general point made by the authors that something more than an oil price shock was necessary to explain the 1970s. He was worried by the fact that, though an oil-price-shock story works well for the 1970s, oil prices are rarely invoked to explain the
Discussion • 197
events of the past 20 years. Blinder responded that an oil-price story would work for the 1990 recession. Barsky and Romer disagreed, saying there was no stagflation in 1990. Greg Mankiw suggested that a beneficial oil shock helps to explain the economy's behavior in 1986. Olivier Blanchard said that no doubling of the price of oil had happened outside the 1970s. He maintained that the one thing that would be fatal to the oil story would be a doubling of the price of oil with no subsequent recession. Blinder remarked that 1997-1998 was characterized by a fall in oil prices, a boom, and low inflation. Ben Bernanke noted that in the VAR literature that tries to look at the effect of oil prices on the real economy, it is hard to find a reliable statistical link between indicators of oil price shocks and subsequent output movements. But there would be some inflationary effect that leads to a policy response. Tom Sargent took up the point, saying that in a model with vintages of capital, it is difficult to find large real effects of oil prices. David Laibson pointed out that in an [S,s] model of capital formation, the effects of oil prices would be nonlinear. Dramatic price shocks in either direction could depress output through scrapping of capital. This nonlinearity could explain why the increases in oil prices of the 1970s depressed activity, while falling prices in the 1980s had ambiguous effects. Mankiw remarked that the issue of modeling sleepy firms was an important one, as the dynamics of the model are sensitive to how sleepiness is modeled. He said that while economists have a good idea of how to model awake firms, they did not yet have good models of sleepiness. In response to the general discussion, Lutz Kilian reiterated the key point that the prices of other commodities rose before the price of oil rose in 1973. Oil prices were different in that they remained high while other commodity prices fell. He suggested that the success of the cartel could explain why oil prices remained high. Kilian questioned the link between events such as the Yom Kippur War and the fall of the Shah of Iran on the one hand, and oil prices rises on the other, on the grounds that the timing was not right. He emphasized that embargoes could be endogenous. Barsky agreed, noting that disruptions in the Middle East don't always raise the price of oil. Rather, embargoes are imposed when it is profitable for oil producers to do so.
This page intentionally left blank
Marvin J. Earth III and Valerie A. Ramey FEDERAL RESERVE BOARD OF GOVERNORS; AND UNIVERSITY OF CALIFORNIA, SAN DIEGO, AND NATIONAL BUREAU OF ECONOMIC RESEARCH
The Cost Channel of Monetary Transmission 1. Introduction Traditional economic models posit that changes in monetary policy exert an effect upon the economy through a demand channel of transmission. This view of monetary policy has a long history that has been fraught with debate over whether monetary policy affects real economic variables, and if so, how powerful these effects may be. Much of this research has been devoted to identification of a demand-side transmission mechanism for monetary policy and quantifying its effects. Alternatively, some researchers have proposed that there may be important supply-side, or cost-side, effects of monetary policy (e.g., Blinder, 1987; Fuerst, 1992; Christiano and Eichenbaum, 1992; Christiano, Eichenbaum, and Evans, 1997; and Farmer, 1984, 1988a, b). One version of this view, which ignores longrun effects, has been called the "Wright Patman effect," after Congressman Wright Patman, who argued that raising interest rates to fight inflation was like "throwing gasoline on fire" (1970). This paper presents aggregate and industry-level evidence that suggests that these cost-side theories of monetary policy transmission deWe wish to thank Wouter den Haan, Charles Evans, Jon Faust, Mark Gertler, Simon Gilchrist, James Hamilton, Alex Kane, Garey Ramey, Matthew Shapiro, Christopher Sims, Chris Woodruff, our colleagues at UCSD and the Board of Governors, as well as participants at UC-Irvine, New York University, the Board of Governors, Rutgers, the San Francisco Federal Reserve, the University of Michigan, the July 2000 NBER Economic Fluctuations and Growth Meeting, and the April 2001 NBER Macroeconomics Annual Meeting for useful comments. Christina Romer kindly provided the Greenbook data. Ramey gratefully acknowledges support from a National Science Foundation grant through the National Bureau of Economic Research. The views expressed in this paper are solely the responsibility of the authors and should not be interpreted as reflecting the views of the Board of Governors of the Federal Reserve System. The authors' respective email addresses are
[email protected] and
[email protected].
200 • EARTH & RAMEY
serve more serious consideration. It is not the purpose of this paper to deny the existence of demand-side effects. Rather, this paper presents evidence implicating supply-side channels as powerful collaborators in the transmission of the real, short-run effects of monetary policy changes. In fact, for many important manufacturing industries, the evidence presented here implies that a cost channel has been the primary mechanism of monetary transmission. A cost channel of monetary transmission can potentially explain three important empirical puzzles. The first puzzle, noted by Bernanke and Gertler (1995), is the degree of amplification. Empirical evidence suggests that monetary policy shocks that induce relatively small and transitory movements in open-market interest rates have large and persistent effects on output. Bernanke and Gertler use this result to support their argument that a credit channel working in tandem with the traditional monetary channel better explains the data. A complementary means to explain the observed amplification is to allow monetary policy shocks to have both supply-side and demand-side effects. If this is the case, then a shock to monetary policy could be viewed as shifting both the aggregate supply and aggregate demand curves in the same direction, leading to a large change in output accompanied by a small change in prices. The response of prices to a monetary contraction is a second empirical puzzle that may be explained by a cost channel. Standard vector autoregression (VAR) methods suggest that the price level rises in the short run in response to a monetary contraction. This price puzzle was first noted by Sims (1992), and has been confirmed by much subsequent work. It is our view that this may result from short-run, cost-push inflation brought on by an increase in interest rates. A third puzzle, which we will document shortly, is the differential effect of monetary shocks on key macroeconomic variables when compared to other aggregate-demand shifters. Using several measures of aggregate-demand shifters and technology shocks, we show that a monetary shock creates economic responses more similar to those due to a technology shock than to an aggregate-demand shock. These results are consistent with our hypothesis that monetary policy shocks affect the short-run productive capacity of the economy. The literature offers several theoretical foundations for monetary policy as a cost shock. For example, Bernanke and Gertler's (1989) model contains both a demand and a supply component of balance-sheet effects. Several other credit-channel papers suggest that there might be a cost-side channel of monetary policy (e.g., Kashyap, Lament, and Stein, 1994; Kashyap, Stein, and Wilcox, 1993; and Gertler and Gilchrist, 1994). Most of these papers are empirical, though, and do not explicitly model
The Cost Channel of Monetary Transmission • 201
the supply-side effects. Nevertheless, the discussion of the results indicates the possibility of supply-side effects. Consider, for example, Gertler & Gilchrist's (1994) study of the cyclical properties of small vs. large firms, in which they show that a monetary contraction leads to a decrease in the sales of small firms relative to large firms. The implication is that tight credit is impeding the ability of small firms to produce.1 There are several other examples of general equilibrium macroeconomic models that explicitly analyze the supply-side effects of monetary policy through working capital. Blinder (1987), Christiano and Eichenbaum (1992), Christiano, Eichenbaum, and Evans (CEE) (1997), and Farmer (1984, 1988a, b) all begin with the assumption that firms must pay their factors of production before they receive revenues from sales, and must borrow to finance these payments. In most of the models, an increase in the nominal interest rate serves to raise production costs. Thus, a monetary contraction leads to a decline in output through an effect on supply. It is important to note that some type of rigidity is still required for money to be nonneutral. If prices and portfolios adjust immediately, then monetary policy has no initial effect on interest rates, so that neither aggregate demand nor aggregate supply shifts. The paper proceeds as follows. Section 2 presents aggregate evidence that the effects of monetary policy shocks look more like technology shocks than like demand shocks. Section 3 investigates the importance of working capital in production. Section 4 presents evidence derived from two-digit-level industry data. The results of this analysis show clear indications of the strength of monetary policy as a cost shock at the industry level: many industries display falling output and rising pricewage ratios. Furthermore, the effect appears to be much more pronounced during the period from 1959 to 1979. This is also the period in which monetary policy shocks have larger and longer effects on output. Section 5 addresses possible alternative explanations of the empirical results presented in the preceding sections. Finally, Section 6 concludes.
2. A Comparison of the Effects of Monetary Shocks and Other Shocks A useful starting point in an analysis of the supply-side effects of monetary policy is a comparison of the responses engendered by identified I . Some earlier empirical work studied whether rises in interest rates are passed on to prices. Seelig (1974) found small or insignificant effects on markups. Shapiro (1981), on the other hand, estimated a Cobb-Douglas markup equation on aggregate data and found significant interest-rate effects on the price level. To our knowledge, there has been little or no recent empirical work on the subject.
202 • EARTH & RAMEY
technology and demand shocks on key macroeconomic variables with the responses of those variables to an unexpected monetary contraction. Unfortunately, the literature is not replete with universally accepted measures of demand and supply shocks.2 Nor is it clear what exactly is meant by "aggregate demand" and "aggregate supply" shocks in a fully specified dynamic general-equilibrium model. Undaunted, we pursue two alternative strategies. The first extends work by Shapiro and Watson (1988), Gali (1999), and Francis and Ramey (2001) by using long-run restrictions to identify technology (supply) and other shocks. The second approach uses defense buildups as an example of an exogenous nontechnology (demand) shock. Neither approach is completely uncontroversial, but the similarity of results across the two approaches strengthens our case. 2.1 THE EFFECTS OF SHOCKS IDENTIFIED USING LONG-RUN RESTRICTIONS
In the first approach, we use a VAR with long-run restrictions to investigate the effects of three types of shocks. We follow Gali (1999) in identifying technology shocks as the only shocks that have permanent effects on productivity. This assumption is fairly unrestrictive, as it allows for temporary effects of nontechnology shocks on measured productivity through variations in capital utilization and effort. Using a bivariate system with labor productivity and hours, Gali identified two shocks: a technology shock and another shock to labor, which he interpreted as an aggregate demand shock. Interestingly, it is this second shock that appears to drive the business-cycle movements in the economy. Adding nominal variables to the system did not significantly alter his results. Francis and Ramey (2001) present evidence in support of the plausibility of the technologyshock interpretation by investigating the effect of this shock on other key macro variables, such as consumption, investment, and real wages. We use a combination of variables from the systems estimated by Gali and by Francis and Ramey in order to compare the effects of the various shocks. Consider the following moving-average representation:
Here yt is a 6 X 1 vector consisting of the log differences of labor productivity (xt), hours (nt), real wages (w,), the price level (pt), money supply as 2. We were intrigued by Shea's (1993) input-output instruments, but decided against them for the following reason: Of the 26 industries studied, Shea uses residential construction as an instrument for 16 industries, and transportation equipment for 2 industries. If monetary policy is affecting residential investment and motor vehicles at the same time as it is affecting the cost of working capital in upstream industries such as concrete and tires, then output in residential construction or motor vehicles is not a valid demand instrument for the upstream firm.
The Cost Channel of Monetary Transmission • 203
measured by M2 (mt), and the level of the federal funds rate (ft). The function C(L) is a polynomial in the lag operator with 6 X 6 matrix coefficients. Shocks to the system, e\, ent, swt, e\, e™, e{, are represented by the vector ut. Note that private output is simply the product of output per hour and total hours. In order to impose the restriction that no shocks other than e* have a permanent effect on productivity, we require C1; (1) = 0 for; = 2,3,... ,6, where C(l) represents the sum of all moving-average coefficient matrices.3 To derive a shock comparable to Gali's aggregate-demand shock, which is the shock to the hours equation, we further impose the restrictions that C2;(0) = 0 for; = 3, 4, 5, 6. These restrictions essentially put the labor input variable ahead of the other four variables in the ordering. Finally, we require the shock to the federal funds rate to be contemporaneously uncorrelated with the other system variables (except productivity), and, following Bernanke and Blinder (1992) and Christiano, Eichenbaum, and Evans (1999), we assume that the shocks to that equation represent monetary policy shocks. A unit root in productivity is key to identifying the shock. We also assume that hours, real wages, the price level, and the money supply have unit roots, while the federal funds rate is stationary. As Gali shows, the results are not sensitive to changing these auxiliary assumptions. We include four quarterly lags of each variable in the estimation. To summarize, our goal is to identify three key shocks. The technology shock is found by imposing the long-run restriction that only a technology shock can have a permanent effect on productivity. The nonmonetary, nontechnology shock is assumed to be the shock to the hours equation, which Gali has argued behaves most like a demand shock. Furthermore, Francis and Ramey have found that while this shock is correlated with military dates and oil-shock dates, it is uncorrelated with the Romer dates. Finally, the monetary policy shock is identified as the shock to the federal funds rate. We use quarterly data from January 1959 to March 2000 to estimate the model. The Data Appendix gives complete details about the data used, as well as how the standard errors are calculated. Figure 1 shows the separate effects of a negative technology shock, a negative demand shock, and a contractionary monetary shock on the variables of interest. First note that all three shocks have a negative impact on private output. Both the technology shock and the monetary shock lead to a sustained fall in output. The demand shock, on the other hand, leads to a less persistent fall. All three shocks also lead to falls in hours. Consistent with Gali's original results, hours first rise in response 3. Francis and Ramey show that similar results are obtained with the alternative restrictions that only set C12 (1) = 0 and C1/ (0) = 0 for / = 3, 4, 5, 6.
204 • EARTH & RAMEY Figure 1 MONETARY, TECHNOLOGY, AND DEMAND SHOCKS
Line with circles, technology shock; with squares, monetary shock; with triangles, demand shock. Filled marks significant at 10%; open marks significant at 25%.
The Cost Channel of Monetary Transmission • 205
to a negative technology shock, but fall immediately in response to the demand shock. The effect of the monetary policy shock on employment is delayed until the third quarter after the shock. It is in the responses of productivity and real wages that the monetary shock really looks more like a technology shock than like a demand shock. The technology shock and the federal-funds-rate shock both cause a fall in productivity, although the effect is less persistent for the monetary shock, as one would expect for a transitory shock. In contrast, after an initial negative effect for three quarters, the demand shock leads productivity to rise. Thus, the usual explanation given for the decline in labor productivity—a fall in capital utilization—does not appear to apply to declines in hours caused by other demand shocks. The response patterns for real wages are very similar to those of productivity as would be predicted by theory. Both a negative technology and monetary shock lead to declines in real wages, and again, the monetary shock is relatively transitory, while the technology shock exhibits more persistence. The responses are consistent with a negative shock to production possibilities that leads to a decline in labor demand. Real wages respond oppositely to a negative demand shock, rising, as would be consistent with a stable production function, and leading to higher labor productivity and hence real wages. The real-wage results are consistent with several other results from the literature. For example, using a standard recursive VAR, Christiano, Eichenbaum, and Evans (1997) also find that real wages decline in response to a contractionary monetary shock. Using Shapiro and Watson's (1988) long-run identifying restrictions that aggregate-demand shocks can have no long-run effect on output, Fleischman (2000) finds that aggregate-demand shocks lead to countercyclical movements in real wages. Thus, the response of real wages to monetary shocks is very different from the response to other aggregate-demand shocks. Figure 1 also shows the effects of the three shocks on the price level and the funds rate. A negative technology shock causes a sustained rise in the price level. A monetary policy shock leads to a temporary increase in the price level, whereas the demand shock does not have much effect. Finally, the funds rate falls in response to a negative demand shock, while it rises in response to a negative technology shock or a monetary contraction (by definition). A noticeable pattern emerges from the graphs. The response of variables to a monetary policy shock is typically more similar in sign and pattern to a technology shock but is less persistent. Further, as one might expect under a hypothesis that monetary contractions beget both supply and demand effects, the responses to a monetary contraction
206 • EARTH & RAMEY
generally lie between, or appear to be a mixture of, technology and demand shocks. 2.2 A COMPARISON OF EXOGENOUS MONETARY VS. DEFENSE SHOCKS
As our second line of attack, we present evidence that monetary shocks differ significantly from demand shocks, identified as exogenous defense buildups, in their effects on output and real wages. To begin, we present some stark evidence in the form of two graphs of variables in the aircraft and parts industry (SIC 372) from the period 1977 to 1995. Military spending is an important component of demand for aerospace goods. At the height of the last buildup, the Department of Defense accounted for almost 60% of total shipments from the aircraft and parts industry. Thus, fluctuations in defense spending are an important exogenous source of demand variation. Figure 2a charts real defense spending on aircraft and parts plotted against both industrial production and the real product wage in SIC 372. The real product wage is measured as average hourly earnings in the industry divided by the producer price index for aircraft and parts. The graph plots the logarithms of the data, which have not been detrended or normalized. From 1977 to 1988, real defense spending on aircraft and parts rose 375%. From 1988 to 1995 it fell by almost the same amount. As Figure 2a clearly demonstrates, the path of defense spending on aircraft has a strong positive correlation with industrial production of aircraft and parts. The correlation is 0.44. In contrast, the real product wage in the industry moves countercyclically. As defense spending rose, real wages plummeted, and as defense spending collapsed, real wages rose; the correlation between the two series is —0.75. These strongly countercyclical responses to exogenous fluctuations in industry demand are entirely consistent with the effects of a demand shock in a standard neoclassical model with flexible prices. With a stable production function and slow accumulation of capital, an increase in output is necessarily accompanied by a decline in labor productivity and hence a decline in real wages. These patterns are not consistent with a theory of countercyclical markups. As Ramey and Shapiro (1998) demonstrated more generally, defense spending has similar effects on more aggregate product wages. To highlight the different effects of monetary vs. defense shocks, we compare the impact on real wages and output of a Romer monetary date (Romer and Romer, 1989, 1994) with that of a Ramey-Shapiro military date. In each case, we estimate a system with real wages, output, and the
The Cost Channel of Monetary Transmission • 207 Figure 2(a) THE EFFECT OF DEFENSE SPENDING ON AIRCRAFT INDUSTRY OUTPUT AND WAGES (QUARTERLY DATA); (b) THE EFFECT OF ROMER DATES AND RAMEY-SHAPIRO MILITARY DATES
Line with circles, response to a monetary shock; with squares, response to a military shock. Filled marks significant at 10%; open marks significant at 25%.
208 • EARTH & RAMEY
dummy variable of interest. Eight lags of all variables plus the current value of the dummy variable are included. The comparison is complicated by the fact that the Romer dates signal a contraction in output whereas the Ramey-Shapiro dates, which index sudden political events that lead to defense buildups, signal an expansion of output. Leaving aside important potential issues about asymmetry, for comparability we reverse the sign of the Ramey-Shapiro dates to make the shocks in both experiments contractionary. Figure 2b graphs the response of the logarithm of real GDP and real wages in response to each shock. Although both shocks lead to declines in output, they have opposite impacts on real wages. Defense-induced changes in output are negatively correlated with real wages, while monetary-induced changes in output are positively correlated with real wages. In contrast with our hypothesis and the evidence presented above, most sticky-price and countercyclical-markup models predict that monetary, government spending, and other demand shocks should have similar economic effects. Rotemberg and Woodford (1991), King and Goodfriend (1997), and various others have argued that either collusive behavior or sticky prices can lead to countercyclical markups. Since the markup is inversely related to the real wage, countercyclical markups imply procyclical wages. One would expect the effects of defense-spending changes and money-supply changes to have similar but opposite effects on real wages and markups. The results presented here suggest that the transmission mechanism for monetary policy is very different from the transmission mechanism for other nontechnology shocks. 3. The Mechanics of the Cost Channel The last section presented qualitative aggregate evidence consistent with a cost channel of monetary policy. We now discuss the quantitative plausibility of the cost-channel hypothesis. The key link in our hypothesis is the role of working capital. We argue that just as interest rates and credit conditions affect firms' long-run ability to produce by investing in fixed capital, they can also be expected to alter firms' short-run ability to produce by investing in working capital. The data support the importance of investment in working capital, whether measured against sales or against fixed capital. One way to measure the magnitude of working capital is to calculate how many months of final sales are held as working capital. Consider the following two measures of working capital: gross working capital, which is equal to the value of inventories plus trade receivables; and net working capital,
The Cost Channel of Monetary Transmission - 209
which nets out trade payables. On average over the period 1959 to 2000, gross working capital was equal to 17 months of final sales, and net working capital was equal to 11 months of sales.4 Thus, even the smaller net-working-capital measure implies that nearly a year's worth of final sales is tied up in working capital. The level of investment in working capital is in fact comparable to the investment in fixed capital. In manufacturing and trade, the value of gross working capital equals the value of fixed capital, about $1.5 trillion each.5 There are various ways to incorporate working capital into a model of production. Fuerst (1992), Christiano and Eichenbaum (1992), and CEE (1997) embed a delay between factor payments and sales receipts in their models. They assume that firms must pay workers before selling their goods, so firms must borrow cash from the bank in order to produce. The need to borrow introduces an additional component to the cost of labor. In this setting, the marginal cost of hiring labor is the real wage multiplied by the gross nominal interest rate. In CEE's (1997) version of the model, labor demand is given by
where a is the coefficient on capital in a Cobb-Douglas production function, fjL is a constant markup, R is the gross nominal interest rate, and W/ P is the real wage. CEE study a calibrated general equilibrium model in which all of the effects last only one period. They find that the magnitude of the effects on output and labor depends significantly on the labor-supply elasticity. If the labor-supply elasticity is as high as 5, a monetary contraction results in an 83-basis-point rise in the nominal interest rate, a 1.4% decline in hours, and a small rise in prices. It is difficult to find microeconomic evidence in support of such a high elasticity, though. Thus, the cost-channel hypothesis shares the same problem with most economic models that assume workers remain on their labor-supply curves: a high labor-supply elasticity is essential for matching the quantitative aspects of the data. Equation (2) is useful for considering the possible magnitude of the direct effects on labor demand of a rise in the nominal interest rate, holding real wages constant. Our evidence on working-capital invest4. The inventory and sales data are from the BE A, and the trade credit data are from the Flow of Funds. 5. From Quarterly Financial Reports, second quarter 2000.
210 • BARTH & RAMEY
ment suggests that a 1-year lag between paying factors and finally receiving payment is not an unreasonable assumption. Hence, it makes sense to consider an annualized interest rate. If the share of capital is 0.3, then a 100-basis-point increase in the nominal interest rate lowers labor demand by 3%, holding real wages constant. The average rise in the federal funds rate during tightening cycles associated with Romer dates is almost 400 basis points. Thus, the direct effects of monetary-induced jumps in the nominal interest rate can have a significant impact on labor demand and output more generally. The direct effect on the federal funds rate, however, is likely only part of the story. Insights from the credit channel suggest a mechanism by which shocks that initially work through demand may be propagated through the supply side. As demand falls off, firms are faced with accumulating inventories and accounts receivable, and falling cash flow. The dropoff in internally generated funds as the stock of working capital rises forces firms to turn to external financing precisely when interest rates are increasing. The opportunity cost of internal funds increases directly with the federal funds rate, but when firms are forced to turn to external funds, their marginal financing cost typically jumps discretely, due to information asymmetries between the firms and their creditors. In recent quarters, an industrial company rated BBB was usually charged a spread of about 80 basis points over LIBOR on existing lines of credit.6 Since this spread usually rises during periods of tightening credit or during recessions, firms that are forced to renegotiate their lines of credit at such times will face an even greater jump in marginal financing rates. When added to a 400-basis-point increase in the federal funds rate, equation (2) would imply a 15% decline in labor demand with real wages held constant. The time-lag-in-production model nicely captures several features of the data shown in Figures 1 and 2, but it does not explain how a monetary contraction can reduce labor productivity. A model in which working capital has a direct impact on the marginal product of labor can explain such a phenomenon. This can be achieved most directly by including working capital as a factor of production. In fact, there is evidence that this is a valid representation of the role of working capital. Ramey (1989) demonstrates that a model which includes inventories by stage of process as production factors is well supported by the data. 6. Based on data from Loan Pricing Corporation. LIBOR has been the base rate for most firms since the mid-1980s. Prior to that, firms paid a spread over the prime rate, which is typically above a market rate like LIBOR. Hence the above is likely a conservative estimate of the jump in marginal cost of external funding for our data sample.
The Cost Channel of Monetary Transmission • 211
It is at this point, though, that the cost-channel theory runs into the same problem as the credit-channel theory. Both theories suggest that firms should decrease their inventories and accounts receivable in response to a monetary contraction. In fact, it is well known that aggregate inventories and accounts receivable rise relative to sales in response to a monetary contraction, at least in the short run (see for example Bernanke and Gertler, 1995, and Ramey, 1992). Some of this rise can be explained by other mechanisms coming into play at the same time. For example, if a monetary contraction also works through product demand, then firms may have unanticipated buildups in final-goods inventories just when they would prefer to hold fewer inventories. Similarly, firms' creditconstrained customers may delay their payments, leading to rises in accounts receivable just when firms need the extra liquidity. The behavior of the various components of working capital in response to a monetary contraction is an important part of the story and deserves a much more detailed analysis than we can offer here. We present one piece of suggestive evidence that there may be something to the story. Raw-material and work-in-process inventories are not as susceptible to unintentional buildup. Thus, it is interesting to study what happens to the ratio of inventories by stage of processing relative to labor hours after a monetary contraction. To measure the effect of a monetary contraction on inventories relative to hours, we use the monthly VAR model that will be presented in the next section.7 We study the response of manufacturing inventories to hours over two different periods of monthly data: January 1959 to September 1979 and January 1983 to March 2000. The sample was split for two reasons. First, the new BE A data on real inventories extend back only to 1967. Therefore, we use the old BEA data for the early period and the new BEA data for the second period. Second, as we will demonstrate in the next section, evidence for a cost channel of monetary transmission is much stronger in the pre-Volcker time-series data than in the postVolcker period. Figure 3 shows the response of inventories relative to sales for each period. Consider first the period from January 1959 to September 1979 shown in Figure 3a. All types of inventories fall relative to hours in the short run. Interestingly, materials and final goods fall by similar amounts, whereas work in process falls much less. The materials-inventory response stays negative the longest before becoming positive, at about 13 7. In particular, we estimate equation (3) in which the ratios of materials, work-in-process, and final-goods inventories to hours replace the last two variables in the system.
212 • EARTH & RAMEY Figure 3 RESPONSE OF MANUFACTURING INVENTORIES BY STAGE OF PROCESS TO A FEDERAL FUNDS RATE SHOCK: (a) EARLY SAMPLE PERIOD—JANUARY 1959 TO SEPTEMBER 1979; (b) LATE SAMPLE PERIOD-JANUARY 1983 TO MARCH 2000.
Line with circles, materials; with squares, work in process; with triangles, final. Filled marks significant at 10%; open marks significant at 25%.
The Cost Channel of Monetary Transmission • 213
months. If inventories enhance labor productivity, then this fall in the ratios might help explain the decline in labor productivity. The behavior of inventories is completely different in the later period, as shown in Figure 3b. In all cases, inventories rise relative to hours. As we shall argue later, the cost channel appears to be much stronger in the early period. These inventory results also support that view. Finally, we would like to emphasize that we view the cost channel as being only a short-run phenomenon. The evidence for long-run monetary neutrality is strong, and we are not suggesting that it does not hold. Figure 1, as well as other figures featured later in the paper, shows that the rise in prices is temporary; the price level does finally end up falling. The cost channel may have a larger effect than the demand channel in the short run because of the nature of the commitments. In the short run, firms cannot find alternative sources for working capital, and may have to cut back dramatically on production. The necessary cutbacks may be amplified because the firm may have commitments to long-term capital investment projects that cannot be cut. As time progresses, firms have more flexibility to reduce investment spending. Bernanke and Gertler's (1995) finding of a delayed effect of a monetary contraction on business fixed investment is consistent with this hypothesis. 4. Industry-Level Evidence of Monetary Policy as a Supply Shock. We now explore cross-sectional variation among manufacturing industries for evidence of a cost channel of monetary transmission. There are two motivations for doing so. First, it is interesting to study the extent to which the same patterns we see at the aggregate level also hold at the industry level. Second, if there is heterogeneity in the industry responses, we can determine whether there is a link between the responses and features of the industry that might make the cost channel more important. The discussion above about the effect of working-capital cost on labor demand easily extends to the industry level. We change the two variables on which we focus, however. First, we use industrial production rather than hours, because of data availability. Since hours and output are so highly correlated, it is doubtful that we will be misled by this change of variables. Second, to facilitate later discussion about the price puzzle and countercyclical markups, it is more convenient to focus on the behavior of the reciprocal of the real product wage, or P/W. The comovement between P/W and output reveals the nature of the monetary transmission mechanism for particular industries. If a mone-
214 • EARTH & RAMEY
tary contraction affects an industry primarily through a demand channel, then both industrial production and P/W should fall. (That is, the real product wage should rise.) If a monetary contraction affects an industry primarily by raising its working-capital costs, then falling industrial production should be accompanied by rising P/W. Prices should rise relative to wages, because working-capital costs are rising. If both channels are equally strong, we would not expect much movement in P/W.8 4.1 EMPIRICAL FRAMEWORK
We again follow the work of Bernanke and Blinder (1992) and CEE (1999) by identifying monetary shocks as innovations to the federal funds rate (hereafter FFR) after controlling for the Federal Reserve's feedback function. The model features a relatively simple partial identification scheme that allows for control of the price puzzle and flexibility in examining the response of individual time series to monetary policy shocks. As discussed in the introduction, the price puzzle is the finding that aggregate prices rise in the short run following a monetary contraction identified by the unexplained portion of the FFR. The proposed solution to this puzzle is that the Federal Reserve possesses better information about coming inflation than is captured in a parsimonious VAR and reacts appropriately. CEE, following Sims (1992), improve their model's information set by including commodity prices as a leading indicator of inflation, to which the Federal Reserve passively responds. CEE demonstrate that this eliminates the price puzzle (note that this is not true in pre-1979 subsamples; see Section 4.3). Although we have argued that a cost channel could explain this type of behavior of prices, in the interest of conservatism we include two controls for incipient inflation to which the Fed might respond: commodity prices and oil-price-shock dummies. Hoover and Perez (1994) note that identified (negative) monetary policy shifts are highly correlated with oil shocks. We control for the cost effects of oil shocks by including dummy variables in each equation that take the value one during a Hoover-Perez date and zero otherwise.9 Based on Hamilton's (1985) evidence that oil 8. Unfortunately, we cannot use CEE's (1997) study of the response of industry real wages to monetary shocks to assess our hypothesis. They do not compare industry wages with industry price and output. Instead, they examine each industry's wage deflated by a general price deflator. 9. That is, a month identified by Hoover and Perez (1994) as having an exogenous political event that leads to an oil supply shock, based on their reading of Hamilton's (1985) history of postwar oil shocks. Hoover and Perez supplement Hamilton's exogenous political events with the Iraqi invasion of Kuwait in August 1990. Upon the advice of Trevor Reeve, oil economist at the Federal Reserve, we add to this list the election of Hugo Chavez as President of Venezuela in December 1998. Chavez served as the catalyst for OPEC's new-found cohesion in cutting production in early 1999.
The Cost Channel of Monetary Transmission • 215
price shocks take an average 9 months to induce recessions, we include current and twelve lags of the Hoover-Perez dummies.10 Our equation system consists of two blocks. The macro block features aggregate industrial production, the price level, commodity prices, M2, and the FFR, in that order. The second block consists of two equations for variables of interest, one for industry output and one for the industry price-wage ratio. To achieve more efficient estimation and consistent identification of the FFR shock, the coefficients of the series of interest in the macro-variable equations (including the FFR equation) are constrained to be zero for each of the industries examined. This is the approach pursued by Davis and Haltiwanger (1997), who point out that this is in essence a pseudo-panel-data VAR. Since the coefficients of the macro-variable equations are fixed across regressions, but the coefficients in the series of interest equations are allowed to vary across industries. To make the above more explicit, consider the following system of seven equations:
where
Here, IP, is industrial production (a proxy for output), Pt is the personal consumption expenditure deflator (a monthly measure of general price levels), PC, is a price index for commodities, M2, is the monthly average of M2, FFR, is the monthly average of the FFR, Q,, is industrial production in industry i, and Pit/Wit is the ratio of price to wage in industry i. In (3), SD, is a matrix of a constant and seasonal dummies, and HP, is a Hoover-Perez dummy. A is a matrix of endogenous variable coefficients with zero restrictions on the industry variables in the macro-variable equations. Following Bernanke, Gertler, and Watson (1997), we used seven lags.11 All series are in natural-log levels except FFRj. Details on data construction and standard errors are given in the Data Appendix. 10. Hoover and Perez's dummy variable has slightly more explanatory power for industrial production than Hamilton's (1996) net-oil-price-change variable. 11. There is little qualitative difference in the results using as many as 13 lags.
216 • EARTH & RAMEY
We estimated vector autoregressions of this form for total manufacturing, durable manufacturing, and nondurable manufacturing, 20 twodigit industries, and one three-digit industry within these categories over three sample periods: the entire period from February 1959 to March 2000, and the two subsample periods from February 1959 to September 1979 and from January 1983 to March 2000. Explicitly, we tested the null hypothesis that the change in industry price relative to industry wage is less than or equal to zero following a monetary contraction. We take rejection of this hypothesis as evidence that a cost channel rather than a demand channel is the most important avenue of monetary transmission for that industry. 4.2 INDUSTRY RESULTS
The results are represented in a series of graphs and tables. Figure 4a through 4c show the effect of a positive FFR shock on the price-to-wage ratio and output for the manufacturing aggregates as well as the individual two- and three-digit industries, using our entire data sample for estimation. For 10 of the 21 industries examined and for all three aggregates, the impulse response functions show that in response to a positive shock to the FFR, output falls and prices rise relative to wages. The second and third columns in Table 1 summarize the results by describing the behavior of the data during the first 24 months for each industry. The third column presents the results of a test of the null hypothesis that none of the price levels are significantly above zero during the first 24 months. The results clearly reject the null hypothesis at the 10% level for six of the industries analyzed, and we can thus reject the claim that monetary policy exerts its effects solely through a demand channel of transmission. In fact, for important cyclical industries, and even for manufacturing as a whole, the results indicate that monetary policy's primary effects on real variables are transmitted through a supply-side channel. There is clear evidence of the importance of a demand channel of transmission for eight industries (food, lumber, pulp and paper, chemicals, hides and skins, primary metals, fabricated metals, and other durables). Recall, however, the nature of our test: it will only show the presence of a supply channel when its effects clearly dominate those of a demand channel, whose existence we do not deny. That the price response of lumber exhibits a typical demand-shock pattern does not imply that there is not a cost channel of transmission for lumber. The lumber industry too may suffer from a cost channel of transmission, but it is the demand channel whose effects dominate. If monetary shocks have an effect primarily through increases in costs, however, prices should rise as output falls. This is exactly what we
The Cost Channel of Monetary Transmission • 217 Figure 4 INDUSTRY OUTPUT & RELATIVE PRICE RESPONSES TO A FEDERAL FUNDS RATE SHOCK: ENTIRE SAMPLE PERIOD: JANUARY 1959 TO MARCH 2000
Thin line with circles, output: filled, significant at 10%; open, significant at 25%. Thick line with boxes, price/wage: filled, significant at 10%; open, significant at 25%
218 • EARTH & RAMEY Figure 4 CONTINUED
The Cost Channel of Monetary Transmission • 219 Figure 4 CONTINUED
Table 1 NUMBER OF PERIODS IN FIRST TWO YEARS P/W RESPONSE IS GREATER THAN ZERO, SIGNIFICANTLY AT 10% LEVEL Whole sample Industry
P/W>0
Pre-Volcker
Significant
P/W>0
Volcker-Greenspan
Significant
P/W>0
Significant
Total mfg. Durables Nondurables
16 6 7
3 0 1
21 18 23
7 0 1
6 4 16
0 0 0
Food SIC 20 Tobacco SIC 21 Textiles SIC 22 Apparel SIC 23 Lumber SIC 24 Furniture SIC 25 Pulp & paper SIC 26 Printing & publishing SIC 27 Chemicals SIC 28 Petroleum & coal SIC 29 Rubber & plastics SIC 30 Leather SIC 31 Stone, clay, & glass SIC 32 Primary metals SIC 33 Fabricated metals SIC 34 Industrial mach. SIC 35 Electrical mach. SIC 36 Trans, equip. SIC 37 Motor veh. SIC 371 Instruments SIC 38 Other durables SIC 39
1 0 24 22 0 3 2 0 0 1 0 0 13 0 0 24 24 24 24 11 0
0 0 8 18 0 0 0 0 0 0 0 0 0 0 0 13 8 20 20 0 0
24 23 14 15 0 21 19 8 15 19 19 0 17 20 18 17 21 19 22 14 21
15 2 0 11 0 6 3 0 7 13 14 0 10 17 14 6 0 11 15 3 0
18 23 0 0 1 7 20 0 9 16 13 1 18 2 6 24 13 6 14 12 8
0 11 0 0 0 1 0 0 0 0 5 0 0 0 3 0 0 0 0 0 0
Total industries Total industries, n > 2
12 10
6 6
19 19
15 15
18 16
4 3
The Cost Channel of Monetary Transmission • 221
observe in Figure 4c. Look at the price and output responses of motor vehicles: prices rise steeply and then decay slowly after a peak at 9 months; the output response is nearly the mirror image, falling to a trough at 9 months and slowly increasing from there. Nor are these unimportant or noninfluential industries showing significant cost effects of monetary policy. Among those with significant evidence of cost-shock effects are textiles, apparel, industrial machinery, electrical machinery and transportation equipment. But these results are not limited to the industry level. Total manufacturing exhibits supplyside effects as well. Taken together, this evidence provides a case for a supply-side channel of monetary transmission as a powerful force supplementing the often assumed demand channel in creating real effects. Some of the industries that exhibit strong cost-side effects run counter to our prior expectations. One such example is motor vehicles and parts, which shows a very pronounced increase in the ratio of price to wages. One might think that an industry governed by such large firms would not experience large cost effects of a monetary contraction, since they have easy access to commercial paper. A possible explanation is that the primary cost-side effect of a monetary contraction is through changes in market interest rates, rather than bank-loan behavior, so that even large firms experience significant increases in their costs. Another possible explanation is that the small companies that supply parts face loan reductions from their banks. We now explore the extent to which the effects we identified may have changed over the sample period. To this end, we split the sample into the period February 1959 to September 1979 (the pre-Volcker period) and January 1983 to March 2000 (the Volcker-Greenspan period). We choose these two subsamples based on the works of Faust (1998) and Gordon and Leeper (1994), who report substantial empirical differences between the aggregate effects of VAR-based identification of monetary policy in these two periods. Additionally, the choice of these two subsamples removes the volatility of monetary policy and economic aggregates experienced between late 1979 and 1982 from the data. Figure 5a through c show the results for the pre-Volcker period. To conserve space we do not show the graphs for the Volcker-Greenspan period. The information for both periods is summarized in columns 4 through 7 of Table 1. The difference between the two periods is substantial. Overall, we see that the early period through 1979 shows very strong cost-channel effects, whereas the later period shows little evidence of cost-channel effects. In the pre-Volcker period, all three manufacturing aggregates, as well as nearly every industry, exhibit some evidence of a cost-channel
222 • BARTH & RAMEY Figure 5 INDUSTRY OUTPUT & RELATIVE PRICE RESPONSES TO A FEDERAL FUNDS RATE SHOCK: EARLY SAMPLE PERIOD: JANUARY 1959 TO SEPTEMBER 1979
Thin line with circles, output: filled, significant at 10%; open, significant at 25%. Thick line with boxes, price/wage: filled, significant at 10%; open, significant at 25%.
The Cost Channel of Monetary Transmission • 223 Figure 5 CONTINUED
224 • EARTH & RAMEY Figure 5 CONTINUED
The Cost Channel of Monetary Transmission • 225
price effect. For total manufacturing and for 15 of the individual industries, the price effects are significant at the 10% level. In contrast, only lumber and leather and hides exhibit dominant demand channel effects during this period, and only lumber significantly. During the Volcker-Greenspan period the cost-channel effects are much weaker. While 16 industries exhibit rising prices, only three do significantly, and the paths of relative prices and output are not as clearly consistent with a supply shock as in the pre-Volcker period. The results of this section display a good deal of heterogeneity, both across time and across industries. The next two sections will explore whether that heterogeneity can be linked to features that would change the strength of the cost channel. 4.3 INTERPRETING THE TIME PATTERN OF THE RESPONSES
In this section we argue that the changes in the responses we observe over time may be linked with a weakening of the cost-channel mechanism in the later period. We discuss institutional changes, and we provide evidence on the changing effect of monetary policy on aggregate variables. As has been discussed by many observers (e.g., Friedman, 1986), the financial structure of the United States changed significantly during the late 1970s and early 1980s. The private-sector financial innovations beginning in the 1970s and the deregulation of the early 1980s led to more efficient and less regionally segmented financial markets. The banking and credit regulations of the earlier period, which limited the scope of lenders and borrowers to respond to sudden monetary contractions, may have allowed monetary policy to restrict the availability of working capital. In the later period, banks and firms had more alternative sources of funds. A different type of institutional change also occurred over this time period. Romer and Romer (1993) use a narrative approach to show that during the earlier period, contractionary monetary policy was often accompanied by "credit actions," in which the Federal Reserve sought to limit directly the amount of bank lending. The consequent nonprice rationing led to particularly acute credit crunches, which could have led to severe limitations in working capital. Finally, the switch from fixed to floating exchange rates during the 1970s may also explain the weakening of the cost channel. With floating exchange rates, a monetary contraction causes the exchange rate to appreciate, making imported materials cheaper. Thus, any direct cost-side effects of a monetary contraction may have been counter balanced by the exchange-rate effect in the floating-rate period. Thus, well-documented
226 • EARTH & RAMEY
differences in financial markets, foreign-exchange markets, and Federal Reserve policy, combined with theory postulating the presence of a cost channel of monetary transmission, may explain the variation we see in the effects of monetary policy through time. We now present another type of evidence in support of the view that the nature of the monetary transmission mechanism changed over time. Recall from the introduction that, as noted by Bernanke and Gertler (1995), if monetary policy shifts both supply and demand in the same direction, the effect on output is greater than if it shifts only demand. Thus, if the cost channel of monetary transmission were more important during the earlier subperiod, we might expect that the effects of monetary policy on output would be greater in magnitude and last longer in the earlier period. To test this hypothesis, we estimate the basic macro part of the model for the two subperiods [that is, system (3) minus the last two equations]. Because we wish to compare the magnitudes of the response of output to a given shock to monetary policy, we set the innovation for both periods equal to 25 basis points, the typical interval of change in Federal Reserve policy.12 Figure 6 shows the responses of the FFR, industrial production, and the aggregate price level to a 25-basis-point federal-funds shock for the model estimated over each of the two subsamples (February 1959 to September 1979 and January 1983 to March 2000) and the entire sample (February 1959 to March 2000). Consider first the difference in the behavior of the FFR, in Figure 6a. The peak responses of the early and the later period are very similar, but their duration is very different. The funds rate takes almost 2 years to return to its original level during the early period, but takes only about 9 months to return to normal during the later period. Consider now the impulse responses of output, in Figure 6b. Comparison of the figures shows that the trough of output is almost 4 times as deep during the early period as during the later period. Moreover, the duration of the effect on output appears to be much longer during the early period. The trough occurs more than 2 years after the initial shock during the early period, but less than 1 year after the initial shock during the later period. Furthermore, during the early period output is still well below its previous level even 4 years after the shock. During the later period, output rebounds within 2 years of the shock to the FFR. Thus, 12. The standard deviation of the innovation to the FFR equation for a regression using the pre-Volcker sample is 25.6 basis points; for a regression on the Volcker-Greenspan period it is 18.2 basis points; and for a regression over the entire data sample it is 46.9 basis points.
The Cost Channel of Monetary Transmission • 227 Figure 6 AGGREGATE RESPONSES TO A 25-BASIS-POINT FEDERALFUNDS-RATE SHOCK, ACROSS DATA SAMPLES
Line with circles: entire sample, January 1959 to March 2000. Line with squares: early sample, January 1959 to September 1979. Line with triangles: later sample, January 1983 to March 2000. Filled marks, significant at 10%; open marks, significant at 25%.
228 • EARTH & RAMEY
both in magnitude and duration, a given monetary shock had much greater effects during the earlier period. The difference in the effects cannot be fully accounted for by the difference in the response of the FFR over these two periods. Finally consider the behavior of prices in response to a monetary contraction. Despite including commodity prices and oil-shock dates in the reaction function, the price puzzle appears to be fully operational in the early period.13 After a contractionary monetary policy shock, prices rise for over 2 years before beginning to fall. By contrast, during the later sample period, prices are mostly unresponsive after a brief positive spike in the first 5 months. Our finding that in the pre-Volcker period aggregate prices rise in the short run following a monetary contraction is consistent with the results of Hanson (1998). Recall that Sims's (1992) original motivation for including an index of commodity prices in a VAR to identify the Fed's feedback function was as a leading indicator of incipient inflation. Hanson tests a variety of variables (including commodity prices) that might have power to forecast inflation in a similar VAR identification of monetary policy functions, and finds that in the pre-1979 sample period none of these eliminate the price puzzle. We also explored several alternative specifications of the Federal Reserve's reaction function in search of one that might dissipate the price puzzle in the early sample period.14 The price-puzzle finding was robust to almost all specifications we tried. The only specification we could find that significantly reduced the price puzzle in the early period was one that satisfied all of the following criterion: (1) it included Ml or M2; (2) it included commodity prices in the Federal Reserve's reaction function; (3) it excluded any measure of oil prices, be it dummy variables or a price index; and (4) it used a lag length of 12 or greater. In this specification, the magnitude of the price-level rise was greatly reduced and was no longer statistically different from zero. We felt, however, that this specification did not make economic sense, because it assumed that the Federal Reserve monitored only general commodity prices without observing oil prices specifically. It is also worth noting that even under this anomalous specification, price-to-wage ratios still rise at the industry level. We believe that, in combination with Hanson's work, this casts doubt on the now widely accepted view that the price puzzle is the result of the Fed possessing better information of coming inflation than is captured in a simple VAR with aggregate output, prices, and monetary policy vari13. The rise in prices during this period is significant at the 10% level for more than 3 years following the FFR shock. 14. We undertook this exploration at the urging of Christopher Sims.
The Cost Channel of Monetary Transmission • 229
ables (like the FFR). The results of this paper suggest that the real solution to the price puzzle may lie instead with a cost channel of monetary transmission, which leads to a short-run increase in prices. As noted previously, if monetary policy does transmit its effects on real variables through a cost channel, then rising prices in the short run following a contractionary policy shock are not a puzzle. Thus, three pieces of evidence suggest that the cost channel may have been a more important part of the monetary transmission mechanism in the period before 1980. First, the industry-level regressions show that many more industries experienced rising price-wage ratios and falling output after a monetary contraction. Second, we appeal to the restrictive regulations and policy actions during the earlier period as leading to particularly acute credit crunches. Third, we show that the amplification and duration effects on output and the price-puzzle effects are substantially greater during the earlier period. 4.4 ANALYSIS OF THE CROSS-INDUSTRY HETEROGENEITY OF THE RESPONSES
The industry results display a great deal of heterogeneity that can potentially shed light on the monetary transmission mechanism. A comprehensive analysis of the cross-industry heterogeneity in the price-to-wage responses would require estimation of a structural model, since the responses depend on both the demand and supply effects of monetary policy. Such an analysis is beyond the scope of this paper. We can, however, offer suggestive evidence linking balance-sheet variables to the behavior of price-to-wage ratios. Data from the Quarterly Financial Reports (QFR) suggests that the rise in the relative prices of these industries may be directly related to financing costs. The QFR aggregated balance-sheet and income-statement data, back to fourth quarter 1973, for 14 of the two-digit manufacturing industries which we study. For each of these industries, we constructed from these data a measure of interest expense normalized by net industry sales. The Data Appendix contains the details. To compare these measures with the price-to-wage responses previously described, we considered two summary measures of these responses: the peak response, and the integral of the response function. Since the interest-expense time series for all industries exhibit a strong upward trend and several are highly volatile, we smoothed these data using a Hodrick-Prescott filter and took two cross-sectional snapshots of the data, one for each of the subsample periods. Using NBER dates for recessions, we chose the second quarter of a recession for each sample period to take a cross-sectional snapshot. The two periods chosen were
230 • EARTH & RAMEY Table 2 INDUSTRY P'/W RESPONSE CORRELATION WITH INTEREST EXPENSE Correlation
Sample
Int. expense quarter
Peak response
Integral
Early Late
1974:1 1990:4
0.529 0.434
0.519 0.395
presumably stressful periods of financing for manufactures, since the FFR was still high and sales had begun to decline in each industry. The crosssectional snapshot for the earlier sample period is first quarter 1974, and for the later one, fourth quarter 1990. Table 2 presents the correlation between the two summary measures of the price-wage response and interest expense as a fraction of net sales. For the early period both summary statistics of the price-wage ratio have a correlation with industry interest expense of just over 0.5. That is, those industries that have the largest relative price responses also tend to have the most burdensome interest expenses, consistent with a cost-channel hypothesis for monetary transmission. Surprisingly, despite the relative weakness of cost-channel effects apparent in the price responses of the later period, Table 2 suggests that the cost-channel effects may still be present. While the correlation across industries between the price-wage response and interest expense does decline slightly, it is still strongly positive at about O.4.15 Thus, there appears to be a strong link between the response of industry prices and a key balance-sheet variable. 5. Possible Alternative Explanations This section considers three possible alternative explanations of price and output responses discussed in the previous sections. The first is that our finding of rising price-to-wage ratios is due mostly to falling wages, rather than rising prices. Wages might be more variable than prices if initial cuts in output involve the elimination of overtime hours and overtime premia. The second alternative explanation is that we are not adequately addressing the Fed's forecasts of future inflation in our estimated reaction function. The third alternative explanation, countercyclical 15. While we chose these two snapshots to illustrate periods of stress, the results are nearly identical for other periods.
The Cost Channel of Monetary Transmission • 231
markups, has been the subject of intense research in recent years by several authors. We discuss each of these possible explanations. 5.1 STICKY PRICES AND FLEXIBLE WAGES
One possible explanation for the results of this paper is that the pricewage ratio rises in some industries after a monetary contraction because prices are sticky whereas wages are not. If a monetary contraction reduces the demand for an industry's output, firms respond by lowering their output and consequently labor demand. If, for some reason, prices cannot adjust immediately but wages can, then wages will fall relative to prices. We consider this explanation to be implausible. Christiano, Eichenbaum, and Evans (1997) show that the behavior of profits is inconsistent with a sticky-price model of money. They show empirically that profits decline significantly in the wake of a monetary contraction. In contrast, a reasonable specification of a sticky-price model predicts rising profits in response to a monetary contraction. Thus, it is unlikely that a stickyprice model can explain these facts. We can also found direct evidence that this type of model cannot explain our results. We investigated the separate responses of nominal prices and wages by industry for the period 1959 to 1979, which had the strongest rises in the price-to-wage ratio.16 We found that the nominal price level itself rises in virtually all of the industries. Nominal wages fall in some industries, but rise or are flat in most industries. It is clear that our earlier results are being driven primarily by rising nominal prices, not by falling nominal wages. 5.2 EXPECTED FUTURE INFLATION
As discussed earlier, a leading explanation for the price puzzle is misspecification of the Federal Reserve reaction function. In particular, if the Fed changes the FFR because it is forecasting future inflation that is not anticipated by a parsimonious VAR, then the incorrectly specified reaction function will make it look as if shocks to the funds rate raised prices. It may be that industrial production, consumer prices, and commodity prices are not sufficient to capture all of the information used by the Fed to forecast future inflation. To address this issue, we include actual Federal Reserve Board forecasts of current and future inflation and output in our policy equation. 16. The graphs showing these results are omitted for space reasons. They are available upon request from the authors.
232 • EARTH & RAMEY
Romer and Romer (2000) have compiled a series of past forecasts from the Green Books prepared by the Federal Reserve's staff prior to each FOMC meeting. They demonstrate that these forecasts incorporate information not available to private forecasters. We use these monthly forecasts of inflation and output for the current quarter and one quarter ahead. We included these series as exogenous variables in the FFR equation. In doing so, we are making two assumptions. First, only the Federal Reserve has access to its forecasts in the relevant period. Second, the Fed's ex post policy actions do not change its forecasts in subsequent months. While the first assumption is unimpeachable, the second is a bit more dubious. The Federal Reserve staff likely would change its forecasts as new information became available. However, this specification should serve as a convenient benchmark for testing the price-puzzle hypothesis that ex ante the Fed possesses superior knowledge about coming inflation. Specifically, we estimated equation (3) less the last two industry equations and with the following modification to the fifth equation for the FFR:
where GB AY' + ' + is the Fed Green Book Forecast for output growth for quarter t + i made in month t, and similarly, GB AP' + ' is the forecast of inflation. Figure 7 shows the effect of controlling for the Fed's inflation forecasts on the aggregate results. As the graphs make clear, using a better measure of inflation forecast does not change the results noticeably. Aggregate prices still rise significantly in the first two years following an unanticipated increase in the FFR. Thus, it seems unlikely to us that our results could be explained by a misspecified reaction function. 5.3 COUNTERCYCLICAL MARKUPS
Countercyclical markups have been offered as a possible factor in cyclical fluctuations in recent years by Rotemberg and Woodford (1991, 1992) and Chevalier and Scharfstein (1996) among others. A countercyclical markup is a spread between price and marginal cost (the markup above marginal cost) that increases in recessions and decreases in booms. The
The Cost Channel of Monetary Transmission • 233 Figure 7 AGGREGATE RESPONSES TO A 25-BASIS-POINT FEDERAL FUNDS RATE SHOCK, USING GREEN BOOK FORECASTS
Sample period: November 1965 to September 1979. Thin line with circles, standard specification: filled, significant at 10%; open, significant at 25%. Thick line with boxes, Green book specification: filled, significant at 10%; open, significant at 25%.
234 - EARTH & RAMEY
direct link with the evidence presented here is that the above authors often consider the price-to-wage ratio to be an accurate measure of markup. For example, Rotemberg and Woodford (1992) argue that a theory of countercyclical markups is required in order to explain the increase in real product wages after an increase in military spending. Subsequent work by Ramey and Shapiro (1998), also confirmed in Section 2 of this paper, shows that properly measured real product wages fall in the wake of a military spending increase. Thus, other demand shocks do not appear to be propagated by countercyclical markups. Chevalier and Scharfstein (1996) present the most compelling evidence of countercyclical markups in their analysis of the pricing behavior of national, regional, and local supermarkets during national and regional downturns. They present a model of capital-market imperfections in which firms with low cash flow sacrifice long-term market share in order to raise short-term profits. Firms implement this policy by raising their markups. In the data, Chevalier and Scharfstein find that leveraged firms do indeed lower their nominal prices less (or raise them more) during recessions than do less leveraged firms. An equally plausible explanation for the price increases observed for leveraged firms is that their marginal costs rose due to increased external financing premiums. In fact, the markup and cost-channel theories are really just variations on a similar theme. The countercyclical-markup hypothesis argues that liquidity constraints lead to higher prices because they raise optimal markups; the cost-channel theory argues that liquidity constraints raise prices because they raise marginal costs. Without an accurate measure of the marginal costs of these firms (including financing costs), one cannot tell whether markups are indeed going up with prices, or whether marginal costs of production and distribution are rising. 6. Concluding Remarks This paper has presented several types of evidence to suggest that monetary policy has supply-side effects on real variables. We first demonstrated that the response of aggregate economic variables, notably productivity and real wages, to a monetary contraction is more similar to that of a contractionary technology shock than to a contractionary demand shock. Second, we showed that in key manufacturing industries, relative prices rise and output falls following an unanticipated monetary contraction, even after controlling for both the price puzzle and the cost effects of oil shocks. We found that the industry-level evidence for a cost
The Cost Channel of Monetary Transmission - 235
channel of monetary transmission is much stronger during the period from 1959 to 1979 than from 1983 to 2000, and that during both periods, industry heterogeneity appears to be related to industry debt-service burdens. During the earlier period, many more industries exhibited rising prices in response to a monetary contraction. Moreover, the effects of monetary policy on output were greater and the price puzzle was more pronounced during this earlier period. These results are consistent with a cost channel of monetary transmission.
Data Appendix Almost all data used in this paper come from one of the following sources: the Bureau of Economic Analysis (BEA) at the Department of Commerce; the Bureau of Labor Statistics (BLS) at the Department of Labor; the Federal Reserve Board (FRB); and the Quarterly Financial Reports (QFR) published by the Bureau of Economic Analysis. The exceptions are: the index of sensitive commodity prices, for which we thank Charles Evans at the Federal Reserve Bank of Chicago; the Hoover-Perez oil dates (monthly, 1947:12, 1953:06, 1956:06, 1957:02, 1969:03, 1970:12, 1974:01, 1978:03, 1979:09, 1981:02, 1990:08), which come from Hoover and Perez (1994) and are supplemented by this paper's authors with 1998:12 (see footnote 9); Romer dates (quarterly, 1947;4,1955:3,1968:4,1974:2,1978:3, 1979:4,1988:4), which come from Romer and Romer (1994); and RameyShapiro dates (quarterly, 1950:3,1965:1,1980:1) which come from Ramey and Shapiro (1998). SECTION 2 Figure 1 productivity: index of output per hour in business, BLS; private hours: index of total hours in business, BLS; real wages: nominal hourly compensation in business divided by deflator for private business, BLS; price level: deflator for private business, BLS; money: M2, FRB; federal funds rate, FRB. All data are quarterly series and in logarithms, except the FFR, which is the quarterly average level. Figure 2a defense purchases of aircraft and equipment, billions of chained 1992 dollars, BEA; industrial production in SIC 372, FRB; average hourly earnings of production workers in SIC 372, BLS; producer price index for aircraft and parts, BLS. (The price data were missing from September to December 1985. We interpolated the data using the price deflator for transportation equipment, excluding motor vehicles derived from BEA shipments data.) All data are quarterly averages of monthly data.
236 • EARTH & RAMEY
Figure 2b output: GDP in chained 1996 dollars, BE A; real wages: nominal hourly compensation in business divided by deflator for private business, BLS; both quarterly logarithms. Romer dates and Ramey-Shapiro dates are given above. SECTION 3 Figure 3 The first five variables in the VAR are the same as those for Figures 4-6, described below. The hours variable is defined as the log of the product of average weekly hours and the number of production workers, BLS. Inventories are the log of chain-weighted manufacturing inventories by stage of processing from the BEA. The ratios are created by taking the difference between the log of inventories and the log of hours. SECTION 4 Figures 4-6 Macroeconomic variables: output: total industrial production, FRB; price level: personal consumption expenditure deflator, BEA; commodity prices: index of sensitive commodity prices, Charles Evans (see above); money: M2, FRB; federal funds rate, FRB. Industry variables: output: industrial production by two-digit and three-digit SIC code, as well as total manufacturing, durable manufacturing, and nondurable manufacturing, FRB; prices: for total, durable, and nondurable manufacturing, producer price indices from the BLS were used; for two- and three-digit SIC industries, deflators derived from BEA shipments data were used; wages: average hourly earnings of production workers, BLS and DRI. All data are monthly and in logarithms, except the FFR, which is the monthly average level, and the industry price/wage ratio, which is the log difference of the two applicable series. Table 2 A measure of interest expense is created from QFR on two-digit manufacturing industry balance-sheet data and FRB interest-rate data. Actual industry interest expense has only been reported in the QFR since 1998. Gertler and Gilchrist (1994) construct an approximation by multiplying the sum of short-term bank loans and other short-term debt by the commercial-paper rate. We compared this measure with actual interest expense, reported from 1998 forward, and found that it is too small by an order of magnitude, and that the two measures are uncorrelated. The difference appears to come from interest on longerterm debt. To correct for this discrepancy, we added to the GertlerGilchrist measure the difference of total and current liabilities multiplied by the yield on BAA-rated corporate bonds. This measure is of the same order of magnitude as reported interest expense and highly correlated.
The Cost Channel of Monetary Transmission • 237
To be specific, we calculate interest expense as the product of the commercial-paper rate with the sum of short-term bank loans and other short-term debt, added to the product of the yield on BAA-rated corporate bonds with the difference between current and total liabilities. Because the interest-expense series for all industries studied have easily apparent time trends and tend to exhibit significant interquarter volatility, we smoothed the data using a Hodrick-Prescott filter before taking crosssectional correlations. Figure 7 Macroeconomic variables as above. Green Book data on Federal Reserve Board staff forecasts for coming inflation and output growth were kindly supplied from David and Christina Romer at the University of California, Berkeley. Because the FOMC meetings do not occur every month, there are several months with missing values for the period November 1965 to September 1979. We filled in the missing values using the last available forecast. CALCULATION OF STANDARD ERROR BANDS FOR IMPULSE RESPONSE FUNCTIONS:
In all figures significance levels refer to one-tailed hypothesis tests. Figure 1 The model with long-run restrictions was estimated via Generalized Method of Moments (GMM), using the IV method suggested by Shapiro and Watson. To calculate standard error bands, we used the estimated mean and variance-covariance matrix of coefficients to generate 500 draws from a normal distribution with the same mean and variance-covariance matrix. We then computed impulse response functions from each of those draws. For each horizon, we sorted the responses and chose the ones corresponding to the percentage bands given in the figure. Figures 2-7 VARs were estimated using ordinary least squares (OLS). We calculated confidence bands for the impulse responses using Kilian's (1998) bootstrap-after-bootstrap bias correction method. Using the OLSestimated errors and coefficient matrix, we created 1000 bootstrapped realizations of the endogenous time-series data, with which we reestimated the coefficient matrix for each realization. We then used the bootstrapped coefficient matrices to estimate and correct the asymptotic bias of the OLS coefficient matrix. The bias-adjusted coefficient matrix was then used to create 1000 bootstrapped estimates of the impulse response functions to approximate their asymptotic distribution. For
238 • EARTH & RAMEY
each iteration of both bootstraps, the initial conditions were assumed to be those of the original regression. REFERENCES Bernanke, B., and A. Blinder. (1992). The Federal Funds Rate and the Channels of Monetary Transmission. American Economic Review 82(4): 901-921. , and M. Gertler. (1989). Agency costs, net worth, and business fluctuations. American Economic Review 79(1): 14-31. , and . (1995). Inside the black box: The credit channel of monetary policy transmission. Journal of Economic Perspectives 9(4):27-48. , , and M. Watson. (1997). Systematic monetary policy and the effects of oil price shocks. Brookings Papers on Economic Activity 1:91-142. Blinder, A. (1987). Credit rationing and effective supply failures. The Economic Journal 97(386):327-352. Chevalier, J., and D. Scharfstein. (1996). Capital-market Imperfections and countercyclical markups: Theory and evidence. American Economic Review 86(4):703725. Christiano, L., and M. Eichenbaum. (1992). Liquidity effects and the monetary transmission mechanism. American Economic Review 82(2):346-353. , , and C. Evans. (1997). Sticky price and limited participation models of money: A comparison. European Economic Review 41(6):1201-1249. , , and . (1999). Monetary policy shocks: What have we learned and to what end? In Handbook of Macroeconomics, vol. 1A, J. B. Taylor and Michael Woodford (eds.). Amsterdam: North-Holland, pp. 65-148. Davis, S., and J. Haltiwanger. (1997). Sectoral job creation and destruction responses to oil price changes and other shocks. Unpublished manuscript. Farmer, R. (1984). A new theory of aggregate supply. American Economic Review 74(5):920-30. . (1988a). Money and contracts. Review of Economic Studies 55(3):431-446. . (1988b). What is a liquidity crisis. Journal of Economic Theory 46(1):1-15. Faust, J. (1998). The robustness of identified VAR conclusions about money. Carnegie-Rochester Conference Series on Public Policy 49:207-244. Fleischman, C. (2000). The causes of business cycles and the cyclicality of real wages. Unpublished manuscript. Francis, N., and V. Ramey. (2001). Is the technology-driven real business cycle hypothesis dead? Shocks and aggregate fluctuations revisited. Unpublished manuscript. Friedman, B. (1986). Money, credit and interest rates in the business cycle. In The American Business Cycle: Continuity and Change, Robert J. Gordon (ed.). Chicago: The University of Chicago Press. Fuerst, T. (1992). Liquidity, loanable funds, and real activity. Journal of Monetary Economics 29(l):3-24. Gali, J. (1999). Technology, employment, and the business cycle: Do technology shocks explain aggregate fluctuations? American Economic Review 89(1):249271. Gertler, M., and S. Gilchrist. (1994). Monetary policy, business cycles, and the behavior of small manufacturing firms. Quarterly Journal of Economics 109(2): 309-340.
The Cost Channel of Monetary Transmission • 239 Gordon, D., and E. Leeper. (1994). The dynamic impacts of monetary policy: An exercise in tentative identification. Journal of Political Economy 102(6):1228-1247. Hamilton, J. (1985). Historical causes of postwar oil shocks and recessions. The Energy Journal 6(1):97-116. . (1996). This is what happened to the oil price-macroeconomy relationship. Journal of Monetary Economics 38(2):215-220. Hanson, M. (1998). On the identification of monetary policy: The "price puzzle" reconsidered. Unpublished manuscript. Hoover, K., and S. Perez. (1994). Post hoc ergo propter hoc once more: An evaluation of "Does monetary policy matter?" in the spirit of James Tobin. Journal of Monetary Economics 34(l):89-99. Kashyap, A., O. Lament, and J. Stein. (1994). Credit conditions and the cyclical behavior of inventories. Quarterly Journal of Economics 109(3):565-592. , J. Stein, and D. Wilcox. (1993). Monetary policy and credit conditions: Evidence from the composition of external finance. American Economic Review 83(3):78-98. Kilian, L. (1998). Small-sample confidence intervals for impulse response functions. The Review of Economics and Statistics 80(2):218-230. King, R., and M. Goodfriend. (1997). The new neoclassical synthesis and the role of monetary policy. In NBER Macroeconomics Annual 1997. Cambridge MA: National Bureau of Economic Research, pp. 231-283. Ramey, V. (1989). Inventories as factors of production and economic fluctuations. American Economic Review 79(3):338-354. . (1992). The source of fluctuations in money: Evidence from trade credit. Journal of Monetary Economics 30(2):171-193. , and M. Shapiro. (1998). Costly capital reallocation and the effects of government spending. Carnegie-Rochester Conference Series on Public Policy 48: 145-194. Romer, C., and D. Romer. (1989). Does monetary policy matter: A new test in the spirit of Friedman and Schwartz. In NBER Macroeconomics Annual 1989. Cambridge, MA: National Bureau of Economic Research, pp. 63-129. , and . (1993). Credit channel or credit actions? An interpretation of the postwar transmission mechanism. In Changing Capital Markets: Implications for Monetary Policy. A symposium sponsored by the Federal Reserve Bank of Kansas City, pp. 71-116. , and . (1994). Monetary policy matters. Journal of Monetary Economics 34(l):75-88. , and . (2000). Federal Reserve private information and the behavior of interest rates. American Economic Review 90(3):429-457. Rotemberg, J., and M. Woodford. (1991). Markups and the business cycle. In NBER Macroeconomics Annual 1991. Cambridge, MA: National Bureau of Economic Research, pp. 63-129. , and . (1992). Oligopolistic pricing and the effects of aggregated demand on economic activity. Journal of Political Economy 100(6):1153-1207. Seelig, S. (1974). Rising interest rates and cost push inflation. Journal of Finance 29(4):1049-1061. Shapiro, M. (1981). Identification and estimation of the "Wright Patman Effect." Unpublished manuscript. , and M. Watson. (1988). Sources of business cycle fluctuations. In NBER
240 • EVANS Macroeconomics Annual 1988. Cambridge, MA: National Bureau of Economic Research, pp. 111-148. Shea, J. (1993). Do supply curves slope up? Quarterly Journal of Economics 108(l):l-32. Sims, C. (1992). Interpreting the macroeconomic time series facts: The effects of monetary policy. European Economic Review 36(5):975-1000. U.S. Congress. Joint Economic Committee. (1970). Report on the January 1970 Economic Report of the President. Washington: U.S. Government Printing Office, pp. 55-56.
Comment1 CHARLES L. EVANS Federal Reserve Bank of Chicago
1. Introduction The cost channel presented by Barth and Ramey is a potentially important component of the monetary transmission mechanism. Casual evidence of this phenomenon often appears in economic discussions between central bankers and the public. For example, Federal Reserve staff regularly collect anecdotal survey information about regional and national economic developments from businesses. During a time of rising short-term interest rates, it is not unusual to hear about rising inventory costs and the increasing likelihood that these higher costs will be passed along to consumers in the form of higher prices. If these high short-term interest rates reflect an attempt to fight inflationary pressures through contractionary monetary policy, these anecdotes suggest that more inflation will be forthcoming, not less. This is the essence of the Wright Patman effect described by Barth and Ramey. Of course, the significance of anecdotes alone is usually unclear. Consequently, Barth and Ramey's empirical analysis of this issue provides useful evidence on the importance of the cost channel for the U.S. economy. This is an ambitious and useful paper. Barth and Ramey use a variety of identification restrictions to identify technology, aggregate-demand, and monetary-policy shocks. They find that monetary policy shocks induce economic responses that are more like the responses following technology shocks than like those following demand shocks. They interpret the aggregate and industry results as providing support for a cost channel in the monetary transmission mechanism. Their paper is ambitious in trying to identify the effects of this important economic mecha1. This paper represents the views of the author and should not be interpreted as reflecting the views of the Federal Reserve Bank of Chicago or the Federal Reserve System.
Comment • 241
nism without an explicit dynamic general equilibrium model. It is very useful to understand how far the evidence can be pushed to argue for the role of a cost channel over other channels. Although I think the cost channel is probably important, my comments focus primarily on the potential contributions of other endogenous mechanisms for explaining the estimated impulse responses. In particular, Earth and Ramey do not spend much time discussing the role of systematic monetary policy or variations in factor utilization. When these features are considered, their empirical results are sometimes more favorable to the cost-channel explanation and other times less favorable. In the absence of a dynamic economic model, it is difficult to quantify each channel's contribution. 2. Sources of Propagation and Multiple Shocks In dynamic general equilibrium models, exogenous impulses can generate persistent responses in endogenous variables through a variety of propagation and amplification mechanisms. Uniquely identifying the economic mechanisms from a small number of first and second moments of the data is challenging. For example, Sargent (1978) presents a dynamic equilibrium analysis of the labor market in which fluctuations are driven by exogenous impulses to productivity and real wages. There are three sources of propagation in the model: persistence in the exogenous wage process, persistence in the exogenous productivity processes, and costs of adjusting labor hours. Although the model is formally econometrically identified, Sargent displays two sets of parameter estimates which have approximately the same likelihood owing to the finite sample length. The economic differences between these parameter estimates center on the sources of propagation. In one case, persistence comes chiefly from the productivity shocks. In the other, persistence is due mainly to more costly adjustment in labor hours. In the context of Sargent's model, the real-wage and labor-hours data alone do not provide convincing information on the sources of propagation. A more recent example of this identification problem comes from research by Gali (1999), Basu, Fernald, and Kimball (1998), Dotsey (1999), and Francis and Ramey (2001). Gali (1999) and Basu, Fernald, and Kimball (1998) have found empirically that a positive technology shock leads to a muted response of output initially and a fall in employment. If firms have predetermined prices and are committed to satisfying demand at those prices, then an exogenous increase in productivity will not lead to an increase in output unless demand increases at the predetermined price. This result depends on the way in which monetary policy system-
242 • EVANS
atically responds to the state of the economy. If an exogenous money growth rule is assumed, then aggregate demand will not systematically increase following a positive productivity shock. Within the framework of Gall's model, however, Dotsey (1999) demonstrates that empirically plausible Taylor rules (estimated in Clarida, Gali, and Gertler, 2000) lead the monetary authority to reduce short-term interest rates in this situation. The Taylor rule's endogenous response to the state of the economy stimulates aggregate demand enough so that output and employment rise. Dotsey's result highlights the importance of realistically capturing the systematic component of monetary policy in dynamic economies. Conditional on the validity of Gali, Basu, Fernald, and Kimball's empirical facts and of Clarida, Gali, and Gertler's Taylor-rule estimates, Gali's model with sticky prices is incomplete. In a very different dynamic general equilibrium model, Francis and Ramey (2001) show that a flexible-price economy with habit persistence in consumption preferences and costs of adjusting investment can generate essentially no initial response of aggregate demand following a technology shock. With aggregate demand and output effectively predetermined relative to technology shocks, this leads to a fall in employment following a positive technology shock. With no nominal rigidities in the model, the specification of the monetary policy rule is not necessary to determine real allocations. Francis and Ramey's analysis highlights two alternative propagation mechanisms in order to match the empirical facts. Clearly, more information from additional data sources is required to sort these issues out. Earth and Ramey's empirical investigation of multiple economic shocks generates two types of information that are not available from a singleshock analysis and that may discriminate among different explanations. The first type of new information is the key insight of their analysis. If different economic shocks have similar effects on a subset of macroeconomic data, the similarities can imply that a common economic mechanism is responsible. Barth and Ramey focus on the empirical responses of macroeconomic data following three identified shocks: technology, aggregate demand, and monetary policy. Their Figure 1 shows that contractionary monetary policy and technology shocks separately lead to reductions in productivity. Identified contractionary aggregate-demand shocks increase productivity. The similarity between technology and monetary policy shock responses leads to a theoretical discussion of the importance of a supply channel—namely, working-capital costs—in the monetary transmission mechanism. The second type of new information is more subtle and difficult to disentangle qualitatively. If different economic shocks have different ef-
Comment • 243
fects on a subset of macroeconomic data, the differences may imply that a common economic mechanism is at work in each case. For example, assume the cost channel is important. Earth and Ramey and Christiano, Eichenbaum, and Evans (1997) capture this by assuming firms must finance their wage bill by borrowing working capital. This feature of the economy is active following all realizations of the economic shocks. Barth and Ramey's Figure 1 shows that contractionary demand shocks lead to higher real wages, while contractionary technology and monetary policy shocks lead to lower real wages. If interest rates respond endogenously to the contractionary demand shock, the cost channel may be responsible for the magnitude of the real-wage response, which can be either larger or smaller depending on the direction of the interestrate response. I discuss this in a more specific context below. From the perspective of a qualitative analysis, different signs in these responses are probably easiest to interpret. But different magnitudes of responses can be enough to identify alternative propagation and amplification mechanisms in the context of a tightly parametrized theoretical analysis. 3. Systematic Monetary-Policy Responses There is much recent evidence on the systematic response of monetary policy to the state of the economy (e.g., Taylor, 1993; Clarida, Gali, and Gertler, 2000; Christiano, Eichenbaum, and Evans, 1999). Barth and Ramey's VAR impulse response functions seem to be quite consistent with the evidence that U.S. monetary-policy actions are well approximated by Taylor rules. This is not at all surprising. A forward-looking Taylor rule without interest-rate smoothing can have the following form:
where FF is the federal funds rate, yt - y* is an output gap, pt+s - pt- rf is an s-period-ahead inflation gap, and Qf is the Fed's information set for forming conditional expectations of future variables, as well as possibly latent variables like an output gap. In implementing these rules, policymakers must evaluate these expectations. The resulting policy reaction function is a dynamic feedback rule which has the same form as an interest-rate equation in a VAR. Allowing for interest-rate smoothing enhances the similarities further. In these reaction functions, policy responds systematically to all economic shocks. Focusing on the contractionary aggregate demand shock in Figure 1, output falls while future inflation is essentially unchanged for about two years. The significant fall in the federal funds rate captures
244 • EVANS Figure 1 IMPULSE RESPONSES FOLLOWING A RAMEY-SHAPIRO SHOCK
a Taylor-like response to a negative output gap. The subsequent modest rise in inflation is consistent with the expansionary monetary policy. So there appears to be a substantial systematic response of monetary policy following the identified aggregate-demand shock in Barth and Ramey's Figure 1. Given the inherent uncertainty in defining and identifying a shock as aggregate-demand, Barth and Ramey investigate additional sources of exogenous variation in aggregate demand using RameyShapiro government shocks. Their Figure 2b does not display the interest-rate response. I estimated similar response functions for real GDP, real wages, 3-month Treasury bill rates, and labor productivity following a contractionary Ramey-Shapiro shock. Each equation included eight lags of all four endogenous variables plus the contemporaneous value and eight lags of the Ramey-Shapiro shocks. The sample period is 1949-2000, and Figure 1 of this Comment displays the estimated responses. In response to these contractionary demand shocks, the Taylor-rule responses continue to be evident: the fall in short-term nominal interest rates follows the steep reductions in real GDR Interest-
Comment • 245
ingly productivity falls procyclically with output following these military shocks. The demand shock in Barth and Ramey's Figure 1 displayed a countercyclical response of productivity, although the statistical significance was not strong. The differences in sign and magnitudes may be suggesting the role of other endogenous propagation mechanisms that lead to procyclical productivity (or less countercyclical responses). The recent literature has assigned an important role to labor hoarding and variations in factor utilization (e.g., Burnside and Eichenbaum, 1996, and Braun and Evans, 1998). In order to find an important role for the cost channel of monetary transmission, it is important to allow for the influences of endogenous monetary policy and the endogenous responses of private agents. Barth and Ramey's theoretical discussion provides a simple framework for thinking about these issues. 4. Theoretical Discussion The thrust of Barth and Ramey's economic analysis is that contractionary monetary-policy shocks induce economic responses that look more like contractionary technology responses than like contractionary demand responses. The basic insights can be understood from a textbook discussion of a competitive spot labor market. Production depends upon capital K, labor N, and an exogenous technology variable z. We have Y = zF (K, N) with F possessing the usual diminishing marginal products. Labor is elastically supplied and increasing in the real wage. A contractionary technology shock increases marginal costs directly by lowering the marginal productivities of labor and capital. The demand for labor falls at all real wages. With z lower, labor productivity and the real wage fall. To consider what happens following a contractionary aggregate-demand shock requires a bit more definition. If we equate this shock with an exogenous fall in unproductive government purchases, then a contraction represents a reduction in current or future taxes. This positive wealth effect leads to a reduction in labor supply at all wage rates. Since the production technology F is unaffected, real wages and labor productivity rise. In arguing that monetary-policy shocks induce economic responses that look like technology shocks, some thought must be given to the source of nominal non-neutralities. Three candidate rigidities are (a) sticky prices, (b) sticky wages, and (c) limited participation with a working-capital channel. (a) when prices are predetermined, output is determined by aggregate demand and an unanticipated monetary contraction reduces aggre-
246 • EVANS
gate demand. Firms' reduced labor requirements can be filled at lower real wages (dictated by the labor-supply schedule). Real wages fall, but labor productivity rises due to the diminishing marginal product of labor. Christiano, Eichenbaum, and Evans (1997) provide a quantitative analysis of these effects. (b) When nominal wages are predetermined, an unanticipated monetary contraction reduces the price level and increases real wages. The resulting fall in labor input again leads to a rise in labor productivity. See Bordo, Erceg, and Evans (2000) for a quantitative analysis of these effects. (c) In a limited-participation model with a cost channel, an unanticipated monetary contraction increases nominal interest rates. The higher interest costs of financing the wage bill lead to a fall in labor demand at all wage rates. This leads to a fall in the labor input and real wages, and an increase in labor productivity. In this model, real wages fall but productivity rises. Christiano, Eichenbaum, and Evans (1997) provide a quantitative analysis of these effects. Table 1 of this comment qualitatively summarizes the theoretical implications of these shocks. The key differences among the shock implications are the responses of real wages and labor productivity. The limitedparticipation analysis embodies the cost-channel mechanism stressed by Earth and Ramey. As the paper discusses and my earlier discussion of propagation mechanisms emphasized, however, we must keep in mind that the simple textbook discussions omit many model features that the literature has stressed.
5. Interpreting the Empirical Results Table 2 summarizes the qualitative findings of the estimated impulse responses following the identified technology, demand, and monetary polTable 1 THEORETICAL IMPLICATIONS OF CONTRACTIONARY SHOCKS Y
N
W/P
Technology Demand
I l
I i
I t
I T
Monetary policy: Sticky prices Sticky wages Limited participation
I l I
I i I
I t I
t t t
Y/N
Comment • 247 Table 2 ESTIMATED RESPONSES FOLLOWING CONTRACTIONARY SHOCKS
Technology Demand Federal funds
Y
N
W/P
Y/N
R
1 1 1
1 1 4
1 T
4
T 4 t
4
?
4
icy shocks. In the aggregate analysis, as Earth and Ramey emphasize, the federal funds rate and technology shocks have similar responses. To conclude that the cost channel is an important component of the monetary transmission mechanism, we should also consider (1) the way in which systematic monetary policy influences the economy, and (2) whether other important endogenous mechanisms are missing from the analysis. In the first case, Earth and Ramey view the rise in the real wage following a contractionary demand shock as indicating the absence of a cost channel. Such a broad-brush dichotomy of demand and supply mechanisms tries to abstract from the complexities of how the cost channel works after various demand shocks. In fact, the role of systematic monetary policy may very well lead to larger increases in real wages following a demand contraction. As I mentioned above, the demand contraction may initially reduce labor supply, output, and labor input, leading to a rise in real wages. A Taylor-rule response of monetary policy may very well lead to a monetary expansion and lower interest rates. In this context, the cost channel stimulates labor demand at all wage rates due to more favorable financing conditions, and real wages rise further. Thus, a relatively large increase in real wages may signal an especially large role for the cost channel. Or, put another way, the fact that certain aggregate-demand shocks lead to different real wage implications than technology shocks may signal a large role for the cost channel. In the second case, the negative response of labor productivity following a monetary policy contraction deserves additional investigation. In light of the diminishing returns to labor, this procyclical response indicates that something has been omitted from the theoretical discussion. There is a large literature that emphasizes the role of variable factor utilization at business and seasonal cycle frequencies (for example, Burnside and Eichenbaum, 1996; Braun and Evans, 1998). Following a monetary contraction, the presence of variable factor utilization would likely reduce the marginal productivity of labor at all wage rates. In equilibrium, real wages and labor productivity would fall. It is important to note that this story does not need to invoke the cost channel in order
248 • EVANS
to account for the responses in the data. Consequently, variable factor utilization is a competing explanation. The industry evidence in Barth and Ramey's Table 1 and Figures 4 and 5 is capable of shedding further light on these competing explanations. The larger responses of P/W in selected industries during the 1959-1979 period could be due to a greater dependence on limited working-capital technologies or greater variations in factor utilization rates. The analysis of QFR data begins to get at one side of this issue, but much more should be possible. Does the factor-utilization story square with the aggregate-demand responses? Here the evidence is mixed. Recall that following a contractionary aggregate demand shock, labor supply falls. A fall in labor input and output should reduce endogenous utilization rates. This reduces labor demand at all wage rates, and intensifies the reductions in output and labor. The qualitative predictions for real wages and labor productivity, however, are ambiguous. The empirical results from the identified VAR shocks and Ramey-Shapiro shocks indicate that real wages rise, but the productivity response may not be robust (comparing their Figure 1 and my Figure 1). Barth and Ramey discount the variable-factor-utilization story in favor of an omitted factor of production such as working capital. A quantitative, dynamic generalequilibrium analysis seems necessary to shed further light here. It is not obvious how an important endogenous mechanism plays a strong role for some shocks, but is presumably rendered mute following other shocks. Stark differences like these have the potential to provide strong identifying power in constructing models of the economy. Finally, Christiano, Eichenbaum, and Evans (2001) present a model in which variable capacity utilization, the cost channel, and other endogenous propagation mechanisms combine to produce a Wright Patman effect following a monetary policy shock. In a dynamic general equilibrium model with Calvo price and wage contracts, we introduce habit persistence in consumption preferences, investment adjustment costs, variable capacity utilization, and a cost channel. There are large literatures arguing that each of these features is important for understanding aggregate fluctuations. Including all of these endogenous mechanisms allows the model to capture the hump-shaped responses of output, consumption, investment, and productivity following a monetary policy shock. Introducing additional shocks in a model like this may allow for a fuller assessment of each mechanism's contribution to economic fluctuations. To conclude, Barth and Ramey's analysis of alternative economic shocks in a variety of settings is an important ingredient in the research program to find useful dynamic general equilibrium models for evaluating alternative economic policies. Empirical analyses like this flesh out the broad features of the data which all useful models should capture.
Comment - 249
While success along limited dimensions of the data's likelihood surface continues to be relatively easy to attain, this paper helps to open the curtain on the larger challenges facing macroeconomists. REFERENCES Basu, S., J. Fernald, and M. Kimball. (1998). Are technology improvements contractionary? Board of Governors of the Federal Reserve System. International Finance Discussion Paper 625. Bordo, M., C. Erceg, and C. Evans. (2000). Money, sticky wages and the Great Depression. American Economic Review 90(5):1447-1463. Braun, R. A., and C. Evans. (1998). Seasonal Solow residuals and Christmas: A case for labor hoarding and increasing returns. Journal of Money, Credit, and Banking 30(3):306-330. Burnside, C., and M. Eichenbaum. (1996). "Factor-hoarding and the propagation of business-cycle shocks. American Economic Review 86(5):1154-1174. Christiano, L., M. Eichenbaum, and C. Evans. (1997). Sticky price and limited participation models: A comparison. European Economic Review 41(6):12011249. Christiano, L., M. Eichenbaum, and C. Evans. (1999). Monetary policy shocks: What have we learned and to what end? In Handbook of Macroeconomics, Vol. 1A. J. B. Taylor and M. Woodford (eds.). Elsevier Science, pp. 65-148. Christiano, L., M. Eichenbaum, and C. Evans. (2001). Nominal rigidities and the dynamic effects of a shock to monetary policy. Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 8403. Clarida, R., J. Gali, and M. Gertler. (2000). Monetary policy rules and macroeconomic stability: Evidence and some theory. Quarterly Journal of Economics. Vol. CXV, 147-180. Dotsey, M. (1999). Structure from shocks. Federal Reserve Bank of Richmond. Working Paper 99-6. Francis, N., and V. Ramey. (2001). "Is the technology-driven real business cycle hypothesis dead? Shocks from aggregate fluctuations revisited. Unpublished manuscript. Gali, J. (1999). "Technology, employment, and the business cycle: Do technology shocks explain aggregate fluctuations?" American Economic Reviezv 89(1):249271. Sargent, T. J. (1978). Estimation of dynamic labor demand schedules under rational expectations. Journal of Political Economy, 89(6):1009-44. Taylor, J. (1993). Discretion versus policy rules in practice. Carnegie-Rochester Conference Series on Public Policy 39:195-214.
Comment SIMON GILCHRIST Boston University, NBER, and Federal Reserve Bank of Boston
The authors have written an excellent empirical paper arguing in favor of a cost channel for monetary policy. The paper provides extensive evidence from a variety of data sources and experiments in support of this
250 • GILCHRIST
view. In my comment, I will provide some discussion of what exactly I think the cost channel is. I will then discuss their evidence and identification, and provide some additional discussion of evidence from the inventory literature, some of which is complementary to their findings, some of which is not. Finally, I will ask what can we expect to obtain by adding a cost channel to a calibrated model. The basic cost channel is easily understood by examining the firstorder condition for labor demand in a model where firms borrow to hire labor inputs. In this case we have R(W/P) = MPL, where R is the real interest rate, W/P is the real wage, and MPL is the marginal product of labor. If firms borrow to hire inputs such as labor, then as interest rates rise, labor costs rise and labor demand and real wages fall. Assuming that the marginal product of labor is determined by technology and the capital-labor ratio [e.g. MPL = A(K/L)a], labor productivity will rise in response to a tightening of monetary policy as firms move up their labor demand curve in response to increased hiring costs. In contrast, real wages will fall. The implication here is that although real wages move in opposite directions for monetary policy shocks and for other demand shocks, labor productivity moves in the same direction. The author's evidence suggests, to the contrary, that both real wages and labor productivity fall in response to a tightening of monetary policy. To explain this result, the authors appeal to the notion that there is another input in the production process, such as working capital or inventories. If firms also borrow to hire this additional factor, demand for this input falls as interest rates rise. If labor productivity is decreasing in this input, one could then rationalize a decline in labor productivity in response to monetary policy shocks but not other demand shocks. In this case, monetary policy shocks will have effects more like supply shocks than like demand shocks. For this to be true, it must be the case that the labor productivity decline owing to the decline in this additional input is large enough to offset the fact that, in the absence of any movements in this other input, labor productivity would rise rather than fall. This raises the question, what is this additional input? As the authors suggest, a natural candidate is inventories in the production function. The aggregate evidence that inventory-sales ratios are strongly and persistently countercyclical is both good news and bad news here. On the one hand, borrowing for inventories is more plausible than borrowing for labor inputs, though working capital is undoubtedly used to finance some component of labor as well as other input costs. The fact that inventories rise relative to sales during a downturn suggests that borrowing costs do indeed rise in response to tight monetary policy. Additional evidence is provided by the work of Gertler and Gilchrist, who docu-
Comment • 251
ment that short-term debt for large manufacturing firms also rises in response to monetary policy. Thus both inventory input movements and movements in short-term debt in response to monetary policy are consistent with the notion that costs rise as interest rates increase. These movements are not necessarily consistent with the notion that labor productivity should fall in response to a tightening of monetary policy however. In particular, because inventory inputs are rising—this is true of all types: final goods, materials, and work in progress—relative to sales following tight monetary policy we would expect inventories to be a poor candidate for explaining the procyclicality of productivity in response to monetary-policy shocks but not other demand shocks. The authors counter this point by providing evidence that the inventoryhours ratio falls rather than rises in response to monetary-policy shocks, this being true during the early part of the sample period but not the later part, consistent with their argument that the cost channel has diminished over time. While this evidence is intriguing, the movements in the inventory-hours ratio only measure the movement of one input relative to another, and not the direct effect of inventories on labor productivity. We need further work to fully assess whether or not inventories, and working capital more generally, provide an argument for procyclical rather than countercyclical productivity in response to monetary policy shocks. (A step along these lines would be to analyze the dynamics of the inventory-sales ratio in response to monetary vs. other demand shocks.) If the labor-productivity movements cannot be rationalized through a cost channel, then we are left with the puzzling result that labor productivity moves in one direction in response to monetary policy shocks and in another direction in response to other demand shocks. One may be tempted to blame this on faulty identification. For example, Romer episodes of tight money are highly correlated with large oil price shocks, confounding supply and demand effects. In addition, military buildups, to the extent that they are more likely to provide anticipated movements in demand than do monetary policy shocks, provide dynamics more like those of a government spending shock in a neoclassical model, even in a setting where prices are sticky and markups are otherwise countercylical. In other words, firms that anticipate demand increases adjust their prices accordingly and are less likely to engage in labor hoarding, which would result in strongly procyclical labor productivity. This seems quite likely to be the case with the aircraft industry example discussed in the paper. Whether or not the military buildups of the Korean and Vietnam wars provide examples of anticipated buildups is more debatable (note it is not the initial increase that needs to be anticipated, but the
252 • GILCHRIST
future spending path). Countering this argument, however, is the fact that the demand shocks in the Gali decomposition are not particularly persistent and hence not likely to be anticipated. Also, the fact that all of the evidence goes in the same direction makes a more persuasive case for the productivity arguments set forth in the paper. Setting this issue aside, it is possible that the basic cost channel is in place, i.e., monetary policy is transmitted through the supply side as well as the demand side, without their being the additional productivity mechanism discussed above. In this case, the cost channel still serves as an additional source of amplification and propagation to monetary policy shocks, and can help explain the well-known price puzzle. I find both of these arguments plausible. There is ample evidence in the literature that financial factors impinge on both input choices and output, particularly for credit-constrained firms. Increasing marginal costs of borrowing in the downturn are a natural consequence of credit-market frictions. We would therefore expect firms that face severe frictions in credit markets to be most susceptible to a cost channel. Indeed, as the paper discusses, the evidence on small vs. large firms provided by Gertler and Gilchrist is highly consistent with the notion that small firms face rising borrowing costs, which cause a reduction in their output relative to that of large firms in the wake of tight monetary policy. Somewhat surprisingly, the industry decomposition in the paper does not provide strong support for this notion, unless we truly believe that industries such as motor vehicles are likely to face significant credit frictions. On the other hand, the fact that the effect of monetary policy on the wage-price markup is correlated with the amount of interest expense is consistent with a cost channel. The fact that the correlation remains unchanged in both the pre- and the post-1980 period contradicts the notion that the cost channel has declined in importance over time, however. Finally, it is worth asking under what conditions a cost channel will be a quantitatively important component of the monetary transmission mechanism. Because the cost channel depends on monetary-induced movements in real interest rates, it must be considered in conjunction with other nominal rigidities which give the monetary authority leverage over real as well as nominal interest rates. A basic experiment analyzing the effect of a monetary policy shock in a dynamic New Keynesian model with sticky output prices augmented to include a cost channel suggests to me that the direct effect of the cost channel may not be particularly strong. In this experiment (details are available on request), I assume that aggregate output is produced by two intermediate input sectors of equal size. Both sectors use capital and labor as inputs and face capital adjustment costs. One sector plans labor one period ahead and
Comment • 253
borrows to pay the wage bill, which is financed over the next year. The other sector hires labor contemporaneously and faces no interest expense in its hiring decision. The monetary authority sets nominal interest rates as a function of past interest rates and current inflation. The impulse responses to an innovation in monetary policy for output and inflation are plotted in Figure 1. Consistent with the arguments in the paper, the cost channel adds amplification and reduces the inflation response to monetary policy. The amplification is not particularly large, however, relative to the baseline model, and the inflation dynamics are not appreciably altered. In particular, the model does not rationalize the price dynamics seen in the data. The explanation here is quite simple: countercyclical markups owing to sticky prices are still the dominant mechanism by which policy is transmitted, whereas the real effects of interest rates on labor costs are relatively small. It is quite possible that a richer model would provide better results. Adding sticky wages will help match wage-output-inflation dynamics and will come closer to rationalizing the price puzzle. Adding a financial accelerator mechanism will provide more amplification for the cost channel. In particular, coun-
Figure 1 IMPULSE RESPONSE TO A MONETARY SHOCK
254 • DISCUSSION
tercyclical borrowing for factor inputs implies a worsening of balance sheets during a downturn, which could suppress economic activity. In this paper, the authors make a strong case that a cost channel is worthy of serious consideration as an important component in the monetary transmission mechanism. Although not all of the evidence is fully rationalized within existing model structures, there is certainly enough evidence here to tempt model builders to incorporate supply as well as demand effects of monetary policy. Whether that can be done in a manner that is consistent with the results in this paper is an interesting topic for future research.
Discussion In his response, Marvin Barth stressed the point that the paper could explain several empirical puzzles with one simple idea. With respect to the behavior of the inventory-sales ratio, he argued that, at a disaggregated level, the response to a monetary shock is what one would expect to see in response to a cost shock. In particular, inventories of raw materials respond first to the monetary policy shock, then to work in progress; final-goods inventories exhibit a hump shape, reflecting the falloff in final demand. On the VAR identification scheme, it reassured him that the industry-level evidence pointed in the same direction. Ben Bernanke found the evidence on raw-materials inventories quite strong, as it could explain why productivity seemed to fall in response to a monetary shock and could help to distinguish between shocks to demand and money shocks. Barth explained that the authors had not emphasized this evidence, as they did not have disaggregated inventories by stage of process at the industry level. Chris Sims suggested an alternative explanation for the pattern of impulse responses to monetary policy shocks. He remarked that the VAR literature found a strong correlation between real effects of monetary policy and the price puzzle, which suggested that there was some confounding of technology and monetary policy shocks. He said that with different identification assumptions that allowed for some simultaneity, the price puzzle in the early part of the sample disappeared. He also claimed that putting money into the policy reaction function eliminated the price puzzle. Barth and Ramey replied that they already had money in the policy reaction function, although they had not tried to deal with the simultaneity issue.
Discussion • 255
Greg Mankiw asked whether monetary policy shocks have permanent effects on hours and output when Romer dates are used to identify the shocks. He wondered whether the permanent effect could be due to hysteresis, or whether instead it was an indication of the confounding of monetary and real shocks. Philip Lane suggested that if monetary contraction leads to currency appreciation, there could be a cost effect opposite to that examined by the authors, working through the price of imported intermediates. This effect could be tested by examining individual sectors. Valerie Ramey agreed and noted that the cost channel was stronger in the earlier period when exchange rates were fixed. Daron Acemoglu wanted to know whether sticky prices or sticky wages were necessary to generate the observed pattern of results by industry. He also asked whether the pattern of industry results correlated with specific industry features, such as the ratio of small to large firms, that could bear on the importance of the cost channel. Kristin Forbes suggested using firm-level data to examine the cost channel. Olivier Blanchard was interested in the correlation of the inverse realwage response with interest expense. He suggested looking at the response of the inverse real wage to Ramey-Shapiro dates and Romer dates, to get an idea of the heterogeneity of responses to government spending and monetary shocks. Susanto Basu suggested another explanation for the finding that responses to monetary policy shocks look like responses to productivity shocks. He raised the possibility that innovations in labor supply, such as those driven by low-frequency demographic movements, could be a fourth structural shock missing from the story. Mark Gertler said that he had simulated models incorporating the cost channel. In these models, the cost channel did a good job of explaining the sluggish response of prices. But price rigidity cannot be the only friction. Otherwise, when interest expense rises, real wages fall and reduce marginal cost. He suggested that the reduced-form responses to shocks could depend on the monetary policy reaction function. A change in the policy reaction function in 1979 could explain the differences in the results between the earlier and later periods. Ramey commented that the authors do not believe that the cost channel is the only mechanism through which monetary policy shocks have an effect on the economy. But the cost channel is something that might be a useful component in a parsimonious model, and one that is more important than, for example, variable factor utilization.
This page intentionally left blank
Xavier Gabaix and David Laibson MIT; and Harvard University and NBER
The 6D Bias and the Equity-Premium Puzzle 1. Introduction Consumption growth covaries only weakly with equity returns, which seems to imply that equities are not very risky. However, investors have historically received a very large premium for holding equities. For twenty years, economists have asked why an asset with little apparent risk has such a large required return.1 Grossman and Laroque (1990) argued that adjustment costs might answer the equity-premium puzzle. If it is costly to change consumption, households will not respond instantaneously to changes in asset prices. Instead, consumption will adjust with a lag, explaining why consumption growth covaries only weakly with current equity returns. In Grossman and Laroque's framework, equities are risky, but that riskiness does not show up in a high contemporaneous correlation between consumption growth and equity returns. The comovement is only observable in the long run. Lynch (1996) and Marshall and Parekh (1999) have simulated discretetime delayed-adjustment models and demonstrated that these models can potentially explain the equity-premium puzzle.2 In light of the complexity of these models, both sets of authors used numerical simulations. We thank Ben Bernanke, Olivier Blanchard, John Campbell, James Choi, Karen Dynan, George Constantinides, John Heaton, Robert Lucas, Anthony Lynch, Greg Mankiw, Jonathan Parker, Monika Piazzesi, Ken Rogoff, James Stock, Jaume Ventura, Annette Vissing, and seminar participants at Delta, Insead, Harvard, MIT, University of Michigan, NBER, and NYU for helpful comments. We thank Emir Kamenica, Guillermo Moloche, Eddie Nikolova, and Rebecca Thornton for outstanding research assistance. 1. For the intellectual history of this puzzle, see Rubinstein (1976), Lucas (1978), Shiller (1982), Hansen and Singleton (1983), Mehra and Prescott (1985), and Hansen and Jagannathan (1991). For useful reviews see Kocherlakota (1996) and Campbell (1999). 2. See also related work by Caballero (1995), He and Modest (1995), Heaton and Lucas (1996), Luttmer (1995), and Lynch and Balduzzi (2000).
258 • GABAIX & LAIBSON
We propose a continuous-time generalization of Lynch's (1996) model. Our extension provides two new sets of results. First, our analysis is analytically tractable; we derive a complete analytic characterization of the model's dynamic properties. Second, our continuous-time framework generates effects that are up to six times larger than those in discrete-time models. We analyze an economy composed of consumers who update their consumption every D (as in "delay") periods. Such delays may be motivated by decision costs, attention allocation costs, and/or mental accounts.3 The core of the paper describes the consequences of such delays. In addition, we derive a sensible value of D based on a decisioncost framework. The 6D bias is our key result. Using data from our economy, an econometrician estimating the coefficient of relative risk aversion (CRRA) from the consumption Euler equation would generate a multiplicative CRRA bias of 6D. For example, if agents adjust their consumption every D = 4 quarters, and the econometrician uses quarterly aggregates in his analysis, the imputed coefficient of relative risk aversion will be 24 times greater than the true value. Once we take account of this 6D bias, the Euler-equation tests are unable to reject the standard consumption model. High equity returns and associated violations of the HansenJagannathan (1991) bounds cease to be puzzles. The basic intuition for this result is quite simple. If households adjust their consumption every D > 1 periods, then on average only 1/D households will adjust each period. Consider only the households that adjust during the current period, and assume that these households adjust consumption at dates spread uniformly over the period. Normalize the timing so the current period is the time interval [0,1]. When a household adjusts at time i G [0, 1], it can only respond to equity returns that have already been realized by time i. Hence, the household can only respond to fraction i of within-period equity returns. Moreover, the household that adjusts at time i can only change consumption for the remainder of the period. Hence, only a fraction 1 — i of this period's consumption is affected by the change at time i. On average the households that adjust during the current period display a covariance between equity returns and consumption growth that is biased down by factor
3. See Gabaix and Laibson (2000b) for a discussion of decision costs and attention allocation costs. See Thaler (1992) for a discussion of mental accounts.
The 6D Bias and the Equity-Premium Puzzle • 259
The integral is taken from 0 to 1 to average over the uniformly distributed adjustment times. Since only a fraction 1/D of households adjust in the first place, the aggregate covariance between equity returns and consumption growth is approximately \ X 1/D as large as it would be if all households adjusted instantaneously. The Euler equation for the instantaneous-adjustment model implies that the coefficient of relative risk aversion is inversely related to the covariance between equity returns and consumption growth. If an econometrician used this Euler equation to impute the coefficient of relative risk aversion, and he used data from our delayed adjustment economy, he would impute a coefficient of relative risk aversion that was 6D times too large. In Section 2 we describe our formal model, motivate our assumptions, and present our key analytic finding. In Section 2.2 we provide a heuristic proof of our results for the case D > 1. In Section 3 we present additional results that characterize the dynamic properties of our model economy. In Section 4 we close our framework by describing how D is chosen. In Section 5 we consider the consequences of our model for macroeconomics and finance. In Section 6 we discuss empirical evidence that supports the Lynch (1996) model and our generalization. The model matches most of the empirical moments of aggregate consumption and equity returns, including a new test which confirms the 6D prediction that the covariance between \n(Ct+h/Ct) and R m should slowly rise with h. In Section 7 we conclude. 2. Model and Key Result Our framework is a synthesis of ideas from the continuous-time model of Merton (1969) and the discrete-time model of Lynch (1996). In essence we adopt Merton's continuous-time modeling approach and Lynch's emphasis on delayed adjustment.4 We assume that the economy has two linear production technologies: a risk-free technology and a risky technology (i.e., equities). The risk-free technology has instantaneous return r. The returns from the risky technology follow a geometric diffusion process with expected return r + rr and standard deviation a. We assume that consumers hold two accounts: a checking account and a balanced mutual fund. A consumer's checking account is used for day-to-day consumption, and this account holds only the risk-free asset. 4. See Calvo (1983), Fischer (1977), and Taylor (1979) for earlier examples of delayed adjustment in macroeconomics.
260 • GABAIX & LAIBSON
The mutual fund is used to replenish the checking account from time to time. The mutual fund is professionally managed and is continuously rebalanced so that a share 6 of the mutual-fund assets is always invested in the risky asset.5 The consumer is able to pick 6.6 In practice, the consumer picks a mutual fund that maintains the consumer's preferred value of 9. We call 6 the equity share (in the mutual fund). Every D periods, the consumer looks at her mutual fund and decides how much wealth to withdraw from it to deposit in her checking account. Between withdrawal periods—i.e., from withdrawal date t to the next withdrawal date t + D—the consumer spends from her checking account and does not monitor her mutual fund. For now we take D to be exogenous. Following a conceptual approach taken in Duffie and Sun (1990), we later calibrate D with a decision-cost model (see Section 4). Alternatively, D can be motivated with a mental-accounting model of the type proposed by Thaler (1992). Finally, we assume that consumers have isoelastic preferences and exponential discount functions:
Here i indexes the individual consumer and t indexes time. We adopt the following notation. Let wit represent the wealth in the mutual fund at date t. Between withdrawal dates, wit evolves according to
where zt is a Wiener process. We can now characterize the optimal choices of our consumer. We describe each date at which the consumer monitors—and in equilibrium withdraws from—her mutual fund as a reset date. Formal proofs of all results are provided in the appendix. PROPOSITION 1
On the equilibrium path, the following properties hold:
1. Between reset dates, consumption grows at a fixed rate (l/y)(r — p). 2. The balance in the checking account just after a reset date equals the net present value (NPV) of consumption between reset dates, where the NPV is taken with the risk-free rate. 5. This assumption can be relaxed without significantly changing the quantitative results. In particular, the consumer could buy assets in separate accounts without any instantaneous rebalancing. 6. The fact that 6 does not vary once it is chosen is optimal from the perspective of the consumer in this model.
The 6D Bias and the Equity-Premium Puzzle • 261
3. At reset date r, consumption is c,v+ = oaviT-, where a is a function of the technology parameters, preference parameters, and D. 4. The equity share in the mutual fund is
Here c,T+ represents consumption immediately after reset, and wiT- represents wealth in the mutual fund immediately before reset. Claim 1 follows from the property that between reset dates the rate of return to marginal savings is fixed and equal to r. So between reset dates the consumption path grows at the rate derived in Ramsey's (1928) original deterministic growth model:
Claim 2 reflects the advantages of holding wealth in the balanced mutual fund. Instantaneous rebalancing of this fund makes it optimal to store "extra" wealth—i.e., wealth that is not needed for consumption between now and the next reset date—in the mutual fund. So the checking account is exhausted between reset dates. Claim 3 follows from the homotheticity of preferences. Claim 4 implies that the equity share is equal to the same equity share derived by Merton (1969) in his instantaneous-adjustment model. This exact equivalence is special to our institutional assumptions, but approximate equivalence is a general property of models of delayed adjustment (see Rogers, 2001, for numerical examples in a related model). Note that the equity share is increasing in the equity premium (77) and decreasing in the coefficient of relative risk aversion (y) and the variance of equity returns (a2}. Combining claims 1-3 implies that the optimal consumption path between date rand date r + D is cit — ae^^'^'^w^- and the optimal balance in the checking account just after reset date r is
Claim 3 implies that at reset dates optimal consumption is linear in wealth. The actual value of the propensity to consume, a, does not matter for the results that follow. Any linear rule—e.g., linear rules of thumb—
262 • GABAIX & LAIBSON
will suffice. In practice, the optimal value of a in our model will be close to the optimal marginal propensity to consume derived by Merton,
Merton's value is exactly optimal in our framework when D = 0. 2.1 OUR KEY RESULT: THE 6D BIAS
In our economy, each agent resets consumption at intervals of D units of time. Agents are indexed by their reset time i G [0,D). Agent i resets consumption at dates {i, i + D, i + 2D, . . . }. We assume that the consumption reset times are distributed uniformly.7 More formally, there exists a continuum of consumers whose reset indexes i are distributed uniformly over [0, D). So the proportion of agents resetting their consumption in any time interval of length At < D is At/D. To fix ideas, suppose that the unit of time is a quarter of the calendar year, and D = 4. In other words, the span of time from t to t + 1 is one quarter of a year. Since D = 4, each consumer will adjust her consumption once every four quarters. We will often choose the slightly nonintuitive normalization that a quarter of the calendar year is one period, since quarterly data constitute the natural unit of temporal aggregation with contemporary macroeconomic data. Call Q the aggregate consumption between t - 1 and t:
Note that f s = f _i cisds is per-period consumption for consumer i. Suppose that an econometrician estimates y and /3 using a consumption Euler equation (i.e., the consumption CAPM). What will the econometrician infer about preferences? THEOREM 2 Consider an economy with true coefficient of relative risk aversion •y. Suppose an econometrician estimates the Euler equation
7. The results change only a little when we relax the assumption of a uniform distribution. Most importantly, if reset dates were clumped at the end of periods—a natural assumption—then the implied bias would be infinite.
The 6D Bias and the Equity-Premium Puzzle • 263 for two assets: the risk-free bond and the stock market. In other words, the econometrician fits ft and y to match the Euler equation above for both assets. Then the econometrician will find
plus higher-order terms characterized in subsequent sections.
Figure 1 plots y/y as a function of D. The formulae for the cases 0 ^ D < 1 and D > 1 are taken from Theorem 2. The two formulae paste at the crossover point, D = 1. Convexity of the formula below D = 1 implies that y/y > 6D for all values of D. The case of instantaneous adjustment (i.e., D = 0) is of immediate interest, since it has been solved already by Grossman, Melino, and Shiller (1987). With D = 0 the only bias arises from time aggregation of the econometrician's data, not delayed adjustment by consumers. Grossman, Melino, and Shiller show that time aggregation produces a bias of y/y = 2, matching our formula for D = 0. The most important result is the equation for D > 1, y — 6Dy, which we call the 6D bias. For example, if each period (t to t + 1) is a quarter of a calendar year, and consumption is reset every D = 4 quarters, then we
FIGURE 1 RATIO OF ESTIMATED y TO TRUE y
264 • GABAIX & LAIBSON
get y = 24y. Hence y is overestimated by a factor of 24. If consumption is revised every 5 years, then we have D = 20, and y = 1207. Reset periods of 4 quarters or more are not unreasonable in practice. For an extreme case, consider the 30-year-old employee who accumulates balances in a retirement savings account [e.g., a 401(k)] and fails to recognize any fungibility between these assets and his preretirement consumption. In this case, stock-market returns will affect consumption at a considerable lag (D > 120 quarters for this example). However, such extreme cases are not necessary for the points that we wish to make. Even with a delay of only 4 quarters, the implications for the equity-premium puzzle literature are dramatic. With a multiplicative bias of 24, econometrically imputed coefficients of relative risk aversion of 50 suddenly appear quite reasonable, since they imply actual coefficients of relative risk aversion of roughly 2. In addition, our results do not rely on the strong assumption that all reset rules are time- and not state-contingent. In Appendix B we incorporate the realistic assumption that all households adjust immediately when the equity market experiences a large (Poisson) shock. In practice, such occasional state-contingent adjustments only slightly modify our results. Our qualitative results are robust to our assumption about the uniform distribution of adjustment dates. For example, if adjustment occurs at the end (or beginning) of the quarter, then the multiplicative bias in the estimated coefficient of relative risk aversion is infinite, since the continuous flow of consumption in the current quarter is unaffected by current asset returns. By contrast, if adjustments occur at exactly the middle of the quarter, then the multiplicative bias is 4D, since the consumers that do adjust can only respond to half of the stock returns and their adjustment only affects half of the consumption flow (i.e. Yi* Vi = Vi). We can also compare the 6D bias analytically with the biases that Lynch (1996) simulates numerically in his original discrete-time model. In Lynch's framework, agents consume every month and adjust their portfolio every T months. Lynch's econometric observation period is the union of F one-month intervals, so D = T/F. In Appendix C we show that when D > 1 Lynch's framework generates a bias which is bounded below by D and bounded above by 6D. Specifically, an econometrician who naively estimated the Euler equation with data from Lynch's economy would find a bias of - higher-order terms.
The 6D Bias and the Equity-Premium Puzzle • 265
Holding D constant, the continuous-time limit corresponds to F —* °°, and for this case y/y = 6D. The discrete-time case where agents consume at every econometric period corresponds to F = 1, implying y/y = D, which can be derived directly. Finally, the 6D bias complements participation bias (e.g., Vissing, 2000; Brav, Constantinides, and Geczy, 2000). If only a fraction s of agents hold a significant share of their wealth in equities (say s = f), then the covariance between aggregate consumption and returns is lower by a factor s. As Theorem 8 demonstrates, this bias combines multiplicatively with our bias: if there is limited participation, the econometrician will find the values of y in Theorem 2, divided by s. In particular, for D > 1, he will find
This formula puts together three important biases generated by Eulerequation (and Hansen-Jagannathan) tests: y will be overestimated because of time aggregation and delayed adjustment (the 6D factor), and because of limited participation (the 1/s factor). 2.2 ARGUMENT FOR D > 1
In this section we present a heuristic proof of Theorem 2. A rigorous proof is provided in Appendix A. Normalize a generic period to be one unit of time. The econometrician observes the return of the stock market from 0 to 1:
where r is the risk-free interest rate, 77 is the equity premium, a2 is the variance of stock returns, and z is a Wiener process. The econometrician also observes aggregate consumption over the period:
As is well known, when returns and consumption are assumed to be jointly lognormal, the standard Euler equation implies that8 8. E,_1[^(C(/C,_1)~''Rn = 1 with R"t = eft"~a'a/2+'7<>ffa. The subscripts and superscripts a denote asset-specific returns and standard deviations. As Hansen and Singleton (1983) showed,
266 • GABAIX & LAIBSON
We will show that when D > 1 the measured covariance between consumption growth and stock-market returns, cov(ln[C1/C0]/ In RJ, will be lower by a factor 6D than the instantaneous covariance, cov(d In Ct, d In Rt)/dt, that arises in the frictionless CCAPM. As is well known, in the frictionless CCAPM
Assume that each agent consumes one unit in period [ —1,0].9 So aggregate consumption in period [—1,0] is also one: C0 = 1. Since In (Q/Q) — Q/Q — 1, we can write
with C(1 = ll cisds the time-aggregated consumption of agent i during period [0,1]. First, take the case D = 1. Agent /' G [0,1) changes her consumption at time i. For s G [0,z), she has consumption cis = awir - ^/^(-r~^(-s~^i where T = i - D. Throughout this paper we use approximations to get analytic results. Let e = max(r,p,67r,cr2,cr292,a). When we use annual periods, e will be
If we evaluate this expression for the risk-free asset and equities, we find that
Note that TT + r = ^a. 9. This assumption need not hold exactly. Consumption need be unity only up to O<0(V^) + O(e) terms, in the notation defined below.
The 6D Bias and the Equity-Premium Puzzle • 267
approximately 0.05.10 For quarterly periods, E will be approximately 0.01. We can express our approximation errors in higher-order terms of e. Since consumption in period [—1,0] is normalized to one, at time r = i — D, a times wealth will be equal to 1 plus small corrective terms; more formalltr
Here O(s) represents stochastic or deterministic terms of order e, and O<0(V^) represents stochastic terms that depend only on equity innovations that happen before time 0. Hence the O<0(\/s) terms are all orthogonal to equity innovations during period [0,1]. Drawing together our last two results, for s G [0,0,
Without loss of generality, set z(0) = 0. So consumer z's mutual fund wealth at date t = i~ is
The consumer adjusts consumption at t = i, and so for s E. [z',1] she consumes
The covariance of consumption and returns for agent i is 10. For a typical annual calibration r = 0.01, p = 0.05, 0ir = (0.78)(0.06), a2 = (0.16)2, cr 2 0 2 = (Tr/ycr)2 = (0.06/3 X 0.16)2, and a = 0.04.
268 • GABAIX & LAIBSON
Here and below — means "plus higher-order terms in e." The covariance contains the multiplicative factor i because the consumption change reflects only return information which is revealed between date 0 and date i. The covariance contains the multiplicative factor 1 — i because the change in consumption occurs at time i, and therefore affects consumption for only the subinterval [i,l]. We often analyze "normalized" variances and covariances. Specifically, we divide the moments predicted by the 6D model by the moments predicted by the benchmark model with instantaneous adjustment and instantaneous measurement. Such normalizations highlight the "biases" introduced by the 6D economy. For the case D = 1, the normalized covariance of aggregate consumption growth and equity returns is
which is the (reciprocal of the) 6D factor for D = 1. Consider now the case D > 1. Consumer i G [0,D) resets her consumption at t = i. During period 1 (i.e., t G [0,1]) only agents with i G [0,1] will reset their consumption. Consumers with i G (1,D] will not change their consumption, so they will have a zero covariance, cov(C,1,R1) = 0. Hence,
The 6D Bias and the Equity-Premium Puzzle • 269
For D > 1 the covariance of aggregate consumption is just 1/D times what it would be if we had D = 1:
The 6D lower covariance of consumption with returns translates into a 6D higher measured CRRA y. Since 6 = rr/ya2 [equation (1)], we get
The Euler equation (6) then implies
as anticipated. Several properties of our result should be emphasized. First, holding D fixed, the bias in y does not depend on either preferences or technology: r,Tr,cr,p,y. This independence property will apply to all of the additional results that we report in subsequent sections. When D is endogenously derived, D itself will depend on the preference and technology parameters. For simplicity, the derivation above assumes that agents with different adjustment indexes i have the same "baseline" wealth at the start of each period. In the long run this wealth equivalence will not apply exactly. However, if the wealth disparity is moderate, the reasoning above will
270 • GABAIX & LAIBSON still hold approximately.11 Numerical analysis with 50-year adult lives implies that the actual bias is very close to 6D, the value it would have if all of the wealth levels were identical period by period.
3. General Characterization of the Economy In this section we provide a general characterization of the dynamic properties of the economy described above. We analyze four properties of our economy: excess smoothness of consumption growth, positive autocorrelation of consumption growth, low covariance of consumption growth and asset returns, and nonzero covariance of consumption growth and lagged equity returns. Our analysis focuses on first-order effects with respect to the parameters r, p, OTT, a2, a2d2, and a. Call e = max(r,p, 077,cr2,a-202,a:). We assume s to be small. Empirically, s — 0.05 with a period length of a year, and e — 0.01 with a period length of a calendar quarter. All the results that follow (except one12) are proved with O(f?/2) residuals. In fact, at the cost of more tedious calculations, one can show that the residuals are actually 0(s2).13 The following theorem is the basis of this section. The proof appears in Appendix A. THEOREM 3 The autocovariance of consumption growth at horizon h>0 can be expressed as
where
11. More precisely, it is only important that the average wealth of households that switch on date t not differ significantly from the average wealth of households that switch on any date s G [t - D,t + D]. To guarantee this cross-date average similarity we could assume that each reset interval ends stochastically. This randomness generates "mixing" between populations of households that begin life with different reset dates. 12. Equation (12) is proved to O(\/s), but with more tedious calculations can be shown to be O(e). 13. One follows exactly the lines of the proofs presented here, but includes higher-order terms. Calculations are available from the authors upon request.
The 6D Bias and the Equity-Premium Puzzle • 271
and (f) = 4!/z!(4 — /)! z's the binomial
coefficient.
The expressions above are valid for noninteger values of D and h. The functions d(D) and F(D,ft) have the following properties, many of which will be exploited in the analysis that follows14:
dec4. d(D) = |D|/2for |D| > 2. d(0) = fo-
F(D,ft) ~ 1/D for large D. r(D,ft) > 0. r(D,ft) > 0 iff D + 2 > ft. r(D,ft) is nonincreasing in ft. F(D,0) is decreasing in D, but F(D,ft) is hump-shaped for ft > 0. F(0,ft) = 0 for ft > 2.
r(o,o) = §.
r(o,i) = i Figure 2 plots d(D) along with a second function which we will use below. 3.1 r(D,0)
We begin by studying the implications of the autocovariance function, F(D,ft), for the volatility of consumption growth (i.e., by setting ft = 0). Like Caballero (1995), we also show that delayed adjustment induces excess smoothness. Corollary 4 describes our quantitative result. COROLLARY 4 In thefrictionless economy (D = 0), var (dCt/Ct)/dt = cr262. In our economy, with delayed adjustment and time aggregation bias,
The volatility of consumption, o-202F(D,0), decreases as D increases.
The normalized variance of consumption, F(D,0), is plotted against D in Figure 3. 14. F is continuous, so T(0,h) is intended as limD_^or(D,?i).
272 • GABAIX & LAIBSON FIGURE 2 THE FUNCTIONS d(x) AND e(x)
FIGURE 3 THE NORMALIZED VARIANCE OF CONSUMPTION GROWTH,
r(D,o)
The 6D Bias and the Equity-Premium Puzzle • 273
For D = 0, the normalized variance is f, well below the benchmark value of 1. The D = 0 case reflects the bias generated by time aggregation effects. As D rises above zero, delayed adjustment effects also appear. For D = 0, 1, 2, 4, 20 the normalized variance takes values 0.67, 0.55, 0.38, 0.22, and 0.04. For large D, the bias is approximately 1/D. Intuitively, as D increases, none of the short-run volatility of the economy is reflected in consumption growth, since only a proportion 1/D of the agents adjust consumption in any single period. Moreover, the size of the adjustments only grows as VD. So the total magnitude of adjustment is falling as 1A/D, and the variance falls as 1/D. 3.2 F(D,ft) WITH h > 0
We now consider the properties of the (normalized) autocovariance function F(D,ft) for h = 1, 2, 4, 8. Figure 4 plots these respective curves, ordered from ft = 1 on top to ft = 8 at the bottom. Note that in the benchmark case—instantaneous adjustment and no time-aggregation bias—the autocovariance of consumption growth is zero. With only time-aggregation effects, the one-period autocovariance is F(0,l) = \, and all ft-period autocovariances with ft > 1 are zero.
FIGURE 4 NORMALIZED AUTOCOVARIANCE T(D,h) WITH h = 1, 2, 4, 8
274 • GABAIX & LAIBSON 3.3 REVISITING THE EQUITY-PREMIUM PUZZLE
We can also state a formal and more general analogue of Theorem 2. PROPOSITION 5 Suppose that consumers reset their consumption every ha periods. Then the covariance between consumption growth and stock-market returns at horizon h will be
The associated correlation is
In the benchmark model with continuous sampling and adjustment, the covariance is just
Moreover, in that model the covariance at horizon h is just
So the effect introduced by the 6D model is captured by the factor l/b(D) which appears in Proposition 5. We compare this benchmark with the effects generated by our discrete-observation, delayed-adjustment model. As the horizon h tends to +00, the normalized covariance between consumption growth and asset returns tends to
The 6D Bias and the Equity-Premium Puzzle • 275
which is true for any fixed value of ha. This effect is due exclusively to time aggregation. Delayed adjustment ceases to matter as the horizon length goes to infinity. Proposition 5 covers the special case discussed in Section 2: horizon h = I , and reset period ha = D > 1. For this case, the normalized covariance is approximately equal to
Figure 5 plots the multiplicative covariance bias factor l/b(ha/h) as a function of h, for ha — 1. In the benchmark case (i.e., continuous sampling and instantaneous adjustment) there is no bias; the bias factor is unity. In the case with only time-aggregation effects (i.e., discrete sampling and ha = 0) the bias factor is l/b(0/7z) = \. Hence, low levels of comovement show up most sharply when horizons are low. For D > I (i.e., ha/h ^ 1), the covariance between consumption growth and stock returns is 6D times lower than one would expect in the model with continuous adjustment and continuous sampling.
FIGURE 5 MULTIPLICATIVE COVARIANCE BIAS FACTOR l/b(l/h)
276 • GABAIX & LAIBSON
We now characterize the covariance between current consumption growth and lagged equity returns. THEOREM 6 Suppose that consumers reset their consumption every ha = Dh periods. Then the covariance between m(C[t/(+1]/C[(_u]) and lagged equity returns In K[t+M+S2] (Si < s2 < 1) will be
with
where
The following corollary will be used in the empirical section. COROLLARY 7 The covariance between In (C[s+h_liS+h]/C[s_liS]) and lagged equity returns In K[s/s+1] will be
In particular, when h > D + 2, cov (In [C[s+h_liS+h]/C[s_l/s]], In R[s/s+1],) = Bo-2; one sees full adjustment at horizons (weakly) greater than D + 2. In practice, Theorem 6 is most naturally applied when the lagged equity returns correspond to specific lagged time periods: s2 = sa + 1, sa = 0, -l,-2,
The 6D Bias and the Equity-Premium Puzzle • 277 FIGURE 6 NORMALIZED COVARIANCE OF CONSUMPTION GROWTH AND LAGGED ASSET RETURNS, V(D,s,s + I ) , FOR D = 0.25, 1, 2, 4
Note that V(D,s1,s2) > 0 iff s2 > -D - 1. Hence, the covariance in Theorem 6 is positive only at lags 0 through D + 1. Figure 6 plots the normalized covariances of consumption growth and lagged asset returns for different values of D. Specifically, we plot V(D,s,s + 1) against s for D = 0.25, 1, 2, 4, from right to left. Consider a regression of consumption growth on some arbitrary (large) number of lagged returns,
One should find
Note that the sum of the normalized lagged covariances is one:
278 • GABAIX & LAIBSON
This implies that the sum of the coefficients will equal the portfolio share of the stock market,15
3.4 EXTENSION TO MULTIPLE ASSETS AND HETEROGENEITY IN D
We now extend the framework to the empirically relevant case of multiple assets with stochastic returns. We also introduce heterogeneity in D's. Such heterogeneity may arise because different D's apply to different asset classes and because D may vary across consumers. Say that there are different types of consumers I = 1, . . . ,n, and different types of asset accounts m = 1, . . . ,nm. Consumers of type / exist in proportion p^ipi = 1) and look at account m every Dlm periods. The consumer has wealth wlm invested in account m, and has an associated marginal propensity to consume (MFC), alm. In most models the MFC's will be the same for all assets, but for the sake of behavioral realism and generality we consider possibly different MFC's. For instance, income shocks could have a low D = 1, stock-market shocks a higher D = 4, and shocks to housing wealth a D — 40.16 Account m has standard deviation
A shock dzmt in wealth account m will get translated at mean interval ^ZjpjDfa into a consumption shock dC/C = S,0/mdzmf. We can calculate the second moments of our economy. 15. This is true in a world with only equities and riskless bonds. In general, it's more appropriate to use a model with several assets, including human capital, as in the next section. 16. This example implies different short-run marginal propensities to consume out of wealth windfalls in different asset classes. Thaler (1992) describes one behavioral model with similar asset-specific marginal propensities to consume.
The 6D Bias and the Equity-Premium Puzzle • 279 THEOREM 8
In the economy described above, we have
and
with
V defined in (14), and d defined in (11).
The function T(D,t), defined earlier in (10), relates to T(D,D',t) by r(D,D,0 = F(D,0. Recall that V(D,0,1) = l/b(D). So a conclusion from (19) is that, when there are several types of people and assets, the bias that the econometrician would find is the harmonic mean of the individual biases b(Dlm), the weights being given by the shares of variance. As an application, consider the case with identical agents (n, = 1; / is suppressed for this example) and different assets with the same MFC, «„, = a. Recall that V(D,0,1) = l/b(D). So the bias y/y will be
Hence, with several assets, the aggregate bias is the weighted mean of the biases, the mean being the harmonic mean, and the weight of asset m being the share of the total variance that comes from this asset. This allows us, in Appendix B, to discuss a modification of the model with differential attention to big shocks (jumps). These relationships are derived exactly along the lines of the singleasset, single-type economy of the previous sections. Equation (19) is the covariance between returns, In R"t+s t+s j = (rnz"t+s t+s j + O(e), andtherepresentation formula for aggregate consumption is
280 • GABAIX & LAIBSON
where a(i) = (1 — |z'|)+. Equation (23) can also be used to calculate the autocovariance (20) of consumption, if one defines
The closed-form expression (21) of F is derived in Appendix A. 3.5 SKETCH OF THE PROOF
Proofs of the propositions appear in Appendix A. In this subsection we provide intuition for those arguments. We start with the following representation formula for consumption growth. PROPOSITION 9
We have
Note that the order of magnitude of Ocr^ a(i)z[t+i_Dt+i]di/D is the order of magnitude of a, i.e. O(\rs). Assets returns can be represented as In R[t+s^t+$2] — crz[t+Si/t+Sz + O(s). So we get
Here A(7) is the length (the Lebesgue measure) of the interval I. Likewise one gets
The 6D Bias and the Equity-Premium Puzzle • 281
The bulk of the proof is devoted to the explicit calculation of this last equation and equation (27).
4. Endogenizing D Until now, we have assumed that D is fixed exogenously. In this section we discuss how D is chosen, and provide a framework for calibrating D. Because of delayed adjustment, the actual consumption path will deviate from the first-best instantaneously adjusted consumption path. In steady state, the welfare loss associated with this deviation is equivalent, using a money metric, to a proportional wealth loss of17
Here AC is the difference between actual consumption and first-best instantaneously adjusted consumption. If the asset is observed every D periods, we have
Equations (28) and (29) are derived in Appendix A. We assume18 that each consumption adjustment costs a proportion q of the wealth w. A 17. This is a second-order approximation. See Cochrane (1989) for a similar derivation. 18. This would come from a utility function
if the adjustments to consumption are made at dates (T,)!a0. A session of consumption planning at time t lowers utility by a consumption equivalent of <\eTpt.
282 • GABAIX & LAIBSON
sensible calibration of q would be qw = (1%) (annual consumption) = (0.01)(0.04)o; = (4 X 1(TVThe NPV of costs as a fraction of current wealth is ^Sna0e"/*iD/ implying a total cognitive cost of
The optimal D minimizes both consumption variability costs and cognitive costs, i.e., D* = arg min Ac + A^:
and we find for the optimal D
whenpD « 1. We make the following calibration choices: q = 4 X 10~4, a2 = (0.16)2, •y = 3, p = 0.01, TT = 0.06, and 9 = 7r/(yo-2) = 0.78. Substituting into our equation for D, we find D — 2 years. This calibration implies that D-values of at least 1 year (or 4 quarters) are quite easy to defend. Moreover, our formula for D* is highly sensitive to the value of 6. If a liquidity-constrained consumer has only a small
The 6D Bias and the Equity-Premium Puzzle • 283
fraction of her wealth in equities—because most of her wealth is in other forms like human capital or home equity—then the value of D will be quite large. If 9 = 0.05 because of liquidity constraints, then D* — 30 years. Note that formula (30) would work for other types of shocks than stock-market shocks. With several accounts indexed by m, people would pay attention to account m at intervals of length
with qmwm representing the cost of evaluating asset m, and Om generalized as in equation (18). Equation (31) implies sensible comparative statics on the frequency of reappraisal. Thus we get a mini-theory of the allocation of attention across accounts.19
5. Consequences for Macroeconomics and Finance 5.1 SIMPLE CALIBRATED MACRO MODEL
To draw together the most important implications of this paper, we describe a simple model of the U.S. economy. We use our model to predict the variability of consumption growth, the autocorrelation of consumption growth, and the covariance of consumption growth with equity returns. Assume the economy is composed of two classes of consumers: stockholders and nonstockholders.20 The consumers that we model in Section 2 are stockholders. Nonstockholders do not have any equity holdings, and instead consume earnings from human capital. Stockholders have aggregate wealth St, and nonstockholders have aggregate wealth Nt. Total consumption is given by the weighted sum
Recall that a is the marginal propensity to consume. So consumption growth can be decomposed into 19. See Gabaix and Laibson (2000a,b) for a broader theoretical and empirical analysis of attention allocation. 20. This is at a given point in time. A major reason for nonparticipation is that relatively young agents have most of their wealth in human capital, against which they cannot borrow to invest in equities (see Constantinides, Donaldson, and Mehra, 2000).
284 • GABAIX & LAIBSON
Here s represents the wealth of stockholders divided by the total wealth of the economy, and n = I — s represents the wealth of nonstockholders divided by the total wealth of the economy. So s and n are wealth shares for stockholders and nonstockholders respectively. We make the simplifying approximation that s and n are constant in the empirically relevant medium run. Using a first-order approximation,
If stockholders have loading in stocks 6, the ratio of stock wealth to total wealth in the economy is
To calibrate the economy we begin with the observation that human capital claims about f of GDP Y. In this model, human capital is the discounted net present value of labor income accruing to the current cohort of nonstockholders. We assume that the expected duration of the remaining working life of a typical worker is 30 years, implying that the human capital of the current workforce is equal to
where Y is aggregate income. Capital income claims | of GDP. Assuming that it has the riskiness (and the returns) of the stock market, the amount of capital is
so that the equity share of total wealth is
The 6D Bias and the Equity-Premium Puzzle • 285
By assuming that all capital is identical to stock-market capital, we implicitly increase the predicted covariance between stock returns and consumption growth. A more realistic model would assume a more heterogeneous capital stock, and hence a lower covariance between stock returns and consumption growth. In this model economy, we work with data at the quarterly frequency. We assume a = 0.16/V?, T = 0.06/4, r = 0.01/4, and y = 3, so the equity share [equation (1) above] is 6 = 7r/(ya2) = 0.78. Then equation (32) implies s = 0.28. In other words, 28% of the wealth in this economy is owned by shareholders. All of stockholders' claims are in either stock or risk-free bonds. To keep things simple, we counterfactually assume that N and S are uncorrelated. We have to take a stand on the distribution of D's in the economy. We assume that D-values are uniformly distributed from 0 to D = 120 quarters (i.e., 30 years). We adopt this distribution to capture a wide range of investment styles. Extremely active investors will have a D-value close to 0, while passive savers may put their retirement wealth in a special mental account, effectively ignoring the accumulating wealth until after age 65 (Thaler, 1992). We are agnostic about the true distribution of Dtypes, and we present this example for illustrative purposes. Any wide range of D-values would serve to make our key points. To keep the focus on stockholders, we assume that nonstockholders adjust their consumption instantaneously in response to innovations in labor income—i.e., at intervals of length 0. Theorem 3 implies that the quarterly volatility of aggregate consumption growth is
We assume that the quarterly standard deviation of growth in human capital is crN = 0.01.21 Our assumptions jointly imply that crc = 0.0063.22 Most of this volatility comes from variation in the consumption of nonstockholders. Stockholders generate relatively little consumption vola21. We calibrate aN from postwar U.S. data on wage growth. From 1959 to 2000 the standard deviation of per capita real wage growth at the quarterly frequency has been 0.0097 (National Income and Product Accounts, Commerce Department, Bureau of Economic Analysis). If wages follow a random walk, then the standard deviation of growth in human capital, crN, will equal the standard deviation in wage growth. 22. Figure 3 plots the function T(D,0). Note that F(0,0) = f and that F(D,0) = 1/D for large D. In the decomposition of a\ above, n2r(0,0)0-^ = 0.34 X 10~4 and <9 2 o- 2 1 We[OD] r(D,D',0)dD dD'/D2 = 0.049 x 10~4.
286 • GABAIX & LAIBSON
tility, because they represent a relatively small share of total consumption and because they only adjust consumption every D periods. This adjustment rule smooths out the response to wealth innovations, since only a fraction 1/D of stockholders adjust their consumption during any single period and the average adjustment is of magnitude \fD. Our model's implied quarterly consumption volatility—ac = 0.0063— lies below its empirical counterpart. We calculate the empirical crc using the cross-country panel dataset created by Campbell (1999).23 We estimate ac = 0.0106 by averaging across all of the countries in Campbell's dataset: Australia, Canada, France, Germany, Italy, Japan, the Netherlands, Spain, Sweden, Switzerland, the United Kingdom, and the United States.24 Part of the gap between our theoretical standard deviation and the empirical standard deviation may reflect measurement error, which should systematically raise the standard deviation of the empirical data. In addition, most of the empirical consumption series include durables, which should raise the variability of consumption growth (Mankiw, 1982). By contrast, the U.S. consumption data omit durables, and for the United States we calculate crc = 0.0054, closely matching our theoretical value. Next, we turn to the first-order autocorrelation of consumption growth, applying again Theorem 3:
Using our calibration choices, our model implies pc = 0.34.25 This theoretical prediction lies well above the empirical estimate of -0.11, found by averaging across the country-by-country autocorrelations in the Campbell dataset. Here too, both measurement error and the inclusion of durables are likely to bias the empirical correlations down. Again, the U.S. data, which omits durables, come much closer to matching our theoretical prediction. In the U.S. data, pc = 0.22. 23. We thank John Campbell for sharing this dataset with us. 24. We use quarterly data from the Campbell dataset. The quarterly data begins in 1947 for the United States, and begins close to 1970 for most of the other countries. The dataset ends in 1996. 25. The resp_ective effects are n2a2NY(Q,l) = 0.077 x 10~4 and 6>V / / DD . e|oS] T(D,D',l) dD dD'/D2 = 0.048 x 10"4.
The 6D Bias and the Equity-Premium Puzzle • 287
We turn now to the covariation between aggregate consumption growth and equity returns, cov(\n[Ct/Ct^^],\T\ Rt). We find
assuming that in the short run the consumption growth of nonstockholders is uncorrelated with that of stockholders. The covariance estimate of 0.13 X 10~4 almost matches the average covariance in the Campbell dataset, 0.14 X 10~4. This time, however, the U.S. data do not "outperform" the rest of the countries in the Campbell dataset. For the United States, the covariance is 0.60 X 10~4. However, all of these covariances come much closer to matching our model than to matching the benchmark model with instantaneous adjustment and measurement. The benchmark model with no delayed adjustment predicts that the quarterly covariance will be da2 — 50 X 10^4. What would an econometrician familiar with the consumption-CAPM literature conclude if he observed quarterly data from our 6D economy, but thought he were observing data from the benchmark economy? First, he might calculate
and conclude that the coefficient of relative risk aversion is over 1000. If he were familiar with the work of Mankiw and Zeldes (1991), he might restrict his analysis to stockholders and calculate
Finally, if he read Mankiw and Zeldes carefully, he would realize that he should also do a continuous-time adjustment (of the type suggested by Grossman, Melino, and Shiller, 1987), leading to another halving of his estimate. But, after all of this hard work, he would still end up with a biased coefficient of relative risk aversion: 300/2 = 150. For this economy, the true coefficient of relative risk aversion is 3! These observations suggest that the literature on the equity-premium puzzle should be reappraised. Once one takes account of delayed adjust-
288 • GABAIX & LAIBSON
merit, high estimates of y no longer seem anomalous. If workers in midlife take decades to respond to innovations in their retirement accounts, we should expect naive estimates of y that are far too high. Defenders of the Euler-equation approach might argue that economists can go ahead estimating the value of y and simply correct those estimates for the biases introduced by delayed adjustment. However, we do not view this as a fruitful approach, since the adjustment delays are difficult to observe or calibrate. For an active stock trader, knowledge of personal financial wealth may be updated daily, and consumption may adjust equally quickly. By contrast, for the typical employee who invests in a 401 (k) plan, retirement wealth may be in its own mental account,26 and hence may not be integrated into current consumption decisions. This generates lags of decades or more between stock price changes and consumption responses. Without precise knowledge of the distribution of D-values, econometricians will be hard pressed to measure y accurately using the Eulerequation approach. In summary, our model tells us that high imputed y-values are not anomalous and that high-frequency properties of the aggregate data can be explained by a model with delayed adjustment. Hence, the equity premium may not be a puzzle. Finally, we wish to note that our delayed-adjustment model is complementary to the theoretical work of other authors who have analyzed the equity-premium puzzle.27 Our qualitative approach has some similarity with the habit-formation approach (e.g., Constantinides, 1990; Abel, 1990; Campbell and Cochrane, 1999). Habit-formation models imply that slow adjustment is optimal because households prefer to smooth the growth rate (not the level) of consumption. In our 6D model, slow adjustment is optimal only because decision costs make high-frequency adjustment too expensive.
6. Review of Related Empirical Evidence In this section, we review two types of evidence that lend support to our model. In the first subsection we review survey evidence which suggests that investors know relatively little about high-frequency variation in their equity wealth. In the second subsection we show that equity innovations predict future consumption growth. 26. See Thaler (1992). 27. For other proposed solutions to the equity-premium puzzle see Kocherlakota (1996), Bernartzi and Thaler (1995), and Barberis, Huang and Santos (2000).
The 6D Bias and the Equity-Premium Puzzle • 289 6.1 KNOWLEDGE OF EQUITY PRICES
Consumers can't respond to high-frequency innovations in equity values if they don't keep close tabs on the values of their equity portfolios. In this subsection, we discuss survey evidence that suggests that consumers may know little about high-frequency variation in the value of their equity wealth.28 We also discuss related evidence that suggests that consumers may not adjust consumption in response to business-cyclefrequency variation in their equity holdings. All of this evidence is merely suggestive, since survey responses may be unreliable. The 1998 Survey of Consumer Finances (SCF) was conducted during the last six months of 1998, a period of substantial variation in equity prices. In July the average value of the Wilshire 5000 equity index was 10,770. The index dropped to an average value of 9,270 in September, before rising back to an average value of 10,840 in December. Kennickell, Starr-McCluer, and Surette (2000) analyze the 1998 SCF data to see whether self-reported equity wealth covaries with movements in stockmarket indexes. They find that the SCF equity measures are uncorrelated with the value of the Wilshire index on the respondents' respective interview dates. Only respondents who were active stock traders (>12 trades/year) showed a significant correlation between equity holdings and the value of the Wilshire index. Dynan and Maki (2000) report related results. They analyze the responses to the Consumer Expenditure Survey (CEX) from the first quarter of 1996 to the first quarter of 1999. During this period, the U.S. equity markets rose over 15% during almost every 12-month period. Nevertheless, when respondents were surveyed for the CEX, one-third of stockholders reported no change in the value of their securities during the 12month period before their respective interviews.29 Starr-McCluer (2000) analyzes data from the Michigan Survey Research Center (SRC) collected in the summer of 1997. One of the survey questions asked, "Have you [Has your family] changed the amount you spend or save as a result of the trend in stock prices during the past few years?" Among all stockholder respondents, 85.0% said "no effect." Among stockholder respondents with most of their stock outside retirement accounts, 83.3% said "no effect." Even among stockholders with large portfolios (> $250,000), 78.4% said "no effect." 28. We are grateful to Karen Dynan for pointing out much of this evidence to us. 29. For the purposes of this survey a change in the value of equity securities includes changes due to price appreciation, sales, and/or purchases.
290 • GABAIX & LAIBSON 6.2 THE EFFECT OF LAGGED EQUITY RETURNS ON CONSUMPTION GROWTH
Dynan and Maki (2000) analyze household-level data on consumption growth from the CEX, and ask whether lagged stock returns affect future consumption growth. They break their results down for nonstockholders and stockholders. For stockholders with at least $10,000 in securities a 1% innovation in the value of equity holdings generates a 1.03% increase in consumption of nondurables and services. However, this increase in consumption occurs with a lag. One third of the increase occurs during the first 9 months after the equity price innovation. Another third occurs 10 to 18 months after the innovation. Another quarter of the increase occurs 19 to 27 months after the innovation, and the rest of the increase occurs 28 to 36 months after the innovation. We now turn to evidence from aggregate data. We look for a relationship between equity returns and future consumption growth. Specifically, we evaluate Cov (ln[Ct+h/Ct], In R m ) for h = 1, 2, . . . , 25. Under the null hypothesis of D = 0, the quarterly covariance between equity returns and consumption growth is predicted to be
The effects of time-aggregation bias are incorporated into this prediction. An equity innovation during period t + 1 only affects consumption after the occurrence of the equity innovation. So the predicted covariance, Cov(ln[Ct+1/Cf],ln Rt+l), is half as great as it would be if consumption growth were measured instantaneously. This time-aggregation bias vanishes once we extend the consumption growth horizon to two or more periods. So, if D = 0 and h>2,
Hence the assumption D = 0 implies that the profile of Cov(ln[Q+^/Q],ln Rt+l) for h > 2 should be flat.
The 6D Bias and the Equity-Premium Puzzle • 291 FIGURE 7 COVARIANCE OF R,+1 AND ln(Cf+/1/Q)
Notes: 1. Dataset is from Campbell (1999). Full dataset includes Australia, Canada, France, Germany, Italy, Japan, the Netherlands, Spain, Sweden, Switzerland, the United Kingdom, and the United States. 2. To identify countries with large stock markets, we ordered the countries by the ratio of stock-market capitalization to GDP (1993). The top half of the countries were included in our large-stock-market subsample: Switzerland (0.87), the United Kingdom (0.80), the United States (0.72), the Netherlands (0.46), Australia (0.42), and Japan (0.40). 3. We assume that households have D-values that are uniformly distributed from 0 to 30 years.
Figure 7 plots the empirical values of Cov(ln[Ct+h/Ct],ln Rt+l) for h G {1,2, . . . ,25}.30 We use the cross-country panel dataset created by Campbell (1999).31 Figure 7 plots the value of Cov(ln[Ct+h/Cf],ln Rt+l), averaging across all of the countries in Campbell's dataset: Australia, Canada, France, Germany, Italy, Japan, the Netherlands, Spain, Sweden, Switzerland, the United Kingdom, and the United States.32 Figure 7 also plots the 30. See Hall (1978) for early evidence that lagged stock returns predict future consumption growth. See Lettau and Ludvigson (2001) for a VAR approach that implies that lagged stock returns do not predict future consumption growth. Future work should attempt to reconcile our results with those of Lettau and Ludvigson. 31. We thank John Campbell for giving this dataset to us. 32. Specifically, we calculate Cov(ln Rl+l,\n[Ct+h/Ct]) for each country and each /z-quarter horizon, h G {1,2, . . . ,25}. We then average across all of the countries in the sample. We use quarterly data from the Campbell dataset. The quarterly data begin in 1947 for the United States, and begin close to 1970 for most of the other countries. The dataset ends in 1996.
292 • GABAIX & LAIBSON
average value of Cov(ln[C(+//Q], In Rt+l), averaging across all of the countries with large stock markets. Specifically, we ordered the countries in the Campbell dataset by the ratio of stock-market capitalization to GDP in 1993. The top half of the countries were included in our large-stock-market subsample: Switzerland (0.87), United Kingdom (0.80) United States (0.72), Netherlands (0.46), Australia (0.42), and Japan (0.40). Two properties of the empirical covariances stand out. First, they slowly rise as the consumption growth horizon h increases. Contrast this increase with the counterfactual prediction for the D = 0 case that the covariance should plateau at h = 2. Second, the empirical covariances are much lower than the covariance predicted by the D = 0 case. For example, at a horizon of 4 quarters, the average empirical covariance is roughly 0.0002, far smaller than the theoretical prediction of 0.0014. Figure 7 also plots the predicted33 covariance profile implied by the 6D model.34 To generate this prediction we assume that D-values are uniformly distributed from 0 years to 30 years, as discussed in the previous section. The 6D model predicts that the covariance Cov(ln[Q+/!/CJ, In Rt+l) slowly rises with the horizon h. To understand this effect, recall that the 6D economy slowly adjusts to innovations in the value of equity holdings. Some consumers respond quickly to equity innovations, either because these consumers have low D-values, or because they have a 33. Corollary 7 gives dD , f Cov(ln[Q+h/C(],ln R m ) = &a2 [e(l + D) - e(l) - e(\ - h + D) + e(l -h)}-=. JDE[O,D] DD 34. The following approximation for the covariances provides intuition for the orders of magnitude. In normalized units,
When the D's are uniformly distributed in [0,D],
This approximation turns out to be quite good for h > 2.
The 6D Bias and the Equity-Premium Puzzle • 293
high D-value and are coincidentally coming up to a reset period. Other consumers respond with substantial lags. For our illustrative example, the full response will take 30 years. For low h, the 6D model predicts that the covariance profile will be close to zero. As h goes to infinity, the covariance profile asymptotes to the prediction of the instantaneous adjustment model, so rinv^ Cov(ln[Cf+/i/C(], In Rt+l) = 00* = 0.0014. Figure 7 shows that our illustrative calibration of the 6D model does a fairly good job of matching the empirical covariances. This analysis has shown that the empirical data are completely inconsistent with the standard assumption of instantaneous adjustment. Lagged equity returns affect consumption growth at very long horizons: Cov(ln[Cf+ft/Cf], In Rt+l), rises slowly with h, instead of quickly plateauing at h = 2. This slow rise is a key test of the 6D framework. We conclude from Figure 7 that the 6D model successfully predicts the profile of Cov(ln[Q+/l/Q]/ In R f+1 ) for h = 1,2, . . . ,25. However, the 6D model fails to predict the profile of a closely related quantity, the normalized Euler covariance,
This /z-period covariance generalizes the one-period Euler covariance, Cov(ln[Q+1/Cf],lnRf+1).35,36 The standard model with D = 0 predicts that the /z-period normalized Euler covariance will equal [(2h - ty/ThlOo2 for all (integer) values of h. The factor (2/z—l)/2/z captures time-aggregation bias, which becomes proportionately less important as the horizon increases. By contrast, the 6D modelpredicts that, if the D's are uniformly distributed between 0 and D (e.g., D = 30 years = 120 quarters), the /z-period normalized Euler covari35. We thank Monika Piazzesi, whose insightful discussion of this paper at the NBER Macroeconomics Annual Conference led us to add analysis of the covariance Euler equation to this final draft. 36. The Euler covariances link the equity premium to the coefficient of relative risk aversion. Consider the /z-period Euler equation for a discrete-time model with instantaneous adjustment, Et_1[$(Q+h/Q)~~xexp(2JL1 In R"+i)] = I (for all assets a). Manipulation of this equation implies
where TTIS the 1-period equity premium.
294 • GABAIX & LAIBSON
ance should approximately37 equal (h/4D)[3 - 2 In (h/D)]9o-2 for h < D. For both the standard model (D = 0) and the 6D model, the normalized Euler covariance should rise monotonically with h, but this rise should be much steeper for the standard model. The empirical data match neither prediction. In the twelve-country Campbell data, an initial rise in the Euler covariance from h = 1 to h = 7 is subsequently reversed for larger values of h. For h > 20, the Euler covariances are very small in magnitude, with some negative point estimates.38 This result seems to contradict the encouraging results plotted in Figure 7. To understand this tension, we assume stationarity and decompose the /z-period Euler covariance:
The /2-period Euler covariance (i.e., the left-hand side) is zero for large h's, and the first sum on the right-hand side is positive (this is the quantity plotted in Figure 7). It follows that the second term on the righthand side should be negative: 37. We use the approximation above,
38. See Cochrane and Hansen (1992) for an early empirical analysis of the multiperiod Euler equation. Daniel and Marshall (1997, 1999) report that consumption Euler equations for aggregate data are not satisfied at the quarterly frequency but improve at the two-year frequency. Our results are consistent with theirs, but we find that this relatively good performance deteriorates as the horizon is lengthened.
The 6D Bias and the Equity-Premium Puzzle • 295
which can be verified in our sample.39 In words, lagged consumption growth negatively predicts the current stock return. Such predictability explains why the Euler covariance does not follow the profile predicted by the 6D model. Of course, this predictability is inconsistent with any model in which the stock market follows a martingale. Alternative frameworks, like Campbell and Cochrane's (1999) model of habit formation, Barberis, Huang, and Santos's (2001) prospect-theory model of asset pricing, or animal-spirits models, are needed to explain why lagged consumption growth negatively forecasts future stock returns. 7. Conclusion Grossman and Laroque (1990) argue that adjustment costs might explain the equity-premium puzzle. Lynch (1996) and Marshall and Parekh (1999) have successfully numerically simulated discrete-time delayed adjustments models which confirm Grossman and Laroque's conjecture. We have described a continuous-time generalization of Lynch's (1996) model. We derive a complete analytic characterization of the model's dynamic properties. In addition, our continuous-time framework generates effects that are up to six times larger than those in discrete-time models. We analyze an economy composed of consumers who update their consumption every D periods. Using data from our economy, an econometrician estimating the coefficient of relative risk aversion (CRRA) from the consumption Euler equation would generate a multiplicative CRRA bias of 6D. Once we take account of this 6D bias, the Euler equation tests are unable to reject the standard consumption model. We have derived closed-form expressions for the first and second moments of this delayed-adjustment economy. The model matches most of the empirical moments of aggregate consumption and equity returns, including a new test which confirms the 6D prediction that the covariance 39. For quarterly horizons h G {5,10,15,20,25}, the average value of
is {-0.9,-2.0,-4.6,-2.8,-3.6} X 10~4 for all of the countries in the Campbell dataset, and {-1.2,-2.4,-5.0,-3.0,-3.2} X 10"4 for the countries with large stock markets.
296 • GABAIX & LAIBSON
between ln(Cf+/J/Q], and Rt+l should slowly rise with h. The 6D model fails long-horizon Euler-equation tests, but this failure is due to the interesting empirical regularity that high lagged consumption growth predicts low future equity returns. Future work should test the new empirical implications of our framework, including the rich covariance lag structure that we have derived. Most importantly, our model implies that standard Euler-equation tests should be viewed very skeptically. Even small positive values of D (e.g., D = 4 quarters) dramatically bias the inferences that economists draw from Euler equations and the related Hansen-Jagannathan bounds.
Appendix A. Proofs We use approximation to get analytic results. Let s = max(r,p, QTT, (T2,cr202,a). For annual data s — 0.05. We shall use the notation/(e) = O
0 and a constant A > 0 such that for s < sQA, we have E0[/2]1/2 ^ A\sk. More concisely, the norms are in the L2 sense. For instance:
We will often replace O
Finally, for z a generic standard Brownian motion, we define z[;;] = z(/) - z(z'), and remark that
as both are equal to the measure [i — D,i] D [/' — D',/]. A.I PROOF OF PROPOSITION 1
Denote by v(w) =E\Q[e~ptc]~y/(\. - i)]dt the expected value of the utils from consumption under the optimal policy, assuming the first reset
The 6D Bias and the Equity-Premium Puzzle • 297
date is t — 0. So v(-) is the value function that applies at reset dates. Say that the agent puts S in the checking account, and the rest, w — S, in the mutual fund. Call M the (stochastic) value of the mutual fund at time D. By homotheticity, we have v(w) = v-wl~~y/(\ — y). We have
with
Optimizing over ct for t G [0,0), we get c~r = E [v'(w')]e(r~p)(D~t\ so that consumption growth is that of the Ramsey model: ct — awe[(r~p)/y]t for some a (by the implicit-function theorem one can show that it is a continuous function of D, and it has Merton's value when D = 0). To avoid bankruptcy, we need S > S0 = /^ cte~rtdt. Imagine that the consumer starts by putting aside the amount S0. Then, he has to manage optimally the remaining amount, w — S0. Given some strategy, he will end up with a stochastic wealth w', and he has to solve the problem of maximizing vE [w/1"y/(l ~ y)]- But this is a finite-horizon Merton problem with utility derived from terminal wealth, whose solution is well known: the whole amount w — S0 should be put in a mutual fund with constant rebalancing, with a proportion of stocks 9 = Tr/(yo-2). In particular, only the amount S0 is put in the checking account. A.2 PROOF OF PROPOSITION 9 The basis of our calculations is the representation formula for consumption, Proposition 9. To prove it we shall need the following LEMMA 10
PROOF have
We have
If the agent doesn't check her portfolio between t and t + s, we
298 • GABAIX & LAIBSON
When the agent checks her portfolio at time r, she puts a fraction / = /£ ae^i+{(r'p}/y}tdt = O(e) in the checking account, so that
Pasting together (37) and (39) at different time intervals, we see that (37) holds between two arbitrary dates (i.e., possibly including reset dates) t and t + s, and the lemma is proven. We can now proceed to the
PROOF OF PROPOSITION 9 Say that z G [0,D] has her latest reset point before t — 1 at tt•,= t — 1 — i. The following reset points are i{ + raD for m > 1, and for s > t - I we have [the first O(s) term capturing the deterministic increase of consumption between reset dates]
so that, using the notation £m = wit.6(rz[ti+(m_l)Diti+mD]i
and we get
The 6D Bias and the Equity-Premium Puzzle • 299
because t,• - t — 1 — i. Letwt_D_l = wigrt_D_lf which implies that wi/t-D-i = wt_D_l{\- + O(e)] for all i. Note that iQ is an arbitrarily selected index value. We now get the expression for consumption growth,
Defining ; = D — I — i, and noting that the above expressions paste together, we have
One can likewise calculate
so
300 • GABAIX & LAIBSON
A.3 PROOF OF THEOREM 2 Use Proposition 9, In Rt+l = (rz[t/t+l] + O(s), to get
with
Using (1) and (6), this leads to the expression (2). A.4 PROOF OF THEOREM 3 First we need LEMMA 11 We have, with d defined in (11), for D E R,
PROOF OF LEMMA 11 Define, for D G R,
First, note that g is even because a is. In addition, for D > 2, g(D) = 0: for the integrand to be nonzero in (40), we need both i < 1 and i + D\ < 1, which is impossible for D > 2.
The 6D Bias and the Equity-Premium Puzzle • 301
For a general D, we derive (in the sense of the theory of distributions, with Dirac's d-function40) g over D, starting from (40):
by direct calculation (or combinatorial insight) using a"(x) = 8(x + 1) — 28(x) + 8(x — 1). We now integrate g(4)(D), which gives
where the &• are integration constants. But the condition g(D) = 0 for D > 2 forces the b-s to be 0, which concludes the proof. The rest of the proof is in two steps. First we prove (41)-(42), then we calculate this expression of p(D, t). Step 1. Using (25) at t and t + h, we get
with 40. Dirac's 5-function is equal to 0 everywhere except at 0, where 5(0) = ».
302 • GABAIX & LAIBSON
so using (34) we get
with
Step 2. Our next step is to calculate p(D,h). Start with the case D > h + 2: then (D - |z - ; - h\)+ = D - i - j - h, as \i - j - h\ < 1 + 1 + h < D), and given / /, ve[ _ u ,fl(iM/) di dj = (/,-6[_u] a(i) di) (/, e[ _ u] a(j) dj} = I, we get
with
Going back to a general D > 0, we get from (42)
because a is even and by an application of change in variables. So from Lemma 11, p"(D) = d"(D - h) + d"(D + h), and
The 6D Bias and the Equity-Premium Puzzle • 303
for some real numbers d0, dv Equation (43) gives us d1 = 0, since d'(x) = \ for x > 2. Finally, p(0) = 0 gives A(h) = -d0 = d(h) + d(-h], which concludes the proof. A.5 PROOF OF COROLLARY 4 r(D,0) is monotonic by direct calculation from the result in Theorem 3. Theorem 3 also implies
Alternatively, this result can be obtained more directly from the calculation at the end of the proof of Theorem 3. A.6 PROOF OF PROPOSITION 5 Extend the argument used to prove Theorem 2. To calculate the correlation coefficient, use the variance results from Corollary 4. A.7 PROOF OF THEOREM 6 Because V(s1,s2) = V(slf\] — V(s2,Y), it is enough to fix s2 = 1. We use the notation s = Sj. Recall (25), so that
with
So, using the Heaviside function—H(x) = 1 if x > 0, 0 if x < 0 (so that H' = 8)—
304 • GABAIX & LAIBSON and
Introducing the function e defined in (15), which satisfies e!' — a, we get
for some constants W0/Wt. Observe that for s > 1, (44) gives W(s) = 0, so (45) gives us Wl = 0 (and W0 = D/2). This allows us to conclude the proposition. A.8 PROOF OF COROLLARY 7
Immediate application of the preceding theorem. A.9 PROOF OF THEOREM 8
The expression (23) is derived exactly as in Proposition 9. The only new work is to calculate F(D,D',h). Using (34), we get
with
To calculate p, we derive (again, H(x) = 1,^ is Heaviside's function)
and
The 6D Bias and the Equity-Premium Puzzle - 305
So Lemma 11 gives
where e0,e1 are functions of D and h. As p = 0 for D' = 0, we get e0 = -d(-h) + d(-D - h) = -d(h) + d(D + h), as d is even. As we should have p(D,D,h) = p(D,h) for p in (42), we can conclude el = 0 and deduce the value of e0, so Theorem 8 is proven. A.10 DERIVATION OF THE UTILITY LOSSES
A fully rigorous derivation, e.g. of the type used by Rogers (2001), is possible here. Such a derivation begins with the Bellman equation (35), and then uses a Taylor expansion to derive an expression for v of the type v = VQ + v-f) + O(vz). This approach is tedious and not very instructive about the economic origins of the losses, which is why we present the following more heuristic proof. Equation (28) is standard (e.g., see Cochrane, 1989). For completeness's sake, though, let us mention a way to derive it. We want to calculate U(C) — U(C'), where C = (c(),>0 is the optimum vector of (stochastic) consumption flows, U(C) = E[\^e~ptu(c^\, and C' is another vector that can be bought with the same Arrow-Debreu prices p. For C and C' close, we have
By optimality of C we have IT(C) = Xp for some p, and pC = pC' = initial wealth = W; thus we have IT(C) (C - C') = 0. Expressing U" finally gives
306 • GABAIX & LAIBSON
A change AW in the initial wealth creates, by homotheticity of the optimal policy, a change in consumption Act/ct — AW/W, hence a change in utility
So the suboptimality of plan C' is equivalent to a wealth loss [using u'(c) = c~r] of
where the weights in the mean < • > are given by <Xt> = E [^ e pic\ rXt dt] / E [/o e~pic}~y dt]. This proves equation (28). We now derive , with Act = c't — ct. With latest reset at time T,
Now application of Lemma 10 gives (sparing the reader the tedious derivation),
Defining ^such that Efa1'7] = cj~ V""**, with *" > 0, we get
The 6D Bias and the Equity-Premium Puzzle • 307
The cross term ((WT — ivt)(w'T - WT)) = 0.
So we have the important (and general in these kinds of problems) fact that the first-order contribution to the welfare loss is the direct impact of the delayed adjustment—the wr — wt term—whereas the indirect impact (where a suboptimal choice of consumption creates modifications in future wealth) is second order. In other terms,
Appendix B. Model with Immediate Adjustment in Response to Large Changes in Equity Prices Suppose that people pay greater attention to "large" movements in the stock markets (because they are more salient, or because it is more rational to do so). How does our bias change? We propose the following tractable way to answer this question. Say that the returns in the stock market are
where jt is a jump process with arrival rate A. For instance, such jumps may correspond to crashes, or to "sharp corrections," though we need not have E[djt] < 0. To be specific, when a crash arrives, the return falls by / (to fix ideas, say / = 0.1-0.3). To model high attention to crashes, we say that consumption adjusts to dzt shocks every D periods, and adjusts to dj shocks immediately (D = 0 for those Poisson events). Denote by a\ the variance of Brownian shocks, and by cr* = E[dj2t] /dt = A/2 the variance of jump shocks. The total variance of the stock market
308 • GABAIX & LAIBSON
is
For tractability, we use the approximation / « 1 (which is reasonable, since a typical value for / is 0.1 to 0.25). We get the analogue of the simple formula (1):
plus higher-order terms in /. One can show that formula (22), which was derived in the case of assets with Brownian shocks, carries over to the case of a mix of Brownian shocks and jumps. Thus we get, to first order,
with b(0) = 2 and a2ot = a\ + a*. Thus, the new bias is the harmonic mean of the b(D) = 6D (if D > 1) bias for "normal" Brownian shocks, and the shorter b(0) = 2 bias of the Brownian shocks. As a numerical illustration, say a "jump" corresponds to a monthly change in the stock market of more than / = 25% in absolute value. This corresponds, empirically, to an estimate of A = 0.53%/year (5 months since 1925), i.e. a crash every 14 years. Then a-2/a2ot = X]2/cr2 = 0.014. Take D = 4 quarters as a baseline. The new y/y becomes 20.6, which is close to the old ratio of 24.
Appendix C: Expression of the Bias in the Lynch Setup when D > 1 In Lynch's (1996) discrete-time setup, agents consume every month and adjust their portfolio every T months. The econometric observation period is time-aggregated periods of F months, so D = T/F. Say consumer z E {1, . . . ,T} adjusts her consumption at i + nT, n E Z. Say the econometrician looks at period {!,... ,F}. The aggregate per capita consumption over this period is
The 6D Bias and the Equity-Premium Puzzle - 309
The returns are
where rs = In Rs. Call C,F = Sf=1 cis the consumption of agent i in the period. For i > F, cov(QF/ln RF) = 0, because agent i did not adjust her consumption during the period. For 1 < i < F, we have cit = 1 + O(s) (normalizing) when t < i, and cit = 1 + &2s=lrs + O(s) when t ^ /, where the O(s) terms incorporate the deterministic part of consumption growth. The stochastic part, in rs, has the order of magnitude cr — O(e1/2), and dominates those terms. Information about stock returns up to i will affect only consumption from time i to F, so, denoting by ACiF the difference in total consumption between a given period of length F and the previous one,
So
310 • GABAIX & LAIBSON
But given that the mean per-period consumption cit = 1 + O(sl/2), the aggregate consumption is CF = F + O<0(e1/2), and
The naive econometrician would predict cov(zlCF/CF/ In RF) = 0(r2F. The econometrician estimating y = 7rF/cov(ziCF/CF/ln RF) will get a bias [with D = T/F and as 6 = 7r/(yo-2)] of
Holding D constant, the continuous-time limit corresponds to F —* oo, and we find the value y/y — 6D. The discrete-time case where agents would consume at every econometric period corresponds to F — 1, and then one gets y/y = D, which can be easily derived directly. REFERENCES Abel, A. (1990). Asset prices under habit formation and catching up with the Joneses. American Economic Review, 80(2):38-42. Barberis, N., M. Huang, and J. Santos. (2001). Prospect theory and asset prices. Quarterly Journal of Economics 116(l):l-53. Bernatzi, S., and R. Thaler. (1995). Myopic loss aversion and the equity premium puzzle. Quarterly Journal of Economics 110(l):73-92. Brav, A., G. M. Constantinides, and C. C. Geczy. (2000). Asset pricing with heterogeneous consumers and limited participation: Empirical evidence. University of Chicago. Mimeo. Caballero, R. J. (1995). Near-rationality, heterogeneity, and aggregate consumption. Journal of Money, Credit and Banking 27(l):29-48. Calvo, G. (1983). "Staggered prices in a utility-maximizing framework. Journal of Monetary Economics 12(3):383-398. Campbell, J. (1999). Asset prices, consumption, and the business cycle. In Handbook of Macroeconomics, J. Taylor and M. Woodford (eds.). Chapter 19, pp. 1231-1303. , and J. Cochrane. (1999). By force of habit: A consumption-based explanation of aggregate stock market behavior. Journal of Political Economy 107(2): 205-251. Cochrane, J. H. (1989). The sensitivity of tests of the intertemporal allocation of consumption to near-rational alternatives. American Economic Review 79: 319-337. , and L. Hansen. (1992). Asset pricing explorations for macroeconomics. In 2992 NBER Macroeconomics Annual, O. Blanchard and S. Fischer (eds.). Cambridge, MA: The MIT Press.
The 6D Bias and the Equity-Premium Puzzle • 311 Constantinides, G. M. (1990). Habit formation: A resolution of the equity premium puzzle. Journal of Political Economy 98:519-543. , J. B. Donaldson, and R. Mehra. (2000). Junior can't borrow: A new perspective on the equity premium puzzle. University of Chicago. Mimeo. Daniel, K., and D. Marshall. (1997). The equity premium puzzle and the risk-free rate puzzle at long horizons. Macroeconomic Dynamics 1:452-484. , and . (1999). Consumption-based modeling of long-horizon returns. Kellogg Graduate School of Management, Northwestern University. Mimeo. Duffie, D., and T. S. Sun. (1990). Transactions costs and portfolio choice in a discrete-continuous-time setting. Journal of Economic Dynamics & Control 14(1): 35-51. Dynan, K., and D. M. Maki. (2000). Does stock market wealth matter for consumption? Federal Reserve Board. Mimeo. Fischer, S. (1977). Long-term contracts, rational expectations, and the optimal money supply rule. Journal of Political Economy 85(1):191-205. Gabaix, X., and D. Laibson. (2000a). A boundedly rational decision algorithm. AEA Papers and Proceedings, May, pp. 433-438. , and . (2000b). Bounded rationality and directed cognition. Harvard University and MIT. Mimeo. Grossman, S. J., and G. Laroque. (1990). Asset pricing and optimal portfolio choice in the presence of illiquid durable consumption goods. Econometrica 58(1):25-51. , A. Melino, and R. J. Shiller. (1987). Estimating the continuous time consumption based asset pricing model. Journal of Business and Economic Statistics 5:315-327. Hall, R. E. (1978). Stochastic implications of the life cycle-permanent income hypothesis: Theory and evidence. Journal of Political Economy 86(6):971-987. Hansen, L. P., and K. Singleton. (1983). Stochastic consumption, risk aversion, and the temporal behavior of asset returns. Journal of Political Economy 91:249268. , and R. Jagannathan. (1991). Implications of security market data for models of dynamic economics. Journal of Political Economy 99(2):225-262. He, H., and D. M. Modest. (1995). Market frictions and consumption-based asset pricing. Journal of Political Economy 103:94-117. Heaton, J., and D. J. Lucas. (1996). Evaluating the effects of incomplete markets on risk sharing and asset pricing. Journal of Political Economy 104(3):443-487. Kennickell, A. B., M. Starr-McCluer, and B. Surette. (2000). Recent changes in U.S. family finances: Results from the 1998 Survey of Consumer Finances. Federal Reserve Bulletin, January, pp. 1-29. Kocherlakota, N. R. (1996). The equity premium: It's still a puzzle. Journal of Economic Literature 34(1):42-71. Lettau, M., and S. Ludvigson (2001). Understanding trend and cycle in asset values: Bulls, bears, and the wealth effect on consumption. Federal Reserve Bank of New York. Mimeo. Lucas, R. E. (1978). Asset prices in an exchange economy. Econometrica 46:14291446. Luttmer, E. (1995). Asset pricing in economics with frictions. University of Chicago. Mimeo. Lynch, A. (1996). Decision frequency and synchronization across agents: Impli-
312 • LYNCH cations for aggregate consumption and equity returns. Journal of Finance 51(4):1479-1497. , and P. Balduzzi. (2000). Predictability and transaction costs: The impact on rebalancing rules and behavior. Journal of Finance 55(5):2285-2309. Mankiw, N. G. (1982). Hall's consumption hypothesis and durable goods. Journal of Monetary Economics 10:417-425. , and S. Zeldes. (1991). The consumption of stockholders and nonstockholders. Journal of Financial Economics 29(1):97-112. Marshall, D., and N. Parekh. (1999). Can costs of consumption adjustment explain asset pricing puzzles? Journal of Finance 54(2):623-654. Mehra, R., and E. Prescott. (1985). The equity premium: A puzzle. Journal of Monetary Economics 15:145-161. Merton, R. (1969). Lifetime portfolio selection under uncertainty: The continuous time case. Review of Economics and Statistics 51:247-257. Poterba, J. (2000). Stock market wealth and consumption. Journal of Economic Perspectives 14(2):99-118. Ramsey, F. (1928). A mathematical theory of saving. Economic Journal 38(December):543-559. Rogers, L.C.G. (2001). The relaxed investor and parameter uncertainty. Finance and Stochastics 5(2):131-154. Rubinstein, M. (1976). The valuation of uncertain income streams and the pricing of options. The Bell Journal of Economics 7(2):407-425. Shiller, R. J. (1982). Consumption, asset markets, and macroconomic fluctuations. Carnegie Mellon Conference Series on Public Policy 17:203-238. Starr-McCluer, M. (2000). Stock market wealth and consumer spending. Federal Reserve Board of Governors. Mimeo. Taylor, J. (1979). Staggered wage setting in a macro model. American Economic Review 69(2):108-113. Thaler, R. (1992). Savings, fungibility, and mental accounts. In The Winner's Curse: Paradoxes and Anomalies of Economic Life. Princeton NJ: Princeton University Press, pp. 107-121. Vissing, A. (2000). Limited stock market participation and the equity premium puzzle. University of Chicago. Mimeo.
Comment ANTHONY W. LYNCH New York University
1. Introduction Gabaix and Laibson extend some earlier work examining the effects of infrequent consumption decision-making by individuals. Grossman and Laroque (1990) developed a continuous-time model in which an individual adjusts consumption infrequently because of proportional adjustment costs. Marshall and Parekh (1999) present numerical results for an economy composed of heterogeneous agents behaving in this way. Cali-
Comment • 313
brating equity returns to U.S. data, they find that undetectably small consumption adjustment costs can alleviate the equity-premium puzzle by delivering the low volatility of aggregate consumption growth and its low correlation with equity return found in U.S. data. Agents facing proportional adjustment costs use a state-dependent decision rule. As an alternative, Lynch (1996) examined an economy in which decisions are made at fixed intervals and are unsynchronized across agents. Agents choose nondurable consumption and portfolio composition, and either or both can be chosen infrequently. A small utility cost is associated with both decisions being made infrequently. Calibrating returns to the U.S. economy Lynch (1996) also found that less frequent and unsynchronized decision making delivers the low volatility of aggregate consumption growth and its low correlation with equity return found in U.S. data. Allowing portfolio rebalancing to occur every period has a negligible effect on the joint behavior of aggregate consumption and returns. Gabaix and Laibson present a continuous-time generalization of Lynch's model and are able to obtain analytic expressions for the bias to risk aversion imparted by less frequent consumption adjustments. The paper also calibrates a version of the model that incorporates temporal aggregation, delayed adjustment, and nonparticipation in stocks by a fraction of the agents. Consistent with the results in Lynch (1996) and Marshall and Parekh (1999), Gabaix and Laibson also find that a delayed-consumption-adjustment model can help explain the equitypremium puzzle by producing lower consumption-growth volatility and lower contemporaneous covariance of consumption growth with equity returns. Although not modeled explicitly by Lynch or by Gabaix and Laibson, constant decision intervals arise when it is costly to gather information about wealth innovations and to solve optimization problems. Duffie and Sun (1990) presented a model of this type and showed that if utility is power, risky-asset return is in geometric Brownian motion, and transaction costs are proportional to wealth, then the optimal decision interval is a constant. This discussion first describes the model and summarizes some of its key implications. Then the calibration and empirical work are discussed. Finally, some general comments and conclusions are presented.
2. Model Setup and Main Results The economy has a riskless rate r and a risky return that follows geometric Brownian motion with an instantaneous mean return of 77 + r and an
314 • LYNCH
instantaneous variance of a2. Agents have power utility and adjust consumption every D periods. At each adjustment time, agents set aside an amount for consumption over the next D periods, which earns the riskless rate r until consumed. The agents place their remaining wealth in an investment portfolio that is continuously rebalanced. Thus, the optimal risky-asset weight 6 is same as in the D = 0 case: 9 = Tr/(ycr2). The economy has a continuum of agents, indexed by adjustment times, which are uniformly distributed over any interval of length D. When D = 0, the econometrician still faces temporal aggregation. The variance of log per-period aggregate consumption growth is | of instantaneous volatility when D = 0, is declining in D, and is approximately 1/D times instantaneous volatility when D is large. Instantaneous log consumption-growth autocorrelation is 0 at all lags when D = 0. In contrast, because of temporal aggregation, log per-period consumption growth autocorrelation when D = 0 is \ at lag 1 and is 0 at lags of 2 or more. With temporal aggregation and D > 0, log per-period consumption growth autocorrelation is positive and decreasing in lag length at lags less than D + 2 and is 0 at lags of D + 2 or more. The instantaneous contemporaneous covariance of log consumption growth with log riskyasset return with D = O is Oa2. The contemporaneous covariance of log per-period consumption growth with log per-period risky-asset return is 9cr2f2. with D = 0 and is 6(T2/(6D) with D > 1. Finally, with D = 0, the instantaneous covariance of log consumption growth with lagged log risky-asset return is 0 at all lags > 0, while the covariance of log perperiod consumption growth with lagged log per-period risky-asset return is positive at lags less than 2 and is 0 at lags of 2 or more. Once D > 1, the covariance of log per-period consumption growth with lagged log per-period risky-asset return is positive at lags less than D + 2 and is 0 at lags of D + 2 or more. To summarize, the contemporaneous covariance of consumption growth with risky-asset return is lower than instantaneous due to temporal aggregation alone, lower still due to infrequent adjustment, and decreasing in D. The variance of consumption growth is lower than the instantaneous variance due to temporal aggregation alone, is lower still due to infrequent adjustment, and is decreasing in D. The covariance of consumption growth with lagged consumption growth and with lagged risky-asset return is positive for lags less than 2 due to temporal aggregation alone, and in general is positive for lags less than D + 2 and zero otherwise. While the paper typically fixes the lag or the period length and varies D, it would be useful to examine what happens to the various statistics
Comment • 315
of interest as the lag or the period length is varied for fixed D. Such an analysis would be helpful for generating testable implications, since one particular distribution for D holds empirically. The paper's assumption that agents continuously rebalance their investment portfolios seems inconsistent with a fixed adjustment interval, since the assumed fixed interval between consumption adjustments is difficult to justify when agents know the risky-asset return. The paper's closed-form solutions rely on continuous portfolio rebalancing by agents. Restricting the ability of an agent to rebalance her portfolio within her adjustment period may affect the distribution of aggregate consumption growth. In an economy with infinite-lived agents making unsynchronized consumption and portfolio decisions every D periods, the crosssectional distribution of agent wealth (expressed as a fraction of total wealth) becomes increasingly disperse over time, indicating that aggregate consumption growth does not have a steady-state distribution. This concern prompted Lynch (1996) to build an overlapping-generations economy with finite-lived individuals and a deterministically growing wealth endowment for each period's newborn. Lynch finds that the implications for aggregate consumption growth of infrequent consumption adjustments are largely unaffected by the portfolio-rebalancing frequency of agents. 3. Calibration and Empirical Work Gabaix and Laibson estimate D based on a cost of adjusting consumption of 0.04% of wealth and obtain an estimate of 2 years. This estimate is likely to overstate D, since their calculations assume continuous portfolio rebalancing. The paper then calibrates a simple macro model with shareholders who delay consumption and nonshareholders who do not delay. The calibration makes many simplifying assumptions: both groups have the same propensity to consume, and the wealth of the two groups is assumed to be uncorrelated. Some sensitivity analysis would be useful. While the endogenous adjustment period is calculated to be 2 years, the calibrated model has a continuum of agents and D is uniformly distributed from 0 to 30 years. This switch is not innocuous. For example, the empirical section attempts to explain the pattern of covariances between equity Rt+1 and ln(Ct+^/Q) as a function of h. The paper uses quarterly data for this section and finds these covariances are roughly increasing in h out to at least 20 quarters, particularly for countries with large stock markets. This model generates the upward-sloping pattern out to at least 20 quarters that is found in the data. But this result
316 • LYNCH
depends critically on D taking values out to at least 20 quarters. If the paper used the calibrated adjustment period of D = 8 quarters, the covariance pattern would be flat for h > 10 quarters. In the calibration section, the paper only compares the model with the data on consumption growth's volatility contemporaneous covariance with return, and first-lag autocorrelation. Then, the empirical section attempts to explain the pattern of covariances between equity Rt+l and \n(Ct+h/Ct) and the pattern of /i-period Euler equations, both as functions of h. However, the distinction between the paper's calibration and empirical work seems artificial. Since the model also provides many other moments that could be compared with the data, it would be useful to calibrate a model and then examine how it performs with respect to a wide range of moments for aggregate consumption growth. A more systematic analysis would be instructive. It would also be useful to have standard errors for the data estimates as part of such an analysis. One approach simulates samples using the model and calculates a distribution for the statistic of interest. This distribution can then be used to calculate a pvalue for the data estimate under the null that the model holds. For example, the paper does not examine consumption-growth autocorrelations beyond the first lag, even though the model provides predictions about them. Heaton (1993) finds negative autocorrelation at the 5th lag for quarterly seasonally adjusted consumption changes, negative autocorrelation at the 1st and 4th lags for monthly seasonally adjusted consumption changes, and negative autocorrelation at the 1st, 2nd, and 5th lags for quarterly non-seasonally-adjusted consumption changes. Negative autocorrelation at any lag is inconsistent with the model. So despite the likely role being played by measurement error, Heaton's results are challenges for the paper's delayed-consumption-adjustment model. Thus, while a useful first step, the empirical work is far from conclusive, and more needs to be done to ascertain whether delayed consumption adjustment is playing a role in the U.S. economy and in other economies. 4. General Comments and Conclusions The assumption of a predetermined delay interval D is difficult to justify. At the very least, agents are likely to adjust consumption after large changes in equity value. While this effect causes all investors to adjust at the same time (see, for example, Marshall and Parekh, 1999), Gabaix and Laibson find the upward bias to risk aversion can still be large. However, their adjustment trigger of 25% monthly return in absolute value is quite extreme, and the upward bias associated with a more modest and reason-
Comment • 317
able trigger point is likely to be much smaller. A model in which investors adjust consumption after a large change in equity value is empirically distinguishable from the fixed-decision-interval model. The former predicts that the standard Euler equation should perform well in those periods that are preceded by a period with a large equity return in absolute value. Empirical work is needed to characterize the delayedadjustment rule (if any) being used by agents in the U.S. economy and other economies. The model assumes that equity value is in geometric Brownian motion. There is much evidence that equity returns are predictable and heteroscedastic. The implication may be a delay interval D that depends on the same variables that forecast means and variances. It seems unlikely that agents always take recent wealth innovations into account when making high-frequency consumption decisions. Gabaix and Laibson's analytical results serve to emphasize the potentially large effect of such behavior on the joint distribution of aggregate consumption and equity return. Hopefully their work will prompt more theoretical and especially empirical work directed toward understanding how agents delay adjusting their consumption and how this delay affects aggregate consumption in the U.S. and other countries. REFERENCES Duffie, D., and T. Sun. 1990. Transaction costs and portfolio choice in a discretecontinuous-time setting. Journal of Economic Dynamics and Control 14:35-51. Grossman, S., and G. Laroque. 1990. Asset pricing and optimal portfolio choice in the presence of illiquid durable consumption goods. Econometrica 58:25-51. Heaton, J. 1993. The interaction of time-nonseparable preferences and time aggregation. Econometrica 61(2):353-385. Lynch, A. W. 1996. Decision frequency and synchronization across agents: Implications for aggregate consumption and equity return. Journal of Finance 51:14791498. Marshall, D.A., and N.G. Parekh. 1999. Can costs of consumption adjustment explain asset pricing puzzles? Journal of Finance 54:623-654.
Comment MONIKA PIAZZESI UCLA and NBER
1. Introduction An economy populated by a representative agent with power utility predicts an equity premium which is far below the realized equity pre-
318 • PIAZZESI
mium in postwar data, at least for "reasonable parameters" for the endowment process and the coefficient of relative risk aversion y. This is the equity-premium puzzle stated by Mehra and Prescott (1985). Simply increasing y (and somehow arguing that this is "reasonable") does not solve the puzzle, because a high y counterfactually leads to a high riskfree rate. The few models in the literature today that may be considered puzzle-free still rely on high y's. An example is Campbell and Cochrane (1999), who use an average y of 50. An argument that relies on estimation bias for y alone, as suggested by the title of the paper, cannot therefore be enough to reconcile the standard model with the data. But there is more to the model of Gabaix and Laibson than the title indicates, because it is populated by agents whose heterogeneity matters. The high equity premium and the low risk-free rate are, literally speaking, no puzzles in the model: asset prices are specified exogenously. The endogenous variables in this model are the consumption processes of individual investors. By summing these over a group of investors, the paper obtains a measure of aggregate consumption. The interpretation of this portfolio choice model, or Merton model, as a production economy with exogenous production technologies (or as a small open economy) leads to another endogenous variable: net borrowing by this group of investors (or the current account). The behavior of these endogenous variables (consumption and net borrowing) is what is puzzling in models with exogenous returns (such as Constantinides, 1990). My discussion will thus concentrate on the model-implied behavior of these endogenous variables. The model is a continuous-time version of Lynch (1996). Agents are indexed by a first adjustment time i and an interval length D, between adjustments, which together define an (exogenous) adjustment sequence {i, i + D(, i + 2D{, . . .}. Between adjustment times [i + /D,, i + (j + 1)D(),; e N, agents do not know the returns of risky assets and do not trade them. This feature makes assets illiquid. As in a limitedparticipation model, the Euler equations for only a subset of agents hold at any point in time t in this economy. With adjustment delays, the firstorder conditions for risky-asset holdings at time t are only satisfied for those agents that are adjusting at time t. The intuition from a closedeconomy version of this model tells us that in this case agents need to be compensated to hold these illiquid assets. The resulting equity premium is not so much a risk premium in the usual sense as a liquidity premium. We need to be careful, however, in applying closed-economy intuition to this setup, because it is not clear whether the implications of the model will survive in a closed-economy setting. The reason is that agents in the model continuously observe the riskless rate, which is
Comment • 319
assumed to be constant. In a closed economy, the riskless rate responds to stock-market movements and therefore reveals information from other agents in the economy (who get to adjust their consumption earlier in response to these movements). This means that even agents who do not directly observe stock returns can infer from the riskless rate whether the stock market just tanked and thus can adjust their consumption immediately. The closed-economy version of the model with learning will be more difficult to solve, but future research will hopefully tell us how it behaves. The puzzles lie in the numbers, so I will compare the model's implication with the joint time series of quarterly U.S. aggregate consumption and real stock returns. I will show that adjustment delays alone cannot provide an explanation for the equity premium. The model fails along three main dimensions: (i) consumption growth from the model is too autocorrelated, (ii) the normalized covariance of returns with consumption monotonically increases with horizon in the model, while it is hump-shaped in the data with a peak at 2 years, and (iii) returns are assumed to be i.i.d., while they are predictable in the data. The reason for (i) is that stock-market shocks trigger a series of individual consumption adjustments in the same direction by agents who only get to adjust later to the shock. The resulting aggregate consumption growth process thus looks autocorrelated and predictable by stock returns. The model does not seem to generate too much predictability for consumption, but it does imply too much autocorrelation for consumption growth. The reason for (ii) is that as we lower the frequency at which we observe data relative to the frequency at which consumption decisions are made, the model looks more and more like a standard model without adjustment delays. In standard models the covariance between consumption growth and stock returns divided by horizon increases with horizon. This feature is counterfactual; it is known as the equity-premium puzzle at long horizons and is documented by Cochrane and Hansen (1992). There is a long list of variables that successfully predict stock returns in (iii). The list includes term spreads (Campbell, 1987), the dividendearnings ratio (Lament, 1996), and the consumption/wealth ratio (Lettau and Ludvigson, 2001). I show that even lagged consumption growth (which is a variable directly taken from the model) is a predictor (but of course less successful than other variables). In addition to these three problems, the model may be relying on large and counterfactual net borrowing from "foreigners" (agents whose consumption is not used to define aggregate consumption) to sustain the
320 • PIAZZESI
exogenously fixed low risk-free rate, but I have not looked at the behavior of net borrowing. In the process of documenting the properties of the model, I also show that the first three autocorrelations of consumption growth are significantly different from zero in the data. Moreover, consumption growth is heteroscedastic in the data. For example, a Garch(l,l) is significant. These two properties mean that consumption growth is certainly not i.i.d., an assumption often made by recent consumption-based asset pricing models (following Hall, 1978). Heteroscedasticity may be important for explaining the time variation in expected returns which is not captured in this paper. Models that replicate this time variation typically rely on features of preferences which produce time-varying risk aversion (Campbell and Cochrane, 1999; Barberis, Huang, and Santos, 2001; Veronesi, 2001). I also show that the cross-correlation of consumption growth and stock returns data seems to be seasonal. This seasonality appears even though the consumption data are seasonally adjusted. At first sight this adjustment looks successful, because the autocorrelation function of consumption growth does not show any obvious seasonal patterns. The cross-moments with returns, however, seem to indicate that it may matter for stock pricing that real-life investors are consuming a seasonal consumption process. This raises the question whether the predictability of consumption growth is a feature of the data that should be matched by an asset-pricing model. The following discussion will thus concentrate on the autocorrelation and predictability of consumption growth, the predictability of returns, and the equity premium at long horizons. Here, "consumption" always refers to aggregate consumption. I will then return to the interpretation of adjustment delays in terms of cognitive costs that is offered in this paper and suggest extensions.
2. Data and Calibration The comparison of the model with the data relies on two series: consumption and real stock returns. Consumption is for nondurables and services excluding shoes and clothing, seasonally adjusted in 1996 chainweighted dollars. The returns are for all stocks traded on NASDAQ, AMEX, and the NYSE. The calculation of real returns relies on the consumer price index. The sample consists of quarterly data from 1953:1 to 2000:3. Since I use different consumption and returns data than the paper, I also use slightly different parameter values to calibrate the model: r + TT
Comment • 321
= 0.08, a = 0.16, 7 = 4, D, = 4 or D, = 10, Vz. I assume that initial adjustment times i are uniformly distributed over [0,D].
3. Autocorrelation of Consumption Growth Figure 1 shows the autocorrelation of consumption growth at different lags h together with 95% confidence bounds. The autocorrelation is significant up to the third quarter, which means that consumption growth is definitely not i.i.d. The figure also shows the autocorrelations implied by the model for D = 4 and D = 10. The general pattern is that the autocorrelation in a model with interval length D between two decisions dies off after D periods. The autocorrelation in the data seems to be best matched by choosing D = 4. The first two autocorrelations of 0.85 and 0.57 produced by the model for D = 4 are clearly too high compared to the data. As an aside, I would like to add that autocorrelation is not the only dimension in which consumption growth is not i.i.d. Consumption FIGURE 1 AUTOCORRELATION OF CONSUMPTION GROWTH
322 • PIAZZESI Table 1 MAXIMUM-LIKELIHOOD ESTIMATES OF A log c, = c0 + q A log c,^ + c2 A log c t _ 2 + c3 J log ct_3 + st C0
q
C2
C3
a0
aj
a2
0.01 (7.04)
0.31 (4.01)
0.03 (0.33)
0.23 (2.93)
0.00 (0.64)
0.03 (1.19)
0.95 (29.56)
Here et is conditionally normal with mean 0 and variance a\ = a0 + a^^ + a2cr^_l. The estimation uses quarterly data on U.S. consumption of nondurables and services without shoes and clothing from 1953:1 to 2000:3. (-statistics are in brackets.
growth is also heteroscedastic, a property which may be important for explaining the time variation in expected returns (which is not captured by the model). This can be seen from Table 1, which reports the maximum-likelihood estimates of an AR(3) combined with a Garch(l,l). The estimate of the Garch parameter a2 is 0.95 and is strongly significant. The autoregressive parameters are partial correlations, so they differ from Figure 1, which shows autocorrelations. 4. Equity Premium at Long Horizons To see how the model behaves as we vary the observation horizon h for a fixed decision interval length D > 1, consider the following equation that determines the equity premium in the model:
Figure 2 shows the covariance factor on the right-hand side of this equation, the covariance of consumption growth and stock returns divided by the horizon. In the data, this covariance is hump-shaped as a function of horizon: increasing up to 2 years and then decreasing. The model predicts a monotonically increasing covariance. The reason is that as we lower the observation frequency relative to the decision interval length D, the model behaves more and more like the original Merton model without adjustment delays. Therefore the model predicts a high covariance of consumption growth and stock returns at long horizons, which is counterfactual. The equity premium at long horizons was noted by Cochrane and Hansen (1992) and was seen as causing a problem for the timeaggregation literature because aggregation problems matter less as we lower the frequency at which we observe the data. The same now applies to a model with adjustment delays. Figure 2 does not show the
Comment - 323 FIGURE 2 COVARIANCE OF log(c,+h/ct) AND log Rtrt+h DIVIDED BY h
standard errors around the covariance estimates, which get large with horizon to the extent that the hump in the empirical covariance is not significant. The equity premium, however, is not much of a puzzle if we take into account standard errors in this case (as can be seen from the cross-correlation at h = 0 in Figure 5 below).
5. Predictability of Consumption Growth To look at the predictability of consumption with stock returns, Gabaix and Laibson compute the cumulative covariance of log stock returns log Rt>t+i from time t to time t + 1 with consumption growth log(ct+1+h/ct) from time t to time t + 1 + h, for different quarterly horizons h. By decomposing this covariance measure into its individual elements, we get
324 • PIAZZESI
From the last equation, we can see that this cumulative covariance measure does not only reflect whether stock returns predict consumption growth, because part of the covariance is due to the contemporaneous covariance cov(log(cm/cf), log Rt/t+l) between returns and consumption growth. Figure 3 plots this cumulative consumption measure (like Figure 7 in the paper), while Figure 4 plots the individual components in the sum on the right-hand side of the last equation. Both figures are based on U.S. data for nondurables and services instead of the total consumption series from different countries used in the paper. The dashed lines are 95% confidence bounds based on Newey-West standard errors. Figure 3 shows that the contemporanous covariance estimate in the data is already nonzero, and then the covariance measure increases up to 7 quarters. Beyond that, the covariance slightly decreases with horizon, but confidence bounds become large. The figure shows that the covariance pattern in the data is well replicated by the model if the interval length D between decisions is set to 4 quarters. FIGURE 3 COVARIANCE OF log (c t+1+h /c t ) AND log JR M+1
Comment • 325 FIGURE 4 COVARIANCE OF (cM+h/ct+h) AND log Rt t+1
The covariance of the total consumption data (used in the paper) with returns seems to increase with horizon. To replicate this, Gabaix and Laibson use a distribution for D over [0,30] years to compute the covariance measure from the model. This is not necessary for data on nondurables and services. Section 6.2 of the paper compares Figure 7 with the plain Merton model with i.i.d. consumption growth. This is not really an appropriate comparison, because it is clear that a model where consumption growth is assumed to be i.i.d. does not imply any predictability. Models with exogenous returns like Constantinides (1990) tend to produce too much predictability, and so they provide a more natural benchmark. The individual covariances in Figure 4 represent the slope of the cumulative covariance function in Figure 3. We can see that the slope is significant and positive for horizons 1, 2, and 4, while it becomes negative at horizon 8. This pattern looks somewhat seasonal, even though the consumption series is seasonally adjusted. This pattern suggests that the covariance increase until h = 7 in Figure 3 may be due to seasonalities. In
326 • PIAZZESI
this case, it is not clear whether this predictability is a feature that the model should match. More generally, the pattern raises doubts about the use of seasonally adjusted data for tests of consumption-based asset pricing models. 6. Predictability of Returns Stock returns can be predicted with a large number of variables. Figure 5 shows the cross-correlation between current consumption growth log(cf+1/c() and returns from time t + hiot + h + l for varying horizons h together with approximate 95% confidence bounds (computed as ±2\/~~T, where T is the number of observations in the sample). The pattern of this cross-correlation for h = 0, -1, -2, -4, -8 shows again that the equity premium is measured with a lot of noise (supposing the standard Euler equation holds) and that consumption growth is predictable with stock returns as documented in Section 5. The interesting stylized fact that emerges from this graph is that the cross-correlation is
FIGURE 5 CROSS-CORRELATION OF log(cf+1/ct) WITH log Rt>t+l+h
Comment • 327
also significant at h = 4. This means we can use current returns to predict consumption growth one year from now. The model by Gabaix and Laibson is not consistent with this feature of the data, because it assumes that these returns are i.i.d. and thus not predictable. Future research will hopefully show whether adjustment delays can be combined with something else, such as habit formation, so that the extended model can capture this important stylized fact. 7. Some Evidence about Cognitive Costs The paper assumes that agents do not receive or process stock-market information between any two periods. If this assumption is a good description of individual behavior, real net mutual-fund inflows should react to past stock return information. To check this implication of the model, I collect monthly data on net inflows into stock funds from 1984:1 to 2001:2. These data can be obtained from the Web site of the Investment Company Institute. Real inflows are computed based on the consumer price index. I also subtract a linear trend from the real inflows. Figure 6 shows that only FIGURE 6 CORRELATION BETWEEN Inflows^) AND log R t _ M _,, +1
328 • PIAZZESI
the contemporaneous correlation between real net inflows and returns is significant, not the correlation between inflows and past returns. While this is certainly not conclusive evidence against cognitive costs, the graph still provides some evidence that investors do not react to past return information when choosing their portfolio. 8. Extension to General Equilibrium The riskless rate is exogenous in this model and therefore does not reveal any information that agents have who have only recently adjusted their portfolio. I doubt this feature of the model will still be true in a closed-economy version where the riskless rate is allowed to move in response to a stock-market crash. This version is not easy to compute, but the wealth distribution matters even without idiosyncratic shocks. It would be interesting to link it to models in the incomplete-market literature (e.g., Krusell and Smith, 1997) which also try to increase individual consumption volatility like Gabaix and Laibson, but with a different mechanism. There is some hope that a combination of the two will be successful at explaining the equity-premium puzzle. REFERENCES Barberis, N., M. Huang, and T. Santos. (2001). Prospect theory and asset prices. Quarterly Journal of Economics, February, 116,1-54. Campbell, J. Y. (1987). Stock returns and the term structure. Journal of Financial Economics 18:373-399. ' , and J. H. Cochrane. (1999). By force of habit: A consumption-based explanation of aggregate stock market behavior. Journal of Political Economy 107:205-251. Cochrane, J. H., and L. P. Hansen. (1992). Asset pricing lessons for macroeconomics. In 2992 NBER Macroeconomics Annual, O. Blanchard and S. Fischer (eds.). Cambridge, MA: The MIT Press, pp. 115-1165. Constantinides, G. M. (1990). "Habit formation: A resolution of the equity premium puzzle." Journal of Political Economy 98:519-543. Hall, R. (1978). Stochastic implications of the life cycle-permanent income hypothesis: Theory and evidence. Journal of Political Economy 86:971-987. Krusell, P., and A. Smith. (1997). Income and wealth heterogeneity, portfolio choice, and equilibrium asset returns. Macroeconomic Dynamics l(2):245-272. Lament, O. (1996). Earnings and expected returns. Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 5671. Lettau, M., and S. Ludvigson. (2001). Resurrecting the (C)CAPM: A crosssectional test when risk premia are time-varying. Journal of Political Economy, forthcoming. Lynch, A. (1996). Decision frequency and synchronization across agents: Implications for aggregate consumption and equity return. Journal of Finance 51:14791498.
Discussion • 329 Mehra, R. and E. Prescott. (1985). The equity premium puzzle. Journal of Monetary Economics 15:145-161. Veronesi, P. (2001). Belief-dependent utilities, aversion to state uncertainty and asset prices. Chicago Graduate School of Business. Working Paper.
Discussion David Laibson admitted that how to fix D, the length of the period between readjustments of consumption, is an important question, and that one would not expect everyone in the economy to have the same D. He explained that the assumption of continuous rebalancing was not a crucial one, as it affected only second-order terms. He was very receptive to the idea that some important financial events capture people's attention. An extension to the model to capture this phenomenon through a Poisson arrival rate of important events affects the results only slightly. Laibson recognized that dealing with the long-horizon evidence was important and suggested that the picture would look better using international data. He also said that at long horizons, standard errors become very large, so the evidence neither supported nor rejected the framework. Robert Barsky suggested that if consumption had to be committed in advance, the effects could be the same as when investors rebalance their portfolios only intermittently because of cognitive costs. He asked whether the model could deal with the puzzle that stocks outperform bonds over long periods. Laibson agreed that cognitive costs are just one possible explanation for delayed adjustment. He guessed that the model had nothing to say about the returns on stocks relative to bonds. David Romer suggested that the authors should look more carefully at the equity premium over long rather than short horizons, as their explanation seemed to have an effect only at short horizons. He commented that even if the model failed to explain all of the puzzle at long horizons, it was still a useful contribution. He did not see why there should be one single explanation for the entire equity-premium puzzle, a view with which Laibson was sympathetic. Romer also said that the fact that the equity premium had fallen in recent years made him nervous about theories that predict a premium at all times and places. Xavier Gabaix remarked that, according to recent surveys, it appears that the public's expected return on stocks remains high, even though actual returns have fallen. Nobuhiro Kiyotaki suggested that limited participation can arise endogenously from the costs of rebalancing portfolios. He suggested that
330 • DISCUSSION the authors could get a sense of the importance of cognitive costs by looking at the size of asset holdings of participants and nonparticipants in the stock market. Greg Mankiw was struck by the fact that the model predicted positive autocorrelation of consumption growth, counter to some of the empirical evidence. Laibson responded that he thought the model did reasonably well on this score. Jim Stock said he would like to see an examination of the temporal aggregation problem in this context. Gertler suggested that looking at the standard deviation of individual consumption in the model and in the data would be a good way of evaluating the empirical plausibility of the model. Laibson replied that he believed the jumps in consumption predicted by the model were of a reasonable order of magnitude.
Timothy Cogley and Thomas J. Sargent ARIZONA STATE UNIVERSITY; AND STANFORD UNIVERSITY AND HOOVER INSTITUTION
Evolving Post-World War II U. S. Inflation Dynamics 1. Introduction This paper uses a nonlinear stochastic model to describe inflationunemployment dynamics in the United States after World War II. The model is a vector autoregression with coefficients that are random walks with reflecting barriers that keep the VAR stable. The innovations in the coefficients are arbitrarily correlated with each other and with innovations to the observables. The model enables us to detect features that have been emphasized in theoretical analyses of inflation-unemployment dynamics. Those analyses involve coefficient drift in essential ways. Thus, DeLong (1997), Taylor (1997, 1998), and Sargent (1999) interpreted the broad movements of the inflation rate in terms of the monetary authority's changing views about the Phillips curve. According to them, the runup in inflation in the late 1960s and 1970s occurred because the monetary authority believed that there was an exploitable trade-off between inflation and unemployment. Its beliefs induced the monetary authority to accept the temptation to inflate more and more until eventually it had attained Kydland-Prescott (1977) time-consistent inflation rates. But the observations of the 1970s taught Volcker and We are grateful to Irena Asmundsen, Sergei Morozov, and Chao Wei for excellent research assistance. For comments and suggestions, we thank Charles Evans, Marvin Goodfriend, Lars Hansen, Chang-Jin Kim, Robert King, Charles Nelson, Simon Potter, Martin Schneider, Christopher Sims, James Stock, Harald Uhlig, and seminar participants at ASU, FRB Atlanta, FRB Richmond, Penn, Stanford, UCLA, UC Riverside, UC Santa Barbara, the 2000 SED Meetings, and the 2001 NBER Macro Annual Conference. Sargent thanks the National Science Foundation for research support through a grant to the National Bureau of Economic Research,
332 • COGLEY & SARGENT
Greenspan the natural-rate hypothesis, which they eventually acted upon to reduce inflation. Another mechanism was posited by Parkin (1993) and Ireland (1999), who argued that the inflation-unemployment dynamics are driven by exogenous drift in the natural rate of unemployment, for example due to demographic changes. Because the time-consistent inflation rate varies directly with the natural rate of unemployment, Parkin and Ireland attributed the drift in the inflation rate to drift in the natural rate of unemployment. The DeLong-Taylor-Sargent story makes contact with various elements in Lucas's (1976) critique. It makes the drift in inflation-unemployment dynamics a consequence of the monetary authority's evolving views about the economy. The story attributes alterations in the law of motion for inflation and unemployment to the changing behavior of the monetary authority, which emerges in turn from its changing beliefs. This story is consistent with one way that Lucas (1976) has been read, namely, as an invitation to impute observed drift in coefficients of econometric models to time-series variation in government policy functions. Sargent's (1999) version of the story focuses on how the coefficient drift over time affected the results of time-series tests of the natural-rate hypothesis. In the late 1960s, Robert Solow and James Tobin proposed a test of the natural-rate hypothesis. Using data through the late 1960s, that test rejected the natural-rate hypothesis in favor of a permanent trade-off between inflation and unemployment. Lucas (1972) and Sargent (1971) criticized that test for not properly stating the implications of the natural-rate hypothesis under rational expectations. In particular, the Solow-Tobin test was correct only if inflation exhibited a unit root. Before the 1970s, postwar U.S. inflation data did not exhibit a unit root, rendering invalid (in the opinion of Lucas and Sargent) Solow's and Tobin's interpretation of their test. However, in the 1970s, just when U.S. inflation seems to have acquired a unit root, the Solow-Tobin test began accepting the natural-rate hypothesis. Building on Sims (1988) and Chung (1990), Sargent (1999) constructs an adaptive model of the government's learning and policymaking that centers on the process by which the government learns an imperfect version of the natural-rate hypothesis, cast in terms of Solow and Tobin's representation. Parts of Sargent's adaptive story acquire credibility when it is noted how the Solow-Tobin characterization of the natural-rate hypothesis has endured, despite the criticism of Lucas and Sargent. As Hall (1999) and Taylor (1998) lament, that faulty characterization continues to be widely used. For example, see Rudebusch and Svensson (1999) for a widely
Evolving Post-World War II U.S. Inflation Dynamics • 333
cited model that represents the natural-rate hypothesis in the SolowTobin form. Fisher and Seater (1993), King and Watson (1994,1997), Fair (1996), Eisner (1997), and Ahmed and Rogers (1998) construct tests of long-run neutrality that are predicated on the assumption of a unit root in inflation.1 Estrella and Mishkin (1999) use the Solow-Tobin characterization to estimate the natural rate of unemployment. In the discussion following the paper by Estrella and Mishkin, John Williams confesses that the Federal Reserve Board's large-scale macroeconometric model also incorporates this characterization. Hall questions its validity for U.S. data after 1979 and sharply criticizes its continued use. Taylor (1998) warns that adherence to the erroneous econometric characterization of the natural-rate hypothesis will eventually cause policy to go astray. Because of the diminished serial correlation that he sees in recent inflation data, Taylor is concerned that the disappearance of a unit root in inflation means that the faulty test may soon signal an exploitable trade-off that will once again tempt the monetary authority. The theme of both Hall and Taylor is that failure to remember the theoretical and econometric lessons of the 1970s is likely to resuscitate pressure to inflate emanating from the empirical Phillips curve. In the same symposium, Friedman (1998) and Solow (1998) made a number of assertions that may have contributed to Taylor's worries. Friedman asserted that the real effects of monetary policy are so long-lasting that "for all practical purposes they might just as well be permanent." Solow (1998) expressed skepticism about the natural-rate hypothesis and suggested that the supporting evidence is specific to the U.S. economy since 1970. He argued that monetary policy can affect the natural rate of unemployment and that the experience of the United States in the 1960s suggests that persistent high unemployment would yield to a revival of aggregate demand. Taylor's concern is that low inflation would be hard to sustain if belief in a long-run trade-off were again to become influential. The object of this paper is to develop empirical evidence that is relevant to this discussion.2 Section 2 describes a Bayesian model that we use to summarize the evolution of inflation dynamics. Section 3 reports 1. Many of these authors pretest for a unit root and apply the Solow-Tobin test only if they fail to reject the null hypothesis. But pretesting could result in a more subtle version of the Lucas-Sargent trap. Unit-root tests have low power and may fail to detect circumstances in which the Solow-Tobin test is inappropriate. 2. Albanesi, Chari, and Christiano (2000) model the inception and termination of inflation in the 1970s with a sunspot variable that shifts expectations between two regimes. Their equilibrium excludes the concerns about model misspecification that are the focus of the present discussion. It is possible that a regime-switching model like theirs can confront the observations about comovements between inflation persistence and mean inflation that we document below.
334 • COGLEY & SARGENT
stylized facts about this evolution, and Section 4 discusses test statistics for the Solow-Tobin version of the natural-rate hypothesis. Section 5 considers Taylor's warning about recidivism on the natural-rate hypothesis. The paper concludes with a summary. 2. A RANDOM-COEFFICIENTS REPRESENTATION
We use a Bayesian vector autoregression with time-varying parameters to describe the evolution of the law of motion for inflation. We are interested in a random-coefficients representation for some of the reasons expressed in the initial sections of Lucas (1976). The Bayesian framework treats coefficients as random variables, making it attractive for modeling data from economies in which important decision makers, including the monetary authority are learning.3 2.1. NOTATION AND STATE-SPACE REPRESENTATION The model has a nonlinear state-space representation. The measurement equation is
where yt is an N X 1 vector of endogenous variables, 6t is a K X 1 vector of coefficients, X/ is an N X K matrix of predetermined and/or exogenous variables, and e, is an N X 1 vector of prediction errors. The vector yt includes inflation and variables useful for predicting inflation. In this paper, we use (2.1) to represent a vector autoregression, so that the right-hand variables are lags of yt. In an unrestricted vector autoregression, each equation contains the same right-hand variables, x; =(iN®x't). We treat the coefficients of the VAR as a hidden state vector. The state vector 6t evolves according to
where 1(0,) = 0 if the roots of the associated VAR polynomial are inside the unit circle and 1 otherwise; V is a covariance matrix defined below; and
3. Our focus in this paper is on the evolution of reduced-form relationships. Structural models involve nonlinear cross-equation restrictions on the evolving parameters, and they require nonlinear filtering methods. We are currently studying nonlinear filters.
Evolving Post-World War II U.S. Inflation Dynamics • 335 Thus,/(0,+1|0,, V) can be represented as the driftless random walk
where vt is an i.i.d. Gaussian process with mean 0 and covariance Q. The economy changes over time when news arrives, making 9t vary in an unpredictable way. Throughout this paper, we use/(-) to denote a normal density, and p (•) to denote a more general density. We assume that the innovations, (s't , v't )', are identically and independently distributed normal random variables with mean zero and covariance matrix
where R is the N X N covariance matrix for measurement innovations, Q is the K X K covariance matrix for state innovations, and C is a K X N cross-covariance matrix. Following the Bayesian literature, we call the ffs parameters and the elements of JR, Q, and C hyperparameters. We assume that the hyperparameters and initial state 00 are independent, that the initial state is a truncated Gaussian random variable, and that the hyperparameters come from an inverse-Wishart distribution. We adopted these parts of the prior mostly because of their convenience in being natural conjugates for our Gaussian virtual prior/. _ Let/(00) = N(0, P) represent a normal prior with mean 6 and variance P. The prior for the initial state is
Our prior for the hyperparameters is
where IW(S, df) represents the inverse-Wishart distribution with scale matrix S and degrees of freedom df. This is a convenient form because it yields an inverse-Wishart posterior when combined with a Gaussian likelihood. Collecting the pieces, the joint prior for 00/ V can be represented as
336 • COGLEY & SARGENT
Both pieces are informative, but in the empirical section we set 6, P, V, and T0 so that they are only weakly informative. We use the following notation to denote partial histories of the variables Yt and 9t. The vectors
represent the history of data and states up to date T, and
and
represent potential future trajectories from date T onward. We can use (2.2) to assemble the joint density
where
and
We call / our virtual prior, and p the prior. The virtual prior / makes 6 a driftless random walk. Multiplying f(6T\V) by I(6T) puts zero probability on sample paths of j dt\ for which 6t for any t > 0 corresponds to unstable VAR coefficients.4 4. An appendix shows that the model formed by (2.3), (2.13), (2.14), and (2.15) implies the nonlinear transition equation (2.2).
Evolving Post-World War II U.S. Inflation Dynamics • 337
In (2.2), the truncation of f(6t\Qt_lf V} through multiplication by I(0f) reflects our opinion that explosive representations are implausible for the United States. An unrestricted normal density /(0T|V) = /(00) IIfT=0 /(fy+i|fy/ ^) f°r tne history of states 9T implies a positive probability of explosive autoregressive roots, but an explosive representation implies an infinite variance for inflation, which cannot be optimal for a central bank that minimizes a loss function involving the variance of inflation.5 We restrict the prior to put zero probability on explosive states. This representation resembles some of the models in Doan, Litterman, and Sims (1984), but with a different prior. Doan et al. were primarily interested in forecasting and recommended a "random walk in variables" prior for the sake of parsimony. We are less interested in forecasting and more interested in summarizing the data in a relatively unconstrained fashion, so we chose the prior described above. 2.2 A LIMITATION OF OUR MODEL: NO STOCHASTIC VOLATILITY
For macroeconomic variables and a period similar to ours, Bernanke and Mihov (1998a, 1998b) and Sims (1999) presented evidence that favors a vector autoregression with time-invariant autoregressive coefficients but a covariance matrix of innovations that fluctuates over time. In contrast, our specification allows the coefficients to vary and assumes a timeinvariant but unknown innovation covariance matrix V. While our prior fixes V, our statistical methods nevertheless allow the data to speak up for volatility or drift in V, albeit in a restricted and adaptive way. Our estimates of V conditioned on time t data fluctuate over time in ways that we shall discuss. We chose our specification partly because we want to focus attention on the coefficient-drift issues raised by Lucas (1976). Our model is rigged to let us detect drifts in the systematic parts of government and private behavior rules that show up in the systematic parts of vector autoregressions. Our prior embodies a prejudice that monetary policy changed systematically during the years that we study. In contradistinction, the interpretation of the evidence favored by Bernanke and Mihov (1998a, 1998b) and Sims (1999) is consistent with a view that while distributions of shocks have evolved, agents' responses to them have been stable.6 5. Alternatively, explosive representations cannot result if the monetary policy rule ensures that inflation is bounded. We do not claim that an integrated representation for inflation is implausible on statistical grounds, only that drift in inflation is hard to reconcile with purposeful central-bank behavior. 6. See Sims (1982) and Sargent (1983) for theoretical settings that, by assuming that the historical sample was produced by optimizing government behavior and stable privatesector responses to it, can explain such a pattern.
338 • COGLEY & SARGENT 2.3 POSTERIOR PREDICTIVE DENSITY
As Bayesians, our goal is to summarize the posterior density for the objects of interest. We are mostly interested in a forward-looking perspective in inflation, so we want posterior predictive densities. In this model, there are four sources of uncertainty about the future. The terminal state 6T and the hyperparameters V are unknown and must be estimated. In addition, as time goes forward, the state vector will drift away from 9T, and the measurement equation will be hit by random shocks. Conditional on prior beliefs and data through date T, beliefs about the future can be expressed by the joint posterior distribution,
Our objective is to characterize (2.16). This is a complicated object, but it can be decomposed into more tractable components. We begin by factoring (2.16) into the product of a conditional and a marginal density,
This expression splits the joint density into a factor that represents beliefs about the past and present and another that represents beliefs about the future. The first factor is the joint posterior density for hyperparameters and the history of states. It summarizes current knowledge about system dynamics, based on data and prior beliefs. The second factor reflects the uncertainty about the future that would be present even if the current state and hyperparameters were known with certainty. This factor reflects the influence of future innovations to the state and measurement questions. Analytical expressions for each piece are unavailable, even for simple cases. Instead, we use Monte Carlo methods to simulate them. The algorithm is split into two parts, corresponding to the components of (2.17). The first part uses the Gibbs sampler to simulate a draw of 9T and V from the marginal density, p(0T, V \YT). The second step plugs that draw into the conditional density p(Yr+1'T+H, 9T+IJ+H 6T, V, YT) and generates a trajectory for future data and states. 2.4 BELIEFS ABOUT THE PAST AND PRESENT
The posterior density for states and hyperparameters can be expressed as
Evolving Post-World War II U.S. Inflation Dynamics • 339
The first line follows from Bayes's theorem: p(6T, V} represents a joint prior for hyperparameters and states and p(Yr 6r, V) is a conditional likelihood. Conditional on states and hyperparameters, the measurement equation is linear in observables and has normal innovations. Thus, the conditional likelihood is Gaussian, p(YT \6T, V) =f(YT 6T, V), as shown in the second line. The joint prior for hyperparameters and states can be factored into a marginal prior for V and a conditional prior for 9T, and substituting I(9T)f(9T\V) for p(0T\V) delivers the expression on the third line. Notice that the expression in brackets on the last line is the joint posterior kernel that would result if the restriction on unstable roots were not imposed. If not for this restriction, the model would have a linear Gaussian state-space representation, with transition equation f(9T\V). The posterior kernel associated with this linear transition law is
Substituting this relation into the last equation, the posterior density for the nonlinear model can be expressed as a truncation of the posterior for the unrestricted linear model,
Among other things, this means that p(6T, V |YT) can be represented and simulated in two steps. First, we derive the posterior associated with linear transition equation, pL(0T, V I Y T ), and then we multiply by I(9T) to rule out explosive outcomes. In the Monte Carlo simulation, this is implemented by simulating the unrestricted posterior and rejecting draws that violate the stability condition. The next subsection describes our method for simulating pL(9T, V \ YT), and the one after that confirms the validity of our rejection sampling procedure. 2.5 SIMULATING THE UNRESTRICTED POSTERIOR
Following Kim and Nelson (1999), we use the Gibbs sampler to simulate draws from pL(9T, V\ Y T ). The Gibbs sampler iterates on two operations. First, conditional on the data and hyperparameters, we draw a history of
340 • COGLEY & SARGENT
states from pL(0T \YT, V). Then, conditional on the data and states, we draw hyperparameters from pL(V YT, 0 T ). Subject to regularity conditions (see Roberts and Smith 1992), the sequence of draws converges to a draw from the joint distribution, pL(6T, V Y T ). 2.5.1. Gibbs Step 1: States Given Hyperparameters Conditional on data and hyperparameters, the unrestricted transition law is linear and has normal innovations. Thus, the virtual states are Gaussian,
This density can be factored as7
The leading factor is the marginal posterior for the terminal state, and the other factors are conditional densities for the preceding time periods. Since the conditional densities on the right-hand side are Gaussian, it is enough to update their conditional means and variances. This can be done via the Kalman filter. Deriving forward and backward recursions for/(0T| YT, V) is straightforward. Going forward in time, let
represent conditional means and variances. These are computed recursively, starting from 6 and P, by iterating on
7. See Kim and Nelson (1999, Chapter 8).
Evolving Post-World War II U.S. Inflation Dynamics • 341
The matrix Kt is the Kalman gain.8 At the end of the sample, these iterations yield the conditional mean and variance for the terminal state,
This pins down the first factor in (2.22). The remaining factors in (2.22) are derived by working backward through the sample, updating means and variances to reflect the additional information about 9t contained in 0t+1.9 Let
represent backward estimates of the mean and variance, respectively. Because the states are conditionally normal, these can be expressed as
Therefore the remaining elements in the (2.22) are
Notice that the smoothed covariances depend only on the output of the Kalman filter, but the smoothed conditional means depend on realizations of Bt+l. Accordingly, a random trajectory for states may be drawn from a backward recursion. First, draw 6T from (2.25), using (2.24) to compute the mean and variance. Next, conditional on its realization, draw 0T_1 from (2.28), using (2.27) to compute the mean and variance. Then draw 0T_2 conditional on the realization of 0^, and so on back to the beginning of the sample. 2.5.2 Gibbs Step 2: Hyperparameters Given States Conditional on YT and 6T, the innovations are observable. Under the unrestricted linear transition law, these are identically and independently distributed normal random variables, and their conditional likelihood is Gaussian. When an 8. The formula for Kt differs from that given in Anderson and Moore (1979) for the case of correlated innovations because of a difference in assumptions about the timing of innovations. 9. Notice that the backward recursions are not determined by the Kalman smoother. We want the mean and variance for/(0J0 f+1/ Y, V) = f(9t\6t+l, YT, V). The Kalman smoother computes the mean and variance for/(0,]YT, V).
342 • COGLEY & SARGENT
inverse-Wishart prior is combined with a Gaussian likelihood, the posterior is also an inverse-Wishart density,
where
and VT is proportional to the usual covariance estimator,
The posterior degree-of-freedom parameter is the sum of the prior degrees of freedom, T0, plus the degrees of freedom in the sample, T. The posterior scale matrix is the sum of the prior and sample sum-of-squares matrices.10 To sample from an inverse-Wishart distribution, we exploit two facts. First, if a matrix Vis distributed as IW(S, df), then V~l is a Wishart matrix with scale matrix S and degrees of freedom df. Second, to simulate a draw from the Wishart distribution, we take df independent draws of a random vector 17, from a N(0, S) density and form the random matrix V"1 = Sflj T}jri[. Since V"1 is a draw from a Wishart density, V is a draw from an inverse-Wishart density. 2.5.3 Summary of the Gibbs Sampler To summarize, the Gibbs sampler iterates on two simulations, drawing states conditional on hyperparamaters and then hyperparameters conditional on states. After a transitional or "burn-in" period, the sequence of draws approximates a sample from the virtual posterior, pL(0T, V YT). 2.6 REJECTION SAMPLING
The final step is to impose the stability condition, which is done by checking the autoregressive roots at each date and rejecting draws with roots inside the unit circle. The rejection step ensures that the posterior density puts zero probability on explosive outcomes. 10. See Gelman et al. (1995).
Evolving Post-World War II U.S. Inflation Dynamics • 343
To confirm the validity of this procedure, we check the conditions associated with rejection sampling.11 The normalized target density is
To perform rejection sampling, we need a candidate density, g(6T, V), that satisfies three properties. The candidate density must be nonnegative and well defined for all (8T, V) for which p(BT, V Y T ) > 0, it must have a finite integral, and the importance ratio R(9T, V) must have a known upper bound M:
A natural candidate density is the virtual posterior, pL(9T, V \ Y r ). Because this is a probability density, it is non-negative and integrates to 1. Since it is an unrestricted analogue of the target density, it is also well defined for all (6T, V) which occur with positive probability. Finally, the importance ratio is bounded by the reciprocal of the probability of obtaining a stable draw from the virtual posterior,
The denominator is the expected value of I(0T) under the virtual posterior, or the probability of a stable draw from the unrestricted density. M is finite as long as this probability is nonzero. Rejection sampling proceeds in two steps: draw a trial (6£ V^ from the virtual posterior, and then accept the draw with probability R(9j, Vt)/M. Since R(6f, V,)/M = I(6f), the second step is equivalent to accepting the trial draw whenever it satisfies the stability condition, and rejecting it when it does not. 2.7 BELIEFS ABOUT THE FUTURE
Having processed data through date T, the next step is to simulate future data and states. Conditional on hyperparameters and the current 11. See, e.g., Gelman et al. (1995, pp. 303-305).
344 • COGLEY & SARGENT
state of the system, the posterior density for future data and states is quite tractable. This density can be factored into the product of a marginal distribution for future states and a conditional distribution for future data,
Because the states are Markov, the first factor can be factored in turn into
Apart from the restriction on explosive autoregressive roots, 6T+1 is conditionally normal with mean 6T and variance Q. Similarly, conditional on 0T+1/ V, and YT, 6T+2 is normally distributed with mean 0T+1 and variance Q, and so on. Therefore, to sample from the virtual posterior for future states, we take H random draws of u; from the N(0, Q) density and iterate on the state equation,
The stability restriction is implemented in the same way as in the Gibbs sampler, by checking the autoregressive roots associated with each draw and rejecting explosive draws. Given a trajectory for future states, all that remains is to simulate future data. The second factor in (2.35) can be factored in turn into
Conditional on 9T, V, YT, and a trajectory for future states, the measurement innovation sT+l is normally distributed with mean C'Q~lVp+l and variance R — C'Q~1C. Hence yr+1 is conditionally normal with mean Xr'+10T+1 + C'Q^Vf+^and variance R - C'Q~1C. Similarly, sT+2 is conditionally normal with mean C'Q~lVj-+2 and variance R — C'Q~1C, and so on. Therefore, to sample from (2.38), we take H random draws of ei from a N(C'Q~1i^+;, jR - C'Q^C) density and iterate on the measurement equation,
Evolving Post-World War II U.S. Inflation Dynamics • 345
using lags of yT+i to compute XT+/. 2.8 COLLECTING THE PIECES
Combining the results of the previous sections, (2.16) can be expressed as
To sample from this distribution, we use the Gibbs sampler to simulate a draw from p(6T, V YT), Then, conditional on that draw, we simulate a trajectory for future states, and conditional on both of those we simulate a trajectory for future data. This provides the raw material for our analysis. 3. Stylized Facts about the Evolving Law of Motion We study data on inflation, unemployment, and a short-term nominal interest rate. Inflation is measured using the CPI for all urban consumers, unemployment is the civilian unemployment rate, and the nominal interest rate is the yield on 3-month Treasury bills. The inflation and unemployment data are quarterly and seasonally adjusted, and the Treasury-bill data are the average of daily rates in the first month of each quarter. The sample runs from 1948.1 to 2000.4. We work with a VAR(2) specification for inflation, the logit of unemployment, and the ex post real interest rate.12 To calibrate the prior, we estimate a time-invariant vector autoregression using data for 1948.1-1958.4. The mean of the virtual prior, 6, is the point estimate; P is its asymptotic covariance matrix; and R is the innovation covariance matrix. To initialize the other hyperparameters, we assume that C = 0 and that Qis proportional to P. To begin conservatively, we start_yvith a minor perturbation from a time-invariant representation, setting Q = (0.01)2P. In other words, our prior is that time variation ac12. The unemployment rate is bounded between 0 and 1, and the logit transformation maps this into (—°°, °°), which is more consonant with our Gaussian approximating model. To ensure that posterior draws for unemployment lie between 0 and 1, we simulate logit(w() and use the inverse logit transformation. The non-negativity bound on nominal interest rates is implemented by rejection sampling.
346 • COGLEY & SARGENT
counts for only 1% of the standard deviation of each parameter.13 The prior degrees of freedom, T0, are equal to those in the preliminary sample. This is an informative prior, but only weakly so. Because the preliminary sample contains only 4.5 data points per VAR parameter, the prior mean is just a ballpark number and the prior variance allows for a substantial range of outcomes. As time passes, the prior becomes progressively less influential and the likelihood comes to dominate the posterior. The simulation strategy follows the algorithm described above. Starting in 1965.4, we compute posterior densities for each year through 2000, for a total of 36 years. At each date, we perform 10,000 iterations of the Gibbs sampler, discarding the first 2000 to let the Markov chain converge to its ergodic distribution.14 Then, conditional on those outcomes, we generate 8000 trajectories of future data and states. Each posterior trajectory is 120 quarters long and contains information about both short- and long-run features of the data. 3.1 OBJECTS OF INTEREST
We initially focus on three features of the data: long-horizon forecasts of inflation and unemployment, the spectrum for inflation, and selected parameters of a version of the Taylor rule for monetary policy. The longhorizon forecasts approximate core inflation and the natural rate of unemployment, the spectrum encodes information about the variance, persistence, and predictability of inflation, and the Taylor-rule parameters summarize the changes in monetary policy that underlie the changing nature of inflation. We are interested in these features because they play a role in theories about the rise and fall of U.S. inflation. For example, Parkin (1993) and Ireland (1999) point out that the magnitude of inflationary bias in the Kydland-Prescott (1977) and Barro-Gordon (1983) model depends positively on the natural rate of unemployment. Taylor (1997, 1998) and Sargent (1999) argue that core inflation depends on the monetary authority's beliefs about the natural-rate hypothesis, which in turn depend on the degree of inflation persistence. In particular, the model presented in Sargent (1999) imposes a definite restriction on the joint evolution of core inflation and the degree of persistence, which we discuss below. Changes 13. The Gibbs sampler quickly adds more time variation to the system. 14. Recursive mean graphs suggest rough convergence, though some wiggling persists beyond the burn-in period. We checked our results by performing a much longer simulation based on data through 2000.4. The longer simulation involved 106,000 draws from the Gibbs sampler, the first 18,000 of which were discarded to allow for convergence. Smoothed estimates based on this simulation were qualitatively similar to the filtered estimates reported in the text. Indeed, we also performed calculations based on a burn-in period of 98,000 and found that the results were much the same.
Evolving Post-World War II U.S. Inflation Dynamics • 347
in beliefs about the natural-rate hypothesis should also be reflected in Taylor-rule parameters. 3.2 CORE INFLATION AND THE NATURAL RATE OF UNEMPLOYMENT
Beveridge and Nelson (1981) define a stochastic trend in terms of longhorizon forecasts. For a driftless random variable like inflation or unemployment, the Beveridge-Nelson trend is defined as the value to which the series is expected to converge once the transients die out,
Assuming that expectations of inflation and unemployment converge to the core and natural rates as the forecast horizon lengthens, the latter can be approximated using this measure.15 Because the posterior distributions are skewed and have fat tails, we modify the BeveridgeNelson definition by substituting the posterior median for the mean. We approximate core inflation and the natural rate of unemployment by setting h = 120 quarters and finding the median of the posterior predictive density,
Estimates of core inflation and the natural rate are shown in Figure 1. The circles represent inflation, and the crosses unemployment. According to this measure, core inflation was between 1.75% and 4% in the late 1960s. It rose throughout the 1970s and peaked at roughly 8% in 19791980. Thereafter it fell quickly, and it has fluctuated between 2.25% and 3.25% since the mid-1980s. Core inflation was just shy of 3% at the end of 2000. The natural rate of unemployment also rose throughout the 1970s, reaching a peak of 6.6% in 1980. It declined gradually in the early 1980s and fluctuated between 5.5% and 6% from the mid-1980s to the mid1990s. The natural rate again began to fall after 1994 and was a bit less than 5% at the end of 2000. A scatterplot, shown in Figure 2, provides a better visual image of the association between the two. The simple correlation is 0.63, which is rather remarkable given the difficulty of measuring these components. 15. Hall (1999) recommends an unconditional mean of unemployment as an estimator of the natural rate of unemployment.
348 • COGLEY & SARGENT Figure 1 CORE INFLATION AND THE NATURAL RATE OF UNEMPLOYMENT
Figure 2 CORE INFLATION AND THE NATURAL RATE OF UNEMPLOYMENT
Evolving Post-World War II U.S. Inflation Dynamics - 349
The two series rise and fall together, in accordance with Parkin and Ireland's theory. As a reality check for the model, Figures 3 and 4 report the cyclical components of inflation and unemployment, measured by subtracting the median Beveridge-Nelson trend estimates from the actual values. We include these plots to confirm that the model captures important features of the data. The first figure shows that the estimated peaks and troughs occur at the right times and are of plausible magnitude. For example, unemployment was well above the natural rate following the recessions of 1975 and 1982. Using Okun's law as a rule of thumb, these estimates correspond to "output gaps" of roughly 6.75% and 12.5% respectively. The model also correctly predicts that the high inflation of 1974-1975 and 1980-1981 would be partially reversed. Figure 4 shows a scatterplot of the cyclical components and illustrates two other characteristics of the data. The first is that the components are asymmetric, with large positive deviations occurring more often than large negative ones. Second, from 1967 until 1983, there were large counterclockwise loops in inflation and unemployment, with increases in inflation leading increases in unemployment. After 1986, the loops were smaller but still mostly counterclockwise. The direction of the Figure 3 CYCLICAL COMPONENTS OF INFLATION AND UNEMPLOYMENT
350 • COGLEY & SARGENT Figure 4 CYCLICAL COMPONENTS OF INFLATION AND UNEMPLOYMENT
loops is consistent with other evidence on the cyclical relation between inflation and economic activity, e.g. as summarized by Taylor (1999). Beveridge-Nelson measures often suggest that all the variation is in the trend, a feature to which many economists object. Our model does not have this feature. 3.3 THE PERSISTENCE, VARIANCE, AND PREDICTABILITY OF INFLATION
Next we consider the evolution of the second moments of inflation. This information is encoded in the spectrum, and its evolution is illustrated in Figures 5 through 7. Figure 5 shows the median posterior spectrum for each year in the sample. This figure was generated as follows. For each year, we estimated a spectrum for each inflation trajectory in the posterior predictive density. Then we computed a median spectrum by taking the median of the estimates on a frequency-by-frequency basis.16 This yields a single 16. The ordinates are asymptotically independent across frequencies.
Evolving Post-World War II U.S. Inflation Dynamics • 351 Figure 5 MEDIAN POSTERIOR SPECTRUM FOR INFLATION
Figure 6 MEDIAN POSTERIOR SPECTRUM FOR INFLATION IN SELECTED YEARS
352 • COGLEY & SARGENT Figure 7 LOG OF THE MEDIAN POSTERIOR SPECTRUM FOR INFLATION IN SELECTED YEARS
slice of the figure, relating power to frequency for a given year. By repeating this for each year, we produced the three-dimensional surface shown in the figure. We emphasize that these are predictive measures, which represent expected variation going forward in time. That is, the slice associated with a given year represents a prediction about how inflation is likely to vary in the future, conditional on data up to the current date.17 The most significant feature of this graph is the variation over time in the magnitude of low-frequency power. Since the spectral densities have Granger's (1966) typical shape, we can interpret low-frequency power as a measure of inflation persistence. According to this measure, inflation was weakly persistent in the 1960s and 1990s, when there was little lowfrequency power, but strongly persistent in the late 1970s, when there was a lot. Indeed, the degree of persistence peaked in 1979-1980, at the same time as the peak in core inflation. Figures 6 and 7 report results for selected years. Here, circles represent 1965, crosses 1979, and asterisks 2000. Figure 6 plots the spectrum, 17. We also calculated an alternative local linear approximation using the VAR representation and the mean posterior state at each date. The results were similar to those shown in the figure.
Evolving Post-World War II U.S. Inflation Dynamics • 353
and Figure 7 plots its logarithm. To interpret the figures, recall that the total variance is the integral of the spectrum,
and that the log of the univariate innovation variance can be expressed as the integral of the log of the spectrum,
The function/^co) is the spectrum at frequency &>, a* is the variance, and of is the error variance for one-step-ahead univariate forecasts of inflation. The former measures long-run uncertainty about inflation; the latter, short-run uncertainty. Looking first at Figure 6, we can say something about how the total variance has changed over time. Between 1965 and 1979, inflation became smoother but more persistent. That is, there was less variation at high and medium frequencies, especially those associated with business cycles (say 4 to 20 quarters per cycle), but more variation at low frequencies, especially those corresponding to cycles lasting 5 years or more. The increase in low-frequency power was greater in magnitude than the decrease in high-frequency power, so the total variance was greater. Thus, the increase in variance during the late 1960s and 1970s reflected an increase in inflation persistence. Between 1979 and 2000, the spectrum for inflation fell at all frequencies, and therefore so did the total variance. But the decline in power was greatest at low frequencies, especially at those greater than 20 quarters per cycle. In other words, the diminished degree of inflation persistence accounted for most of the decline in variance in this period. Thus the evolution of the variance has been closely associated with that of inflation persistence. Inflation became more persistent and more variable in the 1970s, and less persistent and less variable in the 1980s and 1990s. Figure 7 is relevant for short-term forecasting and tells a somewhat different story. The increase in the log of low-frequency power between 1965 and 1979 was smaller in magnitude than the decrease in the log of high-frequency power. Thus, although inflation became more persistent and more variable during the 1970s, it also became easier to predict one quarter ahead. In other words, although there was more long-term uncertainty in 1979, there was actually less short-term uncertainty. Between
354 • COGLEY & SARGENT
1979 and 2000, the log spectrum fell at all frequencies, and inflation became even easier to forecast one quarter ahead. By 2000 there was less uncertainty at both long and short horizons. The next two figures provide more information about prediction errors. Figure 8 is a multivariate analogue of Figure 7 and is related to the total prediction variance for the system. To interpret this figure, recall that the total prediction variance, |Vj, for a vector time series yt can be expressed in terms of the log of the determinant of the spectral density,
where Vff is the covariance matrix for innovations based on the history of yt, and FW(W) is the spectral density matrix. Whittle (1953) interprets \Vee as a measure of the total random variation entering the system at each date. Unlike the univariate measure, the total prediction variance increased between 1965 and 1979. For the system as a whole, there was only a slight decrease in variation at business-cycle frequencies, and this was Figure 8 LOG DETERMINANT OF THE MEAN POSTERIOR SPECTRAL DENSITY MATRIX IN SELECTED YEARS
Evolving Post-World War II U.S. Inflation Dynamics • 355
Figure 9 STANDARD DEVIATION OF ONE-STEP-AHEAD VAR PREDICTION ERRORS
more than offset by a substantial increase in variation at low and high frequencies. Between 1979 and 2000, the system became more predictable, with ln|Fw(cu)| falling at all frequencies. This more than reversed the increase in the earlier period. By the end of 2000, the total prediction variance was 40% smaller than in 1979 and 30% smaller than in 1965. Thus, for the system as a whole, the degree of short-term uncertainty has fallen substantially. Figure 9 reports the variance of VAR forecast errors over the period 1965-2000 and provides more detail about the evolution of short-run uncertainty. At each date, the posterior prediction error variance was computed by averaging across realizations of the posterior predictive density, one quarter ahead. For inflation and ex post real interest rates, there has been a downward trend in short-term uncertainty since 1965, punctuated by an increase in 1974 and again in 1978-1982. According to this measure, the VAR innovation variance for inflation fell by 21% between 1979 and 2000 and by 42% for the period as a whole. In contrast, the forecast error variance for unemployment fluctuated until the early 1980s, rising and falling with the business cycle. Since then it has fallen steadily to less than one-third its peak level. Changes in short-run uncer-
356 • COGLEY & SARGENT Figure 10 CORE INFLATION AND INFLATION PERSISTENCE
tainty about unemployment account for much of the rise and fall of the total prediction variance.18 Finally in Figures 10 and 11, we relate changes in core inflation to the evolution of the variance and degree of persistence of inflation. Figure 10 plots core inflation and the spectrum at frequency zero, which summarizes the degree of persistence. The two are very closely related. Both rose in the 1960s and 1970s, and both fell during and after the Volcker disinflation. The simple correlation is 0.915. Because persistence contributes to variance, core inflation also covaries positively with the long-horizon standard deviation of inflation, as shown in Figure II.19 Again, both measures rose during the 1970s and fell during the 1980s and 1990s. The correlation between the mean and standard deviation is 0.783. This is a bit lower than the previous correlation because the variance includes changes in both low- and highfrequency power, and the latter are less highly correlated with changes in core inflation. Thus the well-known positive correlation between the 18. Although our model assumes that V is constant, the figures illustrate that filtered estimates do shift little by little over time, thus introducing a limited degree of variation in shock variances. This variation may reflect a transient adaptation to the kind of shifts emphasized by our discussants. 19. We focus on the long-horizon variance, var,(7rt+120), in order to let the transients die out.
Evolving Post-World War II U.S. Inflation Dynamics • 357 Figure 11 CORE INFLATION AND THE STANDARD DEVIATION OF INFLATION, 30 YEARS AHEAD
mean and variance of inflation reflects an even stronger correlation between the mean and degree of persistence. 3.4 TAYLOR-RULE PARAMETERS
At the end of the day, we hope to interpret the evolution of inflation dynamics in terms of the changing behavior of central bankers. Accordingly, we also investigate the evolution of the parameters of a Taylor rule. A simple form of the Taylor rule posits that the central bank's nominal interest target, z* varies positively with inflation and negatively with unemp loyment,
where TT*, u*, and r* represent target values for inflation, unemployment, and the real interest rate, respectively. The lags in the relationship reflect the fact that current observations on inflation and unemployment are often unavailable to policymakers, especially early in the quarter.20 20. This is relevant in our case because the interest rate is sampled in the first month of the quarter.
358 • COGLEY & SARGENT
Therefore decisions are based on lagged values of inflation and unemployment. The basic Taylor rule is usually augmented with a policy shock i7( and a partial adjustment formula to allow for interest-rate smoothing,
Cast in this form, the Taylor rule can be represented as the interest-rate equation in a vector autoregression for inflation, unemployment, and nominal interest rates. In an alternative form of the Taylor rule, decisions about the ex ante real interest rate depend on lags of inflation, unemployment, and ex post real rates,
By substituting 7rt = E^^+s^, this form can be cast as the real-interest equation in a vector autoregression for inflation, unemployment, and ex post real rates, with a composite innovation consisting of policy shocks and inflation prediction errors,
This is the form of the Taylor rule that we shall study.21 In response to our discussants, we concede that it is controversial to interpret the systematic part of the monetary policy rule as the projection of real interest rates only on past information. By orthogonalizing an innovation covariance matrix in a particular order, many studies attribute part of the contemporaneous covariance among innovations to the monetary rule (i.e., the rule for setting interest rates responds to contemporary information). We also recognize that the shapes of impulse response functions for the response of macroeconomic aggregates to the monetary policy shock can depend sensitively on how much of the contemporaneous innovation volatility is swept into the monetary shock. In defense of our choice, we note that among others McCallum and Nelson (1999) doubt that monetary authorities have timely and reliable enough reports to let them respond to what the vector autoregression measures as contemporaneous information.22 21. Actually, we substitute the logit of unemployment for unemployment. 22. It would have been possible for us to condition on contemporaneous information by using the time t estimate of the R component of V to orthogonalize R as desired, though we have not done so in this paper.
Evolving Post-World War II U.S. Inflation Dynamics • 359
The literature on monetary policy rules emphasizes several aspects of central-bank behavior. We focus on two elements that are especially relevant to the evolution of the law of motion for inflation. One concerns the evolution of target inflation, TJ*, and the other concerns the evolution of the degree of activism. The value of target inflation cannot be identified from the interest-rate equation alone. But assuming that the central bank adjusts interest rates so that inflation eventually converges to its target, this value can be estimated by computing long-horizon forecasts using the entire vector autoregression. Under this assumption, target and core inflation are synonymous. Evidence on this feature of the policy rule is reported above, in Figure 1. Another important issue concerns whether a rule is activist or passivist, a distinction that bears on the determinacy of equilibrium (e.g., see Clarida, Gali, and Gertler, 2000). A policy rule is activist if, other things equal, the central bank increases the nominal interest rate more than one-for-one in response to an increase in inflation, so that the real interest rate increases. A passivist central bank adjusts the nominal interest rate one-for-one or less, so that the real interest remains constant or falls as inflation rises. In the real-interest version of the Taylor rule, the degree of activism can be measured by
A policy rule is activist if A > 0. Because our version of the Taylor rule is the real-interest equation in the vector autoregression, the posterior density for the activism coefficient can be computed directly from the posterior density for the states. The output of the Gibbs sampler at date t includes the terminal state, 6t, and for each draw of the terminal state we calculate the implied value for A. Conditional on data up to date t, this measures the degree of activism that would be forecast going forward from date t. Posterior beliefs about A are illustrated in Figure 12. Because of outliers in the posterior density, the figure graphs the posterior median and interquartile range.23 The figure has two salient features. First, as reported by Clarida, Gali, and Gertler., there have been important changes in the degree of activism over time. Judging by the posterior median, which is marked by circles, the degree of activism declined in 23. The outliers result from division by 1 — p(l), which sometimes takes on values close to zero.
360 • COGLEY & SARGENT Figure 12 POSTERIOR MEDIAN AND INTERQUARTILE RANGE FOR THE ACTIVISM COEFFICIENT
the late 1960s and was approximately neutral in the early 1970s. For the remainder of the 1970s, the rule was decidedly passive, allowing real interest rates to fall as inflation rose. Monetary policy started becoming activist in 1981 and continued to grow more activist until the end of Volcker's term. During the first half of Greenspan's term, policy drifted toward a less active stance, perhaps reflecting the "opportunistic" approach to disinflation. But policy has again grown more activist since 1993, surpassing the peak achieved at the end of the Volcker years. The second notable feature concerns the dispersion of beliefs about the degree of activism. Judging by the interquartile range, beliefs were tightly concentrated only in the 1970s, when monetary policy was passive. At that time, there seemed to be little doubt, for better or worse, about how the Fed was doing business. The periods before and after both involve more uncertainty about the degree of activism. In the 1960s, the lower end of the interquartile range straddled the boundary of the activist region. In the Volcker-Greenspan years, the interquartile range was wider but safely within the activist region. Figure 13 shows how the activism parameter has covaried with core inflation and the degree of inflation persistence.24 The latter both in24. The variables are measured in standard units in order to put them on a common basis.
Evolving Post-World War II U.S. Inflation Dynamics • 361 Figure 13 CORE INFLATION, INFLATION PERSISTENCE, AND POLICY ACTIVISM
creased during the 1970s experiment with a passivist monetary rule, and they both fell in the 1980s and 1990s as policy became more activist. The correlation between the degree of activism and core inflation is —0.69 over the full sample and -0.87 in the Volcker-Greenspan era. Similarly, the correlation between the activism and persistence measures is —0.46 over the full sample and -0.76 in the Volcker-Greenspan years. Thus, as one might expect, there is an inverse relation between the degree of activism on the one hand and core inflation and inflation persistence on the other. 4. Testing the Natural-Rate Hypothesis Figures 14 through 16 summarize the consequences of implementing econometric tests of the natural-rate hypothesis along the lines of Solow (1968), Tobin (1968), Gordon (1970), and many others. They tested the natural-rate hypothesis by regressing inflation on its own lags along with current and lagged values of unemployment,
362 • COGLEY & SARGENT Figure 14 RECURSIVE TESTS OF THE NATURAL-RATE HYPOTHESIS
Circles represent recursive least-squares estimates with decreasing gain, and diamonds represent constant-gain estimates.
They interpreted the condition /32(1) = 1 as evidence in favor of the natural-rate hypothesis, and &(!) < 1 as evidence in favor of a long-run trade-off.25 The outcomes of recursive natural-rate tests are shown in Figure 14. The initial estimates are based on data from 1948 through 1964, allowing for lags at the beginning of the sample. On the right-hand side of equation (4.1), we include two lags of inflation along with the current value and two lags of unemployment. Starting in 1965.1, new data are added one quarter at a time, and /3j(l) and its t-ratio are updated using the 25. The thought experiment in play imagines the consequences of a permanent increase in expected inflation, which is proxied by the lagged inflation terms on the right-hand side. In order for this to be neutral in the long run, it must be the case that this has a one-for-one effect on actual inflation, so that /^(l) — 1. Assuming that current unemployment is predetermined with respect to current inflation, this regression can be estimated by least squares. King and Watson (1997) point out that the last assumption follows from the structure of vintage 1960s Keynesian models, in which unemployment and inflation were determined in a block recursive fashion. Unemployment was determined by aggregate demand and Okun's law. Taking unemployment as given, inflation was determined by a Phillips-curve relation for wages and a markup equation for prices.
Evolving Post-World War II U.S. Inflation Dynamics • 363 Figure 15 INFLATION PERSISTENCE AND NRH TEST STATISTICS
Figure 16 CORE INFLATION AND NRH TEST STATISTICS
364 • COGLEY & SARGENT
Kalman filter. The figure plots the resulting sequence of f-statistics for &(!) — 1. Points marked with a circle represent OLS estimates, and those marked with a diamond represent discounted least-squares (DLS) estimates. For the latter, the gain parameter was gt = max(lA,^o ).26 The horizontal line marks the 1% critical value for a one-sided test. Sargent (1971) pointed out that this approach is valid only if the sample used to estimate jS^l) contains permanent shifts in inflation. Otherwise the data are uninformative for the thought experiment, and &(!) could be less than 1 even if there were no long-run trade-off. Thus, as the degree of inflation persistence in the sample varies over time, so too will outcomes of the test. Early versions of the test, based on samples in which there was little inflation persistence, found estimates of ft(l) < 1 and were interpreted as evidence in favor of a long-run trade-off. As shown in the figure, the natural-rate hypothesis was strongly rejected through 1973. Later versions were based on samples containing more inflation persistence, and they fail to reject long-run neutrality. Indeed, from the mid-1970s until the mid-1980s there was very little evidence against long-run neutrality. Since then, as the degree of inflation persistence has fallen, evidence against the natural-rate hypothesis has grown. Figure 15 illustrates the relation between inflation persistence and outcomes of the test.27 The figure confirms that the test statistic is positively related to the degree of persistence, though the relation is nonlinear. Once there was enough persistence to identify the long run trade-off parameter, the test began to accept long-run neutrality, and further increases in persistence no longer increased the t-ratio. Figure 16 shows that the test statistic is also positively related with core inflation. Without alterations, the model of Sims (1988), Chung (1990), and Sargent (1999) cannot explain that pattern. In that model, persistence rises and the natural-rate hypothesis is learned as inflation falls, so the model predicts an inverse relation between core inflation and the outcome of the test. The pattern shown in Figure 16 is more consistent with an alternative story, in which the upward drift in inflation taught the government to accept the natural-rate hypothesis via the Solow-Tobin test. Thus, though the Solow-Tobin procedure provided a valid test of the natural-rate hypothesis only when inflation had become sufficiently persistent, by the mid-1970s inflation had become persistent enough to let the test detect the natural rate. Therefore the Solow-Tobin econometric 26. There are only minor differences between the two estimators within the sample, because until recently 1/t > ^ . The distinction between constant and decreasing gain estimators matters more when we consider the likely outcomes of future tests. 27. These figures refer to discounted least-squares estimates, but the results for OLS estimates are essentially the same.
Evolving Post-World War II U.S. Inflation Dynamics • 365
procedures gave policymakers information that should have caused them to stabilize inflation if they had the preferences attributed to them, for example, by Kydland and Prescott (1977). For when a policymaker solves the problem of minimizing an expected discounted sum of a quadratic loss function in inflation and unemployment subject to a Phillips curve like (4.1), and when the policymaker accepts the natural-rate hypothesis in the form in which Solow and Tobin cast it, then for discount factors large enough, the policymaker will soon push average inflation to zero.28 When Volcker took control, the advice to push inflation quickly toward zero came even from those models and optimalcontrol exercises that took inadequate account of the Lucas critique, because they rested on the Solow-Tobin test. However, the strong inflation persistence that induced the SolowTobin test to detect the natural rate in the mid-1970s depended on the monetary authority's having recently allowed inflation to drift upward, perhaps in response to its earlier erroneous views about an exploitable trade-off. If the government's success in lowering inflation created lower persistence in inflation, the Solow-Tobin test could one day again point to an exploitable trade-off that would tempt later monetary authorities to use inflation to fight unemployment. That possibility has worried John Taylor and others, an issue to which we now turn. 5. Taylor's Warning about Recidivism Recently, John Taylor (1998) has warned about recidivism on the naturalrate hypothesis. Taylor notes that inflation is lower and more stable in the current monetary regime, and he points out that as such data accumulate, erroneous econometric tests of long-run neutrality may again begin to suggest the existence of a trade-off. To the extent that the tests undermine confidence in the natural-rate hypothesis, they could also undermine support for a low-inflation policy. In this section, we offer quantitative evidence to back up Taylor's warning. The evidence is based on the posterior predictive density conditioned on data through the end of 2000. We use this to make predictions about the probability of rejecting the natural-rate hypothesis going forward in time. Figure 14 suggests that Taylor's concern has some merit, because by the end of the sample conventional tests were close to rejecting /3a(l) = 1 28. This is a version of the control problem described by Phelps (1967) and Sargent (1999). Long ago, Albert Ando pointed out that good macroeconometric models had confirmed the absence of a long-run inflation-unemployment trade-off by the early or mid-1970s.
366 • COGLEY & SARGENT
against (3^(1) < 1 at the 5% level. The in-sample evidence is marginal,29 however, and it is an open question whether stronger evidence will emerge as data from a low-inflation regime accumulate. To address this question, we compute the posterior predictive density of natural rate iratios going forward in time from 2000.4. Then we calculate the probability, conditioned on what we know now, of rejecting the natural-rate hypothesis at various dates in the future. In this way, we can quantify the risk of backsliding. Let fJ+1-T+H represent a potential future sequence of recursive ^-statistics for A(l) - 1,
We want to make statements about how these sequences are likely to evolve. From a Bayesian perspective, the natural way to proceed is to compute the posterior predictive density for these sequences,
To sample from this density, we start with the posterior predictive density for inflation and unemployment and then exploit the fact that tstatistics are deterministic functions of the data.30 Hence we can write
where the function #(•) is nothing more than the output of the recursive least-squares algorithm initialized with estimates through date T. To draw a realization from (5.2), we first draw a trajectory for future inflation and unemployment from their posterior predictive density and then apply the Kalman filter to compute the associated sequence of test statistics. The probability that the test will reject at some future date h is
where c(a) is the normal critical value corresponding to a one-sided test of size a. In terms of our sampling strategy, this is the fraction of simu29. In our opinion, strong rejections will be needed to reverse the consensus in favor of the natural-rate hypothesis. 30. Remember, from a Bayesian perspective /3(1) is a random and 0(1) is deterministic.
Evolving Post-World War II U.S. Inflation Dynamics • 367
lated trajectories in which /^(l) is significantly less than 1 at date h, where significance is determined by the usual classical criterion. Thus, we are offering a Bayesian interpretation of judgments based on a classical procedure. Figure 17 reports results for a constant-gain estimator. The results for a recursive OLS estimator are similar. We focus on the constant-gain estimator because this holds the effective sample size constant as data accumulate. Thus the increased probability of rejection does not follow simply from an increase in the number of observations. As the figure shows, the probability of rejection remains small in the first two years of the forecast. But then it increases quickly reaches 50% within 9 years, and approaches 85% in 20 years. The increasing probability of rejection reflects the changing nature of inflation-unemployment dynamics along with the fact that data from new and old regimes are being mixed in different proportions. As time moves forward, data from the old high-inflation, strong-persistence regime are discounted more heavily, and data from the new low-inflation, weak-persistence
Figure 17 PROBABILITY OF REJECTING THE NATURAL-RATE HYPOTHESIS, CALCULATED FROM THE POSTERIOR PREDICTIVE DISTRIBUTION FOR THE CONSTANT-GAIN ESTIMATOR
368 • COGLEY & SARGENT
regime increasingly dominate the sample. The identifying information from the 1970s is lost little by little, and the properties of the VolckerGreenspan era come more and more into play. This confirms an element of Taylor's warning, that the Solow-Tobin test may once again begin to suggest the existence of a trade-off. 6, Concluding Remarks This paper has used a vector autoregression with random coefficients to measure parameter drift in U.S. inflation-unemployment-interest-rate dynamics. We construct our model to focus on parameter drift because we are sympathetic to the theoretical views expressed in Lucas (1976) and Sargent (1999), which leads us to suspect that evolution in the monetary policy authority's view of the world will make the systematic part of a vector autoregression drift. We have taken seriously our model's description of four sources of uncertainty about the future,31 and have used computer-intensive Bayesian methods to take those uncertainties into account. We use the model to develop a number of stylized facts about the evolution of postwar U.S. inflation and relate them to important issues about learning to detect the natural-rate hypothesis using imperfect tests, and how the evolving results from those tests were associated with evolution in a description of a monetary policy rule (a Taylor rule). Among other things, we find that the mean and persistence of inflation are strongly positively correlated; that the persistence of inflation is positively associated with statistics that have been used to test for accepting the natural-rate hypothesis; that evolving measures of policy activism in fighting inflation broadly point to more activism with a lag somewhat after test statistics began accepting the natural-rate hypothesis; and that recently the degree of persistence in inflation has been drifting downward as inflation has come under control. We also study John Taylor's warning about recidivism toward an exploitable trade-off between inflation and unemployment. Unfortunately, our statistical model confirms Taylor's concerns. Our model predicts that as observations of lower, more stable inflation accumulate, econometric evidence against the natural-rate hypothesis is likely to develop.32 Against 31. These are: (1) the unknown current location of the VAR coefficients, (2) the unknown covariance matrix of innovations to VAR coefficients and equations, (3) the future evolution of the VAR coefficients, and (4) the stream of future shocks to the VAR equations. 32. Prospects for a gradual backsliding away from the zero-inflation Ramsey outcome toward the higher Nash inflation rate also permeate the "mean dynamics" in the model of Sargent (1999) and Cho, Williams, and Sargent (2001).
Evolving Post-World War II U.S. Inflation Dynamics • 369
this evidence, we hope that policymakers do not succumb again to the temptation to exploit the Phillips curve.
Appendix. A Nonlinear Transition Equation Our numerical procedures construct a sample using p(Or\V) defined by (2.13). This appendix verifies that these procedures are consistent with the nonlinear transition function defined in the text. In particular, we verify the nonlinear transition equation, p(dt+l\9t, V) & I(0t+1)f(0t+1\dt, V) from equations (2.3), (2.13), (2.14), and (2.15). First consider the transition equation for terminal state,
The joint density in the numerator can be expressed as
The marginal density in the denominator of (A.I) can be expressed as
The ratio between the two is
Next consider the transition equation for the penultimate state,
The joint density in the numerator of (A.5) can be expressed as
370 • COGLEY & SARGENT
where the last equality follows from the fact that p(9T\9T_lf V) integrates to one. Using the same argument as above, this can be expressed as
The marginal density for 0T_2 is
The ratio between the two is
Continuing a backward recursion implies
Hence, the nonlinear transition equation can indeed be expressed in terms of the truncated linear transition equation. REFERENCES Ahmed, S., and J. H. Rogers. (1998). Inflation and the great ratios: Long-term evidence from the U.S. Washington: Board of Governors of the Federal Reserve System. International Finance Discussion Paper 628. Albanesi, S., V. V. Chari, and L. J. Christiano. (2000). Expectations traps and monetary policy. Northwestern University and University of Minnesota. Mimeo. Anderson, B. D. O., and J. B. Moore. (1979). Optimal Filtering. Englewood Cliffs, NJ: Prentice-Hall. Barro, R. J., and D. B. Gordon. (1983). A positive theory of monetary policy in a natural rate model. Journal of Political Economy 91:589-610. Bernanke, B. S., and I. Mihov. (1998a). The liquidity effect and long-run neutral-
Evolving Post-World War II U.S. Inflation Dynamics • 371 ity. In Carnegie-Rochester Conference Series on Public Policy, Vol. 49, B. T. McCallum and C.I. Plosser (eds.). Amsterdam: North Holland, pp. 149-194. , and . (1998b). Measuring monetary policy. Quarterly Journal of Economics 113(August):869-902. Beveridge, S., and C. R. Nelson. (1981). A new approach to decomposition of economic time series into permanent and transitory components with particular attention to measurement of the "Business Cycle." Journal of Monetary Economics 7:151-174. Cho, I. K., N. Williams, and T. J. Sargent. (2001). Escaping Nash inflation. Review of Economic Studies, forthcoming. Chung, H. (1990). Did policy makers really believe in the Phillips curve? An econometric text. University of Minnesota. PhD Dissertation. Clarida, R., J. Gali, and M. Gertler. (2000). Monetary policy rules and macroeconomic stability: Evidence and some theory. Quarterly Journal of Economics 115(1):147-180. DeLong, J. B. (1997). "America's only peacetime inflation: The 1970s. In Reducing Inflation: Motivation and Strategy, C. D. Romer and D. Romer (eds.). NBER Studies in Business Cycles, Vol. 30 Chicago: University of Chicago Press. Doan, X, R. Litterman, and C. Sims. (1984). Forecasting and conditional projections using realistic prior distributions. Econometric Reviews 3(1):1-100. Eisner, R. (1997). A new view of the NAIRU. In Improving the Global Economy: Keynesianism and Growth in Output and Employment, P. Davidson and J. Kregel (eds.). Cheltenham: Edward Elgar. Estrella, A., and F. S. Mishkin. (1999). Rethinking the role of the NAIRU in monetary policy: Implications of model formulation and uncertainty. In Monetary Policy Rules, J. B. Taylor (ed.). NBER Conference Report. Chicago: University of Chicago Press. Fair, R. (1996). Testing the standard view of the long-run unemploymentinflation relation. Cowles Commission. Discussion Paper 1121. Fisher, M. E., and J. J. Seater. (1993). Long-run neutrality and superneutrality in an ARIMA framework. American Economic Review 83(3):402-415. Friedman, B. (1998). Introduction. In Inflation, Unemployment, and Monetary Policy, R. M. Solow and J. B. Taylor (eds.). Cambridge, MA: MIT Press. Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin. (1995). Bayesian Data Analysis London: Chapman and Hall. Gordon, R. J. (1970). The recent acceleration of inflation and its lessons for the future. Brookings Papers on Economic Activity 1:8-41. Granger, C. W. J. (1966). The typical spectral shape of an economic variable. Econometrica 34(1):150-161. Hall, R. (1999). Comment on rethinking the role of the NAIRU in monetary policy: Implications of model formulation and uncertainty. In Monetary Policy Rules, J. B. Taylor (ed.). NBER Conference Report. Chicago: University of Chicago Press. Ireland, P. (1999). Does the time-consistency problem explain the behavior of inflation in the United States? Journal of Monetary Economics 44(2):279-292. Kim, C.-J., and C. R. Nelson. (1999). State-Space Models with Regime Switching. Cambridge, MA: MIT Press. King, R. G., and M. W. Watson. (1994). The Post-war U.S. Phillips curve: A revisionist econometric history. Carnegie-Rochester Conference Series on Public Policy 41:157-219.
372 • COGLEY & SARGENT , and . (1997). Testing long-run neutrality, Federal Reserve Bank of Richmond Economic Quarterly 83(3):69-101. Kydland, E, and E. C. Prescott. (1977). Rules rather than discretion: The inconsistency of optimal plans. Journal of Political Economy 85(3):473-491. Lucas, R. E., Jr. (1972). Econometric testing of the natural rate hypothesis. In The Econometrics of Price Determination, O. Eckstein (ed.). Washington: Board of Governors of the Federal Reserve System. . (1976). Econometric policy evaluation: A critique. In The Phillips Curve and Labor Markets, K. Brunner and A. Meltzer (eds.). Carnegie-Rochester Series on Public Policy, Vol. 1. McCallum, B. T, and E. Nelson. (1999). Performance of operational policy rules in an estimated semiclassical structural model. In Monetary Policy Rules, J. B. Taylor (ed.). Chicago: University of Chicago Press, pp. 15-45. Parkin, M. (1993). Inflation in North America. In Price Stabilization in the 1990s, K. Shigehara (ed.). Phelps, E. S. (1967). Phillips curves, expectations of inflation, and optimal unemployment over time. Economica 2(3):22-44. Roberts, G. O., and A. F. M. Smith. (1992). Simple conditions for the convergence of the Gibbs sampler and Metropolis-Hastings algorithms. Stochastic Processes and Their Applications 49:207-216. Rudebusch, G. D., and L. E. O. Svensson. (1999). Policy rules for inflation targeting. In Monetary Policy Rules, ]. B. Taylor (ed.). NBER Conference Report. Chicago: University of Chicago Press. Sargent, T. J. (1971). A note on the accelerationist controversy. Journal of Money Credit and Banking 8(3):721-725. . (1983). Autoregressions, expectations, and advice. American Economic Review, Papers and Proceedings 74(2):408-415. . (1999). The Conquest of American Inflation. Princeton, NJ: Princeton University Press. Sims, C. A. (1982). Policy analysis with econometric models. Brookings Papers on Economic Activity 1:107-152. . (1988). Projecting policy effects with statistical models. Revista de Analysis Economico 3:3-20. . (1999). Drifts and breaks in monetary policy. Princeton University. Mimeo. Solow, R. M. (1968). Recent controversy on the theory of inflation: An eclectic view. In Proceedings of a Symposium on Inflation: Its Causes, Consequences, and Control, S. Rousseaus (ed.). New York: New York University. . (1998). How cautious must the Fed be? In Inflation, Unemployment, and Monetary Policy, R. M. Solow and J. B. Taylor (eds.). Cambridge, MA: The MIT Press. Taylor, J. B. (1997). Comment on America's only peacetime inflation: The 1970s. In Reducing Inflation: Motivation and Strategy, C. D. Romer and D. Romer (eds.). NBER Studies in Business Cycles, Vol. 30. Chicago: University of Chicago Press. . (1998). Monetary policy guidelines for unemployment and inflation stability. In Inflation, Unemployment, and Monetary Policy, R. M. Solow and J. B. Taylor (eds.). Cambridge, MA: The MIT Press. . (1999). Staggered price and wage setting in macroeconomics. In Handbook of Macroeconomics, Vol. IB, J. B. Taylor and M. Woodford (eds.). Amsterdam: North Holland.
Comment • 373 Tobin, J. (1968). Discussion. In Proceedings of a Symposium on Inflation: Its Causes, Consequences, and Control, S. Rousseaus (ed.). New York: New York University. Whittle, P. (1953). The analysis of multiple stationary time series. Journal of the Royal Statistical Society, Series B 15:125-139.
Comment CHRISTOPHER A. SIMS Princeton University
1. Introduction My comments fall under three main headings: (i)
The later, Taylor-rule part of the paper is a structural VAR analysis. It uses nonstandard, and questionable, identifying assumptions without giving us a discussion of why it differs from most of the literature or what motivates the nonstandard specification. It also fails to check its specification as thoroughly as is standard in the structural VAR literature. (ii) The evidence that monetary policy behavior has changed sharply between early and late postwar periods, or even between interwar and postwar periods, is less strong than might appear from this paper. (iii) The paper sets a new, and high, standard for descriptive analysis of macroeconomic data. I hope it will be widely copied, and therefore want to be sure to register objections to certain aspects of its technical procedures before it's too late. Some of the questionable aspects of its procedures may have affected its conclusions. 2. Identification There are several related facts about policy rules and their relation to the data that reflect the identification problem that must be confronted in evaluating claims to estimate a rule. It is easy to generate "policy shocks" that produce strong price puzzles, particularly in pre-1979 data, as we see from Earth and Ramey's paper in this volume. Identification schemes that produce price puzzles tend also to imply large real effects of monetary policy shocks and small responses of interest rates to lagged inflation—low activism.
374 • SIMS No matter what the actual policy rule, it will be possible to estimate a regression of interest rate on "fundamentals" (i.e. not P, M, or other nominal variables: intrinsic state variables) that can play the role of a statistical "interest-rate equation." Yet, in most equilibrium models, if this regression were in fact the policy rule and fiscal policy took the conventionally assumed form, the model's equilibrium would be indeterminate. Observations from a gold-standard or price-level-targeting policy regime will spuriously imply a nonactivist policy rule unless quite sophisticated simultaneity is recognized in the estimation. This follows because in such regimes high inflation predicts low future inflation, which through the Fisher equation then implies low current nominal interest rates. Such a regime can be generated by a policy reaction function that makes r respond very strongly to the price level or inflation, but the policy reaction function is not recovered by OLS regression.
In other words, there is always an identification problem in determining whether policy is active. The identification problem can be solved, but only by bringing in identifying assumptions that are not testable. One of the identifying assumptions in this paper is that the residual in a VAR ex post real-interest-rate equation with unemployment and CPI on the right is the policy shock, which amounts to a recursive VAR identification scheme. While much of the identified VAR literature relies on this assumption, it can lead to problematic interpretations of the data. Most prominently, price puzzles (inflationary response to monetary tightening) are a common outcome (as e.g. in Barth and Ramey's paper in this volume) when purely recursive identification schemes are applied to pre-1980 U.S. data. As Leeper and Zha (2001) show, policy rules are estimated as stable and without price puzzles when the fact that policy behavior (at least before 1980) involved responses to the money stock is allowed for and the resulting simultaneity is recognized. The paper also presents its policy reaction function as a "real-interestrate rule." The unusual timing of the paper's data (r is not a quarterly average, but rather a monthly average from the first month in the quarter, while the other data are quarterly averages) makes this assertion difficult to interpret. In a continuous-time, or cleanly discrete-time, model, when prices are flexible and money is neutral, the monetary authority simply cannot set the real interest rate. A policy equation with the real rate on the left, even if it has lagged inflation on the right, contradicts the mapping from the economy's real state to its real interest rate. With non-neutralities in the model, nonexistence will no longer be a logical necessity, but there will be a range of models, with weak nonneutralities, for which such policy rules raise existence problems. It
Comment • 375
seems unwise to impose a policy rule of this form on the data as an a priori restriction. To understand this problem, consider the simple model
It is easy to understand that this pair of equations leads to nonexistence of a stable rational-expectations equilibrium, because taking the difference of the two equations would force innovations in the real rate to be exact functions of innovations in the policy equation. If we replaced E^^ in the first equation with EtTrt+1/ as would be appropriate if the model's data had conventional timing, the system would be well behaved. But of course, if the data had conventional timing, this specification would no longer represent policy setting the real rate. Replacing Et_l7Tt in the second equation with 7rt itself is no help, however, as the resulting system still has no solution. It would have been better for the paper to stick with a nominal-rate rule, as does the rest of the structural VAR literature. As it is, the interpretation of all the parts of the paper that depend on this identification is problematic. I agree with the authors that it is reasonable to assert as an identifying assumption that policy responds only to lagged information. This view could have been incorporated into their structure simply by omitting current irt from the reaction function. Papers in the structural VAR literature almost universally check identification by examining impulse responses, trying to ensure that the estimated system does not have unreasonable properties. It is easy for apparently reasonable identifying restrictions to lead to estimated systems that are implausible, so this type of check is important. This paper does no such checking. Thus we do not know whether the periods of implied low activism also are accompanied by a price puzzle, whether the implied responses of monetary authorities to private shocks are reasonable, or whether the responses of the economy to the policy shocks are reasonable. Probably the majority view among macroeconomists (and especially within the Fed system?) is that monetary policy has changed drastically for the better over the last 30 or 40 years—Alan Greenspan is completely different from Arthur Burns. But the most careful statistical assessments of this idea are at best inconclusive, and for the most part suggest on the contrary that changes in the systematic component of policy in this period are modest. Examples of work that comes to this conclusion, using widely different methodologies, are papers by Orphanides (2001), Leeper and Zha (2001), Hanson (2001), and me (Sims, 1999). My own paper
376 • SIMS
argues that the most important changes between periods can be accounted for as shifts in the variances of the structural disturbances. Timevarying variances are hard to distinguish from parameter variation. Attempts to show shifts in policy behavior should recognize this, in order to come into contact with the literature supporting the opposite view.
3. Time-Varying Descriptive Statistics The paper implements a novel strategy to summarize the variation in the economy's characteristics over time. It uses descriptive statistics computed from simulated future time paths drawn from the posterior predictive density at each date, displaying how they change over time. The results are thought-provoking and deserve further study. I found particularly interesting the concentration of the posterior on the activism coefficient during the 1970s, followed by widening uncertainty thereafter. Even though the paper's interpretation of its activism coefficient may be dubious, this pattern of increased, then decreased, certainty about important components of inflation dynamics is suggestive. Phenomena like this might have played a role in the inertia of policy at the time and in the subsequent popularity of Monday-morning quarterbacking about it. The paper sticks entirely to forward-looking data summaries. For many purposes this is appropriate, but such filtered, as opposed to smoothed, estimates of the stochastic properties of the model contain a component of variation that is learning, rather than actual time variation in the behavior of the economy. Commonly graphs like, say, Figure 11 or 12 show quite different time paths when computed on the basis of smoothed estimates. The difference lets us distinguish between best ex post estimates of what was actually happening and best current estimates at the time of what was happening. It would be interesting to see the work extended in that direction.
4. The "Learning the NRH" Story The paper's Figure 14 confirms a point that Albert Ando has made for a long time: It is hard to blame the inflation of the 1970s on econometric modelers serving up a long-run inflation trade-off. It is an important result of both Chung's thesis—which this paper cites—and Sargent's book that the story that naive econometric Phillips-curve estimation led to the inflation of the 1970s cannot be sustained. This paper proposes a new, incompletely articulated theory. It seems to me more a narrative theory than a time-invariant one that could be tested. The theory used in Chung's thesis, in Sargent's book, and in my
Comment • 377
(1988) paper specifies both the (incorrect) model the policymakers use and the correct (natural-rate) model relating unemployment and inflation. It works out the consequences of these assumptions. My paper and Chung's thesis show that such a setup can easily lead to very long (at least millennia), possibly permanent periods of near-Ramsey behavior, with interest rates and inflation low on average. Sargent's book and Chung's thesis show that this setup does poorly at explaining U.S. postwar inflation and unemployment data, because it implies that policy authorities quickly realized the Phillips curve is nearly vertical. It is hard to understand why the paper gives such a prominent role to the f-test for the hypothesis &(!) = 1. Figure 14 shows that the test strongly rejected the null starting in 1973. Not until more than 6 years later, in late 1979, did the "Volcker regime" begin. If the f-test showing neutrality was crucial to producing the Volcker policies, the connection was certainly not a simple one. It seems likely that the connection of this f-test to future changes in policy will be at least as tenuous. My own view, which agrees in many respects with that of Orphanides (2001), is that unemployment rose and inflation rose because of real disturbances that lowered growth. Faced with the simultaneous rise in these two variables, and believing that unemployment affected inflation with a lag, policymakers had to decide whether the rise in unemployment that had already occurred was enough to exert adequate deflationary pressure. Since such "stagflation" had not occurred before on such a scale, they faced a difficult inference problem, which it took them some years to unravel. Note that in this story it is not /^(l) that is crucial, but the relation between (30 and /32(1), i.e. the Phillips-curve "natural rate." I think it likely that careful statistical work using the Phillips curve would have demonstrated much earlier than 1979 that the current levels of unemployment were not exerting much downward pressure on inflation. But policy models at the time were estimating "gap" variables by focusing entirely on real factors—production functions and trend rates of growth. Policymakers realized their mistake only slowly because of excessive reliance on a theory that claimed the "gap" was a function of the level of output and the current level of technology. If they had paid more attention to a wider range of data, they would have seen their mistake earlier. The notion that monetary policy acts on the price level by first affecting unemployment, or a "gap," which then via a Phillips curve affects inflation, is in my view mistaken. But if it had been the basis of a flexibly parameterized dynamic econometric model analyzing inflation, interest rates, and real growth jointly, it probably would not have led to such an acceleration of inflation as actually occurred.
378 • SIMS
5. Priors The paper uses a prior that makes no attempt to push the parameter estimates toward the unit-root boundary, centers the prior at an OLS estimate (which will tend to be more stationary than the truth when the truth is near the unit-root boundary), and truncates the parameter space to rule out even mildly unstable roots. This is in the name of being "less informative" than, e.g., Doan, Litterman, and Sims. It is always true that there is no unique way to produce an "uninformative" prior, and this is especially true in VARs. A prior like that proposed here, in a model that conditions on initial observations, implies a lot of weight on stationary models, which in turn generally imply that a great deal of sample history is explained by large initial transients. How this happens is elaborated in some earlier work of mine (Sims, 2000). Such a prior is not uninformative, and may easily lead to strange results. In the latter part of the paper simulations are used to give us an idea of how long it is likely to be before f-tests of j3j(l) = 1 are likely again to accept the null hypothesis. But the prior's concentration on stable models, and the time-variation model's insistence on making the model bounce away from the nonstationary boundary, could be strongly influencing the results of these simulations. 6. Conclusion This paper breaks new ground in interpreting data with a structural VAR and time-varying parameters. Many of the methodological ideas in it are new and worth pursuing. Its choices of prior and identifying assumptions, however, are deviations from standard practice in the structural VAR literature that should not, in my view, be imitated. These aspects of the modeling and interpretation are crucial enough to the paper's substantive conclusions that those conclusions remain doubtful. REFERENCES Hanson, M. (2001). Varying monetary policy regimes: A vector autoregressive investigation. Wesleyan University. Discussion Paper. Leeper, E., and T. Zha. (2001). Modest policy interventions. Indiana University and Federal Reserve Bank of Atlanta. Discussion Paper. http://php.indiana .edu/~eleeper/Papers/lz0101Rev.pdf. Orphanides, A. (2001). Monetary policy rules, macroeconomic stability, and inflation: A view from the trenches. Board of Governors of the Federal Reserve System. Discussion Paper. Sims, C. A. (1988). Projecting policy effects with statistical models. Revista de Analysis Economico, pp. 3-20. www.princeton.edu/~sims.
Comment • 379 . (1999). Drift and breaks in monetary policy, Princeton University. Discussion Paper, http://www.princeton.edu/~sims/. Presented at plenary session of the July 1999 meetings of the Econometric Society, Australasian region. . (2000). Using a likelihood perspective to sharpen econometric discourse: Three examples. Journal of Econometrics 95(2):443-462. http://www.princeton .edu/~sims/.
Comment JAMES H. STOCK Kennedy School of Government, Harvard University; and NBER
1. Introduction Cogley and Sargent have provided a provocative and innovative contribution on an important problem, understanding the history of inflation in the United States and the evolving role of monetary policy in that history. They make many points in their rich paper, some empirical and some methodological. In this discussion, I focus on four of their most salient empirical findings: 1. The persistence of the postwar inflation process has evolved over the past four decades. In the 1960s, inflation was mean-reverting; in the 1970s and early 1980s, it was highly persistent; and in the past ten to fifteen years it has been mean-reverting, as it was in the 1960s. This view is widely shared—for example, it has also been made by Taylor (1999) and by Brainard and Perry (2000)—and it seems to reflect conventional wisdom across a wide spectrum of views of monetary policy. 2. There is a positive correlation between the level of inflation, as measured by its low-frequency component, and its persistence. This is essentially an implication of the first point, because inflation was low in the 1960s, high in the 1970s and early 1980s, and low again during the 1990s. 3. The inflation process has been unstable, not just as measured by its persistence, but also over its entire spectrum or, equivalently, all its autocorrelations. 4. The reduced-form backward-looking Phillips curve relating inflation to lagged inflation and a measure of real economic activity (in Cogley and Sargent, the unemployment rate) has been unstable over the past four decades.
380 • STOCK
Cogley and Sargent draw several conclusions from these and related empirical findings. The most immediately relevant for policy bears on Taylor's (1999) warning that the decline in the persistence of inflation might induce revisionism by policymakers, who might return to the belief that there is an exploitable long-run trade-off between unemployment and inflation. The meat of Taylor's warning is that this revisionism—perhaps a better term is recidivism—would lead to the same mistakes and the same bad outcomes that it did in the 1960s and early 1970s. In this, Cogley and Sargent's message is the same as in Sargent's (1999) monograph on the history of U.S. inflation as elaborated on by Cho, Williams, and Sargent (2001). Most of this discussion is devoted to presenting various pieces of empirical evidence that suggest that the foregoing four empirical findings are less clear-cut than Cogley and Sargent make them out to be. Specifically, I shall present evidence, based on hypothesis tests, confidence intervals, and median-unbiased estimates, that: 1. Inflation persistence has been roughly constant, and high, over the past 40 years in the United States. 2. Therefore, there is no correlation between the level of inflation and its persistence. 3. The autocorrelations of inflation are stable—at least, one cannot reject this hypothesis. 4. The reduced-form Phillips curve is stable, once one allows for a timevarying NAIRU, or if one interprets it not just as a relation between the unemployment rate and the rate of inflation, but more broadly as a relation between real economic activity and inflation. These conclusions are quite at odds with Cogley and Sargent's, and this raises an interesting econometric question as to why my evidence is so different than theirs. The answer, not surprisingly, lies in differences between Cogley and Sargent's Bayesian methods and my frequentist methods. 2. Evaluating Cogley and Sargent's Empirical Results Cogley and Sargent use a sophisticated nonlinear multivariate procedure to characterize inflation dynamics. The methods used here are simpler and univariate, but get at the same issues. The inflation data I consider are for the GDP deflator, quarterly from 1959:1 to 2000:IV, although the results are robust to using other inflation measures.
Comment • 381 2.1 PERSISTENCE OF INFLATION
There are a variety of ways to measure persistence, none perfect. The measure I consider is the largest root of an autoregressive representation of inflation. Cogley and Sargent's emphasis is on measurement, not testing, so to make this analysis parallel I consider median-unbiased estimates of the largest autoregressive root of inflation, constructed by inverting the augmented Dickey-Fuller statistic using the procedure developed in Stock (1991). This procedure produces confidence intervals for the largest root as well. Recursive median-unbiased estimates of the largest AR root and 90% confidence intervals for this root are plotted in Figure 1 [these estimates are based on AR(4) models estimated recursively using all the data from 1959:1 through the date indicated on the horizontal axis]. The striking feature of this plot is the stability of the estimates. Because the number of observations increases with the terminal date, the confidence intervals are tighter towards the end of the sample than at the beginning. At all dates since 1976, these intervals include one (the 90% confidence interval is briefly above one in 1975), and the recursive median-unbiased estimate is typically just less than one. The recursive estimates in Figure 1 use all the historical data through the terminal date, and this might miss changes in persistence towards the end of the sample. Figure 2 therefore plots rolling median-unbiased estimates of the largest AR root and the associated 90% confidence interval for AR(4) models estimated using 12 years of data terminating at the date on the horizontal axis. The median-unbiased point estimates and confidence intervals evidently are quite noisy—not surprisingly, because each estimate is based on just 48 observations, quite few for performing inference about large autoregressive roots. Still, the evidence is striking (and is robust to changing the inflation series, the window length, and the number of lags). With one brief exception for the samples ending near 1994, the 90% confidence intervals contain a unit root, and the median-unbiased estimate, while variable, exceeds one almost as often as it is less than one. Notably, the median-unbiased estimate exceeds one early in the sample, for 12-year periods ending in 1972 through 1976, and late in the sample, for 12-year periods ending in 1997 through 2000. 2.2 RELATION BETWEEN PERSISTENCE AND THE LEVEL OF INFLATION
The results in Figure 2 suggest that there will be no particular relation between the level of inflation and its persistence as measured by the rolling median-unbiased AR root, because this root is estimated to be
Figure 1 RECURSIVE MEDIAN-UNBIASED ESTIMATE AND 90% CONFIDENCE INTERVAL FOR LARGEST AR ROOT
Figure 2 ROLLING MEDIAN-UNBIASED ESTIMATE AND 90% CONFIDENCE INTERVAL FOR LARGEST AR ROOT
384 • STOCK
essentially one throughout this sample. This is in fact the case; the correlation between the running mean of inflation and the rolling estimate of the largest AR root in Figure 2 over the same 12 years is —0.035. 2.3 INSTABILITY OF INFLATION AT HIGHER FREQUENCIES
Cogley and Sargent examine instability of inflation dynamics, both short- and long-run, via spectral estimates implied by their time-varying VAR. Here, I consider a more tightly parametrized approach and ask whether there appears to have been a break in the parameters of a univariate AR(5) model of the inflation rate. This is readily examined using the Quandt likelihood-ratio (or "sup-Wald") test for parameter stability. Although this test is designed around a single break, it is powerful against slow parameter evolution and multiple breaks as well. A technical issue is that the critical values need to hold when the largest root is one or nearly so; I handle this by using the critical values appropriate if the largest root is in fact one, taken from Banerjee, Lumsdaine, and Stock (1992), rather than the critical values appropriate when the largest root is well less than one. The test, implemented with conventional 15% trimming, fails to reject the hypothesis of parameter stability at the 10% significance level. However, using CPI inflation and different lag specifications can yield a significant break at the 10%, but not 5%, level, with the estimated break date in 1981. This evidence suggests that, on the whole, the inflation process has been stable, although there might have been some changes in its short-run dynamics between the first and the second half of the sample. 2.4 INSTABILITY OF THE PHILLIPS CURVE
Whether the backward-looking Phillips curve, interpreted as the relation between inflation, its lags, and current and past values of the unemployment rate, is unstable has attracted much attention. The evidence I provide here is borrowed from Staiger, Stock, and Watson (2001), who investigate the stability of the backward-looking Phillips relation of the type investigated by Gordon (1997,1998). A subdebate in this area has been whether the natural rate of unemployment should be estimated as the low-frequency component of the unemployment rate [the approach advocated by Hall (1999) and adopted by Cogley and Sargent] or whether it should be estimated off an estimated drift in the intercept of an empirical Phillips curve [the approach adopted by King, Stock, and Watson (1995), Gordon (1997, 1998), Staiger, Stock, and Watson (1997), and others]. Staiger, Stock, and Watson (2001) adopt Hall's and Cogley and Sargent's approach and estimate the natural rate by applying a low-
Comment • 385
pass filter to the unemployment rate. Because the natural rate is estimated using only the univariate unemployment rate, it is possible to test separately for drift in intercept of the Phillips curve and for drift in the slope coefficient; the NAIRU is the sum of the estimated natural rate and the rescaled estimated intercept drift. Thus the NAIRU and the natural rate are separately identified. Their conclusion is that in fact these two series are very close to each other empirically, typically within a few tenths of a percentage point of unemployment. The hypothesis that there is no intercept drift in the Phillips curve, specified as the deviation of the unemployment rate from its univariate long-run trend, cannot be rejected at the 10% significance level. In practice, then, there appears to be little difference between estimates of the natural rate based on the Hall's and Cogley and Sargent's idea of the long-run trend in the unemployment rate and the alternative approach of estimating the time-varying NAIRU from intercept drift in the Phillips curve. Staiger, Stock, and Watson (2001) also test for drift in the slope of the Phillips curve and cannot reject the null that the slope is stable. Another way to see whether the Phillips curve has been stable is to see how it has performed for forecasting. Interpreted broadly, the Phillips relation links changes in the rate of inflation to economic activity, of which the unemployment rate is but one measure. In their comparisons of models for forecasting inflation, Stock and Watson (1999, 2001) consider several versions of the backward-looking Phillips curve, each based on different activity measures. They conclude that several activity measures have produced reliable and useful inflation forecasts, at least as measured by pseudo-out-of-sample forecast comparisons with benchmark autoregressive models. These include a composite index of real economic activity constructed using a large number of income and output measures, as well as simpler single measures such as the rate of capacity utilization. Based on these broader measures of output, the backward-looking Phillips curve has been a reasonably reliable and stable predictive relation over the past three decades. 3. Why Do the Bayesian and Frequentist Results Differ? These conclusions are quite different than Cogley and Sargent's, and the obvious question is, why? There are many differences between my methods and theirs: theirs are Bayesian and multivariate, mine are frequentist and mainly univariate. I believe, however, that there are two main sources of these differences: their prior leads them away from finding persistence, and their specification, by forcing all the time variation to
386 • STOCK
occur through the dynamics rather than through the innovation variances, confuses changes in persistence with changes in volatility. These views are informed by the recent study by Pivetta and Reis (2001), who compare the frequentist analysis of inflation persistence of the previous section, Cogley and Sargent's Bayesian method, and a more conventional time-varying parameter model of the type used by Brainard and Perry (2000). Although their analysis remain preliminary at the time of writing this comment, Pivetta and Reis' (2001) results suggest that Cogley and Sargent's importance sampling plays an important role in biasing (from a frequentist perspective) their estimates away from a unit root. This forces their posterior to have a low mean persistence, even if the true persistence (from a frequentist perspective) is quite large. The problem that Cogley and Sargent confront is a difficult one, and even among Bayesian econometricians there appears to be no consensus about the best way to place a prior on large autoregressive roots (see the special issue of Econometric Theory in 1994 on Bayesian approaches to unit-root inference and in particular the survey article by Uhlig, 1994). The problem of confounding persistence and volatility is especially important, and Cogley and Sargent recognize this issue. Their persistence measures are based on the spectrum at frequency zero, but this can change either because the persistence has changed or because the entire spectrum has shifted, that is, the volatility of the process has changed. One does not need fancy tests to see that the volatility of the inflation process has changed greatly over the postwar period: the 1960s and 1990s were times of quiescent low inflation, the 1970s and early 1980s, of volatile high inflation. Because the integral of the spectrum is the variance, on using the height of the spectrum as a measure of persistence, quiescence becomes low persistence, and volatility becomes high persistence.
4. Implications and Conclusions The evidence in Figures 1 and 2 suggests that inflation has been highly persistent for the past three decades, and stably so. My interpretation of the widespread view—that of Brainard, Perry, Taylor, Cogley, and Sargent—is that this confuses volatility with persistence. Inflation was low and stable in the 1960s and 1990s, but this does not mean that it was low and mean-reverting. Whether or not the persistence of inflation has evolved, one implication of this discussion is that we need additional investigations of the statistical properties of Cogley and Sargent's method before adopting it for widespread use as a tool for data description.
Discussion - 387
Finally, let me turn to Taylor's warning, for here I agree with Cogley and Sargent. The fact is that many monetary economists believe inflation to have become less persistent, and this view must be reckoned with. To the extent that this view is held (correctly or not) by policymakers or advisors and to the extent that it encourages a revisionist perspective on the natural rate, then it does raise concerns about inadvertently repeating the inflationary mistakes of the past. ADDITIONAL REFERENCES Banerjee, A., R. L. Lumsdaine, and J. H. Stock. (1992). Recursive and sequential tests of the unit root and trend break hypotheses: Theory and international evidence. Journal of Business and Economic Statistics, 10, 271-288. Brainard, W. C., and G. L. Perry. (2000). Making policy in a changing world. In Economic Events, Ideas, and Policies: The 1960s and After, G. L. Perry and J. Tobin (eds.). Brookings Institution Press. Gordon, R. J. (1997). The time-varying NAIRU and its implications for economic policy. Journal of Economic Perspectives ll(l):ll-32. . (1998). Foundations of the Goldilocks economy: Supply shocks and the time-varying NAIRU. Brookings Papers on Economic Activity 1998(2):297-333. King, R. G., J. H. Stock, and M. W. Watson. (1995). "Temporal Instability of the Unemployment-Inflation Relation," Federal Reserve Bank of Chicago, Economic Perspectives May/June, 2-12. Pivetta, F. and R. Reis. (2001). "The Persistence of Inflation in the U.S.," manuscript, Dept. of Economics, Harvard University. Staiger, D., J. H. Stock, and M. W. Watson. (1997). The NAIRU, unemployment, and monetary policy. Journal of Economic Perspectives 11 (Winter):33-51. , , and . (2001). Prices, wages and the U.S. NAIRU in the 1990s. Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 8320. Stock, J. H. (1991). Confidence intervals for the largest autoregressive root in U.S. economic time series. Journal of Monetary Economics 28:435-460. , and M. W. Watson. (1999). Forecasting inflation. Journal of Monetary Economics 44:293-335. , and . (2001). Forecasting output and inflation: The role of asset prices. Cambridge, MA: National Bureau of Economic Research. NBER Working Paper 8180. Uhlig, H. (1994). What macroeconomists should know about unit roots: A Bayesian perspective. Econometric Theory 10:645-671.
Discussion Tom Sargent responded to the discussants by saying that his view of events differed fundamentally from theirs. While they believed in conditional heteroscedasticity of shocks, he believed in changing decision rules. He explained that the authors were inspired by a graph of inflation over three centuries, which showed a clear break around 1970.
388 • DISCUSSION
Rick Mishkin was sympathetic to the suggestion that what happened in the 1970s was that the Federal Reserve thought that the natural rate of unemployment was lower than it actually was. He was not so worried that the Solow-Tobin test would cause problems in the future, as advances since the 1970s in the understanding of the natural-rate hypothesis and in time-series econometrics are unlikely to go away. He also noted that there had been a substantial restructuring of monetary institutions since the 1970s, including increased central-bank independence and an increased emphasis on price stability. He was most worried about recidivism occurring because of policymakers underestimating the natural rate of unemployment, noting the wide confidence intervals on Jim Stock's estimates of the natural rate. Mishkin suggested that inflation targeting was the way to avoid a repeat of the 1970s. Ken Rogoff noted that the view that Japan was stuck in a liquidity trap was a very powerful one in the policy literature. As a result, many policy economists indeed believe that output growth may be harmed if the rate of inflation wanders too close to zero. He also remarked that in countries other than the United States, there had obviously been a lot of institutional change since the 1970s, so it was hard to see how monetary policy could have remained stable. Mark Gertler remarked that he and Richard Clarida had constructed measures of core inflation for Germany similar in spirit to those of Cogley and Sargent, using long-horizon forecasts to get core inflation. The striking difference between the United States and Germany was that although Germany suffered the same shocks as the United States, and policymakers had the same reasons to be confused, core inflation was flat and stationary in Germany. This finding suggested that there was something different about U.S. monetary policy in the 1970s. Gertler raised the possibility that the shift to nonborrowed reserves in 19791982 could have allowed shocks to have a greater impact, although the policy shift could also have been cover for an attempt to raise interest rates. Chris Sims explained that the mere fact that he believed monetary policy was stable did not mean that he believed it was optimal. On recidivism, he believed that there were dangers in the inertia of orthodoxy. Sargent agreed with Sims on the problems of identification in VARs. He said the problem was more profound than just partitioning contemporaneous correlations, as agents could have more information in their histories than was revealed by the histories of variables in the VAR. This fact generates time aggregation problems.