JOURNAL OF MONETARY ECONOMICS Aims and Scope: The profession witnessed over the past twenty years a remarkable expansion of research activities bearing on problems in the broader field of monetary economics. The strong interest in monetary analysis has been increasingly matched in recent years by the growing attention to the working and structure of financial institutions. The role of various institutional arrangements, the consequences of specific changes in banking structure and the welfare aspects of structural policies have attracted an increasing interest in the profession. We also note lastly the growing attention devoted to the operation of credit markets and to various aspects in the behavior of rates of return on assets. The Journal of Monetary Economics provides a specialized forum for the publication of this research. From vol. 49 onwards, the Journal incorporates the ‘Carnegie-Rochester Conference Series on Public Policy’, a bi-annual conference proceedings. Founding Editors: KARL BRUNNER and CHARLES I. PLOSSER Editor: ROBERT G. KING, Department of Economics, Boston University, 270 Bay State Road, Boston, MA 02215, USA Senior Associate Editors: MARIANNE BAXTER, Boston University; JANICE EBERLY, Northwestern University; MARTIN EICHENBAUM. Northwestern University; SERGIO REBELO, Northwestern University; STEPHEN WILLIAMSON, University of Washington Associate Editors: KLAUS ADAM, University of Mannheim; NICHOLAS BLOOM, Stanford University; YONGSUNG CHANG, University of Rochester; MARIO CRUCINI, Vanderbilt University; HUBERTO ENNIS, Universidad Carlos III de Madrid; MIKHAIL GOLOSOV, Massachusetts Institute of Technology; FRANCOIS GOURIO, Boston University; JONATHAN HEATHCOTE, Federal Reserve Bank of Minneapolis; URBAN JERMANN, University of Pennsylvania ; RICARDO LAGOS, New York University; EDWARD NELSON, Federal Reserve Bank of St. Louis; RICARDO REIS, Columbia University; ESTEBAN ROSSI-HANSBERG, Princeton University; PIERRE-DANIEL SARTE, Federal Reserve Bank of Richmond; FRANK SCHORFHEIDE, University of Pennsylvania; CHRISTOPHER SLEET, Carnegie-Mellon University; JULIA THOMAS, Ohio State University; ANTONELLA TRIGARI, Università Bocconi; LAURA VELDKAMP, New York University; ALEXANDER WOLMAN, Federal Reserve Bank of Richmond CRC Editors: BENNETT T. McCALLUM, Carnegie Mellon University; CHARLES I. PLOSSER, Federal Reserve Bank of Philadelphia CRC Advisory Board: ANDREW ABEL, University of Pennsylvania; YONGSUNG CHANG, University of Rochester; THOMAS F. COOLEY, New York University; JANICE EBERLY, Northwestern University; MARVIN GOODFRIEND, Carnegie Mellon University; SERGIO REBELO, Northwestern University; ALAN C. STOCKMAN, University of Rochester; STANLEY E. ZIN, Carnegie Mellon University Former Editors: KARL BRUNNER (founder); CHARLES I. PLOSSER Submission fee: There is a submission fee of US$250 for all unsolicited manuscripts submitted for publication. There is a reduced fee for full-time students (US$150). To encourage quicker response referees will be paid a nominal fee and the submission fee will be used to cover these refereeing expenses. There are no page charges. Cheques should be made payable to the Journal of Monetary Economics. When a paper is accepted the fee will be reimbursed. Publication information: Journal of Monetary Economics (ISSN 0304-3932). For 2009, volume 56 is scheduled for publication. Subscription prices are available upon request from the Publisher or from the Regional Sales Office nearest you or from this journal’s website (http://www.elsevier.com/locate/jme). Further information is available on this journal and other Elsevier products through Elsevier’s website: (http://www.elsevier.com). Subscriptions are accepted on a prepaid basis only and are entered on a calendar year basis. Issues are sent by standard mail (surface within Europe, air delivery outside Europe). Priority rates are available upon request. Claims for missing issues should be made within six months of the date of dispatch. Orders, claims, and journal enquiries: please contact the Regional Sales Office nearest you: St. Louis: Elsevier, Customer Service Department, 11830 Westline Industrial Drive, St. Louis, MO 63146, USA; phone: (877) 8397126 [toll free within the USA]; (+1) (314) 4537076 [outside the USA]; fax: (+1) (314) 5235153; e-mail:
[email protected] Amsterdam: Elsevier, Customer Service Department, PO Box 211, 1000 AE Amsterdam, The Netherlands; phone: (+31) (20) 4853757; fax: (+31) (20) 4853432; e-mail:
[email protected] Tokyo: Elsevier, Customer Service Department, 4F Higashi-Azabu, 1-Chome Bldg, 1-9-15 Higashi-Azabu, Minato-ku, Tokyo 106-0044, Japan; phone: (+81) (3) 5561 5037; fax: (+81) (3) 5561 5047; e-mail: JournalsCustomerServiceJapan @elsevier.com Singapore: Elsevier, Customer Service Department, 3 Killiney Road, #08-01 Winsland House I, Singapore 239519; phone: (+65) 63490222; fax: (+65) 67331510; e-mail:
[email protected] USA mailing notice: Journal of Monetary Economics (ISSN 0304-3932) is published 8 times per year by Elsevier (P.O. Box 211, 1000 AE Amsterdam, The Netherlands). Periodical postage rate paid at Rahway NJ and additional mailing offices. USA Postmaster: Send change of address to Journal of Monetary Economics, Elsevier, Customer Service Department, 11830 Westline Industrial Drive, St. Louis, MO 63146, USA. Airfreight and mailing in the USA by Mercury International Limited, 365, Blair Road, Avenel, NJ 07001. Printed by Henry Ling Ltd., The Dorset Press, Dorchester, UK
JOURNAL OF MONETARY ECONOMICS The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992 (Permanence of Paper).
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 137–153
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Revisiting the supply side effects of government spending$ George-Marios Angeletos a,, Vasia Panousi b a b
Department of Economics, MIT, 50 Memorial Drive, E52-251, Cambridge, MA 02142, USA and NBER Federal Reserve Board, USA
a r t i c l e in fo
abstract
Article history: Received 8 November 2007 Received in revised form 10 December 2008 Accepted 12 December 2008 Available online 25 December 2008
We revisit the macroeconomic effects of government consumption in the neoclassical growth model when agents face uninsured idiosyncratic investment risk. Under complete markets, a permanent increase in government consumption has no long-run effect on interest rates and capital intensity, while it increases work hours due to the negative wealth effect. These results are upset once we allow for incomplete markets. The same negative wealth effect now causes a reduction in risk taking and the demand for investment. This leads to a lower risk-free rate and, under certain conditions, also to a lower capital–labor ratio and lower productivity. & 2009 Elsevier B.V. All rights reserved.
JEL classification: E13 E62 Keywords: Fiscal policy Government spending Incomplete risk sharing Entrepreneurial risk
1. Introduction Studying the impact of government spending on macroeconomic outcomes is one of the most celebrated policy exercises within the neoclassical growth model: it is important for understanding the business-cycle implications of fiscal policy, the macroeconomic effects of wars, and the cross-section of countries. Some classics include Hall (1980), Barro (1981, 1989), Aiyagari et al. (1992), Baxter and King (1993), Braun and McGrattan (1993), and McGrattan and Ohanian (1999, 2006). These studies have all maintained the convenient assumption of complete markets, abstracting from the possibility that agents’ saving and investment decisions, and hence their reaction to changes in fiscal policy, may crucially depend on the extent of risk sharing within the economy. This paper contributes towards filling this gap. It revisits the macroeconomic effects of government consumption within an incomplete-markets variant of the neoclassical growth model. The key deviation we make from the standard paradigm is the introduction of uninsurable idiosyncratic risk in production and investment. All other ingredients of our model are the same as in the canonical neoclassical growth model:
$ We are grateful to the editor, Robert King, and an anonymous referee for their feedback and suggestions on how to improve the paper. We also thank Olivier Blanchard, Chris Carroll, Edouard Challe, Ricardo Caballero, Mike Golosov, Ricardo Reis, Iva´n Werning and seminar participants at MIT, the 2007 conference on macroeconomic heterogeneity at the Paris School of Economics, and the 2007 SED annual meeting for useful comments. Angeletos thanks the Alfred P. Sloan Foundation for a Sloan Research Fellowship that supported this work. The views presented are solely those of the authors and do not necessarily represent those of the Board of Governors of the Federal Reserve System or its staff members. Corresponding author. Tel.: +1 617 452 3859; fax: +1 617 253 1330. E-mail addresses:
[email protected] (G.-M. Angeletos),
[email protected] (V. Panousi).
0304-3932/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.12.010
ARTICLE IN PRESS 138
G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
firms operate neoclassical constant-returns-to-scale technologies, households have standard CRRA/CEIS preferences, and markets are competitive. The focus on idiosyncratic production/investment risk is motivated by two considerations. First, this friction is empirically relevant. This is obvious for less developed economies. But even in the United States, privately owned firms account for nearly half of aggregate production and employment. Furthermore, the typical investor—the median rich household—holds a very undiversified portfolio, more than one half of which is allocated to private equity.1 And second, as our paper shows, this friction upsets some key predictions of the standard neoclassical paradigm. In the standard paradigm, the steady-state values of the capital–labor ratio, productivity (output per work hour), the wage rate, and the interest rate, are all pinned down by the equality of the marginal product of capital with the discount rate in preferences. As a result, any change in the level of government consumption, even if it is permanent, has no effect on the long-run values of these variables.2 On the other hand, because higher consumption for the government means lower net-of-taxes wealth for the households, a permanent increase in government consumption raises labor supply. It follows that employment and, by implication, output and investment increase. But the long-run levels of capital intensity and productivity unchanged. The picture is quite different once we allow for incomplete markets. The same wealth effect that, in response to an increase in government consumption, stimulates labor supply in the standard paradigm, now also discourages investment. This is simply because risk taking, and hence investment, is sensitive to wealth. We thus find very different long-run effects. First, a permanent increase in government consumption necessarily reduces the risk-free interest rate. And second, unless the elasticity of intertemporal substitution is low enough, it also reduces the capital–labor ratio, productivity, and wages. The effect on the risk-free rate is an implication of the precautionary motive: a higher level of consumption for the government implies a lower aggregate level of wealth for the households, which is possible in steady state only with a lower interest rate. If investment was risk-free, a lower interest rate would immediately translate to a higher capital–labor ratio. But this is not the case in our model precisely because market incompleteness introduces a wedge between the riskfree rate and the marginal product of capital—this wedge is simply the risk premium on investment. Furthermore, because of diminishing absolute risk aversion, this wedge is higher the lower the wealth of the households. It follows that the negative wealth effect of higher government consumption raises the risk premium on investment and can thereby lead to a reduction in the capital–labor ratio, despite the reduction in the interest rate. We show that a sufficient condition for this to be the case is that the elasticity of intertemporal substitution is sufficiently high relative to the income share of capital— a condition easily satisfied for plausible calibrations of the model. Turning to employment and output, there are two opposing effects. On the one hand, as with complete markets, the negative wealth effect on labor supply contributes towards higher employment and output. On the other hand, unlike complete markets, the reduction in capital intensity, productivity, and wages contributes towards lower employment and output. Depending on the income and wage elasticities of labor supply, either of the two effects can dominate. The deviation from the standard paradigm is significant, not only qualitatively, but also quantitatively. For our preferred parametrizations of the model, the following hold. First, the elasticity of intertemporal substitution is comfortably above the critical value that suffices for an increase in government consumption to reduce the long-run levels of the capital–labor ratio, productivity, and wages. Second, a 1% increase in government spending under incomplete markets has the same impact on capital intensity and labor productivity as a 0.5–0.6% increase in capital-income taxation under complete markets. Third, these effects mitigate, but do not fully offset, the wealth effect on labor supply. Finally, the welfare consequences are non-trivial: the welfare cost of a permanent 1% increase in government consumption is three times larger under incomplete markets than under complete markets. The main contribution of the paper is thus to highlight how wealth effects on investment due to financial frictions can significantly modify the supply side channel of fiscal policy. In our model, these wealth effects emerge from idiosyncratic risk along with diminishing absolute risk aversion; in other models, they could emerge from borrowing constraints. Also, such wealth effects are relevant for both neoclassical and Keynesian models. In this paper we follow the neoclassical tradition because this clarifies best our contribution: whereas wealth effects have been central to the neoclassical approach with regard to labor supply, they have been mute with regard to investment. To the best of our knowledge, this paper is the first to study the macroeconomic effects of government consumption in an incomplete-markets version of the neoclassical growth paradigm that allows for uninsurable investment risk. A related, but different, exercise is conducted in Heathcote (2005) and Challe and Ragot (2007). These papers study deviations from Ricardian equivalence in Bewley-type models like Aiyagari’s (1994), where borrowing constraints limit the ability of agents to smooth consumption intertemporally. In our paper, instead, deviations from Ricardian equivalence are not an issue: our model allows households to freely trade a riskless bond, thus ensuring that the timing of taxes and the level of debt has no effect on allocations, and instead focuses on wealth effects on investment due to incomplete risk sharing.
1 See Quadrini (2000), Gentry and Hubbard (2000), Carroll (2000), and Moskowitz and Vissing-Jørgensen (2002). Also note that idiosyncratic investment risks need not be limited to private entrepreneurs; they may also affect educational and occupational choices, or the investment decisions that CEO’s make on behalf of public corporations. On this latter point, see Panousi and Papanikolaou (2008) for some supportive evidence. 2 This, of course, presumes that the change in government consumption is financed with lump-sum taxes. The efficiency or redistributive considerations behind optimal taxation are beyond the scope of this paper.
ARTICLE IN PRESS G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
139
The particular framework we employ in this paper is a continuous-time variant of the one introduced in Angeletos (2007). That paper showed how idiosyncratic capital-income risk can be accommodated within the neoclassical growth model without loss of tractability, studied the impact of this risk on aggregate saving, and contrasted it with the impact of labor-income risk in Bewley-type models (Aiyagari, 1994; Huggett, 1997; Krusell and Smith, 1998). Other papers that introduce idiosyncratic investment or entrepreneurial risk in the neoclassical growth model include Angeletos and Calvet (2005, 2006), Buera and Shin (2007), Cagetti and De Nardi (2006), Covas (2006), and Meh and Quadrini (2006).3 The novelty of our paper is to study the implications for fiscal policy in such an environment. Panousi (2008) studies the macroeconomic effects of capital taxation within a similar environment as ours. That paper shows that, when agents face idiosyncratic investment risk, an increase in capital taxation may paradoxically stimulate more investment in general equilibrium. This provides yet another example of how the introduction of idiosyncratic investment risk can upset some important results of the neoclassical growth model. The rest of the paper is organized as follows. Section 2 introduces the basic model, which fixes labor supply so as to focus on the most novel results of the paper. Section 3 characterizes its equilibrium and Section 4 analyzes its steady state. Section 5 examines the steady-state effects of government consumption on the interest rate and capital accumulation. Section 6 considers three extensions that endogenize labor supply. Section 7 examines the dynamic response of the economy to a permanent change in government consumption. Section 8 concludes. All the formal results are explained in the main text; but the complete proofs are delegated to the Online Appendix. 2. The basic model Time is continuous, indexed by t 2 ½0; 1Þ. There is a continuum of infinitely lived households, indexed by i and distributed uniformly over ½0; 1. Each household is endowed with one unit of labor, which it supplies inelastically in a competitive labor market. Each household also owns and runs a firm, which employs labor in the competitive labor market but can only use the capital stock invested by the particular household.4 Households cannot invest in other households’ firms and cannot otherwise diversify away from the shocks hitting their firms, but can freely trade a riskless bond. Finally, all uncertainty is purely idiosyncratic, and hence all aggregates are deterministic. 2.1. Households, firms, and idiosyncratic risk i
The financial wealth of household i, denoted by xit , is the sum of its holdings in private capital, kt , and the i riskless bond, bt : i
i
xit ¼ kt þ bt . The evolution of
xit
(1) is given by the household budget: i
dxit ¼ dpit þ ½Rt bt þ ot T t cit dt,
(2)
where dpit is the household’s capital income (i.e., the profits it enjoys from the private firm it owns), Rt is the interest rate on the riskless bond, ot is the wage rate, T t is the lump-sum tax, and cit is the household’s consumption. Finally, the familiar no-Ponzi game condition is also imposed. Whereas the sequences of prices and taxes are deterministic (due to the absence of aggregate risk), firm profits, and hence household capital income, are subject to undiversified idiosyncratic risk. In particular i
i
i
dpit ¼ ½Fðkt ; nit Þ ot nit dkt dt þ skt dzit .
(3)
Here, nit is the amount of labor the firm hires in the competitive labor market, F is a constant-returns-to-scale neoclassical production function, and d is the mean depreciation rate. Idiosyncratic risk is introduced through dzit , a standard Wiener process that is i.i.d. across agents and across time. This can be interpreted either as a stochastic depreciation shock or as a stochastic productivity shock, the key element being that it generates risk in the return to capital. The scalar s measures the amount of undiversified idiosyncratic risk and can be viewed as an index of market incompleteness, with higher s corresponding to a lower degree of risk sharing (and s ¼ 0 corresponding to complete markets). Finally, without serious loss of generality, we assume a Cobb–Douglas specification for the technology: a Fðk; nÞ ¼ k n1a , with a 2 ð0; 1Þ.5 Turning to preferences, we assume an Epstein–Zin specification with constant elasticity of intertemporal substitution (CEIS) and constant relative risk aversion (CRRA). Given a consumption process, the utility process is defined by the 3
Related are also the earlier contributions by Leland (1968), Sandmo (1970), Obstfeld (1994), and Acemoglu and Zilibotti (1997). One can think of a household as a couple, with the wife running the family business and the husband working in the competitive labor market (or vice versa). The key assumption, of course, is only that the value of the labor endowment of each household is pinned down by the competitive wage, and is not subject to idiosyncratic risk. 5 The characterization of equilibrium and the proof of the existence of the steady state extend to any neoclassical production function; it is only the proof of the uniqueness of the steady state that uses the Cobb–Douglas specification. 4
ARTICLE IN PRESS 140
G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
solution to the following integral equation: Z 1 zðcs ; U s Þ ds, U t ¼ Et
(4)
t
where zðc; UÞ
"
b
#
c11=y
1 1=y ðð1 gÞUÞð1=yþgÞ=ð1gÞ
ð1 gÞU .
(5)
Here, b40 is the discount rate, g40 is the coefficient of relative risk aversion, and y40 is the elasticity of intertemporal substitution.6 Standard expected utility is nested with g ¼ 1=y. We find it useful to allow ya1=g in order to clarify that the qualitative properties of the steady state depend crucially on the elasticity of intertemporal substitution rather than the coefficient of relative risk aversion (which in turn also guides our preferred parameterizations of the model). However, none of our results rely on allowing ya1=g. A reader who feels uncomfortable with the Epstein–Zin specification can therefore ignore it, assume instead standard expected utility, and simply replace g with 1=y (or vice versa) in all the formulas that follow. 2.2. Government The government consumes output at the rate Gt . Government spending is deterministic, it is financed with lump-sum taxation, and it does not affect either the household’s utility from private consumption or the production of the economy. The government budget constraint is given by dBgt ¼ ½Rt Bgt þ T t Gt dt,
(6)
Bgt
denotes the level of government assets (i.e., minus the level of government debt). Finally, a no-Ponzi game where condition is imposed to rule out explosive debt accumulation. 2.3. Aggregates and equilibrium definition i
i
The initial position of the economy is given by the cross-sectional distribution of ðk0 ; b0 Þ. Households choose plans contingent on the history of their idiosyncratic shocks, and given the price sequence and the government policy, so as to maximize their lifetime utility. Idiosyncratic risk, however, washes out in the aggregate. We thus define an equilibrium as a deterministic sequence of prices fot ; Rt gt2½0;1Þ , policies fGt ; T t gt2½0;1Þ , and macroeconomic variables i i fC t ; K t ; Y t gt2½0;1Þ , along with a collection of individual contingent plans ðfcit ; nit ; kt ; bt gt2½0;1Þ Þi2½0;1 , such that the following conditions hold: (i) given the sequences of prices and policies, the plans are optimal for the households; (ii) the labor R R i market clears, t nit ¼ 1, in all t; (iii) the bond-market clears, i bt þ Bgt ¼ 0, in all t; (iv) the government budget is satisfied in R R i R i all t; and (v) the aggregates are consistent with individual behavior, C t ¼ i cit , K t ¼ i kt , andY t ¼ i Fðkt ; nit Þ, in all t. R (Throughout, we let i denote the mean in the cross-section of the population.)
i i fcit ; nit ; kt ; bt gt2½0;1Þ ,
3. Equilibrium In this section we characterize the equilibrium of the economy. We first solve for a household’s optimal plan for given sequences of prices and policies. We then aggregate across households and derive the general-equilibrium dynamics. 3.1. Individual behavior Since employment is chosen after the capital stock has been installed and the idiosyncratic shock has been observed, optimal employment maximizes profits state by state. By constant returns to scale, optimal firm employment and profits are linear in own capital: i
nit ¼ nð ¯ ot Þkt
and
i
dpit ¼ r¯ ðot Þkt dt þ s dzit ,
(7)
where nð ¯ ot Þ arg max½Fð1; nÞ ot n n
and
r¯ ðot Þ max½Fð1; nÞ ot n d. n
Here, r¯ t r¯ ðot Þ is the household’s expectation of the return to its capital prior to the realization of the idiosyncratic shock zit , as well as the mean of the realized returns in the cross-section of firms. Analogous interpretation applies to n¯ t nð ¯ ot Þ. The key result here is that households face risky, but linear, returns to their capital. To see how this translates to linearity of wealth in assets, let ht denote the present discounted value of future labor income net of taxes, a.k.a. 6 To make sure that (4) defines a preference ordering over consumption lotteries, one must establish existence and uniqueness of the solution to the integral equation (4); see Duffie and Epstein (1992).
ARTICLE IN PRESS G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
141
human wealth: ht ¼
Z
1
e
Rs t
Rj dj
ðos T s Þ ds.
(8)
t
Next, define effective wealth as the sum of financial and human wealth: i
i
wit xit þ ht ¼ kt þ bt þ ht .
(9)
It follows that the evolution of effective wealth can be described by i
i
i
dwit ¼ ½¯r t kt þ Rt ðbt þ ht Þ cit dt þ skt dzit .
(10)
The first term on the right-hand side of (10) measures the expected rate of growth in the household’s effective wealth; the second term captures the impact of idiosyncratic risk. The linearity of budgets, together with the homotheticity of preferences, ensures that, for given prices and policies, the household’s consumption-saving problem reduces to a tractable homothetic problem as in Samuelson’s and Merton’s classic portfolio analysis. It then follows that the optimal policy rules are linear in wealth, as shown in the next proposition. Proposition 1. Let fot ; Rt gt2½0;1Þ and fGt ; T t gt2½0;1Þ be equilibrium price and policy sequences. Then, equilibrium consumption, investment and bond holdings for household i are given by the following: i
cit ¼ mt wit ;
kt ¼ ft wit ;
and
i
bt ¼ ð1 ft Þwit ht ,
(11)
where ft , the fraction of effective wealth invested in capital, is given by
ft ¼
r¯ t Rt
gs2
,
(12)
while mt , the marginal propensity to consume out of effective wealth, satisfies the recursion _t m ¼ mt þ ðy 1Þrt yb, mt
(13) 2
with rt ft r¯ t þ ð1 ft ÞRt 12 gft s2 denoting the risk-adjusted return to saving. Condition (12) simply says that the fraction of wealth invested in the risky asset is increasing in the risk premium mt r¯ t Rt and decreasing in risk aversion g and the amount of risk s. Condition (13) is essentially the Euler condition: it describes the growth rate of the marginal propensity to consume as a function of the anticipated path of risk-adjusted returns to saving. Whether higher risk-adjusted returns increase or reduce the marginal propensity to consume depends on the elasticity of intertemporal substitution. To see this more clearly, note that in steady state this condition reduces to m ¼ yb ðy 1Þr, so that higher r decreases m if and only if y41; that is, a higher risk-adjusted return to saving increases the fraction of savings out of effective wealth if and only if the EIS is higher than one. This is due to the familiar tension between the income and substitution effects implied by an increase in the rate of return.
3.2. General equilibrium Because individual consumption, saving and investment are linear in individual wealth, aggregates at any point in time do not depend on the extent of wealth inequality at that time. As a result, the aggregate equilibrium dynamics can be described with a low-dimensional recursive system. Define f ðKÞ FðK; 1Þ ¼ K a as the production function in intensive form. From Proposition 1, the equilibrium ratio of capital to effective wealth and the equilibrium risk-adjusted return to savings are identical across agents and can be expressed as functions of the current capital stock and risk-free rate: ft ¼ fðK t ; Rt Þ and r ¼ rðK t ; Rt Þ, where
fðK; RÞ
1
gs2
0
ðf ðKÞ d RÞ
and
rðK; RÞ R þ
1 0 ðf ðKÞ d RÞ2 . 2gs2 0
Similarly, the wage is given by ot ¼ oðK t Þ, where oðKÞ f ðKÞ f ðKÞK ¼ ð1 aÞf ðKÞ. Using these facts, aggregating the policy rules of the agents, and imposing market clearing for the risk-free bond, we arrive at the following characterization of the general equilibrium of the economy.
ARTICLE IN PRESS 142
G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
Proposition 2. In equilibrium, the aggregate dynamics satisfy the following system: K_ t ¼ f ðK t Þ dK t C t Gt , 1 C_ t 2 ¼ yðrt bÞ þ gs2 ft , 2 Ct _ t ¼ R t H t o t þ Gt , H
(15)
ft Ht Kt ¼ 1 ft
(17)
(14)
(16)
with ot ¼ oðK t Þ, ft ¼ fðK t ; Rt Þ, and rt ¼ rðK t ; Rt Þ. This result has a simple interpretation.7 Condition (14) is the resource constraint of the economy; it follows from aggregating budgets across all households and the government, imposing labor- and bond-market clearing, and using the linearity of individual firm employment to individual capital together with constant returns to scale, to get R R i R i Y t ¼ i Fðkt ; nit Þ ¼ Fð i kt ; i nit Þ ¼ FðK t ; 1Þ. Condition (15) is the aggregate Euler condition for the economy; it follows from aggregating consumption and wealth across agents, together with the optimality condition (13) for the marginal propensity to consume. Condition (16) expresses the evolution of the present value of aggregate net-of-taxes labor income in recursive form; it follows from the definition of human wealth combined with the intertemporal government budget, which imposes that the present value of taxes equals the present value of government consumption. Finally, condition (17) represents market-clearing in the bond market; more precisely, it follows from aggregating bond holdings and investment across agents to get Bt ¼ ð1 ft ÞW t Ht and K t ¼ ft W t , using the latter to replace W t in the former, and imposing Bt ¼ 0. This system characterizes the equilibrium dynamics of the economy under both complete and incomplete markets. In particular, conditions (14), (16) and (17) are exactly the same under either market structure; the key differences between complete and incomplete markets rest in the Euler condition (15) and in the relation between the risk-adjusted return rt , 0 the risk-free rate Rt and the marginal product of capital f ðK t Þ d. 0 When s ¼ 0 (complete markets), arbitrage imposes that Rt ¼ f ðK t Þ d ¼ rt and the Euler condition reduces to its _ familiar complete-market version, C t =C t ¼ yðRt bÞ. When instead s40 (incomplete markets), there are two important changes. First, the precautionary motive for saving introduces a positive drift in consumption growth, represented by the 2 term 12gs2 ft in the Euler condition (15). And second, the fact that investment is subject to undiversifiable idiosyncratic risk 0 introduces a wedge between the risk-free rate and the marginal product of capital, so that Rt ort of ðK t Þ d. It is worth noting here that the first effect is also shared by Aiyagari (1994) and other Bewley-type models that consider labor-income risk, whereas the second effect relies on the presence of capital-income risk. Finally note that condition (17) can be solved for Rt as a function of the contemporaneous ðK t ; Ht Þ, so that the equilibrium dynamics of the economy reduce to a simple three-dimensional ODE system in ðC t ; K t ; Ht Þ. Indeed, the equilibrium dynamics can be approximated with a simple shooting algorithm, similar to the one applied to the completemarkets neoclassical growth model. For any historically given K 0 , guess some initial values ðC 0 ; H0 Þ and use conditions (14)–(16) to compute the entire path of ðC t ; K t ; Ht Þ for t 2 ½0; T, for some large T; then iterate on the initial guess till ðC T ; K T ; HT Þ is close enough to its steady-state value.8 In the special case of a unit EIS ðy ¼ 1Þ, we have that mt ¼ b and hence C t ¼ bðK t þ Ht Þ for all t. One can then drop the Euler condition from the dynamic system and analyze the equilibrium dynamics with a simple phase diagram in the ðK; HÞ space, much alike in a textbook exposition of the neoclassical growth model. Either way, this is a significant gain in tractability relative to other incomplete-markets models, where the entire wealth distribution—an infinite dimensional object—is usually a relevant state variable for aggregate equilibrium dynamics. As in Angeletos (2007), the key is that individual policy rules are linear in individual wealth, so that aggregate dynamics are invariant to the wealth distribution.
4. Steady state We henceforth parameterize government spending as a fraction g of aggregate output to study the steady state of the economy, that is, the fixed point of the dynamic system in Proposition 2. We now show that this can be characterized as the solution to a system of two equations in K and R. First, note that the growth rate of consumption must be zero in steady state. Setting C_ t =C t ¼ 0 into the Euler condition (15) gives
r¼b
1g 2 2 s f . 2y
(18)
7 This result is similar to Proposition 2 in Angeletos (2007); here the equilibrium dynamics are simplified by the use of continuous time and is modified for the presence of a government. 8 This presumes that a turnpike theorem holds true in our model; we expect this to be the case at least for s small enough, by continuity to the complete-markets case.
ARTICLE IN PRESS G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
143
Using then the facts that r ¼ R þ ð1=2gs2 Þðf ðKÞ d RÞ2 and f ¼ ð1=gs2 Þðf ðKÞ d RÞ, we can write the above as follows: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2ygs2 ðb RÞ 0 . (19) f ðKÞ d R ¼ yþ1 0
0
This condition gives the combinations of K and R that are consistent with stationarity of aggregate consumption (equivalently, with stationarity of aggregate wealth). Second, note that the growth rate of human wealth from (16) must also be zero in steady state. From this we get that H¼
oG R
,
which simply states that human wealth must equal the present value of wages net of taxes. Substituting this into the bondmarket clearing condition (17), and using o ¼ ð1 aÞf ðKÞ and G ¼ gf ðKÞ, we get the following: K¼
fðK; RÞ ð1 a gÞf ðKÞ . 1 fðK; RÞ R
(20)
This condition gives the combinations of K and R that are consistent with stationarity of human wealth and bond-market clearing. In any steady state, the capital stock and the risk-free rate must jointly solve Eqs. (19) and (20). In the Appendix we further show that a solution to this system exists and is unique. We thus reach the following result. Proposition 3. The steady state exists and is unique. The steady-state levels of the capital stock and the risk-free rate are given by the solution to the system of equations (19) and (20). To understand the determination of the steady state of our model and its relation to its complete-markets counterpart, note first that condition (18) imposes rob. That is, the risk-adjusted return to saving must be lower than the discount rate. In particular, r must be low enough just to offset the precautionary motive for saving. If the risk-adjusted return were higher than this critical level, consumption (and wealth) would increase over time without bound, which would be a contradiction of steady state. Conversely, if the risk-adjusted return were lower than this level, consumption (and wealth) would shrink to zero, which would once again be a contradiction of steady state. Combining this with the fact that Ror, we 0 infer that the risk-free rate is also lower than the discount rate: Rob. At the same time, because rof ðKÞ d, it is unclear whether the marginal product of capital is higher or lower than the discount rate. Using these observations, along with the 0 fact that the complete-markets steady-state features f ðKÞ d ¼ R ¼ b, we conclude that incomplete markets necessarily reduce the risk-free rate but can have an ambiguous effect on the capital stock. In simple words, the precautionary motive guarantees that the interest rate is lower under incomplete markets than under complete markets, but this does not necessarily translate to a higher capital stock, because investment risk introduces a wedge between the marginal product of capital and the interest rate. A graphical representation of the steady state helps appreciate further this tension between the precautionary motive and the risk premium in our model (and will also facilitate the comparative statics of the steady state). Let K 1 ðRÞ and K 2 ðRÞ
K MPK
K2 (R)
K1 (R) CM
KCM • KIM •
• IM
•
• 0
RI M Fig. 1. A graphical representation of the steady state.
R
ARTICLE IN PRESS 144
G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
denote the functions defined by solving, respectively, Eqs. (19) and (20) for K as functions of R. We discuss the properties of these functions in what follows and illustrate them in Fig. 1.9 0 Consider first the curve K 1 ðRÞ. When s ¼ 0, condition (19) reduces to f ðKÞ d ¼ R. The complete-markets counterpart of K 1 ðRÞ is therefore given by a standard curve for the marginal product of capital, represented by curve MPK in Fig. 1. The positive risk premium introduced on investment when s40 implies that curve K 1 ðRÞ lies uniformly below curve MPK. Indeed, the distance between the two curves measures the risk premium, as captured by the right-hand side of (19). Clearly, the latter is decreasing in R: the higher the risk-free rate, the lower the risk premium in steady state. To understand the intuition behind this property, take for a moment the interest rate to be exogenously given. Then, an increase in R would lead to an increase in the steady-state level of wealth. Because of diminishing absolute risk aversion, the increase in wealth would stimulate capital accumulation. However, because of diminishing returns to capital accumulation, the ratio of capital to wealth, i.e. the fraction f, would fall. But then the risk premium, which is given by 2 2 1 2 gf s , would also fall. And because wealth explodes as R ! b, while K remains bounded, it follows that the risk premium must vanish as R ! b. These observations explain why the distance between the two curves indeed falls monotonically with R, and vanishes as R ! b. To recap, there are two important economic effects behind curve K 1 ðRÞ. On the one hand, a higher R raises the opportunity cost of capital. This effect, which is present under both complete and incomplete markets, tends to discourage investment. On the other hand, a higher R is possible in steady state under incomplete markets only if aggregate wealth is higher in that steady state. This wealth effect, which is present only under incomplete markets, tends to encourage investment. Moreover, from condition (19) it is immediate that K 1 ðRÞ is U-shaped, as illustrated in Fig. 1. Therefore, the opportunity-cost effect must be dominating for low R, while the wealth effect must be dominating for high R.10 Let us now turn to the curve K 2 ðRÞ. The complete-markets counterpart of K 2 ðRÞ is the vertical line at R ¼ b: as s ! 0, K 2 ðRÞ converges to this vertical line, whereas for any s40, K 2 ðRÞ lies to the left of this vertical line. In Lemma 1 in the Appendix we show that K 2 ðRÞ is monotonically decreasing in R, with K 2 ðRÞ ! þ1 as R ! 0 and K 2 ðRÞ ! 0 as R ! b.11 The intuition for the monotonicity of K 2 ðRÞ is simple. For given K, and hence given o, an increase in R reduces both H and fðK; RÞ, and thereby necessarily reduces the right-hand side of (20). But then, for (20) to hold with the lower R, it must be that K also falls, which explains why K 2 ðRÞ is decreasing. Since K 1 ðRÞ and K 2 ðRÞ are continuous in R, and using their limiting properties from above, it is then clear that the two curves intersect at least once at some R 2 ð0; bÞ. But, as already mentioned, we further show in the Appendix that this intersection is in fact unique. The incomplete-markets steady state of our model is thus represented by point IM in Fig. 1, while its complete-markets counterpart is represented by point CM. For the particular economy we have considered in this figure, the steady-state capital stock is lower under incomplete markets than under complete markets. However, the opposite could also be true. Clearly, a sufficient condition for the steady-state capital stock to be lower than under complete markets is that the two curves intersect on the upward portion of K 1 ðRÞ, or that the wealth effect on investment due to risk aversion dominates the usual opportunity cost effect. The following proposition identifies a condition that is both necessary and sufficient for the capital stock to be lower than under complete markets. Proposition 4. The steady-state level of capital is lower under incomplete markets ðs40Þ than under complete markets ðs ¼ 0Þ if and only if y4f=ð2 fÞ. This result, which was first reported in Angeletos (2007), highlights how augmenting the neoclassical growth model for idiosyncratic capital-income risk can lead to lower aggregate saving, and thereby to lower aggregate output and consumption than under complete markets. This result stands in contrast to Aiyagari (1994), which documents how laborincome risk raises aggregate saving.12 We refer the interested reader to that earlier work for a more extensive discussion and quantification of this result. In the remainder of our paper, we focus on the effects of government spending, which is our main question of interest.
5. The long-run effects of government consumption In this section we study how the steady-state changes when the rate of government consumption increases. The analysis will make clear that the different impact that government spending has in our model as compared to the standard paradigm originates precisely from the wealth effects that idiosyncratic risk introduces in the demand for investment. 9
For a formal derivation of all the properties discussed here, see Lemma 1 in the Appendix. We verify these properties in Lemma 1 in the Appendix, where we further show that the relative strength of these two effects is such that qK 1 =qR40 if and only if y4f=ð1 fÞ. This property will also turn out to be important in the next section, where we analyze the steady-state effects of government spending. 11 Simulations suggest that the curve is also convex, but we have not been able to prove this. 0 12 In Aiyagari (1994), a precautionary motive implies Rob, but the absence of investment risk maintains R ¼ f ðKÞ d, from which it is immediate that the capital stock is necessarily higher under incomplete markets. 10
ARTICLE IN PRESS G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
145
K K2 (R;glow) K2 (R;ghigh)
K1 (R)
• IMlow • IMhigh
0
R
Fig. 2. The steady-state effects of government consumption.
5.1. Characterization To study the long-run effects of an increase in the level of government consumption, we again use a graphical representation of the steady state, namely Fig. 2, which is a variant of Fig. 1. Let the initial level of government spending be g ¼ g low and suppose that the corresponding steady state is given by point IM low in Fig. 2. Subsequently, let government spending increase to g ¼ g high 4g low . Note that condition (19) does not depend on g, and hence an increase in government consumption does not affect the K 1 ðRÞ curve. Rather, it is condition (20), and the K 2 ðRÞ curve, that depend on g. In particular, because a higher g means lower net-of-taxes labor income, and hence a lower H in steady state for any given R, an increase in government consumption causes the K 2 ðRÞ curve to shift leftwards, as illustrated in Fig. 2. This leftward shift is a manifestation of the negative wealth effect of higher lump-sum taxes on investment. The new steady state is then represented by point IM high . Clearly, the leftward shift in the K 2 ðRÞ curve leads unambiguously to a decrease in R. The impact on K, on the other hand, is ambiguous. This is because, as explained in the previous section, a reduction in R entails two opposing effects on the demand for investment: the familiar opportunity-cost channel tends to encourage investment, while the novel wealth channel of our model tends to discourage investment. As evident from the figure, if the two curves intersect on the upward portion of the K 1 ðRÞ curve, that is, in the portion where the wealth effect dominates, then the increase in g leads to a reduction in K. In the Appendix we show that the intersection occurs in the upward portion of the K 1 ðRÞ curve if and only if y4f=ð1 fÞ. Finally, it is easy to check that foa.13 We thus reach the following result. Proposition 5. In steady state, an increase in government consumption ðgÞ necessarily decreases the risk-free rate ðRÞ, while it locally decreases the capital–labor ratio ðK=NÞ, labor productivity ðY=NÞ, the wage rate ðoÞ, and the saving rate ðs dK=YÞ if and only if y4f=ð1 fÞ. A sufficient condition for the latter is that y4a=ð1 aÞ. This is the key theoretical result of our paper. It establishes that, as long as the EIS is sufficiently high relative to the income share of capital, a permanent increase in the rate of government consumption has a negative long-run effect on both the interest rate and the capital intensity of the economy. It is important to appreciate how this result deviates from the standard neoclassical paradigm. With complete markets, in steady state the interest rate is equal to the discount rate ðR ¼ bÞ, and the capital–labor ratio is determined by the 0 equality of the marginal product of capital to the discount rate ðf ðK=NÞ d ¼ bÞ. It follows that, in the long run, government consumption has no effect on either R or K=N, Y=N, o, and s. This is true even when labor supply, N, is endogenous.14 The only difference is that, with endogenous labor supply, N changes with g. In particular, when labor supply is fixed, the increase in government consumption simply leads to a one-to-one decrease in private consumption. When instead labor supply is elastic, the increase in government consumption has a negative wealth effect, inducing agents to 13
0
To see this, note from condition (20) that f=ð1 fÞ ¼ RK=ð1 a gÞf ðKÞ. Combining this with the fact that Rof ðKÞ doaf ðKÞ=K, we get f=ð1
fÞoa=ð1 a gÞoa=ð1 aÞ and hence foa. 14
We endogenize labor in Section 6.
ARTICLE IN PRESS 146
G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
work more. The capital stock then increases one-to-one with labor supply, so as to keep the capital–labor ratio and the interest rate invariant with g. In our model, instead, government consumption has non-trivial long-run effects on both the interest rate and the capital intensity of the economy. Building on the earlier discussions, we can now summarize the key mechanism in our model as follows. Because households face consumption risk, they have a precautionary motive to save. Because preferences exhibit diminishing absolute risk aversion, this motive is stronger when the level of wealth is lower. It follows that, by reducing household wealth, higher government spending stimulates precautionary saving. But then, the risk-free rate at which aggregate saving can be stationary has to be lower, which explains why the risk-free rate R falls with g. At the same time, because of diminishing absolute risk aversion, the reduction in wealth tends to discourage the demand for investment. Provided that the positive effect of the lower opportunity cost of investment is not strong enough to offset this negative wealth effect, the capital–labor ratio K=N also falls with g.
5.2. Calibration and numerical simulation For empirically plausible calibrations of the model, the critical condition y4f=ð1 fÞ appears to be satisfied quite easily. For example, take the interest rate to be R ¼ 4% and labor income to be 65% of GDP (as in US data). This implies that H is about 16 times GDP. With a capital-output ratio of 4 (again as in US data), this translates to an H of about 4 times K. Since in steady state f=ð1 fÞ ¼ K=H, this exercise gives a calibrated value for f=ð1 fÞ of about 0.25. This critical value is lower than most of the recent empirical estimates of the elasticity of intertemporal substitution, which are in most cases above 0.5 and often even above 1.15 Hence, a negative long-run effect of government consumption on aggregate saving and productivity appears to be the most likely scenario. In the remainder of this section, we make a first pass at the potential quantitative importance of our results within the context of our baseline model. In the next section we then turn to an enriched version of the model that allows for endogenous labor supply, as well as a certain type of agent heterogeneity. The economy is fully parameterized by ða; b; g; d; y; s; gÞ, where a is the income share of capital, b is the discount rate, g is the coefficient of relative risk aversion, d is the (mean) depreciation rate, y is the elasticity of intertemporal substitution, s is the standard deviation of the rate of return on private investment, and g is the share of government consumption in aggregate output. In our baseline parametrization, we take a ¼ 0:36, b ¼ 0:042, and d ¼ 0:08; these values are standard in the literature. For risk aversion, we take g ¼ 5, a value commonly used in the macrofinance literature to help generate plausible risk premia. For the elasticity of intertemporal substitution, we take y ¼ 1, a value consistent with recent micro- and macroestimates.16 For the share of government, our baseline value is g ¼ 25% (as in the United States) and a higher alternative is g ¼ 40% (as in some European countries). What remains is s. Unfortunately, there is no direct measure of the rate-of-return risk faced by the ‘‘typical’’ investor in the US economy. However, there are various indications that investment risks are significant. For instance, the probability that a privately held firm survives five years after entry is less than 40%. Furthermore, even conditional on survival, the risks faced by entrepreneurs and private investors appear to be very large: as Moskowitz and Vissing-Jørgensen (2002) document, not only is there a dramatic cross-sectional variation in the returns to private equity, but also the volatility of the book value of a (value-weighted) index of private firms is twice as large as that of the index of public firms—one more indication that private equity is more risky than public equity. Note then that the standard deviation of annual returns is about 15% per annum for the entire pool of public firms; it is over 50% for a single public firm (which gives a measure of firm-specific risk); and it is about 40% for a portfolio of the smallest public firms (which are likely to be similar to large private firms). Given this suggestive evidence, and lacking any better alternative, we let s ¼ 30% for our baseline parameterization and consider s ¼ 20% and 40% for sensitivity analysis. Although these numbers are somewhat arbitrary, it is reassuring that the volatility of individual consumption generated by our model is comparable to its empirical counterpart. For instance, using the Consumer Expenditure Survey (CEX), Malloy et al. (2006) estimate the standard deviation of consumption growth to be about 8% for stockholders (and about 3% for non-stockholders). Similarly, using data that include consumption of luxury goods, Aı¨t-Sahalia et al. (2001) get estimates between 6% and 15%. In our simulations, on the other hand, the standard deviation of individual consumption growth is less than 5% per annum (along the steady state). Putting aside these qualifications about the parametrization of s, we now examine the quantitative effects of government consumption on the steady state of the economy. Table 1 reports the percent reduction in the steady-state values of the capital–labor ratio ðK=NÞ, labor productivity ðY=NÞ, and the saving rate ðsÞ, relative to what their values would have been if g were 0.17 Complete markets are indicated by CM and incomplete markets by IM. 15 See, for example, Vissing-Jørgensen and Attanasio (2003), Mulligan (2002), and Gruber (2005). See also Guvenen (2006) and Angeletos (2007) for related discussions on the parametrization of the EIS. 16 See the references in footnote 15. 17 Here, since labor supply is exogenously fixed, the changes in K and Y coincide with those in K=N and Y=N; this is not the case in the extensions with endogenous labor supply in the next section.
ARTICLE IN PRESS G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
147
Table 1 The steady-state effects of government consumption. K=N
Baseline s ¼ 40% s ¼ 20% g ¼ 40%
Y=N
s
tkequiv
CM
IM
CM
IM
CM
IM
CM
0 0 0 0
10:02 12:18 6:78 17:82
0 0 0 0
3:73 4:57 2:5 6:82
0 0 0 0
1:14 1:21 0:88 2:05
17 20 12 28
The table reports the percent reduction in the steady-state values of the capital–labor ratio ðK=NÞ, labor productivity ðY=NÞ, and the saving rate ðsÞ, relative to what their values would have been if g ¼ G=Y were 0. In the first three lines, g increases from 0% to 25%, whereas in the last line g increases from 0% to 40% in the baseline calibration. CM denotes complete markets, and IM denotes incomplete markets. The tax rate on capital income that would generate the same effects under complete markets is tkequiv .
Table 2 Long-run effects of a permanent 1% increase in government consumption. K=N
g ¼ 25% ! 26% g ¼ 40% ! 41%
Y=N
tkequiv
CM
IM
CM
IM
CM
0 0
0:52 0:71
0 0
0:19 0:26
0.75 0.8
The table reports the change in the steady-state values of the capital–labor ratio ðK=NÞ, and labor productivity ðY=NÞ, when government spending increases by 1%. CM denotes complete markets, and IM denotes incomplete markets. The tax rate on capital income that would generate the same effects under complete markets is tkequiv .
In our baseline parametrization, the capital–labor ratio is about 10% lower when g ¼ 25% than when g ¼ 0. Similarly, productivity is about 4% lower and the saving rate is about 1 percentage point lower. These are significant effects. They are larger (in absolute value) than the steady-state effects of precautionary saving reported in Aiyagari (1994). They are equivalent to what would be the steady-state effects of a marginal tax on capital income equal to 17% in the completemarkets case. (The tax rate on capital income that would generate the same effects under complete markets is given in the last column of the table, as tkequiv .) Not surprisingly, the effects are smaller if s is lower (third row) or if g is lower (not reported), because then risk matters less. On the other hand, the effects are larger when g ¼ 40% (final row): productivity is almost 18% lower, the saving rate is 2 percentage points lower, and the tax on capital income that would have generated the same effects under complete markets is 28%. Table 2 turns from level to marginal effects: it reports the change in K=N, Y=N, and s as we increase government spending by 1%, either from 25% to 26%, or from 40% to 41%. In the first case, productivity falls by 0.19%; in the second, by 0.26%. This is equivalent to what would have been, under complete markets, the effect of increasing the tax rate on capital income by about 0.75 percentage points in the first case, and about 0.8 percentage points in the second case. 6. Endogenous labor In this section we endogenize labor supply in the economy. We consider three alternative specifications that achieve this goal without compromising the tractability of the model. 6.1. GHH preferences One easy way to accommodate endogenous labor supply in the model is to assume preferences that rule out income effects on labor supply, as in Greenwood et al. (1988). In particular, suppose that preferences are given by R1 U 0 ¼ E0 0 ebt uðct ; lt Þ dt, with uðct ; lt Þ ¼
1 ½ct þ vðlt Þ1g , 1g
(21)
where lt denotes leisure and v is a strictly concave, strictly increasing function.18 The analysis can then proceed as in the benchmark model, with labor supply in period t given by N t ¼ 1 lðot Þ, where lðoÞ arg maxl fvðlÞ olg. 18
To allow for ya1=g, we let U t ¼ Et
R1 t
zðct þ vðlt Þ; U t Þ dt, with the function z defined as in condition (5).
ARTICLE IN PRESS 148
G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
This specification highlights an important difference between complete and incomplete markets with regard to the employment impact of fiscal shocks. Under incomplete markets, an increase in government spending can have a negative general-equilibrium effect on aggregate employment. This is never possible with complete markets, but it is possible with incomplete markets when an increase in g reduces the capital–labor ratio, and thereby the wage rate, which in turn discourages labor supply. Indeed, with GHH preferences, y4f=ð1 fÞ suffices for both K=N and N to fall with g in both the short run and the long run. Although it is unlikely that wealth effects on labor supply are zero in the long run, they may well be very weak in the short run. In light of our results, one may then expect that after a positive shock to government consumption both employment and investment could drop on impact under incomplete markets. Indeed, an interesting extension would be to consider a preference specification that allows for weak short-run but strong long-run wealth effects on labor supply, as in Jaimovich and Rebelo (2006). 6.2. KPR preferences A second tractable way to accommodate endogenous labor supply is to assume that agents have homothetic preferences over consumption and leisure, as in King et al. (1988). The specification assumed in that paper is R U 0 ¼ E0 ebt uðct ; lt Þ dt, with 1c c 1g
uðct ; lt Þ ¼
ðct lt Þ 1g
,
(22)
where lt denotes leisure and c 2 ð0; 1Þ is a scalar. This specification imposes expected utility ðy ¼ gÞ. To allow for ya1=g, we R1 c 1c let U t ¼ Et t zðct lt ; U t Þ dt, with z defined as in (5). The benefit of this specification is that it is standard in the literature (making our results comparable to previously reported results), while it also comes with zero cost in tractability.19 The homotheticity of the household’s optimization problem is then preserved and the equilibrium analysis proceeds in a similar fashion as in the benchmark model.20 The only essential novelty is that aggregate employment is now given by Nt ¼ 1 Lðot ; C t Þ, where Lðot ; C t Þ ¼
c Ct . 1 c ot
The neoclassical effect of wealth on labor supply is then captured by the negative relationship between Nt and C t (for given ot ). For the quantitative version of this economy, we take c ¼ 0:75. This value, which is in line with King et al. (1988) and Christiano and Eichenbaum (1992), ensures that the steady-state fraction of available time worked approximately matches US data. The rest of the parameters are as in the baseline specification of the benchmark model. 6.3. Hand-to-mouth workers A third approach is to split the population into two groups. The first group consists of the households that have been modeled in the benchmark model; we will call this group the ‘‘investors’’. The second group consists of households that supply labor but do not hold any assets, and simply consume their entire labor income at each point in time; we will call this group the ‘‘hand-to-mouth workers’’. Their labor supply is given by Nhtm ¼ ot o ðC htm Þc , t t
(23)
C htm t
denotes the consumption of these agents, o 40 parameterizes the wage elasticity of labor supply, and c 40 where parameterizes the wealth elasticity.21 This approach could be justified on its own merit. In the United States, a significant fraction of the population holds no assets, has limited ability to borrow, and sees its consumption tracking its income almost one-to-one. This fact calls for a richer model of heterogeneity than our benchmark model. But is unclear what the ‘‘right’’ model for these households is. Our specification with hand-to-mouth workers is a crude way of capturing this form of heterogeneity in the model, while preserving tractability. A side benefit of this approach is that it also gives freedom in parameterizing the wage and wealth elasticities of labor supply. Whereas the KPR preference specification imposes o ¼ c ¼ 1, the specification introduced above permits us to pick much lower elasticities, consistent with microevidence. The point is not to argue which parametrization of the laborsupply elasticities is more appropriate for quantitative exercises within the neoclassical growth model; this is the subject of 19 For convenience, we allow agents to trade leisure with one another, so that an individual agent can possibly consume more leisure than her own endowment of time. 20 The proofs are available upon request. 21 Preferences that give rise to this labor supply are ut ¼ czt c nzt n , for appropriate zc ; zn .
ARTICLE IN PRESS G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
149
Table 3 Long-run effects with endogenous labor. K=N
Y=N
N
Y
tkequiv
CM
IM
CM
IM
CM
IM
CM
IM
CM
g ¼ 25% ! 26%
KPR HTM
0 0
0:33 0:3
0 0
0:12 0:11
1.4 0.38
1.27 0.38
1.4 0.38
1.15 0.27
0.52 0.46
g ¼ 40% ! 41%
KPR HTM
0 0
0:52 0:36
0 0
0:19 0:13
1.76 0.57
1.53 0.57
1.76 0.57
1.34 0.44
0.68 0.48
The table reports the change in the steady-state values of the capital–labor ratio ðK=NÞ, labor productivity ðY=NÞ, employment N, and output Y, when government spending increases by 1%. CM denotes complete markets, IM denotes incomplete markets. KPR denotes the case of homothetic preferences, and HTM the case of hand-to-mouth workers. The tax rate on capital income that would generate the same effects under complete markets is tkequiv .
a long debate in the literature, to which we have nothing to add. The point here is rather to cover a broader spectrum of empirically plausible quantitative results. For the quantitative version of this economy, we thus take o ¼ 0:25 and c ¼ 0:25, which are in the middle of most microestimates.22 What then remains is the fraction of aggregate income absorbed by hand-to-mouth workers. As mentioned above, a significant fraction of the US population holds no assets. For example, using data from both the PSID and the SCF, Guvenen (2006) reports that the lower 80% of the wealth distribution owns only 12% of aggregate wealth and accounts for about 70% of aggregate consumption. Since some households may be able to smooth consumption even when their net worth is zero, 70% is likely to be an upper bound for the fraction of aggregate consumption accounted for by handto-mouth agents. We thus opt to calibrate the economy so that hand-to-mouth agents account for 50% of aggregate consumption. This is also the value of the relevant parameter that one would estimate if the model were to match US aggregate consumption data—we can deduce this from Campbell and Mankiw (1989).23 6.4. The long-run effects of government consumption with endogenous labor Our main theoretical result (Proposition 5) continues to hold in all of the above variants of the benchmark model: in steady state, a higher rate g of government consumption necessarily reduces the interest rate R; and it also reduces the capital–labor ratio K=N, labor productivity Y=N, and the wage rate o if and only if the elasticity of intertemporal substitution y is higher than f=ð1 fÞ.24 What is not clear anymore is the effect of g on K and Y, because now N is not fixed. On the one hand, the reduction in wealth stimulates labor supply, thus contributing to an increase in N. This is the familiar neoclassical effect of government spending on labor supply. On the other hand, as long as y4f=ð1 fÞ, the reduction in capital intensity depresses real wages, contributing towards a reduction in N. This is the novel general-equilibrium effect due to incomplete markets. The overall effect of government spending on aggregate employment is therefore ambiguous under incomplete markets, whereas it is unambiguously positive under complete markets. Other things equal, we expect the negative general-equilibrium effect to dominate, thus leading to a reduction in longrun employment after a permanent increase in government spending, if the wage elasticity of labor supply is sufficiently high relative to its income elasticity. This is clear in the GHH specification, where the wealth effect is zero. It can also be verified for the case of hand-to-mouth workers, where we have freedom in choosing these elasticities, but not in the case of KPR preferences, where both elasticities are restricted to equal one. Given these theoretical ambiguities, we now seek to get a sense of empirically plausible quantitative effects. As already discussed, the GHH case (zero wealth effects on labor supply) is merely of pedagogical value. We thus focus on the parameterized versions of the other two cases, the economy with KPR (homothetic) preferences and the economy with hand-to-mouth workers. Table 3 then presents the marginal effects on the steady-state levels of the capital–labor ratio, productivity, employment, and output for each of these two economies, as g increases from 25% to 26%, or from 40% to 41%.25 The case of KPR preferences is indicated by KPR, while the case with hand-to-mouth workers is indicated by HTM. In either case, complete markets are indicated by CM and incomplete markets by IM. 22
See, for example, Hausman (1981), MaCurdy (1981), and Blundell and MaCurdy (1999). Note that the specification of aggregate consumption considered in Campbell and Mankiw coincides with the one implied by our model. Therefore, if one were to run their regression on data generated by our model, one would correctly identify the fraction of aggregate consumption accounted for by hand-to-mouth workers in our model. This implies that it is indeed appropriate to calibrate our model’s relevant parameter to Campbell and Mankiw’s estimate. 24 This is true as long as the steady state is unique, which seems to be the case but has not been proved as in the benchmark model. Also, in the variant with hand-to-mouth agents, we have to be cautious to interpret f as the ratio of private equity to effective wealth for the investor population alone. 25 We henceforth focus on marginal rather than level effects just to economize on space. 23
ARTICLE IN PRESS 150
G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
1.4 1.35 1.3 1.25 1.2 1.15 1.1 1.05 1
2 1.9 1.8 1.7 1.6 1.5 1.4 1.3 0
5
10
15
20
25
30
1.4 1.2 1 0.8 0.6 0.4 0.2 0
0 -0.1 -0.2 -0.3 -0.4 -0.5 -0.6 -0.7 -0.8
0
0
5
5
10
10
15
15
20
20
25
25
30
30
1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4
0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 -0.02
0
5
10
15
20
25
0
5
10
15
20
25
0
5
10
15
20
25
30
30
30
Fig. 3. Dynamic responses to a permanent shock with KPR preferences. (a) Aggregate output Y t . (b) Aggregate employment N t . (c) Capital–labor ratio K t =N t . (d) Investment rate It =Y t . (e) Labor productivity Y t =N t . (f) Interest rate Rt .
Regardless of specification, the marginal effects of higher government spending on capital intensity K=N and labor productivity Y=N are negative under incomplete markets (and are stronger the higher is g), whereas they are zero under complete markets. As for aggregate employment N, the wealth effect of higher g turns out to dominate the effect of lower wages under incomplete markets, so that N increases with higher g under either complete or incomplete markets. However, the employment stimulus is weaker under incomplete markets, especially in the economy with hand-to-mouth workers. The same is true for aggregate output: it increases under either incomplete or complete markets, but less so under incomplete markets. Finally, the incomplete-markets effects are on average equivalent to what would have been the effect of increasing the tax rate on capital income by about 0.55% under complete markets.
7. Dynamic responses The results so far indicate that the long-run effects of government consumption can be significantly affected by incomplete risk sharing. We now examine how incomplete risk sharing affects the entire impulse response of the economy to a fiscal shock.26 Starting from the steady state with g ¼ 25%, we hit the economy with a permanent 1% increase in government spending and trace its transition to the new steady state (the one with g ¼ 26%). We conduct this experiment for both the economy 26 Note that the purpose of the quantitative exercises conducted here, and throughout the paper, is not to assess the ability of the model to match the data. Rather, the purpose is to detect the potential quantitative significance of the particular deviation we took from the standard neoclassical growth model.
ARTICLE IN PRESS G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
0.44 0.42 0.4 0.38 0.36 0.34 0.32 0.3 0.28 0.26
0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
0 -0.02 -0.04 -0.06 -0.08 -0.1 -0.12 -0.14 -0.16
0
0
0
5
5
5
10
10
10
15
15
15
20
20
20
25
25
25
30
30
30
0.436 0.434 0.432 0.43 0.428 0.426 0.424 0.422 0.42 0.418 0.416
0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15 -0.2 -0.25
0.04 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 -0.04
20
151
0
5
10
15
25
30
0
5
10
15
20
25
30
0
5
10
15
20
25
30
Fig. 4. Dynamic responses to a permanent shock with hand-to-mouth agents. (g) Aggregate output Y t . (h) Aggregate employment N t . (i) Capital–labor ratio K t =N t . (j) Investment rate It =Y t . (k) Labor productivity Y t =N t . (l) Interest rate Rt .
with KPR preferences and the economy with hand-to-mouth workers, each parameterized as in the previous section; in either case, the transitional dynamics reduce to a simple system of two first-order ODEs in ðK t ; Ht Þ when y ¼ 1.27 The results are presented in Figs. 3 and 4. Time in years is on the horizontal axis, while deviations of the macrovariables from their respective initial values are on the vertical axis. The interest rate and the investment rate are in simple differences, the rest of the variables are in log differences. The solid lines indicate incomplete markets, the dashed lines indicate complete markets. As evident in these figures, the quantitative effects of a permanent fiscal shock can be quite different between complete and incomplete markets. The overall picture that emerges is that the employment and output stimulus of a permanent increase in government spending is weaker under incomplete markets than under complete markets. And whereas we already knew this for the long-run response of the economy, now we see that the same is true for its short-run response. This picture holds for both the economy with KPR preferences and the one with hand-to-mouth workers. But there are also some interesting differences between the two. The mitigating effect of incomplete markets on the employment and output stimulus of government spending is much stronger in the economy with hand-to-mouth workers. As a result, whereas the short-run effects of higher government spending on the investment rate and the interest are positive under
27 Throughout, we focus on permanent shocks. Clearly, transitory shocks have no impact in the long run. As for their short-run impact, the difference between complete and incomplete markets is much smaller than in the case of permanent shocks. This is simply because transitory shocks have very weak wealth effects on investment as long as agents can freely borrow and lend over time, which is the case in our model. We expect the difference between complete and incomplete markets to be larger once borrowing constraints are added to the model, for then investment will be sensitive to changes in current disposable income even if there is no change in present-value wealth.
ARTICLE IN PRESS 152
G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
complete markets in both economies, and whereas these effects remain positive under incomplete markets in the economy with KPR preferences, they turn negative under incomplete markets in the economy with hand-to-mouth workers. To understand this result, consider for a moment the benchmark model, where there are no hand-to-mouth workers and labor supply is completely inelastic. Under complete markets, a permanent change in government spending would be absorbed one-to-one in private consumption, leaving investment and interest rates completely unaffected in both the short- and the long-run. Under incomplete markets, instead, investment and the interest rate would fall on impact, as well as in the long run. Allowing labor supply to increase in response to the fiscal shock ensures that investment and the interest rate jump upwards under complete markets. However, as long as the response of labor supply is weak enough, the response of investment and the interest rate can remain negative under incomplete markets. As a final point of interest, we calculate the welfare cost, in terms of consumption equivalent, associated with a permanent 1% increase in government spending. Under complete markets, welfare drops by 0.2%, whereas under incomplete markets it drops by 0.6%. In other words, the welfare cost of an increase in government spending is three times higher under incomplete markets than under complete markets.28 To recap, the quantitative results indicate that a modest level of idiosyncratic investment risk can have a non-trivial impact on previously reported quantitative evaluations of fiscal policy. Note in particular that our quantitative economy with KPR preferences is directly comparable to two classics in the related literature, Aiyagari et al. (1992) and Baxter and King (1993). Therefore, further investigating the macroeconomic effects of fiscal shocks in richer quantitative models with financial frictions appears to be a promising direction for future research.
8. Conclusion This paper revisited the macroeconomic effects of government consumption within a tractable incomplete-markets variant of the neoclassical growth model. Because private investment is subject to uninsurable idiosyncratic risk and because risk-taking is sensitive to wealth, the aggregate level of investment depends on the aggregate level of net-of-taxes household wealth for any given prices. It follows that an increase in government spending can crowd-out private investment simply by reducing household net worth. As a result, market incompleteness can seriously upset the supplyside effects of fiscal shocks: an increase in government consumption, even if financed with lump-sum taxation, tends to reduce capital intensity, labor productivity, and wages in both the short-run and the long-run. For plausible parameterizations of the model, these results appear to have not only qualitative, but also quantitative content. These results might, or might not, be bad news for the ability of the neoclassical paradigm to explain the available evidence regarding the macroeconomic effects of fiscal shocks.29 However, the goal of this paper was not to study whether our model could match the data. Rather, the goal was to identify an important mechanism through which incomplete markets modify the response of the economy to fiscal shocks: wealth effects on investment. In our model, these wealth effects originated from uninsured idiosyncratic investment risk combined with diminishing absolute risk aversion. Borrowing constraints could lead to similar sensitivity of investment to wealth (or cash flow).30 Also, this mechanism need not depend on whether prices are flexible (as in the neoclassical paradigm) or sticky (as in the Keynesian paradigm). The key insights of this paper are thus clearly more general than the specific model we employed—but the quantitative importance of these insights within richer models of the macroeconomy is, of course, a widely open question. An important aspect left outside our analysis is the optimal financing of government expenditures. In this paper, we assumed that the increase in government spending is financed with lump-sum taxation, only because we wished to isolate wealth effects from the distortionary and redistributive effects of taxation. Suppose, however, that the government has access to two tax instruments, a lump-sum tax and a proportional income tax.31 Clearly, with complete markets (and no inequality) it would be optimal to finance any exogenous increase in government spending with only lump-sum taxes. With incomplete markets, however, it is likely that an increase in government spending is financed with a mixture of both instruments: while using only the lump-sum tax would disproportionately affect the utility of poor agents, using both instruments permits the government to trade off less efficiency for more equality. Further exploring these issues, and the nature of optimal taxation for the class of economies we have studied here, is left for future research.
Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at 10.1016/j.moneco.2008.12.010. 28 Here we have assumed that government consumption has no welfare benefit, but this should not be taken literally: nothing changes if Gt enters separably in the utility of agents. 29 Whether the evidence is consistent with the neoclassical paradigm is still debatable. For example, using structural VARs with different identification assumptions, Ramey and Shapiro (1997) and Ramey (2006) find that private consumption falls in response to a positive shock to government consumption, as predicted by the neoclassical paradigm, while Blanchard and Perotti (2002) and Perotti (2007) find the opposite result. 30 On this point, see Challe and Ragot (2007). 31 As in Werning (2007), this might be a good proxy for more general non-linear tax schemes.
ARTICLE IN PRESS G.-M. Angeletos, V. Panousi / Journal of Monetary Economics 56 (2009) 137–153
153
References Acemoglu, D., Zilibotti, F., 1997. Was Prometheus unbound by chance? Risk, Diversification, and growth. Journal of Political Economy 105, 709–751. Aı¨t-Sahalia, Y., Parker, J.A., Yogo, M., 2001. Luxury goods and the equity premium. NBER Working Paper 8417. Aiyagari, S.R., 1994. Uninsured idiosyncratic risk and aggregate saving. Quarterly Journal of Economics 109, 659–684. Aiyagari, S.R., Christiano, L.J., Eichenbaum, M., 1992. The output, employment, and interest rate effects of government consumption. Journal of Monetary Economics 30, 73–86. Angeletos, G.-M., 2007. Uninsured idiosyncratic investment risk and aggregate saving. Review of Economic Dynamics 10, 1–30. Angeletos, G.-M., Calvet, L.-E., 2005. Incomplete-market dynamics in a neoclassical production economy. Journal of Mathematical Economics 41, 407–438. Angeletos, G.-M., Calvet, L.-E., 2006. Idiosyncratic production risk, growth, and the business. Journal of Monetary Economics 53, 1095–1115. Barro, R.J., 1981. Output effects of government purchases. Journal of Political Economy 89, 1086–1121. Barro, R.J., 1989. The Ricardian approach to budget deficits. Journal of Economic Perspectives 3, 37–54. Baxter, M., King, R.G., 1993. Fiscal policy in general equilibrium. American Economic Review 83, 315–334. Blanchard, O., Perotti, R., 2002. An empirical characterization of the dynamic effects of changes in government spending and taxes on output. Quarterly Journal of Economics 117, 1329–1368. Blundell, R., MaCurdy T., 1999. Labour supply: a review of alternative approaches. In: Ashenfelter, O., Card, D. (Eds.), Handbook of Labor Economics, vol. 3. Braun, A.R., McGrattan E.R., 1993. The macroeconomics of war and peace. NBER Macroeconomics Annual 1993. Buera, F., Shin, Y., 2007. Financial frictions and the persistence of history: a quantitative exploration. Mimeo, Northwestern University. Cagetti, M., De Nardi, M., 2006. Entrepreneurship, frictions, and wealth. Journal of Political Economy 114, 835–870. Campbell, J., Mankiw, N.G., 1989. Permanent income, current income, and consumption. NBER Macroeconomics Annual 1989. Carroll, C., 2000. Portfolios of the rich. NBER Working Paper 430. Challe, E., Ragot, X., 2007. The dynamic effects of fiscal shocks in a liquidity-constrained economy. Mimeo, University of Paris-Dauphine/Paris School of Economics. Christiano, L.J., Eichenbaum, M., 1992. Current real-business-cycle theories and aggregate labor market fluctuations. American Economic Review 82, 430–450. Covas, F., 2006. Uninsured idiosyncratic production risk with borrowing constraints. Journal of Economic Dynamics and Control 30, 2167–2190. Duffie, D., Epstein, L.G., 1992. Stochastic differential utility. Econometrica 60, 353–394. Gentry, W.M., Hubbard, R.G., 2000. Entrepreneurship and household saving. NBER Working Paper 7894. Greenwood, J., Hercowitz, Z., Huffman, G.W., 1988. Investment, capacity utilization and the real business cycle. American Economic Review 78, 402–417. Gruber, J., 2005. A tax-based estimate of the elasticity of intertemporal substitution. NBER Working Paper 11945. Guvenen, F., 2006. Reconciling conflicting evidence on the elasticity of intertemporal substitution: a macroeconomic perspective. Journal of Monetary Economics 53, 1451–1472. Hall, R., 1980. Stabilization policy and capital formation. American Economic Review 70, 156–163. Hausman, J., 1981. Labor supply: how taxes affect economic behavior. In: Aaron, H., Pechman, J. (Eds.), Tax and the Economy. Brookings Institute, Washington, DC. Heathcote, J., 2005. Fiscal policy with heterogeneous agents and incomplete markets. Review of Economic Studies 72, 161–188. Huggett, M., 1997. The one-sector growth model with idiosyncratic shocks. Journal of Monetary Economics 39, 385–403. Jaimovich, N., Rebelo, S., 2006. Can news about the future drive the business cycle? American Economic Review, forthcoming. King, R.G., Plosser, C., Rebelo, S., 1988. Production, growth and business cycles: I. The basic neoclassical model. Journal of Monetary Economics 21, 195–232. Krusell, P., Smith, A.A., 1998. Income and wealth heterogeneity in the macroeconomy. Journal of Political Economy 106, 867–896. Leland, H., 1968. Saving and uncertainty the precautionary demand for saving. Quarterly Journal of Economics 82, 465–473. MaCurdy, T., 1981. An empirical model of labor supply in a life-cycle setting. Journal of Political Economy 86, 1059–1085. Malloy, C.J., Moskowitz, T.J., Vissing-Jørgensen A., 2006. Long run stockholder consumption risk and asset returns. Mimeo. McGrattan, E., Ohanian L.E., 1999. The macroeconomic effects of big fiscal shocks: the case of World War II. Federal Reserve Bank of Minneapolis Working Paper 599. McGrattan, E., Ohanian, L.E., 2006. Does neoclassical theory account for the effects of big fiscal shocks: evidence from World War II. Department Staff Report 315, Federal Reserve Bank of Minneapolis Research. Meh, C.A., Quadrini, V., 2006. Endogenous market incompleteness with investment risks. Journal of Economic Dynamics and Control 30, 2143–2165. Moskowitz, T.J., Vissing-Jørgensen, A., 2002. The returns to entrepreneurial investment: a private equity premium puzzle? American Economic Review 92, 745–778. Mulligan, C.B., 2002. Capital, interest, and aggregate intertemporal substitution. NBER Working Paper 9373. Obstfeld, M., 1994. Risk-taking, global diversification, and growth. American Economic Review 84, 1310–1329. Panousi, V., 2008. Capital taxation with entrepreneurial risk. Mimeo, Federal Reserve Board. Panousi, V., Papanikolaou, D., 2008. Investment, idiosyncratic risk, and ownership. Mimeo, Federal Reserve Board/Northwestern University. Perotti, R., 2007. In search of the transmission mechanism of fiscal policy. NBER Macroeconomics Annual 2007. Quadrini, V., 2000. Entrepreneurship, saving, and social mobility. Review of Economic Dynamics 3, 1–40. Ramey, V.A., 2006. Identifying government spending shocks: it’s all in the timing. Mimeo, UCSD. Ramey, V.A., Shapiro, M.D., 1997. Costly capital reallocation and the effects of government spending. NBER Working Paper 6283. Sandmo, A., 1970. The effect of uncertainty on saving decisions. Review of Economic Studies 37, 353–360. Vissing-Jørgensen, A., Attanasio, O., 2003. Stock-market participation, intertemporal substitution, and risk aversion. American Economic Review 93, 383–391. Werning, I., 2007. Optimal fiscal policy with redistribution. Quarterly Journal of Economics 122, 925–967.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 154–169
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Evaluating the economic significance of downward nominal wage rigidity$ Michael W.L. Elsby a,b a b
Department of Economics, University of Michigan, Ann Arbor, MI 48109-1220, USA NBER, USA
a r t i c l e i n f o
abstract
Article history: Received 8 March 2008 Received in revised form 10 December 2008 Accepted 10 December 2008 Available online 24 December 2008
The existence of downward nominal wage rigidity has been abundantly documented, but what are its economic implications? This paper demonstrates that, even when wages are allocative, downward wage rigidity can be consistent with weak macroeconomic effects. Firms have an incentive to compress wage increases as well as wage cuts when downward wage rigidity binds. By neglecting compression of wage increases, previous literature may have overstated the costs of downward wage rigidity to firms. Using micro-data from the US and Great Britain, I find that the evidence for the compression of wage increases when downward wage rigidity binds. Accounting for this reduces the estimated increase in aggregate wage growth due to wage rigidity to be much closer to zero. These results suggest that downward wage rigidity may not provide a strong argument against the targeting of low inflation rates. & 2009 Elsevier B.V. All rights reserved.
JEL classification: E24 E31 J31 J64 Keywords: Wage rigidity Unemployment Inflation
A longstanding issue in macroeconomics has been the possible long run disemployment effects of low inflation. The argument can be traced back to Tobin (1972): if workers are reluctant to accept reductions in their nominal wages, a certain amount of inflation may ‘‘grease the wheels’’ of the labor market by easing reductions in real labor costs that would otherwise be prevented. This concern has resurfaced with renewed vigor among economists and policymakers in recent years as inflation has declined and evidence for downward rigidity in nominal wages has accumulated. A stylized fact of recent micro-data on wages is the scarcity of nominal wage cuts relative to nominal wage increases (Lebow et al., 1995; Kahn, 1997; Card and Hyslop, 1997). This evidence dovetails with surveys of wage-setters and negotiators who report that they are reluctant to cut workers’ wages (see Howitt, 2002, for a survey). In an influential study, Bewley (1999) finds that a key reason for this reluctance is the belief that nominal wage cuts damage worker morale, and that morale is a key determinant of worker productivity.1 Exploring the macroeconomic implications of downward nominal wage rigidity from both a theoretical and an empirical perspective, I find that these effects are likely to be small. Section 1 begins by formulating an explicit model of
$ I am particularly grateful to Alan Manning and to the Editor Robert King for detailed comments. I am also grateful to Andrew Abel, Joseph Altonji, David Autor, Marianne Bertrand, Stephen Bond, William Dickens, Juan Dolado, Lorenz Goette, Maarten Goos, Steinar Holden, Chris House, Francis Kramarz, Richard Layard, Stephen Machin, Jim Malcomson, Ryan Michaels, Sendhil Mullainathan, Steve Pischke, Matthew Rabin, Jennifer Smith, and Gary Solon for valuable comments, and seminar participants at Michigan, Birkbeck, Boston Fed, Chicago GSB, the CEPR ESSLE 2004 conference, European Winter Meetings of the Econometric Society 2004, Federal Reserve Board, Oslo, Oxford, Stockholm IIES, Warwick, and Zurich, for helpful suggestions. Any errors are my own. E-mail address:
[email protected] 1 While Bewley’s explanation has been influential, it is not the only possible explanation. Other studies have suggested that the tendency for past nominal wages to act as the default outcome in wage negotiations can lead to downward wage rigidity (MacLeod and Malcomson, 1993; Holden, 1994).
0304-3932/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.12.003
ARTICLE IN PRESS M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
155
worker resistance to nominal wage cuts.2Based on Bewley’s results, the model makes the simple assumption that wage rigidity arises because the productivity of workers declines sharply following nominal wage cuts.3Wage rigidity, according to Bewley’s evidence, is therefore allocative in the sense of Barro (1977), because it affects the productivity of workers. This simple assumption implies a key insight that has not been recognized in the literature—that nominal wage increases in this environment become irreversible to some degree. A firm that raises the wage today, but reverses its decision by cutting the wage by an equal amount tomorrow will experience a reduction in productivity: today’s wage increase will raise productivity, but tomorrow’s wage cut will reduce productivity by a greater amount.4 Section 2 shows that this simple insight equips us with a fundamental prediction: firms will compress wage increases as well as wage cuts in the presence of downward wage rigidity. This occurs through two channels. First, forward-looking firms temper wage increases as a precaution against future costly wage cuts. Raising the wage today increases the likelihood of having to cut the wage, at a cost, in the future. Second, even in the absence of forward-looking behavior, downward wage rigidity raises the level of wages that firms inherit from the past. As a result, firms do not have to raise wages as often or as much to obtain their desired wage level. These two forms of compression of wage increases culminate in the perhaps surprising prediction that worker resistance to wage cuts has no effect on aggregate wage growth in the model. This result challenges a common intuition in the previous empirical literature on downward wage rigidity. This literature has assumed (implicitly or otherwise) that the existence of downward wage rigidity has no effect on wage increases.5In addition, many studies go on to report positive estimates of the effect of downward wage rigidity on aggregate wage growth, seemingly in contradiction to the predictions of the model. The model suggests an explanation for this result: neglecting the compression of wage increases leads a researcher to ignore a source of wage growth moderation, and thereby overstate the increase in aggregate wage growth due to downward wage rigidity. To assess the empirical relevance of firms’ compression of wage increases as a response to downward wage rigidity testable implications of the model are derived to take to the data. The implied percentiles of the distribution of wage growth across workers can be characterized using the model. This reveals that the effects of downward wage rigidity on the compression of wage increases can be determined by observing the effects of the rates of inflation and productivity growth on these percentiles. Higher inflation eases the constraint of downward nominal wage rigidity which in turn reduces the compression of wage increases, raising the upper percentiles of wage growth. A symmetric logic holds for the effects of productivity growth. Evidence on these predictions is presented in Section 3 using a broad range of micro-data for the US and Great Britain. I find significant evidence for the compression of wage increases related to downward wage rigidity, consistent with the implications of the model. Moreover, accounting for this limits the estimated increase in aggregate real wage growth due to downward wage rigidity from up to 1.5 percentage points to no more than 0.15 of a percentage point, an order of magnitude smaller. Section 4 then considers the implications of these results for the true implied costs of downward wage rigidity to firms. A simple approximation method allows these costs to be quantified using moments of the available micro-data on wages. This approximation reveals that the model implies that the costs of wage rigidity are driven by reductions in workers’ effort that firms must accept when they reduce wages, contrary to the common intuition that downward wage rigidity increases the cost of labor. In addition, erroneously concluding that downward wage rigidity raises the rate of aggregate wage growth, as previous literature has done, leads to a substantial (more than twofold) overstatement of the costs of downward wage rigidity on firms. Finally, a sense of the magnitude of the implied long run disemployment effects of wage rigidity can be gleaned from the model. For the rates of inflation and productivity growth observed in the data, the effects of downward nominal wage rigidity under zero inflation are unlikely to reduce employment by more than 0.25 of a percentage point.6,7 2 Given the empirical evidence for worker resistance to wage cuts, it is surprising that there has not yet been an explicit model of such wage rigidity in the literature. The need for such a model has been noted by Shafir et al. (1997, p. 371): ‘‘Plausibly, the relationship [between wages and effort] is not continuous: there is a discontinuity coming from nominal wage cuts.... A central issue is how to model such a discontinuity.’’ This sentiment is echoed more recently by Altonji and Devereux (2000, p. 423, note 7) who wrote: ‘‘[I]t is surprising to us that there is no rigorous treatment in the literature of how forward-looking firms should set wages when it is costly to cut nominal wages.’’ 3 Bewley also suggests that wage rigidity is enhanced by firms’ inability to discriminate pay across workers within a firm. For simplicity, I abstract from this possibility. For models that incorporate this feature, but abstract from downward nominal wage rigidity, see Thomas (2005) and Snell and Thomas (2007). 4 In this sense, the model is formally similar to asymmetric adjustment cost models, such as the investment model of Abel and Eberly (1996) and the labor demand model of Bentolila and Bertola (1990). 5 This is a key identifying assumption in Card and Hyslop (1997). However, their analysis is no more subject to this criticism than other previous empirical work on downward wage rigidity: Kahn (1997), Altonji and Devereux (2000), Nickell and Quintini (2003), Fehr and Go¨tte (2005), Dickens et al. (2006), among others, implicitly make the same assumption. 6 Despite some common formal elements, the mechanism here is distinct from that emphasized by Caplin and Spulber (1987). They show that the uniformity of the effects of aggegrate monetary shocks on individual real prices can yield monetary neutrality in an (s,S) pricing environment. In the current analysis, firms’ endogenous wage setting response to idiosyncratic shocks allows them to obviate much of the costs of worker resistance to wage cuts. 7 This result may also help to reconcile an apparent puzzle in the literature. In contrast to micro-level evidence, empirical support for the macroeconomic effects of downward wage rigidity has been relatively scant (Card and Hyslop, 1997; Lebow et al., 1999; Nickell and Quintini, 2003; Smith, 2004). The results of this paper suggest a simple explanation: since previous studies have ignored the compression of wage increases, this has led researchers to overstate the increase in aggregate wage growth and thereby the implied costs of downward wage rigidity to firms.
ARTICLE IN PRESS 156
M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
Based on these results, I conclude that the macroeconomic effects of downward wage rigidity are likely to be small, especially relative to the implications of previous empirical literature on wage rigidity. This suggests that downward nominal wage rigidity does not provide a strong argument against the adoption of a low inflation target. Importantly, however, this result is nevertheless consistent with the diverse body of evidence that suggests workers resist nominal wage cuts. This conclusion therefore complements recent research that has argued for the targeting of low inflation rates in the context of models in which wage rigidity has no allocative effects (see, e.g. Goodfriend and King, 2001). The results of this paper suggest that such a conclusion also extends to a model of allocative wage rigidity based on evidence that workers resist wage cuts (Bewley, 1999). 1. A model of worker resistance to wage cuts This section presents a simple model of downward nominal wage rigidity based on the observations detailed in the empirical literatures mentioned above. Consider the optimal wage policies of worker–firm pairs for whom the productivity of an incumbent worker (denoted e) depends upon the wage according to e ¼ lnðo=bÞ þ c lnðW=W 1 Þ1 ,
(1)
where W is the nominal wage, W 1 the lagged nominal wage, 1 an indicator for a nominal wage cut, o W=P the real wage, and b a measure of real unemployment benefits (which is assumed to be constant over time). The parameter c40 varies the productivity cost to the firm of a nominal wage cut. The key qualitative feature of this effort function is the existence of a kink at W ¼ W 1 reflecting a worker’s resistance to nominal wage cuts. The marginal productivity loss of a nominal wage cut exceeds the marginal productivity gain of a nominal wage increase by a factor of 1 þ c41. This characteristic is what makes nominal wage increases (partially) irreversible—a nominal wage increase can only be reversed at an additional marginal cost of c. Clearly, this irreversibility is the key feature of the model, and the parameter c determines its importance for wage setting. 8 The effort function, (1), can be interpreted as a very simple way of capturing the basic essence of the motivations for downward wage rigidity mentioned in the literature. It is essentially a parametric form of effort functions in the spirit of the fair-wage effort hypothesis expounded by Solow (1979) and Akerlof and Yellen (1986), with an additional term reflecting the impact of nominal wage cuts on effort. Bewley (1999) also advocates such a characterization, but sees wage cuts rather than wage levels as critical for worker morale.9 Given the effort function (1), consider a discrete-time, infinite-horizon model in which price-taking worker–firm pairs choose the nominal wage W t at each date t to maximize the expected discounted value of profits. For simplicity, assume that each worker–firm’s production function is given by a e, where a is a real technology shock that is idiosyncratic to the worker–firm match, is observed contemporaneously, and acts as the source of uncertainty in the model. It is convenient to express the firm’s profit stream in constant date t prices. To this end, define the price level at date t as P t and assume that it evolves according to Pt ¼ ep P t1 , where p reflects inflation.10 Denoting the nominal counterparts, At P t at and Bt Pt b and substituting for et , the value of a job with lagged nominal wage W 1 and nominal productivity A in recursive form11 is given by Z JðW 1 ; AÞ ¼ maxfA½lnðW=BÞ þ c lnðW=W 1 Þ1 W þ bep JðW; A0 Þ dFðA0 jAÞg, (2) W
where b 2 ½0; 1Þ is the real discount factor of the firm. 1.1. Some intuition for the model To anticipate the model’s results, this subsection provides intuition for each of the predictions of the model. First, the model predicts that there will be a spike at zero in the distribution of nominal wage changes. This arises because of the kink in the firm’s objective function at the lagged nominal wage. Thus, there will be a range of values (‘‘region of inaction’’ ) for the nominal shock, A, for which it is optimal not to change the nominal wage. Since A is distributed across firms, there will exist a positive fraction of firms each period whose realization of A lies in their region of inaction that will not change their nominal wage. Second, if a firm does decide to change the nominal wage, the wage change will be compressed relative to the case where there is no wage rigidity. That nominal wage cuts are compressed is straightforward: wage cuts involve a 8 The precise parametric form of (1) is chosen primarily for analytical convenience. None of the qualitative results emphasized in what follows depends on the specific parametric form of (1)—the key is that effort is increasing in the wage and kinked around the lagged nominal wage. 9 In Bewley’s words: ‘‘The only one of the many theories of wage rigidity that seems reasonable is the morale theory of Solow...,’’ (Bewley, 1999, p. 423), and ‘‘The [Solow] theory...errs to the extent that it attaches importance to wage levels rather than to the negative impact of wage cuts,’’ (Bewley, 1999, p. 415). However, such is the intricacy of Bewley’s study, he would probably consider (1) a simplification, not least for its neglect of emphasis on morale as distinct from productivity, and of the internal wage structure of firms as a source of wage rigidity. I argue that it is a useful simplification as it provides key qualitative insights into the implied dynamics of wage-setting under more nuanced theories of morale. 10 Strictly speaking, p is equal to the logarithm of one plus the inflation rate. Thus p approximates the inflation rate only when inflation is low. 11 I adopt the convention of denoting lagged values by a subscript, 1 , and forward values by a prime, 0 .
ARTICLE IN PRESS M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
157
discontinuous fall in productivity at the margin, so the firm will be less willing to implement them. It is only slightly less obvious why nominal wage increases are also compressed in this way. The reason is that, in an uncertain world, increasing the wage today increases the likelihood that the firm will have to cut the wage, at a cost, in the future. An additional, perhaps more fundamental outcome of the model is that an inability to cut wages will tend to raise the wages that firms inherit from the past. Consequently, even in the absence of the forward-looking motive outlined above, wage increases will be compressed simply because firms do not have to increase wages by as much or as often in order to achieve their desired wage level.12,13 A final prediction concerns the effect of increased inflation on these outcomes. Compression of wage increases will become less pronounced as inflation rises. Higher inflation implies that firms are less likely to cut wages either in the past or in the future. As a result, forward-looking firms no longer need to restrain raises as much as a precaution against future costly wage cuts. Likewise, higher inflation implies that wages inherited from the past are less likely to have been constrained by downward wage rigidity. Thus, firms will raise wages more often to reach their desired wage level. 1.2. The dynamic model To make the above intuition precise, consider the solution to the full dynamic model, (2). Taking the first-order condition with respect to W, conditional on DWa0, yields (3) ð1 þ c1 ÞðA=WÞ 1 þ bep DðW; AÞ ¼ 0 if DWa0, R 0 0 where DðW; AÞ JW ðW; A Þ dFðA jAÞ is the marginal effect of the current wage choice on the future profits of the firm. A key step in solving for the firm’s wage policy involves characterizing the function DðÞ. For the moment, however, note that the general structure of the wage policy is as follows: Proposition 1. The optimal wage policy in the dynamic model is given by 8 1 > < U ðAÞ if A4UðW 1 Þ raise if A 2 ½LðW 1 Þ; UðW 1 Þ freeze W ¼ W 1 > : 1 L ðAÞ if AoLðW 1 Þ cut
(4)
where the functions UðÞ and LðÞ satisfy ðUðWÞ=WÞ 1 þ bep DðW; UðWÞÞ 0, ð1 þ cÞðLðWÞ=WÞ 1 þ bep DðW; LðWÞÞ 0.
(5)
Proposition 1 states that the firm’s optimal wage takes the form of a trigger policy. For large realizations of the nominal shock A above the upper trigger UðW 1 Þ, the firm raises the wage. For realizations below the lower trigger LðW 1 Þ, the wage is cut. For intermediate values of A nominal wages are left unchanged.14 To complete the characterization of the firm’s wage policy, it is necessary to establish the functions UðÞ and LðÞ. It can be seen from (5) that, in order to solve for these functions, one requires knowledge of the functions DðW; UðWÞÞ and DðW; LðWÞÞ. This is aided by Proposition 2: Proposition 2. The function DðÞ satisfies Z Z UðWÞ ½ðA0 =WÞ1 dF DðW; AÞ ¼ LðWÞ
0
LðWÞ
cðA0 =WÞ dF þ bep
Z
UðWÞ
DðW; A0 Þ dF,
(6)
LðWÞ
which is a contraction mapping in DðÞ, and thus has a unique fixed point. The first term on the right-hand side of (6) represents tomorrow’s expected within-period marginal benefit, given that W 0 is set equal to W. To see this, note that the firm will freeze tomorrow’s wage if A0 2 ½LðWÞ; UðWÞ, and that in this event a wage level of W today will generate a within-period marginal benefit of ðA0 =WÞ 1. Similarly, the second term on the righthand side of (6) represents tomorrow’s expected marginal cost, given that the firm cuts the nominal wage tomorrow. Finally, the last term on the right-hand side of (6) accounts for the fact that, in the event that tomorrow’s wage is frozen, 12 Identifying this additional effect is an important benefit of the infinite horizon model studied here. Although compression due to forward-looking behavior would arise in a ‘‘simpler’’ two-period model, this additional effect will be shown to be an outcome of steady state considerations, which cannot be treated in a two-period context. 13 Compression of wage increases resulting from the impact of downward wage rigidity on past wages is implicit in the myopic model of Akerlof et al. (1996). 14 Concavity of the firm’s problem in W ensures that the first order conditions (3) characterize optimal wage setting when the wage is adjusted. The remainder of the result follows from the continuity of the optimal value for W in A. Intuitively, since the firm’s objective, (2), is continuous in A and concave in W, realizations of A just above the upper trigger UðW 1 Þ will lead the firm to raise the wage just above the lagged wage. A symmetric logic holds for wage cuts. Formally, continuity of the optimal value of W in the state variable A follows from the theorem of the maximum (see, e.g. Stokey et al., 1989, pp. 62–63).
ARTICLE IN PRESS 158
M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
the marginal effects of W persist into the future in a recursive fashion. It is this recursive property that provides the key to determining the function DðÞ.15 For the purposes of the present paper, a specific form for FðÞ is used. Assume that real shocks, a, evolve according to the geometric random walk, ln a0 ¼ m þ ln a 12s2 þ e0 ,
(7) 0
2
where the innovation e Nð0; s Þ and m reflects productivity growth. Given that prices evolve according to P ¼ yields the following process for nominal shocks, A, 0
ln A0 ¼ m þ p þ ln A 12s2 þ e0 ,
ep P,
this (8)
0
where p reflects inflation. Note that this has the simple implication that EðA jAÞ ¼ expðm þ pÞA , so that average nominal productivity rises in line with inflation and productivity growth. This information can be used to determine the full solution as follows. First, the functions DðW; UðWÞÞ and DðW; LðWÞÞ can be solved for using equation (6) via the method of undetermined coefficients. Given these, the solutions for UðWÞ and LðWÞ can be obtained using the equations in (5). Proposition 3 shows that this method yields a wage policy that takes a simple piecewise linear form. Proposition 3. If nominal shocks evolve according to the geometric random walk, (8), the functions UðÞ and LðÞ are of the form UðWÞ ¼ U W
and
LðWÞ ¼ L W,
(9)
where U and L are given constants that depend upon the parameters of the model, fc; b; m; p; sg. 2. Predictions Anticipating the empirical results documented later, this section draws out a set of relationships predicted by the model that can be estimated using available data. 2.1. Compression of wage increases An important outcome of the model is that it naturally implies that downward wage rigidity leads firms to reduce the magnitude of wage increases, and that this occurs through two effects. The first channel is implied by the properties of the coefficients of the optimal wage policy, U and L in (9). While closed-form solutions for U and L are not available, it is straightforward to compute them numerically. Doing so establishes that U414L in the presence of downward wage rigidity (when c40). It is instructive to contrast this result with some simple special cases of the model. First, consider the frictionless model where c ¼ 0. Denoting frictionless outcomes with an asterisk, it is straightforward to show that U ¼ 1 ¼ L , so that frictionless wages W are equal to the nominal shock A, and wage changes fully reflect changes in productivity. The result that Lo1 in the general model therefore means that firms are avoiding wage cuts that they would have implemented in the absence of wage rigidity. This is a simple implication of the discontinuous fall in effort at the margin following a wage cut. Likewise, the result that U41 when c40 implies that firms are reducing the wage in the event that they increase pay relative to a frictionless world. As a result, for a given level of the lagged wage, this will serve to reduce the magnitude of wage increases, leading to one form of compression of wage increases. To understand the intuition for this result, a useful point of contrast is the special case of the model in which firms are myopic, b ¼ 0. In this case, it is simple to show that U b¼0 ¼ 14Lb¼0 ¼ 1=ð1 þ cÞ. It follows that the result that U41 in the general model is driven by the forward-looking behavior of firms. Intuitively, raising the nominal wage today increases the likelihood that a firm will wish to cut the wage, at a cost, in the future. The second source of compression of wage increases relates to the effect of downward wage rigidity on the lagged wages of firms. Specifically, firms’ inability to reduce wages in the past will place upward pressure on the wage that they inherit from previous periods. As a result, firms do not need to raise wages as often or as much to achieve any given level of the current wage, further serving to reduce the magnitude of wage increases. The joint forces of these two effects culminate in the following, perhaps surprising, result: Proposition 4. Downward wage rigidity has no effect on aggregate wage growth in steady state. This result can be interpreted as a simple requirement for the existence of a steady state in which average growth rates are equal. Since productivity shocks grow on average at a constant rate, so must wages grow at that same rate in the long run. Even a model with downward wage rigidity must comply with this steady state condition in the long run.16 15 Proposition 2 states that this recursive property takes the form of a contraction mapping in D. To see that (6) is a contraction, note that Blackwell’s sufficient conditions can be verified. Monotonicity of the map in D is straightforward. To see that discounting holds, note that bep o1 and the probability 0 that A lies in the inaction region ½LðWÞ; UðWÞ is less than one. 16 A similar result has been established in the investment literature by Bloom (2000).
ARTICLE IN PRESS M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
159
Fig. 1. The distribution of real wage growth implied by the model. The histogram is simulated from the model with downward wage rigidity (c40). The solid line is the true density of frictionless wage growth (c ¼ 0). The dashed line is the density implied by imposing symmetry in the upper tail of the histogram.
Note that this result holds regardless of how forward-looking firms are. Even if firms are myopic (b ¼ 0), so they do not reduce wages in the event that they increase pay, wage rigidity will still have no effect on aggregate wage growth. The reason is that the second channel through which wage increases are compressed dominates because firms inherit higher wages from the past. 2.2. Implications for the literature on wage rigidity Surprisingly, none of the previous research on downward wage rigidity has taken account of the compression of wage increases that is implied by worker resistance to wage cuts (see among others, Kahn, 1997; Card and Hyslop, 1997; Altonji and Devereux, 2000). This section shows that neglecting this compression can lead a researcher to overstate the effects of downward wage rigidity on the aggregate growth of real wages. Fig. 1 illustrates the point. It shows three simulated wage growth distributions derived from the model of Section 1. The histogram shows the distribution of real wage growth in the presence of wage rigidity (c40), whereas the solid line illustrates the true frictionless wage growth density (c ¼ 0). Comparing these two distribution provides a visual impression of the results highlighted above: downward wage rigidity leads to both fewer wage cuts and fewer wage increases. Fig. 1 also includes a ‘‘median symmetric’’ density (dashed line) that is implied if one assumes erroneously that downward wage rigidity has no effect on wage increases.17It can be seen that, by using the median symmetric counterfactual, one obtains an overestimate of the increase in average wage growth due to wage rigidity. This occurs as a direct result of the compression of the upper tail of wage growth. Neglecting this compression leads to an overstatement of the mass of desired frictionless wage cuts, and thereby of the effects of wage rigidity on average wage growth. This observation has important implications for the conclusions of the previous literature. Many studies go on to report positive estimates of the increase in aggregate real wage growth driven by downward rigidity as a measure of the costs of wage rigidity imposed on firms. For example, the results of Card and Hyslop (1997) suggest that downward wage rigidity increases average real wage growth by around one percentage point in times of low inflation. Similar exercises are performed in Nickell and Quintini (2003), Fehr and Go¨tte (2005), Dickens et al. (2006), among others. These results are surprising in the light of the model above: if downward wage rigidity had any effect on average real wage growth, it would imply a violation of steady state in the labor market. A natural question in the light of this is whether it is empirically the case that firms compress wage increases in the face of downward wage rigidity. 2.3. Empirical implications The model implies two simple approaches to testing the prediction that firms compress wage increases as a response to downward wage rigidity. The first is anticipated in Fig. 1: if firms compress wage increases, one should observe the upper tail of the distribution of wage growth shifting inwards as downward wage rigidity binds. The model also suggests when this will occur. When inflation is high, firms’ desired wage growth, D ln W ¼ D ln A is unlikely to be negative. Thus, downward rigidity of nominal wages is unlikely to bind now or in the future, firms will not compress wage increases, and 17 This is derived by imposing symmetry in the upper tail of the distribution of wage growth with c40. This is, in fact, the method used by Card and Hyslop (1997) to generate an estimate of the frictionless wage growth distribution.
ARTICLE IN PRESS 160
M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
Table 1 Descriptive statistics of wage growth for the CPS, PSID, and NES. Year
(a) CPS
(b) PSID
Obs. 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Spike
25,626 28,343 27,426 26,521 26,675 13,122 6935 27,348 26,825 26,736 28,045 28,688 28,521 28,468 26,584 10,227 8458 25,386 25,255 25,489 25,215 24,574 26,575
5.70 5.79 10.41 12.73 12.76 12.28 13.67 13.68 12.59 11.99 11.14 11.61 13.43 13.25 11.88 12.20 11.46 10.67 10.31 9.80 9.68 9.32 10.32
Doo0
53.39 48.07 45.76 45.99 46.29 43.72 40.63 45.94 46.43 47.90 49.11 46.52 44.94 45.73 44.49 45.32 44.68 41.53 38.00 41.02 44.19 42.65 42.38
(c) NES
Obs.
Spike
Doo0
1520 1527 1599 1676 1733 1471 1468 1605 1704 1756 1746 1664 1606 1621 1702 1830 1801 1848 1863 1815 2441 2441
10.39 11.59 8.88 8.35 7.39 7.48 8.65 7.35 6.51 4.38 7.22 8.17 14.51 12.95 11.16 15.30 15.16 15.42 13.96 12.01 13.93 16.39
34.41 32.35 46.34 56.74 42.07 34.33 36.72 37.57 51.35 52.51 50.29 38.58 44.46 46.33 41.07 42.51 49.53 50.87 53.30 54.66 49.77 45.60
Obs.
Spike
Doo0
60,318 64,838 66,168 65,619 66,574 70,431 75,745 77,910 75,652 75,311 74,487 74,848 73,440 72,278 70,752 72,065 76,335 78,171 78,167 79,644 82,489 80,221 76,999 77,227
0.67 1.43 2.15 2.33 0.44 2.62 3.01 2.06 5.09 1.69 1.39 2.52 1.55 2.13 2.49 2.75 4.87 6.95 6.36 5.55 1.53 1.71 4.08 4.38
41.00 77.64 33.73 38.39 46.81 40.53 49.34 19.93 41.62 50.80 18.88 24.97 20.57 44.91 50.33 26.40 30.87 27.91 48.14 51.37 32.31 33.52 51.19 25.93
‘‘Obs.’’ refers to the number of wage change observations per year. The spike is the fraction of wage changes equal to zero. ‘‘Doo0’’ reports the fraction of real wage cuts each year.
the distribution of wage growth will converge to the solid line shown in Fig. 1. When inflation is low, however, downward wage rigidity will bind for many firms, wage increases will be compressed and the distribution of wage growth will look like the histogram shown in Fig. 1.18 This suggests a simple visual test of the model by inspecting the distribution of real wage growth in high compared to low inflation periods. Second, the model yields predictions on the effect of inflation and productivity growth on the percentiles of the distribution of real wage growth. These percentiles can be approximated as follows: Proposition 5. The percentiles of the distribution of real wage growth satisfy 8 > < m cp ðm; pÞ þ const n EðP n jm; pÞ p > : m þ cpþ ðm; pÞ þ const n
if Pn 4 p otherwise
(10)
if Pn o p;
where p ðm; pÞ and pþ ðm; pÞ are, respectively, the frictionless ðc ¼ 0Þ probabilities of reducing and increasing the nominal wage. A number of observations can be gleaned from Proposition 5. First, setting c ¼ 0 reveals that the frictionless percentiles of real wage growth are simply determined by the rate of productivity growth, m, as one would expect. Second, the existence of wage rigidity reduces the upper percentiles of wage growth relative to the frictionless case, reflecting firms’ compression of wage increases. Moreover, as inflation, p, and productivity growth, m, rise, the frictionless probability that a firm wishes to reduce nominal wages, p ðm; pÞ, declines. Thus, on average one should observe the upper percentiles of real wage growth rising more than one-for-one with productivity growth, m, and rising with inflation, p. 18
Likewise, high levels of productivity growth, m, will also relax the constraint of downward wage rigidity.
ARTICLE IN PRESS M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
161
Table 2 Summary statistics. Obs (a) CPS: Change in log real wage Age Female Education: o High school High school Some college College degree Advanced degree Metropolitan area Non-white Self-employed
Mean
Std. Dev.
Min
Max
547042 547042 547042
0.025061 38.01381 0.501651
0.310781 12.52692 0.499998
5.72642 16 0
4.562072 65 1
546516 546516 546516 546516 546516 521083 547042 546877
0.175773 0.451021 0.287882 0.071936 0.013388 0.710509 0.204385 0.000104
0.380628 0.497596 0.452776 0.258382 0.114931 0.453527 0.403252 0.010209
33283 33283 33283
0.022087 38.24457 0.196617
0.337482 11.53739 0.397446
0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1
(b) PSID: Change in log real wage Age Female Education: 0–5 grades 6–8 grades 9–11 grades 12 grades Some college College degree Advanced degree Tenure: ½1; 1:5 years ð1:5; 3:5Þ years ½3:5; 9:5Þ years ½9:5; 19:5Þ years 19:5 years þ Self-employed
30671 30671 30671 30671 30671 30671 30671
0.036256 0.114375 0.218382 0.448469 0.130253 0.040201 0.012064
0.186929 0.318272 0.413155 0.497346 0.336587 0.196433 0.109171
0 0 0 0 0 0 0
1 1 1 1 1 1 1
30536 30536 30536 30536 30536 33257
0.092907 0.204546 0.354008 0.236999 0.111541 0.014313
0.290306 0.403376 0.47822 0.425249 0.314805 0.118779
0 0 0 0 0 0
1 1 1 1 1 1
(c) NES: Change in log real wage Age Female Major union coverage London dummy
1922184 1922184 1922184 1922029 1919091
0.026539 41.01464 0.409069 0.426511 0.144433
0.190503 11.85092 0.491662 0.49457 0.351528
3.72463 18 0
9.9292 16 0 0 0
4.619859 65 1
9.757886 65 1 1 1
The CPS sample also contains 2-digit industry classifications, and 50 regional dummies; the PSID sample also contains 1-digit industry and 1-digit occupation and 6 region dummies; the NES sample also contains 2-digit industry and 2-digit occupation and 10 region dummies.
Downward wage rigidity also implies that a non-negligible range of the lower percentiles of wage growth will exactly correspond to zero nominal wage growth, or real wage growth at minus the rate of inflation, p. In this regime in Eq. (10), the lower percentiles of wage growth fall one-for-one with the rate of inflation by definition. It is through this effect that increases in inflation ‘‘grease the wheels’’ of the labor market by allowing firms to achieve reductions in labor costs without resorting to costly nominal wage cuts. Finally, Eq. (10) implies that very low percentiles of the wage growth distribution that correspond to nominal wage cuts (real wage cuts of greater magnitude than p) also will rise with inflation and productivity growth. To see this, note that these percentiles are increasing in the frictionless probability of raising wages, pþ ðm; pÞ, which in turn is increasing in m and p. This last result can seem odd at first. However, the logic behind it mirrors the intuition for the effects of m and p on upper percentiles. When inflation and real wage growth are large, a firm expects that it will likely reverse nominal wage cuts in the near future. As a result, the firm is less inclined to incur the costs of reducing wages, and wage cuts are reduced in magnitude for a given lagged wages. Moreover, high inflation and productivity growth relax the upward pressure downward wage rigidity places on the wages firms inherit from the past. As a result, firms do not need to reduce wages as often or as much to achieve any given level of the current wage, further reducing the magnitude of wage cuts.
3. Empirical implementation The data used in the empirical analysis are taken from the Current Population Survey (CPS) and the Panel Study of Income Dynamics (PSID) for the US, and from the New Earnings Survey (NES) for Great Britain. For all datasets, the relevant
ARTICLE IN PRESS 162
M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
Fig. 2. US and UK inflation over the sample periods.
wage measure used is the basic hourly wage rate for respondents aged 16–65. The CPS samples are taken from longitudinally linked merged outgoing rotation group files from 1979 to 2002. The PSID data are taken from the random (not poverty) samples for the years 1971–1992. The NES for Great Britain is an individual level panel for each year running from 1975 through to 1999. Since the descriptive properties of wage rigidity in these datasets have been well explored in previous analyses19the purpose here is not to provide a full descriptive account of downward wage rigidity. For reference, though, Tables 1 and 2 present summary statistics for wage growth and key variables that will be used in the forthcoming analysis. It should be noted, however, that the NES data for Great Britain have a number of key advantages for the purposes of this paper, especially in comparison with the CPS and PSID samples for the US. Most starkly, the NES yields comparatively very large sample sizes: one obtains sample sizes of 60–80,000 wage change observations each year. A second advantage of the NES data is its sample period, 1975–2001. This is useful because variation in the rate of inflation will be used in what follows to gauge the impact of wage rigidity on wage growth, and the UK experienced significant variation in inflation over this period relative to the US (see Fig. 2). A final key advantage of the NES sample is that measurement error in these data is less problematic relative to the individually reported data of the CPS and PSID samples. The reason is that the NES is collected from employers’ payroll records, thereby leaving less scope for error (see (Nickell and Quintini, 2003), for more on this). This is important because previous empirical studies have gone to some lengths to control for the effects of measurement error (Smith, 2000; Altonji and Devereux, 2000). The relative accuracy of the NES allows us to concentrate on substantive questions, and is thus an important virtue in this context.20
3.1. The impact of low inflation on wage growth This section explores whether the empirical predictions of Section 2 are borne out in the data summarized above. First, visual evidence for the compression of wage increases is presented using the empirical distribution of wage growth, as anticipated in Fig. 1. In addition, evidence on the effects of inflation and real growth on the percentiles of wage growth based on Eq. (10) is also assessed. Visual evidence from the distribution of wage growth. As noted in Section 2.3, a particularly simple approach is to observe differences in the distribution of wage growth in periods of high inflation compared to periods of low inflation. To this end, Figs. 3 and 4(a) present estimates of the density of log real wage growth for periods with different inflation rates using the PSID for the US, and the NES for Great Britain. 21 Notice that lower inflation leads to a compression of the lower and, more importantly for the purposes of this paper, the upper tail of the wage change distribution, precisely in accordance with the predictions of Section 2. One could argue, however, that at least some of the observed differences are due to changes in other variables, such as the industrial, age, gender, and regional compositions of the workforce. To address this, a set of micro-level control variables are introduced for each dataset, which are summarized in Table 2. Changes in these variables are controlled for using the method of DiNardo et al. (1996), henceforth ‘‘DFL’’ . This method is useful because it requires no parametric assumptions on the effects of these controls on wage growth. Given the intrinsically non-linear character of the wage policy (4), this is especially helpful. 19
See Card and Hyslop (1997) for the CPS, Kahn (1997) and Altonji and Devereux (2000) for the PSID, and Nickell and Quintini (2003) for the NES. Nickell and Quintini (2003) compared the accuracy of hourly wage changes in the NES with those obtained from a sample whose payslip was checked in the British Household Panel Study and found remarkably similar properties in both datasets. 21 The time-varying accuracy of the wage imputation flags in the CPS makes this a less useful exercise for the CPS data. 20
ARTICLE IN PRESS M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
10
10
8
8
6
6
4
4
2
2
0
163
0 -0.3 -0.2 -0.1
0
0.1
71-82 density
0.2
0.3
0.4
-0.3 -0.2 -0.1
0
0.1
0.2
0.3
0.4
Re-weighted 71-82 density
83-92 density
Re-weighted 83-92 density
Fig. 3. Density estimates of log real wage growth distributions (PSID). Results using an Epanechnikov kernel over 250 data points with a bandwidth of 0.005. Micro controls for re-weighted density are age, sex, education, 1-digit industry, 1-digit occupation, region, self-employment, and tenure. (a) Without Re-weighting. (b) With Re-weighting.
10
10
8
8
6
6
4
4
2
2 0
0 -0.3
-0.2
-0.1
0
0.1
76-81 density 92-01 density
0.2
0.3
82-91 density
0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Re-weighted 76-81 density Re-weighted 92-01 density Re-weighted 82-91 density
Fig. 4. Density estimates of log real wage growth distributions (NES). Results using an Epanechnikov kernel over 250 data points with a bandwidth of 0.005. Micro controls for re-weighted density are age, sex, region (including London dummy), 2-digit industry, 2-digit occupation, and major union coverage. (a) Without Re-weighting. (b) With Re-weighting.
The DFL procedure is a simple re-weighting of the observed distribution of wage growth to estimate the counterfactual distribution that would prevail if the distribution of worker characteristics did not change.22 Figs. 3 and 4(b) display density estimates of the DFL re-weighted distribution of log real wage changes for different inflation periods for the PSID and NES, respectively. Again, one can clearly detect that lower rates of inflation are associated with a compression both of tails of the wage growth distribution, in line with the predictions of Section 2. Evidence from percentile regressions. To assess whether the variation in the distribution of wage growth varies systematically with the impact of downward wage rigidity, the effects of inflation on the percentiles of real wage growth are now estimated based on the results of Proposition 5. In particular, regressions of the following form are estimated Pnrt ¼ an þ bn lrt þ Zn pt þ z0rt jn þ enrt ,
(11)
where P nrt is the nth percentile of the DFL re-weighted real wage growth distribution in region r at time t derived above, and zrt is a vector of controls that could potentially affect the distribution of wage growth. To estimate (11), measures of frictionless average real wage growth, lrt , and of the inflation rate, pt , are needed. For the latter, the CPI-U-X1 series for the US, and the April-to-April log change in the Retail Price Index for Great Britain are used. 22 Denote a ‘‘base year’’ , T (this will be the final sample year), worker characteristics, x, and the year of the relevant x distribution, t x . The time t R R counterfactual distribution of wage growth can be written as f ðD ln ot ; t x ¼ TÞ ¼ f ðD ln ojxÞ dFðxjt x ¼ TÞ ¼ f ðD ln ojxÞ c dFðxjt x ¼ tÞ, where c ¼ dFðxjtx ¼ TÞ=dFðxjtx ¼ tÞ ¼ ½Prðtx ¼ TjxÞ= Prðtx ¼ tjxÞ ½Prðtx ¼ t= Prðtx ¼ TÞ, and where the second equality follows from Bayes’ Rule. The weights c are estimated using a probit model.
ARTICLE IN PRESS 164
M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
Table 3 The effect of inflation and mean real wage growth on percentiles of real wage growth. Percentile
P 10 P 20 P 30 P 40 P 60 P 70 P 80 P 90 wsu wsb
l
CPS (US, 1980–2002)
PSID (US, 1971–1992)
NES (UK, 1976–1999)
Coefft. on pt
Coefft. on mt
Coefft. on pt
Coefft. on mt
Coefft. on pt
Coefft. on mt
0.124[0.080] 0.381[0.054]** 0.312[0.087]** 0.017[0.021] 0.018[0.022] 0.061[0.029]* 0.101[0.039]* 0.205[0.061]** þ0:753% 0.713% þ0:040%
1.494[0.160]** 0.821[0.110]** 0.295[0.058]** 0.662[0.032]** 0.792[0.050]** 1.015[0.068]** 1.277[0.072]** 1.704[0.094]**
0.312[0.094]** 0.551[0.110]** 0.003[0.048] 0.013[0.037] 0.022[0.037] 0.090[0.053] 0.205[0.083]* 0.301[0.092]** þ1:517% 1.366% þ0:150%
0.695[0.140]** 0.404[0.110]** 0.902[0.060]** 0.927[0.042]** 0.999[0.046]** 1.113[0.048]** 1.224[0.082]** 1.507[0.110]**
0.117[0.050]* 0.212[0.020]** 0.142[0.019]** 0.0905[0.015]** 0.0669[0.013]** 0.151[0.016]** 0.184[0.018]** 0.153[0.033]** þ1:085% 1.063% þ0:021%
1.086[0.120]** 0.835[0.071]** 0.841[0.051]** 0.883[0.036]** 1.065[0.034]** 1.190[0.043]** 1.205[0.054]** 1.057[0.082]**
Least squares regressions weighted by region size. Controls include the variables listed in Table 2, region, absolute change in inflation, as well as CPS: 2digit industry, current and lagged state unemployment rate, and dummies for the years 1989–93 and 1994–95 to control for the effects of imputed wage data; PSID: 1-digit industry and occupation; NES: 2-digit industry and occupation, current and lagged regional unemployment rate, and a dummy for the year 1977 to control for the income policy of that year. Standard errors in brackets robust to non-independence within years. The statistics wsu, wsb, and l are described in the main text. *Significant at the 5% level. **1% level.
To measure lrt , the result of Proposition 4 is invoked—i.e. that wage rigidity has no effect on average wage growth in the model. Thus, lrt is measured using the observed regional average real wage growth rate.23 In accordance with Eq. (10), (11) is estimated by least squares weighted by the size of the region at each date. The control variables, zrt , used are as follows. First, controls for the absolute change in the rate of inflation are included. This is motivated by the hypothesis that greater inflation volatility will yield greater dispersion in relative wages regardless of the existence of wage rigidity (see Groshen and Schweitzer, 1999). In addition, the current and lagged regional unemployment rates are included. This is motivated by the idea that the existence of downward wage rigidity may have unemployment effects. This will lead to workers ‘‘leaving’’ the wage change distribution, and so any resulting distributional consequences are controlled for.24 Based on the predictions of Proposition 5, the coefficients of interest in (11) for estimating the effects of wage rigidity are Zn and bn . Recall that the key prediction that is being tested—that downward wage rigidity leads to compression of wage increases—implies that upper percentiles of real wage growth will rise with inflation, and will rise more than one-for-one with average real wage growth. Thus, the model predicts that Zn 40 and that bn 41 for large n. The results from estimating (11) for each dataset are reported in Table 3. The results provide strong evidence that the upper tail of the wage growth distribution is compressed as a result of downward wage rigidity, consistent with the predictions of the model. To see this, first consider the results for the upper percentiles of real wage growth. For all datasets, the estimated impact of inflation is positive for the 70–90th percentiles, and is often significant. Likewise, the coefficients on aggregate wage growth exceed unity for these upper percentiles of the real wage growth distribution, and are strongly significant. Recall from Section 2.3 that these results are consistent with higher inflation and aggregate wage growth easing the compression of wage increases, as implied by the model of worker resistance to wage cuts. It is worth noting that these effects are particularly significant in the NES data for Great Britain. This is to be expected given the advantages of these data noted above: the British economy experienced large variation in the rate of inflation over the sample period; the data are taken from employer records minimizing measurement error problems; and the sample sizes are large. These all aid the ability of the regressions based on Eq. (11) to detect the effects of inflation and mean wage growth where they exist. For reference, Table 3 also reports estimates of the effects of inflation and average real wage growth on lower percentiles. Note that the predictions of the model on the Zn and bn for lower percentiles depend on the position of zero nominal wage growth in the distribution of real wage growth. For percentiles that predominantly lie in the spike at zero nominal wage growth over the sample period, Eq. (10) implies that Zn o0, and that the effects of lrt will be attenuated toward zero. For very low percentiles of real wage growth that predominantly lie below the spike at zero, however, Eq. (10) implies that Zn 40 and bn 41.
23 A trimmed mean for regional real wage growth is used to exclude the effects of outliers on lrt . I trim log real wage growth below 50 log points and above 50 log points. To see that such observations are rare, see Figs. 3 and 4. 24 Additionally, controls for any distortion to the wage growth distributions due to limitations of the datasets used are included.
ARTICLE IN PRESS M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
165
The results listed in Table 3 for the lower tail of the wage growth distribution also are consistent with the predictions of the model. The spike at zero nominal wage growth appears between the 20th and 30th percentiles in the CPS data, the 10th and 30th percentiles in the PSID, and the 20th and 40th percentiles in the NES. As predicted in Section 2.3, higher inflation has a significantly negative effect and the effects of aggregate wage growth, lrt , are attenuated toward zero at these percentiles. Likewise, for percentiles that lie below the spike at zero nominal wage growth it can be seen that the effect of higher inflation is diminished and the coefficient on average regional wage growth rises above unity once more. Together, these results provide strong evidence for the prediction that the upper tail of the wage growth distribution will be compressed as a result of downward wage rigidity. In all datasets one can detect greater compression of wage increases as inflation and mean wage growth decline that is statistically significant. A natural question in the light of this is the economic significance of the estimates listed in Table 3. This is addressed by now estimating the increase in real wage growth implied by these estimates. The effect of downward wage rigidity on aggregate wage growth. Proposition 4 showed that downward wage rigidity should have no effect on aggregate wage growth. This contrasts with previous literature that has reported positive estimates of the effects of downward wage rigidity on average real wage growth. A possible reason is that this literature has neglected the compression of wage increases as a result of worker resistance to wage cuts. This section derives estimates of the effect of downward wage rigidity on aggregate wage growth based on the results listed in Table 3. Specifically, the difference between average real wage growth when inflation is low (pL ) and average real wage growth when inflation is high (pH ) is estimated, b ðD ln ojpL ; x; zÞ E b ðD ln ojpH ; x; zÞ. l^ ¼ E
(12)
To do this, note that the mean of a random variable may be expressed as a simple average of its percentiles, so that b ðD ln ojp; x; zÞ can be estimated as a simple average of the predicted values of the percentiles of wage growth obtained E from estimating Eq. (11). These predicted percentiles also allow a discretization of the entire distribution of wage growth, so that the increase in aggregate wage growth due to downward wage rigidity can be decomposed into two components. The first is the increase in wage growth due to restricted nominal wage cuts in times of low inflation. Following the literature, this is referred to as the ‘‘wage sweep up’’ (wsu). The second component is the reduction in average wage growth due to compressed wage increases under low inflation, the ‘‘wage sweep back’’ (wsb).25 The sum of the wsu and the wsb is therefore equal to l. Since the literature has ignored the wsb effect, the wsu provides an estimate of the increase in aggregate wage growth comparable to the estimates in the literature. Therefore, the comparison of wsu with l provides a sense of the overestimate of the increase in aggregate wage growth implied by ignoring compression of wage increases. This procedure is performed on 99 estimated wage growth percentiles using a value for pH equal to 20% (the midpoint of the sample maxima in the US and Great Britain; see Fig. 2) and a value for pL equal to 1% (the sample minimum for both the US and Great Britain). The results are reported in the lower panel of Table 3. Consistent with previous literature, estimates of the wsu due to constrained wage cuts range from 0.75 percentage points to 1.5 percentage points. These values span the estimates from Card and Hyslop (1997) which suggest that the wsu due to low inflation is in the order of 1 percentage point. However, the wsb due to compressed wage increases is of similar magnitude, ranging from –0.71 to –1.37 percentage points, and serves to offset the effects of constrained wage cuts, exactly along the lines of the predictions of Section 2. Together, these lead to estimates of the increase in aggregate wage growth under low inflation to be in the range of 0.02–0.15 percentage points. These values are an order of magnitude smaller than the estimates a researcher would obtain by neglecting the compression of wage increases in times of low inflation. Thus, as anticipated by the theoretical predictions of Section 2, there is abundant empirical evidence that firms compress wage increases as a response to downward wage rigidity. Moreover, the evidence is both statistically and economically significant: neglecting the compression of wage increases leads to a substantial overestimate of the increase in wage growth due to downward wage rigidity.
4. Macroeconomic implications The preceding sections have shown, both as a theoretical and as an empirical issue, that there is little reason to believe that downward wage rigidity imposes costs on firms by raising the rate of growth of real wages. This section turns to the question of how exactly downward wage rigidity imposes costs on firms, and whether or not these costs are large. 25
Specifically, the wsu and wsb are equal to, d ¼ EðD ln o 1ðD ln oo pÞjpL Þ E^ ðD ln o 1ðD ln oo pÞjpH Þ, wsu
d ¼ EðD ln o 1ðD ln oX pÞjp Þ E^ ðD ln o 1ðD ln oX pÞjp Þ, wsb L H b ðD ln o 1ðD ln oopÞjpÞ is estimated from the predicted percentiles of wage growth and 1ðÞ is the indicator function. where E
ARTICLE IN PRESS 166
M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
4.1. Approximating the costs of wage rigidity to a firm A simple approximation to the reduction in the average value of a match due to wage rigidity can be derived from the model. Denote the latter as C EðJ JÞ where J is the frictionless (c ¼ 0) value of a match. It is straightforward to show that C can be approximated by26 Cðm; pÞ gðm; pÞ
Eðo Þ 1 bem
where
gðm; pÞ cjEðD ln W 1 jm; pÞj.
(13)
Eq. (13) is useful from a number of perspectives. First, it shows that the costs of wage rigidity to firms are driven by the reductions in productivity that firms must accept when they reduce wages, rather than by direct increases in the cost of labor. To see this, note from Eq. (2) that C is equal to the average decline in worker effort due to wage cuts. The latter suggests another attractive feature of Eq. (13): it provides an approximation to the costs of wage rigidity to firms that is unaffected by the (simplifying) assumption that workers’ productivity depends on the log real wage in (1). Finally, noticing that Eðo Þ=ð1 bem Þ is the discounted value of real frictionless labor costs, Eq. (13) also has the simple interpretation that the reduction in the value of a match due to wage rigidity is approximately equivalent to increasing the level of average real wages by a factor g, which is equal to the marginal productivity cost of a 1% nominal wage cut, c, times the expected frictionless nominal wage cut, jEðD ln W 1 Þj. Thus g can be interpreted as a compensating wage differential that compensates firms for the costs induced by downward wage rigidity. To get a quantitative sense for gðm; pÞ, it is necessary to quantify c and jEðD ln W 1 jm; pÞj. To quantify the latter in the model, note that all one needs is a value for the dispersion of idiosyncratic shocks, s. The wage growth distributions summarized in Figs. 3 and 4 imply a value for s approximately equal to 0.1.27 Quantifying the marginal effort cost of a 1% wage cut, c, is less straightforward. In the model, c is closely related to the size of the spike at zero in the distribution of nominal wage growth. Obtaining a value for the spike at zero nominal wage growth is complicated by measurement error in wages, which can bias down the observed spike by making true wage freezes appear as small changes (Altonji and Devereux, 2000). To address this, Table 4 reports the implied values for the compensating wage differential for an array of values for the spike at zero nominal wage growth that would prevail under zero inflation and productivity growth in the model. Values for the spike between 0:075 (approximately the maximum value observed in the NES data; see Table 1) and 0:4 (more than double the largest values observed in the datasets in Table 1) are considered. A number of observations can be gleaned from Table 4. First, for any value of the spike, the compensating differential imposed by downward wage rigidity declines as inflation and productivity growth rise. The intuition for this is simple: Higher inflation and productivity growth imply that desired wage growth is higher, and consequently downward wage rigidity is less binding, thereby imposing smaller costs on firms. Second, higher values of the spike are associated with larger compensating differentials. The simple reason is that larger values of the spike are indicative of larger values of c which in turn raise the costs of wage rigidity. Importantly, a third implication of Table 4 is that, for rates of inflation and productivity growth observed in the data,28 an upper bound on the costs of wage rigidity to firms is that they are equivalent to an increase in average real labor costs of around 0:68 percentage points. Interestingly, Table 4 also implies that this compensating differential would be much larger in the event of trend deflation in prices or negative productivity growth. For inflation rates of 5 percent and productivity growth of 2:5 percent, downward wage rigidity could be equivalent to an increase in aggregate real wages of up to 1:5 percentage points. 4.2. Are the costs large or small? A natural question is whether these costs are large or small. This question is examined from two important perspectives. In this subsection, the implied long run employment effects are addressed. It is possible to embed the model of an ongoing employment relationship from Section 1 into a simple model of the aggregate labor market in the long run. Assume that there is free entry of firms into the creation of new jobs and that new jobs are ex ante identical. It follows that the expected profits of a firm upon creating a job must equal zero in equilibrium. By reducing expected profits relative to a frictionless environment, downward wage rigidity leads to a reduction in the level of average wages that is consistent with zero profits. The required reduction in average wages is that which is equivalent to the reduction in expected profits due to downward wage rigidity. Eq. (13) tells us exactly that: It says that average wages must fall by the compensating differential factor g relative to a frictionless world in order to maintain zero expected profits in the presence of downward wage rigidity. Firms P t 26 Note that, for c 0, one can write C dC=dcjc¼0 c. Then note that dC=dcjc¼0 ¼ E½qJ=qc þ ðqJ=qWÞðqW=qcÞjc¼0 ¼ E½ 1 t¼0 b at D ln W t 1t ¼ EðD ln W 1 ÞEðaÞ=ð1 bem Þ where the second equality follows from the envelope theorem and the third follows from the independence of D ln A and a. Noting that, when c ¼ 0, o ¼ a leads to Eq. (13). 27 The standard deviations in high inflation periods implied by Figs. 3 and 4 are, respectively, 0:10 in the PSID data and 0:11 in the NES data. These values differ from the standard deviations reported in Table 2 because the latter include outlier wage changes that are likely to be driven by measurement error. 28 Inflation in both countries over the period remained above 1% (see Fig. 2) and annual growth in output per hour remained above 1%.
ARTICLE IN PRESS M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
167
Table 4 Compensating wage differential g(m, p) for different values of inflation (p), real productivity growth (m), and the spike at zero nominal wage growth for m ¼ 0, p ¼ 0. Spike ¼ 0:075 (c ¼ 0:02)
mnp
0.05
0.025
0
0.025
0.05
0.1
0.025 0 0.025 0.05
0.184% 0.147% 0.113% 0.085%
0.147% 0.113% 0.085% 0.061%
0.113% 0.085% 0.061% 0.043%
0.085% 0.061% 0.043% 0.029%
0.061% 0.043% 0.029% 0.018%
0.029% 0.018% 0.011% 0.007%
Spike ¼ 0:15 (c ¼ 0:045)
mnp
0.05
0.025
0
0.025
0.05
0.1
0.025 0 0.025 0.05
0.414% 0.330% 0.255% 0.191%
0.330% 0.255% 0.191% 0.138%
0.255% 0.191% 0.138% 0.096%
0.191% 0.138% 0.096% 0.064%
0.138% 0.096% 0.064% 0.041%
0.064% 0.041% 0.025% 0.015%
Spike ¼ 0:4 (c ¼ 0:16)
mnp
0.05
0.025
0
0.025
0.05
0.1
0.025 0 0.025 0.05
1.472% 1.172% 0.907% 0.679%
1.172% 0.907% 0.679% 0.491%
0.907% 0.679% 0.491% 0.342%
0.679% 0.491% 0.342% 0.229%
0.491% 0.342% 0.229% 0.146%
0.229% 0.146% 0.090% 0.053%
achieve this in the model by reducing the average initial wage. The implied reduction in equilibrium employment due to downward wage rigidity, therefore, is simply equal to the long run percentage point reduction in the average real wage, g, times the long run elasticity of the effective supply of workers. In their analysis of the employment effects of long run reductions in wages of low skilled workers, Juhn et al. (1991) report estimates of the long run supply elasticity that lie below 0:4 for the US (see pp. 112–121). Applying this upper bound to the values of the compensating differential g in Table 4 suggests that, for observed rates of inflation and productivity growth, the reduction in employment attributable to downward wage rigidity will lie below 0:4 0:68 ¼ 0:27 percentage points in the US. There is no comparable estimate of the long run labor supply elasticity for Great Britain. However, because the NES data for Great Britain are drawn from payroll records, and therefore are relatively free from measurement error, it is arguable that the observed spike reported in Table 1 is likely to be representative of the true spike. Table 1 reveals that the spike consistently lies below 7 percent in the NES data. Even if the the long run labor supply elasticity of the supply of workers were as high as 2, the values of g listed in Table 4 suggest a reduction in employment attributable to downward wage rigidity of approximately 2 0:085 ¼ 0:17 percentage points for observed rates of inflation and productivity growth. Compared to either the cyclical or the secular variation in the unemployment rate in the US and Great Britain experienced over the period considered in this paper, this number is very small. The values for the compensating wage differential listed in Table 4 also highlight that matters could be different in a context of trend deflation or negative growth. The results of Table 4 suggest that in such an environment, the reduction in employment generated by wage rigidity could be as much as 0:4 1:5 ¼ 0:6 percentage points.
4.3. Overstatement of costs in prior studies A second sense in which the magnitude of the costs of downward wage rigidity can be assessed is in comparison to the implied costs if one neglects compression of wage increases, as previous literature has done. The preceding empirical results showed that this can lead one to conclude erroneously that downward wage rigidity raises the annual rate of growth of real wages by approximately one percentage point per year when inflation is low. Similar estimates are reported in Card and Hyslop (1997). To a first-order approximation this implies a rise in average real labor costs equal to29
Eðo Þ C^ g^ 1 bem 29
E½
P1
where g^ 0:01
bem . 1 bem
(14)
If downward wage rigidity raises the rate of real wage growth by g, this implies an increase in average discounted labor costs equal to P t mt m m m þ gÞt ot E½ 1 t¼0 b e ot . For small g, the latter is approximately equal to g½be =ð1 be Þ½Eðo Þ=ð1 be Þ.
t mt t¼0 b e ð1
ARTICLE IN PRESS 168
M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
Eq. (14) has an analogous interpretation to Eq. (13). It says that a one percentage point increase in the rate of growth of real wages is equivalent to a permanent increase in the average level of real wages by a factor of 0:01 bem =ð1 bem Þ. To quantify this, note that if workers and firms separate with probability d each year, and the real interest rate is r, then the firm’s discount factor is equal to b ¼ ð1 dÞ=ð1 þ rÞ. In the US economy, the quarterly separation probability is approximately 0.1, implying a value of d ¼ 0:344 on an annual basis. Setting r ¼ 0:05 yields a value of b ¼ 0:625. Given average productivity growth of 2%, this suggests that ignoring compression of wage increases implies costs of downward wage rigidity equivalent to a 1:76% increase in average real labor costs. For observed values of inflation and growth, the latter is more than double the upper-bound estimate of the true costs of wage rigidity implied by the model above. The results of Table 4 also provide an important perspective on the overstatement of the costs due to downward wage rigidity in prior research. They suggest that the estimated costs in studies that neglect the compression of wage increases exceed the true costs that would prevail even in the presence of 5% trend deflation and 2:5% real growth. Thus, neglecting the compression of wage increases induced by downward wage rigidity provides a misleading picture of the true costs of wage rigidity imposed on firms. 5. Conclusions In his presidential address, Tobin (1972) argued that, if workers are reluctant to accept reductions in their nominal wages, a certain amount of inflation may ‘‘grease the wheels’’ of the labor market by easing reductions in real labor costs. Exploring the macroeconomic implications of downward wage rigidity from both a theoretical and an empirical perspective, I find that these effects are likely to be small. An explicit model of worker resistance to nominal wage cuts reveals that firms will compress wage increases as well as wage cuts in the presence of downward wage rigidity. This compression of wage increases culminates in the prediction that worker resistance to wage cuts has no effect on aggregate wage growth in the model, challenging a common intuition in previous empirical literature on downward wage rigidity. To assess the empirical relevance of these predictions, testable implications of the model are taken to micro-data for the US and Great Britain. These data reveal significant evidence for compression of wage increases related to downward wage rigidity. Moreover, accounting for this limits the estimated increase in aggregate real wage growth due to downward wage rigidity to be much closer to zero. Returning to the model, the implied costs of downward wage rigidity to firms can be approximated using available data. This reveals two senses in which the costs of wage rigidity are small for the rates of inflation and productivity growth observed in the US and Great Britain over recent decades. First, erroneously concluding that downward wage rigidity raises the rate of aggregate wage growth, as previous literature has done, leads to a substantial (more than twofold) overstatement of the costs of wage rigidity to firms. Second, the implied long run disemployment effects of wage rigidity under zero inflation are shown to be unlikely to reduce employment by more than 0:25 of a percentage point. These results suggest that downward wage rigidity does not provide a strong argument against the adoption of a low inflation target. Stepping back from this, one might ask whether the mechanism put forward in this paper—that firms compress wage increases in the face of worker resistance to wage cuts—really rings true in the real world. Bewley et al. (2000, p. 46) reported survey evidence that firms temper wage increases in response to worker resistance to wage cuts: [Business leaders] take account of the fact that, if they raise the level of pay today, it will remain high in the future. I hear a lot about this last point now. ½ Some say that they are not now increasing pay ½ because they know they will not be able to reverse the increases during the next downturn. In addition, there is evidence of explicitly bargained mediation of wage growth as an alternative to wage cuts: General Motors Corp’s historic health care deal with the United Auto Workers will require active workers to forgo $1an-hour in future wage hikes ½ Allen Wojczynski, a 36-year GM employee, said the company’s proposal seems acceptable ½ He had been expecting the automaker to ask its workers for pay cuts to trim health care costs. ‘I could live with it, giving up $1 an hour of my future pay raises,’ said Wojczynski ½ Detroit News, October 21st 2005.30 Thus, compression of raises is used in practice as an approach to limiting labor costs in the face of poor economic conditions, and thereby can limit disemployment effects of worker resistance to wage cuts. References Abel, A.B., Eberly, J.C., Janice, C., Oct., 1996. Optimal investment with costly reversibility. Review of Economic Studies 63 (4), 581–593. Akerlof, G.A., Dickens, W.T., Perry, G.L., 1996. The macroeconomics of low inflation. Brookings Papers on Economic Activity 1996 (1), 1–76. Akerlof, G.A., Yellen, J.L., 1986. Efficiency Wage Models of the Labor Market. Cambridge University Press, Cambridge. Altonji, J., Devereux, P., 2000. The extent and consequences of downward nominal wage rigidity. In: Research in Labor Economics, vol. 19. Elsevier Science Inc, pp. 383–431. Barro, R.J., 1977. Long-term contracting, sticky prices, and monetary policy. Journal of Monetary Economics 3, 305–316. Bentolila, S., Bertola, G., Jul., 1990. Firing costs and labor demand: how bad is eurosclerosis? Review of Economic Studies 57 (3), 381–402.
30
See http://www.detnews.com/2005/autosinsider/0510/21/A01-356532.htm.
ARTICLE IN PRESS M.W.L. Elsby / Journal of Monetary Economics 56 (2009) 154–169
169
Bewley, T.F., 1999. Why Wages Don’t Fall During a Recession. Harvard University Press, Cambridge and London. Bewley, T.F., Akerlof, G.A., Dickens, W.T., Perry, G.L., 2000. Near rational wage and price setting and the long-run phillips curve. Brookings Papers on Economic Activity 2000 (1), 1–44. Bloom, N. The real options effect of uncertainty on investment and labor demand, IFS Working Paper, November 2000. Caplin, A.S., Spulber, D.F., Nov. 1987. Menu costs and the neutrality of money. Quarterly Journal of Economics 102 (4), 703–726. Card, D., Hyslop, D., 1997. Does inflation ‘Grease the Wheels of the Labor Market’? In: Romer, Christina D., Romer, David H. (Eds.), Reducing Inflation: Motivation and Strategy. University of Chicago Press. DiNardo, J., Fortin, N.M., Lemieux, T., Sep., 1996. Labor market institutions and the distribution of wages, 1973–1992: a semiparametric approach. Econometrica 64 (5), 1001–1044. Dickens, W.T., Goette, L., Groshen, E.L., Holden, S., Messina, J., Schweitzer, M.E., et al., 2006. The interaction of labor markets and inflation: analysis of micro data from the International Wage Flexibility Project. Mimeo Brookings Institution. Fehr, E., Go¨tte, L., 2005. Robustness and real consequences of nominal wage rigidity. Journal of Monetary Economics 52, 779–804. Goodfriend, M., King, R.G., 2001. The case for price stability. In: Herrero, A.G., Gaspar, V., Hoogduin, L., Morgan, J., Winkler, B. (Eds.), Why Price Stability? European Central Bank, Frankfurt am Main, pp. 53–94. Groshen, E.L., Schweitzer, M.E., 1999. Identifying inflation’s grease and sand effects in the labor market. In: Feldstein, Martin (Ed.), The Costs and Benefits of Price Stability. NBER Conference Report Series. University of Chicago Press, Chicago and London, pp. 273–308. Holden, S., 1994. Wage bargaining and nominal rigidities. European Economic Review 38, 1021–1039. Howitt, P., March 2002. Looking inside the labor market: a review article. Journal of Economic Literature 40, 125–138. Juhn, C., Murphy, K.M., Topel, R.H., 1991. Why has the natural rate of unemployment increased over time? Brookings Papers on Economic Activity 2, 75–142. Kahn, S., Dec., 1997. Evidence of nominal wage stickiness from microdata. American Economic Review 87 (5), 993–1008. Lebow, D.E., Stockton, D.J., Wascher, W., October 1995. Inflation, nominal wage rigidity, and the efficiency of labor markets. Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series: 95/45. Lebow, D.E., Saks, R.E., Wilson, B.-A., July 1999. Downward nominal wage rigidity: evidence from the employment cost index. Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series: 99/31. MacLeod, W.B., Malcomson, J.M., September 1993. Investments, holdup, and the form of market contracts. American Economic Review 83 (4), 811–837. Nickell, S., Quintini, G., October 2003. Nominal wage rigidity and the rate of inflation. Economic Journal 113 (490), 762–781. Shafir, E., Diamond, P., Tversky, A., 1997. Money illusion. Quarterly Journal of Economics 112 (2), 341–374. Smith, J., March 2000. Nominal wage rigidity in the United Kingdom. Economic Journal 110, 176–195. Smith, J., December 2004. How costly is downward nominal wage rigidity in the UK? Mimeo. Snell, A., Thomas, J.P., 2007. Labour contracts, equal treatment and wage-unemployment dynamics. Mimeo, University of Edinburgh. Solow, R.M., 1979. Another possible source of wage stickiness. Journal of Macroeconomics 1, 79–82. Stokey, N.L., Lucas Jr., R.E., 1989. Recursive Methods in Economic Dynamics. Harvard University Press, (with Edward C. Prescott, Cambridge, MA) London. Thomas, J.P., October 2005. Fair pay and a wagebill argument for wage rigidity and excessive employment variability. Economic Journal 115, 833–859. Tobin, J., 1972. Inflation and unemployment. American Economic Review 62 (1/2), 1–18.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 170–183
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Labor market dynamics under long-term wage contracting$ Leena Rudanko Department of Economics, Boston University, 270 Bay State Road, Boston, MA 02215, USA
a r t i c l e i n f o
abstract
Article history: Received 9 February 2008 Received in revised form 15 December 2008 Accepted 16 December 2008 Available online 25 December 2008
Recent research seeking to explain the strong cyclicality of US unemployment emphasizes the role of wage rigidity. This paper proposes a micro-founded model of wage rigidity—an equilibrium business cycle model of job search, where risk neutral firms post optimal long-term contracts to attract risk averse workers. Equilibrium contracts feature wage smoothing, limited by the inability of parties to commit to contracts. The model is consistent with aggregate wage data if neither worker nor firm can commit, producing too rigid wages otherwise. Wage rigidity does not lead to a substantial increase in the cyclical volatility of unemployment. & 2008 Elsevier B.V. All rights reserved.
JEL classification: E24 E32 J41 J64 Keywords: Wage rigidity Unemployment fluctuations Long-term wage contracts Limited commitment Directed search
1. Introduction In recent years significant research effort has been devoted to trying to understand the sources of the strong cyclical volatility of unemployment in the US. The standard tool for modeling unemployment, the Mortensen–Pissarides search and matching model (Pissarides, 1985), produces significantly smaller variation in unemployment than observed (Shimer, 2005a). This gap between model and data has led to the view that the observed cyclicality of unemployment is a manifestation of important rigidities affecting wage determination, suggested by the weak cyclicality of aggregate wage data, and not captured by the model (Hall, 2005). However, while imposing exogenous rigidity in wages easily allows the model to produce much larger variation in unemployment, mechanically explaining the puzzle, it does not provide a satisfactory economic answer to the problem. It is well known that outcomes in macroeconomic models with exogenously imposed rigidities can differ substantially from those in models where rigidities are derived from micro-foundations.1 This paper shows that wage rigidity, as derived from a plausible microeconomic foundation and embedded into an equilibrium $ This paper is based on the first essay of my dissertation at the University of Chicago, first version dated 2004. I am grateful to Fernando Alvarez, Francois Gourio, Lars P. Hansen, Hanno Lustig, Robert Shimer, Jonathan Thomas, seminar and conference audiences as well as the editor and referee for comments. Financial support from the Yrjo¨ Jahnsson Foundation, the Finnish Cultural Foundation and the Emil Aaltonen Foundation is gratefully acknowledged. Tel.: +1 617 353 7082. E-mail address:
[email protected] 1 For a striking example in the context of price-setting in the face of menu costs, Caplin and Spulber (1987) illustrate that the real effects of monetary shocks vanish when firms are allowed to optimize in price-setting, rather than being exogenously constrained.
0304-3932/$ - see front matter & 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.12.009
ARTICLE IN PRESS L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
171
model, need not lead to a substantial increase in the cyclical volatility of unemployment. By considering the impact of limited commitment as a contracting friction affecting wage determination in the model, I show that aggregate wage data are not informative about the measure of wage rigidity relevant for unemployment cyclicality. This paper develops an extension of the Mortensen–Pissarides model, where risk neutral firms use optimal long-term contracts to attract risk averse workers. Labor markets are subject to search frictions, but operate competitively (Moen, 1997). To attract workers, firms post vacancies. A vacancy specifies a long-term wage contract and the firm’s choice of contract balances the costs of paying high wages with the benefits of attracting many job applicants. Unemployed workers observe all contracts offered and choose one to apply for, balancing the benefits of high wages with the costs of having to search longer for such jobs. Labor productivity varies over the business cycle and, when risk averse workers cannot smooth consumption privately, efficient wage contracts feature income smoothing. Jobs end due to idiosyncratic separation shocks, leaving the worker to face unemployment on his own, without protection from former employers. The ability to commit to contracts affects equilibrium outcomes. Under commitment, insurance motives lead to a constant contract wage, but when parties cannot commit, their outside options restrict the degree of wage smoothing possible. Three cases are studied: two-sided commitment, one-sided commitment, and two-sided limited commitment. In addition to affecting the cyclicality of the aggregate wage through its effect on wage contracts, the ability to commit also has allocative effects on vacancy creation. The quantitative results show that with the exception of the two-sided limited-commitment contracting environment, the model produces an aggregate wage which is very rigid compared to data. Wage rigidity does not come with the substantial increase in the cyclical volatility of unemployment seen in the context of exogenously imposed rigidity, however. Limited commitment works to increase the cyclicality of both the aggregate wage and unemployment, bringing the model closer to data on both dimensions. These seemingly surprising results have a simple explanation: Wage smoothing within contracts translates into significant rigidity in the aggregate wage, but it does not imply that wages are rigid when it comes to hiring new workers. The relevant statistic for hiring decisions is the present value of wages used to attract new workers. Introducing limited commitment makes the present value more rigid, leading to greater variation in vacancy creation. However, it also leads to pro-cyclical adjustments in contract wages, as well as increased cyclicality in starting wages, both of which increase the cyclicality of the aggregate wage. Finally, the results show that while the impact of limited commitment on the aggregate wage is substantial, the impact on unemployment is relatively modest in magnitude. The findings warn against using aggregate wage data to draw inferences about wage rigidity as the cause of unemployment cyclicality. The unemployment dynamics of the model differ from the standard Mortensen–Pissarides model for two main reasons. The first has to do with the incomplete markets environment faced by workers. When risk averse workers cannot smooth their consumption across unemployment and employment spells, they gain less from finding a job with a high wage level than they would if they could smooth their consumption. This change in how workers value wages affects the wage contracts firms find optimal to offer, and distorts the equilibrium toward lower wages. In periods when high productivity bids up the wages firms use to attract new workers, the distortion increases, curbing the wage increase. In periods when low productivity causes the wages used to attract new workers to fall, the distortion relaxes, curbing the wage decrease. The resulting rigidity in the present value of wages used to attract new workers translates into increased cyclicality in vacancy creation.2 The second reason has to do with limited commitment exacerbating the above effects. Under full commitment, firms offer a permanently higher wage to workers hired in booms than those hired in recessions. When limited commitment binds in equilibrium, such contracts are no longer feasible. All the firm can do to raise the wages used to attract new workers in booms is to raise the starting wage, prevailing until the first recession arrives and the firm’s participation constraint forces the wage down. Similarly, all the firm can do to lower the wages used to attract new workers in recessions is to lower the starting wage, prevailing until the first boom arrives and the worker’s participation constraint forces the wage up. Creating differences in present values across booms and recessions in such an environment has to involve increasing the dispersion in starting wages across the two states. This exacerbates the distortion due to the incomplete markets environment in booms and relaxes it in recessions, adding to the rigidity in the present value of wages used to attract new workers and hence also the cyclicality of vacancy creation. The principal theoretical contribution of this paper is to embed the two-sided limited-commitment wage contracting problem of Thomas and Worrall (1988) into an equilibrium model of directed search with aggregate shocks.3 The embedding involves incorporating flows in and out of employment relationships and endogenizing the outside options restricting contracting to reflect the equilibrium value of search. As is well known, solving the
2
The impact of incomplete markets is discussed in detail in Rudanko (2008). My model has already been applied by others to study related questions: Reiter (2008) examines business cycles driven by embodied technology shocks and Kudlyak (2007) the cyclicality of wages in individual level data. The empirical studies of Macis (2007) and Haefke et al. (2007) are also closely related. Interestingly, the contracts in the model are also observationally similar to those in MacLeod and Malcomson (1993), with renegotiation by mutual consent. In earlier work, Sigouin (2004) embedded two-sided limited-commitment contracts into an equilibrium model of job search, but in his model unemployment is constant over the business cycle. 3
ARTICLE IN PRESS 172
L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
two-sided limited-commitment contracting problem is complicated by the relevant recursive representation not being a contraction, a feature which must here be dealt jointly with solving for endogenous outside options. Section 2 presents the model. Sections 3 and 4 examine its qualitative and quantitative properties, respectively. Section 5 concludes. Supplementary online appendixes provide proofs of theoretical results, a description of the computational approach, and robustness checks. 2. Model This section develops an extension of the Mortensen–Pissarides model with optimal dynamic long-term wage contracting. The extension involves introducing risk averse workers facing incomplete asset markets into the model, which makes long-term contracting optimal, as well as incorporating directed search, which gives both firms and workers the opportunity to optimally choose a wage contract to offer and apply for. 2.1. Preferences and technologies P t There is a continuum of measure one workers with preferences Et 1 t¼0 b uðc tþt Þ, where u is a CRRA utility function. They consume their income each period, so consumption c equals the wage w if the agent is working and b40 if the agent is unemployed. P t e e There is a continuum of entrepreneurs with preferences Et 1 t¼0 b ctþt , where c is the sum of cash flows from firms the entrepreneur owns. Entrepreneurs have free access to a constant returns to scale production technology using labor as the only input. The term firm will be used to refer to a single worker production unit. One entrepreneur can operate an unlimited number of firms. The output of an operating firm during a period is given by ez, where z is an aggregate shock, and e is a match specific shock. The aggregate shock z4b follows an n state Markov chain with transition probabilities pðz0 jzÞ, such that the transition matrix P is monotone. The match specific shock e is equal to one for all new matches and each period with probability d switches to zero permanently. At that time the match is terminated. The cash flow of an operating firm during a period is z w. In the beginning of each employment relationship, a state-contingent long-term wage contract is signed. Contracts are assumed to be conditional on aggregate productivity at the time of contracting and specify wages for all continuation histories of z after that, for as long as the match operates. A match history of shocks for a given production unit that starts production in state z0 and is still producing t periods later is denoted as zt ¼ ðz0 ; z1 ; . . . ; zt Þ. A wage contract is a set of functions
sðzÞ ¼ fwt ðzt Þ 2 ½w; w for all zt ; t ¼ 0; 1; . . . s:t: z0 ¼ zg, where z denotes the state at the time of contracting. One can then define the utility value of a contract to the contracting parties. Suppose that the market value of unemployment to a worker in aggregate state z is V u ðzÞ. The utility value of a new contract s in state z is V s ðzÞ ¼ uðw0 ðzÞÞ þ Ez
1 X t
b ð1 dÞt1 ½ð1 dÞuðwt ðzt ÞÞ þ dV u ðzt Þ.
(1)
t¼1
The contract specifies a wage for all continuation histories of z while e ¼ 1. The separation shock hits each period with probability d, leading to the worker becoming unemployed and receiving the prevailing value of unemployment. The set of feasible wage contracts, Sðz; V u Þ, is defined as all sðzÞ s.t. V s ðzÞXV u ðzÞ. The utility value of a new contract cannot be lower than the value of remaining unemployed. Free entry drives an entrepreneur’s equilibrium value of searching for a worker to zero. Hence, the present value of profits for a firm from a new contract s in state z is F s ðzÞ ¼ z w0 ðzÞ þ Ez
1 X t
b ð1 dÞt ðzt wt ðzt ÞÞ.
(2)
t¼1
While the match is productive, the firm produces zt and pays wages wt ðzt Þ. Once the separation shock hits, the firm is left with the market value of searching for a new worker, zero. Workers and firms face search frictions in the labor market, captured by a matching function. To hire a worker, a firm must post a vacancy specifying a wage contract. Posting a vacancy costs the firm k units during each period of search. For unemployed workers, search is costless. Each firm chooses a contract to offer, considering both the present value of wages the contract entails as well as the likelihood of finding a worker, which varies by contract. Each unemployed worker observes all the contracts offered by firms and chooses one to apply for, considering both the value of the contract and the likelihood of getting the job. The labor market can be thought of as segmenting into contract-specific sub-markets. Given a measure N u unemployed workers applying for contract s and measure Nv vacancies offering s, the measure of matches a taking place this period is given by a Cobb–Douglas matching function mðNu ; N v Þ ¼ KN au N 1 , with 0oKo1 and 0oao1. v Defining y ¼ N v =Nu as the contract-specific labor market tightness or vacancy–unemployment ratio, the probability that a
ARTICLE IN PRESS L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
173
worker finds a job this period in this market is mðyÞ: ¼ mðN u ; Nv Þ=N u and the probability that a firm finds a worker qðyÞ: ¼ mðN u ; Nv Þ=N v .4 In principle different sub-markets could co-exist at the same time but, as will become clear later, it will not happen in equilibrium. Anticipating such an outcome, the equilibrium definition specifies the labor market as a single ðsðzÞ; yðzÞÞ-pair for all z. Nevertheless, one has to consider the possibility of alternative markets. Given a value of unemployment V u ðzÞ for all z, the contract sðzÞ determines the value to a worker from matching based on Eq. (1), and the market tightness yðzÞ determines the probability of matching, mðyðzÞÞ. To maximize utility in the choice of contract to apply for, in the presence of an alternative market, a worker would compare the expected value mðyðzÞÞðV s ðzÞ V u ðzÞÞ across the markets. On the firms’ side, the contract value F s ðzÞ is based on Eq. (2) and the probability of matching is qðyðzÞÞ. To maximize profits, in the presence of an alternative market, a firm would compare the expected value qðyðzÞÞF s ðzÞ across the markets. 2.2. A competitive search equilibrium A competitive search equilibrium is defined along the lines of Moen (1997). Definition 1. An equilibrium of the economy consists of: for each z, a search value for unemployed workers V u ðzÞ, a market tightness yðzÞ and a wage contract sðzÞ 2 Sðz; V u Þ such that 1. Search offers zero profit to an entrepreneur: qðyðzÞÞF s ðzÞ k ¼ 0, where F s ðzÞ satisfies (2). ^ ðzÞ 2 Sðz; V u Þ, and y^ ðzÞX0, s.t. 2. No Pareto improving market is possible: )s
mðy^ ðzÞÞðV s^ ðzÞ V u ðzÞÞ4mðyðzÞÞðV s ðzÞ V u ðzÞÞ and qðy^ ðzÞÞF s^ ðzÞ k40, where V s ðzÞ, V s^ ðzÞ satisfy (1) and F s^ ðzÞ satisfies (2). 3. The search values of workers are consistent: X pðz0 jzÞ½mðyðz0 ÞÞV s ðz0 Þ þ ð1 mðyðz0 ÞÞÞV u ðz0 Þ. V u ðzÞ ¼ uðbÞ þ b z0
The equilibrium conditions can be thought of as arising from a contract-posting setting, where entrepreneurs are free to post any feasible wage contract and each worker directs his search to a single optimal contract. For the purposes of understanding the connection, fix a state z. Consider then a labor market offering contract s with prevailing market ^ , it has beliefs about the market tightness yB ðs ^ Þ that tightness y. When a firm considers posting an alternative contract s ^ if qðyB ðs ^ ÞÞF s^ kX0. If a contract s ^ were posted, the will prevail in the market for that contract, and will only post s ^ ^ . The beliefs of unemployed workers’ choice of contract to apply for would determine y, the tightness of the market for s firms are assumed to be consistent with the idea that the measure of workers applying for the alternative contract would ^ and the equilibrium contract s. If be such that workers are made indifferent between applying for the alternative contract s the alternative contract implies a high contract value, many workers apply for it. The more workers apply, the harder it becomes for those workers to get the job, and eventually workers are left indifferent between the two contracts. For ðs; yÞ to ^ that wouldgive the firm strictly positive profits, given be an equilibrium, there cannot exist a feasible alternative contract s ^. the expected tightness of the market for s The model incorporates some features familiar from the implicit contracts literature following Azariadis (1975) and Baily (1974). One such feature is the assumption of risk averse workers and risk neutral firms, which is based on arguments that: (i) selection into entrepreneurship favors less risk averse agents, (ii) entrepreneurs are wealthier than workers, so may behave as though they are less risk averse, and (iii) entrepreneurs have better access to asset markets so can insure away risk.5 Another such feature is that workers cannot save privately. This modeling choice is driven by the extensive literature documenting that a large fraction of the population holds relatively little wealth, particularly for short-term consumption smoothing purposes, with a significant fraction exhibiting liquidity constrained behavior. Particularly in modeling the unemployed, a plausible starting point seems to be that on average they are not particularly wealthy individuals. Chetty (2008) shows that approximately half of unemployment benefit claimants in the US held no liquid wealth at the time of job loss, exhibiting liquidity constrained behavior. In models with saving, workers tend to accumulate more wealth than these 4 In a discrete time model, the matching function may need to be truncated to make sure mðyÞ; qðyÞp1. In practice the parameters will be such that truncation will not be necessary. 5 Rudanko (2008) explores a setting where also entrepreneurs are risk averse, differing from workers by having better access to asset markets.
ARTICLE IN PRESS 174
L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
observations suggest, causing a discrepancy between the model and data, which directly affects the job search behavior of individuals.6 The agents in the model face exogenous shocks to labor productivity, both aggregate and idiosyncratic. The role of the idiosyncratic shock is to proxy for the various reasons why matches sometimes end due to person or match specific factors. Note that the separation rate is constant over time. This simplifying assumption is based on the evidence reported by Shimer (2005b), showing that even though the separation rate is counter-cyclical in the data, its contribution to explaining fluctuations in unemployment is small compared to changes in the jobfinding rate, which explain the bulk of the variation. Hence, the aggregate shock does not lead to separations in the model.7 3. Equilibrium properties Equilibrium contracts must be Pareto efficient between worker and firm, and the ability to commit to contracts affects what contracts are feasible. This section begins by analyzing the case of full-commitment contracting, then proceeding to contracting under limited commitment. Because the case of one-sided limited commitment is a relatively simple intermediate case, the analysis tackles directly the case of two-sided limited commitment. 3.1. Wage contracts under commitment For any state z and feasible worker value V, the Pareto frontier of efficient contracts is defined as f
FC
ðV; z; V u Þ ¼
sup fF s ðzÞjV s ðzÞXVg.
s2Sðz;V u Þ
Because a contract on the frontier cannot be Pareto dominated after any history, f f
FC
u
ðV; z; V Þ ¼ max0 fz w þ bEz ð1 dÞf
FC
w;fVðz Þg
FC
must satisfy the functional equation
u
ðVðz0 Þ; z0 ; V Þg
s:t: V ¼ uðwÞ þ bEz ½ð1 dÞVðz0 Þ þ dV u ðz0 Þ, w 2 ½w; w.
(3)
Providing the worker value V is partly accomplished through today’s wage w, partly through the value of unemployment and partly through future wages, which determine continuation values Vðz0 Þ for all the states possible next period. The worker receives Vðz0 Þ if separation does not occur and the value of unemployment V u ðz0 Þ if it does. The FC maximum firm value from providing Vðz0 Þ next period is given by f ðVðz0 Þ; z0 ; V u Þ and if separation occurs the firm is left with zero. Definition 2. An equilibrium with full commitment is characterized by V u ðzÞ, yðzÞ and sðzÞ 2 Sðz; V u Þ 8z, that satisfy parts 1–3 of Definition 1. Proposition 3. For any z and VXV u ðzÞ, there exists a unique efficient contract and f concave and continuously differentiable in V.
FC
ðV; z; V u Þ is strictly decreasing, strictly
Proposition 4. For any VXV u ðzÞ, the efficient contract under full commitment has a constant wage throughout the contract. The optimal wage contract implements an efficient risk sharing arrangement between worker and firm, by way of a constant contract wage. The Pareto frontier is concave because workers enjoy diminishing marginal utility from increasing the wage level. The proofs of these and other theoretical results in the paper can be found in a supplementary online appendix. 3.2. Unique contract offered in the labor market Having established that equilibrium contracts lie on the Pareto frontier, thus pinning down the form of the optimal state-contingent contract, the remaining question is how the surpluses from matching are divided between the worker and firm, determining the level of wages paid. To answer this second question, parts 1 and 2 of the definition of equilibrium can 6 Long-term contracts may well play a role in explaining a part of why wealth holdings are low, given the income smoothing they provide, but this remains an avenue for future research. Solving a model with saving becomes significantly complicated in a business cycle setting, because individual job search decisions depend on individual wealth. Combining these issues with long-term-contracting poses a challenge. However, as pointed out by the referee, it would be quite feasible to examine an economy with two groups of workers: one with access to complete markets (pooling labor income risk through a large household), and one with no access to asset markets. 7 The aggregate shock can be interpreted as anything affecting labor productivity. One such possibility is a technology shock, but there is no reason to rule out other options such as policy or taste shocks. Clearly the model’s ability to match data will be limited by only considering a shock to labor productivity (whatever the sources). The task is to evaluate whether the model’s responses to such a shock are consistent with data.
ARTICLE IN PRESS L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
175
be collapsed to the problem: max mðyðzÞÞðVðzÞ V u ðzÞÞ
VðzÞ;yðzÞ
s:t: qðyðzÞÞf
FC
ðVðzÞ; z; V u Þ ¼ k,
(4)
where existence requires that V u ðzÞ and k are such that f Proposition 5. Given V u ðzÞ; k such that f
FC
FC
ðV u ðzÞ; z; V u ÞXk for all z.
ðV u ðzÞ; z; V u ÞXk, (4) has a unique solution.
8
For any contract delivering a high value to the worker via high wages, the market tightness must be low for firms to break even in offering such a contract. The low market tightness makes the contract less attractive to workers because their jobfinding probability is low. The congestion properties of the labor market, captured by the matching function, imply that as wages rise, the declining job-finding probability eventually begins to dominate the rising contract value to the worker and there is a unique optimal wage level balancing these effects.9 As the quantitative section will illustrate, both the contract value VðzÞ and market tightness yðzÞ are increasing in productivity. In booms workers are hired faster and with a higher (constant) wage than in recessions. Thus, the restriction made in the definition of equilibrium about the labor market featuring a single contract is innocuous because even if one allowed multiple contracts, the equilibrium would only feature one. For comparing the results of this paper to the literature, it is helpful to note the following result on the close connection between the competitive search equilibrium and an economy with bilateral wage bargaining. The result is immediate when comparing the first order conditions of the two problems. Proposition 6. The competitive search equilibrium of Definition 1 produces the same equilibrium conditions resulting in an environment where search is undirected and when a worker and firm meet they bargain over a long-term wage contract sðzÞ 2 Sðz; V u Þ, by maximizing the Nash product ðV s ðzÞ V u ðzÞÞa F s ðzÞ1a , where a is the power of the Cobb–Douglas matching function. 3.3. Contracting under two-sided limited commitment The quantitative exercises will show that constant-wage contracts often require commitment on both sides to be feasible. A worker hired in a downturn has a permanently low wage and may prefer to quit in a boom. A firm that hired a worker during a boom with a permanently high wage may prefer to fire the worker in a downturn. If the contracting parties cannot commit, such wage smoothing is not feasible. This section examines contracting when neither the worker nor firm can commit.10 Define V s ðzt Þ as the utility value of an on-going contract s to a worker, given that the contract has been in operation t40 periods and the history of shocks is zt . We have V s ðzt Þ ¼ uðwt ðzt ÞÞ þ Ezt
1 X
btt ð1 dÞt˜ t1 ½ð1 dÞuðwt˜ ðzt˜ ÞÞ þ dV u ðzt˜ Þ. ˜
t˜ ¼tþ1
For firms, respectively, F s ðzt Þ ¼ zt wt ðzt Þ þ Ezt
1 X
btt ð1 dÞt˜ t ðzt˜ wt˜ ðzt˜ ÞÞ. ˜
t˜ ¼tþ1
For contracts to be self-enforcing, both workers and firms must always weakly prefer staying in the contract to pursuing their outside option. The set of feasible contracts must hence be restricted as follows:
SLC ðz; V u Þ ¼ fsðzÞ 2 Sðz; V u ÞjV s ðzt ÞXV u ðzt Þ; F s ðzt ÞX0 8zt ; t ¼ 0; 1; . . . s:t: z0 ¼ zg. Definition 7. An equilibrium with limited commitment is characterized by V u ðzÞ, yðzÞ and sðzÞ 2 SLC ðz; V u Þ 8z, that satisfy parts 1–3 of Definition 1. The Pareto frontier under limited commitment is defined as f
LC
ðV; z; V u Þ ¼
sup
fF s ðzÞjV s ðzÞXVg,
s2SLC ðz;V u Þ
8
The result holds for more general matching functions than Cobb–Douglas. A sufficient condition is that the elasticity of q is weakly decreasing. Uniqueness occurs for the same reasons as in Moen (1997), with the concave preferences factoring in to reduce the gain to the worker from high wages. 10 The quantitative exercises consider also the case where firms can commit, but workers cannot. The one-sided case is relatively straightforward and discussed briefly at the end of the section. 9
ARTICLE IN PRESS 176
L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
and now f f
LC
LC
must satisfy the functional equation
ðV; z; V u Þ ¼ max0 fz w þ bEz ð1 dÞf
LC
w;fVðz Þg
ðVðz0 Þ; z0 ; V u Þg
s:t: V ¼ uðwÞ þ bEz ½ð1 dÞVðz0 Þ þ dV u ðz0 Þ, Vðz0 ÞXV u ðz0 Þ; 8z0 , f
LC
ðVðz0 Þ; z0 ; V u ÞX0; 8z0 ,
w 2 ½w; w
(5)
for any feasible V. The above equation is similar to Eq. (3), with two additional constraints. The continuation value of the worker Vðz0 Þ can never be less than the value of unemployment V u ðz0 Þ, and the value of the contract to the firm LC f ðVðz0 Þ; z0 ; V u Þ can never be negative. Proposition 8. When a constrained efficient contract exists, the set of values for which it exists is a closed and bounded interval LC ½V u ðzÞ; Vðz; V u Þ. For any V within the interval there is a unique efficient contract, and f ðV; z; V u Þ is strictly decreasing, strictly concave and continuously differentiable with respect to V. In limited-commitment contracts, the promised values assigned to workers are bounded from below by V u ðzÞ and from above by the fact that firms must make positive profits. Here Vðz; V u Þ is the promised value delivering zero present value of LC profits to firms, f ðVðz; V u Þ; z; V u Þ ¼ 0. The proof of the above result follows Thomas and Worrall (1988), with small extensions. Lemma 9. For any V 2 ½V u ðzÞ; Vðz; V u Þ, (qf the current period.
LC
=qVÞðV; z; V u Þ ¼ 1=u0 ðwÞ, where w is the wage the efficient contract specifies for
Proposition 10. For any V 2 ½V u ðzÞ; Vðz; V u Þ, the efficient wage contract is characterized by a wage which is constant unless LC either participation constraint binds. If f ðV; z; V u ÞX0 binds, the wage is adjusted down just enough to make it hold with u equality. If VXV ðzÞ binds, the wage is adjusted up just enough to make it hold with equality. Lemma 9 states the envelope condition for the recursive representation (5). It relates the current period wage w to the slope of the Pareto frontier at promised value V, establishing a strictly increasing relationship between the two. To see why the optimal contract has the above properties, consider the first order condition for Vðz0 Þ in the recursive representation: 1=u0 ðwÞ ¼ ð1=u0 ðw0 ðz0 ÞÞÞð1 þ cðz0 ÞÞ þ Zðz0 Þ, where Zðz0 Þ is the Lagrange multiplier on the worker’s participation constraint in state z0 and cðz0 Þ the corresponding multiplier on the firm’s constraint. If neither constraint binds, Zðz0 Þ ¼ cðz0 Þ ¼ 0 and constant marginal utility implies a constant wage, w0 ðz0 Þ ¼ w. If the worker’s participation constraint binds for some future state z0 , then Zðz0 Þ40; cðz0 Þ ¼ 0 implies w0 ðz0 Þ4w. If the firm’s constraint binds, then cðz0 Þ40; Zðz0 Þ ¼ 0 implies w0 ðz0 Þow. Contract wages remain constant whenever possible, but the participation constraints restrict contract values to the interval ½V u ðzÞ; Vðz; V u Þ for each z. Lemma 9 relates these value intervals to intervals of wages feasible for each aggregate state. The contract wage is constant as long as it lies within the wage interval, but if keeping the wage constant would lead it outside the interval, then the wage adjusts up or down just enough to bring it inside the interval. Analogous results to Propositions 5 and 6 hold also for limited-commitment contracts. The existence of equilibrium LC requires f ðV u ðzÞ; z; V u ÞXk for all z, which is a tighter condition than that a limited-commitment contract exist. The wage contracts inherit the qualitative features of the contracts in Thomas and Worrall (1988). However, in order to discuss the implications of the contracting framework for unemployment and wages in an economy with turnover in the labor force, my model enriches their environment with flows into and out of employment relationships, as well as allowing the outside options restricting contracting to reflect the equilibrium value of search rather than participation in spot labor markets. The quantitative exercises also consider the intermediate case of one-sided limited commitment, where only firms can commit to contracts. Imposing the participation constraints only for workers, the feasible set of contracts then becomes
S1LC ðz; V u Þ ¼ fsðzÞ 2 Sðz; V u ÞjV s ðzt ÞXV u ðzt Þ 8zt ; t ¼ 0; 1; . . . s:t: z0 ¼ zg. In this case efficient wage contracts keep the contract wage constant whenever possible, but if the worker’s outside option rises sufficiently, as job-finding rates and the contract values used to attract new workers rise in a boom, the wage is adjusted up just enough to match the outside option. When firms are committed, contract wages never fall.
4. Quantitative results This section examines the quantitative performance of the model and the impact of commitment on business cycle variation in unemployment, vacancies and wages. The solution method is discussed in a supplementary appendix.
ARTICLE IN PRESS L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
177
Table 1 Calibration and data. Calibration Simulation interval Discount rate Preferences Separation rate Matching function Vacancy cost Productivity
Simulations
10 days b ¼ 0:9987 (1.2% quarterly return) c1g CRRA: log (benchmark), 1g d ¼ 0:1=9 0:01 0:28
mðyÞ ¼ 0:15y k set s.t. steady-state job-finding rate consistent with data z ¼ ½1 Dz; 1; 1 þ Dz0 , with mean one, transition matrix 0 1 0 1 0 1 0 1 0 0 B C B C 0 1 0 A þ ð1 lÞ@ 12 0 12 A P ¼ l@ 0 1 0 0 0 1 Dz ¼ 0:03115, l ¼ 0:98805 to match measured standard deviation, AR1-coefficient of productivity (after aggregating to quarterly, taking logs, filtering) Model-generated data are aggregated to quarterly, logged, HPð105 Þ-filtered. Model-generated moments are averaged over 1000 simulations of 55 year productivity realizations. The initial distribution has full unemployment and the first 400 000 periods thrown away
Data Productivity Wages
Vacancies Unemployment Filtering
BLS reported quarterly real output per worker for the non-farm business sector for 1951–2005, taking logs and HP-filtering, gives standard deviation 0.02 and AR1-coefficient 0.89 BLS reported quarterly compensation and employment for the non-farm business sector for 1951–2005. Wage refers to per worker compensation (no hours choice in model). BLS reports a current dollar compensation index, which is deflated by the CPI for all urban consumers and divided by employment Conference Board’s help-wanted advertising index Current Population Survey All data are logged and HPð105 Þ-filtered
Notes: The calibration seeks consistency with Shimer (2005a) and the subsequent literature. This motivates the calibration of the matching function, separation rate and vacancy cost, as well as choice of filter. The interval is short both to allow using a Cobb–Douglas matching function, common in empirical studies estimating matching functions (to guarantee matching probabilities between zero and one, short enough periods are needed) as well as because unemployed workers are forced to stay unemployed for at least one period in the model. The elasticity of the matching function a ¼ 0:72 is based on estimates from time series data on vacancies and unemployment, and the separation rate on unemployment data. (Robustness checks show that the results hold also for lower values of the matching function elasticity.) As in Shimer (2005a), the steady-state y can normalized to be one (the calibration involves a normalization because the units of vacancies are not pinned down). The vacancy cost is calibrated such that the average job-finding probability equals the empirically observed 0.15 per period. Doing so guarantees that the model will be consistent with the average level of unemployment in all the results presented. The transition matrix is chosen such that conditional on an aggregate shock, if productivity is intermediate it is equally likely to increase or decrease, and if it is high or low it will return to the intermediate value.
4.1. Calibration The calibration is described in Table 1. It follows largely Shimer (2005a) and the related literature for comparability. However, to avoid taking a stand on the value of unemployment consumption, b, I present results for a range of values, so that the reader can see the sensitivity to this much stressed variable. The value of unemployment consumption is important for the amplification properties of the model, but the literature remains divided on its appropriate calibration.11 For direct evidence on b, note that the model’s workers are excluded from asset markets and consume their income each period, so b corresponds roughly to the consumption of the unemployed relative to that of the employed, because the wage level is close to one. Empirical estimates of how much consumption falls upon entering unemployment range from 5% to 14% .12 The computation strategy used to solve the model in the case of two-sided limited commitment requires restricting the productivity states z to a relatively small set. The baseline calibration uses a three state process for productivity. Because the degree of discretization may matter for the impact of limited commitment in contracting, a supplementary online
11 While Shimer (2005a) calibrates the parameter to the replacement rate of unemployment insurance, roughly 0.4, which implies little amplification, Hagedorn and Manovskii (2008) have argued that the high observed volatility of unemployment is an indication that workers are relatively indifferent with respect to unemployment, and so the appropriate value for the parameter is instead close to one (productivity of market work). A consensus view has not emerged, beyond the suggestion that some intermediate value may be appropriate. 12 Due to data limitations, empirical estimates are available mainly for food consumption: Aguiar and Hurst (2005) find a 5% drop in food consumption, Gruber (1997) and Stephens (2001), respectively, 7% and 10% drops in food expenditures. As the demand for food is likely to be less elastic than that for other goods, one would expect the total consumption of goods to drop more than that of food. Appropriately, Browning and Crossley (2001) find a 14% drop in a wider measure of consumption expenditures.
ARTICLE IN PRESS 178
L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
Illustration of aggregate wages
FC
0.05 0 −0.05
1LC
0.05
0.05 LC
10
15
20
25
30
35
40
45
50
55
5
10
15
20
25
30
35
40
45
50
55
5
10
15
20
25
30
35
40
45
50
55
5
10
15
20
25 30 time/years
35
40
45
50
55
0 −0.05
0 −0.05 0.05
re−barg
5
0 −0.05
Fig. 1. An illustration of the aggregate wage across contracting environments. Notes: The figure shows how model-generated wage data differ across contracting environments: full commitment (FC), one-sided limited commitment where only firms commit (1LC), two-sided limited commitment (LC) and continual re-bargaining of wages. The gray line indicates the underlying productivity realization, with a 1% deviation corresponding to 0.01. The data are aggregated to quarterly, logged and filtered.
appendix provides robustness checks related to the productivity specification: extending to a four state process does not change results in a significant way. Robustness to the estimation errors involved in calculating the moments of the productivity process is also examined. The next section examines how the model compares to data in terms of the cyclical behavior of the aggregate wage and the vacancy–unemployment ratio, the key variable behind variation in unemployment in the model. The central question is whether the magnitudes of the responses of these variables to productivity shocks in the model are consistent with those in the data. To address the question, it seems appropriate to deviate from the majority of the literature examining the unemployment volatility puzzle, which focuses on the ability of the model to generate the same standard deviation for the vacancy–unemployment ratio as observed in the data. Over the 1951–2005 period, the empirical correlation of the vacancy–unemployment ratio with productivity is only 0.4, while in the model with one shock the correlation is approximately one. The low correlation suggests that there may be other shocks causing variation in the data beyond those affecting productivity, and if these are excluded from the model, then the model should only be expected to explain part of the variation. I therefore measure responses instead as regression coefficients of the vacancy–unemployment ratio and the aggregate wage on productivity (all in logs). The main impact of the regression approach is to adjust both empirical targets down, according to the correlations. These elasticities are denoted as PðyjzÞ; PðwjzÞ and the empirical values are 7:58 and 0.55, respectively.13 4.2. The impact of commitment on the aggregate wage Enriching the Mortensen–Pissarides model with long-term contracting allows it to reproduce observed patterns in the cyclical behavior of individual level wages. The model reproduces the empirical finding that the wages of newly hired workers are more cyclical than those of existing workers (Bils, 1985, and others). It allows aggregate conditions at the time of hiring to have a persistent effect on a worker’s wage and, in the case of limited-commitment contracts, it allows the most 13 PðxjzÞ ¼ covðx; zÞ=varðzÞ for x ¼ y; w. The regression measure is used here because it seems a more appropriate measure to use, but the choice does not matter for the main conclusions of the paper, and the interested reader will find the corresponding results using standard deviations in a supplementary online appendix on robustness.
ARTICLE IN PRESS L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
179
extreme aggregate conditions that have prevailed during the life of a contract to affect this wage as well (Beaudry and DiNardo, 1991; McDonald and Worswick, 1999; Grant, 2003; Macis, 2007). The next sections show that a quantitative evaluation of the model relating it to aggregate wage data supports the two-sided limited-commitment model: it is consistent with observed wage cyclicality, as well as the timing of wage responses to productivity shocks. Fig. 1 illustrates how the time series of the aggregate wage look in the three different contracting environments. In the full-commitment economy contract wages are constant, with higher productivity at the time of hiring translating into a higher wage level. The aggregate wage is procyclical because of turnover in the labor force. When workers have limited commitment, contract wages remain constant whenever possible, but if the workers’ outside option begins to bind when productivity improves, they adjust up together with productivity. Hence also the aggregate wage can jump up in the event of a sufficiently large positive productivity shock, reflecting the adjustment in the contract wages of those workers for whom the outside option binds. When both parties have limited commitment, contract wages will also adjust down if the firms’ outside option begins to bind when productivity falls. Hence also the aggregate wage can jump down in the event of a sufficiently large negative productivity shock. The final time series in the figure corresponds the standard wage setting mechanism used in the search and matching literature—continual re-bargaining (see e.g. Pissarides, 1985). These wages result from the bargaining environment of Proposition 6, when the feasible set of contracts is restricted to reflect re-bargaining each period. In that case all workers receive the same wage, and that wage is strongly affected by current productivity. One way in which the environments differ is how quickly the aggregate wage responds to productivity shocks. To compare the models to data on this dimension, Fig. 2 plots empirical and model-generated correlations of the aggregate wage with lagged productivity. The data show a fairly contemporaneous relationship, with slightly more weight on lags. The fixed wage economy has a notable lag in the correlation because the aggregate wage responds only through turnover in the labor force. The continually re-bargained wage represents the opposite extreme with a sharp contemporaneous correlation. The limited-commitment economies, particularly the two-sided limited-commitment economy, come closer to the empirical counterpart by combining the lagged response through turnover with contemporaneous pro-cyclical adjustments in contract wages.
correlation of wt with zt−lag
data
1 0.5 0
FC
−10 1
−8
−6
−4
−2
0
2
4
6
8
10
−8
−6
−4
−2
0
2
4
6
8
10
−8
−6
−4
−2
0
2
4
6
8
10
−8
−6
−4
−2
0
2
4
6
8
10
−8
−6
−4
−2
0
2
4
6
8
10
0.5 0
1LC
−10 1 0.5 0
LC
−10 1 0.5
re−barg
0 −10 1 0.5 0 −10
lag Fig. 2. Correlation of the aggregate wage with lagged productivity. Notes: The figure compares correlations of the aggregate wage with lagged productivity in the data versus the different contracting environments: full commitment (FC), one-sided limited commitment where only firms commit (1LC), twosided limited commitment (LC) and continual re-bargaining of wages.
ARTICLE IN PRESS 180
L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
0.8 2−sided limited commitment 1−sided limited commitment full commitment
0.7
P (w|z)
0.6
data
0.5 0.4 0.3 0.2 0.1
2
4
6 P (θ|z)
8
10
12
Fig. 3. Sensitivity of vacancy–unemployment ratio y, aggregate wage w to productivity z. Notes: Sensitivities are measured as regression coefficients and denoted by PðyjzÞ and PðwjzÞ. The curves in the figure are traced out by varying b, the level of consumption of unemployed agents, in three contracting environments: full commitment, one-sided limited commitment, and two-sided limited commitment. The calibration of the vacancy cost k guarantees that the level of unemployment is constant in the figure.
25
P (θ|z)
20 15
LC 1LC FC data
10 5 0 0.2
0.3
0.4
0.5
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.6
0.7
0.8
0.9
0.8
P (w|z)
0.6 0.4 0.2 0 0.2
b Fig. 4. Sensitivity of vacancy–unemployment ratio y, aggregate wage w to productivity z. Notes: Sensitivities are measured as regression coefficients and denoted by PðyjzÞ and PðwjzÞ. The calibration of the vacancy cost k guarantees that the level of unemployment is constant in the figure. The kinks in the wage plot are due to the finite number of states in the productivity process.
4.3. The volatility of the aggregate wage versus unemployment Fig. 3 examines how the quantitative performance of the model compares to data in terms of matching the business cycle volatility of the aggregate wage and vacancy–unemployment ratio. It traces out the model produced volatilities for each of the three contracting environments as a function of b, the consumption of the unemployed. The figure shows that the model framework is capable of producing very rigid wages compared to data. Note that the model is able to approximate the empirical level of wage volatility only when wage smoothing is restricted by limited commitment on both sides. The figure also makes the important point that the same extent of unemployment cyclicality can well be consistent
ARTICLE IN PRESS L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
181
Table 2 Impact of risk aversion. Utility function
Wage setting
b
k
data
Pðwt jzt Þ 0.55
Linear
FC 1LC LC Re-barg.
0.87 0.87 0.87 0.87
0.04 0.04 0.04 0.04
0.12 0.24 0.53 0.96
g¼1
FC 1LC LC
0.87 0.87 0.86
0.05 0.05 0.05
0.12 0.23 0.49
g¼5
FC 1LC LC
0.86 0.86 0.85
0.07 0.07 0.08
0.12 0.17 0.36
Notes: Each row presents the values of the unemployment consumption b and vacancy cost k, which allow the model to match both the level of unemployment and volatility of market tightness observed, given the contracting environment. The lower is the value of b required, the more amplification is produced by the contracting environment itself. Vacancy cost k is s.t. Em ¼ 0:15 and unemployment consumption b is s.t. Pðyt jzt Þ ¼ 7:583. Preferences are uðcÞ ¼ c1g =ð1 gÞ for g ¼ 0; 1; 5. When preferences are linear, the form of the optimal contract is not determined. The table presents four alternatives for wage determination in this case, corresponding to the form of the FC, 1LC, LC contracts, as well as continual re-bargaining of wages. The contract form does not affect the values of b; k in this case, but only the cyclicality of the aggregate wage.
with a range of different degrees of wage cyclicality. Observations on the cyclicality of the aggregate wage are not informative about unemployment. To understand what underlies this crucial figure, Fig. 4 divides it in two parts, allowing the reader to fix a calibration of b and examine the impact of the contracting environment. It is well known that in the standard Mortensen–Pissarides model the cyclicality of the vacancy–unemployment ratio increases in the value of b, and the same holds here as well. A high b implies that workers are relatively indifferent between market work and unemployment, causing productivity shocks of a given size to induce larger changes in hiring. Under limited-commitment contracting, also the cyclicality of the aggregate wage is increasing in b, because participation constraints bind more often when workers are more indifferent. The figure shows that limited-commitment constraints increase the volatility of both the aggregate wage and unemployment. This seemingly surprising finding is due to the fact that while limited commitment increases the cyclicality of contract wages, as well as starting wages, which translates into increased cyclicality in the aggregate wage, it creates rigidity in the present value of wages used to attract new workers, increasing the cyclicality of vacancy creation. The figure also shows that while the impact on the wage is substantial, the impact on unemployment is relatively modest. The aggregate wage reflects strongly the form of wage contracts, while the impact on the present value of wages is modest.
4.4. Effects of risk aversion The workers’ preference for consumption smoothing plays a central role in the model. In fact, if workers have linear preferences, g ¼ 0, the model can be shown to reproduce the employment dynamics of the standard Mortensen–Pissarides model.14 Qualitatively, one would expect two effects from increased risk aversion: on the one hand, the distortions due to the incomplete markets environment become amplified, which implies greater cyclical variation in unemployment and reduced variation in the present value of wages used to attract new workers. On the other hand, participation constraints become less binding, which reduces the impact of limited commitment. When risk aversion is sufficiently high, participation constraints cease to bind completely, making the limited-commitment economies identical to the fullcommitment economy, with a very rigid aggregate wage. Table 2 examines the quantitative impact of increasing g. For each value of g and each contracting environment, it shows how high a value of b is needed to match the observed volatility of y. The lower the value is, the more amplification is due to the environment itself. To guarantee that the level of unemployment remains fixed in these comparisons, as in the previous figures, the vacancy cost adjusts.15 The table shows that increasing risk aversion increases the volatility of the vacancy–unemployment ratio, especially when limited commitment binds, but that the impact is relatively modest in magnitude. At the same time, the impact on the cyclicality of the aggregate wage is quite substantial. Wage volatility falls 14
When the bargaining power of workers satisfies the Hosios-condition, as often imposed. Matching both the level and the volatility determines the two parameters b and k uniquely, because while both b and k increase the level of unemployment, they have opposing effects on volatility. 15
ARTICLE IN PRESS 182
L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
both because the limited-commitment economies become more similar to the full-commitment economy, and because the present value of wages used to attract new workers becomes more rigid. In a full-commitment economy the latter is accomplished through reduced differences in the wage levels of workers hired in booms versus recessions, which also makes the aggregate wage more rigid. These findings reinforce the conclusions that: (i) the model is capable of producing a very rigid aggregate wage, without the substantial increase in the cyclical volatility of unemployment seen with exogenous wage rigidity, and (ii) a given level of unemployment volatility can be consistent with a range of wage volatilities, depending on the environment. 5. Conclusions The aggregate wage is not informative about the measure of wage rigidity relevant for understanding unemployment fluctuations: how much the present value of wages used to attract new workers varies over the cycle. In an ideal world, one could estimate these present values for individual workers directly, using a sufficiently wide and long panel data set with information on tenure, and compare them for workers hired in booms versus recessions. With limitations on data, a more practical approach is to make assumptions about the type of contract governing the evolution of individual wages in the data, and take advantage of that structure to calculate estimates of these present values. Kudlyak (2007) conducts such an exercise, failing to find evidence of significant rigidity in wages. In related work, Haefke et al. (2007) and Pissarides (2007) interpret the higher cyclical volatility of starting wages, as compared to that of the aggregate wage, as evidence against wage rigidity. Creating the link between the starting wage and the present value requires assumptions about how wages evolve in contracts, and how informative the results are hinges on how accurate those assumptions are. Here the limitedcommitment contracting framework has the advantage of supporting empirical literature.16 This paper is not the first to look for micro-foundations for wage rigidity, as called for by Hall (2005). Gertler and Trigari (2008) introduce staggered wage bargaining into the Mortensen–Pissarides model, allowing only a subset of firms to adjust their wages each period. In their model hiring decisions are distorted in those firms that are constrained to keep wages fixed. Hall and Milgrom (2008) argue that the threat points in the bargaining problem faced by workers and firms in the Mortensen–Pissarides model are misspecified. Altering these threat points to reflect the delay of negotiations rather than the ending of negotiations creates some added rigidity in the present value of wages. Several authors have also explored the role of informational asymmetries and incomplete markets for creating wage rigidities, interesting approaches which are non-trivial to incorporate into quantitative macroeconomic analysis. Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco. 2008.12.009.
References Aguiar, M., Hurst, E., 2005. Consumption versus expenditure. Journal of Political Economy 113 (5), 919–948. Azariadis, C., 1975. Implicit contracts and underemployment equilibria. Journal of Political Economy 83 (6), 1183–1202. Baily, M.N., 1974. Wages and employment under uncertain demand. Review of Economic Studies 41 (1), 37–50. Beaudry, P., DiNardo, J., 1991. The effect of implicit contracts on the movement of wages over the business cycle: evidence from micro data. Journal of Political Economy 99 (4), 665–688. Bils, M., 1985. Real wages over the business cycle: evidence from panel data. Journal of Political Economy 93 (4), 666–689. Browning, M., Crossley, T.F., 2001. Unemployment insurance benefit levels and consumption changes. Journal of Public Economics 80, 1–23. Caplin, A.S., Spulber, D.F., 1987. Menu costs and the neutrality of money. Quarterly Journal of Economics 102 (4), 703–725. Chetty, R., 2008. Moral hazard vs. liquidity and optimal unemployment insurance. Journal of Political Economy 116 (2), 173–234. Gertler, M., Trigari, A., 2008. Unemployment fluctuations with staggered Nash wage bargaining. Unpublished Manuscript, New York University. Grant, D., 2003. The effect of implicit contracts on the movement of wages over the business cycle: evidence from national longitudinal surveys. Industrial and Labor Relations Review 56, 393–408. Gruber, J., 1997. The consumption smoothing benefits of unemployment insurance. American Economic Review 87, 192–205. Haefke, C., van Rens, T., Sonntag, M., 2007. Wage rigidity and job creation. Unpublished Manuscript, Institute for Advanced Studies, Vienna. Hagedorn, M., Manovskii, I., 2008. The cyclical behavior of equilibrium unemployment and vacancies revisited. American Economic Review 98 (4), 1692–1706. Hall, R.E., 2005. Employment fluctuations with equilibrium wage stickiness. American Economic Review 95 (1), 50–65. Hall, R.E., Milgrom, P.R., 2008. The limited influence of unemployment on the wage bargain. American Economic Review 98 (4), 1653–1674. Kudlyak, M., 2007. The cyclicality of the user cost of labor with search and matching. Unpublished Manuscript, University of Rochester. Macis, M., 2007. Wage dynamics and insurance. Unpublished Manuscript, University of Chicago. MacLeod, W.B., Malcomson, J.M., 1993. Investments, holdup, and the form of market contracts. American Economic Review 83 (4), 811–837. McDonald, J.T., Worswick, C., 1999. Wages, implicit contracts, and the business cycle: evidence from Canadian micro data. Journal of Political Economy 107 (4), 884–892. Moen, E.R., 1997. Competitive search equilibrium. Journal of Political Economy 105 (2), 385–411. Pissarides, C.A., 1985. Short-run equilibrium dynamics of unemployment, vacancies, and real wages. American Economic Review 75 (4), 676–690.
16
For a survey see Thomas and Worrall (2007).
ARTICLE IN PRESS L. Rudanko / Journal of Monetary Economics 56 (2009) 170–183
183
Pissarides, C.A., 2007. The unemployment volatility puzzle: is wage stickiness the answer? Unpublished Manuscript, London School of Economics. Reiter, M., 2008. Embodied technical change and the fluctuations of wages and unemployment. Scandinavian Journal of Economics 109 (4), 695–721. Rudanko, L., 2008. Aggregate and idiosyncratic risk in a frictional labor market. Unpublished Manuscript, Boston University. Shimer, R., 2005a. The cyclical behavior of equilibrium unemployment and vacancies. American Economic Review 95 (1), 25–49. Shimer, R., 2005b. Reassessing the ins and outs of unemployment. Unpublished Manuscript, University of Chicago. Sigouin, C., 2004. Self-enforcing employment contracts and business cycle fluctuations. Journal of Monetary Economics 51, 339–373. Stephens, M., 2001. The long run consumption effects of earnings shocks. Review of Economics and Statistics 83, 28–36. Thomas, J., Worrall, T., 1988. Self-enforcing wage contracts. Review of Economic Studies 55, 541–554. Thomas, J., Worrall, T., 2007. Limited commitment models of the labour market. Scottish Journal of Political Economy 54, 750–773.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 184–199
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
The scarring effect of recessions$ Min Ouyang Department of Economics, 3151 Social Science Plaza B, Irvine, CA 92697-5100, USA
a r t i c l e i n f o
abstract
Article history: Received 22 January 2007 Received in revised form 19 December 2008 Accepted 19 December 2008 Available online 20 January 2009
According to the conventional view, recessions improve resource allocation by driving out less productive firms. This paper posits an additional scarring effect: recessions impede the developments of potentially superior firms by destroying them during their infancy. A model is developed to capture both the cleansing and the scarring effects. A key ingredient of the model is that idiosyncratic productivity is not directly observable, but can be learned over time. When calibrated with statistics on entry, exit and productivity differentials, the model suggests that the scarring effect dominates the cleansing effect, and gives rise to lower average productivity during recessions. & 2009 Elsevier B.V. All rights reserved.
JEL classification: E32 L16 C61 Keywords: Cleansing effect Scarring effect Creative destruction Learning Demand shocks
1. Introduction How do recessions affect resource allocation? Economists have long studied this question. Schumpeter (1934) advanced the concept of cleansing: recessions eliminate outdated techniques and obsolescent products, and thus free resources for more productive uses. This idea has been revived during the last decade in an assortment of theoretical work such as Caballero and Hammour (1994, 1996), Hall (2000), and Mortensen and Pissarides (1994). In recent years, however, researchers have begun to explore alternative ways in which recessions might influence allocation. Barlevy (2002, 2003) posits adverse effects caused by credit-market frictions or on-the-job search that offset some or all of the cleansing effect. Related empirical investigations suggest that recessions affect allocation through numerous channels, some with characteristics consistent with cleansing and others consistent with negative effects of the alternatives.1 Three existing empirical findings have pointed to an unexplored channel through which recessions can affect allocation. First, although businesses’ deaths do surge during recessions, the failing ones are not always the least productive. For
$ The author is indebted to Robert King, an anonymous referee, Richard Rogerson, and John Shea for their detailed comments. Ricardo Caballero, Gadi Barlevy, Mark Gertler, John Haltiwanger, Hugo Hopenhayn, Micheal Pries, Vincenzo Quadrini, John Rust, Dmitriy Stolyarov, and Daniel Vincent provided generous advice. The author also thank John Haltiwanger for providing the gross job flows data. Seminar participants at UMD, Yale, USC, UC-Irvine, Kansas City Fed, Federal Reserve of San Francisco, Federal Reserve of Cleveland, McGill, Queen’s, UC-San Diego, UC-Davis, provided helpful discussions. Technical appendix is provided at the ScienceDirect and the author’s homepage. All errors are the author’s. E-mail address:
[email protected] URL: http://www.socsci.uci.edu/mouyang 1 See Davis et al. (1999) and Barlevy (2002, 2003).
0304-3932/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.12.014
ARTICLE IN PRESS M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
185
example, Baden-Fuller (1989) examines British recessions during the 1980s and finds that many closing firms were more profitable than the surviving ones. Second, most of the businesses fail at a very young age. According to Dunne et al. (1989), over 75 percent of the exiting plants in the U.S. manufacturing sector aged five years or less (Table 1, p. 676). Third and most importantly, recessions disproportionately affect businesses in their early years of operation. This is shown in Fig. 1, which plots the quarterly exit rate of the U.S. manufacturing plants across three age categories. Apparently, infant plants suffer the most during recessions. For example, in the second quarter of 1984, the exit rate jumped from 1:35% to 3:42% for plants aged one year or less, and from 0:78% to 0:87% for plants aged between one and nine years; but, for plants aged 10 years or more, it rose from 0:35% to 0:37% only. The disproportionate deaths among infant businesses, this paper argues, should play an important role in determining the allocative effect of economic downturns. Infant businesses tend to appear unproductive in the short run, but have the potential to reveal high productivity in the future. Recessions that destroy infant businesses scar the economy, by preventing new and innovative businesses from reaching their full potential. This scarring effect offsets the conventional cleansing effect, although both effects take place through the exit of unprofitable firms. Accordingly, the overall impact of recessions on resource allocation depends upon the relative magnitudes of these two competing effects—cleansing and scarring. To understand the scarring effect, consider the life cycle of a firm. A firm usually starts without fully knowing its own quality. Uncertainty may come from the unobserved talent of the manager, unknown appeal for the product, or unpredictable profitability of a retail location. As the firm operates, realized revenue signals its true quality: high revenue indicates that it is productive and encourages continuing in operation; low revenue implies otherwise. The longer a firm operates, the more it learns about its true quality. Therefore, potentially good firms—those that do not yet know they are good—must be relatively young. During recessions, profitability declines in general so that a firm cannot bear to learn as long as during good times. A potentially good firm that would have survived during good times, might thus exit during recessions before it learns. At the industry level, the exit of potentially good firms reduces the proportion of good firms at present times, as well as in the future because fewer potentially good firms are left to learn. The reduced proportion of good firms lowers average productivity, which is defined in this paper as a scarring effect. The above story reflects the spirit of learning, theoretically proposed by Jovonavoic (1982) and empirically promoted by Caves (1998) and Foster et al. (2008) as a powerful tool to understand firm turnover. In this paper, a simplified learning mechanism from Pries (2004) is combined with the vintage framework of Caballero and Hammour (1994) to capture cleansing and scarring theoretically. The model decomposes firm productivity into two components—vintage and unobservable idiosyncratic productivity—so that an industry’s average productivity is determined by the distribution of firms across both dimensions. The idiosyncratic productivity is not directly observable, but can be learned over time. Demand variations serve as the source of economic fluctuations. Lower demand reduces profitability in general, so that firms exit younger. Younger exit ages direct, on the one hand, resources to younger and more productive vintages, causing a
0.06
0.05
0.04
0.03
0.02
0.01
0
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
Fig. 1. Quarterly plant exit rates in the U.S. manufacturing sector (1972–1988). Dotted line represents the exit rate for plants aged ten years or older; dashed line represents the exit rate of plants aged between one and ten years; solid line represents that for plants younger than a year. All exit rates are employment-weighted. Data source: gross job flows data compiled by Davis et al. (1996).
ARTICLE IN PRESS 186
M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
cleansing effect that raises average productivity; on the other hand, they truncate the learning process that lead resources toward firms with higher idiosyncratic productivity, creating a scarring effect that pulls down average productivity. Hence, recessions cause two competing effects—cleansing and scarring. The question then becomes, which effect dominates? The paper turns to data to evaluate the scarring effect quantitatively. The model is calibrated to statistics on entry, exit, and cohort productivity differentials observed in the U.S. manufacturing sector. When applied to stochastic demand shocks, the calibrated model suggests that the scarring effect dominates the cleansing effect and generates lower average productivity during recessions. The rest of the paper is organized as follows. Section 2 lays out the model. The cleansing and scarring effects are motivated in Section 3 with comparative static exercises. Section 4 applies the model to stochastic demand shocks, confirms that the cleansing and scarring effects carry over, and studies their quantitative implications. Section 5 concludes. 2. A renovating industry with learning Consider an industry where labor and capital combine in fixed proportions to produce a homogenous output. Firms that enter in different periods coexist, each characterized by two components: vintage and idiosyncratic productivity. A firm’s vintage is given by an exogenous technological progress that drives the industry’s leading technology to grow at a constant rate g40. With At as the leading technology in period t, Atþ1 ¼ At ð1 þ gÞ. Firms that enter in period t adopt At . Only entrants have access to new technology; incumbents cannot retool. With firm age defined as the number of periods that a firm has survived through, the vintage of a firm of age a in period t equals At ð1 þ gÞa . At entry, each firm is endowed with idiosyncratic productivity y. This idiosyncratic productivity can be the talent of the manager as in Lucas (1978) or, alternatively, the location of the store, the organizational structure of the production process or its fitness to the embodied technology. The key assumption regarding y is that its value, although fixed at the time of entry, is not directly observable. A firm hires one worker. The period-t output of a firm of age a and with idiosyncratic productivity y is qt ða; yÞ ¼ At ð1 þ gÞa xt ,
(1)
where xt ¼ y þ et . xt captures the influence of y on output masked by an independent transitory shock et . With the wage rate normalized as one and the output price denoted as P t , this firm’s period-t profit is
pt ða; yÞ ¼ Pt At ð1 þ gÞa ðy þ et Þ 1.
(2)
Both qt ða; yÞ and pt ða; yÞ are directly observable. A firm knows its vintage and can infer the value of xt by observing output or revenue. Given knowledge of the distribution of et , a firm uses the value of xt to learn about y. 2.1. ‘‘All-or-nothing’’ learning Firms attempt to resolve the uncertainty about y to decide whether to continue or terminate production. Following Pries (2004), we model an ‘‘all-or-nothing’’ learning process, assuming only two values of y: yg for a good firm and yb for a bad firm. Moreover, et is distributed uniformly on ½o; o, so that a good firm will have xt each period as a random draw from a uniform distribution over ½yg o; yg þ o, and a bad firm will have it drawn over ½yb o; yb þ w. yg , yb and o satisfy 0oyb ooyg ooyb þ ooyg þ o. Therefore, an observation of xt within ðyb þ o; yg þ o indicates a firm has good idiosyncratic productivity; conversely, an observation of xt within ½yb o; yg oÞ tells that it has bad idiosyncratic productivity. However, an xt within ½yg o; yb þ o reveals nothing, because the probabilities of falling in this range as a good firm and as a bad firm both equal to ð2o þ yb yg Þ=2o. e This all-or-nothing learning process simplifies the model considerably, as it gives only three values for y (the expected y). e e Correspondingly, there are three groups of firms in the industry: good firms with y ¼ yg , bad firms with y ¼ yb , and e ‘‘unsure firms’’ with y ¼ yu , the prior mean of y. The probability of true idiosyncratic productivity being revealed every period is p ðyg yb Þ=2o. The unconditional probability of y ¼ yg is exogenous and equals j. A firm enters the market as unsure; thereafter, every period it stays unsure with probability 1 p; learns it is good with probability pj, and learns it e is bad with probability pð1 jÞ. Therefore, the evolution of y from the time of entry is a Markov process with values ðyg ; yu ; yb Þ, an initial probability distribution ð0; 1; 0Þ, a transition matrix 0 1 1 0 0 B C (3) @ pj 1 p pð1 jÞ A, 0 0 1 and a limiting probability distribution as a goes to 1, ðj; 0; 1 jÞ. If firms were to live forever, eventually all uncertainty would be resolved, as enough information would be provided to reveal each firm’s true idiosyncratic productivity. Suppose that each entering cohort consists of a continuum of firms, so that the law of large numbers applies. Then j and p are not only probabilities, but also the fractions of firms with y ¼ yg in a entering cohort and of firms each period that learn their true idiosyncratic productivity. Ignoring firm exit for now, the fractions of good firms, of unsure firms, and of bad
ARTICLE IN PRESS M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
187
0
bad
0.5
unsure
good
0
age
Fig. 2. Dynamics of a birth cohort with all-or-nothing learning: the distance between the bottom curve and the bottom axis measures the density of firms e e with y ¼ yg ; the distance between the top curve and the top axis measures the density of firms with y ¼ yb ; the distance between the two curves e measures the density of firms with y ¼ yu . This figure is generated assuming a prior fraction of good firms of 0:5 and a learning pace of 0:02.
firms in a cohort of age a are ðj jð1 pÞa ; ð1 pÞa ; ð1 jÞ ð1 jÞð1 pÞa Þ.
(4)
Fig. 2 plots the evolution of firm distribution within a birth cohort under all-or-nothing learning. The horizontal axis depicts the age of a cohort across time. Apparently, the fractions of firms that know their true idiosyncratic productivity, whether good or bad, grow as a cohort ages. Moreover, the two learning curves that denote the dynamics of factions of good firms and bad firms are both concave. This is the decreasing property of marginal learning captured by Jovonavoic (1982): the marginal learning effect decreases as a firm ages, which, in this model, is reflected as the decline in the marginal number of learners as a cohort ages. The convenient feature of all-or-nothing learning is that, on the one hand, the firmlevel learning occurs suddenly, which allows for a easy track of the cross-section firm distribution, while, on the other hand, cohort-level learning takes place gradually as in a more standard learning process. However, there is more that Fig. 2 can tell. Let the horizontal axis to depict the cross-sectional distribution of firm ages at any instant, then Fig. 2 captures the firm distribution across ages and idiosyncratic productivity in an industry that features constant entry but no exit. In this industry, cohorts continuously enter in the same size and experience the same dynamics as they age, so that, at any one time, different life stages of different birth cohorts overlap, giving rise to the firm distribution in Fig. 2. Under this interpretation, Fig. 2 indicates that older cohorts contain fewer unsure firms, as they have lived longer and learned more.
2.2. The recursive competitive equilibrium Within each period, the sequence of events occurs as follows. First, entry and exit take place after firms observe the aggregate state of the industry. Second, each surviving firm pays a fixed operating cost to produce. Third, the output price is realized. Fourth, firms observe revenue and update their beliefs. Then, another period begins. With this setup, this subsection considers a recursive competitive equilibrium definition, which includes as a key element the law of motion of the aggregate state of the industry. The aggregate state is ðF; DÞ. F denotes the firm distribution across e vintages and idiosyncratic productivity. In F, the element that measures the number of firms of age a and with belief y is e f ðy ; aÞ. D is an observable exogenous demand parameter. The law of motion for D is exogenous, described by D’s transition matrix. The law of motion for F is endogenous and denoted as H : F 0 ¼ HðF; DÞ. The sequence of events implies that H captures the influence of entry, exit and learning on firm distribution. Three assumptions characterize this industry equilibrium: firm rationality, free entry, and competitive pricing.
ARTICLE IN PRESS 188
M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
2.2.1. Firm rationality Firms are forward-looking price takers and profit maximizers. They predict current and future profitability to make decisions on entry or exit. The relevant state variables for a firm are its vintage, its belief about its true idiosyncratic e e productivity, and the aggregate state ðF; DÞ. Let Vðy ; a; F; DÞ to be the expected value, for a firm of age a and with belief y , of staying in operation for one more period and optimizing afterward when aggregate state is ðF; DÞ. Then V satisfies e
e
e0
Vðy ; a; F; DÞ ¼ E½pðy ; aÞjF; D þ bE½maxð0; Vðy ; a þ 1; F 0 ; D0 ÞÞjF; D,
(5) e
0
subject to F ¼ HðF; DÞ, the law of motion for D, and the law of motion for y driven by all-or-nothing learning. Since firms enter as unsure, the expected value of entry is Vðyu ; 0; F; DÞ. According to the firm rationality condition, entry occurs only e e when Vðyu ; 0; F; DÞ40; a firm of age a and with belief y exits if and only if Vðy ; a; F; DÞo0.2 2.2.2. Free entry Under the free entry condition, new firms can enter at any instant as long as they bear an entry cost c. This entry cost can be the cost of establishing a particular location, purchasing capital stock, or finding a qualified manager. Let f ðyu ; 0; F; DÞ to be the size of an entering cohort with aggregate state ðF; DÞ. The entry cost is assumed to be a linear function of the entry size: c ¼ c0 þ c1 f ðyu ; 0; F; DÞ;
c0 40 and c1 40.
(6)
c1 40 suggests that the entry cost increases in the entry size. This can arise from a limited amount of land available for production sites or, alternatively, an upward-sloping supply curve for the industry’s capital stock. Goolsbee (1998) provides supporting evidence for this assumption, showing that higher investment demand raises the equipment prices. Goolsbee’s finding suggests that, as more firms enter, capital price rises with capital demand, so that entry becomes more costly. Therefore, new firms keep entering as long as the expected value of entry exceeds the cost of entry. At the same time, the entry cost keeps rising until reaching Vðyu ; 0; F; DÞ. At this point, entry stops, and Vðyu ; 0; F; DÞ ¼ c0 þ c1 f ðyu ; 0; F; DÞ.
(7)
2.2.3. Competitive pricing The output price is determined by PðF; DÞ ¼
D , Q ðF; DÞ
(8)
where Q is total industry output; it equals the sum of production over heterogeneous firms. Recall that, according to the sequence of events, production takes place after entry and exit. Let F 0 to be the updated firm distribution after entry and e e exit, and f ðy ; aÞ0 to be the element of F 0 that measures the number of firms of age a and with belief y .3 Applying (1) gives X X e e Að1 þ gÞa y f ðy ; aÞ0 , (9) Q ðF; DÞ ¼ Q ðF 0 Þ ¼ a
ye
where D is an exogenous demand parameter that captures the influence of demand fluctuations on an industry’s production profitability. In reality, such demand variations can arise from changes in consumers’ taste on an industry’s production goods or, alternatively, productivity shocks of a down-stream industry that demands this industry’s output as one of its inputs. In this model, D equals industry total revenue, and it is the exogenous fluctuations in D that introduce cycles to the industry. Higher D implies higher P, which encourages entry while reduces exit, so that Q rises. Conversely, lower D causes less entry but more exit, and Q falls. With the conditions of firm rationality, free entry, and competitive pricing, the following definition is established: Definition. A recursive competitive equilibrium is a law of motion H, a value function V, and a pricing function P such that (i) V solves the firm’s optimization; (ii) P satisfies (8); and (iii) H is generated by the decision rules suggested by V and the appropriate summing-up of entry, exit and learning. An additional assumption is made to simplify the model: Assumption. Given values for other parameters, the value of yb is so low that Vðyb ; a; F; DÞ remains negative for any a and any ðF; DÞ. 2 Caballero and Hammour (1994) assume myopic exit behavior—a firm exits as long as its current profit drops below zero—by modeling a deterministic and smooth demand sequence, under which the exit behavior of a forward-looking firm is similar to that of a myopic firm. In contrast, demand follows a two-state Markov process in our model so that firm exit decisions must be forward-looking. The forward-looking exit behavior incorporates the value of waiting. When demand is low, a firm may choose to stay even if its current profit has dropped below zero, as it realizes a probability of future demand recovery. e 3 Q is contributed by the expected output Að1 þ gÞa y instead of the realized output Að1 þ gÞa ðy þ eÞ. This is because, with a continuum of firms in each birth cohort, the law of large numbers applies, so that the production noises and the expectation errors cancel out and the sum of realized output equals the sum of expected output.
ARTICLE IN PRESS M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
189
Under this assumption, bad firms always exit. Thus, at any one time, there are only two types of firms in operation—unsure and good. 3. Cleansing and scarring Section 2 shows that the firm distribution F enters the model as a state variable, which makes it difficult to characterize the industrial dynamics in response to stochastic demand shocks. However, it is generally true that, if shocks are sufficiently persistent, the effects of temporary changes in response to transitory shocks are similar to those induced by permanent shocks. Therefore, this section uses comparative static exercises on the steady-state equilibrium to motivate the cleansing and scarring effects. The next section will turn to a numerical analysis of the model to confirm that the two effects carry over with stochastic demand shocks. 3.1. Steady state A steady state is a recursive competitive equilibrium with time-invariant aggregate states. In a steady state, D is and is perceived as time-invariant, D0 ¼ D; F is also time-invariant, F ¼ HðF; DÞ. Because H is generated by entry, exit, and learning, a steady state must feature time-invariant entry and exit for F ¼ HðF; DÞ to hold. Thus, a steady state can be summarized by ff ð0Þ; ag ; au g: f ð0Þ is the time-invariant entry size; ag is the maximum age for good firms; and au is the maximum age for unsure firms. Proposition 1 establishes the existence of a unique steady-state equilibrium for any D. Proposition 1. With constant D, there exists a unique time-invariant ff ð0Þ; ag ; au g that satisfies the conditions of firm rationality, free entry and competitive pricing. Detailed proof is provided in Appendix A. A key step of the proof is to combine the exit condition for unsure firms and that for good firms to get yu pjb p jb pjbg þ bag au . (10) ð1 þ gÞag au ¼ 1 þ yg 1 þ g b 1 b ð1 bÞð1 þ g bÞ The proof for Proposition 1 shows that, with yg 4yu , (10) determines a unique value for ag au. Note that D does not enter (10), so that demand has no impact on ag au . With ag au determined by (10) independently, au in the free entry condition and the competitive pricing condition can be replaced by ag ðag au Þ. As a result, those two conditions jointly determine the values for f ð0Þ and ag . Fig. 3 illustrates the steady-state firm distribution. Like Fig. 2, it can be interpreted in two ways. First, let the horizontal axis to depict the cohort age across time, then Fig. 3 displays the steady-state life-cycle dynamics of a representative cohort. A cohort enters as unsure in a measure of f ð0Þ. As it ages, bad firms exit and the cohort size declines; good firms stay and the density of good firms grows. At age au , all unsure firms exit with their vintage too old to survive as unsure, but good
Learning Margin−−−− Exit of Bad
Entry Margin Exit Margin Of Unsure unsure firms
good firms
Exit Margin Of Good
age
0 Fig. 3. The steady-state firm distribution and the entry and exit margins.
ARTICLE IN PRESS 190
M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
firms can stay. After au, learning stops, the cohort contains good firms only, and its size remains constant. Good firms live until ag . The vintage after ag is too old even for good firms to survive. Second, let the horizontal axis in Fig. 3 to depict the cohort age cross section. Then Fig. 3 displays the steady-state firm distribution across ages and idiosyncratic productivity at any one time. Firms of different ages coexist. Because older cohorts have lived longer and learned more, their sizes are smaller and their densities of good firms are higher. Cohorts older than au are of the same size and contain good firms only. No cohort is older than ag . Also note that, despite its time-invariant structure, the industry experiences continuous entry and exit at the steady state. From a pure accounting point of view, there are three margins for resources to flow in this industry: the entry margin, the exit margins of good firms and unsure firms, and the learning margin. At the entry margin, new vintages enter; at the exit margins, old vintages leave. This introduces a force of creative destruction that replaces old vintages with new vintages. At the learning margin, bad firms leave. This gives rise to a learning force that keeps good firms and drives out bad firms. Because of creative destruction, average productivity grows at the technological pace g. Because of learning, the proportion of good firms is higher among older cohorts. The two forces—learning and creative destruction—together drive entry, exit, and the related productivity dynamics. 3.2. Comparative statics: cleansing and scarring This subsection establishes that, across steady states, the model delivers the conventional cleansing effect, and an additional scarring effect. The two effects are formalized in Propositions 2 and 3. e
Proposition 2. In a steady-state equilibrium, the exit age for firms with a given y is weakly increasing in demand. Detailed proof is provided in Appendix A. Proposition 2 suggests that firms with any belief live longer in a high-demand steady state. Put intuitively, lower demand reduces price, so that some firms that are viable when demand is high become not viable when demand is low. If this story carries over when D fluctuates stochastically, then the model delivers the conventional cleansing effect, in which average firm age falls so that the average vintage becomes younger and more productive. However, once learning is allowed, the firm distribution across the other dimension—idiosyncratic productivity—must be considered. With only two values for true idiosyncratic productivity, good and bad, this distribution can be summarized as the fraction good firms. The next proposition establishes how demand affects this ratio. Proposition 3. In a steady state equilibrium, the fraction of good firms is weakly increasing in demand. Detailed proof is presented in Appendix A. The steady-state industry fraction of good firms including both known and yet unknown, denoted as lg , can be shown as lg ¼ 1
ð1 jÞ pjðau þ 1Þ 1 ð1 pÞ
au þ1
.
(11)
þ ð1 jÞ þ pjðag au Þ
There are only two endogenous variables in (11): au and ðag au Þ. Since ðag au Þ is independent of D according to (10), dðlg Þ=dðDÞ ¼ ðdðlg Þ=dðau ÞÞ dðau Þ=dðDÞ. Appendix A shows that dðlg Þ=dðau ÞX0, which, together with dðau Þ=dðDÞX0 established by Proposition 2, implies dðlg Þ=dðDÞX0. Put intuitively, demand affects the fraction of good firms (dðlg Þ=dðDÞ) through its impact on the exit age for unsure firms (dðau Þ=dðDÞ). To understand this result further, consider Fig. 4. Fig. 4 displays the firm distribution across vintages and idiosyncratic productivity at a high-demand steady state and that at a low-demand steady state. Because the entry size scales the sizes of all age cohorts at a steady state, in Fig. 4 the entry sizes of both steady states are normalized as one. Fig. 4 shows that, corresponding to a lower demand, the two exit margins shift to the left, creating a cleansing effect that clears out oldest vintages. However, the leftward shift of the unsure exit margin also reduces the number of older good firms. The latter effect, shown as the shaded area in Fig. 4, is the scarring effect of recessions. The scarring effect stems from learning. New entrants begin unsure of their idiosyncratic productivity, although a proportion j are truly good. Firms learn their true idiosyncratic productivity over time. If firms could live forever, then all the potentially good firms would eventually realize their true idiosyncratic productivity. However, a finite life span of unsure firms implies that, if potentially good firms do not learn before au , they exit at au and thus forever lose the chance to learn. Therefore, au represents not only unsure firms’ exit age, but also the number of learning opportunities available in a firm’s life time. A lower au gives potentially good firms less time to learn, so that the number of good firms in operation after age au is reduced. Hence, the industry suffers from uncertainty: firms that exit at age au include some that remain unsure but are potentially good. The number of potentially good firms that exit at au depends on the size of the exit margin for unsure firms, which, in turn, is determined by au. Lower demand truncates learning by reducing au . Consequently, more potentially good firms exit at au , fewer good firms become old, and the proportion of good firms for the entire industry declines. To summarize from Propositions 2 and 3, a low-demand steady state features a better average vintage, yet a lower proportion of good firms. If these results carry over with stochastic demand shocks, then recessions will have both a
ARTICLE IN PRESS M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
Unsure firms
191
Cleansing Effect Scarring Effect
Cleansing Effect Good firms
age
0 Fig. 4. Cleansing and scarring.
conventional cleansing effect that raises average vintage, and a scarring effect that lowers average idiosyncratic productivity. As suggested by dðlg Þ=dðDÞ ¼ ðdðlg Þ=dðau ÞÞ dðau Þ=dðDÞ, the two effects are directly related to each other: it is the cleansing effect (dðau Þ=dðDÞ) that truncates learning and prevents more potentially good firms from realizing their true idiosyncratic productivity, which causes the industry fraction of good firms to decline (dðlg Þ=dðau Þ). When moving beyond steady states and allowing for stochastic demand shocks, the intuition behind ‘‘cleansing and scarring’’ still carries over. Again, consider Fig. 4. When demand drops, the exit margins shift to the left, so that the cleansing effect takes place immediately. The scarring effect, however, occurs both instantaneously and gradually. At the onset of a recession, the fraction of good firms drops immediately, due to the shift of the exit margin for good firms that clear out oldest cohorts that contain good firms only—this is an ‘‘instantaneous scarring’’ effect. As the recession persists, another ‘‘lasting scarring’’ effect will follow. Note that, at the onset of a recession, the group of firms already in the shaded area in Fig. 4 would choose to stay, knowing their true idiosyncratic productivity to be good. These old good firms leave gradually as the recession persists, as there vintages grow more and more unproductive compared to other firms in operation. This creates a lasting scarring effect: the reduced au allows fewer potentially good firms to survive past au , so that the shaded area would eventually be left blank. In summary, the arrival of a recession ‘‘scars’’ the industry, and the ‘‘scar’’ deepens as the recession persists. Instantaneous and lasting scarring effects together capture the impact of recessions on the industry composition of idiosyncratic productivity. 3.3. Sensitivity analysis Three modeling assumptions should be discussed to examine the robustness of the scarring effect. 3.3.1. Entry cost and entry size One of the key assumptions in the model is that entry cost increases in entry size: c1 40. Will the scarring effect carry over if entry cost is independent of entry size? If c1 ¼ 0, then the conditions of firm rationality, free entry, and competitive pricing that jointly determine ff ð0Þ; ag ; au g will become fully recursive: ag au is given by (10) independently; with au ¼ ag ðag au Þ, the free entry condition determines ag ; then the competitive pricing condition, where D enters, determines f ð0Þ. Therefore, D impacts f ð0Þ only when c1 ¼ 0. A detailed proof is presented in Appendix A. This extreme case is described as ‘‘full insulation’’ in Caballero and Hammour (1994): when c1 ¼ 0, fast entry is costless, so that the entry size adjusts proportionally to changes in demand and the exit margins remain unchanged. Therefore, with c1 ¼ 0, demand variations are entirely reflected as entry fluctuations, the exit margins do not respond. Consequently, there would be neither cleansing nor scarring effects. On the contrary, when c1 40, f ð0Þ and ag are jointly determined by the free entry condition and the competitive pricing condition, so that some of the demand variations are accommodated at the entry margin, while the rest are taken as the shifts of the exit margins. Therefore, entry and exit both fluctuate over the cycle. Apparently, data are consistent with the
ARTICLE IN PRESS 192
M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
case when c1 40. For example, the business employment dynamics (BED) data show that the quarterly plant entry rate and exit rate display very similar volatility in the U.S. economy from 1992 to 2007: the ratio of the standard deviation of the entry rate over that of the exit rate equals 1:00 for the manufacturing sector, and 1:02 for the entire private sector.4 3.3.2. Productivity composition of entrants In the model, the proportion of good firms among entrants is exogenous and independent of demand. But it is likely for demand to affect entrants’ average productivity. This is modeled by Pries and Rogerson (2005), who allow for inspection of unobservable firm quality prior to entry in addition to learning after entry. In that case, recessions raise the entry threshold for expected idiosyncratic productivity, so that only more promising firms enter. This would drive up the average idiosyncratic productivity among entrants during recessions, which plays against the scarring effect. However, it is hard to tell whether recessions would in fact lower or raise entrants’ average productivity. During recessions, while firms do need to feel more optimistic about themselves to enter, it also becomes cheaper to rent land and easier to find a qualified manager. Thus, it is possible that recessions actually lower the productivity threshold for firms to enter, which would complement the scarring effect by reducing entrants’ average productivity. This is shown by Davis et al. (1996), who find that jobs created during recessions tend to be short-lived, and by Bowlus (1995), who estimates that jobs created during recessions are usually from lower part of the wage distribution. Further evidence is provided by Jensen et al. (2001), who document that the average productivity among the U.S. manufacturing entrants dropped during the 1982 recessions (Table 1, p. 327). 3.3.3. More complicated learning Under the all-or-nothing learning in the model, the noises are distributed uniformly, and the expected idiosyncratic e productivity, y , takes on only two values: yu and yg . This has greatly simplified the analysis so that the scarring effect can be motivated analytically. However, one must consider whether the scarring effect will carry over to more complicated learning. Suppose that the noises covering the true idiosyncratic productivity is distributed normally with mean zero and variance s: oNð0; sÞ, so that every period a good firm receives a draw from Nðyg ; sÞ and a bad firm receives a draw e from Nðyb ; sÞ. In that case, y can take on any value between yb and yg , because almost every x changes the perceived e likelihood of a firm being actually good or bad. Accordingly, a firm’s y in period t depends on the entire sequence of x’s it received before t. Examining the scarring effect thus would require keeping track of the distribution of the x sequences, and would not be feasible analytically. This subsection takes a different approach by examining the scarring effect from another angle. In a vintage world with any type of learning, the steady-state proportion of good firms with demand D, denoted as lg ðDÞ, is lg ðDÞ ¼
jþ
PaðDÞ
1þ
j
a¼1 a ðDÞha ðDÞ . PaðDÞ a¼1 ha ðDÞ
(12)
Since j is exogenous, D affects lg through three endogenous variables: a, the maximum firm age; ja , the proportion of good firms in a cohort of age a; and ha , the size of a cohort of age a relative to the entry size. Lower demand reduces a, as some oldest vintages become not viable, and lowers ha for any a, because more firms have exited as the cohort ages. The impact of demand on ja , however, is negative. Lower demand drives stronger selection, and thus raises ja for any incumbent cohort.5 Therefore, D causes two competing effects on lg through its impact on a, ha , and ja : on the one hand, increases in ja raise lg ; on the other hand, declines in a and in ha reduce lg by giving incumbent cohorts with higher proportion of good firms less weight in determining lg . Does the scarring effect motivated under the all-or-nothing learning capture all these three channels? The a channel is fully incorporated: as shown in Fig. 4, firms’s maximum life span becomes shorter at a low-demand steady state. However, the ha channel and the ja channel are captured only partially. Again, consider Fig. 4: at a low-demand steady state with allor-nothing learning, ha is smaller only for cohorts that contain good firms only, and ja is higher only for cohorts that possess good firms only when demand is low but some unsure firms when demand is high. By contrast, with more complicated learning, demand impacts ha and ja for any age cohorts. Now the question becomes: with more complicated learning, would the scarring effect disappear due to increases in ja for more cohorts, or strengthen because of decreases in ha for more cohorts? An important remark should be made. Lower demand drives out both potentially good firms and potentially bad firms. It is the exit of potentially bad firms that causes higher ja. In that sense, increases in ja add to the conventional cleansing effect. But, it is the exit of potentially good firms that induces lower ha : incumbent cohorts become smaller when demand is low, because some potentially good firms that would have remained in operation if demand were high had exited due to low demand—this is the spirit of the scarring effect. 4 Data are provided by the Bureau of Labor Statistics. The entry and exit series are seasonally adjusted but not employment-weighted. The standard deviations are those of the de-trended variations calculated using the Hodrick–Prescott filter. e e 5 With normally distributed noises, y can take on any value between yb and yg . Each age cohort would feature a cutoff value for y such that firms of e e this age and with y below this cutoff value choose to exit. Lower demand raises this cutoff y value among each age cohort so that the proportion of good firms is higher for any incumbent cohort.
ARTICLE IN PRESS M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
193
This point can be emphasized by comparing the following three worlds: one with vintage only, one with learning only, and the other with both vintage and learning. In the world with vintage only as modeled by Caballero and Hammour (1994), recessions create a cleansing effect but no scarring effect. In the world with learning only, recessions bring a cleansing effect by killing off potentially bad firms, and a scarring effect by driving out potentially good firms. In the world with both vintage and learning, the cleansing and scarring effects both carry over, and become stronger. Lower a destroys marginal vintages in addition to potentially bad firms, which amplifies the cleansing effect, and further reduces older cohorts’ weight in determining lg by driving out the oldest cohorts, which strengthens the scarring effect. Therefore, cleansing and scarring are present as long as learning takes place, with or without vintage. In summary, the all-or-nothing learning provides a convenient framework to motivate the scarring effect analytically. With more complicated learning, the cleansing and the scarring should carry over and become stronger. Therefore, once again, the question becomes: cleansing and scarring, which effect dominates? 4. With stochastic demand shocks To evaluate the cleansing and scarring effects quantitatively, this section analyzes a stochastic version of the model, in which demand follows a two-state Markov process with values ½Dh ; Dl and a transition probability m. Accordingly, firms expect the current demand to persist for another period with probability m, and to change with probability 1 m. 4.1. Calibration Table 1 summarizes the calibration. With a period as a quarter, b is set to equal 0:99. The value of m is chosen as 0:95, so that demand switches between a high level and a low level with a constant probability 0:05 per quarter. Bad firms’ idiosyncratic productivity, yb , is normalized as one. The elasticity of entry cost with respect to entry size, c1 , is chosen based on Goolsbee (1998), who estimates that a 10% increase in demand for equipment raises equipment price by 7:284% (Table VII, p. 143). Accordingly, c1 ¼ 0:7284. The rest of the parameters are chosen based on data from the U.S. manufacturing sector. In particular, p, j, g, and yg are calibrated to the observed manufacturing cohort dynamics and productivity differentials. These parameters jointly determine the strengths of learning and creative destruction. With calibrations on p, j, g, yg , and c1 , changes in demand together with the fixed component of entry cost generate responses in entry and exit, which cause cleansing and scarring. Therefore, Dh , Dl , and c0 are calibrated to the observed fluctuations in manufacturing plant entry and exit. 4.1.1. Learning pace (p) and prior proportion of good firms (j) Assuming that bad firms always exit, (4) implies that the survival rate of a cohort of age a equals j þ ð1 jÞð1 pÞa . Dunne et al. (1989) provide corresponding statistics to calibrate p and j, tracking the exit dynamics of a U.S. manufacturing cohort that entered in 1972. They find that 57:5% of this cohort had exited by 1977 and 78:2% of it had exited by 1982. This imposes two conditions on p and j:
j þ ð1 jÞð1 pÞ19 ¼ 1 0:575;
j þ ð1 jÞð1 pÞ39 ¼ 1 0:782.
(13)
This gives p ¼ 0:0538 and j ¼ 0:1157. 4.1.2. Technological pace (g) Since only entrants adopt the leading technology in the model, g is chosen to match the observed growth in entrants’ productivity. Jensen et al. (2001) estimate that, after controlling for industry and time effects, the U.S. manufacturing
Table 1 Calibration. Parameters
value
Quarterly discount factor: b Persistence rate of demand: m Prior probability of being a good firm: j Quarterly pace of learning: p Quarterly technological pace: g Productivity of bad firms: yb Productivity of good firms: yg Entry cost parameter: c1 Entry cost parameter: c0 High demand: Dh Low demand: Dl
0.9900 0.9500 0.1157 0.0538 0.0040 1 1.7500 0.7284 0.1587 108.7294 103.9819
ARTICLE IN PRESS 194
M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
entrants’ productivity grow by 46:8% from 1963 to 1992 (Table 1, p. 327).6 The 46:8% increase in entrants’ productivity over a 29-year horizon suggests a quarterly technological pace of 0:004. 4.1.3. Idiosyncratic productivity differential (yg ) With yb normalized as one, yg is calibrated to the observed cohort productivity differentials. In the model, productivity differ across birth cohorts due to two factors: the vintage effect, by which younger cohorts have better technology, and the learning of unobservable fixed idiosyncratic productivity, by which older cohorts possess higher proportion of good firms. The latter is defined by Davis and Haltiwanger (1992) as ‘‘passive learning’’. In reality, however, older cohorts may have additional productivity advantages over younger cohorts due to managers’ accumulating experiences, workers’ learning by doing, technology retooling, and the achieving of economies of scale. These additional effects are defined by Davis and Haltiwanger (1992) as ‘‘active learning’’. Since yg captures passive learning only, a careful calibration of yg requires controlling for both the vintage effect and the active learning effect. Accordingly, yg is calibrated based on Jensen et al. (2001), who estimate that, after controlling for industry effects, in 1992 the manufacturing incumbents that entered back in 1967 is 15:3% more productive than the new entrants (Table 4, p. 331), although the 1967 vintage is 60:4% less productive than the 1992 vintage (Table 2, p. 329). This implies a productivity differential of 75:7% in total to be explained by active learning and passive learning together. Furthermore, Jensen et al. (2001) report that those incumbents that entered in 1967 and have survived through 1992 have grown 14:8% more productive over the 25 years (Table 3, p. 330). This 14:8% productivity growth must be driven by active learning because, under passive learning, plant productivity stays constant. This suggests that active learning causes a productivity differential of 14:8% between new entrants and those incumbents of 25 years old, and leaves a productivity differential of 60:9% to be accounted for by passive learning. Applying these statistics to all-or-nothing learning gives ðjyg þ ð1 jÞð1 pÞ100 yb Þ ðjyg þ ð1 jÞyb Þðj þ ð1 jÞð1 pÞ100 Þ
¼ 1:609.
(14)
Combined with calibrations on p and j, (14) suggests yg ¼ 1:75, a 75% differential between bad and good idiosyncratic productivity.7 4.1.4. High demand (Dh ), low demand (Dl ), and entry cost (c0 ) The values of Dh , Dl , and c0 are calibrated using the steady-state conditions. Our numerical simulations suggest that, along any sample path with unchanging demand, the dynamics of the model eventually converge to constant entry and exit. The firm distribution at these stable points are similar to those at the steady states, which allows for using the steadystate conditions as approximation. Let ag h, au h and fh to represent good firms’ exit age, unsure firms’ exit age, and entry size corresponding to Dh ; and let ag l, au l, and fl to be those corresponding to Dl . Applying the calibrations on p, j, g, and yg to (10) gives ag h au h ¼ ag l au l ¼ 120. This leaves ag h, ag l, fh, and fl to be determined. The values of ag h, ag l, fh, and fl are chosen to match the BED statistics on manufacturing plant entry and exit. Fig. 5 plots, in the top two panels, the quarterly entry and exit rates for the U.S. manufacturing sector from 1992 to 2007: the entry rate averages 3:11%, and the exit rate averages 3:45%. Since both series display a declining trend that is not incorporated in the model, they are de-trended using the Hodrick–Prescott filter. The de-trended variations are presented in the bottom two panels of Fig. 5: the de-trended entry rate fluctuates from 0:29% to 0:37%, and the de-trended exit rate varies from 0:36% to 0:34%. This puts the following restrictions on the values of ag h, ag l, fh, and fl. First, their implied long-run entry rate and exit rate have to be around 3:11% and 3:45%, respectively. Second, they must match the peak in exit rate and the trough in entry rate at the onset of a recession. That is, when a negative demand shock hits a high-demand equilibrium, the exit rate should rise to 3:79% (3:45% þ 0:34%), and the entry rate should drop to 2:82% (3:11% 0:29%). Third, they must match the trough in exit rate and the peak in entry rate during recovery. Namely, when a positive demand shock hits a low-demand equilibrium, the exit rate should drop to 3:09% (3:45% 0:36%) and the entry rate should rise to 3:48% (3:11% þ 0:37%). Using a search algorithm that incorporates the related transitory dynamics, we find that these conditions are satisfied for the following combination of parameter values: ag h ¼ 165, ag l ¼ 164, fh ¼ 2:6384, and fl ¼ 2:5452. Details are presented in Appendix A. Because ag h and ag l also represent the expected maximum life spans corresponding to a high demand and a low demand, their values are used to calculate the expected values of entry during expansions and during recessions. Applying the values of entry, the entry sizes, and the calibration on c1 to (7) gives c0 ¼ 0:1587. Applying the calibrations on ag h, ag h, fh, and fl to the steady-state competitive pricing condition gives Dh ¼ 108:7294 and Dl ¼ 103:9819. 6 Jensen et al. (2001) report growths in entrants’ productivity in different time periods, among which the 1663–1992 period is the longest. We take statistics over the longest time period reported in Jensen et al. (2001) to calibrate g, considering that technological pace can vary over time. 7 This calibration is lower than that by Davis et al. (1999), who assume a high-to-low idiosyncratic productivity ratio of 2:4 based on Bertelsman and Doms (2000) without controlling for the vintage effect or the active learning effect. However, our calibration of yg is still consistent with their calibration because, in our model, yg is supposed to capture only the effect of passive learning as one of the many effects driving the observed productivity differentials.
ARTICLE IN PRESS M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
entry rate
0.045
0.04
0.035
0.035
0.03
0.03
0.025
0.025
0.02
4
1993 1995 1997 1999 2001 2003 2005 2007 x 10−3
detrended entry rate
0.02
4
3
3
2
2
1
1
0
0
−1
−1
−2
−2
−3
−3
−4
exit rate
0.045
0.04
195
1993 1995 1997 1999 2001 2003 2005 2007 x 10−3
detrended exit rate
−4 1993 1995 1997 1999 2001 2003 2005 2007
1993 1995 1997 1999 2001 2003 2005 2007
Fig. 5. The quarterly U.S. manufacturing entry and exit rates from 1992 to 2007. The top two panels display the seasonally adjusted, non-employmentweighted raw series; the bottom two panels present the de-trended variations using the Hodrick–Prescott filter. Data source: the business employment dynamics provided by the Bureau of Labor Statistics.
With demand equal to industry total revenue, this implies a revenue differential of 4:57% between expansions and recessions in the U.S. manufacturing sector from 1992 to 2007. 4.1.5. Discussions: average firm age and manufacturing revenue differential Two remarks should be made on the calibration. First, the calibrated maximum age of good firms equals 165 quarters. Since the model assumes that a firm’s vintage stays constant throughout its life span, one may argue that it is hard to believe that some firms would employ the same technology for over 40 years. In reality, new technology adoption takes place not only by entry and exit, but also by incumbents’ retooling. The latter, although not incorporated in the model directly, is controlled for when calibrating yg , because technology retooling is one of the factors that drive the ‘‘active learning’’ effect on incumbents’ productivity growth. In that sense, the model focuses on the technology adoption and the learning of idiosyncratic productivity associated specifically with entry and exit. To further check if the calibration on ag h is reasonable, we compare its implied average firm age with data. Applying ag h ¼ 165 to the steady-state firm distribution gives an average firm age of 51 quarters. This is close to the average plant age reported by Faberman (2003), who studies the unemployment records from five states of the U.S. and finds that the sample manufacturing plants age 58 quarters on average. Second, the calibrations on high demand and low demand based on the observed fluctuations in manufacturing entry and exit suggest a differential of 4:57% in total manufacturing revenue from 1992 to 2007. To check if this calibration is plausible, we examine the 1992–2007 quarterly series of total manufacturing value of shipments at the Census Bureau, and find that the total manufacturing value of shipments fluctuates by about 6% around the trend.8 The difference between 4:57% and 6%, although quite small, points to the possibility that our calibration underestimates the actual fluctuations in 8
The examined series are seasonally adjusted, and are de-trended using the Hodrick–Prescott filter.
ARTICLE IN PRESS 196
M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
total manufacturing revenue. This difference can arise from a couple of elements missing from the model but present in reality. For example, demand shocks serve as the only cause for industry fluctuations in the model, while technology shocks are an additional driving force for business cycles in reality. A positive technology shock would drive further increases in industry total revenue by raising industry output. Moreover, the wage rate is fixed at one in the model, but tends to comove positively with demand in reality due to, for example, an up-ward slopping labor supply curve (Swanson, 2007). With varying wage rate, output price would have to adjust by more to generate changes in the profit margin that match the observed cyclical entry and exit; consequently, industry total revenue fluctuates by larger magnitude. 4.2. Response to a negative demand shock With all the parameter values assigned, firm value functions are approximated to simulate the model’s responses to stochastic demand shocks. The key computational task is to map F, the firm distribution across ages and idiosyncratic e productivity, given demand level D, into a set of value functions V ðy ; a; F; DÞ. Unfortunately, F is a high-dimensional object, and it is well known that the numerical solution of dynamic programming problems becomes increasingly difficult as the size of the state space increases. Following Krusell and Smith (1998), our computational strategy shrinks F into a limited set of variables and shows that these variables’ laws of motion can approximate the equilibrium behavior of firms in the simulated time series. Details are presented in Appendix A. The value functions and decision rules approximated using these variables enable the investigation of the model’s dynamics along any particular path of demand realizations and the study of the model’s quantitative implications. 4.2.1. Scarring and cleansing To assess the effect of a negative demand shock on key variables of the model, the simulation starts with a random firm distribution, and then uses the approximated value functions and decision rules to generate the model’s response to the following sequence of demand realizations. Demand stays at Dh until the key variables converge; then, it drops to Dl and persists afterward. Panel 1 of Fig. 6 illustrates the simulated dynamics of the exit and entry rates in response to a negative demand shock. The quarter labeled 0 denotes the onset of a recession. Panel 1 shows that, when a negative demand shock hits a highdemand long-run equilibrium, the exit rate jumps up, declines afterward, and converges to a level above its initial value when demand was high. Put intuitively, a negative demand shock clears out some firms that would stay in operation if demand had remained high. In contrast with the exit rate, the entry rate drops initially, recovers gradually, and converges to a level below its initial value. Panel 1 of Fig. 6 implies that the conventional cleansing effect carries over with an unexpected persistent negative demand shock. According to the comparative static exercises in Section 3, recessions bring an additional scarring effect that takes place both instantaneously and gradually by worsening the industry composition of idiosyncratic productivity. Panel 2 of Fig. 6 presents the dynamics of the fraction of good firms (lg ) when a negative demand shock hits a high-demand long-run equilibrium. At the onset of a recession, lg drops due to the ‘‘instantaneous’’ scarring effect. As the recession persists, lg recovers temporarily, drops again later, and converges eventually to a level below its initial value when demand was high, as suggested by the ‘‘lasting scarring’’ effect. Panel 2 of Fig. 6 implies that the scarring effect also carries over with an unexpected persistent negative demand shock. Interestingly, the simulated responses of the exit rate, the entry rate, and the fraction of good firms in Fig. 6 all display certain transitory dynamics that have not been captured by the comparative static exercises. The exit rate drops after the initial jump; the entry rate recovers after the initial drop; and the response of lg appears hump-shaped. These transitory dynamics are driven by the following movements of the exit margins. At the onset of a recession, the exit margins over shift to ages younger than the exit ages at the low-demand long-run equilibrium. This is because some old good firms (shown in Fig. 4 as the shaded area) choose to stay at this point by knowing they are good. Their operation raises industry output and lowers output price, causing the exit margins to over shift and the entry size to over drop. As the recession persists, the over-shifted exit margins move back to their stable points quarter by quarter. As unsure firms’ exit margin moves to older ages, more good firms are allowed to reach their potential. As good firms’ exit margin shifts to older ages, no old good firms exit for several quarters. This gives rise to a temporary ‘‘plastic surgery’’ effect that partially erases the instantaneous scar and drives lg to rise after its initial drop. Once the exit margins reach their stable points, old good firms and potentially good firms start exiting. At this point, lg falls again, and converges to a lower level when the industry reaches the low-demand long-run equilibrium. To summarize, Panels 1 and 2 of Fig. 6 suggest that, despite some transitory dynamics, both the conventional cleansing effect established in Proposition 2, and the scarring effect established in Proposition 3, carry over with an unexpected persistent negative demand shock. 4.2.2. Implications for productivity With firm-level productivity equal to Ayð1 þ gÞa , industry average productivity is affected by two components: the leading technology (A), and the firm distribution across vintages (a) and idiosyncratic productivity (y). Technological
ARTICLE IN PRESS M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
Panel 1: Exit and Entry Rates
0.040
0.481
0.036
0.48
0.034
0.479
0.032
0.478
0.030
0.477
0.028
Panel 2: Proportion of Good Firms
0.482
0.038
−50
0
50
100
150
200
0.476 −50 x
Panel 3: Average Productivity 1.002
197
0
10−4
50
100
150
200
150
200
Panel 4: Scarring Effect
5
1.0015
0
1.001
−5
1.0005 −10
1
−15
0.9995 0.999
−50
0
50
100
150
200
−20 −50
0
50
100
Fig. 6. Simulated responses to a negative demand shock: the horizontal axis denotes quarters, with the quarter labeled 0 as the onset of a recession. In Panel 1, the solid line represents the dynamics of the exit rate, and the dashed line is the dynamics of the entry rate. In Panel 3, the dashed line represents the productivity dynamics driven by the cleansing effect alone, and the solid line denotes the productivity dynamics driven by both the cleansing and scarring effects. Panel 4 plots the distance between the two series plotted in Panel 3. See text for more details.
progress drives A and thus average productivity to grow at rate g. Demand shocks add fluctuations around this trend by affecting the firm distribution across a and y. To analyze the cyclical component of the average productivity, this subsection examines the de-trended average productivity as the average of yð1 þ gÞa over heterogeneous firms. In evaluating this measure, recall that there are two competing effects. On the one hand, the cleansing effect lowers the average a, driving average productivity to rise. On the other hand, the scarring effect reduces the average y, causing average productivity to fall. To separate these two competing effects, two indexes are created: the average of yð1 þ gÞa over heterogenous firms, denoted as prod; and the average of e e ð1 þ gÞa over heterogenous vintages, denoted as vin. Let f ðy ; aÞ to represent the number of firms of age a and with y , prod and vin are P
f
prod ¼
ye a
P
e
f ðy ; aÞ
ð1 þ gÞ P e f f ðy ; aÞ
f
;
vin ¼
1 e a f ðy ; aÞ ð1 þ gÞ . P e f f ðy ; aÞ
(15)
Apparently, prod is affected by both the cleansing and the scarring effect, while vin is driven by the cleansing effect alone. Thus, prod vin measures the scarring effect on average productivity. Panel 3 of Fig. 6 traces the percentage change in prod and in vin when an unexpected persistent negative demand shock hits a high-demand long-run equilibrium. The initial levels of prod and vin are normalized as one. Panel 3 shows that, at the onset of a recession, vin rises to 1:0012, implying that the cleansing effect alone raises average productivity by 0:12%; however, prod drops to 0:9995, suggesting that the cleansing effect and the instantaneous scarring effect together lowers average productivity by 0:05%. As the recession persists, prod recovers temporarily due to the ‘‘plastic surgery’’ effect, and declines again as the lasting scarring effect takes place. Eventually, vin converges to 1:0013, a 0:13% increase in long-run
ARTICLE IN PRESS 198
M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
average productivity by the cleansing effect alone; but prod converges to 0:9994, a 0:06% decline in long-run average productivity under both the cleansing and scarring effects. Panel 4 of Fig. 6 presents the corresponding dynamics of the scarring effect (prod vin) on average productivity. The scarring effect reduces average productivity by 0:17% at the onset of a recession and by 0:19% in the long run. In summary, Panels 3 and 4 of Fig. 6 suggest that, with plausible calibrations, the scarring effect dominates the cleansing effect and contributes to lower average productivity during recessions. 5. Conclusion How do recessions affect resource allocation? This paper posits that learning has important consequences for this question. Recessions create, in addition to the conventional cleansing effect, a scarring effect by interrupting businesses’ learning of their unobservable idiosyncratic productivity. The scarring effect is evaluated quantitatively based on statistics on entry, exit, and productivity differentials from the U.S. manufacturing sector. A plausible calibration of the model suggests that the scarring effect dominates the cleansing effect, and gives rise to lower average productivity during recessions. Previous authors have also critiqued the conventional cleansing hypothesis. But the scarring effect differs from their proposed adverse-cleansing effects in important ways. The focus of Ramey and Watson (1997) and Caballero and Hammour (2005) is whether cyclical reallocation is socially efficient: in their models, recessions still promote more productive allocation of resources although associated with lower welfare. The sullying effect proposed by Barlevy (2002) arises from reduced entry rather than concentrated exit. Barlevy (2003) analyzes credit market imperfections rather than the learning of unobservable qualities; moreover, the scarring effect impacts resource allocation both in present times and in the future—a dynamic effect that is missing in Barlevy (2003). Nevertheless, these various adverse-cleansing effects should be viewed as complementary effects that likely amplify each other in reality. For example, during recessions, the credit market frictions can further tighten young businesses’ borrowing constraints, so that more potentially good businesses are driven out before they learn; as a result, credit market frictions deepen the scarring effect. A couple of extensions can be added to the model. Firm size can be introduced, allowing firms with better vintages or higher expected idiosyncratic to hire more workers. This modification will generate interesting new predictions. With good firms bigger than unsure firms, a firm would increase its employment when it learns that its idiosyncratic productivity to be good, giving rise to an additional job creation margin driven by learning. In that case, recessions would reduce later job creation by driving out potentially good firms at present times. This prediction is consistent with the argument by Caballero and Hammour (2005) that recessions in the U.S. manufacturing sector are usually followed by sluggish job creation during the recovery phase. The model can also be extended into a general-equilibrium framework. As discussed in Section 2, the exogenous demand shocks can be modeled as arising from consumers’ taste shocks on an industry’s production goods, or as driven by productivity shocks of down-stream industries that demand an industry’s output as one of their inputs. Extension of this model into a general-equilibrium framework will raise interesting new questions. For example, can taste shocks or productivity shocks of plausible sizes generate the observed fluctuations in manufacturing plant entry and exit? Moreover, what is the welfare loss associated with the scarring effect? Such questions are left for future research. Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco.2008. 12.014. References Baden-Fuller, C., 1989. Exits from declining industries and the case of steel castings. Economic Journal 99, 949–961. Barlevy, G., 2002. The sullying effect of recessions. Review of Economic Studies 69 (1), 65–96. Barlevy, G., 2003. Credit market frictions and the allocation of resources over the business cycle. Journal of Monetary Economics 50, 1795–1818. Bartelsman, E.J., Doms, M., 2000. Understanding productivity: lessons from longitudinal microdata. Journal of Economic Literature 38 (3), 569–594. Bowlus, A., 1995. Matching workers and jobs: cyclical fluctuations in match quality. Journal of Labor Economics 13 (2), 335–350. Caballero, R., Hammour, M., 1994. The cleansing effect of recessions. American Economic Review 84 (5), 1350–1368. Caballero, R., Hammour, M., 1996. On the timing and efficiency of creative destruction. Quarterly Journal of Economics 111 (3), 805–852. Caballero, R., Hammour, M., 2005. The cost of recessions revisited: a reverse-liquidationist view. Review of Economic Studies 72 (2), 313–341. Caves, R., 1998. Industry organization and new findings on the turnover and mobility of firms. Journal of Economic Literature 36 (4), 1947–1982. Davis, S., Haltiwanger, J., August 1992. Gross job creation, gross job destruction, and employment reallocation. Quarterly Journal of Economics 7 (3), 819–863. Davis, S., Haltiwanger, J., 1999. Gross job flows. In: Ashenfelter, O., Card, D. (Eds.), Handbook of Labor Economics, 41. Elsevier, Amsterdam, pp. 2711–2805. Davis, S., Haltiwanger, J., Schuh, S., 1996. Job Creation and Destruction. MIT Press, Cambridge. Dunne, T., Roberts, M.J., Samuelson, L., 1989. The growth and failure of US manufacturing plants. Quarterly Journal of Economics 104 (4), 671–698. Faberman, R.J., 2003. Job flows and establishment characteristics: variations across metropolitan areas. William Davidson Institute Working Papers 609. Foster, L., Haltiwanger, J., Syverson, C., 2008. Reallocation, firm turnover, and efficiency: selection on productivity or profitability? American Economic Review 98 (1), 394–425. Goolsbee, A., 1998. Investment tax incentives, prices, and the supply of capital goods. Quarterly Journal of Economics 113 (1), 121–148. Hall, R., 2000. Reorganization. Carnegie-Rochester Conference Series on Public Policy 52, 1–22.
ARTICLE IN PRESS M. Ouyang / Journal of Monetary Economics 56 (2009) 184–199
199
Jensen, J.B., McGuckin, R.H., Stiroh, K.J., 2001. The impact of vintage and survival on productivity: evidence from cohorts of US manufacturing plants. Review of Economics and Statistics 83 (2), 323–332. Jovanovic, B., 1982. Selection and the evolution of industry. Econometrica 50 (3), 649–670. Krusell, P., Smith Jr., A., 1998. Income and wealth heterogeneity in the macroeconomy. Journal of Political Economy 106 (5), 867–895. Lucas, R.E., 1978. On the size distribution of business firms. Bell Journal of Economics 9 (2), 508–523. Mortensen, D., Pissarides, C., 1994. Job creation and job destruction in the theory of unemployment. Review of Economic Studies 61 (3), 397–415. Pries, M.J., 2004. Persistence of employment fluctuations: a model of recurring firm loss. Review of Economic Studies 71 (1), 193–215. Pries, M.J., Rogerson, R., 2005. Hiring policies, labor market institutions, and labor market flows. Journal of Political Economy 113 (4), 811–839. Ramey, G., Watson, J., 1997. Contractual fragility, job destruction, and business cycles. Quarterly Journal of Economics 112 (3), 873–911. Schumpeter, J.A., 1934. Depressions. In: Douglass, V.B., Edward, C., Seymour, E.H. (Eds.), Economics of the Recovery Program. Whittlesey House, New York, pp. 3–12. Swanson, E.T., 2007. Real wage cyclicality in the PSID. Scottish Journal of Political Economy 54, 617–647.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 200–209
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Why did the average duration of unemployment become so much longer? Toshihiko Mukoyama a,b,, Ays-egu¨l S- ahin c a b c
Department of Economics, University of Virginia, P.O. Box 400182, Charlottesville, VA 22904-4182, USA CIREQ, Canada Federal Reserve Bank of New York, USA
a r t i c l e i n f o
abstract
Article history: Received 6 April 2007 Received in revised form 11 November 2008 Accepted 13 November 2008 Available online 27 November 2008
There has been a substantial increase in the average duration of unemployment relative to the unemployment rate in the U.S. over the last 30 years. We evaluate the performance of a standard job-search model in explaining this phenomenon. In particular, we examine whether the increase in within-group wage inequality and the decline in the incidence of unemployment can account for the increase in unemployment duration. The results indicate that these two changes can explain a significant part of the increase over the last 30 years, although the model fails to match the behavior of unemployment duration during 1980s. & 2008 Elsevier B.V. All rights reserved.
JEL classification: E24 J64 Keywords: Unemployment duration Wage dispersion Job search model
1. Introduction Average unemployment duration and the unemployment rate generally move together; however, recent data indicate that this relationship is changing in the United States. Fig. 1 shows the U.S. unemployment rate (left scale) and average unemployment duration (right scale) from 1948 to 2003.1 Although they track each other closely from 1948 to 1990, we observe a clear break in this pattern during recent years. Specifically, the U.S. unemployment rate declined dramatically over the past 20 years, while average unemployment duration remained high through the 1990s. There are other studies that document the increase in the average duration of unemployment. Juhn et al. (1991, 2002) Baumol and Wolff (1998), Valletta (1998, 2005), Abraham and Shimer (2001), and Machado et al. (2006) all point out that unemployment duration has become longer in recent years, despite low levels of unemployment.2 This pattern is more apparent if we look at the trends of the unemployment duration and unemployment rate. Fig. 2 compares the Hodrick–Prescott (HP) trend of the two series.3 Casual observation of Fig. 2 suggests that the difference between the trends has been particularly pronounced in recent years.4 The U.S. unemployment rate declined dramatically over the past 20 years, while average unemployment duration remained high through the 1990s.
Corresponding author at: Department of Economics, University of Virginia, P.O. Box 400182, Charlottesville, VA 22904-4182, USA. Tel.: +1 434 924 6751; fax: +1 434 982 2904. E-mail address:
[email protected] (T. Mukoyama). 1 The data in this paper are taken from Bureau of Labor Statistics (BLS) website, http://www.bls.gov, except where noted otherwise. 2 Appendix A of Mukoyama and S- ahin (2008) discusses the related literature in detail. 3 For HP-filtering, we use the smoothing parameter l ¼ 6:25 for yearly data following Ravn and Uhlig’s (2002) suggestion. 4 In Appendix B of Mukoyama and S- ahin (2008), we show that the cyclical components of both series track each other very well, even in recent years.
0304-3932/$ - see front matter & 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.11.002
ARTICLE IN PRESS T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
12
25
10
15
6 10
4
5
Unemployment Rate (Left) Average Duration (Right)
2
Weeks
20
8 %
201
2003
1998
1993
1988
1983
1978
1973
1968
1963
1958
1953
0 1948
0
Year Fig. 1. Unemployment rate (left scale) and average unemployment duration (right scale). Data source: Current Population Survey.
9
20
8
18
7
16
%
12
5
10
4
8
3
4
2003
2 1998
1993
1988
1983
1978
1973
1968
1963
1958
1953
1948
1 0
6
Unemployment Rate (Left) Average Duration (Right)
2
Weeks
14
6
0
Year Fig. 2. Trends of unemployment rate (left scale) and average unemployment duration (right scale). Data source: Current Population Survey.
Fig. 2 indicates that the increase in the average duration occurred around 1980.5 The average value from 1971 to 1980 is 11.5 weeks, while the average after 1981 is 15.5 weeks. Therefore, there has been an increase of about 4.0 weeks after the break.6 We investigate the economic reasons behind the increase in unemployment duration. In particular, we evaluate the performance of the standard job search model in explaining this phenomenon. We first document the two prominent changes in the U.S. labor market that took place recently: the increase in wage dispersion and the decline in the incidence of unemployment. We quantitatively examine how these changes affect unemployed workers’ job-search behavior. We argue that these two changes can explain a significant part of the increase in unemployment duration in the last 30 years. However, the model fails to match the behavior of unemployment duration during the 1980s. We conduct some robustness analyses and discuss the generality of our results. Lastly, we examine the link between unemployment duration and the dispersion of wages at a more disaggregated level. Juhn et al. (1991) are among the first to point out that the duration of unemployment has been increasing during the recent years. They emphasize the labor-supply response to the change in the wage level. Their work can be considered complementary to ours, which emphasizes the change in wage dispersion. In her comments to Juhn et al. (1991), Yellen (1991) argues that: When there is wage dispersion, so that both good and bad jobs are available for workers with given skills, some workers will choose to remain unemployed, searching for good, rent-paying jobs, rather than work at the poor jobs that are readily availabley The long-term unemployed are searching for work for which they are qualified. In this 5
In Appendix B of Mukoyama and S- ahin (2008), we identify the break more formally and reach a similar conclusion. Unemployment duration increased from 13.5 weeks (from 1971 to 1980) to 16.9 weeks (from 1981 to 2002) for males and it increased from 10.7 weeks (from 1971 to 1980) to 13.5 weeks (from 1981 to 2002) for females. 6
ARTICLE IN PRESS 202
T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
interpretation, unemployment is a response to wage dispersion rather than to wage levels, contrary to the authors’ labor supply function, in which labor supply depends only on wage levels. (pp. 129–130) Our hypothesis parallels her argument—the recent change in wage distribution may have had a significant impact on unemployment duration. An alternative explanation for the increase in unemployment duration is institutional change. In fact, the unemployment insurance system did change during the post-war period. Theoretically, if unemployment insurance became more generous, it may have lengthened the duration of unemployment. Baicker et al. (1998) describe the changes in the unemployment insurance system since the 1930s. Although there were increases in coverage during the 1970s (mainly affecting public sector workers), they argue that the generosity of unemployment insurance has remained almost constant, and that the ratio of unemployment insurance claims to total unemployment has actually declined over the postwar era (see Fig. 7.2 in their paper). Consistent with this observation, Burtless (1983) argues that the insured unemployment rate has been stable throughout the period from the 1950s to the 1970s.7 Therefore, changes in the unemployment insurance system are not likely to be an explanation for the longer unemployment duration.8 Baumol and Wolff (1998) also support this view. They examine the effect of institutional changes on the duration of unemployment, and conclude that institutional factors like changes in the coverage and generosity of unemployment insurance, in the rate of unionization, and in the minimum wage cannot account for the observed increase in unemployment duration. Another possible explanation is the change in the demographic composition of the U.S. labor force. In general, older workers tend to be unemployed longer, and women tend to be unemployed for a shorter period of time. In Appendix C of Mukoyama and S- ahin (2008), we examine whether demographic change can account for the increase in unemployment duration, by making age and gender compositions that are similar to Shimer (1998). We find that demographic change can explain only a minor part of the increase in unemployment duration. Section 2 documents the changes in wage dispersion and the incidence of unemployment. Section 3 sets up a job-search model and quantitatively evaluates the effects of these changes on unemployment duration. Section 4 examines the robustness of the quantitative findings. Section 5 examines whether wage dispersion and unemployment duration are correlated at a disaggregated level. Section 6 concludes.
2. Wage dispersion and the incidence of unemployment over time In this section, we provide an overview of the evolution of wage dispersion and the incidence of unemployment over time. When we move on to calibrate our job-search model, we use the statistics that are presented in this section. Many labor economists have documented that there are substantial wage differentials among observationally equivalent workers. Mincer-style wage equations typically explain less than 30% of overall wage variation.9 The remaining variation, which is more than 70%, is often called residual (or within-group) wage inequality.10 Beginning with Katz and Murphy (1992) and Juhn et al. (1993), researchers have noticed that there has been a significant increase in residual wage inequality during recent decades.11 In this subsection, we examine the evolution of the change in residual wage inequality, using the March Current Population Surveys (CPS) data covering 1971–2002. In Appendix D of Mukoyama and S- ahin (2008), we consider two more data sources—IPUMS Census samples for 1960–2000 and the May CPS samples for 1973–1978 combined with the CPS Outgoing Rotation Group (ORG) files for 1979–2003. We make use of the dataset that Eckstein and Nagypa´l (2004) put together by using annual data drawn from the March CPS covering the period between 1971 and 2002.12 We follow Eckstein and Nagypa´l (2004) log wage regression to control for the observables.13 Fig. 3 plots the path of the difference between the 90th and 10th percentiles of the residual wage distributions from two regressions, one for males and one for females. Residual wage inequality rose from 1971 to 2002 for both male and female workers. Male residual wage inequality has been increasing steadily (with the exception of a few years) during this 30 year period while female residual wage inequality did not change significantly during 1970s. As described in the Appendix D of Mukoyama and S- ahin (2008), the other two data sources also show the increase in residual wage inequality over the recent years, although the magnitude of the change varies substantially across data 7 Baumol and Wolff (1998) report that the insured coverage rate (the percent of unemployed workers receiving benefits) has actually dropped if the data are extended through the 1990s. 8 Ehrenberg and Oaxaca (1976) estimate that moving from a welfare system without unemployment insurance to a system similar to the U.S. unemployment insurance system increases the duration of unemployment by one week for older (age 45–59) males, and by less than one week for other demographic groups. This result suggests that even if the increase in coverage increased the unemployment duration, its quantitative impact would be small. 9 See, for example, the review of the empirical evidence in Mortensen (2003). 10 We use the terms ‘‘inequality’’ and ‘‘dispersion’’ interchangeably. 11 Some recent studies investigate the causes of this increase in within-group wage inequality. For example, Violante (2002) argues that the recent rapid investment-specific technological change is the major cause of the increase in within-group wage inequality. 12 The data and their program are downloaded from http://faculty.wcas.northwestern.edu/een461/QRproject/. 13 See Eckstein and Nagypa´l (2004) for details. Our Census data regression in Appendix C of Mukoyama and S- ahin (2008) controls for more detailed occupational groups.
ARTICLE IN PRESS T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
203
Male Female
1.3
Residual 90/10
1.2 1.1 1 0.9 0.8 1970
1980
1990
2000
Year Fig. 3. 90%–10% residual wage inequality, March CPS 1971–2002.
Male Female
0.045 0.04
α
0.035 0.03 0.025 0.02 0.015 1965 1970 1975 1980 1985 1990 1995 2000 2005 Year Fig. 4. Incidence of unemployment.
sources. When we check the robustness of our findings in Section 4, we examine the quantitative implications of these differences. Another important change that we focus on is the variation in the incidence of unemployment over time. Fig. 4 plots the incidence of unemployment (defined as the number of the workers unemployed for less than five weeks divided by total employment) using BLS data for men and women. Before 1970, the incidence of unemployment was stable at around 0.025. It increased to 0.035 in the 1980s, then decreased to 0.020 in the 1990s. The relatively small value in recent years reflects the coexistence of a low unemployment rate and a long unemployment duration.14 Next, we formally examine whether these changes in the labor-market environment have had a significant impact on the worker’s job-search behavior.
3. Model In this section, we construct a McCall (1970)-style search model.15 Then, we feed the two important changes in the U.S. labor market—the increase in the wage dispersion16 and the decline in the incidence of unemployment—into the model and examine how the job-search behavior of the unemployed workers change. 14 In fact, Juhn et al. (2002) argue that the decrease in the unemployment rate observed in the 1990s is driven almost entirely by the decreased incidence of unemployment. Even though the unemployment duration remained as high as its level in the 1980s, the lower rate of separation dragged the unemployment rate down to very low levels. 15 There is a large body of empirical work based on this type of search model. See, for example, Wolpin (1987). Rogerson et al. (2005) demonstrate that this model is a building block of many recent equilibrium search models, such as Mortensen and Pissarides (1994) and Burdett and Mortensen (1998). They also emphasize that this model can be interpreted as the equilibrium of a simple economy. 16 There is a large body of theoretical literature that attempts to explain the existence of wage dispersion among workers of the same characteristics. The most popular approach is to utilize a model of search and matching in a frictional labor market (see, for example, Burdett and Mortensen, 1998). We do not attempt to explain the wage dispersion in this paper—we take the wage dispersion as exogenously given.
ARTICLE IN PRESS T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
204
3.1. Model setup We consider a worker who is unemployed and searching for a job. Time is discrete. We assume that there is no borrowing or saving and that the period utility for an unemployed worker is U s . For an employed worker receiving wage w, the momentary utility is U e ðwÞ lnðwÞ. An unemployed worker receives one wage offer each period and decides whether or not to accept it. If she accepts it, she works at that wage until she is separated from the job. If she rejects it, the search continues in the subsequent period. Separation occurs exogenously with probability a 2 ½0; 1Þ every period. If separation occurs, the worker is unemployed for at least one period. We assume that the wage offer is independently and identically distributed and follows a lognormal distribution lnðwÞNðm s2 =2; s2 Þ. Therefore, E½w ¼ em
and
2
Var½w ¼ e2m ðes 1Þ.
The worker’s problem in each period is characterized by the following Bellman equation: Z Z VðwÞ ¼ max U e ðwÞ þ b ð1 aÞVðwÞ þ a U s þ b Vðw0 Þ dFðw0 Þ ; U s þ b Vðw0 Þ dFðw0 Þ ,
(1)
where FðÞ is the distribution function of the wage offer and VðwÞ is the value function of the worker with wage offer w. The implicit assumption here is that the worker views the economy as being in the steady state—she considers s and a to be constant over time when she makes her decisions. (That is, changes in s and a are unanticipated.) While this assumption seems innocuous given that changes in s and a are very slow relative to the length of each unemployment spell, it requires further evaluation. We check the robustness of our results to this assumption in Section 4 by solving a non-stationary version of the model. The optimization problem in (1) has a simple reservation-wage property: the worker accepts the wage offer if w is above the reservation wage and rejects it if it is below the reservation wage. The reservation wage, w, ¯ solves17 Z 1 b ½U e ðw0 Þ U e ðwÞ U e ðwÞ (2) ¯ Us ¼ ¯ dFðw0 Þ. 1 bð1 aÞ w¯ Let l FðwÞ ¯ be the probability that an unemployed worker is still unemployed next period. Given that all unemployed workers are solving the optimization problem presented above, the dynamics of the aggregate unemployment rate, ut , are governed by utþ1 ¼ að1 ut Þ þ lut , where the first term on the right-hand side is the number of workers separated at time t, and the second term on the righthand side is the number of workers who are unemployed at time t and who rejected the time-t job offer.18 In the steadystate, ut is constant (call it u), ¯ and u¯ ¼
a 1þal
.
(3)
As is clear from (3), u¯ is increasing in both a and l. We compute the average unemployment duration as 1=ð1 lÞ and the unemployment rate as a=ð1 þ a lÞ in our simulations.
3.2. Calibration Our calibration strategy is to set the parameters of the model so that it matches the observations in the initial period in our dataset. Here we use the March CPS dataset, which makes the model’s starting year 1971.19 Then we vary a and s over time and see how the average unemployment duration changes as these parameters change. As will become clear later, a can be pinned down directly from the data on the incidence of unemployment, while s has to be calibrated indirectly so that the model solution matches the observed 90–10% residual wage inequality. 17
See the derivation in Appendix F of Mukoyama and S- ahin (2008). See Rogerson et al. (2005) and Section 3.3 for a discussion of these unemployment dynamics. As will be explained in Section 4, we conduct the same experiment for the census data and the CPS May/ORG data in Appendix G of Mukoyama and S- ahin (2008). For the Census data, the initial year is 1970 and for the CPS May/ORG it is 1973. 18
19
ARTICLE IN PRESS T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
205
Eq. (2) is scale-free in the sense that, if m is replaced by m þ m, and U s is replaced by U s þ m, the reservation wage w ¯ becomes w ¯ em , and l remains the same. Therefore, we can normalize m by setting m ¼ 0. In our benchmark calibration, we assume that mean wages do not change over time.20 We set the length of one period as one month. Therefore, b ¼ 0:9471=12 .21 The other parameters, U s and s, are set so that the following two conditions are satisfied: 1. The average unemployment duration, 1=ð1 lÞ, matches the unemployment duration in 1971: 12.3 weeks for males and 10.1 weeks for females. 2. The 90–10% log wage difference of employed workers in the model matches the 90–10% log wage differences in 1971: 0.95 for both males and females. Note that when we compute the 90–10% log wage difference in the model, we look at the accepted wage offers, which requires us to calculate the solution to the model repeatedly until we match the observed 90–10% log wage difference in the data. These two conditions imply U s ¼ 5:58 and s ¼ 0:734 for males and U s ¼ 4:65 and s ¼ 0:680 for females. 3.3. Qualitative predictions Before we conduct the quantitative analysis, we summarize the theoretical predictions of the model.22 Wage dispersion (s): It is well known that in McCall-style search models with linear utility, a mean-preserving spread in the wage-offer distribution increases the reservation wage. This is because an unemployed worker will tend to wait longer to accept an offer since the option value of a job opportunity (i.e. the opportunity cost of accepting a job) increases with the variance of the wage offer. When the variance increases, the possibilities of receiving a very low wage and a very high wage both increase. An increase in the probability of a very low-wage offer does not affect the value of waiting, since those offers are always rejected anyway. The higher probability of a very high-wage offer, however, increases the value of waiting, since those are the offers that workers accept. Therefore, the increase in variance increases the likelihood of a good job opportunity (raises the ‘‘option value’’ of waiting), causing unemployed workers to wait longer in hope of receiving one.23 Incidence of unemployment (a): It is clear from (2) that w ¯ is decreasing in a, so a higher a leads to a shorter duration. The intuition is that a worker becomes less selective about a job when the probability that the job will be terminated is high. The increase in the incidence of unemployment has two opposing effects on the unemployment rate: first the increase in a has the direct effect of increasing the unemployment rate since more workers become unemployed; second it has the indirect effect of decreasing unemployment since workers become less selective. 3.4. Results Now we turn to the quantitative evaluation of the changes in wage dispersion and the incidence of unemployment. We compute the model for each year from 1971 to 2002 by changing the values of s and a. We change s so that the 90–10% log wage difference matches its value for each year in the data (shown in Fig. 3). Similarly we set a to its value in the corresponding year in the data (shown in Fig. 4). Fig. 5 presents the average unemployment duration in the data and model for males (right panel) and females (left panel) for different specifications of the model. It shows that the benchmark model generates a significant increase in duration for both male and female workers. For males, average unemployment duration increases from 12.4 weeks (from 1971 to 1980) to 14.4 weeks (from 1981 to 2002). For females, it increases from 9.9 weeks (from 1971 to 1980) to 14.0 weeks (from 1981 to 2002).24 Our benchmark model allows both wage dispersion (captured by s) and incidence of unemployment (captured by a) to vary as they did in the data. In order to assess the importance of each change we consider two alternative calibrations of the model. In particular, the first alternative calibration only varies a and keeps s constant while the second alternative calibration varies s and keeps a constant. In the first alternative calibration (only a changes), there is not much increase in unemployment duration until the mid-1980s. Then unemployment duration starts increasing and stabilizes at a higher level in the mid-1990s. For males, the increase in unemployment duration implied by the change in a is not as high as in the data. For females, almost all of the increase in the data is captured by only changing a. As for the second alternative calibration (only s changes), the model generates a monotonically increasing unemployment duration for both males and females. The effect of s dominates the effect of a for males while the opposite is true for females. Fig. 6 presents the unemployment rate in the data and the implied unemployment rate u¯ in the benchmark model for males (right panel) and females (left panel). The model does not fully capture the cyclical fluctuations in the 20
In the robustness analysis in Section 4, we also examine the implication of varying m over time. The value for the annual discount rate, b, is taken from Cooley and Prescott (1995). 22 These results are standard, and therefore the proofs are omitted. 23 In our model, the utility function is concave and therefore there is an additional effect in the opposite direction due to consumption smoothing motive. In the calibrated version of the model, however, it turns out that the effect of the ‘‘option value’’ dominates. We thank Jiyoon Oh for pointing this out. 24 Recall that in the data, the duration increased by 3.4 weeks for males and 2.8 weeks for females (Footnote 6). 21
ARTICLE IN PRESS T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
206
24 22
18 17 16 15
18 Duration
Duration
20
Data Benchmark α only σ only Trend
16 14
14 13 12 11
12
10
10 8 1970
Data Benchmark α only σ only Trend
9 1975
1980
1985 1990 Year
1995
2000
8 1970
2005
1975
1980
1985
1990
1995
2000
2005
Year
Fig. 5. Unemployment duration for male (left) and female (right). ‘‘Data’’ refer to the March CPS data and ‘‘Trend’’ refers to the HP-filtered trend of the data. There are three model results: ‘‘Benchmark’’ (both a and s are changed), a only, and s only.
Unemployment rate
10
10
Model Data Unemployment rate
11 9 8 7 6 5 4
3 1970 1975 1980 1985 1990 1995 2000 2005 Year
9
Model Data
8 7 6 5 4 1970 1975 1980 1985 1990 1995 2000 2005 Year
Fig. 6. Unemployment rate for male (left) and female (right), data (March CPS), model.
unemployment rate, since the job finding rate does not vary sufficiently at the business cycle frequency. As discussed by Hall (2005) and Shimer (2005), fluctuations in the job finding rate is the key to accounting for the cyclical fluctuations in the unemployment rate. Overall, the model performs reasonably well in accounting for the long-run trends in the unemployment duration and the unemployment rate, while it fails to capture the cyclical properties of these variables. One shortcoming of the current model is that the offer arrival rate is given as constant. Allowing it to vary at cyclical frequency would improve the model’s performance along this dimension. 4. Robustness In this section, we examine the robustness of our findings by considering two alternative data sources and two model extensions. The details of these robustness checks are described in Appendix G of Mukoyama and S- ahin (2008). 4.1. Different datasets We consider two alternative datasets to calibrate the change in wage dispersion (sÞ and examine the model’s implications. We find out that the quantitative implications of the model change noticeably, depending on the data source used to calibrate our model. The first alternative dataset is the IPUMS Census samples on 1960, 1970, 1980, 1990, and 2000. The Census dataset has the advantage of having large samples and more detailed occupational controls. However, it is only available every 10 years.
ARTICLE IN PRESS T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
207
The Census exhibits a large increase in residual wage inequality in recent years, similar to the March CPS data.25 Not surprisingly, when we use the 90–10% wage difference from the Census data as the measure of wage dispersion, we obtain a significant increase in the unemployment duration. The model implies an increase of 3.4 weeks for males and 8.3 weeks for females from 1970 to 2000. The second alternative data source is the CPS May/ORG data. As is pointed out by Lemieux (2006), this dataset exhibits a much smaller increase in residual wage inequality. Naturally, when we use this dataset, our model generates a considerably smaller increase in unemployment duration. The model duration for male increases from 9.6 weeks (from 1973 to 1980) to 10.7 weeks (from 1981 to 2002).26 The increase in unemployment duration in the Census data case is still significant, while the increase in the CPS May/ ORG case is much smaller. The CPS May/ORG result is anticipated, given that the CPS May/ORG exhibits a much smaller increase in the residual wage dispersion than the other datasets (see Appendix D of Mukoyama and S- ahin, 2008). This discrepancy is a reflection of the ongoing debate among labor economists about the nature and the size of the recent increase in wage inequality.27 This debate is beyond the scope of this paper, and here we simply report the results from different datasets instead of taking a particular position. 4.2. Two extensions of the model We consider two extensions of the benchmark model. Our benchmark model assumes that the economy is always in the steady state, and the worker expects that the economic environment will not change over time. In the first extension, we assume that the worker has perfect foresight about future changes in the economic environment. In this setup, we also examine the effect of changes in the average wage level over time. It turns out that these changes do not affect the main implications of the model. We then consider a model where the unemployed worker can affect the probability of receiving a job offer by exerting search effort. In this extension, the change in unemployment duration is somewhat smaller than in the benchmark model. 4.2.1. Perfect foresight and the effect of wage level Instead of assuming that the worker treats a and s as constant, we assume that the worker foresees the future changes in these parameters. In particular, we assume perfect foresight—the worker knows that a and s evolve as they did in the data in Section 2.28 When we re-calibrate the value of U s to match the initial duration of unemployment, the behavior of the model duration of unemployment is almost identical to the benchmark model. In addition, we examine the effect of the changes in the average wage level. We construct the average wage (compensation) level from the National Income and Product Accounts (NIPA) following Sullivan (1997).29 Then we change the value of m following this wage series (we assume a perfect foresight on m as well). We find that results are again very similar to the benchmark. 4.2.2. Endogenous search effort Instead of assuming that the worker receives one job offer per period, we assume that the probability of receiving a job offer, p, is a function of the search effort level, a. Following Hopenhayn and Nicolini (1997), the period utility of the unemployed worker is assumed to be U s a. We assume that one period is 2 weeks and calibrate the pðaÞ function so that pðaÞ ¼ 12 at the optimal a in the initial period. As in Section 3, the worker expects s and a to be constant over time. We calibrate this model by using the Census data and find out that the increase in the average unemployment duration is 2.3 weeks from 1970 to 2000 (recall that in the benchmark, the increase is 3.4 weeks). Thus, the effect of a and s on unemployment duration is somewhat smaller. Since the decrease in a and the increase in s both increase the value of receiving a wage offer, a increases over time. Consequently, pðaÞ increases and average unemployment duration increases less relative to the benchmark case. 5. Examining disaggregated groups In this section, we examine the link between unemployment duration and the dispersion of wages at a more disaggregated level. We consider two different types of groups—demographic groups and occupational groups. In Appendix H of Mukoyama and S- ahin (2008), we show the cross-sectional relationship between unemployment duration and residual wage inequality. They consistently show a positive relationship, as is predicted by the theory. A more interesting and relevant question for our macroeconomic observation is whether the groups which experienced higher increases in wage dispersion also experienced higher increases in unemployment duration. 25 26 27 28 29
See Appendix D of Mukoyama and S- ahin (2008). In the data, the duration increases from 13.7 weeks (from 1973 to 1980) to 17.0 weeks (from 1981 to 2002). See Autor et al. (2005) and Lemieux (2006, 2007) for discussion. Within a year, we linearly interpolate the values of a and s that are used in Section 3. See Eckstein and Nagypa´l (2004) for the comparison between the NIPA compensation measure and the CPS wage measure.
ARTICLE IN PRESS T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
Change in Unemployment Duration (wks)
208
7 6.5 6
55-64
Male Female 55-64 45-54 35-44
5.5 5 4.5 4
25-34
35-44 45-54
3.5 3 2.5 2 0.125
25-34 0.15 0.175 0.2 0.225 0.25 Change in Wage Dispersion
0.275
Change in Unemployment Duration (wks)
Fig. 7. Change in the unemployment duration and the dispersion of wages from 1970 to 2000 for different age and sex groups.
11 10
Health help
9 8 7 6 5 4
Pre-College Education
Clerks
Mechanic
Operators Unskilled Labor Construction Administration (specialized) Technicians
3 2 1 0.1
Manual labor 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 Change in Wage Residual (d90-d10)
Fig. 8. Change in the unemployment duration and the dispersion of wages from the 1970s to the 2000s for different occupation groups.
First, Fig. 7 shows relationship between the increase in unemployment duration from 1970 to 2000 and the increase in wage dispersion from 1970 to 2000, for each demographic group. They show a positive relationship (the correlation coefficient is 0.58), which is consistent with our analysis. Second, Fig. 8 shows the relationship between the increase in unemployment duration from 1978 to 2002 and the increase in wage dispersion from 1970 to 2000 for different occupation groups.30 This plot also exhibits a positive correlation, although the relationship is weaker than the case of the demographic groups (the correlation coefficient is 0.29). 6. Conclusion In this paper, we examined the causes of the increase in the U.S. average unemployment duration in recent years. By quantitatively evaluating a search model, we showed that both the decrease in the incidence of unemployment and the increase in wage dispersion can cause the average duration to be significantly longer. Overall, the model performs well in replicating the long-run trends in the unemployment duration and the unemployment rate, while it fails to match the cyclical variations of these variables. In addition, our model does not account for the behavior of unemployment duration in the 1980s. Hall (2005) and Shimer (2005) argue that the key to understanding the cyclical movement of the unemployment rate is the change in the job finding rate. In the popular Diamond–Mortensen–Pissarides matching model, the arrival rate of job offers can vary due to the vacancy posting behavior of firms. We abstract from this aspect, assuming that the arrival rate of job offers is constant. An extension along this line is beyond the scope of this paper, but we believe that it has the potential to better match the data. 30
The unemployment duration is computed from the CPS Monthly Basic Studies, as detailed in Appendix H of Mukoyama and S- ahin (2008).
ARTICLE IN PRESS T. Mukoyama, A. -Sahin / Journal of Monetary Economics 56 (2009) 200–209
209
Another limitation of our paper is that we take the wage process and the incidence of unemployment as given. Clearly, wages and unemployment (both incidence and duration) are determined simultaneously in the labor market. Our analysis is a first step towards a better understanding of the interaction between them in the context of the U.S. economy in recent years. A more detailed and complete analysis of this interaction is an important future research agenda.
Acknowledgments We are grateful to an anonymous referee, Mark Bils, Yongsung Chang, Nikolay Gospodinov, Jeremy Greenwood, Nezih Gu¨ner, Mark Huggett, Susumu Imai, Sun-Bin Kim, Bob King, Per Krusell, Lance Lochner, Adi Mayer, Kristy Mayer, Jiyoon Oh, Rob Shimer, Roxanne Stanoprud, Sarah Tulman, Rob Valletta, Gianluca Violante, and seminar participants at the FRS Macro System Meeting, the Montreal Macro Study Group, the Rochester Wegmans Conference, and the SED Meetings for their comments and suggestions. We thank Chris Huckfeldt for excellent research assistance. We are grateful to Melissa Kearney for providing us with CPS May/ORG data. We also thank Emy Sok at the Bureau of Labor Statistics for the help with the BLS data. All errors are our own. The views expressed in this article are those of the authors and do not necessarily reflect the position of the Federal Reserve Bank of New York or the Federal Reserve System. References Abraham, K.G., Shimer, R., 2001. Changes in unemployment duration and labor force attachment. In: Krueger, A.B., Solow, R.M. (Eds.), The Roaring Nineties: Can Full Employment Be Sustained? Russell Sage Foundation, New York, pp. 367–420. Autor, D.H., Katz, L.F., Kearney, M.S., 2005. Trends in U.S. wage inequality: re-assessing the revisionists. NBER Working Paper No. 11627. Baicker, K., Goldin, C., Katz, L.F., 1998. A distinctive system: origins and impact of US unemployment compensation. In: Bordo, M.D., Goldin, C., White, E.N. (Eds.), The Defining Moment: the Great Depression and the American Economy in the Twentieth Century. University of Chicago Press, Chicago, pp. 227–263. Baumol, W.J., Wolff, E.N., 1998. Speed of technical progress and length of the average interjob period. Jarome Levy Economics Institute Working Paper No. 237, May. Burdett, K., Mortensen, D.T., 1998. Wage differentials, employer size, and unemployment. International Economic Review 39, 257–273. Burtless, G., 1983. Why is insured unemployment so low? Brookings Papers on Economic Activity 1, 225–249. Cooley, T.F., Prescott, E.C., 1995. Economic growth and business cycles. In: Cooley, T.F. (Ed.), Frontiers of Business Cycle Research. Princeton University Press, Princeton, pp. 1–38. Eckstein, Z., Nagypa´l, E´., 2004. The evolution of U.S. earnings inequality: 1961–2002. Federal Reserve Bank of Minneapolis Quarterly Review 28, 10–29. Ehrenberg, R., Oaxaca, R.L., 1976. Unemployment insurance, duration of unemployment and subsequent wage gain. American Economic Review 66, 754–766. Hall, R.E., 2005. Job loss, job-finding, and unemployment in the U.S. economy over the past fifty years. NBER Macroeconomics Annual, 101–137. Hopenhayn, H.A., Nicolini, J., 1997. Optimal unemployment insurance. Journal of Political Economy 105, 412–438. Juhn, C., Murphy, K.M., Topel, R.H., 1991. Why has the natural rate of unemployment increased over time? Brookings Papers on Economic Activity 2, 75–142. Juhn, C., Murphy, K.M., Pierce, B., 1993. Wage inequality and the rise in returns to skill. Journal of Political Economy 101, 410–442. Juhn, C., Murphy, K.M., Topel, R.H., 2002. Current unemployment, historically contemplated. Brookings Papers on Economic Activity 1, 79–116. Katz, L.F., Murphy, K.M., 1992. Changes in relative wages, 1963–1987: supply and demand factors. Quarterly Journal of Economics 107, 35–78. Lemieux, T., 2006. Increasing residual wage inequality: composition effects, noisy data, or rising demand for skill? American Economic Review 96, 461–498. Lemieux, T., 2007. The changing nature of wage inequality. NBER Working Paper No. 13523 Machado, J.A.F., Portugal, P., Guimaraes, J., 2006. U.S. unemployment duration: has long become longer or short become shorter? IZA Discussion Paper No. 2174. McCall, J.J., 1970. Economics of information and job search. Quarterly Journal of Economics 84, 113–126. Mortensen, D.T., 2003. Wage Dispersion: Why Are Similar Workers Paid Differently? MIT Press, Cambridge. Mortensen, D.T., Pissarides, C.A., 1994. Job creation and job destruction in the theory of unemployment. Review of Economic Studies 61, 397–415. Mukoyama, T., S- ahin, A., 2008. Why did the average duration of unemployment become so much longer? Working paper. Available at hhttp:// people.virginia.edu/tm5hs/i. Ravn, M., Uhlig, H., 2002. On adjusting the HP-filter for the frequency of observations. Review of Economics and Statistics 84, 371–376. Rogerson, R., Shimer, R., Wright, R., 2005. Search-theoretic models of the labor market: a survey. Journal of Economic Literature 43, 959–988. Shimer, R., 1998. Why is the U.S. unemployment rate so much lower? In: Bernanke, B.S., Rotemberg, J.J. (Eds.), NBER Macroeconomics Annual, 1998. MIT Press, Cambridge, pp. 11–61. Shimer, R., 2005. The cyclical behavior of equilibrium unemployment and vacancies. American Economic Review 95, 25–49. Sullivan, D., 1997. Trends in real wage growth. Chicago Fed Letter 1997, 115. Valletta, R.G., 1998. Changes in the structure and duration of U.S. unemployment, 1967–1998. Federal Reserve Bank of San Francisco Economic Review 1998, No. 3. Valletta, R.G., 2005. Rising unemployment duration in the United States: causes and consequences. Mimeo, Federal Reserve Bank of San Francisco. Violante, G.L., 2002. Technological acceleration, skill transferability and the rise in residual inequality. Quarterly Journal of Economics 117, 297–338. Wolpin, K.I., 1987. Estimating a structural search model: the transition from school to work. Econometrica 55, 801–817. Yellen, J.L., 1991. Comments and discussion. Brookings Papers on Economic Activity 2, 127–133.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 210–221
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
U.S. tax policy and health insurance demand: Can a regressive policy improve welfare? Karsten Jeske a, Sagiri Kitao b, a b
Mellon Capital Management, Investment Research, 50 Fremont Street, Suite 3900, San Francisco, CA 94105, USA Marshall School of Business, University of Southern California, 3670 Trousdale Parkway, Los Angeles, CA 90019, USA
a r t i c l e i n f o
abstract
Article history: Received 19 November 2007 Received in revised form 9 December 2008 Accepted 10 December 2008 Available online 24 December 2008
The U.S. tax policy on health insurance is regressive because it subsidizes only those offered group insurance through their employers, who also tend to have a relatively high income. Moreover, the subsidy takes the form of deductions from the progressive income tax system giving high income earners a larger subsidy. To understand the effect of the policy, we construct a dynamic general equilibrium model with heterogenous agents and an endogenous demand for health insurance. A complete removal of the subsidy may lead to a partial collapse of the group insurance market, reduce the insurance coverage and deteriorate welfare. There is, however, room for improving the coverage and welfare by extending a refundable credit to the individual insurance market. & 2009 Elsevier B.V. All rights reserved.
JEL classification: E21 E62 I10 Keywords: Health insurance Risk-sharing Tax policy
1. Introduction The premium for employer-based health insurance in the U.S. can be both income and payroll tax deductible while individual health insurance (IHI) purchased outside the workplace does not offer this tax break. This tax policy is regressive in two ways. First, data indicate that labor income is positively correlated with the access to employer-based health insurance, thus workers with higher earnings are more likely to enjoy the tax break. Second, conditional on having access to employer-based health insurance, the progressive income tax code in the U.S. makes the policy regressive because high income individuals in a higher marginal tax bracket receive a larger tax break than those in a lower tax bracket. We show that despite its regressiveness the deduction policy is welfare improving. Our result relies on the key difference between employer-based health insurance and IHI. The former, also called group health insurance (GHI), is required by law not to discriminate among employees based on health status, while in the latter insurance companies have an incentive to price-discriminate and offer favorable terms to healthy individuals. Insurance outside the workplace, therefore, offers less pooling and less risk-sharing. Pooling in GHI, however, relies on healthy agents, who are sensitive to the changes in the cost of insurance, to voluntarily cross-subsidize agents with higher health expenditures. Eliminating the tax subsidy can trigger a spiral of adverse selection and a rise in the group insurance premium. We show that completely abolishing the current policy can collapse the pooling in the GHI market and result in a welfare loss due to an increased exposure to health expenditure risks.
Corresponding author. Tel.: +1 213 7406884; fax: +1 213 7406650.
E-mail address:
[email protected] (S. Kitao). 0304-3932/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.12.004
ARTICLE IN PRESS K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
211
Our work is a contribution to the literature of dynamic equilibrium models with heterogenous agents in incomplete markets.1 We add to this literature by incorporating idiosyncratic health expenditure risk which is partially insurable according to the endogenous insurance decisions. Our paper is also related to the literature on fiscal policy in incomplete markets.2 Several recent papers studied the role of health and medical expenditures in Aiyagari–Bewley type models. Livshits et al. (2007) and Chatterjee et al. (2007) argue that health expenditure shocks are an important source of consumer bankruptcies. Hubbard et al. (1995) add health expenditure risk and study the role of social safety net in discouraging savings by low income households. Palumbo (1999), De Nardi et al. (2005) and Scholz et al. (2006) incorporate heterogeneity in medical expenses to understand the pattern of retirement savings. Our model, differently from these papers, endogenizes the health insurance decision, rather than treating households’ out-of-pocket health expenditures as an exogenous shock.3 We take into account important general equilibrium effects of a reform, including the interaction between the health insurance demand and precautionary savings and the effect on the fiscal variables. We also quantify the welfare effect of reforms by computing the transition dynamics. The paper proceeds as follows. Section 2 introduces a simple two-period model to highlight the intuition of our results. Section 3 introduces the full dynamic model and Section 4 details the parameterizations of the model. Section 5 presents the numerical results and the last section concludes.4 2. A simple two-period model We present a two-period model with endogenous health insurance demand to provide the intuition of our results. We demonstrate that changing the tax treatment of the health insurance premium has ambiguous welfare effects. Since the subsidy is regressive, it impedes risk-sharing among agents that face income risks. A subsidy, however, can help overcome the adverse selection in the group insurance market and enhance risk-sharing. The simple model highlights this tradeoff faced by a benevolent government. Suppose there are two firms and a continuum of individuals who live for two periods and consume only in the second period. Assume that ex ante identical agents face an idiosyncratic health risk. With some probability, agents will fall into a bad health state and must pay health expenditures equivalent to a unit of the consumption good in period 2. In period 1, agents observe a noisy signal of their health expenditure shock. Specifically, a measure 12 has a probability pH of suffering from the expenditure shock and the remaining agents have a probability pL, where pH 4pL . Assume that all agents have access to the market of IHI where a competitive and risk-neutral insurance company offers an insurance contracts at price pi based on the observed signal i 2 fL; Hg. Notice that all risk-averse agents will choose to sign up for insurance. Agents receive a life-time labor income y from their employers. In period 1, one half of the agents are matched with a firm of type 1 that offers a GHI contract at price pGHI to all employees independent of their signals. Workers in firm 1, therefore, have a choice between GHI and IHI. The other half of the agents work in firm 2 that does not offer such a group insurance and thus have access only to IHI. Consider a policy of providing a subsidy s for the purchase of a GHI contract. Let the subsidy be s ¼ ðpH pL Þ=2. One can show that all agents in firm 1, even those with signal pL , sign up for GHI.5 The average expenditure per agent is ðpH þ pL Þ=2 and the premium is pGHI ¼ ðpH þ pL Þ=2 s ¼ pL , just low enough to make even the healthy individuals with pL indifferent between GHI and IHI. For any subsidy value smaller than s, healthy agents would leave the GHI contract and go to the individual market. Assume that the government imposes a lump-sum tax on workers in firm 1 to finance the cost of subsidy, i.e. t ¼ s. Such a policy has no effect on workers in firm 2 so that we can focus on the redistributional effect of the policy among those with the GHI offer in firm 1. The subsidy removes a mean-preserving spread in consumption, thus the welfare effect on riskaverse agents is unambiguously positive. To quantify the welfare effect of such a policy, assume that agents derive utility from the consumption in the second period according to the preference uðcÞ ¼ c1s =ð1 sÞ with s ¼ 3, earn the life-time income y ¼ 2. Suppose pL ¼ 0:1 and pH ¼ 0:15. The welfare effect measured in terms of consumption equivalent variation is 0.03%, but the gain rises to 0.11% with pH ¼ 0:20 and 0.46% with pH ¼ 0:30. The magnitude of the welfare gain depends on the variance of the health shocks that the policy helps alleviate: the greater the uncertainty of the health status, the larger are the potential welfare gains of the subsidy. Income uncertainty and regressive policy: In addition to the uncertainty about health expenditures, assume that agents are heterogeneous in income as well. Firm 1 pays a wage y1 and firm 2 pays y2 , where y1 4y2 , i.e. people with a GHI offer 1 The classic work of Bewley (1986), I˙mrohorog˘lu (1989), Huggett (1993) and Aiyagari (1994) pioneered the literature. For more recent work, see, for example, Ferna´ndez-Villaverde and Krueger (2006) and Krueger and Perri (2005). 2 See for example Domeij and Heathcote (2004), Castan˜eda et al. (2003), Conesa and Krueger (2006) and Conesa et al. (forthcoming). 3 Papers that deal with health insurance policy outside of a heterogenous agent framework include Gruber (2004), who measures the effects of different subsidy policies for non-group insurance on the fraction of uninsured, using a micro-simulation model. Kotlikoff (1989) builds an OLG model where households face idiosyncratic health shocks and studies the effect of medical expenditures on precautionary savings. He considers different insurance schemes, which agents take as exogenously given. 4 Jeske and Kitao (2008) that accompanies this paper contains supplementary materials that are not included in this paper due to space constraints. It provides more detailed description of the model’s equilibrium, the calibration strategy and results, as well as a variety of sensitivity analysis and robustness studies of our model results and discussion of possible extensions of the current paper. 5 For simplicity we assume that whenever agents are indifferent between the two contracts they pick GHI.
ARTICLE IN PRESS 212
K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
Table 1 Welfare effect of subsidy and effect of income uncertainty.
All (ex ante) Offered group insurance Not offered group insurance
Case A (%)
Case B (%)
þ0:024 þ1:462 1.354
0.425 þ1:132 1.354
Notes: In case A (no income uncertainty), the income is constant at 2.0. In case B (income uncertainty), the incomes are y1 ¼ 2:0 and y2 ¼ 2:5 with equal probabilities. Welfare effects are expressed in terms of consumption equivalence.
earn more.6 Notice that since people earn more at firm 1, the subsidy is a regressive policy from the perspective of an agent before the realization of the income shock. Consider the same policy of providing the subsidy for GHI, namely a subsidy s just large enough to make the agents with a low health risk sign up for the contract. Now assume that the subsidy is financed by a lump-sum tax on every agent in the economy, even those in firm 2. With pL and pH set at 0.1 and 0.2, respectively, the welfare effect of the subsidy depends on the degree of income uncertainty as shown in Table 1. If the wages in the two firms are identical as in case A, the positive effect of increased risk-sharing in firm 1 dominates, causing an ex ante welfare gain. In contrast, if the income in firm 1 is large enough as in case B, the welfare loss from lower risk-sharing over income uncertainty dominates ex ante welfare. The marginal utility of workers in firm 1 with high income is lower and the welfare gain from pooling the risk in the GHI contract is smaller than the welfare loss of agents in firm 2. As this basic model demonstrates, the welfare effect of the group insurance subsidy is ambiguous. Determining the welfare effect of the current tax policy requires a quantitative exercise based on a carefully calibrated dynamic model. The basic model provides intuition, but fails to capture key aspects of the macro-economy, the magnitude and persistence of the health risks and income uncertainty that individuals face over the life-cycle and institutions that make the insurance markets in the U.S. unique. The next section will present a quantitative dynamic general equilibrium model that achieves the task. 3. The full dynamic model This section presents the dynamic general equilibrium model, which will be used to study the effects of current and alternative insurance policies. 3.1. Demographics, preferences and endowment The economy is populated by two generations of agents, the young and the old. Young agents supply labor and earn wage income and old agents are retired from market work. Young agents become ‘‘old’’ with probability ro every period and old agents die and leave the economy with probability rd. The population is assumed to remain constant. Old agents who die and leave the model are replaced by the entry of new young agents. Bequests are accidental and they are transferred to the entire population in a lump-sum manner, denoted by b. Preferences are time-separable with a constant subjective discount factor b. Instantaneous utility from consumption is defined as uðcÞ ¼ c1s =ð1 sÞ. Young agents supply labor inelastically and earn the labor income wz, that depends on an idiosyncratic stochastic component z and the wage rate w. Productivity shock z follows a Markov process that evolves jointly with the probability of being offered employer-based health insurance, which is discussed in Section 3.2.7 3.2. Health and health insurance In each period, agents face an idiosyncratic health expenditure shock x, which follows a finite-state generation-specific Markov process. Young agents have access to the health insurance market, where they can purchase a contract that covers a fraction qðxÞ of realized medical expenditures x. They purchase either IHI or GHI. While everyone has access to the individual market, GHI is available only if such a benefit is offered by the employer. The probability of receiving a GHI offer evolves jointly with the labor productivity shock z. As discussed in the calibration section, firms’ offer rates differ significantly across earnings groups and the availability of such benefits is highly persistent while the degree of persistence varies by the earnings level. GHI costs a premium p, which does not depend on any individual states, including current health expenditures x. This accounts for the practice that GHI does not price-discriminate among the insured. A fraction c 2 ½0; 1 of the premium is paid by the employer as a subsidy. In the IHI market, the premium is given by pm ðxÞ and depends on the current health expenditure state x. This reflects the practice that in contrast to the group insurance market, there is price differentiation. 6 7
In the panel data presented below, people with a GHI offer have a labor income about 2.15 times higher than those without a GHI offer. Newly born young agents make a draw from the unconditional distribution of this process. The initial assets of the entrants are assumed to be zero.
ARTICLE IN PRESS K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
213
IHI contracts are often contingent on age, specific habits (such as smoking) and other conditions and can rule out payment for preexisting conditions. Health insurance companies are competitive and they are free to offer contracts to different individuals in both group and individual markets. Therefore no-profit condition is satisfied in each type of contract and there is no cross-subsidy across different contracts. The premiums for GHI p and IHI pm ðxÞ for each health status are determined as the expected expenditures for each contract plus a proportional markup denoted as f. All old agents are enrolled in the Medicare program and assumed not to purchase GHI or IHI. Each old agent pays a fixed premium pmed every period and the program will cover a fraction qmed ðxÞ of the medical expenditures x. Young agents pay the Medicare tax at rate tmed on earnings. 3.3. Firms and production technology A continuum of competitive firms operate a technology with constant returns to scale, FðK; LÞ ¼ AK a L1a , where K and L are the aggregate capital and labor efficiency units and A is the total factor productivity, which is assumed to be constant. Capital depreciates at rate d. If a firm offers GHI, a fraction c 2 ½0; 1 of the insurance premium is paid at the firm level. The firm needs to adjust the wage to ensure the zero profit condition, and subtracts the cost cE from the wage rate, which is just enough to cover the total premium cost for the firm.8 3.4. The government The government levies tax on income and consumption to finance expenditures G and the social insurance program. The government budget is balanced every period. Agents’ income is taxed according to a progressive tax function TðÞ and consumption is taxed at a proportional rate tc . The social insurance guarantees a minimum level of consumption c¯ for every agent by supplementing the income in case an agent’s disposable assets fall below c¯ , as in Hubbard et al. (1995). Social security and Medicare systems are self-financed by proportional taxes tss and tmed on labor income and Medicare premium pmed . Each old agent receives the social security benefit ss every period. 3.5. Households The state vector of a young agent is given by sy ¼ ða; z; x; iHI ; iE Þ, where a denotes assets, z the idiosyncratic productivity shock, x the idiosyncratic health expenditure shock from the last period that has to be paid in the current period, and iHI and iE indicator functions for the health insurance coverage and the availability of a GHI offer. The timing of events is as follows. Young agents observe the state sy at the beginning of a period, pay last period’s health care bill x, make the optimal decision of consumption c and savings a0 X0, pay taxes, receive transfers and decide on whether to be covered by health insurance, i0HI 2 f0; 1g. After they have made all decisions, this period’s health expenditure shock x0 , productivity shock z0 and offer status i0E are revealed. They also learn next period’s generation, i.e. whether they retire or not. Agents make the health insurance decision i0HI after they find out whether the employer offers GHI but before the health expenditure shock for the current period x0 is known. Agents pay an insurance premium one period before the expenditure payment occurs. The state vector of an old agent is given by so ¼ ða; xÞ.9 The maximization problem is written in a recursive form with value functions V j , where the subscript j ¼ y or o denotes young or old generation. Young agents’ problem: V y ðsy Þ ¼ max fuðcÞ þ bfð1 ro ÞE½V y ðs0y Þ þ ro E½V o ðs0o Þgg 0
(1)
c;a0 ;iHI
subject to ð1 þ tc Þc þ a0 þ ð1 iHI qðxÞÞx ¼ wz ˜ p˜ þ ð1 þ rÞða þ bÞ Tax þ T SI
(2)
where ( w ˜ ¼
ð1 0:5ðtmed þ tss ÞÞw
if iE ¼ 0
ð1 0:5ðtmed þ tss ÞÞðw cE Þ
if iE ¼ 1
(3)
8 The adjusted wage is given as wE ¼ w cE , where w ¼ F L ðK; LÞ. cE , the employer’s cost of health insurance per efficiency unit, is defined as PNz cE ¼ ½mins is the fraction of workers that purchase GHI, conditional on being offered such benefits. iE ¼ 1 is an ¯ Z;E ðkjiE ¼ 1Þ, where mins E pc=½ E k¼1 zk p indicator function for the group insurance offer, and p¯ Z;E ðkjiE ¼ 1Þ is the stationary probability of drawing productivity zk conditional on iE ¼ 1. 9 In the actual computation, old agents who just retired in the previous period are distinguished from the rest of the old agents. Their health bills from the previous year incurred as young agents are covered by the insurance if iHI ¼ 1 and not by Medicare. Therefore, for this age group, the state vector is given by ða; x; iHI Þ.
ARTICLE IN PRESS 214
K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
8 > < p ð1 cÞ p˜ ¼ pm ðxÞ > :0
if i0HI ¼ 1 and iE ¼ 1 if i0HI ¼ 1 and iE ¼ 0
(4)
if i0HI ¼ 0
Tax ¼ TðyÞ þ 0:5ðtmed þ tss Þðwz ˜ iE p˜ Þ
(5)
y ¼ maxfwz ˜ þ rða þ bÞ iE p˜ ; 0g
(6)
T SI ¼ maxf0; ð1 þ tc Þ¯c þ ð1 iHI qðxÞÞx þ Tð˜yÞ wz ˜ ð1 þ rÞða þ bÞg
(7)
y˜ ¼ wz ˜ þ rða þ bÞ Old agents’ problem: V o ðso Þ ¼ max fuðcÞ þ bð1 rd ÞE½V o ðs0o Þg 0 c;a
(8)
subject to ð1 þ tc Þc þ a0 þ ð1 qmed ðxÞÞx ¼ ss pmed þ ð1 þ rÞða þ bÞ Tðrða þ bÞÞ þ T SI T SI ¼ maxf0; ð1 þ tc Þ¯c þ ð1 qmed ðxÞÞx þ pmed ss ð1 þ rÞða þ bÞ þ Tðrða þ bÞÞg
(9) (10)
Eq. (2) is the budget constraint of a young agent. Consumption, saving, out-of-pocket medical expenditures and payment for the insurance premium are financed by labor income, assets, net of income and payroll taxes Tax plus social insurance transfer T SI if applicable. The marginal cost of the insurance premium p˜ depends on the state iE as given in (4). 4. Calibration In this section, parametrization of the model is outlined.10 Table 2 summarizes the values of key parameters of the model. 4.1. Demographics, preferences and technology The model period is one year. Young agents are between the ages of 20 and 64, and old agents are 65 and over. Young 1 so that they stay for an average of 45 years in the labor force before retirement. agents’ probability of aging ro is set at 45 The death probability rd is calibrated so that old agents above age 65 constitute 20% of the population, based on the panel data set discussed below. Every period a measure rd ro =ðrd þ ro Þ of young agents enter the economy to replace the deceased old agents. The subjective discount factor b is calibrated to achieve an aggregate capital output ratio of 3. The risk aversion parameter s is set at 3. Total factor productivity A is normalized so that the average labor income equals one in the benchmark economy. The capital share a is set at 0:33 and the depreciation rate d at 0.06. 4.2. Endowment, health insurance and health expenditures For labor endowment, health expenditure shocks and health insurance, we use earnings and health data from a single source, the Medical Expenditure Panel Survey (MEPS). The MEPS is based on a series of national surveys conducted by the U.S. Agency for Health Care Research and Quality. It consists of two-year panels since 1996/1997 and includes data on demographics, income and most importantly health expenditures and insurance.11 The endowment process is calibrated jointly with the stochastic probability of being offered employer-based health insurance. Wage income of all heads of households (both male and female) are used to calibrate the earnings distribution. The process is specified over five states and each agent belongs to one of the five bins of equal size. Table 3 shows the stationary distribution. There is an asymmetry in the earnings distribution for the agents with a GHI offer and those without. High earners are more likely to have a GHI offer. The process of health expenditure shocks is estimated using the MEPS data. We use seven states for the expenditures with the bins of size ð20% 4; 15%; 4%; 1%Þ, separately for young and old generations. Young agents’ expenditure grids are given as Xy ¼ f40:000; 0:006; 0:022; 0:061; 0:171; 0:500; 1:594g, which are the mean expenditures in each of the seven bins in the first year of the last panel in 2003, normalized in terms of the average earnings. The old generation’s expenditure grids are given as Xo ¼ f0:007; 0:044; 0:096; 0:185; 0:418; 1:036; 2:265g. Unconditional expectations of xy and xo are 7% and 18% of average earnings or about $2,200 and $6,300 in 2003 dollars. 10
Jeske and Kitao (2008) provide a complete description of the calibration and details that are not presented here due to space constraints. We drop the first three panels and use five panels up to 2003/2004 because one crucial variable that is necessary to determine the joint endowment and insurance offer process is missing in those panels. 11
ARTICLE IN PRESS K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
215
Table 2 Parameters of the model. Parameter
Description
Values
Aging probability Death probability after retirement
2.22% 8.89%
Discount factor Relative risk aversion
0.934 3.0
Capital share Depreciation rate of capital
0.33 0.06
Income tax parameters (progressive part) Income tax parameter (proportional part) Consumption tax rate Social insurance minimum consumption Social security tax rate Social security replacement rate Medicare tax rate Medicare premium
f0:258; 0:768; 0:716g 4.46% 5.67% 23.9% of average earnings ($7,832) 10.61% 45% 1.51% 2.11% of per capita output
Group insurance premium Firms’ group insurance subsidy Premium mark-up
6.2% of average earnings ($2,018) 80% 11%
Demographics
ro rd Preferences
b
s Technology and production
a d Government fa0 ; a1 ; a2 g
ty tc c¯
tss tmed pmed Health insurance p
c f
Table 3 Stationary distribution of earnings and group insurance offer. Earnings grid number
1
2
3
4
5
Total
Group insurance offered (%) Group insurance not offered (%)
3.1 17.1
11.3 8.6
14.9 5.0
16.4 3.6
16.9 3.1
62.6 37.4
4.3. Health insurance The coverage ratios of health insurance are calibrated using the same MEPS panels. We estimate a polynomial qðxÞ, the coverage ratio as a function of expenditures x. The mark-up on the insurance premium is set at 11% based on the study in Kahn et al. (2005). The group insurance premium p is determined in equilibrium to ensure zero profit of the insurance company in the group insurance market. The average annual premium of an employer-based health insurance was $2,051 in 1997 or about 7% of average annual labor income (Sommers, 2002). Model simulations yield a premium of 6.2% of average annual labor income. A firm offering employer-based health insurance pays a fraction c of the premium. According to the MEPS, the average percentage of total premium paid by employees varies between 11% and 23% depending on the industry (Sommers, 2002) and the fraction ð1 cÞ is set to 20%. With regards to IHI, the insurance company sets pm ðxÞ to satisfy no-profit condition for each contract, that is, pm ðxÞ ¼ ð1 þ fÞEfqðx0 Þx0 jxg=ð1 þ rÞ. In the benchmark model, the premiums pm ðxÞ for the seven expenditure states are given as f0:0156; 0:0245; 0:0488; 0:0860; 0:1221; 0:2358; 0:4905g, expressed in terms of average earnings.
4.4. Government Expenditures and taxation: The exogenous government expenditure G is set to 18% of GDP in the benchmark economy. The consumption tax rate tc is 5.67%, based on Mendoza et al. (1994). The income tax function consists of two parts, a non-linear progressive income tax and a proportional income tax. The non-linear part captures the progressive income tax schedule in the U.S. following the functional form studied by Gouveia and Strauss (1994), while the proportional part stands in for all other taxes, that is, non-income and non-consumption
ARTICLE IN PRESS 216
K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
100
100 80
data (MEPS 2003) model
percentage (%)
percentage (%)
80 60 40 20 0
60 40 20
0
20
40 earnings in $1,000
60
80
0
data (MEPS 2003) model
0
10
20 30 40 expenditures in $1,000
50
Fig. 1. Health insurance take-up ratios: (a) over earnings z; (b) over expenditures xy .
taxes, which for simplicity are lumped together into a single proportional tax ty levied on income. The tax function is given as TðyÞ ¼ a0 fy ðya1 þ a2 Þ1=a1 g þ ty y. Parameter a0 is the limit of marginal taxes in the progressive part as income goes to infinity, a1 determines the curvature of marginal taxes and a2 is a scaling parameter. To preserve the shape of the tax function estimated by Gouveia and Strauss, their parameter estimates fa0; a1 g ¼ f0:258; 0:768g are used and the scaling parameter a2 is chosen within the model such that the share of government expenditures raised by the progressive part of the tax function equals 65%, which matches the fraction of total revenues financed by income tax (OECD Revenue Statistics, 2002). ty in the proportional term is chosen in equilibrium to balance the overall government budget. Social insurance, social security and Medicare: The minimum consumption floor c¯ for the social insurance is calibrated so that the model achieves the target share of households with a low level of assets. Households with net worth of less than $5,000 constitute 20.0% (Kennickell, 2003, averaged over 1989–2001 SCF data) and this fraction is used as a target to match in the benchmark economy. The replacement rate of social security benefit is set at 45% based on the study by Whitehouse (2003). In equilibrium, the social security tax tss is adjusted to equate the total benefit payment with the total social security tax revenues. Every old agent is assumed to be enrolled in Medicare Part A and Part B. MEPS data are used to calculate the coverage ratio qmed ðxÞ of Medicare. The annual Medicare premium for Part B is $799.20 in 2004 or about 2.11% of GDP and we use the ratio. Medicare tax rate tmed is determined so that the Medicare system is self-financed. 5. Numerical results In this section, we will first discuss the features of our benchmark model. Second, the effects of a policy to eliminate the current subsidy are analyzed to understand the role of the policy. Finally, three alternative policies are studied. 5.1. Benchmark model Although the model is not calibrated to directly target and generate the patterns of health insurance coverage, our model succeeds in matching them fairly well not only qualitatively but in most cases also quantitatively. The take-up ratio, defined as the share of agents holding health insurance, is 75.7% among all young agents in our model (73.1% in the data) and 35.5% conditional on not being offered GHI (34.8% in the data). Fig. 1(a) displays the take-up ratios of the model over earnings with the same statistics from the MEPS data.12 Both in the data and model, the take-up ratios increase in earnings. Agents with higher earnings are more likely to be offered group insurance and very few agents in the lowest earnings level receive such a benefit, which contributes to the lower take-up ratio among low-earnings agents. Many of them have a low level of assets and are more likely to be eligible for the social insurance. In case the agents face a high expenditure shock and can only purchase IHI at a high premium, some may choose to remain uninsured in the hope of having the health cost be covered by the government through the social insurance. 12 We estimate the empirical take-up ratios over earnings via a probit model on three regressors: a constant, log of earnings and squared log earnings. Likewise for the plots over expenditures we use a constant log of expenditures and squared log expenditures.
ARTICLE IN PRESS
100
100
80
80 percentage (%)
percentage (%)
K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
60 40 20 0
217
60 40 20
model data (MEPS 2003)
0
20
40 earnings in $1,000
60
0
80
model data (MEPS 2003)
0
10
20 30 40 expenditures in $1,000
50
Fig. 2. Health insurance take-up ratios for agents with GHI offer: (a) over earnings z; (b) over expenditures xy .
100
100
data (MEPS 2003) model
80 percentage (%)
percentage (%)
80 60 40 20 0
data (MEPS 2003) model
60 40 20
0
20
40 earnings in $1,000
60
80
0
0
10
20 30 40 medical expenditures in $1,000
50
Fig. 3. Health insurance take-up ratios for agents without GHI offer: (a) over earnings z; (b) over expenditures xy .
Fig. 1(b) displays the take-up ratios over health expenditures. The data show a fairly flat take-up ratio in the range of 70–80% except for the agents with very low expenditures. Our model also generates a flat pattern of take-up ratios, although the model is slightly off at the very low end, where the data exhibit a drop in the coverage. One possible reason for this difference is our assumption that all the employers pay 80% of the premium, which is based on the average in the data. In practice, however, different firms cover a varying fraction of the premium and the data may capture some of those agents not taking-up with a less generous employer subsidy. The healthiest agents with relatively low expected expenditures may choose not to be insured if the employer subsidy is sufficiently low. Figs. 2(a) and (b) plot take-up ratios over earnings and expenditures for those offered GHI. The take-up ratio is very high across different levels of both earnings and expenditures. Group insurance is an attractive deal given the firm’s subsidy and the tax break. The take-up ratio falls slightly among agents with very low earnings, some of whom may expect to be covered by the Medicaid (social insurance in our model). The model somewhat overstates the health insurance demand among them and our abstraction from heterogeneous subsidy rates may help explain the difference. Also evident in Fig. 2(b) is that the take-up ratio does not vary much over expenditures, both in the data and in the model. Finally, Figs. 3(a) and (b) plot take-up ratios for those not offered GHI. Both over earnings and expenditures the model replicates the empirical take-up ratios fairly well, most of the time within 10 percentage points. If agents face higher health expenditures, the demand for insurance would be high. At the same time, however, the premium will reflect the higher expected cost of coverage in the individual insurance market, which offsets the rise in demand.
ARTICLE IN PRESS 218
K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
Table 4 Policy experiments. Benchmark Group insurance premium Take-up ratio: all Take-up ratio: group insurance not offered Take-up ratio: group insurance offered Purchase group insurance Purchase individual insurance Aggregate capital Aggregate consumption Interest rate Wage rate Income tax rate ty Social insurance covered (% of population)
$2,018 75.7% 35.5% 99.0% 99.0% 0.0% – – 4.99% – 4.46% 6.34%
A $5,316 48.4% 38.0% 57.4% 17.6% 39.8% þ0.81% þ0.38% 4.93% þ0.27% 3.74% 6.55%
B
C
D
$2,013 76.0% 35.4% 99.6% 99.6% 0.0%
$2,017 89.8% 73.8% 99.0% 99.0% 0.0%
$2,016 94.5% 86.7% 99.1% 99.1% 0.0%
0.29% 0.09% 5.02% 0.09% 4.46% 6.34%
0.69% 0.27% 5.05% 0.23% 4.72% 6.24%
1.77% 0.62% 5.13% 0.59% 5.11% 6.15%
Notes: Policy A eliminates tax deductibility of group insurance premium. Policy B provides a lump-sum subsidy for the group insurance. Policies C and D provide tax deductibility and refundable credit for individual insurance, respectively.
5.2. Policy experiments We now conduct experiments to study the effect of health insurance policies. In the experiments, the level of government expenditure G is fixed and the proportional tax rate ty is adjusted to balance the government budget.13 In each experiment, a steady state implied by the new policy is computed first and then the transition dynamics are computed. In the latter, we assume that in period 0, the economy is in the steady state of the benchmark economy. In period 1, an unanticipated change of the policy is announced and implemented and the economy starts to make a transition to the new steady state. Throughout the transition, the proportional tax rate ty as well as payroll taxes tss and tmed are adjusted to balance the overall budget of the government, social security and Medicare, respectively. The GHI premium also changes as the insurance demand and the decomposition of the insured evolve over time. In order to assess the welfare effect of a reform, the consumption equivalent variation is computed. It measures an increment in percentage of consumption in every state of the world that has to be given to the agent so that he is indifferent between remaining in the benchmark and moving to another economy which is about to make a transition to the new steady state. 5.2.1. Abolishing tax deductibility of group insurance premium (Policy A) In order to understand the economic and welfare effects of the current health insurance policy, we ask what the agents would do if there was no such policy. The experiment invokes a radical change—the government abolishes the entire deductibility of the group insurance premium for both income and payroll tax purposes. Taxes are now collected on the entire portion of the premium, including the employer-paid portion. Hence, the taxable income is given as 0 y ¼ wz ˜ þ rða þ bÞ þ iE iHI cp. We allow supply side effects as a result of policy changes, both on the extensive and intensive margins, i.e. the m probability of GHI offers and firms’ subsidy rate.14 On the intensive margin, we assume that employers pay c for every dollar of premium above the benchmark premium pbench , that is, employers pay a total subsidy worth cpbench þ cm ðp pbench Þ, where p is the new premium. This specification is motivated by the observation that at the m margin employers tend to carry a smaller share of the GHI premium, i.e. c oc. Gruber and McKnight (2003) argue that ‘‘the key dimension along which employers appear to be adjusting their health insurance spending is through the generosity of what they contribute.’’ Sommers (2005) argues that wage stickiness prevents firms from reducing wages m when the premium increases. We chose to set c at 50%.15 For the extensive margin, to gauge the effect on the offer probability we rely on work by Gruber and Lettau (2004) who run a probit model of GHI offers on the after-tax price of health insurance. They estimate that taking away the income and payroll tax subsidy will reduce the share of workers that are offered GHI by 15:5%. The transition matrix of offer probabilities is adjusted to achieve this ratio. 13 Medicare and social security systems remain self-financed and the Medicare tax tmed and the social security tax tss are adjusted to balance the budget of the programs. 14 In Jeske and Kitao (2008) we report the results without supply side effects and with effects on only the extensive or the intensive margin. They are qualitatively very similar. 15 Simon (2005) estimates that the employers pass 75% of the premium hike as increased employee contributions (i.e. 25% paid by employers), but we regard this estimate of employer contribution is too low for our model given that the study is focused on small firms that tend to be less generous with their group insurance subsidy.
ARTICLE IN PRESS K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
219
Table 5 Welfare effects of policy experiments. A (%)
B (%)
C (%)
D (%)
All (young) Young with group insurance offer Young without group insurance offer
0.34 0.46 0.12
+0.07 +0.06 +0.09
+0.24 +0.16 +0.38
+0.58 +0.37 +0.95
Fraction of young agents with welfare gain
19.2
79.9
79.8
99.3
Note: Welfare effects in the top three rows are expressed in terms of consumption equivalence.
Column A of Table 4 summarizes the results of the policy experiment. The upper section of the table displays statistics on health insurance. The lower section displays changes in aggregate variables and prices. Removing the tax subsidy leads to a partial collapse of the group insurance market. The take-up ratio conditional on being offered group insurance falls from 99.0% in the benchmark to 57.4%. About two-thirds of those who remain insured opt out of the GHI market and purchase IHI. Those are the agents in a better health condition who face a lower premium in the individual insurance market. The exit of healthy agents out of the group insurance market significantly deteriorates the health quality in the pool of the insured and the price of the group insurance premium soars to $5,316 from the benchmark price of $2,018. The overall coverage ratio falls by more than a third to 48.4%. An increased exposure to the health expenditure shocks raises the precautionary savings demand and the aggregate capital increases by 0.81%. The magnitude of the increase in capital may seem relatively small given the large size of the decrease in the coverage. Many agents who become uninsured by declining the group insurance offer are healthy and less concerned about expenditure shocks in the immediate future given the persistence of the expenditure process. The number of agents who are eligible for the social insurance coverage will rise from 6.34% of the population to 6.55% and the increase is not so significant for the same reason. The proportional tax ty on income that balances the government budget falls from 4.46% to 3.74%, since the income tax base is expanded by eliminating the deductibility of the GHI premium. Column A of Table 5 shows the welfare effect of the policy change. Although the wage rate is higher and the tax rate ty is lower than in the benchmark, they are not enough to compensate for the welfare loss due to the lower insurance coverage and increased exposure to health expenditure shocks. Not only the agents with the group insurance offer but also those with no access to the group insurance will face a welfare loss since the group insurance offer they may receive in future is not so attractive any more and they suffer from more future expenditure risks as well. Only 19.2% of young agents would experience a welfare gain from such a reform, and the average welfare effect is negative, in the order of 0.34% in consumption equivalence. 5.2.2. Alternative policies In this subsection, alternative policies on health insurance are studied. We consider correcting regressiveness of the group insurance subsidy (Policy B), extending the deduction to the individual insurance contracts (Policy C) and providing a refundable credit for the purchase of individual insurance (Policy D). Policy B. Fixing regressiveness: The government continues to provide a benefit for the group insurance, but abolishes the premium deductibility from the progressive income tax and instead provides a lump-sum subsidy for the purchase of group insurance. The amount of subsidy is determined so that the government maintains the budget balance while keeping the income tax rate ty unchanged. Compared to the benchmark, this policy is intended to be more beneficial if the agent with a group insurance offer belongs to a lower earnings group, because under the benchmark the deduction was based on their lower marginal tax rate. The subsidy under this policy is common across agents and it is higher than the tax deduction for agents in a low tax bracket. The take-up ratio of GHI goes up from 99.0% to 99.6% and the rise to the nearly perfect take-up comes from the increased demand of the low-earnings people. Welfare is improved, although the magnitude is relatively small. Policy C. Extending tax deductibility to the individual insurance market: The government keeps the current tax deductibility for the group insurance premium but aims to provide some benefit for the individual insurance market. One way to do it is to extend the same tax advantage to everyone, i.e. agents who purchase a contract in the individual market can also deduct the premium cost from their income and payroll tax bases. As shown in Column C of Table 4, the policy would increase the insurance coverage among agents without access to the group insurance market by 38.3% to 73.8% and the overall coverage by 14.1% to nearly 90%. The fiscal cost of extending the deduction universally is reflected in the higher proportional tax, though the change is small. The coverage rises across different health expenditure levels, which reduces the precautionary savings. Aggregate capital falls by 0.69%. In terms of welfare, the policy brings a relatively large welfare gain for those without an insurance offer from the employer, at 0.38% in consumption equivalence. Despite the decrease in the aggregate consumption in the final steady state, the reform will enable agents to smooth consumption across states and enhance overall welfare. The number of agents who are eligible for the social insurance goes down slightly, since agents are better insured and less exposed to a catastrophic health shock that would bring their disposable assets down to the minimum consumption level.
ARTICLE IN PRESS 220
K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
Policy D. Providing credit to the individual insurance market: The government offers a refundable credit of $1,000 to supplement for the purchase of individual insurance, if the person is not offered group insurance. As shown in Column D of Table 4, there is a significant effect on the insurance coverage among those without access to group insurance. The conditional take-up ratio increases from 35.5% to 86.7%. Compared to the tax deduction in Policy C, there is a much larger increase of the take-up ratios among agents with low earnings. Providing a refundable credit as in Policy D is more effective than a tax deduction because low-earnings agents who are more likely to be uninsured belong to lower tax brackets and benefit less from a deduction policy. Increased risk-sharing together with the higher tax rate will reduce the saving motive and the aggregate capital falls by 1.77%. Despite the increased tax burden and a fall in the aggregate consumption, the gain from the better insurance coverage and an increased protection against expenditure uncertainty dominate the negative effects and the welfare effect among young agents is positive on average with a consumption equivalent variation of 0.58%. The number of agents eligible for the social insurance falls from 6.34% to 6.15%. The vast majority of the agents would support such a reform proposal. 6. Conclusion Our policy experiments indicate that despite some issues entailed in the current tax system on health insurance, providing some form of subsidy and an incentive for the group insurance coverage has a merit. Employer-provided group insurance has the feature that everyone can purchase a contract at the same premium irrespective of any individual characteristics, most importantly current health status. Healthy agents would have an incentive to opt out of this contract and either self-insure or find a cheaper insurance contract in the individual market. A subsidy on group insurance can encourage them to sign up, maintain the diverse health quality of the insurance pool and alleviate the adverse selection problem that could plague the group insurance market. We conduct an experiment that confirms this intuition by showing that a complete removal of the subsidy would result in a deterioration of health quality in group insurance, a rise in the premium and a significant reduction in the insurance coverage, which together reduce welfare. Jeske and Kitao (2008) show robustness of our results across a wide variety of alternative model assumptions and parameterizations. Our work complements the existing studies on the health insurance policy by highlighting the additional insights one can obtain by employing a general equilibrium model. An alternative tax treatment of the health insurance premium affects the composition of agents who sign up and therefore the equilibrium insurance premium. Changing the exposure to health expenditure risks affects precautionary saving motives, which in turn affects the level of factor prices, consumption and ultimately welfare. We have also shown that it is important to capture fiscal consequences of a reform because providing a subsidy will affect the magnitude of distortionary taxation. Moreover, the changes in insurance demand are shown to affect other public welfare programs such as Medicaid. We also find that there is room for significantly increasing the insurance coverage and improving welfare by restructuring the current subsidy system. Extending the benefit to the individual insurance market is more effective if the subsidy is refundable than if it takes the form of deductions. The refundable credit policy is shown to raise the coverage by nearly 20% and enhance welfare despite an increase in the fiscal burden and a decrease in the aggregate output and consumption due to the lower demand for precautionary savings. Since our focus is on the effect of the tax policy, we chose not to alter other features of the model along the transition. An interesting extension will be to ask how agents’ insurance and saving decisions as well as the government’s fiscal position will be affected in response to the future changes in the environment, in particular, the combination of the rising health costs and aging demographics. We leave these subjects for future research.
Acknowledgments The authors thank an anonymous referee, Robert King, Thomas Sargent, Gianluca Violante and seminar participants at ASSA meetings in Chicago Atlanta Fed, Chicago Fed, Columbia GSB, Duke, Emory, European Central Bank, German Macro Workshop, Illinois at Urbana-Champaign, Michigan, Maryland, NYU, NYU Stern, Pennsylvania, SED meetings in Vancouver, University of Tokyo, USC Marshall and Western Ontario for helpful comments, and Katie Hsieh for research assistance. All remaining errors are ours. References Aiyagari, R., 1994. Uninsured idiosyncratic risk and aggregate saving. Quarterly Journal of Economics 109, 659–684. Bewley, T.F., 1986. Stationary monetary equilibrium with a continuum of independently fluctuating consumers. In: Hildenbrand, W., Mas-Colell, A. (Eds.), Contributions to Mathematical Economics in Honor of Gerald Debreu. North-Holland, Amsterdam, pp. 79–102. ˜ eda, A., Dı´az-Gime´nez, J., Rı´os-Rull, J.V., 2003. Accounting for the U.S. earnings and wealth inequality. Journal of Political Economy 111, 818–857. Castan Chatterjee, S., Corbae, D., Nakajima, M., Rı´os-Rull, J.V., 2007. A quantitative theory of unsecured consumer credit with risk of default. Econometrica 75, 1525–1589. Conesa, J.C., Kitao, S., Krueger, D., Taxing capital? Not a bad idea after all! American Economic Review, forthcoming. Conesa, J.C., Krueger, D., 2006. On the optimal progressivity of the income tax code. Journal of Monetary Economics 53, 1425–1450. De Nardi, M., French, E., Jones, J.B., 2005. Differential mortality, uncertain medical expenses, and the saving of elderly singles. Working paper.
ARTICLE IN PRESS K. Jeske, S. Kitao / Journal of Monetary Economics 56 (2009) 210–221
221
Domeij, D., Heathcote, J., 2004. On the distributional effects of reducing capital taxes. International Economic Review 45, 523–554. Ferna´ndez-Villaverde, J., Krueger, D., 2006. Consumption over the life cycle: facts from Consumer Expenditure Survey data. Review of Economics and Statistics 89, 552–565. Gouveia, M., Strauss, R.P., 1994. Effective federal individual income tax functions: an exploratory empirical analysis. National Tax Journal 47, 317–339. Gruber, J., 2004. Tax policy for health insurance. NBER Working Paper No. 10977. Gruber, J., Lettau, M., 2004. How elastic is the firm’s demand for health insurance? Journal of Health Economics 88, 1273–1293. Gruber, J., McKnight, R., 2003. Why did employee health insurance contributions rise? Journal of Health Economics 22, 1085–1104. Hubbard, R.G., Skinner, J., Zeldes, S.P., 1995. Precautionary saving and social insurance. Journal of Political Economy 103, 360–399. Huggett, M., 1993. The risk-free rate in heterogeneous-agent incomplete-insurance economies. Journal of Economic Dynamics and Control 17, 953–969. I˙mrohorog˘lu, A., 1989. Cost of business cycles with indivisibilities and liquidity constraints. Journal of Political Economy 97, 1364–1383. Jeske, K., Kitao, S., 2008. U.S. tax policy and health insurance demand: can a regressive policy improve welfare? Supplementary document. Available at hhttp://sagiri.kitao.googlepages.com/JK_supplementary.pdfi. Kahn, J.G., Kronick, R., Kreger, M., Gans, D.N., 2005. The cost of health insurance administration in California: estimates for insurers, physicians, and hospitals. Health Affairs 24, 1629–1639. Kennickell, A.B., 2003. A rolling tide: changes in the distribution of wealth in the U.S., 1989–2001. Working paper, Federal Reserve Board. Kotlikoff, L.J., 1989. Health expenditures and precautionary savings. In: Kotlikoff, L.J. (Ed.), What Determines Savings? MIT Press, Cambridge, MA, pp. 141–163. Krueger, D., Perri, F., 2005. Does income inequality lead to consumption inequality? Evidence and theory. Review of Economics Studies 73, 163–193. Livshits, I., Tertilt, M., MacGee, J., 2007. Consumer bankruptcy: a fresh start. American Economic Review 97, 402–418. Mendoza, E.G., Razin, A., Tesar, L.L., 1994. Effective tax rates in macroeconomics: cross-country estimates of tax rates on factor incomes and consumption. Journal of Monetary Economics 34, 297–323. Palumbo, M.G., 1999. Uncertain medical expenses and precautionary saving near the end of the life cycle. Review of Economic Studies 66, 395–421. Scholz, J.K., Seshadri, A., Khitatrakun, S., 2006. Are Americans saving ‘‘optimally’’ for retirement? Journal of Political Economy 114, 607–643. Simon, K.I., 2005. Adverse selection in health insurance markets? Evidence from state small-group health insurance reforms. Journal of Public Economics 89, 1865–1877. Sommers, B.D., 2005. Who really pays for health insurance? The incidence of employer-provided health insurance with sticky nominal wages. International Journal of Health Care Finance and Economics 5, 89–118. Sommers, J.P., 2002. Estimation of expenditures and enrollments for employer-sponsored health insurance. Agency for Healthcare Research and Quality, MEPS Methodology Report 14. Whitehouse, E., 2003. The value of pension entitlements: a model of nine OECD countries. OECD Social, Employment and Migration Working Papers No. 9.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 222–230
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Technological change and the households’ demand for currency$ Francesco Lippi a, , Alessandro Secchi b a b
University of Sassari, EIEF and CEPR1, Italy Bank of Italy, Italy
a r t i c l e i n f o
abstract
Article history: Received 20 September 2007 Received in revised form 5 November 2008 Accepted 6 November 2008 Available online 21 November 2008
It is shown that accounting for technology variations, across households and periods, is important to obtain theoretically consistent estimates of the demand for currency. An inventory model is presented where the withdrawal technology is explicitly modeled. Both the level and the interest rate elasticity of cash holdings depend on the withdrawal technology available to households. Empirical proxies for the household withdrawal technology, based on the diffusion of cash withdrawal points measured at city level, are used to test the model predictions on a panel of Italian household data over the 1993–2004 period. & 2008 Elsevier B.V. All rights reserved.
JEL classification: E5 Keywords: Money demand Transactions technology Inventory models
1. Introduction Cash usage remains intense. The ratio of cash to GDP for the world economy, between 5% and 8% since the 1950s, displayed an increasing trend over the past 20 years in both high and low income countries (see Fig. 1). Similar patterns emerge from the analysis of individual country data (see Drehmann et al., 2002). Likewise, survey data reported in Table 1 show that despite the strong diffusion of payment and withdrawal instruments that allow households to finance consumption using less cash, such as ATM and POS terminals, the demand for currency by the Italian household hovered around 400 euros over the past 10 years. This paper takes a step towards understanding the households demand for currency by studying the effects of technical progress in the transactions technology. While technological/financial innovation is often invoked as an important factor affecting money demand, an explicit modeling of the mechanism is rarely found. One problem with the study of money demand is that innovation affects both the extensive and the intensive margin of money demand. The extensive margin gives the proportion of total expenditure that are done using cash. Given this cash expenditure, the household determines the cash inventory to finance it (i.e. the intensive margin). Aggregate data do not allow these two choices to be separately analyzed, and even most microdatabase do not contain information on the household expenditures that are done using cash. A dataset of around 50,000 household level observations, spanning the period 1993–2004, is used. The data contain
$ We thank the Editor and an anonymous referee for useful comments on a previous version of this paper. We also benefited from the comments of Fernando Alvarez, Giuseppe Bertola, Luigi Guiso, Christian Haefke, Tullio Jappelli, Tom Sargent, Pedro Teles, and seminar participants at the Banca d’Italia, the Bank of Portugal, the SED 2005 Budapest Meeting, the European University Institute, the University of Turin, Rome II ‘‘Tor Vergata’’, LUISS and the 2005 Vienna Macro Workshop. The views are personal and do not involve the responsibility of the institutions with which we are affiliated. Tel.: +39 06 4792 4836. 1 Einaudi Institute for Economics and Finance (EIEF), via Due Macelli 73, 00184 Rome, Italy.
0304-3932/$ - see front matter & 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.11.001
ARTICLE IN PRESS F. Lippi, A. Secchi / Journal of Monetary Economics 56 (2009) 222–230
223
15% 14%
World
High
Low
13% 12% 11% 10% 9% 8% 7% 6% 5%
19 5 19 4 5 19 6 5 19 8 6 19 0 6 19 2 6 19 4 6 19 6 6 19 8 70 19 7 19 2 7 19 4 7 19 6 7 19 8 8 19 0 8 19 2 84 19 8 19 6 8 19 8 9 19 0 9 19 2 94 19 9 19 6 9 20 8 00 20 0 20 2 04 20 06
4%
Fig. 1. Currency over GDP: world averages 1954–2006. Note: Averages are weighted by the share of a country GDP in the group; whole sample ¼ 98% of world GDP 1995. Source: IFS. Shares of world GDP: high income 80.6%, low income 2.9%.
Table 1 Statistics on cash transactions in Italy. Variable
1993
1995
1998
2000
2002
2004
Full sample Fraction with a bank account Fraction with an ATM card
0.84 0.34
0.84 0.40
0.86 0.51
0.85 0.53
0.86 0.55
0.86 0.56
460 (393) 395 (345)
526 (388) 444 (383)
462 (363) 388 (373)
460 (376) 364 (332)
425 (352) 359 (335)
433 (372) 353 (325)
988 (477) 1,234 (560)
987 (491) 1,262 (595)
847 (467) 1,091 (644)
877 (463) 1,099 (596)
816 (412) 989 (544)
847 (424) 948 (548)
0.91 (0.24) 0.85 (0.27) 0.40 (0.19) 6.10 (0.42)
0.92 (0.30) 0.85 (0.27) 0.45 (0.21) 5.23 (0.32)
0.92 (0.57) 0.84 (2.02) 0.50 (0.22) 2.15 (0.23)
0.92 (0.34) 0.79 (0.34) 0.55 (0.25) 1.16 (0.22)
0.91 (0.34) 0.75 (0.36) 0.58 (0.28) 0.77 (0.15)
0.90 (0.30) 0.69 (0.33) 0.61 (0.30) 0.33 (0.12)
8,089
8,135
7,147
8,001
8,011
8,012
Households with a bank account Average currency holdings Without ATM card With ATM card Cash expenditure per month Without ATM card With ATM card Cash expenditure ratioa Without ATM card With ATM card Bank branchesb Interest ratec
Full sample observations
Note: Entries are sample averages; standard deviation in parenthesis. Nominal variables are in 2004 euros. Source: Bank of Italy—Survey of Household Income and Wealth. a Ratio to non-durable expenditures. The denominator excludes imputed rents and non-monetary benefits. b Per thousand residents; observations disaggregated at city level (source: Central Credit Register). c Observations disaggregated at provincial level (source: Central Credit Register).
information on the household average cash holdings and the value of the expenditure that is done using cash. This allows us to separate the intensive and the extensive margin. The distinction is essential for estimating the transaction elasticity of the money demand. The analysis also casts light on the relation between the money demand interest elasticity and the development of the transactions technology.
ARTICLE IN PRESS 224
F. Lippi, A. Secchi / Journal of Monetary Economics 56 (2009) 222–230
Our model modifies the standard inventory theory by introducing a role for the density of bank branches and ATM terminals on agents’ cash holding choices. The key difference with respect to the classic Baumol–Tobin framework, where all withdrawals are assumed to be costly, is that in our setup agents are occasionally given the opportunity to withdraw at basically no costs, for example when they meet an ATM terminal while shopping. It is shown that in this economy the level of the money demand and its interest elasticity decrease as the frequency of free withdrawal opportunities increases. Thus, advances in the transaction technology may explain the low interest elasticity that emerges from several empirical studies, see, e.g. Daniels and Murphy (1994) for the US. The theory suggests that the money demand level and curvature varies with the development of the transactions technology. This hypothesis is tested on a panel of Italian households data, first used by Attanasio et al. (2002), which include information on the household access to transaction services (e.g. whether they own an ATM card) and the diffusion of bank branches and ATM terminals. Given the sizeable cross-sectional and timeseries variation of the transactions technology faced by households, accounting for these variables is important when estimating the demand for currency. The paper is organized as follows. The next section presents our model. Section 3 uses the suggestions of the theory to estimate the money demand equation. Section 4 discusses the findings and offers some comments on related literature. A concluding section summarizes the findings. 2. Transactions technology in the inventory model This section modifies the standard cash inventory model to investigate the relation between the withdrawal technology and the demand for currency. Consider the steady state problem of an agent who uses cash to finance an exogenous stream of expenditure, c. Shopping takes place in one of the several locations of the economy, which may be endowed with a cash dispenser that allows the agent to withdraw cash without incurring a time cost. By contrast, a withdrawal done at a location without cash dispenser entails a cost b, as in the Baumol–Tobin model. Let TðM; cÞ be the number of costly withdrawals from the bank that are necessary to finance a consumption flow c when the average money balances are M. It is assumed that T is decreasing in M, so that higher balances allow the agent to finance consumption with less withdrawals, and that T is convex in M, so the minimization problem is well behaved. The money demand solves the minimization problem: min RM þ bTðM; cÞ, M
(1)
where R is the net nominal interest rate. The optimal choice of M balances the impact on the cost due to forgone interest with the effect on the cost of withdrawals. To analyze the effect of technological change in T on the money demand we present two comparative static results, one about the level of money demand and the other about its interest rate elasticity. Both results depend critically on the first order derivative of TðMÞ. The absolute value of the derivative gives the marginal savings in terms of costly trips that the agent reaps by holding one more unit of currency. Let us label T 0 ðMÞ as the ‘‘marginal withdrawal benefit of money’’. Consider two withdrawal technologies, T i , and the associated money demand schedules, M i , for i ¼ 1; 2. The first order condition of problem (1) and the assumption that T is convex in M give: Result 1. A smaller marginal withdrawal benefit of money, T 0 ðMÞ, reduces money demand. Formally, if T 02 ðMÞp T 01 ðMÞ for all M then M 2 pM 1 for all RX0. The second result relates the interest rate elasticity to the curvature of the cost function T. In particular, the first order condition of problem (1) and its total differential imply R qM T 00 ¼1 M (2) 0 . M qR T The expression MT 00 =T 0 X0 is a measure of the local curvature of the transaction function T. It is also the elasticity of the marginal benefit T 0 . Thus Eq. (2) says that if the marginal benefit is more sensitive to M, then the money demand is less sensitive to interest rate changes. This yields: Result 2. A higher elasticity of the marginal withdrawal benefit reduces the interest elasticity of the money demand. Formally, let M 1 and M 2 denote, respectively, the demand for currency implied by technology T 1 and T 2 when the interest rate is R. If M 2 T 002 ðM 2Þ = T 02 ðM 2 ÞXM 1 T 001 ðM 1 Þ= T 01 ðM 1 Þ, then ðR=M 2 ÞqM2 =qRp ðR=M 1 ÞqM 1 =qR. The next subsections use these results to analyze the effect of technological progress in T on money demand for two alternative withdrawal technology specifications. 2.1. Example 1: Baumol–Tobin with free withdrawals Consider a Baumol–Tobin setup and assume that in every period the agent has p opportunities to withdraw that come for free. Each withdrawal in excess of p costs b. For concreteness, imagine a shopper who passes by a bank branch once every period. This case is represented by a technology T p with p ¼ 1. Now suppose that an ATM is installed on the way to
ARTICLE IN PRESS F. Lippi, A. Secchi / Journal of Monetary Economics 56 (2009) 222–230
her job. This is represented by a new technology with higher p. In general n c o p; 0 , T p ðM; cÞ ¼ max 2M
225
(3)
where T p denotes the number of costly withdrawals and the parameter p gives the number of free withdrawals per period. Setting p ¼ 0 in (3) stipulates that all trips are costly, as in the Baumol–Tobin model: T 0 ðM; cÞ ¼ c=2M.2 Note that T 0 has a marginal benefit T 00 with constant elasticity equal to 2, which implies the well known result that the interest elasticity of the money demand is 12. The interpretation of the p40 case is that the agent has p free withdrawals, so that if the total number of withdrawals is c=ð2MÞ, then she pays only for the excess of c=ð2MÞ over p. The money demand for a technology with pX0 is given by 8 rffiffiffiffiffiffi bc > > > < 2R for RXR where R ðpÞ2 2b=c. M p ðRÞ ¼ rffiffiffiffiffiffiffiffi (4) > bc > > for RoR : 2R When p ¼ 0 the forgone interest revenue is small at low values of R, so agents economize on the number of withdrawals and choose a large value of M. Now consider p40. In this case there is no reason to have less than p withdrawals per unit of time, since these are free. Hence, for RoR agents choose the same level of money holdings, namely, M p ðRÞ ¼ Mp ðR Þ, since they are not paying for any withdrawal but they are subject to a positive forgone interest. Note that over this range the interest elasticity of money demand is zero. Improvements in the particular technology described in (3) produce a money demand that is lower in level and has a smaller interest rate elasticity (in between zero and one-half) because it indeed satisfies the assumptions for results 1 and 2 presented above. To see this, consider two technologies indexed by 0pp1 op2. These technologies satisfy the following three properties: (i) A greater value of p represents technological progress, because T p is decreasing in p. Formally T p2 ðM; cÞpT p1 ðM; cÞ (with strict inequality for Moc=ð2p1 Þ or, equivalently, R4R1 ). (ii) A higher value of p decreases the marginal withdrawal benefit of M, T 0p , hence decreases money demand by result 1, at least for some values of M. In particular, 0 ¼ T 0p2 ðM; cÞ4T 0p1 ðM; cÞ over the range: c=ð2p2 ÞoMoc=ð2p1 Þ, and equal otherwise. (iii) A greater value of p increases the curvature of T p , hence decreases the interest elasticity by result 2. To see this notice that T p2 ðM; cÞ ¼ gðT p1 ðM; cÞÞ for gðtÞ ¼ maxft ðp2 p1 Þ; 0g. As the transformation is increasing and convex in t, it follows that technologies indexed by a higher value of p have more curvature. 2.2. Example 2: random coupons for free withdrawals Consider an economy with two locations: the shopping center and the financial district. Let c be the agent cashconsumption per period, all of which takes place in the shopping center. If the agent cash balance reaches zero, she must walk to the financial district to withdraw more cash, paying the cost b (e.g. the time wasted in this operation). While in the shopping center, however, with probability p 2 ð0; 1Þ per period the agent receives a storable coupon for a free withdrawal. Think of this as the agent locating an ATM where she can withdraw without paying a fee. It is assumed that she will make use of the coupon (e.g. walk back to this ATM) to refill her balances when they reach zero. Apart from the randomness in the cost of withdrawals (free if a coupon is at hand, costly otherwise) the model is otherwise standard: cash balances follow a saw-tooth pattern and the average money balances ðMÞ and the average withdrawal ðWÞ are related by W ¼ 2M, as in the Baumol–Tobin model.3 Note that a withdrawal of 2M allows the agent to finance consumption for a period of length at least 2M=c, without having to visit the financial district. The probability that the agent does not receive a free coupon for withdrawal during this period is ð1 pÞ2M=c that, for small p, is approximated by eð2M=cÞp. This probability gives the fraction of withdrawals per period ðc=2MÞ which are costly. Hence the transactions technology T p, which gives the expected number of costly withdrawals for an agent who withdraws 2M and consumes c, is T p ðM; cÞ ¼
c ð2M=cÞp e . 2M
(5)
As for the case discussed in Section 2.1, the technology in (5) has the following features: (i) T p is decreasing in p, so that higher values of p represent technological progress; (ii) the marginal benefit T 0p is decreasing in p which, by result 1, implies that the level of money demand decreases as the technology improves; and (iii) the curvature of the cost function, as measured by ðMT 00 = T 0 Þ, is increasing in p which, by result 2, implies that the interest rate elasticity of money demand 2
An agent with consumption flow c withdraws 2M, which last 2M=c periods, has average balances M and makes ðc=2MÞ trips to the bank. Alternatively, and more realistically, one might assume that the coupon cannot be stored, so that it is optimal for the agent to withdraw at the exact time the free withdrawal opportunity materializes. This gives rise to a whole size distribution of withdrawals, where the relationship W ¼ 2M does not hold. See Alvarez and Lippi (2007) for a detailed analysis of this problem. 3
ARTICLE IN PRESS 226
F. Lippi, A. Secchi / Journal of Monetary Economics 56 (2009) 222–230
decreases as the withdrawal technology improves.4 Compared to the model of the first example, which featured an interest elasticity of either 0 or 12, the interest rate elasticity here is a continuous decreasing function of p that spans the ð0; 12 range. 3. Currency demand and transactions technology The model suggests that the level and the interest elasticity of the demand for currency depend on the type of withdrawal technology. It predicts that technological improvements, i.e. reductions in the cost of withdrawals, lower the level of the money demand and its interest elasticity (in absolute value). These hypotheses are evaluated using household level data taken from the Survey of Household Income and Wealth (SHIW), a bi-annual survey conducted by the Bank of Italy on a rotating sample of Italian households. The survey collects information on several social and economic characteristics of the household members, such as age, gender, education, employment, income, real and financial wealth, consumption and saving behavior. Each survey is conducted on a sample of about 8,000 households. We focus on the six surveys conducted from 1993 to 2004 because they include a section on the household cash management that contains data on the average amount of cash held by the household and the value of consumption paid with cash. Two additional data sources are the Italian Central Credit Register and the Supervisory Reports to the Bank of Italy. The former includes information on the interest rate paid by banks on checking accounts disaggregated by year and province (there are 103 provinces). The latter collects the reports that Italian banks file to the Bank of Italy for supervisory reasons and contains information on the supply of various financial services, such as the diffusion of bank branches and of ATM.5 Using these data we construct a proxy for the level of the withdrawal technology faced by the household. The proxy is given by the number of bank branches per capita measured at city level (around 300 cities per year). This indicator, whose year averages and standard deviations are reported in Table 1, highlights the steady diffusion of bank services across the territory over the past 15 years as well as its large cross-section dispersion. The indicator is positively correlated with the number of ATM terminals in the time series and across provinces (the correlation coefficient is between 0.75 and 0.94 in each year).6 There are two caveats, however, in the time-series ATM grows faster than Bank branches: the ratio of the total number of ATM to bank branches is 0.6 in 1993 and 1.2 in 2004. Second, the information we have on bank branches is more detailed than the one we have for ATM: the former is available at the city level, while the latter is only available at the province level. The estimates presented below are based on a currency demand specification that relates average cash holdings to the value of cash expenditure (both measured at household level), the interest rate paid on deposit accounts and to a proxy for the level of the withdrawal technology. The latter is included both in level and interacted with the interest rate. The currency demand specification also includes demographic controls and year and province dummies that are intended to capture unobserved geographical and time-series factors affecting money demand (e.g. the level of crime).7 Following Attanasio et al. (2002) we estimate two separate equations for households with and without ATM card as these two groups are endowed with different withdrawal technologies and we adopt an estimation strategy that allows to control for sample selection (Heckman two-step approach).8 The inclusion of a measure of withdrawal technology is a crucial difference with respect to these authors.9 The results are presented in Table 2 where we report the OLS second stage estimates. The choice to present the OLS coefficients (the so called direct effect), instead of the marginal effect, is due to our interest in the structural parameters of the inventory problem described in Section 2. Thus the coefficients have the same interpretation of those that would be obtained by applying OLS on a truly random sample of households.10 The results reported in columns (1)–(3) concern households without ATM card. These are the households for whom our measure of withdrawal technology—the number of bank branches per capita at the city level—is the most appropriate. In column (1) we report the estimates obtained from a standard specification of the money demand. The specification presented in column (2) integrates the technology measure in level. In line with the predictions of the theory, a greater diffusion of bank branches reduces the average currency holdings. The interest rate enters the equation with a negative and statistically significant coefficient—though its magnitude is much smaller than is suggested by the Baumol–Tobin model—and the transaction elasticity is about 0.5, right on top of the square root formula. Some algebra shows that ðMT 00 = T 0 Þ ¼ 2 þ ð2pM=cÞ2 =ð2pM=c þ 1Þ that is increasing in p. Until the early nineties commercial banks faced restrictions to open new bank branches in other provinces. A gradual process of liberalization has occurred since then, which has led to an increase in the number of bank branches and a reduction of the interest rate differentials across different areas (see Casolaro et al., 2006 for a review of the main developments in the banking industry during the past two decades). 6 In Italy ATM terminals are owned by banks. About 80% of the ATM terminals are located in the premises of a bank branch; the remaining 20% is not (e.g. is located in airports, shopping malls, etc.). 7 See Lippi and Secchi (2007) for the results of several alternative specifications. 8 The choice to present separate equations for households with and without ATM card is supported by the results of a series of formal tests that reject the null hypothesis of equality of the currency holding behavior across the two groups. Details are presented in the online appendix. 9 See the online appendix for details. 10 Instead, the marginal effect would be the coefficient of interest if one was interested in predicting the (in sample) conditional mean of M, accounting for both the direct influence of a change in R and the fact that this variation also affects the dependent variable through the Mills ratio (e.g. the participation decision). 4 5
ARTICLE IN PRESS F. Lippi, A. Secchi / Journal of Monetary Economics 56 (2009) 222–230
227
Table 2 The demand for currency and withdrawal technology. Bank account holders without ATM card (1)
(2)
(3)
(4)
0:338 (0.010) 0.038 (0.037) –
0:130 (0.034)
–
0:162 (0.031)
0:338 (0.010) 0.055 (0.039) 0:025 (0.021) 0:167 (0.032)
0:463 (0.021) 0:248 (0.046)
0:464 (0.021) 0:242 (0.046)
0:464 (0.021) 0:247 (0.046)
0:474 (0.030) 0:399 (0.055)
0:476 (0.030) 0:390 (0.055)
0:473 (0.030) 0:400 (0.055)
Province and year dummies
Yes
Yes
Yes
Yes
Yes
R2 Sample size
0.255
0.256
0.257
0.210
0.211
0.211
17,339
17,339
17,339
22,512
22,512
22,512
log(interest rate)bank branches per capitaa
0:468 (0.013) 0:105 (0.045) –
Bank branches per capitaa
–
log(interest rate)
Mills ratios Bank account ATM card
(6)
0:339 (0.010) 0.036 (0.037) –
0:467 (0.013) 0:110 (0.045) –
(5)
0:466 (0.013) 0:174 (0.048) 0:104 (0.021) 0:134 (0.034)
log(cash expenditure)
Bank account holders with ATM card
Note: The equations are estimated using Heckman two-step procedure. Bootstrapped standard errors in parenthesis. The regressions also include sex, age, education, and work status of the head of the household, together with living location, number of children, number of adults, and number of income recipients in the household. a Number of bank branches per capita measured at the city level.
Column (3) considers a specification that allows the technology index to affect both the level and the interest elasticity of the demand for currency, as the theory predicts. The estimates confirm the findings of column (2) that a greater diffusion of bank branches reduces the currency demand intercept and that the interest rate (log) level enters the equation with a negative coefficient. The transaction elasticity remains about 0.5. Moreover, the interaction between the interest rate and the diffusion term enters significantly with a positive coefficient. This suggests that the interest elasticity of the demand for currency varies across households, with lower values for households which face a superior technology (a greater diffusion of bank branches). The comparison of columns (1) and (2) with column (3) shows that omitting the interaction term yields an estimate of the average interest rate elasticity, which neglects an important layer of heterogeneity. In quantitative terms, the estimates in (3) imply that agents faced with less developed technology, e.g. a diffusion value of 0.1 (the 5th percentile), have an interest elasticity of about 0:2. The interest elasticity falls to 0:1 for the median agent (the median of the diffusion indicator is around 0.5) and is basically nil for the households facing the highest levels of development. The regressions in columns (4)–(6) concern households who possess an ATM card. We attempt this estimation exercise even though we are aware of the fact that our index for the development of the withdrawal technology—the diffusion of bank branches per capita at the city level—is not the most appropriate measure of diffusion for this type of household.11 The estimation results should thus be taken with a grain of salt, as they may be subject to a greater amount of measurement error than the ones concerning the households without ATM. In regressions (5) and (6) the level of currency holdings is negatively related to the diffusion of bank branches, with a coefficient magnitude comparable to the one detected for the agents without ATM card. Instead, the interest rate coefficients (both levels and interactions) are not significantly different from zero. In principle, a zero interest elasticity for agents who face a more advanced withdrawal technology can be explained by the models outlined in Section 2. For instance, the Baumol–Tobin model with free withdrawals predicts that the interest rate range over which the demand for currency has a zero interest elasticity expands with technological advances. Finally, the regression indicates a transaction elasticity that is about 0.3. We conclude this section by exploring the robustness of the estimates for the households without ATM card, those for which our confidence in the indicator of the level of financial technology is high.12 We begin by assessing whether the estimated coefficients were affected by the choice of the Heckman estimation method. The identification of the currency demand coefficients in the presence of sample endogeneity hinges on the specification of the probit selection equation. In particular, if the first and the second stage estimates have a large set of variables in common, a collinearity problem may occur as the Mills ratio is approximately a linear function of these variables over a wide range of values (see Puhani, 2000). This problem might be particularly relevant in our case since, due to a limited availability of appropriate instruments, the
11 12
As mentioned, information on the diffusion of ATM terminals, the natural measure for the ATM card holders, is not available at the city level. See Lippi and Secchi (2007) for a similar experiment for the other group of households.
ARTICLE IN PRESS 228
F. Lippi, A. Secchi / Journal of Monetary Economics 56 (2009) 222–230
Table 3 The demand for currency and withdrawal technology: Robustness. Bank account holders without ATM card Estimation method
Ordinary least squares (1)
0:487 (0.018) 0:241 (0.135) 0:103 (0.062) 0:153 (0.128)
0:394 (0.025) 0:330 (0.089) 0:095 (0.046) 0:397 (0.135)
Province dummies Year dummies
Yes Yes
Yes Yes
No Yes
R2 Sample size
0.225
0.225
0.081
17,339
17,339
7,631
log(interest rate) log(interest rate)bank branches per capitab Bank branches per capitab
Household fixed effects (3)
0:486 (0.019) 0:181 (0.092) 0:107 (0.036) 0:107 (0.058)
log(cash expenditure)
Instrumental variablesa (2)
Note: Robust standard errors in parenthesis. The regressions also include sex, age, education, and work status of the head of the household, together with living location, number of children, number of adults, and number of income recipients in the household. a The instruments used for the deposit interest rate and the number of bank branches at the city level are the interest rate lagged value and the number of firms and employees per resident at the city level. b Number of bank branches per capita measured at the city level.
identification hinges on the assumption of normality of the errors and is helped by the exclusion of a variable that measures real financial assets at the household level from the second stage equations. To assess the impact of multicollinearity on the baseline results of column (3) of Table 2 we present a plain OLS estimate of the demand for currency in column (1) of Table 3.13 The results show that the coefficients on the cash expenditure and the interest rate are not much affected. We consider next the possibility that some of the regressors are not exogenous with respect to the currency demand shocks. This issue might arise both for the number of bank branches per city and the deposit interest rate at the province level, which might move in response to currency demand shocks that are common to all households of a given city or province. To this end we instrument the interest rates with the previous-year value and the number of bank branches with indicators of industrial activity measured at city level (number of firms and number of employees). The results, reported in column (2) of Table 3, do not show significant differences with respect to the benchmark estimates. The similarity of the OLS and IV estimates suggests a limited relevance of endogeneity problems, an hypothesis confirmed by standard exogeneity tests (not reported). Finally, column (3) of Table 3 presents a fixed-effect panel estimate on our data, which controls for household-specific unobserved factors. Since the panel dimension is limited to a subset of households, the number of observation for this estimate is smaller. The coefficients of the cash expenditure, the interest rate, and bank branch diffusion are statistically significant and maintain the expected sign. The transaction elasticity is close to the one predicted by the square root formula, in line with all the other specifications. The estimates of the direct (negative) effect of bank branch diffusion on currency holdings and the average interest elasticity (about 0:3) are somewhat larger than the values reported in columns (1) and (2). The coefficient of the interaction term maintains magnitude and statistical significance. This provides further support to the hypothesis that technological advances reduce the average demand for currency and its interest elasticity (in absolute value).14 4. Discussion and related literature The idea that the adoption of advanced withdrawal and payment technologies might have an effect on currency demand is not new. Its empirical relevance has been previously assessed by comparing average cash holdings of less financially developed individuals (i.e. those who only have a bank account) with those of more financially developed part of the population (those who have an ATM or a credit card). Related contributions based on household level data are those of Attanasio et al. (2002) who highlight, based on Italian survey data, which ATM users hold significantly smaller cash balances than non-users. Likewise, Stix (2004) offers evidence concerning Austrian individuals showing that the demand for purse cash is significantly smaller for ATM users. Similar evidence is also reported by Daniels and Murphy (1994) using 13 The standard errors of the OLS and IV estimates presented in Table 3 take into account the possibility of heteroschedasticity and cross correlation of the shocks within a province in a given year. 14 Further robustness results are presented in the online appendix.
ARTICLE IN PRESS F. Lippi, A. Secchi / Journal of Monetary Economics 56 (2009) 222–230
229
two large surveys on US households. According to Duca and Whitesell (1995), who follow a cross-sectional approach based on US household survey data, also credit card ownership is associated with lower money holdings. Overall, the evidence consistently indicates that innovations in withdrawal (ATM cards) and payment instruments (credit cards) reduce the level of money balances that agents hold. The analysis presented in Section 2 confirms the effects of technical progress on average currency holdings. In addition, it shows that interest elasticity of the demand for currency decreases with developments in the withdrawal technology. As far as the level is regarded we have shown that, in line with the theory, both within the class of individuals who have a bank account (but no ATM card) and within the class of those who also have an ATM card, average currency holdings depend on the diffusion of withdrawal points. The comparison of a standard specification of the currency demand with one that takes into account the level of withdrawal technology (Table 2) illustrates a novel finding of our analysis. While the interest rate elasticity is constant in the standard specification, in our framework it varies with the diffusion of bank branches. Note that the heterogeneity in the diffusion of bank branches is characterized both by a temporal and a geographical dimension. According to our data the average diffusion of bank branches per capita has increased from about 0.4 to 0.6 from 1993 to 2004. As far as the cross-sectional distribution of the diffusion of bank branches is concerned, our data indicate that in 1993 the household associated with the 5th percentile of the distribution of bank branches per capita was characterized by a value of 0.10, while the 95th percentile by a value of 0.70 (respectively, 0 and 1.20 in 2004). This implies that in 1993 the interest rate elasticity ranged between 0.17 and 0.11, while in 2004 the equivalent figures were 0.16 and 0.04. A standard estimate of the money demand equation would neglect this heterogeneity and associate to each household an interest rate elasticity of 0.11 (column (1) of Table 2). A basically zero elasticity was found for agents with an ATM card. This evidence could be due to an imprecise measure of the withdrawal technology available to this class of agents but also, as explained in Section 3, is not necessarily in contradiction with our theory because it is consistent with the theoretical prediction that agents with more developed withdrawal technologies (e.g. ATM card holders) are expected to have a smaller interest elasticity. The findings concerning the elasticity with respect to the cash expenditure that emerge from the various specifications indicate values that are close to, sometimes a little below, one-half. The estimates are almost identical to those detected on a cross section of Austrian households by Stix (2004), but differ from the near-unit elasticity that emerges by the analysis of long time-series, e.g. Lucas (1988, 2000) and Meltzer (1963), and is predicted by many theoretical models. The issue is of interest in the debate on the optimality of the Friedman rule (e.g. De Fiore and Teles, 2003). A simple reconciliation between the long-run unit elasticity of consumption and the smaller values detected using household data over a short span of years is that the cost of a trip to the bank, b, is linked to the consumption (income) variable in the long-run but less so in the cross section or the short-run). It is reasonable to presume that the cost b is proportional to aggregate wages and consumption in the long run. Formally, assuming a proportionality relation between b and c yields a unit income elasticity if one maintains the reasonable assumption that the transactions technology TðM; cÞ in problem (1) is homogenous of degree zero in M and c, as in the model economies discussed in Section 2.15
5. Concluding remarks This paper contributes to the quest for accurate quantitative estimates of the parameters that govern the money demand function. It is shown that accounting for the transactions technology available to households is important to identify theoretically consistent estimates of the demand schedule. The analysis is guided by a theoretical framework that shows how advances in the withdrawal technology shift the money demand curve downwards and reduce its interest elasticity. This insight is tested by augmenting a standard money demand equation with a proxy for the withdrawal technology faced by households (the number of per capita bank branches measured at city level) and its interaction with the nominal interest rate paid on deposits. The estimates do not discard the theory. The estimated transaction elasticity of currency is about 0.5. Various estimation exercises show that the interest rate elasticity depends on the withdrawal technology available to households in a way that is consistent with the theory: agents who have access to a superior withdrawal technology (more bank branches or ATM card) have a smaller cash balance and a smaller interest elasticity. Quantitatively, our estimates of the interest rate elasticity range between around 0:2 to almost nil. The quasi-constant cash balance of Italian households shown in Table 1 emerges as the outcome of two opposing forces: lower interest rate, which increase the demand for cash, are countered by improvements in the withdrawal technologies.
Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco.2008.11.001. 0
0
15 The proof follows from the first order condition of problem 1: R ¼ bT ðM; cÞ. The homogeneity of degree zero of TðM; cÞ implies bT ðM; cÞ ¼ b=cT 0 ðM=c; 1Þ.
ARTICLE IN PRESS 230
F. Lippi, A. Secchi / Journal of Monetary Economics 56 (2009) 222–230
References Alvarez, F., Lippi, F., 2007. Financial innovation and the transactions demand for cash. NBER WP, 13416. Econometrica, in press. Attanasio, O., Guiso, L., Jappelli, T., 2002. The demand for money, financial innovation and the welfare cost of inflation: an analysis with household data. Journal of Political Economy 110 (2), 318–351. Casolaro, L., Gambacorta, L., Guiso, L., 2006. Regulation, formal and informal enforcement and the development of the household loan market. Lessons from Italy. In: Bertola, G., Grant, C., Disney, R. (Eds.), The Economics of Consumer Credit: European Experience and Lessons from the US. MIT Press, Boston. Daniels, K.N., Murphy, N.B., 1994. The impact of technological change on the currency behavior of households: an empirical cross section study. Journal of Money, Credit and Banking 26 (4), 867–874. De Fiore, F., Teles, P., 2003. The optimal mix of taxes on money, consumption and income. Journal of Monetary Economics 50, 871–887. Drehmann, M., Goodhart, C., Krueger, M., 2002. The challenges facing currency usage: will the traditional transactions medium be able to resist competition from the new technologies? Economic Policy 34, 193–227. Duca, J.V., Whitesell, W.C., 1995. Credit cards and money demand: a cross-sectional study. Journal of Money, Credit and Banking 27 (2), 604–623. Lippi, F., Secchi, A., 2007. Technological innovation and the transactions demand for cash. CEPR Discussion Paper, 6023. Lucas Jr., R.E., 1988. Money demand in the United States: a quantitative review. Carnegie-Rochester Conference Series on Public Policy 29, 137–168. Lucas Jr., R.E., 2000. Inflation and welfare. Econometrica 68 (2), 247–274. Meltzer, A.H., 1963. The demand for money: the evidence from the time series. Journal of Political Economy 71, 219–246. Puhani, P., 2000. The Heckman correction for sample selection and its critique. Journal of Economic Surveys 14 (1), 53–68. Stix, H., 2004. How do debit cards affect cash demand? Survey data evidence. Empirica 31, 93–115.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 231–241
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Can aggregation explain the persistence of inflation?$ Filippo Altissimo a, Benoit Mojon b, Paolo Zaffaroni c, a
Brevan Howard Asset Management LLP and CEPR, UK Banque de France, France c Imperial College London, UK b
a r t i c l e in fo
abstract
Article history: Received 5 December 2006 Received in revised form 18 December 2008 Accepted 18 December 2008 Available online 20 January 2009
An aggregation exercise is proposed that aims at investigating whether the fast average adjustments of the disaggregate inflation series of the euro area CPI is coherent with the slow adjustment of euro area aggregate inflation. Estimating a dynamic factor model for 404 inflation sub-indices of the euro area CPI allows to decompose the dynamics of inflation sub-indices into a part due to a common macroeconomic shock and to sector specific idiosyncratic shocks. Although idiosyncratic shocks dominate the variance of sectoral prices, one common factor appears to be the main driver of aggregate dynamics. In addition, the heterogenous propagation of this common shock across sectoral inflation rates, and in particular its slow propagation to inflation rates of services, generates the persistence of aggregate inflation. We conclude that the aggregation mechanism explains a significant amount of aggregate inflation persistence. & 2008 Elsevier B.V. All rights reserved.
Keywords: Inflation Persistence Aggregation Euro area Price stickiness
1. Introduction Recent contributions show that heterogeneity in price stickiness across sectors implies a slower response of aggregate prices to a monetary policy shock than would be the case if prices of all sectors were identically sticky (see Imbs et al., 2005; Carvalho, 2006; Boivin et al., 2006; Clark, 2006 and references therein). In view of this result, it appears legitimate to analyze the role of the aggregation process across heterogenous sectors for the dynamics of the business cycle. We make a step in this direction by assessing the role of aggregation across heterogenous sectors for the persistence of aggregate inflation. We empirically investigate whether the prediction that persistence of the aggregate is (much) larger than the cross-sectoral average persistence holds true for our data, which gathers inflation rates from 404 sub-indices of the euro area CPI between 1985 and 2005. It turns out that this prediction is not rejected for our data. In particular, we show that the heterogenous responses of sectoral inflation rates to a common shock imply an aggregate inflation response that is indeed much slower than if we impose that all sectors respond similarly to the median/average sector. We therefore argue that the aggregation process can solve the apparent dilemma between the flexibility of sectoral prices1 and the persistence of macroeconomic inflation.
$ We are grateful to Laurent Baudry, Laurent Bilke, Herve Le Bihan and Sylvie Tarrieu (Banque de France), Johanes Hoffman (Bundesbank) Roberto Sabbatini and Giovanni Veronese (Banca d’Italia), Anna-Maria Agresti and Martin Eiglsperger (ECB) for making the data available to us. We thank Gonzalo Camba-Me´ndez, Steven Cecchetti, M. Hashem Pesaran, Jim Stock and the participants to the Eurosystem Inflation Persistence Network. Finally we thank the Editor (Bob King) and two anonymous referees for their valuable comments. Any remaining errors are of course the sole responsibility of the authors. The views expressed in the paper are the authors’ own and do not necessarily reflect those of the Banque de France, the ESCB or Brevan Howard Asset Management. Corresponding author. E-mail address:
[email protected] (P. Zaffaroni). 1 See Chari et al. (2000), Bils and Klenow (2004) and Dhyne et al. (2005).
0304-3932/$ - see front matter & 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.12.013
ARTICLE IN PRESS 232
F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
Table 1 Impulse response function mk for the Beta(a; b) density with a ¼ 4b.
mk
k
0:8k
b ¼ 0:2
0:7
1
3
1 2 5 10 50
0.8 0.72 0.61 0.54 0.39
0.8 0.67 0.48 0.33 0.12
0.8 0.66 0.44 0.28 0.07
0.8 0.65 0.37 0.17 0.01
0.8 0.64 0.33 0.11
200
0.36
0.09
0.05
0.01
4:1 1020
1:4 105
Our analysis is based on results on aggregation of time series2 and it is built upon an explicit model of the aggregation process that is then estimated and simulated. In the model, each sectoral inflation rate is decomposed into its response to an aggregate shock that affects all sectors and a sector specific shock. Our main findings can be summarized as follows: (i) one common factor accounts for 30 percent of the overall variance of the 404 series. This share is twice as large if one focuses on business cycle and lower frequencies, i.e. on the persistent components of the series; (ii) the propagation mechanism of the common shock is highly heterogenous across sectors; (iii) the persistence implied by the aggregation exercise mimics remarkably well the persistence observed in the aggregate inflation. In particular, the cross-sectional distribution of the sectoral parameters implies an autocorrelation function of the aggregate CPI inflation which decays hyperbolically toward zero and displays long memory; and (iv) the high volatility and low persistence, observed on average at the level of sectoral inflation, is consistent with aggregate smoothness and high persistence. Such results are important for two distinct reasons. First, they show the importance of heterogeneity and aggregation in shaping up the dynamic response of macrovariables. This is a clear warning against the naı¨ve use of average-median microparameters in calibrating macromodels. Second, they shed further light on the long standing debate on the persistence of inflation (see Pivetta and Reis, 2007 and reference therein). These results provide support to the view of inflation as a stationary process, albeit with long memory. The paper is organized as follows. Section 2 illustrates the intuition on the theoretical link between aggregation, heterogeneity and persistence. Section 3 presents the data used in the empirical analysis and addresses the presence of common factors across the price sub-indices. Section 4 introduces the micromodel and the estimation methodology. It then discusses the estimation results and their implications for the aggregate persistence. Section 5 concludes. 2. Aggregation, heterogeneity and persistence: a link To build the intuition on the theoretical link between heterogeneity, aggregation and persistence, we make use of a simple example which summarizes the results of Zaffaroni (2004) (see also Robinson, 1978; Granger, 1980). Let us consider n units and let the ith unit be described by a first-order autoregressive model yit ¼ ai yit1 þ ut þ it ;
i ¼ 1; . . . ; n,
(1) 2 u Þ,
2 ;i Þ
where ut , the common shock, is an i:i:d: sequence ð0; s and the it , the idiosyncratic shock, an i:i:d: sequence ð0; s for any i; t. The propagation of the shocks is heterogenous across units and the ai are i:i:d: with probability density function f ðaÞ over the support ð1; 1Þ. P The aggregate is simply the average of the individual units and by linearity Y n;t ¼ n1 ni¼1 yit ¼ U n;t þ En;t ; with En;t ¼ Pn Pn 1 1 k ^ ^ ^ n gets large, by the strong i¼1 it =ð1 ai LÞ and U n;t ¼ ut þ m1 ut1 þ m2 ut2 þ þ , setting mk ¼ n i¼1 ai . When n R1 k ^ k will converge a:s: to the population moments of the ai , i.e. mk ¼ 1 law of large numbers, each m a f ðaÞ da: The dynamic pattern of the mk represents the impulse response of the common shocks ut on the aggregate. Zaffaroni (2004) shows that the rate of decay of the mk (i.e. the persistence of the aggregate shock) only depends on the behavior near unity of f ðaÞ. To illustrate this result, let us parameterize f ðaÞ as a Beta(a; b) distribution, with a; b40.3 The parameter b governs the mass of the Beta distribution around unity and a is set equal to ðm=ð1 mÞÞb in order to assure that the distribution has mean m for any choice of b. Table 1 reports the behavior of mk for various choices of b when m ¼ 0:8 across all columns. The smaller is b, the larger is the mass of the distribution f ðaÞ around unity (i.e. the larger is the share of microunits having very high persistence), and therefore the slower would the impulse response converge to zero. In contrast, the last column shows that if all units had the same AR coefficient (equal to m ¼ 0:8) the impulse response would decay to zero very b fast, as mk . Zaffaroni (2004) shows that mk ck as k ! 1 for some c40. This characterization of the impulse response implies a precise representation of the autocovariance function (acf) and spectral density of the limit aggregate U t , the latter being the limit (in mean-square) of U n;t . In particular, Zaffaroni (2004) shows that when 12 obo1, then 2 3
See Robinson (1978), Granger (1980) and Zaffaroni (2004). A Beta distribution has density function f ðxja; bÞ ¼ B1 ða; bÞxa1 ð1 xÞb1 for 0pxp1 where Bð; Þ is the Beta function.
ARTICLE IN PRESS F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
233
12b
varðU t Þo1; covðU t ; U tþk Þck as k ! 1. In other words, when 12 obo1 the covðU t ; U tþk Þ decays toward zero, in agreement with the notion of stationarity, but too slowly to ensure summability, resembling the classical definition4 of long memory for the U t given by 2d1
covðU t ; U tþk Þck
;
k ! 1,
(2)
0odo 12.
with long memory parameter It can be shown that there is a simple mapping between the long memory parameter d and the tail index b of the density f ðaÞ, which is d ¼ 1 b.
(3) b1
The above results can be generalized to all distributions whose behavior close to unity can be approximated by ð1 aÞ . Furthermore, the results can also be extended within a general autoregressive moving average (ARMA) framework where the implication for aggregation depend on the shape of the distribution around unity of the largest autoregressive root of the ARMA process. See Zaffaroni (2004) for more details. To evaluate the following empirical application, we will specifically exploit the theoretical link between the distribution of the micropersistence parameters and the persistence of the aggregate. 3. The data The data set utilized in our analysis comprises 470 seasonally adjusted and annualized quarter on quarter changes of consumer price sub-indices from France, Germany and Italy. The data runs from 1985 Q1 to 2004:Q2. Long time series of CPI sub-indexes are not readily available in most countries of the euro area. Therefore, we have been obliged to limit our data to France, Germany and Italy. Nevertheless, this has not undermined the representativeness of our exercise given that those three countries together account for roughly 70 percent of the euro area population and consumption. Unfortunately, the availability and cross-countries comparability of the data has induced us to work with consumer price data, which are a measure of the cost of goods and services as purchased by final consumers. CPI includes also the margins charged by retailers at the various steps of the distribution process. This implies that such consumer prices do not properly capture price dynamics at the production level nor the cost of intermediate inputs paid by firms. Concerning the choice of the sample period, we focus on the post-1985 data for two reasons. First, the German data are not available before 1985. Second, we want to avoid that our results on inflation persistence are impaired by possible deterministic breaks over the estimation period. Many empirical studies have showed that the mid-eighties marked a significant break in average inflation in most OECD countries.5 For all these reasons, we deem the post-1985 sample as appropriate to study the persistence of inflation in the euro area. The three CPI original databases together comprise 470 sub-indices, but 66 of these were not suitable for estimation of ARMA models either because they have too few observations (e.g. some sub-indices are available only since year 2000) or because they correspond to items which prices are set at discrete intervals (e.g. tobacco or postal services). We are left with 404 ‘well-behaved’ series each having 78 quarterly observations. 3.1. Sectoral inflation series: descriptive statistics We now turn to the properties of the sub-sector inflation rates. Table 2 reports descriptive statistics of aggregate inflation and of the distribution of the 404 sectoral inflation rates. Let us emphasize four points. First, the inflation sectoral averages are quite disperse with 50 percent of their distribution ranging from 1:8 to 4:2 percent. The mean of aggregate inflation is 2:6 percent. Second, sectoral inflation is noticeably more volatile than aggregate inflation. On average, the standard deviation of the inflation sub-indices is equal to 3:6 percent, i.e. nearly three times as large the standard deviation of the aggregate. This much higher volatility is a common feature of the inflation rates of sectoral prices as shown in Clark (2006) and Bilke (2005). Third, the persistence of sectoral inflation rates is also clearly smaller than the one of the aggregate inflation series. A first measure of persistence is given by the largest root of an ARMA model fitted on the data. This measure is equal to 0:93 in the case of the aggregate inflation and this is roughly equal to the 75th percentile of the distribution of the same measure on the sector’s data.6 An alternative measure of persistence is given by the long memory parameter, here denoted di for each sub-indices inflation rate pit (see Eq. (2)). If the large majority of the sub-indices inflation rates displays long memory, then the aggregation exercise would be trivial. The last column of the table, which reports the statistics relative to the distribution of estimated long memory parameters di based on the parametric Whittle estimator (see Brockwell and Davis, 1987), confirms that only very few sectoral inflation rates exhibit long memory: di is not different from or close to 12 for a vast majority of the cross-section. However, as shown below, our 4 When 12 odo0 in (2), under further regularity conditions, the U t is said to display anti-persistence, which is a variation of long memory for which the acf becomes summable. We disregard this case because it is not relevant for our empirical application. 5 Among the others, Corvoisier and Mojon (2005) suggests that Italian, French and German time series of inflation all admit a break in the mideighties. 6 The largest root is the one associated to the best ARMA(p; q) as selected by the Akaike (AIC) criteria with 0pp; qp4, estimated using the ARMAX procedure of MatLab.
ARTICLE IN PRESS 234
F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
Table 2 Descriptive statistics of the 404 sectoral inflation rates (first two columns annualized q–o–q inflation rates). Aggregate of 404
Cross-section characteristics Weighted mean Unweighted mean Minimum 25th percentile Median 75th percentile Maximum
Stand. dev. 1.1
Larg. ARMA root 0.93
Long mem. d 0.18
2.6 2.4 11.3 1.8 2.6 4.2 8.1
3.6 3.5 0.7 1.7 2.5 4.0 25.4
0.78 0.72 0.81 0.71 0.83 0.90 1.02
0.33 0.36 0.50 0.50 0.43 0.18 0.32
1.8 1.5 3.6 2.6 2.3 2.0 1.9 3.3
3.0 3.2 4.3 3.1 3.9 2.2 8.4 3.0
0.72 0.71 0.73 0.68 0.52 0.77 0.78 0.74
0.34 0.34 0.36 0.20 0.33 0.38 0.37 0.33
Relative variance accounted
Mean for selected sub-sets France Germany Italy Processed food Unprocessed food NE indus. goods Energy Services
Mean 2.6
0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.5
1
1.5 2 Frequencies
2.5
3
Fig. 1. First 10 dynamic principal components: variance accounted.
estimate of d for the aggregate inflation rate is 0:18, well above the 75th percentile of the sectoral di parameters (0:18). Finally, we observe sharper differences in persistence across the main sectoral groupings of the CPI (processed food, unprocessed food, energy, non-energy industrial goods—NEIG—and services) than across countries. The gap between the ARMA largest root of the inflation process of unprocessed food prices (0.52) and one of the energy (0.78) is wider than between the root associated to the ARMA processes fitted on theinflation of German, French and Italian prices. Comparing the main groupings of CPI sub-indices across countries, we also find that the sectoral hierarchy at the euro area level applies within each country. To conclude, we find clear evidence that the inflation rates of the individual sub-indices are way more volatile and much less persistent than the inflation rate of the aggregate CPI index. Moreover, volatility and persistence are more sector than country specific.
3.2. Behind the aggregation mechanism This section assesses the two elements that play a crucial role in shaping the effect of cross-section aggregation of time series: the presence of common shocks and of heterogeneity in the propagation mechanism of those shocks. This means that, using the notation of Section 2, we are checking whether there is at least one common shock ut and whether the ai are different across i. The analysis here is carried out using non-parametric methods so that the results do not depend on any
ARTICLE IN PRESS
Coherence
F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
3 2.5 2 1.5 1 0.5 0
235
Coherence at 0
-1
-0.8 -0.6 -0.4 -0.2
0
0.2 0.4 0.6 0.8
1
Coherence
Frequencies 30 25 20 15 10 5 0
Coherence at pi/6 Coherence at pi/2
-0.25-0.2-0.15-0.1-0.05 0 0.05 0.1 0.15 0.2 0.25 Frequencies Fig. 2. Coherence between aggregate and sub-indexes at various frequencies.
parametric assumptions. Following the recent literature on factor model in large cross-sections7 we estimate the first 10 dynamic principal components of the autocovariance structure of the sectoral inflation series. The dynamic principal component analysis provides indications on the number of common shocks explaining the correlation structure in the data (see Forni et al., 2000). Fig. 1 presents the spectrum of the first 10 dynamic principal components8 of the 404 inflation time series. The variance of sectoral inflation is strikingly dominated by one common factor. Fig. 1 also shows that this first common factor is the only one for which the spectrum is concentrated on low frequencies. Hence this factor drives the persistence observed in sectoral inflation.9 The other factors account for a much smaller share of the variance than the first one. They are also more equally relevant at all frequencies, as indicated by their relatively flat patterns in Fig. 1.10 On the basis of these results, we opt for a factor model of the sectoral inflation series that admits a single common shock, modelling the sectoral inflation time series pit as
pit ¼ d0i þ Ci ðLÞut þ xit ;
i ¼ 1; . . . ; n,
(4)
where ut is the common shock, xit is a stationary idiosyncratic component, orthogonal to the common component and d0i is the constant mean parameter. Ci ðLÞ is a lag polynomial for the ith unit which represents the propagation of the common shock through the pit process. In the next section we consider parametric specifications of both Ci ðLÞ and xit . Model (4) is here to be intended specifically as a reduced-form model once all the simultaneous, endogenous, relations are resolved. Indeed, the fact that we are working only with consumer price indexes prevents us from spelling out a complete simultaneous model where all the relations between prices at the various stages of the production process are properly accounted for (i.e. the role of the price of intermediate inputs). Furthermore, our model is designed to be in reduced form because we do not pursue any identification of the nature of the shocks, such as a productivity shock or a monetary policy shock, which would instead require a structural model, coupled with a suitable set of identification conditions. However, one might still wonder whether our reduced-form approach can shed some evidence on which source of shocks is the dominant one, if any. A simple consequence of a purely monetary shock is its neutrality with respect to relative prices long run dynamics, in line with the definition of pure inflation given by Reis and Watson (2008). For this reason, we look at whether the propagation mechanism of the common shock in the cross-section of price sub-indices is homogenous across items. Still resorting to spectral analysis, we estimate the coherence11 between the first principal component and each one of the 404 series, in order to obtain ‘correlation’ at different frequencies. Fig. 2 reports the cross-section distribution of the coherence values at three frequencies: 0 (long run), p=6 (three years periodicity) and p=2 (one year periodicity), respectively. 7
See Forni et al. (2000), Stock and Watson (2002) and the application by Clark (2006) to US disaggregate consumption deflators. Dynamic principal components are calculated as the eigenvalue decomposition of the multivariate spectra of the data at each frequency. The acf up to eight lags has been used in the construction of the multivariate spectral matrix. The data have been standardized to have unit variance before estimating the multivariate spectra. 9 The height of the spectrum at frequency zero is a well-known non-parametric measure of the persistence of a time series. 10 Clark (2006) finds similar results on US inflation data. 11 Coherence measures the degree at which two variables are moving together at a given frequency. It is a quantity always between 0 and 1; see Brockwell and Davis (1987) for details. 8
ARTICLE IN PRESS 236
F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
The figure supports the presumption that the common shock transmission to sectoral inflation is highly heterogenous. Such evidence plays against the idea that the our common shock can be identified as a monetary policy one. However, we leave the issue of the shock identification to future work. The following section addresses precisely the issue of how to model and estimate, based on a parametric one-factor linear model, the heterogeneity manifested by Fig. 2. 4. Model: specification and estimation The quarterly rate of change of each sectoral price sub-index pit is assumed to behave according to the following parametric specification of (4):
pit ¼ d0i þ
Yui ðLÞ Y ðLÞ ut þ i , u Fi ðLÞ Fi ðLÞ it
(5)
meaning that we are considering a particular case of (4) with Ci ðLÞ ¼ Yui ðLÞ=Fui ðLÞ and xit ¼ Yi ðLÞ=Fi ðLÞit , where as before ut i:i:d:ð0; 1Þ is the common shock and now it i:i:d:ð0; s2i Þ is the idiosyncratic shock for each i ¼ 1; . . . ; n, also mutually independent from ut . No distributional assumptions are made. The above polynomials in the lag operator are assumed to satisfy the usual stationarity-invertibility conditions, that is they have all roots outside the unit circle in the complex plane. Also Yxi ðLÞ; Fxi ðLÞ, respectively, of finite order qxi ; pxi, have no common roots for x 2 fu; g ensuring identification. Thus, each pit behaves as a stationary ARMA(pi ; qi ) with a possibly non-zero mean d0i , where standard arguments on sum of ARMA processes yield qi pmaxfqui þ pi ; qi þ pui g and pi ppui þ pi , equality holding if the two AR polynomials, Fi ðLÞ and Fui ðLÞ, have no roots in common. Note that we allow full heterogeneity of all the parameters characterizing model (5). The aggregation result requires to estimate the cross-sectional distribution of the maximal AR roots of Fui ðLÞ and Fi ðLÞ. We assume that the i;t are mutually orthogonal across units. This is merely a simplification assumption, not required in fact for the aggregation theory, used in this paper, to be applicable. In fact, a factor model, such as (5), represents a parsimonious way to capture cross-sectional dependence. This enters primarily through the common component, representing a form of pervasive dependence. However, dependence is in principle permitted also through cross-correlated it (i.e. covði;t ; j;t Þa0 for iaj) as long as this form of cross-correlation is sufficiently weak and not pervasive in the crosssection, ensuring that model (5) belongs to the class of approximate factor structure.12 For example, one might consider the case of local correlation arising between the prices of items belonging to the same sector as follows: the n units are classified into mon sectors, with covðil ;t ; jl ;t Þ ¼ rl a0 for all units belonging to the lth sector, with 1plpm and covðil ;t ; jr ;t Þ ¼ 0 for all lar, any i; j. In this case, the approximate factor structure condition is still satisfied as long as the number of units belonging to each sector do not increase too fast with n. However, we do not endorse these type of parameterizations here. In fact allowing cross-sectional dependenceamong the i;t is not of primarily importance for this paper because, from an econometric perspective, (i) the aggregation mechanism depends solely on the characteristics of the common component (see Zaffaroni, 2004), (ii) our estimation strategy is robust to many forms of cross-sectional dependence among the i;t as, for example, an approximate factor structure, and (iii) we are modelling consumer price indexes, which makes difficult to properly account for input/output relations across sectors. 4.1. Estimation strategy The estimation of model (5) is non-standard. First, the large dimensionality (large n) rules out the recourse to the conventional Kalman filter approach typically used for univariate or for small multivariate ARMA. Second, the recent techniques for estimation of dynamic factor models, all based on the principal component approach such as Forni et al. (2000) and Stock and Watson (2002), would be inappropriate in our model due to the presence of sector specific AR components Fui ðLÞ and Fi ðLÞ. We propose instead the following multi-stage procedure:
(i) For each unit i, we estimate the following ARMAðpi ; qi Þ implied by (5), namely Fi ðLÞpit ¼ d0i þ Xi ðLÞzit , setting d0i ¼ Fi ð1Þd0i ; Fi ðLÞ ¼ Fui ðLÞFi ðLÞ with MA component Xi ðLÞzit ¼ Yui ðLÞFi ðLÞut þ Yi ðLÞFui ðLÞit , for some white-noise sequence zit and a polynomial Xi ðLÞ of order qi. Simplification occurs if there are roots in common. Note that here we need not distinguish between the common and the idiosyncratic shock, since this is un-necessary for consistent estimation of the intercept d0i ; and of the AR coefficients in Fi ðLÞ, which are the goal of this stage. ^ i ðLÞz^ i;t across i yielding (ii) We average the estimated MA component X x^ t;n ¼
n X
^ i ðLÞz^ i;t , wi X
(6)
i¼1
12 An approximate factor structure is qualified by the condition that the maximum eigenvalue of the covariance matrix Eet et remains bounded as n increases, where et ¼ ð1t ; . . . ; nt Þ0 ; see Chamberlain and Rothschild (1983).
ARTICLE IN PRESS F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
237
8 Nonpara. estimate reweighted dist. Estimated Beta
7
Distribution
6 5 4 3 2 1 0 0
0.1
0.2
0.3
0.4
0.5 0.6 Root
0.7
0.8
0.9
1
Fig. 3. Distribution of the maximal autoregressive root.
where the wi are the CPI weights. This yields an estimate of its common part, namely the limit of P ð ni¼1 wi Yui ðLÞFi ðLÞÞut , where the approximation improves as n grows since the idiosyncratic part vanishes in meansquare as n ! 1. We then fit a finite order MA to the x^ t;n in order to get, at this stage, an estimate of the common innovation u^ t . (iii) Using the u^ t as an (artificial) regressor, we finally fit model (5) to each pit but with the caution of using it as an ARMAX(pi ; qi ; qi ) process (an ARMA process with exogenous regressors). Note that, in doing so, we are not imposing that the AR roots of the common and idiosyncratic components coincide, a feature admitted by the so-called Box–Jenkins model structure. At this stage we get all parameter estimates. Steps (ii) and (iii) of the procedure can be iterated in order to improve the estimate of the common shock as well as of the model parameters. Such procedure does not require any distributional assumption but its validity requires that both n; T to diverge to infinity. Details of the algorithm can be found in Altissimo and Zaffaroni (2004). The described iterative procedure is similar to a recent modification of the EM algorithm for the estimation of factor models in the presence idiosyncratic AR components recently proposed by Stock and Watson (2005). We set 0ppui ; pi p2; 0pqui ; qi p4 implying in stage (i) estimation of ARMA(pi ; qi ) with 0ppi ; qi p4.13 The order of the models (qi ; pi in step (i), qi in step (ii) and pui ; pi ; qui ; qi in step (iii)) are selected based on the AIC criteria. At each iteration and for each i this means that we have estimated 32 52 ¼ 225 specifications, given the chosen range for pui ; pi ; qui ; qi . Our choice on the orders is made solely to contain computational time and to follow the principle of parsimony, since the larger the orders, the larger would be the number of parameters to be estimated. However, in terms of the aggregation mechanism, note that no effect would arise from taking arbitrarily large, yet finite, MA and AR orders. 4.2. Results for sectorial inflation rates This section briefly describes the estimates of up to 6000 parameters (up to 15 parameters for each of the 404 sub-index time series, namely the AR and MA parameters, the mean and the idiosyncratic innovation variance) of the model. First, the estimated common shock ut turns out to be white, with a non-significant autocorrelation, corroborating the i:i:d: hypothesis. Second, the idiosyncratic volatility si is substantially larger than the common shock volatility, in fact six times larger. The median of the distribution of the absolute value of the estimated first loading is 0:06; whereas we obtain 0:38 for the standard deviation of the idiosyncratic component. This strikingly confirms that most of the variance of sectoral prices is indeed due to sector specific shocks. Third, we turn to Fui ðLÞ, which dominates the dynamic effects of the common shock on sectoral inflation. In Fig. 3 (blue dashed line), we show the kernel estimate of the distribution of the signed modulus of the maximal AR root of such polynomial.14 It turns out that this distribution is dense near unity with a median of 0.82 and a long tail to the left. Fourth, we compare the dynamics of sectoral inflation rates and main CPI groupings in Table 3. The table reports the number and the relative 13
We use the armax and bj procedures of MatLab. We did sign such modulus so that we could distinguish between the effect of a negative root from a positive one and also consider the effect of complex roots. 14
ARTICLE IN PRESS 238
F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
Table 3 Summary statistics of estimated parameters for sectorial inflation rates. #
CPI weight
Largest root 40:875
Largest root 40:925
#
Freq.
CPI weight
#
Freq.
CPI weight
142
0.35
0.48
54
0.27
0.26
EA3
404
1
Germany France Italy
87 147 170
0.42 0.30 0.28
31 55 56
0.34 0.38 0.35
0.23 0.15 0.11
11 23 20
0.12 0.17 0.12
0.14 0.06 0.06
Processed food Unprocessed food Non-energy ind. goods Energy goods Services
65 42 167 18 114
0.14 0.07 0.33 0.07 0.39
9 7 68 8 50
0.14 0.17 0.42 0.50 0.44
0.03 0.01 0.14 0.05 0.25
2 3 25 5 19
0.03 0.07 0.15 0.31 0.17
0.01 0.00 0.08 0.03 0.14
frequency of series having roots above given thresholds in specific sub-category as well as the total weight of such series in the overall CPI. Looking at the cross-country pattern, highly persistent sub-sectors in Germany account for a much larger share of the overall euro area CPI; this effect is mainly due the high persistence (AR root of 0:96) of German housing expenditure inflation,15 which accounts for around 8 percent of the overall CPI. Furthermore, the service sector turns out to be the most relevant for the dynamics of aggregate inflation either because it has the largest share of highly persistent series and the highest weight in the consumption basket. 4.3. Results for aggregate inflation rate Given the estimates of the microparameters, we are now in a position to infer the dynamic properties of the aggregate induced by the behavior of the microtime series, which represent the core results of the paper. We consider three different types of aggregation schemes. First, we reconstruct the aggregate as an exact weighted average of the individual microtime series and in this way we exactly recover the contribution of the common shocks to the aggregate inflation and to aggregate persistence. This exact aggregation result will corroborates the plausibility of one dominant common shock specification. Second, we exploit the theoretical link between the distribution of the largest AR root of sectoral inflation rates and the autocovariance structure of the aggregate, as presented in Section 2, to infer the dynamic properties of the latter. We shall call this asymptotic aggregation result, which is the main finding of the paper. Third, we consider a so-called naı¨ve aggregation scheme based on the (wrong) presumption that the aggregate model has the same functional form as the individual models, in our case an ARMA. This exercise will warn against not properly accounting for the aggregation effects. Summarizing the results described below in detail, we claim that the analysis of the microdeterminants of the aggregate inflation supports the view that aggregate inflation in our sample period can be well described by a stationary long memory process. We will show that starting from very simple ARMA process at microlevel we have been able to properly reconstruct the dynamic properties of the aggregate. We will also show that the microvolatility and low persistence can be squared with the aggregate smoothness and persistence. 4.3.1. Exact aggregation The aggregate inflation data are defined as the weighted average of the sectoral inflation rates.16 Therefore using the estimates of the model in (5) it follows:
Pn;t ¼
n X i¼1
wi pit ¼
n X i¼1
wi b d0i þ u^ t
n X i¼1
u
wi
b ðLÞ Y i u
b ðLÞ F i
þ
n X i¼1
wi
b ðLÞ Y i
b ðLÞ F i
b ðLÞu^ t þ b ^ it ¼ db0 þ C xt ,
where the wi are the CPI weights. Hence, the aggregate inflation is decomposed into two components, one associated with b ðLÞ; and a second associated with the idiosyncratic component the common shocks, u^ t ; and its propagation mechanism, C b ðLÞ ¼ Pn w C b ðLÞ, viz. a weighted average of the propagation mechanisms at microlevel. Fig. 4 shows x^ t , where C i i i¼1 b ðLÞu^ t : There is a high correlation between the two the reconstructed aggregate, Pn;t ; versus its common component, C components, around 0.76, but the former is clearly more volatile. This suggests that the idiosyncratic component x^ t has not been completely washed out at the aggregate level. However, the effect of the idiosyncratic component has been drastically 15 The sub-index is rent (including imputed rents from owner-occupied houses), which account for around 20 percent of the German CPI, while it is only 3 percent of the Italian and French. 16 Statistical offices do not aggregate inflation rates but first they aggregate price indices and then compute the aggregate inflation rate. Here we ignore the possible effect induced by such non-linear transformation since not relevant for the dynamic properties of the aggregate.
ARTICLE IN PRESS F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
239
5 4.5 4
Aggregate Inflation - qoq Annualized Common Component - Annualized
Inflation
3.5 3 2.5 2 1.5 1 0.5 0 19861988199019921994199619982000200220042006 Time Fig. 4. Aggregate CPI inflation versus estimated common component.
P d arðx^ it Þ, is equal to 0:2, reduced: the average of the estimated variances of the idiosyncratic components, namely ni¼1 wi v which is four times larger than the estimated variance of the average of the idiosyncratic components, namely P b ðLÞu^ t can be interpreted as a measure d d v arðx^ t Þ ¼ v arð ni¼1 wi x^ it Þ. Since the idiosyncratic component has little persistence, C of ‘core inflation’, possibly relevant for monitoring and forecasting inflation. 4.3.2. Asymptotic aggregation In this section, we provide a more formal link between microheterogeneity and aggregate persistence based on the arguments introduced in Section 2. However, differently from the simple example discussed there, we first need to take into account that CPI sub-sectors have different weights in the aggregate. To overcome this problem, we implement a relative re-weighting of the 404 maximal AR roots in function of the relative weights of the respective sectors. Precisely, we bootstrap a sample of 10; 000 data out of the 404 roots with relative frequency equal to the weighting scheme; hence roots associated to sectors will a larger weight wi will be re-sampled more often. For the bootstrapped sample, the distribution of the roots is estimated in two ways, parametrically and nonparametrically. The non-parametric estimate resorts on the use of a bounded kernel in order to account for the bounded support of the roots; such non-parametric estimate of the density function associated to this simulated sample is plotted in Fig. 3 (blue dashed line). Instead, for the parametric case, mirroring the example of Section 2, we fitted the Beta distribution to the data: ^ ^ ^ ¼ B1 ða^ ; bÞ ^ aa1 f ðaja^ ; bÞ ð1 aÞb1 ,
(7) ^ ^ where the Beta parameter estimates, obtained by maximum likelihood estimation, are respectively, a ¼ 4:7; b ¼ 0:87 (standard deviations equal to 0.17 and 0.02, respectively). The fitted Beta density is also plotted in Fig. 3 (green solid line). The behavior near unity of the two estimated densities, based on the parametric and non-parametric estimator, respectively, is remarkably similar and in the following we will focus on the estimated Beta.17 We now have all the ingredients to apply the arguments of Section 2. By (3) we immediately get d^ ¼ 1 b^ ¼ 0:13 and plugging this value into (2) yields the (estimated) asymptotic behavior for the acf of the common component of the aggregate, CðLÞut , equal to ^ CðLÞut ; CðLÞutþk Þck0:84 covð
as k ! 1.
(8)
In other words, it follows that the common component of aggregate inflation has long memory parameter equal to d^ ¼ 0:13. Therefore, the acf of the common component of aggregate inflation decays toward zero with an hyperbolic decay, and thus is markedly different from the behavior of the sectoral inflation processes. So far we have backed out the aggregate memory from the microdata. One might wonder at what would be the estimate of the memory based directly on the aggregate data. It turns out that the estimate of the memory parameter18 d by using only the aggregate Pn;t is equal to 0:18, with standard error of the estimate equals to 0:20; which, given the distribution reported in the last column of Table 2, is reasonably close to 0:13; as recovered from the microstructure. Therefore, the aggregate data presents a long memory behavior that is not present in the microtime series; this long memory characteristic appears to be fully accounted for by aggregation. 17
Using simple regression method, we can show that the non-parametric estimate of the root distribution behaves near unity as f^ ðaÞcð1 aÞ0:16 as
^ a ! 1 , which is remarkably close to the tail behavior implied by (7) namely f ðaja^ ; bÞcð1 aÞ0:13 as a ! 1 . 18
We have used the log periodogram regression estimator of Robinson (1995).
ARTICLE IN PRESS 240
F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
0.14 Average of the IR of individual models
Impulse Response
0.12
IR of average model (naive model)
0.1 0.08 0.06 0.04 0.02 0
0
5
10
15
20
25
30
35
Lag Fig. 5. Impulse responses.
4.3.3. Naı¨ve aggregation The above results are framed in term of acf and they show that the difference in persistence between the micro and macrodynamic is not necessarily inconsistent. The key element there is the aggregate impulse response function CðLÞ, P equal to the limit (in mean square) of ni¼1 wi Ci ðLÞ (cf. Eq. (4)). Another way to highlight the effects of aggregation on persistence is to consider the following naı¨ve exercise. We construct a hypothetical average ARMA process (here called in fact naı¨ve), whose roots are the average of the individual roots of the 404 estimated ARMAs. This means that we are considering the following impulse response function: u
Cnaive ðLÞ ¼ u
u
ð1 þ y1 LÞ . . . ð1 þ yq LÞ ð1
fu1 LÞ . . . ð1
u
fp LÞ
.
(9) u
Here yh , for 1phpq, is the (cross-sectional weighted) average of the hth root19 yih of the MA polynomial Yui ðLÞ u u corresponding to the common shock ut (cf. model (5)). Similarly, fk , for 1pkpp, is the average of the kth root fik of the AR polynomial Fui ðLÞ corresponding to the common shock ut . The difference between CðLÞ and Cnaive ðLÞ arises whenever p40, that is when there is an autoregressive component, since the impulse response is a nonlinear function of the autoregressive parameters. Therefore the coefficients embedded within the impulse response CðLÞ decay hyperbolically, a symptom of long memory, whereas the coefficients of the naı¨ve impulse response Cnaive ðLÞ decay exponentially. The idea of the exercise is to see the aggregate response to the common shock in case the propagation mechanism is equal across agents versus the case of different propagation mechanisms, i.e. to quantify the effect of heterogeneity and aggregation.20 b naive ðLÞ and C b ðLÞ, respectively. Fig. 5 compares the estimated impulse responses21 implied by C The exercise is quite instructive. In the hypothetical case of homogeneity of the micropropagation mechanism, after four years a common shock ut would be completely absorbed (green dashed line), while in reality, due to the presence of heterogeneity and of some very persistent microunits, around 20 percent of the shock would not have been absorbed (blue solid line). In other words, focusing on the naı¨ve model, that uses (9), leads to understate the degree of persistence in the data in a quantitatively important way; once and again this results warns against the practice of using average microbehavior in order to calibrate macrophenomena. 5. Conclusions In this paper, exploiting the heterogeneity in the inflation dynamics across CPI sub-indices within the euro area, we investigate the role played by cross-sectional aggregation in explaining some of the difference observed between micro and macroinflation dynamics. We focus in particular on the link between CPI sub-indices and the aggregate CPI. Our results are able to square the volatility and low persistence observed, on average, at the level of sectoral inflation with the smoothness and persistence of the aggregate. The persistence that we obtain through this aggregation exercise mimics remarkably well the persistence observed in the aggregate inflation. In particular, aggregate inflation turns out to be well described by a stationary long memory process. The persistence of the aggregate inflation is mainly due to the high persistence of some 19 Hereafter, we follow the convention of referring to the inverse of the roots as the roots so that, for example, the root of the polynomial 1 fL will be 1 indicated by f rather than by f . 20 See also the presentation of similar exercises in the context of microfounded model, though on hypothetical distribution by Carvalho (2006). 21 ^ u ¼ Pn w f ^u b naive ðLÞ is obtained by plugging the sample weighted means y^ u ¼ Pn w y^ u ; 1phpq; and f Here C i ih i ik ; 1pkpp; into (9). h k i¼1 i¼1
ARTICLE IN PRESS F. Altissimo et al. / Journal of Monetary Economics 56 (2009) 231–241
241
sub-indices mainly concentrated in the service sector, such as housing costs in Germany. Altogether, we show the importance of heterogeneity and aggregation for understanding the persistence of inflation at the macroeconomic level. We leave the design and estimation of stylized models of the business cycle that can be consistent with both heterogeneity at the microlevel and the implied persistence at the macrolevel for future research. References Altissimo, F., Zaffaroni, P., 2004. Towards understanding the relationship between aggregate fluctuations and individual heterogeneity, Imperial College London, preprint. Bilke, L., 2005. Break in the mean and persistence of inflation: a sectoral analysis of French CPI. ECB Working Paper 413. Bils, M., Klenow, P., 2004. Some evidence on the importance of sticky price. Journal of Political Economy 112, 947–985. Boivin, J., Giannoni, M., Mihov, I., 2006. Sticky prices and monetary policy: evidence from disaggregated U.S. data. The American Economic Review, Forthcoming. Brockwell, J., Davis, R., 1987. Time Series: Theory and Methods. Springer, Berlin. Carvalho, C., 2006. Heterogeneity in price setting and the real effects of monetary shocks, Princeton University, preprint. Chamberlain, G., Rothschild, M., 1983. Arbitrage, factor structure and mean-variance analysis on large asset markets. Econometrica 51, 1281–1304. Chari, V., Kehoe, P., McGrattan, E., 2000. Sticky price models of the business cycle: can the contract multiplier solve the persistence problem? Econometrica 68, 1151–1179. Clark, T., 2006. Disaggregate evidence on the persistence of US consumer price inflation. Journal of Applied Econometrics 21, 563–587. Corvoisier, S., Mojon, B., 2005. Breaks in the mean of inflation: how they happen and what to do with them. ECB Working Paper 451. Dhyne, E., Alvarez, L., Le Bihan, H., Veronese, G., Dias, D., Hoffman, J., Jonker, N., Lu¨nnemann, P., Rumler, F., Vilmunen, J., 2005. Price setting in the euro area: some stylised facts from individual consumer price data. ECB Working Paper 523. Forni, M., Hallin, M., Lippi, M., Reichlin, L., 2000. The generalized factor model: identification and estimation. The Review of Economics and Statistics 82, 540–554. Granger, C., 1980. Long memory relationships and the aggregation of dynamic models. Journal of Econometrics 14, 227–238. Imbs, J., Mumtaz, H., Ravn, M., Rey, H., 2005. PPP strikes back: aggregation and the real exchange rate. Quarterly Journal of Economics CXX, 1–43. Pivetta, F., Reis, R., 2007. The persistence of inflation in the United States. Journal of Economic Dynamics and Control 31, 1326–1358. Reis, R., Watson, M., 2008. Relative goods’ prices, pure inflation, and the Phillips correlation, Princeton University, preprint. Robinson, P.M., 1978. Statistical inference for a random coefficient autoregressive model. Scandinavian Journal of Statistics 5, 163–168. Robinson, P.M., 1995. Log-periodogram of time series with long range dependence. Annals of Statistics 23, 1048–1072. Stock, J., Watson, M., 2002. Macroeconomic forecasting using diffusion indices. Journal of Business and Economics Statistics 20, 147–162. Stock, J., Watson, M., 2005. An empirical comparison of methods for forecasting using many predictors, Harvard, preprint. Zaffaroni, P., 2004. Contemporaneous aggregation of linear dynamic models in large economies. Journal of Econometrics 120, 75–102.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 242–254
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
New Keynesian models, durable goods, and collateral constraints$ Tommaso Monacelli ` Bocconi, IGIER and CEPR, IGIER Bocconi, Via Salasco 5, 20136 Milan, Italy Universita
a r t i c l e i n f o
abstract
Article history: Received 26 September 2008 Accepted 26 September 2008 Available online 14 October 2008
Econometric evidence suggests that, in response to monetary policy shocks, durable and non-durable spending co-move positively, and durable spending exhibits a much larger sensitivity to the shocks. A standard two-sector New Keynesian model with perfect financial markets is at odds with these facts. The introduction of a borrowing constraint, where durables play the role of collateral assets, helps in reconciling the model with the empirical evidence. & 2008 Elsevier B.V. All rights reserved.
JEL classification: E52 E62 Keywords: Durable goods Sticky prices Collateral constraint
1. Introduction New Keynesian (NK) models of the last generation, featuring imperfect competition, and price stickiness as central building blocks, have recently become a workhorse reference for the analysis of business cycles and monetary policy.1Surprisingly, most of these models have largely ignored the role played by durable goods. In the data, the evolution of durable spending in response to monetary shocks is characterized by two main features. First, durable spending co-moves positively with non-durable spending. Second, the sensitivity of durable spending to monetary shocks is significantly larger than the one of non-durable spending. A baseline two-sector NK model with perfect financial markets is generally at odds with those facts: if price stickiness is asymmetric in the two sectors, whenever consumption contracts in one sector it tends to expand in the other. The intuition for this theoretical anomaly lies in a distinctive feature of durable goods under perfect financial markets: namely, that their shadow value (which corresponds to the discounted stream of marginal utilities of the durables) is almost constant.2This is due to the stock-flow ratio of durables being particularly high, so that a unit of durables does not add much to overall utility at the margin. As a result, durable consumption is very sensitive to variations in the user cost of durables. Hence if durable prices are flexible (sticky) and non-durable prices sticky (flexible), a monetary contraction lowers (increases) the relative price of durables, and almost invariably the user cost, leading consumption to rise in the flexible-price sector and to fall in
$ I thank the editor and an anonymous referee for their extremely useful comments. I also would like to thank Tim Fuerst, Zvi Hercowitz, Alessandro Notarpietro, and Daniele Terlizzese for valuable insights. All errors are my own responsibility only. Appendix A containing supplementary material is available via Science Direct. Tel.: +39 0258363330; fax: +39 0258363332. E-mail address:
[email protected] URL: http://www.igier.uni-bocconi.it/monacelli 1 To name a few, Goodfriend and King (1997), Rotemberg and Woodford (1997), Claridan et al. (1999), Woodford (2003). 2 See also Barsky et al. (2007).
0304-3932/$ - see front matter & 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.09.013
ARTICLE IN PRESS T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
243
the sticky-price one. In a nutshell, a positive correlation between the user cost and the relative price of durables is at the heart of the co-movement problem. This paper shows that the presence of credit market frictions can reconcile an otherwise standard NK model with the empirical evidence of monetary policy shocks on durable and non-durable spending. In our economy, agents with heterogenous discount factors trade nominal private debt in equilibrium, with the borrowers being subject to a collateral constraint. As a result, the latter do not act as full consumption smoothers, but exhibit preferences tilted towards current consumption. Importantly, their borrowing limit is endogenously tied to the (expected future) value of the stock of durables. This feature has a twofold implication: first, the shadow value of durables is now linked to the shadow value of borrowing (a marginal unit of durables provides an additional service: it allows to expand borrowing); second, the ability of borrowing depends also on the evolution of the asset price, i.e., the relative price of durables. To understand the implications of a borrowing constraint for the transmission mechanism, consider a monetary policy contraction, and let (for the sake of exposition) durable prices be more flexible than non-durable prices, so that the relative price of durables falls in response to the shock. By increasing the shadow value of borrowing, an interest rate hike alters the dynamics under perfect financial markets in two main respects: first, it breaks the quasi-constancy of the shadow value of durables. This happens because the latter is now a function not only of the (current and future) marginal utility of durables (which is roughly constant, as under perfect financial markets), but also of the shadow value of borrowing, which varies in response to shocks. Second, a policy rate hike increases the user cost of durables, producing a substitution towards nondurable consumption. In fact, the user cost and the relative price of durables are negatively correlated under credit market imperfections, in stark contrast with the case of perfect financial markets. The latter effect also helps in reconciling the model with the evidence that durable consumption is a more sensitive component of spending to monetary policy shocks. Asset price movements reinforce the collateral-constraint channel described above. When a monetary policy contraction lowers the relative price of durables, it also lowers the collateral value of the durable stock, thereby affecting the borrowing capability also on the extensive margin. The latter effect is at work, for instance, when durable goods prices are assumed to be relatively more flexible than non-durable prices. The role of durable goods in NK models has only recently received some attention. Erceg and Levin (2006) study optimal monetary policy in a sticky-price model with durable and non-durable goods, but without a borrowing constraint. In a similar environment, Barsky et al. (2007) analyze the transmission of monetary shocks and argue that it is largely affected by the assumption on the degree of stickiness of durable goods prices. Our analysis is related to their work, in that it shows that the critical role played by the stickiness (or lack thereof) of durable goods prices can be de-emphasized by the introduction of credit market imperfections. Campbell and Hercowitz (2006) study the role of collateralized debt in a business cycle model, but their analysis is confined to a one-sector, real business cycle model.3 2. Monetary shocks and durable spending: the evidence In this section we document two stylized features that characterize the dynamic evolution of durable and non-durable spending in response to (identified) monetary policy shocks. First, durable spending co-moves positively with non-durable spending in response to those shocks. Second, the sensitivity of durable spending to policy shocks is significantly larger than the one of non-durable spending. This evidence complements the one in Erceg and Levin (2006) and Barsky et al. (2007) by documenting also the behavior of household debt. To assess the impact of monetary policy shocks we estimate a quarterly VAR model for the U.S. economy specified as follows:
Yt ¼
L X
Aj Ytj þ BEt
(1)
j¼1
where Et is a vector of contemporaneous disturbances. The vector Yt comprises six variables: (i) real GDP, (ii) real durable consumption, (iii) real non-durable consumption and services, (iv) the GDP deflator, (v) total real household debt (the sum of mortgage and consumer credit debt), and (vi) the federal funds rate. Except for the funds rate, all variables are in logs and have been deflated by the GDP deflator.4 The VAR system features a constant, a time trend, and four lags, and is estimated over the sample 1952:1–2007:2. To identify a monetary policy shock, we resort to a standard recursive identification scheme (Christiano et al., 1999). Fig. 1 displays estimated responses of real GDP, real non-durable spending, real durable spending, and total private debt to a one-standard-deviation innovation in the federal funds rate. Dashed lines represent two-standard error bands. Hence we see that both components of spending and GDP react negatively to the policy tightening. The smooth and persistent response of these variables is in line with a recent widespread empirical evidence (Rotemberg and Woodford, 1997; Christiano et al., 1999). Household debt is also observed to fall very gradually in response to the shock. Importantly, the fall 3 Most recently a paper by Carlstrom and Fuerst (2006) addressing the co-movement problem, although written independently, has been brought to my attention. That paper differs from mine in that it emphasizes the possibility that sticky wages, coupled with adjustment costs in the durable sector, may be a candidate explanation for the co-movement problem. 4 The source of data are FRB Flow of Funds and FRED.
ARTICLE IN PRESS 244
T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
Real Consumption Durables
Real GDP 0.002
0.005
0.000
0.000
-0.002
-0.005
-0.004
-0.010
-0.006
-0.015
-0.008
1
2
3
4
5
6
7
8
9 10
-0.020
1
2
Real Consumption ND and Services
4
5
6
7
8
9 10
Real Total Household Debt
0.000
0.000
-0.002
-0.002 -0.004
-0.004
-0.006
-0.006
-0.008
-0.008 -0.010
3
-0.010 1
2
3
4
5
6
7
8
9 10
-0.012
1
2
3
4
5
6
7
8
9 10
Fig. 1. Estimated impulse responses to a monetary policy tightening (sample 1952:1–2007:2, dashed lines are two standard error bands, see text for VAR model specification).
in durable spending peaks earlier than the one of non-durables, and is almost three times larger at the peak. These results are robust to the specification of alternative orderings, less or additional lags, and to the introduction of alternative variables. 3. The model The economy is composed of a continuum of households in the interval ð0; 1Þ. There are two types of households, named borrowers and savers, of measure o and 1 o, respectively. Each individual household’s time endowment is normalized to 1. There are also two sectors (producing durable and non-durable goods, respectively), each populated by a large number of monopolistic competitive firms. The two types of households have heterogeneous preferences, with the borrower being more impatient than the saver.5 All households derive utility from consumption of a non-durable final good and from services of a durable final good. Debt accumulation reflects intertemporal trading between the borrowers and the savers. The borrowers are subject to a collateral constraint, with the borrowing limit tied to the value of the expected future value of the stock of durables. Noticeably, in our context, the borrowers differ from so-called ‘‘rule-of-thumb’’ (or Keynesian) consumers (see, e.g., Galı´ et al., 2007). While the latter are, by assumption, myopic agents who consume only out of their current income, the former are rational forward-looking agents whose ability to smooth consumption intertemporally is limited by the constraint on borrowing (Chah et al., 1995). 3.1. Final good producers In each sector ðj ¼ c; dÞ a perfectly competitive final good producer purchases Y j;t ðiÞ units of intermediate good i. Each producer in sector j operates the production function: !j =ðj 1Þ Z 1
Y j;t
0
Y j;t ðiÞðj 1Þ=j di
;
j 41; j ¼ c; d
(2)
where Y j;t ðiÞ is the quantity demanded of the intermediate good i by final good producer j and j is the elasticity of substitution between differentiated varieties in sector j. Notice, in particular, that in the durable good sector Y d;t ðiÞ refers to expenditure in the newdurable intermediate good i (rather than services). Maximization of profits yields demand functions 5 For earlier models with heterogeneity in discount rates, see Becker (1980), Becker and Foias (1987), Kiyotaki and Moore (1997), Krusell and Smith (1998), Iacoviello (2005), Campbell and Hercowitz (2006).
ARTICLE IN PRESS T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
245
for the typical intermediate good i in sector j: P j;t ðiÞ j Y j;t ; j ¼ c; d (3) Y j;t ðiÞ ¼ P j;t R1 for all i. In particular, Pj;t ð 0 P j;t ðiÞ1j diÞ1=ð1j Þ is the price index consistent with the final good producer in sector j 6 earning zero profits. 3.2. Borrowers A typical borrower consumes an index of consumption services of durable and non-durable final goods, defined as X t ½ð1 aÞ1=Z ðC t ÞðZ1Þ=Z þ a1=Z ðDt ÞðZ1Þ=Z Z=ðZ1Þ
(4)
where C t denotes consumption of the final non-durable good, Dt denotes services from the stock of the final durable good at the end of period t, a40 is the share of durable goods in the composite consumption index, and ZX0 is the elasticity of substitution between services of non-durable and durable goods.7 The borrower maximizes the following utility program: ( ) 1 X t E0 b UðX t ; Nt Þ (5) t¼0
subject to the sequence of budget constraints (in nominal terms): Pc;t C t þ P d;t ðDt ð1 dÞDt1 Þ þ Rt1 Bt1 ¼ Bt þ W t N t þ T t
(6)
where Et f g denotes (conditional) expectations at any given period t, Bt is end-of-period t nominal one-period debt, Rt1 is the nominal lending rate on loan contracts stipulated at time t 1, W t is the nominal wage, Nt is total labor supply, and T t are lump-sum government transfers/taxes. Labor is assumed to be perfectly mobile across sectors, implying that the nominal wage rate is common across sectors. In real terms (units of non-durable consumption), (6) reads C t þ qt ðDt ð1 dÞDt1 Þ þ Rt1
bt1
pc;t
¼ bt þ
Wt Tt Nt þ Pc;t Pc;t
(7)
where qt P d;t =P c;t is the relative price of the durable good, pc;t P c;t =Pc;t1 is non-durable good (gross) inflation, and bt Bt =P c;t is real debt (in units of non-durables). Below we will specialize the form of the utility function as follows: UðX t ; N t Þ ¼ logðX t Þ
j nN1þ t 1þj
(8)
where j is the inverse elasticity of labor supply and n is a parameter that indexes the preference for hours worked of each agent. Private borrowing is subject to an endogenous limit.8 At any time t, the amount that the borrower agrees to repay in the following period, Rt Bt , is tied to the expected future value of the durable stock (after depreciation): Rt Bt pð1 wÞð1 dÞEt fDt P d;tþ1 g
(9)
where w is the fraction of the durable good value that cannot be used as a collateral. Notice that expected movements in the relative price of durables affect the ability of borrowing directly. This channel will be important in evaluating the transmission of monetary policy shocks in the model. The form of constraint (9) can be rationalized in terms of limited enforcement. Although debt repudiation is in principle feasible for the borrower, this option would entail loosing the entire current value of the asset. Hence the provision of collateral acts against that temptation. We assume that, in a neighborhood of the deterministic steady state, equation (9) is always satisfied with equality.9 Hence we can rewrite the collateral constraint in real terms (i.e., in units of non-durable consumption) as follows: Dt qtþ1 (10) bt ¼ ð1 wÞð1 dÞEt Rt =pc;tþ1 R1 Hence the problem of the final good producer j is max P j;t Y j;t 0 Pj;t ðiÞY j;t ðiÞ di subject to (2). Throughout the paper we will refer to Dt as durable services (i.e., the end-of-period stock of durables) and define durable investment as the flow variable Dt ð1 dÞDt1 . 8 Kiyotaki and Moore (1997) and Kocherlakota (2000). 9 This condition is always satisfied in the steady state (see below). The assumption that it continues to hold also in the neighborhood of the steady state will allow us to employ standard local approximation methods when analyzing equilibrium dynamics. In turn, this will require a bound on the amplitude of the stochastic driving forces in the model. Notice that, although the constraint is assumed to hold with equality, at least locally, variations in its tightness will still be measurable in terms of its corresponding shadow value. 6 7
ARTICLE IN PRESS 246
T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
Given initial values fb1 ; D1 g, the borrower chooses fNt ; bt ; Dt ; C t g to maximize (5) subject to Eqs. (7) and (10). By defining lt and lt ct as the multipliers on constraints Eqs. (7) and (10), respectively, and U i;t as the marginal utility of variable i ¼ C; N; D, efficiency conditions for the above program read U n;t W t ¼ U c;t P c;t
(11)
U c;t ¼ lt
(12)
qt U c;t ¼ U d;t þ bð1 dÞEt fU c;tþ1 qtþ1 g þ ð1 wÞð1 dÞU c;t qt ct Et fpd;tþ1 g Rt ct ¼ 1 bEt
U c;tþ1 Rt U c;t pc;tþ1
(13) (14)
Eq. (11) is a standard condition linking the real wage (in units of non-durables) to the borrower’s marginal rate of substitution between consumption and leisure. In (12), the borrower’s marginal utility of consumption is equated to the shadow value of relaxing the flow budget constraint (7). Eq. (13) requires the borrower to equate the marginal utility of non-durable consumption to the shadow value of durable services. The latter depends on three components: (i) the direct utility gain of an additional unit of durable; (ii) the expected utility stemming from the possibility of expanding future consumption by means of the realized resale value of the durable purchased in the previous period; and (iii) the marginal utility of relaxing the collateral constraint, which is proportional to ct (recall that the impatient agent can purchase new debt only by acquiring durables). Notice that, in the case in which the borrowing constraint is not binding (ct ¼ 0 for all t), the shadow value of a unit of durable good comprises only the terms (i) and (ii). Eq. (14) is a modified version of a typical Euler equation. Indeed it reduces to a standard Euler condition in the case of ct ¼ 0 for all t. Consider, for the sake of argument, ct rising from zero to a positive value (for any given Rt Þ. This implies, from (14), that U c;t 4bEt fU c;tþ1 Rt =pc;tþ1 g. In other words, the marginal utility of current consumption exceeds the marginal gain of shifting one unit of consumption intertemporally. The higher ct, the higher the net marginal benefit of acquiring today a unit of the durable asset which in turn allows, by relaxing the collateral constraint at the margin, to purchase additional current consumption. Hence a rise in ct signals a tightening of the collateral constraint. User cost: An alternative interpretation of condition (13) is that it requires to equate the marginal rate of substitution between durable and non-durable consumption, U d;t =U c;t , to the user cost of durables ðZ t Þ, which in this case reads U c;tþ1 qtþ1 Z t qt ½1 ð1 wÞð1 dÞct Et fpd;tþ1 g bð1 dÞEt (15) U c;t Under perfect financial markets (ct ¼ 0 for all t), movements in the user cost are typically dominated by (current and expected) variations in the asset price qt (Erceg and Levin, 2006). A typical feature of the model with a borrowing constraint is instead that movements in the user cost are also affected by the shadow value ct . A de-linking between the user cost Z t and the asset price qt will be a defining feature of the dynamics under credit market imperfections (see more below). To understand this point, consider a log-linear approximation of Eqs. (13) and (14) around the deterministic steady state.10 Using the symbol ‘‘b’’ to denote percent deviations from corresponding steady-state values, we can write the following expression for the user cost of durables: b b bt ¼ F1 ð1 dÞ½Gb br;t þ ðg bÞðwc Z qt bEt fb qtþ1 g þ gR zt Þ t
(16)
where
G
1 ð1 wÞð1 dÞðg bÞ ð1 dÞ
F 1 ð1 dÞ½b þ ð1 wÞðg bÞ br;t R bt Et fp b c;tþ1 g is the (ex ante) real interest rate in units of non-durables and b b d;tþ1 p b c;tþ1 g is a R zt Et fð1 wÞp composite term in sectoral inflation. Notice that in the case g ¼ b, i.e., when heterogeneity in patience rates vanishes (and the collateral constraint becomes locally non-binding, see more below), the expression for the user cost simplifies to br;t Et fb bt ½1 bð1 dÞ1 ½b qt þ bð1 dÞðR qtþ1 gÞ Z
(17)
bt depend positively on the current relative price of durables, but negaUnder perfect financial markets, movements in Z tively on the expected future price of durables. Intuitively, current demand for durables rises when the expected 10 In particular, a sectoral zero-inflation steady state with the relative price of durables is normalized to 1. See below for a more detailed characterization of the steady state.
ARTICLE IN PRESS T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
247
future price rises, due to the expected asset appreciation. This feature vanishes for d ! 1, i.e., when durability disappears. Also, the user cost depends inversely on the real interest rate, for the latter reflects the opportunity cost of investing in the durable good. Finally, depreciation rises the user cost, because it physically erodes the investment in the durable good. In the presence of a collateral constraint (Eq. (16)), the expression for the user cost is affected by an additional element, namely the multiplier ct . As hinted above, a rise in ct signals that the collateral constraint is tighter, for the higher would be the marginal value for the borrower of tilting the consumption plan towards current consumption. By inspecting b induces (ceteris paribus) a rise in Z bt if equation (16) it is clear that a rise in c t
F1 ð1 dÞðg bÞw40
(18)
which in turn requires F40. Notice that condition (18) is always satisfied in the limit case of d ¼ 0. In that case, in fact, we have
Fd¼0 ð1 gÞ þ wðg bÞ40 In the more general case of d small but positive, the same condition is more easily satisfied: (i) the lower the depreciation rate d (higher durability); (ii) the higher the inverse LTV ratio w (therefore the lower the ability to translate the value of the collateral into new debt); and (iii) the higher the saver’s patience rate g (for any given borrower’s patience rate b), and therefore the stronger the heterogeneity in patience rates. For instance, condition (18) will be always satisfied under our baseline parameterization (see below). 3.3. Savers The economy is composed of a second category of consumers, labeled savers. We assume that the typical saver is the P t e e owner of the monopolistic firms in each sector. He/she maximizes the utility program E0 f 1 t¼0 g UðX t ; N t Þg. The key feature that distinguishes the saver’s behavior is the discount factor. We assume that the saver is more patient than the borrower, implying g4b. Since the saver’s optimal program is standard we refrain from presenting it here and refer the interested reader to the supplementary material in Appendix A. 3.4. Production and pricing of intermediate goods A typical intermediate good firm i in sector j hires labor (supplied by the borrowers) to operate a linear production function: Y j;t ðiÞ ¼ Nj;t ðiÞ
(19)
where Nj;t ðiÞ is total demand for labor by firm i in sector j, and, for simplicity, labor productivity is assumed to be constant and normalized to 1 in both sectors. Each firm i has monopolistic power in the production of its own variety and therefore has leverage in setting the price. In so doing it faces a quadratic cost proportional to output, and equal to (Wj =2ÞðPj;t ðiÞ=P j;t1 ðiÞ 1Þ2 Y j;t , where the parameter Wj measures the degree of sectoral nominal price rigidity. In the particular case of Wj ¼ 0, prices are flexible. The problem of each monopolistic firm in sector j is similar to the one presented, e.g., in Ireland (2003) for a one-sector economy. Up to a first order approximation, that problem leads to a forward-looking (‘‘NK’’) Phillips curve in each sector.11 Here we simply recall that in the particular case of flexible prices (in both sectors), the real marginal cost must be constant and equal to the inverse steady-state markup (j 1Þ=j . In this case, the pricing condition reads U n;t c 1 ¼ U c;t c
if j ¼ c
U n;t 1 d 1 q ¼ U c;t t d
if j ¼ d
(20)
(21)
Notice that, in the durable sector, the real marginal cost is directly affected by movements in the relative price. 3.5. Monetary policy We assume that monetary policy is conducted by means of a simple Taylor-type rule: Rt ¼ R
11
et p e p
fp
t ;
fp 41
See also Galı´ and Gertler (1999).
(22)
ARTICLE IN PRESS 248
T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
a a e t p1 where R is the steady-state gross nominal interest rate, p c;t pd;t is a composite inflation index, and t is a policy shock which is assumed to evolve according to
expðt Þ ¼ expðt1 Þr ut with u t iid and 0oro1. 3.6. Market clearing Equilibrium in the goods market of sector j ¼ c; d requires that the production of the final good be allocated to total households’ expenditure and to resource costs originating from the adjustment of prices: e t þ Wc ðpc;t 1Þ2 Y c;t Y c;t ¼ oC t þ ð1 oÞC 2 e t ð1 dÞD e t1 Þ þ Y d;t ¼ oðDt ð1 dÞDt1 Þ þ ð1 oÞðD
(23)
Wd 2
ðpd;t 1Þ2 Y d;t
(24)
R1 R1 where Y j;t 0 Y j;t ðiÞ di ¼ 0 N j;t ðiÞ di ¼ Nj;t for j ¼ c; d: Equilibrium in the debt and labor market requires, respectively
oBt þ ð1 oÞBet ¼ 0 X
et N j;t ¼ oN t þ ð1 oÞN
(25) (26)
j
Finally, we abstract from redistribution via fiscal policy. Hence we set the transfer structure as et ¼ 0 Tt ¼ T 4. Deterministic steady state In the deterministic steady state we assume that inflation is zero in both sectors. Hence R corresponds to the real (gross) rate of interest, and is pinned down by the savers’ discount rate R ¼ g1 via their (standard) consumption Euler condition. Due to the assumed heterogeneity in discount rates ði:e:; bogÞ, the shadow value of borrowing is always positive. In other words, the borrower will always choose to hold a positive amount of debt. To show that, we evaluate (14) in the steady state (under the assumption of zero inflation) and obtain
c ¼ ðg bÞ40
(27)
Notice that, to insure a well-defined steady state, both heterogeneity in patience rates and a borrowing limit are required. In fact, if discount rates were equal, the steady-state level of debt would be indeterminate (Becker, 1980; Becker and Foias, 1987). In this case, in fact, it would hold b=g ¼ bR ¼ 1, and the economy would display a well-known problem of dependence of the steady state on the initial conditions.12 With different discount rates, and yet still perfect financial markets, the consumption path of the borrower would be tilted downward, and the ratio of consumption to income would asymptotically shrink to zero.13 Hence a binding collateral constraint allows a constant consumption path to be compatible with heterogeneity in discount rates. One can show that, under the assumption bog, the steady-state level of debt b is stable, i.e., the economy converges to a unique positive finite value b starting from any initial value different from b (see Appendix A). By evaluating (13) in the steady state, and combining with (27), we obtain the borrower’s relative consumption of durables: D a ¼ fq½1 ð1 dÞðb þ ð1 wÞðg bÞÞgZ C ð1 aÞ
(28)
If d ! 1 (no durability), and/or g ¼ b (which implies c ¼ 0 and therefore a non-binding collateral constraint), the durable/ non-durable margin depends only on the relative price q. Notice that a rise in the down-payment parameter w induces a fall in the relative demand for durables. Intuitively, if the ability of transforming the collateral into new debt is diminished, this makes durables less attractive. 12 In other words, under b ¼ g, the economy would constantly replicate the initial (arbitrary) distribution of wealth forever. This problem is analogous to the typical one that attains to small open economies with incomplete markets. 13 In this case the assumption bog is equivalent to bRo1. In the absence of exogenous growth, this implies that the (gross) growth rate of consumption (bRÞ is below the (gross) growth rate of income (which is 1). Hence, the ratio of consumption to output must shrink over time.
ARTICLE IN PRESS T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
249
5. Calibration and solution method The steady-state real rate of interest is pinned down by the saver’s discount factor g. We choose an annual real rate of return of 4 percent. This implies ð1=gÞ4 ¼ 1:04, and in turn g ¼ 0:99. As in Krusell and Smith (1998), we set the borrower’s discount factor b ¼ 0:98. We choose an annual depreciation rate for durable goods of 4 percent, hence d ¼ 0:04=4. We set w ¼ 0:25, corresponding to a loan-to-value ratio of 70 percent. The share of durable consumption in the aggregate spending index, defined by a, is set in such a way that the steady-state share of durable spending in total private spending, is 0:2 (see NIPA tables). The elasticity of substitution between varieties j is set equal to 6 in both sectors, implying a steady-state mark-up of 20%. The elasticity of substitution between non-durable and durable services is set Z ¼ 1, implying a Cobb–Douglas consumption index for X t.14The inverse elasticity of labor supply j is set equal to 1. We set the degree of nominal rigidity in non-durable prices Wc to generate a frequency of price adjustment of about four quarters. To pin down this value we proceed as follows. Let y be the probability of not resetting prices in the standard Calvo–Yun model.15 We parameterize 1=ð1 yÞ ¼ 4, which implies y ¼ 0:75, and therefore an average frequency of price adjustment of one year. Log-linearization of the optimal pricing condition in each sector yields a slope of the Phillips curve equal to ðj 1Þ=Wj , whereas the slope of the Phillips curve in the Calvo–Yun model reads ð1 yÞð1 gyÞ=y. After setting the elasticity j, the resulting stickiness parameter satisfies Wj ¼ yðj 1Þ=ð1 yÞð1 gyÞ. A price rigidity of four quarters is a standard calibration in the recent literature. Bils and Klenow (2004) document that prices of durable goods are generally more flexible than those of non-durable goods. Nakamura and Steinsson (2008), however, do not report any systematic evidence of larger flexibility of durable prices, and estimate an average frequency of adjustment of about four quarters regardless of durability. In our simulations, then, we experiment with alternative values for the degree of stickiness in durables, ranging from full flexibility ðWd ¼ 0Þ to sizeable stickiness ðWd ¼ Wc Þ. As for the monetary policy rule, we set fp ¼ 1:5, which is a standard value in the literature on Taylor rules, and the shock persistence parameter r ¼ 0:5. When we analyze the economy with heterogeneous agents we normalize the steady-state level of hours worked such that each individual agent chooses to work 13 of her time endowment. This does not entail that ‘‘effective’’ hours worked are the same for the two agents, for the preference parameter n will endogenously differ across agents to ensure that both categories choose to work exactly that share of their time endowment. Our solution method consists in taking a log-linear approximation of the equilibrium conditions in the neighborhood of the deterministic steady state, in which condition (27) holds, and therefore Eq. (10) is satisfied with equality at least locally. This local approximation method is accurate to the extent that we limit the exogenous process ft g to be bounded in the neighborhood of the steady state, an assumption which appears reasonable at least in the case of monetary policy shocks. 6. Co-movement problem under perfect financial markets We start by studying the benchmark case of a standard NK model with perfect financial markets and simply augmented by the presence of a durable goods sector. This version of the model is obtained by evaluating the system of first order conditions (11)–(14) in the particular case of ct ¼ 0. A twofold anomaly emerges. First, when durable prices are flexible, the response of durable spending to a policy shock is countercyclical (and co-moves negatively with non-durable spending). Second, when durable prices are assumed to be sticky, durable consumption correctly contracts in response to a policy tightening, but still exhibits a wrong co-movement with consumption in the non-durable sector. Both results are at odds with the empirical evidence reported in the early part of the paper. Let V t U c;t qt denote the shadow value of one unit of durables. To understand the effect of the policy shock, it is important to recall that a key property of durability is that, under perfect financial markets, V t is almost a constant. In fact, after iterating (13) forward, we can write 8 9 1 <X = j ½bð1 dÞ U d;tþj ’ const. V t U c;t qt ¼ Et (29) : ; j¼0
Hence the right-hand side of (29) depends only on (current and expected future values of) the marginal utility of durables U d;t . Given that, for sufficiently small d, the stock-flow ratio is high for durables, U d;t is a very smooth process. This entails that for V t to be constant, any variation in the relative price of durables must be matched by a variation in the marginal utility of non-durable consumption, U c;t , of the opposite sign, and therefore by a variation in non-durable consumption of the same sign. In the particular case in which prices are equally flexible in both sectors, but also in the case in which they are equally sticky, non-durable consumption is (almost) invariant to monetary shocks. We will see later that the introduction of a borrowing constraint will alter the quasi-constancy of the shadow value V t . Sticky non-durable prices: Fig. 2 displays the effect on selected variables of a 25 basis points innovation in the policy rule (22) in the model with perfect financial markets. Three limit cases are described: (i) sticky non-durableprices (and 14 Ogaki and Reinhart (1998) estimate values for Z slightly above unity. Qualitatively, however, our results will not hinge on the assumed value for the elasticity of substitution Z. 15 See Woodford (2003).
ARTICLE IN PRESS 250
T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
Fig. 2. Co-movement problem under free borrowing: impulse responses to a monetary policy tightening.
flexible durable prices), (ii) sticky durable prices (and flexible durables); and (iii) prices equally sticky in both sectors. In all cases price stickiness is the equivalent of four quarters. When non-durable goods prices are sticky, the relative price qt falls substantially in response to the shock. This is caused by the price of durable goods falling relatively more than the price of non-durables. Notice the one-to-one co-movement between the relative price of durables qt and non-durable consumption, consistent with Eq. (29). Why, then, do consumption and production (employment) both rise in the durable sector? It is useful to rewrite the condition driving the consumption-leisure margin (for a generic agent) as follows: U n;t W t Wt ¼ ¼ q U c;t P c;t Pd;t t
(30)
Price flexibility in the durable sector implies that the marginal cost is constant in that sector, i.e., U n;t =U c;t qt ¼ const. Hence, given that the denominator U c;t qt is (quasi) constant as a result of (29), both the product wage W t =P d;t in the durable sector and U n;t must be constant. In turn, this implies that total employment must be constant in equilibrium, Nt N. Yet if employment falls in the non-durable sector as a result of the monetary tightening, it must necessarily rise in the durable sector (as we observe in Fig. 2) to keep total employment unchanged. Hence output and expenditure both contract in the non-durable sector, whereas they simultaneously expand in the durable sector. Sticky durable prices: When durable prices are sticky (and yet non-durable prices are flexible), the co-movement problem arises again. Notice that the relative price of durables now rises, thereby dictating a rise also in non-durable consumption (and employment). The reason for why qt rises is just symmetric to the previous case: now, prices fall relatively more in the non-durable sector. At the same time, flexibility in the non-durable sector implies a constant real marginal cost: U n;t =U c;t ¼ const. Using condition (30) we can write U n;t ¼ U c q
Wt Pd;t
(31)
where an upper bar indicates that U c q is a constant (again consistent with Eq. (29)). Sticky durable prices imply that the product wage must fall in that sector (to accompany a fall in the sectoral real marginal cost). From (31), this implies that
ARTICLE IN PRESS T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
251
U n;t must rise, and therefore total employment must fall. But if consumption and employment in the non-durable sector both rise, then necessarily employment and expenditure must fall in the durable sector. In the case of sticky durable prices, the model exhibits a further anomaly: the nominal interest rate falls (see also the case of equal stickiness below). As noted in Barsky et al. (2007) this is due to the real return in units of durables being quasi-constant (once again a consequence of the quasi-constancy of the shadow value of durables), so that the nominal rate tracks expected inflation in durables almost one to one. Equal stickiness: Finally, Fig. 2 shows that the case of equal price stickiness in both sectors generates a completely flat response of the relative price of durables. As a result, also non-durable consumption is basically constant. This (almost) perfect correlation between non-durable consumption and the relative price follows once again from the property of constant shadow value of durables, as from Eq. (29). At the same time, however, we observe that durable consumption still falls. In fact, despite a constant relative price of durables, the user cost rises, due to the rise in the real interest rate (there is no distinction in this case between durable vs. non-durable-based real interest rate). In other words, even in the absence of relative price movements, the model is still not capable of generating a clear positive co-movement between consumption in the two sectors. 7. The role of credit market imperfections This section argues that the introduction of credit market imperfections can help in reconciling an otherwise standard NK model with the empirical evidence on the sectoral transmission of monetary shocks. In addition to the standard effect traditionally related to price stickiness, the presence of a collateral constraint on borrowing produces a tightening of the constraint, i.e., a rise in ct . We will label this collateral-constraint effect. The rise in the shadow value ct has a twofold implication. First, it breaks the quasi-constancy of the shadow value of durables, which is a key ingredient of the co-movement problem under perfect financial markets; second, it increases the user cost of durables, producing a substitution towards non-durables. To understand the first point, recall that the shadow value of durables must satisfy (from (13)): Vt ¼
U d;t þ bð1 dÞEt fV tþ1 g Kt
(32)
where K t ½1 ð1 wÞð1 dÞct Et fpd;tþ1 g is a composite term that depends (negatively) on the multiplier ct and on the expected rate of inflation in durables. Notice that, under perfect financial markets (ct ¼ 0 for all t), the composite term K t is equal to 1 for all t. Log-linearizing (32) around the deterministic steady-state, and iterating forward one obtains bt ¼ V
1 j X b j¼0
G
b d;tþj K b tþj g Et fYU
(33)
where Y F=Gð1 dÞ. Notice that (33) reduces to (29) in the case b ¼ g. Under a collateral constraint, the shadow value of durables depends not only on the marginal utility of durables (as under perfect financial markets), but also on the current and expected future values of the shadow value of borrowing (via the term K t ), which does fluctuate in response to shocks. The additional implication stemming from the presence of a collateral constraint is that tighter credit conditions generate (via a rise in ct ) a rise in the user cost of durables (see our previous analysis). This induces a substitution from durables to non-durable consumption, i.e., from the flexible towards the sticky sector. Importantly, this effect generates a de-linking between the user cost and the relative price of durables (see Eq. (15) above). A tight co-movement between the user cost and the relative price of durables was instead a defining feature of the baseline model with perfect financial markets. It is important to emphasize, though, that the collateral-constraint channel described above is at work for any assumed relative degree of sectoral price stickiness: in other words, this effect is independent of qt . Hence movements in the relative price of durables can strengthen the collateral-constraint effect by altering the value of the collateral asset. In turn, the implied variation in the demand for durables will feedback onto the behavior of relative prices, all in a self-reinforcing fashion. Fig. 3 depicts impulse responses to a mofnetary policy tightening (25 basis point innovation) in the model with a binding collateral constraint. Solid and dashed lines denote the typical borrower’s and saver’s variables, respectively (with the behavior of the saver being illustrative also of the dynamics under perfect financial markets). In this experiment we assume that non-durable prices feature a standard four-quarter stickiness, whereas durable prices are more flexible (two- quarter stickiness), so that the relative price of durables tends to move in the right direction (i.e., fall). We further assume that the elasticity of substitution Z equals 1 and that the share of borrowers is equal to 12. Consider now the effect of introducing a collateral constraint. The monetary policy tightening induces a rise in the marginal value of borrowing ct . Notice that while the shadow value of durables ðV t Þ is almost constant under perfect financial markets it rises sharply in the case in which the collateral constraint is binding (as a result of the rise in ct ). As in
ARTICLE IN PRESS 252
T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
Fig. 3. Impulse responses to a monetary policy tightening: model with collateral constraint.
the case of perfect financial markets, the result of the policy shock is a fall in the relative price of durables. However, the dynamics of the user cost and of the relative price of durables are now de-linked: the relative price of durables falls whereas the user cost rises in response to the shock, in sharp contrast with the baseline economy under perfect financial markets. With a collateral constraint the fall in the price qt has an additional effect: the one of reducing directly the collateral value, further contributing to a tightening of the borrowing conditions. As a result, real debt falls, the demand for durables drops on impact and then starts to gradually revert back towards the steady state as the user cost gradually falls over time. In addition, the observed rise in the user cost produces a substitution effect from durables to non-durables. Hence the peak impact on durable consumption is larger than the one on non-durable consumption. Notice also that the fall in real debt in response to a policy tightening is qualitatively in line with our empirical evidence discussed in the early part of the paper. Simultaneously, the borrower reduces also the demand of non-durable goods, and to a larger extent relative to the saver. This is the result of two effects. First, prices are sticky in that sector, so the real interest rate on non-durables rises. Second, and most importantly, the reduced ability of borrowing (due to tighter borrowing conditions as well as to the fall in the relative price of durables) affects negatively also the borrower’s demand for non-durables. In this vein, the presence of an endogenous collateral constraint generates a complementarity between durable and non-durable demand. As it is clear from Fig. 3 the borrower’s durable consumption response contrasts sharply to the one of the saver’s. Hence it is important to understand whether the co-movement problem disappears also for aggregate consumption. Aggregation requires an understanding of the savers’ consumption responses to the policy shock (see Fig. 3 again). Recall that the savers are standard permanent-income agents. Two competing effects drive their demand. For one, a positive income shock, which is the counterpart of the negative income shock for the borrowers. This effect leads the savers to increase both categories of consumption. However, the rise in the real interest rate makes them substitute consumption intertemporally, so that, on balance, savers’ non-durable consumption is observed to fall initially. At the same time, since the relative price of durables falls, the savers increase their demand for durables. For these agents, in fact, the relevant user cost is the one prevailing in the absence of any collateral constraint, and therefore it depends heavily on the behavior of the relative price.
ARTICLE IN PRESS T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
253
Fig. 4. Impulse responses to a monetary policy tightening: effect of varying the degree of stickiness in durable prices (model with collateral constraint).
Fig. 4 illustrates how aggregate consumption responds to the policy shock under alternative values of the frequency of adjustment in durable prices (stickiness in non-durables is kept constant at four quarters). A positive co-movement generally arises under the condition that durable prices display a minimal degree of stickiness. The required degree of stickiness in durables, however, is well below the standard value of four quarters estimated, for instance, in Nakamura and Steinsson (2008). In the baseline case of prices equally sticky for four quarters, the model generates a strong positive comovement. The latter result contrasts sharply with the one obtained in the baseline NK model with perfect credit markets. Notice that a negative co-movement arises also between non-durable consumption and durable investment (expenditure). The latter falls sharply (in accordance with our introductory empirical evidence) but its fall is short-lived. This is the result of the dynamic of durable services lacking persistence (i.e., being not hump-shaped), thereby durable services fall but then starts almost immediately to revert back to the steady state (recall that the household’s preferences require to smooth the response of Dt and not the one of investment). It would be straightforward, however, to amend the model to allow for a more persistent dynamic in durable investment.16 The same figure displays the behavior of the nominal interest rate. As in the case of frictionless credit markets the degree of stickiness in durable prices is relevant in determining the sign of the response: if the stickiness of durable prices is sufficiently high, and as high as the one of non-durables, the interest rate slightly falls. It is interesting, however, that there exist values of durable stickiness for which the model delivers a positive co-movement between durable and nondurable consumption and simultaneously the nominal interest rate is observed to rise, as conventional wisdom would predict.
8. Conclusions This paper has shown that credit market imperfections on the household’s side can be relevant in accounting for the transmission of monetary policy shocks on durable and non-durable spending. The key idea is that, by affecting credit 16 There are at least three features that might help in this direction: first, some form of segmentation in the labor market, thereby sectoral labor is either not perfectly mobile across sectors (as in Erceg and Levin, 2006) or not perfectly substitutable in the production function; second, adjustment costs in (the rate of change of) durable investment; third, habit persistence in durable consumption.
ARTICLE IN PRESS 254
T. Monacelli / Journal of Monetary Economics 56 (2009) 242–254
conditions for constrained agents, monetary policy can have an impact on the intertemporal relative price of durables (the user cost), and therefore on the sectoral allocation of demand. Our conclusions bear implications on two further grounds. First, Barsky et al. (2007) have recently highlighted that the presence of durable goods, despite their smaller relative share in total spending, can substantially alter the transmission of monetary shocks within a standard NK sticky-price model. In particular, if durable prices are flexible, their model exhibits monetary neutrality, while if durable prices are sticky, the model behaves as a standard sticky-price model even if nondurable prices are flexible. Our paper shows that the assumption on the degree of stickiness of durables becomes significantly less crucial once a collateral constraint on borrowing is introduced in the model. Second, a recent research program has tried to assess the empirical validity of dynamic stochastic general equilibrium (DSGE) models via structural estimation methods (Smets and Wouters, 2003; Christiano et al., 2005). In that research program, credit markets are usually assumed to be frictionless. An extension of estimated DSGE models to include a role for collateral constraints (both on the household’s and the firm’s side) seems of paramount importance in order to improve the ability of such models to provide an adequate representation of the data. Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco.2008.09.013.
References Barsky, R., House, C., Kimball, M., 2007. Sticky price models and durable goods. American Economic Review 97, 984–998. Becker, R., 1980. On the long-run steady state in a simple dynamic model of equilibrium with heterogeneous agents. Quarterly Journal of Economics 95 (2), 375–382. Becker, R., Foias, C., 1987. A characterization of Ramsey equilibrium. Journal of Economic Theory 41, 173–184. Bils, M., Klenow, P., 2004. Some evidence on the importance of sticky prices, Journal of Political Economy October, 947–985. Campbell, J., Hercowitz, Z., 2006. The role of collateralized household debt in macroeconomic stabilization. NBER w.p. 11330. Carlstrom, C., Fuerst, T., 2006. Co-movement in sticky price models with durable goods. Mimeo, Federal Reserve Bank of Cleveland. Chah, E.Y., Ramey, V., Starr, R.M., 1995. Liquidity constraints and intertemporal consumer optimization: theory and evidence from durable goods. Journal of Money Credit and Banking 27 (1), 272–287. Christiano, L., Eichenbaum, M., Evans, C., 1999. In: Taylor, J.B., Woddford, M. (Eds.), Handbook of Monetary Economics. Elsevier, Amsterdam, pp. 65–148. Christiano, L., Eichenbaum, M., Evans, C., 2005. Nominal rigidities and the dynamic effects of a shock to monetary policy. Journal of Political Economy 113, 1–45. Clarida, R., Galı´, J., Gertler, M., 1999. The science of monetary policy: a New Keynesian perspective. Journal of Economic Literature 37, 1661–1707. Erceg, C., Levin, A., 2006. Optimal monetary policy with durable consumption goods, 2006. Journal of Monetary Economics 53, 1341–1359. Galı´, J., Gertler, M., 1999. Inflation dynamics: a structural econometric analysis. Journal of Monetary Economics 44 (2), 195–222. Galı´, J., Lo´pez-Salido, D., Valle´s, J., 2007. Understanding the effects of government spending on consumption. Journal of the European Economic Association 5 (1), 227–270. Goodfriend, M., King, B., 1997. The new neoclassical synthesis. In: Bernanke, B.S., Rotemberg, J.J. (Eds.), NBER Macroeconomics Annual. MIT Press, Cambridge, MA, pp. 231–283. Iacoviello, M., 2005. House prices, borrowing constraints and monetary policy in the business cycle. American Economic Review June, 739–764. Ireland, P., 2003. Endogenous money or sticky prices? Journal of Monetary Economics 50 (8), 1623–1648. Kiyotaki, N., Moore, J., 1997. Credit cycles. Journal of Political Economy 105, 211–248. Kocherlakota, N., 2000. Creating business cycles through credit constraints. Federal Reserve Bank of Minneapolis Quarterly Review 24 (3), 2–10. Krusell, P., Smith, A., 1998. Income and wealth heterogeneity in the macroeconomy. Journal of Political Economy 106, 867–896. Nakamura, E., Steinsson, J., 2008. Five facts about prices: a reevaluation of menu cost models. Quarterly Journal of Economics 123 (4), 1415–1464. Ogaki, M., Reinhart, C.M., 1998. Measuring intertemporal substitution: the role of durable goods. Journal of Political Economy 106 (5), 1078–1098. Rotemberg, J., Woodford, M., 1997. An optimization-based econometric model for the evaluation of monetary policy. In: Bernanke, B.S., Rotemberg, J.J. (Eds.), NBER Macroeconomics Annual. MIT Press, Cambridge, MA, pp. 297–346. Smets, F., Wouters, R., 2003. An estimated dynamic stochastic general equilibrium model of the euro area. Journal of the European Economic Association 15, 1123–1175. Woodford, M., 2003. Interest and Prices: Foundations of a Theory of Monetary Policy. Princeton University Press, Princeton, NJ.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 255–266
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Money demand heterogeneity and the great moderation Pablo A. Guerron-Quintana Department of Economics, North Carolina State University, Raleigh, NC 27605, USA
a r t i c l e in fo
abstract
Article history: Received 4 December 2006 Received in revised form 20 November 2008 Accepted 21 November 2008 Available online 6 December 2008
A forward-looking model of the demand for money based on heterogeneous and sluggish-portfolio adjustment can simultaneously account for the low short-run and high long-run semi-elasticities reported in the literature. The parameter estimates from the model for the short-run and long-run interest semi-elasticities are 1.04 and 13.16, respectively. A simulated version of the model suggests that the Great Moderation can be partially attributed to financial innovations in the late 1970s. When moving toward a more flexible portfolio, the model can account for almost one-third of the observed decline in the volatilities of output, consumption, and investment. & 2008 Elsevier B.V. All rights reserved.
JEL classification: E41 E47 Keywords: Financial innovation Great moderation GMM Money demand Sluggish-portfolio adjustment Monetary and technology shocks
1. Introduction The demand for money has been a long-standing puzzle in the monetary economics literature. Indeed, since the seminal papers of Friedman (1959) and Meltzer (1963), the literature has yielded conflicting estimates for the coefficients of the money demand equation. For example, whereas Lucas (1988) suggests that the long-run interest rate semi-elasticity of money demand falls between 5 and 10, other researchers (e.g., Goldfeld and Sichel, 1990) find that the short-run semielasticity lies around 1. Understanding the demand for money is critical because of its implications for welfare analysis (Lucas, 2000), the dynamics of exchange rates (Engel and West, 2005), and optimal monetary policy (Khan et al., 2003). Similarly, the Great Moderation (the sustained decline in the volatility of economic activity; McConell and Perez-Quiroz, 2000; Stock and Watson, 2002) has been a controversial topic without a consensus on its cause. The moderation has been attributed to better monetary policy (Clarida et al., 2000), less volatile productivity shocks (Fernandez-Villaverde and Rubio-Ramirez, 2007; Justiniano and Primiceri, 2008), and unknown forms of good luck reflected in smaller forecast errors (Stock and Watson, 2002). Table 1, which reports the volatilities for some variables in the U.S., shows that real variables are at least 50 percent less volatile after 1984 than before.1 For example, the volatility of output growth declined 54% in the post-1984 sample. We also observe a much less discussed fact: the substantial increase in the variability of real balances. This paper investigates what frictions might account for the considerable difference between the short- and long-run
Tel.: +1 919 513 2869; fax: +1 919 513 7873.
E-mail address:
[email protected] The results in McConell and Perez-Quiroz (2000) and Sims and Zha (2004) suggest 1984 as the most likely year when the Great Moderation started.
1
0304-3932/$ - see front matter & 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.11.003
ARTICLE IN PRESS 256
P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
Table 1 Summary statistics of the Great Moderation in U.S. 1960–2005.
a
1960–1983 1984–2005b 1960–2005b
sy
sc
si
sp
sr
sm=p
1.14 0.44 0.81
0.57 0.52 0.84
4.37 0.53 0.83
2.76 0.31 0.91
3.88 0.50 0.81
4.16 1.35 1.19
Notes: Standard deviations of (1) annual growth rates of output (y), consumption (c), investment (i), and real balances (m=p), and (2) annualized interest rates (r) and inflation ðpÞ. Data sources described in Section 4 in main text. a Volatilities expressed in percentage points. b Volatilities expressed as a fraction of those for the period 1960–1983.
semi-elasticities of the demand for money. Furthermore, I argue that the interaction between money demand and financial innovations of the late 1970s contributed to the Great Moderation. This paper develops a model in which households divide their money holdings (portfolio) into two parts: money for consumption purchases and money for savings. Re-balancing the composition of their portfolio is costly, however. To capture this cost, the model assumes households use time-dependent rules to re-optimize their money holdings, a supposition that yields non-trivial heterogeneity across households. Moreover, the presence of those costs implies that households look forward when re-balancing their portfolios between cash and deposits. In the model, velocity depends on its own lagged value and a forward-looking element capturing households’ expectations of the short-term interest rate. When this equilibrium condition is estimated via the generalized method of moments (GMM), I uncover (1) short-run and long-run interest semi-elasticities of 1.04 and 13.16, respectively; (2) that households re-optimize their money balances on average once every 4.5 quarters; and (3) a significant forward-looking component in money demand. Notably, the estimated short-run semi-elasticity agrees with the findings of Christiano et al. (2005, CEE), and the long-run semi-elasticity is in the range reported in Lucas (1988) and Stock and Watson (1993). Moreover, the estimated portfolio sluggishness is consistent with the microdata-based evidence in Vissing-Jorgensen (2002). There is little statistical evidence against the model. The late 1970s and early 1980s were characterized by significant financial innovations, such as banking de-regulation and ATMs. These innovations most likely decreased the cost of transacting in financial markets, which resulted in frequent portfolio re-balancing. In the model, additional portfolio flexibility allows households to efficiently adjust money balances when shocks buffet the economy and hence facilitates consumption smoothing. Consequently, my model suggests that financial innovations of the 1970s and 1980s contributed to the Great Moderation. To support this hypothesis, two exercises are undertaken. First, I re-estimate the model over the pre-1984 and post-1984 samples. Whereas the short- and long-run interest semi-elasticities in the first subsample equal 3.00 and 16.67, respectively, those elasticities fall to 0.42 and 4.72, respectively, in the second subsample. Furthermore, the frequency of portfolio re-optimization changes from six quarters in the pre-1984 sample to three quarters in the post-1984 sample. This analysis suggests that financial innovations did indeed lead to a more frequent portfolio adjustment. To fully explore the effects of portfolio flexibility, the second step involves embedding the model of money demand into a standard New Keynesian framework. The resulting model is then simulated using different degrees of portfolio sluggishness. This exercise reveals that when moving from high portfolio sluggishness, in which households re-balance their portfolios on average every 6 quarters, to moderate flexibility (portfolios are reviewed every 3 quarters), the model accounts for almost one-third of the observed decline in the volatilities of output, consumption, inflation, and investment. Interestingly, the model is capable of explaining one-third of the rise in the volatility of real balances found in the data (Table 1). As will become apparent below, this increase in volatility is crucial to understanding the workings of the model. The rest of the paper is organized as follows. Section 2 describes the main innovations that hit financial markets in the 1970s and 1980s. The model of money demand and its estimation are discussed in Sections 3 and 4. Simulations and the relation of the results to the existing literature are in Section 5. The last section provides concluding remarks. 2. Financial innovations Undoubtedly progress in computer and information technology has been one of the main sources of innovation in financial markets. Most of those important innovations happened during the late 1970s and early 1980s, including the production of Intel’s 16-bit microprocessor, the 8086, and the introduction of the first personal computers: the Apple II in 1977, and the IBM PC in 1981. These advances in turn had significant effects on the banking sector. To begin with, automated teller machines (ATMs) and electronic fund transfers (EFTs) became a standard in the industry.2 Customers in turn could deposit and withdraw money more quickly and at a reduced fee compared to person-based transactions. Next, financial institutions gained access to statistical software that greatly simplifies credit risk assessment (VisiCalc, introduced in 1979, was the first spreadsheet for personal computers). This newly acquired ability enabled financial intermediaries to 2
Sienkiewicz (2002) reports that the number of ATMs rose from less than 10,000 terminals in 1978 to roughly 324,000 in 2001.
ARTICLE IN PRESS P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
3.4
14
3.2
12
3
10
2.8
8
2.6 2.4 2.2 2
257
6 1977: APPLE II Introduced
4
1980: DIDMCA Act Passed 1984: Great Moderation Starts
1.8
2
1965 1970 1975 1980 1985 1990 1995 2000 2005 Fig. 1. Plots of velocity (defined as the log of nominal consumption to money), interest rates, the Great Moderation, and some financial innovations.
lend money more efficiently. Firms also benefited from the new risk valuation techniques as they progressively gained access to a market for high-risk debt.3 A related source of innovations resulted from major changes in banking legislation. The Depository Institutions Deregulation and Monetary Control Act (DIDMCA) was introduced in 1980, which allowed, among other things, NOW and sweep accounts nationwide; eliminated interest rate ceilings on deposits and usury ceilings on loans; and increased deposit insurance to $100,000 per account. In 1982, Congress passed the Depository Institutions Act (Garn-St. Germain), which permitted depository institutions to offer money market deposit accounts (MMDAs). Together, these laws fostered greater integration between capital and credit markets and therefore led to better access to financing (see Campbell and Hercowitz, 2004; Dynan et al., 2006). Financial sophistication also came through the introduction of new banking products and interstate banking, both aimed to boost banks’ profits. For example, in 1975 California allowed lending institutions to issue adjustable-rate mortgages, characterized by low initial interest rates, which made them very popular among homeowners. With respect to branching, states such as Maine and Massachusetts introduced new legislation (1975 and 1982, respectively) that permitted out-of-state companies to buy banks in those states. Finally, the Chicago Board of Trade introduced financial derivatives in the mid-1970s. Fig. 1 displays velocity (nominal GDP to money), interest rates and some financial innovations in the U.S. This figure clearly reveals that the arrival of the innovations precedes the Great Moderation and coincides with two changes in the data: (1) a sharp rise followed by a sustained decline in velocity and interest rates and (2) these two variables tend to comove, which is quite apparent after 1977, the year when the Apple II was introduced. Such findings suggest that money balances, and hence velocity, became more flexible over time, most likely because of the financial innovations.4 As will become clear, my model precisely predicts that sophistication in financial markets, as captured by flexible portfolio adjustment, indeed implies a more flexible money demand. Yet anecdotal evidence is no guarantee that financial innovations ultimately translated into more flexible portfolio adjustment. Fortunately, the empirical results in Section 4 lend support to the link between financial sophistication and frequency of portfolio re-balancing. Moreover, Vissing-Jorgensen (2002) estimates that the annual cost for participating in the stock market declined 43% between 1984 and 1994. Consistent with this decline in costs, she finds that household participation in the stock market increased from 28% to 44% for the same period. Her results, therefore, indicate that transaction costs declined and households traded more frequently.
3. Model The basic formulation of my model is based on Altig et al. (2005, ACEL) and Schmitt-Grohe and Uribe (2004). Their models are rich enough to capture the business cycle properties of key variables after monetary and technology shocks. 3 Dynan et al. (2006) report that new issuance of junk bonds went from nothing in the mid-1970s to more than 40 percent of total non-financial bond issuance in 2004. 4 The change in interest rates and velocity may also be a consequence of a switch in monetary policy or smoother shocks hitting the economy. In the simulations in Section 5, I control for such possible scenarios to show that financial innovations alone can generate a sizable smoothing of the economy.
ARTICLE IN PRESS 258
P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
3.1. Households The model features a continuum of households, indexed by j 2 ð0; 1Þ. Before any uncertainty is revealed, households can buy insurance against idiosyncratic shocks (described below). After learning the current state of nature, households make decisions regarding consumption, C j;t , investment, Ij;t , and labor supply, hj;t . Simultaneously, households allocate their beginning-of-period money holdings, M j;t , between deposits at a bank and cash to be used for consumption transactions, Q j;t . By sending resources to the bank, households are entitled to collect a return, Rt , at the end of the period. The model assumes that households face a Calvo lottery every time they optimize transaction balances. Each household is a monopolistic supplier of a differentiated labor supply. Moreover, she sets her wage subject to a Calvo scheme similar to that in place in the financial service sector. However, my model allows the Calvo probabilities in each market to differ. Therefore, households make their decisions based on the following maximization problem: " # 2 X l hj;tþl max Et b logðC j;tþl bC tþl1 Þ cL (1) C;M;Q;I 2 l¼0 K;w;h
subject to a k ð1 þ ZðV j;t ÞÞpt C j;t þ pt U1 t Ij;t þ M j;tþ1 pRt ðM j;t Q j;t þ ðxt 1ÞM t Þ þ Aj;t þ wj;t hj;t þ pt r t K j;t þ Q j;t
and Ij;t Ij;t . K j;tþ1 ¼ ð1 dÞK j;t þ 1 S Ij;t1 Here, b40 is the habit formation parameter; cL 40; pt is the price level; and S is a function reflecting the costs associated with adjusting investment. This function is assumed to be increasing and convex satisfying S ¼ S0 ¼ 0 and S00 K40 in steady state. Aj is household j’s net cash inflow from the insurance markets. The term Ut is an investment-specific shock. The quantity ðxt 1ÞM at is a lump-sum transfer made to household j by the monetary authority, and xt is the gross growth rate of the aggregate stock of money, M at . Following Sims (1994), money is introduced into the model by assuming that the purchase of goods requires the payment of transaction services. These transaction costs depend on individual velocity, V j;t pt C j;t =Q j;t , through the function Z with positive first, Z0 , and second, Z00 , derivatives. 3.1.1. Money holdings setting The main assumption in this paper comes from time-dependent portfolio adjustment: agents re-optimize their cash balances, Q , infrequently, similar in spirit to the price- and wage-setting model of CEE. Specifically, a fraction, 1 x, of randomly chosen households is allowed to re-optimize their balances every period. For inactive households, the literature on portfolio choice provides little guidance regarding their behavior (Campbell and Viciera, 2002). Hence, if a household is not allowed to re-optimize today, her money holdings are adjusted according to the rule Q j;t ¼ pg c Q j;t1 , where p represents the steady state inflation, and g c is the growth rate of aggregate consumption in the steady state.5 This sluggish-portfolio assumption is likely to capture two important aspects of the economy. First, it measures the degree of access to financial services enjoyed by households. Prior to the widespread use of ATMs, electronic banking, and the branching liberalization of the late 1970s, households spent substantial resources managing their accounts. For instance, money deposits and withdrawals had to be made during business days, since banks were closed on weekends. Hence, households had limited access to such services, which is parsimoniously captured in the model by infrequent portfolio re-balancing. Indeed, a decline in x implies more frequent portfolio adjustment or higher household participation in the financial system. Second, the time-dependent assumption captures the costs faced by households when assessing the uncertainty surrounding the economy. The presence of large costs makes it harder for households to determine the state of the economy and in particular the risk exposure of banks. As a consequence, households may opt to limit their participation in financial markets. This infrequent participation in turn makes households factor in future outcomes when choosing their portfolio. Moreover, households may keep real balances for purchase purposes as well as for precautionary motives. The financial innovations of the late 1970s most likely contributed to decreasing those costs and ultimately increasing household participation in financial markets. For tractability reasons, let us assume that agents can contract upon the uncertainty created by the Calvo lotteries with perfectly competitive insurance companies. Such an assumption, which has been extensively used in the sticky price literature (Erceg et al., 2000), aims to reduce the degree of heterogeneity across households resulting from time-dependent portfolio adjustment. In fact, market completeness ensures that households value an extra dollar equally: lj;t ¼ lk;t for households j and k, where l is the budget multiplier (CEE). By employing this assumption, we focus on the direct effects of the Calvo friction on consumption and money balances. The optimal money holdings for household j that 5 The presence of g c in the indexation rule implies that there are no distortions from portfolio dispersion along the steady state growth path. GuerronQuintana (2007a) explores the implications of alternative indexation rules.
ARTICLE IN PRESS P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
259
can re-optimize at time t obey: Et Rt ¼ 1 þ ðZj;t Þ0 ðV j;t Þ2 þ e
1 X l ðxbpg c Þl tþlþ1 ½ðRtþl þ 1Þ þ ðZj;tþl Þ0 ðV tj;tþl Þ2 , l¼1
lt
(2)
where Zj;tþl ZðV tj;tþl Þ, V tj;tþl ptþl C j;tþl =ðpg c Þl Q j;t , and e Et is the expectation operator upon the event that household j does not re-optimize her money balances after period t. The Lagrangian multiplier, lt , is not indexed reflecting the assumption of insurance markets. In the absence of time-dependent portfolio adjustment, the optimal condition for money balances requires that the cost of an extra unit of balances, Rt , equal the benefits of the decrease in the transaction cost: Rt ¼ 1 þ Z0 ðV t ÞðV t Þ2 . In contrast, with sluggish-portfolio adjustment, the costs and benefits of an extra unit of money balances extend beyond the current period because with positive probability households must retain their current nominal balances, indexed by inflation, next period. In the second period, losses come from the forgone interest income pg c Rtþ1 , whereas gains come from the savings on transaction costs, pg c þ ððZtþ1 Þ0 =pg c Þðptþ1 C tþ1 =Q t Þ2 . Because these costs and gains arrive during the next period with probability x, they must be discounted using the household’s stochastic discount factor, xbltþ1 =lt . The combination of these terms corresponds to the second element of Eq. (2). Hence, a direct implication of the infrequent portfolio friction is that households must look forward every time they are allowed to optimize money balances. Deriving a tractable money demand equation from Eq. (2) presents two challenges. First, the conditional expectation e Et ðZj;tþl Þ0 ðV tj;tþl Þ2 must be evaluated only across those histories in which household j has not re-optimized her money balances. However, her consumption, C j;tþl , depends on all continuation histories including those in which the household can re-optimize her balances. Second, in spite of the insurance market assumption, the presence of the transaction function Z implies that consumption is still heterogeneous across agents. This is so, since contingent markets equate the marginal utility of wealth across households, U c;t ðjÞ U c;t ðkÞ ¼ ½1 þ ZðV j;t Þ þ Z0 ðV j;t ÞV j;t ½1 þ ZðV k;t Þ þ Z0 ðV k;t ÞV k;t
(3)
for households j and k. Because money holdings differ across households, so too does consumption. To manage these complications, this paper borrows ideas from the procedure outlined in Woodford (2005), who solves a forward-looking pricing model with heterogeneity. First, take a log-linear approximation of Eq. (2) about the steady state. As shown in the technical appendix, this equation admits a first-order Taylor approximation under mild assumptions. Second, because only active households absorb additional funds in the market after an expansionary monetary shock, other things equal, their velocity is smaller than that of inactive households. The only source of this differential lies in the sluggish adjustment setup. Thus, I guess and verify that, as a log-linear approximation, individual velocity is a function of economy-wide velocity plus a term that depends on individual money holdings relative to economy-wide money balances: d b j;t ¼ V b þ CðQ j;t V =Q t Þ, where C is a coefficient to be determined, starred variables refer to aggregate variables, and a hat t indicates log-deviations.6 By combining Eqs. (2) and (3), the aggregate money demand equation is given by b b Þ þ Et ½f ðV b b et ¼ f V bt þ b b tþ1 Þ. g c;t þ V g c;tþ1 p R 1 t þ f3 ðp 2 t1 tþ1
(4)
et , represents deviations of the interest rate from its steady state value; the reduced-form coefficients, f and C, Here, R i depend on the structural parameters (the mappings can be found in the technical appendix). Furthermore, I find that Co0, which confirms the suspicion that households with larger money holdings have smaller individual velocities. Eq. (4) resembles Goldfeld’s (1976) partial adjustment model.7 This new formulation differs from his model in several ways, however. First, unlike Goldfeld’s model, money balances do not appear explicitly in the last equation; instead, they enter indirectly through their influence on velocity. By using velocity as the main variable, the model takes directly into account Lucas (1988) suggestion that recovering the semi-elasticity of money demand requires imposing an income elasticity of 1.8 Second, Eq. (4) involves the growth rate of consumption, whereas the original model focused on the level of output. This result follows from my model assuming that money is used for transaction purposes. Interestingly, Khan et al. (2003) derive a money demand model where velocity and consumption play a preponderant role. Finally, the new dynamic money equation includes a forward-looking term, V tþ1 , which, as previously discussed, is a consequence of the staggered portfolio adjustment assumption. With sticky portfolios, only a fraction of the population re-optimizes their balances following a monetary shock. As time goes by, the probability of participating in the financial market rises, which implies that more agents optimize their balances. Hence, we expect that money balances will be mildly responsive in the short run but highly elastic in the long run. Based on this argument, define the interest semi-elasticity of money demand, z, as the percentage change in real 6 In Woodford’s (2005) setup, the firm’s price decision is a linear function of a price index and individual capital, which is heterogeneous across firms. Guerron-Quintana (2008) shows how this approach can be exploited to solve models with heterogeneous wage settings. 7 Goldfeld (1976) model is given by ln mt ¼ b0 þ b1 ln yt þ b2 ln r t þ b3 ln mt1 þ b4 pt , where m refers to real balances, y to real output, and r to shortterm interest rates. 8 Lucas (1988) argues that the empirical evidence overwhelmingly supports the finding of unitary income elasticity. Hence, by imposing an income elasticity of 1, we can obtain sharper estimates for the interest elasticity.
ARTICLE IN PRESS 260
P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
balances stemming from a permanent increase of 100 basis points in the annualized interest rate:
z¼
1 qðlogðQ =PÞt Þ . 4 qRt
(5)
This equation accounts for the quarterly time period of the model. Combining this last definition and Eq. (4) is straightforward to show that the short- and long-run semi-elasticities are given by
zshort ¼
1 1
4f2 l2 ð1 l2 Þ
and
zlong ¼
1 1
4f2 l2 ð1 l1 Þð1 l2 Þ
,
(6)
where l1 and l2 are the inverse of the roots in the polynomial ðf3 =f2 Þz2 þ ðf1 =f2 Þz þ 1 ¼ 0. With these results in hand, we can argue that the estimates in Christiano et al. (1999) correspond to what the model identifies as a short-run semielasticity, while the elasticities reported in Lucas (1988) and others reflect a long-run semi-elasticity. For the last statement to be true, we need to show that 0ozshort ozlong . This condition follows trivially from the observation that 0ol1 o1ol2 and f2 o0. Furthermore, as the Calvo probability declines, the distinction between the short- and long-run behavior vanishes, i.e., zshort ! zlong , which results in the flexible money demand: Rt ¼ 1 þ Z0 ðV t ÞðV t Þ2 . This is precisely what we observe in Fig. 1: as financial innovations arrive, velocity becomes more flexible and more correlated with interest rates. 3.1.2. Wage setting Each household is a monopolistic supplier of a differentiated labor service, say hj;t . Households sell these labor services to a competitive firm that aggregates labor and sells it to intermediate firms. The technology used by the aggregator is R 1 1=l Ht ¼ ½ 0 hj;t w djlw , 1plw o1. It is straightforward to show that the relation between the labor aggregate and the wage aggregate, W t , is given by hj;t ¼ ½W t =wj;t lw =ðlw 1Þ Ht . As in CEE, I assume some households do not change nominal wages (with exogenous probability xw ). In those cases, wages are set according to the rule wj;t ¼ pt1 mz wj;t1 . The term mz is needed to avoid wage dispersion along the steady state growth path, where mz corresponds to the growth rate of zt ¼ Uta=ð1aÞ zt and z is a neutral technology shock defined below. 3.2. Firms and the loan market The economy consists of two types of firms: final- and intermediate-good firms. The former behave competitively while the latter enjoy some monopoly power. At time t, a consumption good, Y t , is produced by the perfectly competitive firm. This good is produced by combining a continuum of intermediate goods indexed by i 2 ½0; 1; according to the technology R1 Y t ¼ ½ 0 yt ðiÞ1=lf dilf . Here, 1plf o1 measures the degree of substitutability among intermediate goods. Perfect competition implies that the final-good firm takes output and input prices as given. Each intermediate-good firm has monopoly power in the supply of good i. This good is produced using the technology yt ðiÞ ¼ K t ðiÞa ðzt ht ðiÞÞ1a Yzt , where, 0oao1, hðiÞ and KðiÞ denote labor and capital used by firm i, and Y is a fixed cost that guarantees profits are zero in the steady state. Intermediate firms rent capital and labor in perfectly competitive factor markets. Furthermore, these firms must borrow the wage bill in advance from a financial intermediary at the gross interest rate, Rt . Finally, prices are set in a Calvo fashion. That is, each period, an intermediate-good firm revises its price with an exogenous probability 1 xp. If this firm does not re-optimize prices, then its price is updated according to the rule: b z and P t ðiÞ ¼ pt1 P t1 ðiÞ, where pt1 is inflation over the previous period. The log-deviations of the growth rates of z and U, m b z;t1 þ z;t , m b U;t1 þ U;t , where 0orz o1; z;t and U;t are iid shocks with b U , follow ARð1Þ processes: m b z;t ¼ rz m b U;t ¼ rU m m standard deviations sU and sz , respectively. Banks receive M t Q t þ ðxt 1ÞM t from households (which includes the monetary transfer from the government, xt M at .) Note that the equilibrium condition, M t ¼ M at , has been imposed. Banks in turn lend money to firms and request a gross interest rate Rt . At the end of the period, the financial intermediary transfers principal plus interest payments to households. Market clearing in the loan market requires wt Ht ¼ xt Mt Q t . 3.3. Monetary and fiscal policy Monetary policy has a parsimonious representation: b xt ¼ b xz;t þ b xM;t þ b xU;t ; b xM;t ¼ rM b xM;t þ M;t , b xz;t1 þ Bz z;t þ B1z z;t1 ; xz;t ¼ rxzb
b xU;t ¼ rxU b xU;t1 þ BU U;t þ B1U U;t1 .
(7)
Here, x represents the gross growth rate of the aggregate stock of money, M atþ1 =Mat . The disturbance M denotes a monetary policy shock with standard deviation sM . The terms b xz;t and b xU;t capture the response of the monetary authority to innovations in neutral and capital embodied technology, respectively. An accommodative monetary policy like (7) improves the model’s predictions with respect to the effects of technology shocks (see, for example, ACEL). Finally, the government adjusts lump-sum taxes to ensure that its intertemporal budget constraint is satisfied. The experiments outlined in Section 5 analyze the effects of financial innovations by assuming that every aspect of the economy except the portfolio selection part remains unchanged. Hence, the monetary policy described in Eq. (7) is a good
ARTICLE IN PRESS P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
261
Table 2 Parameter values for DSGE model used in simulations in Section 5.
b
a
d
cL
mz
Discount factor 0.99
Capital share 0.36
Depreciation 0.025
Labor utility 1
Growth rates 1.00
b Habit formation 0.70
rU
rm
rz
Persistence parameters 0.24 0:03
lf
lw
Bz
Elasticity goods and labor 1.01 1.05
B1z
mx
xw
1.017
Calvo probabilities 0.72 0.82
K
sm
sU
0.90
Cost of invest 3.28
Volatility parameters 0.33 0.30
BU
B1U
Monetary rule parameters 1.42 0.25
0.13
3.00
xp
sz 0.07
rxz
V
0.33
Velocity 0.45
Notes: Values taken from Altig et al. (2005) and Christiano et al. (2005).
approximation of the Fed’s policy prior to the financial innovations of the early 1980s (see, Christiano et al., 1999; Friedman and Kuttner, 1996; Walsh, 2003). 3.4. Parameterization Table 2 reports the parameters used for the simulations reported in Section 5; the values are taken from CEE and ACEL. A crucial parameter in the model is the Calvo probability of money adjustment: x. The related literature offers little guidance on how to choose this parameter. One possibility is to resort to Vissing-Jorgensen (2002) study about investors’ preferences. However, setting x to match her findings is a poor approximation, since my study does not directly model the stock market. Here, because the money demand equation (4) is a moment restriction, I choose to estimate it via GMM. Such an approach allows us to infer the semi-elasticities and the Calvo probability associated with money directly from the data. 4. Estimating the demand for money Data are U.S. quarterly and the sample period covers 1960:I–2005:IV. All data come from the St. Louis Fed. Nominal consumption is measured by personal consumption expenditure on non-durable goods (GCN), plus personal consumption expenditure on services (GCS). The price index is measured by the ratio of nominal to real output. The gross interest rate, Rt , is computed using the 3-month Treasury bill rate (TB3MS). Output corresponds to real chain-weighted output (GDPQ). The results in Goldfeld and Sichel (1990) and Guerron-Quintana (2007a) indicate that: (1) M1 is an inappropriate measure of money for transaction purposes and (2) M1-based money demand is unstable. Although M2 might seem an obvious replacement for M1, this monetary aggregate includes small time deposits issued by financial institutions, which do not measure money spent in transactions as required by my formulation. I propose measuring transaction balances, Q t in the model, with M2-minus (M2MSL). This monetary indicator, available from the St. Louis Fed, comprises M2 minus small-denomination time deposits. Hence, velocity is defined as the log of nominal income to M2-minus. The resulting variable is stationary, which is desirable for GMM estimation. 4.1. Results Based on the studies of Eichenbaum and Fisher (2007), Gali and Gertler (1999) and Staiger and Stock (1997), I propose the following set of instruments: X t1 ¼ f1; g c;tj ; V tj ; Rtj ; j ¼ 1; 2g. As shown in Guerron-Quintana (2007a), using additional lags, alternative instruments, or different measures of interest rates and money (e.g., fed funds rate and money zero maturity) does not significantly change the results reported in Table 3. Optimal weighting matrices are used in the computation of the GMM estimates. The column J T corresponds to the p-value of Hansen’s statistic for overidentifying restrictions. Values in parentheses are standard deviations. The estimation exercise indicates that all coefficients are statistically significant. Hansen’s J statistic does not reject the model. More important, the estimated parameters concur with the model’s predictions: (1) the statistical significance of f2 indicates that the data support the presence of a forward-looking term in money demand and (2) there is a substantial difference between short- and long-run elasticities; a permanent increase in interest rates of 100 basis points implies an immediate decline of 1.04% and a long-run decline of 13.16% in real balances. The first figure is close to the low estimates reported in the literature (e.g., ACEL; Goldfeld and Sichel, 1990). Taking into account sampling error, the long-run estimate also falls in line with values found in the literature. Mankiw and Summers (1986), for example, find an interest semielasticity of 11. Furthermore, Stock and Watson’s (1993) study reports long-run semi-elasticities around 12.
ARTICLE IN PRESS 262
P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
Table 3 b et ¼ f V b b b b b b GMM estimates of the money demand equation R 1 t þ f3 ðpt þ g c;t þ V t1 Þ þ Et ½f2 ðV tþ1 g c;tþ1 ptþ1 Þ.
f3
zshort
zlong
Z
Z00
x
JT
Sample: 1960:I–2005:IV 1.59 (0.72) 0.70 (0.36)
0.88 (0.35)
1.04 (0.40)
13.16 (3.34)
0.054 (0.54)
0.00 (2e4)
0.77 (2e3)
0.14
Sample: 1960:I–1983:IV 0.91 (0.42) 0.45 (0.21)
0.44 (0.21)
3.00 (1.72)
16.67 (3.92)
0.051 (0.24)
0.00 (1e4)
0.84 (0.19)
0.27
Sample: 1984:I–2005:IV 1.47 (0.53) 0.46 (0.29)
0.95 (0.27)
0.42 (0.11)
4.72 (1.23)
0.034 (0.07)
0.004 (5e4)
0.67 (0.01)
0.87
f1
f2
Notes: Instruments are a constant and two lags of interest rates, growth rate of consumption, and velocity. Standard errors are in parenthesis and J T is the p-value of Hansen’s overidentification test. Data described in Section 4 in main text. Definitions: R interest rate, p inflation, g c growth rate of consumption, z semi-elasticity of money demand Z and Z00 transaction cost parameters, and x money demand Calvo probability.
The results described in this section and reported in Table 3 illustrate this paper’s first main point: different elasticities reported in the literature are not puzzling after taking into account the short- and long-run dynamics of money. Furthermore, the statistical and economic significance of the model indicates that the forward-looking money equation (4) satisfactorily describes the data covering 1960–2005. Subsample analysis: As previously discussed, the late 1970s and early 1980s featured continuous financial innovations, including electronic banking and new banking regulations. Additionally, several authors have argued that the U.S. economy became less volatile in the early 1980s, resulting in what Stock and Watson (2002) call the Great Moderation. Motivated by these observations, the money demand equation is re-estimated splitting the sample around 1984 (Sims and Zha, 2004, identify this year as the most likely start of the moderation). The results in Table 3 reveal that the money demand elasticities are closer to each other after 1984.9 To observe this pattern, compare zshort ¼ 0:42 and zlong ¼ 4:72 in the final sample with zshort ¼ 3:00 and zlong ¼ 16:67 in the initial period. The difference between the short- and long-run elasticities declines by a factor of three in the post-1984 sample. GuerronQuintana (2007a) shows that this decline in the elasticities is robust to alternative breaking years. The elasticities’ closeness in the second sample suggests that financial innovations in the 1970s and 1980s might entail reduced transaction costs, making it easier for households to adjust their money balances.10 Indeed, we observe in Fig. 1 that the arrival of financial innovations nicely coincides with changes in interest rates and velocity. In the model, easier portfolio adjustment and a diminishing distinction between the short-run and long-run dynamics of money are only possible with a decline in the Calvo probability x. As the next section discusses this decline is exactly what happens with the implied Calvo probabilities. Implied frequency of re-optimization: The model’s expected time of portfolio rebalancing is given by 1=ð1 xÞ. We can use the GMM results to infer the implied frequency of re-optimization. In particular, the structural parameters are e FðxÞ ¼ 0. Here, x ¼ ½Z Z00 x0 , F e is the vector of estimated coefficients from computed by solving the non-linear mapping F 00 our GMM procedure, Z is the second derivative of the cost function, and FðxÞ is the vector mapping the structural parameters space into the reduced-form coefficients space (the interested reader is referred to the technical appendix for further details). According to the results reported in Table 3, the probability of portfolio adjustment, x, ranges from 0.67 to 0.84, meaning that households re-optimize their portfolios every 6 quarters on average in the upper bound, whereas they review their financial decisions every 3 quarters in the lower bound. Under the full sample scenario, the Calvo probability is 0.77. This large probability (and its implied frequency of re-optimization) is close to the microdata-based values reported in Vissing-Jorgensen (2002). Moreover, the smaller Calvo probability in the second subsample implies that households waited longer before re-optimizing in the pre-1984 sample versus the post-1984 sample. Indeed, this result suggests that re-balancing money holdings became easier after 1984, thus confirming the hypothesis that money balances have become more flexible over time. Finally, the estimated transaction function, Z, is close to the 0.036 found in ACEL. For the curvature of the transaction function, n00 , the estimation yields numbers close to zero, the largest being 0.004. To understand this result, note that the long-run semi-elasticity of money is proportional to the term ð2Z0 þ Z00 VÞ1 ; here, V is the steady state velocity. Hence, small values of Z00 seem necessary to deliver a large long-run semi-elasticity.11
b Þ were included as additional instruments to improve the model’s statistical fit. b t1 þ b Lagged inflation and ðp g c;t1 þ V t2 Using a different methodology, Khan et al. (2003) uncover a decline in elasticity of money demand. 11 The slope of the transaction cost function is pinned down by the steady state relation R ¼ 1 þ Z0 V 2 . 9
10
ARTICLE IN PRESS P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
263
Table 4 Volatilities from DSGE model with different portfolio stickiness. Baseline model
x
a
sm=p sy sc si sp sr
No nominal frictions b
b
0.84 (6 quarters)
0.75 (4 quarters)
0.67 (3 quarters)
0.84a (6 quarters)
0.75b (4 quarters)
0.67b (3 quarters)
0.20 0.42 0.20 1.14 1.09 2.41
1.35 0.78 0.86 0.77 0.85 0.76
1.53 0.66 0.77 0.65 0.67 0.58
0.71 0.27 0.21 0.44 3.04 1.54
1.30 0.84 0.86 0.74 0.70 0.64
1.45 0.76 0.85 0.63 0.56 0.45
Notes: Standard deviations of (1) annual growth rates of output (y), consumption (c), investment (i), and real balances (m=p) and (2) annualized interest rates (r) and inflation ðpÞ. a Volatilities expressed in percentage points. b Volatilities expressed as a fraction of those for x ¼ 0:84.
5. Financial sophistication and the Great Moderation In the last section, I argued that financial innovations in the 1970s caused the structural break found in the demand for money. It now remains to show the link between the change in money demand and the Great Moderation in a DSGE framework. Two elements are essential to understanding this connection: (1) sluggish portfolios and (2) households’ use of cash to purchase consumption. To begin with, sticky portfolios imply that current money holdings were determined in the past. Unable to replenish cash balances, households must adjust consumption in response to shocks so that marginal utility remains constant as required by the complete market condition (3). Hence, volatility from shocks is passed to consumption. In contrast, when portfolio re-balancing is more flexible, households can efficiently adjust cash balances to buffer shocks and achieve consumption smoothing. In terms of condition (3), money absorbs fluctuations in prices to keep consumption constant. Yet, one may wonder to what extent financial innovations alone can account for the smoothing in the economy. To formally answer this question, consider the following experiment: start by selecting three degrees of portfolio stickiness: x ¼ 0:84, 0.75, and 0.67. The first and last values are the pre- and post-1984 GMM estimates, respectively. They are intended to capture the state of the economy before and after the financial innovations. Next, for each Calvo probability, simulate the model outlined in Section 3 using the monetary, neutral technology, and investment-specific shocks reported in ACEL. Finally, compute the theoretical standard deviations for real and nominal variables and compare them to the empirical ones.12 Table 4 reports the theoretical volatilities for different degrees of financial innovation (baseline model). To facilitate comparison with the data, the volatilities are reported relative to that of the highest portfolio sluggishness, x ¼ 0:84. Two clear patterns emerge from this exercise. First, interest rates are substantially more volatile as portfolio adjustment becomes less frequent. To understand this result, note that large Calvo probabilities reduce participation in financial markets, since only a small fraction of households re-balance portfolios. Hence, following an expansionary monetary policy, interest rates must decline by a large amount to induce the few households active in the market to take the extra money. This large drop translates into substantial interest rate volatility. As the Calvo probability declines, more households are able to accept the additional funds, requiring a smaller drop in the interest rate, i.e., smoother time paths. The second pattern is that moving toward portfolio flexibility drives down the volatility of output and investment growth by 34% and that of consumption growth by 23%; such drops correspond to two-thirds of those found in the data (Table 1). It was already argued that smooth consumption results from households’ ability to optimize their portfolio more frequently. Optimality in turn requires that the rate of return on capital co-moves with interest rates.13 Other things equal, a decline in the volatility of the latter is accompanied by a smaller volatility in the former. Because the return on capital drives investment, more frequent portfolio re-balancing must be followed by a drop in the volatility of investment. Finally, the resource constraint, yt ¼ ct þ it , indicates that less volatility in consumption and investment delivers smaller fluctuations in output. To gauge the effects of additional portfolio flexibility, Fig. 2 displays simulated paths for interest rates and growth rates of output, consumption, and real balances under three scenarios: (1) the pre-financial innovation era characterized by high portfolio inflexibility, x ¼ 0:84 (starred lines); (2) the post-financial innovation period with flexible portfolios, x ¼ 0:67 (solid lines); and (3) the same as in case 2 but with shocks 30% smaller (thick, dotted lines). To make the comparison meaningful, the shocks are common across the simulations. For convenience, the horizontal axis has been labeled with the years 1980–2000. The simulations for the last two cases start in 1984 to represent the beginning of the Great Moderation.
12
Guerron-Quintana (2007b) discusses alternative ways to conduct this experiment. Abstracting from adjustments in capital, optimal behavior equates the return of capital with the real interest rate, i.e., r ktþ1 þ 1 d ¼ Rt =ptþ1 . Here, r k is the return on capital and d is the rate of depreciation. 13
ARTICLE IN PRESS 264
P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
Percentage points from mean
Output
Consumption
0.6
0.3
0.4
0.2
0.2
0.1
0
0
-0.2
-0.1
-0.4
-0.2
-0.6 -0.8
ξ = 0.84
-1
ξ = 0.67 ξ = 0.67 and small shocks
-1.2 1980
-0.3
1985
1990
1995
2000
Percentage points from mean
Real Balances
-0.4 1980
1985
1995
2000
1995
2000
Interest Rate
5
0.4
1990
4 3
0.2
2
0
1
-0.2
-1
0 -2
-0.4
-3
-0.6 1980
-4 1985
1990
1995
2000
1980
1985
1990
Fig. 2. Simulated paths DSGE model with different portfolio stickiness: high (starred line), medium (solid line) and medium with small shocks (dotted line). Annualized levels for interest rates and annual growth rates for output, consumption and real balances.
This counterfactual exercise shows that (a) if portfolio stickiness had stayed at its pre-financial innovation levels, the economy would have remained highly volatile; (b) the decline in the probability of portfolio adjustment found in the data indeed has contributed to smoothing the economy; and (c) the smoothing comes at the expense of more volatile real balances. This third result precisely confirms the workings of the model: portfolio flexibility allows households to transfer volatility from consumption to money. It is reassuring to see that real balances have indeed become more volatile in the data. Scenario 3 reveals that when combining flexible portfolios and smaller shocks, the model predicts an even stronger smoothing of the economy. This last case is what most likely has happened over the past decades in the U.S. Yet the results must be interpreted carefully. ACEL report that monetary and technology shocks combined explain around 50% of the variance of real and nominal variables at business cycle frequencies. Indeed, output growth had a volatility of 1.1% for the 1960–1983 period, while the model predicts a volatility of 0.42% for a Calvo probability of x ¼ 0:84; recall this high probability corresponds to the pre-1984 period. An implication of these observations is that the change in the volatilities reported in Table 4 captures at most half of the variations found in the data. Consequently, when moving from a portfolio sluggishness of roughly 6 quarters to 3 quarters, the model accounts for one-third of the decline in the volatilities of output, consumption, and investment. In terms of real balances, the model predicts an increase in volatility of 50%. This prediction is probably an overstatement of reality because of the explanatory power of the model. Factoring in this effect implies that the model predicts a volatility increase for real balances of 25%, which is roughly two-thirds of the rise observed in the data (see Table 1). The baseline model includes several assumptions in addition to the sticky portfolio adjustment assumption. To check whether the results are an artifact of the extra frictions, I repeat the simulations but exclude price and wage stickiness, habit formation, and adjustment costs in investment. The columns labeled No nominal frictions in Table 4 display the findings from the new scenario. As before, the volatilities are expressed relative to that of the highest portfolio sluggishness, x ¼ 0:84. The new results are in line with those from the baseline model, i.e., a more flexible portfolio implies smoother real variables. Yet, the smoothing of output and consumption is smaller relative to those in the baseline scenario, while the smoothing of nominal variables is stronger. For instance, moving portfolio adjustment from every 6 quarters to 3 quarters decreases the volatility of output by 24 percentage points (compare this result to the 34% decline found in the
ARTICLE IN PRESS P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
265
baseline scenario). These findings are hardly surprising because in the absence of sluggish price and wage adjustments, the model is closer to displaying Modigliani’s (1963) classical dichotomy. In other words, monetary shocks now induce fluctuations mostly in nominal variables. Goldfeld and Sichel (1990) and Dynan et al. (2006) have previously highlighted the potential smoothing effects of financial innovations. However, these authors provide only indirect arguments about the effects of financial sophistication on the Great Moderation. Hence, my results can be seen as providing a formalization of their intuition. More important, this paper is closely related to those of Jermann and Quadrini (2007) and Mertens (2007). The first paper argues that financial innovations improved debt and equity financing in the business sector. Consequently, firms can dampen the effects of asset price shocks, which in turn translate into smoother production plans. Our studies share the feature that flexible financial markets allow agents (households or firms) to transfer volatility from the real side to the financial sector of the economy (from consumption to real balances in my case, and from production to equity/debt in theirs). Mertens (2007) shows that in the presence of deposit rate ceilings banks do not have incentives to lend resources to firms. With low financial intermediation, his study continues, firms fail to fully smooth out production. Hence, his results suggest that the removal of interest rate caps (the introduction of the DIDMCA legislation in 1980) contributed to the Great Moderation. Our studies both predict an increase in the volatility of real balances and a smoothing of economic activity following more flexible financial markets. However, the channels behind the moderation are different; his study attributes it to increased flexibility in financial intermediation (reflected by the absence of interest rate caps), while my study gives an essential role to flexibility on the household side of the economy (more frequent portfolio re-balancing). In summary, my study complements the existent literature on financial innovations.
6. Concluding remarks This paper has expanded the field’s understanding of money demand in three ways. First, I put forth a micro-founded dynamic money demand equation. To derive this equation, I show that the tools from price-setting models can help manage the heterogeneity on the economy’s household side. The second contribution is to show that my model of money demand simultaneously accounts for low short-run interest elasticities and high long-run elasticities of money demand. Estimates of these elasticities are in line with the findings in the literature. Finally, this paper offers a simple explanation for the change in real balances, output, and consumption volatility after 1984. Given estimates from the preand post-1984 samples, consumption and output have been smoother since the financial innovations of the late 1970s facilitated portfolio re-balancing.
Acknowledgments I am greatly indebted to Martin Eichenbaum for his continuous support and countless suggestions that have improved this paper. I have also benefited from extensive comments by Larry Christiano, the editor Robert King, Alex Monge-Naranjo, and an anonymous referee. Finally, I am grateful to Martha Carrillo, Jesus Fernandez-Villaverde, Tom Grennes, Lyndon Moore, Doug Pearce, Viswanath Pingali, Giorgio Primiceri, Annette Vissing-Jorgensen, and seminar participants at NCSU, Rutgers University, the Bank of Canada, Northwestern University, and the 2007 Summer Meetings of the Econometric Society for comments and suggestions. All remaining errors are mine. References Altig, D., Christiano, L., Eichenbaum, M., Linde, J., 2005. Firm-specific capital, nominal rigidities and the business cycles. NBER Working Paper 11034. Campbell, J., Hercowitz, Z., 2004. The role of collaterilized household debt in macroeconomic stabilization. Mimeo, Federal Reserve Bank of Chicago. Campbell, J., Viciera, L., 2002. Strategic Asset Allocation: Portfolio Choice for Long-Term Investors. Oxford University Press, Oxford. Christiano, L., Eichenbaum, M., Evans, C., 1999. Monetary policy shocks: What have we learned and to what end? In: Woodford, M., Taylor, J. (Eds.), Handbook of Macroeconomics, vol. 1.A. North-Holland, Amsterdam. Christiano, L., Eichenbaum, M., Evans, C., 2005. Nominal rigidities and the dynamic effects of a shock to monetary policy. Journal of Political Economy 113, 1–45. Clarida, R., Gali, J., Gertler, M., 2000. Monetary policy rules and macroeconomic stability: evidence and some theory. Quarterly Journal of Economics 115, 147–180. Dynan, K., Elmendorf, D., Sichel, D., 2006. Can financial innovation help to explain the reduced volatility of economic activity. Journal of Monetary Economics 53, 123–150. Eichenbaum, M., Fisher, J., 2007. Estimating the frequency of price re-optimization in Calvo-style models. Journal of Monetary Economics 54, 2032–2047. Engel, C., West, K., 2005. Exchange rates and fundamentals. Journal of Political Economy 113, 485–517. Erceg, C., Henderson, D., Levin, A., 2000. Optimal monetary policy with staggered wage and price contracts. Journal of Monetary Economics 46, 281–313. Fernandez-Villaverde, J., Rubio-Ramirez, J., 2007. Estimating macroeconomic models: a likelihood approach. Review of Economic Studies 74, 1059–1087. Friedman, B., Kuttner, K., 1996. A price target for U.S. monetary policy? Lessons from the experience with money growth targets. Brookings Papers on Economic Activity 1, 77–125. Friedman, M., 1959. The demand for money: some theoretical and empirical results. Journal of Political Economy 67, 327–351. Gali, J., Gertler, M., 1999. Inflation dynamics: a structural econometric analysis. Journal of Monetary Economics 44, 195–222. Goldfeld, S., 1976. The case of the missing money. Brookings Papers on Economic Activity 3, 683–740. Goldfeld, S., Sichel, D., 1990. The demand for money. In: Friedman, B., Hahn, F. (Eds.), Handbook of Monetary Economics, pp. 299–356.
ARTICLE IN PRESS 266
P.A. Guerron-Quintana / Journal of Monetary Economics 56 (2009) 255–266
Guerron-Quintana, P., 2007a. Heterogenous portfolio adjustment, the demand for money, and the great moderation. Working paper, Department of Economics, North Carolina State University. Guerron-Quintana, P., 2007b. Financial innovations: an alternative explanation to the great moderation. Working paper, Department of Economics, North Carolina State University. Guerron-Quintana, P., 2008. Refinements on macroeconomic modelling: the role of non-separability and heterogeneous labor supply. Journal of Economic Dynamics and Control 32, 3613–3630. Jermann, U., Quadrini, V., 2007. Financial innovations and macroeconomic volatility. Mimeo, University of Southern California. Justiniano, A., Primiceri, G., 2008. The time varying volatility of macroeconomic fluctuations. American Economic Review 98, 604–641. Khan, A., King, R., Wolman, A., 2003. Optimal monetary policy. Review of Economic Studies 70, 825–860. Lucas Jr., R., 1988. Money demand in the United States: a quantitative review. Carnegie-Rochester Conference Series on Public Policy 29, 137–167. Lucas Jr., R., 2000. Inflation and welfare. Econometrica 68, 247–274. Mankiw, G., Summers, L., 1986. Money demand and the effects of fiscal policies. Journal of Money Credit and Banking 18, 415–429. McConell, M., Perez-Quiroz, G., 2000. Output fluctuations in the United States: What has changed since the early 1980’s? American Economic Review 90, 1464–1476. Meltzer, A., 1963. The demand for money: the evidence from the time series. Journal of Political Economy 71, 219–246. Mertens, K., 2007. How the removal of deposit rate ceilings has changed monetary transmission in the US: theory and evidence. Mimeo, Department of Economics, Cornell University. Modigliani, F., 1963. The monetary mechanism and its implications with real phenomena. Review of Economics and Statistics 45, 79–107. Schmitt-Grohe, S., Uribe, M., 2004. Optimal operational monetary policy in the Christiano–Eichenbaum–Evans model of the U.S. business cycle. Mimeo, Department of Economics, Duke University. Sienkiewicz, S., 2002. The evolution of ETF networks from ATMs to new online debit payment products. Discussion paper, Payment Card Center, Federal Reserve Bank of Philadelphia. Sims, C., 1994. A simple model for study of the determination of the price level and the interaction of monetary and fiscal policy. Economic Theory 4, 381–399. Sims, C., Zha, T., 2004. Were there regime switches in U.S. monetary policy? Mimeo, Department of Economics, Princeton University. Staiger, D., Stock, J., 1997. Instrumental variables regression with weak instruments. Econometrica 65, 557–586. Stock, J., Watson, M., 1993. A simple estimator of cointegrating vectors in higher order integrated systems. Econometrica 61, 783–820. Stock, J., Watson, M., 2002. Has the business cycle changed and why? NBER Working Paper 9127. Vissing-Jorgensen, A., 2002. Towards an explanation of household portfolio choice heterogeneity: nonfinancial income and participation cost structures. NBER Working 8884. Walsh, C., 2003. Monetary Theory and Policy. MIT Press, Cambridge, MA. Woodford, M., 2005. Firm-specific capital and the New-Keynesian Phillips curve. Working paper, Department of Economics, Columbia University.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 267–274
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Sovereign debt auctions: Uniform or discriminatory? Menachem Brenner a, Dan Galai b, Orly Sade a,b, a b
Stern School of Business, New York University, 44 West, 4th street, NY, NY 10012, USA Jerusalem School of Business, Hebrew University of Jerusalem, Israel
a r t i c l e in fo
abstract
Article history: Received 28 November 2007 Received in revised form 17 December 2008 Accepted 18 December 2008 Available online 13 January 2009
Many financial assets, especially government bonds, are issued by an auction. An important feature of the design is the auction pricing mechanism: uniform versus discriminatory. Theoretical papers do not provide a definite answer regarding the dominance of one type of auction over the other. We investigate the revealed preferences of the issuers by surveying the sovereign issuers that conduct auctions. We find that the majority of the issuers/countries in our sample use a discriminatory auction mechanism for issuing government debt. We use a multinomial logit procedure and discriminatory analysis to investigate the mechanism choice. It was interesting to find that market-oriented economies and those that practice common law tend to use a uniform method while economies who are less market oriented and practice civil law tend to use discriminatory price auctions. & 2009 Elsevier B.V. All rights reserved.
JEL classification: G1 F3 Keywords: Uniform auction Discriminatory auction Treasury bonds T-bills
1. Introduction There is a long-standing debate regarding the auction system that a sovereign should use when it issues debt instruments. The most common pricing rules are the uniform and the discriminatory.1 Our objective is to analyze the choices made by countries around the globe and what may explain these choices. As early as 1960, Friedman has argued that a discriminatory auction will drive out uninformed participants because of the ‘‘winner’s curse’’ and attract better informed, typically large players. Thus, the discriminatory auction will be more susceptible to collusion than the uniform one. He predicted that the discriminatory auction would lead to lower revenues. Alternatively, a uniform price mechanism would lead to wider participation which should result in lesser collusion and higher revenues. It is puzzling, therefore, to find that most countries, in our study, use the discriminatory price mechanism. The academic literature since Friedman (1960) is not conclusive regarding the optimal pricing mechanism that countries should use in sovereign debt auctions. Both pricing mechanisms are used in practice. Also, several countries, in our sample, switched from one pricing rule to another (see, for example, the US experiment).2
Corresponding author at: Stern School of Business, New York University, 44 West, 4th street, NY, NY 10012, USA. Tel.: +1 212 998 0336.
E-mail addresses:
[email protected] (M. Brenner),
[email protected] (D. Galai),
[email protected] (O. Sade). In the uniform price auction (UPA) (also known as single price auction), the objects are awarded to bidders that bid above the market clearing price. All bidders pay the (same) market clearing price. In the discriminatory auction (DA) (also known as multiple prices auction), the objects are also awarded to bidders that bid above the clearing price but each bidder pays the price that he bid. 2 The so-called ‘‘Salomon Squeeze’’ in May 1991 (Jagadeesh, 1993) has triggered an examination of the auctioning system, in particular the pricing rule. Though the experiment did not result in a significant revenue improvement in the uniform auction versus the discriminatory, there were additional considerations in the decision to switch to the uniform auction (Malvey et al., 1995; Malvey and Archibald, 1998). 1
0304-3932/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.12.012
ARTICLE IN PRESS 268
M. Brenner et al. / Journal of Monetary Economics 56 (2009) 267–274
Our research consists of two parts. First, we document the recent auction mechanisms employed by treasuries and central banks around the globe (their revealed preferences). In the second part, we analyze, in a cross-sectional setting, the factors that are potentially related to the choice of a mechanism by country. We use several variables that were used in the academic literature to study the relationship between financial development and economic growth. Given our results we provide an explanation, consistent with our empirical finding that takes into account the bargaining power of the three stake holders; the issuer, the intermediaries and the investors.3 Though the primary market for government debt is one of the largest financial markets in the world, there is no source of public data that provides information about treasury auctions. This information can only be obtained by collecting data directly from each country. We have contacted treasury ministries and central banks around the globe and received answers from 48 countries. We have screened this unique data and documented which country is using what mechanism (discriminatory, uniform, both or other pricing rule). Our sample consists of countries from different continents and economic size, including almost all (83%) OECD countries. Most countries use a discriminatory auction (24) while nine countries use a uniform auction. Some use both mechanisms (9), depending on the security being auctioned, while others use pricing rules which are neither uniform nor discriminatory (6). We investigate the factors which may explain the choice of an auction mechanism by a sovereign. We find that countries that have more market-oriented economies (as measured by capitalization/GDP) and practice common law tend to use a uniform price auction. In other countries where the financial environment is less developed and barriers to the public’s participation in the auctions (direct or indirect) may exist, the central planner needs to be more attuned to the preferences of the intermediaries and if they prefer a discriminatory price auction the central planner will adopt this mechanism. Our paper belongs to the growing literature on divisible-unit auctions. The theory does not tell us whether the uniform auctions will generate higher revenue than the discriminatory ones.4 This remains an empirical issue that our research is trying to contribute to. The relevant empirical work uses either an event study approach (e.g. the US experiment)5 or employ structural econometric models.6 The novelty of our approach is the application of a cross-section analysis to find explanatory variables for sovereign decisions.7 It makes a contribution to the literature on the relationship between country characteristics and financial development. The paper is organized as follows. Section 2 looks at the auction practices of different countries. Section 3 investigates the factors that affect a country’s choice. Section 4 provides concluding remarks. 2. Auction methods used by issuers of government bonds We first investigated the current practices used worldwide at treasury auctions. Since this information is not available in public databases, we had to use our own survey which was sent (see Appendix A) via e-mails, mail and faxes to central banks and treasuries around the globe.8 We received answers from 48 countries, listed in Table 1. The responses that we have received show that 50% of the countries use a discriminatory price auction, about 19% use a uniform one while about 19% use both methods, depending on the type of debt instruments being issued. The others, about 12%, use a method that is different than the two conventional ones (e.g. Austria). Interestingly, even among countries with the same currency and relatively similar monetary policy (for example, the EU countries that use the Euro) different types of auction mechanisms are used. Finland, for example, which used a uniform price mechanism, does not use auctions anymore9 while France and Germany use a discriminatory auction. We also find that in some countries the mechanism that is being used has changed over time (e.g. the US has switched, in the 1990s, from a discriminatory mechanism to a uniform one while Mongolia switched from uniform auction to a discriminatory one). In about 50% of our sample, the country employed in the past a different selling mechanism than the one it currently uses. Some countries in our sample use more than one type of pricing rule to sell their debt instruments (e.g. Canada and Brazil). Some use a different auction mechanism to issue debt than to buy back debt (e.g. USA).10 Given the different practices and the changes introduced by 3 The objective of the issuer, the treasury or the central bank, is to maximize revenues over time. The issuer is not only concerned with the next auction’s revenues but has also long-term considerations, like the quality of the secondary market and the likelihood of collusion in the auction or the secondary market. The goal of the intermediaries, who serve as underwriters, dealers and brokers, is to maximize the profit from their activities. The third stakeholder is the public, including financial institutions, who invest in these debt instruments and would naturally like to pay the lowest possible price. 4 See, for example, Wilson (1979), Back and Zender (1993) and Ausubel and Cramton (2002) for theoretical evidence on strategic bidding in multiunit auctions. 5 The main issue with this approach (see in addition Tenorio, 1993; Umlauf, 1993) is that one cannot claim ‘‘ceteris paribus’’ that the economic conditions have not changed. 6 These papers (e.g. Hortac-su, 2002) use a bidder’s optimality condition to recover the distribution of the marginal valuations of the bidders. At its current stage, this literature does not provide a clear answer with respect to the mechanism choice. 7 A previous cross-country description of auction design issues is given in Bartolini and Cottarelli (1997). While their paper describes various aspects of the auction mechanism, our paper investigates recent practices and focuses on the determinants of the choice of the auction pricing rule. 8 The survey was sent to all the central banks that their e-mails were listed at the bank for international settlements, international directory and to the treasuries and central banks that their e-mails were listed at the IMF home page. In some cases, when we did not get a response, we used personal contacts to get answers to the survey. 9 Though it now considers reinstating them in the future. 10 See Han et al. (2007) for the description of the US treasury buyback auctions.
ARTICLE IN PRESS M. Brenner et al. / Journal of Monetary Economics 56 (2009) 267–274
269
Table 1 Survey answers regarding the type of auctions used to sell sovereign debt in different countries around the world as of April–October 2005. Discriminatory
Uniform
Both
Other
Bangladesh+ Belgium+ Cambodia+ Cyprus+ Ecuador France Germany+ Greece+ Hungary Israel+ Jamaica Latvia+ Lithuania+ Macedonia Malta7 Mauritius Mongolia+ Panama+ Poland+ Portugal+ Solomon Islands Sweden+ Turkey+ Venezuela
Argentina Australia Colombia Korea+ Norway Singapore Switzerland+ Trinidad and Tobago USA+
Brazil Canada+ Ghana Italy Mexico+ New-Zealand+ Sierra—Leon+ Slovenia United Kingdom+
Austria+ Finland+a Luxemburg Fiji+ Ireland+ Japan
The table describes the auction mechanism employed by the countries in our sample. + Indicates if the treasury or the central bank has the right to change the quantity being auctioned. For more details about the specific auction in each country, see Appendix A. a At the time of the survey, Finland indicated that it does not use auctions to sell its debt. Yet after the survey was conducted, we received information that Finland considers using again uniform auctions in the future.
some countries11 it is clear that research, theoretical, experimental and/or empirical, about auction designs would be of great interest to a variety of issuers, be it governments or corporations. Thus, we examine the features which make up the profile of a country to see if there are common factors associated with one auction design or another. 3. What may affect the choice of an auction mechanism by a country? Given the potential consequences of the mechanism choice on the revenue obtained and the subsequent activity in the secondary markets, we investigate the possible factors that may affect this choice. As stated above, the cross-section analysis, done for the first time, looks for specific characteristics that affect the mechanism choice. There is no auction related model that provides specific guidelines as to the variables that we should include in the empirical investigation. We have decided to use a set of macro variables that have been used in studying macro finance issues and seemed to be appropriate in our context. The first set of variables is related to the risk of the assets that are being auctioned, more specifically the credit risk of the sovereign. Anecdotal evidence from the UK (Leong, 1999) suggests that the UK took into account the potential level of the ‘‘winner’s curse’’, due to the riskiness of the asset auctioned, in its determination of the auction price mechanism. The second set of variables is related to the specific characteristics of the country that issues the debt and the characteristics of its financial markets. We, thus, examined the recent literature which investigates the different global financial systems, trying to explain their growth and efficiency by their legal system and other economic and non-economic variables. La Porta et al. (1998), Levine (1999) and others, argue that legal systems that protect creditors and enforce contracts are likely to encourage greater financial intermediary development than legal and regulatory systems that ineffectively enforce contracts. Following this literature, we use the origin of law as a potential explanatory variable to the auction mechanism design. Rajan and Zingales (1998, 2003) discuss how to measure financial development and suggest that the measures should capture the ease with which any entrepreneur, company or country can raise funds and the confidence with which investors anticipate an adequate return. Allen et al. (2006) find a link between the economic system and the financial system. Here we use two variables: capitalization divided by GDP and the ranking of ‘‘easiness of doing business’’. There is growing literature that connects different aspects of political forces to the structure of financial markets. Examples include Perotti and von Thadden (2006), Pagano and Volpin (2001, 2005), Bolton and Rosenthal (2002) and Biais and Perotti (2002) 11 We also found that most countries using both mechanisms have the right to change the quantity after viewing the bidding results (67% for the discriminatory and 56% for the uniform), yet some of them do not use this right.
ARTICLE IN PRESS 270
M. Brenner et al. / Journal of Monetary Economics 56 (2009) 267–274
among others. Given this literature, we collected data that include indexes that rank different countries by freedom of the economy and the level of corruption.12 3.1. Data sources (explanatory variables) We collected several explanatory variables that describe the auctioned assets and the issuer, from the World Bank and its International Finance Corporation (IFC), Moody’s, the Wall Street Journal and Transparency International. For the specific characteristics of the bonds being auctioned, we use an estimate of the sovereign default risk. We use Moody’s sovereign debt ratings (August 2005) and World Bank Indebtedness Classification (2003).13 The rationale for investigating the effect of sovereign risk on the mechanism choice is the potential relationship of risk and the ‘‘winner’s curse’’. We also include variables that describe the legal system, the financial structure and the economic environment of the countries that issue the debt. The legal system of countries can be classified either as civil (Roman) law or as common law. Common law is associated with countries that have a more liberal economic system, small role for the government like Britain, the United States and Australia, while civil law is associated with economies where the government plays a larger role like France, Germany and Japan. Stock market capitalization as percentage of the GDP (World Bank—2003) serves as a proxy for the degree of development of the financial markets while GDP (World Bank—2003) itself serves as proxy for country size. We also use several indexes that rank the level of competitiveness, economic freedom and corruption. The ease of doing business 2006 index (source: IFC) ranks countries on their ease of doing business from 1 to 175. A high ranking means the regulatory environment is conducive to the operation of business. The CPI Corruption 2005 Index (source: Transparency International) aims to measure the overall extent of corruption (frequency and/or size of bribes) in the public and political sectors. The index ranks countries from 1 to 158. The Index of Economic Freedom 2006 (source: The Heritage Foundation/Wall Street Journal) uses 50 independent variables divided into 10 broad factors of economic freedom to rank 161 countries. 3.2. Empirical findings—a univariate investigation We divided our sample into three categories according to the pricing mechanism, those that use the discriminatory auction, those that use the uniform auction and those that use both. Table 2 provides the means and medians of these variables with respect to the auction mechanism. First, we find that countries that use a discriminatory auction have, on average, significantly lower capitalization to GDP ratio compared with countries that use a uniform auction (P ¼ 0.03) and countries that use both (P ¼ 0.04). There is no significant difference in the averages of this ratio between countries that use both mechanisms and those that use the uniform one. Second, we find that the type of law practiced in countries that use a discriminatory auction is significantly (P ¼ 0.038) different than the legal system in countries that use a uniform auction. Specifically, we find that countries that use a discriminatory auction tend to be countries with a civil law system.14 Third, we do not find GDP to be significantly different between countries that use the discriminatory auction and countries that use the uniform one. Fourth, though we find the frequency measure of indebtedness classification to be higher for countries that use a discriminatory auction compared with those that use a uniform one, the difference is only marginally significant.15 Fifth, we find, using a standard non-parametric test that the ranking of ease of doing business Index is significantly higher for countries that use a uniform auction than those that use a discriminatory one. Though we find that a lower corruption index level and a higher level of economic freedom index are associated with countries that employ a uniform auction compared with the discriminatory one, these differences are not statistically significant. In summary, the univariate investigation indicates that variables associated with development of financial markets, capitalization to GDP, ease of doing business and the type of law employed, are statistically significant. 3.3. A multivariate investigation—multinomial logit and discriminatory analysis To examine which variables affect the mechanism choice, we also conducted a multinomial regression analysis.16 Our dependent variable, the auction mechanism, was classified into four categories: uniform, discriminatory, both types, other 12 While we would like to have additional variables such as the number of participants in the auction markets and their relative share in dollar terms, this information is not only unavailable to us but is also unavailable to most issuers (e.g. central banks and treasuries) since the buyers may represent also other participants. For a discussion on data issues see Fleming (2007). 13 In 2003, countries with a present value of debt service greater than 220% of exports or 80% of GNI were classified as severely indebted, countries whose present value of debt service exceeded 132% of exports or 48% of GNI were classified as moderately indebted and countries that did not fall into either group were classified as less indebted. 14 The same applies to the difference between countries that use a discriminatory auction versus countries that use both types of auctions. 15 Moodys rating of over 60% of the countries that use the uniform price mechanism is Aaa. This is true only for 17% of the countries that use the discriminatory mechanism. 16 Multinomial logit models are an extension of logistic models for more than two alternatives.
ARTICLE IN PRESS M. Brenner et al. / Journal of Monetary Economics 56 (2009) 267–274
271
Table 2 Sovereign classification by auction method and by country characteristics. Discriminatory (N ¼ 24) % Of civil law Average stock market capitalization % of GDP Median stock market capitalization % of GDP Average GDP % Of indebtedness classification Average ranking of ease of doing business Median ranking of ease of doing business Average ranking of corruption index Median ranking of corruption index Average ranking of economics freedom index Median ranking of economics freedom index
a
83 38%b (std ¼ 32%) 28c 2.49E+11 (std ¼ 5.80E+11) 67d 56e 52g 61i 51j 55k 44l
Uniform (N ¼ 9)
Both (N ¼ 9)
44% 97% (std ¼ 69%) 101 1.43E+12 (std ¼ 3.56E+12) 33 25f 11h 33 17 39 30
43% 54% (std ¼ 42%) 42 5.54E+11 (std ¼ 6.36E+11) 44 62 70 44 40 51 42
This table provides descriptive statistics of the countries according to the auction mechanism employed by them and the country classification on several dimensions; indebtedness classification, The World Bank (source: 2003) classifies countries by their level of indebtedness for the purpose of developing debt management strategies. It uses a three-point scale: severely indebted (S), moderately indebted (M) and less indebted (L). The indebtness classification serves as proxy for the riskiness of the country. Civil (Roman) law versus common law. This variable was proposed by La Porta et al. (1998). Stock market capitalization as percentage of the GDP (source: World Bank—2003). Market capitalization is the share price times the number of shares outstanding. GDP (source: World Bank—2003) is measured in current US$. Ease of doing business 2006. (source: IFC—published in 2005). The ease of doing business index ranks economies from 1 to 155. The CPI corruption index 2005 (source: Transparency International) aims to measure the overall extent of corruption (frequency and/or size of bribes) in the public and political sectors. The index ranks countries from 1 to 158. The index of economic freedom 2006 (source: the Heritage Foundation/Wall Street Journal). The index uses 50 independent variables divided into 10 broad factors of economic freedom to rank 161 countries. a Based on 23 observations since we do not have the classification for the source of law of Solomon Islands. b Based on 19 observations since data were not available for Cambodia, Macedonia, Malta, Cyprus and Solomon Islands. c Based on 19 observations since data were not available for Cambodia, Macedonia, Malta, Cyprus and Solomon Islands. d Based on 21 observations since data were not available for Malta, Cyprus and Solomon Islands. e Based on 22 observations since data were not available for Malta and Cyprus. f Based on eight observations since data were not available for Trinidad and Tobago. g Based on 22 observations since data were not available for Malta and Cyprus. h Based on eight observations since data were not available for Trinidad and Tobago i Based on 23 observations since data were not available for Solomon Islands. j Based on 23 observations since data were not available for Solomon Islands. k Based on 23 observations since data were not available for Solomon Islands. l Based on 23 observations since data were not available for Solomon Islands.
types (Beck et al., 2000). We estimated four different models with a different set of independent variables. In Table 3, we present the values of the coefficients and the statistical significance only for the comparison between the uniform auction and the discriminatory one. Our main finding is that capitalization/GDP is positively and significantly correlated with the choice of a uniform auction, rather than the discriminatory one. The dummy variable for civil law versus common law is significantly correlated with the bidding system.17 Neither GDP by itself nor the dummy for indebtedness classification are significantly correlated with the mechanism choice.18 For robustness, we also conducted a discriminatory analysis that is used to classify cases into categorical dependence. The results that we obtain are consistent with our multinomial logit results. We find that we can correctly classify 82% of the observations using only the capitalization/GDP ratio. Moreover, adding other variables from our list does not improve our ability to classify (Wilks’ Lambda test is significance at 0.007). Why does the financial markets development factor play such an important role in the auction design decision of the issuer? Why countries with less developed financial markets choose the discriminatory auction? Our conjecture is related to the bargaining power of the different financial players in the market. In many countries, the issuer cannot rely on sufficient (at a desirable minimum price) direct investor participation and needs the help of the intermediaries to sell the issue. If the intermediaries prefer a discriminatory auction, then the issuer has an incentive to use this auction system.19 Why would dealers/intermediaries prefer a discriminatory mechanism? One possible explanation is that this mechanism does not result in one known equal price to all investors, which helps them to sell it at a higher price in the secondary market. Another possible explanation relates to Friedman’s argument that the discriminatory mechanism reduces the
17 When the two variables are used together, only capitalization/GDP remains significant. This could be due to multicolinearity; the Pearson correlation between these two variables; legal system and capitalization/GDP ratio is 0.354 which is significant. 18 We also examined the choice between using both mechanisms versus using only the discriminatory one. The only variable that is significant and negatively correlated with the decision to use ‘‘both’’ mechanisms rather than the discriminatory one is the dummy variable for civil law. 19 For part of our sample, we were able to collect the total size of government debt and indeed those countries that use a discriminatory price mechanism have on average larger government debt to GDP ratio.
ARTICLE IN PRESS 272
M. Brenner et al. / Journal of Monetary Economics 56 (2009) 267–274
Table 3 What explains auction type choices—multinomial analysis. Variables
1
2
3
4
Constant
2.572
0.503 (0.765) – – 1.069 (1.085) 3.66e13 (0.847)
0.110 (0.154) – – – – 7.60e13 (1.459)
1.535 (1.233) 0.025 (2.075) – – –
0.106 0.069
1.823 (2.020) 0.088 0.115
1.140 (1.071) 0.126 0.057
Cap/GDP Dummy (indebtedness classification) GDP
(2.995) 0.030 (2.579) – – – –
Dummy (civil law) Pseudo R2 Prob 4w(n)
0.096 0.023
For completeness and statistical accuracy, we conducted a multinomial analysis that included four auction categories: uniform, discriminatory, both and other mechanisms. We present here only the comparison between the uniform and the discriminatory mechanism. The discriminatory mechanism is the comparison group. The dependent variables are as follows: a dummy for indebtedness classification (source: World Bank—2003). Civil (Roman) law versus common law variable was proposed by La Porta et al (1998). We try to see whether the auction mechanism is associated with the legal system in a country. Stock market capitalization as percentage of the GDP (source: World Bank—2003). GDP (source: World Bank—2003) is measured in current US$. Z values are in parenthesis. We estimated four different specifications as follows. Significant at 5% level. Significant at 10% level.
number of potential bidders and hence the number of potential competitors which could result in them paying lower prices. A study by Sade et al. (2006) has shown that in the discriminatory mechanism, on average, the participants collude more and pay lower prices. On the other hand, in countries with well developed financial markets, the intermediaries have less bargaining power in setting the auction mechanism since the central planner can rely on public participation.20 Given the intermediaries assumed preferences on one hand, the investors/public assumed preferences on the other hand and the issuer’s objective, it is clear why the bargaining power between the three different stakeholders may affect the auction’s mechanism choice.21 To provide additional support to our conjecture that bargaining power may drive the observed results, we searched for a proxy for the relative power of the dealers. A suitable proxy, in our opinion, is the level of concentration of the banking system. In many countries, not in the US, the commercial banks serve as the dealers in the bond market. Thus, the higher the concentration the higher is their bargaining power. We use the 2004 bank concentration measure from the updated version of ‘‘New Database on Financial Development and Structure’’ by the World Bank constructed by Beck, Demirgu¨c- -Kunt and Levine. Their bank concentration measure is calculated as the value of the assets of the three largest banks as a share of all commercial banks assets in the country. For each auction mechanism, we counted the number of countries that the concentration value is above the median of all countries in the sample. We divided this number by the total number of countries that use the respective mechanism. We find that in the sample of countries that use the discriminatory mechanism there is a higher proportion of countries that their concentration level is above the sample median (0.55) while this ratio is lower for countries that use the uniform one (0.44).22 Finally, for whatever it is worth, we would like to import the following quote made in reference to the Treasury’s move from a discriminatory auction to a uniform one: But some primary dealers responded to the Treasury’s trial balloon last week by saying that nobody will bid for these bonds at a Dutch auction. Are they wrong? WSJ/Diana B. Henriques; Treasury’s Troubled Auctions, 1991. 4. Summary and conclusions In auctioning financial assets governments face a major decision; what is the optimal pricing mechanism to sell their debt? Should it be a uniform price auction or a discriminatory one? The existing theoretical and empirical work is ambivalent about the method that a sovereign should use. 20 An argument, consistent with this conjecture, is made by Brenner et al. (2007) in an experimental study. They show that when investors are given the choice between a uniform auction and a discriminatory one, they prefer to participate in a uniform auction and are willing to pay higher prices. It is suggested that a possible reason for such a preference is that uniform auctions are perceived as ‘‘fair’’ and transparent by the participants. See also Garbade (2004) for the description of the 1959 testimony by Robert Anderson, Secretary of the Treasury, who suggested that small banks, corporations and individuals do not have the ‘‘professional capacity’’ to bid at the discriminatory auction. 21 It could be argued that the main consideration in choosing a discriminatory auction in the US Treasury buy back program is the dealers bargaining power. 22 Though this result is statistically insignificant, possibly due to the sample size, it supports our conjecture.
ARTICLE IN PRESS M. Brenner et al. / Journal of Monetary Economics 56 (2009) 267–274
273
We find that most countries use the discriminatory method, and fewer use the uniform one. We also find that most market-oriented economies use the uniform price mechanism and that countries that use the uniform price mechanism tend to be ‘‘common law’’ countries and have, on average, a more favorable ranking for ‘‘easiness of doing business’’, economic freedom and have a lower level of corruption. Using multinomial analysis, we find that capitalization/GDP is correlated with the mechanism choice. This is supported by a discriminatory analysis. So why do we find so many countries using the discriminatory pricing method? Our conjecture is that the financial markets in many of these countries are dominated by a few large financial intermediaries and it is in their interest, paying lower prices, to have a discriminatory auction rather than a uniform one. These few institutions are better informed than the rest of the public because they hold a large portion of the potential bids either as proprietary bidders or as agents for other bidders. This conjecture is supported by our tests that show that the discriminatory method is used more in countries which have less developed financial markets.23 Future research should use additional variables to investigate further the linkage between auction design, financial markets and economic variables; why so many countries use the discriminatory method. The effect of the secondary market on auction design is an interesting topic and so is a study about the switch that some countries have made, from one auction type to another, the reasons behind it and the consequences of it.
Acknowledgments We would like to thank the editor, Robert King and an anonymous referee for their helpful comments and suggestions. We benefited from discussions with Bill Allen, Bruno Biais, Peter Cramton, Kenneth Garbade, Avner Kalay, Marco Pagano, Michal Passerman, Jesus M. Salas, Anthony Saunders, Raghu Sundaram, Avi Wohl, Yishay Yafeh and Jaime Zender. We thank Moran Ofir for her excellent research assistance. We would also like to thank the participants of the 2006 European Finance Association Meeting in Zurich, MTS 2006, Istanbul and FUR XIII 2006, Rome. We also benefited from comments received from the participants of seminars at Tel-Aviv University, IDC (Israel), NYU, the University of Colorado at Boulder, University of Massachusetts at Amherst, University of Utah, University of Houston and the Federal Reserve Bank of NY. We thank ‘‘The Caesarea Edmond Benjamin de Rothschild Center for Capital Markets and Risk’’ at IDC, the Krueger Center for Finance and the Zagagi Center at the Hebrew University of Jerusalem for partial financial support.
Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco. 2008.12.012.
References Allen, F., Bartiloro, L., Kowalewski, O., 2006. Does economics structure determine financial structure? Working Paper Wharton, University of Pennsylvania. Ausubel, L., Cramton, P., 2002. Demand reduction and inefficiency in multi-unit auctions. Working Paper, University of Maryland. Back, K., Zender, J.F., 1993. Auctions of divisible goods: on the rationale for the treasury experiment. Review of Financial Studies 6, 733–764. Bartolini, L., Cottarelli, C., 1997. Treasury bill auctions: issues and uses. In: Blejer, M.I., Ter-Minassian, T. (Eds.), Macroeconomics Dimensions of Public Finance, Essays in Honor of Vito Tanzi. Routledge, London, pp. 267–336. Beck, T., Demirgu¨c- -Kunt, A., Levine, R., 2000. A new database on financial development and structure. World Bank Economic Review 14, 597–605. Biais, B., Perotti, E., 2002. Machiavellian privatization. American Economics Review 92, 240–258. Bolton, P., Rosenthal, H., 2002. Political intervention in debt contracts. Journal of Political Economics 110, 1103–1134. Brenner, M., Galai, D., Sade, O., 2007. Endogenous bidder preferences in divisible good auction: discriminatory versus uniform. Working Paper, New York University. Fleming, M., 2007. Who buys treasury securities at auction? Federal Reserve Bank of New York Current Issues in Economics and Finance 13, 1–7. Friedman, M., 1960. A Program for Monetary Stability. Fordham University Press, New York. Garbade, K., 2004. The institutionalization of treasury note and bond auctions, 1970–75. Federal Reserve Bank of New York Economic Policy Review, 29–45. Han, B., Longstaff, F., Merrill, C., 2007. The ‘‘cherry-picking’’ option in the US treasury buyback auctions. Journal of Finance 62, 2673–2693. Hortac-su, A., 2002. Mechanism choice and strategic bidding in divisible good auctions: an empirical analysis of the Turkish treasury auction market. Working Paper, University of Chicago. Jagadeesh, N., 1993. Treasury auction bids and the Salomon Squeeze. Journal of Finance 48, 1403–1419. La Porta, R., Lopez-de-Silanes, F., Shleifer, A., Vishny, R., 1998. Law and finance. Journal of Political Economy 106, 1113–1155. Leong, D., 1999. Treasury Occasional Paper No. 10: Debt Management—Theory and practice. Levine, R., 1999. Law, finance and economic growth. Journal of Financial Intermediation 8, 8–35. Malvey, P., Archibald, C., Flynn, S.T., 1995. Uniform-Price Auctions: Evaluation of the Treasury Experience. Office of Market Finance, US Treasury. Malvey, P., Archibald, C., 1998. Uniform-Price Auctions: Update of the Treasury Experience. Office of Market Finance, US Treasury. Pagano, M., Volpin, P., 2001. The political economy of finance. Oxford Review of Economic Policy 17, 502–519.
23 An additional explanation for the origin of using a given rule method has to do with the evolution of financial markets around the globe. Since the development of financial markets around the globe has, by and large, lagged behind the US many countries have just followed the pre-change US example without questioning its rationale and whether it is appropriate and fits the market structure of that country.
ARTICLE IN PRESS 274
M. Brenner et al. / Journal of Monetary Economics 56 (2009) 267–274
Pagano, M., Volpin, P., 2005. The political economy of corporate governance. American Economic Review 95, 1005–1030. Perotti, E., von Thadden, E., 2006. The political economy of corporate control and labor rents. Journal of Political Economy 114, 145–174. Rajan, R., Zingales, L., 1998. Financial dependence and growth. American Economic Review 88, 559–586. Rajan, R., Zingales, L., 2003. The great reversals: the politics of financial development in the twentieth century. Journal of Financial Economics 69, 5–50. Sade, O., Schnitzlein, C., Zender, J., 2006. Competition and cooperation in divisible good auctions: an experimental examination. Review of Financial Studies 19, 195–235. Tenorio, R., 1993. Revenue-equivalence and bidding behavior in a multi-unit auction market: an empirical analysis. Review of Economics and Statistics 75, 302–314. Umlauf, S., 1993. An empirical study of the Mexican treasury bill auction. Journal of Financial Economics 33, 313–340. Wilson, R., 1979. Auctions of shares. Quarterly Journal of Economics 93, 675–698.
ARTICLE IN PRESS Journal of Monetary Economics 56 (2009) 275–282
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Identifying the interdependence between US monetary policy and the stock market$ Hilde C. Bjørnland a,b,, Kai Leitemo a,c a b c
Norwegian School of Management (BI), Nydalsveien 37, N-0442 Oslo, Norway Norges Bank, P.O. 1179 Sentrum, 0107 Oslo, Norway Bank of Finland, P.O. Box 160, 00101 Helsinki, Finland
a r t i c l e in fo
abstract
Article history: Received 27 October 2005 Received in revised form 2 December 2008 Accepted 2 December 2008 Available online 10 December 2008
We estimate the interdependence between US monetary policy and the S&P 500 using structural vector autoregressive (VAR) methodology. A solution is proposed to the simultaneity problem of identifying monetary and stock price shocks by using a combination of short-run and long-run restrictions that maintains the qualitative properties of a monetary policy shock found in the established literature [Christiano, L.J., Eichenbaum, M., Evans, C.L., 1999. Monetary policy shocks: what have we learned and to what end? In: Taylor, J.B., Woodford, M. (Eds.), Handbook of Macroeconomics, vol. 1A. Elsevier, New York, pp. 65–148]. We find great interdependence between the interest rate setting and real stock prices. Real stock prices immediately fall by seven to nine percent due to a monetary policy shock that raises the federal funds rate by 100 basis points. A stock price shock increasing real stock prices by one percent leads to an increase in the interest rate of close to 4 basis points. & 2008 Elsevier B.V. All rights reserved.
JEL classification: E61 E52 E43 Keywords: VAR Monetary policy Asset prices Identification
1. Introduction It is commonly accepted that monetary policy influences private-sector decision making. If prices are not fully flexible in the short run, as assumed by the New Keynesian theory framework, the central bank can temporarily influence the real interest rate and therefore have an effect on real output in addition to nominal prices. It is commonly believed that central banks have clear objectives for exerting control over real interest rates, including low and stable inflation and production close to the natural rate. In order to best fulfill these objectives, the central bank must monitor, respond to and influence private-sector decisions appropriately. Thus, the central bank and private sector mutually affect each other, leading to considerable interdependence between the two sectors. In the financial markets, where information is readily available and
$ We thank the Editor Martin Eichenbaum, two anonymous referees, Ida Wolden Bache, Petra Geraats, Bruno Gerard, Steinar Holden, Jan Tore Klovland, Jesper Linde´, Roberto Rigobon, Erling Steigum, Kjetil Storesletten, Øystein Thøgersen, Karl Walentin and seminar participants at the Econometric Society World Congress 2005, Cambridge University, the Norwegian School of Economics and Business Administration, the Norwegian School of Management BI and the University of Oslo for valuable comments and suggestions. We gratefully acknowledge financial support from the Norwegian Financial Market Fund under the Norwegian Research Council. The usual disclaimer applies. The views expressed in this paper are those of the authors and should not be attributed to the Norges Bank and Bank of Finland. Corresponding author at: Norwegian School of Management (BI), Nydalsveien 37, N-0442 Oslo, Norway. E-mail addresses:
[email protected] (H.C. Bjørnland),
[email protected] (K. Leitemo).
0304-3932/$ - see front matter & 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2008.12.001
ARTICLE IN PRESS 276
H.C. Bjørnland, K. Leitemo / Journal of Monetary Economics 56 (2009) 275–282
prices are sensitive to agents’ expectations about the future, this interdependence should largely be simultaneous. Allowing for simultaneity between monetary policy and financial markets is therefore likely to be both quantitatively and qualitatively important when measuring the degree of interdependence. The aim of this paper is to explore just how important this link is. Analyses of the effects of monetary policy have to a large extent been addressed in terms of vector autoregressive (VAR) models, initiated by Sims (1980). Yet, studies that use VAR models to identify interdependence have found little interaction between monetary policy and asset prices, see for instance Lee (1992), Thorbecke (1997) and Neri (2004) among others. Furthermore, the effect of a monetary policy shock on real stock prices is often counterintuitive and may imply permanent effects on stock returns, which is clearly unrealistic. However, these conventional VAR studies have not allowed for simultaneous interdependence, as the structural shocks have been recovered using recursive, short-run restrictions on the interaction between monetary policy and asset prices. Their implausible results provide a strong motivation for moving to an alternative specification. In this study we analyze the interaction between asset prices and monetary policy in the US, represented by the S&P 500 and the federal funds rate, respectively, using a VAR model that takes full account of the potential simultaneity of interdependence. We solve the simultaneity problem by imposing a combination of short-run and long-run restrictions on the multipliers of the shocks, leaving the contemporaneous relationship between the interest rate and real stock prices intact. Identification is instead achieved by assuming that monetary policy can have no long-run effect on real stock prices, which is a common long-run neutrality assumption. By using only one long-run restriction, the simultaneity problem is addressed without deviating too far from the established literature (i.e., Christiano et al., 1999, 2005) of identifying monetary policy shocks. Contrary to previous studies, we find strong interaction effects between the stock market and interest rate setting. Much of this interaction is found to be simultaneous. These results are achieved without considerably altering conventional views on how monetary policy affects macroeconomic variables, as previously found in the VAR literature. Section 2 provides a brief survey of theoretical, methodological and empirical arguments related to the interaction between asset prices and monetary policy. Section 3 presents the identification scheme used in this VAR study to identify the interdependence between monetary policy and the stock market. Section 4 presents and discusses the empirical results, including issues pertaining to robustness. Section 5 concludes. 2. Monetary policy and stock price interaction: a short overview Economic theory suggests several reasons why there should be interaction effects between monetary policy and asset prices, in particular, stock prices. Since stock prices are determined in a forward-looking manner, monetary policy, and in particular surprise policy moves, is likely to influence stock prices through the interest rate (discount) channel, and indirectly through its influence on the determinants of dividends and the stock return premium by influencing the degree of uncertainty faced by agents. Asset prices may influence consumption through a wealth channel and investments through the Tobin Q effect and, moreover, increase a firm’s ability to fund operations (credit channel). The monetary policymaker who manages aggregate demand in an effort to control inflation and output thus has incentives to monitor asset prices in general, and stock prices in particular, and use them as short-run indicators for the appropriate stance of monetary policy.1 Therefore, there is likely to be considerable interdependence between stock price formation and monetary policymaking.2 Empirical modelers should thus be open to the potential influence of asset prices on monetary policymaking. 2.1. Empirical evidence Compared to the vast number of studies that analyze the influence of monetary policy actions on the macroeconomic environment, there are relatively few that attempt to model interactions between monetary policy and asset prices. Early attempts, like Geske and Roll (1983) and Kaul (1987), examine the causal chain between monetary policy and stock market returns separately (see Sellin (2001) for a comprehensive survey). More recently, empirical studies have tended to use a joint estimation scheme like the vector autoregressive approach, see e.g., Lee (1992), Patelis (1997), Thorbecke (1997), Millard and Wells (2003) and Neri (2004) among others. All of these studies find that monetary policy shocks account for only a small part of the variation in stock returns. However, stock prices tend to respond with a significant delay, which is difficult to understand from the perspective of financial market theory, which predicts 1
See Vickers (2000) for an overview of the use of asset prices in monetary policy in inflation-targeting countries. The form of interaction is further complicated by issues of whether asset prices should be included in the central bank loss function (see e.g., Bernanke and Gertler, 1999 and Carlstrom and Fuerst, 2001), on how to use asset price information efficiently and whether assets prices convey information that is not available elsewhere (e.g., Faia and Monacelli, 2007), whether the credit channel is important (see Bernanke et al., 2000; Bernanke and Gertler, 1989) and whether asset prices include expectations-driven sunspot components that may influence target variables more than what is reflected by the fundamental part of the asset price (see e.g., Cecchetti et al., 2000; Bernanke and Gertler, 2001). 2
ARTICLE IN PRESS H.C. Bjørnland, K. Leitemo / Journal of Monetary Economics 56 (2009) 275–282
277
that asset prices respond to news on impact. A major shortcoming in these papers is that shocks are identified using Cholesky decomposition, imposing a recursive ordering of the identified shocks. In many of these papers, stock prices are ordered last, implying that they react contemporaneously to all other shocks, but that variables ordered ahead of the stock market (i.e., monetary policy stance) react with a lag to stock market news. Hence, simultaneous interdependence is ruled out by assumption. In contrast to the studies referred to above, Lastrapes (1998) and Rapach (2001) identify monetary shocks in a VAR model using solely long-run (neutrality) restrictions. Both find that monetary shocks have a considerably stronger effect on the stock market. However, the reverse causation, from the stock market to systematic monetary policy, is either ignored or addressed rudimentarily.3 Recently, the simultaneity problem has been addressed using high-frequency observation (i.e., daily data), to analyze how asset prices are associated with particular policy actions in the short run. In an influential paper, Rigobon and Sack (2004) use an identification technique based on the heteroscedasticity of shocks present in high-frequency data to analyze the impact effect of monetary policy on the stock market. They find that following a surprise interest rate increase, stock prices decline significantly. Furthermore, using the same method, but analyzing reverse causation, Rigobon and Sack (2003) find that stock market movements have a significant impact on short-term interest rates, driving them in the same direction as changes in stock prices. These results are somewhat stronger than results found in more conventional ‘‘event studies’’ like Bernanke and Kuttner (2005). The studies cited above are useful for quantifying the immediate (short-run) effect of a specific action, such as a monetary policy surprise, while ignoring the issue of dynamic adjustments following the initial shock. Furthermore, they rarely provide for two-way causation, focusing either exclusively on the effects of monetary policy shocks, or stock price shocks. To take this into account one should identify the shocks in a system like the structural VARs, as in the present study.
3. The identified VAR model The VAR model is comprised of monthly data on the annual change in the log of consumer prices (pt), the annual change in the log of the commodity price index in US dollars (Economist Commodity price index, all items, Dct), the log of the (detrended)4 industrial production index (yt), the federal funds rate (it) and the log of the S&P 500 stock price index (st). Stock prices are deflated by CPI, so that they are measured in real terms, and then differenced (to denote monthly changes, i.e., Dst). The federal funds rate and the stock price index are observed daily, but averaged monthly, so as to reflect the same information content as the other monthly variables.
3.1. Identification Assume Zt to be the (5 1) vector of macroeconomic variables discussed above, ordered as follows: Zt ¼ [yt, pt, Dct, Dst, it]0 . Specified this way, the VAR is assumed to be stable5 and can be inverted and written in terms of its moving average (MA) representation (ignoring any deterministic terms): Z t ¼ BðLÞvt ,
(1)
where vt is a (5 1) vector of reduced-form residuals assumed to be identically and independently distributed, vtiid(0, O), with positive-definite covariance matrix O. B(L) is the (5 5) convergent matrix polynomial in the lag operator L, P j BðLÞ ¼ 1 j¼0 Bj L . We assume that the underlying orthogonal structural disturbances (et) can be written as linear combinations of the innovations (vt), i.e., vt ¼ Set, where S is the (5 5) contemporaneous matrix; (1) can then be written in terms of the structural shocks as Z t ¼ CðLÞt ,
(2)
where B(L)S ¼ C(L). To identify S, the et are first assumed to be normalized with unit variance. We order the vector of MP MP 0 is the monetary policy shock and eSP the stock price uncorrelated structural shocks as t ¼ ½yt ; pt ; ct ; SP t ; t , where e shock, while the remaining shocks are identified from their respective equations, but left uninterpreted. We follow the standard closed economy literature (Christiano et al., 1999, 2005) and identify monetary policy shocks by assuming that macroeconomic variables do not simultaneously react to policy variables, while a simultaneous reaction from the macroeconomic environment to policy variables is allowed for. This is taken care of by placing the three macroeconomic variables above the interest rate in the ordering and assuming three zero restrictions on the relevant coefficients in the fifth 3 Another strand of literature estimates the contribution of asset prices in (Taylor type) interest rate reaction functions (i.e., Chadha et al., 2003) but is subject to the same simultaneity problem as in conventional VARs (see Rigobon and Sack (2003) for a more critical review). 4 Giordani (2004) has argued that if one follows the model set up in Svensson (1997) as data-generating process in monetary policy studies, the output gap, rather than the level of output, should be included in the VAR. 5 This will be discussed further in Section 4 below.
ARTICLE IN PRESS 278
H.C. Bjørnland, K. Leitemo / Journal of Monetary Economics 56 (2009) 275–282
column in the S matrix, as follows: 3 2 2 S11 0 0 0 yt 6 p 7 6S S 0 0 22 6 t 7 6 21 7 6 6 6 Dct 7 ¼ BðLÞ6 S31 S32 S33 0 7 6 6 6 Ds 7 6S S S S 4 t5 4 41 42 43 44 S51 S52 S53 S54 it
32
3
yt 6 p 7 0 7 76 t 7 76 c 7 7 6 0 7 76 t 7. 6 SP 7 S45 7 54 t 5 S55 MP t 0
(3)
Similar recursive restrictions are imposed on the relationship between the stock price shock and the macroeconomic variables. By placing three zero restrictions in the fourth column in (3), macroeconomic variables react with a lag to the stock price shock, while stock prices can respond immediately to all variables (as there are no zeroes in the fourth row). Regarding the interaction between monetary policy and stock price shocks, the standard practice in the VAR literature has been to either assume that real stock prices respond with a lag to monetary policy shocks (S45 ¼ 0), or, that monetary policy responds with a lag to stock price shocks (the stock price will be ordered below the interest rate). Only the latter allows for an immediate reaction in real stock prices to a monetary policy shock. However, as discussed above, such identifications rule out a potentially important channel for interaction between monetary policy and stock prices, which if empirically relevant, would bias the result. We therefore impose the alternative identifying restriction that a monetary policy shock can have no long-run effects on the level of real stock prices. The restriction can be applied by setting the P1 infinite number of relevant lag coefficients in (2), j¼0 C 45;j equal to zero. Writing the long-run expression of C(L) as P1 P1 B(1)S ¼ C(1), where Bð1Þ ¼ j¼0 Bj and Cð1Þ ¼ j¼0 C j indicate the (5 5) long-run matrix of B(L) and C(L), respectively, the long-run restriction C45(1) ¼ 0 implies B41 ð1ÞS15 þ B42 ð1ÞS25 þ B43 ð1ÞS35 þ B44 ð1ÞS45 þ B45 ð1ÞS55 ¼ 0.
(4)
6
To sum up, the system is now just identifiable. Recursive Cholesky restrictions identify the non-zero parameters above the interest rate equation, whereas the remaining parameters are uniquely identified from the long-run restriction C45(1) ¼ 0. That is, we assume that both stock price and monetary policy shocks have no immediate impact on macroeconomic variables. The stock price shock is furthermore differentiated from the monetary policy shock by allowing it to have a longrun impact on real stock prices. Note that the responses to the monetary policy shock (or the stock price shock) will be invariant to the ordering of the three first variables. This follows from a generalization of Christiano et al. (1999, Proposition 4.1), and can be shown on request. 4. Empirical modeling and results The model is estimated using monthly data from 1983M1 to 2002M12. As noted in Section 3, the choice of data and transformation reflects the data-generating process found in Svensson’s (1997) model, where annual inflation rates and the output gap (obtained from detrending output with a linear trend) are included in the VAR. It is, however, important that the variables in the VAR should be stationary; otherwise the MA representation of the VAR may be non-convergent. This is confirmed using unit root tests, with the exception of consumer price inflation, which displays clear evidence of a stochastic trend that drifts downwards (that could reflect a fall in the inflation target). We therefore remove the nonstationarity in inflation by taking first differences (the appendix shows that the results are robust to all our suggested data transformations). Lag reduction tests suggest that four lags could be accepted at the one-percent level by all tests. Using four lags, the VAR satisfies the stability condition (no eigenvalues, i.e., inverse roots, of the AR characteristics polynomial lie outside the unit circle) and basic diagnostic tests (i.e., there is no evidence of autocorrelation or heteroscedasticity in the model residuals). 4.1. Cholesky decomposition To motivate our structural identification scheme, Fig. 1 gives an account of the impulse responses of interest rates and real stock prices to both a monetary policy shock and a stock price shock under standard Cholesky decomposition. These are shown for two different orderings of variables, with the interest rate and the stock price alternating as the ultimate and penultimate variables. The shocks are normalized so that the monetary policy shock increases the interest rate with one percentage point the first month, while the stock price shock increases the real stock prices with one percent the first month. Both orderings produce almost identical impulse responses. Neither the monetary policy shock nor the stock price shock has any important contemporaneous effect on the other variables. Assuming that both the stock market and the monetary policymaker react to shocks in the other sector so that interaction is important, the restriction imposed by either Cholesky ordering distorts the estimates of the two shocks in such a way that the degree of interaction will seem unimportant. Furthermore, the effect of a contractionary monetary policy shock on real stock prices is counterintuitive, implying a permanent positive effect on stock returns, which is clearly unrealistic. 6
Note that (4) reduces to B44(1)S45+B45(1)S55 ¼ 0 with the zero contemporaneous restrictions applied.
ARTICLE IN PRESS H.C. Bjørnland, K. Leitemo / Journal of Monetary Economics 56 (2009) 275–282
279
Fig. 1. Impulse responses of interest rates and real stock prices to a monetary policy shock (upper panels) and a stock price shock (lower panels) under standard Cholesky decompositions for two different orderings of variables, with the interest rate and the stock price alternating as the ultimate and penultimate variables.
4.2. Structural identification scheme Turning to the structural model, Fig. 2 shows the impulse responses for the federal funds rate, the real stock price, annual inflation and industrial production (gap) from a monetary policy shock (top panels) and a stock price shock (lower panels). The responses are graphed with probability bands represented as .16 and .84 fractiles (as suggested by Doan, 2004).7 The shocks are again normalized the first month. The upper panels of Fig. 2 show that the monetary policy shock temporarily increases interest rates. As is commonly found in the literature, output falls temporarily and reaches its minimum after a year and a half. The negative effect on output is clearly significantly different from zero, but after 4 years, the effect has essentially died out. Initially inflation increases. This increase, a ‘‘price puzzle’’ (see Eichenbaum, 1992), may be due to a cost channel of the interest rate (see, Ravenna and Walsh, 2006; Chowdhury et al., 2006). However, after 4–6 months, inflation starts to decline, so that in the longer run, prices fall following a contractionary monetary policy shock. Turning to real stock prices, the monetary policy shock has a strong impact on stock returns, which immediately fall by about nine percent for each (normalized) 100 basis-point increase in the federal funds rate. This is consistent with results found in Rigobon and Sack (2004), who focus on short-run responses,8 but much larger than those found in the traditional VARs. The result of a fall in real stock prices is consistent with the increase in the discount rate of dividends associated with the increase in the federal funds rate, but also with temporarily reduced output and higher cost of borrowing, which are likely to reduce expected future dividends. Following the initial shock, real stock prices fall for an additional month or so, before returning back towards the average level as the long-run restriction bites. Although interpretations of this result should be made with care, a potential explanation might be that as the interest rate gradually falls, the discounted value of expected future dividends increases while output and profits build up, leading to a normalization of real stock prices.
7 This is the Bayesian simulated distribution obtained by Monte Carlo integration with 2500 replications, using the approach for just-identified systems. The draws are made directly from the posterior distribution of the VAR coefficients (see Doan, 2004). 8 Rigobon and Sack (2004) find that ‘‘a 25 basis point increase in the three-month interest rate results in a 1.9% decline in the S&P 500 index and a 2.5% decline in the Nasdaq index.’’
ARTICLE IN PRESS 280
H.C. Bjørnland, K. Leitemo / Journal of Monetary Economics 56 (2009) 275–282
Fig. 2. Impulse responses for the federal funds rate, the real stock price, annual inflation and industrial production (gap) from a monetary policy shock (top panels) and a stock price shock (lower panels).
The lower panels of Fig. 2 show that a positive stock price shock increases both inflation and output in the short run. This is consistent with the view that a rise in real stock prices increases consumption through a wealth effect and investment through a Tobin Q effect, thus affecting aggregate demand. Due to nominal rigidities, prices react slowly and inflation rises in the intermediate run. The increase in inflation may, however, also be partly driven by the increase in the interest rate itself due to the initial price puzzle in the model. In any case, the response of the interest rate is consistent with an inflation-targeting central bank raising interest rates to curb the inflationary effects of increased aggregate demand. Stock price shocks are important indicators for the interest rate setting. A shock that increases real stock prices by one percent causes the interest rates to increase immediately by just less than four basis points, increasing to seven basis points
ARTICLE IN PRESS H.C. Bjørnland, K. Leitemo / Journal of Monetary Economics 56 (2009) 275–282
281
within a year. By increasing the interest rate, the FOMC achieves a reduction in aggregate demand through the usual interest rate channels and by reducing the positive impact on real stock prices. Again, our results are consistent with studies that focus on short-run responses (i.e., Rigobon and Sack, 2003),9 but larger than those found in traditional VAR analyses. How can one interpret the stock price shock? Under the ‘‘news’’ interpretation, the shock contains information about the future that is not yet incorporated in current macroeconomic variables, leading to a delayed but persistent change in productivity (see Beaudry and Portier, 2006). If the shock is non-fundamental (a sunspot), the innovation in real stock prices is driven purely by expectations with no permanent effects on output. Although differentiating between the two interpretations would be interesting, the current framework does not allow us to do so since under both interpretations the shock may have non-permanent effects on output (i.e., some productivity shocks may have only persistent but nonpermanent effects on technology). Under both of these interpretations, however, the shock may contain vital information for the central bank for reasons outlined in Section 2. One objection to our identification of the stock price shock is that the shock could have an immediate effect on other variables like production and consumption (i.e., Jaimovich and Rebelo, 2006). This was ruled out by our identification scheme. However, it is not unlikely that consumer prices, consumption and investment decisions will be subject to implementation lags similar to the model’s monthly frequency (see, Woodford (2003) and Svensson and Woodford (2005) for arguments),10 supporting our identifying assumptions. The results reported so far suggest a great interdependence between the effects of the shocks.11 How is it possible to reconcile the zero interdependence found using Cholesky decomposition above, with that of large interdependence found in the present structural model? To understand this, assume for simplicity a system of two variables, the interest rate (it) and real stock price (st). The reduced-form residuals will be functions of the structural orthogonal shocks, that is, the monetary policy shock and the stock price shock: þ aSP ui;t ¼ MP t t , þ SP us;t ¼ bMP t t ,
(5) MP t
SP MP t Þðb t
SP t Þ
2 MP
2 SP .
þ ¼ bo þ ao Hence, a covariance close to zero either with covariance given by covðui;t ; us;t Þ ¼ E½ð þ a using Cholesky implies that the interdependence is zero, cov(ui,t,ui,t) ¼ 0, implying a ¼ b ¼ 0 in (5) (as imposed decomposition), or that the effects are opposite in signs and cancel out, b ¼ o2SP =o2MP a. Only the structural identification scheme suggested here allows the latter to be the case. 4.3. Robustness We study the robustness of the results by using plausible alternative models. We first study alternative specifications of the baseline monthly model: varying the sample period and lags, allowing for dummies (for specific events like the stock crashes in 1987 and 2001), using different transformations of the variables (first differences, no detrending, etc.) and changing the order of the variables. The same model is then estimated using quarterly data, allowing us to substitute industrial production with GDP, as well as expand the dimension of the model by including consumption and investment. The results remain robust to all of these variations. Regarding the monthly specifications, the baseline model has an average response across the models. The responses are marginally smaller in the quarterly model, but are robust to various transformations. These results are reported in the supplementary data in the online version of this article. 5. Conclusion We find that there is substantial simultaneous interaction between the interest rate setting and shocks to real stock prices in the US. Just as monetary policy is important for the determination of stock prices, the stock market is an important source of information for the conduct of monetary policy. This result is found in many plausible and alternative model specifications that allow for the possibility of simultaneous interaction. Appendix A. Supplementary materials These results are reported in the supplementary data in the online version of this article doi:10.1016/j.jmoneco. 2008.12.001. 9 Rigobon and Sack (2003) find that a ‘‘5 percent rise in stock prices over a day causes the probability of a 25 basis point interest rate hike to increase by a half, while a similar-sized movement over a week has a slightly larger effect on anticipated policy actions.’’ 10 We have experimented with an alternative identification scheme where output is ordered below real stock prices (i.e., allowing for the immediate impact of the stock price shock on output, but restricting real stock prices from responding on impact to output shocks). We find no significant impact effects on output from a stock price shock, taking this as support of our assumption about implementation lags in output. 11 The error variance decomposition (obtained on request) suggests that monetary and stock price shocks together account for almost all variation in the federal funds rate and stock prices on impact, leaving the other shocks to influence these variables only in the long run.
ARTICLE IN PRESS 282
H.C. Bjørnland, K. Leitemo / Journal of Monetary Economics 56 (2009) 275–282
References Beaudry, P., Portier, F., 2006. Stock prices, news and economic fluctuations. American Economic Review 96, 1293–1307. Bernanke, B.S., Gertler, M., 1989. Agency costs, net worth, and business fluctuations. American Economic Review 79, 14–31. Bernanke, B.S., Gertler, M., 1999. Monetary policy and asset volatility. Federal Reserve Bank of Kansas City Economic Review 84 (Fourth Quarter), 17–51. Bernanke, B.S., Gertler, M., 2001. Should central banks respond to movements in asset prices? American Economic Review 91, 253–257. Bernanke, B.S., Kuttner, K.N., 2005. What explains the stock market’s reaction to Federal Reserve Policy? Journal of Finance 60, 1221–1257. Bernanke, B.S., Gertler, M., Gilchrist, S., 2000. The financial accelerator in a quantitative business cycle framework. In: Taylor, J.B., Woodford, M. (Eds.), Handbook of Macroeconomics, vol. 1C. Elsevier, New York, pp. 1341–1393. Carlstrom, C.T., Fuerst, T.S., 2001. Monetary policy and asset prices with imperfect credit markets. Federal Reserve Bank of Cleveland Economic Review 37, 51–59. Cecchetti, S.G., Genberg, H., Lipsky, J., Wadhwani, S., 2000. Asset prices and Central Bank Policy. The Geneva Report on the World Economy No. 2, ICMB/CEPR. Chadha, J.S., Sarno, L., Valente, G., 2003. Monetary policy rules, asset prices and exchange rates. CEPR Discussion Paper No. 4114. Chowdhury, I., Hoffmann, M., Schabert, A., 2006. Inflation dynamics and the cost channel of monetary transmission. European Economic Review 50, 995–1016. Christiano, L.J., Eichenbaum, M., Evans, C.L., 1999. Monetary policy shocks: what have we learned and to what end? In: Taylor, J.B., Woodford, M. (Eds.), Handbook of Macroeconomics, vol. 1A. Elsevier, New York, pp. 65–148. Christiano, L.J., Eichenbaum, M., Evans, C.L., 2005. Nominal rigidities and the dynamic effects of a shock to monetary policy. Journal of Political Economy 113, 1–45. Doan, T., 2004. Rats Manual, version 6. Estima, Evanston, IL. Eichenbaum, M., 1992. Comment on Interpreting the macroeconomic time series facts: the effects of monetary policy. European Economic Review 36, 1001–1011. Faia, E., Monacelli, T., 2007. Optimal interest rate rules, asset prices and credit frictions. Journal of Economic Dynamics and Control 31 (10), 3228–3254. Geske, R., Roll, R., 1983. The fiscal and monetary linkage between stock returns and inflation. Journal of Finance 38, 1–33. Giordani, P., 2004. An alternative explanation of the price puzzle. Journal of Monetary Economics 51, 1271–1296. Jaimovich, N., Rebelo, S., 2006. Can news about the future drive the business cycle? American Economic Review, forthcoming. Kaul, G., 1987. Stock returns and inflation: the role of the monetary sector. Journal of Financial Economics 18, 253–276. Lastrapes, W.D., 1998. International evidence on equity prices, interest rates and money. Journal of International Money and Finance 17, 377–406. Lee, B.-S., 1992. Causal relations among stock returns, interest rates, real activity, and inflation. The Journal of Finance 47, 1591–1603. Millard, S.P., Wells, S., 2003. The role of asset prices in transmitting monetary and other shocks. Working Paper No. 188, Bank of England. Neri, S., 2004. Monetary policy and stock prices. Working paper No. 513, Bank of Italy. Patelis, A.D., 1997. Stock return predictability and the role of monetary policy. The Journal of Finance 52, 1951–1972. Rapach, D.E., 2001. Macro shocks and real stock prices. Journal of Economics and Business 53, 5–26. Ravenna, F., Walsh, C.E., 2006. Optimal monetary policy with the cost channel. Journal of Monetary Economics 53, 199–216. Rigobon, R., Sack, B., 2004. The impact of monetary policy on asset prices. Journal of Monetary Economics 51, 1553–1575. Rigobon, R., Sack, B., 2003. Measuring the reaction of monetary policy to the stock market. The Quarterly Journal of Economics 118, 639–669. Sellin, P., 2001. Monetary policy and the stock market: theory and empirical evidence. Journal of Economic Surveys 15, 491–541. Sims, C.A., 1980. Macroeconomics and reality. Econometrica 48, 1–48. Svensson, L.E.O., 1997. Inflation forecast targeting: implementing and monitoring inflation targets. European Economic Review 41, 1111–1146. Svensson, L.E.O., Woodford, M., 2005. Implementing optimal policy through inflation-forecast targeting. In: Bernanke, B.S., Woodford, M. (Eds.), The Inflation-Targeting Debate. The University of Chicago Press, Chicago. Thorbecke, W., 1997. On stock market returns and monetary policy. The Journal of Finance 52, 635–654. Vickers, J., 2000. Monetary policy and asset prices. The Manchester School 68 (1), 1–22. Woodford, M., 2003. Interest and Prices. Princeton University Press, Princeton and Oxford.